Commit graph

304 commits

Author SHA1 Message Date
bibabolynn
7a9d1546d4 [java] Fix getWorker and add create multi actors test (#4472) 2019-03-26 20:26:13 +08:00
Ruifang Chen
59d74d5e92 [Java] Build Java code with Bazel (#4284) 2019-03-22 14:30:05 +08:00
Hao Chen
d03999d01e
Cross-language invocation Part 1: Java calling Python functions and actors (#4166) 2019-03-21 13:34:21 +08:00
Hao Chen
f8d12b0418
[Java] Package native dependencies into jar (#4367) 2019-03-15 12:38:40 +08:00
Hao Chen
f0465bc68c
[Java] Refine tests and fix single-process mode (#4265) 2019-03-07 09:59:13 +08:00
Wang Qing
a116b7f646 [Java] Add runtime context (#4194) 2019-03-05 20:25:29 +08:00
bibabolynn
c73d5086f3 [Java] Single-process mode (#4245) 2019-03-05 13:50:20 +08:00
Hao Chen
a99676e39b [Java] lint unused imports (#4100) 2019-02-20 12:37:04 -08:00
Hao Chen
de17443dc2
Propagate backend error to worker (#4039) 2019-02-16 11:39:15 +08:00
Hao Chen
f31a79f3f7
Implement actor checkpointing (#3839)
* Implement Actor checkpointing

* docs

* fix

* fix

* fix

* move restore-from-checkpoint to HandleActorStateTransition

* Revert "move restore-from-checkpoint to HandleActorStateTransition"

This reverts commit 9aa4447c1e3e321f42a1d895d72f17098b72de12.

* resubmit waiting tasks when actor frontier restored

* add doc about num_actor_checkpoints_to_keep=1

* add num_actor_checkpoints_to_keep to Cython

* add checkpoint_expired api

* check if actor class is abstract

* change checkpoint_ids to long string

* implement java

* Refactor to delay actor creation publish until checkpoint is resumed

* debug, lint

* Erase from checkpoints to restore if task fails

* fix lint

* update comments

* avoid duplicated actor notification log

* fix unintended change

* add actor_id to checkpoint_expired

* small java updates

* make checkpoint info per actor

* lint

* Remove logging

* Remove old actor checkpointing Python code, move new checkpointing code to FunctionActionManager

* Replace old actor checkpointing tests

* Fix test and lint

* address comments

* consolidate kill_actor

* Remove __ray_checkpoint__

* fix non-ascii char

* Loosen test checks

* fix java

* fix sphinx-build
2019-02-13 19:39:02 +08:00
Wang Qing
c523bc04ad Enable redis password in Java worker (#3943)
* Support Java redis password

* Fix

* Refine

* Fix lint.
2019-02-12 13:11:25 +08:00
Wang Qing
bc438ca73b [Java] Refine Java config item (#4014)
* Refine

* Address comment.
2019-02-11 23:55:40 +08:00
Yuhong Guo
3a66d47a3a
Remove RAY_CHECK from JNI code (#3978)
* Remove RAY_CHECK in JNI

* Try to add mvn test to test the exception.

* Refine

* Address comments
2019-02-09 18:10:22 +08:00
bibabolynn
728031a972 [java] when put an object in plasma store, ignore "object alreay exists" exception (#3687)
* distinct plasma client exception

* Update ObjectStoreProxy.java

* Update and rename PlasmaArrowTest.java to PlasmaStoreTest.java

* store put

* Use testng to replace junit to fix test failure
2019-02-09 18:03:17 +08:00
Ion
f987572795 Inline objects (#3756)
* added store_client_ to object_manager and node_manager

* half through...

* all code in, and compiling! Nothing tested though...

* something is working ;-)

* added a few more comments

* now, add only one entry to the in GCS for inlined objects

* more comments

* remove a spurious todo

* some comment updates

* add test

* added support for meta data for inline objects

* avoid some copies

* Initialize plasma client in tests

* Better comments. Enable configuring nline_object_max_size_bytes.

* Update src/ray/object_manager/object_manager.cc

Co-Authored-By: istoica <istoica@cs.berkeley.edu>

* Update src/ray/raylet/node_manager.cc

Co-Authored-By: istoica <istoica@cs.berkeley.edu>

* Update src/ray/raylet/node_manager.cc

Co-Authored-By: istoica <istoica@cs.berkeley.edu>

* fiexed comments

* fixed various typos in comments

* updated comments in object_manager.h and object_manager.cc

* addressed all comments...hopefully ;-)

* Only add eviction entries for objects that are not inlined

* fixed a bunch of comments

* Fix test

* Fix object transfer dump test

* lint

* Comments

* Fix test?

* Fix test?

* lint

* fix build

* Fix build

* lint

* Use const ref

* Fixes, don't let object manager hang

* Increase object transfer retry time for travis?

* Fix test

* Fix test?

* Add internal config to java, fix PlasmaFreeTest
2019-02-07 10:32:39 -08:00
Wang Qing
e1c68a0881 Enable including Java worker for ray start command (#3838) 2019-02-04 16:23:43 +08:00
Wang Qing
3cf59855af [Java] Replace junit with testNG (#3768) 2019-01-14 17:49:17 +08:00
Hao Chen
1bb20badec [Java] Fix bug when actor creation task fails (#3740)
* [Java] Fix bug when actor creation task fails

* remove imports
2019-01-14 11:09:15 +08:00
Wang Qing
8674606e26 Support to auto-generate Java files from flatbuffer (#3749)
* auto gen flatbuffers for Java

* Add auto_gen_tool.py

* Refine

* Add a comment

* address comments.

* Address comments.

* Addressed

* Refine

* Address comments

* Fix typo

* Add exception

* Address comments.

* Refine

* Fix lint

* Fix

* Fix lint and address comment.

* Fix lint error
2019-01-13 11:39:23 -08:00
Wang Qing
a0cf8ee5a8 Refine Java worker code (#3735) 2019-01-12 22:45:33 +08:00
Wang Qing
692fdc6bc3 [Java] Allow actor handle to be serialized without forking (#3686) 2019-01-06 00:29:08 +08:00
Wang Qing
c59b506c6e [Java] Support calling Ray APIs from multiple threads (#3646) 2018-12-28 17:44:31 +08:00
Wang Qing
a971b73bbe [Java] Fix the issue when waiting an empty list or a null pointer (#3632) 2018-12-26 11:29:29 +08:00
Wang Qing
8393df2516 Use BaseTest to instead of TestListener. (#3577) 2018-12-21 16:29:16 -08:00
bibabolynn
e65b8f18f4 [java] change RayLog.core to org.slf4j.Logger (#3579) 2018-12-21 15:58:32 +08:00
bibabolynn
7fd24e384b [java] Pass large args by reference (#3504) 2018-12-14 23:32:35 +08:00
Hao Chen
e7b51cbd1b [xray] Implement Actor Reconstruction (#3332)
* Implement Actor Reconstruction

* fix

* fix actor handle __del__

* fix lint

* add comment

* Remove actorCreationDummyObjectId

* address comments

* fix

* address comments

* avoid copy

* change log to debug

* fix error name
2018-12-13 21:28:58 -08:00
Hao Chen
abd37df41e Add stress test for Java worker (#3424) 2018-12-01 16:11:09 -08:00
Robert Nishihara
658c14282c Remove legacy Ray code. (#3121)
* Remove legacy Ray code.

* Fix cmake and simplify monitor.

* Fix linting

* Updates

* Fix

* Implement some methods.

* Remove more plasma manager references.

* Fix

* Linting

* Fix

* Fix

* Make sure class IDs are strings.

* Some path fixes

* Fix

* Path fixes and update arrow

* Fixes.

* linting

* Fixes

* Java fixes

* Some java fixes

* TaskLanguage -> Language

* Minor

* Fix python test and remove unused method signature.

* Fix java tests

* Fix jenkins tests

* Remove commented out code.
2018-10-26 13:36:58 -07:00
Wang Qing
b410ee0d29 [Java] Support dynamically defining resources when submitting task. (#3070)
## What do these changes do?
Before this PR, if we want to specify some resources, we must do as following codes:
```java
@RayRemote(Resources={ResourceItem("CPU", 10)})
public static void f1() {
// do sth
}

@RayRemote(Resources={ResourceItem("CPU", 10)})
class Demo {
// sth
}
```
Unfortunately, it's no way for us to create another actor or task with different resources required.

After this PR, the thing will be:
```java
ActorCreationOptions option = new ActorCreationOptions(); 
option.resources.put("CPU", 4.0);
RayActor<Echo> echo1 = Ray.createActor(Echo::new, option);
option.resources.put("Res-A", 4.0);
RayActor<Echo> echo2 = Ray.createActor(Echo::new, option);


//if we don't specify resource,  the resources will be `{"cpu":0.0}` by default.
Ray.call(Echo::echo, echo2, 100);
```


## Related issue number
N/A
2018-10-19 06:22:32 -07:00
Wang Qing
4a2ed47b6c [Java] Improve some Java code (#3040)
This PR improves some java codes,  and removes some duplicated code.
2018-10-10 17:30:23 -07:00
Wang Qing
ef1f2fde95 Fix the uniqueId toString format. (#3035) 2018-10-08 13:12:14 -07:00
Wang Qing
84bf5fc8f3 [Java] Load driver resources from local path. (#3001)
## What do these changes do?
1. Add a configuration item `driver.resource-path`.
2. Load driver resources from the local path which is specified in the `ray.conf`.

Before this change, we should add all driver resources(like user's jar package, dependencies package and config files) into `classpath`.

After this change, we should add the driver resources into the mount path which we can configure it in `ray.conf`, and we shouldn't configure `classpath` for driver resources any more.

## Related issue number
N/A
2018-10-08 21:05:26 +01:00
Wang Qing
fcef4edd46 [Java] Fix the required-resources issue of actor member function in Java worker. (#3002)
This fixes a bug in which Java actor methods inherit the resource requirements of the actor creation task.
2018-10-01 12:56:36 -07:00
Hao Chen
4ffe1e3556 [Java] Fix: task spec's resource map should contain CPU (#2987) 2018-09-28 14:23:38 -05:00
Wang Qing
68cf194e90 [fix] Fix ray.home configuration item. (#2977)
If we set `ray.home` configuration item to `""`.
The current `RayConfig` will set it to current work directory, like `/User/My/Ray`.
But the some other configuration items(like `redisServerExecutablePath`) will be set to `/User/My/Ray//build/src/common/thirdparty/redis/src/redis-server` by mistake.
Note: There are 2 `/` between current work directory and `build/src/common....`

This PR will fix this issue.
2018-09-28 00:06:14 -05:00
Wang Qing
8e8e123777 [Java] Simplify Java worker configuration (#2938)
## What do these changes do?
Previously, Java worker configuration is complicated, because it requires setting environment variables as well as command-line arguments.

This PR aims to simplify Java worker's configuration. 
1) Configuration management is now migrated to [lightbend config](https://github.com/lightbend/config), thus doesn't require setting environment variables.
2) Many unused config items are removed.
3) Provide a simple `example.conf` file, so users can get started quickly.
4) All possible options and their default values are declared and documented in `ray.default.conf` file.

This PR also simplifies and refines the following code:
1) The process of `Ray.init()`.
2) `RunManager`.
3) `WorkerContext`. 

### How to use this configuration?
1. Copy `example.conf` into your classpath and rename it to `ray.conf`.
2. Modify/add your configuration items. The all items are declared in `ray.default.conf`.
3. You can also set the items in java system prosperities.

Note: configuration is read in this priority:
System properties > `ray.conf` > `ray.default.conf`

## Related issue number
N/A
2018-09-26 20:14:22 +08:00
Wang Qing
0e552fbb22 [Java] Update maven version to 0.1-SNAPSHOT
Update the version in maven from 0.1 to 0.1-SNAPSHOT, because SNAPSHOT is the conventional version name in dev process. Non-snapshot versions are only used for release.
2018-09-26 18:08:46 +08:00
Hao Chen
971df5ea8a [java] put function meta in task spec and load functions with function meta (#2881)
This PR adds a `function_desc` field into task spec. a function descriptor is a list of strings that can uniquely describe a function.
- For a Python function, it should be: [module_name, class_name, function_name]
- For a Java function, it should be: [class_name, method_name, type_descriptor]

There're a couple of purposes to add this field:

In this PR:
- Java worker needs to know function's class name to load it. Previously, since task spec didn't have such a field to hold this info, we did a hack by appending the class name to the argument list. With this change, we fixed that hack and significantly simplified function management in Java.

Will be done in subsequent PRs:
- Support cross-language invocation (#2576): currently Python worker manages functions by saving them in GCS and pass function id in task spec. However, if we want to call a Python function from Java, we cannot save it in GCS and get the function id. But instead, we can pass the function descriptor (module name, class name, function name) in task spec and use it to load the function.
- Support deployment: one major problem of Python worker's current function management mechanism is #2327. In prod env, we should have a mechanism to deploy code and dependencies to the cluster. And when code is already deployed, we don't need to save functions to GCS any more and can use `function_desc` to manage functions.
2018-09-25 23:05:05 -07:00
Robert Nishihara
f16d33593b Mark worker as blocked and trigger reconstruction in ray.wait. (#2864)
* Trigger reconstruction in ray.wait and mark worker as blocked.

* Add test.

* Linting.

* Don't run new test with legacy Ray.

* Only call HandleClientUnblocked if it actually blocked in ray.wait.

* Reduce time to ray.wait in the test.
2018-09-13 15:28:17 -07:00
Hao Chen
8414e413a2 [java] refine and simplify java worker code structure (#2838) 2018-09-10 10:48:17 -07:00
Wang Qing
7e13e1fd49 [Java] Remove non-raylet code in Java. (#2828) 2018-09-06 14:54:13 +08:00
Yuhong Guo
dfb7c2be1e [Java] Add Plasma Free to Java code path (#2802) 2018-09-04 15:28:23 +08:00
Hao Chen
9d655721e5 [java] support creating an actor with parameters (#2817)
Previously `Ray.createActor` only support creating an actor without any parameter. This PR adds the support for creating an actor with parameters. Moreover, besides using a constructor, it's now also allowed to create an actor with a factory method. For more usage, prefer refer to `ActorTest.java`.
2018-09-03 09:53:03 -07:00
Hao Chen
3b0a2c4197 [Java] improve Java API module (#2783)
API module (`ray/java/api` dir) includes all public APIs provided by Ray, it should be the only module that normal Ray users need to face.

The purpose of this PR to first improve the code quality of the API module. Subsequent PRs will improve other modules later. The changes of this PR include the following aspects: 
1) Only keep interfaces in api module, to hide implementation details from users and fix circular dependencies among modules.
2) Document everything in the api module. 
3) Improve naming.
4) Add more tests for API. 
5) Also fix/improve related code in other modules.
6) Remove some unused code.

(Apologize for posting such a large PR. Java worker code has been lack of maintenance for a while. There're a lot of code quality issues that need to be fixed. We plan to use a couple of large PRs to address them. After that, future changes will come in small PRs.)
2018-09-02 11:51:16 -07:00
Wang Qing
b4cba9a49f [java] Fix the logic of generating TaskID (#2747)
## What do these changes do?
Because the logic of generating `TaskID` in java is different from python's, there are many tests fail when we change the `Ray Core` code.
In this change,  I rewrote the logic of generating `TaskID` in java which is the same as the python's.

In java, we call the native method `_generateTaskId()` to generate a `TaskID` which is also used in python. We change `computePutId()`'s logic too.

## Related issue number
[#2608](https://github.com/ray-project/ray/issues/2608)
2018-08-27 13:11:33 -07:00
Hao Chen
4f4bea086a [java] Remove multi-return API (#2724) 2018-08-26 00:04:54 -07:00
Wang Qing
244337d381 [java] Support resources management in raylet mode. (#2606) 2018-08-10 12:44:18 -07:00
Wang Qing
3845c294c3 [java] Fix java raylet wait (#2553) 2018-08-05 23:49:54 -07:00
Wang Qing
b7088c1010 Clean the pom files (#2350) 2018-07-05 13:36:01 -07:00