Commit graph

534 commits

Author SHA1 Message Date
Wang Qing
7e13e1fd49 [Java] Remove non-raylet code in Java. (#2828) 2018-09-06 14:54:13 +08:00
Yuhong Guo
dfb7c2be1e [Java] Add Plasma Free to Java code path (#2802) 2018-09-04 15:28:23 +08:00
Hao Chen
9d655721e5 [java] support creating an actor with parameters (#2817)
Previously `Ray.createActor` only support creating an actor without any parameter. This PR adds the support for creating an actor with parameters. Moreover, besides using a constructor, it's now also allowed to create an actor with a factory method. For more usage, prefer refer to `ActorTest.java`.
2018-09-03 09:53:03 -07:00
Hao Chen
3b0a2c4197 [Java] improve Java API module (#2783)
API module (`ray/java/api` dir) includes all public APIs provided by Ray, it should be the only module that normal Ray users need to face.

The purpose of this PR to first improve the code quality of the API module. Subsequent PRs will improve other modules later. The changes of this PR include the following aspects: 
1) Only keep interfaces in api module, to hide implementation details from users and fix circular dependencies among modules.
2) Document everything in the api module. 
3) Improve naming.
4) Add more tests for API. 
5) Also fix/improve related code in other modules.
6) Remove some unused code.

(Apologize for posting such a large PR. Java worker code has been lack of maintenance for a while. There're a lot of code quality issues that need to be fixed. We plan to use a couple of large PRs to address them. After that, future changes will come in small PRs.)
2018-09-02 11:51:16 -07:00
Wang Qing
514633456b [Java] Fix out-dated signatures of JNI methods (#2756)
1) Renamed the native JNI methods and some parameters of JNI methods. 
2) Fixed native JNI methods' signatures by `javah` tool.
3) Removed some useless native methods.
2018-08-30 17:59:29 +08:00
Robert Nishihara
132f133214 Limit number of concurrent workers started by hardware concurrency. (#2753)
* Limit number of concurrent workers started by hardware concurrency.

* Check if std:🧵:hardware_concurrency() returns 0.

* Pass in max concurrency from Python.

* Fix Java call to startRaylet.

* Fix typo

* Remove unnecessary cast.

* Fix linting.

* Cleanups on Java side.

* Comment back in actor test.

* Require maximum_startup_concurrency to be at least 1.

* Fix linting and test.

* Improve documentation.

* Fix typo.
2018-08-29 14:53:40 +08:00
Wang Qing
b4cba9a49f [java] Fix the logic of generating TaskID (#2747)
## What do these changes do?
Because the logic of generating `TaskID` in java is different from python's, there are many tests fail when we change the `Ray Core` code.
In this change,  I rewrote the logic of generating `TaskID` in java which is the same as the python's.

In java, we call the native method `_generateTaskId()` to generate a `TaskID` which is also used in python. We change `computePutId()`'s logic too.

## Related issue number
[#2608](https://github.com/ray-project/ray/issues/2608)
2018-08-27 13:11:33 -07:00
Wang Qing
26d3c0655c [java] Improve UniqueID code. (#2723) 2018-08-26 12:32:57 -07:00
Hao Chen
4f4bea086a [java] Remove multi-return API (#2724) 2018-08-26 00:04:54 -07:00
Philipp Moritz
b4c47a5861 Upgrade arrow to include more detailed flushing message (#2706) 2018-08-24 11:44:04 -07:00
Hao Chen
78b6bfb7f9 [Java] Change log dir to /tmp/raylogs (#2677)
Currently, log directory in Java is a relative path . This PR changes it to `/tmp/raylogs` (with the same format as Python, e.g., `local_scheduler-2018-51-17_17-8-6-05164.err`). It also cleans up some relative code.
2018-08-18 23:46:36 -07:00
Wang Qing
06a58016d8 [multi-language part 2] Change the command line arguments to start raylet (#2670) 2018-08-16 21:59:44 -07:00
Hao Chen
a719e089b0 [multi-language part 1] add a 'language' field to task specification (#2639) 2018-08-16 21:26:42 -07:00
Philipp Moritz
f13e3e22f2 Upgrade arrow to include tensorflow op fix (#2607) 2018-08-14 21:47:01 -07:00
Yuhong Guo
4bd98eed45 Support building Java and Python version at the same time. (#2640)
* Support building Java and Python version at the same time.

* Remove duplicated definition.

* Refine the building process of local_scheduler

* Refine

* Add comment for languages

* Modify instruction and add python,jave building to CI.

* change according to comment
2018-08-14 11:33:51 -07:00
Wang Qing
244337d381 [java] Support resources management in raylet mode. (#2606) 2018-08-10 12:44:18 -07:00
Stephanie Wang
d49b4bef0a [xray] Basic task reconstruction mechanism (#2526)
## What do these changes do?

This implements basic task reconstruction in raylet. There are two parts to this PR:
1. Reconstruction suppression through the `TaskReconstructionLog`. This prevents two raylets from reconstructing the same task if they decide simultaneously (via the logic in #2497) that reconstruction is necessary.
2. Task resubmission once a raylet becomes responsible for reconstructing a task.

Reconstruction is quite slow in this PR, especially for long chains of dependent tasks. This is mainly due to the lease table mechanism, where nodes may wait too long before trying to reconstruct a task. There are two ways to improve this:
1. Expire entries in the lease table using Redis `PEXPIRE`. This is a WIP and I may include it in this PR.
2. Introduce a "fast path" for reconstructing dependencies of a re-executed task. Normally, we wait for an initial timeout before checking whether a task requires reconstruction. However, if a task requires reconstruction, then it's likely that its dependencies also require reconstruction. In this case, we could skip the initial timeout before checking the GCS to see whether reconstruction is necessary (e.g., if the object has been evicted).

Since handling failures of other raylets is probably not yet complete in master, this only turns back on Python tests for reconstructing evicted objects.
2018-08-09 07:24:37 -07:00
Mitar
9015e742c4 Update installation instructions with psmisc to enable 'ray stop' (#2550) 2018-08-05 23:58:58 -07:00
Wang Qing
3845c294c3 [java] Fix java raylet wait (#2553) 2018-08-05 23:49:54 -07:00
Wang Qing
e4f68ff8cf [Java Worker] Support raylet on Java (#2479) 2018-08-01 17:52:49 -07:00
Hao Chen
fe65f9fbbc improve java api doc (#2508) 2018-07-29 20:41:11 -07:00
Hao Chen
0ea7a6abf0 add java tutorial (#2491) 2018-07-28 17:09:30 -07:00
Stephanie Wang
6675361684 [xray] Track ray.get calls as task dependencies (#2362) 2018-07-27 11:59:17 -07:00
Hao Chen
5b015f9a79 Remove the check of java primitive types (#2495) 2018-07-27 11:44:19 -07:00
Wang Qing
344e3d2c05 Fix bug: Init RayLog before using it. (#2408) 2018-07-18 00:44:37 -07:00
Hanwei Jin
450b11f1d6 update to slf4j, remove DynamicLog (#2384) 2018-07-09 23:33:59 -07:00
Zhijun Fu
fa33ea5283 [Java] Java worker cluster support (#2359) 2018-07-09 10:20:41 -07:00
Wang Qing
b7088c1010 Clean the pom files (#2350) 2018-07-05 13:36:01 -07:00
Shuo
8e687cbc98 Unify the identity of a process while logging. (#2325) 2018-07-04 14:26:19 -07:00
mylinyuzhi
fa0ade2bc5 [Java] Replace binary rewrite with Remote Lambda Cache (SerdeLambda) (#2245)
* <feature> : serde lambda

* <feature>:fixed CR

with issue #2245

* <feature>: fixed CR
2018-06-13 12:58:07 -07:00
Yujie Liu
3b5e700fd7 [JavaWorker] Java code lint check and binding to CI (#2225)
* add java code lint check and fix the java code lint error

* add java doc lint check and fix the java doc lint error

* add java code and doc lint to the CI
2018-06-09 16:26:54 -07:00
Yuhong Guo
5b0df0eca2 Change surefire version to 2.21.0 to fix test failure on Java10. (#2198) 2018-06-06 10:39:20 -07:00
Hao Chen
ac1e5a7d15 [JavaWorker] Do not kill local-scheduler-forked workers in RunManager.cleanup (#2151)
Local-scheduler-forked workers will be killed by local scheduler itself,
don't need to be killed here. See:
570c3153cd/src/local_scheduler/local_scheduler.cc (L184-L192)

Also, using `ps | grep | kill` might be dangerous, because it
could also kill irrelevant processes, e.g., `vim DefaultWorker.java`.
2018-05-30 00:25:03 -07:00
Yujie Liu
a8d3c057c1 [JavaWorker] Enable java worker support (#2094)
* Enable java worker support
--------------------------
This commit includes a tailored version of the Java worker implementation from Ant Financial.
The changes for build system, python module, src module and arrow are in other commits, this commit consists of the following modules:
 - java/api: Ray API definition
 - java/common: utilities
 - java/hook: binary rewrite of the Java byte-code for remote execution
 - java/runtime-common: common implementation of the runtime in worker
 - java/runtime-dev: a pure-java mock implementation of the runtime for fast development
 - java/runtime-native: a native implementation of the runtime
 - java/test: various tests

Contributors for this work:
 Guyang Song, Peng Cao, Senlin Zhu,Xiaoying Chu, Yiming Yu, Yujie Liu, Zhenyu Guo

* change the format of java help document from markdown to RST

* update the vesion of Arrow for java worker

* adapt the new version of plasma java client from arrow which use byte[] instead of custom type

* add java worker test to ci

* add the example module for better usage guide
2018-05-26 14:38:50 -07:00