hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Robert Nishihara	98edf752a9	Note requirement cython==0.27.3 in installation instructions. (#3322 )	2018-11-15 15:27:19 -08:00
Wang Qing	9d4847ad2d	[hot-fix] Fix error when calling Ray.init() twice. (#3314 )	2018-11-13 21:21:54 -05:00
Stephanie Wang	d950e92f63	Allow multiple threads to call ray.get and ray.wait (#3244 ) * Handle multiple threads calling ray.get * Multithreaded ray.wait * Pass in current task ID in java backend * Add multithreaded actor to tests, add warning messages to worker for multithreaded ray.get * Fix test * Some cleanups * Improve error message * Add assertion * Cleanup, throw error in HandleTaskUnblocked if task not actually blocked * lint * Fix python worker reset * Fix references to reconstruct_objects * Linting * java lint * Fix java * Fix iterator	2018-11-07 22:39:28 -08:00
Richard Liaw	0bab8ed95c	Expose internal config parameters for starting Ray (#3246 ) ## What do these changes do? This PR exposes the CL option for using a config parameter. This is important for certain tests (i.e., FT tests that removing nodes) to run quickly. Note that this is bad practice and should be replaced with GFLAGS or some equivalent as soon as possible. #3239 depends on this. TODO: - [x] Add documentation to method arguments before merging. - [x] Add test to verify this works? ## Related issue number	2018-11-07 21:46:02 -08:00
Robert Nishihara	fd854ff090	Allow the node manager port and object manager port to be set through… (#3130 ) * Allow the node manager port and object manager port to be set through ray start. * Linting * Fix Java test * Address comments.	2018-10-28 17:28:41 -07:00
Robert Nishihara	658c14282c	Remove legacy Ray code. (#3121 ) * Remove legacy Ray code. * Fix cmake and simplify monitor. * Fix linting * Updates * Fix * Implement some methods. * Remove more plasma manager references. * Fix * Linting * Fix * Fix * Make sure class IDs are strings. * Some path fixes * Fix * Path fixes and update arrow * Fixes. * linting * Fixes * Java fixes * Some java fixes * TaskLanguage -> Language * Minor * Fix python test and remove unused method signature. * Fix java tests * Fix jenkins tests * Remove commented out code.	2018-10-26 13:36:58 -07:00
bibabolynn	b4614ae69a	[java] customize path of ray.conf (#3100 ) users can add custom path of ray.config by using -Dray.config=/path/to/ray.conf	2018-10-26 13:36:34 +08:00
Hanwei Jin	7c1fd19fd9	[Java] support python worker command in raylet (#3092 ) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> ## What do these changes do? support raylet, which is started by java runManager, to start python default_worker.py . So when doing local test of java call python task, it helps auto start python worker. ## Related issue number <!-- Are there any issues opened that will be resolved by merging this change? -->	2018-10-24 20:43:39 +08:00
bibabolynn	9a5c273db7	[java] fix check exception type (#3093 ) <!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> ## What do these changes do? remove TaskExecutionException, use RayException instead <!-- Please give a short brief about these changes. --> ## Related issue number <!-- Are there any issues opened that will be resolved by merging this change? -->	2018-10-19 06:43:42 -07:00
Wang Qing	b410ee0d29	[Java] Support dynamically defining resources when submitting task. (#3070 ) ## What do these changes do? Before this PR, if we want to specify some resources, we must do as following codes: ```java @RayRemote(Resources={ResourceItem("CPU", 10)}) public static void f1() { // do sth } @RayRemote(Resources={ResourceItem("CPU", 10)}) class Demo { // sth } ``` Unfortunately, it's no way for us to create another actor or task with different resources required. After this PR, the thing will be: ```java ActorCreationOptions option = new ActorCreationOptions(); option.resources.put("CPU", 4.0); RayActor<Echo> echo1 = Ray.createActor(Echo::new, option); option.resources.put("Res-A", 4.0); RayActor<Echo> echo2 = Ray.createActor(Echo::new, option); //if we don't specify resource, the resources will be `{"cpu":0.0}` by default. Ray.call(Echo::echo, echo2, 100); ``` ## Related issue number N/A	2018-10-19 06:22:32 -07:00
Wang Qing	64e5eb305e	[Java] Add jvm-parameters in Config. (#3065 )	2018-10-16 15:03:18 -07:00
Wang Qing	828fe24b39	[Java] Fix loading driver resources issue. (#3046 ) ## What do these changes do? Fix the issue how we load driver resources by a specified path. Also this addressed the comments from the related PR [3044](https://github.com/ray-project/ray/pull/3044). ## Related PRs: [#3044](https://github.com/ray-project/ray/pull/3044) and [#3001](https://github.com/ray-project/ray/pull/3001).	2018-10-11 09:45:21 -07:00
Wang Qing	4a2ed47b6c	[Java] Improve some Java code (#3040 ) This PR improves some java codes, and removes some duplicated code.	2018-10-10 17:30:23 -07:00
Wang Qing	ef1f2fde95	Fix the uniqueId toString format. (#3035 )	2018-10-08 13:12:14 -07:00
Wang Qing	84bf5fc8f3	[Java] Load driver resources from local path. (#3001 ) ## What do these changes do? 1. Add a configuration item `driver.resource-path`. 2. Load driver resources from the local path which is specified in the `ray.conf`. Before this change, we should add all driver resources(like user's jar package, dependencies package and config files) into `classpath`. After this change, we should add the driver resources into the mount path which we can configure it in `ray.conf`, and we shouldn't configure `classpath` for driver resources any more. ## Related issue number N/A	2018-10-08 21:05:26 +01:00
Robert Nishihara	faa31ae018	Introduce concept of resources required for placing a task. (#2837 ) * Introduce concept of resources required for placement. * Add placement resources to task spec * Update java worker * Update taskinfo.java	2018-10-04 10:35:39 -07:00
bibabolynn	9c606ea06c	fix bug: (#3000 ) before fix,RAY_FUN_CACHE use only get method ,can only get null fix : put after create	2018-10-02 22:53:54 -07:00
Wang Qing	fcef4edd46	[Java] Fix the required-resources issue of actor member function in Java worker. (#3002 ) This fixes a bug in which Java actor methods inherit the resource requirements of the actor creation task.	2018-10-01 12:56:36 -07:00
Hao Chen	c5b8840193	[Java] fix java/cleanup.sh (#2989 ) Remove legacy-ray-related stuff from this script, and update temp file locations.	2018-09-28 21:31:47 -05:00
Hao Chen	18173dde26	[Java] update api doc (#2988 ) API doc is kind of out-dated, because of some recent code changes. Update it and add some simple examples.	2018-09-28 19:05:42 -05:00
Hao Chen	4ffe1e3556	[Java] Fix: task spec's resource map should contain CPU (#2987 )	2018-09-28 14:23:38 -05:00
Wang Qing	68cf194e90	[fix] Fix ray.home configuration item. (#2977 ) If we set `ray.home` configuration item to `""`. The current `RayConfig` will set it to current work directory, like `/User/My/Ray`. But the some other configuration items(like `redisServerExecutablePath`) will be set to `/User/My/Ray//build/src/common/thirdparty/redis/src/redis-server` by mistake. Note: There are 2 `/` between current work directory and `build/src/common....` This PR will fix this issue.	2018-09-28 00:06:14 -05:00
Wang Qing	1d9652abf1	[java] fix wrong links in Java readme file.	2018-09-27 11:23:10 +08:00
Wang Qing	8e8e123777	[Java] Simplify Java worker configuration (#2938 ) ## What do these changes do? Previously, Java worker configuration is complicated, because it requires setting environment variables as well as command-line arguments. This PR aims to simplify Java worker's configuration. 1) Configuration management is now migrated to [lightbend config](https://github.com/lightbend/config), thus doesn't require setting environment variables. 2) Many unused config items are removed. 3) Provide a simple `example.conf` file, so users can get started quickly. 4) All possible options and their default values are declared and documented in `ray.default.conf` file. This PR also simplifies and refines the following code: 1) The process of `Ray.init()`. 2) `RunManager`. 3) `WorkerContext`. ### How to use this configuration? 1. Copy `example.conf` into your classpath and rename it to `ray.conf`. 2. Modify/add your configuration items. The all items are declared in `ray.default.conf`. 3. You can also set the items in java system prosperities. Note: configuration is read in this priority: System properties > `ray.conf` > `ray.default.conf` ## Related issue number N/A	2018-09-26 20:14:22 +08:00
Wang Qing	0e552fbb22	[Java] Update maven version to 0.1-SNAPSHOT Update the version in maven from 0.1 to 0.1-SNAPSHOT, because SNAPSHOT is the conventional version name in dev process. Non-snapshot versions are only used for release.	2018-09-26 18:08:46 +08:00
Hao Chen	971df5ea8a	[java] put function meta in task spec and load functions with function meta (#2881 ) This PR adds a `function_desc` field into task spec. a function descriptor is a list of strings that can uniquely describe a function. - For a Python function, it should be: [module_name, class_name, function_name] - For a Java function, it should be: [class_name, method_name, type_descriptor] There're a couple of purposes to add this field: In this PR: - Java worker needs to know function's class name to load it. Previously, since task spec didn't have such a field to hold this info, we did a hack by appending the class name to the argument list. With this change, we fixed that hack and significantly simplified function management in Java. Will be done in subsequent PRs: - Support cross-language invocation (#2576): currently Python worker manages functions by saving them in GCS and pass function id in task spec. However, if we want to call a Python function from Java, we cannot save it in GCS and get the function id. But instead, we can pass the function descriptor (module name, class name, function name) in task spec and use it to load the function. - Support deployment: one major problem of Python worker's current function management mechanism is #2327. In prod env, we should have a mechanism to deploy code and dependencies to the cluster. And when code is already deployed, we don't need to save functions to GCS any more and can use `function_desc` to manage functions.	2018-09-25 23:05:05 -07:00
Hao Chen	3cccb49191	[Java] Implement missing methods in MockRayletClient (#2954 ) Previous changes broke single-process mode in raylet. This PR fixes the hello-world example work in single-process mode. Follow-up diffs will completely fix single-process mode and add tests.	2018-09-25 09:57:32 -07:00
Robert Nishihara	f16d33593b	Mark worker as blocked and trigger reconstruction in ray.wait. (#2864 ) * Trigger reconstruction in ray.wait and mark worker as blocked. * Add test. * Linting. * Don't run new test with legacy Ray. * Only call HandleClientUnblocked if it actually blocked in ray.wait. * Reduce time to ray.wait in the test.	2018-09-13 15:28:17 -07:00
Hanwei Jin	fbf214e408	update ray cmake build process (#2853 ) * use cmake to build ray project, no need to appply build.sh before cmake, fix some abuse of cmake, improve the build performance * support boost external project, avoid using the system or build.sh boost * keep compatible with build.sh, remove boost and arrow build from it. * bugfix: parquet bison version control, plasma_java lib install problem * bugfix: cmake, do not compile plasma java client if no need * bugfix: component failures test timeout machenism has problem for plasma manager failed case * bugfix: arrow use lib64 in centos, travis check-git-clang-format-output.sh does not support other branches except master * revert some fix * set arrow python executable, fix format error in component_failures_test.py * make clean arrow python build directory * update cmake code style, back to support cmake minimum version 3.4	2018-09-12 11:19:33 -07:00
Hao Chen	8414e413a2	[java] refine and simplify java worker code structure (#2838 )	2018-09-10 10:48:17 -07:00
Wang Qing	7e13e1fd49	[Java] Remove non-raylet code in Java. (#2828 )	2018-09-06 14:54:13 +08:00
Yuhong Guo	dfb7c2be1e	[Java] Add Plasma Free to Java code path (#2802 )	2018-09-04 15:28:23 +08:00
Hao Chen	9d655721e5	[java] support creating an actor with parameters (#2817 ) Previously `Ray.createActor` only support creating an actor without any parameter. This PR adds the support for creating an actor with parameters. Moreover, besides using a constructor, it's now also allowed to create an actor with a factory method. For more usage, prefer refer to `ActorTest.java`.	2018-09-03 09:53:03 -07:00
Hao Chen	3b0a2c4197	[Java] improve Java API module (#2783 ) API module (`ray/java/api` dir) includes all public APIs provided by Ray, it should be the only module that normal Ray users need to face. The purpose of this PR to first improve the code quality of the API module. Subsequent PRs will improve other modules later. The changes of this PR include the following aspects: 1) Only keep interfaces in api module, to hide implementation details from users and fix circular dependencies among modules. 2) Document everything in the api module. 3) Improve naming. 4) Add more tests for API. 5) Also fix/improve related code in other modules. 6) Remove some unused code. (Apologize for posting such a large PR. Java worker code has been lack of maintenance for a while. There're a lot of code quality issues that need to be fixed. We plan to use a couple of large PRs to address them. After that, future changes will come in small PRs.)	2018-09-02 11:51:16 -07:00
Wang Qing	514633456b	[Java] Fix out-dated signatures of JNI methods (#2756 ) 1) Renamed the native JNI methods and some parameters of JNI methods. 2) Fixed native JNI methods' signatures by `javah` tool. 3) Removed some useless native methods.	2018-08-30 17:59:29 +08:00
Robert Nishihara	132f133214	Limit number of concurrent workers started by hardware concurrency. (#2753 ) * Limit number of concurrent workers started by hardware concurrency. * Check if std:🧵:hardware_concurrency() returns 0. * Pass in max concurrency from Python. * Fix Java call to startRaylet. * Fix typo * Remove unnecessary cast. * Fix linting. * Cleanups on Java side. * Comment back in actor test. * Require maximum_startup_concurrency to be at least 1. * Fix linting and test. * Improve documentation. * Fix typo.	2018-08-29 14:53:40 +08:00
Wang Qing	b4cba9a49f	[java] Fix the logic of generating TaskID (#2747 ) ## What do these changes do? Because the logic of generating `TaskID` in java is different from python's, there are many tests fail when we change the `Ray Core` code. In this change, I rewrote the logic of generating `TaskID` in java which is the same as the python's. In java, we call the native method `_generateTaskId()` to generate a `TaskID` which is also used in python. We change `computePutId()`'s logic too. ## Related issue number [#2608](https://github.com/ray-project/ray/issues/2608)	2018-08-27 13:11:33 -07:00
Wang Qing	26d3c0655c	[java] Improve UniqueID code. (#2723 )	2018-08-26 12:32:57 -07:00
Hao Chen	4f4bea086a	[java] Remove multi-return API (#2724 )	2018-08-26 00:04:54 -07:00
Philipp Moritz	b4c47a5861	Upgrade arrow to include more detailed flushing message (#2706 )	2018-08-24 11:44:04 -07:00
Hao Chen	78b6bfb7f9	[Java] Change log dir to /tmp/raylogs (#2677 ) Currently, log directory in Java is a relative path . This PR changes it to `/tmp/raylogs` (with the same format as Python, e.g., `local_scheduler-2018-51-17_17-8-6-05164.err`). It also cleans up some relative code.	2018-08-18 23:46:36 -07:00
Wang Qing	06a58016d8	[multi-language part 2] Change the command line arguments to start raylet (#2670 )	2018-08-16 21:59:44 -07:00
Hao Chen	a719e089b0	[multi-language part 1] add a 'language' field to task specification (#2639 )	2018-08-16 21:26:42 -07:00
Philipp Moritz	f13e3e22f2	Upgrade arrow to include tensorflow op fix (#2607 )	2018-08-14 21:47:01 -07:00
Yuhong Guo	4bd98eed45	Support building Java and Python version at the same time. (#2640 ) * Support building Java and Python version at the same time. * Remove duplicated definition. * Refine the building process of local_scheduler * Refine * Add comment for languages * Modify instruction and add python,jave building to CI. * change according to comment	2018-08-14 11:33:51 -07:00
Wang Qing	244337d381	[java] Support resources management in raylet mode. (#2606 )	2018-08-10 12:44:18 -07:00
Stephanie Wang	d49b4bef0a	[xray] Basic task reconstruction mechanism (#2526 ) ## What do these changes do? This implements basic task reconstruction in raylet. There are two parts to this PR: 1. Reconstruction suppression through the `TaskReconstructionLog`. This prevents two raylets from reconstructing the same task if they decide simultaneously (via the logic in #2497) that reconstruction is necessary. 2. Task resubmission once a raylet becomes responsible for reconstructing a task. Reconstruction is quite slow in this PR, especially for long chains of dependent tasks. This is mainly due to the lease table mechanism, where nodes may wait too long before trying to reconstruct a task. There are two ways to improve this: 1. Expire entries in the lease table using Redis `PEXPIRE`. This is a WIP and I may include it in this PR. 2. Introduce a "fast path" for reconstructing dependencies of a re-executed task. Normally, we wait for an initial timeout before checking whether a task requires reconstruction. However, if a task requires reconstruction, then it's likely that its dependencies also require reconstruction. In this case, we could skip the initial timeout before checking the GCS to see whether reconstruction is necessary (e.g., if the object has been evicted). Since handling failures of other raylets is probably not yet complete in master, this only turns back on Python tests for reconstructing evicted objects.	2018-08-09 07:24:37 -07:00
Mitar	9015e742c4	Update installation instructions with psmisc to enable 'ray stop' (#2550 )	2018-08-05 23:58:58 -07:00
Wang Qing	3845c294c3	[java] Fix java raylet wait (#2553 )	2018-08-05 23:49:54 -07:00
Wang Qing	e4f68ff8cf	[Java Worker] Support raylet on Java (#2479 )	2018-08-01 17:52:49 -07:00

1 2

64 commits