hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 02:21:39 -05:00

Author	SHA1	Message	Date
Qing Wang	a37d9a2ec2	[Core] Support default actor lifetime. (#21283 ) Support the ability to specify a default lifetime for actors which are not specified lifetime when creating. This is a job level configuration item. #### API Change The Python API looks like: ```python ray.init(job_config=JobConfig(default_actor_lifetime="detached")) ``` Java API looks like: ```java System.setProperty("ray.job.default-actor-lifetime", defaultActorLifetime.name()); Ray.init(); ``` One example usage is: ```python ray.init(job_config=JobConfig(default_actor_lifetime="detached")) a1 = A.options(lifetime="non_detached").remote() # a1 is a non-detached actor. a2 = A.remote() # a2 is a non-detached actor. ``` Co-authored-by: Kai Yang <kfstorm@outlook.com> Co-authored-by: Qing Wang <jovany.wq@antgroup.com>	2022-01-22 12:26:08 +08:00
SangBin Cho	5514711a35	[Part 5] Set actor died error message in ActorDiedError (#20903 ) This is the second last PR to improve `ActorDiedError` exception. This propagates Actor death cause metadata to the ray error object. In this way, we can raise a better actor died error exception. After this PR is merged, I will add more metadata to each error message and write a documentation that explains when each error happens. TODO - [x] Fix test failures - [x] Add unit tests - [x] Fix Java/cpp cases Follow up PRs - Not allowing nullptr for RayErrorInfo input.	2022-01-20 22:11:11 -08:00
Yi Cheng	3c63a8410d	[gcs/ha] Fix java related error when enable redisless ray (#21692 ) This PR enables ray java to be able to run without redis. It also fixes java related tests and updated the pipeline.	2022-01-20 13:56:25 -08:00
Rong Ma	f54282147c	[PlacementGroup] Support using any available bundle in java api (#21496 ) In python or C++, we can specify the bundle index as -1 to use any available bundle in the placement group. We should also enable it in Java to keep the API consistent across all languages.	2022-01-18 01:58:02 +08:00
Qing Wang	2c3be852ab	[Java] Support defining ConcurrencyGroup statically in Java. (#20373 ) This PR introduces statically defining ConcurrencyGroup APIs in Java. We introduce 2 APIs: 1. Introducing `@DefConcurrencyGroup` annotation for an actor class to define a concurrency group statically. 2. Introducing `@UseConcurrencyGroup` annotation for actor methods to define the concurrency group to be used in the method. Examples are below: ```java @DefConcurrencyGroup(name = "io", maxConcurrency = 2) @DefConcurrencyGroup(name = "compute", maxConcurrency = 4) private static class MyActor { @UseConcurrencyGroup(name = "io") public long f1() { } @UseConcurrencyGroup(name = "io") public long f2() { } @UseConcurrencyGroup(name = "compute") public long f3(int a, int b) { } @UseConcurrencyGroup(name = "compute") public long f4() { } } ActorHandle<> myActor = Ray.actor(MyActor::new).remote(); myActor.task(MyActor::f1).remote(); myActor.task(MyActor::f2).remote(); myActor.task(MyActor::f3).remote(); myActor.task(MyActor::f4).remote(); ``` `MyActor` has 3 concurrency groups: `io` with 2 concurrency, `compute` with 4 concurrency and `default` with 1 concurrency. f1 and f2 will be executed in `io`, f3 and f4 will be executed in `compute`.	2022-01-17 16:23:10 +08:00
Qing Wang	bb647626cf	[Xlang][Java] Fix Java overrided `default` method cannot be invoked. (#21491 ) In Xlang(Python call Java), a Java method which overrides a `default` method of the super class is not able to be invoked successfully, due to we treat it as overloaded method instead of overrided method. This PR correctly handle it at the case it overrides a `default` method. Before this PR, the following usage is not able to be invoked from Python -> Java. ```Java public interface ExampleInterface { default String echo(String inp) { return inp; } } public class ExampleImpl implements ExampleInterface { @Override public String echo(String inp) { return inp + " echo"; } } ``` ```python /// Invoke it in Python. cls = ray.java_actor_class("io.ray.serve.util.ExampleImpl") handle = cls.remote() print(ray.get(handle.echo.remote("hi"))) ```	2022-01-11 23:11:24 +08:00
Qing Wang	57ff13461c	[Java] Use localhost instead of public ip (#21462 ) Use localhost ip address instead of public ip for avoid security popups on MacOS. This also reverts This reverts commit `e4542be0d1`.	2022-01-11 02:58:22 +08:00
Yi Cheng	8fa9fddaa0	[1/3][kv] move some internal kv py logic into cpp (#21386 ) This PR moves the internal kv namespace logic into cpp to reduce logic in python for the following reasons: - internal kv is used in x-lang so we have to move it to cpp so that all langs can benefit. - for https://github.com/ray-project/ray/issues/8822 we need to delete resource when job finished in gcs One extra field about del is also added so that when delete, we are able to delete by prefix instead of just a key	2022-01-07 17:35:06 -08:00
Qing Wang	340fbf53c0	[Java] Support actor handle reference counting. (#21249 )	2022-01-01 10:26:22 +08:00
Tao Wang	a78baf4075	[Java]Init gcs client in runtime only if necessary (#21072 ) There's a redis connection in gcs client, but most time the gcs client is never used in worker. We can make the initialization lazy to reduce redis connections. After that, the number of redis connections reduces from 2 to 1 in one core worker.	2021-12-30 15:44:06 +08:00
Shawn	4f9aceb3a6	[Java] Native memory support (#21256 ) This PR povided universal native memory access support in java worker mentioned in #21234, which will also be the foundation for later zero-copy and serialization. The main changes include: * Native memory operations based on `sun.misc.Unsafe` * Little-Endian based Native memory buffer. * Native memory based IO operations: * InputStream/OutputStream * ReadChannel/WriteChannel * MockReadChannel/MockWriteChannel	2021-12-30 15:31:22 +08:00
Qing Wang	e653d47533	[Java] Shade some widely used dependencies in bazel_jar_jar rule. (#21237 ) These dependencies are widely used: - com.google.common - com.google.protobuf - com.google.thirdparty So that we need to shade them to avoid being conflict with jars introduced by user. In this PR, we introduce a `bazel_jar_jar` rule for doing these and also shade them in maven pom files.	2021-12-23 16:54:31 +08:00
WanXing Wang	72bd2d7e09	[Core] Support back pressure for actor tasks. (#20894 ) Resubmit the PR https://github.com/ray-project/ray/pull/19936 I've figure out that the test case `//rllib:tests/test_gpus::test_gpus_in_local_mode` failed due to deadlock in local mode. In local mode, if the user code submits another task during the executing of current task, the `CoreWorker::actor_task_mutex_` may cause deadlock. The solution is quite simple, release the lock before executing task in local mode. In the commit `7c2f61c76c`: 1. Release the lock in local mode to fix the bug. @scv119 2. `test_local_mode_deadlock` added to cover the case. @rkooo567 3. Left a trivial change in `rllib/tests/test_gpus.py` to make the `RAY_CI_RLLIB_DIRECTLY_AFFECTED ` to take effect.	2021-12-13 23:56:07 -08:00
Kai Fricke	d4413299c0	Revert "[Core] Support back pressure for actor tasks (#19936 )" (#20880 ) This reverts commit `a4495941c2`.	2021-12-03 17:48:47 -08:00
DK.Pino	4ef0d4a37a	[Java] [Placement Group] Make class `PlacementGroupImpl` serializable (#20759 )	2021-12-03 13:06:17 +08:00
WanXing Wang	a4495941c2	[Core] Support back pressure for actor tasks (#19936 ) Support back pressure in core worker. Job config added for python worker and java worker.	2021-12-02 14:41:30 -08:00
Tao Wang	f481081904	[Java]Get next job id only in driver (#20813 ) ## Why are these changes needed? Job id is only used in driver, we should not get it in WORKER.	2021-12-01 15:48:21 +08:00
Qing Wang	84f7062329	[Java] Cleanup temp file of libcore_worker.so (#20748 ) Why are these changes needed? Replace the existing temp file to avoid the issue that the previous worker dies and leaves the temp file there, resulting in the next coming workers are not able to write a new temp file since there is an existing one.	2021-11-29 16:05:06 +08:00
Guyang Song	53630ee03b	Revert "Revert "[runtime env] redefine runtime env to protobuf"" and fix windows compiling (#20692 ) - Fix windows compiling and revert https://github.com/ray-project/ray/pull/20641 - Seems the pr https://github.com/ray-project/ray/pull/20670 can solve the windows compiling issue.	2021-11-24 09:01:01 -08:00
Alex Wu	9388d28233	Revert "[runtime env] redefine runtime env to protobuf" (#20641 ) Reverts #19511 Breaks windows compilation	2021-11-22 13:11:30 -08:00
Guyang Song	ad56b9b432	[runtime env] redefine runtime env to protobuf (#19511 )	2021-11-20 16:54:42 +08:00
Larry	454db6902c	[Java] Add timeout parameter for Ray.get() API (#20282 ) Why are these changes needed? Add timeout(ms) param for Java ray.get. The API changes have been updated to doc ([Ray Core Walkthrough]->[Fetching Results]). eg: ObjectRef<Integer> objRef = Ray.put(1); objRef.get(1000) Ray.get(Ray.task(MyRayApp::slowFunction).remote(), 3000) Related issue number #20247	2021-11-17 11:02:17 +08:00
Yi Cheng	a4e187c0e7	[gcs] Update function table to use internal kv (#20152 ) ## Why are these changes needed? This is a part of redis removal. This PR remove redis kv in function table. rpush related code is not updated in this PR. ## Related issue number	2021-11-15 23:34:41 -08:00
Qing Wang	1172195571	[Java] Remove global named actor and global pg (#20135 ) This PR removes global named actor and global PGs. I believe these APIs are not used widely in OSS. CPP part is not included in this PR. @kfstorm @clay4444 @raulchen Please take a look if this change is reasonable. IMPORTANT NOTE: This is a Java API change and will lead backward incompatibility in Java global named actor and global PG usage. CPP part is not included in this PR. INCLUDES: Remove setGlobalName() and getGlobalActor() APIs. Remove getGlobalPlacementGroup() and setGlobalPG Add getActor(name, namespace) API Add getPlacementGroup(name, namespace) API Update doc pages.	2021-11-15 16:28:53 +08:00
Qing Wang	7500f7d88a	Remove deprecated Java PG APIs. (#20219 ) These APIs were deprecated at least 7+ months and 4+ versions, it's the time and very necessary to remove them.	2021-11-12 09:29:48 +08:00
Yi Cheng	e54d3117a4	[gcs] Update all redis kv usage in python except function table (#20014 ) ## Why are these changes needed? This is part of redis removal project. In this PR all direct usage of redis got removed except function table. Function table will be migrated in the next PR ## Related issue number #19443	2021-11-10 20:24:53 -08:00
Stephanie Wang	ffcc5935d7	[core] Evict lineage to bound memory usage (#19946 ) * bound lineage * Bound lineage in bytes * test * Lineage evicted error * Lineage evicted * lint * test * test * comment * doc * x * x * x * x	2021-11-08 21:53:40 -08:00
Qing Wang	6d8a7291ab	Add getNamespace API for Java worker (#20057 ) [Java API] Add getNamespace API for Java worker.	2021-11-08 15:51:14 +08:00
Qing Wang	4373aa1e3b	Support generating a UUID string as the anonymous namespace for Java worker. (#19986 ) Why are these changes needed? For Java worker, we generate a UUID string as the namespace if a job is not specified a namespace by user. Related issue number #16474	2021-11-04 11:40:17 +08:00
Jiajun Yao	5de4a38948	[CI] Run Java CI on Mac (#19757 ) Why are these changes needed? Enable Java tests on Mac CI to avoid more breakages. Related issue number Closes #19700	2021-11-03 23:40:05 +08:00
Qing Wang	da6894848d	Support Java namespace APIs (#19468 ) ## Why are these changes needed? ## Related issue number #16474	2021-11-02 11:05:40 +08:00
Tao Wang	7a2e9e00e8	[Tiny]Remove duplicated assignment (#19866 )	2021-11-01 11:44:01 +08:00
Jiajun Yao	e4542be0d1	[Java] Run java on mac with public ip (#19701 )	2021-10-25 11:38:33 -07:00
Jiajun Yao	805ce453dd	[Java] Remove auto-generated pom.xml files. (#19475 )	2021-10-19 17:35:37 +08:00
Gagandeep Singh	d226cbf21a	Added StartupToken to idenitfy a process at startup (#19014 ) * Added StartupToken to idenitfy a process at startup * Applied linting formats * Addressed reviews * Fixing worker_pool_test * Fixed worker_pool_test * Applied linting formatting * Added documentation for StartupToken * Fixed linting * Reordered initialisation of WorkerPool members * Fixed Python docs * Fixing bugs in cluster_mode_test * Fixing Java tests * Create and set shim process after verifying startup_token * shim_process.GetId() -> worker_shim_pid * Improvements in startup token and modifying java files * update io_ray_runtime_RayNativeRuntime.h * Fixed java tests by adding startup-token to conf * Applied linting * Increased arg count for startup_token * Attempt to fix streaming tests * Type correction * applied linting * Corrected index of startup token arg * Modified, mock_worker.cc to accept startup tokens * Applied linting * Applied linting changes from CI * Removed override from worker.h * Applied linting from scripts/format.sh * Addressed reviews and applied scripts/format.sh * Applied linting script from ci/travis * Removed unrequired methods from public scope * Applied linting	2021-10-15 15:13:13 -07:00
Qing Wang	2cc164e616	[Java] Fix incompleted core worker dynamic library. (#19342 ) * Fix incompleted core worker dynamic library. * Fix lint.	2021-10-14 14:42:05 +08:00
hazeone	c2f0035fd2	[Java]Support getGpuIds API (#19031 ) Add java getGpuIds() API which is the same as get_gpu_ids in python. We can get deviceId if we've allocated a GPU to a worker.	2021-10-13 23:40:26 +08:00
Guyang Song	ab55b808c5	[runtime env] move worker env to runtime env in Java (#19060 )	2021-10-11 17:25:09 +08:00
Stephanie Wang	940f84cedb	[core] Remove unused plasma promotion path (#19122 ) * remove unused * lint * lint * lint	2021-10-07 10:55:50 -07:00
Qing Wang	90d2456ec7	[Java] Support userloggers. (#18846 ) Co-authored-by: Kai Yang <kfstorm@outlook.com>	2021-09-26 16:53:06 +08:00
Stephanie Wang	284dee493e	[core][usability] Disambiguate ObjectLostErrors for better understandability (#18292 ) * Define error types, throw error for ObjectReleased * x * Disambiguate OBJECT_UNRECONSTRUCTABLE and OBJECT_LOST * OwnerDiedError * fix test * x * ObjectReconstructionFailed * ObjectReconstructionFailed * x * x * print owner addr * str * doc * rename * x	2021-09-13 16:16:17 -07:00
Guyang Song	3bc5f0501f	fix WaitPlacementGroupReady API (#18464 )	2021-09-13 14:07:40 +08:00
Qing Wang	371f03fa48	Remove dynamic resource from client side. (#18514 )	2021-09-11 10:39:59 -07:00
Qing Wang	d87441cda7	[Java] ConcurrencyGroup in Java local mode. (#18241 ) * WIP * Fix * Fix test * Refine * Fix lint, * WIP2 * WIP2 * Refine * Put a default concurrency group. * Fix submitting task with concurrency group name. * Remove unnecessary changes. * Update java/runtime/src/main/java/io/ray/runtime/task/LocalModeTaskSubmitter.java Co-authored-by: Kai Yang <kfstorm@outlook.com> Co-authored-by: Kai Yang <kfstorm@outlook.com>	2021-09-07 20:43:31 +08:00
Zhi Lin	d3786ac131	Bump Java version to 2.0.0-SNAPSHOT (#15394 ) * bump java version to 2.0.0-SNAPSHOT * update	2021-08-30 12:25:30 +08:00
Lingxuan Zuo	f2a3085ce2	[Metric]Java metric api enhancement (#17811 ) * Java metric api enhancement: make tagkey transparent for upper level users * add java metric tags test * mark Deprecated	2021-08-16 22:38:27 +08:00
Qing Wang	9d5c68ff55	[Java] Better log message when failed to invoke task. (#17737 )	2021-08-13 17:31:58 +08:00
Kai Yang	ab53c5fc93	[Java] Update rolling logging configuration (#17741 )	2021-08-12 10:15:27 +08:00
Qing Wang	6d6a1ea43e	Support reading system configs from native in Java. (#17703 ) * Support reading system configs from native in Java. * Fix lint * Lint cpp * Fix Java cases. * Address comments. * Address comments.	2021-08-12 10:06:01 +08:00
Qing Wang	4cc34588db	[Core] Support ConcurrentGroup part1 (#16795 ) * Core change and Java change. * Fix void call. * Address comments and fix cases. * Fix asyncio	2021-08-07 22:41:33 +08:00

1 2 3 4 5 ...

311 commits