Commit graph

161 commits

Author SHA1 Message Date
XiaodongLv
63ab063997
Change_notes_in_setting_async_flag_for_python_actor_in_java (#28282)
Signed-off-by: lvxiaodong <lvxiaodong.lxd@antgroup.com>
2022-09-05 00:16:41 +08:00
XiaodongLv
a31be7cef1
[Ray][xlang]Setting async flag for Python actor actor in Java (#28149)
It's important that setting async flag for Python actor in Java for us.
So we added the API which is named "PyActorCreator setAsync(boolean enabled)" based on PyActorCreator,
To avoid misuse for user, we check the flag before the ActorCreationTask is executed.
2022-09-03 11:09:19 +08:00
Guyang Song
cf2cb66d29
[runtime env][java] Support runtime env config in Java (#28083)
Support job level and task/actor level runtime env config eg. `setupTimeoutSeconds` and `eagerInstall`.
2022-08-26 08:37:39 +08:00
Guyang Song
06b0e715c7
[runtime env] plugin refactor [7/n]: support runtime env in C++ API (#27010)
Signed-off-by: 久龙 <guyang.sgy@antfin.com>
2022-07-27 18:24:31 +08:00
Guyang Song
419e78180a
[runtime env] plugin refactor[6/n]: java api refactor (#26783) 2022-07-26 09:00:57 +08:00
Tao Wang
bb6c805bd7
[Java worker][Cpp worker]Support Java call Cpp Task (#26182) 2022-07-12 17:49:22 +08:00
Qing Wang
2d4663d0cd
[Java] Support getCurrentNodeId API for RuntimeContext (#26147)
Add an API to get the node id of this worker, see usage:
```java
UniqueId currNodeId = Ray.getRuntimeContext().getCurrentNodeId();
```
for the requirement from Ray Serve.
2022-06-30 16:19:32 +08:00
Qing Wang
cb77209ce1
[Java] Allow to specify zero CPU as resource. (#26148)
This is aligned to the behavior of Python resources validation.
2022-06-29 22:53:00 +08:00
Tao Wang
49cafc6323
[Cpp worker][Java worker]Support Java call Cpp Actor (#25933) 2022-06-29 14:33:32 +08:00
Qing Wang
8884bfb445
[Java] Support starting named actors in different namespace. (#25995)
Allow you start actors in different namespace instead of the driver namespace.
Usage is simple:
```java
Ray.init(namespace="a");
/// Named actor a will starts in namespace `b`
ActorHandle<A> a = Ray.actor(A::new).setName("myActor", "b").remote();
```

Co-authored-by: Hao Chen <chenh1024@gmail.com>
2022-06-28 13:49:15 +08:00
Qing Wang
af418fb729
[Java][API CHANGE] Move exception to api module. (#24540)
This PR moves all exception classes from runtime module to api module. It's aiming to eliminate the confusion about ray exceptions. It means that Ray users don't need to touch runtime module when API programming after this PR.

Note that this should be merged onto 2.0.
2022-05-19 10:18:20 +08:00
Qing Wang
cc621ff08a
[Java][API CHANGE] Rename mode SINGLE_PROCESS to LOCAL (#24714)
for aligning to the key concept local mode, this PR renames SINGLE_PROCESS to LOCAL.
2022-05-19 10:17:24 +08:00
Qing Wang
eb29895dbb
[Core] Remove multiple core workers in one process 1/n. (#24147)
This is the 1st PR to remove the code path of multiple core workers in one process. This PR is aiming to remove the flags and APIs related to `num_workers`.
After this PR checking in, we needn't to consider the multiple core workers any longer.

The further following PRs are related to the deeper logic refactor, like eliminating the gap between core worker and core worker process,  removing the logic related to multiple workers from workerpool, gcs and etc.

**BREAK CHANGE**
This PR removes these APIs:
- Ray.wrapRunnable();
- Ray.wrapCallable();
- Ray.setAsyncContext();
- Ray.getAsyncContext();

And the following APIs are not allowed to invoke in a user-created thread in local mode:
- Ray.getRuntimeContext().getCurrentActorId();
- Ray.getRuntimeContext().getCurrentTaskId()

Note that this PR shouldn't be merged to 1.x.
2022-05-19 00:36:22 +08:00
Qing Wang
259661042c
[runtime env] [java] Support jars in runtime env for Java (#24170)
This PR supports setting the jars for an actor in Ray API. The API looks like:
```java
class A {
    public boolean findClass(String className) {
      try {
        Class.forName(className);
      } catch (ClassNotFoundException e) {
        return false;
      }
      return true;
    }
}

RuntimeEnv runtimeEnv = new RuntimeEnv.Builder()
    .addJars(ImmutableList.of("https://github.com/ray-project/test_packages/raw/main/raw_resources/java-1.0-SNAPSHOT.jar"))
    .build();
ActorHandle<A> actor1 = Ray.actor(A::new).setRuntimeEnv(runtimeEnv).remote();
boolean ret = actor1.task(A::findClass, "io.testpackages.Foo").remote().get();
System.out.println(ret); // true
```
2022-05-12 09:34:40 +08:00
Qing Wang
c5252c5ceb
[Java] Support parallel actor in experimental. (#21701)
For the purpose to provide an alternative option for running multiple actor instances in a Java worker process, and the eventual goal is to remove the original multi-worker-instances in one worker process implementation.  we're proposing supporting parallel actor concept in Java. This feature enables that users could define some homogeneous parallel execution instances in an actor, and all instances hold one thread as the execution backend.

### Introduction

For the following example, we define a parallel actor with  10 parallelism. The backend actor has 10 concurrency groups for the parallel executions, it also means there're 10 threads for that.

We can access the instance by the instance handle, like:
```java
ParallelActorHandle<A> actor = ParallelActor.actor(A::new).setParallelism(10).remote();
ParallelInstance<A> instance = actor.getInstance(/*index=*/ 2);
Preconditions.checkNotNull(instance);
Ray.get(instance.task(A::incr, 1000000).remote()); // print 1000000           

instance = actor.getInstance(/*index=*/ 2);
Preconditions.checkNotNull(instance);
Ray.get(instance.task(A::incr, 2000000).remote().get()); // print 3000000

instance = actor.getInstance(/*index=*/ 3);
Preconditions.checkNotNull(instance);
Ray.get(instance.task(A::incr, 2000000).remote().get()); // print 2000000
```


### Limitation
- It doesn't support concurrency group on a parallel actor yet.

Co-authored-by: Kai Yang <kfstorm@outlook.com>
2022-04-21 22:54:33 +08:00
Qing Wang
77b0015ea0
[Java] Add NO_RESTART and INFINITE_RESTART constants. (#23771) 2022-04-12 10:40:44 +08:00
Qing Wang
e0ea7567c4
Add getJobId API for ActorId (#23770) 2022-04-08 11:30:53 +08:00
Larry
d0b324990f
[Java] Add doc for Ray.get api that throws an exception if it times out (#23666)
Add doc for Ray.get api that throws an exception if it times out

![image](https://user-images.githubusercontent.com/11072802/161364231-4337124d-3141-4334-879c-f88cecc0d818.png)

Co-authored-by: 稚鱼 <lianjunwen.ljw@antgroup.com>
2022-04-02 18:29:19 +08:00
Qing Wang
ef5b9b87d3
[Java] Add set runtime env api for normal task. (#23412)
This PR adds the API `setRuntimeEnv` for submitting a normal task, for the usage:
```java
RuntimeEnv runtimeEnv =
    new RuntimeEnv.Builder()
        .addEnvVar("KEY1", "A")
        .build();

/// Return `A`
Ray.task(RuntimeEnvTest::getEnvVar, "KEY1").setRuntimeEnv(runtimeEnv).remote().get();
```
2022-03-24 15:57:24 +08:00
Larry
81dcf9ff35
[Placement Group] Make PlacementGroupID generate from JobID (#23175) 2022-03-21 17:09:16 +08:00
Qing Wang
9aa0b4e89e
[Java] Add transient for cached hashcode of IDs to reduce serialized size. (#22766)
Use `transient` keyword for reducing the serialized size of  ids for transporting.
2022-03-08 14:36:08 +08:00
Qing Wang
9572bb717f
[RuntimeEnv] Support setting actor level env vars for Java worker (#22240)
This PR supports setting actor level env vars for Java worker in runtime env.
General API looks like:
```java
RuntimeEnv runtimeEnv = new RuntimeEnv.Builder()
    .addEnvVar("KEY1", "A")
    .addEnvVar("KEY2", "B")
    .addEnvVar("KEY1", "C")  // This overwrites "KEY1" to "C"
    .build();

ActorHandle<A> actor1 = Ray.actor(A::new).setRuntimeEnv(runtimeEnv).remote();
```

If `num-java-workers-per-process` > 1, it will never reuse the worker process except they have the same runtime envs.

Co-authored-by: Qing Wang <jovany.wq@antgroup.com>
2022-02-28 10:58:37 +08:00
Qing Wang
a37d9a2ec2
[Core] Support default actor lifetime. (#21283)
Support the ability to specify a default lifetime for actors which are not specified lifetime when creating. This is a job level configuration item.
#### API Change
The Python API looks like:
```python
  ray.init(job_config=JobConfig(default_actor_lifetime="detached"))
```

Java API looks like:
```java
  System.setProperty("ray.job.default-actor-lifetime", defaultActorLifetime.name());
  Ray.init();
```

One example usage is:
```python
  ray.init(job_config=JobConfig(default_actor_lifetime="detached"))
  a1 = A.options(lifetime="non_detached").remote()   # a1 is a non-detached actor.
  a2 = A.remote()  # a2 is a non-detached actor.
```

Co-authored-by: Kai Yang <kfstorm@outlook.com>
Co-authored-by: Qing Wang <jovany.wq@antgroup.com>
2022-01-22 12:26:08 +08:00
Rong Ma
f54282147c
[PlacementGroup] Support using any available bundle in java api (#21496)
In python or C++, we can specify the bundle index as -1 to use any available bundle in the placement group. We should also enable it in Java to keep the API consistent across all languages.
2022-01-18 01:58:02 +08:00
Qing Wang
6f82bff7ff
[Java] Change ActorLifetime API: DEFAULT -> NON_DETACHED (#21639)
This PR changes the enum value `ActorLifetime.DEFAULT` to `ActorLifetime.NON_DETACHED`. In our release versions, `ActorLifetime` was not introduced <= 1.9.2

Co-authored-by: Qing Wang <jovany.wq@antgroup.com>
2022-01-17 18:10:12 +08:00
Qing Wang
2c3be852ab
[Java] Support defining ConcurrencyGroup statically in Java. (#20373)
This PR introduces statically defining ConcurrencyGroup APIs in Java.
We introduce 2 APIs:
1. Introducing `@DefConcurrencyGroup` annotation for an actor class to define a concurrency group statically.
2. Introducing `@UseConcurrencyGroup` annotation for actor methods to define the concurrency group to be used in the method.

Examples are below:

```java
 @DefConcurrencyGroup(name = "io", maxConcurrency = 2)
  @DefConcurrencyGroup(name = "compute", maxConcurrency = 4)
  private static class MyActor {
    @UseConcurrencyGroup(name = "io")
    public long f1() { }

    @UseConcurrencyGroup(name = "io")
    public long f2() { }

    @UseConcurrencyGroup(name = "compute")
    public long f3(int a, int b) { }

    @UseConcurrencyGroup(name = "compute")
    public long f4() { }
  }

ActorHandle<> myActor = Ray.actor(MyActor::new).remote();
myActor.task(MyActor::f1).remote();
myActor.task(MyActor::f2).remote();
myActor.task(MyActor::f3).remote();
myActor.task(MyActor::f4).remote();
```
`MyActor` has 3 concurrency groups: `io` with 2 concurrency, `compute` with 4 concurrency and `default` with 1 concurrency.
f1 and f2 will be executed in `io`, f3 and f4 will be executed in `compute`.
2022-01-17 16:23:10 +08:00
Qing Wang
2df27a5f87
[Java] Support ActorLifetime (#21074)
We add a enum class ActorLifetime to indicate the lifetime of an actor. In this PR, we also add the necessary API to create an actor with specifying lifetime.
Currently, it has 2 values: detached and default.
2021-12-23 19:48:56 +08:00
Qing Wang
bd502e8bd5
[Java] Remove out of date comment. (#21073)
The semantic of `setName` API is changed, but the comment is out of date. This PR fixes it.
2021-12-20 20:07:59 +08:00
WanXing Wang
72bd2d7e09
[Core] Support back pressure for actor tasks. (#20894)
Resubmit the PR https://github.com/ray-project/ray/pull/19936

I've figure out that the test case `//rllib:tests/test_gpus::test_gpus_in_local_mode` failed due to deadlock in local mode.
In local mode, if the user code submits another task during the executing of current task, the `CoreWorker::actor_task_mutex_` may cause deadlock.
The solution is quite simple, release the lock before executing task in local mode.

In the commit 7c2f61c76c:
1. Release the lock in local mode to fix the bug. @scv119 
2. `test_local_mode_deadlock` added to cover the case. @rkooo567 
3. Left a trivial change in `rllib/tests/test_gpus.py` to make the `RAY_CI_RLLIB_DIRECTLY_AFFECTED ` to take effect.
2021-12-13 23:56:07 -08:00
Kai Fricke
d4413299c0
Revert "[Core] Support back pressure for actor tasks (#19936)" (#20880)
This reverts commit a4495941c2.
2021-12-03 17:48:47 -08:00
WanXing Wang
a4495941c2
[Core] Support back pressure for actor tasks (#19936)
Support back pressure in core worker.
Job config added for python worker and java worker.
2021-12-02 14:41:30 -08:00
Larry
454db6902c
[Java] Add timeout parameter for Ray.get() API (#20282)
Why are these changes needed?

Add timeout(ms) param for Java ray.get. The API changes have been updated to doc ([Ray Core Walkthrough]->[Fetching Results]).

eg:
ObjectRef<Integer> objRef = Ray.put(1);
objRef.get(1000) 
Ray.get(Ray.task(MyRayApp::slowFunction).remote(), 3000)

Related issue number
#20247
2021-11-17 11:02:17 +08:00
Qing Wang
1172195571
[Java] Remove global named actor and global pg (#20135)
This PR removes global named actor and global PGs.

I believe these APIs are not used widely in OSS.
CPP part is not included in this PR.
@kfstorm @clay4444 @raulchen Please take a look if this change is reasonable.


IMPORTANT NOTE: This is a Java API change and will lead backward incompatibility in Java global named actor and global PG usage.

CPP part is not included in this PR.
INCLUDES:

 Remove setGlobalName() and getGlobalActor() APIs.
 Remove getGlobalPlacementGroup() and setGlobalPG
 Add getActor(name, namespace) API
 Add getPlacementGroup(name, namespace) API
 Update doc pages.
2021-11-15 16:28:53 +08:00
Qing Wang
7500f7d88a
Remove deprecated Java PG APIs. (#20219)
These APIs were deprecated at least 7+ months and 4+ versions, it's the time and very necessary to remove them.
2021-11-12 09:29:48 +08:00
Qing Wang
6d8a7291ab
Add getNamespace API for Java worker (#20057)
[Java API] Add getNamespace API for Java worker.
2021-11-08 15:51:14 +08:00
Jiajun Yao
805ce453dd
[Java] Remove auto-generated pom.xml files. (#19475) 2021-10-19 17:35:37 +08:00
hazeone
c2f0035fd2
[Java]Support getGpuIds API (#19031)
Add java getGpuIds() API which is the same as get_gpu_ids in python. We can get deviceId if we've allocated a GPU to a worker.
2021-10-13 23:40:26 +08:00
Qing Wang
3ad1553b34
[Java] Remove API setJvmOptions(String). (#18664) 2021-09-22 20:00:49 +08:00
Guyang Song
3bc5f0501f
fix WaitPlacementGroupReady API (#18464) 2021-09-13 14:07:40 +08:00
Qing Wang
371f03fa48
Remove dynamic resource from client side. (#18514) 2021-09-11 10:39:59 -07:00
Zhi Lin
d3786ac131
Bump Java version to 2.0.0-SNAPSHOT (#15394)
* bump java version to 2.0.0-SNAPSHOT

* update
2021-08-30 12:25:30 +08:00
Tao Wang
0b5f5890f7
[Named Actor] Throw RayException when getting named actor timed out (#17998)
* [Named Actor]throw RayException when getting named actor timed out

* lint

* correct the message

* lint

* nice catch
2021-08-25 13:50:53 +08:00
Qing Wang
4cc34588db
[Core] Support ConcurrentGroup part1 (#16795)
* Core change and Java change.

* Fix void call.

* Address comments and fix cases.

* Fix asyncio
2021-08-07 22:41:33 +08:00
Zhi Lin
82123123c4
[object store] Java API for Assign the object owner in Ray.put() (#17237)
Co-authored-by: Qing Wang <kingchin1218@126.com>
Co-authored-by: Kai Yang <kfstorm@outlook.com>
2021-08-06 15:26:59 +08:00
Qing Wang
4bde71ca86
[Java][Core] Support get current actor handle. (#14900) 2021-07-12 15:27:54 -07:00
Qing Wang
89b07572da
[Java] Upgrade log4j (#16657) 2021-06-24 21:01:27 -07:00
Tao Wang
2523072a3d
[large scale]Use gcs client instead of redis client to increase job id (#16190)
Co-authored-by: Alex Wu <itswu.alex@gmail.com>
2021-06-17 15:01:32 +08:00
Kai Yang
6278df8604
[Java] refine generation of jvm options (#14931) 2021-03-31 21:04:52 +08:00
DK.Pino
ef59c145e2
[Java][Placement Group] Move related API of Placement Group from Ray to PlacementGroups. (#14729) 2021-03-23 12:34:12 +08:00
Kai Yang
f60bd3afee
[Java] some small improvements (#14565) 2021-03-12 12:26:55 +08:00