[Serve] Java documentation (#26321)

This commit is contained in:
liuyang-my 2022-08-13 00:07:12 +08:00 committed by GitHub
parent 0badbb8b1e
commit 6b886d394c
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
14 changed files with 696 additions and 0 deletions

View file

@ -182,6 +182,7 @@ parts:
- file: serve/performance
- file: serve/dev-workflow
- file: serve/production
- file: serve/managing-java-deployments
- file: serve/migration
- file: serve/architecture
- file: serve/tutorials/index

View file

@ -0,0 +1,134 @@
# Managing Java Deployments
Java is one of the mainstream programming languages for production services. Ray Serve natively supports Java API for creating, updating, and managing deployments. You can create Ray Serve deployments using Java and call them via Python, or vice versa.
This section helps you to:
- create, query, update and configure Java deployments
- configure resources of your Java deployments
- manage Python deployments using Java API
```{contents}
```
## Creating a Deployment
By specifying the full name of the class as an argument to `Serve.deployment()` method, as shown in the code below, we can create and deploy our deployment of the class.
```{literalinclude} ../../../java/serve/src/test/java/io/ray/serve/docdemo/ManageDeployment.java
:end-before: docs-create-end
:language: java
:start-after: docs-create-start
```
## Accessing a Deployment
Once a deployment is deployed, you can fetch its instance by name.
```{literalinclude} ../../../java/serve/src/test/java/io/ray/serve/docdemo/ManageDeployment.java
:end-before: docs-query-end
:language: java
:start-after: docs-query-start
```
## Updating a Deployment
We can update the code and the configuration of a deployment and redeploy it. The following example updates the initial value of the deployment 'counter' to 2.
```{literalinclude} ../../../java/serve/src/test/java/io/ray/serve/docdemo/ManageDeployment.java
:end-before: docs-update-end
:language: java
:start-after: docs-update-start
```
## Configuring a Deployment
There are a couple of deployment configuration Serve supports:
- ability to scale out by increasing number of deployment replicas
- ability to assign resources such as CPU and GPUs.
The next two sections describe how to configure your deployments.
### Scaling Out
By specifying the `numReplicas` parameter, you can change the number of deployment replicas:
```{literalinclude} ../../../java/serve/src/test/java/io/ray/serve/docdemo/ManageDeployment.java
:end-before: docs-scale-end
:language: java
:start-after: docs-scale-start
```
### Resource Management (CPUs, GPUs)
Through the `rayActorOptions` parameter, you can set the resources of deployment, such as using one GPU:
```{literalinclude} ../../../java/serve/src/test/java/io/ray/serve/docdemo/ManageDeployment.java
:end-before: docs-resource-end
:language: java
:start-after: docs-resource-start
```
## Managing a Python Deployment
A python deployment can also be managed and called by the Java API. Suppose we have a python file `counter.py` in path `/path/to/code/`:
```python
from ray import serve
@serve.deployment
class Counter(object):
def __init__(self, value):
self.value = int(value)
def increase(self, delta):
self.value += int(delta)
return str(self.value)
```
We deploy it as a deployment and call it through RayServeHandle:
```java
import io.ray.api.Ray;
import io.ray.serve.api.Serve;
import io.ray.serve.deployment.Deployment;
import io.ray.serve.generated.DeploymentLanguage;
import java.io.File;
public class ManagePythonDeployment {
public static void main(String[] args) {
System.setProperty(
"ray.job.code-search-path",
System.getProperty("java.class.path") + File.pathSeparator + "/path/to/code/");
Serve.start(true, false, null);
Deployment deployment =
Serve.deployment()
.setDeploymentLanguage(DeploymentLanguage.PYTHON)
.setName("counter")
.setDeploymentDef("counter.Counter")
.setNumReplicas(1)
.setInitArgs(new Object[] {"1"})
.create();
deployment.deploy(true);
System.out.println(Ray.get(deployment.getHandle().method("increase").remote("2")));
}
}
```
> NOTE: Before `Ray.init` or `Serve.start`, we need to set the directory to find the Python code. For details, please refer to [Cross-Language Programming](cross_language).
## Future Roadmap
In the future, we will provide more features on Ray Serve Java, such as:
- improved API to match the Python version
- HTTP ingress support
- bring your own Java Spring project as a deployment

View file

@ -13,6 +13,7 @@ serve-ml-models
batch
rllib
gradio
java
```
Other Topics:

View file

@ -0,0 +1,145 @@
(serve-java-tutorial)=
# Java Tutorial
To use Java Ray Serve, you need the following dependency in your pom.xml.
```xml
<dependency>
<groupId>io.ray</groupId>
<artifactId>ray-serve</artifactId>
<version>${ray.version}</version>
<scope>provided</scope>
</dependency>
```
> NOTE: After installing Ray via Python, the Java jar of Ray Serve is included locally. The `provided` scope could ensure the Java code using Ray Serve can be compiled and will not cause version conflicts when deployed on the cluster.
## Example Model
Our example use case is derived from production workflow of a financial application. The application needs to compute the best strategy to interact with different banks for a single task.
```{literalinclude} ../../../../java/serve/src/test/java/io/ray/serve/docdemo/Strategy.java
:end-before: docs-strategy-end
:language: java
:start-after: docs-strategy-start
```
This `Strategy` class is used to calculate the indicators of a number of banks.
* The `calc` method is the entry of the calculation. The input parameters are the time interval of calculation and the map of the banks and their indicators. As we can see, the `calc` method contains a two-tier `for` loop, traversing each indicator list of each bank, and calling the `calcBankIndicators` method to calculate the indicators of the specified bank.
- There is another layer of `for` loop in the `calcBankIndicators` method, which traverses each indicator, and then calls the `calcIndicator` method to calculate the specific indicator of the bank.
- The `calcIndicator` method is a specific calculation logic based on the bank, the specified time interval and the indicator.
This is the code that uses the `Strategy` class:
```{literalinclude} ../../../../java/serve/src/test/java/io/ray/serve/docdemo/StrategyCalc.java
:end-before: docs-strategy-calc-end
:language: java
:start-after: docs-strategy-calc-start
```
When the scale of banks and indicators expands, the three-tier `for` loop will slow down the calculation. Even if the thread pool is used to calculate each indicator in parallel, we may encounter a single machine performance bottleneck. Moreover, this `Strategy` object cannot be reused as a resident service.
## Converting to a Ray Serve Deployment
Through Ray Serve, the core computing logic of `Strategy` can be deployed as a scalable distributed computing service.
First, we can extract the indicator calculation of each institution into a separate `StrategyOnRayServe` class:
```{literalinclude} ../../../../java/serve/src/test/java/io/ray/serve/docdemo/StrategyOnRayServe.java
:end-before: docs-strategy-end
:language: java
:start-after: docs-strategy-start
```
Next, we start the Ray Serve runtime and deploy `StrategyOnRayServe` as a deployment.
```{literalinclude} ../../../../java/serve/src/test/java/io/ray/serve/docdemo/StrategyCalcOnRayServe.java
:end-before: docs-deploy-end
:language: java
:start-after: docs-deploy-start
```
The `Deployment.create` makes a Deployment object named "strategy." After executing `Deployment.deploy`, this "strategy" deployment is deployed in the instance of Ray Serve with four replicas, and we can access it for distributed parallel computing.
## Testing the Ray Serve Deployment
Now we can test the "strategy" deployment using RayServeHandle inside Ray:
```{literalinclude} ../../../../java/serve/src/test/java/io/ray/serve/docdemo/StrategyCalcOnRayServe.java
:end-before: docs-calc-end
:language: java
:start-after: docs-calc-start
```
At present, the calculation of each bank's each indicator is still executed serially, and sent to Ray for execution. We can make the calculation concurrent, which not only improves the calculation efficiency, but also solves the bottleneck of single machine.
```{literalinclude} ../../../../java/serve/src/test/java/io/ray/serve/docdemo/StrategyCalcOnRayServe.java
:end-before: docs-parallel-calc-end
:language: java
:start-after: docs-parallel-calc-start
```
Now, we can use `StrategyCalcOnRayServe` like the example in the `main` method:
```{literalinclude} ../../../../java/serve/src/test/java/io/ray/serve/docdemo/StrategyCalcOnRayServe.java
:end-before: docs-main-end
:language: java
:start-after: docs-main-start
```
## Calling Ray Serve Deployment with HTTP
Another way to test or call a deployment is through the HTTP request. But there are now two limitations for the Java deployments:
- The HTTP requests can only be processed by the `call` method of the user class.
- The `call` method could only have one input parameter, and the type of the input parameter and the returned value can only be `String`.
If we want to call the "strategy" deployment via HTTP, the class can be rewritten like this:
```{literalinclude} ../../../../java/serve/src/test/java/io/ray/serve/docdemo/HttpStrategyOnRayServe.java
:end-before: docs-strategy-end
:language: java
:start-after: docs-strategy-start
```
After deploying this deployment, we can access it through `curl` command:
```shell
curl -d '{"time":1641038674, "bank":"test_bank", "indicator":"test_indicator"}' http://127.0.0.1:8000/strategy
```
It can also be accessed using HTTP Client in Java code:
```{literalinclude} ../../../../java/serve/src/test/java/io/ray/serve/docdemo/HttpStrategyCalcOnRayServe.java
:end-before: docs-http-end
:language: java
:start-after: docs-http-start
```
The example of strategy calculation using HTTP to access deployment is as follows:
```{literalinclude} ../../../../java/serve/src/test/java/io/ray/serve/docdemo/HttpStrategyCalcOnRayServe.java
:end-before: docs-calc-end
:language: java
:start-after: docs-calc-start
```
This code can also be rewritten to support concurrency:
```{literalinclude} ../../../../java/serve/src/test/java/io/ray/serve/docdemo/HttpStrategyCalcOnRayServe.java
:end-before: docs-parallel-calc-end
:language: java
:start-after: docs-parallel-calc-start
```
Finally, the complete usage of `HttpStrategyCalcOnRayServe` is like this:
```{literalinclude} ../../../../java/serve/src/test/java/io/ray/serve/docdemo/HttpStrategyCalcOnRayServe.java
:end-before: docs-main-end
:language: java
:start-after: docs-main-start
```

View file

@ -162,6 +162,7 @@ define_java_module(
"@maven//:com_google_protobuf_protobuf_java",
"@maven//:org_apache_commons_commons_lang3",
"@maven//:org_apache_httpcomponents_client5_httpclient5",
"@maven//:org_apache_httpcomponents_client5_httpclient5_fluent",
"@maven//:org_apache_httpcomponents_core5_httpcore5",
"@maven//:org_slf4j_slf4j_api",
"@maven//:org_testng_testng",

View file

@ -30,6 +30,7 @@ def gen_java_deps():
"net.java.dev.jna:jna:5.8.0",
"org.apache.httpcomponents.client5:httpclient5:5.0.3",
"org.apache.httpcomponents.core5:httpcore5:5.0.2",
"org.apache.httpcomponents.client5:httpclient5-fluent:5.0.3",
maven.artifact(
group = "org.testng",
artifact = "testng",

View file

@ -0,0 +1,119 @@
package io.ray.serve.docdemo;
import com.google.gson.Gson;
import io.ray.serve.api.Serve;
import io.ray.serve.deployment.Deployment;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import org.apache.hc.client5.http.fluent.Request;
public class HttpStrategyCalcOnRayServe {
public void deploy() {
Serve.start(true, false, null);
Deployment deployment =
Serve.deployment()
.setName("http-strategy")
.setDeploymentDef(HttpStrategyOnRayServe.class.getName())
.setNumReplicas(4)
.create();
deployment.deploy(true);
}
// docs-http-start
private Gson gson = new Gson();
public String httpCalc(long time, String bank, String indicator) {
Map<String, Object> data = new HashMap<>();
data.put("time", time);
data.put("bank", bank);
data.put("indicator", indicator);
String result;
try {
result =
Request.post("http://127.0.0.1:8000/strategy")
.bodyString(gson.toJson(data), null)
.execute()
.returnContent()
.asString();
} catch (IOException e) {
result = "error";
}
return result;
}
// docs-http-end
// docs-calc-start
public List<String> calc(long time, Map<String, List<List<String>>> banksAndIndicators) {
List<String> results = new ArrayList<>();
for (Entry<String, List<List<String>>> e : banksAndIndicators.entrySet()) {
String bank = e.getKey();
for (List<String> indicators : e.getValue()) {
for (String indicator : indicators) {
results.add(httpCalc(time, bank, indicator));
}
}
}
return results;
}
// docs-calc-end
// docs-parallel-calc-start
private ExecutorService executorService = Executors.newFixedThreadPool(4);
public List<String> parallelCalc(long time, Map<String, List<List<String>>> banksAndIndicators) {
List<String> results = new ArrayList<>();
List<Future<String>> futures = new ArrayList<>();
for (Entry<String, List<List<String>>> e : banksAndIndicators.entrySet()) {
String bank = e.getKey();
for (List<String> indicators : e.getValue()) {
for (String indicator : indicators) {
futures.add(executorService.submit(() -> httpCalc(time, bank, indicator)));
}
}
}
for (Future<String> future : futures) {
try {
results.add(future.get());
} catch (InterruptedException | ExecutionException e1) {
results.add("error");
}
}
return results;
}
// docs-parallel-calc-end
// docs-main-start
public static void main(String[] args) {
long time = System.currentTimeMillis();
String bank1 = "demo_bank_1";
String bank2 = "demo_bank_2";
String indicator1 = "demo_indicator_1";
String indicator2 = "demo_indicator_2";
Map<String, List<List<String>>> banksAndIndicators = new HashMap<>();
banksAndIndicators.put(bank1, Arrays.asList(Arrays.asList(indicator1, indicator2)));
banksAndIndicators.put(
bank2, Arrays.asList(Arrays.asList(indicator1), Arrays.asList(indicator2)));
HttpStrategyCalcOnRayServe strategy = new HttpStrategyCalcOnRayServe();
strategy.deploy();
List<String> results = strategy.parallelCalc(time, banksAndIndicators);
System.out.println(results);
}
// docs-main-end
}

View file

@ -0,0 +1,20 @@
package io.ray.serve.docdemo;
// docs-strategy-start
import com.google.gson.Gson;
import java.util.Map;
public class HttpStrategyOnRayServe {
private Gson gson = new Gson();
public String call(String dataJson) {
Map<String, Object> data = gson.fromJson(dataJson, Map.class);
long time = (long) data.get("time");
String bank = (String) data.get("bank");
String indicator = (String) data.get("indicator");
// do bank data calculation
return bank + "-" + indicator + "-" + time; // Demo;
}
}
// docs-strategy-end

View file

@ -0,0 +1,78 @@
package io.ray.serve.docdemo;
import io.ray.serve.api.Serve;
import io.ray.serve.deployment.Deployment;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.atomic.AtomicInteger;
public class ManageDeployment {
// docs-create-start
public static class Counter {
private AtomicInteger value;
public Counter(Integer value) {
this.value = new AtomicInteger(value);
}
public String call(String delta) {
return String.valueOf(value.addAndGet(Integer.valueOf(delta)));
}
}
public void create() {
Serve.deployment()
.setName("counter")
.setDeploymentDef(Counter.class.getName())
.setInitArgs(new Object[] {1})
.setNumReplicas(2)
.create()
.deploy(true);
}
// docs-create-end
// docs-query-start
public void query() {
Deployment deployment = Serve.getDeployment("counter");
}
// docs-query-end
// docs-update-start
public void update() {
Serve.deployment()
.setName("counter")
.setDeploymentDef(Counter.class.getName())
.setInitArgs(new Object[] {2})
.setNumReplicas(2)
.create()
.deploy(true);
}
// docs-update-end
// docs-scale-start
public void scaleOut() {
Deployment deployment = Serve.getDeployment("counter");
// Scale up to 10 replicas.
deployment.options().setNumReplicas(10).create().deploy(true);
// Scale down to 1 replica.
deployment.options().setNumReplicas(1).create().deploy(true);
}
// docs-scale-end
// docs-resource-start
public void manageResource() {
Map<String, Object> rayActorOptions = new HashMap<>();
rayActorOptions.put("num_gpus", 1);
Serve.deployment()
.setName("counter")
.setDeploymentDef(Counter.class.getName())
.setRayActorOptions(rayActorOptions)
.create()
.deploy(true);
}
// docs-resource-end
}

View file

@ -0,0 +1,30 @@
package io.ray.serve.docdemo;
import io.ray.api.Ray;
import io.ray.serve.api.Serve;
import io.ray.serve.deployment.Deployment;
import java.io.File;
public class ManagePythonDeployment {
public static void main(String[] args) {
System.setProperty(
"ray.job.code-search-path",
System.getProperty("java.class.path") + File.pathSeparator + "/path/to/code/");
Serve.start(true, false, null);
Deployment deployment =
Serve.deployment()
// .setDeploymentLanguage(DeploymentLanguage.PYTHON)
.setName("counter")
.setDeploymentDef("counter.Counter")
.setNumReplicas(1)
.setInitArgs(new Object[] {"1"})
.create();
deployment.deploy(true);
System.out.println(Ray.get(deployment.getHandle().method("increase").remote("2")));
}
}

View file

@ -0,0 +1,35 @@
package io.ray.serve.docdemo;
// docs-strategy-start
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
public class Strategy {
public List<String> calc(long time, Map<String, List<List<String>>> banksAndIndicators) {
List<String> results = new ArrayList<>();
for (Entry<String, List<List<String>>> e : banksAndIndicators.entrySet()) {
String bank = e.getKey();
for (List<String> indicators : e.getValue()) {
results.addAll(calcBankIndicators(time, bank, indicators));
}
}
return results;
}
public List<String> calcBankIndicators(long time, String bank, List<String> indicators) {
List<String> results = new ArrayList<>();
for (String indicator : indicators) {
results.add(calcIndicator(time, bank, indicator));
}
return results;
}
public String calcIndicator(long time, String bank, String indicator) {
// do bank data calculation
return bank + "-" + indicator + "-" + time; // Demo;
}
}
// docs-strategy-end

View file

@ -0,0 +1,28 @@
package io.ray.serve.docdemo;
// docs-strategy-calc-start
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class StrategyCalc {
public static void main(String[] args) {
long time = System.currentTimeMillis();
String bank1 = "demo_bank_1";
String bank2 = "demo_bank_2";
String indicator1 = "demo_indicator_1";
String indicator2 = "demo_indicator_2";
Map<String, List<List<String>>> banksAndIndicators = new HashMap<>();
banksAndIndicators.put(bank1, Arrays.asList(Arrays.asList(indicator1, indicator2)));
banksAndIndicators.put(
bank2, Arrays.asList(Arrays.asList(indicator1), Arrays.asList(indicator2)));
Strategy strategy = new Strategy();
List<String> results = strategy.calc(time, banksAndIndicators);
System.out.println(results);
}
}
// docs-strategy-calc-end

View file

@ -0,0 +1,92 @@
package io.ray.serve.docdemo;
import io.ray.api.ObjectRef;
import io.ray.serve.api.Serve;
import io.ray.serve.deployment.Deployment;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
public class StrategyCalcOnRayServe {
// docs-deploy-start
public void deploy() {
Serve.start(true, false, null);
Deployment deployment =
Serve.deployment()
.setName("strategy")
.setDeploymentDef(StrategyOnRayServe.class.getName())
.setNumReplicas(4)
.create();
deployment.deploy(true);
}
// docs-deploy-end
// docs-calc-start
public List<String> calc(long time, Map<String, List<List<String>>> banksAndIndicators) {
Deployment deployment = Serve.getDeployment("strategy");
List<String> results = new ArrayList<>();
for (Entry<String, List<List<String>>> e : banksAndIndicators.entrySet()) {
String bank = e.getKey();
for (List<String> indicators : e.getValue()) {
for (String indicator : indicators) {
results.add(
(String)
deployment
.getHandle()
.method("calcIndicator")
.remote(time, bank, indicator)
.get());
}
}
}
return results;
}
// docs-calc-end
// docs-parallel-calc-start
public List<String> parallelCalc(long time, Map<String, List<List<String>>> banksAndIndicators) {
Deployment deployment = Serve.getDeployment("strategy");
List<String> results = new ArrayList<>();
List<ObjectRef<Object>> refs = new ArrayList<>();
for (Entry<String, List<List<String>>> e : banksAndIndicators.entrySet()) {
String bank = e.getKey();
for (List<String> indicators : e.getValue()) {
for (String indicator : indicators) {
refs.add(deployment.getHandle().method("calcIndicator").remote(time, bank, indicator));
}
}
}
for (ObjectRef<Object> ref : refs) {
results.add((String) ref.get());
}
return results;
}
// docs-parallel-calc-end
// docs-main-start
public static void main(String[] args) {
long time = System.currentTimeMillis();
String bank1 = "demo_bank_1";
String bank2 = "demo_bank_2";
String indicator1 = "demo_indicator_1";
String indicator2 = "demo_indicator_2";
Map<String, List<List<String>>> banksAndIndicators = new HashMap<>();
banksAndIndicators.put(bank1, Arrays.asList(Arrays.asList(indicator1, indicator2)));
banksAndIndicators.put(
bank2, Arrays.asList(Arrays.asList(indicator1), Arrays.asList(indicator2)));
StrategyCalcOnRayServe strategy = new StrategyCalcOnRayServe();
strategy.deploy();
List<String> results = strategy.parallelCalc(time, banksAndIndicators);
System.out.println(results);
}
// docs-main-end
}

View file

@ -0,0 +1,11 @@
package io.ray.serve.docdemo;
// docs-strategy-start
public class StrategyOnRayServe {
public String calcIndicator(long time, String bank, String indicator) {
// do bank data calculation
return bank + "-" + indicator + "-" + time; // Demo;
}
}
// docs-strategy-end