Take out the CLI reference from the core API subsection. It follows the same CLI reference pattern as other library (e.g., Serve has Serve CLI under Serve API section).
Move the code to doc_code
Fix the code example to make batching faster than serial run.
Related issue number
#27048
Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com>
An attempt at making the docs shorter and sweeter including various small cleanup items.
- Reorder the TOC on the sidebar for the user guides to be more linear based on a user's journey.
- Put the batching content under the performance guide.
- Remove the AIR guide (AIR users already have a serving guide).
- Combine the `ServeHandle` and model composition pages into a single guide. We may want to revisit this in the future but for now better to have it in a single place instead of duplicated (with links going to both).
- Fix the index page for the user guides to match the TOC sidebar.
- Rename a few pages for clarity & consistency.
- Remove some now-redundant content (old ML models user guide).
- Adds KubeRay information to the production guide.
- Consolidates the two user guides we had related to production deployment.
- Adds information about experimental GCS HA feature.
Signed-off-by: Yi Cheng 74173148+iycheng@users.noreply.github.com
Why are these changes needed?
This PR update workflow doc to reflect the recent change.
Focusing on position change and others.
Different metrics are collected in Ray Serve when the deployments are called from HTTP vs Python. This needs to be mentioned in the documentation and each metric marked accordingly.
Enables better usage with GCP.
The default behavior is that the head runs with the ray-autoscaler-sa-v1 service Account, but workers do not. Workers can run with this service account by copying & uncommenting L114->L117 from example-full
Signed-off-by: Ian <ian.rodney@gmail.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
We currently measure end-to-end training time in our benchmarks, which includes setup overhead. This is an unequal comparison, as setup overhead for vanilla training cannot be accurately expressed and was instead just disregarded.
By comparing the raw training times in the actual training loop, we will get a more accurate expression of any potential overhead or benefit in using Ray vs. vanilla tensorflow/torch.
Signed-off-by: Kai Fricke <kai@anyscale.com>
This PR restores notes for migration from the legacy Ray operator to the new KubeRay operator.
To avoid disrupting the flow of the Ray documentation, these notes are placed in a README accompanying the old operator's code.
These notes are linked from the new docs.
Signed-off-by: Dmitri Gekhtman <dmitri.m.gekhtman@gmail.com>
Went through https://docs.ray.io/en/master/data/examples/nyc_taxi_basic_processing.html, and doing some minor fix here.
Fix the size_bytes() result (before this PR it was using Parquet sampling, but we disasble it later)
Change one size_bytes() call to count() call as it was meant to use count() with followed wording That’s a lot of rows in doc.
Changed places are as followed in screenshots:
# Why are these changes needed?
- Promote APIs to PublicAPI(alpha)
- Change pre-alpha -> alpha
- Fix a bug ray_logs is displayed to ray --help
Release test result: #26610
Some APIs are subject to change at the beta stage (e.g., ray list jobs or ray logs).
Adds a page describing a development workflow for Serve applications.
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Co-authored-by: shrekris-anyscale <92341594+shrekris-anyscale@users.noreply.github.com>
Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu>
The "Monitoring Ray Serve" page explains how to inspect your Ray Serve applications. This change updates the page to remove outdated metrics that Serve no longer exposes and to upgrade code samples to use 2.0 APIs. It also improves the content's readability and organization.
Link to updated "Monitoring Ray Serve" page: https://ray--27777.org.readthedocs.build/en/27777/serve/monitoring.html