Commit graph

1788 commits

Author SHA1 Message Date
Amog Kamsetty
5a41fb18bd
[Docs] Automatically render latest ray_lightning docs (#23729)
Automatically pull the latest ray_lightning README to render on Ray docs. (#23505)

Depends on ray-project/ray_lightning#135
2022-04-08 16:57:23 -07:00
Sven Mika
c82f6c62c8
[RLlib] Make RolloutWorkers (optionally) recoverable after failure. (#23739) 2022-04-08 15:33:28 +02:00
Michael (Mike) Gelbart
774b62b3c0
[RLlib; docs] Clarify how MultiDiscrete spaces are encoded by default. (#23777) 2022-04-08 08:39:09 +02:00
Jian Xiao
f737731a5e
Remove dataset pipeline from the Getting Started page (#23756)
1. Dataset pipeline is advanced usage of Ray Dataset, which should not jam into the Getting Started page
2. We already have a separate/dedicated page called Pipelining Compute to cover the same content
2022-04-07 12:52:04 -07:00
matthewdeng
a12f5ff5d6
[train] add FAQ (#22757)
Adding a FAQ page. Currently has some basic questions that have come up in the past.

Explaining how to use Matplotlib due to threading in the distributed training function.
2022-04-04 16:14:35 -07:00
shrekris-anyscale
071e1dd20f
[serve] Create deployment.py and deployment_graph.py (#23578)
`api.py` has accumulated classes and functions that aren't purely public APIs, causing circular dependencies. This change pulls `Deployment` and deployment graph-related features out of `api.py` and puts them in two new files: `deployment.py` and `deployment_graph.py`.
2022-04-01 13:40:13 -07:00
Sven Mika
2eaa54bd76
[RLlib] POC: Config objects instead of dicts (PPO only). (#23491) 2022-03-31 18:26:12 +02:00
Antoni Baum
756d08cd31
[docs] Add support for external markdown (#23505)
This PR fixes the issue of diverging documentation between Ray Docs and ecosystem library readmes which live in separate repos (eg. xgboost_ray). This is achieved by adding an extra step before the docs build process starts that downloads the readmes of specified ecosystem libraries from their GitHub repositories. The files are then preprocessed by a very simple parser to allow for differences between GitHub and Docs markdowns.

In summary, this makes the markdown files in ecosystem library repositories single sources of truth and removes the need to manually keep the doc pages up to date, all the while allowing for differences between what's rendered on GitHub and in the Docs.

See ray-project/xgboost_ray#204 & https://ray--23505.org.readthedocs.build/en/23505/ray-more-libs/xgboost-ray.html for an example.

Needs ray-project/xgboost_ray#204 and ray-project/lightgbm_ray#30 to be merged first.
2022-03-31 08:38:14 -07:00
Jiao
d7e77fc9c5
[DAG] Serve Deployment Graph documentation and tutorial. (#23512) 2022-03-30 17:32:16 -07:00
Kai Fricke
b0fc631dea
[docs/tune] Fix PTL multi GPU link (#23589)
Broken in current docs
2022-03-30 09:24:48 -07:00
Simon Mo
cb1919b8d0
[Doc][Serve] Add minimal docs for model wrappers and http adapters (#23536) 2022-03-29 11:33:14 -07:00
Philipp Moritz
005ea36850
[linkcheck] Remove flaky url (#23549) 2022-03-29 08:36:54 -07:00
Matti Picus
77c4c1e48e
WINDOWS: enable and fix failures in test_runtime_env_complicated (#22449) 2022-03-29 00:56:42 -07:00
Chen Shen
1d0fe1e1c3
[doc/linter] fix broken deepmind link #23542 2022-03-28 22:35:53 -07:00
Siyuan (Ryans) Zhuang
6b1b25168f
[workflow][doc] Doc for workflow checkpointing (#23510) 2022-03-27 12:18:14 -07:00
shrekris-anyscale
891301ff54
[serve] [docs] Add tip about serve status (#23481)
The `serve status` command allows users to get their deployments' status info through the CLI. This change adds a tip to the health-checking documentation to inform users about `serve status`.
2022-03-25 13:36:15 -05:00
Sven Mika
7cb86acce2
[RLlib] trainer_template.py: hard deprecation (error when used). (#23488) 2022-03-25 18:25:51 +01:00
Jan Weßling
f78404da4a
[serve] Add ensemble model example to docs (#22771)
Added ensemble model examples to the Documentation. That was needed, due to a user request and there was no methodology outlining the creation of higher level ensemble models.

Co-authored-by: Jiao Dong <sophchess@gmail.com>
2022-03-25 11:17:54 -05:00
Philipp Moritz
46c1b98b2f
[ci/lint] Fix linkcheck flakiness (#23482)
As seen in https://buildkite.com/ray-project/ray-builders-branch/builds/6736#de2e23c9-3ec1-4c1b-83cb-41ae658ef1f8
2022-03-25 15:58:24 +00:00
Brett Göhre
f5e492ea8a
[Docs] optuna notebook (#23477) 2022-03-25 09:04:53 +01:00
Max Pumperla
60054995e6
[docs] fix doctests and activate CI (#23418) 2022-03-24 17:04:02 -07:00
Philipp Moritz
1b0c667061
Make linkcheck less flaky (#23442)
The huggingface links have created a number of spurious linkcheck errors, this PR is fixing that by ignoring them.
2022-03-24 14:51:49 +00:00
Eric Liang
38925f60d2
Add a get_if_exists option for simpler creation of named actors (#23344)
Getting or creating a named actor is a common pattern, however it is somewhat esoteric in how to achieve this. Add a utility function and test that it doesn't cause any scary error messages.

Actor.options(name="my_singleton", get_if_exists=True).remote(args)
2022-03-23 22:02:58 -07:00
Jiajun Yao
ce93bfff7e
Fix broken doc link (#23440)
https://github.com/ray-project/ray/blob/master/benchmarks/README.md is moved to a new place.
2022-03-23 18:54:02 -07:00
Chen Shen
48d456d373
[RFC][Doc] add a page describe actor execution order. (#23406)
* add

* task-orders

* fix

* address comments

* add

* address comments
2022-03-23 11:07:18 -07:00
Kai Fricke
668eade515
[docs] Add oracle to linkcheck ignore list (#23422)
This link currently breaks the linter CI.
2022-03-23 17:14:52 +00:00
Max Pumperla
9b1a3f9f9a
[docs] fix nav (#23417)
Algolia search now does not overflow on mobile devices anymore, making the nav scrollable again.

Signed-off-by: Max Pumperla <max.pumperla@googlemail.com>
2022-03-23 10:38:33 +00:00
mwtian
51feac9868
Clean up dev docs (#23407) 2022-03-22 23:22:56 -07:00
Richard Liaw
1fe110f8f4
[ml] Add a starter page for docstrings (#23312) 2022-03-21 17:20:45 -07:00
Kai Fricke
b64452bc63
[tune] Add multinode sync test (#23229)
This adds a multinode checkpoint/restore test for Ray Tune. This covers some of the functionality of the release tests, but in a more controlled environment. In a follow-up PR, we should test (mocked) cloud checkpointing, too.
2022-03-21 17:02:17 +00:00
Michael (Mike) Gelbart
99d60ef18c
[docs] Fix typos in ray docs contributing guide (#23360)
There are a couple typos in the [Ray contributing guide](https://docs.ray.io/en/master/ray-contribute/docs.html). I fixed the typos, added a relevant link, and reworded a sentence.
2022-03-21 10:01:41 -07:00
Jiajun Yao
d3159f201b
[Doc] Add scheduling doc (#23343) 2022-03-20 16:05:06 -07:00
Philipp Moritz
886cc4d674
Fix broken links in documentation and put linkcheck linter in place on CI (#23340) 2022-03-18 21:02:52 -07:00
Junwen Yao
8fff665455
[Train] Add torch data prefetch benchmark example (#22974)
Add a benchmark example for the auto pipeline functionality for host to device data transfer.
2022-03-18 13:27:26 -07:00
Jian Xiao
0b1a2a44c0
[Dataset GA doc] Decompose the monolith of Getting Started page (and get them under User Guide) (#23311)
Improve the Dataset documentation for GA.
2022-03-18 11:25:43 -07:00
Jialing He
4a83bc3dc2
[runtime env] Support set timeout for runtime env setup (#23082)
Interface example:
```python
@ray.remote(runtime_env=RuntimeEnv(..., config=RuntimeEnvConfig(setup_timeout_s=10))
def f(): pass

@ray.remote(runtime_env={..., "config": {"setup_timeout_s": 10}})
def f(): pass
```

Support set timeout second for timeout of runtime environment creation.

Co-authored-by: 捕牛 <hejialing.hjl@antgroup.com>
2022-03-18 12:52:59 -05:00
Archit Kulkarni
76bb5396c7
[Doc] [jobs] Add links to Job Submission and improve doc (#23209)
- Adds links to Job Submission from existing library tutorials where `ray submit` is used.  When Jobs becomes GA, we should fully replace the uses of `ray submit` with Ray job submission and ensure this is tested.
- Adds docstrings for the Jobs SDK, which automatically show up in the API reference
- Improve the Job Submission main page
- Add a "Deployment Guide" landing page explaining when to use Ray Client vs Ray Jobs

Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2022-03-18 12:52:13 -05:00
Archit Kulkarni
16fd099b8b
[runtime env] Change pip_check default from True to False (#23306)
@SongGuyang @Catch-Bull @edoakes  I know we discussed this earlier, but after thinking about it some more I think a more reasonable default is for `pip check` to be `False` by default.  My guess is that a lot of users (including myself) work inside an environment where `python -m pip check` fails, but the environment doesn't cause them any problems otherwise.  So a lot of users will hit an error when trying a simple `runtime_env` `pip` example, and possibly give up.  Another less important piece of evidence is that we had to set `pip_check = False` to make some CI tests pass in the original PR.

This also matches the default behavior of pip which allows this situation to occur in the first place:  `pip install` doesn't error when there's a dependency conflict; rather the command succeeds, the package is installed and usable, and it prints a warning (which is confusingly titled "ERROR")
2022-03-18 12:51:41 -05:00
shrekris-anyscale
86169d2452
[docs] Fix malformatted list in "Advanced Pattern: Fault Tolerance with Actor Checkpointing" (#23319) 2022-03-18 10:50:13 -07:00
Eric Liang
08dc31e747
[minor] Fix incorrect link to ray core user guide (#23316) 2022-03-17 20:58:56 -07:00
Guyang Song
1ad019aac3
[C++ API][Doc] Add doc and error log to notice C++ API is not supported on Windows (#23272)
We don't support Windows entirely now.

## Checks

- [X] I've run `scripts/format.sh` to lint the changes in this PR.
- [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [ ] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(
2022-03-18 10:52:57 +08:00
Eric Liang
015181ab9a
Add random access support for Datasets (experimental feature) (#22749)
This PR adds experimental support for random access to datasets. A Dataset can be random access enabled by calling `ds.to_random_access_dataset(key, num_workers=N)`. This creates a RandomAccessDataset.

RandomAccessDataset partitions the dataset across the cluster by the given sort key, providing efficient random access to records via binary search. A number of worker actors are created, each of which has zero-copy access to the underlying sorted data blocks of the Dataset.

Performance-wise, you can expect each worker to provide ~3000 records / second via ``get_async()``, and ~10000 records / second via ``multiget()``.

Since Ray actor calls go direct from worker->worker, throughput scales linearly with the number of workers.
2022-03-17 15:01:12 -07:00
Archit Kulkarni
684a1821d3
[Doc] [runtime_env] Add limitation about single-file py_modules to doc (#23248)
Until #23151 is fixed, this PR adds it as a known limitation in the documentation.
2022-03-17 16:23:46 -05:00
Simon Mo
f400b4333a
[Serve] Remove legacy pipeline codebase (#23172) 2022-03-17 13:27:16 -07:00
Jian Xiao
8c9e3f6c2e
Move the third-party data integrations (non-Dataset stuff) out of the user guides which is for Dataset (#23162)
Improve documentation of Ray Dataset.
2022-03-17 11:27:40 -07:00
Eric Liang
c8f207f746
[docs] Core docs refactor (#23216)
This PR makes a number of major overhauls to the Ray core docs:

Add a key-concepts section for {Tasks, Actors, Objects, Placement Groups, Env Deps}.
Re-org the user guide to align with key concepts.
Rewrite the walkthrough to link to mini-walkthroughs in the key concept sections.
Minor tweaks and additional transition material.
2022-03-17 11:26:17 -07:00
Balaji Veeramani
83986a4d83
[Train] Add support for automatic mixed precision (#22227)
Closes #20643

Co-authored-by: Ubuntu <ubuntu@ip-172-31-58-19.us-west-2.compute.internal>
2022-03-16 20:53:02 -07:00
Archit Kulkarni
8707eb6288
[runtime env] Support .whl files in py_modules (#22368)
The `py_modules` field of runtime_env supports uploading local Python modules for use on the Ray cluster.  One gap in this is if the local Python module is in the form of a wheel (`.whl` file.)  This PR adds the missing support for uploading and installing the `.whl` file.
2022-03-16 16:37:10 -05:00
Max Pumperla
71c57c619b
[docs] RLlib broken links (fixes #23160) (#23226) 2022-03-16 12:38:18 +01:00
Kai Fricke
b80f79a072
[ci/multinode] Improve multi-node tests (#23196)
The current multi node tests use a hardcoded mapping for local development mounts. With this PR, a new environment variable is introduced to be able to control this dynamically. Additionally, some minor improvements to the test utilities and monitor are added.
2022-03-16 09:59:50 +00:00