Commit graph

2226 commits

Author SHA1 Message Date
Jiao
f6735f90c7
[Ray DAG] Move dag project folder out of experimental (#25532) 2022-06-16 19:15:39 -07:00
Clark Zinzow
3dda4e1d46
[Docs] Add a py:obj default role to Sphinx builds. (#25765)
By setting the [Sphinx `default_role`](https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-default_role) to [`py:obj`](https://www.sphinx-doc.org/en/master/usage/restructuredtext/domains.html#role-py-obj), we can concisely cross-reference other Python APIs (classes or functions) in API docstrings while still maintaining the editor/IDE/terminal readability of the docstrings.

Before this PR, when referencing a class or a function, the relevant role specification is required: :class:`Dataset`, :meth:`Dataset.map`, :func:`.read_parquet`.

After this PR, the raw cross reference will work in most cases: `Dataset`, `Dataset.map`, `read_parquet`.

## Checks

- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [x] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(
2022-06-16 16:33:20 -07:00
Chen Shen
8e7e89a178
[Data] fix broken link (#25867)
update the broken spark link.
2022-06-16 14:01:38 -07:00
shrekris-anyscale
d944f7469c
[Serve] [Docs] Remove references to namespaces in the Serve documentation (#25830)
#25575 starts all Serve actors in the `"serve"` namespace. This change updates the Serve documentation to remove now-outdated explanations about namespaces and to specify that all Serve actors start in the `"serve"` namespace.
2022-06-16 10:50:49 -05:00
Antoni Baum
91dd360f9d
[AIR/train] Move predictors to ray.train (#25769) 2022-06-15 17:02:15 -07:00
Antoni Baum
b5fd02af4f
[CI] Print linkcheck summary only in linkcheck (#25781) 2022-06-15 16:21:08 -07:00
zcin
3f91cbd979
[serve][docs] Replaced term 'actor_init_options' with 'ray_actor_options' in documentation (#25808)
Replaced the term `actor_init_options` with `ray_actor_options` in [this documentation section](https://docs.ray.io/en/releases-1.13.0/serve/performance.html#choosing-the-right-hardware) because `actor_init_options` is an outdated variable name. It's been changed to `ray_actor_options` in the [code](2546fbf99d/python/ray/serve/deployment.py (L45)).
2022-06-15 15:21:24 -05:00
Clark Zinzow
526e12074a
[Datasets] Make it clear that read_parquet() does not support multiple directories. (#25747)
Unfortunately, ray.data.read_parquet() doesn't work with multiple directories since it uses Arrow's Dataset abstraction under-the-hood, which doesn't accept multiple directories as a source: https://arrow.apache.org/docs/python/generated/pyarrow.dataset.dataset.html

This PR makes this clear in the docs, and as a driveby, adds ray.data.read_parquet_bulk() to the API docs.
2022-06-15 13:19:39 -07:00
clarng
ef866d1e49
exclude doc_code from import sorting (#25772)
Skip sorting the imports in doc_code.
2022-06-15 11:34:45 -07:00
xwjiang2010
88d824d067
[air] remove fully_executed from Tune. (#25750) 2022-06-14 22:32:48 -07:00
Kai Fricke
fdf85ea403
[air] Add tutorial to convert existing pytorch code to Ray AIR (#25723) 2022-06-14 18:11:32 -07:00
Kai Fricke
6313ddc47c
[tune] Refactor Syncer / deprecate Sync client (#25655)
This PR includes / depends on #25709

The two concepts of Syncer and SyncClient are confusing, as is the current API for passing custom sync functions.

This PR refactors Tune's syncing behavior. The Sync client concept is hard deprecated. Instead, we offer a well defined Syncer API that can be extended to provide own syncing functionality. However, the default will be to use Ray AIRs file transfer utilities.

New API:
- Users can pass `syncer=CustomSyncer` which implements the `Syncer` API
- Otherwise our off-the-shelf syncing is used
- As before, syncing to cloud disables syncing to driver

Changes:
- Sync client is removed
- Syncer interface introduced
- _DefaultSyncer is a wrapper around the URI upload/download API from Ray AIR
- SyncerCallback only uses remote tasks to synchronize data
- Rsync syncing is fully depracated and removed
- Docker and kubernetes-specific syncing is fully deprecated and removed
- Testing is improved to use `file://` URIs instead of mock sync clients
2022-06-14 14:46:30 +02:00
kourosh hakhamaneshi
25940cb95b
[RLlib] CRR documentation. (#25667) 2022-06-14 12:45:36 +02:00
sychen52
d5b8a1caab
[docs] actor is not created in driver1 (#25749)
call .remote() after .option
2022-06-13 21:41:14 -07:00
Dmitri Gekhtman
e745cd0e7b
[Docs] Note that certain features are community maintained (#25687)
Adds notes explaining that Ray's support on Azure, Aliyun, and SLURM is community-maintained.
Rephrases the mention of K8s support in the intro.

This PR replaces https://github.com/ray-project/ray/pull/25504.
2022-06-13 16:10:32 -07:00
Eric Liang
ff2cfbe351
[air] Add streaming BatchPredictor support (#25693) 2022-06-13 15:22:36 -07:00
Antoni Baum
182f604d32
[docs] Fix bad argument name in PTL docs (#25736)
Fixes bad argument name in PTL docs. This is just a quick fix - we should be testing the code snippet.
2022-06-13 15:20:24 -07:00
Antoni Baum
5e9a8eb5f6
[AIR/data] Move preprocessors to ray.data (#25599)
Moves ray.air.Preprocessor and ray.air.preprocessors to ray.data to converge on the agreed upon package structure discussed internally.
2022-06-13 12:57:59 -07:00
shrekris-anyscale
3278763dd7
[Serve] Start all Serve actors in the "serve" namespace only (#25575) 2022-06-13 10:31:28 -07:00
Sven Mika
ca10530a1a
[Serve; RLlib; Docs] Change terms in Serve+RLlib example (Trainer -> Algorithm). (#25700) 2022-06-13 11:43:38 +02:00
Jiao
f8b0ab7e78
[Ray DAG] Add documentation in more options section (#25528) 2022-06-12 09:47:20 -07:00
Eric Liang
b52cd964cb
[docs] Move the workflows (alpha) library to the more libraries section for now (#25704) 2022-06-11 19:47:45 -07:00
Sven Mika
130b7eeaba
[RLlib] Trainer to Algorithm renaming. (#25539) 2022-06-11 15:10:39 +02:00
Richard Liaw
1dd714e0fa
[rfc][doc] Add clarity to stability guidelines (#25611) 2022-06-10 15:19:21 -07:00
Avnish Narayan
d0f975e00f
[RLlib] Fix broken link replay buffer docs. (#25666) 2022-06-10 21:18:59 +02:00
Simon Mo
271c7d73ac
[AIR][Serve] Add support for multi-modal array input (#25609) 2022-06-10 09:19:42 -07:00
Sven Mika
7c39aa5fac
[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076) 2022-06-10 17:09:18 +02:00
Artur Niederfahrenhorst
94d6c212df
[RLlib] Replay Buffer API documentation. (#24683) 2022-06-10 16:47:51 +02:00
Antoni Baum
445400d727
[CI] Print a summary of broken links in LinkCheck (#25634) 2022-06-09 17:03:53 -07:00
matthewdeng
88524d8b57
[air] add CustomStatefulPreprocessor (#25497) 2022-06-09 16:54:46 -07:00
Eric Liang
a058a98c5d
[docs] Try to clarify some advantages of bulk ingest in the AIR ingest docs (#25616) 2022-06-09 11:47:22 -07:00
Amog Kamsetty
1316a2d05e
[AIR/Train] Move ray.air.train to ray.train (#25570) 2022-06-08 21:34:18 -07:00
Dmitri Gekhtman
836b08597f
[kuberay][autoscaler] Use new autoscaling fields from the KubeRay operator (#25386)
This PR incorporates recent autoscaler config changes from KubeRay.
2022-06-08 20:09:43 -07:00
matthewdeng
ba0a2a022a
[datasets] add Dataset.randomize_block_order (#25568)
This exposes a low-cost way to perform a pseudo global shuffle.

For extremely large datasets that span multiple nodes, contiguous blocks will often be colocated on the same node. This leads to hot spots during iteration of the dataset in which single nodes (1) must send a lot of data over the network, and (2) perform lots of disk reads if the dataset is spilled to disk.

This allows the workload to be spread across the nodes on which the dataset blocks are on.
2022-06-08 18:39:15 -07:00
M Waleed Kadous
9e2e84bc1c
[docs] Add an example for simple highly parallelizable tasks. (#24885)
It's important to show how Ray can be used for easily parallelizable independent tasks. I put this together to demonstrate how to di this.
2022-06-08 18:10:37 -07:00
Antoni Baum
7616435ed0
[Docs] Capitalize Ray AIR (#25597) 2022-06-08 14:37:53 -07:00
Kai Fricke
aa142eb377
[RLlib; CI] Add team:rllib tag for Bazel. (#25589)
Currently, team:ml spans all ML (Tune, Train, AIR) tests and rllib tests. rllib tests are much more flaky and it would be good to split them up in the flaky test tracker. This PR changes Rllib-tests from team:ml to team:rllib to enable this separation.

Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2022-06-08 22:25:59 +01:00
Jian Xiao
50c854b1ad
Fix hyperlink in rst doc (#25427)
Hyperlink not working

Co-authored-by: Ubuntu <ubuntu@ip-172-31-32-136.us-west-2.compute.internal>
2022-06-08 13:46:23 -07:00
Clark Zinzow
9dc0bb3d5e
[Datasets] Unrevert "[Datasets] [Tensor Story - 1/2] Automatically provide tensor views to UDFs and infer tensor blocks for pure-tensor datasets. (#25031)" (#25531)
Unreverts #24812, skipping the memory releasing tests that are already flaky. We have a separate issue tracking the unskipping of these memory releasing tests, once we find a more reliable way to test them.

* Revert "Revert "Revert "Revert "[Datasets] [Tensor Story - 1/2] Automatically provide tensor views to UDFs and infer tensor blocks for pure-tensor datasets."" (#25031)" (#25057)"

This reverts commit fb2933a78f.

* Skip shuffle memory release test.
2022-06-08 10:33:25 -07:00
Amog Kamsetty
80ae651f25
[Train] Clean up ray.train package (#25566) 2022-06-08 10:22:36 -07:00
Archit Kulkarni
3296345557
Add warning about entrpoint command in quotes (#25519) 2022-06-08 09:38:55 -07:00
Kai Fricke
8affbc7be6
[tune/train] Consolidate checkpoint manager 3: Ray Tune (#24430)
**Update**: This PR is now part 3 of a three PR group to consolidate the checkpoints.

1. Part 1 adds the common checkpoint management class #24771 
2. Part 2 adds the integration for Ray Train #24772
3. This PR builds on #24772 and includes all changes. It moves the Ray Tune integration to use the new common checkpoint manager class.

Old PR description:

This PR consolidates the Ray Train and Tune checkpoint managers. These concepts previously did something very similar but in different modules. To simplify maintenance in the future, we've consolidated the common core.

- This PR keeps full compatibility with the previous interfaces and implementations. This means that for now, Train and Tune will have separate CheckpointManagers that both extend the common core
- This PR prepares Tune to move to a CheckpointStrategy object
- In follow-up PRs, we can further unify interfacing with the common core, possibly removing any train- or tune-specific adjustments (e.g. moving to setup on init rather on runtime for Ray Train)

Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2022-06-08 12:05:34 +01:00
xwjiang2010
29a063afdf
[air] add feast example (#25417) 2022-06-07 14:55:42 -07:00
xwjiang2010
76b34d4a03
[air] add to_air_checkpoint method for inference only workload. (#25444)
Follow up on our last discussion for supporting piecemeal fashion air users.
Only did for tensorflow for now, want to collect some feedback on API naming, package structure etc and I will add others.
2022-06-07 14:50:39 -07:00
Simon Mo
7471b1fa41
[Serve] [AIR] ModelWrapper improvements and docs (#25003)
* batching collation code and tests

* wip notebook for np and dataframe

* finish content

* reset ray-more-libs changes

* add comments

* run through

* Apply suggestions from code review

Co-authored-by: shrekris-anyscale <92341594+shrekris-anyscale@users.noreply.github.com>

* rename package

* lint

* richard's comment

Co-authored-by: shrekris-anyscale <92341594+shrekris-anyscale@users.noreply.github.com>
2022-06-07 08:53:10 -07:00
John B Nelson
e913352bdc
[Doc] Remove trailing period symbol install instruction (#25543) 2022-06-07 08:08:04 -07:00
Rohan Potdar
a9d8da0100
[RLlib]: Doubly Robust Off-Policy Evaluation. (#25056) 2022-06-07 12:52:19 +02:00
Zhe Zhang
6793426a9d
[Docs; RLlib] Remove $ from rllib pip install instructions (#25358) 2022-06-07 08:57:17 +02:00
Philipp Moritz
ec02e78b01
[docs] Use better method to mock ObjectRef (#25535)
Actually fix #25498
2022-06-06 23:50:52 -07:00
Eric Liang
c1afbcb6f4
[air] Enforce API stability annotations for AIR module (#25485) 2022-06-06 22:52:21 -07:00