Commit graph

2279 commits

Author SHA1 Message Date
xwjiang2010
ac831fded4
[air] update documentation to use session.report (#26051)
Update documentation to use `session.report`.

Next steps:
1. Update our internal caller to use `session.report`. Most importantly, CheckpointManager and DataParallelTrainer.
2. Update `get_trial_resources` to use PGF notions to incorporate the requirement of ResourceChangingScheduler. @Yard1 
3. After 2 is done, change all `tune.get_trial_resources` to `session.get_trial_resources`
4. [internal implementation] remove special checkpoint handling logic from huggingface trainer. Optimize the flow for checkpoint conversion with `session.report`.

Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2022-06-30 10:37:31 -07:00
Siyuan (Ryans) Zhuang
ddd63aba77
[workflow] Major refactoring - new async workflow executor (#25618)
* major workflow refactoring
2022-06-29 20:31:40 -07:00
Archit Kulkarni
84be085a5a
[Doc] Fix typo in Serve doc (#26211) 2022-06-29 16:15:26 -07:00
Christy Bergman
541e2ec14c
Add Environments to Key Concepts page (#25791) 2022-06-29 16:10:49 -07:00
matthewdeng
4a21dc31ae
[air] update DummyTrainer to handle DatasetPipelines (#26175)
1. Update `DummyTrainer` to take `num_epochs` instead of `runtime_seconds`.
    1. Ray Train expects equal number of calls to `train.report()`. Different workers may run at different speeds and terminate after different epoch numbers, which causes an error.
2. Add `generate_epochs` to support `DatasetPipeline` when `use_stream_api` is True.
3. Update `__main__` code to support testing different configurations.
2022-06-29 09:32:57 -07:00
Antoni Baum
dc7ed086a5
[AIR] More checkpoint configurability, Result extension (#25943)
This PR:
* Allows the user to set `keep_checkpoints_num` and `checkpoint_score_attr` in `RunConfig` using the `CheckpointStrategy` dataclass
* Adds two new fields to the `Result` object - `best_checkpoints` - a list of saved best checkpoints as determined by `CheckpointingConfig`.
2022-06-29 08:23:29 -07:00
Antoni Baum
128f9e5664
[AIR] Move integration logging callbacks to AIR (#26126)
As the integration logging callbacks are commonly used with AIR Trainers, they should be moved from the tune package to the air package. The old imports will still work, but raise a deprecation warning.
2022-06-28 17:25:19 -07:00
Stephanie Wang
c9be251b7a
Revert "[AIR][Serve] Rename ModelWrapperDeployment -> PredictorDeployment (#25962)" (#26176)
This reverts commit 68692b3464.
2022-06-28 17:07:07 -07:00
matthewdeng
68315b34b4
[docs] move troubleshooting section to source development page (#26166) 2022-06-28 16:06:46 -07:00
Simon Mo
68692b3464
[AIR][Serve] Rename ModelWrapperDeployment -> PredictorDeployment (#25962) 2022-06-28 10:26:10 -07:00
Dmitri Gekhtman
3af3269b8e
[KubeRay][docs] Warning about kubectl apply, update feature state wording. (#25685)
This PR

Adds a warning about a known issue to the KubeRay section of the Ray docs.
Updates the description of the feature state of KubeRay integration.
Adds some links to the KubeRay docs.
2022-06-27 14:11:00 -07:00
Ruben Berenguel
7f8d3c3dc7
Update installation instructions (#23121)
Currently unqualified `conda install` is installing 1.44.0 whereas `ray` is requiring 1.43.0 in `pip install`, thus the instructions are cancelling themselves out and you end with an unusable installation due to no symbols for `grpcio` in ARM

Co-authored-by: Simon Mo <simon.mo@hey.com>
2022-06-27 13:55:50 -07:00
Kai Fricke
75d08b0632
[tune/structure] Refactor suggest into search package (#26074)
This PR renames the `suggest` package to `search` and alters the layout slightly.

In the new package, the higher-level abstractions are on the top level and the search algorithms have their own subdirectories.

In a future refactor, we can turn algorithms such as PBT into actual `SearchAlgorithm` classes and move them into the `search` package. 

The main reason to keep algorithms and searchers in the same directory is to avoid user confusion - for a user, `Bayesopt` is as much a search algorithm as e.g. `PBT`, so it doesn't make sense to split them up.
2022-06-25 14:55:30 +01:00
shrekris-anyscale
6092869ff3
[Serve] [Docs] Create end-to-end documentation example for Serve REST API and CLI (#25936) 2022-06-24 14:44:39 -07:00
Antoni Baum
0ec198acc2
[AIR] Remove unnecessary pandas from examples (#26009)
Removes unnecessary pandas usage from AIR examples. Helps ensure users do not follow bad practices.
2022-06-24 14:38:23 -07:00
shrekris-anyscale
97a9a20f74
[Serve] [Docs] Add Serve REST API Schema to Serve API Docs (#25786) 2022-06-24 14:06:26 -07:00
Chen Shen
95fe3271ec
[Core][Doc] remove cython section from advanced doc. #26062
the example is removed.
2022-06-24 10:39:45 -07:00
Kai Fricke
012306da68
[hotfix] Fix linkcheck (#26070) 2022-06-24 13:38:01 +01:00
Artur Niederfahrenhorst
a3f1323457
[RLlib] Make QMix use the ReplayBufferAPI (#25560) 2022-06-23 22:55:22 -07:00
Antoni Baum
94492c2b49
[AIR/Docs] Improve user guide gallery (#26016)
Moves logging to examples, reorders to match TOC, adds missing entry.
2022-06-23 17:51:01 -07:00
Kai Fricke
b21314fac2
[tune/structure] Introduce trainable package (#26046)
Introduce a `trainable` package to house Trainable, FunctionTrainable (renamed), Session, and utilities.
2022-06-23 21:50:55 +01:00
Guyang Song
2934efe502
[runtime_envmove 'eager_intall' to 'config' (#26004) 2022-06-23 13:16:52 -05:00
Guyang Song
26ae3a0239
[Doc] [C++ API] Add note about ABI issue (#26030) 2022-06-23 13:14:50 -05:00
Kai Fricke
0959f44b6f
[tune/structure] Introduce execution package (#26015)
Execution-specific packages are moved to tune.execution.

Co-authored-by: Xiaowei Jiang <xwjiang2010@gmail.com>
2022-06-23 11:13:19 +01:00
Sihan Wang
c0cf9b8098
[Serve][Doc] Autoscaling (#25646)
- new section of doc for autoscaling (introduction of serve autoscaling and config parameter)
- Remove the version requirement note inside the doc

Co-authored-by: Simon Mo <simon.mo@hey.com>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Co-authored-by: shrekris-anyscale <92341594+shrekris-anyscale@users.noreply.github.com>
Co-authored-by: Archit Kulkarni <architkulkarni@users.noreply.github.com>
2022-06-22 15:32:18 -05:00
sychen52
84401bb616
add missing brackets (#25992) 2022-06-22 15:30:55 -05:00
Sven Mika
464ac82207
[RLlib] Small docs fixes for evaluation + training. (#25957) 2022-06-22 13:11:18 +02:00
Rina Ueno
a29eeaa1f6
[Workflows] Explain workflow_id and task_name in the docs (#25800) 2022-06-21 15:24:16 -07:00
Eric Liang
43aa2299e6
[api] Annotate as public / move ray-core APIs to _private and add enforcement rule (#25695)
Enable checking of the ray core module, excluding serve, workflows, and tune, in ./ci/lint/check_api_annotations.py. This required moving many files to ray._private and associated fixes.
2022-06-21 15:13:29 -07:00
sychen52
5c58d43df2
[docs][minor] Change one of the because to therefore. (#25921) 2022-06-21 10:41:40 -05:00
matthewdeng
fe4185974a
[docs] fix swapped pattern docs (#25948)
Content of the two docs were switched.

Unnecessary Ray Get images were correctly in `unnecessary-ray-get.rst`, which made this noticeable beyond the URL.
2022-06-21 10:37:37 -05:00
Philipp Moritz
c604bc23c7
[Docs] Fix documentation building instructions (#25942)
It is often a bit challenging to get the full documentation to build (there are external packages that can make this challenging). This changes the instructions to treat warnings as warnings and not errors, which should improve the workflow.

`make develop` is the same as `make html` except it doesn't treat warnings as errors.
2022-06-20 18:04:25 -07:00
Myeongju Kim
a1a78077ca
Fix a broken link in Ray Dataset doc (#25927)
Co-authored-by: Myeong Kim <myeongki@amazon.com>
2022-06-20 13:17:46 -07:00
Sven Mika
1499af945b
[RLlib] Algorithm step() fixes: evaluation should NOT be part of timed training_step loop. (#25924) 2022-06-20 19:53:47 +02:00
Sven Mika
96693055bd
[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869) 2022-06-20 15:54:00 +02:00
Zhe Zhang
216aede789
Remove RL Summit announcement (#25354) 2022-06-18 16:01:23 -07:00
Clark Zinzow
1701b923bc
[Datasets] [Tensor Story - 2/2] Add "numpy" batch format for batch mapping and batch consumption. (#24870)
This PR adds a NumPy "numpy" batch format for batch transformations and batch consumption that works with all block types. See #24811.
2022-06-17 16:01:02 -07:00
clarng
2b270fd9cb
apply isort uniformly for a subset of directories (#25824)
Simplify isort filters and move it into isort cfg file.

With this change, isort will not longer apply to diffs other than to files that are in whitelisted directory (isort only supports blacklist so we implement that instead) This is much simpler than building our own whitelist logic since our formatter runs multiple codepaths depending on whether it is formatting a single file / PR / entire repo in CI.
2022-06-17 13:40:32 -07:00
Archit Kulkarni
b24c736bb8
[Doc] [runtime env] Add note that excludes paths are relative to working_dir (#25874)
Users' intuition might lead them to fill out `excludes` with absolute paths, e.g. `/Users/working_dir/subdir/`.  However, the `excludes` field uses `gitignore` syntax.  In `gitignore` syntax, paths that start with `/` are interpreted relative to the level of the directory where the `gitignore` file resides, and in our case this is the `working_dir` directory (morally speaking, since there's no actual `.gitignore` file.)  So the correct thing to put in `excludes` would be `/subdir/`.  As long as we support `gitignore` syntax, we should have a note in the docs for this.  This PR adds the note.
2022-06-17 10:50:04 -05:00
sychen52
edf16b8e2c
[docs] Edit the output of the script to match the code (#25855) 2022-06-17 10:48:28 -05:00
sychen52
ce02ac0311
[docs] Fix example actor indentation (#25882) 2022-06-16 22:06:21 -07:00
Jiao
f6735f90c7
[Ray DAG] Move dag project folder out of experimental (#25532) 2022-06-16 19:15:39 -07:00
Clark Zinzow
3dda4e1d46
[Docs] Add a py:obj default role to Sphinx builds. (#25765)
By setting the [Sphinx `default_role`](https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-default_role) to [`py:obj`](https://www.sphinx-doc.org/en/master/usage/restructuredtext/domains.html#role-py-obj), we can concisely cross-reference other Python APIs (classes or functions) in API docstrings while still maintaining the editor/IDE/terminal readability of the docstrings.

Before this PR, when referencing a class or a function, the relevant role specification is required: :class:`Dataset`, :meth:`Dataset.map`, :func:`.read_parquet`.

After this PR, the raw cross reference will work in most cases: `Dataset`, `Dataset.map`, `read_parquet`.

## Checks

- [x] I've run `scripts/format.sh` to lint the changes in this PR.
- [x] I've included any doc changes needed for https://docs.ray.io/en/master/.
- [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
- Testing Strategy
   - [x] Unit tests
   - [ ] Release tests
   - [ ] This PR is not tested :(
2022-06-16 16:33:20 -07:00
Chen Shen
8e7e89a178
[Data] fix broken link (#25867)
update the broken spark link.
2022-06-16 14:01:38 -07:00
shrekris-anyscale
d944f7469c
[Serve] [Docs] Remove references to namespaces in the Serve documentation (#25830)
#25575 starts all Serve actors in the `"serve"` namespace. This change updates the Serve documentation to remove now-outdated explanations about namespaces and to specify that all Serve actors start in the `"serve"` namespace.
2022-06-16 10:50:49 -05:00
Antoni Baum
91dd360f9d
[AIR/train] Move predictors to ray.train (#25769) 2022-06-15 17:02:15 -07:00
Antoni Baum
b5fd02af4f
[CI] Print linkcheck summary only in linkcheck (#25781) 2022-06-15 16:21:08 -07:00
zcin
3f91cbd979
[serve][docs] Replaced term 'actor_init_options' with 'ray_actor_options' in documentation (#25808)
Replaced the term `actor_init_options` with `ray_actor_options` in [this documentation section](https://docs.ray.io/en/releases-1.13.0/serve/performance.html#choosing-the-right-hardware) because `actor_init_options` is an outdated variable name. It's been changed to `ray_actor_options` in the [code](2546fbf99d/python/ray/serve/deployment.py (L45)).
2022-06-15 15:21:24 -05:00
Clark Zinzow
526e12074a
[Datasets] Make it clear that read_parquet() does not support multiple directories. (#25747)
Unfortunately, ray.data.read_parquet() doesn't work with multiple directories since it uses Arrow's Dataset abstraction under-the-hood, which doesn't accept multiple directories as a source: https://arrow.apache.org/docs/python/generated/pyarrow.dataset.dataset.html

This PR makes this clear in the docs, and as a driveby, adds ray.data.read_parquet_bulk() to the API docs.
2022-06-15 13:19:39 -07:00
clarng
ef866d1e49
exclude doc_code from import sorting (#25772)
Skip sorting the imports in doc_code.
2022-06-15 11:34:45 -07:00