Commit graph

2208 commits

Author SHA1 Message Date
Tao Wang
cd521ed132
[Doc][namespaces][C++ worker]add document for c++ worker namespace and specifying namespace while creating/getting named actors (#26498)
We've supported namespace in c++ worker in https://github.com/ray-project/ray/pull/26327. Here we add doc for usage and also reinforce the documents of Java and Python, like adding explanation of specifying namespace while creating named actors.

- [x] add doc for basic c++ worker namespace usage
- [x] add explanation for specifying namespace while creating named actors, in Python, Java and C++
2022-07-20 10:58:41 +08:00
Dmitri Gekhtman
fdd5c53bfd
[KubeRay] Documentation structure and skeleton (#26589)
Adds outline and structure for new KubeRay-based Ray-on-Kubernetes docs.
2022-07-19 13:28:04 -07:00
Richard Liaw
6563c2762d
[air] add pytorch benchmark number (#26719) 2022-07-19 09:51:13 -07:00
Richard Liaw
7e62e1187c
[air/benchmark] Torch benchmarks for 4x4 (#26692)
Add benchmark data for 4x4 GPU setup.

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

Co-authored-by: Jimmy Yao <jiahaoyao.math@gmail.com>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2022-07-19 17:06:37 +01:00
Siyuan (Ryans) Zhuang
5b937167d3
[Workflow] Fix typo in workflow event doc (#26686)
Signed-off-by: Siyuan Zhuang <suquark@gmail.com>
2022-07-18 23:26:50 -07:00
Siyuan (Ryans) Zhuang
eb4ed49c1f
[Workflow] Unify the semantics of max_retries of workflow task and Ray task (#26350)
* workflow task retry

Signed-off-by: Siyuan Zhuang <suquark@gmail.com>

* move and enhance tests

Signed-off-by: Siyuan Zhuang <suquark@gmail.com>

* use "max_retries" of Ray task

Signed-off-by: Siyuan Zhuang <suquark@gmail.com>

* add test for disabling lineage reconstruction in workflow

Signed-off-by: Siyuan Zhuang <suquark@gmail.com>
2022-07-18 23:25:44 -07:00
Sumanth Ratna
759966781f
[air] Allow users to use instances of ScalingConfig (#25712)
Co-authored-by: Xiaowei Jiang <xwjiang2010@gmail.com>
Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
2022-07-18 15:46:58 -07:00
matthewdeng
6670708010
[air] add placement group max CPU to data benchmark (#26649)
Set experimental `_max_cpu_fraction_per_node` to prevent deadlock.

This should technically be a no-op with the SPREAD strategy.
2022-07-18 10:34:40 -07:00
Chen Shen
b20f5f51df
[Air][Data] Don't promote locality_hints for split (#26647)
Why are these changes needed?
Since locality_hints is an experimental feature, we stop promoting it in doc and don't enable it in AIR. See #26641 for more context
2022-07-17 22:18:30 -07:00
Jiao
98a07920d3
[AIR][CUJ] Make distributing training benchmark at silver tier (#26640) 2022-07-17 22:07:09 -07:00
Jules S. Damji
55368402ee
added summary why and when to use bulk vs streaming data ingest (#26637) 2022-07-17 18:46:58 -07:00
Eric Liang
12825fc5aa
[air] Add a warning if no CPUs are reserved for dataset execution (#26643) 2022-07-17 16:33:51 -07:00
Clark Zinzow
864af14f41
[Datasets] [Local Shuffle - 1/N] Add local shuffling option. (#26094)
Co-authored-by: Eric Liang <ekhliang@gmail.com>
Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>
Co-authored-by: Matthew Deng <matt@anyscale.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2022-07-17 16:21:14 -07:00
Eric Liang
400330e9c0
[air] Add _max_cpu_fraction_per_node to ScalingConfig and documentation (#26634) 2022-07-16 21:55:51 -07:00
Amog Kamsetty
3a345a470c
[AIR/Docs] Add Predictor Docs (#25833) 2022-07-16 21:14:21 -07:00
Jiao
77e2ef2eb6
[AIR] Update Torch benchmarks with documentation (#26631)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2022-07-16 17:58:21 -07:00
Eric Liang
0855bcb77e
[air] Use SPREAD strategy by default and don't special case it in benchmarks (#26633) 2022-07-16 17:37:06 -07:00
M Waleed Kadous
7c32993c15
[core/docs]Add a new section under Ray Core called Ray Gotchas (#26624)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2022-07-16 16:53:01 -07:00
Antoni Baum
fb6f3cf708
[AIR/Docs] Small improvements to Train user guide (#26577)
Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>
2022-07-16 16:51:17 -07:00
Eric Liang
6217138eb0
[docs] Move AIR benchmarks to top level (#26632) 2022-07-16 15:34:31 -07:00
Philipp Moritz
081bbfbff1
[Examples] Test OCR example in documentation tests (#26482)
Make sure the OCR example is tested in documentation after we discovered that example notebooks are not tested in CI.

Signed-off-by: Philipp Moritz <pcmoritz@gmail.com>
2022-07-16 10:51:28 -07:00
Richard Liaw
799311b2f7
[air/docs] update examples to remove pandas again (#26598) 2022-07-16 08:40:44 -07:00
Balaji Veeramani
34cf1f17ea
[Datasets] Add ImageFolderDatasource (#24641)
Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2022-07-15 22:43:23 -07:00
matthewdeng
e3a096f412
[air] add bulk ingest benchmarks (#26618) 2022-07-15 22:01:23 -07:00
Richard Liaw
5ad4e75831
[air] Add initial benchmark section (#26608) 2022-07-15 15:33:48 -07:00
Jiao
647e12b6c7
[AIR] Fix convert_existing_pytorch_code_to_ray_air notebook (#26523) 2022-07-14 14:30:55 -07:00
Tim Gates
e42dc7943e
docs: Fix a few typos (#26556)
There are small typos in:
- doc/source/data/faq.rst
- python/ray/serve/replica.py

Fixes:
- Should read `successfully` rather than `succssifully`.
- Should read `pseudo` rather than `psuedo`.
2022-07-14 12:38:33 -07:00
Jiajun Yao
60dd77a2d3
Enable usage stats collection for ray.init iff nightly wheels (#26461)
For nightly wheels, we want to collect usage stats for local clusters started via ray.init() as well.
2022-07-14 12:14:01 -07:00
Amog Kamsetty
6595bd6e2d
[AIR] Introduce better scoring API for BatchPredictor (#26451)
Signed-off-by: Amog Kamsetty <amogkamsetty@yahoo.com>

As discussed offline, allow configurability for feature columns and keep columns in BatchPredictor for better scoring UX on test datasets.
2022-07-14 11:26:12 -07:00
Richard Liaw
a0ce3c111b
[air/data] Concatenator preprocessor (#26526) 2022-07-14 10:26:14 -07:00
Eric Liang
5f18c67ba3
Fix LINT (#26554)
Signed-off-by: Eric Liang <ekhliang@gmail.com>
2022-07-13 23:28:02 -07:00
Jiao
15dbc0362a
[AIR][Docs] Fix torch_image_example (#26453) 2022-07-13 21:59:24 -07:00
Scott Cheng
1bc44c13fb
Update Python3.10 in docs (#26463)
Make it clear to users that ray supports Python 3.10
2022-07-13 20:08:56 -07:00
Eric Liang
31c8c908f9
[docs] Improve AIR API ref organization (#26530) 2022-07-13 18:05:17 -07:00
Sihan Wang
b606169cb5
[Serve] Promote autoscaling feature (#26393)
1. get rid of the private attribute
2. fix unit test
3. docs and workflows
2022-07-13 14:38:38 -05:00
Antoni Baum
cc7115f6a2
[Tune/CI] Fix tune-sklearn notebook example (#26470)
Fixes the tune-sklearn notebook example as found in #26410

Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>
2022-07-13 18:14:36 +01:00
Antoni Baum
5ed10ef921
[AIR/CI] Fix Hugging Face notebook example (#26475) 2022-07-13 09:16:42 -07:00
Antoni Baum
ddb5572040
[Tune/CI] Fix Hyperopt notebook example (#26469)
Fixes failing hyperopt notebook in CI (as found in #26410). The cause was a mismatch between keys in points to evaluate and the search space - now, an informative exception will be raised.

Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>
2022-07-13 16:50:11 +01:00
Antoni Baum
9b2cd29511
[CI] Install Horovod in doc tests to fix notebook (#26476)
Fixes the Horovod notebook example as found in #26410 by installing Horovod in doc tests jobs.

Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>
2022-07-13 16:27:20 +01:00
Antoni Baum
67a7ffa6b4
[Tune/CI] Fix BOHB notebook example (#26473)
Fixes the BOHB notebook example as found in #26410

Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>
2022-07-13 10:35:38 +01:00
Avnish Narayan
5df66b917d
[Lint Check] Remove broken link (#26505)
The paper is not available anymore.
2022-07-13 10:30:20 +01:00
Antoni Baum
e48d381926
[Tune/CI] Fix Tune-Pytorch-CIFAR notebook example (#26474)
Fixes the Tune-Pytorch-CIFAR notebook example as found in #26410

Signed-off-by: Antoni Baum <antoni.baum@protonmail.com>
2022-07-13 10:28:30 +01:00
Christy Bergman
7c925fe99f
[RLlib; docs] Re-organize algorithms so TOC matches README. (#26339) 2022-07-13 10:46:36 +02:00
Eric Liang
9de1add073
[Datasets] Autodetect dataset parallelism based on available resources and data size (#25883)
This PR defaults the parallelism of Dataset reads to `-1`. The parallelism is determined according to the following rule in this case:
- The number of available CPUs is estimated. If in a placement group, the number of CPUs in the cluster is scaled by the size of the placement group compared to the cluster size. If not in a placement group, this is the number of CPUs in the cluster. If the estimated CPUs is less than 8, it is set to 8.
- The parallelism is set to the estimated number of CPUs multiplied by 2.
- The in-memory data size is estimated. If the parallelism would create in-memory blocks larger than the target block size (512MiB), the parallelism is increased until the blocks are < 512MiB in size.

These rules fix two common user problems:
1. Insufficient parallelism in a large cluster, or too much parallelism on a small cluster.
2. Overly large block sizes leading to OOMs when processing a single block.

TODO:
- [x] Unit tests
- [x] Docs update

Supercedes part of: https://github.com/ray-project/ray/pull/25708

Co-authored-by: Ubuntu <ubuntu@ip-172-31-32-136.us-west-2.compute.internal>
2022-07-12 21:08:49 -07:00
Pamphile Roy
53ecc28f9f
[docs] Install ray from conda-forge instead of PyPi when using conda (#25296) 2022-07-12 16:59:44 -07:00
Eric Liang
4c04c8d92c
[doc] Rename toc entry for libraries back to "Ray Libraries" (#26485) 2022-07-12 14:23:36 -07:00
Rohan Potdar
09ce4711fd
[RLlib]: Move OPE to evaluation config (#25911) 2022-07-12 11:04:34 -07:00
Richard Liaw
92efc85b3b
[air/docs] checkpoints (#25901) 2022-07-11 20:40:23 -07:00
Richard Liaw
1abe908c22
[air/docs] improve consistency of getting started (#26247) 2022-07-11 20:16:37 -07:00
Richard Liaw
191921f4ec
[docs] Fix pytest and add stacklevel (#26340) 2022-07-11 19:43:37 -07:00