Commit graph

1560 commits

Author SHA1 Message Date
Sven Mika
9d5c4a9d21
[RLlib] API reference pages: rllib/env package only. (#20486) 2021-11-19 10:06:40 +01:00
Alex Wu
88266a6fce
Revert "Revert "[Docs] More detailed M1 Mac installation instructions"" (#20549)
Reverts ray-project/ray#20547
2021-11-18 20:18:37 -08:00
Eric Liang
65a8698e82
Raise the dataset block size limit to 2GiB (#20551)
The default block size of 500MiB seems too low for some common workloads, e.g. shuffling 500GB. This creates 1000 blocks which means 1 million intermediate shuffle objects until we implement #20500.
2021-11-18 19:36:10 -08:00
Richard Liaw
c964455642
Revert "[Docs] More detailed M1 Mac installation instructions" (#20547)
Reverts ray-project/ray#20512 due to lint errors.
2021-11-18 12:06:57 -08:00
Antoni Baum
0b14f38ac7
[tune] Multi-objective support for Optuna (#20489)
This PR adds multi-objective support for Optuna searchers, including a test and example.

Co-authored-by: gjoliver <jungong@anyscale.com>
2021-11-18 18:47:29 +00:00
Alex Wu
540c9e35d1
[Docs] More detailed M1 Mac installation instructions (#20512)
This PR adds more detail the M1 mac installation instructions following the bug bash.
2021-11-18 09:35:43 -08:00
Sven Mika
7a585fb275
[RLlib; Documentation] RLlib README overhaul. (#20249) 2021-11-18 18:08:40 +01:00
shrekris-anyscale
65a023ef71
[runtime_env][docs] Add documentation on using remote URIs for runtime environments (#20352) 2021-11-17 23:17:48 -06:00
Amog Kamsetty
9796ae56d5
[Train][Data] Change usages of iter_datasets to iter_epochs (#20487) 2021-11-17 18:05:51 -08:00
Yi Cheng
cbf5826040
[workflow] Fix workflow event doc typo (#20465)
In the example, it says `after_checkpoint`, but this should be `event_checkpointed`
2021-11-17 16:18:20 -08:00
Qing Wang
e01f14d7df
[DOC] Add namespace doc for Java part. (#20428)
Add namespace doc for Java part.
2021-11-17 23:02:47 +08:00
Simon Mo
18d605fa7c
[Serve] Add experimental CLI for serve deploy (#20371) 2021-11-16 20:22:09 -08:00
Larry
454db6902c
[Java] Add timeout parameter for Ray.get() API (#20282)
Why are these changes needed?

Add timeout(ms) param for Java ray.get. The API changes have been updated to doc ([Ray Core Walkthrough]->[Fetching Results]).

eg:
ObjectRef<Integer> objRef = Ray.put(1);
objRef.get(1000) 
Ray.get(Ray.task(MyRayApp::slowFunction).remote(), 3000)

Related issue number
#20247
2021-11-17 11:02:17 +08:00
Simon Mo
5fccad4cc9
[Serve] Add experimental pipeline docs (#20292) 2021-11-16 16:13:55 -08:00
Richard Liaw
cf357f6bce
[docs] Add a talks section for ray.data (#20444) 2021-11-16 14:30:08 -08:00
Antoni Baum
3f9ded55f7
[tune] Merge Analysis into ExperimentAnalysis (#20197)
Co-authored-by: Kai Fricke <kai@anyscale.com>
2021-11-16 16:47:12 +00:00
Amog Kamsetty
4f88796d5a
[Train] Move to beta (#20378) 2021-11-16 08:19:30 -08:00
Kai Fricke
3e6ba5d6d2
Revert "Revert [RLlib] POC: PGTrainer class that works by sub-classing, not trainer_template.py." (#20285)
* Revert "Revert "[RLlib] POC: `PGTrainer` class that works by sub-classing, not `trainer_template.py`. (#20055)" (#20284)"
This reverts commit 246787cdd9.
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-16 12:26:47 +01:00
Eric Liang
460cf86858
Split blocks automatically into 500MB chunks on file read and transformation (#20235)
This PR adds support for automatic block splitting on read and map transforms, to keep block size bounded to ~500MiB. This avoids potential OOM situations where a map task may consume too much intermediate Python heap memory, or too much object store shared memory for one block.
2021-11-15 22:25:11 -08:00
Antoni Baum
ec81f52061
[Docs] Fix typo in C++ Placement Group example (#20386) 2021-11-16 08:19:09 +09:00
Will Drevo
fa878e2d4d
Added example to user guide for cloud checkpointing (#20045)
Co-authored-by: will <will@anyscale.com>
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2021-11-15 15:43:06 +00:00
Amog Kamsetty
a74cf7ff1c
[Train] Torch Prepare utilities (#20254)
* update

* formatting

* fix failures

* fix session tests

* address comments

* add to api docs

* package refactor

* wip

* wip

* wip

* finish

* finish

* fix

* comment

* fix

* install horovod for docs

* address comment

* Update python/ray/train/session.py

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>

* Update python/ray/train/torch.py

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>

* address comments

* try fix docs

* fix doc build failure

* fix

* fix

* fix

* try fix doc highlighting

* fix docs

Co-authored-by: matthewdeng <matthew.j.deng@gmail.com>
2021-11-15 07:34:17 -08:00
Qing Wang
1172195571
[Java] Remove global named actor and global pg (#20135)
This PR removes global named actor and global PGs.

I believe these APIs are not used widely in OSS.
CPP part is not included in this PR.
@kfstorm @clay4444 @raulchen Please take a look if this change is reasonable.


IMPORTANT NOTE: This is a Java API change and will lead backward incompatibility in Java global named actor and global PG usage.

CPP part is not included in this PR.
INCLUDES:

 Remove setGlobalName() and getGlobalActor() APIs.
 Remove getGlobalPlacementGroup() and setGlobalPG
 Add getActor(name, namespace) API
 Add getPlacementGroup(name, namespace) API
 Update doc pages.
2021-11-15 16:28:53 +08:00
Sven Mika
e5ead6a4b0
[RLlib; Documentation] Minor fixes "rllib in 60s" and per-feature sigils. (#20248) 2021-11-13 22:10:47 +01:00
Amog Kamsetty
65a17da2ec
[Train] Refactor Backends (#20312)
* wip

* finish

* comment

* fix

* install horovod for docs

* address comment

* fix doc build failure
2021-11-13 11:05:53 -08:00
matthewdeng
e77cc926be
[train] minor doc updates (#20271) 2021-11-12 17:20:23 -08:00
Tricia Fu
e59c14117f
[Doc] [Serve] Add summary sub header to each page (#20231) 2021-11-12 14:18:42 -08:00
xwjiang2010
cdf70c2900
[Tune] Remove legacy resources implementations in Runner and Executor. (#19773) 2021-11-12 12:33:39 -08:00
Siyuan (Ryans) Zhuang
3b62388a9a
[Workflow] Workflow tail recursion optimization (#19928)
* tail recursion optimization
2021-11-12 09:13:40 -08:00
Kai Fricke
246787cdd9
Revert "[RLlib] POC: PGTrainer class that works by sub-classing, not trainer_template.py. (#20055)" (#20284)
This reverts commit 6f85af435f.
2021-11-12 13:09:43 +00:00
Kai Fricke
d88fdd6e38
[tune] refactor SyncConfig (#20155) 2021-11-12 09:36:15 +00:00
Michael Galarnyk
dbeb2e2f73
Add Ray Serve Blogs to Doc(#19846)
The Serving ML Models in Production blog links is inline with the latest Ray Summit talk on Ray Serve.
2021-11-11 15:10:36 -08:00
Edward Oakes
59698aa89c
[Serve] add survey link (#20230) 2021-11-11 15:10:10 -08:00
Jules S. Damji
71a162d8ab
Fixed code snippet to include config parameter and a minor typo (#20193)
Signed-off-by: Jules S.Damji <jules@anyscale.com>

Co-authored-by: Jules S.Damji <jules@anyscale.com>
2021-11-11 18:37:03 +00:00
Dmitri Gekhtman
8971422d8f
[autoscaler] Use drain node api in autoscaler before terminating nodes (#20013)
* wip

* Draft

* Use bytest for node id

* remove stray helm change

* fix autoscaler init arg

* don't forget to instantiate new load metrics dict

* remove extraneous diff

* Timeout, comments, function signature.

* typo

* another comment

* tweak

* docstring

* shorter timeout

* Use a better error code

* missing self

* Dedent example

* Add drain node prometheus metric.

* comment

* Update tests part 1: test_autoscaler.py

* Update tests part 2: test_resource_demand_scheduler

* lint

* Update tests part 3: test_autoscaling_policy

* Unit tests for new Prometheus metric and DrainNode error handling.

* comment

* removed unused function

* Try adding ability to mock out process termination to fake node provider

* Add integration test.

* fix

* fix

* lint

* Improve log message

* fix

* Simplify test

* Fix doc example

* remove unused dict

* Mock out process termination in a subclass

* Add add doc string and comment explaining prune active ips.

* Comment: wtf is use_node_id_as_ip

* one more comment

* more explanation

* period

* tweak
2021-11-11 08:31:40 -08:00
Sven Mika
6f85af435f
[RLlib] POC: PGTrainer class that works by sub-classing, not trainer_template.py. (#20055) 2021-11-11 12:16:20 +01:00
Will Drevo
2fdb1c46c7
[RLlib; Documentation] Added atari pip installs to Pong-v0 example. (#20225)
* Added imports to Pongv0 example

* Added comment

* Apply suggestions from code review

Co-authored-by: will <will@anyscale.com>
Co-authored-by: Sven Mika <sven@anyscale.io>
2021-11-11 09:08:02 +01:00
Tobias Kaymak
893f57591d
[serve] Add Google Cloud Storage as a backend (#20104) 2021-11-10 19:45:19 -08:00
Edward Oakes
082a4af3e6
[serve] Remove lingering backend/endpoint wording in docs (#20229) 2021-11-10 16:49:29 -08:00
Sven Mika
ebd56b57db
[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215) 2021-11-10 22:20:06 +01:00
matthewdeng
790e22f9ad
[tune] move force_on_current_node to ml_utils (#20211) 2021-11-10 10:21:24 -08:00
Sven Mika
143d23a278
[RLlib] Issue 20062: Action inference examples missing (#20144) 2021-11-10 18:49:06 +01:00
Kim Pevey
82a5bf68fa
[Docs] Add note for multi-node on Windows (#20184)
* add note for multi-node on Windows

* update message

Co-authored-by: Philipp Moritz <pcmoritz@gmail.com>
2021-11-09 16:02:01 -08:00
Kai Fricke
9c2b8c8501
[tune] Deprecate DurableTrainable (#19880) 2021-11-08 20:56:07 +00:00
Amog Kamsetty
b1f24768a1
[Tune] More fixes to PTL Tutorial (#20065)
* ptl-fix-2

* improve

* fix
2021-11-08 09:13:44 -08:00
Jules S. Damji
e6343f0e69
Fixed a broken code snippet with a missing method (#20130)
Signed-off-by: Jules S.Damji <jules@anyscale.com>

Co-authored-by: Jules S.Damji <jules@anyscale.com>
2021-11-08 07:56:32 +09:00
Alex Wu
81194f5660
[workflow][docs] Fix api comparison formatting (#20069)
## Why are these changes needed?

The API comparison formatting uses \`code\` which is rendered as italicization not code. This PR puts the code in code blocks instead of italics. 
## Related issue number

## Checks
2021-11-05 17:05:35 -07:00
Chen Shen
320f9dc234
[Core][CoreWorker] increase the default port range (#19541)
* increase the port range

* Update doc/source/configure.rst

Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>

Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2021-11-05 09:25:44 -07:00
Eric Liang
6102912494
Dataset doc updates (#19815) 2021-11-04 18:13:40 -07:00
javi-redondo
11371768c1
Update Ray client docs with working_dir explanation (#18294) 2021-11-04 14:52:28 -07:00