Commit graph

1413 commits

Author SHA1 Message Date
Clark Zinzow
a86277a93c
[dask-on-ray] Fix Dask-on-Ray examples in docs (#14461) 2021-03-17 10:37:32 -07:00
Edward Oakes
aab7ccc466
[serve] Deprecate client-based API in favor of process-wide singleton (#14696) 2021-03-17 09:39:54 -05:00
Ian Rodney
8a936ad64d
[Autoscaler Docs] Use worker_run_options (#14721)
Co-authored-by: Ameer Haj Ali <ameerh@berkeley.edu>
2021-03-16 18:04:27 -07:00
Edward Oakes
72615ae590
[metrics] Improve custom metrics docs, add an example on how to use them (#14690) 2021-03-15 17:37:02 -05:00
architkulkarni
8b17ec7c6d
[Serve] Disable auto conda env setting if using Ray Client (#14672) 2021-03-15 14:36:46 -05:00
DK.Pino
ef0c91f605
[Placement Group] [Doc] Fix PG doc display problem. (#14665) 2021-03-15 11:56:05 -07:00
Edward Oakes
d90cd545d1
[serve] Deprecate system-level batching with warning, update the docs (#14648) 2021-03-15 13:47:01 -05:00
Edward Oakes
dda3ab0161
[metrics] Cleanup package ref (#14658) 2021-03-15 13:00:57 -05:00
Brian Yu
a65002514c
[Doc] Update Slurm documentation examples (#14673) 2021-03-15 00:27:13 -07:00
Richard Liaw
c2aeccaf14
[tune] revert all mnist tests (#14677)
This reverts commit 3f557348a2.
2021-03-14 23:58:13 -07:00
Richard Liaw
3f557348a2
[tune] re-enable MNIST tests! (#14561) 2021-03-12 13:35:43 -08:00
Michael Luo
020c9439dd
[RLlib] CQL Documentation + Tests (#14531) 2021-03-11 18:51:39 +01:00
architkulkarni
9b6d2ca345
[Core] Add runtime_env option to actor and task options, with conda_env (#14430) 2021-03-11 10:09:38 -06:00
Stephanie Wang
b187693121
[docs] Fix links for installing wheels for a specific commit (#14572)
* Fix doc

* version
2021-03-09 16:55:03 -08:00
Alex Wu
e1fbb8489e
[core] Supress infeasible warning (#14068) 2021-03-09 16:37:56 -08:00
SongGuyang
134152937a
fix doc (#14555) 2021-03-09 18:57:03 +08:00
Qing Wang
29d5b110de
Update doc about installing Ray Java (#14383)
* Fix

* Update doc/source/installation.rst

Co-authored-by: Kai Yang <kfstorm@outlook.com>

* Update doc/source/installation.rst

Co-authored-by: Kai Yang <kfstorm@outlook.com>

* Update doc/source/walkthrough.rst

Co-authored-by: Kai Yang <kfstorm@outlook.com>

* Address comments.

* lint

Co-authored-by: Qing Wang <jovany.wq@antgroup.com>
Co-authored-by: Kai Yang <kfstorm@outlook.com>
2021-03-09 18:03:13 +08:00
Kai Fricke
43e098402a
[tune] make tune.with_parameters() work with the class API (#14532)
* [tune] make `tune.with_parameters()` work with the class API

* Update python/ray/tune/utils/trainable.py

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-09 09:36:17 +01:00
architkulkarni
505d2b6abe
[Serve] [Doc] Add small dashboard section under Serve Monitoring (#14328) 2021-03-08 20:41:42 -06:00
Sven Mika
732197e23a
[RLlib] Multi-GPU for tf-DQN/PG/A2C. (#13393) 2021-03-08 15:41:27 +01:00
Kai Fricke
b0bf44b154
[tune/docs] Add high level trial runner flow to documentation (#14468)
* [tune/docs] Add high level trial runner flow to documentation

* Apply suggestions from code review
2021-03-08 10:35:54 +01:00
Dmitri Gekhtman
3f6c23e3cc
[doc][autoscaler][minor] Fix quickstart guide: ray.init(address='auto') (#14459) 2021-03-03 17:58:52 -08:00
Richard Liaw
dba533dd84
Disable more torch (#14480)
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-03 15:46:32 -08:00
Richard Liaw
60a8b67488
Disable mnist tests (#14474) 2021-03-03 13:25:01 -08:00
Dmitri Gekhtman
1675156a8b
[autoscaler][interface] Use multi node types in defaults.yaml and example-full.yaml (#14239)
* random doc typo

* example-full-multi

* left off max workers

* wip

* address comments, modify defaults, wip

* fix

* wip

* reformat more things

* undo useless diff

* space

* max workers

* space

* copy-paste mishaps

* space

* More copy-paste mishaps

* copy-paste issues, space, max_workers

* head_node_type

* legacy yamls

* line undeleted

* correct-gpu

* Remove redundant GPU example.

* Extraneous comment

* whitespace

* example-java.yaml

* Revert "example-java.yaml"

This reverts commit 1e9c0124b9d97e651aaeeb6ec5bf7a4ef2a2df17.

* tests and other things

* doc

* doc

* revert max worker default

* Kubernetes comment

* wip

* wip

* tweak

* Address comments

* test_resource_demand_scheduler fixes

* Head type min/max workers, aws resources

* fix example_cluster2.yaml

* Fix external node type test (compatibility with legacy-style external node types)

* fix test_autoscaler_aws

* gcp-images

* gcp node type names

* fix gcp defaults

* doc format

* typo

* Skip failed Windows tests

* doc string and comment

* assert

* remove contents of default external head and worker

* legacy external failed validation test

* Readability -- define the minimal external config at the top of the file.

* Remove default worker type min worker

* Remove extraneous global min_workers comment.

* per-node-type docker in aws/example-gpu-docker

* ray.worker.small -> ray.worker.default

* fix-docker

* fix gpu docker again

* undo kubernetes experiment

* fix doc

* remove worker max_worker from kubernetes

* remove max_worker from local worker node type

* fix doc again

* py38

* eric-comment

* fix cluster name

* fix-test-autoscaler

* legacy config logic

* pop resources

* Remove min_workers AFTER merge

* comment, warning message

* warning, comment
2021-03-03 06:16:19 +02:00
Dmitri Gekhtman
58c0959ea7
[kubernetes][docs][minor] Move Kubernetes example scripts to docs (#14412) 2021-03-01 20:17:16 -08:00
Eric Liang
dbaa28f81e
Add links to new rllib paper (#14432) 2021-03-01 20:11:40 -08:00
Eric Liang
eab53a8808
Update Ray client docs (#14422) 2021-03-01 14:08:34 -08:00
Micah Yong
db0c16824c
[Dashboard][CLI] Ray memory parity with dashboard 2 (#13444)
* Minor improvements in Ray Core Walkthrough as seen in https://github.com/ray-project/ray/issues/12472

* Define node_stats() to return NodeStats object from cluster

* Add --group-by and --sort-by capabilities to ray memory script

* Resolve merge conflict

* Add helper functions for group by and sorting type in memory_utils.py

* Reformat

* Format

* Compartmentalize memory script into get_memory_summary and get_store_stats_summary

* Modify unit tests in test_mem_stat

* Lint and format

* Test cases for group_by sort_by

* Lint and format

* Fix actor handle failing test case

* Update test_memstat.py

* Resolve merge conflicts

* Adjust ray memory output based on terminal size

* Formatting and linting

* Use constant for callsite length

* Switch from OS to shutil for querying terminal size (official python support)

* Linting and formatting

* Lint and format

* Resolve lint issue in walkthrough.rst

* Revert to python 3.6

* Delete visitor.py

It was accidentally included in most recent commit

* Delete .eggs

It was accidentally included in most recent commit

* Resolve test_object_spilling.py test case

* Add stats only argument

* revert changes on this file

* Remove package-lock.json

* Add back npm installation

* Sync package-lock.json

* Linting and formatting

* Sync with package-lock

* Sync with package-lock pt 2

* Update documentation in https://docs.ray.io/en/master/memory-management.html

* Add include_memory_info as argument for node_stats

* Switch object ref and call site positions

* Linting and formatting

* Change from MiB to B

* Change from stats-only to store-true

* Add memory test case

* Add memory test case

* Lint and format

* Correct test in memstat

* Change line wrap and stats only to flags

* Clarify --stats-only and --no-format in ray memory

* --stats-only description modified

Co-authored-by: Micah Yong <micahyong@Micahs-MacBook-Pro.local>
2021-03-01 09:27:22 -08:00
niole
be9a584a94
[Docs] Remove version reference in dashboard proxy docs (#14359) 2021-02-27 21:06:25 -08:00
architkulkarni
f9364b1d5c
[Serve] Add logger with backend and replica tags (#14251) 2021-02-26 12:46:19 -08:00
Simon Mo
af085ed8aa
[Serve] Add Perf Tuning Doc (#14334)
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Co-authored-by: architkulkarni <architkulkarni@users.noreply.github.com>
2021-02-26 10:28:02 -08:00
Kai Fricke
4014168928
[tune] Introduce durable() wrapper to convert trainables into durable trainables (#14306)
* [tune] Introduce `durable()` wrapper to convert trainables into durable trainables

* Fix wrong check

* Improve docs, add FAQ for tackling overhead

* Fix bugs in `tune.with_parameters`

* Update doc/source/tune/api_docs/trainable.rst

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

* Update doc/source/tune/_tutorials/_faq.rst

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-02-26 13:59:28 +01:00
Clark Zinzow
b844548b57
[dask-on-ray] Adds support for dask.persist() with inlined Ray futures. (#14294)
* Adds support for dask.persist() with inlined Ray futures.

* Update persist test.

* Add patched dask.persist() documentation.
2021-02-25 17:48:47 -08:00
Richard Liaw
a2d2275ee1
Revert "[RLlib + Tune] Add placement group support to RLlib. (#14289)" (#14360)
This reverts commit 6cd0cd3bd9.
2021-02-25 14:27:35 -08:00
architkulkarni
ba4b7ccfe8
[Serve] [Doc] Add basic Serve tutorial (#14256) 2021-02-25 14:10:08 -06:00
Guy Khazma
e3f3269b15
[doc] Fixes to RayDP docs (#14309)
* minor fix to raydp docs

* fix pytorch and tensorflow samples

* fix: minor fixes
2021-02-25 11:23:10 -08:00
Sven Mika
6cd0cd3bd9
[RLlib + Tune] Add placement group support to RLlib. (#14289) 2021-02-25 16:01:31 +01:00
Sven Mika
8000258333
[RLlib] R2D2 Implementation. (#13933) 2021-02-25 12:18:11 +01:00
niole
488f63efe3
[Dashboard] Make requests sent by the dashboard reverse proxy compatible (#14012) 2021-02-24 18:31:59 -08:00
SangBin Cho
be68a78b3f
[Object Spilling] Support multiple directories for spilling. (#14240)
* Finish the initial implementation.

* Improve the doc.

* Addressed comment.

* lint.

* f
2021-02-23 11:51:57 -08:00
Kai Fricke
757866ec01
[tune] enable placement groups per default (#13906)
* Refactor placement group factory object to accept placement_group arguments instead of callables

* Convert resources to pgf

* Enable placement groups per default

* Fix tests WIP

* Fix stop/resume with placement groups

* Fix progress reporter test

* Fix trial executor tests

* Check resource for trial, not resource object

* Move ENV vars into class

* Fix tests

* Sphinx

* Wait for trial start in PBT

* Revert merge errors

* Support trial reuse with placement groups

* Better check for just staged trials

* Fix trial queuing

* Wait for pg after trial termination

* Clean up PGs before tune run

* No PG settings in pbt scheduler

* Fix buffering tests

* Skip test if ray reports erroneous available resources

* Disable PG for cluster resource counting test

* Debug output for tests

* Output in-use resources for placement groups

* Don't start new trial on trial start failure

* Add docs

* Cleanup PGs once futures returned

* Fix placement group shutdown

* Use updated_queue flag

* Apply suggestions from code review

* Apply suggestions from code review

* Update docs

* Reuse placement groups independently from actors

* Do not remove placement groups for paused trials

* Only continue enqueueing trials if it didn't fail the first time

* Rename parameter

* Fix pause trial

* Code review + try_recover

* Update python/ray/tune/utils/placement_groups.py

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

* Move placement group lifecycle management

* Move total used resources to pg manager

* Update FAQ example

* Requeue trial if start was unsuccessful

* Do not cleanup pgs at start of run

* Revert "Do not cleanup pgs at start of run"

This reverts commit 933d9c4c

* Delayed PG removal

* Fix trial requeue test

* Trigger pg cleanup on status update

* Fix tests

* Fix docs

* fix-test

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-02-23 18:46:02 +01:00
javi-redondo
0408fe6a69
Small improvements to the Ray Cluster docs (#14241)
* Small improvements to the Ray Cluster docs

* Update quickstart.rst

Changed title for quick start

Co-authored-by: Javier Redondo <javier@Anyscale-MacBook-Pro.local>
2021-02-23 13:44:28 +02:00
Simon Mo
f6a8a9be59
[Serve] Add RLlib tutorial (#14194) 2021-02-22 13:23:12 -08:00
Ryan Sander
8b5310a4e6
Fixed "multit-threaded" --> "multi-threaded" (#14236) 2021-02-21 19:25:51 -08:00
Dmitri Gekhtman
090970bdf5
[autoscaler] Max worker default infinity (#14201)
* random doc typo

* max-worker-default-inf

* fix

* -1 means infinity

* doc

* comment tweak

* fix random typo

* Cluster max-worker default

* fix

* typo

* test

* Git add the test

* doc-tweak

* rest of the test logistics

* periods in doc

* Address comments

* docstring
2021-02-22 05:14:00 +02:00
chaokunyang
f8a36eb350
[Java] Add java api overload doc and test (#14204) 2021-02-19 19:46:35 +08:00
Antoni Baum
58d7398246
[Tune] Add HEBOSearch Searcher (#13863)
* HEBO first pass

* Fix bad quotes

* Fixes

* Reproductibility

* Update python/ray/tune/suggest/hebo.py

Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>

* Add hebo_example.py to BUILD

* Nit

* Update to pypi package

* Alphabetical HEBO requirement

* Fix syntax error

* Fix wrong space in hebo example

* Move validate_warmstart to utils

* Space assertion in HEBO

* Comment

* Apply suggestions from code review

Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>

* Formatting

Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
2021-02-17 22:53:10 +01:00
Sumanth Ratna
c1d68d7dd0
[docs] Remove sphinx-gallery example runtimes (#14141)
e7f65d9b21/doc/conf.py (L340)
2021-02-17 11:07:16 -08:00
Alex Wu
753083c617
[docs][autoscaler] Update AWS node config link (#14125) 2021-02-17 10:44:10 -08:00