Commit graph

142 commits

Author SHA1 Message Date
Yi Cheng
3d9c973861
[doc] Update ray client documentation to include multi-client. (#18891) 2021-09-24 18:35:52 -07:00
Antoni Baum
4c95ea6d0a
[client] Improve Ray Client connection timeout information (#18281)
* Improve Ray Client connection timeout information

* fix lint issue.

Co-authored-by: Ameer Haj Ali <ameerh@berkeley.edu>
2021-09-02 16:34:11 +03:00
xwjiang2010
63f00843f3
[Tune] Inform users of the setup needed for uploading results to cloud. (#18220) 2021-08-31 10:27:50 -07:00
Sasha Sobol
fcb044d47c
[autoscaler] make 0 default min/max workers for head node (#17757)
* make 0 default min/max workers for head node

* fix helm charts, test, defaults for head

* fix test, docs

* make 0 default min/max workers for head node

* fix helm charts, test, defaults for head

* fix test, docs

* comments. logging

* better wording (logs)

Co-authored-by: Dmitri Gekhtman <62982571+DmitriGekhtman@users.noreply.github.com>

* fix logging message

* fix max workers in raycluster.yaml

* use default values of 0 for min/max workders in a helm chart

* add missing line back

Co-authored-by: Dmitri Gekhtman <62982571+DmitriGekhtman@users.noreply.github.com>
2021-08-25 14:56:20 -04:00
Antoni Baum
88163c4755
[docs] Add a TPU example to the docs (#17959)
* Add a TPU example to the docs

* Add a line about TPU API

* Add link to TPU pods

* Clarify
2021-08-24 10:08:26 -07:00
77loopin
c6b24fcb5d
[RayClient] Add the guide for k8s Ingress (#17736)
Co-authored-by: Dmitri Gekhtman <62982571+DmitriGekhtman@users.noreply.github.com>
Co-authored-by: seungjaebaek <seungjaebaek@linecorp.com>
2021-08-20 18:31:03 -07:00
Guyang Song
5713a0be6c
[C++ API] add C++ API docs (#17743) 2021-08-12 22:40:09 +08:00
Chris K. W
a33cbec12a
[client][docs] update docs for new client support in init (#17333)
* start

* check formatting

* undo changes from base branch

* Client builder API docs

* indent

* 8

* minor fixes

* absolute path to runtime env docs

* fix runtime_env link

* Update worker.init docs

* drop clientbuilder docs, link to 1.4.1 docs instead. Specify local:// behavior when address passed

* add debug info for ray.init("local")

* local:// attaches a driver directly

* update ray.init return wording

* remote init.connect() from example

* drop local:// docs, add section on when to use ray client

* link to 1.4.1 docs in code example instead of mentioning clientbuilder

* fix backticks, doc mentions of ray.util.connect

* remove ray.util.connect mentions from examples and comments

* update tune example

* wording

* localhost:<port> also works if you're on the head node

* add quotes

* drop mentions of ray client from ray.init docstring

* local->remote

* fix section ref

* update ray start output

* fix section link

* try to fix doc again

* fix link wording

* drop local:// from docs and special handling from code

* update ray start message

* lint

* doc lint

* remove local:// codepath

* remove 'internal_config'

* Update doc/source/cluster/ray-client.rst

Co-authored-by: Ameer Haj Ali <ameerh@berkeley.edu>

* doc suggestion

* Update doc/source/cluster/ray-client.rst

Co-authored-by: Ameer Haj Ali <ameerh@berkeley.edu>
2021-08-04 05:31:44 +03:00
architkulkarni
756a4e7a90
[Core] [runtime env] update tests to use ray.init(runtime_env=...) and add e2e test (#17232) 2021-07-26 11:21:30 -05:00
Chris K. W
bd9d7bbbaa
[client] Add support for protocol (ray://, local://, custom://) to ray.init (#16946) 2021-07-14 21:45:46 -07:00
Tao Wang
34422ef53f
[Doc]Add statement for supporting remtoe redis (#16869)
* [Doc]Add statement for supporting remtoe redis

* Update doc/source/cluster/cloud.rst

Co-authored-by: Alex Wu <itswu.alex@gmail.com>

Co-authored-by: Alex Wu <itswu.alex@gmail.com>
2021-07-07 00:37:06 -07:00
Dmitri Gekhtman
a27a8172cc
[autoscaler] Handle node type key change/deletion (#16691) 2021-07-06 09:06:58 -07:00
Dmitri Gekhtman
ea23382919
[autoscaler][docs] Doc tweak (#16663)
* doc-tweak

* fix
2021-06-24 16:25:00 -07:00
Dmitri Gekhtman
cb878b6514
[doc][kubernetes] K8s doc updates (#16570) 2021-06-20 19:38:34 -07:00
Brandon
2ab1c74032
[docs] Add link for launching ray manually in quickstart (#15384)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-06-20 17:47:12 -07:00
Alex Wu
197dab0e2f
[docs] Deploying Ray (#16538)
Co-authored-by: Alex Wu <alex@anyscale.com>
2021-06-19 10:07:15 -07:00
Ian Rodney
16d762aed0
[DocSprint] Ray Client Docs (#16497)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-06-19 10:05:37 -07:00
Abhishek Malvankar
85bc1b2979
[docs] ray LSF integration (#16438) 2021-06-17 01:35:55 -07:00
Dmitri Gekhtman
e58ba66681
[gcp][doc][minor] project_id is required (#16266) 2021-06-05 01:00:11 -07:00
Dmitri Gekhtman
a60ee3a8b2
[autoscaler][kubernetes][minor] latest images everywhere (#16205)
* latest images everywhere

* add back some documentation on the images

* Doc update
2021-06-04 16:01:39 -07:00
Alex Wu
fa292a4edf
[Doc] Document memory and object store memory as autoscaler resources (#16210) 2021-06-03 10:10:03 -07:00
Travis Addair
050a076de9
[k8s] Refactored k8s operator to use kopf for controller logic (#15787)
Co-authored-by: Dmitri Gekhtman <dmitri.m.gekhtman@gmail.com>
2021-06-01 12:00:55 -07:00
Dmitri Gekhtman
27c2f570f1
[kubernetes] pin the K8s config yamls to ray:latest instead of ray1.3 (#15988) 2021-06-01 19:12:35 +03:00
zhuangzhuang131419
0429882bbf
[autoscaler] Implement node provider for aliyun (#15712)
Co-authored-by: Ian Rodney <ian.rodney@gmail.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: zhuang <zhengchicheng.zcc@alibaba-inc.com>
Co-authored-by: chenk008 <kongchen28@gmail.com>
Co-authored-by: wuhua.ck <wuhua.ck@alibaba-inc.com>
2021-05-29 00:56:32 -07:00
Xiang Xu
ec8b591f32
[docs] typo fix on the Doc for helm (#16036) 2021-05-25 12:59:39 -07:00
Dmitri Gekhtman
95c3d88cac
[autoscaler][kubernetes] Helm chart (#15614) 2021-05-17 16:55:10 -07:00
wzl
5247c0a5b8
[doc] Fix typo (#15828) 2021-05-16 16:08:14 -07:00
Dmitri Gekhtman
052d2acaee
[autoscaler][kubernetes] Restart after head failure, more consistent operator restart behavior. (#15655) 2021-05-12 11:49:11 -05:00
Dmitri Gekhtman
8f83053e35
[autoscaler][Kubernetes] Operator subprocess error reporting, configuration fixes (#15526) 2021-05-04 16:45:37 -05:00
Dmitri Gekhtman
de897673c5
[kubernetes][autoscaler] Kubernetes operator basic fixes (#15469) 2021-04-29 10:45:52 -05:00
Dmitri Gekhtman
6b0673f207
[doc][Kubernetes][minor] Restructure section labels for operator launch (#14962) 2021-04-23 09:50:58 -07:00
Dmitri Gekhtman
fd43e9e6f8
[kubernetes][doc][minor] Add namespace to job creation command (#15442) 2021-04-23 09:44:51 -07:00
Dmitri Gekhtman
e6864523cf
[autoscaler] Do not divide by zero in resource demand scheduler (#15323)
* Do not divide by zero

* Don't take min or mean of an empty list

* max workers 0 for head node in distributed benchmark

* test

* Correct the type annotation

* comment grammar tweak

* message

* docs

* test

* Move test cli to large tests.
2021-04-16 10:20:05 -07:00
Richard Liaw
59bf3a7b22
ray[cluster] -> ray[default] (#15251) 2021-04-14 09:37:04 -07:00
Richard Liaw
e72f6b0377
Fix ray[full] -> ray[cluster] #15112
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-04-05 09:55:00 -07:00
Dmitri Gekhtman
474fb6bf0c
[kubernetes][client][docs] Note requirement for matching Ray versions (#15068) 2021-04-01 15:08:25 -07:00
Ian Rodney
73fb5d6022
[Autoscaler][Docker] Make disable_shm_size_detection more usable (#14913) 2021-03-30 18:10:09 -07:00
Richard Liaw
c1c9649671
Set up things to remove dependencies in later release (#14793)
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-19 13:54:52 -07:00
Ian Rodney
eb12033612
[Code Cleanup] Switch to use ray.util.get_node_ip_address() (#14741)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-18 13:10:57 -07:00
Michael Schock
42dcacd888
[k8s] Minor doc fix (#14732) 2021-03-17 16:15:38 -07:00
Ian Rodney
8a936ad64d
[Autoscaler Docs] Use worker_run_options (#14721)
Co-authored-by: Ameer Haj Ali <ameerh@berkeley.edu>
2021-03-16 18:04:27 -07:00
Brian Yu
a65002514c
[Doc] Update Slurm documentation examples (#14673) 2021-03-15 00:27:13 -07:00
Dmitri Gekhtman
3f6c23e3cc
[doc][autoscaler][minor] Fix quickstart guide: ray.init(address='auto') (#14459) 2021-03-03 17:58:52 -08:00
Dmitri Gekhtman
1675156a8b
[autoscaler][interface] Use multi node types in defaults.yaml and example-full.yaml (#14239)
* random doc typo

* example-full-multi

* left off max workers

* wip

* address comments, modify defaults, wip

* fix

* wip

* reformat more things

* undo useless diff

* space

* max workers

* space

* copy-paste mishaps

* space

* More copy-paste mishaps

* copy-paste issues, space, max_workers

* head_node_type

* legacy yamls

* line undeleted

* correct-gpu

* Remove redundant GPU example.

* Extraneous comment

* whitespace

* example-java.yaml

* Revert "example-java.yaml"

This reverts commit 1e9c0124b9d97e651aaeeb6ec5bf7a4ef2a2df17.

* tests and other things

* doc

* doc

* revert max worker default

* Kubernetes comment

* wip

* wip

* tweak

* Address comments

* test_resource_demand_scheduler fixes

* Head type min/max workers, aws resources

* fix example_cluster2.yaml

* Fix external node type test (compatibility with legacy-style external node types)

* fix test_autoscaler_aws

* gcp-images

* gcp node type names

* fix gcp defaults

* doc format

* typo

* Skip failed Windows tests

* doc string and comment

* assert

* remove contents of default external head and worker

* legacy external failed validation test

* Readability -- define the minimal external config at the top of the file.

* Remove default worker type min worker

* Remove extraneous global min_workers comment.

* per-node-type docker in aws/example-gpu-docker

* ray.worker.small -> ray.worker.default

* fix-docker

* fix gpu docker again

* undo kubernetes experiment

* fix doc

* remove worker max_worker from kubernetes

* remove max_worker from local worker node type

* fix doc again

* py38

* eric-comment

* fix cluster name

* fix-test-autoscaler

* legacy config logic

* pop resources

* Remove min_workers AFTER merge

* comment, warning message

* warning, comment
2021-03-03 06:16:19 +02:00
Dmitri Gekhtman
58c0959ea7
[kubernetes][docs][minor] Move Kubernetes example scripts to docs (#14412) 2021-03-01 20:17:16 -08:00
javi-redondo
0408fe6a69
Small improvements to the Ray Cluster docs (#14241)
* Small improvements to the Ray Cluster docs

* Update quickstart.rst

Changed title for quick start

Co-authored-by: Javier Redondo <javier@Anyscale-MacBook-Pro.local>
2021-02-23 13:44:28 +02:00
Dmitri Gekhtman
090970bdf5
[autoscaler] Max worker default infinity (#14201)
* random doc typo

* max-worker-default-inf

* fix

* -1 means infinity

* doc

* comment tweak

* fix random typo

* Cluster max-worker default

* fix

* typo

* test

* Git add the test

* doc-tweak

* rest of the test logistics

* periods in doc

* Address comments

* docstring
2021-02-22 05:14:00 +02:00
Alex Wu
753083c617
[docs][autoscaler] Update AWS node config link (#14125) 2021-02-17 10:44:10 -08:00
javi-redondo
b8b2d6410d
[docs] new Ray Cluster documentation (#13839)
Co-authored-by: Javier Redondo <javier@anyscale.com>
Co-authored-by: AmeerHajAli <ameerh@berkeley.edu>
2021-02-15 00:47:14 -08:00
Dmitri Gekhtman
6644a0fe50
[autoscaler][kubernetes][docs] Updated Kubernetes Documentation (#14016)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-02-11 23:00:25 -08:00