Commit graph

193 commits

Author SHA1 Message Date
Dmitri Gekhtman
6644a0fe50
[autoscaler][kubernetes][docs] Updated Kubernetes Documentation (#14016)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-02-11 23:00:25 -08:00
Dmitri Gekhtman
1187d1dd3e
[autoscaler][kubernetes][operator] Rudimentary error handling, make "MODIFIED" -> update event work. (#13756) 2021-02-03 20:07:11 -06:00
Ameer Haj Ali
1fbb752f42
[autoscaler] remove worker_default_node_type that is useless. (#13588) 2021-01-21 17:04:38 -08:00
PENG Zhenghao
e63da54931
[docs] Add more guideline on using ray in slurm cluster (#12819)
Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com>
Co-authored-by: PENG Zhenghao <pengzh@ie.cuhk.edu.hk>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-01-14 12:17:53 -08:00
Simon Mo
8e0a2f669b
[Doc] Remove trailing whitespaces (#13390) 2021-01-12 20:35:38 -08:00
Dmitri Gekhtman
7166949194
[Kubernetes][Docs] GPU usage (#13325)
* gpu-note

* gpu-note

* More info

* lint?

* Update doc/source/cluster/kubernetes.rst

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

* Update doc/source/cluster/kubernetes.rst

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

* Update doc/source/cluster/kubernetes.rst

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

* Update doc/source/cluster/kubernetes.rst

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

* GKE->Kubernetes

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-01-11 21:36:31 -08:00
Dmitri Gekhtman
31453621ef
[kubernetes][docs][minor] Kubernetes version warning (#13161) 2021-01-04 10:29:17 -06:00
Gekho457
8cebe5cbe9
[docs][autoscaler][k8s][minor] quotes #12866 2020-12-14 18:24:13 -08:00
Gekho457
44f5be04ca
[autoscaler][k8s][doc][minor] Fix typo in k8s doc. (#12865) 2020-12-14 17:30:43 -08:00
Gekho457
11ce1dc743
Ray cluster CRD and example CR + multi-ray-cluster operator (#12098) 2020-12-14 10:26:01 -06:00
Eric Liang
4ad4463be6
Add comments to clarify purpose of new scheduler queues (#12730)
* update

* clarify

* update
2020-12-11 11:53:09 -08:00
Kai Yang
e3b5deb741
[Multi-tenancy] Delete flag enable_multi_tenancy and remove old code path (#10573) 2020-12-10 19:01:40 +08:00
Ian Rodney
e2a147d5fb
[docs] Remove DL AMi reference (#12120) 2020-11-18 12:40:19 -08:00
Ameer Haj Ali
85197deece
[autoscaler] Remove legacy autoscaler (#11802) 2020-11-11 13:36:48 -08:00
Eric Liang
9b8218aabd
[docs] Move all /latest links to /master (#11897)
* use master link

* remae

* revert non-ray

* more

* mre
2020-11-10 10:53:28 -08:00
Eric Liang
a9cf0141a0
[autoscaler] Fix semantics of request_resources (#11820) 2020-11-09 14:57:40 -08:00
dHannasch
6147b6a1a3
[docs] Note that the printed IP address can be incorrect. (#11804)
* If the head node is on a subnet with NAT, then you will need a different IP address.

* Specify what you are checking firewall settings and network configuration *for*.

* reword following @amogkam

* Give the full error message.
2020-11-04 13:48:03 -08:00
dHannasch
e7f7cb29c4
[docs] Show expected terminal output for manual cluster setup (#11752)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-11-02 20:59:14 -08:00
Scott Graham
c4ae94d60b
[autoscaler] Azure deployment fixes (#11613)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-10-27 15:27:18 -07:00
Richard Liaw
a4b418d30c
[docs] update cloud docs (#11262)
* update-cloud-docs

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

* Update doc/source/cluster/config.rst

Co-authored-by: Ian Rodney <ian.rodney@gmail.com>

* fix

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

* fix

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

Co-authored-by: Ian Rodney <ian.rodney@gmail.com>
2020-10-21 16:37:26 -07:00
Ameer Haj Ali
6b86d4d280
Automatically detect CPU, GPU, accelerator_type for AWS (#11147) 2020-10-02 21:16:43 -07:00
Ian Rodney
0d5b09f426
[Docker] Automagically add "runtime=nvidia" (#11125) 2020-10-01 17:04:19 -07:00
Ameer Haj Ali
0d36e4c025
[autoscaler] Support min_workers for multi node type (#11041)
* prepare for head node

* move command runner interface outside _private

* remove space

* Eric

* flake

* min_workers in multi node type

* fixing edge cases

* eric not idle

* fix target_workers to consider min_workers of node types

* idle timeout

* minor

* minor fix

* test

* lint

* eric v2

* eric 3

* min_workers constraint before bin packing

* Update resource_demand_scheduler.py

* Revert "Update resource_demand_scheduler.py"

This reverts commit 818a63a2c86d8437b3ef21c5035d701c1d1127b5.

* reducing diff

Co-authored-by: Ameer Haj Ali <ameerhajali@ameers-mbp.lan>
Co-authored-by: Alex Wu <alex@anyscale.io>
Co-authored-by: Alex Wu <itswu.alex@gmail.com>
2020-09-28 22:02:01 -07:00
Richard Liaw
a563344bc2
[docs] remove ref to google groups -> github discussions (#11019) 2020-09-24 18:09:51 -07:00
Ian Rodney
4c3f09094a
[docs] redis-port -> port (#10937) 2020-09-23 17:04:13 -07:00
Lee moon soo
df4c3abe30
[autoscaler] Staroid node provider (#10956) 2020-09-22 21:25:29 -07:00
Richard Liaw
b0ca70f628
[tune+core] tune lifecycle and starting ray guide (#10813) 2020-09-21 11:27:50 -07:00
rkube
cd7351f6a3
Streamlined slurm script and removed references to redis_password (#10827)
Co-authored-by: Ralph Kube <ralph.kube@uit.not>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-09-18 14:55:56 -07:00
Keqiu Hu
8a77cf925a
[cli][ray] update ray cli message (#10823) 2020-09-17 09:26:55 -07:00
architkulkarni
940f02913a
[Docs] update docs readme and fix typo (#10807) 2020-09-15 12:51:18 -07:00
Richard Liaw
1a4023387d
[docs] slurm + progress_bar example (#10782) 2020-09-15 00:16:36 -07:00
Alex Wu
d0b73647b4
[Autoscaler] Unmanaged nodes (#10513) 2020-09-13 11:58:47 -07:00
Ian Rodney
b9633a2b67
[docker] Support multiple node types (#10504) 2020-09-02 18:27:59 -07:00
Eric Liang
e5d089384b
[1.0] Ray whitepaper link and tagline update (#10455) 2020-09-01 09:48:35 -07:00
Eric Liang
f6a1698bab
[autoscaler] Add documentation for multi node type autoscaling (#10405) 2020-08-28 19:57:21 -07:00
Ian Rodney
a079f46c25
[autoscaler]/[docker] Cleanup YAMLs & Use RAY docker images (#10108) 2020-08-17 09:49:28 -07:00
krfricke
8f0f7371a0
[tune] Added Kubernetes syncer and sync client (#10097)
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-08-16 14:09:28 -07:00
PidgeyBE
6ad2fc4831
[autoscaler] Service and Ingress per worker pod (#9359) 2020-08-10 14:13:52 -05:00
Ameer Haj Ali
65a2886b0a
Docs For on Prem Cluster manager (#9873) 2020-08-04 11:31:09 -07:00
Bill Chambers
2e9d748100
[Cluster Launcher] Re Org the cluster launcher pages. (#9687) 2020-07-27 13:47:06 -07:00
Patrick Ames
dc51b08c36
[autoscaler] Allow users to disable the cluster config cache (#8117)
* [autoscaler] Remove autoscaler config cache.

* [autoscaler] Add flag allowing users to explicitly disable the config cache.
2020-07-09 15:47:58 -07:00
Ian Rodney
6fecd3cfce
[autoscaler] Run initialization_commands without a persistent connection (#9020)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2020-07-06 16:34:59 -07:00
Richard Liaw
56d934bc18
[docs] Revised Cluster documentation (#9062)
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2020-06-26 09:29:22 -07:00