Commit graph

3922 commits

Author SHA1 Message Date
Richard Liaw
c2aeccaf14
[tune] revert all mnist tests (#14677)
This reverts commit 3f557348a2.
2021-03-14 23:58:13 -07:00
Eric Liang
3bdcca7ee5
Add basic concurrency test for ray client (#14630) 2021-03-13 11:24:57 -08:00
Edward Oakes
66be4801c6
Add deprecation warning to Counter.record() (#14622) 2021-03-12 17:04:28 -06:00
Eric Liang
b47036d014
Bump Ray client protocol version; fix dataclasses dependency for py 3.6 (#14654) 2021-03-12 14:58:34 -08:00
Richard Liaw
3f557348a2
[tune] re-enable MNIST tests! (#14561) 2021-03-12 13:35:43 -08:00
Dmitri Gekhtman
a90cffe26c
[dashboard][k8s] Better CPU reporting when running on K8s (#14593) 2021-03-12 12:02:15 -06:00
Raphael CHEN
c93961e070
[tune] Enable list of tuning hyperparameters in BOHB (#14487)
* [tune] Enable list of tuning hyperparameters in BOHB

* More concise code

Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com>

* Add comment to `unflatten_list_dict`

* Fix lint

* Fix lint

* Add test for `unflatten_list_dict`

Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com>
2021-03-12 09:22:44 -08:00
Edward Oakes
9cf328d616
[serve] Application-level batching initial commit (#14610) 2021-03-11 21:16:08 -06:00
Eric Liang
ee2bf0f989
Improved object store memory behavior with respect /dev/shm size (#14629) 2021-03-11 17:29:06 -08:00
Edward Oakes
5e2a3df7cd
Allow returning an actor handle from a remote call (#13476) 2021-03-11 16:52:09 -06:00
Edward Oakes
8e778d6f42
[serve] Remove more Counter.record()s (#14628) 2021-03-11 12:54:38 -06:00
Eric Liang
4c1df378bb
Point load_package tests to ray-project GH instead of personal (#14605) 2021-03-11 10:46:36 -08:00
architkulkarni
9b6d2ca345
[Core] Add runtime_env option to actor and task options, with conda_env (#14430) 2021-03-11 10:09:38 -06:00
Clark Zinzow
5a788474aa
[Core] First pass at privatizing non-public Python APIs. (#14607)
* async_compat

* utils

* cluster_utils

* compat

* function_manager

* import_thread

* memory_monitor

* monitor, log_monitor, ray_process_reaper

* metrics_agent

* parameter

* prometheus_exporter

* ray_logging

* signature
2021-03-10 22:47:28 -08:00
Eric Liang
081c960b59
Fix missing __init__ for wheels (#14615) 2021-03-10 18:13:58 -08:00
Eric Liang
4e8b53b3d0
Add an experimental load_package API (#14552) 2021-03-10 13:13:49 -08:00
Edward Oakes
8111ff5c3f
[serve] Use placement groups to bypass autoscaler throttling (#13844) 2021-03-10 13:33:44 -06:00
Edward Oakes
55a28cee52
[serve] Count -> Counter (#14571) 2021-03-10 11:59:44 -06:00
Eric Liang
dcb22af50d
Use vendored cloudpickle (#14576) 2021-03-09 22:08:45 -08:00
burglarralgrub
dfcb9c356e
Remove the --java-worker-options parameter (#14563) 2021-03-10 10:49:31 +08:00
Alex Wu
e1fbb8489e
[core] Supress infeasible warning (#14068) 2021-03-09 16:37:56 -08:00
Richard Liaw
ea7d4c6607
[placement groups] fix gpu ids for bundles (#14574) 2021-03-09 15:11:59 -08:00
Hao Zhang
2505bc8aa9
[Collective] Ray CPU collectives now available (#14277)
Co-authored-by: YLJALDC <dal177@ucsd.edu>
Co-authored-by: Ezra-H <huangrh9@gmail.com>
Co-authored-by: Ezra-H <44772185+Ezra-H@users.noreply.github.com>
2021-03-09 15:02:16 -08:00
Yi Cheng
ed8935406b
[core] Minimal support for runtime env (#14270) 2021-03-09 11:53:58 -08:00
Ian Rodney
6d5511cf80
Revert "reset memory for tasks and actors to 5% when cached memory ad…" (#14556)
This reverts commit 6f151ad510.
2021-03-09 08:19:55 -08:00
Kai Fricke
43e098402a
[tune] make tune.with_parameters() work with the class API (#14532)
* [tune] make `tune.with_parameters()` work with the class API

* Update python/ray/tune/utils/trainable.py

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-09 09:36:17 +01:00
Yiran Wang
a06dc39d9f
[Autoscaler] Check if SSH is available every 5 sec, not 10 (#14484) 2021-03-08 20:58:21 -08:00
Edward Oakes
59221b2f31
[metrics] Standardize metrics.Count API to prometheus counter (#14498) 2021-03-08 20:47:46 -06:00
Edward Oakes
04c009712d
Revert "Revert "Support accessing underlying attributes in RayTaskErr… (#14449) 2021-03-08 11:04:10 -06:00
Kai Yang
7977474899
[Core] Filter out dead nodes when getting address info from redis (#14440) 2021-03-08 15:48:26 +08:00
Edward Oakes
8e139046b9
[metrics] Remove unused unit field from cython classes (#14497) 2021-03-07 20:06:02 -06:00
Richard Liaw
dec3aa3453
Split tests for timeout (#14516) 2021-03-07 16:46:52 -08:00
Eric Liang
3fab5e2ada
Switch memory units to bytes (#14433) 2021-03-06 19:32:35 -08:00
Richard Liaw
5fc761c562
Fix test_advanced_3 timeout (#14509)
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-06 10:59:06 -08:00
EscapeReality846089495
33b271aa97
[tune] Fixed save_to_dir w/ os.replace (#14510)
The method save_to_dir of the class Searcher in ray.tune.suggest.suggestion.py uses the os.rename method to replace tmp_search_ckpt to current ckpt. os.rename method will raise the [WinError 183] or file exists error of other operating system. os.replace is the currect way.
2021-03-06 01:14:56 -08:00
Alex Wu
2395e25fc0
[hotfix][core] Load balancing spillback feature flag (#14457) 2021-03-05 16:45:33 -08:00
Antoni Baum
2002cff42e
[Tune] HEBO concurrency fix after discussion with authors (#14504) 2021-03-05 14:05:37 -08:00
Sven Mika
ef944bc5f0
[RLlib] Re-enable placement group support for RLlib. (#14384) 2021-03-05 08:16:24 +01:00
Qstar
6f151ad510
reset memory for tasks and actors to 5% when cached memory added (#14345) 2021-03-05 10:36:29 +08:00
Dmitri Gekhtman
736c99fadb
[kubernetes][test][minor] Operator test modification (#14488) 2021-03-04 14:38:58 -08:00
Edward Oakes
be974a6596
[metrics] Only put live nodes in prometheus service discovery file (#14495) 2021-03-04 16:17:00 -06:00
Eric Liang
2cf4c7253c [ray client] Fix ctrl-c for ray.get() by setting a short-server side timeout (#14425) 2021-03-04 10:36:42 -08:00
Ian Rodney
759892740a
[Autoscaler] chown Ray_bootstrap Files in DockerCommandRunner (#14380) 2021-03-03 19:13:20 -08:00
Antoine Galataud
460c2757a3
Allow assigning weight to var with close name (#14109)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-03 19:11:34 -08:00
Eric Liang
99a63b3dd1
Remove old scheduler and friends (#14184) 2021-03-03 18:29:15 -08:00
Richard Liaw
dba533dd84
Disable more torch (#14480)
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-03 15:46:32 -08:00
tchordia
e40dc3a3e9
[serve] Better validation for arguments to client.start() (#14327) 2021-03-03 14:33:36 -08:00
Richard Liaw
60a8b67488
Disable mnist tests (#14474) 2021-03-03 13:25:01 -08:00
Hao Zhang
4135b0eb4a
[Collective] Supporting multistream, stream pool, and CUDA events. (#14127)
Co-authored-by: fustinose <fustinosej@gmail.com>
2021-03-03 09:53:45 -08:00
SangBin Cho
a04ab9b472
[Core] Fix ray memory bug (#14452)
* ray memory bug

* Fix ray memory issue.

* done.
2021-03-03 09:20:00 -08:00