Maxime RICHE
9a7fbd3cdf
[RLlib] Add coin game env. Matrix social dilemma env. With tests and examples. ( #14208 )
2021-03-09 17:26:20 +01:00
Ian Rodney
6d5511cf80
Revert "reset memory for tasks and actors to 5% when cached memory ad…" ( #14556 )
...
This reverts commit 6f151ad510
.
2021-03-09 08:19:55 -08:00
SongGuyang
134152937a
fix doc ( #14555 )
2021-03-09 18:57:03 +08:00
Qing Wang
29d5b110de
Update doc about installing Ray Java ( #14383 )
...
* Fix
* Update doc/source/installation.rst
Co-authored-by: Kai Yang <kfstorm@outlook.com>
* Update doc/source/installation.rst
Co-authored-by: Kai Yang <kfstorm@outlook.com>
* Update doc/source/walkthrough.rst
Co-authored-by: Kai Yang <kfstorm@outlook.com>
* Address comments.
* lint
Co-authored-by: Qing Wang <jovany.wq@antgroup.com>
Co-authored-by: Kai Yang <kfstorm@outlook.com>
2021-03-09 18:03:13 +08:00
Kai Fricke
43e098402a
[tune] make tune.with_parameters()
work with the class API ( #14532 )
...
* [tune] make `tune.with_parameters()` work with the class API
* Update python/ray/tune/utils/trainable.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-09 09:36:17 +01:00
qicosmos
f2348a5456
[C++ worker] Add ray register part1 ( #14436 )
2021-03-09 13:57:17 +08:00
Yiran Wang
a06dc39d9f
[Autoscaler] Check if SSH is available every 5 sec, not 10 ( #14484 )
2021-03-08 20:58:21 -08:00
Dmitri Gekhtman
4a7d9e71bb
[dashboard][kubernetes] Show container's memory info on K8s, not the physical host's. ( #14499 )
...
* random doc typo
* more reasonable memory output
* no if
* get rid of comment
2021-03-08 18:59:41 -08:00
Edward Oakes
59221b2f31
[metrics] Standardize metrics.Count API to prometheus counter ( #14498 )
2021-03-08 20:47:46 -06:00
architkulkarni
505d2b6abe
[Serve] [Doc] Add small dashboard section under Serve Monitoring ( #14328 )
2021-03-08 20:41:42 -06:00
fyrestone
3616424f10
Disable dashboard tune module if pandas version is incorrect ( #14381 )
2021-03-08 20:40:59 -06:00
fyrestone
2da58bb021
[Dashboard] Fix reporter agent ( #14378 )
2021-03-08 13:12:34 -06:00
Ian Rodney
b6c4f21fda
fix docker build ( #14536 )
2021-03-08 09:33:26 -08:00
Edward Oakes
04c009712d
Revert "Revert "Support accessing underlying attributes in RayTaskErr… ( #14449 )
2021-03-08 11:04:10 -06:00
Sven Mika
732197e23a
[RLlib] Multi-GPU for tf-DQN/PG/A2C. ( #13393 )
2021-03-08 15:41:27 +01:00
Kai Fricke
b0bf44b154
[tune/docs] Add high level trial runner flow to documentation ( #14468 )
...
* [tune/docs] Add high level trial runner flow to documentation
* Apply suggestions from code review
2021-03-08 10:35:54 +01:00
Kai Yang
7977474899
[Core] Filter out dead nodes when getting address info from redis ( #14440 )
2021-03-08 15:48:26 +08:00
Edward Oakes
8e139046b9
[metrics] Remove unused unit field from cython classes ( #14497 )
2021-03-07 20:06:02 -06:00
Richard Liaw
dec3aa3453
Split tests for timeout ( #14516 )
2021-03-07 16:46:52 -08:00
Eric Liang
3fab5e2ada
Switch memory units to bytes ( #14433 )
2021-03-06 19:32:35 -08:00
Richard Liaw
5fc761c562
Fix test_advanced_3 timeout ( #14509 )
...
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-06 10:59:06 -08:00
EscapeReality846089495
33b271aa97
[tune] Fixed save_to_dir w/ os.replace ( #14510 )
...
The method save_to_dir of the class Searcher in ray.tune.suggest.suggestion.py uses the os.rename method to replace tmp_search_ckpt to current ckpt. os.rename method will raise the [WinError 183] or file exists error of other operating system. os.replace is the currect way.
2021-03-06 01:14:56 -08:00
Alex Wu
2395e25fc0
[hotfix][core] Load balancing spillback feature flag ( #14457 )
2021-03-05 16:45:33 -08:00
Antoni Baum
2002cff42e
[Tune] HEBO concurrency fix after discussion with authors ( #14504 )
2021-03-05 14:05:37 -08:00
Sven Mika
ef944bc5f0
[RLlib] Re-enable placement group support for RLlib. ( #14384 )
2021-03-05 08:16:24 +01:00
Qstar
6f151ad510
reset memory for tasks and actors to 5% when cached memory added ( #14345 )
2021-03-05 10:36:29 +08:00
DK.Pino
26907b7708
Support placement group for normal task in Java API ( #14342 )
...
* support pg for normal task
* fix lint
* fix comment
* fix comment
* update comment
* fix java typo
2021-03-05 10:21:37 +08:00
Dmitri Gekhtman
736c99fadb
[kubernetes][test][minor] Operator test modification ( #14488 )
2021-03-04 14:38:58 -08:00
Edward Oakes
be974a6596
[metrics] Only put live nodes in prometheus service discovery file ( #14495 )
2021-03-04 16:17:00 -06:00
Eric Liang
2cf4c7253c
[ray client] Fix ctrl-c for ray.get() by setting a short-server side timeout ( #14425 )
2021-03-04 10:36:42 -08:00
SangBin Cho
190ab40645
[Core] Display ip address when node dies ( #14489 )
...
* done.
* Addressed code review.
2021-03-04 10:27:00 -08:00
Kai Yang
1d7bd990b6
[Java] Update System.gc() log to debug level ( #14490 )
2021-03-04 18:54:10 +08:00
qicosmos
d77e25b4b1
[C++ worker] Avoid recursive inclusion of API header files ( #14414 )
2021-03-04 18:51:02 +08:00
Kai Yang
5d79821e69
[Core] Initialize system config in CoreWorkerProcess constructor ( #14439 )
2021-03-04 16:34:54 +08:00
Qing Wang
07e619f404
Remove unsed script. ( #14462 )
2021-03-04 11:24:00 +08:00
Ian Rodney
759892740a
[Autoscaler] chown Ray_bootstrap Files in DockerCommandRunner ( #14380 )
2021-03-03 19:13:20 -08:00
Antoine Galataud
460c2757a3
Allow assigning weight to var with close name ( #14109 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-03 19:11:34 -08:00
Eric Liang
99a63b3dd1
Remove old scheduler and friends ( #14184 )
2021-03-03 18:29:15 -08:00
Dmitri Gekhtman
3f6c23e3cc
[doc][autoscaler][minor] Fix quickstart guide: ray.init(address='auto') ( #14459 )
2021-03-03 17:58:52 -08:00
Richard Liaw
dba533dd84
Disable more torch ( #14480 )
...
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-03 15:46:32 -08:00
tchordia
e40dc3a3e9
[serve] Better validation for arguments to client.start() ( #14327 )
2021-03-03 14:33:36 -08:00
Richard Liaw
60a8b67488
Disable mnist tests ( #14474 )
2021-03-03 13:25:01 -08:00
Hao Zhang
4135b0eb4a
[Collective] Supporting multistream, stream pool, and CUDA events. ( #14127 )
...
Co-authored-by: fustinose <fustinosej@gmail.com>
2021-03-03 09:53:45 -08:00
ZhuSenlin
dcff25aed6
remove invalid code inside NodeManager::NodeAdded ( #14273 )
...
Co-authored-by: senlin.zsl <senlin.zsl@antgroup.com>
2021-03-03 09:20:21 -08:00
SangBin Cho
a04ab9b472
[Core] Fix ray memory bug ( #14452 )
...
* ray memory bug
* Fix ray memory issue.
* done.
2021-03-03 09:20:00 -08:00
SangBin Cho
1d2136959f
[Core] Fix port issue ( #14435 )
...
* Initial impl.
* Update.
* fixed a bug.
* Fix all the issues.
* Addressed code review.
* Addressed code review.
* Fix a test failure.
2021-03-03 09:16:00 -08:00
Xianyang Liu
fc9182e63c
Fixes autoscaling monitor when environment has set http_proxy or https_proxy ( #14351 )
2021-03-03 18:22:53 +02:00
Sven Mika
5637d89ecc
[RLlib] Serve + RLlib example script. ( #14416 )
2021-03-03 14:33:03 +01:00
Sven Mika
7718ec70fb
[RLlib] Remove old SegmentTree from tests dir and unflake respective segment tree test. ( #14450 )
2021-03-03 14:31:30 +01:00
Kai Yang
d653394c7f
[Java] Some bug fixes about Java UT workflow ( #14444 )
2021-03-03 19:32:14 +08:00