Ameer Haj Ali
|
11f34f72d8
|
[autoscaler] Do not count head node with min_workers constraint. (#12980)
|
2020-12-20 14:54:46 -08:00 |
|
Barak Michener
|
7ab9164f1b
|
[ray_client] Integrate with test_basic, test_basic_2 and test_actor (#12964)
|
2020-12-20 14:54:18 -08:00 |
|
Philipp Moritz
|
bf6577c8f4
|
Switch debugger to sockets and support unicode (#13004)
|
2020-12-20 12:10:28 -08:00 |
|
Ian Rodney
|
d6e243ad46
|
[serve] Refactor to full control loop design (#12537)
|
2020-12-20 13:03:57 -06:00 |
|
Sven Mika
|
407a3523f3
|
[RLlib] eval_workers after restore not generated in Trainer due to unintuitive config handling. (#12844)
|
2020-12-20 09:37:31 -05:00 |
|
fangfengbin
|
3fab93b61b
|
Fix scheduling_resources comment errors (#12991)
* Fix scheduling_resources comment error
* add part code
Co-authored-by: 灵洵 <fengbin.ffb@antgroup.com>
|
2020-12-20 20:20:07 +08:00 |
|
Richard Liaw
|
038a50af52
|
[tune] skopt fix-extra-import (#12970)
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
|
2020-12-20 01:01:09 -08:00 |
|
Philipp Moritz
|
ec9ad4a56b
|
Documentation for Ray debugger stepping (#12845)
|
2020-12-20 00:43:27 -08:00 |
|
Amog Kamsetty
|
4c63917439
|
[Queue] Add options and shutdown to Queue (#12932)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
|
2020-12-20 00:42:21 -08:00 |
|
Amog Kamsetty
|
51139ed37c
|
[SGD] Fix process group timeout units (#12477)
|
2020-12-19 21:46:33 -08:00 |
|
Dmitri Gekhtman
|
4832b39066
|
Suggest mounting into home. Note non-root user. (#12987)
|
2020-12-19 16:09:24 -08:00 |
|
Eric Liang
|
64c97d25d3
|
Enable by default new scheduler (#12735)
|
2020-12-19 13:22:24 -08:00 |
|
Amog Kamsetty
|
5d3c9c8861
|
[Tune] Mlflow Integration (#12840)
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
|
2020-12-19 00:40:02 -08:00 |
|
Eric Liang
|
5d987f5988
|
Revert "Increase the number of unique bits for actors to avoid handle collisions (#12894)" (#12988)
This reverts commit 3e492a79ec .
|
2020-12-18 23:51:44 -08:00 |
|
dHannasch
|
a092433bc8
|
[core] Use the ConnectWithoutRetries error message (#12732)
|
2020-12-18 22:34:34 -08:00 |
|
SangBin Cho
|
9d939e6674
|
[Object Spilling] Implement level triggered logic to make streaming shuffle work + additional cleanup (#12773)
|
2020-12-18 19:31:14 -08:00 |
|
Alex Wu
|
404161a3ff
|
[Autoscaler/Core] Remove autoscaler spam (#12952)
|
2020-12-18 18:22:45 -08:00 |
|
Kai Yang
|
ac5ea2c13d
|
[Java] Fix output parsing in RunManager (#12968)
* Fix output parsing in RunManager
* change log level
Co-authored-by: 灵洵 <fengbin.ffb@antgroup.com>
|
2020-12-18 18:22:12 -08:00 |
|
Eric Liang
|
6ece291f35
|
Clean up block/unblock handling of resources in new scheduler (#12963)
|
2020-12-18 16:00:54 -08:00 |
|
Eric Liang
|
3e492a79ec
|
Increase the number of unique bits for actors to avoid handle collisions (#12894)
|
2020-12-18 15:59:03 -08:00 |
|
Edward Oakes
|
3521e74f3a
|
[serve] Support for imported backends (#12923)
|
2020-12-18 15:49:24 -06:00 |
|
Eric Liang
|
92812f2e8a
|
Implement resource deadlock detection for new scheduler (#12961)
|
2020-12-18 12:17:54 -08:00 |
|
Barak Michener
|
5cfa1934e4
|
[ray_client]: Implement object retain/release and Data Streaming API (#12818)
|
2020-12-18 11:47:38 -08:00 |
|
Kai Fricke
|
55ae567f7a
|
[tune] Fix and enable SigOpt tests (#12877)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
|
2020-12-18 01:33:12 -08:00 |
|
Gekho457
|
bff50cfc37
|
[k8s] Read gpu resources properly (#12942)
* Read gpu resources properly
* Comments and docstrings
* Comment formatting
|
2020-12-18 01:32:12 -08:00 |
|
Kai Fricke
|
426f8a8d15
|
[tune] Fix tutorial training on GPU (#12914)
|
2020-12-18 01:31:40 -08:00 |
|
fangfengbin
|
a442cd17e0
|
[GCS]Optimize gcs client reconnection (#12878)
* [GCS]Optimize gcs client reconnection
* fix review comment
* fix review comment
* add part code
Co-authored-by: 灵洵 <fengbin.ffb@antgroup.com>
|
2020-12-17 21:57:37 -08:00 |
|
dHannasch
|
cfefd7c70e
|
Test PingPort (#12954)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
|
2020-12-17 21:15:42 -08:00 |
|
DK.Pino
|
6404f1e609
|
[Placement Group][New scheduler] New scheduler pg implementation (#12910)
|
2020-12-18 11:56:45 +08:00 |
|
Tao Wang
|
17152c84a7
|
[Tiny]Print raylet info after register (#12566)
|
2020-12-18 11:22:13 +08:00 |
|
Farzan Taj
|
53378170e0
|
[tune] Change pickle to ray.cloudpickle -- support large models (#12958)
* Change pickle to ray.cloudpickle
* Change pickle import to ray.cloudpickle
|
2020-12-17 19:17:08 -08:00 |
|
Kai Fricke
|
3d72000826
|
[tune] Add points_to_evaluate to BasicVariantGenerator (#12916)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
|
2020-12-17 19:16:03 -08:00 |
|
Sven Mika
|
124c8318a8
|
[RLlib] Fix broken test_distributions.py (test_categorical) (#12915)
|
2020-12-17 17:44:26 -06:00 |
|
dHannasch
|
d747071dd9
|
Test shard_context on already-created boost::asio::io_service. (#12917)
|
2020-12-17 14:26:30 -08:00 |
|
Edward Oakes
|
c7a59b239f
|
Remove unused endpoints_to_remove (#12946)
|
2020-12-17 15:04:11 -06:00 |
|
Gekho457
|
82f9c7014e
|
[K8s] Retry getting home directory in command runner. (#12925)
|
2020-12-17 09:41:48 -08:00 |
|
Allen
|
e6cb4f4bd7
|
[Core] Add log of address and port (#12908)
Co-authored-by: Allen Yin <allenyin@anyscale.io>
|
2020-12-17 00:25:29 -08:00 |
|
Yi Cheng
|
40032541dc
|
[core] Introduce fetch_local to ray.wait (#12526)
|
2020-12-16 23:44:28 -08:00 |
|
Tao Wang
|
12231ec2a6
|
Optimize heartbeat manager initialization (#12911)
|
2020-12-17 14:24:23 +08:00 |
|
SangBin Cho
|
057687e534
|
[New Scheduler] Fix test_failure.py by supporting infeasible tasks (#12738)
* Fix the first issue.
* ip
* In Progress.
* In progress.
* done.
* Remove unnecessary logs.
* Addressed code review + fix some test failures.
* Try fixing issues.
* Fix issues.
* Fix test issues.
* Fix issues.
* done.
|
2020-12-16 21:27:50 -08:00 |
|
Philipp Moritz
|
ad036fd564
|
Fix continue for debugger (#12862)
|
2020-12-16 16:09:13 -08:00 |
|
Amog Kamsetty
|
dd522a71a1
|
[SGD] Disable Elastic Training by default when using with Tune (#12927)
|
2020-12-16 15:37:44 -08:00 |
|
Alex Wu
|
8b783ecafa
|
Fix pull manager retry (#12907)
|
2020-12-16 14:18:43 -08:00 |
|
Ameer Haj Ali
|
c677b9e201
|
[autoscaler] Fix flaky autoscaler test (#12918)
|
2020-12-16 14:18:27 -08:00 |
|
Edward Oakes
|
aedcf0c9d9
|
Disable test_distributions (#12919)
|
2020-12-16 14:17:49 -08:00 |
|
Edward Oakes
|
fdb4c6eb1c
|
Better message for too little /dev/shm memory (#12896)
|
2020-12-16 10:30:20 -06:00 |
|
fangfengbin
|
91878d18b5
|
[PlacementGroup]Fix placement group wait api disorder bug (#12827)
* [PlacementGroup]Fix placment group wait api disorder bug
* fix review comment
* fix review comment
* fix review comment
* fix review comments
* increase num_heartbeats_timeout
Co-authored-by: 灵洵 <fengbin.ffb@antgroup.com>
|
2020-12-16 18:45:53 +08:00 |
|
Eric Liang
|
7ff314a5df
|
[New scheduler] Also unsubscribe get dependencies on unblock
|
2020-12-15 20:29:44 -08:00 |
|
Richard Liaw
|
a7caa14d3d
|
[k8s] avoid bad error messages (#12871)
|
2020-12-15 15:00:02 -08:00 |
|
Edward Oakes
|
f4b5a8b2f7
|
[serve] Re-enable test_failure.py (#12891)
|
2020-12-15 16:02:04 -06:00 |
|