Ian Rodney
857408874c
[Autoscaler][Azure] check if 'update' is available in ( #14787 )
2021-03-19 08:39:46 -07:00
Chris Bamford
cd89f0dc55
[RLLib] Episode media logging support ( #14767 )
2021-03-19 09:17:09 +01:00
Amog Kamsetty
47300d5a53
[SGD] Worker Startup Fault Tolerance ( #14724 )
2021-03-18 22:53:56 -07:00
Eric Liang
c30d5f445c
Nonblocking release for ray client to deflake tests ( #14782 )
...
* fix
* update
* fix
2021-03-18 21:49:36 -07:00
Ian Rodney
00aceaae37
[Client] Test Serialization in a platform independent way. ( #14786 )
2021-03-18 18:24:44 -07:00
Alex Wu
62214f1b80
Delete WIP in scalability envelope ( #14791 )
2021-03-18 17:53:53 -07:00
Amog Kamsetty
7ee2e4185b
[Tune] PTL Fractional GPUs ( #14781 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-18 17:07:51 -07:00
Lixin Wei
9227a83b59
Print ERROR log of actor creation task ( #14764 )
2021-03-18 16:56:55 -07:00
Richard Liaw
ebc71339fe
[client] fix multi-threading bugs ( #14701 )
2021-03-18 16:25:55 -07:00
Dmitri Gekhtman
da56a863f9
[Kubernetes][autoscaler] Deep copy in K8s Node Provider to fix scaling issues ( #14773 )
2021-03-18 18:17:57 -05:00
Ian Rodney
0495d6af15
[autoscaler] fix azure config issues ( #14750 )
2021-03-18 16:00:25 -07:00
Yi Cheng
881a46e1d6
[core] RuntimeEnv GC in local node ( #14594 )
2021-03-18 14:55:11 -07:00
Ian Rodney
eb12033612
[Code Cleanup] Switch to use ray.util.get_node_ip_address() ( #14741 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-18 13:10:57 -07:00
Sven Mika
c3a15ecc0f
[RLlib] Issue #13802 : Enhance metrics for multiagent->count_steps_by=agent_steps
setting. ( #14033 )
2021-03-18 20:27:41 +01:00
Richard Liaw
1d033fb552
[client] Fix serialization of RayTaskError ( #14698 )
2021-03-18 12:26:33 -07:00
Richard Liaw
8201e4ea11
[client] fix refcounting for named actors ( #14753 )
...
* max-workers
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* fix
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-18 12:20:29 -07:00
Eric Liang
ef249c98b1
[flaky test] Fix test_cli by disabling config cache for dashboard test ( #14755 )
2021-03-18 12:02:25 -07:00
SangBin Cho
351540e17e
[Test] Fix flaky object spilling test linux ( #14757 )
...
* Fix.
* done.
2021-03-18 11:37:09 -07:00
Edward Oakes
90f5ebac72
[serve] Add backend_state tests for updating backend config ( #14772 )
2021-03-18 12:58:39 -05:00
Edward Oakes
de598149d1
[serve] Add tests for backend_state versioning ( #14748 )
2021-03-18 11:08:45 -05:00
Ian Rodney
971855a353
[Serve] Disable Final Standalone Test on Windows ( #14761 )
2021-03-18 09:26:55 -05:00
Tao Wang
5305dbb639
[large scale]Always disable sync/subscribe context in sharding context ( #14706 )
2021-03-18 19:31:36 +08:00
Edward Oakes
91308b9b52
[serve] Refactor to add basic unit tests for BackendState ( #14740 )
2021-03-17 22:35:28 -05:00
Tao Wang
44a7ce3d35
[large scale]Disable async/subscribe context in global state accessor ( #14705 )
2021-03-18 11:07:33 +08:00
Tao Wang
ea7c9171e9
[large scale]Disable async context in raylets' gcs client ( #14704 )
2021-03-18 10:50:09 +08:00
Ian Rodney
50e95ad6dd
[Serve] Disable More test::standalone on windows ( #14751 )
2021-03-17 16:51:02 -07:00
Clark Zinzow
6a28cf4add
[Core] Event loop instrumentation concurrency fixes. ( #14719 )
...
* Moved global stats member to a shared pointer explicitly captured by-value by handler lambdas, fixed handler stats copy outside of lock, ported to generalized lambda capture.
* Reenabled event loop instrumentation by default.
* Remove explicit inline specifier from non-member functions, move into anonymous namespace.
* Revert "Reenabled event loop instrumentation by default."
This reverts commit 949215269f79a1ab5ddc1ce0285c3ff4477ee6e0.
2021-03-17 16:49:25 -07:00
Michael Schock
42dcacd888
[k8s] Minor doc fix ( #14732 )
2021-03-17 16:15:38 -07:00
Edward Oakes
34b5781ae0
[serve] Add basic support for a declarative deploy() API call ( #14720 )
2021-03-17 16:00:23 -05:00
Edward Oakes
f2013a0586
[serve] Skip test_standalone::test_connect on windows ( #14747 )
2021-03-17 13:50:34 -07:00
Lixin Wei
72d87093b9
[Core] Make Actor DEAD and Save Exceptions in GCS When Error Happens in Constructor ( #14211 )
2021-03-17 12:50:28 -07:00
Alex Wu
534846a1d2
[Autoscaler] Track failed nodes ( #14608 )
2021-03-17 12:49:31 -07:00
Ian Rodney
99861f5302
[JAR Build] Prevent MacOS Jar Builds from Timing Out ( #14738 )
2021-03-17 12:05:37 -07:00
Siyuan (Ryans) Zhuang
6d346e74a6
cleanup python code ( #14691 )
...
* cleanup python code
2021-03-17 10:45:05 -07:00
Clark Zinzow
a86277a93c
[dask-on-ray] Fix Dask-on-Ray examples in docs ( #14461 )
2021-03-17 10:37:32 -07:00
Ian Rodney
10250d737f
[Autoscaler] Add tests around docker run options ( #14713 )
2021-03-17 10:13:51 -07:00
Edward Oakes
c781197755
[serve] Temporarily disable ray client test ( #14733 )
2021-03-17 08:48:05 -07:00
Edward Oakes
aab7ccc466
[serve] Deprecate client-based API in favor of process-wide singleton ( #14696 )
2021-03-17 09:39:54 -05:00
Sven Mika
69202c6a7d
[RLlib] Obsolete usage tracking dict via sample batch. ( #13065 )
2021-03-17 08:18:15 +01:00
Akash Patel
6e326cc239
upgrade setproctitle dep ( #14538 )
2021-03-16 21:58:36 -07:00
Ian Rodney
8a936ad64d
[Autoscaler Docs] Use worker_run_options
( #14721 )
...
Co-authored-by: Ameer Haj Ali <ameerh@berkeley.edu>
2021-03-16 18:04:27 -07:00
Siyuan (Ryans) Zhuang
f30ac73640
update cloudpickle to commit 6e0f571 ( #14693 )
2021-03-16 12:36:43 -07:00
Ian Rodney
bd641a5e71
Revert "[Core] Added event loop metrics for posts. ( #14546 )" ( #14692 )
2021-03-16 10:38:45 -07:00
Edward Oakes
5a45e3351f
add Serve service by default ( #14711 )
2021-03-16 10:34:30 -07:00
Eric Liang
b240f5f0c9
Incremental refactor of runtime_env for consistency ( #14632 )
2021-03-16 10:11:50 -07:00
Sven Mika
78a134efa2
[RLlib] Add HowTo set env seed to our custom env example script. ( #14471 )
2021-03-16 08:12:27 +01:00
Tao Wang
897b84b300
[large scale]Add option for disable/enable context connection and disable asynchro… ( #14596 )
2021-03-16 15:09:13 +08:00
Edward Oakes
ae2c20c1ac
[serve] Include required and available resources in slow startup message ( #14695 )
2021-03-15 21:32:07 -05:00
Kathryn Zhou
01dda99b8c
Export cluster statistics to Prometheus ( #14612 )
2021-03-15 19:28:13 -07:00
Ian Rodney
d251bb676d
[Autoscaler] Get_Head_Node should return an up-to-date
node ( #14579 )
2021-03-15 17:48:18 -07:00