architkulkarni
5ed3f0ce35
[Serve] [Dashboard] Add end times and DELETED state for endpoints ( #17898 )
2021-08-19 11:10:42 -05:00
Clark Zinzow
d958457d07
[Core] Second pass at privatizing APIs. ( #17885 )
...
* gcs_utils
* resource_spec
* profiling
* ray_perf and ray_cluster_perf
* test_utils
2021-08-18 20:56:33 -07:00
architkulkarni
fcac416933
[Serve] [Dashboard] Add start times and replica tags to cluster snapshot ( #17749 )
2021-08-13 09:49:12 -07:00
architkulkarni
00f6b30684
[Serve] [Dashboard] Support nondetached and multiple Serve instances in cluster snapshot ( #17747 )
2021-08-11 22:26:54 -05:00
Jiao
e38db5875b
Add serve external kv store ( #17622 )
2021-08-11 12:06:14 -07:00
architkulkarni
0c2c99b951
[Dashboard] [Serve] Make serve import conditional ( #17713 )
2021-08-10 17:06:00 -07:00
architkulkarni
febe54f422
[serve] [dashboard] Change empty serve cluster snapshot from empty list to empty dict ( #17655 )
2021-08-10 13:35:00 -05:00
architkulkarni
6d975b821b
[Serve] [Dashboard] Initial PR for exporting Serve data to cluster snapshot ( #17489 )
2021-08-06 15:03:29 -07:00
architkulkarni
ac9a1a20df
[core] [runtime_env] Use per-env async lock in agent ( #17542 )
...
Co-authored-by: Ed Oakes <ed.nmi.oakes@gmail.com>
2021-08-06 11:11:37 -05:00
Amog Kamsetty
add6ceb3ec
[Dependencies] Fix missing dependency UX ( #17420 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-08-05 20:18:42 -07:00
Edward Oakes
7007c6271d
[runtime_env] Gracefully fail tasks when an environment fails to be set up ( #17249 )
2021-07-28 15:25:02 -05:00
Simon Mo
4a4210a083
Support streaming output of runtime env setup to logger/driver ( #17306 )
2021-07-27 16:39:15 -07:00
fyrestone
57b9b1bb0f
[Dashboard] Use a dedicated RPC to check the GCS is alive ( #16330 )
...
* Dashboard check gcs is alive
* Fix dashboard hangs at exit
* ray health-check call GCS CheckAlive
* Minor fixes
Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-27 14:05:44 +08:00
Richard Liaw
597dc08dfe
Revert "Revert "[core] remove opencensus/prometheus_exporter dependencies"" ( #17254 )
...
* Revert "Revert "[core] remove opencensus/prometheus_exporter dependencies" (#17251 )"
This reverts commit 7b44dd8ecb
.
* Lint
* Fix more imports
Co-authored-by: Kai Fricke <kai@anyscale.com>
2021-07-26 21:09:25 -07:00
architkulkarni
bcb3a6789b
[Core] [runtime env] Cache created runtime envs ( #17342 )
2021-07-26 14:37:40 -05:00
Amog Kamsetty
cb74053ee5
Retry remove gpustat
dependency ( #17115 )
...
* remove gpustat
* move psutil imports
2021-07-19 11:14:10 -07:00
Amog Kamsetty
8dfd471823
Revert "Revert "[Dashboard][event] Basic event module ( #16985 )" ( #17068 )" ( #17107 )
...
This reverts commit c17e171f92
.
Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-18 12:59:04 +08:00
fyrestone
e2808a35cf
Dashboard job module uses attrs instead of pydantic for job description ( #17116 )
2021-07-16 22:26:00 +08:00
Amog Kamsetty
caa78a3cff
Revert "[Core] Remove gpustat from core dependencies ( #17059 )" ( #17106 )
...
This reverts commit 7ec18f671a
.
2021-07-14 20:19:33 -07:00
Amog Kamsetty
c17e171f92
Revert "[Dashboard][event] Basic event module ( #16985 )" ( #17068 )
...
This reverts commit f1faa79a04
.
2021-07-13 23:18:43 -07:00
Amog Kamsetty
7ec18f671a
[Core] Remove gpustat from core dependencies ( #17059 )
2021-07-13 21:22:02 -07:00
fyrestone
f1faa79a04
[Dashboard][event] Basic event module ( #16985 )
...
* Basic event module
* Fix comments
* Set the SCAN_EVENT_DIR_INTERVAL_SECONDS defaults to 2
* Fix lint
* Fix lint
* Clean code
* Try to fix flaky
* Fix test
* Disable event module by default
* Make monitor events task cancellable
* Fix error
Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-13 19:08:39 -07:00
Amog Kamsetty
a14342ce6f
Revert "[Dashboard][event] Basic event module ( #16698 )" ( #17004 )
...
This reverts commit 66ea099897
.
2021-07-12 11:22:46 -07:00
fyrestone
66ea099897
[Dashboard][event] Basic event module ( #16698 )
...
* Basic event module
* Fix comments
* Set the SCAN_EVENT_DIR_INTERVAL_SECONDS defaults to 2
* Fix lint
* Fix lint
* Clean code
* Try to fix flaky
* Fix test
* Disable event module by default
Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-09 10:25:30 -07:00
architkulkarni
06dfd8dddb
Revert "[Dashboard][event] Basic event module ( #16283 )" ( #16676 )
...
This reverts commit 5afa53aa64
.
2021-06-25 09:38:18 -07:00
SongGuyang
e74d9d3ded
[runtime env] Download runtime env(conda) in agent instead of setup_worker ( #16525 )
2021-06-25 19:39:05 +08:00
fyrestone
5afa53aa64
[Dashboard][event] Basic event module ( #16283 )
2021-06-25 13:59:02 +08:00
SongGuyang
874e947d6f
[runtime env] support create or delete runtime envs in agent ( #15904 )
2021-06-09 20:22:25 +08:00
fyrestone
4ca316a0f4
Move test_snapshot from test_dashboard.py to modules/snapshot/tests/test_snapshot.py ( #16306 )
...
Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-06-08 10:26:03 -07:00
fyrestone
dfadf33a94
[Dashboard] Reorganize dashboard modules - node ( #16217 )
2021-06-07 19:50:46 -07:00
Alex Wu
e1da31f149
[dashboard] Include ray session name in dashboard snapshot ( #16199 )
...
* .
* .
* .
Co-authored-by: Alex Wu <alex@anyscale.com>
2021-06-02 15:07:06 -07:00
fyrestone
c53893cb13
[Dashboard] Reorganize dashboard modules - actor ( #16170 )
2021-06-02 06:58:30 -07:00
Alex Wu
f080911d9b
[dashboard] include worker id in actor snapshot ( #15967 )
...
Co-authored-by: Alex Wu <alex@anyscale.com>
2021-05-21 09:26:37 -07:00
Alex Wu
cd2fc7792f
[dashboard] Snapshot of cluster state ( #15868 )
2021-05-20 08:10:32 -07:00
fyrestone
56c309416e
[Job submission] Basic job submission structure ( #15103 )
2021-05-12 15:08:20 +08:00
Ian Rodney
90ce25cb35
[dashboard] Avoid global min_workers ( #15660 )
2021-05-10 15:47:51 -07:00
SongGuyang
b8ff86adb9
Add objectStore stats to dashboard API. ( #15677 )
2021-05-10 11:32:14 -05:00
Ian Rodney
546e5f6f13
[API] Remove non-API top Level function imports ( #15440 )
2021-04-27 12:33:59 -07:00
Dmitri Gekhtman
410f768046
[Kubernetes] [Dashboard] Remove disk data from dashboard when running on K8s. ( #14676 )
2021-04-05 17:16:20 -07:00
Clark Zinzow
1a9ba19012
[Core] Adds deprecation decorator and fixes privatization of a few APIs. ( #14811 )
2021-03-22 10:31:50 -07:00
Ian Rodney
eb12033612
[Code Cleanup] Switch to use ray.util.get_node_ip_address() ( #14741 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-18 13:10:57 -07:00
Lixin Wei
72d87093b9
[Core] Make Actor DEAD and Save Exceptions in GCS When Error Happens in Constructor ( #14211 )
2021-03-17 12:50:28 -07:00
Kathryn Zhou
01dda99b8c
Export cluster statistics to Prometheus ( #14612 )
2021-03-15 19:28:13 -07:00
Dmitri Gekhtman
6babd1928c
[Kubernetes][dashboard][minor] Fix uptime ( #14655 )
2021-03-12 18:30:13 -06:00
Dmitri Gekhtman
a90cffe26c
[dashboard][k8s] Better CPU reporting when running on K8s ( #14593 )
2021-03-12 12:02:15 -06:00
Clark Zinzow
5a788474aa
[Core] First pass at privatizing non-public Python APIs. ( #14607 )
...
* async_compat
* utils
* cluster_utils
* compat
* function_manager
* import_thread
* memory_monitor
* monitor, log_monitor, ray_process_reaper
* metrics_agent
* parameter
* prometheus_exporter
* ray_logging
* signature
2021-03-10 22:47:28 -08:00
Dmitri Gekhtman
4a7d9e71bb
[dashboard][kubernetes] Show container's memory info on K8s, not the physical host's. ( #14499 )
...
* random doc typo
* more reasonable memory output
* no if
* get rid of comment
2021-03-08 18:59:41 -08:00
fyrestone
3616424f10
Disable dashboard tune module if pandas version is incorrect ( #14381 )
2021-03-08 20:40:59 -06:00
fyrestone
2da58bb021
[Dashboard] Fix reporter agent ( #14378 )
2021-03-08 13:12:34 -06:00
fyrestone
5e76a51d56
[Dashboard] Select port in dashboard ( #13763 )
...
* Dashboard select port; Fix dashboard may hangs when exit
* Add test case
* Fix
* Fix test_stats_collector.py::test_get_all_node_details
* Refine dashboard error messages
* Refine code
* Refine code
* Show last 10 lines of dashboard log if start dashboard failed
* Fix ValueError: too many values to unpack (expected 2) when getsockname
* Fix test_multi_node_3.py::test_calling_start_ray_head may fail
* Fix Windows CI
* Disable dashboard in C++ test
* Refine code
* Fix issue 7084
Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-02-23 16:27:48 -08:00