hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-10 13:26:39 -04:00

Author	SHA1	Message	Date
Eric Liang	43aa2299e6	[api] Annotate as public / move ray-core APIs to _private and add enforcement rule (#25695 ) Enable checking of the ray core module, excluding serve, workflows, and tune, in ./ci/lint/check_api_annotations.py. This required moving many files to ray._private and associated fixes.	2022-06-21 15:13:29 -07:00
mwtian	1ce0ab7b7c	[Core] Export additional metrics for workers and Raylet memory (#25418 ) Add visibility into the following to help Ray users and developers debug performance and OOM issues: Raylet memory usage broken down by USS vs remaining RSS. Total workers' count, CPU percentage usage, and memory usage.	2022-06-06 10:58:14 -07:00
Dmitri Gekhtman	6d09244a7e	[Dashboard][K8s] Add toggle to enable showing node disk usage on K8s (#24416 ) https://github.com/ray-project/ray/pull/14676 disabled the disk usage/total display for Ray nodes on K8s, because Ray nodes on K8s are run as pods, which in general do not use up the entire machine. However, in some situations, it is useful to run one Ray pod per K8s node and report the disk usage. This PR adds a flag to enable displaying disk usage in those situations.	2022-05-03 10:58:05 -05:00
Stephanie Wang	b43426bc33	[core] Add metrics for disk and network I/O (#23546 ) Adds some metrics useful for object-intensive workloads: Per raylet/object manager: Add num bytes pending restore to spill manager Add num requests cumulative to PullManager Num bytes pushed/pulled from other nodes cumulative Histogram for request latencies in PullManager: total life time of request, from start to cancel request satisfaction time, from start to object local pull time, from object activation to object local Per-node disk read/write speed, IOPS	2022-04-01 11:15:34 -07:00
Balaji Veeramani	7f1bacc7dc	[CI] Format Python code with Black (#21975 ) See #21316 and #21311 for the motivation behind these changes.	2022-01-29 18:41:57 -08:00
Edward Oakes	7736cdd91d	[dashboard] Rename "new_dashboard" -> "dashboard" (#18214 )	2021-09-15 11:17:15 -05:00
Clark Zinzow	d958457d07	[Core] Second pass at privatizing APIs. (#17885 ) * gcs_utils * resource_spec * profiling * ray_perf and ray_cluster_perf * test_utils	2021-08-18 20:56:33 -07:00
Richard Liaw	597dc08dfe	Revert "Revert "[core] remove opencensus/prometheus_exporter dependencies"" (#17254 ) * Revert "Revert "[core] remove opencensus/prometheus_exporter dependencies" (#17251)" This reverts commit `7b44dd8ecb`. * Lint * Fix more imports Co-authored-by: Kai Fricke <kai@anyscale.com>	2021-07-26 21:09:25 -07:00
Kathryn Zhou	01dda99b8c	Export cluster statistics to Prometheus (#14612 )	2021-03-15 19:28:13 -07:00
fyrestone	2da58bb021	[Dashboard] Fix reporter agent (#14378 )	2021-03-08 13:12:34 -06:00
Kathryn Zhou	d6521be7ef	Export GPU metrics, CPU count, and additional Memory metrics to Prometheus (#14170 )	2021-02-22 10:04:18 -08:00
Kathryn Zhou	f6b5e838fe	Add disk and network metrics to Prometheus and fix dashboard (#14144 )	2021-02-17 10:27:14 -08:00
Simon Mo	33316d4f8f	Revert "Export additional metrics to Prometheus (#14061 )" (#14134 ) This reverts commit `82539f2da4`.	2021-02-16 12:49:12 -08:00
Kathryn Zhou	82539f2da4	Export additional metrics to Prometheus (#14061 )	2021-02-14 23:16:26 -08:00
SangBin Cho	32dc5676b4	[Metrics] Record per node and raylet cpu / mem usage (#12982 ) * Record per node and raylet cpu / mem usage * Add comments. * Addressed code review.	2021-01-05 21:57:21 -08:00
SangBin Cho	753cda2f28	[Dashboard] Delete old dashboard (#12144 ) * Delete old dashboard from repo. * Delete old dashboard from repo. 2	2020-11-25 11:31:02 -08:00
Max Fitton	caf3b04b27	[Dashboard] Turn on new dashboard by default pt 2 (#11510 )	2020-10-23 15:52:14 -05:00

17 commits