Commit graph

30 commits

Author SHA1 Message Date
Edward Oakes
7736cdd91d
[dashboard] Rename "new_dashboard" -> "dashboard" (#18214) 2021-09-15 11:17:15 -05:00
Clark Zinzow
d958457d07
[Core] Second pass at privatizing APIs. (#17885)
* gcs_utils

* resource_spec

* profiling

* ray_perf and ray_cluster_perf

* test_utils
2021-08-18 20:56:33 -07:00
Amog Kamsetty
add6ceb3ec
[Dependencies] Fix missing dependency UX (#17420)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-08-05 20:18:42 -07:00
Amog Kamsetty
cb74053ee5
Retry remove gpustat dependency (#17115)
* remove gpustat

* move psutil imports
2021-07-19 11:14:10 -07:00
Amog Kamsetty
caa78a3cff
Revert "[Core] Remove gpustat from core dependencies (#17059)" (#17106)
This reverts commit 7ec18f671a.
2021-07-14 20:19:33 -07:00
Amog Kamsetty
7ec18f671a
[Core] Remove gpustat from core dependencies (#17059) 2021-07-13 21:22:02 -07:00
Dmitri Gekhtman
410f768046
[Kubernetes] [Dashboard] Remove disk data from dashboard when running on K8s. (#14676) 2021-04-05 17:16:20 -07:00
Ian Rodney
eb12033612
[Code Cleanup] Switch to use ray.util.get_node_ip_address() (#14741)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-18 13:10:57 -07:00
Kathryn Zhou
01dda99b8c
Export cluster statistics to Prometheus (#14612) 2021-03-15 19:28:13 -07:00
Dmitri Gekhtman
6babd1928c
[Kubernetes][dashboard][minor] Fix uptime (#14655) 2021-03-12 18:30:13 -06:00
Dmitri Gekhtman
a90cffe26c
[dashboard][k8s] Better CPU reporting when running on K8s (#14593) 2021-03-12 12:02:15 -06:00
Clark Zinzow
5a788474aa
[Core] First pass at privatizing non-public Python APIs. (#14607)
* async_compat

* utils

* cluster_utils

* compat

* function_manager

* import_thread

* memory_monitor

* monitor, log_monitor, ray_process_reaper

* metrics_agent

* parameter

* prometheus_exporter

* ray_logging

* signature
2021-03-10 22:47:28 -08:00
Dmitri Gekhtman
4a7d9e71bb
[dashboard][kubernetes] Show container's memory info on K8s, not the physical host's. (#14499)
* random doc typo

* more reasonable memory output

* no if

* get rid of comment
2021-03-08 18:59:41 -08:00
fyrestone
2da58bb021
[Dashboard] Fix reporter agent (#14378) 2021-03-08 13:12:34 -06:00
Kathryn Zhou
d6521be7ef
Export GPU metrics, CPU count, and additional Memory metrics to Prometheus (#14170) 2021-02-22 10:04:18 -08:00
Kathryn Zhou
f6b5e838fe
Add disk and network metrics to Prometheus and fix dashboard (#14144) 2021-02-17 10:27:14 -08:00
Simon Mo
33316d4f8f
Revert "Export additional metrics to Prometheus (#14061)" (#14134)
This reverts commit 82539f2da4.
2021-02-16 12:49:12 -08:00
Kathryn Zhou
82539f2da4
Export additional metrics to Prometheus (#14061) 2021-02-14 23:16:26 -08:00
SangBin Cho
32dc5676b4
[Metrics] Record per node and raylet cpu / mem usage (#12982)
* Record per node and raylet cpu / mem usage

* Add comments.

* Addressed code review.
2021-01-05 21:57:21 -08:00
Max Fitton
caf3b04b27
[Dashboard] Turn on new dashboard by default pt 2 (#11510) 2020-10-23 15:52:14 -05:00
Max Fitton
cdca5af53b
Revert "[Dashboard] Turn on New Dashboard by Default (#11321)" (#11502)
This reverts commit f500292d41.
2020-10-20 10:53:10 -05:00
Max Fitton
f500292d41
[Dashboard] Turn on New Dashboard by Default (#11321) 2020-10-19 12:31:11 -05:00
Eric Liang
609c1b8acd
Start moving ray internal files to _private module (#10994) 2020-09-24 22:46:35 -07:00
fyrestone
50784e2496
[Dashboard] Dashboard node grouping (#10528)
* Add RAY_NODE_ID environment var to agent

* Node ralated data use node id as key

* ray.init() return node id; Pass test_reporter.py

* Fix lint & CI

* Fix comments

* Minor fixes

* Fix CI

* Add const to ClientID in AgentManager::Options

* Use fstring

* Add comments

* Fix lint

* Add test_multi_nodes_info

Co-authored-by: 刘宝 <po.lb@antfin.com>
2020-09-16 10:17:29 -07:00
fyrestone
e9b046306a
[Dashboard] Dashboard basic modules (#10303)
* Improve reporter module

* Add test_node_physical_stats to test_reporter.py

* Add test_class_method_route_table to test_dashboard.py

* Add stats_collector module for dashboard

* Subscribe actor table data

* Add log module for dashboard

* Only enable test module in some test cases

* CI run all dashboard tests

* Reduce test timeout to 10s

* Use fstring

* Remove unused code

* Remove blank line

* Fix dashboard tests

* Fix asyncio.create_task not available in py36; Fix lint

* Add format_web_url to ray.test_utils

* Update dashboard/modules/reporter/reporter_head.py

Co-authored-by: Max Fitton <mfitton@berkeley.edu>

* Add DictChangeItem type for Dict change

* Refine logger.exception

* Refine GET /api/launch_profiling

* Remove disable_test_module fixture

* Fix test_basic may fail

Co-authored-by: 刘宝 <po.lb@antfin.com>
Co-authored-by: Max Fitton <mfitton@berkeley.edu>
2020-08-29 23:09:34 -07:00
Ian Rodney
d6f2b0d933
[docker] Run profiling without sudo (#10388)
* fix profiling for docker

* small fixes

* use name

* do not import pwd on windows
2020-08-28 21:25:10 -07:00
fyrestone
05c103af94
[Dashboard] Start the new dashboard (#10131)
* Use new dashboard if environment var RAY_USE_NEW_DASHBOARD exists; new dashboard startup

* Make fake client/build/static directory for dashboard

* Add test_dashboard.py for new dashboard

* Travis CI enable new dashboard test

* Update new dashboard

* Agent manager service

* Add agent manager

* Register agent to agent manager

* Add a new line to the end of agent_manager.cc

* Fix merge; Fix lint

* Update dashboard/agent.py

Co-authored-by: SangBin Cho <rkooo567@gmail.com>

* Update dashboard/head.py

Co-authored-by: SangBin Cho <rkooo567@gmail.com>

* Fix bug

* Add tests for dashboard

* Fix

* Remove const from Process::Kill() & Fix bugs

* Revert error check of execute_after

* Raise exception from DashboardAgent.run

* Add more tests.

* Fix compile on Linux

* Use dict comprehension instead of dict(generator)

* Fix lint

* Fix windows compile

* Fix lint

* Test Windows CI

* Revert "Test Windows CI"

This reverts commit 945e01051ec95cff5fcc1c0bc37045b46e7ad9a6.

* Fix ParseWindowsCommandLine bug

* Update src/ray/util/util.cc

Co-authored-by: Robert Nishihara <robertnishihara@gmail.com>

Co-authored-by: 刘宝 <po.lb@antfin.com>
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
Co-authored-by: Robert Nishihara <robertnishihara@gmail.com>
2020-08-24 13:24:23 -07:00
Robert Nishihara
36e626e95d
Revert "[Dashboard] Start the new dashboard (#9860)" (#10116)
This reverts commit 739933e5b8.
2020-08-14 14:06:57 -07:00
fyrestone
739933e5b8
[Dashboard] Start the new dashboard (#9860) 2020-08-13 11:01:46 +08:00
fyrestone
4d08ddbf24
[Dashboard] New dashboard skeleton (#9099) 2020-07-27 11:34:47 +08:00