Commit graph

16 commits

Author SHA1 Message Date
Alan Guo
50b20809b8
[Dashboard] Stop caching logs in memory. Use state observability api to fetch on demand. (#26818)
Signed-off-by: Alan Guo <aguo@anyscale.com>

## Why are these changes needed?
Reduces memory footprint of the dashboard.
Also adds some cleanup to the errors data.

Also cleans up actor cache by removing dead actors from the cache.

Dashboard UI no longer allows you to see logs for all workers in a node. You must click into each worker's logs individually.
<img width="1739" alt="Screen Shot 2022-07-20 at 9 13 00 PM" src="https://user-images.githubusercontent.com/711935/180128633-1633c187-39c9-493e-b694-009fbb27f73b.png">


## Related issue number
fixes #23680 
fixes #22027
fixes #24272
2022-07-26 03:10:57 -07:00
Kai Yang
4a999777fa
[Core] Allow accepting gRPC HTTP proxy via env variable (#23526) 2022-05-10 11:30:46 +08:00
Guyang Song
119318932a
remove the env config 'RAY_DASHBOARD_MODULE_EVENT' (#19629) 2021-10-28 16:51:59 +09:00
Simon Mo
e61160d514
[Dashboard] Move gcs health check to a separate thread to avoid crashing due to excessive CPU usage. (#18236) 2021-09-03 14:23:56 -07:00
fyrestone
57b9b1bb0f
[Dashboard] Use a dedicated RPC to check the GCS is alive (#16330)
* Dashboard check gcs is alive

* Fix dashboard hangs at exit

* ray health-check call GCS CheckAlive

* Minor fixes

Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-27 14:05:44 +08:00
Amog Kamsetty
8dfd471823
Revert "Revert "[Dashboard][event] Basic event module (#16985)" (#17068)" (#17107)
This reverts commit c17e171f92.

Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-18 12:59:04 +08:00
Amog Kamsetty
c17e171f92
Revert "[Dashboard][event] Basic event module (#16985)" (#17068)
This reverts commit f1faa79a04.
2021-07-13 23:18:43 -07:00
fyrestone
f1faa79a04
[Dashboard][event] Basic event module (#16985)
* Basic event module

* Fix comments

* Set the SCAN_EVENT_DIR_INTERVAL_SECONDS defaults to 2

* Fix lint

* Fix lint

* Clean code

* Try to fix flaky

* Fix test

* Disable event module by default

* Make monitor events task cancellable

* Fix error

Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-13 19:08:39 -07:00
Amog Kamsetty
a14342ce6f
Revert "[Dashboard][event] Basic event module (#16698)" (#17004)
This reverts commit 66ea099897.
2021-07-12 11:22:46 -07:00
fyrestone
66ea099897
[Dashboard][event] Basic event module (#16698)
* Basic event module

* Fix comments

* Set the SCAN_EVENT_DIR_INTERVAL_SECONDS defaults to 2

* Fix lint

* Fix lint

* Clean code

* Try to fix flaky

* Fix test

* Disable event module by default

Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-09 10:25:30 -07:00
Xianyang Liu
4ecd29ea2b
[dashboard] Fixes dashboard issues when environments have set http_proxy (#12598)
* fixes ray start with http_proxy

* format

* fixes

* fixes

* increase timeout

* address comments
2021-01-21 20:10:01 -08:00
fyrestone
defd41aad7
[Dashboard] http route handler cache (#10921)
* Add aiohttp_cache to dashboard

* Add comments; Refine code

* Keep NODE_STATS_UPDATE_INTERVAL_SECONDS 1 second; Change AIOHTTP_CACHE_TTL_SECONDS to 2 seconds

* Update merge

Co-authored-by: 刘宝 <po.lb@antfin.com>
2020-10-09 22:27:05 -07:00
fyrestone
e9b046306a
[Dashboard] Dashboard basic modules (#10303)
* Improve reporter module

* Add test_node_physical_stats to test_reporter.py

* Add test_class_method_route_table to test_dashboard.py

* Add stats_collector module for dashboard

* Subscribe actor table data

* Add log module for dashboard

* Only enable test module in some test cases

* CI run all dashboard tests

* Reduce test timeout to 10s

* Use fstring

* Remove unused code

* Remove blank line

* Fix dashboard tests

* Fix asyncio.create_task not available in py36; Fix lint

* Add format_web_url to ray.test_utils

* Update dashboard/modules/reporter/reporter_head.py

Co-authored-by: Max Fitton <mfitton@berkeley.edu>

* Add DictChangeItem type for Dict change

* Refine logger.exception

* Refine GET /api/launch_profiling

* Remove disable_test_module fixture

* Fix test_basic may fail

Co-authored-by: 刘宝 <po.lb@antfin.com>
Co-authored-by: Max Fitton <mfitton@berkeley.edu>
2020-08-29 23:09:34 -07:00
fyrestone
05c103af94
[Dashboard] Start the new dashboard (#10131)
* Use new dashboard if environment var RAY_USE_NEW_DASHBOARD exists; new dashboard startup

* Make fake client/build/static directory for dashboard

* Add test_dashboard.py for new dashboard

* Travis CI enable new dashboard test

* Update new dashboard

* Agent manager service

* Add agent manager

* Register agent to agent manager

* Add a new line to the end of agent_manager.cc

* Fix merge; Fix lint

* Update dashboard/agent.py

Co-authored-by: SangBin Cho <rkooo567@gmail.com>

* Update dashboard/head.py

Co-authored-by: SangBin Cho <rkooo567@gmail.com>

* Fix bug

* Add tests for dashboard

* Fix

* Remove const from Process::Kill() & Fix bugs

* Revert error check of execute_after

* Raise exception from DashboardAgent.run

* Add more tests.

* Fix compile on Linux

* Use dict comprehension instead of dict(generator)

* Fix lint

* Fix windows compile

* Fix lint

* Test Windows CI

* Revert "Test Windows CI"

This reverts commit 945e01051ec95cff5fcc1c0bc37045b46e7ad9a6.

* Fix ParseWindowsCommandLine bug

* Update src/ray/util/util.cc

Co-authored-by: Robert Nishihara <robertnishihara@gmail.com>

Co-authored-by: 刘宝 <po.lb@antfin.com>
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
Co-authored-by: Robert Nishihara <robertnishihara@gmail.com>
2020-08-24 13:24:23 -07:00
Robert Nishihara
36e626e95d
Revert "[Dashboard] Start the new dashboard (#9860)" (#10116)
This reverts commit 739933e5b8.
2020-08-14 14:06:57 -07:00
fyrestone
739933e5b8
[Dashboard] Start the new dashboard (#9860) 2020-08-13 11:01:46 +08:00