Commit graph

128 commits

Author SHA1 Message Date
Amog Kamsetty
caa78a3cff
Revert "[Core] Remove gpustat from core dependencies (#17059)" (#17106)
This reverts commit 7ec18f671a.
2021-07-14 20:19:33 -07:00
Amog Kamsetty
c17e171f92
Revert "[Dashboard][event] Basic event module (#16985)" (#17068)
This reverts commit f1faa79a04.
2021-07-13 23:18:43 -07:00
Amog Kamsetty
7ec18f671a
[Core] Remove gpustat from core dependencies (#17059) 2021-07-13 21:22:02 -07:00
fyrestone
f1faa79a04
[Dashboard][event] Basic event module (#16985)
* Basic event module

* Fix comments

* Set the SCAN_EVENT_DIR_INTERVAL_SECONDS defaults to 2

* Fix lint

* Fix lint

* Clean code

* Try to fix flaky

* Fix test

* Disable event module by default

* Make monitor events task cancellable

* Fix error

Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-13 19:08:39 -07:00
Amog Kamsetty
a14342ce6f
Revert "[Dashboard][event] Basic event module (#16698)" (#17004)
This reverts commit 66ea099897.
2021-07-12 11:22:46 -07:00
fyrestone
66ea099897
[Dashboard][event] Basic event module (#16698)
* Basic event module

* Fix comments

* Set the SCAN_EVENT_DIR_INTERVAL_SECONDS defaults to 2

* Fix lint

* Fix lint

* Clean code

* Try to fix flaky

* Fix test

* Disable event module by default

Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-09 10:25:30 -07:00
Amog Kamsetty
39d60f62d2
[hotfix] fix material-ui version once more (#16901) 2021-07-06 13:57:34 -07:00
Simon Mo
b11b35aa45
hotfix material-ui version again (#16897) 2021-07-06 11:08:57 -07:00
Amog Kamsetty
d5ac5c45ea
[Dashboard] Pin material-ui/lab dependency (#16890) 2021-07-06 10:49:10 -07:00
architkulkarni
06dfd8dddb
Revert "[Dashboard][event] Basic event module (#16283)" (#16676)
This reverts commit 5afa53aa64.
2021-06-25 09:38:18 -07:00
SongGuyang
e74d9d3ded
[runtime env] Download runtime env(conda) in agent instead of setup_worker (#16525) 2021-06-25 19:39:05 +08:00
fyrestone
5afa53aa64
[Dashboard][event] Basic event module (#16283) 2021-06-25 13:59:02 +08:00
SongGuyang
874e947d6f
[runtime env] support create or delete runtime envs in agent (#15904) 2021-06-09 20:22:25 +08:00
fyrestone
4ca316a0f4
Move test_snapshot from test_dashboard.py to modules/snapshot/tests/test_snapshot.py (#16306)
Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-06-08 10:26:03 -07:00
fyrestone
dfadf33a94
[Dashboard] Reorganize dashboard modules - node (#16217) 2021-06-07 19:50:46 -07:00
Alex Wu
e1da31f149
[dashboard] Include ray session name in dashboard snapshot (#16199)
* .

* .

* .

Co-authored-by: Alex Wu <alex@anyscale.com>
2021-06-02 15:07:06 -07:00
fyrestone
c53893cb13
[Dashboard] Reorganize dashboard modules - actor (#16170) 2021-06-02 06:58:30 -07:00
Simon Mo
677514b3ff
Revert "[Dashboard] Actor Table UI Optimize (#15802)" (#15981)
This reverts commit 43be599a9a.
2021-05-21 10:56:15 -07:00
Alex Wu
f080911d9b
[dashboard] include worker id in actor snapshot (#15967)
Co-authored-by: Alex Wu <alex@anyscale.com>
2021-05-21 09:26:37 -07:00
Dominic Ming
43be599a9a
[Dashboard] Actor Table UI Optimize (#15802) 2021-05-21 09:23:32 -07:00
Alex Wu
cd2fc7792f
[dashboard] Snapshot of cluster state (#15868) 2021-05-20 08:10:32 -07:00
Ian Rodney
7b1c5dbe0a
[Hotfix][Lint] Pin other ESlint Deps (#15816) 2021-05-14 09:18:43 -07:00
fyrestone
56c309416e
[Job submission] Basic job submission structure (#15103) 2021-05-12 15:08:20 +08:00
Ashwin Hegde
4d8ed6dd5c
#13890 [new-dashboard] add object store memory column (#15697) 2021-05-11 15:36:16 -05:00
Ian Rodney
90ce25cb35
[dashboard] Avoid global min_workers (#15660) 2021-05-10 15:47:51 -07:00
Ian Rodney
c50490ccef
[Lint] Pin Prettier to 2.3.0 (#15721) 2021-05-10 11:46:29 -07:00
Ian Rodney
11b5c6c702
[HotFix][Lint] Fix Lint because of Prettier update (#15720) 2021-05-10 09:51:41 -07:00
SongGuyang
b8ff86adb9
Add objectStore stats to dashboard API. (#15677) 2021-05-10 11:32:14 -05:00
Amog Kamsetty
ebc44c3d76
[CI] Upgrade flake8 to 3.9.1 (#15527)
* formatting

* format util

* format release

* format rllib/agents

* format rllib/env

* format rllib/execution

* format rllib/evaluation

* format rllib/examples

* format rllib/policy

* format rllib utils and tests

* format streaming

* more formatting

* update requirements files

* fix rllib type checking

* updates

* update

* fix circular import

* Update python/ray/tests/test_runtime_env.py

* noqa
2021-05-03 14:23:28 -07:00
Ian Rodney
546e5f6f13
[API] Remove non-API top Level function imports (#15440) 2021-04-27 12:33:59 -07:00
fyrestone
43de7f48a7
Fix reported dashboard ip when using 0.0.0.0 (#15506) 2021-04-27 23:48:22 +08:00
Dmitri Gekhtman
410f768046
[Kubernetes] [Dashboard] Remove disk data from dashboard when running on K8s. (#14676) 2021-04-05 17:16:20 -07:00
Micah Yong
b3089b31f2
[RFC] Ray memory improvements: format and summary (#14520)
* Better formatting when terminal size doesn't support tabular

* Summary now displays size of reference types

* Add unit conversion support (e.g. b, kb, mb, gb)

* Format and test

* Add ability to specify the number of sorted entries

* Linting

* Clean up group summary, move import defaultdict, comment num entries counter, n

* Format and lint
2021-03-28 21:03:06 -07:00
fyrestone
52cfa1cdd7
Fix load code from local (#12102) 2021-03-24 11:49:58 +08:00
Clark Zinzow
1a9ba19012
[Core] Adds deprecation decorator and fixes privatization of a few APIs. (#14811) 2021-03-22 10:31:50 -07:00
Ian Rodney
eb12033612
[Code Cleanup] Switch to use ray.util.get_node_ip_address() (#14741)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-18 13:10:57 -07:00
Lixin Wei
72d87093b9
[Core] Make Actor DEAD and Save Exceptions in GCS When Error Happens in Constructor (#14211) 2021-03-17 12:50:28 -07:00
Kathryn Zhou
01dda99b8c
Export cluster statistics to Prometheus (#14612) 2021-03-15 19:28:13 -07:00
Dmitri Gekhtman
6babd1928c
[Kubernetes][dashboard][minor] Fix uptime (#14655) 2021-03-12 18:30:13 -06:00
Dmitri Gekhtman
a90cffe26c
[dashboard][k8s] Better CPU reporting when running on K8s (#14593) 2021-03-12 12:02:15 -06:00
Clark Zinzow
5a788474aa
[Core] First pass at privatizing non-public Python APIs. (#14607)
* async_compat

* utils

* cluster_utils

* compat

* function_manager

* import_thread

* memory_monitor

* monitor, log_monitor, ray_process_reaper

* metrics_agent

* parameter

* prometheus_exporter

* ray_logging

* signature
2021-03-10 22:47:28 -08:00
Dmitri Gekhtman
4a7d9e71bb
[dashboard][kubernetes] Show container's memory info on K8s, not the physical host's. (#14499)
* random doc typo

* more reasonable memory output

* no if

* get rid of comment
2021-03-08 18:59:41 -08:00
fyrestone
3616424f10
Disable dashboard tune module if pandas version is incorrect (#14381) 2021-03-08 20:40:59 -06:00
fyrestone
2da58bb021
[Dashboard] Fix reporter agent (#14378) 2021-03-08 13:12:34 -06:00
SangBin Cho
a04ab9b472
[Core] Fix ray memory bug (#14452)
* ray memory bug

* Fix ray memory issue.

* done.
2021-03-03 09:20:00 -08:00
SangBin Cho
09fd38ede1
[Multi node shuffle] More efficient ray memory --stats-only (#14423)
* Done.

* Fix all the issues.
2021-03-01 23:14:06 -08:00
Eric Liang
9db000ff2c
Auto report object store memory usage; remove some deprecated code (#14260) 2021-03-01 13:19:44 -08:00
Micah Yong
db0c16824c
[Dashboard][CLI] Ray memory parity with dashboard 2 (#13444)
* Minor improvements in Ray Core Walkthrough as seen in https://github.com/ray-project/ray/issues/12472

* Define node_stats() to return NodeStats object from cluster

* Add --group-by and --sort-by capabilities to ray memory script

* Resolve merge conflict

* Add helper functions for group by and sorting type in memory_utils.py

* Reformat

* Format

* Compartmentalize memory script into get_memory_summary and get_store_stats_summary

* Modify unit tests in test_mem_stat

* Lint and format

* Test cases for group_by sort_by

* Lint and format

* Fix actor handle failing test case

* Update test_memstat.py

* Resolve merge conflicts

* Adjust ray memory output based on terminal size

* Formatting and linting

* Use constant for callsite length

* Switch from OS to shutil for querying terminal size (official python support)

* Linting and formatting

* Lint and format

* Resolve lint issue in walkthrough.rst

* Revert to python 3.6

* Delete visitor.py

It was accidentally included in most recent commit

* Delete .eggs

It was accidentally included in most recent commit

* Resolve test_object_spilling.py test case

* Add stats only argument

* revert changes on this file

* Remove package-lock.json

* Add back npm installation

* Sync package-lock.json

* Linting and formatting

* Sync with package-lock

* Sync with package-lock pt 2

* Update documentation in https://docs.ray.io/en/master/memory-management.html

* Add include_memory_info as argument for node_stats

* Switch object ref and call site positions

* Linting and formatting

* Change from MiB to B

* Change from stats-only to store-true

* Add memory test case

* Add memory test case

* Lint and format

* Correct test in memstat

* Change line wrap and stats only to flags

* Clarify --stats-only and --no-format in ray memory

* --stats-only description modified

Co-authored-by: Micah Yong <micahyong@Micahs-MacBook-Pro.local>
2021-03-01 09:27:22 -08:00
Kathryn Zhou
456d9aab47
Add Cypress test for Ray Dashboard (#14253) 2021-02-24 20:41:52 -08:00
niole
488f63efe3
[Dashboard] Make requests sent by the dashboard reverse proxy compatible (#14012) 2021-02-24 18:31:59 -08:00