* Minor improvements in Ray Core Walkthrough as seen in https://github.com/ray-project/ray/issues/12472
* Define node_stats() to return NodeStats object from cluster
* Add --group-by and --sort-by capabilities to ray memory script
* Resolve merge conflict
* Add helper functions for group by and sorting type in memory_utils.py
* Reformat
* Format
* Compartmentalize memory script into get_memory_summary and get_store_stats_summary
* Modify unit tests in test_mem_stat
* Lint and format
* Test cases for group_by sort_by
* Lint and format
* Fix actor handle failing test case
* Update test_memstat.py
* Resolve merge conflicts
* Adjust ray memory output based on terminal size
* Formatting and linting
* Use constant for callsite length
* Switch from OS to shutil for querying terminal size (official python support)
* Linting and formatting
* Lint and format
* Resolve lint issue in walkthrough.rst
* Revert to python 3.6
* Delete visitor.py
It was accidentally included in most recent commit
* Delete .eggs
It was accidentally included in most recent commit
* Resolve test_object_spilling.py test case
* Add stats only argument
* revert changes on this file
* Remove package-lock.json
* Add back npm installation
* Sync package-lock.json
* Linting and formatting
* Sync with package-lock
* Sync with package-lock pt 2
* Update documentation in https://docs.ray.io/en/master/memory-management.html
* Add include_memory_info as argument for node_stats
* Switch object ref and call site positions
* Linting and formatting
* Change from MiB to B
* Change from stats-only to store-true
* Add memory test case
* Add memory test case
* Lint and format
* Correct test in memstat
* Change line wrap and stats only to flags
* Clarify --stats-only and --no-format in ray memory
* --stats-only description modified
Co-authored-by: Micah Yong <micahyong@Micahs-MacBook-Pro.local>
* Dashboard select port; Fix dashboard may hangs when exit
* Add test case
* Fix
* Fix test_stats_collector.py::test_get_all_node_details
* Refine dashboard error messages
* Refine code
* Refine code
* Show last 10 lines of dashboard log if start dashboard failed
* Fix ValueError: too many values to unpack (expected 2) when getsockname
* Fix test_multi_node_3.py::test_calling_start_ray_head may fail
* Fix Windows CI
* Disable dashboard in C++ test
* Refine code
* Fix issue 7084
Co-authored-by: 刘宝 <po.lb@antfin.com>
* prepare for head node
* move command runner interface outside _private
* remove space
* Eric
* flake
* min_workers in multi node type
* fixing edge cases
* eric not idle
* fix target_workers to consider min_workers of node types
* idle timeout
* minor
* minor fix
* test
* lint
* eric v2
* eric 3
* min_workers constraint before bin packing
* Update resource_demand_scheduler.py
* Revert "Update resource_demand_scheduler.py"
This reverts commit 818a63a2c86d8437b3ef21c5035d701c1d1127b5.
* reducing diff
* make get_nodes_to_launch return a dict
* merge
* weird merge fix
* auto fill instance types for AWS
* Alex/Eric
* Update doc/source/cluster/autoscaling.rst
* merge autofill and input from user
* logger.exception
* make the yaml use the default autofill
* docs Eric
* remove test_autoscaler_yaml from windows tests
* lets try changing the test a bit
* return test
* lets see
* edward
* Limit max launch concurrency
* commenting frac TODO
* move to resource demand scheduler
* use STATUS UP TO DATE
* Eric
* make logger of gc freed refs debug instead of info
* add cluster name to docker mount prefix directory
* grrR
* fix tests
* moving docker directory to sdk
* move the import to prevent circular dependency
* smallf fix
* ian
* fix max launch concurrency bug to assume failing nodes as pending and consider only load_metric's connected nodes as running
* small fix
* deflake test_joblib
* lint
* placement groups bypass
* remove space
* Eric
* first ocmmit
* lint
* exmaple
* documentation
* hmm
* file path fix
* fix test
* some format issue in docs
* modified docs
* joblib strikes again on windows
* add ability to not start autoscaler/monitor
* a
* remove worker_default
* Remove default pod type from operator
* Remove worker_default_node_type from rewrite_legacy_yaml_to_availble_node_types
* deprecate useless fields
Co-authored-by: Ameer Haj Ali <ameerhajali@ameers-mbp.lan>
Co-authored-by: Alex Wu <alex@anyscale.io>
Co-authored-by: Alex Wu <itswu.alex@gmail.com>
Co-authored-by: Eric Liang <ekhliang@gmail.com>
Co-authored-by: Ameer Haj Ali <ameerhajali@Ameers-MacBook-Pro.local>
Co-authored-by: root <root@ip-172-31-56-188.us-west-2.compute.internal>
Co-authored-by: Dmitri Gekhtman <dmitri.m.gekhtman@gmail.com>
* Fix bug where None was passed as the empty value for ActorInfo.gpu_stats instead of an empty list
* lint
* dashboard/modules/logical_view
* fix test
* trigger build
* In Progress.
* Done.
* Fix the issue.
* Add wait for condition because logs are not written right away now.
* debug string.
* lint.
* Fix flaky test.
* Fix issues.
* Fix test.
* lint.
* Fix duplicate node total rows in dashboard by changing the react key of the NodeTotalRow component from the node IP to the node ID (node IP can be duplicated in the case of docker).
* simplify a piece of test code and fix a flaky time out
* lint