* Update ActorState in dashboard to support new actor states
* Update dashboard documentation for new states
* Add missing state to doc
Co-authored-by: Max Fitton <max@semprehealth.com>
* Add `log_to_file` parameter, pass to Trainable config, redirect stdout/stderr.
* Add logging handler to root ray logger
* Added test for `log_to_file` parameter
* Added logs, reuse test
* Revert debug change
* Update logdir on reset, flush streams after each train() step
* Remove magic keys from visible config
Co-authored-by: Kai Fricke <kai@anyscale.com>
* on prem server first commit
* minor fix
* verify error on autoscaling in on prem mode
* lint
* lint
* Tests complete
* add tests to check for backward compatibility
* Fixing comments and autoscaling
* minor fixes
* coordinating server mode
* tests
* lint
* remove unnecessary import
* Resolving Comments
* seperating coordinator and local node provider
Co-authored-by: Ameer Haj Ali <ameerhajali@Ameers-MacBook-Pro.local>
* python_test: fix cython_examples in doc/ and tests/
* update setup.py to parse the bazel version string better
* all: centralize all python deps into stackable requirements files in python/
* format
* Move cython test into the proper package
* Add cross-reference dependency comments for requirements and setup.py
* re-enable version pinning on CI, fix formatting
* fix up torchvision version
* fix case in shell
* Separate out file_mounts contents hashing into its own separate hash
Add an option to continuously sync file_mounts from head node to worker nodes:
monitor.py will re-sync file mounts whenver contents change but will only run setup_commands if the config also changes
* add test and default value for file_mounts_sync_continuously
* format code
* Update comments
* Add param to skip setup commands when only file_mounts content changed during monitor.py's update tick
Fixed so setup commands run when ray up is run and file_mounts content changes
* Refactor so that runtime_hash retains previous behavior
runtime_hash is almost identical as before this PR. It is used to determine if setup_commands need to run
file_mounts_contents_hash is an additional hash of the file_mounts content that is used to detect when only file syncing has to occur.
Note: runtime_hash value will have changed from before the PR because we hash the hash of the contents of the file_mounts as a performance optimization
* fix issue with hashing a hash
* fix bug where trying to set contents hash when it wasn't generated
* Fix lint error
Fix bug in command_runner where check_output was no longer returning the output of the command
* clear out provider between tests to get rid of flakyness
* reduce chance of race condition from node_launcher launching a node in the middle of an autoscaler.update call