Adds a Population-Based Training (as described in https://arxiv.org/abs/1711.09846) scheduler to Ray.tune. Currently mutates hyperparameters according to either a user-defined list of possible values to mutate to (necessary if hyperparameters can only be certain values ex. sgd_batch_size), or by a factor of 0.8 or 1.2.
* Bring cloudpickle version 0.5.2 inside the repo.
* Use internal copy of cloudpickle everywhere.
* Fix linting.
* Import ordering.
* Change __init__.py.
* Set pickler in serialization context.
* Don't check ray location.
Remove rllib dep: trainable is now a standalone abstract class that can be easily subclassed.
Clean up hyperband: fix debug string and add an example.
Remove YAML api / ScriptRunner: this was never really used.
Move ray.init() out of run_experiments(): This provides greater flexibility and should be less confusing since there isn't an implicit init() done there. Note that this is a breaking API change for tune.
* Add failing unit test for nondeterministic reconstruction
* Retry scheduling actor tasks if reassigned to local scheduler
* Update execution edges asynchronously upon dispatch for nondeterministic reconstruction
* Fix bug for updating checkpoint task execution dependencies
* Update comments for deterministic reconstruction
* cleanup
* Add (and skip) failing test case for nondeterministic reconstruction
* Suppress test output
* working multi action distribution and multiagent model
* currently working but the splits arent done in the right place
* added shared models
* added categorical support and mountain car example
* now compatible with generalized advantage estimation
* working multiagent code with discrete and continuous example
* moved reshaper to utils
* code review changes made, ppo action placeholder moved to model catalog, all multiagent code moved out of fcnet
* added examples in
* added PEP8 compliance
* examples are mostly pep8 compliant
* removed all flake errors
* added examples to jenkins tests
* fixed custom options bug
* added lines to let docker file find multiagent tests
* shortened example run length
* corrected nits
* fixed flake errors
* some autoscaling config tweaks
* Sun Jan 14 13:56:55 PST 2018
* Mon Jan 15 14:21:09 PST 2018
* increase backoff
* Mon Jan 15 14:40:47 PST 2018
* check boto version
* wip
* Sat Dec 30 15:07:28 PST 2017
* log video
* video doesn't work well
* scenario integration
* Sat Dec 30 17:30:22 PST 2017
* Sat Dec 30 17:31:05 PST 2017
* Sat Dec 30 17:31:32 PST 2017
* Sat Dec 30 17:32:16 PST 2017
* Sat Dec 30 17:34:11 PST 2017
* Sat Dec 30 17:34:50 PST 2017
* Sat Dec 30 17:35:34 PST 2017
* Sat Dec 30 17:38:49 PST 2017
* Sat Dec 30 17:40:39 PST 2017
* Sat Dec 30 17:43:00 PST 2017
* Sat Dec 30 17:43:04 PST 2017
* Sat Dec 30 17:45:56 PST 2017
* Sat Dec 30 17:46:26 PST 2017
* Sat Dec 30 17:47:02 PST 2017
* Sat Dec 30 17:51:53 PST 2017
* Sat Dec 30 17:52:54 PST 2017
* Sat Dec 30 17:56:43 PST 2017
* Sat Dec 30 18:27:07 PST 2017
* Sat Dec 30 18:27:52 PST 2017
* fix train
* Sat Dec 30 18:41:51 PST 2017
* Sat Dec 30 18:54:11 PST 2017
* Sat Dec 30 18:56:22 PST 2017
* Sat Dec 30 19:05:04 PST 2017
* Sat Dec 30 19:05:23 PST 2017
* Sat Dec 30 19:11:53 PST 2017
* Sat Dec 30 19:14:31 PST 2017
* Sat Dec 30 19:16:20 PST 2017
* Sat Dec 30 19:18:05 PST 2017
* Sat Dec 30 19:18:45 PST 2017
* Sat Dec 30 19:22:44 PST 2017
* Sat Dec 30 19:24:41 PST 2017
* Sat Dec 30 19:26:57 PST 2017
* Sat Dec 30 19:40:37 PST 2017
* wip models
* reward bonus
* test prep
* Sun Dec 31 18:45:25 PST 2017
* Sun Dec 31 18:58:28 PST 2017
* Sun Dec 31 18:59:34 PST 2017
* Sun Dec 31 19:03:33 PST 2017
* Sun Dec 31 19:05:05 PST 2017
* Sun Dec 31 19:09:25 PST 2017
* fix train
* kill
* add tuple preprocessor
* Sun Dec 31 20:38:33 PST 2017
* Sun Dec 31 22:51:24 PST 2017
* Sun Dec 31 23:14:13 PST 2017
* Sun Dec 31 23:16:04 PST 2017
* Mon Jan 1 00:08:35 PST 2018
* Mon Jan 1 00:10:48 PST 2018
* Mon Jan 1 01:08:31 PST 2018
* Mon Jan 1 14:45:44 PST 2018
* Mon Jan 1 14:54:56 PST 2018
* Mon Jan 1 17:29:29 PST 2018
* switch to euclidean dists
* Mon Jan 1 17:39:27 PST 2018
* Mon Jan 1 17:41:47 PST 2018
* Mon Jan 1 17:44:18 PST 2018
* Mon Jan 1 17:47:09 PST 2018
* Mon Jan 1 20:31:02 PST 2018
* Mon Jan 1 20:39:33 PST 2018
* Mon Jan 1 20:40:55 PST 2018
* Mon Jan 1 20:55:06 PST 2018
* Mon Jan 1 21:05:52 PST 2018
* fix env path
* merge richards fix
* fix hash
* Mon Jan 1 22:04:00 PST 2018
* Mon Jan 1 22:25:29 PST 2018
* Mon Jan 1 22:30:42 PST 2018
* simplified reward function
* add framestack
* add env configs
* simplify speed reward
* Tue Jan 2 17:36:15 PST 2018
* Tue Jan 2 17:49:16 PST 2018
* Tue Jan 2 18:10:38 PST 2018
* add lane keeping simple mode
* Tue Jan 2 20:25:26 PST 2018
* Tue Jan 2 20:30:30 PST 2018
* Tue Jan 2 20:33:26 PST 2018
* Tue Jan 2 20:41:42 PST 2018
* ppo lane keep
* simplify discrete actions
* Tue Jan 2 21:41:05 PST 2018
* Tue Jan 2 21:49:03 PST 2018
* Tue Jan 2 22:12:23 PST 2018
* Tue Jan 2 22:14:42 PST 2018
* Tue Jan 2 22:20:59 PST 2018
* Tue Jan 2 22:23:43 PST 2018
* Tue Jan 2 22:26:27 PST 2018
* Tue Jan 2 22:27:20 PST 2018
* Tue Jan 2 22:44:00 PST 2018
* Tue Jan 2 22:57:58 PST 2018
* Tue Jan 2 23:08:51 PST 2018
* Tue Jan 2 23:11:32 PST 2018
* update dqn reward
* Thu Jan 4 12:29:40 PST 2018
* Thu Jan 4 12:30:26 PST 2018
* Update train_dqn.py
* fix
* docs
* Update README.rst
* Sat Dec 30 15:23:49 PST 2017
* comments
* Sun Dec 31 23:33:30 PST 2017
* Sun Dec 31 23:33:38 PST 2017
* Sun Dec 31 23:37:46 PST 2017
* Sun Dec 31 23:39:28 PST 2017
* Sun Dec 31 23:43:05 PST 2017
* Sun Dec 31 23:51:55 PST 2017
* Sun Dec 31 23:52:51 PST 2017
This adds (experimental) auto-scaling support for Ray clusters based on GCS load metrics. The auto-scaling algorithm is as follows:
Based on current (instantaneous) load information, we compute the approximate number of "used workers". This is based on the bottleneck resource, e.g. if 8/8 GPUs are used in a 8-node cluster but all the CPUs are idle, the number of used nodes is still counted as 8. This number can also be fractional.
We scale that number by 1 / target_utilization_fraction and round up to determine the target cluster size (subject to the max_workers constraint). The autoscaler control loop takes care of launching new nodes until the target cluster size is met.
When a node is idle for more than idle_timeout_minutes, we remove it from the cluster if that would not drop the cluster size below min_workers.
Note that we'll need to update the wheel in the example yaml file after this PR is merged.