* Convert multi_node_test.py to pytest.
* Convert array_test.py to pytest.
* Convert failure_test.py to pytest.
* Convert microbenchmarks to pytest.
* Convert component_failures_test.py to pytest and some minor quotes changes.
* Convert tensorflow_test.py to pytest.
* Convert actor_test.py to pytest.
* Fix.
* Fix
* Added checkpoint_at_end option. To fix#2740
* Added ability to checkpoint at the end of trials if the option is set to True
* checkpoint_at_end option added; Consistent with Experience and Trial runner
* checkpoint_at_end option mentioned in the tune usage guide
* Moved the redundant checkpoint criteria check out of the if-elif
* Added note that checkpoint_at_end is enabled only when checkpoint_freq is not 0
* Added test case for checkpoint_at_end
* Made checkpoint_at_end have an effect regardless of checkpoint_freq
* Removed comment from the test case
* Fixed the indentation
* Fixed pep8 E231
* Handled cases when trainable does not have _save implemented
* Constrained test case to a particular exp using the MockAgent
* Revert "Constrained test case to a particular exp using the MockAgent"
This reverts commit e965a9358ec7859b99a3aabb681286d6ba3c3906.
* Revert "Handled cases when trainable does not have _save implemented"
This reverts commit 0f5382f996ff0cbf3d054742db866c33494d173a.
* Simpler test case for checkpoint_at_end
* Preserved bools from loosing their actual value
* Revert "Moved the redundant checkpoint criteria check out of the if-elif"
This reverts commit 783005122902240b0ee177e9e206e397356af9c5.
* Fix linting error.
* Limit number of concurrent workers started by hardware concurrency.
* Check if std:🧵:hardware_concurrency() returns 0.
* Pass in max concurrency from Python.
* Fix Java call to startRaylet.
* Fix typo
* Remove unnecessary cast.
* Fix linting.
* Cleanups on Java side.
* Comment back in actor test.
* Require maximum_startup_concurrency to be at least 1.
* Fix linting and test.
* Improve documentation.
* Fix typo.
## What do these changes do?
* distribute load and resource information on a heartbeat
* for each raylet, maintain total and available resource capacity as well as measure of current load
* this PR introduces a new notion of load, defined as a sum of all resource demand induced by queued ready tasks on the local raylet. This provides a heterogeneity-aware measure of load that supersedes legacy Ray's task count as a proxy for load.
* modify the scheduling policy to perform *capacity-based*, *load-aware*, *optimistically concurrent* resource allocation
* perform task spillover to the heartbeating node in response to a heartbeat, implementing heterogeneity-aware late-binding/work-stealing.
…'127.0.0.1',
when we forbid the external network. Instead of we can get ip address from hostname.
The function get_node_ip_address while catch an exception and return '127.0.0.1' when we forbid the external network. Instead of we can get ip address from hostname.
https://github.com/ray-project/ray/issues/2721
## What do these changes do?
1. Separate the log related code to logger.py from services.py.
2. Allow users to modify logging formatter in `ray start`.
## Related issue number
https://github.com/ray-project/ray/pull/2664
* add noisy network
* distributional q-learning in dev
* add distributional q-learning
* validated rainbow module
* add some comments
* supply some comments
* remove redundant argument to pass CI test
* async replay optimizer does NOT need annealing beta
* ignore rainbow specific arguments for DDPG and Apex
* formatted by yapf
* Update dqn_policy_graph.py
* Update dqn_policy_graph.py
* added ars
* functioning ars with regression test
* added regression tests for ARs
* fixed default config for ARS
* ARS code runs, now time to test
* ARS working and tested, changed std deviation of meanstd filter to initialize to 1
* ARS working and tested, changed std deviation of meanstd filter to initialize to 1
* pep8 fixes
* removed unused linear model
* address comments
* more fixing comments
* post yapf
* fixed support failure
* Update LICENSE
* Update policies.py
* Update test_supported_spaces.py
* Update policies.py
* Update LICENSE
* Update test_supported_spaces.py
* Update policies.py
* Update policies.py
* Update filter.py
* + Compatibility fix under py2 on ray.tune
* + Revert changes on master branch
* + Use default JsonEncoder in ray.tune.logger
* + Add UT for infinity support
A bunch of minor rllib fixes:
pull in latest baselines atari wrapper changes (and use deepmind wrapper by default)
move reward clipping to policy evaluator
add a2c variant of a3c
reduce vision network fc layer size to 256 units
switch to 84x84 images
doc tweaks
print timesteps in tune status
This PR makes it so that when Ray is started via ray.init() (as opposed to via ray start) the Redis servers will be started in "protected mode" (which means that clients can only connect by connecting to localhost).
In practice, we actually connect redis clients by passing in the node IP address (not localhost), so I need to create a redis config file on the fly to allow both localhost and the node's actual IP address (it would have been nice to find a way to do this from the Python redis client, but I couldn't find one).
This adds some experimental (undocumented) support for launching Ray on existing nodes. You have to provide the head ip, and the list of worker ips.
There are also a couple additional utils added for rsyncing files and port-forward.
This PR introduces the following changes:
* Ray Tune -> Tune
* [breaking] Creation of `schedulers/`, moving PBT, HyperBand into a submodule
* [breaking] Search Algorithms now must take in experiment configurations via `add_configurations` rather through initialization
* Support `"run": (function | class | str)` with automatic registering of trainable
* Documentation Changes
The goal of this PR is to allow custom policies to perform model-based rollouts. In the multi-agent setting, this requires access to not only policies of other agents, but also their current observations.
Also, you might want to return the model-based trajectories as part of the rollout for efficiency.
compute_actions() now takes a new keyword arg episodes
pull out internal episode class into a top-level file
add function to return extra trajectories from an episode that will be appended to the sample batch
documentation
ray exec CLUSTER CMD [--screen] [--start] [--stop]
ray attach CLUSTER [--start]
Example:
ray exec sgd.yaml 'source activate tensorflow_p27 && cd ~/ray/python/ray/rllib && ./train.py --run=PPO --env=CartPole-v0' --screen --start --stop
This will in one command create a cluster and run the command on it in a screen session. The screen can later be attached to via ray attach. After the command finishes, the cluster workers will be terminated and the head node stopped.
to support TF version < 1.5
to support rmsprop optimizer in Impala
Before TF1.5, tf.reduce_sum() and tf.reduce_max() has an argument keep_dims which has been renamed as keepdims in later versions.
In the original paper of Impala, they use rmsprop algorithm to optimize the model. We'd better also support it so that users can reproduce their experiments. Without any tuning, say that using the same hyper-parameters as AdamOptimizer, it reaches "episode_reward_mean": 19.083333333333332 in Pong after consume 3,610,350 samples.
This PR adds a driver table for the new GCS, which enables cleanup functionality associated with monitoring driver death.
Some testing in `monitor_test.py` is restored, but redis sharding for xray is needed to enable remaining tests.