Cleanup: TFPolicyGraph now automatically adds loss input entries for state_in_*, so that graph sub-classes don't need to worry about it.
Multi-GPU support:
Allow setting up model tower replicas with existing state input tensors
Truncate the per-device minibatch slices so that they are always a multiple of max_seq_len.
* Fix one of the stress tests, fix ray.global_state.client_table when called early on.
* Re-enable testWait.
* Convert stress_tests.py to pytest.
* Fix
* Add profile table and store profiling information there.
* Code for dumping timeline.
* Improve color scheme.
* Push timeline events on driver only for raylet.
* Improvements to profiling and timeline visualization
* Some linting
* Small fix.
* Linting
* Propagate node IP address through profiling events.
* Fix test.
* object_id.hex() should return byte string in python 2.
* Include gcs.fbs in node_manager.fbs.
* Remove flatbuffer definition duplication.
* Decode to unicode in Python 3 and bytes in Python 2.
* Minor
* Submit profile events in a batch. Revert some CMake changes.
* Fix
* Workaround test failure.
* Fix linting
* Linting
* Don't return anything from chrome_tracing_dump when filename is provided.
* Remove some redundancy from profile table.
* Linting
* Move TODOs out of docstring.
* Minor
Specifically, subtracts 1 from the target number of workers, taking into
account that the head node has some computational resources.
Do not kill an idle node if it would drop us below the target number of
nodes (in which case we just immediately relaunch).
* Fix documentation indentation.
* Add error table to GCS and push error messages through node manager.
* Add type to error data.
* Linting
* Fix failure_test bug.
* Linting.
* Enable one more test.
* Attempt to fix doc building.
* Restructuring
* Fixes
* More fixes.
* Move current_time_ms function into util.h.
* build_credis.sh: use an up-to-date credis commit.
* build_credis.sh: leveldb is updated, so update build cmds for it
* WIP: make monitor.py issue flush; switch gcs client to use credis
* Experimental: enable automatic GCS flushing with configurable policy.
* Fix linux compilation error
* Fix leveldb build
* Use optimized build for credis
* Address comments
* Attempt to fix tests
## What do these changes do?
**Vectorized envs**: Users can either implement `VectorEnv`, or alternatively set `num_envs=N` to auto-vectorize gym envs (this vectorizes just the action computation part).
```
# CartPole-v0 on single core with 64x64 MLP:
# vector_width=1:
Actions per second 2720.1284458322966
# vector_width=8:
Actions per second 13773.035334888269
# vector_width=64:
Actions per second 37903.20472563333
```
**Async envs**: The more general form of `VectorEnv` is `AsyncVectorEnv`, which allows agents to execute out of lockstep. We use this as an adapter to support `ServingEnv`. Since we can convert any other form of env to `AsyncVectorEnv`, utils.sampler has been rewritten to run against this interface.
**Policy serving**: This provides an env which is not stepped. Rather, the env executes in its own thread, querying the policy for actions via `self.get_action(obs)`, and reporting results via `self.log_returns(rewards)`. We also support logging of off-policy actions via `self.log_action(obs, action)`. This is a more convenient API for some use cases, and also provides parallelizable support for policy serving (for example, if you start a HTTP server in the env) and ingest of offline logs (if the env reads from serving logs).
Any of these types of envs can be passed to RLlib agents. RLlib handles conversions internally in CommonPolicyEvaluator, for example:
```
gym.Env => rllib.VectorEnv => rllib.AsyncVectorEnv
rllib.ServingEnv => rllib.AsyncVectorEnv
```
* Print warning when defining very large remote function or actor.
* Add weak test.
* Check that warnings appear in test.
* Make wait_for_errors actually fail in failure_test.py.
* Use constants for error types.
* Fix
* Google Cloud Platform scaffolding
* Add minimal gcp config example
* Add googleapiclient discoveries, update gcp.config constants
* Rename and update gcp.config key pair name function
* Implement gcp.config._configure_project
* Fix the create project get project flow
* Implement gcp.config._configure_iam_role
* Implement service account iam binding
* Implement gcp.config._configure_key_pair
* Implement rsa key pair generation
* Implement gcp.config._configure_subnet
* Save work-in-progress gcp.config._configure_firewall_rules.
These are likely to be not needed at all. Saving them if we happen to
need them later.
* Remove unnecessary firewall configuration
* Update example-minimal.yaml configuration
* Add new wait_for_compute_operation, rename old wait_for_operation
* Temporarily rename autoscaler tags due to gcp incompatibility
* Implement initial gcp.node_provider.nodes
* Still missing filter support
* Implement initial gcp.node_provider.create_node
* Implement another compute wait
operation (wait_For_compute_zone_operation). TODO: figure out if we
can remove the function.
* Implement initial gcp.node_provider._node and node status functions
* Implement initial gcp.node_provider.terminate_node
* Implement node tagging and ip getter methods for nodes
* Temporarily rename tags due to gcp incompatibility
* Tiny tweaks for autoscaler.updater
* Remove unused config from gcp node_provider
* Add new example-full example to gcp, update load_gcp_example_config
* Implement label filtering for gcp.node_provider.nodes
* Revert unnecessary change in ssh command
* Revert "Temporarily rename tags due to gcp incompatibility"
This reverts commit e2fe634c5d11d705c0f5d3e76c80c37394bb23fb.
* Revert "Temporarily rename autoscaler tags due to gcp incompatibility"
This reverts commit c938ee435f4b75854a14e78242ad7f1d1ed8ad4b.
* Refactor autoscaler tagging to support multiple tag specs
* Remove missing cryptography imports
* Update quote function import
* Fix threading issue in gcp.config with the compute discovery object
* Add gcs support for log_sync
* Fix the labels/tags naming discrepancy
* Add expanduser to file_mounts hashing
* Fix gcp.node_provider.internal_ip
* Add uuid to node name
* Remove 'set -i' from updater ssh command
* Also add TODO with the context and reason for the change.
* Update ssh key creation in autoscaler.gcp.config
* Fix wait_for_compute_zone_operation's threading issue
Google discovery api's compute object is not thread safe, and thus
needs to be recreated for each thread. This moves the
`wait_for_compute_zone_operation` under `autoscaler.gcp.config`, and
adds compute as its argument.
* Address pr feedback from @ericl
* Expand local file mount paths in NodeUpdater
* Add ssh_user name to key names
* Update updater ssh to attempt 'set -i' and fall back if that fails
* Update gcp/example-full.yaml
* Fix wait crm operation in gcp.config
* Update gcp/example-minimal.yaml to match aws/example-minimal.yaml
* Fix gcp/example-full.yaml comment indentation
* Add gcp/example-full.yaml to setup files
* Update example-full.yaml command
* Revert "Refactor autoscaler tagging to support multiple tag specs"
This reverts commit 9cf48409ca2e5b66f800153853072c706fa502f6.
* Update tag spec to only use characters [0-9a-z_-]
* Change the tag values to conform gcp spec
* Add project_id in the ssh key name
* Replace '_' with '-' in autoscaler tag names
* Revert "Update updater ssh to attempt 'set -i' and fall back if that fails"
This reverts commit 23a0066c5254449e49746bd5e43b94b66f32bfb4.
* Revert "Remove 'set -i' from updater ssh command"
This reverts commit 5fa034cdf79fa7f8903691518c0d75699c630172.
* Add fallback to `set -i` in force_interactive command
* Update autoscaler tests to match current implementation
* Update GCPNodeProvider.create_node to include hash in instance name
* Add support for creating multiple instance on one create_node call
* Clean TODOs
* Update styles
* Replace single quotes with double quotes
* Some minor indentation fixes etc.
* Remove unnecessary comment. Fix indentation.
* Yapfify files that fail flake8 test
* Yapfify more files
* Update project_id handling in gcp node provider
* temporary yapf mod
* Revert "temporary yapf mod"
This reverts commit b6744e4e15d4d936d1a14f4bf155ed1d3bb14126.
* Fix autoscaler/updater.py lint error, remove unused variable
* Use F.softmax instead of a pointless network layer
Stateless functions should not be network layers.
* Use correct pytorch functions
* Rename argument name to out_size
Matches in_size and makes more sense.
* Fix shapes of tensors
Advantages and rewards both should be scalars, and therefore a list of them
should be 1D.
* Fmt
* replace deprecated function
* rm unnecessary Variable wrapper
* rm all use of torch Variables
Torch does this for us now.
* Ensure that values are flat list
* Fix shape error in conv nets
* fmt
* Fix shape errors
Reshaping the action before stepping in the env fixes a few errors.
* Add TODO
* Use correct filter size
Works when `self.config['model']['channel_major'] = True`.
* Add missing channel major
* Revert reshape of action
This should be handled by the agent or at least in a cleaner way that doesn't
break existing envs.
* Squeeze action
* Squeeze actions along first dimension
This should deal with some cases such as cartpole where actions are scalars
while leaving alone cases where actions are arrays (some robotics tasks).
* try adding pytorch tests
* typo
* fixup docker messages
* Fix A3C for some envs
Pendulum doesn't work since it's an edge case (expects singleton arrays, which
`.squeeze()` collapses to scalars).
* fmt
* nit flake
* small lint
* Implement global state API for xray.
* Fix object table.
* Fixes for log structure.
* Implement cluster_resources.
* Add driver task to task table.
* Remove python flatbuffers code
* Get some global state API tests running.
* Python linting.
* Fix linting.
* Fix mock modules for doc
* Copy over flatbuffer bindings.
* Fix for tests.
* Linting
* Fix monitor crash.
* Add flake8 to Travis
* Add flake8-comprehensions
[flake8 plugin](https://github.com/adamchainz/flake8-comprehensions) that
checks for useless constructions.
* Use generators instead of lists where appropriate
A lot of the builtins can take in generators instead of lists.
This commit applies `flake8-comprehensions` to find them.
* Fix lint error
* Fix some string formatting
The rest can be fixed in another PR
* Fix compound literals syntax
This should probably be merged after #1963.
* dict() -> {}
* Use dict literal syntax
dict(...) -> {...}
* Rewrite nested dicts
* Fix hanging indent
* Add missing import
* Add missing quote
* fmt
* Add missing whitespace
* rm duplicate pip install
This is already installed in another file.
* Fix indent
* move `merge_dicts` into utils
* Bring up to date with `master`
* Add automatic syntax upgrade
* rm pyupgrade
In case users want to still use it on their own, the upgrade-syn.sh script was
left in the `.travis` dir.
* Use pep8 style
The original style file is actually just pep8 style, but with everything
spelled out. It's easier to use the `based_on_style` feature. Any overrides are
clearer that way.
* Improve yapf script
1. Do formatting in parallel
2. Lint RLlib
3. Use .style.yapf file
* Pull out expressions into variables
* Don't format rllib
* Don't allow splits in dicts
* Apply yapf
* Disallow single line if-statements
* Use arithmetic comparison
* Simplify checking for changed files
* Pull out expr into var
* Run xray tests in travis.
* Comment out TaskTests.testSubmittingManyTasks.
* Comment out failing tests.
* Comment out hanging test.
* Linting
* Comment out failing test.
* Comment out failing test.
* Ignore test_dataframe.py for now.
* Comment out testDriverExitingQuickly.