Rename AsyncSamplesOptimizer -> AsyncReplayOptimizer
Add AsyncSamplesOptimizer that implements the IMPALA architecture
integrate V-trace with a3c policy graph
audit V-trace integration
benchmark compare vs A3C and with V-trace on/off
PongNoFrameskip-v4 on IMPALA scaling from 16 to 128 workers, solving Pong in <10 min. For reference, solving this env takes ~40 minutes for Ape-X and several hours for A3C.
This also removes the async resetting code in VectorEnv. While that improves benchmark performance slightly, it substantially complicates env configuration and probably isn't worth it for most envs.
This makes it easy to efficiently support setups like Joint PPO: https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/retro-contest/gotta_learn_fast_report.pdf
For example, for 188 envs, you could do something like num_envs: 10, num_envs_per_worker: 19.
This adds a simple DQN+PPO example for multi-agent. We don't do anything fancy here, just syncing weights between two separate trainers. This potentially is wasting some compute, but is very simple to set up.
It might be nice to share experience collection between the top-level trainers in the future.
* Ray documentation - created new section 'Profiling for Ray Users', opposed to current Profiling section for Ray developers. Completed three sections 'A Basic Profiling Example', 'Timing Performance Using Python's Timestamps', and 'Profiling Using An External Profiler (Line_Profiler).' Left to-do two sections on CProfile and Ray Timeline Visualization.'
* Ray documentation - Fixed rst codeblock linebreaks in 'User Profiling'
* Ray documentation - For User Profiling, added section on cProfile
* Ray documentation - For User Profiling, completed Ray Timeline Visualization section, including graphical images
* Ray documentation - made User Profiling timeline image larger, minor wording edits
* Ray documentation - minor wording edits to User Profiling
* Ray documentation - User Profiling- fixed broken link
* Minor wording changes requested by Philipp Moritz addressed. Still need to address (1) compressing the image files, (2) correcting ex 3 to not be remote, and (3) using cProfile on an actor
* Ray documentation - For user-profiling.rst, revised example 3 to show a semi-parallelized example. Compressed timeline example image to be under 50 KB, removed view timeline GUI image. Updated timeline example image to reflect revised example 3. cProfile actor example left
* Ray documentation - in user-profiling.rst, added a new example including actors in the cProfile section
* Ray documentation - For user-profiling.rst, added section header for the Ray actor cProfile example
* Update user-profiling.rst
* Update user-profiling.rst
* 4 space indentation
* Update user-profiling.rst
* Update user-profiling.rst
* Update user-profiling.rst
* corrections
* Add profile table and store profiling information there.
* Code for dumping timeline.
* Improve color scheme.
* Push timeline events on driver only for raylet.
* Improvements to profiling and timeline visualization
* Some linting
* Small fix.
* Linting
* Propagate node IP address through profiling events.
* Fix test.
* object_id.hex() should return byte string in python 2.
* Include gcs.fbs in node_manager.fbs.
* Remove flatbuffer definition duplication.
* Decode to unicode in Python 3 and bytes in Python 2.
* Minor
* Submit profile events in a batch. Revert some CMake changes.
* Fix
* Workaround test failure.
* Fix linting
* Linting
* Don't return anything from chrome_tracing_dump when filename is provided.
* Remove some redundancy from profile table.
* Linting
* Move TODOs out of docstring.
* Minor
* Fix documentation indentation.
* Add error table to GCS and push error messages through node manager.
* Add type to error data.
* Linting
* Fix failure_test bug.
* Linting.
* Enable one more test.
* Attempt to fix doc building.
* Restructuring
* Fixes
* More fixes.
* Move current_time_ms function into util.h.
* removed ddpg2
* removed ddpg2 from codebase
* added tests used in ddpg vs ddpg2 comparison
* added notes about training timesteps to yaml files
* removed ddpg2 yaml files
* removed unnecessary configs from yaml files
* removed unnecessary configs from yaml files
* moved pendulum, mountaincarcontinuous, and halfcheetah tests to tuned_examples
* moved pendulum, mountaincarcontinuous, and halfcheetah tests to tuned_examples
* added more configuration details to yaml files
* removed random starts from halfcheetah
* Allow yapf to lint individual files
* Add tip for using yapf
* Update doc
* Update script to autoformat changed py files
The new default is for the script to only updated changed files to encourage
using it as a pre-push hook. Travis still checks all since it's not that big an
increase to runtime.
* Exclude formatting thirdparty/autogen py files
* Symlink .travis -> scripts
Hidden directories may get glossed over otherwise.
* .travis -> scripts in docs
They are symlinks to the same thing, but `scripts` is more dev-friendly, while
`.travis` is really only for Travis CI.
* Document different yapf format functions
Most devs will only need `format_changed`, and this is run by default.
`format_changed` should be fast enough in most cases to work as a pre-commit
hook.
* Speed up yapf by only formatting changed files
* Update docs
1. Mention how yapf can be used a pre-commit hook
2. rm `bash`, script is executable
* Update yapf.sh
* Update development.rst
* Update yapf.sh
* Use bash arrays for correct argument splitting
Playing fast and loose with whitespace in bash is a terrible idea.
* Only format non-excluded by default
* Check changes against master
Normally, the remote is called `origin`, but naming it explicit
* Adding missing directory to `format_all`
* Cleanup YAPF code
Remove unused function and move around code to make clearer and adding lines
give cleaner diffs.
* Ensure correct files are autoformatted
* Fix cmd line arg splitting
Each arg has to be in its own set of quotes.
* Diff against mergebase
TIL there's a clean syntax for doing that, but it's too clever to belong in a
shell script.
We use `mapfile -t` to ensure no problems down the line with weird filenames.
* Implement global state API for xray.
* Fix object table.
* Fixes for log structure.
* Implement cluster_resources.
* Add driver task to task table.
* Remove python flatbuffers code
* Get some global state API tests running.
* Python linting.
* Fix linting.
* Fix mock modules for doc
* Copy over flatbuffer bindings.
* Fix for tests.
* Linting
* Fix monitor crash.
* Integrate worker with raylet.
* Begin allowing worker to attach to cluster.
* Fix linting and documentation.
* Fix linting.
* Comment tests back in.
* Fix type of worker command.
* Remove xray python files and tests.
* Fix from rebase.
* Add test.
* Copy over raylet executable.
* Small cleanup.