Eric Liang
5cebee68d6
[rllib] Add scaling guide to documentation, improve bandit docs ( #7780 )
...
* update
* reword
* update
* ms
* multi node sgd
* reorder
* improve bandit docs
* contrib
* update
* ref
* improve refs
* fix build
* add pillow dep
* add pil
* update pil
* pillow
* remove false
2020-03-27 22:05:43 -07:00
Sven Mika
1138f2ebed
[RLlib] Issue 7046 cannot restore keras model from h5 file. ( #7482 )
2020-03-23 12:19:30 -07:00
Eric Liang
9392cdbf74
[rllib] Add high-performance external application connector ( #7641 )
2020-03-20 12:43:57 -07:00
Eric Liang
dd70720578
[rllib] Rename sample_batch_size => rollout_fragment_length ( #7503 )
...
* bulk rename
* deprecation warn
* update doc
* update fig
* line length
* rename
* make pytest comptaible
* fix test
* fi sys
* rename
* wip
* fix more
* lint
* update svg
* comments
* lint
* fix use of batch steps
2020-03-14 12:05:04 -07:00
Eric Liang
52cf77f5a9
[rllib] SAC no_done_at_end should default to False ( #7594 )
...
* update
* update doc
* stochastic
* cleanu
2020-03-14 11:16:54 -07:00
Sven Mika
2d97650b1e
[RLlib] Add Exploration API documentation. ( #7373 )
...
* Add Exploration API documentation.
* Add Exploration API documentation.
* Add Exploration API documentation.
* Update exporation docs.
2020-03-01 16:55:41 -08:00
Eric Liang
5df801605e
Add ray.util package and move libraries from experimental ( #7100 )
2020-02-18 13:43:19 -08:00
Eric Liang
fbc545c03b
[rllib] Support parallel, parameterized evaluation ( #6981 )
...
* eval api
* update
* sync eval filters
* sync fix
* docs
* update
* docs
* update
* link
* nit
* doc updates
* format
2020-02-01 22:12:12 -08:00
Eric Liang
e659699ca9
[tune] Fix directory naming regression ( #6839 )
2020-01-27 15:53:40 -08:00
Sven Mika
e6227082bd
[RLlib] Add torch
flag to train.py ( #6807 )
2020-01-17 18:48:44 -08:00
Maltimore
0ec613c95a
[rllib] doc: fix typo: on_postprocess_batch -> on_postprocess_traj ( #6438 )
2019-12-11 15:00:53 -08:00
Eric Liang
bc5e259264
[rllib] Add a doc section on computing actions ( #6326 )
...
* options doc
* add note
* hint shr
* doc update
2019-12-03 00:10:50 -08:00
Eric Liang
e4565c9cc6
Reduce RLlib log verbosity ( #6154 )
2019-11-13 18:50:45 -08:00
David Bignell
3f83b2daa9
[rllib] Rollout extensions ( #6065 )
...
* Rollout improvements
* Make info-saving optional, to avoid breaking change.
* Store generating ray version in checkpoint metadata
* Keep the linter happy
* Add small rollout test
* Terse.
* Update test_io.py
2019-11-05 20:34:18 -08:00
gehring
8903bcd0c3
[rllib] Tracing for eager tensorflow policies with tf.function
( #5705 )
...
* Added tracing of eager policies with `tf.function`
* lint
* add config option
* add docs
* wip
* tracing now works with a3c
* typo
* none
* file doc
* returns
* syntax error
* syntax error
2019-09-17 01:44:20 -07:00
Eric Liang
74abeab057
[rllib] Improve accessing model state docs ( #5656 )
...
* [rllib] better model docs
* fix
* s
2019-09-08 23:01:26 -07:00
Eric Liang
1455a19c85
Consolidate and clean up documentation ( #5645 )
2019-09-07 11:50:18 -07:00
Richard Liaw
34f6d2fc5c
[tune] Update trainable docs and support hparams ( #5558 )
2019-09-04 12:44:42 -07:00
Eric Liang
daf38c8723
[tune] Deprecate tune.function ( #5601 )
...
* remove tune function
* remove examples
* Update tune-usage.rst
2019-08-31 16:00:10 -07:00
Eric Liang
550c96b965
[rllib] Add docs on policy.model ( #5597 )
2019-08-30 21:10:42 -07:00
Eric Liang
7d28bbbdbb
[rllib] Document on traj postprocess ( #5532 )
...
* document on traj postprocess
* shorten it
2019-08-24 20:37:45 -07:00
gehring
b520f6141e
[rllib] Adds eager support with a generic TFEagerPolicy
class ( #5436 )
2019-08-23 14:21:11 +08:00
Eric Liang
5d7afe8092
[rllib] Try moving RLlib to top level dir ( #5324 )
2019-08-05 23:25:49 -07:00
Richard Liaw
1eaa57c98f
[tune] Distributed example + walkthrough ( #5157 )
2019-08-02 09:17:20 -07:00
Kristian Hartikainen
13fb9fe3db
[rllib] Feature/soft actor critic v2 ( #5328 )
...
* Add base for Soft Actor-Critic
* Pick changes from old SAC branch
* Update sac.py
* First implementation of sac model
* Remove unnecessary SAC imports
* Prune unnecessary noise and exploration code
* Implement SAC model and use that in SAC policy
* runs but doesn't learn
* clear state
* fix batch size
* Add missing alpha grads and vars
* -200 by 2k timesteps
* doc
* lazy squash
* one file
* ignore tfp
* revert done
2019-08-01 23:37:36 -07:00
Eric Liang
20450a4e82
[rllib] Add rock paper scissors multi-agent example ( #5336 )
2019-08-01 13:03:59 -07:00
Eric Liang
9e328fbe6f
[rllib] Add docs on how to use TF eager execution ( #4927 )
2019-06-07 16:42:37 -07:00
Eric Liang
7501ee51db
[rllib] Rename PolicyEvaluator => RolloutWorker ( #4820 )
2019-06-03 06:49:24 +08:00
Eric Liang
4f46d3e9bf
[rllib] Add multi-agent examples for hand-coded policy, centralized VF ( #4554 )
2019-04-09 00:36:49 -07:00
Eric Liang
37208216ae
[rllib] Rename Agent to Trainer ( #4556 )
2019-04-07 00:36:18 -07:00
Eric Liang
fce0062380
[rllib] Switch to tune.run() instead of run_experiments() ( #4515 )
2019-03-30 14:07:50 -07:00
Eric Liang
cff08e19ff
[rllib] Print out intermediate data shapes on the first iteration ( #4426 )
2019-03-26 00:27:59 -07:00
Eric Liang
4b8b703561
[rllib] Some API cleanups and documentation improvements ( #4409 )
2019-03-21 21:34:22 -07:00
Eric Liang
05d96ce81b
[rllib] Raise an error if multi-agent envs terminate without a last observation for agents ( #4139 )
...
* fix it
* lint
* Update rllib-training.rst
2019-02-23 21:23:40 -08:00
Eric Liang
c4182463f6
[rllib] Add helper to iterate over envs in a vectorized environment ( #4001 )
...
* add foreach env func
* fix
* add test
2019-02-11 10:40:47 -08:00
Eric Liang
fb73cedf70
[rllib] Add examples page, add hierarchical training example, delete SC2 examples ( #3815 )
...
* wip
* lint
* wip
* up
* wip
* update examples
* wip
* remove carla
* update
* improve envspec
* link to custom
* Update rllib-env.rst
* update
* fix
* fn
* lint
* ds
* ssd games
* desc
* fix up docs
* fix
2019-01-29 21:06:09 -08:00
Eric Liang
e78562b2e8
[rllib] Misc fixes: set lr for PG, better error message for LSTM/PPO, fix multi-agent/APEX ( #3697 )
...
* fix
* update test
* better error
* compute
* eps fix
* add get_policy() api
* Update agent.py
* better err msg
* fix
* pass in rew
2019-01-06 19:37:35 -08:00
Eric Liang
b8a9e3f106
[rllib] Remove uses of sgd_stepsize => lr ( #3667 )
...
* lr
* Update example-evolution-strategies.rst
2019-01-01 12:01:27 +08:00
Richard Liaw
e046a5c767
[tune] resources_per_trial from trial_resources ( #3580 )
...
Renaming variable due to user errors.
2018-12-20 19:00:47 -08:00
Eric Liang
db0dee573e
[rllib] Q-Mix implementation (Q-Mix, VDN, IQN, and Ape-X variants) ( #3548 )
2018-12-18 10:40:01 -08:00
Eric Liang
d864f299d7
[rllib] fixes from dogfooding multi-agent ( #3456 )
...
auto wrap multi-agent dict and tuple spaces by keeping a policy -> preprocessor in the sampler
add some Q-learning debug stats
report min, max of custom metrics
better errors
2018-12-05 23:31:45 -08:00
Eric Liang
93a9d32288
[docs] Switch docs to use rllib train instead of train.py
2018-12-04 17:36:06 -08:00
Eric Liang
ce355d13d4
[rllib] Allow envs to be auto-registered; add on_train_result callback with curriculum example ( #3451 )
...
* train step and docs
* debug
* doc
* doc
* fix examples
* fix code
* integration test
* fix
* ...
* space
* instance
* Update .travis.yml
* fix test
2018-12-03 23:15:43 -08:00
Eric Liang
f0df97db6f
[rllib] example and docs on how to use parametric actions with DQN / PG algorithms ( #3384 )
2018-11-27 23:35:19 -08:00
Eric Liang
abdc3b592e
[rllib] Update multi-gpu impala numbers ( #3327 )
2018-11-19 20:55:27 -08:00
Eric Liang
65c27c70cf
[rllib] Clean up agent resource configurations ( #3296 )
...
Closes #3284
2018-11-13 18:00:03 -08:00
Eric Liang
bd0dbde149
[rllib] Rename ServingEnv => ExternalEnv ( #3302 )
2018-11-12 16:31:27 -08:00
eugenevinitsky
344b4ef0ff
[rllib] Fix filter sync for ES and ARS ( #2918 )
2018-11-06 19:09:34 -08:00
Eric Liang
369cb833fe
[rllib] Implement custom metrics ( #3144 )
2018-11-03 18:48:32 -07:00
Eric Liang
af0c1174cd
[sgd] Merge sharded param server based SGD implementation ( #3033 )
...
This includes most of the TF code used for the OSDI experiment. Perf sanity check on p3.16xl instances: Overall scaling looks ok, with the multi-node results within 5% of OSDI final numbers. This seems reasonable given that hugepages are not enabled here, and the param server shards are placed randomly.
$ RAY_USE_XRAY=1 ./test_sgd.py --gpu --batch-size=64 --num-workers=N \
--devices-per-worker=M --strategy=<simple|ps> \
--warmup --object-store-memory=10000000000
Images per second total
gpus total | simple | ps
========================================
1 | 218
2 (1 worker) | 388
4 (1 worker) | 759
4 (2 workers) | 176 | 623
8 (1 worker) | 985
8 (2 workers) | 349 | 1031
16 (2 nodes, 2 workers) | 600 | 1661
16 (2 nodes, 4 workers) | 468 | 1712 <--- OSDI perf was 1817
2018-10-27 21:25:02 -07:00