Commit graph

89 commits

Author SHA1 Message Date
Eric Liang
026f6884b5
[rllib] Add Decentralized DDPPO trainer and documentation (#7088) 2020-02-10 15:28:27 -08:00
Eric Liang
dc7e78deb4
[rllib] Add -v and --torch flags to first page of docs (#7032)
* add verbose doc

* torch
2020-02-04 10:17:51 -08:00
Eric Liang
6bb30c9f1b fix links (#6883) 2020-01-22 01:06:07 -08:00
Eric Liang
14016535a5
[rllib] Add TF and Torch icons to show which are available for each algo (#6869) 2020-01-20 15:22:21 -08:00
Eric Liang
5ecb02fb80
Release 0.7.5 updates (#5727) 2019-09-26 10:30:37 -07:00
Eric Liang
249ca2cf9e
[rllib] add blog posts to examples list (#5762)
* add blog post

* remove

* link
2019-09-23 10:42:21 -07:00
gehring
8903bcd0c3 [rllib] Tracing for eager tensorflow policies with tf.function (#5705)
* Added tracing of eager policies with `tf.function`

* lint

* add config option

* add docs

* wip

* tracing now works with a3c

* typo

* none

* file doc

* returns

* syntax error

* syntax error
2019-09-17 01:44:20 -07:00
Eric Liang
fe5bd09b46
Fix rllib image in readme and doc typo (#5579)
* fix

* rlllig
2019-08-29 16:02:16 -07:00
gehring
b520f6141e [rllib] Adds eager support with a generic TFEagerPolicy class (#5436) 2019-08-23 14:21:11 +08:00
Eric Liang
79949fb8a0
[rllib] RLlib in 60 seconds documentation (#5430) 2019-08-12 17:39:02 -07:00
Eric Liang
a1d2e17623
[rllib] Autoregressive action distributions (#5304) 2019-08-10 14:05:12 -07:00
Wonseok Jeon
281829e712 MADDPG implementation in RLlib (#5348) 2019-08-06 16:22:06 -07:00
Eric Liang
5d7afe8092
[rllib] Try moving RLlib to top level dir (#5324) 2019-08-05 23:25:49 -07:00
Kristian Hartikainen
13fb9fe3db [rllib] Feature/soft actor critic v2 (#5328)
* Add base for Soft Actor-Critic

* Pick changes from old SAC branch

* Update sac.py

* First implementation of sac model

* Remove unnecessary SAC imports

* Prune unnecessary noise and exploration code

* Implement SAC model and use that in SAC policy

* runs but doesn't learn

* clear state

* fix batch size

* Add missing alpha grads and vars

* -200 by 2k timesteps

* doc

* lazy squash

* one file

* ignore tfp

* revert done
2019-08-01 23:37:36 -07:00
Eric Liang
20450a4e82
[rllib] Add rock paper scissors multi-agent example (#5336) 2019-08-01 13:03:59 -07:00
Eric Liang
a62c5f40f6
[rllib] Document ModelV2 and clean up the models/ directory (#5277) 2019-07-27 02:08:16 -07:00
Eric Liang
9e328fbe6f
[rllib] Add docs on how to use TF eager execution (#4927) 2019-06-07 16:42:37 -07:00
Eric Liang
1c073e92e4
[rllib] Fix documentation on custom policies (#4910)
* wip

* add docs

* lint

* todo sections

* fix doc
2019-06-01 16:13:21 +08:00
Eric Liang
a45c61e19b
[rllib] Update concepts docs and add "Building Policies in Torch/TensorFlow" section (#4821)
* wip

* fix index

* fix bugs

* todo

* add imports

* note on get ph

* note on get ph

* rename to building custom algs

* add rnn state info
2019-05-27 14:17:32 -07:00
Eric Liang
02583a8598 [rllib] Rename PolicyGraph => Policy, move from evaluation/ to policy/ (#4819)
This implements some of the renames proposed in #4813
We leave behind backwards-compatibility aliases for *PolicyGraph and SampleBatch.
2019-05-20 16:46:05 -07:00
Eric Liang
6e7680bf21
[rllib] Clean up concepts documentation and policy optimizer creation (#4592) 2019-04-12 21:03:26 -07:00
Eric Liang
37208216ae
[rllib] Rename Agent to Trainer (#4556) 2019-04-07 00:36:18 -07:00
Eric Liang
2871609296
[rllib] Report sampler performance metrics (#4427) 2019-03-27 13:24:23 -07:00
Eric Liang
4b8b703561
[rllib] Some API cleanups and documentation improvements (#4409) 2019-03-21 21:34:22 -07:00
Eric Liang
c7f74dbdc7
[rllib] Add async remote workers (#4253) 2019-03-08 15:39:48 -08:00
Eric Liang
78ad9c4cbb Add "ray timeline" command to auto-dump Chrome trace for the current Ray instance (#4239) 2019-03-05 16:28:00 -08:00
Eric Liang
3896b726dd Dynamically adjust redis memory usage (#4152)
* f

* Update services.py
2019-02-25 16:21:37 -08:00
Eric Liang
d9da183c7d
[rllib] Custom supervised loss API (#4083) 2019-02-24 15:36:13 -08:00
Eric Liang
152375aa8a
[rllib] Add evaluation option to DQN agent (#3835)
* add eval

* interval

* multiagent minor fix

* Update rllib.rst

* Update ddpg.py

* Update qmix.py
2019-01-29 21:19:53 -08:00
Eric Liang
fb73cedf70
[rllib] Add examples page, add hierarchical training example, delete SC2 examples (#3815)
* wip

* lint

* wip

* up

* wip

* update examples

* wip

* remove carla

* update

* improve envspec

* link to custom

* Update rllib-env.rst

* update

* fix

* fn

* lint

* ds

* ssd games

* desc

* fix up docs

* fix
2019-01-29 21:06:09 -08:00
Michael Luo
16f7ca45e4 Appo (#3779)
* Deleted old fork, updated new ray and moved PPO-impala to APPO in ppo folder

* Deleted unneccesary vtrace.py file

* Update pong-impala.yaml

* Cleaned PPO Code

* Update pong-impala.yaml

* Update pong-impala.yaml

* wip

* new ifle

* refactor

* add vtrace off option

* revert

* support any space

* docs

* fix comment

* remove kl

* Update cartpole-appo-vtrace.yaml
2019-01-18 13:40:26 -08:00
Jones Wong
319c1340cb [rllib] Develop MARWIL (#3635)
*  add marvil policy graph

*  fix typo

*  add offline optimizer and enable running marwil

*  fix loss function

*  add maintaining the moving average of advantage norm

*  use sync replay optimizer for unifying

*  remove offline optimizer and use sync replay optimizer

*  format by yapf

*  add imitation learning objective

*  fix according to eric's review

*  format by yapf

* revise

* add test data

* marwil
2019-01-16 19:00:43 -08:00
Eric Liang
401e656b95 [rllib] Sync filters at end of iteration not start; hierarchical docs (#3769) 2019-01-15 16:25:25 -08:00
Eric Liang
03fe760616
[rllib] Model self loss isn't included in all algorithms (#3679) 2019-01-04 22:30:35 -08:00
Eric Liang
ca864faece
[rllib] Documentation for I/O API and multi-agent support / cleanup (#3650) 2019-01-03 15:15:36 +08:00
Eric Liang
47d36d7bd6
[rllib] Refactor pytorch custom model support (#3634) 2019-01-03 13:48:33 +08:00
Eric Liang
9f63119a83
[rllib] Allow development without needing to compile Ray (#3623)
* wip

* lint

* wip

* wip

* rename

* wip

* Cleaner handling of cli prompt
2018-12-24 18:08:23 +09:00
Eric Liang
303883a3b6 [rllib] [rfc] add contrib module and guideline for merging (#3565)
This adds guidelines for merging code into `rllib/contrib` vs `rllib/agents`. Also, clean up the agent import code to make registration easier.
2018-12-20 10:44:34 -08:00
Eric Liang
db0dee573e
[rllib] Q-Mix implementation (Q-Mix, VDN, IQN, and Ape-X variants) (#3548) 2018-12-18 10:40:01 -08:00
Eric Liang
f0df97db6f
[rllib] example and docs on how to use parametric actions with DQN / PG algorithms (#3384) 2018-11-27 23:35:19 -08:00
Eric Liang
0d56fc10cc Move setproctitle to ray[debug] package (#3415) 2018-11-27 09:50:59 -08:00
Eric Liang
8b76bab25c
[rllib] docs for td3 (#3381)
* td3 doc

* Update rllib-env.rst
2018-11-22 13:36:47 -08:00
Eric Liang
706dc1d473
[rllib] Add test for multi-agent support and fix IMPALA multi-agent (#3289)
IMPALA support for multiagent was broken since IMPALA has a requirement that batch sizes be of a certain length. However multi-agent envs can create variable-length batches.

Fix this by adding zero-padding as needed (similar to the RNN case).
2018-11-14 14:14:07 -08:00
Eric Liang
bd0dbde149
[rllib] Rename ServingEnv => ExternalEnv (#3302) 2018-11-12 16:31:27 -08:00
Eric Liang
9dd3eedbac [rllib] rollout.py should reduce num workers (#3263)
## What do these changes do?

Don't create an excessive amount of workers for rollout.py, and also fix up the env wrapping to be consistent with the internal agent wrapper.

## Related issue number

Closes #3260.
2018-11-09 12:29:16 -08:00
Eric Liang
369cb833fe
[rllib] Implement custom metrics (#3144) 2018-11-03 18:48:32 -07:00
Eric Liang
9a0f0db070 Add ray stack tool for debugging (#3213) 2018-11-03 13:13:02 -07:00
Eric Liang
a9e454f6fd
[rllib] Include config dicts in the sphinx docs (#3064) 2018-10-16 15:55:11 -07:00
Eric Liang
b45bed4bce
[rllib] Propagate model options correctly in ARS / ES, to action dist of PPO (#2974)
* fix

* fix

* fix it

* propagate conf to action dist

* move carla example too

* rr

* Update policies.py

* wip

* lint
2018-10-01 12:49:39 -07:00
Eric Liang
69d1354016
[rllib] Document ARS & rainbow (#2744)
* wip

* rainbow doc too

* e not used

* fix ppo doc

* clean list

* use same title
2018-08-28 18:13:36 -07:00