Commit graph

32 commits

Author SHA1 Message Date
Richard Liaw
f19decb848
[docs] Update RLlib install to not include Tensorflow (#2178) 2018-06-10 10:29:12 -07:00
andrewztan
1475600c81 [rllib] Merge DDPG and DDPG2 implementations (#2202)
* removed ddpg2

* removed ddpg2 from codebase

* added tests used in ddpg vs ddpg2 comparison

* added notes about training timesteps to yaml files

* removed ddpg2 yaml files

* removed unnecessary configs from yaml files

* removed unnecessary configs from yaml files

* moved pendulum, mountaincarcontinuous, and halfcheetah tests to tuned_examples

* moved pendulum, mountaincarcontinuous, and halfcheetah tests to tuned_examples

* added more configuration details to yaml files

* removed random starts from halfcheetah
2018-06-09 16:46:23 -07:00
Eric Liang
f37e2e5d2f
[rllib] [doc] Broken link in ddpg doc 2018-05-20 00:10:59 -07:00
Eric Liang
b55f4a7f04 [rllib] Fix broken link in docs (#1967)
* Update README.rst

* Update rllib.rst
2018-04-30 16:02:48 -07:00
Eric Liang
47bc4c3009
[rllib] Add DDPG documentation, rename DDPG2 <=> DDPG (#1946)
* updates

* updates

* updates

* updates

* updates

* updates

* Update rllib.rst

* Update policy-optimizers.rst
2018-04-30 00:18:15 -07:00
Eric Liang
7ab890f4a1 [tune] [rllib] Automatically determine RLlib resources and add queueing mechanism for autoscaling (#1848) 2018-04-16 16:58:15 -07:00
Eric Liang
72595cca0d [tune] Change tune resource request syntax to be less confusing (#1764)
* update

* update examples

* Wed Mar 21 15:19:56 PDT 2018

* Wed Mar 21 15:21:32 PDT 2018

* Update train_a3c.py

* Update train.py

* fix resources accounting
2018-03-23 06:25:01 -07:00
Eric Liang
e3685fca5e
[rllib] remove redundant docs (#1728)
* wip

* more work

* fix apex

* docs

* apex doc

* pool comment

* clean up

* make wrap stack pluggable

* Mon Mar 12 21:45:50 PDT 2018

* clean up comment

* table

* Mon Mar 12 22:51:57 PDT 2018

* Mon Mar 12 22:53:05 PDT 2018

* Mon Mar 12 22:55:03 PDT 2018

* Mon Mar 12 22:56:18 PDT 2018

* Mon Mar 12 22:59:54 PDT 2018

* Update apex_optimizer.py

* Update index.rst

* Update README.rst

* Update README.rst

* comments

* Wed Mar 14 19:01:02 PDT 2018

* Fri Mar 16 15:44:27 PDT 2018
2018-03-17 14:45:04 -07:00
Eric Liang
882a649f0c
[rllib] [docs] Cleanup RLlib API and make docs consistent with upcoming blog post (#1708)
* wip

* more work

* fix apex

* docs

* apex doc

* pool comment

* clean up

* make wrap stack pluggable

* Mon Mar 12 21:45:50 PDT 2018

* clean up comment

* table

* Mon Mar 12 22:51:57 PDT 2018

* Mon Mar 12 22:53:05 PDT 2018

* Mon Mar 12 22:55:03 PDT 2018

* Mon Mar 12 22:56:18 PDT 2018

* Mon Mar 12 22:59:54 PDT 2018

* Update apex_optimizer.py

* Update index.rst

* Update README.rst

* Update README.rst

* comments

* Wed Mar 14 19:01:02 PDT 2018
2018-03-15 15:57:31 -07:00
Richard Liaw
061e435411
[rllib] Fix eval.py -> rollout.py (#1650) 2018-03-04 14:59:16 -08:00
eugenevinitsky
639df85fda updated multiagent docs (#1523)
* updated multiagent docs

* Update rllib.rst

* Update multiagent_mountaincar_env.py

* Update multiagent_pendulum_env.py
2018-02-11 16:35:03 -08:00
Eric Liang
b948405532
[tune] clean up population based training prototype (#1478)
* patch up pbt

* Sat Jan 27 01:00:03 PST 2018

* Sat Jan 27 01:04:14 PST 2018

* Sat Jan 27 01:04:21 PST 2018

* Sat Jan 27 01:15:15 PST 2018

* Sat Jan 27 01:15:42 PST 2018

* Sat Jan 27 01:16:14 PST 2018

* Sat Jan 27 01:38:42 PST 2018

* Sat Jan 27 01:39:21 PST 2018

* add pbt

* Sat Jan 27 01:41:19 PST 2018

* Sat Jan 27 01:44:21 PST 2018

* Sat Jan 27 01:45:46 PST 2018

* Sat Jan 27 16:54:42 PST 2018

* Sat Jan 27 16:57:53 PST 2018

* clean up test

* Sat Jan 27 18:01:15 PST 2018

* Sat Jan 27 18:02:54 PST 2018

* Sat Jan 27 18:11:18 PST 2018

* Sat Jan 27 18:11:55 PST 2018

* Sat Jan 27 18:14:09 PST 2018

* review

* try out a ppo example

* some tweaks to ppo example

* add postprocess hook

* Sun Jan 28 15:00:40 PST 2018

* clean up custom explore fn

* Sun Jan 28 15:10:21 PST 2018

* Sun Jan 28 15:14:53 PST 2018

* Sun Jan 28 15:17:04 PST 2018

* Sun Jan 28 15:33:13 PST 2018

* Sun Jan 28 15:56:40 PST 2018

* Sun Jan 28 15:57:36 PST 2018

* Sun Jan 28 16:00:35 PST 2018

* Sun Jan 28 16:02:58 PST 2018

* Sun Jan 28 16:29:50 PST 2018

* Sun Jan 28 16:30:36 PST 2018

* Sun Jan 28 16:31:44 PST 2018

* improve tune doc

* concepts

* update humanoid

* Fri Feb  2 18:03:33 PST 2018

* fix example

* show error file
2018-02-02 23:03:12 -08:00
Eric Liang
173f1d629a
[tune] Ray Tune API cleanup (#1454)
Remove rllib dep: trainable is now a standalone abstract class that can be easily subclassed.

Clean up hyperband: fix debug string and add an example.

Remove YAML api / ScriptRunner: this was never really used.

Move ray.init() out of run_experiments(): This provides greater flexibility and should be less confusing since there isn't an implicit init() done there. Note that this is a breaking API change for tune.
2018-01-24 16:55:17 -08:00
Eric Liang
ee36effd8e
[rllib] Add n-step Q learning for DQN (#1439)
* n-step

* add sample adjustm

* Oops

* fix nstep

* metric adjustment

* Sat Jan 20 23:30:34 PST 2018

* Sun Jan 21 16:40:46 PST 2018

* Mon Jan 22 22:24:57 PST 2018
2018-01-23 10:31:19 -08:00
Richard Liaw
04a50aa9ae
[tune] Standardize Ray Tune on documentation (#1448) 2018-01-21 12:07:15 -08:00
Eric Liang
424bd7f74d
[rllib] improve custom env docs (#1447)
* env docs

* add env

* update env

* Fri Jan 19 18:55:34 PST 2018
2018-01-19 21:36:18 -08:00
Eric Liang
e216766bbc [rllib] Update docs with api and components overview figures (#1443) 2018-01-19 10:08:45 -08:00
Eric Liang
5a2f85048d [rllib] Fix incorrect documentation on how to use custom models #1405 2018-01-09 18:09:05 -08:00
Eric Liang
c60ccbad46 [carla] [rllib] Add support for carla nav planner and scenarios from paper (#1382)
* wip

* Sat Dec 30 15:07:28 PST 2017

* log video

* video doesn't work well

* scenario integration

* Sat Dec 30 17:30:22 PST 2017

* Sat Dec 30 17:31:05 PST 2017

* Sat Dec 30 17:31:32 PST 2017

* Sat Dec 30 17:32:16 PST 2017

* Sat Dec 30 17:34:11 PST 2017

* Sat Dec 30 17:34:50 PST 2017

* Sat Dec 30 17:35:34 PST 2017

* Sat Dec 30 17:38:49 PST 2017

* Sat Dec 30 17:40:39 PST 2017

* Sat Dec 30 17:43:00 PST 2017

* Sat Dec 30 17:43:04 PST 2017

* Sat Dec 30 17:45:56 PST 2017

* Sat Dec 30 17:46:26 PST 2017

* Sat Dec 30 17:47:02 PST 2017

* Sat Dec 30 17:51:53 PST 2017

* Sat Dec 30 17:52:54 PST 2017

* Sat Dec 30 17:56:43 PST 2017

* Sat Dec 30 18:27:07 PST 2017

* Sat Dec 30 18:27:52 PST 2017

* fix train

* Sat Dec 30 18:41:51 PST 2017

* Sat Dec 30 18:54:11 PST 2017

* Sat Dec 30 18:56:22 PST 2017

* Sat Dec 30 19:05:04 PST 2017

* Sat Dec 30 19:05:23 PST 2017

* Sat Dec 30 19:11:53 PST 2017

* Sat Dec 30 19:14:31 PST 2017

* Sat Dec 30 19:16:20 PST 2017

* Sat Dec 30 19:18:05 PST 2017

* Sat Dec 30 19:18:45 PST 2017

* Sat Dec 30 19:22:44 PST 2017

* Sat Dec 30 19:24:41 PST 2017

* Sat Dec 30 19:26:57 PST 2017

* Sat Dec 30 19:40:37 PST 2017

* wip models

* reward bonus

* test prep

* Sun Dec 31 18:45:25 PST 2017

* Sun Dec 31 18:58:28 PST 2017

* Sun Dec 31 18:59:34 PST 2017

* Sun Dec 31 19:03:33 PST 2017

* Sun Dec 31 19:05:05 PST 2017

* Sun Dec 31 19:09:25 PST 2017

* fix train

* kill

* add tuple preprocessor

* Sun Dec 31 20:38:33 PST 2017

* Sun Dec 31 22:51:24 PST 2017

* Sun Dec 31 23:14:13 PST 2017

* Sun Dec 31 23:16:04 PST 2017

* Mon Jan  1 00:08:35 PST 2018

* Mon Jan  1 00:10:48 PST 2018

* Mon Jan  1 01:08:31 PST 2018

* Mon Jan  1 14:45:44 PST 2018

* Mon Jan  1 14:54:56 PST 2018

* Mon Jan  1 17:29:29 PST 2018

* switch to euclidean dists

* Mon Jan  1 17:39:27 PST 2018

* Mon Jan  1 17:41:47 PST 2018

* Mon Jan  1 17:44:18 PST 2018

* Mon Jan  1 17:47:09 PST 2018

* Mon Jan  1 20:31:02 PST 2018

* Mon Jan  1 20:39:33 PST 2018

* Mon Jan  1 20:40:55 PST 2018

* Mon Jan  1 20:55:06 PST 2018

* Mon Jan  1 21:05:52 PST 2018

* fix env path

* merge richards fix

* fix hash

* Mon Jan  1 22:04:00 PST 2018

* Mon Jan  1 22:25:29 PST 2018

* Mon Jan  1 22:30:42 PST 2018

* simplified reward function

* add framestack

* add env configs

* simplify speed reward

* Tue Jan  2 17:36:15 PST 2018

* Tue Jan  2 17:49:16 PST 2018

* Tue Jan  2 18:10:38 PST 2018

* add lane keeping simple mode

* Tue Jan  2 20:25:26 PST 2018

* Tue Jan  2 20:30:30 PST 2018

* Tue Jan  2 20:33:26 PST 2018

* Tue Jan  2 20:41:42 PST 2018

* ppo lane keep

* simplify discrete actions

* Tue Jan  2 21:41:05 PST 2018

* Tue Jan  2 21:49:03 PST 2018

* Tue Jan  2 22:12:23 PST 2018

* Tue Jan  2 22:14:42 PST 2018

* Tue Jan  2 22:20:59 PST 2018

* Tue Jan  2 22:23:43 PST 2018

* Tue Jan  2 22:26:27 PST 2018

* Tue Jan  2 22:27:20 PST 2018

* Tue Jan  2 22:44:00 PST 2018

* Tue Jan  2 22:57:58 PST 2018

* Tue Jan  2 23:08:51 PST 2018

* Tue Jan  2 23:11:32 PST 2018

* update dqn reward

* Thu Jan  4 12:29:40 PST 2018

* Thu Jan  4 12:30:26 PST 2018

* Update train_dqn.py

* fix
2018-01-05 21:32:41 -08:00
Eric Liang
6e6674a824
[rllib] Split docs into user and development guide (#1377)
* docs

* Update README.rst

* Sat Dec 30 15:23:49 PST 2017

* comments

* Sun Dec 31 23:33:30 PST 2017

* Sun Dec 31 23:33:38 PST 2017

* Sun Dec 31 23:37:46 PST 2017

* Sun Dec 31 23:39:28 PST 2017

* Sun Dec 31 23:43:05 PST 2017

* Sun Dec 31 23:51:55 PST 2017

* Sun Dec 31 23:52:51 PST 2017
2018-01-01 11:10:44 -08:00
Eric Liang
22c7c87e14 [rllib] [tune] Custom preprocessors and models, various fixes (#1372) 2017-12-28 13:19:04 -08:00
Eric Liang
47b1f02d3e [rllib] Pull out multi-gpu optimizer as a generic class (#1313) 2017-12-17 15:59:57 -08:00
Eric Liang
fbf1806b8a
[tune] Clean up result logging: move out of /tmp, add timestamp (#1297) 2017-12-15 14:19:08 -08:00
Richard Liaw
b6a35e0395 [rllib] Introduce pip install rllib (#1310)
* update setup

* more dependencies
2017-12-12 13:58:28 -08:00
Peter Schafhalter
20d6b74aa6 [rllib] Added evaluation script to RLLib (#1295) 2017-12-11 11:59:44 -08:00
Richard Liaw
2e0eb0e4c7
[rllib] Adding dependencies (#1298) 2017-12-08 01:57:19 -08:00
Eric Liang
35f7398666
[rllib] Update RLlib docs and README (#1288)
Updates the rllib docs and README.
2017-12-06 18:17:51 -08:00
Richard Liaw
f34d705178 [rllib] Update Docs for RLLib (#1248)
* init_changes

* last_changes

* addressing comments

* fix comments

* update

* nit
2017-11-24 10:36:57 -08:00
Eric Liang
316f9e2bb7 [tune] Support user-defined trainable functions / classes / envs with a shared object registry (#1226) 2017-11-20 17:52:43 -08:00
Eric Liang
5a50e0e1d7 [rllib] Add the ability to run arbitrary Python scripts with ray.tune (#1132)
* fix yaml bug

* add ext agent

* gpus

* update

* tuning

* docs

* Sun Oct 15 21:09:25 PDT 2017

* lint

* update

* Sun Oct 15 22:39:55 PDT 2017

* Sun Oct 15 22:40:17 PDT 2017

* Sun Oct 15 22:43:06 PDT 2017

* Sun Oct 15 22:46:06 PDT 2017

* Sun Oct 15 22:46:21 PDT 2017

* Sun Oct 15 22:48:11 PDT 2017

* Sun Oct 15 22:48:44 PDT 2017

* Sun Oct 15 22:49:23 PDT 2017

* Sun Oct 15 22:50:21 PDT 2017

* Sun Oct 15 22:53:00 PDT 2017

* Sun Oct 15 22:53:34 PDT 2017

* Sun Oct 15 22:54:33 PDT 2017

* Sun Oct 15 22:54:50 PDT 2017

* Sun Oct 15 22:55:20 PDT 2017

* Sun Oct 15 22:56:56 PDT 2017

* Sun Oct 15 22:59:03 PDT 2017

* fix

* Update tune_mnist_ray.py

* remove script trial

* fix

* reorder

* fix ex

* py2 support

* upd

* comments

* comments

* cleanup readme

* fix trial

* annotate

* Update rllib.rst
2017-10-18 11:49:28 -07:00
Eric Liang
b1660c4edf [rllib] Refactor to support passing custom env_creator function (#1096)
* refactor to use env creator

* doc

* lint
2017-10-10 12:49:42 -07:00
Philipp Moritz
1eb8c83314 [rllib] Initial RLLib documentation (#969)
* initial documentation for RLLib

* more RL documentation

* fix linting

* fix comments

* update

* fix
2017-09-12 23:38:21 -07:00