Commit graph

71 commits

Author SHA1 Message Date
Avnish Narayan
5df66b917d
[Lint Check] Remove broken link (#26505)
The paper is not available anymore.
2022-07-13 10:30:20 +01:00
Christy Bergman
7c925fe99f
[RLlib; docs] Re-organize algorithms so TOC matches README. (#26339) 2022-07-13 10:46:36 +02:00
Rohan Potdar
09ce4711fd
[RLlib]: Move OPE to evaluation config (#25911) 2022-07-12 11:04:34 -07:00
Christy Bergman
5b44afe9c1
[RLlib] Some Docs fixes (2). (#26265) 2022-07-05 15:46:32 +02:00
Christy Bergman
541e2ec14c
Add Environments to Key Concepts page (#25791) 2022-06-29 16:10:49 -07:00
Kai Fricke
012306da68
[hotfix] Fix linkcheck (#26070) 2022-06-24 13:38:01 +01:00
Artur Niederfahrenhorst
a3f1323457
[RLlib] Make QMix use the ReplayBufferAPI (#25560) 2022-06-23 22:55:22 -07:00
Sven Mika
464ac82207
[RLlib] Small docs fixes for evaluation + training. (#25957) 2022-06-22 13:11:18 +02:00
Sven Mika
1499af945b
[RLlib] Algorithm step() fixes: evaluation should NOT be part of timed training_step loop. (#25924) 2022-06-20 19:53:47 +02:00
Sven Mika
96693055bd
[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869) 2022-06-20 15:54:00 +02:00
kourosh hakhamaneshi
25940cb95b
[RLlib] CRR documentation. (#25667) 2022-06-14 12:45:36 +02:00
Sven Mika
130b7eeaba
[RLlib] Trainer to Algorithm renaming. (#25539) 2022-06-11 15:10:39 +02:00
Avnish Narayan
d0f975e00f
[RLlib] Fix broken link replay buffer docs. (#25666) 2022-06-10 21:18:59 +02:00
Sven Mika
7c39aa5fac
[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076) 2022-06-10 17:09:18 +02:00
Artur Niederfahrenhorst
94d6c212df
[RLlib] Replay Buffer API documentation. (#24683) 2022-06-10 16:47:51 +02:00
Rohan Potdar
a9d8da0100
[RLlib]: Doubly Robust Off-Policy Evaluation. (#25056) 2022-06-07 12:52:19 +02:00
Zhe Zhang
6793426a9d
[Docs; RLlib] Remove $ from rllib pip install instructions (#25358) 2022-06-07 08:57:17 +02:00
Sven Mika
a559efb7e4
[CI; LinkCheck] 3 RLlib fixes. (#25476) 2022-06-04 11:54:56 +02:00
Sven Mika
b5bc2b93c3
[RLlib] Move all remaining algos into algorithms directory. (#25366) 2022-06-04 07:35:24 +02:00
Yi Cheng
fd0f967d2e
Revert "[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms dir and rename policy and trainer classes. (#25346)" (#25420)
This reverts commit e4ceae19ef.

Reverts #25346

linux://python/ray/tests:test_client_library_integration never fail before this PR.

In the CI of the reverted PR, it also fails (https://buildkite.com/ray-project/ray-builders-pr/builds/34079#01812442-c541-4145-af22-2a012655c128). So high likely it's because of this PR.

And test output failure seems related as well (https://buildkite.com/ray-project/ray-builders-branch/builds/7923#018125c2-4812-4ead-a42f-7fddb344105b)
2022-06-02 20:38:44 -07:00
Sven Mika
e4ceae19ef
[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms dir and rename policy and trainer classes. (#25346) 2022-06-02 16:47:05 +02:00
Sven Mika
18c03f8d93
[RLlib] A2C + A3C move to algorithms folder and re-name into A2C/A3C (from ...Trainer). (#25314) 2022-06-01 09:29:16 +02:00
Sven Mika
30f6fc340b
[RLlib] AlphaZero TrainerConfig objects. (#25256) 2022-05-30 15:37:58 +02:00
Rohan Potdar
ab81c8e9ca
[RLlib]: Rename input_evaluation to off_policy_estimation_methods. (#25107) 2022-05-27 13:14:54 +02:00
Sven Mika
e73c37cc17
[RLlib] MADDPG: Move into main algorithms folder and add proper unit and learning tests. (#24579) 2022-05-24 12:53:53 +02:00
Sven Mika
09886d7ab8
[RLlib] Upgrade gym 0.23 (#24171) 2022-05-23 08:18:44 +02:00
Steven Morad
501d932449
[RLlib] SAC, RNNSAC, and CQL TrainerConfig objects (#25059) 2022-05-22 19:58:47 +02:00
Lucas Alava Peña
2a7ebd4dcf
[RLlib] Fix minor typos in docs (#24845) 2022-05-20 12:19:49 -07:00
Max Pumperla
c4aa5a4347
[RLlib] Fix broken links in docs. (#25013) 2022-05-20 11:06:25 +02:00
Michael (Mike) Gelbart
8d6548a74a
[docs] Refactor (some of) RLlib training API docs using literalinclude (#24436)
Per the [Ray docs contributing guide](https://docs.ray.io/en/master/ray-contribute/docs.html), code chunks should be in `.py` files and pulled in via `literalinclude` rather than placed directly in `.rst` files. This PR takes a small step in doing this for the RLlib docs, specifically for the training API doc page. 

Note that I had to make some changes to the code itself so that it would run, namely adding missing numpy imports and changing `model.from_batch(...)` to `model(...)` in a couple places.

Co-authored-by: Max Pumperla <max.pumperla@googlemail.com>
2022-05-20 09:52:04 +01:00
kourosh hakhamaneshi
3815e52a61
[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits (#24896) 2022-05-19 18:30:42 +02:00
Sven Mika
8f50087908
[RLlib] AlphaZero uses training_iteration API. (#24507) 2022-05-18 09:58:25 +02:00
Kai Fricke
96da5dc776
[rllib] Fix some missing agent->algorithm doc changes (#24841)
#24797 missed some doc changes that popped up in broken linkcheck. Note that there could be others that were not caught by this.
2022-05-16 11:52:49 +01:00
Jun Gong
68a9a33386
[RLlib] Retry agents -> algorithms. with proper doc changes this time. (#24797) 2022-05-16 09:45:32 +02:00
Kai Fricke
3f9eea00af
[ci/linkcheck] Fix broken gym envs link (#24817)
These are currently broken in CI.
2022-05-15 18:59:31 +01:00
kourosh hakhamaneshi
69055f556d
[RLlib] Move agents.ars to algorithms.ars. (#24516) 2022-05-06 19:11:15 +02:00
kourosh hakhamaneshi
f48f1b252c
[RLlib] Moved agents.es to algorithms.es (#24511) 2022-05-06 14:54:22 +02:00
Sven Mika
7ab19ddc32
[RLlib] MADDPG: Move into agents folder (from contrib) and use training_iteration method. (#24502) 2022-05-06 12:35:21 +02:00
Christy Bergman
76eb47e226
[RLlib; docs] Rename UCB -> LinUCB. (#24348) 2022-05-05 10:20:16 +02:00
Sven Mika
5b61a00792
[RLlib] Feed all values in COMMON_CONFIG directly from TrainerConfig() (removes duplicate values and comments). (#24433) 2022-05-04 16:28:12 +02:00
Sven Mika
ba14f0a41b
[RLlib] PGTrainer config object class (PGConfig). (#24295) 2022-04-28 22:25:16 +02:00
Jeroen Bédorf
1263015931
[RLlib] Add support for writing env 'info' dicts to output datasets for TFPolicies (for TorchPolicies, these are part of the view-requirements by default and thus written either way). (#24041) 2022-04-25 11:17:50 +02:00
Chen Shen
cb02e2f713
[linter] fix broken link in rllib examples #23959
fix broken link in rllib examples
2022-04-17 19:34:38 -07:00
Eric Liang
1ff874e8e8
[spelling] Add linter rule for mis-capitalizations of RLLib -> RLlib (#23817) 2022-04-10 16:12:53 -07:00
Sven Mika
c82f6c62c8
[RLlib] Make RolloutWorkers (optionally) recoverable after failure. (#23739) 2022-04-08 15:33:28 +02:00
Michael (Mike) Gelbart
774b62b3c0
[RLlib; docs] Clarify how MultiDiscrete spaces are encoded by default. (#23777) 2022-04-08 08:39:09 +02:00
Sven Mika
2eaa54bd76
[RLlib] POC: Config objects instead of dicts (PPO only). (#23491) 2022-03-31 18:26:12 +02:00
Sven Mika
7cb86acce2
[RLlib] trainer_template.py: hard deprecation (error when used). (#23488) 2022-03-25 18:25:51 +01:00
Philipp Moritz
886cc4d674
Fix broken links in documentation and put linkcheck linter in place on CI (#23340) 2022-03-18 21:02:52 -07:00
Max Pumperla
71c57c619b
[docs] RLlib broken links (fixes #23160) (#23226) 2022-03-16 12:38:18 +01:00