Sven Mika
130b7eeaba
[RLlib] Trainer
to Algorithm
renaming. ( #25539 )
2022-06-11 15:10:39 +02:00
Avnish Narayan
d0f975e00f
[RLlib] Fix broken link replay buffer docs. ( #25666 )
2022-06-10 21:18:59 +02:00
Sven Mika
7c39aa5fac
[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. ( #25076 )
2022-06-10 17:09:18 +02:00
Artur Niederfahrenhorst
94d6c212df
[RLlib] Replay Buffer API documentation. ( #24683 )
2022-06-10 16:47:51 +02:00
Rohan Potdar
a9d8da0100
[RLlib]: Doubly Robust Off-Policy Evaluation. ( #25056 )
2022-06-07 12:52:19 +02:00
Zhe Zhang
6793426a9d
[Docs; RLlib] Remove $
from rllib pip install instructions ( #25358 )
2022-06-07 08:57:17 +02:00
Sven Mika
a559efb7e4
[CI; LinkCheck] 3 RLlib fixes. ( #25476 )
2022-06-04 11:54:56 +02:00
Sven Mika
b5bc2b93c3
[RLlib] Move all remaining algos into algorithms
directory. ( #25366 )
2022-06-04 07:35:24 +02:00
Yi Cheng
fd0f967d2e
Revert "[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms
dir and rename policy and trainer classes. ( #25346 )" ( #25420 )
...
This reverts commit e4ceae19ef
.
Reverts #25346
linux://python/ray/tests:test_client_library_integration never fail before this PR.
In the CI of the reverted PR, it also fails (https://buildkite.com/ray-project/ray-builders-pr/builds/34079#01812442-c541-4145-af22-2a012655c128 ). So high likely it's because of this PR.
And test output failure seems related as well (https://buildkite.com/ray-project/ray-builders-branch/builds/7923#018125c2-4812-4ead-a42f-7fddb344105b )
2022-06-02 20:38:44 -07:00
Sven Mika
e4ceae19ef
[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms
dir and rename policy and trainer classes. ( #25346 )
2022-06-02 16:47:05 +02:00
Sven Mika
18c03f8d93
[RLlib] A2C + A3C move to algorithms
folder and re-name into A2C/A3C (from ...Trainer). ( #25314 )
2022-06-01 09:29:16 +02:00
Sven Mika
30f6fc340b
[RLlib] AlphaZero TrainerConfig objects. ( #25256 )
2022-05-30 15:37:58 +02:00
Rohan Potdar
ab81c8e9ca
[RLlib]: Rename input_evaluation
to off_policy_estimation_methods
. ( #25107 )
2022-05-27 13:14:54 +02:00
Sven Mika
e73c37cc17
[RLlib] MADDPG: Move into main algorithms
folder and add proper unit and learning tests. ( #24579 )
2022-05-24 12:53:53 +02:00
Sven Mika
09886d7ab8
[RLlib] Upgrade gym 0.23 ( #24171 )
2022-05-23 08:18:44 +02:00
Steven Morad
501d932449
[RLlib] SAC, RNNSAC, and CQL TrainerConfig objects ( #25059 )
2022-05-22 19:58:47 +02:00
Lucas Alava Peña
2a7ebd4dcf
[RLlib] Fix minor typos in docs ( #24845 )
2022-05-20 12:19:49 -07:00
Max Pumperla
c4aa5a4347
[RLlib] Fix broken links in docs. ( #25013 )
2022-05-20 11:06:25 +02:00
Michael (Mike) Gelbart
8d6548a74a
[docs] Refactor (some of) RLlib training API docs using literalinclude ( #24436 )
...
Per the [Ray docs contributing guide](https://docs.ray.io/en/master/ray-contribute/docs.html ), code chunks should be in `.py` files and pulled in via `literalinclude` rather than placed directly in `.rst` files. This PR takes a small step in doing this for the RLlib docs, specifically for the training API doc page.
Note that I had to make some changes to the code itself so that it would run, namely adding missing numpy imports and changing `model.from_batch(...)` to `model(...)` in a couple places.
Co-authored-by: Max Pumperla <max.pumperla@googlemail.com>
2022-05-20 09:52:04 +01:00
kourosh hakhamaneshi
3815e52a61
[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits ( #24896 )
2022-05-19 18:30:42 +02:00
Sven Mika
8f50087908
[RLlib] AlphaZero uses training_iteration API. ( #24507 )
2022-05-18 09:58:25 +02:00
Kai Fricke
96da5dc776
[rllib] Fix some missing agent->algorithm doc changes ( #24841 )
...
#24797 missed some doc changes that popped up in broken linkcheck. Note that there could be others that were not caught by this.
2022-05-16 11:52:49 +01:00
Jun Gong
68a9a33386
[RLlib] Retry agents -> algorithms. with proper doc changes this time. ( #24797 )
2022-05-16 09:45:32 +02:00
Kai Fricke
3f9eea00af
[ci/linkcheck] Fix broken gym envs link ( #24817 )
...
These are currently broken in CI.
2022-05-15 18:59:31 +01:00
kourosh hakhamaneshi
69055f556d
[RLlib] Move agents.ars
to algorithms.ars
. ( #24516 )
2022-05-06 19:11:15 +02:00
kourosh hakhamaneshi
f48f1b252c
[RLlib] Moved agents.es
to algorithms.es
( #24511 )
2022-05-06 14:54:22 +02:00
Sven Mika
7ab19ddc32
[RLlib] MADDPG: Move into agents folder (from contrib) and use training_iteration
method. ( #24502 )
2022-05-06 12:35:21 +02:00
Christy Bergman
76eb47e226
[RLlib; docs] Rename UCB -> LinUCB. ( #24348 )
2022-05-05 10:20:16 +02:00
Sven Mika
5b61a00792
[RLlib] Feed all values in COMMON_CONFIG directly from TrainerConfig() (removes duplicate values and comments). ( #24433 )
2022-05-04 16:28:12 +02:00
Sven Mika
ba14f0a41b
[RLlib] PGTrainer config object class (PGConfig
). ( #24295 )
2022-04-28 22:25:16 +02:00
Jeroen Bédorf
1263015931
[RLlib] Add support for writing env 'info' dicts to output datasets for TFPolicies (for TorchPolicies, these are part of the view-requirements by default and thus written either way). ( #24041 )
2022-04-25 11:17:50 +02:00
Chen Shen
cb02e2f713
[linter] fix broken link in rllib examples #23959
...
fix broken link in rllib examples
2022-04-17 19:34:38 -07:00
Eric Liang
1ff874e8e8
[spelling] Add linter rule for mis-capitalizations of RLLib -> RLlib ( #23817 )
2022-04-10 16:12:53 -07:00
Sven Mika
c82f6c62c8
[RLlib] Make RolloutWorkers (optionally) recoverable after failure. ( #23739 )
2022-04-08 15:33:28 +02:00
Michael (Mike) Gelbart
774b62b3c0
[RLlib; docs] Clarify how MultiDiscrete
spaces are encoded by default. ( #23777 )
2022-04-08 08:39:09 +02:00
Sven Mika
2eaa54bd76
[RLlib] POC: Config objects instead of dicts (PPO only). ( #23491 )
2022-03-31 18:26:12 +02:00
Sven Mika
7cb86acce2
[RLlib] trainer_template.py: hard deprecation (error when used). ( #23488 )
2022-03-25 18:25:51 +01:00
Philipp Moritz
886cc4d674
Fix broken links in documentation and put linkcheck linter in place on CI ( #23340 )
2022-03-18 21:02:52 -07:00
Max Pumperla
71c57c619b
[docs] RLlib broken links ( fixes #23160 ) ( #23226 )
2022-03-16 12:38:18 +01:00
Max Pumperla
11c40e363d
[docs] external promo content ( #22823 )
2022-03-10 11:39:44 -08:00
Sven Mika
8e00537b65
[RLlib] SlateQ: framework=tf fixes and SlateQ documentation update ( #22543 )
2022-02-23 13:03:45 +01:00
Max Pumperla
9482f03134
[docs] RLlib concepts consolidation, user guide, RL conf prep ( #22496 )
2022-02-18 09:35:20 -08:00
Sven Mika
e03606f0b3
[RLlib] Bandit documentation enhancements. ( #22427 )
2022-02-17 13:25:50 +01:00
Jun Gong
b729a9390f
[RLlib] Add example commands for using setup-dev.py
with RLlib for improved dev setup stability and developer experience. ( #22380 )
2022-02-15 12:00:36 +01:00
Jun Gong
6f5afcbce9
[RLlib] Docs enhancements: Setup-dev instructions; Ray datasets integration. ( #22239 )
2022-02-15 09:09:24 +01:00
Sven Mika
c73e0597fa
[RLlib] Discussion 2022: Fix batch_mode="complete_episodes" documentation inaccuracy. ( #22074 )
2022-02-10 02:57:27 +01:00
Max Pumperla
5cc9355303
[Docs ] Tune docs overhaul (first part) ( #22112 )
...
Continuing docs overhaul, tune now has:
- [x] better landing page
- [x] a getting started guide
- [x] user guide was cut down, partially merged with FAQ, and partially integrated with tutorials
- [x] the new user guide contains guides to tune features and practical integrations
- [x] we rewrote some of the feature guides for clarity
- [x] we got rid of sphinx-gallery for this sub-project (only data and core left), as it looks bad and is unnecessarily complicated anyway (plus, makes the build slower)
- [x] sphinx-gallery examples are now moved to markdown notebook, as started in #22030 .
- [x] Examples are tested in the new framework, of course.
There's still a lot one can do, but this is already getting too large. Will follow up with more fine-tuning next week.
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
2022-02-07 15:47:03 +00:00
Sven Mika
893536ebd9
[RLlib] Move bandits into main agents folder; Make RecSim adapter more accessible; ( #21773 )
2022-01-27 13:58:12 +01:00
Sven Mika
371fbb17e4
[RLlib] Make policies_to_train
more flexible via callable option. ( #20735 )
2022-01-27 12:17:34 +01:00
Max Pumperla
b34099e764
[docs] landing page ( fixes #21750 ) ( #21859 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2022-01-26 17:14:25 -08:00