Artur Niederfahrenhorst
|
bed9083f35
|
[RLlib] Add timeout to filter synchronization. (#25959)
|
2022-06-24 14:37:43 +02:00 |
|
Artur Niederfahrenhorst
|
e10876604d
|
[RLlib] Include SampleBatch.T column in all collected batches. (#25926)
|
2022-06-21 13:20:22 +02:00 |
|
Sven Mika
|
96693055bd
|
[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869)
|
2022-06-20 15:54:00 +02:00 |
|
Sven Mika
|
130b7eeaba
|
[RLlib] Trainer to Algorithm renaming. (#25539)
|
2022-06-11 15:10:39 +02:00 |
|
Sven Mika
|
b5bc2b93c3
|
[RLlib] Move all remaining algos into algorithms directory. (#25366)
|
2022-06-04 07:35:24 +02:00 |
|
Yi Cheng
|
fd0f967d2e
|
Revert "[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms dir and rename policy and trainer classes. (#25346)" (#25420)
This reverts commit e4ceae19ef .
Reverts #25346
linux://python/ray/tests:test_client_library_integration never fail before this PR.
In the CI of the reverted PR, it also fails (https://buildkite.com/ray-project/ray-builders-pr/builds/34079#01812442-c541-4145-af22-2a012655c128). So high likely it's because of this PR.
And test output failure seems related as well (https://buildkite.com/ray-project/ray-builders-branch/builds/7923#018125c2-4812-4ead-a42f-7fddb344105b)
|
2022-06-02 20:38:44 -07:00 |
|
Sven Mika
|
e4ceae19ef
|
[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms dir and rename policy and trainer classes. (#25346)
|
2022-06-02 16:47:05 +02:00 |
|
kourosh hakhamaneshi
|
3815e52a61
|
[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits (#24896)
|
2022-05-19 18:30:42 +02:00 |
|
Steven Morad
|
00922817b6
|
[RLlib] Rewrite PPO to use training_iteration + enable DD-PPO for Win32. (#23673)
|
2022-04-11 08:39:10 +02:00 |
|
Balaji Veeramani
|
7f1bacc7dc
|
[CI] Format Python code with Black (#21975)
See #21316 and #21311 for the motivation behind these changes.
|
2022-01-29 18:41:57 -08:00 |
|
Sven Mika
|
70fe25055a
|
[RLlib] Issue: Get single step input dict incorrect. (#20217)
|
2021-11-12 08:38:51 +01:00 |
|
Sven Mika
|
61a1274619
|
[RLlib] No Preprocessors (part 2). (#18468)
|
2021-09-23 12:56:45 +02:00 |
|
Sven Mika
|
a96dbd885b
|
[RLlib] Reinstate trajectory view API tests. (#18809)
|
2021-09-23 08:31:51 +02:00 |
|
Sven Mika
|
494ddd98c1
|
[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. (#17928)
|
2021-08-21 17:05:48 +02:00 |
|
Sven Mika
|
be6db06485
|
[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569)
|
2021-06-21 13:46:01 +02:00 |
|
Amog Kamsetty
|
bd3cbfc56a
|
Revert "[RLlib] Allow policies to be added/deleted on the fly. (#16359)" (#16543)
This reverts commit e78ec370a9 .
|
2021-06-18 12:21:49 -07:00 |
|
Sven Mika
|
e78ec370a9
|
[RLlib] Allow policies to be added/deleted on the fly. (#16359)
|
2021-06-18 10:31:30 +02:00 |
|
Sven Mika
|
e973b726c2
|
[RLlib] Support native tf.keras.Models (part 2) - Default keras models for Vision/RNN/Attention. (#15273)
|
2021-04-30 19:26:30 +02:00 |
|
Sven Mika
|
8b3554e37e
|
[RLlib] Remove all (already soft-deprecated) SampleBatch.data from code. (#15335)
|
2021-04-15 19:19:51 +02:00 |
|
Sven Mika
|
04bc0a9828
|
[RLlib] Remove all non-trajectory view API code. (#14860)
|
2021-03-23 09:50:18 -07:00 |
|
Sven Mika
|
c3a15ecc0f
|
[RLlib] Issue #13802: Enhance metrics for multiagent->count_steps_by=agent_steps setting. (#14033)
|
2021-03-18 20:27:41 +01:00 |
|
Sven Mika
|
eb0038612f
|
[RLlib] Extend on_learn_on_batch callback to allow for custom metrics to be added. (#13584)
|
2021-02-08 15:02:19 +01:00 |
|
Sven Mika
|
9eba1871bb
|
[RLlib] Support easy use_attention=True flag for using the GTrXL model. (#11698)
|
2021-01-01 14:06:23 -05:00 |
|
Sven Mika
|
391cdfae8c
|
[RLlib] Trajectory view API docs. (#12718)
|
2020-12-30 17:32:21 -08:00 |
|
Sven Mika
|
c524f86785
|
[RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. (#13064)
|
2020-12-27 09:46:03 -05:00 |
|
Sven Mika
|
d5604eaba3
|
[RLlib] Attention nets PyTorch support and cleanup (using traj. view API). (#12029)
|
2020-12-21 18:38:34 -08:00 |
|
Sven Mika
|
e40b14d255
|
[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. (#12420)
|
2020-12-08 16:41:45 -08:00 |
|
Sven Mika
|
99c81c6795
|
[RLlib] Attention Net prep PR #3. (#12450)
|
2020-12-07 13:08:17 +01:00 |
|
Sven Mika
|
3ad9365e1d
|
[RLlib] Attention Net prep PR #2: Smaller cleanups. (#12449)
|
2020-12-01 08:21:45 +01:00 |
|
Sven Mika
|
0df55a139c
|
[RLlib] Attention Net prep PR #1: Smaller cleanups. (#12447)
* WIP.
* Fix.
* Fix.
* Fix.
|
2020-11-27 16:25:47 -08:00 |
|
Sven Mika
|
592c161032
|
[RLlib] Issue 12118: LSTM prev-a/r should be separately configurable. Fix missing prev-a one-hot encoding. (#12397)
* WIP.
* Fix and LINT.
|
2020-11-25 11:27:46 -08:00 |
|
Sven Mika
|
95175a822f
|
[RLlib] Issue 11974: Traj view API next-action (shift=+1) not working. (#12407)
* WIP.
* Fix and LINT.
|
2020-11-25 11:26:29 -08:00 |
|
Sven Mika
|
dab241dcc6
|
[RLlib] Fix inconsistency wrt batch size in SampleCollector (traj. view API). Makes DD-PPO work with traj. view API. (#12063)
|
2020-11-19 19:01:14 +01:00 |
|
Sven Mika
|
b6b54f1c81
|
[RLlib] Trajectory view API: enable by default for SAC, DDPG, DQN, SimpleQ (#11827)
|
2020-11-16 10:54:35 -08:00 |
|
Sven Mika
|
414041c6dd
|
[RLlib] Do not create env on driver iff num_workers > 0. (#11307)
|
2020-10-15 18:21:30 +02:00 |
|
Sven Mika
|
36bda8432b
|
[RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056)
|
2020-10-01 16:57:10 +02:00 |
|
Sven Mika
|
e968b52cb7
|
[RLlib] Trajectory view API - 03 Fast LSTM + prev actions/rewards (#9950)
|
2020-08-21 12:35:16 +02:00 |
|