xwjiang2010
|
fcf897ee72
|
[air] update rllib example to use Tuner API. (#26987)
update rllib example to use Tuner API.
Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>
|
2022-07-27 12:12:59 +01:00 |
|
kourosh hakhamaneshi
|
3815e52a61
|
[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits (#24896)
|
2022-05-19 18:30:42 +02:00 |
|
Artur Niederfahrenhorst
|
fb2915d26a
|
[RLlib] Replay Buffer API and Ape-X. (#24506)
|
2022-05-17 13:43:49 +02:00 |
|
Jun Gong
|
68a9a33386
|
[RLlib] Retry agents -> algorithms. with proper doc changes this time. (#24797)
|
2022-05-16 09:45:32 +02:00 |
|
Simon Mo
|
9f23affdc0
|
[Hotfix] Unbreak lint in master (#24794)
|
2022-05-13 15:05:05 -07:00 |
|
Sven Mika
|
8fe3fd8f7b
|
[RLlib] QMix TrainerConfig objects. (#24775)
|
2022-05-13 18:50:28 +02:00 |
|
Max Pumperla
|
6a6c58b5b4
|
[RLlib] Config objects for DDPG and SimpleQ. (#24339)
|
2022-05-12 16:12:42 +02:00 |
|
Sven Mika
|
7ab19ddc32
|
[RLlib] MADDPG: Move into agents folder (from contrib) and use training_iteration method. (#24502)
|
2022-05-06 12:35:21 +02:00 |
|
Balaji Veeramani
|
7f1bacc7dc
|
[CI] Format Python code with Black (#21975)
See #21316 and #21311 for the motivation behind these changes.
|
2022-01-29 18:41:57 -08:00 |
|
Sven Mika
|
abd3bef63b
|
[RLlib] QMIX better defaults + added to CI learning tests (#21332)
|
2022-01-04 08:54:41 +01:00 |
|
Sven Mika
|
698b4eeed3
|
[RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). (#18669)
|
2021-09-21 22:00:14 +02:00 |
|
Sven Mika
|
649580d735
|
[RLlib] Redo simplify multi agent config dict: Reverted b/c seemed to break test_typing (non RLlib test). (#17046)
|
2021-07-15 05:51:24 -04:00 |
|
Amog Kamsetty
|
38b5b6d24c
|
Revert "[RLlib] Simplify multiagent config (automatically infer class/spaces/config). (#16565)" (#17036)
This reverts commit e4123fff27 .
|
2021-07-13 09:57:15 -07:00 |
|
Sven Mika
|
e4123fff27
|
[RLlib] Simplify multiagent config (automatically infer class/spaces/config). (#16565)
|
2021-07-13 06:38:14 -04:00 |
|
Kai Fricke
|
10fd7111b3
|
[rllib] Improve test learning check, fix flaky two step qmix (#16843)
|
2021-07-06 19:39:12 +01:00 |
|
Sven Mika
|
53206dd440
|
[RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes (#16531)
|
2021-06-30 12:32:11 +02:00 |
|
Sven Mika
|
be6db06485
|
[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569)
|
2021-06-21 13:46:01 +02:00 |
|
Sven Mika
|
169ddabae7
|
[RLlib] Issue 15973: Trainer.with_updates(validate_config=...) behaves confusingly. (#16429)
|
2021-06-19 22:42:00 +02:00 |
|
Amog Kamsetty
|
bd3cbfc56a
|
Revert "[RLlib] Allow policies to be added/deleted on the fly. (#16359)" (#16543)
This reverts commit e78ec370a9 .
|
2021-06-18 12:21:49 -07:00 |
|
Sven Mika
|
e78ec370a9
|
[RLlib] Allow policies to be added/deleted on the fly. (#16359)
|
2021-06-18 10:31:30 +02:00 |
|
Sven Mika
|
d2c755ccef
|
[RLlib] Examples scripts add argparse help and replace --torch with --framework . (#15832)
|
2021-05-18 13:18:12 +02:00 |
|
Sven Mika
|
c17169dc11
|
[RLlib] Fix all example scripts to run on GPUs. (#11105)
|
2020-10-02 23:07:44 +02:00 |
|
Sven Mika
|
78dfed2683
|
[RLlib] Issue 8384: QMIX doesn't learn anything. (#9527)
|
2020-07-17 12:14:34 +02:00 |
|