Artur Niederfahrenhorst
|
e9a8f7d9ae
|
[RLlib] Unify gnorm mixin for tf and torch policies. (#26102)
|
2022-07-24 15:31:09 +02:00 |
|
Jun Gong
|
6b6d3017ba
|
[RLlib] more connector polishes and fixes. (#26645)
|
2022-07-19 08:50:28 -07:00 |
|
Ishant Mrinal
|
57244aeee3
|
[RLlib] Make DQN update_target use only trainable variables. (#25226)
|
2022-07-15 09:17:06 +02:00 |
|
Sven Mika
|
d90c6cfbd6
|
[RLlib] SimpleQ PolicyV2 (sub-classing). (#25871)
|
2022-06-17 20:12:16 +02:00 |
|
Sven Mika
|
130b7eeaba
|
[RLlib] Trainer to Algorithm renaming. (#25539)
|
2022-06-11 15:10:39 +02:00 |
|
Artur Niederfahrenhorst
|
5133978adc
|
[RLlib] PG policy subclassing conversion. (#25288)
|
2022-06-06 13:07:47 +02:00 |
|
Sven Mika
|
ab6c3027e5
|
[RLlib] A2/3C policy sub-classing schema. (#25078)
|
2022-05-28 09:54:47 +02:00 |
|
Jun Gong
|
eaf9c941ae
|
[RLlib] Migrate PPO Impala and APPO policies to use sub-classing implementation. (#25117)
|
2022-05-25 14:38:03 +02:00 |
|
Jun Gong
|
dea134a472
|
[RLlib] Clean up Policy mixins. (#24746)
|
2022-05-17 17:16:08 +02:00 |
|