Sven Mika
|
59a967a3a0
|
[RLlib] Cleanup some deprecated metric keys and classes. (#26036)
|
2022-06-23 21:30:01 +02:00 |
|
Sven Mika
|
96693055bd
|
[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869)
|
2022-06-20 15:54:00 +02:00 |
|
Sven Mika
|
ec89fe5203
|
[RLlib] APEX-DQN and R2D2 config objects. (#25067)
|
2022-05-23 12:15:45 +02:00 |
|
Sven Mika
|
04a5c72ea3
|
Revert "Revert "[RLlib] Speedup A3C up to 3x (new training_iteration function instead of execution_plan) and re-instate Pong learning test."" (#18708)
|
2022-02-10 13:44:22 +01:00 |
|
Alex Wu
|
b122f093c1
|
Revert "[RLlib] Speedup A3C up to 3x (new training_iteration function instead of execution_plan ) and re-instate Pong learning test." (#22250)
Reverts ray-project/ray#22126
Breaks rllib:tests/test_io
|
2022-02-09 09:26:36 -08:00 |
|
Sven Mika
|
ac3e6ab411
|
[RLlib] Speedup A3C up to 3x (new training_iteration function instead of execution_plan ) and re-instate Pong learning test. (#22126)
|
2022-02-08 19:04:13 +01:00 |
|
Sven Mika
|
ee41800c16
|
[RLlib] Preparatory PR for multi-agent, multi-GPU learning agent (alpha-star style) #02. (#21649)
|
2022-01-27 22:07:05 +01:00 |
|
Sven Mika
|
62dbf26394
|
[RLlib] POC: Run PGTrainer w/o the distr. exec API (Trainer's new training_iteration method). (#20984)
|
2021-12-21 08:39:05 +01:00 |
|
Sven Mika
|
ed85f59194
|
[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879)
|
2021-09-30 16:39:05 +02:00 |
|