gjoliver
|
9385b6c1be
|
[RLlib] Make a few LRSchedule and EntropyCoeffSchedule tests more reliable. (#19934)
|
2021-11-02 16:52:56 +01:00 |
|
roireshef
|
9b0352f363
|
[RLlib] Added LearningRateSchedule and EntropyCoeffSchedule to TF and Torch versions of A3C and PPO (#19276)
|
2021-10-25 09:39:35 +02:00 |
|
Sven Mika
|
ed85f59194
|
[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879)
|
2021-09-30 16:39:05 +02:00 |
|
Sven Mika
|
f3bbe4ea44
|
[RLlib] Test cases/BUILD cleanup; split "everything else" (longest running one rn) tests in 2. (#17640)
|
2021-08-16 22:01:01 +02:00 |
|
Sven Mika
|
1f00f834ac
|
[RLlib] Solve PyTorch/TF-eager A3C async race condition between calling model and its value function. (#13467)
|
2021-01-18 10:29:03 -08:00 |
|
Sven Mika
|
244aafdcf8
|
[RLlib] Curiosity enhancements. (#10373)
|
2020-09-05 13:14:24 +02:00 |
|
Sven Mika
|
4da0e542d5
|
[RLlib] DDPG and SAC eager support (preparation for tf2.x) (#9204)
|
2020-07-08 16:12:20 +02:00 |
|
Sven Mika
|
4ed796a7d6
|
[RLlib] Add testing Policy.compute_single_action() for all agents. (#8903)
|
2020-06-13 17:51:50 +02:00 |
|
Sven Mika
|
2746fc0476
|
[RLlib] Auto-framework, retire use_pytorch in favor of framework=... (#8520)
|
2020-05-27 16:19:13 +02:00 |
|