Balaji Veeramani
|
7f1bacc7dc
|
[CI] Format Python code with Black (#21975)
See #21316 and #21311 for the motivation behind these changes.
|
2022-01-29 18:41:57 -08:00 |
|
Jun Gong
|
8ebc50f844
|
[RLlib] Issue 21334: Fix APPO when kl_loss is enabled. (#21855)
|
2022-01-27 20:08:58 +01:00 |
|
Sven Mika
|
371fbb17e4
|
[RLlib] Make policies_to_train more flexible via callable option. (#20735)
|
2022-01-27 12:17:34 +01:00 |
|
Artur Niederfahrenhorst
|
d07e50e957
|
[RLlib] Replay buffer API (cleanups; docstrings; renames; move into rllib/execution/buffers dir) (#20552)
|
2021-11-19 11:57:37 +01:00 |
|
Sven Mika
|
c3e3fc7637
|
[RLlib] Issue 18280: A3C/IMPALA multi-agent not working. (#19100)
|
2021-10-07 23:57:53 +02:00 |
|
Sven Mika
|
ed85f59194
|
[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879)
|
2021-09-30 16:39:05 +02:00 |
|
Sven Mika
|
3803e796ff
|
[RLlib] Multi-GPU learner thread (IMPALA) error messages/comments/code-cleanup. (#18540)
|
2021-09-13 19:27:53 +02:00 |
|
Sven Mika
|
5a313ba3d6
|
[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169)
|
2021-07-20 14:58:13 -04:00 |
|