Sven Mika
|
59a967a3a0
|
[RLlib] Cleanup some deprecated metric keys and classes. (#26036)
|
2022-06-23 21:30:01 +02:00 |
|
Eric Liang
|
905258dbc1
|
Clean up docstyle in python modules and add LINT rule (#25272)
|
2022-06-01 11:27:54 -07:00 |
|
Artur Niederfahrenhorst
|
fb2915d26a
|
[RLlib] Replay Buffer API and Ape-X. (#24506)
|
2022-05-17 13:43:49 +02:00 |
|
Sven Mika
|
25001f6d8d
|
[RLlib] APPO Training iteration fn. (#24545)
|
2022-05-17 10:31:07 +02:00 |
|
Sven Mika
|
924adcf402
|
[RLlib] Issue 24074: multi-GPU learner thread key error in MA-scenarios. (#24382)
|
2022-05-02 18:30:46 +02:00 |
|
Balaji Veeramani
|
7f1bacc7dc
|
[CI] Format Python code with Black (#21975)
See #21316 and #21311 for the motivation behind these changes.
|
2022-01-29 18:41:57 -08:00 |
|
Jun Gong
|
8ebc50f844
|
[RLlib] Issue 21334: Fix APPO when kl_loss is enabled. (#21855)
|
2022-01-27 20:08:58 +01:00 |
|
Sven Mika
|
371fbb17e4
|
[RLlib] Make policies_to_train more flexible via callable option. (#20735)
|
2022-01-27 12:17:34 +01:00 |
|
Artur Niederfahrenhorst
|
d07e50e957
|
[RLlib] Replay buffer API (cleanups; docstrings; renames; move into rllib/execution/buffers dir) (#20552)
|
2021-11-19 11:57:37 +01:00 |
|
Sven Mika
|
c3e3fc7637
|
[RLlib] Issue 18280: A3C/IMPALA multi-agent not working. (#19100)
|
2021-10-07 23:57:53 +02:00 |
|
Sven Mika
|
ed85f59194
|
[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879)
|
2021-09-30 16:39:05 +02:00 |
|
Sven Mika
|
3803e796ff
|
[RLlib] Multi-GPU learner thread (IMPALA) error messages/comments/code-cleanup. (#18540)
|
2021-09-13 19:27:53 +02:00 |
|
Sven Mika
|
5a313ba3d6
|
[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169)
|
2021-07-20 14:58:13 -04:00 |
|