Jun Gong
|
dea134a472
|
[RLlib] Clean up Policy mixins. (#24746)
|
2022-05-17 17:16:08 +02:00 |
|
Sven Mika
|
f891a2b6f1
|
[RLlib] SlateQ + tf; release test fixes, related to TD-error not properly being formatted. (#24521)
|
2022-05-06 08:50:30 +02:00 |
|
Sven Mika
|
539832f2c5
|
[RLlib] SlateQ training iteration function. (#24151)
|
2022-04-29 18:38:17 +02:00 |
|
Sven Mika
|
7b687e6cd8
|
[RLlib] SlateQ: Add a hard-task learning test to weekly regression suite. (#22544)
|
2022-02-25 21:58:16 +01:00 |
|
Sven Mika
|
8e00537b65
|
[RLlib] SlateQ: framework=tf fixes and SlateQ documentation update (#22543)
|
2022-02-23 13:03:45 +01:00 |
|
Sven Mika
|
6522935291
|
[RLlib] Slate-Q tf implementation and tests/benchmarks. (#22389)
|
2022-02-22 09:36:44 +01:00 |
|