Sven Mika
|
f066180ed5
|
[RLlib] Deprecate timesteps_per_iteration config key (in favor of min_[sample|train]_timesteps_per_reporting . (#24372)
|
2022-05-02 12:51:14 +02:00 |
|
Jun Gong
|
d12977c4fb
|
[RLlib] TF2 Bandit Agent (#22838)
|
2022-03-21 16:55:55 +01:00 |
|
Jun Gong
|
a385c9b127
|
[RLlib] Update bandit_envs_recommender_system (#22421)
|
2022-02-24 22:43:41 +01:00 |
|
Sven Mika
|
6522935291
|
[RLlib] Slate-Q tf implementation and tests/benchmarks. (#22389)
|
2022-02-22 09:36:44 +01:00 |
|
Sven Mika
|
c58cd90619
|
[RLlib] Enable Bandits to work in batches mode(s) (vector envs + multiple workers + train_batch_sizes > 1). (#22465)
|
2022-02-17 22:32:26 +01:00 |
|
Jun Gong
|
9c95b9a5fa
|
[RLlib] Add an env wrapper so RecSim works with our Bandits agent. (#22028)
|
2022-02-02 12:15:38 +01:00 |
|
Jun Gong
|
a55258eb9c
|
[RLlib] Move bandit example scripts into examples folder. (#21949)
|
2022-02-02 09:20:47 +01:00 |
|