kourosh hakhamaneshi
|
3815e52a61
|
[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits (#24896)
|
2022-05-19 18:30:42 +02:00 |
|
Sven Mika
|
628ee4b5f0
|
[RLlib] Bandit tf2 fix (+ add tf2 to test cases). (#24908)
|
2022-05-18 18:58:42 +02:00 |
|
Steven Morad
|
ebe6ab0afc
|
[RLlib] Bandits use TrainerConfig objects. (#24687)
|
2022-05-12 22:02:15 +02:00 |
|
Sven Mika
|
0c5ac3b9e8
|
[RLlib] Issue 24075: Better error message for Bandit MultiDiscrete (suggest using our wrapper). (#24385)
|
2022-05-02 21:14:08 +02:00 |
|
Sven Mika
|
f066180ed5
|
[RLlib] Deprecate timesteps_per_iteration config key (in favor of min_[sample|train]_timesteps_per_reporting . (#24372)
|
2022-05-02 12:51:14 +02:00 |
|
kourosh hakhamaneshi
|
c38a29573f
|
[RLlib] Removed deprecated code with error=True (#23916)
|
2022-04-15 13:51:12 +02:00 |
|
Eric Liang
|
1ff874e8e8
|
[spelling] Add linter rule for mis-capitalizations of RLLib -> RLlib (#23817)
|
2022-04-10 16:12:53 -07:00 |
|
Avnish Narayan
|
5134e0dc12
|
[RLlib] Change type to tensortype for cql policies. (#23438)
|
2022-03-24 12:32:29 +01:00 |
|
Jun Gong
|
d12977c4fb
|
[RLlib] TF2 Bandit Agent (#22838)
|
2022-03-21 16:55:55 +01:00 |
|
Sven Mika
|
b1cda46681
|
[RLlib] SlateQ (tf GPU + multi-GPU) + Bandit fixes (#23276)
|
2022-03-18 13:45:16 +01:00 |
|
Sven Mika
|
3fe6f3b3eb
|
[RLlib] 2 bug fixes: Bandit registration not working if torch not installed. Env checker for MA envs. (#22821)
|
2022-03-04 19:16:30 +01:00 |
|
Sven Mika
|
c58cd90619
|
[RLlib] Enable Bandits to work in batches mode(s) (vector envs + multiple workers + train_batch_sizes > 1). (#22465)
|
2022-02-17 22:32:26 +01:00 |
|
Balaji Veeramani
|
31ed9e5d02
|
[CI] Replace YAPF disables with Black disables (#21982)
|
2022-02-08 16:29:25 -08:00 |
|
Jun Gong
|
a55258eb9c
|
[RLlib] Move bandit example scripts into examples folder. (#21949)
|
2022-02-02 09:20:47 +01:00 |
|
Balaji Veeramani
|
7f1bacc7dc
|
[CI] Format Python code with Black (#21975)
See #21316 and #21311 for the motivation behind these changes.
|
2022-01-29 18:41:57 -08:00 |
|
Sven Mika
|
7fc1683bab
|
[RLlib] Some more bandit cleanup/tests. (#21932)
|
2022-01-28 12:03:26 +01:00 |
|
Sven Mika
|
893536ebd9
|
[RLlib] Move bandits into main agents folder; Make RecSim adapter more accessible; (#21773)
|
2022-01-27 13:58:12 +01:00 |
|