ray/rllib/agents at 0c9e5db9cbc2c47cdf3c480cafcdf32425c378ca - hiro/ray

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

History

Sven Mika 0c9e5db9cb Fix SAC bug (twin Q not used for min'ing over both Q-nets in loss func). (#7354 )		2020-02-27 12:49:08 -08:00
..
a3c	[RLlib] PPO torch memory leak and unnecessary torch.Tensor creation and gc'ing. (#7238 )	2020-02-22 11:02:31 -08:00
ars	[RLlib] Add `torch` flag to train.py (#6807 )	2020-01-17 18:48:44 -08:00
ddpg	[RLlib] Policy.compute_log_likelihoods() and SAC refactor. (issue #7107 ) (#7124 )	2020-02-22 14:19:49 -08:00
dqn	[RLlib] Policy.compute_log_likelihoods() and SAC refactor. (issue #7107 ) (#7124 )	2020-02-22 14:19:49 -08:00
es	[RLlib] Add `torch` flag to train.py (#6807 )	2020-01-17 18:48:44 -08:00
impala	[RLlib] Exploration API: merge deterministic flag with exploration classes (SoftQ and StochasticSampling). (#7155 )	2020-02-19 12:18:45 -08:00
marwil	[rllib] implemented compute_advantages without gae (#6941 )	2020-01-31 22:25:45 -08:00
pg	[rllib] [experimental] custom RL training pipelines (PG_pl, A2C_pl) (#7213 )	2020-02-19 16:07:37 -08:00
ppo	[RLlib] PPO torch memory leak and unnecessary torch.Tensor creation and gc'ing. (#7238 )	2020-02-22 11:02:31 -08:00
qmix	[RLlib] Policy.compute_log_likelihoods() and SAC refactor. (issue #7107 ) (#7124 )	2020-02-22 14:19:49 -08:00
sac	Fix SAC bug (twin Q not used for min'ing over both Q-nets in loss func). (#7354 )	2020-02-27 12:49:08 -08:00
__init__.py	[rllib] Try moving RLlib to top level dir (#5324 )	2019-08-05 23:25:49 -07:00
agent.py	Remove future imports (#6724 )	2020-01-09 00:15:48 -08:00
mock.py	Remove future imports (#6724 )	2020-01-09 00:15:48 -08:00
registry.py	[rllib] [experimental] custom RL training pipelines (PG_pl, A2C_pl) (#7213 )	2020-02-19 16:07:37 -08:00
trainer.py	[RLlib] Exploration API: merge deterministic flag with exploration classes (SoftQ and StochasticSampling). (#7155 )	2020-02-19 12:18:45 -08:00
trainer_template.py	[rllib] [experimental] custom RL training pipelines (PG_pl, A2C_pl) (#7213 )	2020-02-19 16:07:37 -08:00