hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 02:21:39 -05:00

History

Sven Mika ec89fe5203 [RLlib] APEX-DQN and R2D2 config objects. (#25067 )		2022-05-23 12:15:45 +02:00
..
tests	[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits (#24896 )	2022-05-19 18:30:42 +02:00
__init__.py	[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits (#24896 )	2022-05-19 18:30:42 +02:00
default_config.py	[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits (#24896 )	2022-05-19 18:30:42 +02:00
pg.py	[RLlib] APEX-DQN and R2D2 config objects. (#25067 )	2022-05-23 12:15:45 +02:00
pg_tf_policy.py	[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits (#24896 )	2022-05-19 18:30:42 +02:00
pg_torch_policy.py	[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits (#24896 )	2022-05-19 18:30:42 +02:00
README.md	[RLlib] Fix broken links from agent -> algo conversion. (#25014 )	2022-05-20 11:37:11 +02:00
utils.py	[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits (#24896 )	2022-05-19 18:30:42 +02:00

README.md

Policy Gradient (PG)

An implementation of a vanilla policy gradient algorithm for TensorFlow and PyTorch.

Detailed Documentation

Implementation