hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

History

Sven Mika 36bda8432b [RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056 )		2020-10-01 16:57:10 +02:00
..
tests	ci: Redo `format.sh --all` script & backfill lint fixes (#9956 )	2020-08-07 16:49:49 -07:00
__init__.py	[RLlib] Examples folder restructuring (models) part 1 (#8353 )	2020-05-08 08:20:18 +02:00
pg.py	[RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056 )	2020-10-01 16:57:10 +02:00
pg_tf_policy.py	[RLlib] PPO, APPO, and DD-PPO code cleanup. (#10420 )	2020-09-02 14:03:01 +02:00
pg_torch_policy.py	[RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056 )	2020-10-01 16:57:10 +02:00
README.md	[RLlib] First attempt at cleaning up algo code in RLlib: PG. (#10115 )	2020-08-20 17:05:57 +02:00
utils.py	[RLlib] First attempt at cleaning up algo code in RLlib: PG. (#10115 )	2020-08-20 17:05:57 +02:00

README.md

Policy Gradient (PG)

An implementation of a vanilla policy gradient algorithm for TensorFlow and PyTorch.

Detailed Documentation

Implementation