hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-12 14:16:39 -04:00

Kai Fricke 3e053c85ee

[RLlib] Fix broken links from agent -> algo conversion. (#25014 )

2022-05-20 11:37:11 +02:00

311 B

Raw Blame History

Policy Gradient (PG)

An implementation of a vanilla policy gradient algorithm for TensorFlow and PyTorch.

Detailed Documentation

Implementation