hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Sven Mika	ff9c1dac88	[RLlib] Issue 9667 DDPG Torch bugs and enhancements. (#9680 )	2020-07-28 14:15:03 +02:00
Sven Mika	57544b1ff9	[RLlib] Examples folder restructuring (Model examples; final part). (#8278 ) - This PR completes any previously missing PyTorch Model counterparts to TFModels in examples/models. - It also makes sure, all example scripts in the rllib/examples folder are tested for both frameworks and learn the given task (this is often currently not checked) using a --as-test flag in connection with a --stop-reward.	2020-05-12 08:23:10 +02:00
Sven Mika	a00144f746	[RLlib] Fix issue 8135 (DDPG inf actions when using [-inf,inf] action space). (#8302 )	2020-05-04 22:27:30 +02:00
Sven Mika	7ec2223c84	[RLlib] DDPG PyTorch actor-model was missing sigmoid layer (#8188 ) Fix DDPG PyTorch (missing sigmoid layer (to squash action outputs) after deterministic action outputs).	2020-04-26 23:08:13 +02:00
Sven Mika	d0fab84e4d	[RLlib] DDPG PyTorch version. (#7953 ) The DDPG/TD3 algorithms currently do not have a PyTorch implementation. This PR adds PyTorch support for DDPG/TD3 to RLlib. This PR: - Depends on the re-factor PR for DDPG (Functional Algorithm API). - Adds learning regression tests for the PyTorch version of DDPG and a DDPG (torch) - Updates the documentation to reflect that DDPG and TD3 now support PyTorch. * Learning Pendulum-v0 on torch version (same config as tf). Wall time a little slower (~20% than tf). * Fix GPU target model problem.	2020-04-16 10:20:01 +02:00

5 commits