ray/rllib/models/tf at 793e616a2de32866ace7201296b3ccab077bd1bb - hiro/ray

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

History

Sven Mika 165a86f1ab [RLlib] SAC MuJoCo instability issues (tf and torch versions). (#8063 ) SAC (both torch and tf versions) are showing issues (crashes) due to numeric instabilities in the SquashedGaussian distribution (sampling + logp after extreme NN outputs). This PR fixes these. Stable MuJoCo learning (HalfCheetah) has been confirmed on both tf and torch versions. A Distribution stability test (using extreme NN outputs) has been added for SquashedGaussian (can be used for any other type of distribution as well).		2020-04-19 10:20:23 +02:00
..
__init__.py	[RLlib] Working/learning example: PPO + torch + LSTM. (#7797 )	2020-03-31 22:00:28 -07:00
fcnet_v1.py	[RLlib] SAC Torch (incl. Atari learning) (#7984 )	2020-04-15 13:25:16 +02:00
fcnet_v2.py	[RLlib] SAC Torch (incl. Atari learning) (#7984 )	2020-04-15 13:25:16 +02:00
lstm_v1.py	Remove future imports (#6724 )	2020-01-09 00:15:48 -08:00
misc.py	[RLlib] SAC Torch (incl. Atari learning) (#7984 )	2020-04-15 13:25:16 +02:00
modelv1_compat.py	[RLlib] Fix issue (bug): LSTM + non-shared vf + PPO + tuple actions (#6890 )	2020-01-24 10:29:35 -08:00
recurrent_tf_modelv2.py	[RLlib] Working/learning example: PPO + torch + LSTM. (#7797 )	2020-03-31 22:00:28 -07:00
tf_action_dist.py	[RLlib] SAC MuJoCo instability issues (tf and torch versions). (#8063 )	2020-04-19 10:20:23 +02:00
tf_modelv2.py	[RLlib] DQN torch version. (#7597 )	2020-04-06 11:56:16 -07:00
visionnet_v1.py	[RLlib] SAC Torch (incl. Atari learning) (#7984 )	2020-04-15 13:25:16 +02:00
visionnet_v2.py	[RLlib] SAC Torch (incl. Atari learning) (#7984 )	2020-04-15 13:25:16 +02:00