ray/rllib/models
Sven Mika 165a86f1ab
[RLlib] SAC MuJoCo instability issues (tf and torch versions). (#8063)
SAC (both torch and tf versions) are showing issues (crashes) due to numeric instabilities in the SquashedGaussian distribution (sampling + logp after extreme NN outputs).
This PR fixes these. Stable MuJoCo learning (HalfCheetah) has been confirmed on both tf and torch versions. A Distribution stability test (using extreme NN outputs) has been added for SquashedGaussian (can be used for any other type of distribution as well).
2020-04-19 10:20:23 +02:00
..
tests [RLlib] SAC MuJoCo instability issues (tf and torch versions). (#8063) 2020-04-19 10:20:23 +02:00
tf [RLlib] SAC MuJoCo instability issues (tf and torch versions). (#8063) 2020-04-19 10:20:23 +02:00
torch [RLlib] SAC MuJoCo instability issues (tf and torch versions). (#8063) 2020-04-19 10:20:23 +02:00
__init__.py [rllib] Try moving RLlib to top level dir (#5324) 2019-08-05 23:25:49 -07:00
action_dist.py [RLlib] Exploration API: merge deterministic flag with exploration classes (SoftQ and StochasticSampling). (#7155) 2020-02-19 12:18:45 -08:00
catalog.py [RLlib] SAC Torch (incl. Atari learning) (#7984) 2020-04-15 13:25:16 +02:00
extra_spaces.py [rllib] Try moving RLlib to top level dir (#5324) 2019-08-05 23:25:49 -07:00
model.py Remove future imports (#6724) 2020-01-09 00:15:48 -08:00
modelv2.py [RLlib] DDPG re-factor to fit into RLlib's functional algorithm builder API. (#7934) 2020-04-09 14:04:21 -07:00
preprocessors.py [RLlib] Fix broken example: tf-eager with custom-RNN (#6732). (#7021) 2020-02-06 09:44:08 -08:00
README.txt [rllib] Try moving RLlib to top level dir (#5324) 2019-08-05 23:25:49 -07:00

Shared neural network models for RLlib.