ray/rllib/models at 793e616a2de32866ace7201296b3ccab077bd1bb - hiro/ray

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

History

Sven Mika 165a86f1ab [RLlib] SAC MuJoCo instability issues (tf and torch versions). (#8063 ) SAC (both torch and tf versions) are showing issues (crashes) due to numeric instabilities in the SquashedGaussian distribution (sampling + logp after extreme NN outputs). This PR fixes these. Stable MuJoCo learning (HalfCheetah) has been confirmed on both tf and torch versions. A Distribution stability test (using extreme NN outputs) has been added for SquashedGaussian (can be used for any other type of distribution as well).		2020-04-19 10:20:23 +02:00
..
tests	[RLlib] SAC MuJoCo instability issues (tf and torch versions). (#8063 )	2020-04-19 10:20:23 +02:00
tf	[RLlib] SAC MuJoCo instability issues (tf and torch versions). (#8063 )	2020-04-19 10:20:23 +02:00
torch	[RLlib] SAC MuJoCo instability issues (tf and torch versions). (#8063 )	2020-04-19 10:20:23 +02:00
__init__.py	[rllib] Try moving RLlib to top level dir (#5324 )	2019-08-05 23:25:49 -07:00
action_dist.py	[RLlib] Exploration API: merge deterministic flag with exploration classes (SoftQ and StochasticSampling). (#7155 )	2020-02-19 12:18:45 -08:00
catalog.py	[RLlib] SAC Torch (incl. Atari learning) (#7984 )	2020-04-15 13:25:16 +02:00
extra_spaces.py	[rllib] Try moving RLlib to top level dir (#5324 )	2019-08-05 23:25:49 -07:00
model.py	Remove future imports (#6724 )	2020-01-09 00:15:48 -08:00
modelv2.py	[RLlib] DDPG re-factor to fit into RLlib's functional algorithm builder API. (#7934 )	2020-04-09 14:04:21 -07:00
preprocessors.py	[RLlib] Fix broken example: tf-eager with custom-RNN (#6732 ). (#7021 )	2020-02-06 09:44:08 -08:00
README.txt	[rllib] Try moving RLlib to top level dir (#5324 )	2019-08-05 23:25:49 -07:00

README.txt

Shared neural network models for RLlib.