mirror of
https://github.com/vale981/ray
synced 2025-03-06 02:21:39 -05:00
![]() * Add base for Soft Actor-Critic * Pick changes from old SAC branch * Update sac.py * First implementation of sac model * Remove unnecessary SAC imports * Prune unnecessary noise and exploration code * Implement SAC model and use that in SAC policy * runs but doesn't learn * clear state * fix batch size * Add missing alpha grads and vars * -200 by 2k timesteps * doc * lazy squash * one file * ignore tfp * revert done |
||
---|---|---|
.. | ||
base-deps | ||
deploy | ||
examples | ||
stress_test | ||
tune_test |