mirror of
https://github.com/vale981/ray
synced 2025-03-06 10:31:39 -05:00
![]() Fix CQL getting stuck when deprecated timesteps_per_iteration is used (use min_train_timesteps_per_reporting instead). CQL does not perform sampling timesteps and the deprecated timesteps_per_iteration is automatically translated into the new min_sample_timesteps_per_reporting, but should be translated (only for CQL and other purely offline RL algos) into min_train_timesteps_per_reporting. If timesteps_per_iteration, CQL lever leaves the first iteration as it thinks it's not done yet (sample timesteps always remain at 0). |
||
---|---|---|
.. | ||
halfcheetah-bc.yaml | ||
halfcheetah-cql.yaml | ||
hopper-bc.yaml | ||
hopper-cql.yaml | ||
pendulum-cql.yaml |