hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Artur Niederfahrenhorst	a3f1323457	[RLlib] Make QMix use the ReplayBufferAPI (#25560 )	2022-06-23 22:55:22 -07:00
Sven Mika	130b7eeaba	[RLlib] `Trainer` to `Algorithm` renaming. (#25539 )	2022-06-11 15:10:39 +02:00
Sven Mika	7c39aa5fac	[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076 )	2022-06-10 17:09:18 +02:00
Artur Niederfahrenhorst	c3645928ca	[RLlib] Fix no gradient clipping happening in QMix. (#25656 )	2022-06-10 13:51:26 +02:00
Artur Niederfahrenhorst	9226643433	[RLlib] Issue 4965: Fixes PyTorch grad clipping logic and adds grad clipping to QMIX. (#25584 )	2022-06-08 19:40:57 +02:00
Sven Mika	b5bc2b93c3	[RLlib] Move all remaining algos into `algorithms` directory. (#25366 )	2022-06-04 07:35:24 +02:00
Steven Morad	501d932449	[RLlib] SAC, RNNSAC, and CQL TrainerConfig objects (#25059 )	2022-05-22 19:58:47 +02:00
kourosh hakhamaneshi	3815e52a61	[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits (#24896 )	2022-05-19 18:30:42 +02:00