hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 18:41:40 -05:00

Author	SHA1	Message	Date
Sven Mika	7c39aa5fac	[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076 )	2022-06-10 17:09:18 +02:00
Sven Mika	f066180ed5	[RLlib] Deprecate `timesteps_per_iteration` config key (in favor of `min_[sample\|train]_timesteps_per_reporting`. (#24372 )	2022-05-02 12:51:14 +02:00
Jun Gong	d12977c4fb	[RLlib] TF2 Bandit Agent (#22838 )	2022-03-21 16:55:55 +01:00
Jun Gong	a385c9b127	[RLlib] Update bandit_envs_recommender_system (#22421 )	2022-02-24 22:43:41 +01:00
Sven Mika	6522935291	[RLlib] Slate-Q tf implementation and tests/benchmarks. (#22389 )	2022-02-22 09:36:44 +01:00
Jun Gong	9c95b9a5fa	[RLlib] Add an env wrapper so RecSim works with our Bandits agent. (#22028 )	2022-02-02 12:15:38 +01:00