hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 02:21:39 -05:00

Author	SHA1	Message	Date
Sven Mika	1499af945b	[RLlib] Algorithm `step()` fixes: evaluation should NOT be part of timed `training_step` loop. (#25924 )	2022-06-20 19:53:47 +02:00
Yi Cheng	7b8b0f8e03	Revert "[RLlib] Remove execution plan code no longer used by RLlib. (#25624 )" (#25776 ) This reverts commit `804719876b`.	2022-06-14 13:59:15 -07:00
Avnish Narayan	804719876b	[RLlib] Remove execution plan code no longer used by RLlib. (#25624 )	2022-06-14 10:57:27 +02:00
Avnish Narayan	eaed256d68	[RLlib] Async parallel execution manager. (#24423 )	2022-05-25 17:54:08 +02:00
Avnish Narayan	477b9d22d2	[RLlib][Training iteration fn] APEX conversion (#22937 )	2022-04-20 17:56:18 +02:00
Steven Morad	00922817b6	[RLlib] Rewrite PPO to use training_iteration + enable DD-PPO for Win32. (#23673 )	2022-04-11 08:39:10 +02:00
Max Pumperla	60054995e6	[docs] fix doctests and activate CI (#23418 )	2022-03-24 17:04:02 -07:00
Sven Mika	04a5c72ea3	Revert "Revert "[RLlib] Speedup A3C up to 3x (new training_iteration function instead of execution_plan) and re-instate Pong learning test."" (#18708 )	2022-02-10 13:44:22 +01:00
Alex Wu	b122f093c1	Revert "[RLlib] Speedup A3C up to 3x (new `training_iteration` function instead of `execution_plan`) and re-instate Pong learning test." (#22250 ) Reverts ray-project/ray#22126 Breaks rllib:tests/test_io	2022-02-09 09:26:36 -08:00
Sven Mika	ac3e6ab411	[RLlib] Speedup A3C up to 3x (new `training_iteration` function instead of `execution_plan`) and re-instate Pong learning test. (#22126 )	2022-02-08 19:04:13 +01:00
Balaji Veeramani	7f1bacc7dc	[CI] Format Python code with Black (#21975 ) See #21316 and #21311 for the motivation behind these changes.	2022-01-29 18:41:57 -08:00
Sven Mika	ee41800c16	[RLlib] Preparatory PR for multi-agent, multi-GPU learning agent (alpha-star style) #02 . (#21649 )	2022-01-27 22:07:05 +01:00

12 commits