Sven Mika
|
1499af945b
|
[RLlib] Algorithm step() fixes: evaluation should NOT be part of timed training_step loop. (#25924)
|
2022-06-20 19:53:47 +02:00 |
|
Yi Cheng
|
7b8b0f8e03
|
Revert "[RLlib] Remove execution plan code no longer used by RLlib. (#25624)" (#25776)
This reverts commit 804719876b .
|
2022-06-14 13:59:15 -07:00 |
|
Avnish Narayan
|
804719876b
|
[RLlib] Remove execution plan code no longer used by RLlib. (#25624)
|
2022-06-14 10:57:27 +02:00 |
|
Avnish Narayan
|
eaed256d68
|
[RLlib] Async parallel execution manager. (#24423)
|
2022-05-25 17:54:08 +02:00 |
|
Avnish Narayan
|
477b9d22d2
|
[RLlib][Training iteration fn] APEX conversion (#22937)
|
2022-04-20 17:56:18 +02:00 |
|
Steven Morad
|
00922817b6
|
[RLlib] Rewrite PPO to use training_iteration + enable DD-PPO for Win32. (#23673)
|
2022-04-11 08:39:10 +02:00 |
|
Max Pumperla
|
60054995e6
|
[docs] fix doctests and activate CI (#23418)
|
2022-03-24 17:04:02 -07:00 |
|
Sven Mika
|
04a5c72ea3
|
Revert "Revert "[RLlib] Speedup A3C up to 3x (new training_iteration function instead of execution_plan) and re-instate Pong learning test."" (#18708)
|
2022-02-10 13:44:22 +01:00 |
|
Alex Wu
|
b122f093c1
|
Revert "[RLlib] Speedup A3C up to 3x (new training_iteration function instead of execution_plan ) and re-instate Pong learning test." (#22250)
Reverts ray-project/ray#22126
Breaks rllib:tests/test_io
|
2022-02-09 09:26:36 -08:00 |
|
Sven Mika
|
ac3e6ab411
|
[RLlib] Speedup A3C up to 3x (new training_iteration function instead of execution_plan ) and re-instate Pong learning test. (#22126)
|
2022-02-08 19:04:13 +01:00 |
|
Balaji Veeramani
|
7f1bacc7dc
|
[CI] Format Python code with Black (#21975)
See #21316 and #21311 for the motivation behind these changes.
|
2022-01-29 18:41:57 -08:00 |
|
Sven Mika
|
ee41800c16
|
[RLlib] Preparatory PR for multi-agent, multi-GPU learning agent (alpha-star style) #02. (#21649)
|
2022-01-27 22:07:05 +01:00 |
|