Steven Morad
00922817b6
[RLlib] Rewrite PPO to use training_iteration + enable DD-PPO for Win32. ( #23673 )
2022-04-11 08:39:10 +02:00
Sven Mika
c82f6c62c8
[RLlib] Make RolloutWorkers (optionally) recoverable after failure. ( #23739 )
2022-04-08 15:33:28 +02:00
Sven Mika
0af100ffae
[RLlib] Fix tree.flatten dict ordering bug: flatten_space([obs_space])
should produce same struct as tree.flatten([obs])
. ( #22731 )
2022-03-01 21:24:24 +01:00
Avnish Narayan
740def0a13
[RLlib] Put env-checker on critical path. ( #22191 )
2022-02-17 14:06:14 +01:00
Sven Mika
04a5c72ea3
Revert "Revert "[RLlib] Speedup A3C up to 3x (new training_iteration function instead of execution_plan) and re-instate Pong learning test."" ( #18708 )
2022-02-10 13:44:22 +01:00
Alex Wu
b122f093c1
Revert "[RLlib] Speedup A3C up to 3x (new training_iteration
function instead of execution_plan
) and re-instate Pong learning test." ( #22250 )
...
Reverts ray-project/ray#22126
Breaks rllib:tests/test_io
2022-02-09 09:26:36 -08:00
Sven Mika
ac3e6ab411
[RLlib] Speedup A3C up to 3x (new training_iteration
function instead of execution_plan
) and re-instate Pong learning test. ( #22126 )
2022-02-08 19:04:13 +01:00
Jun Gong
87fe033f7b
[RLlib] Request CPU resources in Trainer.default_resource_request()
if using dataset input. ( #21948 )
2022-02-02 10:20:37 +01:00
Balaji Veeramani
7f1bacc7dc
[CI] Format Python code with Black ( #21975 )
...
See #21316 and #21311 for the motivation behind these changes.
2022-01-29 18:41:57 -08:00
Sven Mika
371fbb17e4
[RLlib] Make policies_to_train
more flexible via callable option. ( #20735 )
2022-01-27 12:17:34 +01:00
Jun Gong
099c170ab4
[RLlib] Dataset Reader/Writer for RLlib ( #21808 )
2022-01-26 16:00:46 +01:00
Sven Mika
d5bfb7b7da
[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 ( #21652 )
2022-01-25 14:16:58 +01:00
Sven Mika
92f030331e
[RLlib] Initial code/comment cleanups in preparation for decentralized multi-agent learner. ( #21420 )
2022-01-10 11:22:55 +01:00
Sven Mika
b10d5533be
[RLlib] Issue 20920 (partial solution): contrib/MADDPG + pettingzoo coop-pong-v4 not working. ( #21452 )
2022-01-10 11:19:40 +01:00
Sven Mika
62dbf26394
[RLlib] POC: Run PGTrainer w/o the distr. exec API (Trainer's new training_iteration method). ( #20984 )
2021-12-21 08:39:05 +01:00
Sven Mika
60b2219d72
[RLlib] Allow for evaluation to run by timesteps
(alternative to episodes
) and add auto-setting to make sure train doesn't ever have to wait for eval (e.g. long episodes) to finish. ( #20757 )
2021-12-04 13:26:33 +01:00
Sven Mika
56619b955e
[RLlib; Documentation] Some docstring cleanups; Rename RemoteVectorEnv into RemoteBaseEnv for clarity. ( #20250 )
2021-11-17 21:40:16 +01:00
Sven Mika
9c73871da0
[RLlib; Docs overhaul] Docstring cleanup: Evaluation ( #19783 )
2021-10-29 12:03:56 +02:00
Sven Mika
902e854af2
[RLlib; Docs overhaul] Docstring cleanup: Environments. ( #19784 )
...
* wip.
* Test: Make a change in tune to trigger tune tests, which are not run otherwise, but seem to fail nevertheless with this PR's changes.
* remove bare_metal_policy_with_custom_view_reqs from tests
2021-10-29 10:46:52 +02:00
Sven Mika
a2a077b874
[RLlib] Faster remote worker space inference (don't infer if not required). ( #18805 )
2021-09-23 10:54:37 +02:00
Sven Mika
9a8ca6a69d
[RLlib] Fix Atari learning test regressions (2 bugs) and 1 minor attention net bug. ( #18306 )
2021-09-03 13:29:57 +02:00
Sven Mika
7bc4376466
[RLlib] Example script: Simple league-based self-play w/ open spiel env (markov soccer or connect-4). ( #17077 )
2021-07-22 10:59:13 -04:00
Sven Mika
649580d735
[RLlib] Redo simplify multi agent config dict: Reverted b/c seemed to break test_typing (non RLlib test). ( #17046 )
2021-07-15 05:51:24 -04:00
Amog Kamsetty
38b5b6d24c
Revert "[RLlib] Simplify multiagent config (automatically infer class/spaces/config). ( #16565 )" ( #17036 )
...
This reverts commit e4123fff27
.
2021-07-13 09:57:15 -07:00
Sven Mika
e4123fff27
[RLlib] Simplify multiagent config (automatically infer class/spaces/config). ( #16565 )
2021-07-13 06:38:14 -04:00
Julius Frost
a88b217d3f
[rllib] Enhancements to Input API for customizing offline datasets ( #16957 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-07-10 15:05:25 -07:00
Julius Frost
ada0552f16
[rllib] d4rl: fix for paths with multiple periods ( #16721 )
2021-07-01 18:35:50 -07:00
Sven Mika
53206dd440
[RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes ( #16531 )
2021-06-30 12:32:11 +02:00
Sven Mika
c95dea51e9
[RLlib] External env enhancements + more examples. ( #16583 )
2021-06-23 09:09:01 +02:00
Sven Mika
be6db06485
[RLlib] Re-do: Trainer: Support add and delete Policies. ( #16569 )
2021-06-21 13:46:01 +02:00
Amog Kamsetty
bd3cbfc56a
Revert "[RLlib] Allow policies to be added/deleted on the fly. ( #16359 )" ( #16543 )
...
This reverts commit e78ec370a9
.
2021-06-18 12:21:49 -07:00
Sven Mika
e78ec370a9
[RLlib] Allow policies to be added/deleted on the fly. ( #16359 )
2021-06-18 10:31:30 +02:00
Sven Mika
d89fb82bfb
[RLlib] Add simple curriculum learning API and example script. ( #15740 )
2021-05-16 17:35:10 +02:00
Amog Kamsetty
ebc44c3d76
[CI] Upgrade flake8 to 3.9.1 ( #15527 )
...
* formatting
* format util
* format release
* format rllib/agents
* format rllib/env
* format rllib/execution
* format rllib/evaluation
* format rllib/examples
* format rllib/policy
* format rllib utils and tests
* format streaming
* more formatting
* update requirements files
* fix rllib type checking
* updates
* update
* fix circular import
* Update python/ray/tests/test_runtime_env.py
* noqa
2021-05-03 14:23:28 -07:00
Sven Mika
04bc0a9828
[RLlib] Remove all non-trajectory view API code. ( #14860 )
2021-03-23 09:50:18 -07:00
Sven Mika
f859ebb99f
[RLlib] Fix env rendering and recording options (for non-local mode; >0 workers; +evaluation-workers). ( #14796 )
2021-03-23 10:06:06 +01:00
Eric Liang
9db000ff2c
Auto report object store memory usage; remove some deprecated code ( #14260 )
2021-03-01 13:19:44 -08:00
Michael Luo
587f207c2f
[RLlib] Support for D4RL + Semi-working CQL Benchmark ( #13550 )
2021-01-21 16:43:55 +01:00
Sven Mika
abb1eefdc2
[RLlib] Issue 12483: Discrete observation space error: "ValueError: ('Observation ({}) outside given space ..." when doing Trainer.compute_action. ( #12787 )
2020-12-11 22:43:30 +01:00
Sven Mika
ea25482f6a
WIP. ( #12706 )
2020-12-09 11:49:21 -08:00
Sven Mika
e40b14d255
[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. ( #12420 )
2020-12-08 16:41:45 -08:00
Sven Mika
99c81c6795
[RLlib] Attention Net prep PR #3 . ( #12450 )
2020-12-07 13:08:17 +01:00
mvindiola1
9e68b77796
[RLLIB] Wait for remote_workers to finish closing environments before terminating ( #11476 )
2020-10-28 14:23:06 -07:00
Sven Mika
414041c6dd
[RLlib] Do not create env on driver iff num_workers > 0. ( #11307 )
2020-10-15 18:21:30 +02:00
Sven Mika
ce96b03b07
[RLlib] MB-MPO cleanup (comments, docstrings, type annotations). ( #11033 )
2020-10-06 20:28:16 +02:00
Sven Mika
805dad3bc4
[RLlib] SAC algo cleanup. ( #10825 )
2020-09-20 11:27:02 +02:00
Alex Wu
a699f6a4d8
[Core] Fix override memory and object_store_memory in decorator ( #10563 )
2020-09-06 20:56:48 -07:00
Sven Mika
715ee8dfc9
[RLlib] Issue 10469: Callbacks should receive env idx ... ( #10477 )
2020-09-03 17:27:05 +02:00
raoul-khour-ts
c8c4832794
Prevent Local Worker creation from blocking remote worker creation by creating remote workers before local worker ( #10245 )
...
* create remote workers before local worker
* reformatted
2020-08-24 12:29:55 -07:00
Sven Mika
d14b501692
[RLlib] First attempt at cleaning up algo code in RLlib: PG. ( #10115 )
2020-08-20 17:05:57 +02:00