Balaji Veeramani
31ed9e5d02
[CI] Replace YAPF disables with Black disables ( #21982 )
2022-02-08 16:29:25 -08:00
Sven Mika
ac3e6ab411
[RLlib] Speedup A3C up to 3x (new training_iteration
function instead of execution_plan
) and re-instate Pong learning test. ( #22126 )
2022-02-08 19:04:13 +01:00
Sven Mika
8b678ddd68
[RLlib] Issue 22036: Client should handle concurrent episodes with one being training_enabled=False
. ( #22076 )
2022-02-06 12:35:03 +01:00
Sven Mika
f6617506a2
[RLlib] Add on_sub_environment_created
to DefaultCallbacks class. ( #21893 )
2022-02-04 22:22:47 +01:00
Rodrigo de Lazcano
a258f9c692
[RLlib] Neural-MMO keep_per_episode_custom_metrics
patch (toward making Neuro-MMO RLlib's default massive-multi-agent learning test environment). ( #22042 )
2022-02-02 17:28:42 +01:00
Jun Gong
87fe033f7b
[RLlib] Request CPU resources in Trainer.default_resource_request()
if using dataset input. ( #21948 )
2022-02-02 10:20:37 +01:00
Balaji Veeramani
7f1bacc7dc
[CI] Format Python code with Black ( #21975 )
...
See #21316 and #21311 for the motivation behind these changes.
2022-01-29 18:41:57 -08:00
Sven Mika
371fbb17e4
[RLlib] Make policies_to_train
more flexible via callable option. ( #20735 )
2022-01-27 12:17:34 +01:00
Jun Gong
099c170ab4
[RLlib] Dataset Reader/Writer for RLlib ( #21808 )
2022-01-26 16:00:46 +01:00
Sven Mika
d5bfb7b7da
[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 ( #21652 )
2022-01-25 14:16:58 +01:00
Avnish Narayan
12b087acb8
[RLlib] Base env pre-checker. ( #21569 )
2022-01-18 16:34:06 +01:00
Sven Mika
90c6b10498
[RLlib] Decentralized multi-agent learning; PR #01 ( #21421 )
2022-01-13 10:52:55 +01:00
Sven Mika
92f030331e
[RLlib] Initial code/comment cleanups in preparation for decentralized multi-agent learner. ( #21420 )
2022-01-10 11:22:55 +01:00
Sven Mika
b10d5533be
[RLlib] Issue 20920 (partial solution): contrib/MADDPG + pettingzoo coop-pong-v4 not working. ( #21452 )
2022-01-10 11:19:40 +01:00
Sven Mika
62dbf26394
[RLlib] POC: Run PGTrainer w/o the distr. exec API (Trainer's new training_iteration method). ( #20984 )
2021-12-21 08:39:05 +01:00
Alexis DUBURCQ
6c3e63bc9c
[RLlib] Fix view requirements. ( #21043 )
2021-12-15 11:59:04 +01:00
Sven Mika
daa4304a91
[RLlib] Switch off preprocessors by default for PGTrainer. ( #21008 )
2021-12-13 12:04:23 +01:00
Sven Mika
596c8e2772
[RLlib] Experimental no-flatten option for actions/prev-actions. ( #20918 )
2021-12-11 14:57:58 +01:00
Tomasz Wrona
39c202fa66
[RLlib] Allow extra keys in info in multi-agent ( #20793 )
2021-12-09 14:44:33 +01:00
Avnish Narayan
b8c64480d8
[RLlib] Change return type of try_reset to MultiEnvDict ( #20868 )
2021-12-06 14:15:33 +01:00
Sven Mika
60b2219d72
[RLlib] Allow for evaluation to run by timesteps
(alternative to episodes
) and add auto-setting to make sure train doesn't ever have to wait for eval (e.g. long episodes) to finish. ( #20757 )
2021-12-04 13:26:33 +01:00
Amog Kamsetty
611bfc1352
[ML] Move find_free_port
to ml_utils
( #20828 )
...
Small refactoring of common utility used by Train, Tune, and Rllib.
2021-12-03 13:38:42 -08:00
Avnish Narayan
74dd0e4085
[RLlib] Make to_base_env()
a method of all RLlib-supported Env classes ( #20811 )
2021-12-01 09:01:02 +01:00
Avnish Narayan
3ddc09544d
[rllib] Env to base env refactor ( #20785 )
2021-11-30 17:02:10 -08:00
Sven Mika
56619b955e
[RLlib; Documentation] Some docstring cleanups; Rename RemoteVectorEnv into RemoteBaseEnv for clarity. ( #20250 )
2021-11-17 21:40:16 +01:00
Sven Mika
f82880eda1
Revert "Revert [RLlib] POC: Deprecate build_policy
(policy template) for torch only; PPOTorchPolicy ( #20061 ) ( #20399 )" ( #20417 )
...
This reverts commit 90dc5460d4
.
2021-11-16 14:49:41 +01:00
Amog Kamsetty
90dc5460d4
Revert "[RLlib] POC: Deprecate build_policy
(policy template) for torch only; PPOTorchPolicy ( #20061 )" ( #20399 )
...
This reverts commit 5b1c8e46e1
.
2021-11-15 16:11:35 -08:00
Sven Mika
5b1c8e46e1
[RLlib] POC: Deprecate build_policy
(policy template) for torch only; PPOTorchPolicy ( #20061 )
2021-11-15 10:41:54 +01:00
Sven Mika
70fe25055a
[RLlib] Issue: Get single step input dict incorrect. ( #20217 )
2021-11-12 08:38:51 +01:00
Sven Mika
2d24ef0d32
[RLlib] Add all simple learning tests as framework=tf2
. ( #19273 )
...
* Unpin gym and deprecate pendulum v0
Many tests in rllib depended on pendulum v0,
however in gym 0.21, pendulum v0 was deprecated
in favor of pendulum v1. This may change reward
thresholds, so will have to potentially rerun
all of the pendulum v1 benchmarks, or use another
environment in favor. The same applies to frozen
lake v0 and frozen lake v1
Lastly, all of the RLlib tests and Tune tests have
been moved to python 3.7
* fix tune test_sampler::testSampleBoundsAx
* fix re-install ray for py3.7 tests
Co-authored-by: avnishn <avnishn@uw.edu>
2021-11-02 12:10:17 +01:00
Sven Mika
0b308719f8
[RLlib; Docs overhaul] Docstring cleanup: rllib/utils ( #19829 )
2021-11-01 21:46:02 +01:00
Sven Mika
ea2bea7e30
[RLlib; Docs overhaul] Docstring cleanup: Offline. ( #19808 )
2021-11-01 10:59:53 +01:00
Sven Mika
9c73871da0
[RLlib; Docs overhaul] Docstring cleanup: Evaluation ( #19783 )
2021-10-29 12:03:56 +02:00
Sven Mika
902e854af2
[RLlib; Docs overhaul] Docstring cleanup: Environments. ( #19784 )
...
* wip.
* Test: Make a change in tune to trigger tune tests, which are not run otherwise, but seem to fail nevertheless with this PR's changes.
* remove bare_metal_policy_with_custom_view_reqs from tests
2021-10-29 10:46:52 +02:00
Antoine Galataud
edb338ff7c
[RLlib] Check training_enabled
on PolicyServer ( #19007 )
2021-10-12 16:21:02 +02:00
Sven Mika
d439fd7f17
[RLlib] TF2/eager memory leak fixes. ( #19198 )
2021-10-09 00:11:53 +02:00
Sven Mika
c3e3fc7637
[RLlib] Issue 18280: A3C/IMPALA multi-agent not working. ( #19100 )
2021-10-07 23:57:53 +02:00
Jiajun Yao
7588bfd315
[Lint] Add flake8-bugbear ( #19053 )
...
* Add flake8-bugbear
* Add flake8-bugbear
2021-10-03 23:24:11 -07:00
Sven Mika
ed85f59194
[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. ( #18879 )
2021-09-30 16:39:05 +02:00
mvindiola1
62f5da0b65
[RLlib] Add unit tests for updating episode data in base_env ( #17137 )
2021-09-24 16:08:11 +02:00
Sven Mika
61a1274619
[RLlib] No Preprocessors (part 2). ( #18468 )
2021-09-23 12:56:45 +02:00
Sven Mika
a2a077b874
[RLlib] Faster remote worker space inference (don't infer if not required). ( #18805 )
2021-09-23 10:54:37 +02:00
Sven Mika
a96dbd885b
[RLlib] Reinstate trajectory view API tests. ( #18809 )
2021-09-23 08:31:51 +02:00
Sven Mika
fd13bac9b3
[RLlib] Add worker
arg (optional) to policy_mapping_fn
. ( #18184 )
2021-09-17 12:07:11 +02:00
Sven Mika
8a72824c63
[RLlib Testig] Split and unflake more CI tests (make sure all jobs are < 30min). ( #18591 )
2021-09-15 22:16:48 +02:00
Sven Mika
3f89f35e52
[RLlib] Better error messages and hints; + failure-mode tests; ( #18466 )
2021-09-10 16:52:47 +02:00
Sven Mika
8a066474d4
[RLlib] No Preprocessors; preparatory PR #1 ( #18367 )
2021-09-09 08:10:42 +02:00
Sven Mika
1520c3d147
[RLlib] Deepcopy env_ctx for vectorized sub-envs AND add eval-worker-option to Trainer.add_policy()
( #18428 )
2021-09-09 07:10:06 +02:00
Sven Mika
e3e6ed7aaa
[RLlib] Issues 17844, 18034: Fix n-step > 1 bug. ( #18358 )
2021-09-06 12:14:20 +02:00
Sven Mika
a772c775cd
[RLlib] Set random seed (if provided) to Trainer process as well. ( #18307 )
2021-09-04 11:02:30 +02:00