Yi Cheng
7b8b0f8e03
Revert "[RLlib] Remove execution plan code no longer used by RLlib. ( #25624 )" ( #25776 )
...
This reverts commit 804719876b
.
2022-06-14 13:59:15 -07:00
Avnish Narayan
804719876b
[RLlib] Remove execution plan code no longer used by RLlib. ( #25624 )
2022-06-14 10:57:27 +02:00
Sven Mika
130b7eeaba
[RLlib] Trainer
to Algorithm
renaming. ( #25539 )
2022-06-11 15:10:39 +02:00
Rohan Potdar
a9d8da0100
[RLlib]: Doubly Robust Off-Policy Evaluation. ( #25056 )
2022-06-07 12:52:19 +02:00
Artur Niederfahrenhorst
5133978adc
[RLlib] PG policy subclassing conversion. ( #25288 )
2022-06-06 13:07:47 +02:00
Sven Mika
d95009a3ac
[RLlib] Vectorized envs: Gracefully handle sub-environments failing by restarting them (if configured so). ( #24967 )
2022-05-28 10:50:03 +02:00
Sven Mika
163fa81976
[RLlib] Discussion 6060 and 5120: auto-infer different agents' spaces in multi-agent env. ( #24649 )
2022-05-27 14:56:24 +02:00
Rohan Potdar
ab81c8e9ca
[RLlib]: Rename input_evaluation
to off_policy_estimation_methods
. ( #25107 )
2022-05-27 13:14:54 +02:00
Jun Gong
eaf9c941ae
[RLlib] Migrate PPO Impala and APPO policies to use sub-classing implementation. ( #25117 )
2022-05-25 14:38:03 +02:00
Eric Liang
4963dfaae0
[api] Add API stability annotations for all RLlib symbols and add to LINT ( #25060 )
2022-05-24 22:14:25 -07:00
Sven Mika
09886d7ab8
[RLlib] Upgrade gym 0.23 ( #24171 )
2022-05-23 08:18:44 +02:00
Eric Liang
55d039af32
Annotate datasources and add API annotation check script ( #24999 )
...
Why are these changes needed?
Add API stability annotations for datasource classes, and add a linter to check all data classes have appropriate annotations.
2022-05-21 15:05:07 -07:00
Rohan Potdar
5a70b732e8
[RLlib] MARWIL and BC Config. ( #24853 )
2022-05-21 12:50:20 +02:00
kourosh hakhamaneshi
3815e52a61
[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits ( #24896 )
2022-05-19 18:30:42 +02:00
Sven Mika
7cca7782f1
[RLlib] OPE (off policy estimator) API. ( #24384 )
2022-05-02 21:15:50 +02:00
Jun Gong
ec636dcb29
[RLlib] Do not print warning message during env pre-checking, if there is nothing wrong with user envs. ( #24289 )
2022-04-29 10:41:19 +02:00
Pavel C
de0c6f6132
[RLlib] Fix policy_map
always loading all policies from disk due to (not always needed) global_vars
update. ( #22010 )
2022-04-29 10:38:05 +02:00
Noon van der Silk
3589c21924
[RLlib] Fix some missing f-strings and a f-string related bug in tf eager policy. ( #24148 )
2022-04-25 11:25:28 +02:00
Avnish Narayan
3bf907bcf8
[RLlib] Don't modify environments via the env checker utilities. ( #24083 )
2022-04-22 18:39:47 +02:00
Sven Mika
92781c603e
[RLlib] A2C training_iteration
method implementation (_disable_execution_plan_api=True
) ( #23735 )
2022-04-15 18:36:13 +02:00
Sven Mika
a8494742a3
[RLlib] Memory leak finding toolset using tracemalloc + CI memory leak tests. ( #15412 )
2022-04-12 07:50:09 +02:00
Sven Mika
c82f6c62c8
[RLlib] Make RolloutWorkers (optionally) recoverable after failure. ( #23739 )
2022-04-08 15:33:28 +02:00
Sven Mika
434265edd0
[RLlib] Examples folder: All training_iteration
translations. ( #23712 )
2022-04-05 16:33:50 +02:00
Max Pumperla
60054995e6
[docs] fix doctests and activate CI ( #23418 )
2022-03-24 17:04:02 -07:00
Siyuan (Ryans) Zhuang
0c74ecad12
[Lint] Cleanup incorrectly formatted strings (Part 1: RLLib). ( #23128 )
2022-03-15 17:34:21 +01:00
Avnish Narayan
740def0a13
[RLlib] Put env-checker on critical path. ( #22191 )
2022-02-17 14:06:14 +01:00
Sven Mika
04a5c72ea3
Revert "Revert "[RLlib] Speedup A3C up to 3x (new training_iteration function instead of execution_plan) and re-instate Pong learning test."" ( #18708 )
2022-02-10 13:44:22 +01:00
Sven Mika
44d09c2aa5
[RLlib] Filter.clear_buffer() deprecated (use Filter.reset_buffer() instead). ( #22246 )
2022-02-10 02:58:43 +01:00
Sven Mika
f6617506a2
[RLlib] Add on_sub_environment_created
to DefaultCallbacks class. ( #21893 )
2022-02-04 22:22:47 +01:00
Balaji Veeramani
7f1bacc7dc
[CI] Format Python code with Black ( #21975 )
...
See #21316 and #21311 for the motivation behind these changes.
2022-01-29 18:41:57 -08:00
Sven Mika
371fbb17e4
[RLlib] Make policies_to_train
more flexible via callable option. ( #20735 )
2022-01-27 12:17:34 +01:00
Sven Mika
d5bfb7b7da
[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 ( #21652 )
2022-01-25 14:16:58 +01:00
Sven Mika
90c6b10498
[RLlib] Decentralized multi-agent learning; PR #01 ( #21421 )
2022-01-13 10:52:55 +01:00
Sven Mika
92f030331e
[RLlib] Initial code/comment cleanups in preparation for decentralized multi-agent learner. ( #21420 )
2022-01-10 11:22:55 +01:00
Sven Mika
daa4304a91
[RLlib] Switch off preprocessors by default for PGTrainer. ( #21008 )
2021-12-13 12:04:23 +01:00
Amog Kamsetty
611bfc1352
[ML] Move find_free_port
to ml_utils
( #20828 )
...
Small refactoring of common utility used by Train, Tune, and Rllib.
2021-12-03 13:38:42 -08:00
Avnish Narayan
74dd0e4085
[RLlib] Make to_base_env()
a method of all RLlib-supported Env classes ( #20811 )
2021-12-01 09:01:02 +01:00
Avnish Narayan
3ddc09544d
[rllib] Env to base env refactor ( #20785 )
2021-11-30 17:02:10 -08:00
Sven Mika
0b308719f8
[RLlib; Docs overhaul] Docstring cleanup: rllib/utils ( #19829 )
2021-11-01 21:46:02 +01:00
Sven Mika
ea2bea7e30
[RLlib; Docs overhaul] Docstring cleanup: Offline. ( #19808 )
2021-11-01 10:59:53 +01:00
Sven Mika
9c73871da0
[RLlib; Docs overhaul] Docstring cleanup: Evaluation ( #19783 )
2021-10-29 12:03:56 +02:00
Sven Mika
902e854af2
[RLlib; Docs overhaul] Docstring cleanup: Environments. ( #19784 )
...
* wip.
* Test: Make a change in tune to trigger tune tests, which are not run otherwise, but seem to fail nevertheless with this PR's changes.
* remove bare_metal_policy_with_custom_view_reqs from tests
2021-10-29 10:46:52 +02:00
Sven Mika
c3e3fc7637
[RLlib] Issue 18280: A3C/IMPALA multi-agent not working. ( #19100 )
2021-10-07 23:57:53 +02:00
Sven Mika
61a1274619
[RLlib] No Preprocessors (part 2). ( #18468 )
2021-09-23 12:56:45 +02:00
Sven Mika
fd13bac9b3
[RLlib] Add worker
arg (optional) to policy_mapping_fn
. ( #18184 )
2021-09-17 12:07:11 +02:00
Sven Mika
8a72824c63
[RLlib Testig] Split and unflake more CI tests (make sure all jobs are < 30min). ( #18591 )
2021-09-15 22:16:48 +02:00
Sven Mika
3f89f35e52
[RLlib] Better error messages and hints; + failure-mode tests; ( #18466 )
2021-09-10 16:52:47 +02:00
Sven Mika
8a066474d4
[RLlib] No Preprocessors; preparatory PR #1 ( #18367 )
2021-09-09 08:10:42 +02:00
Sven Mika
1520c3d147
[RLlib] Deepcopy env_ctx for vectorized sub-envs AND add eval-worker-option to Trainer.add_policy()
( #18428 )
2021-09-09 07:10:06 +02:00
Sven Mika
a772c775cd
[RLlib] Set random seed (if provided) to Trainer process as well. ( #18307 )
2021-09-04 11:02:30 +02:00