Commit graph

159 commits

Author SHA1 Message Date
Yi Cheng
7b8b0f8e03
Revert "[RLlib] Remove execution plan code no longer used by RLlib. (#25624)" (#25776)
This reverts commit 804719876b.
2022-06-14 13:59:15 -07:00
Avnish Narayan
804719876b
[RLlib] Remove execution plan code no longer used by RLlib. (#25624) 2022-06-14 10:57:27 +02:00
Sven Mika
130b7eeaba
[RLlib] Trainer to Algorithm renaming. (#25539) 2022-06-11 15:10:39 +02:00
Rohan Potdar
a9d8da0100
[RLlib]: Doubly Robust Off-Policy Evaluation. (#25056) 2022-06-07 12:52:19 +02:00
Artur Niederfahrenhorst
5133978adc
[RLlib] PG policy subclassing conversion. (#25288) 2022-06-06 13:07:47 +02:00
Sven Mika
d95009a3ac
[RLlib] Vectorized envs: Gracefully handle sub-environments failing by restarting them (if configured so). (#24967) 2022-05-28 10:50:03 +02:00
Sven Mika
163fa81976
[RLlib] Discussion 6060 and 5120: auto-infer different agents' spaces in multi-agent env. (#24649) 2022-05-27 14:56:24 +02:00
Rohan Potdar
ab81c8e9ca
[RLlib]: Rename input_evaluation to off_policy_estimation_methods. (#25107) 2022-05-27 13:14:54 +02:00
Jun Gong
eaf9c941ae
[RLlib] Migrate PPO Impala and APPO policies to use sub-classing implementation. (#25117) 2022-05-25 14:38:03 +02:00
Eric Liang
4963dfaae0
[api] Add API stability annotations for all RLlib symbols and add to LINT (#25060) 2022-05-24 22:14:25 -07:00
Sven Mika
09886d7ab8
[RLlib] Upgrade gym 0.23 (#24171) 2022-05-23 08:18:44 +02:00
Eric Liang
55d039af32
Annotate datasources and add API annotation check script (#24999)
Why are these changes needed?
Add API stability annotations for datasource classes, and add a linter to check all data classes have appropriate annotations.
2022-05-21 15:05:07 -07:00
Rohan Potdar
5a70b732e8
[RLlib] MARWIL and BC Config. (#24853) 2022-05-21 12:50:20 +02:00
kourosh hakhamaneshi
3815e52a61
[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits (#24896) 2022-05-19 18:30:42 +02:00
Sven Mika
7cca7782f1
[RLlib] OPE (off policy estimator) API. (#24384) 2022-05-02 21:15:50 +02:00
Jun Gong
ec636dcb29
[RLlib] Do not print warning message during env pre-checking, if there is nothing wrong with user envs. (#24289) 2022-04-29 10:41:19 +02:00
Pavel C
de0c6f6132
[RLlib] Fix policy_map always loading all policies from disk due to (not always needed) global_vars update. (#22010) 2022-04-29 10:38:05 +02:00
Noon van der Silk
3589c21924
[RLlib] Fix some missing f-strings and a f-string related bug in tf eager policy. (#24148) 2022-04-25 11:25:28 +02:00
Avnish Narayan
3bf907bcf8
[RLlib] Don't modify environments via the env checker utilities. (#24083) 2022-04-22 18:39:47 +02:00
Sven Mika
92781c603e
[RLlib] A2C training_iteration method implementation (_disable_execution_plan_api=True) (#23735) 2022-04-15 18:36:13 +02:00
Sven Mika
a8494742a3
[RLlib] Memory leak finding toolset using tracemalloc + CI memory leak tests. (#15412) 2022-04-12 07:50:09 +02:00
Sven Mika
c82f6c62c8
[RLlib] Make RolloutWorkers (optionally) recoverable after failure. (#23739) 2022-04-08 15:33:28 +02:00
Sven Mika
434265edd0
[RLlib] Examples folder: All training_iteration translations. (#23712) 2022-04-05 16:33:50 +02:00
Max Pumperla
60054995e6
[docs] fix doctests and activate CI (#23418) 2022-03-24 17:04:02 -07:00
Siyuan (Ryans) Zhuang
0c74ecad12
[Lint] Cleanup incorrectly formatted strings (Part 1: RLLib). (#23128) 2022-03-15 17:34:21 +01:00
Avnish Narayan
740def0a13
[RLlib] Put env-checker on critical path. (#22191) 2022-02-17 14:06:14 +01:00
Sven Mika
04a5c72ea3
Revert "Revert "[RLlib] Speedup A3C up to 3x (new training_iteration function instead of execution_plan) and re-instate Pong learning test."" (#18708) 2022-02-10 13:44:22 +01:00
Sven Mika
44d09c2aa5
[RLlib] Filter.clear_buffer() deprecated (use Filter.reset_buffer() instead). (#22246) 2022-02-10 02:58:43 +01:00
Sven Mika
f6617506a2
[RLlib] Add on_sub_environment_created to DefaultCallbacks class. (#21893) 2022-02-04 22:22:47 +01:00
Balaji Veeramani
7f1bacc7dc
[CI] Format Python code with Black (#21975)
See #21316 and #21311 for the motivation behind these changes.
2022-01-29 18:41:57 -08:00
Sven Mika
371fbb17e4
[RLlib] Make policies_to_train more flexible via callable option. (#20735) 2022-01-27 12:17:34 +01:00
Sven Mika
d5bfb7b7da
[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 (#21652) 2022-01-25 14:16:58 +01:00
Sven Mika
90c6b10498
[RLlib] Decentralized multi-agent learning; PR #01 (#21421) 2022-01-13 10:52:55 +01:00
Sven Mika
92f030331e
[RLlib] Initial code/comment cleanups in preparation for decentralized multi-agent learner. (#21420) 2022-01-10 11:22:55 +01:00
Sven Mika
daa4304a91
[RLlib] Switch off preprocessors by default for PGTrainer. (#21008) 2021-12-13 12:04:23 +01:00
Amog Kamsetty
611bfc1352
[ML] Move find_free_port to ml_utils (#20828)
Small refactoring of common utility used by Train, Tune, and Rllib.
2021-12-03 13:38:42 -08:00
Avnish Narayan
74dd0e4085
[RLlib] Make to_base_env() a method of all RLlib-supported Env classes (#20811) 2021-12-01 09:01:02 +01:00
Avnish Narayan
3ddc09544d
[rllib] Env to base env refactor (#20785) 2021-11-30 17:02:10 -08:00
Sven Mika
0b308719f8
[RLlib; Docs overhaul] Docstring cleanup: rllib/utils (#19829) 2021-11-01 21:46:02 +01:00
Sven Mika
ea2bea7e30
[RLlib; Docs overhaul] Docstring cleanup: Offline. (#19808) 2021-11-01 10:59:53 +01:00
Sven Mika
9c73871da0
[RLlib; Docs overhaul] Docstring cleanup: Evaluation (#19783) 2021-10-29 12:03:56 +02:00
Sven Mika
902e854af2
[RLlib; Docs overhaul] Docstring cleanup: Environments. (#19784)
* wip.

* Test: Make a change in tune to trigger tune tests, which are not run otherwise, but seem to fail nevertheless with this PR's changes.

* remove bare_metal_policy_with_custom_view_reqs from tests
2021-10-29 10:46:52 +02:00
Sven Mika
c3e3fc7637
[RLlib] Issue 18280: A3C/IMPALA multi-agent not working. (#19100) 2021-10-07 23:57:53 +02:00
Sven Mika
61a1274619
[RLlib] No Preprocessors (part 2). (#18468) 2021-09-23 12:56:45 +02:00
Sven Mika
fd13bac9b3
[RLlib] Add worker arg (optional) to policy_mapping_fn. (#18184) 2021-09-17 12:07:11 +02:00
Sven Mika
8a72824c63
[RLlib Testig] Split and unflake more CI tests (make sure all jobs are < 30min). (#18591) 2021-09-15 22:16:48 +02:00
Sven Mika
3f89f35e52
[RLlib] Better error messages and hints; + failure-mode tests; (#18466) 2021-09-10 16:52:47 +02:00
Sven Mika
8a066474d4
[RLlib] No Preprocessors; preparatory PR #1 (#18367) 2021-09-09 08:10:42 +02:00
Sven Mika
1520c3d147
[RLlib] Deepcopy env_ctx for vectorized sub-envs AND add eval-worker-option to Trainer.add_policy() (#18428) 2021-09-09 07:10:06 +02:00
Sven Mika
a772c775cd
[RLlib] Set random seed (if provided) to Trainer process as well. (#18307) 2021-09-04 11:02:30 +02:00