Commit graph

227 commits

Author SHA1 Message Date
Michael Luo
587f207c2f
[RLlib] Support for D4RL + Semi-working CQL Benchmark (#13550) 2021-01-21 16:43:55 +01:00
Sven Mika
2e3655e8a9
[RLlib] Issue 9071 A3C w/ RNN not working due to VF assuming no RNN. (#13238) 2021-01-19 14:22:36 +01:00
Sven Mika
e74947cc94
[RLlib] Env directory cleanup and tests. (#13082) 2021-01-19 10:09:39 +01:00
Sven Mika
d49c3fae0b
[RLlib] Trajectory View API: Atari framestacking. (#13315) 2021-01-13 08:53:34 +01:00
Sven Mika
6f342a2221
[RLlib] Preparatory PR for: Documentation on Model Building. (#13260) 2021-01-08 10:56:09 +01:00
Sven Mika
9eba1871bb
[RLlib] Support easy use_attention=True flag for using the GTrXL model. (#11698) 2021-01-01 14:06:23 -05:00
Sven Mika
391cdfae8c
[RLlib] Trajectory view API docs. (#12718) 2020-12-30 17:32:21 -08:00
Sven Mika
c524f86785
[RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. (#13064) 2020-12-27 09:46:03 -05:00
Sven Mika
d5604eaba3
[RLlib] Attention nets PyTorch support and cleanup (using traj. view API). (#12029) 2020-12-21 18:38:34 -08:00
Sven Mika
b2bcab711d
[RLlib] Attention Nets: tf (#12753) 2020-12-20 20:22:32 -05:00
Sven Mika
abb1eefdc2
[RLlib] Issue 12483: Discrete observation space error: "ValueError: ('Observation ({}) outside given space ..." when doing Trainer.compute_action. (#12787) 2020-12-11 22:43:30 +01:00
Sven Mika
ea25482f6a
WIP. (#12706) 2020-12-09 11:49:21 -08:00
Sven Mika
e40b14d255
[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. (#12420) 2020-12-08 16:41:45 -08:00
Sven Mika
99c81c6795
[RLlib] Attention Net prep PR #3. (#12450) 2020-12-07 13:08:17 +01:00
Sven Mika
19c8033df2
[RLlib] Fix most remaining RLlib algos for running with trajectory view API. (#12366)
* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* LINT and fixes.
MB-MPO and MAML not working yet.

* wip

* update

* update

* rmeove

* remove dep

* higher

* Update requirements_rllib.txt

* Update requirements_rllib.txt

* relpos

* no mbmpo

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-12-01 17:41:10 -08:00
Sven Mika
3ad9365e1d
[RLlib] Attention Net prep PR #2: Smaller cleanups. (#12449) 2020-12-01 08:21:45 +01:00
Sven Mika
0df55a139c
[RLlib] Attention Net prep PR #1: Smaller cleanups. (#12447)
* WIP.

* Fix.

* Fix.

* Fix.
2020-11-27 16:25:47 -08:00
Sven Mika
592c161032
[RLlib] Issue 12118: LSTM prev-a/r should be separately configurable. Fix missing prev-a one-hot encoding. (#12397)
* WIP.

* Fix and LINT.
2020-11-25 11:27:46 -08:00
Sven Mika
95175a822f
[RLlib] Issue 11974: Traj view API next-action (shift=+1) not working. (#12407)
* WIP.

* Fix and LINT.
2020-11-25 11:26:29 -08:00
Sven Mika
dab241dcc6
[RLlib] Fix inconsistency wrt batch size in SampleCollector (traj. view API). Makes DD-PPO work with traj. view API. (#12063) 2020-11-19 19:01:14 +01:00
Sven Mika
b6b54f1c81
[RLlib] Trajectory view API: enable by default for SAC, DDPG, DQN, SimpleQ (#11827) 2020-11-16 10:54:35 -08:00
Sven Mika
5b788ccb13
[RLlib] Trajectory view API (prep PR for switching on by default across all RLlib; plumbing only) (#11717) 2020-11-03 12:53:34 -08:00
mvindiola1
9e68b77796
[RLLIB] Wait for remote_workers to finish closing environments before terminating (#11476) 2020-10-28 14:23:06 -07:00
Sven Mika
d9f1874e34
[RLlib] Minor fixes (torch GPU bugs + some cleanup). (#11609) 2020-10-27 10:00:24 +01:00
Sven Mika
2aec77e305
[RLlib] Fix two test cases that only fail on Travis. (#11435) 2020-10-16 13:53:30 -05:00
Sven Mika
414041c6dd
[RLlib] Do not create env on driver iff num_workers > 0. (#11307) 2020-10-15 18:21:30 +02:00
Sven Mika
ce96b03b07
[RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033) 2020-10-06 20:28:16 +02:00
Sven Mika
36bda8432b
[RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056) 2020-10-01 16:57:10 +02:00
Sven Mika
47eb6613b5
[RLlib] Remove unnecessary copies in compute_advantages. (#10897) 2020-09-29 12:25:20 +02:00
Eric Liang
609c1b8acd
Start moving ray internal files to _private module (#10994) 2020-09-24 22:46:35 -07:00
Sven Mika
805dad3bc4
[RLlib] SAC algo cleanup. (#10825) 2020-09-20 11:27:02 +02:00
Eric Liang
f83c588f08
[rllib] Remove broken no eager on workers mode (#10745)
* remove no eager

* Update trainer.py
2020-09-15 17:25:20 -07:00
Sven Mika
4b278c36fc
[RLlib] Behavioral Cloning (from MARWIL). (#10619) 2020-09-09 17:33:21 +02:00
Alex Wu
a699f6a4d8
[Core] Fix override memory and object_store_memory in decorator (#10563) 2020-09-06 20:56:48 -07:00
Sven Mika
244aafdcf8
[RLlib] Curiosity enhancements. (#10373) 2020-09-05 13:14:24 +02:00
architkulkarni
6ae9e76b81
[RLlib] Fix seeding issue (#10589) 2020-09-04 17:17:53 -07:00
Sven Mika
715ee8dfc9
[RLlib] Issue 10469: Callbacks should receive env idx ... (#10477) 2020-09-03 17:27:05 +02:00
Eric Liang
2a204260a8
[api] Second round of 1.0 API changes: exceptions, num_return_vals (#10377) 2020-08-28 19:57:02 -07:00
Eric Liang
519354a39a
[api] Initial API deprecations for Ray 1.0 (#10325) 2020-08-28 15:03:50 -07:00
raoul-khour-ts
c8c4832794
Prevent Local Worker creation from blocking remote worker creation by creating remote workers before local worker (#10245)
* create remote workers before local worker

* reformatted
2020-08-24 12:29:55 -07:00
Sven Mika
e968b52cb7
[RLlib] Trajectory view API - 03 Fast LSTM + prev actions/rewards (#9950) 2020-08-21 12:35:16 +02:00
Sven Mika
d14b501692
[RLlib] First attempt at cleaning up algo code in RLlib: PG. (#10115) 2020-08-20 17:05:57 +02:00
Sven Mika
2cbe29a7fa
[RLlib] Curiosity minor fixes, do-overs, and testing. (#10143) 2020-08-19 17:49:50 +02:00
Sven Mika
aeb5be7733
[RLlib] Trajectory View API (part 2.5): Actual implementations (not used yet) of a SampleCollector. (#10112) 2020-08-15 15:09:00 +02:00
Sven Mika
2256047876
[RLlib] Rename rllib.utils.types into typing to match built-in python module's name. (#10114) 2020-08-15 13:24:22 +02:00
yncxcw
32cd94b750
[Core] Do not convert gpu id to int (#9744)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-11 12:09:46 -07:00
Barak Michener
8e76796fd0
ci: Redo format.sh --all script & backfill lint fixes (#9956) 2020-08-07 16:49:49 -07:00
Sven Mika
57690a3a9f
[RLlib] Trajectory view API - 02 actual API scaffold (#9753) 2020-08-06 10:54:20 +02:00
Michael Luo
4d7bd8c892
[RLlib] Implementation of "Model-based Meta Policy Optimization" (MB MPO) (#9409) 2020-08-02 18:12:09 +02:00
Miguel Morales
372114b4ed
Update sampler.py (#9805)
Minor fix for warning string
2020-07-29 22:58:35 -07:00