Commit graph

93 commits

Author SHA1 Message Date
Sven Mika
8a72824c63
[RLlib Testig] Split and unflake more CI tests (make sure all jobs are < 30min). (#18591) 2021-09-15 22:16:48 +02:00
Sven Mika
08c09737fa
[RLlib] Fix R2D2 (torch) multi-GPU issue. (#18550) 2021-09-14 19:58:10 +02:00
Sven Mika
cabaa3b3c6
[RLlib Testing] Add A3C/APPO/BC/DDPPO/MARWIL/CQL/ES/ARS/TD3 to weekly learning tests. (#18381) 2021-09-07 11:48:41 +02:00
Sven Mika
e3e6ed7aaa
[RLlib] Issues 17844, 18034: Fix n-step > 1 bug. (#18358) 2021-09-06 12:14:20 +02:00
Sven Mika
599e589481
[RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. (#18065) 2021-08-31 14:56:53 +02:00
Sven Mika
8acb469b04
[RLlib; Testing] Green all RLlib nightly tests. (#18073) 2021-08-26 14:09:20 +02:00
Sven Mika
90b21ce27e
[RLlib] De-flake 3 test cases; Fix config.simple_optimizer and SampleBatch.is_training warnings. (#17321) 2021-07-27 14:39:06 -04:00
Sven Mika
5a313ba3d6
[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169) 2021-07-20 14:58:13 -04:00
Sven Mika
53206dd440
[RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes (#16531) 2021-06-30 12:32:11 +02:00
Sven Mika
e80095591c
[RLlib] Entropy coeff schedule bug fix and git bisect script. (#15937) 2021-05-20 18:15:10 +02:00
Sven Mika
2d34216660
[RLlib] APEX-DQN: Bug fix for torch and add learning test. (#15762) 2021-05-20 09:27:03 +02:00
Sven Mika
d2c755ccef
[RLlib] Examples scripts add argparse help and replace --torch with --framework. (#15832) 2021-05-18 13:18:12 +02:00
Sven Mika
839fc59224
[RLlib] CQL TensorFlow support (#15841) 2021-05-18 11:10:46 +02:00
Sven Mika
6f4d988713
[RLlib] Issue 15556: Fix R2D2 using chunks from previous episodes in the "burn-in" window. (#15737) 2021-05-18 11:05:42 +02:00
Michael Luo
4cbe13cdfd
[RLlib] CQL loss fn fixes, MuJoCo + Pendulum benchmarks, offline-RL example script w/ json file. (#15603)
Co-authored-by: Sven Mika <sven@anyscale.io>
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-05-04 19:06:19 +02:00
Sven Mika
bdda73e2dd
[RLlib] Torch multi-GPU bug fixes (discussion 1755). (#15421)
Thanks a lot @Bam4d for raising this and your help on fixing the worker GPU issue for torch!
2021-04-22 11:29:42 +02:00
Michael Luo
b84575c092
[RLlib] 2 RLlib Flaky Tests (#14930) 2021-03-30 19:21:13 +02:00
Sven Mika
e98808ce11
[RLlib] Fix 2 flakey test cases. (#14892) 2021-03-29 17:20:29 +02:00
Eric Liang
af8a93f2a4
Deflake some RLlib tests (#14947)
* fix

* update

* 100

* flake
2021-03-26 11:45:17 -07:00
Sven Mika
4e17f95927
[RLlib] Unflake 2 test cases (SAC cont. cartpole). (#14620) 2021-03-15 14:03:54 +01:00
Sven Mika
8000258333
[RLlib] R2D2 Implementation. (#13933) 2021-02-25 12:18:11 +01:00
Michael Luo
ec2c10309b
[RLlib] CQL for HalfCheetah-Random-v0 + Hopper-Random-v0 + CQL Bug Fixes (#14243) 2021-02-22 17:30:18 +01:00
Sven Mika
52c94b7ee9
[RLlib] Allow SAC to use custom models as Q- or policy nets and deprecate "state-preprocessor" for image spaces. (#13522) 2021-02-02 13:05:58 +01:00
Michael Luo
587f207c2f
[RLlib] Support for D4RL + Semi-working CQL Benchmark (#13550) 2021-01-21 16:43:55 +01:00
Sven Mika
2e3655e8a9
[RLlib] Issue 9071 A3C w/ RNN not working due to VF assuming no RNN. (#13238) 2021-01-19 14:22:36 +01:00
Sven Mika
e74947cc94
[RLlib] Env directory cleanup and tests. (#13082) 2021-01-19 10:09:39 +01:00
Sven Mika
93c0a5549b
[RLlib] Deprecate vf_share_layers in top-level PPO/MAML/MB-MPO configs. (#13397) 2021-01-19 09:51:35 +01:00
Michael Luo
42cd414e5b
[RLlib] New Offline RL Algorithm: CQL (based on SAC) (#13118) 2020-12-30 10:11:57 -05:00
Sven Mika
deb33bce84
[RLlib] Add DQN SoftQ learning test case. (#12712) 2020-12-10 14:55:19 +01:00
Sven Mika
bb03e2499b
[RLlib] PyBullet Env native support via env str-specifier (if installed). (#12209) 2020-11-30 12:41:24 +01:00
Sven Mika
4afaa46028
[RLlib] Increase the scope of RLlib's regression tests. (#12200) 2020-11-24 22:18:31 +01:00
Edward Oakes
32d159a2ed
Fix release directory & RELEASE_PROCESS.md (#12269) 2020-11-23 14:28:59 -06:00
Sven Mika
b6b54f1c81
[RLlib] Trajectory view API: enable by default for SAC, DDPG, DQN, SimpleQ (#11827) 2020-11-16 10:54:35 -08:00
Michael Luo
59bc1e6c09
[RLLib] MAML extension for all models except RNNs (#11337) 2020-11-12 16:51:40 -08:00
Michael Luo
6e6c680f14
MBMPO Cartpole (#11832)
* MBMPO Cartpole Done

* Added doc
2020-11-12 10:30:41 -08:00
Sven Mika
5b788ccb13
[RLlib] Trajectory view API (prep PR for switching on by default across all RLlib; plumbing only) (#11717) 2020-11-03 12:53:34 -08:00
Sven Mika
ce96b03b07
[RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033) 2020-10-06 20:28:16 +02:00
Michael Luo
47b499d899
Cartpole MAML + Discrete (#11028) 2020-10-02 12:56:34 +02:00
Sven Mika
4b278c36fc
[RLlib] Behavioral Cloning (from MARWIL). (#10619) 2020-09-09 17:33:21 +02:00
Michael Luo
8e613652af
[RLLib] MBMPO Fixes (#10296) 2020-09-09 09:34:34 +02:00
Sven Mika
8a891b3c30
[RLlib] SAC n_step > 1. (#10567) 2020-09-05 22:26:42 +02:00
Michael Luo
4e9888ce2f
[RLlib] Dreamer (#10172) 2020-08-26 13:24:05 +02:00
Michael Luo
4d7bd8c892
[RLlib] Implementation of "Model-based Meta Policy Optimization" (MB MPO) (#9409) 2020-08-02 18:12:09 +02:00
Sven Mika
617eb8f279
[RLlib] Issue 9402 MARWIL producing nan rewards. (#9429) 2020-07-14 05:07:16 +02:00
Sven Mika
b4c0b942fe
[RLlib] Remove requirement for dataclasses in rllib (not supported in py3.5) (#9237) 2020-07-01 17:31:44 +02:00
Michael Luo
cf0894d396
[rllib] MAML Agent (#8862)
* Halfway done with transferring MAML to new Ray

* MAML Beta Out

* Debugging MAML atm

* Distributed Execution

* Pendulum Mass Working

* All experiments complete

* Cleaned up codebase

* Travis CI

* Travis CI

* Tests

* Merged conflicts

* Fixed variance bug conflict

* Comment resolved

* Apply suggestions from code review

fixed test_maml

* Update rllib/agents/maml/tests/test_maml.py

* asdf

* Fix testing

Co-authored-by: Sven Mika <sven@anyscale.io>
2020-06-23 09:48:23 -07:00
Sven Mika
2589309cf0
[RLlib] Make sure torch and tf behave the same wrt conv2d nets. (#8785) 2020-06-20 00:05:19 +02:00
Sven Mika
7008902cff
[RLlib] Minor rllib.utils cleanup. (#8932) 2020-06-16 08:52:20 +02:00
Sven Mika
8d1ccfd0f7
[RLlib] Issue 8889: action clipping bug ppo not learning mujoco (#8898) 2020-06-11 19:17:43 +02:00
Sven Mika
a90cd0fcbb
[RLlib] Unity3d soccer benchmarks (#8834) 2020-06-11 14:29:57 +02:00