Sven Mika
|
cabaa3b3c6
|
[RLlib Testing] Add A3C/APPO/BC/DDPPO/MARWIL/CQL/ES/ARS/TD3 to weekly learning tests. (#18381)
|
2021-09-07 11:48:41 +02:00 |
|
Sven Mika
|
e3e6ed7aaa
|
[RLlib] Issues 17844, 18034: Fix n-step > 1 bug. (#18358)
|
2021-09-06 12:14:20 +02:00 |
|
Sven Mika
|
599e589481
|
[RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. (#18065)
|
2021-08-31 14:56:53 +02:00 |
|
Sven Mika
|
8acb469b04
|
[RLlib; Testing] Green all RLlib nightly tests. (#18073)
|
2021-08-26 14:09:20 +02:00 |
|
Sven Mika
|
90b21ce27e
|
[RLlib] De-flake 3 test cases; Fix config.simple_optimizer and SampleBatch.is_training warnings. (#17321)
|
2021-07-27 14:39:06 -04:00 |
|
Sven Mika
|
5a313ba3d6
|
[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169)
|
2021-07-20 14:58:13 -04:00 |
|
Sven Mika
|
53206dd440
|
[RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes (#16531)
|
2021-06-30 12:32:11 +02:00 |
|
Sven Mika
|
e80095591c
|
[RLlib] Entropy coeff schedule bug fix and git bisect script. (#15937)
|
2021-05-20 18:15:10 +02:00 |
|
Sven Mika
|
2d34216660
|
[RLlib] APEX-DQN: Bug fix for torch and add learning test. (#15762)
|
2021-05-20 09:27:03 +02:00 |
|
Sven Mika
|
d2c755ccef
|
[RLlib] Examples scripts add argparse help and replace --torch with --framework . (#15832)
|
2021-05-18 13:18:12 +02:00 |
|
Sven Mika
|
839fc59224
|
[RLlib] CQL TensorFlow support (#15841)
|
2021-05-18 11:10:46 +02:00 |
|
Sven Mika
|
6f4d988713
|
[RLlib] Issue 15556: Fix R2D2 using chunks from previous episodes in the "burn-in" window. (#15737)
|
2021-05-18 11:05:42 +02:00 |
|
Michael Luo
|
4cbe13cdfd
|
[RLlib] CQL loss fn fixes, MuJoCo + Pendulum benchmarks, offline-RL example script w/ json file. (#15603)
Co-authored-by: Sven Mika <sven@anyscale.io>
Co-authored-by: sven1977 <svenmika1977@gmail.com>
|
2021-05-04 19:06:19 +02:00 |
|
Sven Mika
|
bdda73e2dd
|
[RLlib] Torch multi-GPU bug fixes (discussion 1755). (#15421)
Thanks a lot @Bam4d for raising this and your help on fixing the worker GPU issue for torch!
|
2021-04-22 11:29:42 +02:00 |
|
Michael Luo
|
b84575c092
|
[RLlib] 2 RLlib Flaky Tests (#14930)
|
2021-03-30 19:21:13 +02:00 |
|
Sven Mika
|
e98808ce11
|
[RLlib] Fix 2 flakey test cases. (#14892)
|
2021-03-29 17:20:29 +02:00 |
|
Eric Liang
|
af8a93f2a4
|
Deflake some RLlib tests (#14947)
* fix
* update
* 100
* flake
|
2021-03-26 11:45:17 -07:00 |
|
Sven Mika
|
4e17f95927
|
[RLlib] Unflake 2 test cases (SAC cont. cartpole). (#14620)
|
2021-03-15 14:03:54 +01:00 |
|
Sven Mika
|
8000258333
|
[RLlib] R2D2 Implementation. (#13933)
|
2021-02-25 12:18:11 +01:00 |
|
Michael Luo
|
ec2c10309b
|
[RLlib] CQL for HalfCheetah-Random-v0 + Hopper-Random-v0 + CQL Bug Fixes (#14243)
|
2021-02-22 17:30:18 +01:00 |
|
Sven Mika
|
52c94b7ee9
|
[RLlib] Allow SAC to use custom models as Q- or policy nets and deprecate "state-preprocessor" for image spaces. (#13522)
|
2021-02-02 13:05:58 +01:00 |
|
Michael Luo
|
587f207c2f
|
[RLlib] Support for D4RL + Semi-working CQL Benchmark (#13550)
|
2021-01-21 16:43:55 +01:00 |
|
Sven Mika
|
2e3655e8a9
|
[RLlib] Issue 9071 A3C w/ RNN not working due to VF assuming no RNN. (#13238)
|
2021-01-19 14:22:36 +01:00 |
|
Sven Mika
|
e74947cc94
|
[RLlib] Env directory cleanup and tests. (#13082)
|
2021-01-19 10:09:39 +01:00 |
|
Sven Mika
|
93c0a5549b
|
[RLlib] Deprecate vf_share_layers in top-level PPO/MAML/MB-MPO configs. (#13397)
|
2021-01-19 09:51:35 +01:00 |
|
Michael Luo
|
42cd414e5b
|
[RLlib] New Offline RL Algorithm: CQL (based on SAC) (#13118)
|
2020-12-30 10:11:57 -05:00 |
|
Sven Mika
|
deb33bce84
|
[RLlib] Add DQN SoftQ learning test case. (#12712)
|
2020-12-10 14:55:19 +01:00 |
|
Sven Mika
|
bb03e2499b
|
[RLlib] PyBullet Env native support via env str-specifier (if installed). (#12209)
|
2020-11-30 12:41:24 +01:00 |
|
Sven Mika
|
4afaa46028
|
[RLlib] Increase the scope of RLlib's regression tests. (#12200)
|
2020-11-24 22:18:31 +01:00 |
|
Edward Oakes
|
32d159a2ed
|
Fix release directory & RELEASE_PROCESS.md (#12269)
|
2020-11-23 14:28:59 -06:00 |
|
Sven Mika
|
b6b54f1c81
|
[RLlib] Trajectory view API: enable by default for SAC, DDPG, DQN, SimpleQ (#11827)
|
2020-11-16 10:54:35 -08:00 |
|
Michael Luo
|
59bc1e6c09
|
[RLLib] MAML extension for all models except RNNs (#11337)
|
2020-11-12 16:51:40 -08:00 |
|
Michael Luo
|
6e6c680f14
|
MBMPO Cartpole (#11832)
* MBMPO Cartpole Done
* Added doc
|
2020-11-12 10:30:41 -08:00 |
|
Sven Mika
|
5b788ccb13
|
[RLlib] Trajectory view API (prep PR for switching on by default across all RLlib; plumbing only) (#11717)
|
2020-11-03 12:53:34 -08:00 |
|
Sven Mika
|
ce96b03b07
|
[RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033)
|
2020-10-06 20:28:16 +02:00 |
|
Michael Luo
|
47b499d899
|
Cartpole MAML + Discrete (#11028)
|
2020-10-02 12:56:34 +02:00 |
|
Sven Mika
|
4b278c36fc
|
[RLlib] Behavioral Cloning (from MARWIL). (#10619)
|
2020-09-09 17:33:21 +02:00 |
|
Michael Luo
|
8e613652af
|
[RLLib] MBMPO Fixes (#10296)
|
2020-09-09 09:34:34 +02:00 |
|
Sven Mika
|
8a891b3c30
|
[RLlib] SAC n_step > 1. (#10567)
|
2020-09-05 22:26:42 +02:00 |
|
Michael Luo
|
4e9888ce2f
|
[RLlib] Dreamer (#10172)
|
2020-08-26 13:24:05 +02:00 |
|
Michael Luo
|
4d7bd8c892
|
[RLlib] Implementation of "Model-based Meta Policy Optimization" (MB MPO) (#9409)
|
2020-08-02 18:12:09 +02:00 |
|
Sven Mika
|
617eb8f279
|
[RLlib] Issue 9402 MARWIL producing nan rewards. (#9429)
|
2020-07-14 05:07:16 +02:00 |
|
Sven Mika
|
b4c0b942fe
|
[RLlib] Remove requirement for dataclasses in rllib (not supported in py3.5) (#9237)
|
2020-07-01 17:31:44 +02:00 |
|
Michael Luo
|
cf0894d396
|
[rllib] MAML Agent (#8862)
* Halfway done with transferring MAML to new Ray
* MAML Beta Out
* Debugging MAML atm
* Distributed Execution
* Pendulum Mass Working
* All experiments complete
* Cleaned up codebase
* Travis CI
* Travis CI
* Tests
* Merged conflicts
* Fixed variance bug conflict
* Comment resolved
* Apply suggestions from code review
fixed test_maml
* Update rllib/agents/maml/tests/test_maml.py
* asdf
* Fix testing
Co-authored-by: Sven Mika <sven@anyscale.io>
|
2020-06-23 09:48:23 -07:00 |
|
Sven Mika
|
2589309cf0
|
[RLlib] Make sure torch and tf behave the same wrt conv2d nets. (#8785)
|
2020-06-20 00:05:19 +02:00 |
|
Sven Mika
|
7008902cff
|
[RLlib] Minor rllib.utils cleanup. (#8932)
|
2020-06-16 08:52:20 +02:00 |
|
Sven Mika
|
8d1ccfd0f7
|
[RLlib] Issue 8889: action clipping bug ppo not learning mujoco (#8898)
|
2020-06-11 19:17:43 +02:00 |
|
Sven Mika
|
a90cd0fcbb
|
[RLlib] Unity3d soccer benchmarks (#8834)
|
2020-06-11 14:29:57 +02:00 |
|
Sven Mika
|
c74dc58f8b
|
[RLlib] Fix use_lstm flag for ModelV2 (w/o ModelV1 wrapping) and add it for PyTorch. (#8734)
|
2020-06-05 15:40:30 +02:00 |
|
Sven Mika
|
97d524c075
|
[RLlib] Issue 8769 broken OOM tests_dir cases (R & S). (#8770)
|
2020-06-05 08:34:21 +02:00 |
|