Commit graph

863 commits

Author SHA1 Message Date
mvindiola1
82a3ff795c
[RLlib] ensure curiosity exploration actions are passed in as tf tens… (#15704) 2021-06-21 10:03:17 -07:00
Benjamin D. Killeen
50049f86d0
[rllib] check if self.env is not None explicitly (#15634)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-06-21 10:02:13 -07:00
Sven Mika
be6db06485
[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569) 2021-06-21 13:46:01 +02:00
Sven Mika
169ddabae7
[RLlib] Issue 15973: Trainer.with_updates(validate_config=...) behaves confusingly. (#16429) 2021-06-19 22:42:00 +02:00
Sven Mika
79a9d6d517
[RLlib] Issues 16287 and 16200: RLlib not rendering custom multi-agent Envs. (#16428) 2021-06-19 08:57:53 +02:00
Amog Kamsetty
bd3cbfc56a
Revert "[RLlib] Allow policies to be added/deleted on the fly. (#16359)" (#16543)
This reverts commit e78ec370a9.
2021-06-18 12:21:49 -07:00
Sven Mika
2900a06dd7
[RLlib] Issue 14503: SAC not allowing custom action distributions. (#16427) 2021-06-18 17:27:29 +02:00
Sven Mika
e78ec370a9
[RLlib] Allow policies to be added/deleted on the fly. (#16359) 2021-06-18 10:31:30 +02:00
Sven Mika
a5831f9429
[RLlib] Fix bandit example scripts and add all scripts to CI testing suite. 2021-06-15 13:30:31 +02:00
Sven Mika
d0014cd351
[RLlib] Policies get/set_state fixes and enhancements. (#16354) 2021-06-15 13:08:43 +02:00
Chris Bamford
fd1a97e39f
[RLlib] Memory leak docs (#15908) 2021-06-10 18:10:21 +02:00
Sven Mika
3d4dc60e2e
[RLlib] CQL iteration count fixes: Remove dummy buffer and unnecessary store op from exec_plan. (#16332) 2021-06-10 07:49:17 +02:00
matthewdeng
138b273136
[rllib] Add tests for examples using ray client (#16271)
* [rllib] add tests for examples using ray client

* rename test_client to test_ray_client
2021-06-09 10:39:14 -07:00
Sven Mika
4b8dadccbd
[RLlib] Fix PR 16162: Having added sleep to _NextValueNotReady causes TD3 tests to become flakey. (#16309) 2021-06-08 07:27:02 -07:00
Gerges Dib
f8cf4a1985
[RLlib] Fixed import tensorflow when module not available (#16171) 2021-06-04 10:07:59 +02:00
Sven Mika
e2be41b407
[RLlib] MARWIL + BC: Various fixes and enhancements. (#16218) 2021-06-03 22:29:00 +02:00
Sven Mika
c9d220bcda
[RLlib] Upgrade RLlib regression test scripts to new testing tool - RLlib release logs for 1.4. (#16080) 2021-06-01 17:39:18 +02:00
Chris Bamford
1e3721ef4a
[RLlib] Remove bad spinlocks to allow pytorch GPU scheduler to interrupt. (#16162) 2021-06-01 16:40:28 +02:00
Sven Mika
5fe34862ce
[RLlib] DDPG torch GPU bug. (#16133) 2021-05-28 22:09:25 +02:00
Sven Mika
33a69135cb
[RLlib] Issue 16117: DQN/APEX torch not working on GPU. (#16118) 2021-05-28 09:12:53 +02:00
Sven Mika
f6302d81be
[RLlib] Discussion 2210: BC algo broken, if "advantages" missing in offline data. (#16019) 2021-05-25 08:47:17 +02:00
Eric Liang
810f5c803a
Disable flaky object spilling test on OSX & adjust test timeouts (#15986)
* blacklist

* move it

* adjust according to bazel timeouts

* fix build

* move to large

* Update BUILD
2021-05-24 09:49:59 -07:00
Steven Morad
581d63e607
[RLlib] Fix dnc input shape (#15939)
Co-authored-by: Steven Morad <sm2558@cam.ac.uk>
2021-05-20 19:06:02 -07:00
Sven Mika
e80095591c
[RLlib] Entropy coeff schedule bug fix and git bisect script. (#15937) 2021-05-20 18:15:10 +02:00
Sven Mika
03c7c530a9
[RLlib] Issue 15483: Wrong init states (should be non-zero if ModelV2.get_initial_state returns non-zero values). (#15733) 2021-05-20 09:28:09 +02:00
Sven Mika
2d34216660
[RLlib] APEX-DQN: Bug fix for torch and add learning test. (#15762) 2021-05-20 09:27:03 +02:00
Sven Mika
eaa7f6696d
[RLlib] Issue 15887: MARWIL adv norm update mismatch for tf (static-graph) vs torch versions. (#15898) 2021-05-19 15:44:11 -07:00
Stefan Schneider
55709bac7a
[RLlib] Examples for training, saving, loading, testing an agent with SB & RLlib (#15897) 2021-05-19 16:36:59 +02:00
Michael Luo
474f04e322
[RLlib] DDPG/TD3 + A3C/A2C + MARWIL/BC Annotation/Comments/Code Cleanup (#14707) 2021-05-19 16:32:29 +02:00
Steven Morad
d8eed68af2
[RLlib] Add differentiable neural computer example (#14844) 2021-05-19 09:15:39 +02:00
Rick Lan
3b1b1d74fe
[rllib] Read "logger_config" first before "prefix". (#15871) 2021-05-18 10:50:46 -07:00
Sven Mika
7e260edb07
[RLlib] Fix small memory leak in SimpleListCollector (already superseeded by Bam4d's PR + small fix in error message). (#15783) 2021-05-18 16:02:03 +02:00
Chris Bamford
0be83d9a95
[RLlib] Fixing Memory Leak In Multi-Agent environments. Adding tooling for finding memory leaks in workers. (#15815) 2021-05-18 13:23:00 +02:00
Sven Mika
d2c755ccef
[RLlib] Examples scripts add argparse help and replace --torch with --framework. (#15832) 2021-05-18 13:18:12 +02:00
Sven Mika
2303851c3c
[RLlib] Torch multi-GPU + LSTM/RNN bug fix. (#15492) 2021-05-18 11:51:05 +02:00
Sven Mika
839fc59224
[RLlib] CQL TensorFlow support (#15841) 2021-05-18 11:10:46 +02:00
Sven Mika
a36b9305d4
[RLlib] Better error message when deep-learning framework not installed. (#15735) 2021-05-18 11:06:05 +02:00
Sven Mika
6f4d988713
[RLlib] Issue 15556: Fix R2D2 using chunks from previous episodes in the "burn-in" window. (#15737) 2021-05-18 11:05:42 +02:00
Sven Mika
308ea62430
[RLlib] Fix "seed" setting to work in all frameworks and w/ all CUDA versions. (#15682) 2021-05-18 11:00:24 +02:00
Sven Mika
f25d58492d
[Testing] Dependabot for RLlib. (#15812) 2021-05-17 18:24:13 +02:00
Sven Mika
d89fb82bfb
[RLlib] Add simple curriculum learning API and example script. (#15740) 2021-05-16 17:35:10 +02:00
Sven Mika
ebc6d8692a
[RLlib] Docs: Example scripts and blogs documentation update. (#15763) 2021-05-16 15:24:38 +02:00
Sven Mika
469f5227da
[RLlib] CQL bug fix: Normalize actions for atanh in BC part of the CQL loss. (#15814) 2021-05-16 15:21:06 +02:00
Sven Mika
bc09e75b78
[RLlib] Fix 3 flakey test cases. (#15785) 2021-05-16 12:20:33 +02:00
Ian Rodney
00c913cbc6
[Flaky] Mark test_nested_observation_spaces as Flaky (#15794) 2021-05-14 12:08:52 -07:00
Ian Rodney
82876ecc2a
[rllib] [testing] make kill failure non fatal (#15771) 2021-05-13 12:24:49 -07:00
Sven Mika
c4a3e1589b
[RLlib] CQL: Bug fixes and OPE example added to test and offline_rl.py example. (#15761) 2021-05-13 09:17:23 +02:00
Sven Mika
16ddab49f5
[RLlib] Trainer._evaluate -> Trainer.evaluate; Also make evaluation possible w/o evaluation worker set. (#15591) 2021-05-12 12:16:00 +02:00
Sven Mika
a495759f06
[RLlib] Discussion 2022: PPO should auto-adjust rollout_fragment_length if other settings do not align with train_batch_size. (#15611) 2021-05-10 16:16:02 +02:00
Sven Mika
461d73ddf1
[RLlib] simple_optimizer should not be used by default for tf+MA. (#15365) 2021-05-10 16:10:44 +02:00