Commit graph

146 commits

Author SHA1 Message Date
Sven Mika
b2bcab711d
[RLlib] Attention Nets: tf (#12753) 2020-12-20 20:22:32 -05:00
Sven Mika
74c98ac38e
[RLlib] Issue 12244: Unable to restore multi-agent PPOTFPolicy's Model (from exported). (#12786) 2020-12-11 16:13:38 +01:00
Sven Mika
a082ea18b8
[RLlib] Issue 12212: "TFEagerPolicy has no attribute action_sampler_fn. 2020-12-11 12:57:33 +01:00
Sven Mika
28108c905b
[RLlib] Tf-eager policy bug fix: Duplicate model call in compute_gradients. (#12682) 2020-12-09 08:03:58 +01:00
Sven Mika
e40b14d255
[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. (#12420) 2020-12-08 16:41:45 -08:00
Sven Mika
99c81c6795
[RLlib] Attention Net prep PR #3. (#12450) 2020-12-07 13:08:17 +01:00
Sven Mika
19c8033df2
[RLlib] Fix most remaining RLlib algos for running with trajectory view API. (#12366)
* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* LINT and fixes.
MB-MPO and MAML not working yet.

* wip

* update

* update

* rmeove

* remove dep

* higher

* Update requirements_rllib.txt

* Update requirements_rllib.txt

* relpos

* no mbmpo

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-12-01 17:41:10 -08:00
Sven Mika
9021f15b2a
[RLlib] Fix setup-dev.py error when creating a softlink for new_dashboard. (#12442) 2020-12-01 11:46:59 +01:00
Sven Mika
3ad9365e1d
[RLlib] Attention Net prep PR #2: Smaller cleanups. (#12449) 2020-12-01 08:21:45 +01:00
Sven Mika
fb318addcb
[RLlib] Curiosity exploration module: tf/tf2.x/tf-eager support. (#11945) 2020-11-29 12:31:24 +01:00
Sven Mika
0df55a139c
[RLlib] Attention Net prep PR #1: Smaller cleanups. (#12447)
* WIP.

* Fix.

* Fix.

* Fix.
2020-11-27 16:25:47 -08:00
Sven Mika
6475297bd3
[RLlib] Torch LR schedule not working. Fix and added test case. (#12396) 2020-11-26 13:14:11 +01:00
Sven Mika
6da4342822
[RLlib] Add on_learn_on_batch (Policy) callback to DefaultCallbacks. (#12070) 2020-11-18 15:39:23 +01:00
Sven Mika
b6b54f1c81
[RLlib] Trajectory view API: enable by default for SAC, DDPG, DQN, SimpleQ (#11827) 2020-11-16 10:54:35 -08:00
Sven Mika
62c7ab5182
[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747) 2020-11-12 16:27:34 +01:00
mvindiola1
4518fe790f
[RLLIB] Convert torch state arrays to tensors during compute log likelihoods (#11708) 2020-11-04 09:33:56 +01:00
Sven Mika
5b788ccb13
[RLlib] Trajectory view API (prep PR for switching on by default across all RLlib; plumbing only) (#11717) 2020-11-03 12:53:34 -08:00
Sven Mika
54d85a6c2a
[RLlib] Fix RNN learning for tf-eager/tf2.x. (#11720) 2020-11-02 11:18:41 +01:00
Sven Mika
d9f1874e34
[RLlib] Minor fixes (torch GPU bugs + some cleanup). (#11609) 2020-10-27 10:00:24 +01:00
Sven Mika
0c0f67c14d
[RLlib] ARS/ES eval workers not working: Issue 9933. (#11308) 2020-10-12 13:49:48 -07:00
Sven Mika
8ea1bc5ff9
[RLlib] Allow for more than 2^31 policy timesteps. (#11301) 2020-10-12 13:49:11 -07:00
Sven Mika
ce96b03b07
[RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033) 2020-10-06 20:28:16 +02:00
Philsik Chang
2b26d2ca1b
[rllib] Fix for Torch checkpoint taken on GPU fails to deserialize on CPU (#11071) (#11208) 2020-10-05 22:01:55 -07:00
Sven Mika
c17169dc11
[RLlib] Fix all example scripts to run on GPUs. (#11105) 2020-10-02 23:07:44 +02:00
Sven Mika
36bda8432b
[RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056) 2020-10-01 16:57:10 +02:00
Eric Liang
ecdaaffc67
add large data warning (#10957) 2020-09-23 15:46:06 -07:00
Sven Mika
d7c42d6d92
[RLlib] Unity blogpost final fixes. (#10894) 2020-09-20 14:13:20 +02:00
Sven Mika
805dad3bc4
[RLlib] SAC algo cleanup. (#10825) 2020-09-20 11:27:02 +02:00
Sven Mika
5c7b35d694
[RLlib] Issue 10833 TorchPolicy GPU. (#10834) 2020-09-17 09:04:46 +02:00
desktable
799318d7d7
[RLlib] Add type annotations for agents/dqn (#10626) 2020-09-09 18:55:26 +02:00
Sven Mika
28ab797cf5
[RLlib] Deprecate old classes, methods, functions, config keys (in prep for RLlib 1.0). (#10544) 2020-09-06 10:58:00 +02:00
Sven Mika
ef18893fb5
[RLlib] PPO, APPO, and DD-PPO code cleanup. (#10420) 2020-09-02 14:03:01 +02:00
Michael Luo
4e9888ce2f
[RLlib] Dreamer (#10172) 2020-08-26 13:24:05 +02:00
Sven Mika
e968b52cb7
[RLlib] Trajectory view API - 03 Fast LSTM + prev actions/rewards (#9950) 2020-08-21 12:35:16 +02:00
Sven Mika
d14b501692
[RLlib] First attempt at cleaning up algo code in RLlib: PG. (#10115) 2020-08-20 17:05:57 +02:00
Sven Mika
2cbe29a7fa
[RLlib] Curiosity minor fixes, do-overs, and testing. (#10143) 2020-08-19 17:49:50 +02:00
Eric Liang
ca133e2699
[rllib] Remove extra model config kwargs passed incorrectly for Torch models (#10055) 2020-08-17 11:12:20 -07:00
Olli Huotari
9ff599cbb8
torch policy now includes model.metrics (#10121)
* torch policy now includes model.metrics

* Fixed tests to work with custom metrics

* Forgot to run format.sh
2020-08-15 10:43:11 -07:00
Sven Mika
aeb5be7733
[RLlib] Trajectory View API (part 2.5): Actual implementations (not used yet) of a SampleCollector. (#10112) 2020-08-15 15:09:00 +02:00
Sven Mika
2256047876
[RLlib] Rename rllib.utils.types into typing to match built-in python module's name. (#10114) 2020-08-15 13:24:22 +02:00
Tanay Wakhare
1826b29757
[RLlib] Curiosity (intrinsic motivation) Exploration module. (#9912) 2020-08-13 20:14:16 +02:00
yncxcw
32cd94b750
[Core] Do not convert gpu id to int (#9744)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-11 12:09:46 -07:00
Barak Michener
8e76796fd0
ci: Redo format.sh --all script & backfill lint fixes (#9956) 2020-08-07 16:49:49 -07:00
Eric Liang
668f555755
[rllib] Clean up outdated docs #9915 2020-08-06 18:29:04 -07:00
Sven Mika
57690a3a9f
[RLlib] Trajectory view API - 02 actual API scaffold (#9753) 2020-08-06 10:54:20 +02:00
Sven Mika
9b90f7db67
[RLlib] Missing type annotations policy templates. (#9846) 2020-08-06 05:33:24 +02:00
Sven Mika
b0b0463161
[RLlib] Trajectory View API (preparatory cleanup and enhancements). (#9678) 2020-07-29 21:15:09 +02:00
Eric Liang
590943a499
[rllib] Type annotations for model classes (#9646) 2020-07-24 12:01:46 -07:00
Eric Liang
5acd3e66dd
[rllib] Fix torch TD error, IMPALA LR updates (#9477)
* update

* add test

* lint

* fix super call

* speed es test up
2020-07-23 12:50:25 -07:00
Raphael Avalos
440c9c42be
[RLlib] Fix combination of lockstep and multiple agnts controlled by the same policy. (#9521)
* Change aggregation when lockstep is activated.

Modification of MultiAgentBatch.timeslices to support the combination of lockstep and multiple agents controlled by the same policy.

fix ray-project/ray#9295

* Line too long.
2020-07-19 23:03:12 -07:00