Eric Liang
|
905258dbc1
|
Clean up docstyle in python modules and add LINT rule (#25272)
|
2022-06-01 11:27:54 -07:00 |
|
Balaji Veeramani
|
7f1bacc7dc
|
[CI] Format Python code with Black (#21975)
See #21316 and #21311 for the motivation behind these changes.
|
2022-01-29 18:41:57 -08:00 |
|
Sven Mika
|
9e6b871739
|
[RLlib] Better utils for flattening complex inputs and enable prev-actions for LSTM/attention for complex action spaces. (#21330)
|
2022-01-05 11:29:44 +01:00 |
|
Sven Mika
|
cf21c634a3
|
[RLlib] Fix deprecated warning for torch_ops.py (soft-replaced by torch_utils.py). (#19982)
|
2021-11-03 10:00:46 +01:00 |
|
Sven Mika
|
9eba1871bb
|
[RLlib] Support easy use_attention=True flag for using the GTrXL model. (#11698)
|
2021-01-01 14:06:23 -05:00 |
|
Sven Mika
|
391cdfae8c
|
[RLlib] Trajectory view API docs. (#12718)
|
2020-12-30 17:32:21 -08:00 |
|
Sven Mika
|
c524f86785
|
[RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. (#13064)
|
2020-12-27 09:46:03 -05:00 |
|
Sven Mika
|
99c81c6795
|
[RLlib] Attention Net prep PR #3. (#12450)
|
2020-12-07 13:08:17 +01:00 |
|
Sven Mika
|
3ad9365e1d
|
[RLlib] Attention Net prep PR #2: Smaller cleanups. (#12449)
|
2020-12-01 08:21:45 +01:00 |
|
Sven Mika
|
592c161032
|
[RLlib] Issue 12118: LSTM prev-a/r should be separately configurable. Fix missing prev-a one-hot encoding. (#12397)
* WIP.
* Fix and LINT.
|
2020-11-25 11:27:46 -08:00 |
|
Sven Mika
|
62c7ab5182
|
[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747)
|
2020-11-12 16:27:34 +01:00 |
|
Michael Luo
|
b2984d1c34
|
[RLlib] Model Annotations to Torch Models (#9749)
|
2020-11-12 12:16:12 +01:00 |
|
Jiajie Xiao
|
0b07af374a
|
allow tuple action space (#11429)
Co-authored-by: Jiajie Xiao <jj@Jiajies-MBP-2.attlocal.net>
|
2020-10-29 16:05:38 +01:00 |
|
Sven Mika
|
36bda8432b
|
[RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056)
|
2020-10-01 16:57:10 +02:00 |
|
Sven Mika
|
e968b52cb7
|
[RLlib] Trajectory view API - 03 Fast LSTM + prev actions/rewards (#9950)
|
2020-08-21 12:35:16 +02:00 |
|
Sven Mika
|
2cbe29a7fa
|
[RLlib] Curiosity minor fixes, do-overs, and testing. (#10143)
|
2020-08-19 17:49:50 +02:00 |
|
Sven Mika
|
57690a3a9f
|
[RLlib] Trajectory view API - 02 actual API scaffold (#9753)
|
2020-08-06 10:54:20 +02:00 |
|
Sven Mika
|
5c6d5d4ab1
|
This PR fixes the currently broken lstm_use_prev_action_reward flag for default lstm models (model.use_lstm=True). (#8970)
|
2020-06-27 20:50:01 +02:00 |
|
Sven Mika
|
0ba7472da9
|
[Testing] Fix LINT/sphinx errors. (#8874)
|
2020-06-10 15:41:59 +02:00 |
|
Sven Mika
|
c74dc58f8b
|
[RLlib] Fix use_lstm flag for ModelV2 (w/o ModelV1 wrapping) and add it for PyTorch. (#8734)
|
2020-06-05 15:40:30 +02:00 |
|
Sven Mika
|
796a834c48
|
[RLlib] Attention Net integration into ModelV2 and learning RL example. (#8371)
|
2020-05-18 17:26:40 +02:00 |
|