Steven Morad
|
d0a8e3c36f
|
[RLlib] User-friendly RNN sequencing. (#27087)
|
2022-08-01 15:32:22 +02:00 |
|
Olaf Lipinski
|
8271406a04
|
[RLLib] Fix MultiDiscrete not being one-hotted correctly (#26558)
Co-authored-by: Jun Gong <jungong@anyscale.com>
|
2022-07-20 15:25:53 -07:00 |
|
kourosh hakhamaneshi
|
4cdd508f70
|
[RLlib] Added CRR implementation. (#25499)
|
2022-06-08 11:42:02 +02:00 |
|
Artur Niederfahrenhorst
|
243038d00a
|
[RLlib] Issue 25401: Faulty usage of get_filter_config in ComplexInputNetworks (#25493)
|
2022-06-06 13:04:17 +02:00 |
|
Eric Liang
|
905258dbc1
|
Clean up docstyle in python modules and add LINT rule (#25272)
|
2022-06-01 11:27:54 -07:00 |
|
Eric Liang
|
4963dfaae0
|
[api] Add API stability annotations for all RLlib symbols and add to LINT (#25060)
|
2022-05-24 22:14:25 -07:00 |
|
Nathan Matare
|
012a4c8667
|
[RLlib] Allow passing **kwargs to action distribution. (#24692)
|
2022-05-18 09:22:37 +02:00 |
|
Sven Mika
|
4d285a00a4
|
[RLlib] Issue 23689: tf Initializer has hard-coded float32 dtype. (#23741)
|
2022-04-07 21:35:02 +02:00 |
|
simonsays1980
|
9ca9c67bc9
|
[RLlib] Added dtype safeguards to the 'required_model_output_shape()' methods… (#23490)
|
2022-03-31 13:52:00 +02:00 |
|
Jun Gong
|
d12977c4fb
|
[RLlib] TF2 Bandit Agent (#22838)
|
2022-03-21 16:55:55 +01:00 |
|
Sven Mika
|
8e00537b65
|
[RLlib] SlateQ: framework=tf fixes and SlateQ documentation update (#22543)
|
2022-02-23 13:03:45 +01:00 |
|
Balaji Veeramani
|
7f1bacc7dc
|
[CI] Format Python code with Black (#21975)
See #21316 and #21311 for the motivation behind these changes.
|
2022-01-29 18:41:57 -08:00 |
|
Sven Mika
|
9e6b871739
|
[RLlib] Better utils for flattening complex inputs and enable prev-actions for LSTM/attention for complex action spaces. (#21330)
|
2022-01-05 11:29:44 +01:00 |
|
Sven Mika
|
daa4304a91
|
[RLlib] Switch off preprocessors by default for PGTrainer. (#21008)
|
2021-12-13 12:04:23 +01:00 |
|
Sven Mika
|
596c8e2772
|
[RLlib] Experimental no-flatten option for actions/prev-actions. (#20918)
|
2021-12-11 14:57:58 +01:00 |
|
Sven Mika
|
56619b955e
|
[RLlib; Documentation] Some docstring cleanups; Rename RemoteVectorEnv into RemoteBaseEnv for clarity. (#20250)
|
2021-11-17 21:40:16 +01:00 |
|
Sven Mika
|
0b308719f8
|
[RLlib; Docs overhaul] Docstring cleanup: rllib/utils (#19829)
|
2021-11-01 21:46:02 +01:00 |
|
Sven Mika
|
ac3371a148
|
[RLlib] Discussion 3644: Fix bug for complex obs spaces containing Box([2D shape]) and discrete component. (#18917)
|
2021-09-30 16:39:38 +02:00 |
|
o0olele
|
ff6730f903
|
[RLlib] Attention Nets + MultiDiscrete spaces: Fix range() takes no keyword args error! (#17502)
|
2021-09-24 13:43:58 +02:00 |
|
Sven Mika
|
61a1274619
|
[RLlib] No Preprocessors (part 2). (#18468)
|
2021-09-23 12:56:45 +02:00 |
|
Sven Mika
|
8a066474d4
|
[RLlib] No Preprocessors; preparatory PR #1 (#18367)
|
2021-09-09 08:10:42 +02:00 |
|
Sven Mika
|
9a8ca6a69d
|
[RLlib] Fix Atari learning test regressions (2 bugs) and 1 minor attention net bug. (#18306)
|
2021-09-03 13:29:57 +02:00 |
|
Kai Fricke
|
34cf5db109
|
[tune] Fix hyperopt points to evaluate for nested lists (#18113)
|
2021-08-26 14:34:22 +02:00 |
|
Sven Mika
|
9883505e84
|
[RLlib] Add [LSTM=True + multi-GPU]-tests to nightly RLlib testing suite (for all algos supporting RNNs, except R2D2, RNNSAC, and DDPPO). (#18017)
|
2021-08-24 21:55:27 +02:00 |
|
Sven Mika
|
494ddd98c1
|
[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. (#17928)
|
2021-08-21 17:05:48 +02:00 |
|
Sven Mika
|
839fc59224
|
[RLlib] CQL TensorFlow support (#15841)
|
2021-05-18 11:10:46 +02:00 |
|
Sven Mika
|
e973b726c2
|
[RLlib] Support native tf.keras.Models (part 2) - Default keras models for Vision/RNN/Attention. (#15273)
|
2021-04-30 19:26:30 +02:00 |
|
Sven Mika
|
bb8a286cbc
|
[RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). (#14684)
|
2021-04-27 10:44:54 +02:00 |
|
Sven Mika
|
cecfc3b43b
|
[RLlib] Multi-GPU support for Torch algorithms. (#14709)
|
2021-04-16 09:16:24 +02:00 |
|
Sven Mika
|
b267f1f1ba
|
[RLlib] Add support for Int-Box action spaces. (#15012)
|
2021-04-11 13:16:01 +02:00 |
|
Jack Parsons
|
3df7a010b1
|
[RLlib] Fixing conv filters config for ComplexInputNetwork (#14749)
|
2021-03-24 16:15:36 +01:00 |
|
Sven Mika
|
ee4b6e7e3b
|
[RLlib] Unity3D example broken due to change in ML-Agents API. Attention-net prev-n-a/r. Attention-wrapper works with images. (#14569)
|
2021-03-12 18:27:25 +01:00 |
|
Sven Mika
|
52c94b7ee9
|
[RLlib] Allow SAC to use custom models as Q- or policy nets and deprecate "state-preprocessor" for image spaces. (#13522)
|
2021-02-02 13:05:58 +01:00 |
|
Sven Mika
|
56878221ed
|
[RLlib] Redo: Make TFModelV2 fully modular like TorchModelV2 (soft-deprecate register_variables, unify var names wrt torch). (#13363)
|
2021-01-14 14:44:33 +01:00 |
|
Sven Mika
|
d49c3fae0b
|
[RLlib] Trajectory View API: Atari framestacking. (#13315)
|
2021-01-13 08:53:34 +01:00 |
|
Kai Fricke
|
25f10a947a
|
Revert "[RLlib] Make TFModelV2 behave more like TorchModelV2: Obsolete register_variables. Unify variable dicts. (#13339)" (#13361)
This reverts commit e2b2abb88b .
|
2021-01-12 12:33:57 +01:00 |
|
Sven Mika
|
e2b2abb88b
|
[RLlib] Make TFModelV2 behave more like TorchModelV2: Obsolete register_variables. Unify variable dicts. (#13339)
|
2021-01-11 22:42:30 +01:00 |
|
Sven Mika
|
bcaff63909
|
[RLlib] SquashedGaussians should throw error when entropy or kl are called. (#13126)
|
2021-01-07 15:07:35 +01:00 |
|
Sven Mika
|
9eba1871bb
|
[RLlib] Support easy use_attention=True flag for using the GTrXL model. (#11698)
|
2021-01-01 14:06:23 -05:00 |
|
Sven Mika
|
8726521604
|
[RLlib] JAXPolicy prep PR #2 (move get_activation_fn (backward-compatibly), minor fixes and preparations). (#13091)
|
2020-12-30 22:30:52 -05:00 |
|
Sven Mika
|
391cdfae8c
|
[RLlib] Trajectory view API docs. (#12718)
|
2020-12-30 17:32:21 -08:00 |
|
Sven Mika
|
c524f86785
|
[RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. (#13064)
|
2020-12-27 09:46:03 -05:00 |
|
Sven Mika
|
b2bcab711d
|
[RLlib] Attention Nets: tf (#12753)
|
2020-12-20 20:22:32 -05:00 |
|
Sven Mika
|
99c81c6795
|
[RLlib] Attention Net prep PR #3. (#12450)
|
2020-12-07 13:08:17 +01:00 |
|
Sven Mika
|
19c8033df2
|
[RLlib] Fix most remaining RLlib algos for running with trajectory view API. (#12366)
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* LINT and fixes.
MB-MPO and MAML not working yet.
* wip
* update
* update
* rmeove
* remove dep
* higher
* Update requirements_rllib.txt
* Update requirements_rllib.txt
* relpos
* no mbmpo
Co-authored-by: Eric Liang <ekhliang@gmail.com>
|
2020-12-01 17:41:10 -08:00 |
|
Sven Mika
|
3ad9365e1d
|
[RLlib] Attention Net prep PR #2: Smaller cleanups. (#12449)
|
2020-12-01 08:21:45 +01:00 |
|
Sven Mika
|
592c161032
|
[RLlib] Issue 12118: LSTM prev-a/r should be separately configurable. Fix missing prev-a one-hot encoding. (#12397)
* WIP.
* Fix and LINT.
|
2020-11-25 11:27:46 -08:00 |
|
Sven Mika
|
62c7ab5182
|
[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747)
|
2020-11-12 16:27:34 +01:00 |
|
Michael Luo
|
59ccbc0fc7
|
[RLlib] Model Annotations: Tensorflow (#11964)
|
2020-11-12 12:18:50 +01:00 |
|
Sven Mika
|
291c172d83
|
[RLlib] Support Simplex action spaces for SAC (torch and tf). (#11909)
|
2020-11-11 18:45:28 +01:00 |
|