Commit graph

105 commits

Author SHA1 Message Date
Sven Mika
4d285a00a4
[RLlib] Issue 23689: tf Initializer has hard-coded float32 dtype. (#23741) 2022-04-07 21:35:02 +02:00
simonsays1980
9ca9c67bc9
[RLlib] Added dtype safeguards to the 'required_model_output_shape()' methods… (#23490) 2022-03-31 13:52:00 +02:00
Jun Gong
d12977c4fb
[RLlib] TF2 Bandit Agent (#22838) 2022-03-21 16:55:55 +01:00
Sven Mika
8e00537b65
[RLlib] SlateQ: framework=tf fixes and SlateQ documentation update (#22543) 2022-02-23 13:03:45 +01:00
Balaji Veeramani
7f1bacc7dc
[CI] Format Python code with Black (#21975)
See #21316 and #21311 for the motivation behind these changes.
2022-01-29 18:41:57 -08:00
Sven Mika
9e6b871739
[RLlib] Better utils for flattening complex inputs and enable prev-actions for LSTM/attention for complex action spaces. (#21330) 2022-01-05 11:29:44 +01:00
Sven Mika
daa4304a91
[RLlib] Switch off preprocessors by default for PGTrainer. (#21008) 2021-12-13 12:04:23 +01:00
Sven Mika
596c8e2772
[RLlib] Experimental no-flatten option for actions/prev-actions. (#20918) 2021-12-11 14:57:58 +01:00
Sven Mika
56619b955e
[RLlib; Documentation] Some docstring cleanups; Rename RemoteVectorEnv into RemoteBaseEnv for clarity. (#20250) 2021-11-17 21:40:16 +01:00
Sven Mika
0b308719f8
[RLlib; Docs overhaul] Docstring cleanup: rllib/utils (#19829) 2021-11-01 21:46:02 +01:00
Sven Mika
ac3371a148
[RLlib] Discussion 3644: Fix bug for complex obs spaces containing Box([2D shape]) and discrete component. (#18917) 2021-09-30 16:39:38 +02:00
o0olele
ff6730f903
[RLlib] Attention Nets + MultiDiscrete spaces: Fix range() takes no keyword args error! (#17502) 2021-09-24 13:43:58 +02:00
Sven Mika
61a1274619
[RLlib] No Preprocessors (part 2). (#18468) 2021-09-23 12:56:45 +02:00
Sven Mika
8a066474d4
[RLlib] No Preprocessors; preparatory PR #1 (#18367) 2021-09-09 08:10:42 +02:00
Sven Mika
9a8ca6a69d
[RLlib] Fix Atari learning test regressions (2 bugs) and 1 minor attention net bug. (#18306) 2021-09-03 13:29:57 +02:00
Kai Fricke
34cf5db109
[tune] Fix hyperopt points to evaluate for nested lists (#18113) 2021-08-26 14:34:22 +02:00
Sven Mika
9883505e84
[RLlib] Add [LSTM=True + multi-GPU]-tests to nightly RLlib testing suite (for all algos supporting RNNs, except R2D2, RNNSAC, and DDPPO). (#18017) 2021-08-24 21:55:27 +02:00
Sven Mika
494ddd98c1
[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. (#17928) 2021-08-21 17:05:48 +02:00
Sven Mika
839fc59224
[RLlib] CQL TensorFlow support (#15841) 2021-05-18 11:10:46 +02:00
Sven Mika
e973b726c2
[RLlib] Support native tf.keras.Models (part 2) - Default keras models for Vision/RNN/Attention. (#15273) 2021-04-30 19:26:30 +02:00
Sven Mika
bb8a286cbc
[RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). (#14684) 2021-04-27 10:44:54 +02:00
Sven Mika
cecfc3b43b
[RLlib] Multi-GPU support for Torch algorithms. (#14709) 2021-04-16 09:16:24 +02:00
Sven Mika
b267f1f1ba
[RLlib] Add support for Int-Box action spaces. (#15012) 2021-04-11 13:16:01 +02:00
Jack Parsons
3df7a010b1
[RLlib] Fixing conv filters config for ComplexInputNetwork (#14749) 2021-03-24 16:15:36 +01:00
Sven Mika
ee4b6e7e3b
[RLlib] Unity3D example broken due to change in ML-Agents API. Attention-net prev-n-a/r. Attention-wrapper works with images. (#14569) 2021-03-12 18:27:25 +01:00
Sven Mika
52c94b7ee9
[RLlib] Allow SAC to use custom models as Q- or policy nets and deprecate "state-preprocessor" for image spaces. (#13522) 2021-02-02 13:05:58 +01:00
Sven Mika
56878221ed
[RLlib] Redo: Make TFModelV2 fully modular like TorchModelV2 (soft-deprecate register_variables, unify var names wrt torch). (#13363) 2021-01-14 14:44:33 +01:00
Sven Mika
d49c3fae0b
[RLlib] Trajectory View API: Atari framestacking. (#13315) 2021-01-13 08:53:34 +01:00
Kai Fricke
25f10a947a
Revert "[RLlib] Make TFModelV2 behave more like TorchModelV2: Obsolete register_variables. Unify variable dicts. (#13339)" (#13361)
This reverts commit e2b2abb88b.
2021-01-12 12:33:57 +01:00
Sven Mika
e2b2abb88b
[RLlib] Make TFModelV2 behave more like TorchModelV2: Obsolete register_variables. Unify variable dicts. (#13339) 2021-01-11 22:42:30 +01:00
Sven Mika
bcaff63909
[RLlib] SquashedGaussians should throw error when entropy or kl are called. (#13126) 2021-01-07 15:07:35 +01:00
Sven Mika
9eba1871bb
[RLlib] Support easy use_attention=True flag for using the GTrXL model. (#11698) 2021-01-01 14:06:23 -05:00
Sven Mika
8726521604
[RLlib] JAXPolicy prep PR #2 (move get_activation_fn (backward-compatibly), minor fixes and preparations). (#13091) 2020-12-30 22:30:52 -05:00
Sven Mika
391cdfae8c
[RLlib] Trajectory view API docs. (#12718) 2020-12-30 17:32:21 -08:00
Sven Mika
c524f86785
[RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. (#13064) 2020-12-27 09:46:03 -05:00
Sven Mika
b2bcab711d
[RLlib] Attention Nets: tf (#12753) 2020-12-20 20:22:32 -05:00
Sven Mika
99c81c6795
[RLlib] Attention Net prep PR #3. (#12450) 2020-12-07 13:08:17 +01:00
Sven Mika
19c8033df2
[RLlib] Fix most remaining RLlib algos for running with trajectory view API. (#12366)
* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* LINT and fixes.
MB-MPO and MAML not working yet.

* wip

* update

* update

* rmeove

* remove dep

* higher

* Update requirements_rllib.txt

* Update requirements_rllib.txt

* relpos

* no mbmpo

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-12-01 17:41:10 -08:00
Sven Mika
3ad9365e1d
[RLlib] Attention Net prep PR #2: Smaller cleanups. (#12449) 2020-12-01 08:21:45 +01:00
Sven Mika
592c161032
[RLlib] Issue 12118: LSTM prev-a/r should be separately configurable. Fix missing prev-a one-hot encoding. (#12397)
* WIP.

* Fix and LINT.
2020-11-25 11:27:46 -08:00
Sven Mika
62c7ab5182
[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747) 2020-11-12 16:27:34 +01:00
Michael Luo
59ccbc0fc7
[RLlib] Model Annotations: Tensorflow (#11964) 2020-11-12 12:18:50 +01:00
Sven Mika
291c172d83
[RLlib] Support Simplex action spaces for SAC (torch and tf). (#11909) 2020-11-11 18:45:28 +01:00
Jiajie Xiao
0b07af374a
allow tuple action space (#11429)
Co-authored-by: Jiajie Xiao <jj@Jiajies-MBP-2.attlocal.net>
2020-10-29 16:05:38 +01:00
Sven Mika
d3bc20b727
[RLlib] ConvTranspose2D module (#11231) 2020-10-12 15:00:42 +02:00
Sven Mika
957877ad3f
Tf version of VisionNet (ray/rllib/model/tf/vision_net.py) crashes iff len(conv-filters)=1. (#11330) 2020-10-11 12:49:47 +02:00
Sven Mika
ce96b03b07
[RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033) 2020-10-06 20:28:16 +02:00
Sven Mika
805dad3bc4
[RLlib] SAC algo cleanup. (#10825) 2020-09-20 11:27:02 +02:00
Sven Mika
28ab797cf5
[RLlib] Deprecate old classes, methods, functions, config keys (in prep for RLlib 1.0). (#10544) 2020-09-06 10:58:00 +02:00
Sven Mika
e968b52cb7
[RLlib] Trajectory view API - 03 Fast LSTM + prev actions/rewards (#9950) 2020-08-21 12:35:16 +02:00