Commit graph

59 commits

Author SHA1 Message Date
Sven Mika
b5bc2b93c3
[RLlib] Move all remaining algos into algorithms directory. (#25366) 2022-06-04 07:35:24 +02:00
Yi Cheng
fd0f967d2e
Revert "[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms dir and rename policy and trainer classes. (#25346)" (#25420)
This reverts commit e4ceae19ef.

Reverts #25346

linux://python/ray/tests:test_client_library_integration never fail before this PR.

In the CI of the reverted PR, it also fails (https://buildkite.com/ray-project/ray-builders-pr/builds/34079#01812442-c541-4145-af22-2a012655c128). So high likely it's because of this PR.

And test output failure seems related as well (https://buildkite.com/ray-project/ray-builders-branch/builds/7923#018125c2-4812-4ead-a42f-7fddb344105b)
2022-06-02 20:38:44 -07:00
Sven Mika
e4ceae19ef
[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms dir and rename policy and trainer classes. (#25346) 2022-06-02 16:47:05 +02:00
Fabian Witter
56bc90ca72
[RLlib] Remove Unnecessary List Conversion of Complex Observations in SAC Models (torch and tf). (#24106) 2022-04-25 11:21:34 +02:00
Max Pumperla
60054995e6
[docs] fix doctests and activate CI (#23418) 2022-03-24 17:04:02 -07:00
Fabien Couthouis
e575ed3350
[RLlib] Fix AttributeError with None obs shape + tf in _unpack_obs() utility (#22428) 2022-03-15 16:34:31 +01:00
Balaji Veeramani
7f1bacc7dc
[CI] Format Python code with Black (#21975)
See #21316 and #21311 for the motivation behind these changes.
2022-01-29 18:41:57 -08:00
simonsays1980
1a8aa2da1f
[RLlib] Added `tensorlib=numpy' to 'restore_original_dimensions()' such that … (#20342) 2021-12-15 14:03:18 +01:00
Sven Mika
f814c2af89
[RLlib; Docs] Docs API reference pages: rllib/execution, rllib/evaluation, rllib/models, rllib/offline. (#20538) 2021-12-10 09:41:29 +01:00
Jun Gong
2317c693cf
[RLlib] Use SampleBrach instead of input dict whenever possible (#20746) 2021-12-02 13:11:26 +01:00
Sven Mika
a931076f59
[RLlib] Tf2 + eager-tracing same speed as framework=tf; Add more test coverage for tf2+tracing. (#19981) 2021-11-05 16:10:00 +01:00
Sven Mika
0b308719f8
[RLlib; Docs overhaul] Docstring cleanup: rllib/utils (#19829) 2021-11-01 21:46:02 +01:00
Sven Mika
b213565783
[RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). (#19693) 2021-10-25 15:00:00 +02:00
gjoliver
c3c42278e4
[RLlib] clean up all the SampleBatch['is_training'] deprecation warnings (#19652)
* [RLlib] clean up all the SampleBatch['is_training'] deprecation warnings.

* wip
2021-10-25 09:38:56 +02:00
Sven Mika
61a1274619
[RLlib] No Preprocessors (part 2). (#18468) 2021-09-23 12:56:45 +02:00
Sven Mika
8a066474d4
[RLlib] No Preprocessors; preparatory PR #1 (#18367) 2021-09-09 08:10:42 +02:00
Sven Mika
494ddd98c1
[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. (#17928) 2021-08-21 17:05:48 +02:00
Sven Mika
7eb1a29426
[RLlib] Fix ModelV2 custom metrics for torch. (#16734) 2021-07-01 13:01:40 +02:00
Sven Mika
e973b726c2
[RLlib] Support native tf.keras.Models (part 2) - Default keras models for Vision/RNN/Attention. (#15273) 2021-04-30 19:26:30 +02:00
Sven Mika
bb8a286cbc
[RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). (#14684) 2021-04-27 10:44:54 +02:00
Sven Mika
cecfc3b43b
[RLlib] Multi-GPU support for Torch algorithms. (#14709) 2021-04-16 09:16:24 +02:00
Sven Mika
69202c6a7d
[RLlib] Obsolete usage tracking dict via sample batch. (#13065) 2021-03-17 08:18:15 +01:00
Sven Mika
ee4b6e7e3b
[RLlib] Unity3D example broken due to change in ML-Agents API. Attention-net prev-n-a/r. Attention-wrapper works with images. (#14569) 2021-03-12 18:27:25 +01:00
Sven Mika
95ef04b71a
[RLlib] Implement TorchPolicy.export_model. (#13989) 2021-02-22 17:09:40 +01:00
Sven Mika
52c94b7ee9
[RLlib] Allow SAC to use custom models as Q- or policy nets and deprecate "state-preprocessor" for image spaces. (#13522) 2021-02-02 13:05:58 +01:00
Sven Mika
6f342a2221
[RLlib] Preparatory PR for: Documentation on Model Building. (#13260) 2021-01-08 10:56:09 +01:00
Sven Mika
391cdfae8c
[RLlib] Trajectory view API docs. (#12718) 2020-12-30 17:32:21 -08:00
Sven Mika
a5318961de
[RLlib] Preprocessor fixes (multi-discrete) and tests. (#13083) 2020-12-26 20:14:36 -05:00
Sven Mika
99ae7bae05
[RLlib] JAXPolicy prep. PR #1. (#13077) 2020-12-26 20:14:18 -05:00
Sven Mika
b2bcab711d
[RLlib] Attention Nets: tf (#12753) 2020-12-20 20:22:32 -05:00
Sven Mika
99c81c6795
[RLlib] Attention Net prep PR #3. (#12450) 2020-12-07 13:08:17 +01:00
Sven Mika
3ad9365e1d
[RLlib] Attention Net prep PR #2: Smaller cleanups. (#12449) 2020-12-01 08:21:45 +01:00
Sven Mika
62c7ab5182
[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747) 2020-11-12 16:27:34 +01:00
Sven Mika
d9f1874e34
[RLlib] Minor fixes (torch GPU bugs + some cleanup). (#11609) 2020-10-27 10:00:24 +01:00
Sven Mika
36bda8432b
[RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056) 2020-10-01 16:57:10 +02:00
Sven Mika
805dad3bc4
[RLlib] SAC algo cleanup. (#10825) 2020-09-20 11:27:02 +02:00
Sven Mika
e968b52cb7
[RLlib] Trajectory view API - 03 Fast LSTM + prev actions/rewards (#9950) 2020-08-21 12:35:16 +02:00
Sven Mika
2256047876
[RLlib] Rename rllib.utils.types into typing to match built-in python module's name. (#10114) 2020-08-15 13:24:22 +02:00
Sven Mika
57690a3a9f
[RLlib] Trajectory view API - 02 actual API scaffold (#9753) 2020-08-06 10:54:20 +02:00
Eric Liang
590943a499
[rllib] Type annotations for model classes (#9646) 2020-07-24 12:01:46 -07:00
Sven Mika
03ab86567f
[RLlib] Layout of Trajectory View API (new class: Trajectory; not used yet). (#9269) 2020-07-14 04:27:49 +02:00
Sven Mika
01125b8fcf
[RLlib] DQN rainbow eager-mode (keras style NoisyLayer) (preparation for native tf2.x support). (#9304) 2020-07-09 10:44:10 +02:00
Sven Mika
43043ee4d5
[RLlib] Tf2x preparation; part 2 (upgrading try_import_tf()). (#9136)
* WIP.

* Fixes.

* LINT.

* WIP.

* WIP.

* Fixes.

* Fixes.

* Fixes.

* Fixes.

* WIP.

* Fixes.

* Test

* Fix.

* Fixes and LINT.

* Fixes and LINT.

* LINT.
2020-06-30 10:13:20 +02:00
Sven Mika
0d37103f84
[RLlib] Prototype: Model Trajectory View API, part 0 (#9171) 2020-06-30 05:33:19 +02:00
Sven Mika
5c6d5d4ab1
This PR fixes the currently broken lstm_use_prev_action_reward flag for default lstm models (model.use_lstm=True). (#8970) 2020-06-27 20:50:01 +02:00
Eric Liang
34bae27ac7
[rllib] Flexible multi-agent replay modes and replay_sequence_length (#8893) 2020-06-12 20:17:27 -07:00
Eric Liang
be26a7b1b0
[rllib] Support for complex / variable-length observation spaces (#8393) 2020-06-06 12:22:19 +02:00
Eric Liang
baadbdf8d4
[rllib] Execute PPO using training workflow (#8206)
* wip

* add kl

* kl

* works now

* doc update

* reorg

* add ddppo

* add stats

* fix fetch

* comment

* fix learner stat regression

* test fixes

* fix test
2020-04-30 01:18:09 -07:00
Sven Mika
bf25aee392
[RLlib] Deprecate all Model(v1) usage. (#8146)
Deprecate all Model(v1) usage.
2020-04-29 12:12:59 +02:00
Sven Mika
1b31c11806
[RLlib] DDPG re-factor to fit into RLlib's functional algorithm builder API. (#7934) 2020-04-09 14:04:21 -07:00