1
0
Fork 0
mirror of https://github.com/vale981/ray synced 2025-03-13 06:36:39 -04:00
Commit graph

25 commits

Author SHA1 Message Date
Sven Mika
5107d16ae5
[RLlib] Add @Deprecated decorator to simplify/unify deprecation of classes, methods, functions. () 2021-08-03 18:30:02 -04:00
Sven Mika
839fc59224
[RLlib] CQL TensorFlow support () 2021-05-18 11:10:46 +02:00
Sven Mika
4b3add0066
[RLlib] Discussion 2021: PPO does not learn vf, iff use_gae=False (ignores use_critic setting). () 2021-05-04 14:17:00 +02:00
Sven Mika
bb8a286cbc
[RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). () 2021-04-27 10:44:54 +02:00
Sven Mika
bbfa8ffec9
[RLlib] Minor release 1.3 warnings cleanups. () 2021-04-14 14:03:15 +02:00
Sven Mika
04bc0a9828
[RLlib] Remove all non-trajectory view API code. () 2021-03-23 09:50:18 -07:00
Sven Mika
69202c6a7d
[RLlib] Obsolete usage tracking dict via sample batch. () 2021-03-17 08:18:15 +01:00
Sven Mika
8000258333
[RLlib] R2D2 Implementation. () 2021-02-25 12:18:11 +01:00
Sven Mika
d629292d63
[RLlib] Add grad_clip config option to MARWIL and stabilize grad clipping against inf global_norms. () 2021-01-22 19:36:02 +01:00
Sven Mika
2e3655e8a9
[RLlib] Issue 9071 A3C w/ RNN not working due to VF assuming no RNN. () 2021-01-19 14:22:36 +01:00
Sven Mika
93c0a5549b
[RLlib] Deprecate vf_share_layers in top-level PPO/MAML/MB-MPO configs. () 2021-01-19 09:51:35 +01:00
Sven Mika
b2bcab711d
[RLlib] Attention Nets: tf () 2020-12-20 20:22:32 -05:00
Sven Mika
99c81c6795
[RLlib] Attention Net prep PR . () 2020-12-07 13:08:17 +01:00
Sven Mika
0df55a139c
[RLlib] Attention Net prep PR : Smaller cleanups. ()
* WIP.

* Fix.

* Fix.

* Fix.
2020-11-27 16:25:47 -08:00
Sven Mika
c17169dc11
[RLlib] Fix all example scripts to run on GPUs. () 2020-10-02 23:07:44 +02:00
Sven Mika
805dad3bc4
[RLlib] SAC algo cleanup. () 2020-09-20 11:27:02 +02:00
Sven Mika
ef18893fb5
[RLlib] PPO, APPO, and DD-PPO code cleanup. () 2020-09-02 14:03:01 +02:00
Sven Mika
b0b0463161
[RLlib] Trajectory View API (preparatory cleanup and enhancements). () 2020-07-29 21:15:09 +02:00
Sven Mika
fcdf410ae1
[RLlib] Tf2.x native. () 2020-07-11 22:06:35 +02:00
Sven Mika
43043ee4d5
[RLlib] Tf2x preparation; part 2 (upgrading try_import_tf()). ()
* WIP.

* Fixes.

* LINT.

* WIP.

* WIP.

* Fixes.

* Fixes.

* Fixes.

* Fixes.

* WIP.

* Fixes.

* Test

* Fix.

* Fixes and LINT.

* Fixes and LINT.

* LINT.
2020-06-30 10:13:20 +02:00
Sven Mika
4fd8977eaf
[RLlib] Minor cleanup in preparation to tf2.x support. ()
* WIP.

* Fixes.

* LINT.

* Fixes.

* Fixes and LINT.

* WIP.
2020-06-25 19:01:32 +02:00
Sven Mika
7008902cff
[RLlib] Minor rllib.utils cleanup. () 2020-06-16 08:52:20 +02:00
Sven Mika
bf25aee392
[RLlib] Deprecate all Model(v1) usage. ()
Deprecate all Model(v1) usage.
2020-04-29 12:12:59 +02:00
Sven Mika
e153e3179f
[RLlib] Exploration API: Policy changes needed for forward pass noisifications. ()
* Rollback.

* WIP.

* WIP.

* LINT.

* WIP.

* Fix.

* Fix.

* Fix.

* LINT.

* Fix (SAC does currently not support eager).

* Fix.

* WIP.

* LINT.

* Update rllib/evaluation/sampler.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Update rllib/evaluation/sampler.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Update rllib/utils/exploration/exploration.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Update rllib/utils/exploration/exploration.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* WIP.

* WIP.

* Fix.

* LINT.

* LINT.

* Fix and LINT.

* WIP.

* WIP.

* WIP.

* WIP.

* Fix.

* LINT.

* Fix.

* Fix and LINT.

* Update rllib/utils/exploration/exploration.py

* Update rllib/policy/dynamic_tf_policy.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Update rllib/policy/dynamic_tf_policy.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Update rllib/policy/dynamic_tf_policy.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Fixes.

* LINT.

* WIP.

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-04-01 00:43:21 -07:00
Sven Mika
c957ed58ed [RLlib] Implement PPO torch version. () 2020-01-20 23:06:50 -08:00
Renamed from rllib/agents/ppo/ppo_policy.py (Browse further)