Charles Sun
61880591e9
[RLlib] Add DTTorchModel ( #27872 )
2022-08-16 18:18:29 -07:00
kourosh hakhamaneshi
5520a96ce0
[RLlib] Fix get_init_state
annotation in torch and define more specific TensorType
. ( #27791 )
2022-08-11 20:02:17 +02:00
Artur Niederfahrenhorst
a598458c46
[RLlib] Fix complex torch one-hot and flattened layers not being added to module list. ( #27304 )
2022-08-01 15:52:28 +02:00
Steven Morad
d0a8e3c36f
[RLlib] User-friendly RNN sequencing. ( #27087 )
2022-08-01 15:32:22 +02:00
Eric Liang
a4434fac7f
[docs] Fix the remaining style violations in docstrings and add lint rule ( #27033 )
2022-07-27 22:24:20 -07:00
Fabian Witter
dc2ad6c8b4
[RLlib] Fix ModelCatalog for nested complex inputs ( #25620 )
2022-07-22 00:45:25 -07:00
Olaf Lipinski
8271406a04
[RLLib] Fix MultiDiscrete not being one-hotted correctly ( #26558 )
...
Co-authored-by: Jun Gong <jungong@anyscale.com>
2022-07-20 15:25:53 -07:00
Ram Rachum
14800e5ac7
Fix exception cause in preprocessors.py ( #26322 )
2022-07-12 20:15:04 -07:00
Sven Mika
130b7eeaba
[RLlib] Trainer
to Algorithm
renaming. ( #25539 )
2022-06-11 15:10:39 +02:00
kourosh hakhamaneshi
4cdd508f70
[RLlib] Added CRR implementation. ( #25499 )
2022-06-08 11:42:02 +02:00
Artur Niederfahrenhorst
243038d00a
[RLlib] Issue 25401: Faulty usage of get_filter_config in ComplexInputNetworks ( #25493 )
2022-06-06 13:04:17 +02:00
Sven Mika
b5bc2b93c3
[RLlib] Move all remaining algos into algorithms
directory. ( #25366 )
2022-06-04 07:35:24 +02:00
Yi Cheng
fd0f967d2e
Revert "[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms
dir and rename policy and trainer classes. ( #25346 )" ( #25420 )
...
This reverts commit e4ceae19ef
.
Reverts #25346
linux://python/ray/tests:test_client_library_integration never fail before this PR.
In the CI of the reverted PR, it also fails (https://buildkite.com/ray-project/ray-builders-pr/builds/34079#01812442-c541-4145-af22-2a012655c128 ). So high likely it's because of this PR.
And test output failure seems related as well (https://buildkite.com/ray-project/ray-builders-branch/builds/7923#018125c2-4812-4ead-a42f-7fddb344105b )
2022-06-02 20:38:44 -07:00
Sven Mika
e4ceae19ef
[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms
dir and rename policy and trainer classes. ( #25346 )
2022-06-02 16:47:05 +02:00
Eric Liang
905258dbc1
Clean up docstyle in python modules and add LINT rule ( #25272 )
2022-06-01 11:27:54 -07:00
Eric Liang
4963dfaae0
[api] Add API stability annotations for all RLlib symbols and add to LINT ( #25060 )
2022-05-24 22:14:25 -07:00
Eric Liang
55d039af32
Annotate datasources and add API annotation check script ( #24999 )
...
Why are these changes needed?
Add API stability annotations for datasource classes, and add a linter to check all data classes have appropriate annotations.
2022-05-21 15:05:07 -07:00
Sven Mika
8f50087908
[RLlib] AlphaZero uses training_iteration API. ( #24507 )
2022-05-18 09:58:25 +02:00
Nathan Matare
012a4c8667
[RLlib] Allow passing **kwargs to action distribution. ( #24692 )
2022-05-18 09:22:37 +02:00
HJasperson
5f12c62226
[RLlib] Fix "tf variable is unhashable" Error. ( #24273 )
2022-04-29 10:07:02 +02:00
Fabian Witter
56bc90ca72
[RLlib] Remove Unnecessary List Conversion of Complex Observations in SAC Models (torch and tf). ( #24106 )
2022-04-25 11:21:34 +02:00
xwjiang2010
d7da0d706e
[rllib] Only conditionally import JaxCategorical in catalog.py ( #24086 )
...
* Experiment with less imports in catalog.py
* lint
2022-04-22 14:51:35 -07:00
Sven Mika
4d285a00a4
[RLlib] Issue 23689: tf Initializer has hard-coded float32 dtype. ( #23741 )
2022-04-07 21:35:02 +02:00
simonsays1980
9ca9c67bc9
[RLlib] Added dtype safeguards to the 'required_model_output_shape()' methods… ( #23490 )
2022-03-31 13:52:00 +02:00
Max Pumperla
60054995e6
[docs] fix doctests and activate CI ( #23418 )
2022-03-24 17:04:02 -07:00
Jun Gong
d12977c4fb
[RLlib] TF2 Bandit Agent ( #22838 )
2022-03-21 16:55:55 +01:00
Siyuan (Ryans) Zhuang
0c74ecad12
[Lint] Cleanup incorrectly formatted strings (Part 1: RLLib). ( #23128 )
2022-03-15 17:34:21 +01:00
Fabien Couthouis
e575ed3350
[RLlib] Fix AttributeError with None obs shape + tf in _unpack_obs()
utility ( #22428 )
2022-03-15 16:34:31 +01:00
Sven Mika
8e00537b65
[RLlib] SlateQ: framework=tf fixes and SlateQ documentation update ( #22543 )
2022-02-23 13:03:45 +01:00
Sven Mika
6522935291
[RLlib] Slate-Q tf implementation and tests/benchmarks. ( #22389 )
2022-02-22 09:36:44 +01:00
Balaji Veeramani
31ed9e5d02
[CI] Replace YAPF disables with Black disables ( #21982 )
2022-02-08 16:29:25 -08:00
Balaji Veeramani
7f1bacc7dc
[CI] Format Python code with Black ( #21975 )
...
See #21316 and #21311 for the motivation behind these changes.
2022-01-29 18:41:57 -08:00
Sven Mika
ee41800c16
[RLlib] Preparatory PR for multi-agent, multi-GPU learning agent (alpha-star style) #02 . ( #21649 )
2022-01-27 22:07:05 +01:00
Jun Gong
55f3bcfb2d
[RLlib] Add a logstd term to MARWIL's loss func to encourage exploration. ( #21493 )
2022-01-26 16:00:17 +01:00
mickelliu
75078f965d
[Rllib] Fix range()
(no keyword args supported!) in torch version of attention_net.py
. ( #21598 )
2022-01-18 16:11:16 +01:00
Sven Mika
3ac4daba07
[RLlib] Discussion 4351: Conv2d default filter tests and add default setting for 96x96 image obs space. ( #21560 )
2022-01-13 18:50:42 +01:00
Avnish Narayan
f7a5fc36eb
[rllib] Give rnnsac_stateless cartpole gpu, increase timeout ( #21407 )
...
Increase test_preprocessors runtimes.
2022-01-06 11:54:19 -08:00
Sven Mika
9e6b871739
[RLlib] Better utils for flattening complex inputs and enable prev-actions for LSTM/attention for complex action spaces. ( #21330 )
2022-01-05 11:29:44 +01:00
brulu
8b77fc0aef
[RLlib] Updating Repeated space. Allowing numpy arrays and adding representation. ( #20799 )
2021-12-16 08:27:55 +01:00
simonsays1980
1a8aa2da1f
[RLlib] Added `tensorlib=numpy' to 'restore_original_dimensions()' such that … ( #20342 )
2021-12-15 14:03:18 +01:00
Sven Mika
daa4304a91
[RLlib] Switch off preprocessors by default for PGTrainer. ( #21008 )
2021-12-13 12:04:23 +01:00
Sven Mika
596c8e2772
[RLlib] Experimental no-flatten option for actions/prev-actions. ( #20918 )
2021-12-11 14:57:58 +01:00
Sven Mika
f814c2af89
[RLlib; Docs] Docs API reference pages: rllib/execution
, rllib/evaluation
, rllib/models
, rllib/offline
. ( #20538 )
2021-12-10 09:41:29 +01:00
Jun Gong
2317c693cf
[RLlib] Use SampleBrach instead of input dict whenever possible ( #20746 )
2021-12-02 13:11:26 +01:00
Sven Mika
56619b955e
[RLlib; Documentation] Some docstring cleanups; Rename RemoteVectorEnv into RemoteBaseEnv for clarity. ( #20250 )
2021-11-17 21:40:16 +01:00
Sven Mika
a931076f59
[RLlib] Tf2 + eager-tracing same speed as framework=tf; Add more test coverage for tf2+tracing. ( #19981 )
2021-11-05 16:10:00 +01:00
Avnish Narayan
026bf01071
[RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. ( #19535 )
...
* Fix QMix, SAC, and MADDPA too.
* Unpin gym and deprecate pendulum v0
Many tests in rllib depended on pendulum v0,
however in gym 0.21, pendulum v0 was deprecated
in favor of pendulum v1. This may change reward
thresholds, so will have to potentially rerun
all of the pendulum v1 benchmarks, or use another
environment in favor. The same applies to frozen
lake v0 and frozen lake v1
Lastly, all of the RLlib tests and have
been moved to python 3.7
* Add gym installation based on python version.
Pin python<= 3.6 to gym 0.19 due to install
issues with atari roms in gym 0.20
* Reformatting
* Fixing tests
* Move atari-py install conditional to req.txt
* migrate to new ale install method
* Fix QMix, SAC, and MADDPA too.
* Unpin gym and deprecate pendulum v0
Many tests in rllib depended on pendulum v0,
however in gym 0.21, pendulum v0 was deprecated
in favor of pendulum v1. This may change reward
thresholds, so will have to potentially rerun
all of the pendulum v1 benchmarks, or use another
environment in favor. The same applies to frozen
lake v0 and frozen lake v1
Lastly, all of the RLlib tests and have
been moved to python 3.7
* Add gym installation based on python version.
Pin python<= 3.6 to gym 0.19 due to install
issues with atari roms in gym 0.20
Move atari-py install conditional to req.txt
migrate to new ale install method
Make parametric_actions_cartpole return float32 actions/obs
Adding type conversions if obs/actions don't match space
Add utils to make elements match gym space dtypes
Co-authored-by: Jun Gong <jungong@anyscale.com>
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-03 16:24:00 +01:00
Sven Mika
cf21c634a3
[RLlib] Fix deprecated warning for torch_ops.py (soft-replaced by torch_utils.py). ( #19982 )
2021-11-03 10:00:46 +01:00
Sven Mika
2d24ef0d32
[RLlib] Add all simple learning tests as framework=tf2
. ( #19273 )
...
* Unpin gym and deprecate pendulum v0
Many tests in rllib depended on pendulum v0,
however in gym 0.21, pendulum v0 was deprecated
in favor of pendulum v1. This may change reward
thresholds, so will have to potentially rerun
all of the pendulum v1 benchmarks, or use another
environment in favor. The same applies to frozen
lake v0 and frozen lake v1
Lastly, all of the RLlib tests and Tune tests have
been moved to python 3.7
* fix tune test_sampler::testSampleBoundsAx
* fix re-install ray for py3.7 tests
Co-authored-by: avnishn <avnishn@uw.edu>
2021-11-02 12:10:17 +01:00
Sven Mika
0b308719f8
[RLlib; Docs overhaul] Docstring cleanup: rllib/utils ( #19829 )
2021-11-01 21:46:02 +01:00