Yi Cheng
fd0f967d2e
Revert "[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms
dir and rename policy and trainer classes. ( #25346 )" ( #25420 )
...
This reverts commit e4ceae19ef
.
Reverts #25346
linux://python/ray/tests:test_client_library_integration never fail before this PR.
In the CI of the reverted PR, it also fails (https://buildkite.com/ray-project/ray-builders-pr/builds/34079#01812442-c541-4145-af22-2a012655c128 ). So high likely it's because of this PR.
And test output failure seems related as well (https://buildkite.com/ray-project/ray-builders-branch/builds/7923#018125c2-4812-4ead-a42f-7fddb344105b )
2022-06-02 20:38:44 -07:00
Sven Mika
e4ceae19ef
[RLlib] Move (A/DD)?PPO and IMPALA algos to algorithms
dir and rename policy and trainer classes. ( #25346 )
2022-06-02 16:47:05 +02:00
Eric Liang
905258dbc1
Clean up docstyle in python modules and add LINT rule ( #25272 )
2022-06-01 11:27:54 -07:00
Eric Liang
4963dfaae0
[api] Add API stability annotations for all RLlib symbols and add to LINT ( #25060 )
2022-05-24 22:14:25 -07:00
Eric Liang
55d039af32
Annotate datasources and add API annotation check script ( #24999 )
...
Why are these changes needed?
Add API stability annotations for datasource classes, and add a linter to check all data classes have appropriate annotations.
2022-05-21 15:05:07 -07:00
Sven Mika
8f50087908
[RLlib] AlphaZero uses training_iteration API. ( #24507 )
2022-05-18 09:58:25 +02:00
Nathan Matare
012a4c8667
[RLlib] Allow passing **kwargs to action distribution. ( #24692 )
2022-05-18 09:22:37 +02:00
HJasperson
5f12c62226
[RLlib] Fix "tf variable is unhashable" Error. ( #24273 )
2022-04-29 10:07:02 +02:00
Fabian Witter
56bc90ca72
[RLlib] Remove Unnecessary List Conversion of Complex Observations in SAC Models (torch and tf). ( #24106 )
2022-04-25 11:21:34 +02:00
xwjiang2010
d7da0d706e
[rllib] Only conditionally import JaxCategorical in catalog.py ( #24086 )
...
* Experiment with less imports in catalog.py
* lint
2022-04-22 14:51:35 -07:00
Sven Mika
4d285a00a4
[RLlib] Issue 23689: tf Initializer has hard-coded float32 dtype. ( #23741 )
2022-04-07 21:35:02 +02:00
simonsays1980
9ca9c67bc9
[RLlib] Added dtype safeguards to the 'required_model_output_shape()' methods… ( #23490 )
2022-03-31 13:52:00 +02:00
Max Pumperla
60054995e6
[docs] fix doctests and activate CI ( #23418 )
2022-03-24 17:04:02 -07:00
Jun Gong
d12977c4fb
[RLlib] TF2 Bandit Agent ( #22838 )
2022-03-21 16:55:55 +01:00
Siyuan (Ryans) Zhuang
0c74ecad12
[Lint] Cleanup incorrectly formatted strings (Part 1: RLLib). ( #23128 )
2022-03-15 17:34:21 +01:00
Fabien Couthouis
e575ed3350
[RLlib] Fix AttributeError with None obs shape + tf in _unpack_obs()
utility ( #22428 )
2022-03-15 16:34:31 +01:00
Sven Mika
8e00537b65
[RLlib] SlateQ: framework=tf fixes and SlateQ documentation update ( #22543 )
2022-02-23 13:03:45 +01:00
Sven Mika
6522935291
[RLlib] Slate-Q tf implementation and tests/benchmarks. ( #22389 )
2022-02-22 09:36:44 +01:00
Balaji Veeramani
31ed9e5d02
[CI] Replace YAPF disables with Black disables ( #21982 )
2022-02-08 16:29:25 -08:00
Balaji Veeramani
7f1bacc7dc
[CI] Format Python code with Black ( #21975 )
...
See #21316 and #21311 for the motivation behind these changes.
2022-01-29 18:41:57 -08:00
Sven Mika
ee41800c16
[RLlib] Preparatory PR for multi-agent, multi-GPU learning agent (alpha-star style) #02 . ( #21649 )
2022-01-27 22:07:05 +01:00
Jun Gong
55f3bcfb2d
[RLlib] Add a logstd term to MARWIL's loss func to encourage exploration. ( #21493 )
2022-01-26 16:00:17 +01:00
mickelliu
75078f965d
[Rllib] Fix range()
(no keyword args supported!) in torch version of attention_net.py
. ( #21598 )
2022-01-18 16:11:16 +01:00
Sven Mika
3ac4daba07
[RLlib] Discussion 4351: Conv2d default filter tests and add default setting for 96x96 image obs space. ( #21560 )
2022-01-13 18:50:42 +01:00
Avnish Narayan
f7a5fc36eb
[rllib] Give rnnsac_stateless cartpole gpu, increase timeout ( #21407 )
...
Increase test_preprocessors runtimes.
2022-01-06 11:54:19 -08:00
Sven Mika
9e6b871739
[RLlib] Better utils for flattening complex inputs and enable prev-actions for LSTM/attention for complex action spaces. ( #21330 )
2022-01-05 11:29:44 +01:00
brulu
8b77fc0aef
[RLlib] Updating Repeated space. Allowing numpy arrays and adding representation. ( #20799 )
2021-12-16 08:27:55 +01:00
simonsays1980
1a8aa2da1f
[RLlib] Added `tensorlib=numpy' to 'restore_original_dimensions()' such that … ( #20342 )
2021-12-15 14:03:18 +01:00
Sven Mika
daa4304a91
[RLlib] Switch off preprocessors by default for PGTrainer. ( #21008 )
2021-12-13 12:04:23 +01:00
Sven Mika
596c8e2772
[RLlib] Experimental no-flatten option for actions/prev-actions. ( #20918 )
2021-12-11 14:57:58 +01:00
Sven Mika
f814c2af89
[RLlib; Docs] Docs API reference pages: rllib/execution
, rllib/evaluation
, rllib/models
, rllib/offline
. ( #20538 )
2021-12-10 09:41:29 +01:00
Jun Gong
2317c693cf
[RLlib] Use SampleBrach instead of input dict whenever possible ( #20746 )
2021-12-02 13:11:26 +01:00
Sven Mika
56619b955e
[RLlib; Documentation] Some docstring cleanups; Rename RemoteVectorEnv into RemoteBaseEnv for clarity. ( #20250 )
2021-11-17 21:40:16 +01:00
Sven Mika
a931076f59
[RLlib] Tf2 + eager-tracing same speed as framework=tf; Add more test coverage for tf2+tracing. ( #19981 )
2021-11-05 16:10:00 +01:00
Avnish Narayan
026bf01071
[RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. ( #19535 )
...
* Fix QMix, SAC, and MADDPA too.
* Unpin gym and deprecate pendulum v0
Many tests in rllib depended on pendulum v0,
however in gym 0.21, pendulum v0 was deprecated
in favor of pendulum v1. This may change reward
thresholds, so will have to potentially rerun
all of the pendulum v1 benchmarks, or use another
environment in favor. The same applies to frozen
lake v0 and frozen lake v1
Lastly, all of the RLlib tests and have
been moved to python 3.7
* Add gym installation based on python version.
Pin python<= 3.6 to gym 0.19 due to install
issues with atari roms in gym 0.20
* Reformatting
* Fixing tests
* Move atari-py install conditional to req.txt
* migrate to new ale install method
* Fix QMix, SAC, and MADDPA too.
* Unpin gym and deprecate pendulum v0
Many tests in rllib depended on pendulum v0,
however in gym 0.21, pendulum v0 was deprecated
in favor of pendulum v1. This may change reward
thresholds, so will have to potentially rerun
all of the pendulum v1 benchmarks, or use another
environment in favor. The same applies to frozen
lake v0 and frozen lake v1
Lastly, all of the RLlib tests and have
been moved to python 3.7
* Add gym installation based on python version.
Pin python<= 3.6 to gym 0.19 due to install
issues with atari roms in gym 0.20
Move atari-py install conditional to req.txt
migrate to new ale install method
Make parametric_actions_cartpole return float32 actions/obs
Adding type conversions if obs/actions don't match space
Add utils to make elements match gym space dtypes
Co-authored-by: Jun Gong <jungong@anyscale.com>
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-03 16:24:00 +01:00
Sven Mika
cf21c634a3
[RLlib] Fix deprecated warning for torch_ops.py (soft-replaced by torch_utils.py). ( #19982 )
2021-11-03 10:00:46 +01:00
Sven Mika
2d24ef0d32
[RLlib] Add all simple learning tests as framework=tf2
. ( #19273 )
...
* Unpin gym and deprecate pendulum v0
Many tests in rllib depended on pendulum v0,
however in gym 0.21, pendulum v0 was deprecated
in favor of pendulum v1. This may change reward
thresholds, so will have to potentially rerun
all of the pendulum v1 benchmarks, or use another
environment in favor. The same applies to frozen
lake v0 and frozen lake v1
Lastly, all of the RLlib tests and Tune tests have
been moved to python 3.7
* fix tune test_sampler::testSampleBoundsAx
* fix re-install ray for py3.7 tests
Co-authored-by: avnishn <avnishn@uw.edu>
2021-11-02 12:10:17 +01:00
Sven Mika
0b308719f8
[RLlib; Docs overhaul] Docstring cleanup: rllib/utils ( #19829 )
2021-11-01 21:46:02 +01:00
Sven Mika
b213565783
[RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). ( #19693 )
2021-10-25 15:00:00 +02:00
gjoliver
c3c42278e4
[RLlib] clean up all the SampleBatch['is_training'] deprecation warnings ( #19652 )
...
* [RLlib] clean up all the SampleBatch['is_training'] deprecation warnings.
* wip
2021-10-25 09:38:56 +02:00
Sven Mika
b4300dd532
[RLlib] Issue 18812: Torch multi-GPU stats not protected against race conditions. ( #18937 )
2021-10-04 13:29:00 +02:00
Sven Mika
ac3371a148
[RLlib] Discussion 3644: Fix bug for complex obs spaces containing Box([2D shape])
and discrete component. ( #18917 )
2021-09-30 16:39:38 +02:00
Sven Mika
ed85f59194
[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. ( #18879 )
2021-09-30 16:39:05 +02:00
o0olele
ff6730f903
[RLlib] Attention Nets + MultiDiscrete spaces: Fix range() takes no keyword args error! ( #17502 )
2021-09-24 13:43:58 +02:00
Sven Mika
61a1274619
[RLlib] No Preprocessors (part 2). ( #18468 )
2021-09-23 12:56:45 +02:00
Sven Mika
8a72824c63
[RLlib Testig] Split and unflake more CI tests (make sure all jobs are < 30min). ( #18591 )
2021-09-15 22:16:48 +02:00
Sven Mika
8a066474d4
[RLlib] No Preprocessors; preparatory PR #1 ( #18367 )
2021-09-09 08:10:42 +02:00
Sven Mika
cabaa3b3c6
[RLlib Testing] Add A3C/APPO/BC/DDPPO/MARWIL/CQL/ES/ARS/TD3 to weekly learning tests. ( #18381 )
2021-09-07 11:48:41 +02:00
Sven Mika
9a8ca6a69d
[RLlib] Fix Atari learning test regressions (2 bugs) and 1 minor attention net bug. ( #18306 )
2021-09-03 13:29:57 +02:00
Kai Fricke
34cf5db109
[tune] Fix hyperopt points to evaluate for nested lists ( #18113 )
2021-08-26 14:34:22 +02:00