Commit graph

52 commits

Author SHA1 Message Date
Balaji Veeramani
31ed9e5d02
[CI] Replace YAPF disables with Black disables (#21982) 2022-02-08 16:29:25 -08:00
Sven Mika
c17a44cdfa
Revert "Revert "[RLlib] AlphaStar: Parallelized, multi-agent/multi-GPU learni…" (#22153) 2022-02-08 16:43:00 +01:00
SangBin Cho
a887763b38
Revert "[RLlib] AlphaStar: Parallelized, multi-agent/multi-GPU learni… (#22105)
This reverts commit 3f03ef8ba8.
2022-02-04 00:54:50 -08:00
Sven Mika
3f03ef8ba8
[RLlib] AlphaStar: Parallelized, multi-agent/multi-GPU learning via league-based self-play. (#21356) 2022-02-03 09:32:09 +01:00
Balaji Veeramani
7f1bacc7dc
[CI] Format Python code with Black (#21975)
See #21316 and #21311 for the motivation behind these changes.
2022-01-29 18:41:57 -08:00
Sven Mika
d5bfb7b7da
[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 (#21652) 2022-01-25 14:16:58 +01:00
Sven Mika
b10d5533be
[RLlib] Issue 20920 (partial solution): contrib/MADDPG + pettingzoo coop-pong-v4 not working. (#21452) 2022-01-10 11:19:40 +01:00
Sven Mika
b4790900f5
[RLlib] Sub-class Trainer (instead of build_trainer()): All remaining classes; soft-deprecate build_trainer. (#20725) 2021-12-04 22:05:26 +01:00
Sven Mika
49cd7ea6f9
[RLlib] Trainer sub-class PPO/DDPPO (instead of build_trainer()). (#20571) 2021-11-23 23:01:05 +01:00
Kai Fricke
3e6ba5d6d2
Revert "Revert [RLlib] POC: PGTrainer class that works by sub-classing, not trainer_template.py." (#20285)
* Revert "Revert "[RLlib] POC: `PGTrainer` class that works by sub-classing, not `trainer_template.py`. (#20055)" (#20284)"
This reverts commit 246787cdd9.
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-16 12:26:47 +01:00
Kai Fricke
246787cdd9
Revert "[RLlib] POC: PGTrainer class that works by sub-classing, not trainer_template.py. (#20055)" (#20284)
This reverts commit 6f85af435f.
2021-11-12 13:09:43 +00:00
Sven Mika
6f85af435f
[RLlib] POC: PGTrainer class that works by sub-classing, not trainer_template.py. (#20055) 2021-11-11 12:16:20 +01:00
Avnish Narayan
026bf01071
[RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535)
* Fix QMix, SAC, and MADDPA too.

* Unpin gym and deprecate pendulum v0

Many tests in rllib depended on pendulum v0,
however in gym 0.21, pendulum v0 was deprecated
in favor of pendulum v1. This may change reward
thresholds, so will have to potentially rerun
all of the pendulum v1 benchmarks, or use another
environment in favor. The same applies to frozen
lake v0 and frozen lake v1

Lastly, all of the RLlib tests and have
been moved to python 3.7

* Add gym installation based on python version.

Pin python<= 3.6 to gym 0.19 due to install
issues with atari roms in gym 0.20

* Reformatting

* Fixing tests

* Move atari-py install conditional to req.txt

* migrate to new ale install method

* Fix QMix, SAC, and MADDPA too.

* Unpin gym and deprecate pendulum v0

Many tests in rllib depended on pendulum v0,
however in gym 0.21, pendulum v0 was deprecated
in favor of pendulum v1. This may change reward
thresholds, so will have to potentially rerun
all of the pendulum v1 benchmarks, or use another
environment in favor. The same applies to frozen
lake v0 and frozen lake v1

Lastly, all of the RLlib tests and have
been moved to python 3.7
* Add gym installation based on python version.

Pin python<= 3.6 to gym 0.19 due to install
issues with atari roms in gym 0.20

Move atari-py install conditional to req.txt

migrate to new ale install method

Make parametric_actions_cartpole return float32 actions/obs

Adding type conversions if obs/actions don't match space

Add utils to make elements match gym space dtypes

Co-authored-by: Jun Gong <jungong@anyscale.com>
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-03 16:24:00 +01:00
Sven Mika
cf21c634a3
[RLlib] Fix deprecated warning for torch_ops.py (soft-replaced by torch_utils.py). (#19982) 2021-11-03 10:00:46 +01:00
Sven Mika
0b308719f8
[RLlib; Docs overhaul] Docstring cleanup: rllib/utils (#19829) 2021-11-01 21:46:02 +01:00
Sven Mika
45f60e51a9
[RLlib] DDPPO fixes and benchmarks. (#18390) 2021-09-08 19:39:01 +02:00
Sven Mika
f3bbe4ea44
[RLlib] Test cases/BUILD cleanup; split "everything else" (longest running one rn) tests in 2. (#17640) 2021-08-16 22:01:01 +02:00
Sven Mika
e0640ad0dc
[RLlib] Fix seeding for ES and ARS. (#16744) 2021-07-19 13:13:05 -04:00
Sven Mika
cecfc3b43b
[RLlib] Multi-GPU support for Torch algorithms. (#14709) 2021-04-16 09:16:24 +02:00
Sven Mika
b267f1f1ba
[RLlib] Add support for Int-Box action spaces. (#15012) 2021-04-11 13:16:01 +02:00
Sven Mika
732197e23a
[RLlib] Multi-GPU for tf-DQN/PG/A2C. (#13393) 2021-03-08 15:41:27 +01:00
Sven Mika
99ae7bae05
[RLlib] JAXPolicy prep. PR #1. (#13077) 2020-12-26 20:14:18 -05:00
Sven Mika
19c8033df2
[RLlib] Fix most remaining RLlib algos for running with trajectory view API. (#12366)
* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* LINT and fixes.
MB-MPO and MAML not working yet.

* wip

* update

* update

* rmeove

* remove dep

* higher

* Update requirements_rllib.txt

* Update requirements_rllib.txt

* relpos

* no mbmpo

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-12-01 17:41:10 -08:00
Sven Mika
0bd69edd71
[RLlib] Trajectory view API: enable by default for ES and ARS (#11826) 2020-11-12 10:33:10 -08:00
Sven Mika
0c0f67c14d
[RLlib] ARS/ES eval workers not working: Issue 9933. (#11308) 2020-10-12 13:49:48 -07:00
Eric Liang
5acd3e66dd
[rllib] Fix torch TD error, IMPALA LR updates (#9477)
* update

* add test

* lint

* fix super call

* speed es test up
2020-07-23 12:50:25 -07:00
Sven Mika
935d8308fb
[RLlib] Issue #9437 (PyTorch converts to CPU tensor, even if on GPU). (#9497) 2020-07-16 14:55:50 +02:00
Sven Mika
5b2a97597b
[RLlib] Retire try_import_tree (should be installed along with other requirements). (#9211)
- Retire try_import_tree.
- Stabilize test_supported_multi_agent.py.
2020-07-02 13:06:34 +02:00
Sven Mika
c4ccbfdfa9
[RLlib] tf-eager support for ES and ARS (tf2.x preparation). (#9207) 2020-07-02 13:03:10 +02:00
Richard Liaw
d35f0e40d0
[tune] Use public methods for trainable (#9184) 2020-07-01 11:00:00 -07:00
Sven Mika
43043ee4d5
[RLlib] Tf2x preparation; part 2 (upgrading try_import_tf()). (#9136)
* WIP.

* Fixes.

* LINT.

* WIP.

* WIP.

* Fixes.

* Fixes.

* Fixes.

* Fixes.

* WIP.

* Fixes.

* Test

* Fix.

* Fixes and LINT.

* Fixes and LINT.

* LINT.
2020-06-30 10:13:20 +02:00
Sven Mika
4ed796a7d6
[RLlib] Add testing Policy.compute_single_action() for all agents. (#8903) 2020-06-13 17:51:50 +02:00
Sven Mika
25c0974543
[RLlib] Issue 8412 (Adam vars not stored in ModelV2). (#8480) 2020-06-05 21:07:02 +02:00
Sven Mika
97d524c075
[RLlib] Issue 8769 broken OOM tests_dir cases (R & S). (#8770) 2020-06-05 08:34:21 +02:00
Sven Mika
2746fc0476
[RLlib] Auto-framework, retire use_pytorch in favor of framework=... (#8520) 2020-05-27 16:19:13 +02:00
Sven Mika
6d196197bc
[RLlib] utils/spaces ... (#8608) 2020-05-27 10:21:30 +02:00
Eric Liang
9a83908c46
[rllib] Deprecate policy optimizers (#8345) 2020-05-21 10:16:18 -07:00
Sven Mika
d76578700d
[RLlib] Policy.compute_single_action() broken for nested actions (Issue 8411). (#8514) 2020-05-20 22:29:08 +02:00
Sven Mika
754290daad
[RLlib] Add light-weight Trainer.compute_action() tests for all Algos. (#8356) 2020-05-08 16:31:31 +02:00
Sven Mika
bf25aee392
[RLlib] Deprecate all Model(v1) usage. (#8146)
Deprecate all Model(v1) usage.
2020-04-29 12:12:59 +02:00
Sven Mika
1775e89f26
[RLlib] Remove TupleActions and support arbitrarily nested action spaces. (#8143)
Deprecate TupleActions and support arbitrarily nested action spaces.
Closes issue #8143.
2020-04-28 14:59:16 +02:00
Sven Mika
d15609ba2a
[RLlib] PyTorch version of ARS (Augmented Random Search). (#8106)
This PR implements a PyTorch version of RLlib's ARS algorithm using RLlib's functional algo builder API. It also adds a regression test for ARS (torch) on CartPole.
2020-04-21 09:47:52 +02:00
Sven Mika
3812bfedda
[RLlib] PyTorch version of ES (Evolution Strategies). (#8104)
PyTorch version of Evolution Strategies (ES) Algo.
2020-04-20 21:47:28 +02:00
Eric Liang
dd70720578
[rllib] Rename sample_batch_size => rollout_fragment_length (#7503)
* bulk rename

* deprecation warn

* update doc

* update fig

* line length

* rename

* make pytest comptaible

* fix test

* fi sys

* rename

* wip

* fix more

* lint

* update svg

* comments

* lint

* fix use of batch steps
2020-03-14 12:05:04 -07:00
Sven Mika
dded5b6d22
[RLlib] ES env_config is not a EnvContext object (e.g. does not contain worker_index). (#7560) 2020-03-11 20:33:20 -07:00
Eric Liang
1989eed3bf
[RLlib] Issue 7136: rollout not working for ES and ARS. (#7444)
* Fix.

* Fix issue #7136.

* ARS fix.
2020-03-04 23:57:44 -08:00
Sven Mika
e6227082bd [RLlib] Add torch flag to train.py (#6807) 2020-01-17 18:48:44 -08:00
Sven
60d4d5e1aa Remove future imports (#6724)
* Remove all __future__ imports from RLlib.

* Remove (object) again from tf_run_builder.py::TFRunBuilder.

* Fix 2xLINT warnings.

* Fix broken appo_policy import (must be appo_tf_policy)

* Remove future imports from all other ray files (not just RLlib).

* Remove future imports from all other ray files (not just RLlib).

* Remove future import blocks that contain `unicode_literals` as well.
Revert appo_tf_policy.py to appo_policy.py (belongs to another PR).

* Add two empty lines before Schedule class.

* Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.
2020-01-09 00:15:48 -08:00
Robert Nishihara
39a3459886 Remove (object) from class declarations. (#6658) 2020-01-02 17:42:13 -08:00
Eric Liang
a1d2e17623
[rllib] Autoregressive action distributions (#5304) 2019-08-10 14:05:12 -07:00