Commit graph

251 commits

Author SHA1 Message Date
Sven Mika
893536ebd9
[RLlib] Move bandits into main agents folder; Make RecSim adapter more accessible; (#21773) 2022-01-27 13:58:12 +01:00
Sven Mika
d5bfb7b7da
[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 (#21652) 2022-01-25 14:16:58 +01:00
Sven Mika
3ac4daba07
[RLlib] Discussion 4351: Conv2d default filter tests and add default setting for 96x96 image obs space. (#21560) 2022-01-13 18:50:42 +01:00
Avnish Narayan
f7a5fc36eb
[rllib] Give rnnsac_stateless cartpole gpu, increase timeout (#21407)
Increase test_preprocessors runtimes.
2022-01-06 11:54:19 -08:00
Sven Mika
9e6b871739
[RLlib] Better utils for flattening complex inputs and enable prev-actions for LSTM/attention for complex action spaces. (#21330) 2022-01-05 11:29:44 +01:00
Sven Mika
abd3bef63b
[RLlib] QMIX better defaults + added to CI learning tests (#21332) 2022-01-04 08:54:41 +01:00
Sven Mika
daa4304a91
[RLlib] Switch off preprocessors by default for PGTrainer. (#21008) 2021-12-13 12:04:23 +01:00
Sven Mika
596c8e2772
[RLlib] Experimental no-flatten option for actions/prev-actions. (#20918) 2021-12-11 14:57:58 +01:00
Eric Liang
6f93ea437e
Remove the flaky test tag (#21006) 2021-12-11 01:03:17 -08:00
Avnish Narayan
6996eaa986
[RLlib] Add necessary fields to Base Envs, and BaseEnv wrapper classes (#20832) 2021-12-09 14:40:40 +01:00
Ishant Mrinal
2868d1a2cf
[RLlib] Support for RE3 exploration algorithm (for tf) (#19551) 2021-12-07 13:26:34 +01:00
Sven Mika
60b2219d72
[RLlib] Allow for evaluation to run by timesteps (alternative to episodes) and add auto-setting to make sure train doesn't ever have to wait for eval (e.g. long episodes) to finish. (#20757) 2021-12-04 13:26:33 +01:00
Jun Gong
65bd8e29f8
[RLlib] Update a few things to get rid of the remote_vector_env deprecation warning. (#20753) 2021-12-02 13:10:44 +01:00
mvindiola1
8cee0c03bf
[RLlib] Update max_seq_len in pad_batch_to_sequences_of_same_size (#20743) 2021-11-30 18:00:07 +01:00
Sven Mika
7a585fb275
[RLlib; Documentation] RLlib README overhaul. (#20249) 2021-11-18 18:08:40 +01:00
Sven Mika
56619b955e
[RLlib; Documentation] Some docstring cleanups; Rename RemoteVectorEnv into RemoteBaseEnv for clarity. (#20250) 2021-11-17 21:40:16 +01:00
Avnish Narayan
dc17f0a241
Add error messages for missing tf and torch imports (#20205)
Co-authored-by: Sven Mika <sven@anyscale.io>
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-16 16:30:53 -08:00
Sven Mika
f82880eda1
Revert "Revert [RLlib] POC: Deprecate build_policy (policy template) for torch only; PPOTorchPolicy (#20061) (#20399)" (#20417)
This reverts commit 90dc5460d4.
2021-11-16 14:49:41 +01:00
Amog Kamsetty
90dc5460d4
Revert "[RLlib] POC: Deprecate build_policy (policy template) for torch only; PPOTorchPolicy (#20061)" (#20399)
This reverts commit 5b1c8e46e1.
2021-11-15 16:11:35 -08:00
Sven Mika
5b1c8e46e1
[RLlib] POC: Deprecate build_policy (policy template) for torch only; PPOTorchPolicy (#20061) 2021-11-15 10:41:54 +01:00
Sven Mika
ebd56b57db
[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215) 2021-11-10 22:20:06 +01:00
Sven Mika
143d23a278
[RLlib] Issue 20062: Action inference examples missing (#20144) 2021-11-10 18:49:06 +01:00
Sven Mika
76f8a9f125
[RLlib; testing] Increase size of two time-out'ing test cases from medium to large. (#20128) 2021-11-06 21:48:28 +01:00
Sven Mika
a931076f59
[RLlib] Tf2 + eager-tracing same speed as framework=tf; Add more test coverage for tf2+tracing. (#19981) 2021-11-05 16:10:00 +01:00
Sven Mika
4cb23d1c95
[Tune; Testing] Revert to 3.7 (undone by accident by previous PR); + some minor comment cleanups. (#20031) 2021-11-04 10:58:34 +01:00
gjoliver
2c1fa459d4
[RLlib] Add an RLlib Tune experiment to UserTest suite. (#19807)
* Add an RLlib Tune experiment to UserTest suite.

* Add ray.init()

* Move example script to example/tune/, so it can be imported as module.

* add __init__.py so our new module will get included in python wheel.

* Add block device to RLlib test instances.

* Reduce disk size a little bit.

* Add metrics reporting

* Allow max of 5 workers to accomodate all the worker tasks.

* revert disk size change.

* Minor updates

* Trigger build

* set max num workers

* Add a compute cfg for autoscaled cpu and gpu nodes.

* use 1gpu instance.

* install tblib for debugging worker crashes.

* Manually upgrade to pytorch 1.9.0

* -y

* torch=1.9.0

* install torch on driver

* Add an RLlib Tune experiment to UserTest suite.

* Add ray.init()

* Move example script to example/tune/, so it can be imported as module.

* add __init__.py so our new module will get included in python wheel.

* Add block device to RLlib test instances.

* Reduce disk size a little bit.

* Add metrics reporting

* Allow max of 5 workers to accomodate all the worker tasks.

* revert disk size change.

* Minor updates

* Trigger build

* set max num workers

* Add a compute cfg for autoscaled cpu and gpu nodes.

* use 1gpu instance.

* install tblib for debugging worker crashes.

* Manually upgrade to pytorch 1.9.0

* -y

* torch=1.9.0

* install torch on driver

* bump timeout

* Write a more informational result dict.

* Revert changes to compute config files that are not used.

* add smoke test

* update

* reduce timeout

* Reduce the # of env per worker to 1.

* Small fix for getting trial_states

* Trigger build

* simply result dict

* lint

* more lint

* fix smoke test

Co-authored-by: Amog Kamsetty <amogkamsetty@yahoo.com>
2021-11-03 17:04:27 -07:00
Avnish Narayan
026bf01071
[RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535)
* Fix QMix, SAC, and MADDPA too.

* Unpin gym and deprecate pendulum v0

Many tests in rllib depended on pendulum v0,
however in gym 0.21, pendulum v0 was deprecated
in favor of pendulum v1. This may change reward
thresholds, so will have to potentially rerun
all of the pendulum v1 benchmarks, or use another
environment in favor. The same applies to frozen
lake v0 and frozen lake v1

Lastly, all of the RLlib tests and have
been moved to python 3.7

* Add gym installation based on python version.

Pin python<= 3.6 to gym 0.19 due to install
issues with atari roms in gym 0.20

* Reformatting

* Fixing tests

* Move atari-py install conditional to req.txt

* migrate to new ale install method

* Fix QMix, SAC, and MADDPA too.

* Unpin gym and deprecate pendulum v0

Many tests in rllib depended on pendulum v0,
however in gym 0.21, pendulum v0 was deprecated
in favor of pendulum v1. This may change reward
thresholds, so will have to potentially rerun
all of the pendulum v1 benchmarks, or use another
environment in favor. The same applies to frozen
lake v0 and frozen lake v1

Lastly, all of the RLlib tests and have
been moved to python 3.7
* Add gym installation based on python version.

Pin python<= 3.6 to gym 0.19 due to install
issues with atari roms in gym 0.20

Move atari-py install conditional to req.txt

migrate to new ale install method

Make parametric_actions_cartpole return float32 actions/obs

Adding type conversions if obs/actions don't match space

Add utils to make elements match gym space dtypes

Co-authored-by: Jun Gong <jungong@anyscale.com>
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-03 16:24:00 +01:00
Sven Mika
e6ae08f416
[RLlib] Optionally don't drop last ts in v-trace calculations (APPO and IMPALA). (#19601) 2021-11-03 10:01:34 +01:00
Sven Mika
2d24ef0d32
[RLlib] Add all simple learning tests as framework=tf2. (#19273)
* Unpin gym and deprecate pendulum v0

Many tests in rllib depended on pendulum v0,
however in gym 0.21, pendulum v0 was deprecated
in favor of pendulum v1. This may change reward
thresholds, so will have to potentially rerun
all of the pendulum v1 benchmarks, or use another
environment in favor. The same applies to frozen
lake v0 and frozen lake v1

Lastly, all of the RLlib tests and Tune tests have
been moved to python 3.7

* fix tune test_sampler::testSampleBoundsAx

* fix re-install ray for py3.7 tests

Co-authored-by: avnishn <avnishn@uw.edu>
2021-11-02 12:10:17 +01:00
Sven Mika
4d945fe651
[RLlib] Issue 19878: Re-instate bare_metal_policy example script (#19881) 2021-10-30 12:50:39 -07:00
Rohan138
b9c9cc5946
[RLlib] Updated PettingZoo+RLlib tutorial; Removed pettingzoo example script (#19069)
* Updated PettingZoo+RLlib tutorial

Updated the tutorial and added link to the blog post by the PettingZoo team.

* Ran linting

* Converted link to tinyurl for linting

* fixed line lengths

* Decrease num_workers to 1

* Added comments

* Decreased num_workers

* Decreased timesteps

* Increased num_workers

* Update links and remove pettingzoo_env.py

* remove pettingzoo.py script from tests

Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-10-29 10:57:10 +02:00
Sven Mika
902e854af2
[RLlib; Docs overhaul] Docstring cleanup: Environments. (#19784)
* wip.

* Test: Make a change in tune to trigger tune tests, which are not run otherwise, but seem to fail nevertheless with this PR's changes.

* remove bare_metal_policy_with_custom_view_reqs from tests
2021-10-29 10:46:52 +02:00
gjoliver
d81885c1f1
[RLlib] Fix all the CI tests that were broken by is_training and replay buffer changes; re-comment-in the failing RLlib tests (#19809)
* Fix DDPG, since it is based on GenericOffPolicyTrainer.

* Fix QMix, SAC, and MADDPA too.

* Undo QMix change.

* Fix DQN input batch type. Always use SampleBatch.

* apex ddpg should not use replay_buffer_config yet.

* Make eager tf policy to use SampleBatch.

* lint

* LINT.

* Re-enable RLlib broken tests to make sure things work ok now.

* fixes.

Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-10-28 18:06:47 +02:00
Simon Mo
5e927b01ad
Revert "[CI] Remove config that disables Bazel test result cache" (#19818)
* Revert "[CI] Remove config that disables Bazel test result cache (#18701)"

This reverts commit 098ff36faa.

* Remove all RLlib tests from BUILD that currently fail.

Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-10-28 15:54:53 +02:00
Avnish Narayan
ad87ddf93e
[rllib] Add deterministic test to gpu (#19306)
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-10-26 10:11:39 -07:00
Sven Mika
fd438d5630
[RLlib] Issue 18104: Cannot set remote_worker_envs=True for non local-mode and MultiAgentEnv. (#19133) 2021-10-07 22:39:21 +02:00
Sven Mika
ac3371a148
[RLlib] Discussion 3644: Fix bug for complex obs spaces containing Box([2D shape]) and discrete component. (#18917) 2021-09-30 16:39:38 +02:00
Sven Mika
05a55a9335
[RLlib] Issue 18668: Unity3D env client/server example not working (fix + add to test cases). (#18942) 2021-09-30 08:30:20 +02:00
Sven Mika
9c9b482661
[RLlib] Allow n-step > 1 and prio. replay for R2D2 and RNNSAC. (#18939) 2021-09-29 21:31:34 +02:00
mvindiola1
62f5da0b65
[RLlib] Add unit tests for updating episode data in base_env (#17137) 2021-09-24 16:08:11 +02:00
Sven Mika
61a1274619
[RLlib] No Preprocessors (part 2). (#18468) 2021-09-23 12:56:45 +02:00
Sven Mika
a96dbd885b
[RLlib] Reinstate trajectory view API tests. (#18809) 2021-09-23 08:31:51 +02:00
Sven Mika
93208bb087
[RLlib] Increase size of (very flakey) action_masking example script test. (#18816) 2021-09-22 21:48:01 +02:00
Sven Mika
8a72824c63
[RLlib Testig] Split and unflake more CI tests (make sure all jobs are < 30min). (#18591) 2021-09-15 22:16:48 +02:00
Sven Mika
c5d20849ae
[RLlib] Rename rllib rollout into rllib evaluate (backward compatible) to match Trainer API. (#18467) 2021-09-15 08:45:17 +02:00
Sven Mika
08c09737fa
[RLlib] Fix R2D2 (torch) multi-GPU issue. (#18550) 2021-09-14 19:58:10 +02:00
Ameer Haj Ali
e6807ecb43
Change tests owners for ml tests (#18417) 2021-09-14 01:04:52 -07:00
Sven Mika
ea4a22249c
[RLlib] Add simple action-masking example script/env/model (tf and torch). (#18494) 2021-09-11 23:08:09 +02:00
Sven Mika
8a066474d4
[RLlib] No Preprocessors; preparatory PR #1 (#18367) 2021-09-09 08:10:42 +02:00
Sven Mika
56f142cac1
[RLlib] Add support for evaluation_num_episodes=auto (run eval for as long as the parallel train step takes). (#18380) 2021-09-07 08:08:37 +02:00