Sven Mika
04a5c72ea3
Revert "Revert "[RLlib] Speedup A3C up to 3x (new training_iteration function instead of execution_plan) and re-instate Pong learning test."" ( #18708 )
2022-02-10 13:44:22 +01:00
Alex Wu
b122f093c1
Revert "[RLlib] Speedup A3C up to 3x (new training_iteration
function instead of execution_plan
) and re-instate Pong learning test." ( #22250 )
...
Reverts ray-project/ray#22126
Breaks rllib:tests/test_io
2022-02-09 09:26:36 -08:00
Sven Mika
ac3e6ab411
[RLlib] Speedup A3C up to 3x (new training_iteration
function instead of execution_plan
) and re-instate Pong learning test. ( #22126 )
2022-02-08 19:04:13 +01:00
Sven Mika
c17a44cdfa
Revert "Revert "[RLlib] AlphaStar: Parallelized, multi-agent/multi-GPU learni…" ( #22153 )
2022-02-08 16:43:00 +01:00
SangBin Cho
a887763b38
Revert "[RLlib] AlphaStar: Parallelized, multi-agent/multi-GPU learni… ( #22105 )
...
This reverts commit 3f03ef8ba8
.
2022-02-04 00:54:50 -08:00
Sven Mika
3f03ef8ba8
[RLlib] AlphaStar: Parallelized, multi-agent/multi-GPU learning via league-based self-play. ( #21356 )
2022-02-03 09:32:09 +01:00
Balaji Veeramani
7f1bacc7dc
[CI] Format Python code with Black ( #21975 )
...
See #21316 and #21311 for the motivation behind these changes.
2022-01-29 18:41:57 -08:00
Jun Gong
7517aefe05
[RLlib] Bring back BC and Marwil learning tests. ( #21574 )
2022-01-14 14:35:32 +01:00
Sven Mika
188324c5c7
[RLlib] Issue 21552: unsquash_action
and clip_action
(when None) cause wrong actions computed by Trainer.compute_single_action
. ( #21553 )
2022-01-12 18:56:51 +01:00
Sven Mika
f94bd99ce4
[RLlib] Issue 21044: Improve error message for "multiagent" dict checks. ( #21448 )
2022-01-11 19:50:03 +01:00
Sven Mika
92f030331e
[RLlib] Initial code/comment cleanups in preparation for decentralized multi-agent learner. ( #21420 )
2022-01-10 11:22:55 +01:00
Sven Mika
62dbf26394
[RLlib] POC: Run PGTrainer w/o the distr. exec API (Trainer's new training_iteration method). ( #20984 )
2021-12-21 08:39:05 +01:00
gjoliver
e7f9e8ceec
[RLlib] Report total_train_steps correctly for offline agents like CQL. ( #20541 )
...
* Fix trainer timestep reporting for offline agents like CQL.
* wip.
* extend timesteps_total to 200K for learning_tests_pendulum_cql test
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-22 21:46:45 +01:00
gjoliver
724a140795
[rllib] Make sure json can serialize result dict ( #20439 )
...
We may have fields in the result dict that are or None.
Make sure our results are json serializable.
2021-11-17 10:27:00 -08:00
gjoliver
6e787f70e0
[Rllib/release] Disable throughput check ( #20387 )
...
Throughput check was enabled by d8a61f801f
prematurely.
E.g., see state before the commit:
a931076f59/rllib/utils/test_utils.py (L740-L741)
2021-11-16 11:05:51 -08:00
Kai Fricke
3e6ba5d6d2
Revert "Revert [RLlib] POC: PGTrainer
class that works by sub-classing, not trainer_template.py
." ( #20285 )
...
* Revert "Revert "[RLlib] POC: `PGTrainer` class that works by sub-classing, not `trainer_template.py`. (#20055 )" (#20284 )"
This reverts commit 246787cdd9
.
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-16 12:26:47 +01:00
Kai Fricke
246787cdd9
Revert "[RLlib] POC: PGTrainer
class that works by sub-classing, not trainer_template.py
. ( #20055 )" ( #20284 )
...
This reverts commit 6f85af435f
.
2021-11-12 13:09:43 +00:00
Sven Mika
6f85af435f
[RLlib] POC: PGTrainer
class that works by sub-classing, not trainer_template.py
. ( #20055 )
2021-11-11 12:16:20 +01:00
gjoliver
d8a61f801f
[RLlib] Create a set of performance benchmark tests to run nightly. ( #19945 )
...
* Create a core set of algorithms tests to run nightly.
* Run release tests under tf, tf2, and torch frameworks.
* Fix
* Add eager_tracing option for tf2 framework.
* make sure core tests can run in parallel.
* cql
* Report progress while running nightly/weekly tests.
* Innclude SAC in nightly lineup.
* Revert changes to learning_tests
* rebrand to performance test.
* update build_pipeline.py with new performance_tests name.
* Record stats.
* bug fix, need to populate experiments dict.
* Alphabetize yaml files.
* Allow specifying frameworks. And do not run tf2 by default.
* remove some debugging code.
* fix
* Undo testing changes.
* Do not run CQL regression for now.
* LINT.
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-08 18:15:13 +01:00
Sven Mika
a931076f59
[RLlib] Tf2 + eager-tracing same speed as framework=tf; Add more test coverage for tf2+tracing. ( #19981 )
2021-11-05 16:10:00 +01:00
Sven Mika
2d24ef0d32
[RLlib] Add all simple learning tests as framework=tf2
. ( #19273 )
...
* Unpin gym and deprecate pendulum v0
Many tests in rllib depended on pendulum v0,
however in gym 0.21, pendulum v0 was deprecated
in favor of pendulum v1. This may change reward
thresholds, so will have to potentially rerun
all of the pendulum v1 benchmarks, or use another
environment in favor. The same applies to frozen
lake v0 and frozen lake v1
Lastly, all of the RLlib tests and Tune tests have
been moved to python 3.7
* fix tune test_sampler::testSampleBoundsAx
* fix re-install ray for py3.7 tests
Co-authored-by: avnishn <avnishn@uw.edu>
2021-11-02 12:10:17 +01:00
Sven Mika
0b308719f8
[RLlib; Docs overhaul] Docstring cleanup: rllib/utils ( #19829 )
2021-11-01 21:46:02 +01:00
Carlo Grisetti
5cee8a1985
[release tests] Switch from yaml.load to yaml.safe_load ( #19365 )
2021-10-13 17:27:25 -07:00
Sven Mika
d439fd7f17
[RLlib] TF2/eager memory leak fixes. ( #19198 )
2021-10-09 00:11:53 +02:00
Sven Mika
c3e3fc7637
[RLlib] Issue 18280: A3C/IMPALA multi-agent not working. ( #19100 )
2021-10-07 23:57:53 +02:00
Sven Mika
73f5c4039b
[RLlib] Fix flakey test_a3c, test_maml, test_apex_dqn. ( #19035 )
2021-10-04 13:23:51 +02:00
Sven Mika
16ad46a654
[RLlib] Fix broken test_r2d2.py. ( #19017 )
2021-09-30 21:19:37 +02:00
Sven Mika
ed85f59194
[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. ( #18879 )
2021-09-30 16:39:05 +02:00
Sven Mika
828f5d26b7
[RLlib] Custom view requirements (e.g. for prev-n-obs) work with compute_single_action
and compute_actions_from_input_dict
. ( #18921 )
2021-09-30 15:03:37 +02:00
Sven Mika
e6aae61487
[RLlib; testing] Fix bug in stress tests not handling >1 trials per experiment (due to grid-search in IMPALA stress tests). ( #18705 )
2021-09-20 15:31:57 +02:00
Sven Mika
ba1c489b79
[RLlib Testing] Lower --smoke-test
"time_total_s" to make sure it doesn't time out. ( #18670 )
2021-09-16 18:22:23 +02:00
Sven Mika
8a72824c63
[RLlib Testig] Split and unflake more CI tests (make sure all jobs are < 30min). ( #18591 )
2021-09-15 22:16:48 +02:00
Sven Mika
45f60e51a9
[RLlib] DDPPO fixes and benchmarks. ( #18390 )
2021-09-08 19:39:01 +02:00
Sven Mika
cabaa3b3c6
[RLlib Testing] Add A3C/APPO/BC/DDPPO/MARWIL/CQL/ES/ARS/TD3 to weekly learning tests. ( #18381 )
2021-09-07 11:48:41 +02:00
Sven Mika
5292b70fc6
[RLlib] Add multi-GPU attention net tests to nightly test suite (+ R2D2 tests for LSTM and attention nets). ( #18368 )
2021-09-06 17:48:05 +02:00
Sven Mika
e3e6ed7aaa
[RLlib] Issues 17844, 18034: Fix n-step > 1 bug. ( #18358 )
2021-09-06 12:14:20 +02:00
Sven Mika
9a8ca6a69d
[RLlib] Fix Atari learning test regressions (2 bugs) and 1 minor attention net bug. ( #18306 )
2021-09-03 13:29:57 +02:00
Sven Mika
a7670d9fab
[RLlib; Testing] Fix smoke-test settings for nightly learning_tests
and stress_test
; Add pybullet_envs
to app-config. ( #18274 )
2021-09-01 21:46:06 +02:00
Sven Mika
4888d7c9af
[RLlib] Replay buffers: Add config option to store contents in checkpoints. ( #17999 )
2021-08-31 12:21:49 +02:00
Sven Mika
a428f10ebe
[RLlib] Add multi-GPU learning tests to nightly. ( #17778 )
2021-08-18 17:21:01 +02:00
Kai Fricke
10fd7111b3
[rllib] Improve test learning check, fix flaky two step qmix ( #16843 )
2021-07-06 19:39:12 +01:00
Sven Mika
bc09e75b78
[RLlib] Fix 3 flakey test cases. ( #15785 )
2021-05-16 12:20:33 +02:00
Sven Mika
e973b726c2
[RLlib] Support native tf.keras.Models (part 2) - Default keras models for Vision/RNN/Attention. ( #15273 )
2021-04-30 19:26:30 +02:00
Sven Mika
52c94b7ee9
[RLlib] Allow SAC to use custom models as Q- or policy nets and deprecate "state-preprocessor" for image spaces. ( #13522 )
2021-02-02 13:05:58 +01:00
Sven Mika
d49c3fae0b
[RLlib] Trajectory View API: Atari framestacking. ( #13315 )
2021-01-13 08:53:34 +01:00
Sven Mika
8726521604
[RLlib] JAXPolicy prep PR #2 (move get_activation_fn (backward-compatibly), minor fixes and preparations). ( #13091 )
2020-12-30 22:30:52 -05:00
Sven Mika
340b1e99fc
[RLlib] Fix JAX import bug. ( #12621 )
2020-12-07 11:05:08 -08:00
Sven Mika
3f4bc16276
[RLlib] Add a minimal JAX ModelV2 (FCNet) to RLlib. ( #12502 )
2020-12-03 15:51:30 +01:00
Sven Mika
dab241dcc6
[RLlib] Fix inconsistency wrt batch size in SampleCollector (traj. view API). Makes DD-PPO work with traj. view API. ( #12063 )
2020-11-19 19:01:14 +01:00
Sven Mika
d9f1874e34
[RLlib] Minor fixes (torch GPU bugs + some cleanup). ( #11609 )
2020-10-27 10:00:24 +01:00