Sven Mika
63db0e3a7c
[RLlib] Fix SAC learning test flakiness introduced in PR: "Sub-class Trainer
(instead of build_trainer()
): All remaining classes; soft-deprecate build_trainer
." ( #20985 )
2021-12-09 14:24:27 +01:00
Ishant Mrinal
2868d1a2cf
[RLlib] Support for RE3 exploration algorithm (for tf) ( #19551 )
2021-12-07 13:26:34 +01:00
Avnish Narayan
b8c64480d8
[RLlib] Change return type of try_reset to MultiEnvDict ( #20868 )
2021-12-06 14:15:33 +01:00
Sven Mika
b4790900f5
[RLlib] Sub-class Trainer
(instead of build_trainer()
): All remaining classes; soft-deprecate build_trainer
. ( #20725 )
2021-12-04 22:05:26 +01:00
Sven Mika
60b2219d72
[RLlib] Allow for evaluation to run by timesteps
(alternative to episodes
) and add auto-setting to make sure train doesn't ever have to wait for eval (e.g. long episodes) to finish. ( #20757 )
2021-12-04 13:26:33 +01:00
Amog Kamsetty
611bfc1352
[ML] Move find_free_port
to ml_utils
( #20828 )
...
Small refactoring of common utility used by Train, Tune, and Rllib.
2021-12-03 13:38:42 -08:00
Sven Mika
0de41e4a6b
[RLlib] Trainer sub-class QMIX/MAML/MB-MPO (instead of build_trainer
). ( #20639 )
2021-12-02 13:17:10 +01:00
Jun Gong
2317c693cf
[RLlib] Use SampleBrach instead of input dict whenever possible ( #20746 )
2021-12-02 13:11:26 +01:00
Jun Gong
65bd8e29f8
[RLlib] Update a few things to get rid of the remote_vector_env
deprecation warning. ( #20753 )
2021-12-02 13:10:44 +01:00
Sven Mika
9e38f6f613
[RLlib] Trainer sub-class DDPG/TD3/APEX-DDPG (instead of build_trainer
). ( #20636 )
2021-12-01 10:52:12 +01:00
Avnish Narayan
74dd0e4085
[RLlib] Make to_base_env()
a method of all RLlib-supported Env classes ( #20811 )
2021-12-01 09:01:02 +01:00
Avnish Narayan
3ddc09544d
[rllib] Env to base env refactor ( #20785 )
2021-11-30 17:02:10 -08:00
Sven Mika
bec719d823
[RLlib] Trainer sub-class IMPALA (instead of using build_trainer()
). ( #20570 )
2021-11-30 19:08:36 +01:00
Sven Mika
3d2e27485b
[RLlib] Trainer sub-class DQN/SimpleQ/APEX-DQN/R2D2 (instead of using build_trainer
). ( #20633 )
2021-11-30 18:05:44 +01:00
Carlo Grisetti
514ed27f63
[RLlib] Fix deprecation message for rllib.env.remote_vector_env
(now RemoteBaseEnv
) and migrate import ( #20750 )
2021-11-30 18:01:21 +01:00
mvindiola1
8cee0c03bf
[RLlib] Update max_seq_len
in pad_batch_to_sequences_of_same_size ( #20743 )
2021-11-30 18:00:07 +01:00
mvindiola1
eadc7669c5
[RLlib] SampleBatch.concat_samples fix incorrect max_seq_len calculation ( #20704 )
2021-11-29 12:01:40 +01:00
Sven Mika
e37afe0425
[RLlib; Docs] Auto API reference pages overhaul: rllib/policy
and rllib/agents
packages. ( #20537 )
2021-11-25 09:35:19 +01:00
Sven Mika
c07d8c4c22
[RLlib] Trainer sub-class A2C/A3C (instead of build_trainer
). ( #20635 )
2021-11-24 22:07:13 +01:00
Sven Mika
49cd7ea6f9
[RLlib] Trainer sub-class PPO/DDPPO (instead of build_trainer()
). ( #20571 )
2021-11-23 23:01:05 +01:00
Sven Mika
9d2fe5756c
[RLlib] Trainer sub-class for APPO (instead of using build_trainer()
). ( #20424 )
2021-11-22 22:14:21 +01:00
gjoliver
e7f9e8ceec
[RLlib] Report total_train_steps correctly for offline agents like CQL. ( #20541 )
...
* Fix trainer timestep reporting for offline agents like CQL.
* wip.
* extend timesteps_total to 200K for learning_tests_pendulum_cql test
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-22 21:46:45 +01:00
Artur Niederfahrenhorst
d07e50e957
[RLlib] Replay buffer API (cleanups; docstrings; renames; move into rllib/execution/buffers
dir) ( #20552 )
2021-11-19 11:57:37 +01:00
gjoliver
18862f9f44
[RLlib] Add a comment in the doc string of on_learn_on_batch
callback function. ( #20456 )
2021-11-19 10:49:07 +01:00
Avnish Narayan
b6077a36d4
[RLlib; Pre-checks/better failure behavior]: Env Checker for Gym Environments ( #20481 )
2021-11-19 09:41:03 +01:00
Sven Mika
7a585fb275
[RLlib; Documentation] RLlib README overhaul. ( #20249 )
2021-11-18 18:08:40 +01:00
Sven Mika
56619b955e
[RLlib; Documentation] Some docstring cleanups; Rename RemoteVectorEnv into RemoteBaseEnv for clarity. ( #20250 )
2021-11-17 21:40:16 +01:00
gjoliver
724a140795
[rllib] Make sure json can serialize result dict ( #20439 )
...
We may have fields in the result dict that are or None.
Make sure our results are json serializable.
2021-11-17 10:27:00 -08:00
Avnish Narayan
dc17f0a241
Add error messages for missing tf and torch imports ( #20205 )
...
Co-authored-by: Sven Mika <sven@anyscale.io>
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-16 16:30:53 -08:00
Kai Fricke
05d21497db
[rllib/tune] Fix durable trainable in trainer template, add release test ( #20422 )
2021-11-16 20:52:42 +00:00
gjoliver
6e787f70e0
[Rllib/release] Disable throughput check ( #20387 )
...
Throughput check was enabled by d8a61f801f
prematurely.
E.g., see state before the commit:
a931076f59/rllib/utils/test_utils.py (L740-L741)
2021-11-16 11:05:51 -08:00
Sven Mika
f82880eda1
Revert "Revert [RLlib] POC: Deprecate build_policy
(policy template) for torch only; PPOTorchPolicy ( #20061 ) ( #20399 )" ( #20417 )
...
This reverts commit 90dc5460d4
.
2021-11-16 14:49:41 +01:00
Stefan Schneider
2b3d0c691f
[RLlib] Document and extend action mask example. ( #20390 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Sven Mika <sven@anyscale.io>
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-16 13:20:41 +01:00
Kai Fricke
3e6ba5d6d2
Revert "Revert [RLlib] POC: PGTrainer
class that works by sub-classing, not trainer_template.py
." ( #20285 )
...
* Revert "Revert "[RLlib] POC: `PGTrainer` class that works by sub-classing, not `trainer_template.py`. (#20055 )" (#20284 )"
This reverts commit 246787cdd9
.
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-16 12:26:47 +01:00
Amog Kamsetty
90dc5460d4
Revert "[RLlib] POC: Deprecate build_policy
(policy template) for torch only; PPOTorchPolicy ( #20061 )" ( #20399 )
...
This reverts commit 5b1c8e46e1
.
2021-11-15 16:11:35 -08:00
Sven Mika
6ff4061f3a
[RLlib] Issue 20269: Offline RL example not working due to new_obs not being written to file. ( #20366 )
...
* wip.
* Apply suggestions from code review
2021-11-15 16:41:08 +01:00
Sven Mika
5b1c8e46e1
[RLlib] POC: Deprecate build_policy
(policy template) for torch only; PPOTorchPolicy ( #20061 )
2021-11-15 10:41:54 +01:00
xwjiang2010
cdf70c2900
[Tune] Remove legacy resources implementations in Runner and Executor. ( #19773 )
2021-11-12 12:33:39 -08:00
Sven Mika
38c456b6f4
[RLlib; Tune] Fix rllib/train.py script after tune.Experiment c'tor change. ( #20283 )
2021-11-12 15:25:50 +01:00
Kai Fricke
246787cdd9
Revert "[RLlib] POC: PGTrainer
class that works by sub-classing, not trainer_template.py
. ( #20055 )" ( #20284 )
...
This reverts commit 6f85af435f
.
2021-11-12 13:09:43 +00:00
Sven Mika
70fe25055a
[RLlib] Issue: Get single step input dict incorrect. ( #20217 )
2021-11-12 08:38:51 +01:00
Sven Mika
6f85af435f
[RLlib] POC: PGTrainer
class that works by sub-classing, not trainer_template.py
. ( #20055 )
2021-11-11 12:16:20 +01:00
Sven Mika
ebd56b57db
[RLlib; documentation] "RLlib in 60sec" overhaul. ( #20215 )
2021-11-10 22:20:06 +01:00
Sven Mika
143d23a278
[RLlib] Issue 20062: Action inference examples missing ( #20144 )
2021-11-10 18:49:06 +01:00
Sungho Joo
dc51af798c
[RLlib] Minor fix on json encoding during worker sampling ( #20134 )
...
* import custom json encoder from util and improve encoder default function
* linting
2021-11-09 16:46:41 -08:00
Kai Fricke
9c2b8c8501
[tune] Deprecate DurableTrainable ( #19880 )
2021-11-08 20:56:07 +00:00
gjoliver
d8a61f801f
[RLlib] Create a set of performance benchmark tests to run nightly. ( #19945 )
...
* Create a core set of algorithms tests to run nightly.
* Run release tests under tf, tf2, and torch frameworks.
* Fix
* Add eager_tracing option for tf2 framework.
* make sure core tests can run in parallel.
* cql
* Report progress while running nightly/weekly tests.
* Innclude SAC in nightly lineup.
* Revert changes to learning_tests
* rebrand to performance test.
* update build_pipeline.py with new performance_tests name.
* Record stats.
* bug fix, need to populate experiments dict.
* Alphabetize yaml files.
* Allow specifying frameworks. And do not run tf2 by default.
* remove some debugging code.
* fix
* Undo testing changes.
* Do not run CQL regression for now.
* LINT.
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-08 18:15:13 +01:00
Sven Mika
eea6b40a3e
[RLlib] Minor cleanups in Trainer
; better tf/tf2 info messages about possible tracing speedups. ( #20109 )
2021-11-08 15:37:27 +01:00
Sven Mika
76f8a9f125
[RLlib; testing] Increase size of two time-out'ing test cases from medium to large. ( #20128 )
2021-11-06 21:48:28 +01:00
Amog Kamsetty
3408b60d2b
[Release] Refactor User Tests ( #20028 )
...
* wip
* add directory
* wip
* try again
* Revert "try again"
This reverts commit 82d33ccea6f92848df025e019b87df73cea49e5d.
* finish
* formatting
* fix merge
* fix path
* chmod
* check
* sudo
* wip
* update
* fix horovod
* try
* typo
* reduce num workers
2021-11-05 17:28:37 -07:00