Sven Mika
daa4304a91
[RLlib] Switch off preprocessors by default for PGTrainer. ( #21008 )
2021-12-13 12:04:23 +01:00
Sven Mika
db058d0fb3
[RLlib] Rename metrics_smoothing_episodes
into metrics_num_episodes_for_smoothing
for clarity. ( #20983 )
2021-12-11 20:33:35 +01:00
Sven Mika
596c8e2772
[RLlib] Experimental no-flatten option for actions/prev-actions. ( #20918 )
2021-12-11 14:57:58 +01:00
Eric Liang
6f93ea437e
Remove the flaky test tag ( #21006 )
2021-12-11 01:03:17 -08:00
Sven Mika
f814c2af89
[RLlib; Docs] Docs API reference pages: rllib/execution
, rllib/evaluation
, rllib/models
, rllib/offline
. ( #20538 )
2021-12-10 09:41:29 +01:00
kk-55
9acf2f954d
[RLlib] Example containing a proposal for computing an adapted (time-dependent) GAE used by the PPO algorithm (via callback on_postprocess_trajectory) ( #20850 )
2021-12-09 14:48:56 +01:00
Tomasz Wrona
39c202fa66
[RLlib] Allow extra keys in info in multi-agent ( #20793 )
2021-12-09 14:44:33 +01:00
Carlo Grisetti
a8286c55af
[RLLib] Fix deprecated convert_to_non_torch_type ( #20751 )
2021-12-09 14:42:12 +01:00
Avnish Narayan
6996eaa986
[RLlib] Add necessary fields to Base Envs, and BaseEnv wrapper classes ( #20832 )
2021-12-09 14:40:40 +01:00
Sven Mika
63db0e3a7c
[RLlib] Fix SAC learning test flakiness introduced in PR: "Sub-class Trainer
(instead of build_trainer()
): All remaining classes; soft-deprecate build_trainer
." ( #20985 )
2021-12-09 14:24:27 +01:00
Ishant Mrinal
2868d1a2cf
[RLlib] Support for RE3 exploration algorithm (for tf) ( #19551 )
2021-12-07 13:26:34 +01:00
Avnish Narayan
b8c64480d8
[RLlib] Change return type of try_reset to MultiEnvDict ( #20868 )
2021-12-06 14:15:33 +01:00
Sven Mika
b4790900f5
[RLlib] Sub-class Trainer
(instead of build_trainer()
): All remaining classes; soft-deprecate build_trainer
. ( #20725 )
2021-12-04 22:05:26 +01:00
Sven Mika
60b2219d72
[RLlib] Allow for evaluation to run by timesteps
(alternative to episodes
) and add auto-setting to make sure train doesn't ever have to wait for eval (e.g. long episodes) to finish. ( #20757 )
2021-12-04 13:26:33 +01:00
Amog Kamsetty
611bfc1352
[ML] Move find_free_port
to ml_utils
( #20828 )
...
Small refactoring of common utility used by Train, Tune, and Rllib.
2021-12-03 13:38:42 -08:00
Sven Mika
0de41e4a6b
[RLlib] Trainer sub-class QMIX/MAML/MB-MPO (instead of build_trainer
). ( #20639 )
2021-12-02 13:17:10 +01:00
Jun Gong
2317c693cf
[RLlib] Use SampleBrach instead of input dict whenever possible ( #20746 )
2021-12-02 13:11:26 +01:00
Jun Gong
65bd8e29f8
[RLlib] Update a few things to get rid of the remote_vector_env
deprecation warning. ( #20753 )
2021-12-02 13:10:44 +01:00
Sven Mika
9e38f6f613
[RLlib] Trainer sub-class DDPG/TD3/APEX-DDPG (instead of build_trainer
). ( #20636 )
2021-12-01 10:52:12 +01:00
Avnish Narayan
74dd0e4085
[RLlib] Make to_base_env()
a method of all RLlib-supported Env classes ( #20811 )
2021-12-01 09:01:02 +01:00
Avnish Narayan
3ddc09544d
[rllib] Env to base env refactor ( #20785 )
2021-11-30 17:02:10 -08:00
Sven Mika
bec719d823
[RLlib] Trainer sub-class IMPALA (instead of using build_trainer()
). ( #20570 )
2021-11-30 19:08:36 +01:00
Sven Mika
3d2e27485b
[RLlib] Trainer sub-class DQN/SimpleQ/APEX-DQN/R2D2 (instead of using build_trainer
). ( #20633 )
2021-11-30 18:05:44 +01:00
Carlo Grisetti
514ed27f63
[RLlib] Fix deprecation message for rllib.env.remote_vector_env
(now RemoteBaseEnv
) and migrate import ( #20750 )
2021-11-30 18:01:21 +01:00
mvindiola1
8cee0c03bf
[RLlib] Update max_seq_len
in pad_batch_to_sequences_of_same_size ( #20743 )
2021-11-30 18:00:07 +01:00
mvindiola1
eadc7669c5
[RLlib] SampleBatch.concat_samples fix incorrect max_seq_len calculation ( #20704 )
2021-11-29 12:01:40 +01:00
Sven Mika
e37afe0425
[RLlib; Docs] Auto API reference pages overhaul: rllib/policy
and rllib/agents
packages. ( #20537 )
2021-11-25 09:35:19 +01:00
Sven Mika
c07d8c4c22
[RLlib] Trainer sub-class A2C/A3C (instead of build_trainer
). ( #20635 )
2021-11-24 22:07:13 +01:00
Sven Mika
49cd7ea6f9
[RLlib] Trainer sub-class PPO/DDPPO (instead of build_trainer()
). ( #20571 )
2021-11-23 23:01:05 +01:00
Sven Mika
9d2fe5756c
[RLlib] Trainer sub-class for APPO (instead of using build_trainer()
). ( #20424 )
2021-11-22 22:14:21 +01:00
gjoliver
e7f9e8ceec
[RLlib] Report total_train_steps correctly for offline agents like CQL. ( #20541 )
...
* Fix trainer timestep reporting for offline agents like CQL.
* wip.
* extend timesteps_total to 200K for learning_tests_pendulum_cql test
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-22 21:46:45 +01:00
Artur Niederfahrenhorst
d07e50e957
[RLlib] Replay buffer API (cleanups; docstrings; renames; move into rllib/execution/buffers
dir) ( #20552 )
2021-11-19 11:57:37 +01:00
gjoliver
18862f9f44
[RLlib] Add a comment in the doc string of on_learn_on_batch
callback function. ( #20456 )
2021-11-19 10:49:07 +01:00
Avnish Narayan
b6077a36d4
[RLlib; Pre-checks/better failure behavior]: Env Checker for Gym Environments ( #20481 )
2021-11-19 09:41:03 +01:00
Sven Mika
7a585fb275
[RLlib; Documentation] RLlib README overhaul. ( #20249 )
2021-11-18 18:08:40 +01:00
Sven Mika
56619b955e
[RLlib; Documentation] Some docstring cleanups; Rename RemoteVectorEnv into RemoteBaseEnv for clarity. ( #20250 )
2021-11-17 21:40:16 +01:00
gjoliver
724a140795
[rllib] Make sure json can serialize result dict ( #20439 )
...
We may have fields in the result dict that are or None.
Make sure our results are json serializable.
2021-11-17 10:27:00 -08:00
Avnish Narayan
dc17f0a241
Add error messages for missing tf and torch imports ( #20205 )
...
Co-authored-by: Sven Mika <sven@anyscale.io>
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-16 16:30:53 -08:00
Kai Fricke
05d21497db
[rllib/tune] Fix durable trainable in trainer template, add release test ( #20422 )
2021-11-16 20:52:42 +00:00
gjoliver
6e787f70e0
[Rllib/release] Disable throughput check ( #20387 )
...
Throughput check was enabled by d8a61f801f
prematurely.
E.g., see state before the commit:
a931076f59/rllib/utils/test_utils.py (L740-L741)
2021-11-16 11:05:51 -08:00
Sven Mika
f82880eda1
Revert "Revert [RLlib] POC: Deprecate build_policy
(policy template) for torch only; PPOTorchPolicy ( #20061 ) ( #20399 )" ( #20417 )
...
This reverts commit 90dc5460d4
.
2021-11-16 14:49:41 +01:00
Stefan Schneider
2b3d0c691f
[RLlib] Document and extend action mask example. ( #20390 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Sven Mika <sven@anyscale.io>
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-16 13:20:41 +01:00
Kai Fricke
3e6ba5d6d2
Revert "Revert [RLlib] POC: PGTrainer
class that works by sub-classing, not trainer_template.py
." ( #20285 )
...
* Revert "Revert "[RLlib] POC: `PGTrainer` class that works by sub-classing, not `trainer_template.py`. (#20055 )" (#20284 )"
This reverts commit 246787cdd9
.
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-11-16 12:26:47 +01:00
Amog Kamsetty
90dc5460d4
Revert "[RLlib] POC: Deprecate build_policy
(policy template) for torch only; PPOTorchPolicy ( #20061 )" ( #20399 )
...
This reverts commit 5b1c8e46e1
.
2021-11-15 16:11:35 -08:00
Sven Mika
6ff4061f3a
[RLlib] Issue 20269: Offline RL example not working due to new_obs not being written to file. ( #20366 )
...
* wip.
* Apply suggestions from code review
2021-11-15 16:41:08 +01:00
Sven Mika
5b1c8e46e1
[RLlib] POC: Deprecate build_policy
(policy template) for torch only; PPOTorchPolicy ( #20061 )
2021-11-15 10:41:54 +01:00
xwjiang2010
cdf70c2900
[Tune] Remove legacy resources implementations in Runner and Executor. ( #19773 )
2021-11-12 12:33:39 -08:00
Sven Mika
38c456b6f4
[RLlib; Tune] Fix rllib/train.py script after tune.Experiment c'tor change. ( #20283 )
2021-11-12 15:25:50 +01:00
Kai Fricke
246787cdd9
Revert "[RLlib] POC: PGTrainer
class that works by sub-classing, not trainer_template.py
. ( #20055 )" ( #20284 )
...
This reverts commit 6f85af435f
.
2021-11-12 13:09:43 +00:00
Sven Mika
70fe25055a
[RLlib] Issue: Get single step input dict incorrect. ( #20217 )
2021-11-12 08:38:51 +01:00