Sven Mika
0b308719f8
[RLlib; Docs overhaul] Docstring cleanup: rllib/utils ( #19829 )
2021-11-01 21:46:02 +01:00
gjoliver
d81885c1f1
[RLlib] Fix all the CI tests that were broken by is_training and replay buffer changes; re-comment-in the failing RLlib tests ( #19809 )
...
* Fix DDPG, since it is based on GenericOffPolicyTrainer.
* Fix QMix, SAC, and MADDPA too.
* Undo QMix change.
* Fix DQN input batch type. Always use SampleBatch.
* apex ddpg should not use replay_buffer_config yet.
* Make eager tf policy to use SampleBatch.
* lint
* LINT.
* Re-enable RLlib broken tests to make sure things work ok now.
* fixes.
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-10-28 18:06:47 +02:00
gjoliver
99a0088233
[RLlib] Unify the way we create local replay buffer for all agents ( #19627 )
...
* [RLlib] Unify the way we create and use LocalReplayBuffer for all the agents.
This change
1. Get rid of the try...except clause when we call execution_plan(),
and get rid of the Deprecation warning as a result.
2. Fix the execution_plan() call in Trainer._try_recover() too.
3. Most importantly, makes it much easier to create and use different types
of local replay buffers for all our agents.
E.g., allow us to easily create a reservoir sampling replay buffer for
APPO agent for Riot in the near future.
* Introduce explicit configuration for replay buffer types.
* Fix is_training key error.
* actually deprecate buffer_size field.
2021-10-26 20:56:02 +02:00
Sven Mika
b213565783
[RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). ( #19693 )
2021-10-25 15:00:00 +02:00
gjoliver
89fbfc00f8
[RLlib] Some minor cleanups (buffer buffer_size -> capacity and others). ( #19623 )
2021-10-25 09:42:39 +02:00
gjoliver
c3c42278e4
[RLlib] clean up all the SampleBatch['is_training'] deprecation warnings ( #19652 )
...
* [RLlib] clean up all the SampleBatch['is_training'] deprecation warnings.
* wip
2021-10-25 09:38:56 +02:00
gjoliver
9226f9bddc
[RLlib] Report timesteps_this_iter to Tune, so it can track/checkpoint/restore total timesteps trained. ( #19264 )
...
* Report timesteps_this_iter to Tune, so it can track/checkpoint/restore
total timesteps trained.
* Trigger Build
* lint
2021-10-12 16:03:41 +02:00
Sven Mika
b4300dd532
[RLlib] Issue 18812: Torch multi-GPU stats not protected against race conditions. ( #18937 )
2021-10-04 13:29:00 +02:00
Sven Mika
73f5c4039b
[RLlib] Fix flakey test_a3c, test_maml, test_apex_dqn. ( #19035 )
2021-10-04 13:23:51 +02:00
Sven Mika
ed85f59194
[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. ( #18879 )
2021-09-30 16:39:05 +02:00
Sven Mika
828f5d26b7
[RLlib] Custom view requirements (e.g. for prev-n-obs) work with compute_single_action
and compute_actions_from_input_dict
. ( #18921 )
2021-09-30 15:03:37 +02:00
Sven Mika
9c9b482661
[RLlib] Allow n-step > 1 and prio. replay for R2D2 and RNNSAC. ( #18939 )
2021-09-29 21:31:34 +02:00
Sven Mika
8a00154038
[RLlib] Bump tf version in ML docker to tf==2.5.0; add tfp to ML-docker. ( #18544 )
2021-09-15 08:46:37 +02:00
Sven Mika
08c09737fa
[RLlib] Fix R2D2 (torch) multi-GPU issue. ( #18550 )
2021-09-14 19:58:10 +02:00
Sven Mika
8a066474d4
[RLlib] No Preprocessors; preparatory PR #1 ( #18367 )
2021-09-09 08:10:42 +02:00
gjoliver
808b683f81
[RLlib] Add a unittest for learning rate schedule used with APEX agent. ( #18389 )
2021-09-08 23:29:40 +02:00
Sven Mika
e3e6ed7aaa
[RLlib] Issues 17844, 18034: Fix n-step > 1 bug. ( #18358 )
2021-09-06 12:14:20 +02:00
Sven Mika
599e589481
[RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. ( #18065 )
2021-08-31 14:56:53 +02:00
Sven Mika
4888d7c9af
[RLlib] Replay buffers: Add config option to store contents in checkpoints. ( #17999 )
2021-08-31 12:21:49 +02:00
Sven Mika
494ddd98c1
[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. ( #17928 )
2021-08-21 17:05:48 +02:00
Sven Mika
a428f10ebe
[RLlib] Add multi-GPU learning tests to nightly. ( #17778 )
2021-08-18 17:21:01 +02:00
Thomas Lecat
c02f91fa2d
[RLlib] Ape-X doesn't take the value of prioritized_replay
into account ( #17541 )
2021-08-16 22:18:08 +02:00
Sven Mika
924f11cd45
[RLlib] Torch algos use now-framework-agnostic MultiGPUTrainOneStep execution op (~33% speedup for PPO-torch + GPU). ( #17371 )
2021-08-03 11:35:49 -04:00
Sven Mika
8a844ff840
[RLlib] Issues: 17397, 17425, 16715, 17174. When on driver, Torch|TFPolicy should not use ray.get_gpu_ids()
(b/c no GPUs assigned by ray). ( #17444 )
2021-08-02 17:29:59 -04:00
Sven Mika
5a313ba3d6
[RLlib] Refactor: All tf static graph code should reside inside Policy class. ( #17169 )
2021-07-20 14:58:13 -04:00
Grzegorz Bartyzel
d553d4da6c
[RLlib] DQN (Rainbow): Fix torch noisy layer support and loss ( #16716 )
2021-07-13 16:48:06 -04:00
Sven Mika
1fd0eb805e
[RLlib] Redo fix bug normalize vs unsquash actions (original PR made log-likelihood test flakey). ( #17014 )
2021-07-13 14:01:30 -04:00
Antoine Galataud
16f1011c07
[RLlib] Issue 15910: APEX current learning rate not updated on local worker ( #15911 )
2021-07-13 14:01:00 -04:00
Amog Kamsetty
bc33dc7e96
Revert "[RLlib] Fix bug in policy.py: normalize_actions=True has to call unsquash_action
, not normalize_action
." ( #17002 )
...
This reverts commit 7862dd64ea
.
2021-07-12 11:09:14 -07:00
Sven Mika
7862dd64ea
[RLlib] Fix bug in policy.py: normalize_actions=True has to call unsquash_action
, not normalize_action
. ( #16774 )
2021-07-08 17:31:34 +02:00
Sven Mika
53206dd440
[RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes ( #16531 )
2021-06-30 12:32:11 +02:00
AnnaKosiorek
1e709771b2
[rllib][minor] clarification of the softmax axis in dqn_torch_policy ( #16311 )
...
pytorch nn.functional.softmax (unlike tf.nn.softmax) calculates softmax along zeroth dimension by default
2021-06-26 11:19:54 -07:00
Sven Mika
169ddabae7
[RLlib] Issue 15973: Trainer.with_updates(validate_config=...) behaves confusingly. ( #16429 )
2021-06-19 22:42:00 +02:00
Sven Mika
d0014cd351
[RLlib] Policies get/set_state fixes and enhancements. ( #16354 )
2021-06-15 13:08:43 +02:00
Sven Mika
33a69135cb
[RLlib] Issue 16117: DQN/APEX torch not working on GPU. ( #16118 )
2021-05-28 09:12:53 +02:00
Sven Mika
2d34216660
[RLlib] APEX-DQN: Bug fix for torch and add learning test. ( #15762 )
2021-05-20 09:27:03 +02:00
Michael Luo
474f04e322
[RLlib] DDPG/TD3 + A3C/A2C + MARWIL/BC Annotation/Comments/Code Cleanup ( #14707 )
2021-05-19 16:32:29 +02:00
Sven Mika
839fc59224
[RLlib] CQL TensorFlow support ( #15841 )
2021-05-18 11:10:46 +02:00
Sven Mika
16ddab49f5
[RLlib] Trainer._evaluate -> Trainer.evaluate; Also make evaluation possible w/o evaluation worker set. ( #15591 )
2021-05-12 12:16:00 +02:00
Antoine Galataud
ce1c001b1d
[RLlib] DQN: Place LearningRateSchedule mixin at the right moment ( #15558 )
2021-05-04 13:21:40 +02:00
Amog Kamsetty
ebc44c3d76
[CI] Upgrade flake8 to 3.9.1 ( #15527 )
...
* formatting
* format util
* format release
* format rllib/agents
* format rllib/env
* format rllib/execution
* format rllib/evaluation
* format rllib/examples
* format rllib/policy
* format rllib utils and tests
* format streaming
* more formatting
* update requirements files
* fix rllib type checking
* updates
* update
* fix circular import
* Update python/ray/tests/test_runtime_env.py
* noqa
2021-05-03 14:23:28 -07:00
Sven Mika
e973b726c2
[RLlib] Support native tf.keras.Models (part 2) - Default keras models for Vision/RNN/Attention. ( #15273 )
2021-04-30 19:26:30 +02:00
Sven Mika
bb8a286cbc
[RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). ( #14684 )
2021-04-27 10:44:54 +02:00
Sven Mika
7318439c3d
[RLlib] DQN native_ratio (for training intensity) incorrect (discussion 1763). ( #15436 )
...
Thanks @Manuscrit !
2021-04-22 11:06:29 +02:00
Sven Mika
8b3554e37e
[RLlib] Remove all (already soft-deprecated) SampleBatch.data
from code. ( #15335 )
2021-04-15 19:19:51 +02:00
Sven Mika
c90de315e5
[RLlib] APEX returns incorrect default resources (PleacementGroupFactory) colocated missing replay actors. ( #15295 )
2021-04-15 16:50:42 +01:00
Sven Mika
4f66309e19
[RLlib] Redo issue 14533 tf enable eager exec ( #14984 )
2021-03-29 20:07:44 +02:00
SangBin Cho
fa5f961d5e
Revert "[RLlib] Issue 14533: tf.enable_eager_execution()
must be called at beginning. ( #14737 )" ( #14918 )
...
This reverts commit 3e389d5812
.
2021-03-25 00:42:01 -07:00
Sven Mika
3e389d5812
[RLlib] Issue 14533: tf.enable_eager_execution()
must be called at beginning. ( #14737 )
2021-03-24 12:54:27 +01:00
Sven Mika
04bc0a9828
[RLlib] Remove all non-trajectory view API code. ( #14860 )
2021-03-23 09:50:18 -07:00