Sven Mika
70fe25055a
[RLlib] Issue: Get single step input dict incorrect. ( #20217 )
2021-11-12 08:38:51 +01:00
Sven Mika
9c73871da0
[RLlib; Docs overhaul] Docstring cleanup: Evaluation ( #19783 )
2021-10-29 12:03:56 +02:00
Antoine Galataud
edb338ff7c
[RLlib] Check training_enabled
on PolicyServer ( #19007 )
2021-10-12 16:21:02 +02:00
mvindiola1
62f5da0b65
[RLlib] Add unit tests for updating episode data in base_env ( #17137 )
2021-09-24 16:08:11 +02:00
Sven Mika
61a1274619
[RLlib] No Preprocessors (part 2). ( #18468 )
2021-09-23 12:56:45 +02:00
Sven Mika
a96dbd885b
[RLlib] Reinstate trajectory view API tests. ( #18809 )
2021-09-23 08:31:51 +02:00
Sven Mika
e3e6ed7aaa
[RLlib] Issues 17844, 18034: Fix n-step > 1 bug. ( #18358 )
2021-09-06 12:14:20 +02:00
gjoliver
336e79956a
[RLlib] Make MultiAgentEnv inherit gym.Env to avoid direct class type manipulation ( #18156 )
2021-09-03 08:02:05 +02:00
Sven Mika
2357bbc0c8
[RLlib] Issue 18231: Better (earlier) env validation and error message improvement. ( #18249 )
2021-09-02 09:28:16 +02:00
gjoliver
6621bb5611
[RLlib] Minor renaming and cleanups related to last rollout worker seed fix. ( #18155 )
2021-09-02 06:57:46 +02:00
gjoliver
a8813675f4
[RLlib] Issue 17900: Set seed
in single vectorized sub-envs properly, if num_envs_per_worker > 1
( #18110 )
...
* In case a worker runs multiple envs, make sure a different seed can be deterministically set on all of them.
* Revert a couple of whitespace changes.
* Fix a few style errors.
Co-authored-by: Jun Gong <jungong@mbpro.local>
2021-08-26 11:32:58 +02:00
Sven Mika
494ddd98c1
[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. ( #17928 )
2021-08-21 17:05:48 +02:00
Sven Mika
0d8fce8fd8
[RLlib] Discussion 2294: Custom vector env example and fix. ( #16083 )
2021-07-28 10:40:04 -04:00
Sven Mika
18d173b172
[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. ( #17031 )
2021-07-19 13:16:03 -04:00
Sven Mika
53206dd440
[RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes ( #16531 )
2021-06-30 12:32:11 +02:00
Sven Mika
be6db06485
[RLlib] Re-do: Trainer: Support add and delete Policies. ( #16569 )
2021-06-21 13:46:01 +02:00
Amog Kamsetty
bd3cbfc56a
Revert "[RLlib] Allow policies to be added/deleted on the fly. ( #16359 )" ( #16543 )
...
This reverts commit e78ec370a9
.
2021-06-18 12:21:49 -07:00
Sven Mika
e78ec370a9
[RLlib] Allow policies to be added/deleted on the fly. ( #16359 )
2021-06-18 10:31:30 +02:00
Sven Mika
16ddab49f5
[RLlib] Trainer._evaluate -> Trainer.evaluate; Also make evaluation possible w/o evaluation worker set. ( #15591 )
2021-05-12 12:16:00 +02:00
Sven Mika
e973b726c2
[RLlib] Support native tf.keras.Models (part 2) - Default keras models for Vision/RNN/Attention. ( #15273 )
2021-04-30 19:26:30 +02:00
Sven Mika
8b3554e37e
[RLlib] Remove all (already soft-deprecated) SampleBatch.data
from code. ( #15335 )
2021-04-15 19:19:51 +02:00
Sven Mika
04bc0a9828
[RLlib] Remove all non-trajectory view API code. ( #14860 )
2021-03-23 09:50:18 -07:00
Sven Mika
c3a15ecc0f
[RLlib] Issue #13802 : Enhance metrics for multiagent->count_steps_by=agent_steps
setting. ( #14033 )
2021-03-18 20:27:41 +01:00
Sven Mika
eb0038612f
[RLlib] Extend on_learn_on_batch callback to allow for custom metrics to be added. ( #13584 )
2021-02-08 15:02:19 +01:00
Sven Mika
9eba1871bb
[RLlib] Support easy use_attention=True
flag for using the GTrXL model. ( #11698 )
2021-01-01 14:06:23 -05:00
Sven Mika
391cdfae8c
[RLlib] Trajectory view API docs. ( #12718 )
2020-12-30 17:32:21 -08:00
Sven Mika
c524f86785
[RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. ( #13064 )
2020-12-27 09:46:03 -05:00
Sven Mika
d5604eaba3
[RLlib] Attention nets PyTorch support and cleanup (using traj. view API). ( #12029 )
2020-12-21 18:38:34 -08:00
Sven Mika
e40b14d255
[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. ( #12420 )
2020-12-08 16:41:45 -08:00
Sven Mika
99c81c6795
[RLlib] Attention Net prep PR #3 . ( #12450 )
2020-12-07 13:08:17 +01:00
Sven Mika
3ad9365e1d
[RLlib] Attention Net prep PR #2 : Smaller cleanups. ( #12449 )
2020-12-01 08:21:45 +01:00
Sven Mika
0df55a139c
[RLlib] Attention Net prep PR #1 : Smaller cleanups. ( #12447 )
...
* WIP.
* Fix.
* Fix.
* Fix.
2020-11-27 16:25:47 -08:00
Sven Mika
592c161032
[RLlib] Issue 12118: LSTM prev-a/r should be separately configurable. Fix missing prev-a one-hot encoding. ( #12397 )
...
* WIP.
* Fix and LINT.
2020-11-25 11:27:46 -08:00
Sven Mika
95175a822f
[RLlib] Issue 11974: Traj view API next-action (shift=+1) not working. ( #12407 )
...
* WIP.
* Fix and LINT.
2020-11-25 11:26:29 -08:00
Sven Mika
dab241dcc6
[RLlib] Fix inconsistency wrt batch size in SampleCollector (traj. view API). Makes DD-PPO work with traj. view API. ( #12063 )
2020-11-19 19:01:14 +01:00
Sven Mika
b6b54f1c81
[RLlib] Trajectory view API: enable by default for SAC, DDPG, DQN, SimpleQ ( #11827 )
2020-11-16 10:54:35 -08:00
Sven Mika
414041c6dd
[RLlib] Do not create env on driver iff num_workers > 0. ( #11307 )
2020-10-15 18:21:30 +02:00
Sven Mika
36bda8432b
[RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic ( #11056 )
2020-10-01 16:57:10 +02:00
Sven Mika
e968b52cb7
[RLlib] Trajectory view API - 03 Fast LSTM + prev actions/rewards ( #9950 )
2020-08-21 12:35:16 +02:00
Sven Mika
aeb5be7733
[RLlib] Trajectory View API (part 2.5): Actual implementations (not used yet) of a SampleCollector. ( #10112 )
2020-08-15 15:09:00 +02:00
Sven Mika
03ab86567f
[RLlib] Layout of Trajectory View API (new class: Trajectory; not used yet). ( #9269 )
2020-07-14 04:27:49 +02:00