Commit graph

41 commits

Author SHA1 Message Date
Sven Mika
70fe25055a
[RLlib] Issue: Get single step input dict incorrect. (#20217) 2021-11-12 08:38:51 +01:00
Sven Mika
9c73871da0
[RLlib; Docs overhaul] Docstring cleanup: Evaluation (#19783) 2021-10-29 12:03:56 +02:00
Antoine Galataud
edb338ff7c
[RLlib] Check training_enabled on PolicyServer (#19007) 2021-10-12 16:21:02 +02:00
mvindiola1
62f5da0b65
[RLlib] Add unit tests for updating episode data in base_env (#17137) 2021-09-24 16:08:11 +02:00
Sven Mika
61a1274619
[RLlib] No Preprocessors (part 2). (#18468) 2021-09-23 12:56:45 +02:00
Sven Mika
a96dbd885b
[RLlib] Reinstate trajectory view API tests. (#18809) 2021-09-23 08:31:51 +02:00
Sven Mika
e3e6ed7aaa
[RLlib] Issues 17844, 18034: Fix n-step > 1 bug. (#18358) 2021-09-06 12:14:20 +02:00
gjoliver
336e79956a
[RLlib] Make MultiAgentEnv inherit gym.Env to avoid direct class type manipulation (#18156) 2021-09-03 08:02:05 +02:00
Sven Mika
2357bbc0c8
[RLlib] Issue 18231: Better (earlier) env validation and error message improvement. (#18249) 2021-09-02 09:28:16 +02:00
gjoliver
6621bb5611
[RLlib] Minor renaming and cleanups related to last rollout worker seed fix. (#18155) 2021-09-02 06:57:46 +02:00
gjoliver
a8813675f4
[RLlib] Issue 17900: Set seed in single vectorized sub-envs properly, if num_envs_per_worker > 1 (#18110)
* In case a worker runs multiple envs, make sure a different seed can be deterministically set on all of them.

* Revert a couple of whitespace changes.

* Fix a few style errors.

Co-authored-by: Jun Gong <jungong@mbpro.local>
2021-08-26 11:32:58 +02:00
Sven Mika
494ddd98c1
[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. (#17928) 2021-08-21 17:05:48 +02:00
Sven Mika
0d8fce8fd8
[RLlib] Discussion 2294: Custom vector env example and fix. (#16083) 2021-07-28 10:40:04 -04:00
Sven Mika
18d173b172
[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. (#17031) 2021-07-19 13:16:03 -04:00
Sven Mika
53206dd440
[RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes (#16531) 2021-06-30 12:32:11 +02:00
Sven Mika
be6db06485
[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569) 2021-06-21 13:46:01 +02:00
Amog Kamsetty
bd3cbfc56a
Revert "[RLlib] Allow policies to be added/deleted on the fly. (#16359)" (#16543)
This reverts commit e78ec370a9.
2021-06-18 12:21:49 -07:00
Sven Mika
e78ec370a9
[RLlib] Allow policies to be added/deleted on the fly. (#16359) 2021-06-18 10:31:30 +02:00
Sven Mika
16ddab49f5
[RLlib] Trainer._evaluate -> Trainer.evaluate; Also make evaluation possible w/o evaluation worker set. (#15591) 2021-05-12 12:16:00 +02:00
Sven Mika
e973b726c2
[RLlib] Support native tf.keras.Models (part 2) - Default keras models for Vision/RNN/Attention. (#15273) 2021-04-30 19:26:30 +02:00
Sven Mika
8b3554e37e
[RLlib] Remove all (already soft-deprecated) SampleBatch.data from code. (#15335) 2021-04-15 19:19:51 +02:00
Sven Mika
04bc0a9828
[RLlib] Remove all non-trajectory view API code. (#14860) 2021-03-23 09:50:18 -07:00
Sven Mika
c3a15ecc0f
[RLlib] Issue #13802: Enhance metrics for multiagent->count_steps_by=agent_steps setting. (#14033) 2021-03-18 20:27:41 +01:00
Sven Mika
eb0038612f
[RLlib] Extend on_learn_on_batch callback to allow for custom metrics to be added. (#13584) 2021-02-08 15:02:19 +01:00
Sven Mika
9eba1871bb
[RLlib] Support easy use_attention=True flag for using the GTrXL model. (#11698) 2021-01-01 14:06:23 -05:00
Sven Mika
391cdfae8c
[RLlib] Trajectory view API docs. (#12718) 2020-12-30 17:32:21 -08:00
Sven Mika
c524f86785
[RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. (#13064) 2020-12-27 09:46:03 -05:00
Sven Mika
d5604eaba3
[RLlib] Attention nets PyTorch support and cleanup (using traj. view API). (#12029) 2020-12-21 18:38:34 -08:00
Sven Mika
e40b14d255
[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. (#12420) 2020-12-08 16:41:45 -08:00
Sven Mika
99c81c6795
[RLlib] Attention Net prep PR #3. (#12450) 2020-12-07 13:08:17 +01:00
Sven Mika
3ad9365e1d
[RLlib] Attention Net prep PR #2: Smaller cleanups. (#12449) 2020-12-01 08:21:45 +01:00
Sven Mika
0df55a139c
[RLlib] Attention Net prep PR #1: Smaller cleanups. (#12447)
* WIP.

* Fix.

* Fix.

* Fix.
2020-11-27 16:25:47 -08:00
Sven Mika
592c161032
[RLlib] Issue 12118: LSTM prev-a/r should be separately configurable. Fix missing prev-a one-hot encoding. (#12397)
* WIP.

* Fix and LINT.
2020-11-25 11:27:46 -08:00
Sven Mika
95175a822f
[RLlib] Issue 11974: Traj view API next-action (shift=+1) not working. (#12407)
* WIP.

* Fix and LINT.
2020-11-25 11:26:29 -08:00
Sven Mika
dab241dcc6
[RLlib] Fix inconsistency wrt batch size in SampleCollector (traj. view API). Makes DD-PPO work with traj. view API. (#12063) 2020-11-19 19:01:14 +01:00
Sven Mika
b6b54f1c81
[RLlib] Trajectory view API: enable by default for SAC, DDPG, DQN, SimpleQ (#11827) 2020-11-16 10:54:35 -08:00
Sven Mika
414041c6dd
[RLlib] Do not create env on driver iff num_workers > 0. (#11307) 2020-10-15 18:21:30 +02:00
Sven Mika
36bda8432b
[RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056) 2020-10-01 16:57:10 +02:00
Sven Mika
e968b52cb7
[RLlib] Trajectory view API - 03 Fast LSTM + prev actions/rewards (#9950) 2020-08-21 12:35:16 +02:00
Sven Mika
aeb5be7733
[RLlib] Trajectory View API (part 2.5): Actual implementations (not used yet) of a SampleCollector. (#10112) 2020-08-15 15:09:00 +02:00
Sven Mika
03ab86567f
[RLlib] Layout of Trajectory View API (new class: Trajectory; not used yet). (#9269) 2020-07-14 04:27:49 +02:00