Alexis DUBURCQ
6c3e63bc9c
[RLlib] Fix view requirements. ( #21043 )
2021-12-15 11:59:04 +01:00
Sven Mika
596c8e2772
[RLlib] Experimental no-flatten option for actions/prev-actions. ( #20918 )
2021-12-11 14:57:58 +01:00
Sven Mika
2d24ef0d32
[RLlib] Add all simple learning tests as framework=tf2
. ( #19273 )
...
* Unpin gym and deprecate pendulum v0
Many tests in rllib depended on pendulum v0,
however in gym 0.21, pendulum v0 was deprecated
in favor of pendulum v1. This may change reward
thresholds, so will have to potentially rerun
all of the pendulum v1 benchmarks, or use another
environment in favor. The same applies to frozen
lake v0 and frozen lake v1
Lastly, all of the RLlib tests and Tune tests have
been moved to python 3.7
* fix tune test_sampler::testSampleBoundsAx
* fix re-install ray for py3.7 tests
Co-authored-by: avnishn <avnishn@uw.edu>
2021-11-02 12:10:17 +01:00
Sven Mika
9c73871da0
[RLlib; Docs overhaul] Docstring cleanup: Evaluation ( #19783 )
2021-10-29 12:03:56 +02:00
Sven Mika
d439fd7f17
[RLlib] TF2/eager memory leak fixes. ( #19198 )
2021-10-09 00:11:53 +02:00
Sven Mika
61a1274619
[RLlib] No Preprocessors (part 2). ( #18468 )
2021-09-23 12:56:45 +02:00
Sven Mika
9a8ca6a69d
[RLlib] Fix Atari learning test regressions (2 bugs) and 1 minor attention net bug. ( #18306 )
2021-09-03 13:29:57 +02:00
Sven Mika
494ddd98c1
[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. ( #17928 )
2021-08-21 17:05:48 +02:00
simonsays1980
60aee4a330
[RLlib] Add example script for bare metal Policy with custom view_requirements
. ( #17896 )
2021-08-20 12:17:13 +02:00
Chris Bamford
29768a7c01
[RLLib] (P1 regression) Fixing view requirements in compute actions ( #15856 )
2021-07-25 14:25:07 -04:00
Sven Mika
18d173b172
[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. ( #17031 )
2021-07-19 13:16:03 -04:00
Sven Mika
be6db06485
[RLlib] Re-do: Trainer: Support add and delete Policies. ( #16569 )
2021-06-21 13:46:01 +02:00
Amog Kamsetty
bd3cbfc56a
Revert "[RLlib] Allow policies to be added/deleted on the fly. ( #16359 )" ( #16543 )
...
This reverts commit e78ec370a9
.
2021-06-18 12:21:49 -07:00
Sven Mika
e78ec370a9
[RLlib] Allow policies to be added/deleted on the fly. ( #16359 )
2021-06-18 10:31:30 +02:00
Chris Bamford
fd1a97e39f
[RLlib] Memory leak docs ( #15908 )
2021-06-10 18:10:21 +02:00
Sven Mika
03c7c530a9
[RLlib] Issue 15483: Wrong init states (should be non-zero if ModelV2.get_initial_state
returns non-zero values). ( #15733 )
2021-05-20 09:28:09 +02:00
Sven Mika
7e260edb07
[RLlib] Fix small memory leak in SimpleListCollector (already superseeded by Bam4d's PR + small fix in error message). ( #15783 )
2021-05-18 16:02:03 +02:00
Chris Bamford
0be83d9a95
[RLlib] Fixing Memory Leak In Multi-Agent environments. Adding tooling for finding memory leaks in workers. ( #15815 )
2021-05-18 13:23:00 +02:00
Sven Mika
e973b726c2
[RLlib] Support native tf.keras.Models (part 2) - Default keras models for Vision/RNN/Attention. ( #15273 )
2021-04-30 19:26:30 +02:00
Sven Mika
bb8a286cbc
[RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). ( #14684 )
2021-04-27 10:44:54 +02:00
Sven Mika
c8ca4d03ad
[RLlib] Issue with agent-id -> pol-id mapping not required to be fixed across different episodes. ( #15020 )
2021-03-30 19:25:52 +02:00
Sven Mika
04bc0a9828
[RLlib] Remove all non-trajectory view API code. ( #14860 )
2021-03-23 09:50:18 -07:00
Sven Mika
3e7899d251
[RLlib] Issue 14653: Empty env steps cause key error in SimpleListCollector. ( #14765 )
2021-03-23 10:30:53 +01:00
Sven Mika
69202c6a7d
[RLlib] Obsolete usage tracking dict via sample batch. ( #13065 )
2021-03-17 08:18:15 +01:00
Sven Mika
d49c3fae0b
[RLlib] Trajectory View API: Atari framestacking. ( #13315 )
2021-01-13 08:53:34 +01:00
Sven Mika
6f342a2221
[RLlib] Preparatory PR for: Documentation on Model Building. ( #13260 )
2021-01-08 10:56:09 +01:00
Sven Mika
391cdfae8c
[RLlib] Trajectory view API docs. ( #12718 )
2020-12-30 17:32:21 -08:00
Sven Mika
d5604eaba3
[RLlib] Attention nets PyTorch support and cleanup (using traj. view API). ( #12029 )
2020-12-21 18:38:34 -08:00
Sven Mika
b2bcab711d
[RLlib] Attention Nets: tf ( #12753 )
2020-12-20 20:22:32 -05:00
Sven Mika
e40b14d255
[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. ( #12420 )
2020-12-08 16:41:45 -08:00
Sven Mika
99c81c6795
[RLlib] Attention Net prep PR #3 . ( #12450 )
2020-12-07 13:08:17 +01:00
Sven Mika
19c8033df2
[RLlib] Fix most remaining RLlib algos for running with trajectory view API. ( #12366 )
...
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* WIP.
* LINT and fixes.
MB-MPO and MAML not working yet.
* wip
* update
* update
* rmeove
* remove dep
* higher
* Update requirements_rllib.txt
* Update requirements_rllib.txt
* relpos
* no mbmpo
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-12-01 17:41:10 -08:00
Sven Mika
3ad9365e1d
[RLlib] Attention Net prep PR #2 : Smaller cleanups. ( #12449 )
2020-12-01 08:21:45 +01:00
Sven Mika
0df55a139c
[RLlib] Attention Net prep PR #1 : Smaller cleanups. ( #12447 )
...
* WIP.
* Fix.
* Fix.
* Fix.
2020-11-27 16:25:47 -08:00
Sven Mika
95175a822f
[RLlib] Issue 11974: Traj view API next-action (shift=+1) not working. ( #12407 )
...
* WIP.
* Fix and LINT.
2020-11-25 11:26:29 -08:00
Sven Mika
dab241dcc6
[RLlib] Fix inconsistency wrt batch size in SampleCollector (traj. view API). Makes DD-PPO work with traj. view API. ( #12063 )
2020-11-19 19:01:14 +01:00
Sven Mika
5b788ccb13
[RLlib] Trajectory view API (prep PR for switching on by default across all RLlib; plumbing only) ( #11717 )
2020-11-03 12:53:34 -08:00
Sven Mika
d9f1874e34
[RLlib] Minor fixes (torch GPU bugs + some cleanup). ( #11609 )
2020-10-27 10:00:24 +01:00
Sven Mika
36bda8432b
[RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic ( #11056 )
2020-10-01 16:57:10 +02:00