hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Alexis DUBURCQ	6c3e63bc9c	[RLlib] Fix view requirements. (#21043 )	2021-12-15 11:59:04 +01:00
Sven Mika	596c8e2772	[RLlib] Experimental no-flatten option for actions/prev-actions. (#20918 )	2021-12-11 14:57:58 +01:00
Sven Mika	2d24ef0d32	[RLlib] Add all simple learning tests as `framework=tf2`. (#19273 ) * Unpin gym and deprecate pendulum v0 Many tests in rllib depended on pendulum v0, however in gym 0.21, pendulum v0 was deprecated in favor of pendulum v1. This may change reward thresholds, so will have to potentially rerun all of the pendulum v1 benchmarks, or use another environment in favor. The same applies to frozen lake v0 and frozen lake v1 Lastly, all of the RLlib tests and Tune tests have been moved to python 3.7 * fix tune test_sampler::testSampleBoundsAx * fix re-install ray for py3.7 tests Co-authored-by: avnishn <avnishn@uw.edu>	2021-11-02 12:10:17 +01:00
Sven Mika	9c73871da0	[RLlib; Docs overhaul] Docstring cleanup: Evaluation (#19783 )	2021-10-29 12:03:56 +02:00
Sven Mika	d439fd7f17	[RLlib] TF2/eager memory leak fixes. (#19198 )	2021-10-09 00:11:53 +02:00
Sven Mika	61a1274619	[RLlib] No Preprocessors (part 2). (#18468 )	2021-09-23 12:56:45 +02:00
Sven Mika	9a8ca6a69d	[RLlib] Fix Atari learning test regressions (2 bugs) and 1 minor attention net bug. (#18306 )	2021-09-03 13:29:57 +02:00
Sven Mika	494ddd98c1	[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. (#17928 )	2021-08-21 17:05:48 +02:00
simonsays1980	60aee4a330	[RLlib] Add example script for bare metal Policy with custom `view_requirements`. (#17896 )	2021-08-20 12:17:13 +02:00
Chris Bamford	29768a7c01	[RLLib] (P1 regression) Fixing view requirements in compute actions (#15856 )	2021-07-25 14:25:07 -04:00
Sven Mika	18d173b172	[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. (#17031 )	2021-07-19 13:16:03 -04:00
Sven Mika	be6db06485	[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569 )	2021-06-21 13:46:01 +02:00
Amog Kamsetty	bd3cbfc56a	Revert "[RLlib] Allow policies to be added/deleted on the fly. (#16359 )" (#16543 ) This reverts commit `e78ec370a9`.	2021-06-18 12:21:49 -07:00
Sven Mika	e78ec370a9	[RLlib] Allow policies to be added/deleted on the fly. (#16359 )	2021-06-18 10:31:30 +02:00
Chris Bamford	fd1a97e39f	[RLlib] Memory leak docs (#15908 )	2021-06-10 18:10:21 +02:00
Sven Mika	03c7c530a9	[RLlib] Issue 15483: Wrong init states (should be non-zero if `ModelV2.get_initial_state` returns non-zero values). (#15733 )	2021-05-20 09:28:09 +02:00
Sven Mika	7e260edb07	[RLlib] Fix small memory leak in SimpleListCollector (already superseeded by Bam4d's PR + small fix in error message). (#15783 )	2021-05-18 16:02:03 +02:00
Chris Bamford	0be83d9a95	[RLlib] Fixing Memory Leak In Multi-Agent environments. Adding tooling for finding memory leaks in workers. (#15815 )	2021-05-18 13:23:00 +02:00
Sven Mika	e973b726c2	[RLlib] Support native tf.keras.Models (part 2) - Default keras models for Vision/RNN/Attention. (#15273 )	2021-04-30 19:26:30 +02:00
Sven Mika	bb8a286cbc	[RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). (#14684 )	2021-04-27 10:44:54 +02:00
Sven Mika	c8ca4d03ad	[RLlib] Issue with agent-id -> pol-id mapping not required to be fixed across different episodes. (#15020 )	2021-03-30 19:25:52 +02:00
Sven Mika	04bc0a9828	[RLlib] Remove all non-trajectory view API code. (#14860 )	2021-03-23 09:50:18 -07:00
Sven Mika	3e7899d251	[RLlib] Issue 14653: Empty env steps cause key error in SimpleListCollector. (#14765 )	2021-03-23 10:30:53 +01:00
Sven Mika	69202c6a7d	[RLlib] Obsolete usage tracking dict via sample batch. (#13065 )	2021-03-17 08:18:15 +01:00
Sven Mika	d49c3fae0b	[RLlib] Trajectory View API: Atari framestacking. (#13315 )	2021-01-13 08:53:34 +01:00
Sven Mika	6f342a2221	[RLlib] Preparatory PR for: Documentation on Model Building. (#13260 )	2021-01-08 10:56:09 +01:00
Sven Mika	391cdfae8c	[RLlib] Trajectory view API docs. (#12718 )	2020-12-30 17:32:21 -08:00
Sven Mika	d5604eaba3	[RLlib] Attention nets PyTorch support and cleanup (using traj. view API). (#12029 )	2020-12-21 18:38:34 -08:00
Sven Mika	b2bcab711d	[RLlib] Attention Nets: tf (#12753 )	2020-12-20 20:22:32 -05:00
Sven Mika	e40b14d255	[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. (#12420 )	2020-12-08 16:41:45 -08:00
Sven Mika	99c81c6795	[RLlib] Attention Net prep PR #3 . (#12450 )	2020-12-07 13:08:17 +01:00
Sven Mika	19c8033df2	[RLlib] Fix most remaining RLlib algos for running with trajectory view API. (#12366 ) * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * LINT and fixes. MB-MPO and MAML not working yet. * wip * update * update * rmeove * remove dep * higher * Update requirements_rllib.txt * Update requirements_rllib.txt * relpos * no mbmpo Co-authored-by: Eric Liang <ekhliang@gmail.com>	2020-12-01 17:41:10 -08:00
Sven Mika	3ad9365e1d	[RLlib] Attention Net prep PR #2 : Smaller cleanups. (#12449 )	2020-12-01 08:21:45 +01:00
Sven Mika	0df55a139c	[RLlib] Attention Net prep PR #1 : Smaller cleanups. (#12447 ) * WIP. * Fix. * Fix. * Fix.	2020-11-27 16:25:47 -08:00
Sven Mika	95175a822f	[RLlib] Issue 11974: Traj view API next-action (shift=+1) not working. (#12407 ) * WIP. * Fix and LINT.	2020-11-25 11:26:29 -08:00
Sven Mika	dab241dcc6	[RLlib] Fix inconsistency wrt batch size in SampleCollector (traj. view API). Makes DD-PPO work with traj. view API. (#12063 )	2020-11-19 19:01:14 +01:00
Sven Mika	5b788ccb13	[RLlib] Trajectory view API (prep PR for switching on by default across all RLlib; plumbing only) (#11717 )	2020-11-03 12:53:34 -08:00
Sven Mika	d9f1874e34	[RLlib] Minor fixes (torch GPU bugs + some cleanup). (#11609 )	2020-10-27 10:00:24 +01:00
Sven Mika	36bda8432b	[RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056 )	2020-10-01 16:57:10 +02:00

39 commits