hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 02:21:39 -05:00

Author	SHA1	Message	Date
Avnish Narayan	55209692ee	[RLlib] Deflake MARWIL and BC and remove memory leak from torch MARWIL policy (#27406 )	2022-08-03 16:53:12 -07:00
Jun Gong	acf2bf9b2f	[RLlib] Get rid of all these deprecation warnings. (#27085 )	2022-07-27 10:48:54 -07:00
Rohan Potdar	a53bbe49bf	[RLlib]: Raise deprecation warning in MARWIL OPE methods. (#26893 )	2022-07-23 13:55:40 +02:00
Jun Gong	b383d987d1	[RLlib] Fix a bunch of issues related to connectors. (#26510 )	2022-07-13 18:55:20 +02:00
Rohan Potdar	09ce4711fd	[RLlib]: Move OPE to evaluation config (#25911 )	2022-07-12 11:04:34 -07:00
Avnish Narayan	1243ed62bf	[RLlib] Make Dataset reader default reader and enable CRR to use dataset (#26304 ) Co-authored-by: avnish <avnish@avnishs-MBP.local.meter>	2022-07-08 12:43:35 -07:00
Sven Mika	59a967a3a0	[RLlib] Cleanup some deprecated metric keys and classes. (#26036 )	2022-06-23 21:30:01 +02:00
Sven Mika	96693055bd	[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869 )	2022-06-20 15:54:00 +02:00
Sven Mika	130b7eeaba	[RLlib] `Trainer` to `Algorithm` renaming. (#25539 )	2022-06-11 15:10:39 +02:00
Sven Mika	7c39aa5fac	[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076 )	2022-06-10 17:09:18 +02:00
Artur Niederfahrenhorst	94d6c212df	[RLlib] Replay Buffer API documentation. (#24683 )	2022-06-10 16:47:51 +02:00
Rohan Potdar	a9d8da0100	[RLlib]: Doubly Robust Off-Policy Evaluation. (#25056 )	2022-06-07 12:52:19 +02:00
Sven Mika	b5bc2b93c3	[RLlib] Move all remaining algos into `algorithms` directory. (#25366 )	2022-06-04 07:35:24 +02:00
Jun Gong	1d24d6af98	[RLlib] Fix MARWIL tf policy. (#25384 )	2022-06-03 10:50:36 +02:00
Rohan Potdar	ab81c8e9ca	[RLlib]: Rename `input_evaluation` to `off_policy_estimation_methods`. (#25107 )	2022-05-27 13:14:54 +02:00
Jun Gong	eaf9c941ae	[RLlib] Migrate PPO Impala and APPO policies to use sub-classing implementation. (#25117 )	2022-05-25 14:38:03 +02:00
Artur Niederfahrenhorst	d76ef9add5	[RLLib] Fix RNNSAC example failing on CI + fixes for recurrent models for other Q Learning Algos. (#24923 )	2022-05-24 14:39:43 +02:00
Rohan Potdar	5a70b732e8	[RLlib] MARWIL and BC Config. (#24853 )	2022-05-21 12:50:20 +02:00
Jun Gong	d5a6d46049	[RLlib] Migrate MAML, MB-MPO, MARWIL, and BC to use Policy sub-classing implementation. (#24914 )	2022-05-20 14:10:59 +02:00
Jun Gong	dea134a472	[RLlib] Clean up Policy mixins. (#24746 )	2022-05-17 17:16:08 +02:00
Artur Niederfahrenhorst	fb2915d26a	[RLlib] Replay Buffer API and Ape-X. (#24506 )	2022-05-17 13:43:49 +02:00
Sven Mika	0cd7bc4054	[RLlib] Re-establish dashboard performance tests. (#24728 )	2022-05-16 13:13:49 +02:00
Jun Gong	68a9a33386	[RLlib] Retry agents -> algorithms. with proper doc changes this time. (#24797 )	2022-05-16 09:45:32 +02:00
Simon Mo	9f23affdc0	[Hotfix] Unbreak lint in master (#24794 )	2022-05-13 15:05:05 -07:00
kourosh hakhamaneshi	ffcbb30552	[RLlib] Move from `agents` to `algorithms` - CQL, MARWIL, AlphaStar, MAML, Dreamer, MBMPO. (#24739 )	2022-05-13 18:43:36 +02:00

25 commits