hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-10 05:16:49 -04:00

Author	SHA1	Message	Date
Sven Mika	2d24ef0d32	[RLlib] Add all simple learning tests as `framework=tf2`. (#19273 ) * Unpin gym and deprecate pendulum v0 Many tests in rllib depended on pendulum v0, however in gym 0.21, pendulum v0 was deprecated in favor of pendulum v1. This may change reward thresholds, so will have to potentially rerun all of the pendulum v1 benchmarks, or use another environment in favor. The same applies to frozen lake v0 and frozen lake v1 Lastly, all of the RLlib tests and Tune tests have been moved to python 3.7 * fix tune test_sampler::testSampleBoundsAx * fix re-install ray for py3.7 tests Co-authored-by: avnishn <avnishn@uw.edu>	2021-11-02 12:10:17 +01:00
Sven Mika	b4300dd532	[RLlib] Issue 18812: Torch multi-GPU stats not protected against race conditions. (#18937 )	2021-10-04 13:29:00 +02:00
Sven Mika	ed85f59194	[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879 )	2021-09-30 16:39:05 +02:00
Sven Mika	599e589481	[RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. (#18065 )	2021-08-31 14:56:53 +02:00
Sven Mika	a428f10ebe	[RLlib] Add multi-GPU learning tests to nightly. (#17778 )	2021-08-18 17:21:01 +02:00
Sven Mika	c2ea2c01bb	[RLlib] Redo: Add support for multi-GPU to DDPG. (#17789 ) * wip. * wip. * wip. * wip. * wip. * wip.	2021-08-13 18:01:24 -07:00
Amog Kamsetty	0b8489dcc6	Revert "[RLlib] Add support for multi-GPU to DDPG. (#17586 )" (#17707 ) This reverts commit `0eb0e0ff58`.	2021-08-10 10:50:21 -07:00
Sven Mika	0eb0e0ff58	[RLlib] Add support for multi-GPU to DDPG. (#17586 )	2021-08-05 11:39:51 -04:00
Sven Mika	53206dd440	[RLlib] CQL BC loss fixes; PPO/PG/A2\|3C action normalization fixes (#16531 )	2021-06-30 12:32:11 +02:00
Sven Mika	be6db06485	[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569 )	2021-06-21 13:46:01 +02:00
Sven Mika	5fe34862ce	[RLlib] DDPG torch GPU bug. (#16133 )	2021-05-28 22:09:25 +02:00
Sven Mika	bc09e75b78	[RLlib] Fix 3 flakey test cases. (#15785 )	2021-05-16 12:20:33 +02:00
Sven Mika	bb8a286cbc	[RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). (#14684 )	2021-04-27 10:44:54 +02:00
mvindiola1	5e350ceaa2	[RLlib] Issue 14119: Fix TD3 policy delay for torch. (#14840 )	2021-03-24 16:26:22 +01:00
Sven Mika	37c7daa3c0	[RLlib] DDPG: Support simplex action space. (#14011 )	2021-02-10 15:10:01 +01:00
Sven Mika	ce96b03b07	[RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033 )	2020-10-06 20:28:16 +02:00
mvindiola1	2b893d1bb5	fix incorrect critic loss in TD3 (#10775 ) Co-authored-by: Manny Vindiola <manuel.m.vindiola.civ@mail.mil>	2020-09-20 20:01:51 -07:00
Barak Michener	8e76796fd0	ci: Redo `format.sh --all` script & backfill lint fixes (#9956 )	2020-08-07 16:49:49 -07:00
Sven Mika	fcdf410ae1	[RLlib] Tf2.x native. (#8752 )	2020-07-11 22:06:35 +02:00
Sven Mika	4da0e542d5	[RLlib] DDPG and SAC eager support (preparation for tf2.x) (#9204 )	2020-07-08 16:12:20 +02:00
Sven Mika	b4c0b942fe	[RLlib] Remove requirement for dataclasses in rllib (not supported in py3.5) (#9237 )	2020-07-01 17:31:44 +02:00
Sven Mika	4ed796a7d6	[RLlib] Add testing `Policy.compute_single_action()` for all agents. (#8903 )	2020-06-13 17:51:50 +02:00
Sven Mika	2746fc0476	[RLlib] Auto-framework, retire `use_pytorch` in favor of `framework=...` (#8520 )	2020-05-27 16:19:13 +02:00
Sven Mika	baa053496a	[RLlib] Benchmark and regression test yaml cleanup and restructuring. (#8414 )	2020-05-26 11:10:27 +02:00
Eric Liang	9a83908c46	[rllib] Deprecate policy optimizers (#8345 )	2020-05-21 10:16:18 -07:00
Sven Mika	754290daad	[RLlib] Add light-weight `Trainer.compute_action()` tests for all Algos. (#8356 )	2020-05-08 16:31:31 +02:00
Sven Mika	d7eaacb5fe	[RLlib] Issue 8319 DDPG (MA or num_envs_per_worker > 1) broken. (#8324 )	2020-05-08 08:26:32 +02:00
Eric Liang	b14cc16616	[rllib] Enable functional execution workflow API by default (#8221 )	2020-05-05 12:36:42 -07:00
Sven Mika	7ec2223c84	[RLlib] DDPG PyTorch actor-model was missing sigmoid layer (#8188 ) Fix DDPG PyTorch (missing sigmoid layer (to squash action outputs) after deterministic action outputs).	2020-04-26 23:08:13 +02:00
Sven Mika	d0fab84e4d	[RLlib] DDPG PyTorch version. (#7953 ) The DDPG/TD3 algorithms currently do not have a PyTorch implementation. This PR adds PyTorch support for DDPG/TD3 to RLlib. This PR: - Depends on the re-factor PR for DDPG (Functional Algorithm API). - Adds learning regression tests for the PyTorch version of DDPG and a DDPG (torch) - Updates the documentation to reflect that DDPG and TD3 now support PyTorch. * Learning Pendulum-v0 on torch version (same config as tf). Wall time a little slower (~20% than tf). * Fix GPU target model problem.	2020-04-16 10:20:01 +02:00
Sven Mika	1b31c11806	[RLlib] DDPG re-factor to fit into RLlib's functional algorithm builder API. (#7934 )	2020-04-09 14:04:21 -07:00
Sven Mika	1d4823c0ec	[RLlib] Add testing framework_iterator. (#7852 ) * Add testing framework_iterator. * LINT. * WIP. * Fix and LINT. * LINT fix.	2020-04-03 12:24:25 -07:00
Sven Mika	20ef4a8603	[RLlib] Cleanup/unify all test cases. (#7533 )	2020-03-11 20:39:47 -07:00
Sven Mika	83e06cd30a	[RLlib] DDPG refactor and Exploration API action noise classes. (#7314 ) * WIP. * WIP. * WIP. * WIP. * WIP. * Fix * WIP. * Add TD3 quick Pendulum regresison. * Cleanup. * Fix. * LINT. * Fix. * Sort quick_learning test cases, add TD3. * Sort quick_learning test cases, add TD3. * Revert test_checkpoint_restore.py (debugging) changes. * Fix old soft_q settings in documentation and test configs. * More doc fixes. * Fix test case. * Fix test case. * Lower test load. * WIP.	2020-03-01 11:53:35 -08:00

34 commits