hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Sven Mika	ee41800c16	[RLlib] Preparatory PR for multi-agent, multi-GPU learning agent (alpha-star style) #02 . (#21649 )	2022-01-27 22:07:05 +01:00
Sven Mika	893536ebd9	[RLlib] Move bandits into main agents folder; Make RecSim adapter more accessible; (#21773 )	2022-01-27 13:58:12 +01:00
Sven Mika	371fbb17e4	[RLlib] Make `policies_to_train` more flexible via callable option. (#20735 )	2022-01-27 12:17:34 +01:00
Sven Mika	d5bfb7b7da	[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 (#21652 )	2022-01-25 14:16:58 +01:00
Avnish Narayan	12b087acb8	[RLlib] Base env pre-checker. (#21569 )	2022-01-18 16:34:06 +01:00
Jun Gong	7517aefe05	[RLlib] Bring back BC and Marwil learning tests. (#21574 )	2022-01-14 14:35:32 +01:00
Avnish Narayan	c0f1202278	[RLlib] `MultiAgentEnv` pre-checker (#21476 )	2022-01-13 11:31:22 +01:00
Sven Mika	90c6b10498	[RLlib] Decentralized multi-agent learning; PR #01 (#21421 )	2022-01-13 10:52:55 +01:00
Sven Mika	188324c5c7	[RLlib] Issue 21552: `unsquash_action` and `clip_action` (when None) cause wrong actions computed by `Trainer.compute_single_action`. (#21553 )	2022-01-12 18:56:51 +01:00
Sven Mika	f94bd99ce4	[RLlib] Issue 21044: Improve error message for "multiagent" dict checks. (#21448 )	2022-01-11 19:50:03 +01:00
Sven Mika	92f030331e	[RLlib] Initial code/comment cleanups in preparation for decentralized multi-agent learner. (#21420 )	2022-01-10 11:22:55 +01:00
Sven Mika	35af30a446	[RLlib] Issue 21109: Action unsquashing causes inf/NaN actions for unbounded action spaces. (#21110 )	2022-01-10 11:20:37 +01:00
Sven Mika	853d10871c	[RLlib] Issue 18499: PGTrainer with training_iteration fn does not support multi-GPU. (#21376 )	2022-01-05 18:22:33 +01:00
Sven Mika	9e6b871739	[RLlib] Better utils for flattening complex inputs and enable prev-actions for LSTM/attention for complex action spaces. (#21330 )	2022-01-05 11:29:44 +01:00
Sven Mika	62dbf26394	[RLlib] POC: Run PGTrainer w/o the distr. exec API (Trainer's new training_iteration method). (#20984 )	2021-12-21 08:39:05 +01:00
brulu	8b77fc0aef	[RLlib] Updating Repeated space. Allowing numpy arrays and adding representation. (#20799 )	2021-12-16 08:27:55 +01:00
Sven Mika	e485aa846a	[RLlib; Docs overhaul] Overhaul of auto-API reference pages (via sphinx autoclass/automodule). (#19786 )	2021-12-15 22:32:52 +01:00
Sven Mika	daa4304a91	[RLlib] Switch off preprocessors by default for PGTrainer. (#21008 )	2021-12-13 12:04:23 +01:00
Ishant Mrinal	2868d1a2cf	[RLlib] Support for RE3 exploration algorithm (for tf) (#19551 )	2021-12-07 13:26:34 +01:00
mvindiola1	8cee0c03bf	[RLlib] Update `max_seq_len` in pad_batch_to_sequences_of_same_size (#20743 )	2021-11-30 18:00:07 +01:00
gjoliver	e7f9e8ceec	[RLlib] Report total_train_steps correctly for offline agents like CQL. (#20541 ) * Fix trainer timestep reporting for offline agents like CQL. * wip. * extend timesteps_total to 200K for learning_tests_pendulum_cql test Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-11-22 21:46:45 +01:00
Avnish Narayan	b6077a36d4	[RLlib; Pre-checks/better failure behavior]: Env Checker for Gym Environments (#20481 )	2021-11-19 09:41:03 +01:00
Sven Mika	56619b955e	[RLlib; Documentation] Some docstring cleanups; Rename RemoteVectorEnv into RemoteBaseEnv for clarity. (#20250 )	2021-11-17 21:40:16 +01:00
gjoliver	724a140795	[rllib] Make sure json can serialize result dict (#20439 ) We may have fields in the result dict that are or None. Make sure our results are json serializable.	2021-11-17 10:27:00 -08:00
gjoliver	6e787f70e0	[Rllib/release] Disable throughput check (#20387 ) Throughput check was enabled by `d8a61f801f` prematurely. E.g., see state before the commit: `a931076f59/rllib/utils/test_utils.py (L740-L741)`	2021-11-16 11:05:51 -08:00
Sven Mika	f82880eda1	Revert "Revert [RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 ) (#20399 )" (#20417 ) This reverts commit `90dc5460d4`.	2021-11-16 14:49:41 +01:00
Kai Fricke	3e6ba5d6d2	Revert "Revert [RLlib] POC: `PGTrainer` class that works by sub-classing, not `trainer_template.py`." (#20285 ) * Revert "Revert "[RLlib] POC: `PGTrainer` class that works by sub-classing, not `trainer_template.py`. (#20055)" (#20284)" This reverts commit `246787cdd9`. Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-11-16 12:26:47 +01:00
Amog Kamsetty	90dc5460d4	Revert "[RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 )" (#20399 ) This reverts commit `5b1c8e46e1`.	2021-11-15 16:11:35 -08:00
Sven Mika	5b1c8e46e1	[RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 )	2021-11-15 10:41:54 +01:00
Kai Fricke	246787cdd9	Revert "[RLlib] POC: `PGTrainer` class that works by sub-classing, not `trainer_template.py`. (#20055 )" (#20284 ) This reverts commit `6f85af435f`.	2021-11-12 13:09:43 +00:00
Sven Mika	6f85af435f	[RLlib] POC: `PGTrainer` class that works by sub-classing, not `trainer_template.py`. (#20055 )	2021-11-11 12:16:20 +01:00
gjoliver	d8a61f801f	[RLlib] Create a set of performance benchmark tests to run nightly. (#19945 ) * Create a core set of algorithms tests to run nightly. * Run release tests under tf, tf2, and torch frameworks. * Fix * Add eager_tracing option for tf2 framework. * make sure core tests can run in parallel. * cql * Report progress while running nightly/weekly tests. * Innclude SAC in nightly lineup. * Revert changes to learning_tests * rebrand to performance test. * update build_pipeline.py with new performance_tests name. * Record stats. * bug fix, need to populate experiments dict. * Alphabetize yaml files. * Allow specifying frameworks. And do not run tf2 by default. * remove some debugging code. * fix * Undo testing changes. * Do not run CQL regression for now. * LINT. Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-11-08 18:15:13 +01:00
Sven Mika	a931076f59	[RLlib] Tf2 + eager-tracing same speed as framework=tf; Add more test coverage for tf2+tracing. (#19981 )	2021-11-05 16:10:00 +01:00
Sven Mika	f3397b6f48	[RLlib] Minor fixes/cleanups; chop_into_sequences now handles nested data. (#19408 )	2021-11-05 14:39:28 +01:00
Avnish Narayan	026bf01071	[RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535 ) * Fix QMix, SAC, and MADDPA too. * Unpin gym and deprecate pendulum v0 Many tests in rllib depended on pendulum v0, however in gym 0.21, pendulum v0 was deprecated in favor of pendulum v1. This may change reward thresholds, so will have to potentially rerun all of the pendulum v1 benchmarks, or use another environment in favor. The same applies to frozen lake v0 and frozen lake v1 Lastly, all of the RLlib tests and have been moved to python 3.7 * Add gym installation based on python version. Pin python<= 3.6 to gym 0.19 due to install issues with atari roms in gym 0.20 * Reformatting * Fixing tests * Move atari-py install conditional to req.txt * migrate to new ale install method * Fix QMix, SAC, and MADDPA too. * Unpin gym and deprecate pendulum v0 Many tests in rllib depended on pendulum v0, however in gym 0.21, pendulum v0 was deprecated in favor of pendulum v1. This may change reward thresholds, so will have to potentially rerun all of the pendulum v1 benchmarks, or use another environment in favor. The same applies to frozen lake v0 and frozen lake v1 Lastly, all of the RLlib tests and have been moved to python 3.7 * Add gym installation based on python version. Pin python<= 3.6 to gym 0.19 due to install issues with atari roms in gym 0.20 Move atari-py install conditional to req.txt migrate to new ale install method Make parametric_actions_cartpole return float32 actions/obs Adding type conversions if obs/actions don't match space Add utils to make elements match gym space dtypes Co-authored-by: Jun Gong <jungong@anyscale.com> Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-11-03 16:24:00 +01:00
Sven Mika	cf21c634a3	[RLlib] Fix deprecated warning for torch_ops.py (soft-replaced by torch_utils.py). (#19982 )	2021-11-03 10:00:46 +01:00
Sven Mika	2d24ef0d32	[RLlib] Add all simple learning tests as `framework=tf2`. (#19273 ) * Unpin gym and deprecate pendulum v0 Many tests in rllib depended on pendulum v0, however in gym 0.21, pendulum v0 was deprecated in favor of pendulum v1. This may change reward thresholds, so will have to potentially rerun all of the pendulum v1 benchmarks, or use another environment in favor. The same applies to frozen lake v0 and frozen lake v1 Lastly, all of the RLlib tests and Tune tests have been moved to python 3.7 * fix tune test_sampler::testSampleBoundsAx * fix re-install ray for py3.7 tests Co-authored-by: avnishn <avnishn@uw.edu>	2021-11-02 12:10:17 +01:00
Sven Mika	0b308719f8	[RLlib; Docs overhaul] Docstring cleanup: rllib/utils (#19829 )	2021-11-01 21:46:02 +01:00
Sven Mika	b213565783	[RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). (#19693 )	2021-10-25 15:00:00 +02:00
Carlo Grisetti	5cee8a1985	[release tests] Switch from yaml.load to yaml.safe_load (#19365 )	2021-10-13 17:27:25 -07:00
Sven Mika	d439fd7f17	[RLlib] TF2/eager memory leak fixes. (#19198 )	2021-10-09 00:11:53 +02:00
Sven Mika	c3e3fc7637	[RLlib] Issue 18280: A3C/IMPALA multi-agent not working. (#19100 )	2021-10-07 23:57:53 +02:00
Sven Mika	b4300dd532	[RLlib] Issue 18812: Torch multi-GPU stats not protected against race conditions. (#18937 )	2021-10-04 13:29:00 +02:00
Sven Mika	73f5c4039b	[RLlib] Fix flakey test_a3c, test_maml, test_apex_dqn. (#19035 )	2021-10-04 13:23:51 +02:00
Jiajun Yao	7588bfd315	[Lint] Add flake8-bugbear (#19053 ) * Add flake8-bugbear * Add flake8-bugbear	2021-10-03 23:24:11 -07:00
Sven Mika	16ad46a654	[RLlib] Fix broken test_r2d2.py. (#19017 )	2021-09-30 21:19:37 +02:00
Sven Mika	ac3371a148	[RLlib] Discussion 3644: Fix bug for complex obs spaces containing `Box([2D shape])` and discrete component. (#18917 )	2021-09-30 16:39:38 +02:00
Sven Mika	ed85f59194	[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879 )	2021-09-30 16:39:05 +02:00
Sven Mika	828f5d26b7	[RLlib] Custom view requirements (e.g. for prev-n-obs) work with `compute_single_action` and `compute_actions_from_input_dict`. (#18921 )	2021-09-30 15:03:37 +02:00
Sven Mika	61a1274619	[RLlib] No Preprocessors (part 2). (#18468 )	2021-09-23 12:56:45 +02:00

1 2 3 4 5 ...

336 commits