hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Sven Mika	f82880eda1	Revert "Revert [RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 ) (#20399 )" (#20417 ) This reverts commit `90dc5460d4`.	2021-11-16 14:49:41 +01:00
Amog Kamsetty	90dc5460d4	Revert "[RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 )" (#20399 ) This reverts commit `5b1c8e46e1`.	2021-11-15 16:11:35 -08:00
Sven Mika	5b1c8e46e1	[RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 )	2021-11-15 10:41:54 +01:00
Sven Mika	70fe25055a	[RLlib] Issue: Get single step input dict incorrect. (#20217 )	2021-11-12 08:38:51 +01:00
Sven Mika	2d24ef0d32	[RLlib] Add all simple learning tests as `framework=tf2`. (#19273 ) * Unpin gym and deprecate pendulum v0 Many tests in rllib depended on pendulum v0, however in gym 0.21, pendulum v0 was deprecated in favor of pendulum v1. This may change reward thresholds, so will have to potentially rerun all of the pendulum v1 benchmarks, or use another environment in favor. The same applies to frozen lake v0 and frozen lake v1 Lastly, all of the RLlib tests and Tune tests have been moved to python 3.7 * fix tune test_sampler::testSampleBoundsAx * fix re-install ray for py3.7 tests Co-authored-by: avnishn <avnishn@uw.edu>	2021-11-02 12:10:17 +01:00
Sven Mika	0b308719f8	[RLlib; Docs overhaul] Docstring cleanup: rllib/utils (#19829 )	2021-11-01 21:46:02 +01:00
Sven Mika	ea2bea7e30	[RLlib; Docs overhaul] Docstring cleanup: Offline. (#19808 )	2021-11-01 10:59:53 +01:00
Sven Mika	9c73871da0	[RLlib; Docs overhaul] Docstring cleanup: Evaluation (#19783 )	2021-10-29 12:03:56 +02:00
Sven Mika	902e854af2	[RLlib; Docs overhaul] Docstring cleanup: Environments. (#19784 ) * wip. * Test: Make a change in tune to trigger tune tests, which are not run otherwise, but seem to fail nevertheless with this PR's changes. * remove bare_metal_policy_with_custom_view_reqs from tests	2021-10-29 10:46:52 +02:00
Antoine Galataud	edb338ff7c	[RLlib] Check `training_enabled` on PolicyServer (#19007 )	2021-10-12 16:21:02 +02:00
Sven Mika	d439fd7f17	[RLlib] TF2/eager memory leak fixes. (#19198 )	2021-10-09 00:11:53 +02:00
Sven Mika	c3e3fc7637	[RLlib] Issue 18280: A3C/IMPALA multi-agent not working. (#19100 )	2021-10-07 23:57:53 +02:00
Jiajun Yao	7588bfd315	[Lint] Add flake8-bugbear (#19053 ) * Add flake8-bugbear * Add flake8-bugbear	2021-10-03 23:24:11 -07:00
Sven Mika	ed85f59194	[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879 )	2021-09-30 16:39:05 +02:00
mvindiola1	62f5da0b65	[RLlib] Add unit tests for updating episode data in base_env (#17137 )	2021-09-24 16:08:11 +02:00
Sven Mika	61a1274619	[RLlib] No Preprocessors (part 2). (#18468 )	2021-09-23 12:56:45 +02:00
Sven Mika	a2a077b874	[RLlib] Faster remote worker space inference (don't infer if not required). (#18805 )	2021-09-23 10:54:37 +02:00
Sven Mika	a96dbd885b	[RLlib] Reinstate trajectory view API tests. (#18809 )	2021-09-23 08:31:51 +02:00
Sven Mika	fd13bac9b3	[RLlib] Add `worker` arg (optional) to `policy_mapping_fn`. (#18184 )	2021-09-17 12:07:11 +02:00
Sven Mika	8a72824c63	[RLlib Testig] Split and unflake more CI tests (make sure all jobs are < 30min). (#18591 )	2021-09-15 22:16:48 +02:00
Sven Mika	3f89f35e52	[RLlib] Better error messages and hints; + failure-mode tests; (#18466 )	2021-09-10 16:52:47 +02:00
Sven Mika	8a066474d4	[RLlib] No Preprocessors; preparatory PR #1 (#18367 )	2021-09-09 08:10:42 +02:00
Sven Mika	1520c3d147	[RLlib] Deepcopy env_ctx for vectorized sub-envs AND add eval-worker-option to `Trainer.add_policy()` (#18428 )	2021-09-09 07:10:06 +02:00
Sven Mika	e3e6ed7aaa	[RLlib] Issues 17844, 18034: Fix n-step > 1 bug. (#18358 )	2021-09-06 12:14:20 +02:00
Sven Mika	a772c775cd	[RLlib] Set random seed (if provided) to Trainer process as well. (#18307 )	2021-09-04 11:02:30 +02:00
Sven Mika	9a8ca6a69d	[RLlib] Fix Atari learning test regressions (2 bugs) and 1 minor attention net bug. (#18306 )	2021-09-03 13:29:57 +02:00
gjoliver	336e79956a	[RLlib] Make MultiAgentEnv inherit gym.Env to avoid direct class type manipulation (#18156 )	2021-09-03 08:02:05 +02:00
Sven Mika	2357bbc0c8	[RLlib] Issue 18231: Better (earlier) env validation and error message improvement. (#18249 )	2021-09-02 09:28:16 +02:00
gjoliver	6621bb5611	[RLlib] Minor renaming and cleanups related to last rollout worker seed fix. (#18155 )	2021-09-02 06:57:46 +02:00
Joseph Suarez	8136d2912b	[RLlib] Add `policies` arg to callback: `on_episode_step` (already exists in all other episode-related callbacks) (#18119 )	2021-08-27 16:12:19 +02:00
gjoliver	a8813675f4	[RLlib] Issue 17900: Set `seed` in single vectorized sub-envs properly, if `num_envs_per_worker > 1` (#18110 ) * In case a worker runs multiple envs, make sure a different seed can be deterministically set on all of them. * Revert a couple of whitespace changes. * Fix a few style errors. Co-authored-by: Jun Gong <jungong@mbpro.local>	2021-08-26 11:32:58 +02:00
Sven Mika	494ddd98c1	[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. (#17928 )	2021-08-21 17:05:48 +02:00
simonsays1980	60aee4a330	[RLlib] Add example script for bare metal Policy with custom `view_requirements`. (#17896 )	2021-08-20 12:17:13 +02:00
Kai Fricke	bf3eaa9264	[RLlib] Dreamer fixes and reinstate Dreamer test. (#17821 ) Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-08-18 18:47:08 +02:00
Sven Mika	f18213712f	[RLlib] Redo: "fix self play example scripts" PR (17566) (#17895 ) * wip. * wip. * wip. * wip. * wip. * wip. * wip. * wip. * wip.	2021-08-17 09:13:35 -07:00
Sven Mika	2bd2ee7a73	[RLlib] SampleBatch: Docstring- and API cleanups; Add support for nested data. (#17485 )	2021-08-16 06:08:14 +02:00
akern40	0cb2c602db	[rllib] Fixes typo in RolloutWorker.__init__ (#17583 ) Fixes the typo in RolloutWorker.__init__, closes #17582	2021-08-13 13:17:36 -07:00
Sven Mika	29f20cccb6	[RLlib] Issue 17706: AttributeError: 'numpy.ndarray' object has no attribute 'items'" on certain turn-based MultiAgentEnvs with Dict obs space. (#17735 )	2021-08-11 12:33:35 +02:00
Amog Kamsetty	77f28f1c30	Revert "[RLlib] Fix `Trainer.add_policy` for num_workers>0 (self play example scripts). (#17566 )" (#17709 ) This reverts commit `3b447265d8`.	2021-08-10 10:50:01 -07:00
Sven Mika	3b447265d8	[RLlib] Fix `Trainer.add_policy` for num_workers>0 (self play example scripts). (#17566 )	2021-08-05 11:41:18 -04:00
Kai Fricke	5d56a8aac5	[RLlib] Fix ignoring "sample_collector" config key (#17460 )	2021-08-04 10:27:35 -04:00
Sven Mika	5107d16ae5	[RLlib] Add @Deprecated decorator to simplify/unify deprecation of classes, methods, functions. (#17530 )	2021-08-03 18:30:02 -04:00
Sven Mika	8a844ff840	[RLlib] Issues: 17397, 17425, 16715, 17174. When on driver, Torch\|TFPolicy should not use `ray.get_gpu_ids()` (b/c no GPUs assigned by ray). (#17444 )	2021-08-02 17:29:59 -04:00
Sven Mika	0d8fce8fd8	[RLlib] Discussion 2294: Custom vector env example and fix. (#16083 )	2021-07-28 10:40:04 -04:00
Sven Mika	0c5c70b584	[RLlib] Discussion 247: Allow remote sub-envs (within vectorized) to be used with custom APIs. (#17118 )	2021-07-25 16:55:51 -04:00
Chris Bamford	29768a7c01	[RLLib] (P1 regression) Fixing view requirements in compute actions (#15856 )	2021-07-25 14:25:07 -04:00
Sven Mika	7bc4376466	[RLlib] Example script: Simple league-based self-play w/ open spiel env (markov soccer or connect-4). (#17077 )	2021-07-22 10:59:13 -04:00
Sven Mika	5a313ba3d6	[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169 )	2021-07-20 14:58:13 -04:00
Sven Mika	18d173b172	[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. (#17031 )	2021-07-19 13:16:03 -04:00
Sven Mika	649580d735	[RLlib] Redo simplify multi agent config dict: Reverted b/c seemed to break test_typing (non RLlib test). (#17046 )	2021-07-15 05:51:24 -04:00

1 2 3 4 5

225 commits