hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Sven Mika	627b9f2e88	[RLlib] QMIX training iteration function and new replay buffer API. (#24164 )	2022-04-27 14:24:20 +02:00
Steven Morad	00922817b6	[RLlib] Rewrite PPO to use training_iteration + enable DD-PPO for Win32. (#23673 )	2022-04-11 08:39:10 +02:00
Sven Mika	c82f6c62c8	[RLlib] Make RolloutWorkers (optionally) recoverable after failure. (#23739 )	2022-04-08 15:33:28 +02:00
Sven Mika	0af100ffae	[RLlib] Fix tree.flatten dict ordering bug: `flatten_space([obs_space])` should produce same struct as `tree.flatten([obs])`. (#22731 )	2022-03-01 21:24:24 +01:00
Avnish Narayan	740def0a13	[RLlib] Put env-checker on critical path. (#22191 )	2022-02-17 14:06:14 +01:00
Sven Mika	04a5c72ea3	Revert "Revert "[RLlib] Speedup A3C up to 3x (new training_iteration function instead of execution_plan) and re-instate Pong learning test."" (#18708 )	2022-02-10 13:44:22 +01:00
Alex Wu	b122f093c1	Revert "[RLlib] Speedup A3C up to 3x (new `training_iteration` function instead of `execution_plan`) and re-instate Pong learning test." (#22250 ) Reverts ray-project/ray#22126 Breaks rllib:tests/test_io	2022-02-09 09:26:36 -08:00
Sven Mika	ac3e6ab411	[RLlib] Speedup A3C up to 3x (new `training_iteration` function instead of `execution_plan`) and re-instate Pong learning test. (#22126 )	2022-02-08 19:04:13 +01:00
Jun Gong	87fe033f7b	[RLlib] Request CPU resources in `Trainer.default_resource_request()` if using dataset input. (#21948 )	2022-02-02 10:20:37 +01:00
Balaji Veeramani	7f1bacc7dc	[CI] Format Python code with Black (#21975 ) See #21316 and #21311 for the motivation behind these changes.	2022-01-29 18:41:57 -08:00
Sven Mika	371fbb17e4	[RLlib] Make `policies_to_train` more flexible via callable option. (#20735 )	2022-01-27 12:17:34 +01:00
Jun Gong	099c170ab4	[RLlib] Dataset Reader/Writer for RLlib (#21808 )	2022-01-26 16:00:46 +01:00
Sven Mika	d5bfb7b7da	[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 (#21652 )	2022-01-25 14:16:58 +01:00
Sven Mika	92f030331e	[RLlib] Initial code/comment cleanups in preparation for decentralized multi-agent learner. (#21420 )	2022-01-10 11:22:55 +01:00
Sven Mika	b10d5533be	[RLlib] Issue 20920 (partial solution): contrib/MADDPG + pettingzoo coop-pong-v4 not working. (#21452 )	2022-01-10 11:19:40 +01:00
Sven Mika	62dbf26394	[RLlib] POC: Run PGTrainer w/o the distr. exec API (Trainer's new training_iteration method). (#20984 )	2021-12-21 08:39:05 +01:00
Sven Mika	60b2219d72	[RLlib] Allow for evaluation to run by `timesteps` (alternative to `episodes`) and add auto-setting to make sure train doesn't ever have to wait for eval (e.g. long episodes) to finish. (#20757 )	2021-12-04 13:26:33 +01:00
Sven Mika	56619b955e	[RLlib; Documentation] Some docstring cleanups; Rename RemoteVectorEnv into RemoteBaseEnv for clarity. (#20250 )	2021-11-17 21:40:16 +01:00
Sven Mika	9c73871da0	[RLlib; Docs overhaul] Docstring cleanup: Evaluation (#19783 )	2021-10-29 12:03:56 +02:00
Sven Mika	902e854af2	[RLlib; Docs overhaul] Docstring cleanup: Environments. (#19784 ) * wip. * Test: Make a change in tune to trigger tune tests, which are not run otherwise, but seem to fail nevertheless with this PR's changes. * remove bare_metal_policy_with_custom_view_reqs from tests	2021-10-29 10:46:52 +02:00
Sven Mika	a2a077b874	[RLlib] Faster remote worker space inference (don't infer if not required). (#18805 )	2021-09-23 10:54:37 +02:00
Sven Mika	9a8ca6a69d	[RLlib] Fix Atari learning test regressions (2 bugs) and 1 minor attention net bug. (#18306 )	2021-09-03 13:29:57 +02:00
Sven Mika	7bc4376466	[RLlib] Example script: Simple league-based self-play w/ open spiel env (markov soccer or connect-4). (#17077 )	2021-07-22 10:59:13 -04:00
Sven Mika	649580d735	[RLlib] Redo simplify multi agent config dict: Reverted b/c seemed to break test_typing (non RLlib test). (#17046 )	2021-07-15 05:51:24 -04:00
Amog Kamsetty	38b5b6d24c	Revert "[RLlib] Simplify multiagent config (automatically infer class/spaces/config). (#16565 )" (#17036 ) This reverts commit `e4123fff27`.	2021-07-13 09:57:15 -07:00
Sven Mika	e4123fff27	[RLlib] Simplify multiagent config (automatically infer class/spaces/config). (#16565 )	2021-07-13 06:38:14 -04:00
Julius Frost	a88b217d3f	[rllib] Enhancements to Input API for customizing offline datasets (#16957 ) Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2021-07-10 15:05:25 -07:00
Julius Frost	ada0552f16	[rllib] d4rl: fix for paths with multiple periods (#16721 )	2021-07-01 18:35:50 -07:00
Sven Mika	53206dd440	[RLlib] CQL BC loss fixes; PPO/PG/A2\|3C action normalization fixes (#16531 )	2021-06-30 12:32:11 +02:00
Sven Mika	c95dea51e9	[RLlib] External env enhancements + more examples. (#16583 )	2021-06-23 09:09:01 +02:00
Sven Mika	be6db06485	[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569 )	2021-06-21 13:46:01 +02:00
Amog Kamsetty	bd3cbfc56a	Revert "[RLlib] Allow policies to be added/deleted on the fly. (#16359 )" (#16543 ) This reverts commit `e78ec370a9`.	2021-06-18 12:21:49 -07:00
Sven Mika	e78ec370a9	[RLlib] Allow policies to be added/deleted on the fly. (#16359 )	2021-06-18 10:31:30 +02:00
Sven Mika	d89fb82bfb	[RLlib] Add simple curriculum learning API and example script. (#15740 )	2021-05-16 17:35:10 +02:00
Amog Kamsetty	ebc44c3d76	[CI] Upgrade flake8 to 3.9.1 (#15527 ) * formatting * format util * format release * format rllib/agents * format rllib/env * format rllib/execution * format rllib/evaluation * format rllib/examples * format rllib/policy * format rllib utils and tests * format streaming * more formatting * update requirements files * fix rllib type checking * updates * update * fix circular import * Update python/ray/tests/test_runtime_env.py * noqa	2021-05-03 14:23:28 -07:00
Sven Mika	04bc0a9828	[RLlib] Remove all non-trajectory view API code. (#14860 )	2021-03-23 09:50:18 -07:00
Sven Mika	f859ebb99f	[RLlib] Fix env rendering and recording options (for non-local mode; >0 workers; +evaluation-workers). (#14796 )	2021-03-23 10:06:06 +01:00
Eric Liang	9db000ff2c	Auto report object store memory usage; remove some deprecated code (#14260 )	2021-03-01 13:19:44 -08:00
Michael Luo	587f207c2f	[RLlib] Support for D4RL + Semi-working CQL Benchmark (#13550 )	2021-01-21 16:43:55 +01:00
Sven Mika	abb1eefdc2	[RLlib] Issue 12483: Discrete observation space error: "ValueError: ('Observation ({}) outside given space ..." when doing Trainer.compute_action. (#12787 )	2020-12-11 22:43:30 +01:00
Sven Mika	ea25482f6a	WIP. (#12706 )	2020-12-09 11:49:21 -08:00
Sven Mika	e40b14d255	[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. (#12420 )	2020-12-08 16:41:45 -08:00
Sven Mika	99c81c6795	[RLlib] Attention Net prep PR #3 . (#12450 )	2020-12-07 13:08:17 +01:00
mvindiola1	9e68b77796	[RLLIB] Wait for remote_workers to finish closing environments before terminating (#11476 )	2020-10-28 14:23:06 -07:00
Sven Mika	414041c6dd	[RLlib] Do not create env on driver iff num_workers > 0. (#11307 )	2020-10-15 18:21:30 +02:00
Sven Mika	ce96b03b07	[RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033 )	2020-10-06 20:28:16 +02:00
Sven Mika	805dad3bc4	[RLlib] SAC algo cleanup. (#10825 )	2020-09-20 11:27:02 +02:00
Alex Wu	a699f6a4d8	[Core] Fix override memory and object_store_memory in decorator (#10563 )	2020-09-06 20:56:48 -07:00
Sven Mika	715ee8dfc9	[RLlib] Issue 10469: Callbacks should receive env idx ... (#10477 )	2020-09-03 17:27:05 +02:00
raoul-khour-ts	c8c4832794	Prevent Local Worker creation from blocking remote worker creation by creating remote workers before local worker (#10245 ) * create remote workers before local worker * reformatted	2020-08-24 12:29:55 -07:00

1 2

72 commits