hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Sven Mika	92781c603e	[RLlib] A2C `training_iteration` method implementation (`_disable_execution_plan_api=True`) (#23735 )	2022-04-15 18:36:13 +02:00
Sven Mika	a8494742a3	[RLlib] Memory leak finding toolset using tracemalloc + CI memory leak tests. (#15412 )	2022-04-12 07:50:09 +02:00
Jun Gong	500cf7dcef	[RLlib] Run test_policy_client_server_setup.sh tests on different ports. (#23787 )	2022-04-11 22:07:07 +02:00
Sven Mika	c82f6c62c8	[RLlib] Make RolloutWorkers (optionally) recoverable after failure. (#23739 )	2022-04-08 15:33:28 +02:00
Sven Mika	4d285a00a4	[RLlib] Issue 23689: tf Initializer has hard-coded float32 dtype. (#23741 )	2022-04-07 21:35:02 +02:00
Sven Mika	0b3a79ca41	[RLlib] Issue 23639: Error in client/server setup when using LSTMs (#23740 )	2022-04-07 10:16:22 +02:00
Sven Mika	e391b624f0	[RLlib] Re-enable (for CI-testing) our two self_play example scripts. (#23742 )	2022-04-07 08:20:48 +02:00
Sven Mika	434265edd0	[RLlib] Examples folder: All `training_iteration` translations. (#23712 )	2022-04-05 16:33:50 +02:00
Sven Mika	b1cda46681	[RLlib] SlateQ (tf GPU + multi-GPU) + Bandit fixes (#23276 )	2022-03-18 13:45:16 +01:00
Artur Niederfahrenhorst	37d129a965	[RLlib] ReplayBuffer API: Test cases. (#22390 )	2022-03-08 16:54:12 +01:00
Artur Niederfahrenhorst	c0ade5f0b7	[RLlib] Issue 22625: `MultiAgentBatch.timeslices()` does not behave as expected. (#22657 )	2022-03-08 14:25:48 +01:00
Jiajun Yao	4801e57c77	[Test] Add missing tests to bazel BUILD (#22827 )	2022-03-07 19:54:49 -08:00
Sven Mika	e50bd212a1	[RLlib] Disable flakey Pendulum-v1 tests (until further investigation). (#22686 )	2022-03-01 16:44:17 +01:00
Sven Mika	8e00537b65	[RLlib] SlateQ: framework=tf fixes and SlateQ documentation update (#22543 )	2022-02-23 13:03:45 +01:00
Sven Mika	6522935291	[RLlib] Slate-Q tf implementation and tests/benchmarks. (#22389 )	2022-02-22 09:36:44 +01:00
Sven Mika	c58cd90619	[RLlib] Enable Bandits to work in batches mode(s) (vector envs + multiple workers + train_batch_sizes > 1). (#22465 )	2022-02-17 22:32:26 +01:00
Sven Mika	04a5c72ea3	Revert "Revert "[RLlib] Speedup A3C up to 3x (new training_iteration function instead of execution_plan) and re-instate Pong learning test."" (#18708 )	2022-02-10 13:44:22 +01:00
Alex Wu	b122f093c1	Revert "[RLlib] Speedup A3C up to 3x (new `training_iteration` function instead of `execution_plan`) and re-instate Pong learning test." (#22250 ) Reverts ray-project/ray#22126 Breaks rllib:tests/test_io	2022-02-09 09:26:36 -08:00
Sven Mika	ac3e6ab411	[RLlib] Speedup A3C up to 3x (new `training_iteration` function instead of `execution_plan`) and re-instate Pong learning test. (#22126 )	2022-02-08 19:04:13 +01:00
Sven Mika	c17a44cdfa	Revert "Revert "[RLlib] AlphaStar: Parallelized, multi-agent/multi-GPU learni…" (#22153 )	2022-02-08 16:43:00 +01:00
Sven Mika	8b678ddd68	[RLlib] Issue 22036: Client should handle concurrent episodes with one being `training_enabled=False`. (#22076 )	2022-02-06 12:35:03 +01:00
Sven Mika	f6617506a2	[RLlib] Add `on_sub_environment_created` to DefaultCallbacks class. (#21893 )	2022-02-04 22:22:47 +01:00
Sven Mika	38d75ce058	[RLlib] Cleanup SlateQ algo; add test + add target Q-net (#21827 )	2022-02-04 17:01:12 +01:00
Avnish Narayan	0d2ba41e41	[RLlib] [CI] Deflake longer running RLlib learning tests for off policy algorithms. Fix seeding issue in TransformedAction Environments (#21685 )	2022-02-04 14:59:56 +01:00
SangBin Cho	a887763b38	Revert "[RLlib] AlphaStar: Parallelized, multi-agent/multi-GPU learni… (#22105 ) This reverts commit `3f03ef8ba8`.	2022-02-04 00:54:50 -08:00
Sven Mika	3f03ef8ba8	[RLlib] AlphaStar: Parallelized, multi-agent/multi-GPU learning via league-based self-play. (#21356 )	2022-02-03 09:32:09 +01:00
Jun Gong	9c95b9a5fa	[RLlib] Add an env wrapper so RecSim works with our Bandits agent. (#22028 )	2022-02-02 12:15:38 +01:00
Jun Gong	a55258eb9c	[RLlib] Move bandit example scripts into examples folder. (#21949 )	2022-02-02 09:20:47 +01:00
Sven Mika	893536ebd9	[RLlib] Move bandits into main agents folder; Make RecSim adapter more accessible; (#21773 )	2022-01-27 13:58:12 +01:00
Sven Mika	d5bfb7b7da	[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 (#21652 )	2022-01-25 14:16:58 +01:00
Sven Mika	3ac4daba07	[RLlib] Discussion 4351: Conv2d default filter tests and add default setting for 96x96 image obs space. (#21560 )	2022-01-13 18:50:42 +01:00
Avnish Narayan	f7a5fc36eb	[rllib] Give rnnsac_stateless cartpole gpu, increase timeout (#21407 ) Increase test_preprocessors runtimes.	2022-01-06 11:54:19 -08:00
Sven Mika	9e6b871739	[RLlib] Better utils for flattening complex inputs and enable prev-actions for LSTM/attention for complex action spaces. (#21330 )	2022-01-05 11:29:44 +01:00
Sven Mika	abd3bef63b	[RLlib] QMIX better defaults + added to CI learning tests (#21332 )	2022-01-04 08:54:41 +01:00
Sven Mika	daa4304a91	[RLlib] Switch off preprocessors by default for PGTrainer. (#21008 )	2021-12-13 12:04:23 +01:00
Sven Mika	596c8e2772	[RLlib] Experimental no-flatten option for actions/prev-actions. (#20918 )	2021-12-11 14:57:58 +01:00
Eric Liang	6f93ea437e	Remove the flaky test tag (#21006 )	2021-12-11 01:03:17 -08:00
Avnish Narayan	6996eaa986	[RLlib] Add necessary fields to Base Envs, and BaseEnv wrapper classes (#20832 )	2021-12-09 14:40:40 +01:00
Ishant Mrinal	2868d1a2cf	[RLlib] Support for RE3 exploration algorithm (for tf) (#19551 )	2021-12-07 13:26:34 +01:00
Sven Mika	60b2219d72	[RLlib] Allow for evaluation to run by `timesteps` (alternative to `episodes`) and add auto-setting to make sure train doesn't ever have to wait for eval (e.g. long episodes) to finish. (#20757 )	2021-12-04 13:26:33 +01:00
Jun Gong	65bd8e29f8	[RLlib] Update a few things to get rid of the `remote_vector_env` deprecation warning. (#20753 )	2021-12-02 13:10:44 +01:00
mvindiola1	8cee0c03bf	[RLlib] Update `max_seq_len` in pad_batch_to_sequences_of_same_size (#20743 )	2021-11-30 18:00:07 +01:00
Sven Mika	7a585fb275	[RLlib; Documentation] RLlib README overhaul. (#20249 )	2021-11-18 18:08:40 +01:00
Sven Mika	56619b955e	[RLlib; Documentation] Some docstring cleanups; Rename RemoteVectorEnv into RemoteBaseEnv for clarity. (#20250 )	2021-11-17 21:40:16 +01:00
Avnish Narayan	dc17f0a241	Add error messages for missing tf and torch imports (#20205 ) Co-authored-by: Sven Mika <sven@anyscale.io> Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-11-16 16:30:53 -08:00
Sven Mika	f82880eda1	Revert "Revert [RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 ) (#20399 )" (#20417 ) This reverts commit `90dc5460d4`.	2021-11-16 14:49:41 +01:00
Amog Kamsetty	90dc5460d4	Revert "[RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 )" (#20399 ) This reverts commit `5b1c8e46e1`.	2021-11-15 16:11:35 -08:00
Sven Mika	5b1c8e46e1	[RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 )	2021-11-15 10:41:54 +01:00
Sven Mika	ebd56b57db	[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215 )	2021-11-10 22:20:06 +01:00
Sven Mika	143d23a278	[RLlib] Issue 20062: Action inference examples missing (#20144 )	2021-11-10 18:49:06 +01:00

1 2 3 4 5 ...

329 commits