hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Sven Mika	ee4b6e7e3b	[RLlib] Unity3D example broken due to change in ML-Agents API. Attention-net prev-n-a/r. Attention-wrapper works with images. (#14569 )	2021-03-12 18:27:25 +01:00
Clark Zinzow	5a788474aa	[Core] First pass at privatizing non-public Python APIs. (#14607 ) * async_compat * utils * cluster_utils * compat * function_manager * import_thread * memory_monitor * monitor, log_monitor, ray_process_reaper * metrics_agent * parameter * prometheus_exporter * ray_logging * signature	2021-03-10 22:47:28 -08:00
Maxime RICHE	9a7fbd3cdf	[RLlib] Add coin game env. Matrix social dilemma env. With tests and examples. (#14208 )	2021-03-09 17:26:20 +01:00
Sven Mika	732197e23a	[RLlib] Multi-GPU for tf-DQN/PG/A2C. (#13393 )	2021-03-08 15:41:27 +01:00
Sven Mika	5637d89ecc	[RLlib] Serve + RLlib example script. (#14416 )	2021-03-03 14:33:03 +01:00
Sven Mika	4cd5c1da2c	[RLlib] Remove flaky test case for mixed (tf+torch) policies trainer. (#14357 )	2021-02-25 14:07:05 -08:00
Sven Mika	3d20d58c90	[RLlib] Tune trial + checkpoint selection example. (#14209 )	2021-02-22 12:52:37 +01:00
Sven Mika	929946271d	[RLlib] Issue #14022 : Trajectory View API fails in MA-env where a new agent terminates right away (done=True right after initial obs). (#14031 )	2021-02-18 14:07:49 +01:00
Sven Mika	4db86404ad	[RLlib] Issue #13507 : Fix MB-MPO CartPole Env's reward function as well as MB-MPO running into a traj. view API related issue. (#14037 )	2021-02-11 18:58:46 +01:00
Sven Mika	eb0038612f	[RLlib] Extend on_learn_on_batch callback to allow for custom metrics to be added. (#13584 )	2021-02-08 15:02:19 +01:00
Sven Mika	d001af3e59	[RLlib] Allow `rllib rollout` to run distributed via evaluation workers. (#13718 )	2021-02-08 12:05:16 +01:00
Sven Mika	0a0d9183fe	[RLlib] Trajectory view API example script (enhancements and tf2 support). (#13786 )	2021-02-02 18:42:18 +01:00
Sven Mika	52c94b7ee9	[RLlib] Allow SAC to use custom models as Q- or policy nets and deprecate "state-preprocessor" for image spaces. (#13522 )	2021-02-02 13:05:58 +01:00
Sven Mika	9423930bcc	[RLlib] MAML: Add cartpole mass test for PyTorch. (#13679 )	2021-01-25 12:32:41 +01:00
Sven Mika	e74947cc94	[RLlib] Env directory cleanup and tests. (#13082 )	2021-01-19 10:09:39 +01:00
Sven Mika	93c0a5549b	[RLlib] Deprecate `vf_share_layers` in top-level PPO/MAML/MB-MPO configs. (#13397 )	2021-01-19 09:51:35 +01:00
Sven Mika	d98235cc84	[RLlib] Deflake 2x remote & local inference tests (external env). (#13459 )	2021-01-14 20:44:26 +01:00
Sven Mika	56878221ed	[RLlib] Redo: Make TFModelV2 fully modular like TorchModelV2 (soft-deprecate register_variables, unify var names wrt torch). (#13363 )	2021-01-14 14:44:33 +01:00
Kai Fricke	25f10a947a	Revert "[RLlib] Make TFModelV2 behave more like TorchModelV2: Obsolete register_variables. Unify variable dicts. (#13339 )" (#13361 ) This reverts commit `e2b2abb88b`.	2021-01-12 12:33:57 +01:00
Sven Mika	e2b2abb88b	[RLlib] Make TFModelV2 behave more like TorchModelV2: Obsolete register_variables. Unify variable dicts. (#13339 )	2021-01-11 22:42:30 +01:00
Sven Mika	9dd9f72111	[RLlib] Add more detailed Documentation on Model building API (#13261 )	2021-01-09 12:38:29 +01:00
Sven Mika	6f342a2221	[RLlib] Preparatory PR for: Documentation on Model Building. (#13260 )	2021-01-08 10:56:09 +01:00
Basu Jindal	4e569ee20b	Update multi_agent_independent_learning.py (#13196 ) pettingzoo.utils.error.DeprecatedEnv: waterworld_v0 is now depreciated, use waterworld_v2 instead	2021-01-05 13:44:54 -08:00
Sven Mika	9eba1871bb	[RLlib] Support easy `use_attention=True` flag for using the GTrXL model. (#11698 )	2021-01-01 14:06:23 -05:00
Sven Mika	391cdfae8c	[RLlib] Trajectory view API docs. (#12718 )	2020-12-30 17:32:21 -08:00
Sven Mika	c524f86785	[RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. (#13064 )	2020-12-27 09:46:03 -05:00
Sven Mika	99ae7bae05	[RLlib] JAXPolicy prep. PR #1 . (#13077 )	2020-12-26 20:14:18 -05:00
Sven Mika	670d083a56	[RLlib] Fix broken unity3d_env import in example server script. (#13040 )	2020-12-23 11:29:58 -05:00
Sven Mika	d5604eaba3	[RLlib] Attention nets PyTorch support and cleanup (using traj. view API). (#12029 )	2020-12-21 18:38:34 -08:00
Sven Mika	b2bcab711d	[RLlib] Attention Nets: tf (#12753 )	2020-12-20 20:22:32 -05:00
Sven Mika	e40b14d255	[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. (#12420 )	2020-12-08 16:41:45 -08:00
Sven Mika	99c81c6795	[RLlib] Attention Net prep PR #3 . (#12450 )	2020-12-07 13:08:17 +01:00
Sven Mika	3ad9365e1d	[RLlib] Attention Net prep PR #2 : Smaller cleanups. (#12449 )	2020-12-01 08:21:45 +01:00
Sven Mika	0df55a139c	[RLlib] Attention Net prep PR #1 : Smaller cleanups. (#12447 ) * WIP. * Fix. * Fix. * Fix.	2020-11-27 16:25:47 -08:00
Sven Mika	592c161032	[RLlib] Issue 12118: LSTM prev-a/r should be separately configurable. Fix missing prev-a one-hot encoding. (#12397 ) * WIP. * Fix and LINT.	2020-11-25 11:27:46 -08:00
Sven Mika	841d93d366	[RLlib] Issue 12233 shared tf layers example not really shared (only works for tf1.x, not tf2.x). (#12399 )	2020-11-25 11:27:19 -08:00
Raoul Khouri	d07ffc152b	[rllib] Rrk/12079 custom filters (#12095 ) * travis reformatted	2020-11-19 13:20:20 -08:00
Sven Mika	dab241dcc6	[RLlib] Fix inconsistency wrt batch size in SampleCollector (traj. view API). Makes DD-PPO work with traj. view API. (#12063 )	2020-11-19 19:01:14 +01:00
Michael Luo	6e6c680f14	MBMPO Cartpole (#11832 ) * MBMPO Cartpole Done * Added doc	2020-11-12 10:30:41 -08:00
Sven Mika	62c7ab5182	[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747 )	2020-11-12 16:27:34 +01:00
Benjamin Black	1999266bba	Updated pettingzoo env to acomidate api changes and fixes (#11873 ) * Updated pettingzoo env to acomidate api changes and fixes * fixed test failure * fixed linting issue * fixed test failure	2020-11-09 16:09:49 -08:00
Pierre TASSEL	66605cfcbd	[RLLib] Random Parametric Trainer (#11366 )	2020-11-04 11:12:51 +01:00
desktable	5af745c90d	[RLlib] Implement the SlateQ algorithm (#11450 )	2020-11-03 09:52:04 +01:00
Lara Codeca	e735add268	[RLlib] Integration with SUMO Simulator (#11710 )	2020-11-03 09:45:03 +01:00
Sven Mika	54d85a6c2a	[RLlib] Fix RNN learning for tf-eager/tf2.x. (#11720 )	2020-11-02 11:18:41 +01:00
Sven Mika	8ea1bc5ff9	[RLlib] Allow for more than 2^31 policy timesteps. (#11301 )	2020-10-12 13:49:11 -07:00
Sven Mika	ce96b03b07	[RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033 )	2020-10-06 20:28:16 +02:00
Sven Mika	c17169dc11	[RLlib] Fix all example scripts to run on GPUs. (#11105 )	2020-10-02 23:07:44 +02:00
Sven Mika	36bda8432b	[RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056 )	2020-10-01 16:57:10 +02:00
Eric Liang	daa03ba6e6	[rllib] Add execution module to package ref (#10941 ) * add init * add * update	2020-09-21 23:03:06 -07:00

1 2 3 4 5

202 commits