hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 02:21:39 -05:00

Author	SHA1	Message	Date
kourosh hakhamaneshi	8d848890f1	[RLlib] Fix default view_requirement in policy.py (#27255 )	2022-08-02 10:44:07 -07:00
Jun Gong	54df8bfe42	[RLlib] Try to checkpoint a durable policy name (#27016 )	2022-07-27 00:01:14 -07:00
kourosh hakhamaneshi	8ddcf89096	[RLlib] Implemented ViewRequirementConnector (#26998 )	2022-07-26 21:52:14 -07:00
Jun Gong	6b6d3017ba	[RLlib] more connector polishes and fixes. (#26645 )	2022-07-19 08:50:28 -07:00
Jun Gong	b383d987d1	[RLlib] Fix a bunch of issues related to connectors. (#26510 )	2022-07-13 18:55:20 +02:00
Jun Gong	0c469e490e	[RLlib] Checkpoint and restore connectors. (#26253 )	2022-07-09 01:06:24 -07:00
Jun Gong	d83bbda281	[RLlib] Save serialized PolicySpec. Extract `num_gpus` related logics into a util function. (#25954 )	2022-06-30 11:38:21 +02:00
Jun Gong	52bb8e47d4	[RLlib] EnvRunnerV2 and EpisodeV2 that support Connectors. (#25922 )	2022-06-30 08:44:10 +02:00
Artur Niederfahrenhorst	e10876604d	[RLlib] Include SampleBatch.T column in all collected batches. (#25926 )	2022-06-21 13:20:22 +02:00
Sven Mika	130b7eeaba	[RLlib] `Trainer` to `Algorithm` renaming. (#25539 )	2022-06-11 15:10:39 +02:00
Artur Niederfahrenhorst	71a8a443ce	[RLlib] Fix Policy global timesteps being off by init sample batch size. (#25349 )	2022-06-02 10:19:21 +02:00
Eric Liang	905258dbc1	Clean up docstyle in python modules and add LINT rule (#25272 )	2022-06-01 11:27:54 -07:00
Sven Mika	d95009a3ac	[RLlib] Vectorized envs: Gracefully handle sub-environments failing by restarting them (if configured so). (#24967 )	2022-05-28 10:50:03 +02:00
Jun Gong	eaf9c941ae	[RLlib] Migrate PPO Impala and APPO policies to use sub-classing implementation. (#25117 )	2022-05-25 14:38:03 +02:00
Eric Liang	4963dfaae0	[api] Add API stability annotations for all RLlib symbols and add to LINT (#25060 )	2022-05-24 22:14:25 -07:00
Jun Gong	d5a6d46049	[RLlib] Migrate MAML, MB-MPO, MARWIL, and BC to use Policy sub-classing implementation. (#24914 )	2022-05-20 14:10:59 +02:00
Sven Mika	6551922c21	[RLlib] Fix AlphaStar for tf2+tracing; smaller cleanups around avoiding to wrap a TFPolicy `as_eager()` or `with_tracing` more than once. (#24271 )	2022-04-28 13:43:21 +02:00
Sven Mika	a8494742a3	[RLlib] Memory leak finding toolset using tracemalloc + CI memory leak tests. (#15412 )	2022-04-12 07:50:09 +02:00
Max Pumperla	60054995e6	[docs] fix doctests and activate CI (#23418 )	2022-03-24 17:04:02 -07:00
Balaji Veeramani	7f1bacc7dc	[CI] Format Python code with Black (#21975 ) See #21316 and #21311 for the motivation behind these changes.	2022-01-29 18:41:57 -08:00
Sven Mika	ee41800c16	[RLlib] Preparatory PR for multi-agent, multi-GPU learning agent (alpha-star style) #02 . (#21649 )	2022-01-27 22:07:05 +01:00
Sven Mika	92f030331e	[RLlib] Initial code/comment cleanups in preparation for decentralized multi-agent learner. (#21420 )	2022-01-10 11:22:55 +01:00
Sven Mika	daa4304a91	[RLlib] Switch off preprocessors by default for PGTrainer. (#21008 )	2021-12-13 12:04:23 +01:00
Sven Mika	596c8e2772	[RLlib] Experimental no-flatten option for actions/prev-actions. (#20918 )	2021-12-11 14:57:58 +01:00
Sven Mika	f82880eda1	Revert "Revert [RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 ) (#20399 )" (#20417 ) This reverts commit `90dc5460d4`.	2021-11-16 14:49:41 +01:00
Amog Kamsetty	90dc5460d4	Revert "[RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 )" (#20399 ) This reverts commit `5b1c8e46e1`.	2021-11-15 16:11:35 -08:00
Sven Mika	6ff4061f3a	[RLlib] Issue 20269: Offline RL example not working due to new_obs not being written to file. (#20366 ) * wip. * Apply suggestions from code review	2021-11-15 16:41:08 +01:00
Sven Mika	5b1c8e46e1	[RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 )	2021-11-15 10:41:54 +01:00
Sven Mika	a931076f59	[RLlib] Tf2 + eager-tracing same speed as framework=tf; Add more test coverage for tf2+tracing. (#19981 )	2021-11-05 16:10:00 +01:00
Sven Mika	f3397b6f48	[RLlib] Minor fixes/cleanups; chop_into_sequences now handles nested data. (#19408 )	2021-11-05 14:39:28 +01:00
Sven Mika	0b308719f8	[RLlib; Docs overhaul] Docstring cleanup: rllib/utils (#19829 )	2021-11-01 21:46:02 +01:00
Sven Mika	9c73871da0	[RLlib; Docs overhaul] Docstring cleanup: Evaluation (#19783 )	2021-10-29 12:03:56 +02:00
Sven Mika	f2cb2ed203	[RLlib; Docs overhaul] Docstring cleanup: Policies, policy_templates. (#19759 )	2021-10-27 19:14:39 +02:00
Sven Mika	ed85f59194	[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879 )	2021-09-30 16:39:05 +02:00
Sven Mika	828f5d26b7	[RLlib] Custom view requirements (e.g. for prev-n-obs) work with `compute_single_action` and `compute_actions_from_input_dict`. (#18921 )	2021-09-30 15:03:37 +02:00
Sven Mika	61a1274619	[RLlib] No Preprocessors (part 2). (#18468 )	2021-09-23 12:56:45 +02:00
Sven Mika	a96dbd885b	[RLlib] Reinstate trajectory view API tests. (#18809 )	2021-09-23 08:31:51 +02:00
Sven Mika	8a066474d4	[RLlib] No Preprocessors; preparatory PR #1 (#18367 )	2021-09-09 08:10:42 +02:00
Sven Mika	4888d7c9af	[RLlib] Replay buffers: Add config option to store contents in checkpoints. (#17999 )	2021-08-31 12:21:49 +02:00
Sven Mika	494ddd98c1	[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. (#17928 )	2021-08-21 17:05:48 +02:00
simonsays1980	7b33dc21dc	[RLlib] Fix update model view requirements from init state for bare-metal policies with custom view-reqs. (#17867 ) * Changed '_update_model_view_requirements_from_init_state()' to adopt the 'shift' in view_requirements from a user-defined policy that inherits directly from Policy. * Added slightly modifed version of Sven's suggestion. Like this any user-defined attributes of the ViewRequirement of the state get conserved. * I saw that the code in _update_model_view_requirements_from_init_state() had changed and is not identical to my locally installed version. In the new version view_requirements from the model and the policy get united and therefore a loop runs through this unified list. Code should run now in the present version * Apply suggestions from code review	2021-08-17 11:49:24 +02:00
Sven Mika	5107d16ae5	[RLlib] Add @Deprecated decorator to simplify/unify deprecation of classes, methods, functions. (#17530 )	2021-08-03 18:30:02 -04:00
Sven Mika	58da5c1c9b	[RLlib] Discussion 3001: Fix comment on internal state shape (must be [B x S=state dim]). (#17341 )	2021-07-27 21:41:53 -04:00
Chris Bamford	29768a7c01	[RLLib] (P1 regression) Fixing view requirements in compute actions (#15856 )	2021-07-25 14:25:07 -04:00
Sven Mika	7bc4376466	[RLlib] Example script: Simple league-based self-play w/ open spiel env (markov soccer or connect-4). (#17077 )	2021-07-22 10:59:13 -04:00
Sven Mika	5a313ba3d6	[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169 )	2021-07-20 14:58:13 -04:00
Sven Mika	18d173b172	[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. (#17031 )	2021-07-19 13:16:03 -04:00
Sven Mika	649580d735	[RLlib] Redo simplify multi agent config dict: Reverted b/c seemed to break test_typing (non RLlib test). (#17046 )	2021-07-15 05:51:24 -04:00
Sven Mika	1fd0eb805e	[RLlib] Redo fix bug normalize vs unsquash actions (original PR made log-likelihood test flakey). (#17014 )	2021-07-13 14:01:30 -04:00
Amog Kamsetty	38b5b6d24c	Revert "[RLlib] Simplify multiagent config (automatically infer class/spaces/config). (#16565 )" (#17036 ) This reverts commit `e4123fff27`.	2021-07-13 09:57:15 -07:00

1 2 3

116 commits