hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 02:21:39 -05:00

Author	SHA1	Message	Date
Sven Mika	b10d5533be	[RLlib] Issue 20920 (partial solution): contrib/MADDPG + pettingzoo coop-pong-v4 not working. (#21452 )	2022-01-10 11:19:40 +01:00
Sven Mika	b4790900f5	[RLlib] Sub-class `Trainer` (instead of `build_trainer()`): All remaining classes; soft-deprecate `build_trainer`. (#20725 )	2021-12-04 22:05:26 +01:00
Jun Gong	2317c693cf	[RLlib] Use SampleBrach instead of input dict whenever possible (#20746 )	2021-12-02 13:11:26 +01:00
Artur Niederfahrenhorst	d07e50e957	[RLlib] Replay buffer API (cleanups; docstrings; renames; move into `rllib/execution/buffers` dir) (#20552 )	2021-11-19 11:57:37 +01:00
Sven Mika	cf21c634a3	[RLlib] Fix deprecated warning for torch_ops.py (soft-replaced by torch_utils.py). (#19982 )	2021-11-03 10:00:46 +01:00
Sven Mika	2d24ef0d32	[RLlib] Add all simple learning tests as `framework=tf2`. (#19273 ) * Unpin gym and deprecate pendulum v0 Many tests in rllib depended on pendulum v0, however in gym 0.21, pendulum v0 was deprecated in favor of pendulum v1. This may change reward thresholds, so will have to potentially rerun all of the pendulum v1 benchmarks, or use another environment in favor. The same applies to frozen lake v0 and frozen lake v1 Lastly, all of the RLlib tests and Tune tests have been moved to python 3.7 * fix tune test_sampler::testSampleBoundsAx * fix re-install ray for py3.7 tests Co-authored-by: avnishn <avnishn@uw.edu>	2021-11-02 12:10:17 +01:00
Sven Mika	0b308719f8	[RLlib; Docs overhaul] Docstring cleanup: rllib/utils (#19829 )	2021-11-01 21:46:02 +01:00
Sven Mika	9c73871da0	[RLlib; Docs overhaul] Docstring cleanup: Evaluation (#19783 )	2021-10-29 12:03:56 +02:00
gjoliver	d81885c1f1	[RLlib] Fix all the CI tests that were broken by is_training and replay buffer changes; re-comment-in the failing RLlib tests (#19809 ) * Fix DDPG, since it is based on GenericOffPolicyTrainer. * Fix QMix, SAC, and MADDPA too. * Undo QMix change. * Fix DQN input batch type. Always use SampleBatch. * apex ddpg should not use replay_buffer_config yet. * Make eager tf policy to use SampleBatch. * lint * LINT. * Re-enable RLlib broken tests to make sure things work ok now. * fixes. Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-10-28 18:06:47 +02:00
gjoliver	99a0088233	[RLlib] Unify the way we create local replay buffer for all agents (#19627 ) * [RLlib] Unify the way we create and use LocalReplayBuffer for all the agents. This change 1. Get rid of the try...except clause when we call execution_plan(), and get rid of the Deprecation warning as a result. 2. Fix the execution_plan() call in Trainer._try_recover() too. 3. Most importantly, makes it much easier to create and use different types of local replay buffers for all our agents. E.g., allow us to easily create a reservoir sampling replay buffer for APPO agent for Riot in the near future. * Introduce explicit configuration for replay buffer types. * Fix is_training key error. * actually deprecate buffer_size field.	2021-10-26 20:56:02 +02:00
gjoliver	c3c42278e4	[RLlib] clean up all the SampleBatch['is_training'] deprecation warnings (#19652 ) * [RLlib] clean up all the SampleBatch['is_training'] deprecation warnings. * wip	2021-10-25 09:38:56 +02:00
Sven Mika	1f0646f658	[RLlib] Issue 18418: SAC w/ dict space not working. (#19101 )	2021-10-06 09:05:50 +02:00
Sven Mika	b4300dd532	[RLlib] Issue 18812: Torch multi-GPU stats not protected against race conditions. (#18937 )	2021-10-04 13:29:00 +02:00
Sven Mika	ed85f59194	[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879 )	2021-09-30 16:39:05 +02:00
Sven Mika	9c9b482661	[RLlib] Allow n-step > 1 and prio. replay for R2D2 and RNNSAC. (#18939 )	2021-09-29 21:31:34 +02:00
Sven Mika	ba1c489b79	[RLlib Testing] Lower `--smoke-test` "time_total_s" to make sure it doesn't time out. (#18670 )	2021-09-16 18:22:23 +02:00
Sven Mika	8a00154038	[RLlib] Bump tf version in ML docker to tf==2.5.0; add tfp to ML-docker. (#18544 )	2021-09-15 08:46:37 +02:00
Sven Mika	e3e6ed7aaa	[RLlib] Issues 17844, 18034: Fix n-step > 1 bug. (#18358 )	2021-09-06 12:14:20 +02:00
Sven Mika	599e589481	[RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. (#18065 )	2021-08-31 14:56:53 +02:00
Sven Mika	4888d7c9af	[RLlib] Replay buffers: Add config option to store contents in checkpoints. (#17999 )	2021-08-31 12:21:49 +02:00
Sven Mika	494ddd98c1	[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. (#17928 )	2021-08-21 17:05:48 +02:00
Sven Mika	a428f10ebe	[RLlib] Add multi-GPU learning tests to nightly. (#17778 )	2021-08-18 17:21:01 +02:00
Sven Mika	924f11cd45	[RLlib] Torch algos use now-framework-agnostic MultiGPUTrainOneStep execution op (~33% speedup for PPO-torch + GPU). (#17371 )	2021-08-03 11:35:49 -04:00
Sven Mika	8a844ff840	[RLlib] Issues: 17397, 17425, 16715, 17174. When on driver, Torch\|TFPolicy should not use `ray.get_gpu_ids()` (b/c no GPUs assigned by ray). (#17444 )	2021-08-02 17:29:59 -04:00
Julius Frost	d7a5ec1830	[RLlib] SAC tuple observation space fix (#17356 )	2021-07-28 12:39:28 -04:00
Sven Mika	90b21ce27e	[RLlib] De-flake 3 test cases; Fix `config.simple_optimizer` and `SampleBatch.is_training` warnings. (#17321 )	2021-07-27 14:39:06 -04:00
ddworak94	fba8461663	[RLlib] Add RNN-SAC agent (#16577 ) Shoutout to @ddworak94 :)	2021-07-25 10:04:52 -04:00
Julius Frost	0b1b6222bc	[rllib] Add merge_trainer_config arguments to trainer template (#17160 )	2021-07-21 15:43:06 -07:00
Sven Mika	5a313ba3d6	[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169 )	2021-07-20 14:58:13 -04:00
Sven Mika	1fd0eb805e	[RLlib] Redo fix bug normalize vs unsquash actions (original PR made log-likelihood test flakey). (#17014 )	2021-07-13 14:01:30 -04:00
Amog Kamsetty	bc33dc7e96	Revert "[RLlib] Fix bug in policy.py: normalize_actions=True has to call `unsquash_action`, not `normalize_action`." (#17002 ) This reverts commit `7862dd64ea`.	2021-07-12 11:09:14 -07:00
Sven Mika	7862dd64ea	[RLlib] Fix bug in policy.py: normalize_actions=True has to call `unsquash_action`, not `normalize_action`. (#16774 )	2021-07-08 17:31:34 +02:00
Sven Mika	53206dd440	[RLlib] CQL BC loss fixes; PPO/PG/A2\|3C action normalization fixes (#16531 )	2021-06-30 12:32:11 +02:00
Sven Mika	2900a06dd7	[RLlib] Issue 14503: SAC not allowing custom action distributions. (#16427 )	2021-06-18 17:27:29 +02:00
Sven Mika	839fc59224	[RLlib] CQL TensorFlow support (#15841 )	2021-05-18 11:10:46 +02:00
Sven Mika	bc09e75b78	[RLlib] Fix 3 flakey test cases. (#15785 )	2021-05-16 12:20:33 +02:00
SebastianBo1995	f5be8d8f74	[Rllib] Offline Learning Bug, different shapes (#15132 )	2021-04-27 17:18:17 +02:00
Sven Mika	bb8a286cbc	[RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). (#14684 )	2021-04-27 10:44:54 +02:00
Sven Mika	cecfc3b43b	[RLlib] Multi-GPU support for Torch algorithms. (#14709 )	2021-04-16 09:16:24 +02:00
Sven Mika	9c5a0cfd7a	[RLlib] Issue 14385: `Policy.compute_actions_from_input_dict` does not properly track accessed fields for Policy's view requirements. (#14386 )	2021-04-11 18:20:04 +02:00
Raphael CHEN	93d4244d9c	[RLlib] Correctly get bytes size of SampleBatch (#14801 )	2021-03-30 19:24:58 +02:00
Sven Mika	4f66309e19	[RLlib] Redo issue 14533 tf enable eager exec (#14984 )	2021-03-29 20:07:44 +02:00
SangBin Cho	fa5f961d5e	Revert "[RLlib] Issue 14533: `tf.enable_eager_execution()` must be called at beginning. (#14737 )" (#14918 ) This reverts commit `3e389d5812`.	2021-03-25 00:42:01 -07:00
astronauti	8874ccec2d	[RLlib] Update sac_tf_policy.py (add tf.cast to float32 for rewards) (#14843 )	2021-03-24 16:12:55 +01:00
Sven Mika	3e389d5812	[RLlib] Issue 14533: `tf.enable_eager_execution()` must be called at beginning. (#14737 )	2021-03-24 12:54:27 +01:00
Sven Mika	04bc0a9828	[RLlib] Remove all non-trajectory view API code. (#14860 )	2021-03-23 09:50:18 -07:00
Sven Mika	69202c6a7d	[RLlib] Obsolete usage tracking dict via sample batch. (#13065 )	2021-03-17 08:18:15 +01:00
Sven Mika	732197e23a	[RLlib] Multi-GPU for tf-DQN/PG/A2C. (#13393 )	2021-03-08 15:41:27 +01:00
Sven Mika	ef944bc5f0	[RLlib] Re-enable placement group support for RLlib. (#14384 )	2021-03-05 08:16:24 +01:00
Richard Liaw	a2d2275ee1	Revert "[RLlib + Tune] Add placement group support to RLlib. (#14289 )" (#14360 ) This reverts commit `6cd0cd3bd9`.	2021-02-25 14:27:35 -08:00

1 2 3

126 commits