hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
brulu	8b77fc0aef	[RLlib] Updating Repeated space. Allowing numpy arrays and adding representation. (#20799 )	2021-12-16 08:27:55 +01:00
Sven Mika	daa4304a91	[RLlib] Switch off preprocessors by default for PGTrainer. (#21008 )	2021-12-13 12:04:23 +01:00
Sven Mika	596c8e2772	[RLlib] Experimental no-flatten option for actions/prev-actions. (#20918 )	2021-12-11 14:57:58 +01:00
Sven Mika	f814c2af89	[RLlib; Docs] Docs API reference pages: `rllib/execution`, `rllib/evaluation`, `rllib/models`, `rllib/offline`. (#20538 )	2021-12-10 09:41:29 +01:00
Carlo Grisetti	a8286c55af	[RLLib] Fix deprecated convert_to_non_torch_type (#20751 )	2021-12-09 14:42:12 +01:00
Ishant Mrinal	2868d1a2cf	[RLlib] Support for RE3 exploration algorithm (for tf) (#19551 )	2021-12-07 13:26:34 +01:00
Jun Gong	2317c693cf	[RLlib] Use SampleBrach instead of input dict whenever possible (#20746 )	2021-12-02 13:11:26 +01:00
mvindiola1	8cee0c03bf	[RLlib] Update `max_seq_len` in pad_batch_to_sequences_of_same_size (#20743 )	2021-11-30 18:00:07 +01:00
mvindiola1	eadc7669c5	[RLlib] SampleBatch.concat_samples fix incorrect max_seq_len calculation (#20704 )	2021-11-29 12:01:40 +01:00
Sven Mika	e37afe0425	[RLlib; Docs] Auto API reference pages overhaul: `rllib/policy` and `rllib/agents` packages. (#20537 )	2021-11-25 09:35:19 +01:00
Sven Mika	f82880eda1	Revert "Revert [RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 ) (#20399 )" (#20417 ) This reverts commit `90dc5460d4`.	2021-11-16 14:49:41 +01:00
Kai Fricke	3e6ba5d6d2	Revert "Revert [RLlib] POC: `PGTrainer` class that works by sub-classing, not `trainer_template.py`." (#20285 ) * Revert "Revert "[RLlib] POC: `PGTrainer` class that works by sub-classing, not `trainer_template.py`. (#20055)" (#20284)" This reverts commit `246787cdd9`. Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-11-16 12:26:47 +01:00
Amog Kamsetty	90dc5460d4	Revert "[RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 )" (#20399 ) This reverts commit `5b1c8e46e1`.	2021-11-15 16:11:35 -08:00
Sven Mika	6ff4061f3a	[RLlib] Issue 20269: Offline RL example not working due to new_obs not being written to file. (#20366 ) * wip. * Apply suggestions from code review	2021-11-15 16:41:08 +01:00
Sven Mika	5b1c8e46e1	[RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061 )	2021-11-15 10:41:54 +01:00
Sven Mika	70fe25055a	[RLlib] Issue: Get single step input dict incorrect. (#20217 )	2021-11-12 08:38:51 +01:00
Sven Mika	a931076f59	[RLlib] Tf2 + eager-tracing same speed as framework=tf; Add more test coverage for tf2+tracing. (#19981 )	2021-11-05 16:10:00 +01:00
Sven Mika	f3397b6f48	[RLlib] Minor fixes/cleanups; chop_into_sequences now handles nested data. (#19408 )	2021-11-05 14:39:28 +01:00
Avnish Narayan	026bf01071	[RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535 ) * Fix QMix, SAC, and MADDPA too. * Unpin gym and deprecate pendulum v0 Many tests in rllib depended on pendulum v0, however in gym 0.21, pendulum v0 was deprecated in favor of pendulum v1. This may change reward thresholds, so will have to potentially rerun all of the pendulum v1 benchmarks, or use another environment in favor. The same applies to frozen lake v0 and frozen lake v1 Lastly, all of the RLlib tests and have been moved to python 3.7 * Add gym installation based on python version. Pin python<= 3.6 to gym 0.19 due to install issues with atari roms in gym 0.20 * Reformatting * Fixing tests * Move atari-py install conditional to req.txt * migrate to new ale install method * Fix QMix, SAC, and MADDPA too. * Unpin gym and deprecate pendulum v0 Many tests in rllib depended on pendulum v0, however in gym 0.21, pendulum v0 was deprecated in favor of pendulum v1. This may change reward thresholds, so will have to potentially rerun all of the pendulum v1 benchmarks, or use another environment in favor. The same applies to frozen lake v0 and frozen lake v1 Lastly, all of the RLlib tests and have been moved to python 3.7 * Add gym installation based on python version. Pin python<= 3.6 to gym 0.19 due to install issues with atari roms in gym 0.20 Move atari-py install conditional to req.txt migrate to new ale install method Make parametric_actions_cartpole return float32 actions/obs Adding type conversions if obs/actions don't match space Add utils to make elements match gym space dtypes Co-authored-by: Jun Gong <jungong@anyscale.com> Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-11-03 16:24:00 +01:00
Sven Mika	cf21c634a3	[RLlib] Fix deprecated warning for torch_ops.py (soft-replaced by torch_utils.py). (#19982 )	2021-11-03 10:00:46 +01:00
Sven Mika	2d24ef0d32	[RLlib] Add all simple learning tests as `framework=tf2`. (#19273 ) * Unpin gym and deprecate pendulum v0 Many tests in rllib depended on pendulum v0, however in gym 0.21, pendulum v0 was deprecated in favor of pendulum v1. This may change reward thresholds, so will have to potentially rerun all of the pendulum v1 benchmarks, or use another environment in favor. The same applies to frozen lake v0 and frozen lake v1 Lastly, all of the RLlib tests and Tune tests have been moved to python 3.7 * fix tune test_sampler::testSampleBoundsAx * fix re-install ray for py3.7 tests Co-authored-by: avnishn <avnishn@uw.edu>	2021-11-02 12:10:17 +01:00
Sven Mika	0b308719f8	[RLlib; Docs overhaul] Docstring cleanup: rllib/utils (#19829 )	2021-11-01 21:46:02 +01:00
Sven Mika	9c73871da0	[RLlib; Docs overhaul] Docstring cleanup: Evaluation (#19783 )	2021-10-29 12:03:56 +02:00
gjoliver	d81885c1f1	[RLlib] Fix all the CI tests that were broken by is_training and replay buffer changes; re-comment-in the failing RLlib tests (#19809 ) * Fix DDPG, since it is based on GenericOffPolicyTrainer. * Fix QMix, SAC, and MADDPA too. * Undo QMix change. * Fix DQN input batch type. Always use SampleBatch. * apex ddpg should not use replay_buffer_config yet. * Make eager tf policy to use SampleBatch. * lint * LINT. * Re-enable RLlib broken tests to make sure things work ok now. * fixes. Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-10-28 18:06:47 +02:00
Sven Mika	f2cb2ed203	[RLlib; Docs overhaul] Docstring cleanup: Policies, policy_templates. (#19759 )	2021-10-27 19:14:39 +02:00
Avnish Narayan	ad87ddf93e	[rllib] Add deterministic test to gpu (#19306 ) Co-authored-by: sven1977 <svenmika1977@gmail.com>	2021-10-26 10:11:39 -07:00
Sven Mika	b213565783	[RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). (#19693 )	2021-10-25 15:00:00 +02:00
gjoliver	c3c42278e4	[RLlib] clean up all the SampleBatch['is_training'] deprecation warnings (#19652 ) * [RLlib] clean up all the SampleBatch['is_training'] deprecation warnings. * wip	2021-10-25 09:38:56 +02:00
Sven Mika	bd2d2079d2	[RLlib] Support >1 loss terms and optimizers for framework=tf2 (already supported for framework=[tf\|torch]) (#19269 )	2021-10-10 12:19:47 +02:00
Sven Mika	d439fd7f17	[RLlib] TF2/eager memory leak fixes. (#19198 )	2021-10-09 00:11:53 +02:00
Sven Mika	b4300dd532	[RLlib] Issue 18812: Torch multi-GPU stats not protected against race conditions. (#18937 )	2021-10-04 13:29:00 +02:00
Sven Mika	ac3371a148	[RLlib] Discussion 3644: Fix bug for complex obs spaces containing `Box([2D shape])` and discrete component. (#18917 )	2021-09-30 16:39:38 +02:00
Sven Mika	ed85f59194	[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879 )	2021-09-30 16:39:05 +02:00
Sven Mika	828f5d26b7	[RLlib] Custom view requirements (e.g. for prev-n-obs) work with `compute_single_action` and `compute_actions_from_input_dict`. (#18921 )	2021-09-30 15:03:37 +02:00
Sven Mika	05a55a9335	[RLlib] Issue 18668: Unity3D env client/server example not working (fix + add to test cases). (#18942 )	2021-09-30 08:30:20 +02:00
Sven Mika	61a1274619	[RLlib] No Preprocessors (part 2). (#18468 )	2021-09-23 12:56:45 +02:00
Sven Mika	a96dbd885b	[RLlib] Reinstate trajectory view API tests. (#18809 )	2021-09-23 08:31:51 +02:00
Sven Mika	698b4eeed3	[RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). (#18669 )	2021-09-21 22:00:14 +02:00
Sven Mika	08c09737fa	[RLlib] Fix R2D2 (torch) multi-GPU issue. (#18550 )	2021-09-14 19:58:10 +02:00
Sven Mika	8a066474d4	[RLlib] No Preprocessors; preparatory PR #1 (#18367 )	2021-09-09 08:10:42 +02:00
Sven Mika	cd22a7d1bb	[RLlib] Add locking to PolicyMap in case it is accessed by a RolloutWorker and the same worker's AsyncSampler or the main LearnerThread. (#18444 )	2021-09-08 23:32:23 +02:00
Sven Mika	ba58f5edb1	[RLlib] Strictly run `evaluation_num_episodes` episodes each evaluation run (no matter the other eval config settings). (#18335 )	2021-09-05 15:37:05 +02:00
Sven Mika	9a8ca6a69d	[RLlib] Fix Atari learning test regressions (2 bugs) and 1 minor attention net bug. (#18306 )	2021-09-03 13:29:57 +02:00
Sven Mika	4888d7c9af	[RLlib] Replay buffers: Add config option to store contents in checkpoints. (#17999 )	2021-08-31 12:21:49 +02:00
Sven Mika	9883505e84	[RLlib] Add [LSTM=True + multi-GPU]-tests to nightly RLlib testing suite (for all algos supporting RNNs, except R2D2, RNNSAC, and DDPPO). (#18017 )	2021-08-24 21:55:27 +02:00
Sven Mika	494ddd98c1	[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. (#17928 )	2021-08-21 17:05:48 +02:00
Sven Mika	a428f10ebe	[RLlib] Add multi-GPU learning tests to nightly. (#17778 )	2021-08-18 17:21:01 +02:00
Sven Mika	f18213712f	[RLlib] Redo: "fix self play example scripts" PR (17566) (#17895 ) * wip. * wip. * wip. * wip. * wip. * wip. * wip. * wip. * wip.	2021-08-17 09:13:35 -07:00
simonsays1980	7b33dc21dc	[RLlib] Fix update model view requirements from init state for bare-metal policies with custom view-reqs. (#17867 ) * Changed '_update_model_view_requirements_from_init_state()' to adopt the 'shift' in view_requirements from a user-defined policy that inherits directly from Policy. * Added slightly modifed version of Sven's suggestion. Like this any user-defined attributes of the ViewRequirement of the state get conserved. * I saw that the code in _update_model_view_requirements_from_init_state() had changed and is not identical to my locally installed version. In the new version view_requirements from the model and the policy get united and therefore a loop runs through this unified list. Code should run now in the present version * Apply suggestions from code review	2021-08-17 11:49:24 +02:00
Sven Mika	f3bbe4ea44	[RLlib] Test cases/BUILD cleanup; split "everything else" (longest running one rn) tests in 2. (#17640 )	2021-08-16 22:01:01 +02:00

1 2 3 4 5 ...

272 commits