hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 18:41:40 -05:00

Author	SHA1	Message	Date
Sven Mika	e7557ae433	[RLlib] Issue 13132: DQN does not update target net after restore (#14838 )	2021-03-23 08:30:37 +01:00
Sven Mika	ee4b6e7e3b	[RLlib] Unity3D example broken due to change in ML-Agents API. Attention-net prev-n-a/r. Attention-wrapper works with images. (#14569 )	2021-03-12 18:27:25 +01:00
Sven Mika	52c94b7ee9	[RLlib] Allow SAC to use custom models as Q- or policy nets and deprecate "state-preprocessor" for image spaces. (#13522 )	2021-02-02 13:05:58 +01:00
Sven Mika	4bc257f4fb	[RLlib] Fix custom multi action distr (#13681 )	2021-01-28 19:28:48 +01:00
Jan Blumenkamp	964689b280	[RLlib] Fix bug in ModelCatalog when using custom action distribution (#12846 ) * return tuple returned from _get_multi_action_distribution when using custom action dict * Always return dst_class and required_model_output_shape in _get_multi_action_distribution * pass model config to _get_multi_action_distribution	2021-01-25 12:42:39 +01:00
Sven Mika	56878221ed	[RLlib] Redo: Make TFModelV2 fully modular like TorchModelV2 (soft-deprecate register_variables, unify var names wrt torch). (#13363 )	2021-01-14 14:44:33 +01:00
Sven Mika	d49c3fae0b	[RLlib] Trajectory View API: Atari framestacking. (#13315 )	2021-01-13 08:53:34 +01:00
Kai Fricke	25f10a947a	Revert "[RLlib] Make TFModelV2 behave more like TorchModelV2: Obsolete register_variables. Unify variable dicts. (#13339 )" (#13361 ) This reverts commit `e2b2abb88b`.	2021-01-12 12:33:57 +01:00
Sven Mika	e2b2abb88b	[RLlib] Make TFModelV2 behave more like TorchModelV2: Obsolete register_variables. Unify variable dicts. (#13339 )	2021-01-11 22:42:30 +01:00
Sven Mika	5d50d37f45	[RLlib] Issue 13330: No TF installed causes crash in `ModelCatalog.get_action_shape()` (#13332 )	2021-01-11 13:19:46 +01:00
Sven Mika	6f342a2221	[RLlib] Preparatory PR for: Documentation on Model Building. (#13260 )	2021-01-08 10:56:09 +01:00
Sven Mika	9eba1871bb	[RLlib] Support easy `use_attention=True` flag for using the GTrXL model. (#11698 )	2021-01-01 14:06:23 -05:00
Sven Mika	c524f86785	[RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. (#13064 )	2020-12-27 09:46:03 -05:00
Sven Mika	3f4bc16276	[RLlib] Add a minimal JAX ModelV2 (FCNet) to RLlib. (#12502 )	2020-12-03 15:51:30 +01:00
Sven Mika	592c161032	[RLlib] Issue 12118: LSTM prev-a/r should be separately configurable. Fix missing prev-a one-hot encoding. (#12397 ) * WIP. * Fix and LINT.	2020-11-25 11:27:46 -08:00
Michael Luo	b2984d1c34	[RLlib] Model Annotations to Torch Models (#9749 )	2020-11-12 12:16:12 +01:00
Sven Mika	d9f1874e34	[RLlib] Minor fixes (torch GPU bugs + some cleanup). (#11609 )	2020-10-27 10:00:24 +01:00
Sven Mika	1ebcdf236f	[RLlib] Add support for custom MultiActionDistributions. (#11311 )	2020-10-12 13:50:43 -07:00
Sumanth Ratna	14d8826e43	Fix overriden typo (#11227 )	2020-10-07 19:11:07 -07:00
Sven Mika	ce96b03b07	[RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033 )	2020-10-06 20:28:16 +02:00
Sven Mika	c17169dc11	[RLlib] Fix all example scripts to run on GPUs. (#11105 )	2020-10-02 23:07:44 +02:00
internetcoffeephone	840fb5543b	Change get_action_shape so that it uses the dtype of the Discrete object, rather than overwriting it with tf.int64. (#8424 )	2020-09-21 17:08:31 -07:00
Sven Mika	28ab797cf5	[RLlib] Deprecate old classes, methods, functions, config keys (in prep for RLlib 1.0). (#10544 )	2020-09-06 10:58:00 +02:00
Sven Mika	e968b52cb7	[RLlib] Trajectory view API - 03 Fast LSTM + prev actions/rewards (#9950 )	2020-08-21 12:35:16 +02:00
Sven Mika	2cbe29a7fa	[RLlib] Curiosity minor fixes, do-overs, and testing. (#10143 )	2020-08-19 17:49:50 +02:00
Eric Liang	ca133e2699	[rllib] Remove extra model config kwargs passed incorrectly for Torch models (#10055 )	2020-08-17 11:12:20 -07:00
Sven Mika	2256047876	[RLlib] Rename rllib.utils.types into typing to match built-in python module's name. (#10114 )	2020-08-15 13:24:22 +02:00
Eric Liang	590943a499	[rllib] Type annotations for model classes (#9646 )	2020-07-24 12:01:46 -07:00
Sven Mika	fcdf410ae1	[RLlib] Tf2.x native. (#8752 )	2020-07-11 22:06:35 +02:00
Sven Mika	5b2a97597b	[RLlib] Retire `try_import_tree` (should be installed along with other requirements). (#9211 ) - Retire try_import_tree. - Stabilize test_supported_multi_agent.py.	2020-07-02 13:06:34 +02:00
Sven Mika	43043ee4d5	[RLlib] Tf2x preparation; part 2 (upgrading `try_import_tf()`). (#9136 ) * WIP. * Fixes. * LINT. * WIP. * WIP. * Fixes. * Fixes. * Fixes. * Fixes. * WIP. * Fixes. * Test * Fix. * Fixes and LINT. * Fixes and LINT. * LINT.	2020-06-30 10:13:20 +02:00
Sven Mika	4fd8977eaf	[RLlib] Minor cleanup in preparation to tf2.x support. (#9130 ) * WIP. * Fixes. * LINT. * Fixes. * Fixes and LINT. * WIP.	2020-06-25 19:01:32 +02:00
Sven Mika	14405b90d5	[RLlib] Prototype of a DynaTrainer (for env dynamics learning in upcoming MBMPO algo). (#8860 )	2020-06-16 09:01:20 +02:00
Sven Mika	0ba7472da9	[Testing] Fix LINT/sphinx errors. (#8874 )	2020-06-10 15:41:59 +02:00
Eric Liang	831b2fe51d	[rllib] Set framework to tf by default and remove import checks; "Auto" option (#8748 ) * tf by default * Update rllib/agents/trainer.py Co-authored-by: Sven Mika <sven@anyscale.io> * remove it * fix * remove * fix * lint Co-authored-by: Sven Mika <sven@anyscale.io>	2020-06-08 23:04:50 -07:00
Sven Mika	c74dc58f8b	[RLlib] Fix `use_lstm` flag for ModelV2 (w/o ModelV1 wrapping) and add it for PyTorch. (#8734 )	2020-06-05 15:40:30 +02:00
Sven Mika	2746fc0476	[RLlib] Auto-framework, retire `use_pytorch` in favor of `framework=...` (#8520 )	2020-05-27 16:19:13 +02:00
Sven Mika	6d196197bc	[RLlib] utils/spaces ... (#8608 )	2020-05-27 10:21:30 +02:00
Sven Mika	0422e9c5a8	[RLlib] Add 2 Transformer learning test cases on StatelessCartPole (PPO and IMPALA). (#8624 )	2020-05-27 10:19:47 +02:00
Sven Mika	796a834c48	[RLlib] Attention Net integration into ModelV2 and learning RL example. (#8371 )	2020-05-18 17:26:40 +02:00
Eric Liang	7ce138a6dc	[rllib] Support free_log_std in ModelV2 (#8380 ) * update * factor * update * fix test failures * fix torch net	2020-05-12 10:14:05 -07:00
Sven Mika	5f278c6411	[RLlib] Examples folder restructuring (models) part 1 (#8353 )	2020-05-08 08:20:18 +02:00
Sven Mika	166bb5d690	[RLlib] IMPALA PyTorch (#8287 ) This PR adds an IMPALA PyTorch implementation. - adds compilation tests for LSTM and w/o LSTM. - adds learning test for CartPole.	2020-05-03 13:44:25 +02:00
Sven Mika	bf25aee392	[RLlib] Deprecate all Model(v1) usage. (#8146 ) Deprecate all Model(v1) usage.	2020-04-29 12:12:59 +02:00
Sven Mika	1775e89f26	[RLlib] Remove TupleActions and support arbitrarily nested action spaces. (#8143 ) Deprecate TupleActions and support arbitrarily nested action spaces. Closes issue #8143.	2020-04-28 14:59:16 +02:00
Sven Mika	3812bfedda	[RLlib] PyTorch version of ES (Evolution Strategies). (#8104 ) PyTorch version of Evolution Strategies (ES) Algo.	2020-04-20 21:47:28 +02:00
Sven Mika	428516056a	[RLlib] SAC Torch (incl. Atari learning) (#7984 ) * Policy-classes cleanup and torch/tf unification. - Make Policy abstract. - Add `action_dist` to call to `extra_action_out_fn` (necessary for PPO torch). - Move some methods and vars to base Policy (from TFPolicy): num_state_tensors, ACTION_PROB, ACTION_LOGP and some more. * Fix `clip_action` import from Policy (should probably be moved into utils altogether). * - Move `is_recurrent()` and `num_state_tensors()` into TFPolicy (from DynamicTFPolicy). - Add config to all Policy c'tor calls (as 3rd arg after obs and action spaces). * Add `config` to c'tor call to TFPolicy. * Add missing `config` to c'tor call to TFPolicy in marvil_policy.py. * Fix test_rollout_worker.py::MockPolicy and BadPolicy classes (Policy base class is now abstract). * Fix LINT errors in Policy classes. * Implement StatefulPolicy abstract methods in test cases: test_multi_agent_env.py. * policy.py LINT errors. * Create a simple TestPolicy to sub-class from when testing Policies (reduces code in some test cases). * policy.py - Remove abstractmethod from `apply_gradients` and `compute_gradients` (these are not required iff `learn_on_batch` implemented). - Fix docstring of `num_state_tensors`. * Make QMIX torch Policy a child of TorchPolicy (instead of Policy). * QMixPolicy add empty implementations of abstract Policy methods. * Store Policy's config in self.config in base Policy c'tor. * - Make only compute_actions in base Policy's an abstractmethod and provide pass implementation to all other methods if not defined. - Fix state_batches=None (most Policies don't have internal states). * Cartpole tf learning. * Cartpole tf AND torch learning (in ~ same ts). * Cartpole tf AND torch learning (in ~ same ts). 2 * Cartpole tf (torch syntax-broken) learning (in ~ same ts). 3 * Cartpole tf AND torch learning (in ~ same ts). 4 * Cartpole tf AND torch learning (in ~ same ts). 5 * Cartpole tf AND torch learning (in ~ same ts). 6 * Cartpole tf AND torch learning (in ~ same ts). Pendulum tf learning. * WIP. * WIP. * SAC torch learning Pendulum. * WIP. * SAC torch and tf learning Pendulum and Cartpole after cleanup. * WIP. * LINT. * LINT. * SAC: Move policy.target_model to policy.device as well. * Fixes and cleanup. * Fix data-format of tf keras Conv2d layers (broken for some tf-versions which have data_format="channels_first" as default). * Fixes and LINT. * Fixes and LINT. * Fix and LINT. * WIP. * Test fixes and LINT. * Fixes and LINT. Co-authored-by: Sven Mika <sven@Svens-MacBook-Pro.local>	2020-04-15 13:25:16 +02:00
Sven Mika	22ccc43670	[RLlib] DQN torch version. (#7597 ) * Fix. * Rollback. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * Fix. * Fix. * Fix. * Fix. * Fix. * WIP. * WIP. * Fix. * Test case fixes. * Test case fixes and LINT. * Test case fixes and LINT. * Rollback. * WIP. * WIP. * Test case fixes. * Fix. * Fix. * Fix. * Add regression test for DQN w/ param noise. * Fixes and LINT. * Fixes and LINT. * Fixes and LINT. * Fixes and LINT. * Fixes and LINT. * Comment * Regression test case. * WIP. * WIP. * LINT. * LINT. * WIP. * Fix. * Fix. * Fix. * LINT. * Fix (SAC does currently not support eager). * Fix. * WIP. * LINT. * Update rllib/evaluation/sampler.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/evaluation/sampler.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/utils/exploration/exploration.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/utils/exploration/exploration.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * WIP. * WIP. * Fix. * LINT. * LINT. * Fix and LINT. * WIP. * WIP. * WIP. * WIP. * Fix. * LINT. * Fix. * Fix and LINT. * Update rllib/utils/exploration/exploration.py * Update rllib/policy/dynamic_tf_policy.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/policy/dynamic_tf_policy.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/policy/dynamic_tf_policy.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Fixes. * WIP. * LINT. * Fixes and LINT. * LINT and fixes. * LINT. * Move action_dist back into torch extra_action_out_fn and LINT. * Working SimpleQ learning cartpole on both torch AND tf. * Working Rainbow learning cartpole on tf. * Working Rainbow learning cartpole on tf. * WIP. * LINT. * LINT. * Update docs and add torch to APEX test. * LINT. * Fix. * LINT. * Fix. * Fix. * Fix and docstrings. * Fix broken RLlib tests in master. * Split BAZEL learning tests into cartpole and pendulum (reached the 60min barrier). * Fix error_outputs option in BAZEL for RLlib regression tests. * Fix. * Tune param-noise tests. * LINT. * Fix. * Fix. * test * test * test * Fix. * Fix. * WIP. * WIP. * WIP. * WIP. * LINT. * WIP. Co-authored-by: Eric Liang <ekhliang@gmail.com>	2020-04-06 11:56:16 -07:00
Sven Mika	4198db5038	Torch multicat support (7419)	2020-03-04 00:41:40 -08:00
Matthew Brulhardt	75f683eec6	[rllib] Fix error in shape calculation. (#7301 )	2020-02-25 14:16:29 -08:00

1 2

64 commits