hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Sven Mika	f2cb2ed203	[RLlib; Docs overhaul] Docstring cleanup: Policies, policy_templates. (#19759 )	2021-10-27 19:14:39 +02:00
Sven Mika	ac3371a148	[RLlib] Discussion 3644: Fix bug for complex obs spaces containing `Box([2D shape])` and discrete component. (#18917 )	2021-09-30 16:39:38 +02:00
Sven Mika	ed85f59194	[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879 )	2021-09-30 16:39:05 +02:00
Sven Mika	61a1274619	[RLlib] No Preprocessors (part 2). (#18468 )	2021-09-23 12:56:45 +02:00
Sven Mika	698b4eeed3	[RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). (#18669 )	2021-09-21 22:00:14 +02:00
Sven Mika	9883505e84	[RLlib] Add [LSTM=True + multi-GPU]-tests to nightly RLlib testing suite (for all algos supporting RNNs, except R2D2, RNNSAC, and DDPPO). (#18017 )	2021-08-24 21:55:27 +02:00
Sven Mika	494ddd98c1	[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. (#17928 )	2021-08-21 17:05:48 +02:00
Sven Mika	5107d16ae5	[RLlib] Add @Deprecated decorator to simplify/unify deprecation of classes, methods, functions. (#17530 )	2021-08-03 18:30:02 -04:00
Sven Mika	8a844ff840	[RLlib] Issues: 17397, 17425, 16715, 17174. When on driver, Torch\|TFPolicy should not use `ray.get_gpu_ids()` (b/c no GPUs assigned by ray). (#17444 )	2021-08-02 17:29:59 -04:00
Sven Mika	5a313ba3d6	[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169 )	2021-07-20 14:58:13 -04:00
Sven Mika	18d173b172	[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. (#17031 )	2021-07-19 13:16:03 -04:00
Sven Mika	1fd0eb805e	[RLlib] Redo fix bug normalize vs unsquash actions (original PR made log-likelihood test flakey). (#17014 )	2021-07-13 14:01:30 -04:00
Kai Fricke	27d80c4c88	[RLlib] ONNX export for tensorflow (1.x) and torch (#16805 )	2021-07-13 12:38:11 -04:00
Amog Kamsetty	bc33dc7e96	Revert "[RLlib] Fix bug in policy.py: normalize_actions=True has to call `unsquash_action`, not `normalize_action`." (#17002 ) This reverts commit `7862dd64ea`.	2021-07-12 11:09:14 -07:00
Sven Mika	7862dd64ea	[RLlib] Fix bug in policy.py: normalize_actions=True has to call `unsquash_action`, not `normalize_action`. (#16774 )	2021-07-08 17:31:34 +02:00
Sven Mika	7eb1a29426	[RLlib] Fix ModelV2 custom metrics for torch. (#16734 )	2021-07-01 13:01:40 +02:00
Sven Mika	d0014cd351	[RLlib] Policies get/set_state fixes and enhancements. (#16354 )	2021-06-15 13:08:43 +02:00
Sven Mika	e80095591c	[RLlib] Entropy coeff schedule bug fix and git bisect script. (#15937 )	2021-05-20 18:15:10 +02:00
Amog Kamsetty	ebc44c3d76	[CI] Upgrade flake8 to 3.9.1 (#15527 ) * formatting * format util * format release * format rllib/agents * format rllib/env * format rllib/execution * format rllib/evaluation * format rllib/examples * format rllib/policy * format rllib utils and tests * format streaming * more formatting * update requirements files * fix rllib type checking * updates * update * fix circular import * Update python/ray/tests/test_runtime_env.py * noqa	2021-05-03 14:23:28 -07:00
Sven Mika	e973b726c2	[RLlib] Support native tf.keras.Models (part 2) - Default keras models for Vision/RNN/Attention. (#15273 )	2021-04-30 19:26:30 +02:00
Sven Mika	78b776942f	[RLlib] Discussion 1928: Initial lr wrong if schedule used that includes ts=0 (both tf and torch). (#15538 )	2021-04-27 17:19:52 +02:00
Sven Mika	bb8a286cbc	[RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). (#14684 )	2021-04-27 10:44:54 +02:00
Sven Mika	41968512ca	[RLlib] Partial GPU examples (for learner and workers). (#15334 )	2021-04-20 08:46:05 +02:00
Sven Mika	bbfa8ffec9	[RLlib] Minor release 1.3 warnings cleanups. (#15272 )	2021-04-14 14:03:15 +02:00
Sven Mika	9c5a0cfd7a	[RLlib] Issue 14385: `Policy.compute_actions_from_input_dict` does not properly track accessed fields for Policy's view requirements. (#14386 )	2021-04-11 18:20:04 +02:00
Sven Mika	69202c6a7d	[RLlib] Obsolete usage tracking dict via sample batch. (#13065 )	2021-03-17 08:18:15 +01:00
Sven Mika	8000258333	[RLlib] R2D2 Implementation. (#13933 )	2021-02-25 12:18:11 +01:00
Sven Mika	81e7434091	[RLlib] TFPolicy.export_model: Add timestep placeholder to model's signature, if needed. (#13988 )	2021-02-10 15:21:46 +01:00
Sven Mika	eb0038612f	[RLlib] Extend on_learn_on_batch callback to allow for custom metrics to be added. (#13584 )	2021-02-08 15:02:19 +01:00
Sven Mika	6f342a2221	[RLlib] Preparatory PR for: Documentation on Model Building. (#13260 )	2021-01-08 10:56:09 +01:00
Sven Mika	391cdfae8c	[RLlib] Trajectory view API docs. (#12718 )	2020-12-30 17:32:21 -08:00
Sven Mika	b2bcab711d	[RLlib] Attention Nets: tf (#12753 )	2020-12-20 20:22:32 -05:00
Sven Mika	74c98ac38e	[RLlib] Issue 12244: Unable to restore multi-agent PPOTFPolicy's Model (from exported). (#12786 )	2020-12-11 16:13:38 +01:00
Sven Mika	99c81c6795	[RLlib] Attention Net prep PR #3 . (#12450 )	2020-12-07 13:08:17 +01:00
Sven Mika	9021f15b2a	[RLlib] Fix setup-dev.py error when creating a softlink for new_dashboard. (#12442 )	2020-12-01 11:46:59 +01:00
Sven Mika	6da4342822	[RLlib] Add on_learn_on_batch (Policy) callback to DefaultCallbacks. (#12070 )	2020-11-18 15:39:23 +01:00
Sven Mika	62c7ab5182	[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747 )	2020-11-12 16:27:34 +01:00
Sven Mika	d9f1874e34	[RLlib] Minor fixes (torch GPU bugs + some cleanup). (#11609 )	2020-10-27 10:00:24 +01:00
Sven Mika	8ea1bc5ff9	[RLlib] Allow for more than 2^31 policy timesteps. (#11301 )	2020-10-12 13:49:11 -07:00
Sven Mika	ce96b03b07	[RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033 )	2020-10-06 20:28:16 +02:00
Sven Mika	2256047876	[RLlib] Rename rllib.utils.types into typing to match built-in python module's name. (#10114 )	2020-08-15 13:24:22 +02:00
Barak Michener	8e76796fd0	ci: Redo `format.sh --all` script & backfill lint fixes (#9956 )	2020-08-07 16:49:49 -07:00
Sven Mika	fcdf410ae1	[RLlib] Tf2.x native. (#8752 )	2020-07-11 22:06:35 +02:00
Sven Mika	f43d934817	[RLlib] Type annotations for policy. (#9248 )	2020-07-05 13:09:51 +02:00
Sven Mika	43043ee4d5	[RLlib] Tf2x preparation; part 2 (upgrading `try_import_tf()`). (#9136 ) * WIP. * Fixes. * LINT. * WIP. * WIP. * Fixes. * Fixes. * Fixes. * Fixes. * WIP. * Fixes. * Test * Fix. * Fixes and LINT. * Fixes and LINT. * LINT.	2020-06-30 10:13:20 +02:00
Sven Mika	25c0974543	[RLlib] Issue 8412 (Adam vars not stored in ModelV2). (#8480 )	2020-06-05 21:07:02 +02:00
Sven Mika	6c2b9a4cfa	[RLlib] Remove tf.py_function from all Schedule classes (not differentiable and causes other bugs in MA setups). (#8304 ) Remove tf.py_function from all Schedule classes (not differentiable and causes other bugs in MA setups). (#8304)	2020-05-04 23:53:38 +02:00
Sven Mika	5537fe13b0	[RLlib] Exploration API: ParamNoise Integration into DQN; working example/test cases. (#7814 )	2020-04-03 10:44:25 -07:00
Sven Mika	e153e3179f	[RLlib] Exploration API: Policy changes needed for forward pass noisifications. (#7798 ) * Rollback. * WIP. * WIP. * LINT. * WIP. * Fix. * Fix. * Fix. * LINT. * Fix (SAC does currently not support eager). * Fix. * WIP. * LINT. * Update rllib/evaluation/sampler.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/evaluation/sampler.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/utils/exploration/exploration.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/utils/exploration/exploration.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * WIP. * WIP. * Fix. * LINT. * LINT. * Fix and LINT. * WIP. * WIP. * WIP. * WIP. * Fix. * LINT. * Fix. * Fix and LINT. * Update rllib/utils/exploration/exploration.py * Update rllib/policy/dynamic_tf_policy.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/policy/dynamic_tf_policy.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Update rllib/policy/dynamic_tf_policy.py Co-Authored-By: Eric Liang <ekhliang@gmail.com> * Fixes. * LINT. * WIP. Co-authored-by: Eric Liang <ekhliang@gmail.com>	2020-04-01 00:43:21 -07:00
Sven Mika	66df8b8c35	[RLlib] Working/learning example: PPO + torch + LSTM. (#7797 )	2020-03-31 22:00:28 -07:00

1 2

70 commits