hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Eric Liang	4963dfaae0	[api] Add API stability annotations for all RLlib symbols and add to LINT (#25060 )	2022-05-24 22:14:25 -07:00
Ishant Mrinal	0248c60387	[RLlib] Add additional return values to `action_sampler_fn`. (#22721 )	2022-04-29 10:34:48 +02:00
Sven Mika	b1cda46681	[RLlib] SlateQ (tf GPU + multi-GPU) + Bandit fixes (#23276 )	2022-03-18 13:45:16 +01:00
Siyuan (Ryans) Zhuang	0c74ecad12	[Lint] Cleanup incorrectly formatted strings (Part 1: RLLib). (#23128 )	2022-03-15 17:34:21 +01:00
Balaji Veeramani	7f1bacc7dc	[CI] Format Python code with Black (#21975 ) See #21316 and #21311 for the motivation behind these changes.	2022-01-29 18:41:57 -08:00
Sven Mika	596c8e2772	[RLlib] Experimental no-flatten option for actions/prev-actions. (#20918 )	2021-12-11 14:57:58 +01:00
Ishant Mrinal	2868d1a2cf	[RLlib] Support for RE3 exploration algorithm (for tf) (#19551 )	2021-12-07 13:26:34 +01:00
Sven Mika	6ff4061f3a	[RLlib] Issue 20269: Offline RL example not working due to new_obs not being written to file. (#20366 ) * wip. * Apply suggestions from code review	2021-11-15 16:41:08 +01:00
Sven Mika	a931076f59	[RLlib] Tf2 + eager-tracing same speed as framework=tf; Add more test coverage for tf2+tracing. (#19981 )	2021-11-05 16:10:00 +01:00
Sven Mika	0b308719f8	[RLlib; Docs overhaul] Docstring cleanup: rllib/utils (#19829 )	2021-11-01 21:46:02 +01:00
Sven Mika	f2cb2ed203	[RLlib; Docs overhaul] Docstring cleanup: Policies, policy_templates. (#19759 )	2021-10-27 19:14:39 +02:00
Sven Mika	b213565783	[RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). (#19693 )	2021-10-25 15:00:00 +02:00
Sven Mika	61a1274619	[RLlib] No Preprocessors (part 2). (#18468 )	2021-09-23 12:56:45 +02:00
Sven Mika	698b4eeed3	[RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). (#18669 )	2021-09-21 22:00:14 +02:00
Sven Mika	9883505e84	[RLlib] Add [LSTM=True + multi-GPU]-tests to nightly RLlib testing suite (for all algos supporting RNNs, except R2D2, RNNSAC, and DDPPO). (#18017 )	2021-08-24 21:55:27 +02:00
Sven Mika	494ddd98c1	[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. (#17928 )	2021-08-21 17:05:48 +02:00
Sven Mika	a428f10ebe	[RLlib] Add multi-GPU learning tests to nightly. (#17778 )	2021-08-18 17:21:01 +02:00
Sven Mika	924f11cd45	[RLlib] Torch algos use now-framework-agnostic MultiGPUTrainOneStep execution op (~33% speedup for PPO-torch + GPU). (#17371 )	2021-08-03 11:35:49 -04:00
Sven Mika	90b21ce27e	[RLlib] De-flake 3 test cases; Fix `config.simple_optimizer` and `SampleBatch.is_training` warnings. (#17321 )	2021-07-27 14:39:06 -04:00
Chris Bamford	29768a7c01	[RLLib] (P1 regression) Fixing view requirements in compute actions (#15856 )	2021-07-25 14:25:07 -04:00
Sven Mika	5a313ba3d6	[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169 )	2021-07-20 14:58:13 -04:00
Sven Mika	18d173b172	[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. (#17031 )	2021-07-19 13:16:03 -04:00
Sven Mika	be6db06485	[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569 )	2021-06-21 13:46:01 +02:00
Sven Mika	e973b726c2	[RLlib] Support native tf.keras.Models (part 2) - Default keras models for Vision/RNN/Attention. (#15273 )	2021-04-30 19:26:30 +02:00
Sven Mika	bb8a286cbc	[RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). (#14684 )	2021-04-27 10:44:54 +02:00
Sven Mika	9c5a0cfd7a	[RLlib] Issue 14385: `Policy.compute_actions_from_input_dict` does not properly track accessed fields for Policy's view requirements. (#14386 )	2021-04-11 18:20:04 +02:00
Sven Mika	4f66309e19	[RLlib] Redo issue 14533 tf enable eager exec (#14984 )	2021-03-29 20:07:44 +02:00
SangBin Cho	fa5f961d5e	Revert "[RLlib] Issue 14533: `tf.enable_eager_execution()` must be called at beginning. (#14737 )" (#14918 ) This reverts commit `3e389d5812`.	2021-03-25 00:42:01 -07:00
Sven Mika	3e389d5812	[RLlib] Issue 14533: `tf.enable_eager_execution()` must be called at beginning. (#14737 )	2021-03-24 12:54:27 +01:00
Sven Mika	04bc0a9828	[RLlib] Remove all non-trajectory view API code. (#14860 )	2021-03-23 09:50:18 -07:00
Sven Mika	69202c6a7d	[RLlib] Obsolete usage tracking dict via sample batch. (#13065 )	2021-03-17 08:18:15 +01:00
Sven Mika	ee4b6e7e3b	[RLlib] Unity3D example broken due to change in ML-Agents API. Attention-net prev-n-a/r. Attention-wrapper works with images. (#14569 )	2021-03-12 18:27:25 +01:00
Sven Mika	732197e23a	[RLlib] Multi-GPU for tf-DQN/PG/A2C. (#13393 )	2021-03-08 15:41:27 +01:00
Sven Mika	8000258333	[RLlib] R2D2 Implementation. (#13933 )	2021-02-25 12:18:11 +01:00
Sven Mika	4db86404ad	[RLlib] Issue #13507 : Fix MB-MPO CartPole Env's reward function as well as MB-MPO running into a traj. view API related issue. (#14037 )	2021-02-11 18:58:46 +01:00
Sven Mika	d7301a51f4	[RLlib]: Trajectory View API: Keep env infos (e.g. for postprocessing callbacks), no matter what. (#13555 )	2021-02-09 17:05:26 +01:00
Sven Mika	391cdfae8c	[RLlib] Trajectory view API docs. (#12718 )	2020-12-30 17:32:21 -08:00
Sven Mika	b2bcab711d	[RLlib] Attention Nets: tf (#12753 )	2020-12-20 20:22:32 -05:00
Sven Mika	74c98ac38e	[RLlib] Issue 12244: Unable to restore multi-agent PPOTFPolicy's Model (from exported). (#12786 )	2020-12-11 16:13:38 +01:00
Sven Mika	99c81c6795	[RLlib] Attention Net prep PR #3 . (#12450 )	2020-12-07 13:08:17 +01:00
Sven Mika	19c8033df2	[RLlib] Fix most remaining RLlib algos for running with trajectory view API. (#12366 ) * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * WIP. * LINT and fixes. MB-MPO and MAML not working yet. * wip * update * update * rmeove * remove dep * higher * Update requirements_rllib.txt * Update requirements_rllib.txt * relpos * no mbmpo Co-authored-by: Eric Liang <ekhliang@gmail.com>	2020-12-01 17:41:10 -08:00
Sven Mika	b6b54f1c81	[RLlib] Trajectory view API: enable by default for SAC, DDPG, DQN, SimpleQ (#11827 )	2020-11-16 10:54:35 -08:00
Sven Mika	62c7ab5182	[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747 )	2020-11-12 16:27:34 +01:00
Sven Mika	5b788ccb13	[RLlib] Trajectory view API (prep PR for switching on by default across all RLlib; plumbing only) (#11717 )	2020-11-03 12:53:34 -08:00
Sven Mika	8ea1bc5ff9	[RLlib] Allow for more than 2^31 policy timesteps. (#11301 )	2020-10-12 13:49:11 -07:00
Sven Mika	805dad3bc4	[RLlib] SAC algo cleanup. (#10825 )	2020-09-20 11:27:02 +02:00
Eric Liang	ca133e2699	[rllib] Remove extra model config kwargs passed incorrectly for Torch models (#10055 )	2020-08-17 11:12:20 -07:00
Sven Mika	2256047876	[RLlib] Rename rllib.utils.types into typing to match built-in python module's name. (#10114 )	2020-08-15 13:24:22 +02:00
Barak Michener	8e76796fd0	ci: Redo `format.sh --all` script & backfill lint fixes (#9956 )	2020-08-07 16:49:49 -07:00
Sven Mika	9b90f7db67	[RLlib] Missing type annotations policy templates. (#9846 )	2020-08-06 05:33:24 +02:00

1 2

73 commits