hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Jun Gong	68a9a33386	[RLlib] Retry agents -> algorithms. with proper doc changes this time. (#24797 )	2022-05-16 09:45:32 +02:00
Artur Niederfahrenhorst	b1bc435adc	[RLlib] Policy Server/Client metrics reporting fix (#24783 )	2022-05-15 17:25:25 +02:00
Steven Morad	6321c3a85c	[RLlib] Simple-Q TrainerConfig (#24583 )	2022-05-15 17:24:01 +02:00
Steven Morad	5c96e7223b	[RLlib] SimpleQ (minor cleanups) and DQN TrainerConfig objects. (#24584 )	2022-05-15 16:14:43 +02:00
Simon Mo	9f23affdc0	[Hotfix] Unbreak lint in master (#24794 )	2022-05-13 15:05:05 -07:00
Jun Gong	bc3a1d35cf	[RLlib] Introduce new policy base classes. (#24742 )	2022-05-13 21:48:30 +02:00
Sven Mika	8fe3fd8f7b	[RLlib] QMix TrainerConfig objects. (#24775 )	2022-05-13 18:50:28 +02:00
kourosh hakhamaneshi	ffcbb30552	[RLlib] Move from `agents` to `algorithms` - CQL, MARWIL, AlphaStar, MAML, Dreamer, MBMPO. (#24739 )	2022-05-13 18:43:36 +02:00
Steven Morad	ebe6ab0afc	[RLlib] Bandits use TrainerConfig objects. (#24687 )	2022-05-12 22:02:15 +02:00
Max Pumperla	6a6c58b5b4	[RLlib] Config objects for DDPG and SimpleQ. (#24339 )	2022-05-12 16:12:42 +02:00
Artur Niederfahrenhorst	95d4a83a87	[RLlib] R2D2 Replay Buffer API integration. (#24473 )	2022-05-10 20:36:14 +02:00
Sven Mika	44a51610c2	[RLlib] SlateQ config objects. (#24577 )	2022-05-10 20:07:18 +02:00
Sven Mika	f243895ebb	[RLlib] Dreamer ConfigObject class. (#24650 )	2022-05-10 16:19:42 +02:00
Sven Mika	6d94b2acbe	[RLlib] AlphaStar config objects. (#24576 )	2022-05-10 14:01:00 +02:00
Amog Kamsetty	b5b48f6cc7	[RLlib] Switch `Dreamer` to `training_iteration` API. (#24488 )	2022-05-10 08:37:34 +02:00
Artur Niederfahrenhorst	8d906f9bf8	[RLlib] SAC with new Replay Buffer API. (#24156 )	2022-05-09 14:33:02 +02:00
Artur Niederfahrenhorst	bd2fdf4752	[RLlib] Automate sequences in `timeslice_along_seq_lens_with_overlap()`. (#24561 )	2022-05-09 11:55:06 +02:00
Steven Morad	b76273357b	[RLlib] APEX-DQN replay buffer config validation fix. (#24588 )	2022-05-09 09:59:04 +02:00
kourosh hakhamaneshi	69055f556d	[RLlib] Move `agents.ars` to `algorithms.ars`. (#24516 )	2022-05-06 19:11:15 +02:00
Daewoo Lee	fee35444ab	[RLlib] Issue 24530: Fix `add_time_dimension` (#24531 ) Co-authored-by: Daewoo Lee <dwlee@rtst.co.kr>	2022-05-06 15:21:42 +02:00
kourosh hakhamaneshi	f48f1b252c	[RLlib] Moved `agents.es` to `algorithms.es` (#24511 )	2022-05-06 14:54:22 +02:00
Antoni Baum	c5e1851ab9	[Tune] Improve `JupyterNotebookReporter` (#24444 ) Improves Tune Jupyter notebook experience by modifying the `JupyterNotebookReporter` in two ways: * Previously, the `overwrite` flag controlled whether the entire cell would be overwritten with the updated table. This caused all the other logs to be cleared. Now, we use IPython display handle functionality to create a table at the top of the cell and update only that, preserving the rest of the output. The `overwrite` flag now controls whether the cell output prior to the initialization of `JupyterNotebookReporter` is overwritten or not. * The Ray Client detection was not working unless the user specifically passed a `JupyterNotebookReporter` as the `progress_reporter`. Now, the default value allows for correct detection of the enviroment while running Ray Client. Furthermore, the progress reporter detection logic in `rllib/train.py` has been replaced to make use of the `detect_reporter` function for consistency with Tune (the sign in the overwrite condition was similarly flipped).	2022-05-06 11:52:47 +01:00
Sven Mika	7ab19ddc32	[RLlib] MADDPG: Move into agents folder (from contrib) and use `training_iteration` method. (#24502 )	2022-05-06 12:35:21 +02:00
Sven Mika	f54557073e	[RLlib] Remove `execution_plan` API code no longer needed. (#24501 )	2022-05-06 12:29:53 +02:00
Sven Mika	f891a2b6f1	[RLlib] SlateQ + tf; release test fixes, related to TD-error not properly being formatted. (#24521 )	2022-05-06 08:50:30 +02:00
Avnish Narayan	f2bb6f6806	[RLlib] Impala training iteration fn (#23454 )	2022-05-05 16:11:08 +02:00
Christy Bergman	76eb47e226	[RLlib; docs] Rename UCB -> LinUCB. (#24348 )	2022-05-05 10:20:16 +02:00
Artur Niederfahrenhorst	86bc9ecce2	[RLlib] DDPG Training iteration fn & Replay Buffer API (#24212 )	2022-05-05 09:41:38 +02:00
Sven Mika	5b61a00792	[RLlib] Feed all values in COMMON_CONFIG directly from TrainerConfig() (removes duplicate values and comments). (#24433 )	2022-05-04 16:28:12 +02:00
Sven Mika	b48f63113b	[RLlib] SlateQ fixes: Release learning tests wrong yaml structure + TD-error torch issue (#24429 )	2022-05-04 13:37:14 +02:00
Sven Mika	1bc6419e0e	[RLlib] R2D2 training iteration fn AND switch off `execution_plan` API by default. (#24165 )	2022-05-03 07:59:26 +02:00
Sven Mika	7cca7782f1	[RLlib] OPE (off policy estimator) API. (#24384 )	2022-05-02 21:15:50 +02:00
Sven Mika	0c5ac3b9e8	[RLlib] Issue 24075: Better error message for Bandit MultiDiscrete (suggest using our wrapper). (#24385 )	2022-05-02 21:14:08 +02:00
Sven Mika	296e2ebc46	[RLlib] Issue 24082: WorkerSet.policies_to_train (deprecated) - if still used - returns wrong values. (#24386 )	2022-05-02 18:33:52 +02:00
Sven Mika	924adcf402	[RLlib] Issue 24074: multi-GPU learner thread key error in MA-scenarios. (#24382 )	2022-05-02 18:30:46 +02:00
Sven Mika	f53ca1cacb	[RLlib] ES + ARS TrainerConfig objects. (#24374 )	2022-05-02 16:55:28 +02:00
Edward Oakes	11954e6798	Issue 24143: Fix a few f-strings missing the f. (#24232 )	2022-05-02 16:11:33 +02:00
Sven Mika	026849cd27	[RLlib] APPO TrainerConfig objects. (#24376 )	2022-05-02 15:06:23 +02:00
Sven Mika	f066180ed5	[RLlib] Deprecate `timesteps_per_iteration` config key (in favor of `min_[sample\|train]_timesteps_per_reporting`. (#24372 )	2022-05-02 12:51:14 +02:00
Sven Mika	950bd3fc3f	[RLlib] IMPALA TrainerConfig objects. (#24375 )	2022-05-02 12:05:30 +02:00
Jiajun Yao	cfc192ebc4	Collect library usage (#24312 ) Collect which libraries are used for usage stats purpose.	2022-04-30 07:51:01 -07:00
Sven Mika	b2b1c95aa5	[RLlib] A2/3C Config objects (A2CConfig and A3CConfig). (#24332 )	2022-04-30 09:51:09 +02:00
Sven Mika	3052193c9e	[RLlib] Fix CQL getting stuck when deprecated `timesteps_per_iteration` is used (use `min_train_timesteps_per_reporting` instead). (#24345 ) Fix CQL getting stuck when deprecated timesteps_per_iteration is used (use min_train_timesteps_per_reporting instead). CQL does not perform sampling timesteps and the deprecated timesteps_per_iteration is automatically translated into the new min_sample_timesteps_per_reporting, but should be translated (only for CQL and other purely offline RL algos) into min_train_timesteps_per_reporting. If timesteps_per_iteration, CQL lever leaves the first iteration as it thinks it's not done yet (sample timesteps always remain at 0).	2022-04-29 21:02:34 +01:00
Kai Fricke	7a4d58d80f	[rllib] Fix doctest failure (#24343 ) Lint was still failing (but only caught with doctest): ``` File "../../python/ray/rllib/utils/numpy.py", line ?, in default Failed example: tree.traverse(make_action_immutable, d, top_down=False) Exception raised: Traceback (most recent call last): File "/opt/miniconda/lib/python3.6/doctest.py", line 1330, in __run compileflags, 1), test.globs) File "<doctest default[4]>", line 1, in <module> tree.traverse(make_action_immutable, d, top_down=False) NameError: name 'make_action_immutable' is not defined ```	2022-04-29 19:13:24 +01:00
Sven Mika	539832f2c5	[RLlib] SlateQ training iteration function. (#24151 )	2022-04-29 18:38:17 +02:00
Kai Fricke	242706922b	[rllib] Fix linting (#24335 ) #24262 broke linting. This fixes this.	2022-04-29 15:21:11 +01:00
Jun Gong	ec636dcb29	[RLlib] Do not print warning message during env pre-checking, if there is nothing wrong with user envs. (#24289 )	2022-04-29 10:41:19 +02:00
Xuehai Pan	377a522ce2	[RLlib] Fix time dimension shaping for PyTorch RNN models. (#21735 )	2022-04-29 10:39:03 +02:00
Pavel C	de0c6f6132	[RLlib] Fix `policy_map` always loading all policies from disk due to (not always needed) `global_vars` update. (#22010 )	2022-04-29 10:38:05 +02:00
Ishant Mrinal	0248c60387	[RLlib] Add additional return values to `action_sampler_fn`. (#22721 )	2022-04-29 10:34:48 +02:00

1 2 3 4 5 ...

1187 commits