hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Sven Mika	93120e0347	Unity3D API Fixes (recent changes in Unity's MLAgents API caused errors on RLlib side). (#10285 )	2020-08-26 14:16:08 +02:00
Michael Luo	4e9888ce2f	[RLlib] Dreamer (#10172 )	2020-08-26 13:24:05 +02:00
Olli Huotari	0dae50b5eb	Fixed num_atoms>1 in pytorch (#10330 )	2020-08-25 23:10:20 -07:00
Eric Liang	deea1861ab	[rllib] Try fixing torch GPU and masking errors (#10168 )	2020-08-25 18:34:19 -07:00
Benjamin Black	2689fb439c	Fixed pettingzoo env example (#9973 )	2020-08-25 13:22:25 +02:00
raoul-khour-ts	c8c4832794	Prevent Local Worker creation from blocking remote worker creation by creating remote workers before local worker (#10245 ) * create remote workers before local worker * reformatted	2020-08-24 12:29:55 -07:00
Michael Luo	48a39d7cb9	[RLlib] Deepmind Control Suite Examples (#9751 )	2020-08-23 12:53:08 +02:00
krfricke	c31876002d	[tune/rllib] made wandb compatible with rllib trainables (#10252 )	2020-08-21 17:25:52 -07:00
Sven Mika	e968b52cb7	[RLlib] Trajectory view API - 03 Fast LSTM + prev actions/rewards (#9950 )	2020-08-21 12:35:16 +02:00
Sven Mika	d14b501692	[RLlib] First attempt at cleaning up algo code in RLlib: PG. (#10115 )	2020-08-20 17:05:57 +02:00
Raphael Avalos	8b704eb419	Small fix for Cuda Torch DQN. (#10177 )	2020-08-19 13:28:05 -07:00
Sven Mika	2cbe29a7fa	[RLlib] Curiosity minor fixes, do-overs, and testing. (#10143 )	2020-08-19 17:49:50 +02:00
Tomasz Wrona	aff7f19360	[tune] Added logger_config field (#8521 ) Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2020-08-18 11:10:22 -07:00
Eric Liang	ca133e2699	[rllib] Remove extra model config kwargs passed incorrectly for Torch models (#10055 )	2020-08-17 11:12:20 -07:00
Julius Frost	dc659ae89a	make action probabilities a numpy array (#10122 )	2020-08-16 11:25:12 -07:00
Olli Huotari	9ff599cbb8	torch policy now includes model.metrics (#10121 ) * torch policy now includes model.metrics * Fixed tests to work with custom metrics * Forgot to run format.sh	2020-08-15 10:43:11 -07:00
Sven Mika	aeb5be7733	[RLlib] Trajectory View API (part 2.5): Actual implementations (not used yet) of a SampleCollector. (#10112 )	2020-08-15 15:09:00 +02:00
Sven Mika	2256047876	[RLlib] Rename rllib.utils.types into typing to match built-in python module's name. (#10114 )	2020-08-15 13:24:22 +02:00
Chua Cheow Huan	ea51e94729	[rllib] Learning rate schedule for DDPPO. (#10006 ) * Get shared metrics, increment counter & set global vars for remote workers. * Add unit test to test lr_schedule for DDPPO. * Broadcast the local set of global vars to remote workers instead of independently setting the global vars on each rollout worker.	2020-08-15 00:51:45 -07:00
Tanay Wakhare	1826b29757	[RLlib] Curiosity (intrinsic motivation) Exploration module. (#9912 )	2020-08-13 20:14:16 +02:00
Sven Mika	66d204e078	[RLlib] Model documentation enhancements. (#10011 )	2020-08-13 13:36:40 +02:00
Sven Mika	0effcda3e4	Add missing int-casts for all shape calculating code (using np.product([some shape])). (#10092 )	2020-08-13 12:04:22 +02:00
Julius Frost	6d9d2b320a	[RLlib] Support windows drives other than C drive for the offline json API (#9909 )	2020-08-13 11:57:54 +02:00
yncxcw	32cd94b750	[Core] Do not convert gpu id to int (#9744 ) Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2020-08-11 12:09:46 -07:00
Sven Mika	4b10bdf8fc	[RLlib] rollout.py - Add multi-agent test case. (#9981 )	2020-08-10 19:44:23 +02:00
Barak Michener	8e76796fd0	ci: Redo `format.sh --all` script & backfill lint fixes (#9956 )	2020-08-07 16:49:49 -07:00
Sven Mika	5d5643e633	[RLlib] Add informative error message when bad Conv2D stack is used with fixed `num_outputs` (no flattening at end). (#9966 )	2020-08-07 12:04:17 +02:00
Eric Liang	668f555755	[rllib] Clean up outdated docs #9915	2020-08-06 18:29:04 -07:00
Sven Mika	57690a3a9f	[RLlib] Trajectory view API - 02 actual API scaffold (#9753 )	2020-08-06 10:54:20 +02:00
Sven Mika	19d785b947	[LINT] Except RLlib from checking for flake8 error F821 (#9946 )	2020-08-06 10:44:37 +02:00
Sven Mika	9b90f7db67	[RLlib] Missing type annotations policy templates. (#9846 )	2020-08-06 05:33:24 +02:00
Michael Luo	4d7bd8c892	[RLlib] Implementation of "Model-based Meta Policy Optimization" (MB MPO) (#9409 )	2020-08-02 18:12:09 +02:00
Sven Mika	e540e425e4	[RLlib] `rllib rollout` test and bug fixes. (#9779 )	2020-07-30 16:17:03 +02:00
Sven Mika	f6bd12eb18	[RLlib] Add tensor-based tests for Schedules and fix some bugs related to using Schedules with tensor time input. (#9782 )	2020-07-30 12:49:32 +02:00
Miguel Morales	372114b4ed	Update sampler.py (#9805 ) Minor fix for warning string	2020-07-29 22:58:35 -07:00
Sven Mika	b0b0463161	[RLlib] Trajectory View API (preparatory cleanup and enhancements). (#9678 )	2020-07-29 21:15:09 +02:00
Sven Mika	ff9c1dac88	[RLlib] Issue 9667 DDPG Torch bugs and enhancements. (#9680 )	2020-07-28 14:15:03 +02:00
Sven Mika	e6ea33a03c	[RLlib] Enhance reward clipping test; add action_clipping tests. (#9684 )	2020-07-28 10:44:54 +02:00
Michael Luo	b51ab2af66	[RLlib] Offline Type Annotations (#9676 ) * Offline Annotations * Modifications * Fixed circular dependencies * Linter fix	2020-07-27 14:01:17 -07:00
Sven Mika	5dc4b6686e	[RLlib] Implement DQN PyTorch distributional head. (#9589 )	2020-07-25 09:29:24 +02:00
Petros Christodoulou	46c64c90d0	fixed simplex initialisation seeding bug (#9660 ) Co-authored-by: Petros Christodoulou <petrochr@amazon.com>	2020-07-24 14:22:41 -07:00
Sven Mika	e4c5d3526f	Issue 9631: Tf1.14 does not have tf.config.list_physical_devices. (#9681 )	2020-07-24 21:48:58 +02:00
Eric Liang	590943a499	[rllib] Type annotations for model classes (#9646 )	2020-07-24 12:01:46 -07:00
Eric Liang	5acd3e66dd	[rllib] Fix torch TD error, IMPALA LR updates (#9477 ) * update * add test * lint * fix super call * speed es test up	2020-07-23 12:50:25 -07:00
Raphael Avalos	5303c3abe3	Fix TorchDeterministic (#9241 )	2020-07-23 10:43:20 -07:00
Sven Mika	75592e664f	Issue 9568: `rllib train` framework in config gets overridden with tf. (#9572 )	2020-07-21 22:02:24 +02:00
Raphael Avalos	440c9c42be	[RLlib] Fix combination of lockstep and multiple agnts controlled by the same policy. (#9521 ) * Change aggregation when lockstep is activated. Modification of MultiAgentBatch.timeslices to support the combination of lockstep and multiple agents controlled by the same policy. fix ray-project/ray#9295 * Line too long.	2020-07-19 23:03:12 -07:00
Sven Mika	887cf5eca7	MADDPG learning confirmation test. (#9538 )	2020-07-17 20:18:02 +02:00
Sven Mika	78dfed2683	[RLlib] Issue 8384: QMIX doesn't learn anything. (#9527 )	2020-07-17 12:14:34 +02:00
Sven Mika	8204717eed	[RLlib] Issue 9218: PyTorch Policy places Model on GPU even with num_gpus=0 (#9516 )	2020-07-17 05:53:25 +02:00

1 2 3 4 5 ...

380 commits