hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Sven Mika	3a234ed9e3	[RLlib] Error: "Unknown trainable [some rllib algo name]" (#8525 )	2020-05-21 08:59:32 +02:00
Sven Mika	d76578700d	[RLlib] `Policy.compute_single_action()` broken for nested actions (Issue 8411). (#8514 )	2020-05-20 22:29:08 +02:00
mehrdadn	ebf060d484	Make more tests run on Windows (#8446 ) * Remove worker Wait() call due to SIGCHLD being ignored * Port _pid_alive to Windows * Show PID as well as TID in glog * Update TensorFlow version for Python 3.8 on Windows * Handle missing Pillow on Windows * Work around dm-tree PermissionError on Windows * Fix some lint errors on Windows with Python 3.8 * Simplify torch requirements * Quiet git clean * Handle finalizer issues * Exit with the signal number * Get rid of wget * Fix some Windows compatibility issues with tests Co-authored-by: Mehrdad <noreply@github.com>	2020-05-20 12:25:04 -07:00
Eric Liang	aa7a58e92f	[rllib] Support training intensity for dqn / apex (#8396 )	2020-05-20 11:22:30 -07:00
Sven Mika	796a834c48	[RLlib] Attention Net integration into ModelV2 and learning RL example. (#8371 )	2020-05-18 17:26:40 +02:00
Dennis van der Hoff	be1f158747	Added Done to MultiAgentExternalEnv. (#8478 ) Co-authored-by: devanderhoff <devanderhoff@hotmail.com>	2020-05-17 16:29:47 -07:00
Edward Oakes	16f48078d9	Remove use of ObjectID transport flag (#7699 )	2020-05-17 11:29:49 -05:00
Sven Mika	c9435cad43	WIP. (#8456 ) Fix multi-GPU histogram metrics for > 0D tensors.	2020-05-15 21:43:27 +02:00
Sven Mika	5f4c196fed	[RLlib] Make PyTorch Model forward pass faster in vf-case. (#8422 )	2020-05-14 10:15:50 +02:00
Eric Liang	6bf1dc0888	[rllib] [hotfix] Build broken due to merge conflict: MixInReplay has no attribute buffer	2020-05-13 12:21:04 -07:00
Eric Liang	96f4d82cc3	[rllib] Qmix replay ratio is wrong	2020-05-12 13:07:19 -07:00
Eric Liang	7ce138a6dc	[rllib] Support free_log_std in ModelV2 (#8380 ) * update * factor * update * fix test failures * fix torch net	2020-05-12 10:14:05 -07:00
Sven Mika	57544b1ff9	[RLlib] Examples folder restructuring (Model examples; final part). (#8278 ) - This PR completes any previously missing PyTorch Model counterparts to TFModels in examples/models. - It also makes sure, all example scripts in the rllib/examples folder are tested for both frameworks and learn the given task (this is often currently not checked) using a --as-test flag in connection with a --stop-reward.	2020-05-12 08:23:10 +02:00
Eric Liang	9d012626e5	[rllib] Distributed exec workflow for impala (#8321 )	2020-05-11 20:24:43 -07:00
Sven Mika	c7cb2f5416	[RLlib] IMPALA PyTorch GPU fixes (#8397 )	2020-05-11 22:03:27 +02:00
A Kharitonov	304e31b7e5	Fixed: contrib/MADDPG MADDPGTFPolicy missing self.config assignment (#8343 )	2020-05-08 12:05:06 -07:00
Sven Mika	754290daad	[RLlib] Add light-weight `Trainer.compute_action()` tests for all Algos. (#8356 )	2020-05-08 16:31:31 +02:00
Sven Mika	d946f58fd0	LINT fixes. (#8370 )	2020-05-08 16:24:20 +02:00
gehring	7f14fb577d	[RLlib] Added TransformerXL and "stabilized for RL" variant, GTrXL (#6470 )	2020-05-08 14:10:23 +02:00
Eric Liang	2c599dbf05	[rllib] Port QMIX, MADDPG to new execution API (#8344 )	2020-05-07 23:41:10 -07:00
Eric Liang	9f04a65922	[rllib] Add PPO+DQN two trainer multiagent workflow example (#8334 )	2020-05-07 23:40:29 -07:00
Sven Mika	d7eaacb5fe	[RLlib] Issue 8319 DDPG (MA or num_envs_per_worker > 1) broken. (#8324 )	2020-05-08 08:26:32 +02:00
Sven Mika	5f278c6411	[RLlib] Examples folder restructuring (models) part 1 (#8353 )	2020-05-08 08:20:18 +02:00
Eric Liang	30db920787	[rllib] Fix centralized critic example to use right policy (#8341 ) * update * update	2020-05-07 10:47:55 -07:00
Eric Liang	b14cc16616	[rllib] Enable functional execution workflow API by default (#8221 )	2020-05-05 12:36:42 -07:00
Eric Liang	ee0eb44a32	Rename async_queue_depth -> num_async (#8207 ) * rename * lint	2020-05-05 01:38:10 -07:00
Eric Liang	f48da50e1c	[rllib] observation function api for multi-agent (#8236 )	2020-05-04 22:13:49 -07:00
Sven Mika	6c2b9a4cfa	[RLlib] Remove tf.py_function from all Schedule classes (not differentiable and causes other bugs in MA setups). (#8304 ) Remove tf.py_function from all Schedule classes (not differentiable and causes other bugs in MA setups). (#8304)	2020-05-04 23:53:38 +02:00
Sven Mika	a00144f746	[RLlib] Fix issue 8135 (DDPG inf actions when using [-inf,inf] action space). (#8302 )	2020-05-04 22:27:30 +02:00
Sven Mika	b95e28faea	[RLlib] APEX_DDPG (PyTorch) test case and docs. (#8288 ) APEX_DDPG (PyTorch) test case and docs.	2020-05-04 09:36:27 +02:00
Sven Mika	166bb5d690	[RLlib] IMPALA PyTorch (#8287 ) This PR adds an IMPALA PyTorch implementation. - adds compilation tests for LSTM and w/o LSTM. - adds learning test for CartPole.	2020-05-03 13:44:25 +02:00
Sven Mika	76e1a4df9e	Fix TD3 torch via GaussianNoise torch bug. (#8276 )	2020-05-02 08:12:21 +02:00
Sven Mika	42991d723f	[RLlib] rllib/examples folder restructuring (#8250 ) Cleans up of the rllib/examples folder by moving all example Envs into rllibexamples/env (so they can be used by other scripts and tests as well).	2020-05-01 22:59:34 +02:00
Eric Liang	2a0ad0b8ce	[rllib] [hotfix] Remove assert that trips on pytorch multiagent (#8241 )	2020-05-01 06:32:54 +02:00
Sven Mika	c593fb09b7	[RLlib] Remove all f-strings to keep py3.5 compatibility.	2020-04-30 11:10:16 -07:00
Sven Mika	eea75ac623	[RLlib] Beta distribution. (#8229 )	2020-04-30 11:09:33 -07:00
Sven Mika	b23b6addfc	[RLlib] Stabilize Pendulum-v0 regression test cases. (#8232 ) Stabilize Pendulum regression test cases.	2020-04-30 15:48:11 +02:00
Eric Liang	baadbdf8d4	[rllib] Execute PPO using training workflow (#8206 ) * wip * add kl * kl * works now * doc update * reorg * add ddppo * add stats * fix fetch * comment * fix learner stat regression * test fixes * fix test	2020-04-30 01:18:09 -07:00
Eric Liang	ae54e0dc0a	[rllib] Copy plasma memory before adding data to replay buffer	2020-04-29 14:17:54 -07:00
Sven Mika	bf25aee392	[RLlib] Deprecate all Model(v1) usage. (#8146 ) Deprecate all Model(v1) usage.	2020-04-29 12:12:59 +02:00
Sven Mika	eb91619175	Fix release 0.8.5 tests for PPO torch Breakout. (#8226 )	2020-04-29 10:36:41 +02:00
Sven Mika	1775e89f26	[RLlib] Remove TupleActions and support arbitrarily nested action spaces. (#8143 ) Deprecate TupleActions and support arbitrarily nested action spaces. Closes issue #8143.	2020-04-28 14:59:16 +02:00
Sven Mika	4e713152e9	[RLlib] Fix for issue https://github.com/ray-project/ray/issues/8191 (#8200 ) Fix attribute error when missing exploration in Policy. Issue #8191	2020-04-27 23:19:26 +02:00
Sven Mika	7ec2223c84	[RLlib] DDPG PyTorch actor-model was missing sigmoid layer (#8188 ) Fix DDPG PyTorch (missing sigmoid layer (to squash action outputs) after deterministic action outputs).	2020-04-26 23:08:13 +02:00
Tomasz Wrona	b508166419	Copy initial state of an RNN to a CPU before converting it to a NumPy array (#8097 )	2020-04-25 18:49:09 -07:00
Eric Liang	2298f6fb40	[rllib] Port DQN/Ape-X to training workflow api (#8077 )	2020-04-23 12:39:19 -07:00
Sven Mika	499ad5fbe4	[RLlib] PyTorch version of APPO. (#8120 ) - Translate all vtrace functionality to torch and added torch to the framework_iterator-loop in all existing vtrace test cases. - Add learning test cases for APPO torch (both w/ and w/o v-trace). - Add quick compilation tests for APPO (tf and torch, v-trace and no v-trace).	2020-04-23 09:11:12 +02:00
Sven Mika	e9ee5c4e5f	[RLlib] Nested action space PR (minimally invasive; torch only + test). (#8101 ) - Add TorchMultiActionDistribution class. - Add framework-agnostic test cases for TorchMultiActionDistribution.	2020-04-23 09:09:22 +02:00
Sven Mika	d15609ba2a	[RLlib] PyTorch version of ARS (Augmented Random Search). (#8106 ) This PR implements a PyTorch version of RLlib's ARS algorithm using RLlib's functional algo builder API. It also adds a regression test for ARS (torch) on CartPole.	2020-04-21 09:47:52 +02:00
Eric Liang	17e3c545d9	[rllib] Fix truncate episodes mode in central critic example (#8073 )	2020-04-20 12:58:01 -07:00

1 2 3 4 5 ...

356 commits