hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-08 19:41:38 -05:00

Author	SHA1	Message	Date
Eric Liang	831b2fe51d	[rllib] Set framework to tf by default and remove import checks; "Auto" option (#8748 ) * tf by default * Update rllib/agents/trainer.py Co-authored-by: Sven Mika <sven@anyscale.io> * remove it * fix * remove * fix * lint Co-authored-by: Sven Mika <sven@anyscale.io>	2020-06-08 23:04:50 -07:00
Sven Mika	25c0974543	[RLlib] Issue 8412 (Adam vars not stored in ModelV2). (#8480 )	2020-06-05 21:07:02 +02:00
Sven Mika	c74dc58f8b	[RLlib] Fix `use_lstm` flag for ModelV2 (w/o ModelV1 wrapping) and add it for PyTorch. (#8734 )	2020-06-05 15:40:30 +02:00
Sven Mika	97d524c075	[RLlib] Issue 8769 broken OOM tests_dir cases (R & S). (#8770 )	2020-06-05 08:34:21 +02:00
Eric Liang	1e4a1360fd	[rllib] Add type annotations to Trainer class (#8642 ) * type trainer * type it * fxi	2020-06-03 12:47:35 -07:00
Sven Mika	b37a162076	[RLlib] Make envs specifiable in configs by their class path. (#8750 )	2020-06-03 08:14:29 +02:00
Sven Mika	d8a081a185	[RLlib] Unity3D integration (n Unity3D clients vs learning server). (#8590 )	2020-05-30 22:48:34 +02:00
Sven Mika	d483ed28ba	[RLlib] Fix broken tune tests in master due to framework=auto errors. (#8672 )	2020-05-29 11:55:47 +02:00
Tomasz Wrona	f266318a01	[rllib] Do not store torch tensors when using grad clipping (#8509 )	2020-05-28 12:06:27 -07:00
Sven Mika	2746fc0476	[RLlib] Auto-framework, retire `use_pytorch` in favor of `framework=...` (#8520 )	2020-05-27 16:19:13 +02:00
Sven Mika	c7a2e3f309	[RLlib] Removed config["sample_async"] restriction for A3C-torch. (#8617 )	2020-05-27 10:22:49 +02:00
Sven Mika	6d196197bc	[RLlib] utils/spaces ... (#8608 )	2020-05-27 10:21:30 +02:00
Sven Mika	baa053496a	[RLlib] Benchmark and regression test yaml cleanup and restructuring. (#8414 )	2020-05-26 11:10:27 +02:00
Jan Blumenkamp	d6f78f58dc	Fix missing learning rate and entropy coeff schedule for torch PPO (#8572 )	2020-05-23 10:54:18 -07:00
Sven Mika	8870270164	[RLlib] Add QMIX support for complex obs spaces (Issue 8523). (#8533 )	2020-05-22 10:17:51 +02:00
Eric Liang	9a83908c46	[rllib] Deprecate policy optimizers (#8345 )	2020-05-21 10:16:18 -07:00
Sven Mika	d76578700d	[RLlib] `Policy.compute_single_action()` broken for nested actions (Issue 8411). (#8514 )	2020-05-20 22:29:08 +02:00
mehrdadn	ebf060d484	Make more tests run on Windows (#8446 ) * Remove worker Wait() call due to SIGCHLD being ignored * Port _pid_alive to Windows * Show PID as well as TID in glog * Update TensorFlow version for Python 3.8 on Windows * Handle missing Pillow on Windows * Work around dm-tree PermissionError on Windows * Fix some lint errors on Windows with Python 3.8 * Simplify torch requirements * Quiet git clean * Handle finalizer issues * Exit with the signal number * Get rid of wget * Fix some Windows compatibility issues with tests Co-authored-by: Mehrdad <noreply@github.com>	2020-05-20 12:25:04 -07:00
Eric Liang	aa7a58e92f	[rllib] Support training intensity for dqn / apex (#8396 )	2020-05-20 11:22:30 -07:00
Sven Mika	796a834c48	[RLlib] Attention Net integration into ModelV2 and learning RL example. (#8371 )	2020-05-18 17:26:40 +02:00
Eric Liang	96f4d82cc3	[rllib] Qmix replay ratio is wrong	2020-05-12 13:07:19 -07:00
Eric Liang	7ce138a6dc	[rllib] Support free_log_std in ModelV2 (#8380 ) * update * factor * update * fix test failures * fix torch net	2020-05-12 10:14:05 -07:00
Sven Mika	57544b1ff9	[RLlib] Examples folder restructuring (Model examples; final part). (#8278 ) - This PR completes any previously missing PyTorch Model counterparts to TFModels in examples/models. - It also makes sure, all example scripts in the rllib/examples folder are tested for both frameworks and learn the given task (this is often currently not checked) using a --as-test flag in connection with a --stop-reward.	2020-05-12 08:23:10 +02:00
Eric Liang	9d012626e5	[rllib] Distributed exec workflow for impala (#8321 )	2020-05-11 20:24:43 -07:00
Sven Mika	c7cb2f5416	[RLlib] IMPALA PyTorch GPU fixes (#8397 )	2020-05-11 22:03:27 +02:00
Sven Mika	754290daad	[RLlib] Add light-weight `Trainer.compute_action()` tests for all Algos. (#8356 )	2020-05-08 16:31:31 +02:00
Eric Liang	2c599dbf05	[rllib] Port QMIX, MADDPG to new execution API (#8344 )	2020-05-07 23:41:10 -07:00
Eric Liang	9f04a65922	[rllib] Add PPO+DQN two trainer multiagent workflow example (#8334 )	2020-05-07 23:40:29 -07:00
Sven Mika	d7eaacb5fe	[RLlib] Issue 8319 DDPG (MA or num_envs_per_worker > 1) broken. (#8324 )	2020-05-08 08:26:32 +02:00
Sven Mika	5f278c6411	[RLlib] Examples folder restructuring (models) part 1 (#8353 )	2020-05-08 08:20:18 +02:00
Eric Liang	b14cc16616	[rllib] Enable functional execution workflow API by default (#8221 )	2020-05-05 12:36:42 -07:00
Eric Liang	ee0eb44a32	Rename async_queue_depth -> num_async (#8207 ) * rename * lint	2020-05-05 01:38:10 -07:00
Eric Liang	f48da50e1c	[rllib] observation function api for multi-agent (#8236 )	2020-05-04 22:13:49 -07:00
Sven Mika	6c2b9a4cfa	[RLlib] Remove tf.py_function from all Schedule classes (not differentiable and causes other bugs in MA setups). (#8304 ) Remove tf.py_function from all Schedule classes (not differentiable and causes other bugs in MA setups). (#8304)	2020-05-04 23:53:38 +02:00
Sven Mika	a00144f746	[RLlib] Fix issue 8135 (DDPG inf actions when using [-inf,inf] action space). (#8302 )	2020-05-04 22:27:30 +02:00
Sven Mika	b95e28faea	[RLlib] APEX_DDPG (PyTorch) test case and docs. (#8288 ) APEX_DDPG (PyTorch) test case and docs.	2020-05-04 09:36:27 +02:00
Sven Mika	166bb5d690	[RLlib] IMPALA PyTorch (#8287 ) This PR adds an IMPALA PyTorch implementation. - adds compilation tests for LSTM and w/o LSTM. - adds learning test for CartPole.	2020-05-03 13:44:25 +02:00
Sven Mika	76e1a4df9e	Fix TD3 torch via GaussianNoise torch bug. (#8276 )	2020-05-02 08:12:21 +02:00
Sven Mika	42991d723f	[RLlib] rllib/examples folder restructuring (#8250 ) Cleans up of the rllib/examples folder by moving all example Envs into rllibexamples/env (so they can be used by other scripts and tests as well).	2020-05-01 22:59:34 +02:00
Sven Mika	eea75ac623	[RLlib] Beta distribution. (#8229 )	2020-04-30 11:09:33 -07:00
Eric Liang	baadbdf8d4	[rllib] Execute PPO using training workflow (#8206 ) * wip * add kl * kl * works now * doc update * reorg * add ddppo * add stats * fix fetch * comment * fix learner stat regression * test fixes * fix test	2020-04-30 01:18:09 -07:00
Sven Mika	bf25aee392	[RLlib] Deprecate all Model(v1) usage. (#8146 ) Deprecate all Model(v1) usage.	2020-04-29 12:12:59 +02:00
Sven Mika	eb91619175	Fix release 0.8.5 tests for PPO torch Breakout. (#8226 )	2020-04-29 10:36:41 +02:00
Sven Mika	1775e89f26	[RLlib] Remove TupleActions and support arbitrarily nested action spaces. (#8143 ) Deprecate TupleActions and support arbitrarily nested action spaces. Closes issue #8143.	2020-04-28 14:59:16 +02:00
Sven Mika	7ec2223c84	[RLlib] DDPG PyTorch actor-model was missing sigmoid layer (#8188 ) Fix DDPG PyTorch (missing sigmoid layer (to squash action outputs) after deterministic action outputs).	2020-04-26 23:08:13 +02:00
Eric Liang	2298f6fb40	[rllib] Port DQN/Ape-X to training workflow api (#8077 )	2020-04-23 12:39:19 -07:00
Sven Mika	499ad5fbe4	[RLlib] PyTorch version of APPO. (#8120 ) - Translate all vtrace functionality to torch and added torch to the framework_iterator-loop in all existing vtrace test cases. - Add learning test cases for APPO torch (both w/ and w/o v-trace). - Add quick compilation tests for APPO (tf and torch, v-trace and no v-trace).	2020-04-23 09:11:12 +02:00
Sven Mika	d15609ba2a	[RLlib] PyTorch version of ARS (Augmented Random Search). (#8106 ) This PR implements a PyTorch version of RLlib's ARS algorithm using RLlib's functional algo builder API. It also adds a regression test for ARS (torch) on CartPole.	2020-04-21 09:47:52 +02:00
Sven Mika	3812bfedda	[RLlib] PyTorch version of ES (Evolution Strategies). (#8104 ) PyTorch version of Evolution Strategies (ES) Algo.	2020-04-20 21:47:28 +02:00
Sven Mika	d6cb7d865e	[RLlib] Torch DQN (APEX) TD-Error/prio. replay fixes. (#8082 ) PyTorch APEX_DQN with Prioritized Replay enabled would not work properly due to the td_error not being retrievable by the AsyncReplayOptimizer.	2020-04-20 10:03:25 +02:00

... 2 3 4 5 6 ...

306 commits