hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Eric Liang	8fc2272f43	[rllib] Reorganize trainer config, add warnings about high VF loss magnitude for PPO (#6181 )	2019-11-18 10:39:07 -08:00
Philipp Moritz	fc655acfee	Fix linting on master branch (#6174 )	2019-11-16 10:02:58 -08:00
Eric Liang	243b1b7281	[rllib] Add microbatch optimizer with A2C example (#6161 )	2019-11-14 12:14:00 -08:00
Eric Liang	e4565c9cc6	Reduce RLlib log verbosity (#6154 )	2019-11-13 18:50:45 -08:00
Eric Liang	16891e9379	[rllib] Don't use flat weights in non-eager mode (#6001 )	2019-10-31 15:16:02 -07:00
Eric Liang	a0dcb45dc3	[rllib] Fix APEX priorities returning zero all the time (#5980 ) * fix * move example tests to end * level err * guard against none * no trace test * ignore thumbs * np * fix multi node * fix	2019-10-26 13:23:42 -07:00
Matthew A. Wright	0110941de5	rllib: use pytorch's fn to see if gpu is available (#5890 )	2019-10-12 00:13:00 -07:00
Matthew A. Wright	4aa06918ae	Qmix on gpu and with non-stacked-obs environment state support (#5751 )	2019-10-08 13:18:07 -07:00
Eric Liang	c6919d315d	[rllib] Remove TorchPolicy locks (#5764 ) * remove torch lock * remove lock	2019-09-24 17:52:16 -07:00
Vince Jankovics	7e214fd95e	[tune] TensorBoard HParams for TF2.0 (#5678 )	2019-09-21 11:06:34 -07:00
Kilian Batzner	79b9c70ad6	Add local_tf_session_args to unknown subkeys whitelist (#5742 ) * Add local_tf_session_args to unknown subkeys whitelist * Remove trailing whitespace	2019-09-20 10:32:49 -07:00
Matthew A. Wright	3131e1742d	[rllib] Qmix off by 1 in double Q calculation (#5731 ) * Qmix fix. -Current version of double Q learning is incorrect; it selects actions at timestep t instead of t+1 when computing the t+1 Q value. * Allow extra obs dict keys * Move Q-value-computing replay code to own function * Run the autoformatter * use better terms in comments ("policy" network instead of "live" network)	2019-09-18 18:12:30 -07:00
gehring	8903bcd0c3	[rllib] Tracing for eager tensorflow policies with `tf.function` (#5705 ) * Added tracing of eager policies with `tf.function` * lint * add config option * add docs * wip * tracing now works with a3c * typo * none * file doc * returns * syntax error * syntax error	2019-09-17 01:44:20 -07:00
Ashwinee Panda	946ebfaa3c	[rllib] Validate that entropy coeff is not an integer (#5687 ) * Validate that entropy coeff is not an integer Passing an integer value for entropy coeff such as 0 raises an error somewhere inside the TF policy graph, so this checks to make sure the entropy coeff is a float. * Cast to float instead Also move this check after the negative value check	2019-09-11 14:35:42 -07:00
Eric Liang	bc6a95deb0	[rllib] Eager execution for centralized critic example, fix simple optimizer for multiagent (#5683 )	2019-09-11 12:15:34 -07:00
Eric Liang	cf90394a09	[rllib] Fix TF2 import of EagerVariableStore (#5625 )	2019-09-07 12:10:03 -07:00
Eric Liang	19bbf1eb4d	[rllib] Revert [rllib] Port DDPG to the build_tf_policy pattern (#5626 )	2019-09-04 21:39:22 -07:00
Eric Liang	daf38c8723	[tune] Deprecate tune.function (#5601 ) * remove tune function * remove examples * Update tune-usage.rst	2019-08-31 16:00:10 -07:00
Philipp Moritz	747daff2cb	Fix impala stress test (#5596 )	2019-08-31 01:20:53 -07:00
Eric Liang	38231907f3	[rllib] Forgot to register param noise layer variables	2019-08-29 18:12:31 -07:00
Eric Liang	03a1b75852	[rllib] Fix some eager execution regressions with 1.13 (#5537 ) * fix bugs with 1.13 * allow disable	2019-08-26 23:23:35 -07:00
Eric Liang	97ccd75952	[rllib] Enable object store memory limit by default (#5534 )	2019-08-26 01:37:28 -07:00
gehring	b520f6141e	[rllib] Adds eager support with a generic `TFEagerPolicy` class (#5436 )	2019-08-23 14:21:11 +08:00
Eric Liang	e2e30ca507	Ray, Tune, and RLlib support for memory, object_store_memory options (#5226 )	2019-08-21 23:01:10 -07:00
Eric Liang	a1d2e17623	[rllib] Autoregressive action distributions (#5304 )	2019-08-10 14:05:12 -07:00
Eric Liang	592f313210	[rllib] Centralized critic / PPO example on TwoStepGame (#5392 )	2019-08-08 14:03:28 -07:00
Matthew A. Wright	e3c9f7e83a	Custom action distributions (#5164 ) * custom action dist wip * Test case for custom action dist * ActionDistribution.get_parameter_shape_for_action_space pattern * Edit exception message to also suggest using a custom action distribution * Clean up ModelCatalog.get_action_dist * Pass model config to ActionDistribution constructors * Update custom action distribution test case * Name fix * Autoformatter * parameter shape static methods for torch distributions * Fix docstring * Generalize fake array for graph initialization * Fix action dist constructors * Correct parameter shape static methods for multicategorical and gaussian * Make suggested changes to custom action dist's * Correct instances of not passing model config to action dist * Autoformatter * fix tuple distribution constructor * bugfix	2019-08-06 11:13:16 -07:00
Eric Liang	5d7afe8092	[rllib] Try moving RLlib to top level dir (#5324 )	2019-08-05 23:25:49 -07:00

... 10 11 12 13 14

678 commits