Sven
8b16847c02
Get utils ready for better Agent torch support. ( #6561 )
2019-12-30 12:27:32 -08:00
Zhongxia Yan
98689bd263
Changed foreach_policy to foreach_trainable_policy ( #6564 )
...
Changed foreach_policy to foreach_trainable_policy in DQN when disabling exploration. This makes it consistent with the rest of the file
2019-12-26 19:50:48 -08:00
Michael Luo
548df014ec
SAC Performance Fixes ( #6295 )
...
* SAC Performance Fixes
* Small Changes
* Update sac_model.py
* fix normalize wrapper
* Update test_eager_support.py
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2019-12-20 10:51:25 -08:00
Eric Liang
8fc2272f43
[rllib] Reorganize trainer config, add warnings about high VF loss magnitude for PPO ( #6181 )
2019-11-18 10:39:07 -08:00
Philipp Moritz
fc655acfee
Fix linting on master branch ( #6174 )
2019-11-16 10:02:58 -08:00
Eric Liang
243b1b7281
[rllib] Add microbatch optimizer with A2C example ( #6161 )
2019-11-14 12:14:00 -08:00
Eric Liang
e4565c9cc6
Reduce RLlib log verbosity ( #6154 )
2019-11-13 18:50:45 -08:00
Eric Liang
16891e9379
[rllib] Don't use flat weights in non-eager mode ( #6001 )
2019-10-31 15:16:02 -07:00
Eric Liang
a0dcb45dc3
[rllib] Fix APEX priorities returning zero all the time ( #5980 )
...
* fix
* move example tests to end
* level err
* guard against none
* no trace test
* ignore thumbs
* np
* fix multi node
* fix
2019-10-26 13:23:42 -07:00
Matthew A. Wright
0110941de5
rllib: use pytorch's fn to see if gpu is available ( #5890 )
2019-10-12 00:13:00 -07:00
Matthew A. Wright
4aa06918ae
Qmix on gpu and with non-stacked-obs environment state support ( #5751 )
2019-10-08 13:18:07 -07:00
Eric Liang
c6919d315d
[rllib] Remove TorchPolicy locks ( #5764 )
...
* remove torch lock
* remove lock
2019-09-24 17:52:16 -07:00
Vince Jankovics
7e214fd95e
[tune] TensorBoard HParams for TF2.0 ( #5678 )
2019-09-21 11:06:34 -07:00
Kilian Batzner
79b9c70ad6
Add local_tf_session_args to unknown subkeys whitelist ( #5742 )
...
* Add local_tf_session_args to unknown subkeys whitelist
* Remove trailing whitespace
2019-09-20 10:32:49 -07:00
Matthew A. Wright
3131e1742d
[rllib] Qmix off by 1 in double Q calculation ( #5731 )
...
* Qmix fix.
-Current version of double Q learning is incorrect; it selects actions
at timestep t instead of t+1 when computing the t+1 Q value.
* Allow extra obs dict keys
* Move Q-value-computing replay code to own function
* Run the autoformatter
* use better terms in comments ("policy" network instead of "live" network)
2019-09-18 18:12:30 -07:00
gehring
8903bcd0c3
[rllib] Tracing for eager tensorflow policies with tf.function
( #5705 )
...
* Added tracing of eager policies with `tf.function`
* lint
* add config option
* add docs
* wip
* tracing now works with a3c
* typo
* none
* file doc
* returns
* syntax error
* syntax error
2019-09-17 01:44:20 -07:00
Ashwinee Panda
946ebfaa3c
[rllib] Validate that entropy coeff is not an integer ( #5687 )
...
* Validate that entropy coeff is not an integer
Passing an integer value for entropy coeff such as 0 raises an error somewhere inside the TF policy graph, so this checks to make sure the entropy coeff is a float.
* Cast to float instead
Also move this check after the negative value check
2019-09-11 14:35:42 -07:00
Eric Liang
bc6a95deb0
[rllib] Eager execution for centralized critic example, fix simple optimizer for multiagent ( #5683 )
2019-09-11 12:15:34 -07:00
Eric Liang
cf90394a09
[rllib] Fix TF2 import of EagerVariableStore ( #5625 )
2019-09-07 12:10:03 -07:00
Eric Liang
19bbf1eb4d
[rllib] Revert [rllib] Port DDPG to the build_tf_policy pattern ( #5626 )
2019-09-04 21:39:22 -07:00
Eric Liang
daf38c8723
[tune] Deprecate tune.function ( #5601 )
...
* remove tune function
* remove examples
* Update tune-usage.rst
2019-08-31 16:00:10 -07:00
Philipp Moritz
747daff2cb
Fix impala stress test ( #5596 )
2019-08-31 01:20:53 -07:00
Eric Liang
38231907f3
[rllib] Forgot to register param noise layer variables
2019-08-29 18:12:31 -07:00
Eric Liang
03a1b75852
[rllib] Fix some eager execution regressions with 1.13 ( #5537 )
...
* fix bugs with 1.13
* allow disable
2019-08-26 23:23:35 -07:00
Eric Liang
97ccd75952
[rllib] Enable object store memory limit by default ( #5534 )
2019-08-26 01:37:28 -07:00
gehring
b520f6141e
[rllib] Adds eager support with a generic TFEagerPolicy
class ( #5436 )
2019-08-23 14:21:11 +08:00
Eric Liang
e2e30ca507
Ray, Tune, and RLlib support for memory, object_store_memory options ( #5226 )
2019-08-21 23:01:10 -07:00
Eric Liang
a1d2e17623
[rllib] Autoregressive action distributions ( #5304 )
2019-08-10 14:05:12 -07:00
Eric Liang
592f313210
[rllib] Centralized critic / PPO example on TwoStepGame ( #5392 )
2019-08-08 14:03:28 -07:00
Matthew A. Wright
e3c9f7e83a
Custom action distributions ( #5164 )
...
* custom action dist wip
* Test case for custom action dist
* ActionDistribution.get_parameter_shape_for_action_space pattern
* Edit exception message to also suggest using a custom action distribution
* Clean up ModelCatalog.get_action_dist
* Pass model config to ActionDistribution constructors
* Update custom action distribution test case
* Name fix
* Autoformatter
* parameter shape static methods for torch distributions
* Fix docstring
* Generalize fake array for graph initialization
* Fix action dist constructors
* Correct parameter shape static methods for multicategorical and gaussian
* Make suggested changes to custom action dist's
* Correct instances of not passing model config to action dist
* Autoformatter
* fix tuple distribution constructor
* bugfix
2019-08-06 11:13:16 -07:00
Eric Liang
5d7afe8092
[rllib] Try moving RLlib to top level dir ( #5324 )
2019-08-05 23:25:49 -07:00