Eric Liang
7559fdb141
[rllib/tune] Cache get_preprocessor() calls, default max_failur… ( #6211 )
2019-11-21 15:55:56 -08:00
Eric Liang
8fc2272f43
[rllib] Reorganize trainer config, add warnings about high VF loss magnitude for PPO ( #6181 )
2019-11-18 10:39:07 -08:00
Philipp Moritz
fc655acfee
Fix linting on master branch ( #6174 )
2019-11-16 10:02:58 -08:00
Eric Liang
a68cda0a33
[rllib] remove exists call ( #6168 )
2019-11-15 21:59:40 -08:00
Eric Liang
243b1b7281
[rllib] Add microbatch optimizer with A2C example ( #6161 )
2019-11-14 12:14:00 -08:00
waldroje
e4c0843f60
Allow EntropyCoeffSchedule to accept custom schedule ( #6158 )
...
* modify tf_policy to enable EntropyCoeffSchedule to handle list, and avoid negative values under current implementation
* Update custom_metrics_and_callbacks.py
* Update tf_policy.py
2019-11-14 00:45:43 -08:00
Eric Liang
e4565c9cc6
Reduce RLlib log verbosity ( #6154 )
2019-11-13 18:50:45 -08:00
Eric Liang
b924299833
Add large scale regression test for RLlib ( #6093 )
2019-11-13 12:22:55 -08:00
Siyuan (Ryans) Zhuang
f48293f96d
Fix deprecated warning ( #6142 )
2019-11-11 17:49:15 -08:00
Miguel Morales
d17ae5ad7a
Update hyperband-cartpole.yaml ( #6121 )
...
Typo
2019-11-09 19:39:03 -08:00
Eric Liang
1f043daf69
[rllib] Fix and add test for LR annealing config
2019-11-07 12:17:27 -08:00
David Bignell
3f83b2daa9
[rllib] Rollout extensions ( #6065 )
...
* Rollout improvements
* Make info-saving optional, to avoid breaking change.
* Store generating ray version in checkpoint metadata
* Keep the linter happy
* Add small rollout test
* Terse.
* Update test_io.py
2019-11-05 20:34:18 -08:00
Eric Liang
2a0225dd25
[rllib] RLlib chooses wrong neural network model for Atari in 0.7.5 ( #6087 )
2019-11-05 11:36:29 -08:00
Eric Liang
16891e9379
[rllib] Don't use flat weights in non-eager mode ( #6001 )
2019-10-31 15:16:02 -07:00
Eric Liang
a0dcb45dc3
[rllib] Fix APEX priorities returning zero all the time ( #5980 )
...
* fix
* move example tests to end
* level err
* guard against none
* no trace test
* ignore thumbs
* np
* fix multi node
* fix
2019-10-26 13:23:42 -07:00
Eric Liang
34fbc7fb4c
rllib] Fix leak of TensorFlow assign operations in DQN/DDPG
2019-10-23 00:28:15 -07:00
Eric Liang
f7bda0abad
[rllib] Fix rnn shape with multi-dimensional data ( #5939 )
...
* fix shape
* add test
* Update rnn_sequencing.py
2019-10-22 11:07:26 -07:00
Stefan Otte
d70abcfd70
Fix typo in examples/centralized_critic.py ( #5943 )
...
`opp_ops` should be `opp_obs`.
2019-10-17 08:42:50 -07:00
Matthew A. Wright
0110941de5
rllib: use pytorch's fn to see if gpu is available ( #5890 )
2019-10-12 00:13:00 -07:00
Matthew A. Wright
4aa06918ae
Qmix on gpu and with non-stacked-obs environment state support ( #5751 )
2019-10-08 13:18:07 -07:00
Eric Liang
04e997fe0d
Fix TF2 / rllib test ( #5846 )
2019-10-07 14:25:16 -07:00
Eric Liang
fb33160df8
Fix obs space lo/hi ( #5826 )
2019-10-04 09:28:06 -07:00
Eric Liang
c6919d315d
[rllib] Remove TorchPolicy locks ( #5764 )
...
* remove torch lock
* remove lock
2019-09-24 17:52:16 -07:00
Vince Jankovics
7e214fd95e
[tune] TensorBoard HParams for TF2.0 ( #5678 )
2019-09-21 11:06:34 -07:00
Kilian Batzner
79b9c70ad6
Add local_tf_session_args to unknown subkeys whitelist ( #5742 )
...
* Add local_tf_session_args to unknown subkeys whitelist
* Remove trailing whitespace
2019-09-20 10:32:49 -07:00
Eric Liang
fb3b232c0e
[rllib] Properly flatten 2-d observations as input to FCnet ( #5733 )
2019-09-19 12:10:31 -07:00
Matthew A. Wright
3131e1742d
[rllib] Qmix off by 1 in double Q calculation ( #5731 )
...
* Qmix fix.
-Current version of double Q learning is incorrect; it selects actions
at timestep t instead of t+1 when computing the t+1 Q value.
* Allow extra obs dict keys
* Move Q-value-computing replay code to own function
* Run the autoformatter
* use better terms in comments ("policy" network instead of "live" network)
2019-09-18 18:12:30 -07:00
gehring
8903bcd0c3
[rllib] Tracing for eager tensorflow policies with tf.function
( #5705 )
...
* Added tracing of eager policies with `tf.function`
* lint
* add config option
* add docs
* wip
* tracing now works with a3c
* typo
* none
* file doc
* returns
* syntax error
* syntax error
2019-09-17 01:44:20 -07:00
Edward Oakes
07c4c6367a
[core worker] Python core worker object interface ( #5272 )
2019-09-12 23:07:46 -07:00
Ashwinee Panda
946ebfaa3c
[rllib] Validate that entropy coeff is not an integer ( #5687 )
...
* Validate that entropy coeff is not an integer
Passing an integer value for entropy coeff such as 0 raises an error somewhere inside the TF policy graph, so this checks to make sure the entropy coeff is a float.
* Cast to float instead
Also move this check after the negative value check
2019-09-11 14:35:42 -07:00
Eric Liang
bc6a95deb0
[rllib] Eager execution for centralized critic example, fix simple optimizer for multiagent ( #5683 )
2019-09-11 12:15:34 -07:00
Richard Liaw
0010f54378
Update Cloudpickle ( #5643 )
2019-09-09 17:17:29 -07:00
Eric Liang
74abeab057
[rllib] Improve accessing model state docs ( #5656 )
...
* [rllib] better model docs
* fix
* s
2019-09-08 23:01:26 -07:00
Eric Liang
cf90394a09
[rllib] Fix TF2 import of EagerVariableStore ( #5625 )
2019-09-07 12:10:03 -07:00
Eric Liang
1455a19c85
Consolidate and clean up documentation ( #5645 )
2019-09-07 11:50:18 -07:00
Eric Liang
19bbf1eb4d
[rllib] Revert [rllib] Port DDPG to the build_tf_policy pattern ( #5626 )
2019-09-04 21:39:22 -07:00
Eric Liang
a101812b9f
Replace --redis-address with --address in test, docs, tune, rllib ( #5602 )
...
* wip
* add tests and tune
* add ci
* test fix
* lint
* fix tests
* wip
* sugar dep
2019-09-01 16:53:02 -07:00
Eric Liang
daf38c8723
[tune] Deprecate tune.function ( #5601 )
...
* remove tune function
* remove examples
* Update tune-usage.rst
2019-08-31 16:00:10 -07:00
Philipp Moritz
747daff2cb
Fix impala stress test ( #5596 )
2019-08-31 01:20:53 -07:00
Eric Liang
38231907f3
[rllib] Forgot to register param noise layer variables
2019-08-29 18:12:31 -07:00
Eric Liang
03a1b75852
[rllib] Fix some eager execution regressions with 1.13 ( #5537 )
...
* fix bugs with 1.13
* allow disable
2019-08-26 23:23:35 -07:00
Eric Liang
97ccd75952
[rllib] Enable object store memory limit by default ( #5534 )
2019-08-26 01:37:28 -07:00
gehring
b520f6141e
[rllib] Adds eager support with a generic TFEagerPolicy
class ( #5436 )
2019-08-23 14:21:11 +08:00
Eric Liang
e2e30ca507
Ray, Tune, and RLlib support for memory, object_store_memory options ( #5226 )
2019-08-21 23:01:10 -07:00
jon-chuang
658e002cdf
[rllib] Add autoregressive KL ( #5469 )
2019-08-19 14:34:50 +08:00
Neil Lugovoy
1376f1ae60
[tune] Reporter crash fix ( #5426 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2019-08-13 14:10:22 -07:00
Eric Liang
79949fb8a0
[rllib] RLlib in 60 seconds documentation ( #5430 )
2019-08-12 17:39:02 -07:00
Eric Liang
cc86271cf8
[hotfix] fix Travis action dist test ( #5428 )
2019-08-10 17:59:54 -07:00
Eric Liang
a1d2e17623
[rllib] Autoregressive action distributions ( #5304 )
2019-08-10 14:05:12 -07:00
Eric Liang
8b6f0d3224
[rllib] Fix output API when lz4 not installed ( #5421 )
2019-08-10 13:53:27 -07:00