Commit graph

55 commits

Author SHA1 Message Date
Eric Liang
7559fdb141 [rllib/tune] Cache get_preprocessor() calls, default max_failur… (#6211) 2019-11-21 15:55:56 -08:00
Eric Liang
8fc2272f43
[rllib] Reorganize trainer config, add warnings about high VF loss magnitude for PPO (#6181) 2019-11-18 10:39:07 -08:00
Philipp Moritz
fc655acfee
Fix linting on master branch (#6174) 2019-11-16 10:02:58 -08:00
Eric Liang
a68cda0a33
[rllib] remove exists call (#6168) 2019-11-15 21:59:40 -08:00
Eric Liang
243b1b7281
[rllib] Add microbatch optimizer with A2C example (#6161) 2019-11-14 12:14:00 -08:00
waldroje
e4c0843f60 Allow EntropyCoeffSchedule to accept custom schedule (#6158)
* modify tf_policy to enable EntropyCoeffSchedule to handle list, and avoid negative values under current implementation

* Update custom_metrics_and_callbacks.py

* Update tf_policy.py
2019-11-14 00:45:43 -08:00
Eric Liang
e4565c9cc6
Reduce RLlib log verbosity (#6154) 2019-11-13 18:50:45 -08:00
Eric Liang
b924299833
Add large scale regression test for RLlib (#6093) 2019-11-13 12:22:55 -08:00
Siyuan (Ryans) Zhuang
f48293f96d
Fix deprecated warning (#6142) 2019-11-11 17:49:15 -08:00
Miguel Morales
d17ae5ad7a Update hyperband-cartpole.yaml (#6121)
Typo
2019-11-09 19:39:03 -08:00
Eric Liang
1f043daf69
[rllib] Fix and add test for LR annealing config 2019-11-07 12:17:27 -08:00
David Bignell
3f83b2daa9 [rllib] Rollout extensions (#6065)
* Rollout improvements

* Make info-saving optional, to avoid breaking change.

* Store generating ray version in checkpoint metadata

* Keep the linter happy

* Add small rollout test

* Terse.

* Update test_io.py
2019-11-05 20:34:18 -08:00
Eric Liang
2a0225dd25
[rllib] RLlib chooses wrong neural network model for Atari in 0.7.5 (#6087) 2019-11-05 11:36:29 -08:00
Eric Liang
16891e9379
[rllib] Don't use flat weights in non-eager mode (#6001) 2019-10-31 15:16:02 -07:00
Eric Liang
a0dcb45dc3
[rllib] Fix APEX priorities returning zero all the time (#5980)
* fix

* move example tests to end

* level err

* guard against none

* no trace test

* ignore thumbs

* np

* fix multi node

* fix
2019-10-26 13:23:42 -07:00
Eric Liang
34fbc7fb4c
rllib] Fix leak of TensorFlow assign operations in DQN/DDPG 2019-10-23 00:28:15 -07:00
Eric Liang
f7bda0abad
[rllib] Fix rnn shape with multi-dimensional data (#5939)
* fix shape

* add test

* Update rnn_sequencing.py
2019-10-22 11:07:26 -07:00
Stefan Otte
d70abcfd70 Fix typo in examples/centralized_critic.py (#5943)
`opp_ops` should be `opp_obs`.
2019-10-17 08:42:50 -07:00
Matthew A. Wright
0110941de5 rllib: use pytorch's fn to see if gpu is available (#5890) 2019-10-12 00:13:00 -07:00
Matthew A. Wright
4aa06918ae Qmix on gpu and with non-stacked-obs environment state support (#5751) 2019-10-08 13:18:07 -07:00
Eric Liang
04e997fe0d
Fix TF2 / rllib test (#5846) 2019-10-07 14:25:16 -07:00
Eric Liang
fb33160df8
Fix obs space lo/hi (#5826) 2019-10-04 09:28:06 -07:00
Eric Liang
c6919d315d
[rllib] Remove TorchPolicy locks (#5764)
* remove torch lock

* remove lock
2019-09-24 17:52:16 -07:00
Vince Jankovics
7e214fd95e [tune] TensorBoard HParams for TF2.0 (#5678) 2019-09-21 11:06:34 -07:00
Kilian Batzner
79b9c70ad6 Add local_tf_session_args to unknown subkeys whitelist (#5742)
* Add local_tf_session_args to unknown subkeys whitelist

* Remove trailing whitespace
2019-09-20 10:32:49 -07:00
Eric Liang
fb3b232c0e
[rllib] Properly flatten 2-d observations as input to FCnet (#5733) 2019-09-19 12:10:31 -07:00
Matthew A. Wright
3131e1742d [rllib] Qmix off by 1 in double Q calculation (#5731)
* Qmix fix.

-Current version of double Q learning is incorrect; it selects actions
at timestep t instead of t+1 when computing the t+1 Q value.

* Allow extra obs dict keys

* Move Q-value-computing replay code to own function

* Run the autoformatter

* use better terms in comments ("policy" network instead of "live" network)
2019-09-18 18:12:30 -07:00
gehring
8903bcd0c3 [rllib] Tracing for eager tensorflow policies with tf.function (#5705)
* Added tracing of eager policies with `tf.function`

* lint

* add config option

* add docs

* wip

* tracing now works with a3c

* typo

* none

* file doc

* returns

* syntax error

* syntax error
2019-09-17 01:44:20 -07:00
Edward Oakes
07c4c6367a [core worker] Python core worker object interface (#5272) 2019-09-12 23:07:46 -07:00
Ashwinee Panda
946ebfaa3c [rllib] Validate that entropy coeff is not an integer (#5687)
* Validate that entropy coeff is not an integer

Passing an integer value for entropy coeff such as 0 raises an error somewhere inside the TF policy graph, so this checks to make sure the entropy coeff is a float.

* Cast to float instead

Also move this check after the negative value check
2019-09-11 14:35:42 -07:00
Eric Liang
bc6a95deb0
[rllib] Eager execution for centralized critic example, fix simple optimizer for multiagent (#5683) 2019-09-11 12:15:34 -07:00
Richard Liaw
0010f54378 Update Cloudpickle (#5643) 2019-09-09 17:17:29 -07:00
Eric Liang
74abeab057
[rllib] Improve accessing model state docs (#5656)
* [rllib] better model docs

* fix

* s
2019-09-08 23:01:26 -07:00
Eric Liang
cf90394a09
[rllib] Fix TF2 import of EagerVariableStore (#5625) 2019-09-07 12:10:03 -07:00
Eric Liang
1455a19c85
Consolidate and clean up documentation (#5645) 2019-09-07 11:50:18 -07:00
Eric Liang
19bbf1eb4d [rllib] Revert [rllib] Port DDPG to the build_tf_policy pattern (#5626) 2019-09-04 21:39:22 -07:00
Eric Liang
a101812b9f
Replace --redis-address with --address in test, docs, tune, rllib (#5602)
* wip

* add tests and tune

* add ci

* test fix

* lint

* fix tests

* wip

* sugar dep
2019-09-01 16:53:02 -07:00
Eric Liang
daf38c8723
[tune] Deprecate tune.function (#5601)
* remove tune function

* remove examples

* Update tune-usage.rst
2019-08-31 16:00:10 -07:00
Philipp Moritz
747daff2cb
Fix impala stress test (#5596) 2019-08-31 01:20:53 -07:00
Eric Liang
38231907f3
[rllib] Forgot to register param noise layer variables 2019-08-29 18:12:31 -07:00
Eric Liang
03a1b75852
[rllib] Fix some eager execution regressions with 1.13 (#5537)
* fix bugs with 1.13

* allow disable
2019-08-26 23:23:35 -07:00
Eric Liang
97ccd75952
[rllib] Enable object store memory limit by default (#5534) 2019-08-26 01:37:28 -07:00
gehring
b520f6141e [rllib] Adds eager support with a generic TFEagerPolicy class (#5436) 2019-08-23 14:21:11 +08:00
Eric Liang
e2e30ca507 Ray, Tune, and RLlib support for memory, object_store_memory options (#5226) 2019-08-21 23:01:10 -07:00
jon-chuang
658e002cdf [rllib] Add autoregressive KL (#5469) 2019-08-19 14:34:50 +08:00
Neil Lugovoy
1376f1ae60 [tune] Reporter crash fix (#5426)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2019-08-13 14:10:22 -07:00
Eric Liang
79949fb8a0
[rllib] RLlib in 60 seconds documentation (#5430) 2019-08-12 17:39:02 -07:00
Eric Liang
cc86271cf8
[hotfix] fix Travis action dist test (#5428) 2019-08-10 17:59:54 -07:00
Eric Liang
a1d2e17623
[rllib] Autoregressive action distributions (#5304) 2019-08-10 14:05:12 -07:00
Eric Liang
8b6f0d3224 [rllib] Fix output API when lz4 not installed (#5421) 2019-08-10 13:53:27 -07:00