Eyal Sela
7b955881f3
Initializing default saver inside the function ( #6540 )
2019-12-19 12:29:45 -08:00
Eric Liang
2530eb90dc
Move tf.test.is_gpu_available() to after session init ( #6515 )
...
* move to after session init
* script fixes
2019-12-17 14:55:39 -08:00
Eugene Vinitsky
3cb499632e
(Bug Fix): Remove the extra 0.5 in the Diagonal Gaussian entropy ( #6475 )
2019-12-13 14:42:30 -08:00
Eric Liang
be5dd8eb5e
Enable direct calls by default ( #6367 )
...
* wip
* add
* timeout fix
* const ref
* comments
* fix
* fix
* Move actor state into actor handle
* comments 2
* enable by default
* temp reorder
* some fixes
* add debug code
* tmp
* fix
* wip
* remove dbg
* fix compile
* fix
* fix check
* remove non direct tests
* Increment ref count before resolving value
* rename
* fix another bug
* tmp
* tmp
* Fix object pinning
* build change
* lint
* ActorManager
* tmp
* ActorManager
* fix test component failures
* Remove old code
* Remove unused
* fix
* fix
* fix resources
* fix advanced
* eric's diff
* blacklist
* blacklist
* cleanup
* annotate
* disable tests for now
* remove
* fix
* fix
* clean up verbosity
* fix test
* fix concurrency test
* Update .travis.yml
* Update .travis.yml
* Update .travis.yml
* split up analysis suite
* split up trial runner suite
* fix detached direct actors
* fix
* split up advanced tesT
* lint
* fix core worker test hang
* fix bad check fail which breaks test_cluster.py in tune
* fix some minor diffs in test_cluster
* less workers
* make less stressful
* split up test
* retry flaky tests
* remove old test flags
* fixes
* lint
* Update worker_pool.cc
* fix race
* fix
* fix bugs in node failure handling
* fix race condition
* fix bugs in node failure handling
* fix race condition
* nits
* fix test
* disable heartbeatS
* disable heartbeatS
* fix
* fix
* use worker id
* fix max fail
* debug exit
* fix merge, and apply [PATCH] fix concurrency test
* [patch] fix core worker test hang
* remove NotifyActorCreation, and return worker on completion of actor creation task
* remove actor diied callback
* Update core_worker.cc
* lint
* use task manager
* fix merge
* fix deadlock
* wip
* merge conflits
* fix
* better sysexit handling
* better sysexit handling
* better sysexit handling
* check id
* better debug
* task failed msg
* task failed msg
* retry failed tasks with delay
* retry failed tasks with delay
* clip deps
* fix
* fix core worker tests
* fix task manager test
* fix all tests
* cleanup
* set to 0 for direct tests
* dont check worker id for ownership rpc
* dont check worker id for ownership rpc
* debug messages
* add comment
* remove debug statements
* nit
* check worker id
* fix test
* owner
* fix tests
2019-12-13 13:58:04 -08:00
Zack Polizzi
9e9c524823
Update pong-apex tuned example ( #6462 )
2019-12-12 10:57:55 -08:00
Victor Le
4e24c805ee
AlphaZero and Ranked reward implementation ( #6385 )
2019-12-07 12:08:40 -08:00
Eric Liang
4c6739476b
[rllib] Raise an error if GPUs are enabled but not tf.test.is_gpu_available() ( #6365 )
2019-12-05 10:13:54 -08:00
Stephanie Wang
da41180dc0
[direct task] Retry tasks on failure and turn on RAY_FORCE_DIRECT for test_multinode_failures.py ( #6306 )
...
* multinode failures direct
* Add number of retries allowed for tasks
* Retry tasks
* Add failing test for object reconstruction
* Handle return status and debug
* update
* Retry task unit test
* update
* update
* todo
* Fix max_retries decorator, fix test
* Fix test that flaked
* lint
* comments
2019-12-02 10:20:57 -08:00
Eric Liang
77b5098e7d
[rllib] Warn about dict action spaces
2019-11-27 12:57:38 -08:00
Eric Liang
ddc8855f41
Fix wrap ( #6293 )
2019-11-26 17:47:47 -08:00
Ameer Haj Ali
71316fa8d0
wrap models with DistributionalQModel when running DQN ( #6258 )
...
* wrap models with DistributionalQModel when running DQN
* wrap only for tensorflow models
* Update custom_keras_model.py
2019-11-25 00:11:24 -08:00
Eric Liang
53641f1f74
Move more unit tests to bazel ( #6250 )
...
* move more unit tests to bazel
* move to avoid conflict
* fix lint
* fix deps
* seprate
* fix failing tests
* show tests
* ignore mismatch
* try combining bazel runs
* build lint
* remove tests from install
* fix test utils
* better config
* split up
* exclusive
* fix verbosity
* fix tests class
* cleanup
* remove flaky
* fix metrics test
* Update .travis.yml
* no retry flaky
* split up actor
* split basic test
* split up trial runner test
* split stress
* fix basic test
* fix tests
* switch to pytest runner for main
* make microbench not fail
* move load code to py3
* test is no longer package
* bazel to end
2019-11-24 11:43:34 -08:00
Eric Liang
7559fdb141
[rllib/tune] Cache get_preprocessor() calls, default max_failur… ( #6211 )
2019-11-21 15:55:56 -08:00
Eric Liang
8fc2272f43
[rllib] Reorganize trainer config, add warnings about high VF loss magnitude for PPO ( #6181 )
2019-11-18 10:39:07 -08:00
Philipp Moritz
fc655acfee
Fix linting on master branch ( #6174 )
2019-11-16 10:02:58 -08:00
Eric Liang
a68cda0a33
[rllib] remove exists call ( #6168 )
2019-11-15 21:59:40 -08:00
Eric Liang
243b1b7281
[rllib] Add microbatch optimizer with A2C example ( #6161 )
2019-11-14 12:14:00 -08:00
waldroje
e4c0843f60
Allow EntropyCoeffSchedule to accept custom schedule ( #6158 )
...
* modify tf_policy to enable EntropyCoeffSchedule to handle list, and avoid negative values under current implementation
* Update custom_metrics_and_callbacks.py
* Update tf_policy.py
2019-11-14 00:45:43 -08:00
Eric Liang
e4565c9cc6
Reduce RLlib log verbosity ( #6154 )
2019-11-13 18:50:45 -08:00
Eric Liang
b924299833
Add large scale regression test for RLlib ( #6093 )
2019-11-13 12:22:55 -08:00
Siyuan (Ryans) Zhuang
f48293f96d
Fix deprecated warning ( #6142 )
2019-11-11 17:49:15 -08:00
Miguel Morales
d17ae5ad7a
Update hyperband-cartpole.yaml ( #6121 )
...
Typo
2019-11-09 19:39:03 -08:00
Eric Liang
1f043daf69
[rllib] Fix and add test for LR annealing config
2019-11-07 12:17:27 -08:00
David Bignell
3f83b2daa9
[rllib] Rollout extensions ( #6065 )
...
* Rollout improvements
* Make info-saving optional, to avoid breaking change.
* Store generating ray version in checkpoint metadata
* Keep the linter happy
* Add small rollout test
* Terse.
* Update test_io.py
2019-11-05 20:34:18 -08:00
Eric Liang
2a0225dd25
[rllib] RLlib chooses wrong neural network model for Atari in 0.7.5 ( #6087 )
2019-11-05 11:36:29 -08:00
Eric Liang
16891e9379
[rllib] Don't use flat weights in non-eager mode ( #6001 )
2019-10-31 15:16:02 -07:00
Eric Liang
a0dcb45dc3
[rllib] Fix APEX priorities returning zero all the time ( #5980 )
...
* fix
* move example tests to end
* level err
* guard against none
* no trace test
* ignore thumbs
* np
* fix multi node
* fix
2019-10-26 13:23:42 -07:00
Eric Liang
34fbc7fb4c
rllib] Fix leak of TensorFlow assign operations in DQN/DDPG
2019-10-23 00:28:15 -07:00
Eric Liang
f7bda0abad
[rllib] Fix rnn shape with multi-dimensional data ( #5939 )
...
* fix shape
* add test
* Update rnn_sequencing.py
2019-10-22 11:07:26 -07:00
Stefan Otte
d70abcfd70
Fix typo in examples/centralized_critic.py ( #5943 )
...
`opp_ops` should be `opp_obs`.
2019-10-17 08:42:50 -07:00
Matthew A. Wright
0110941de5
rllib: use pytorch's fn to see if gpu is available ( #5890 )
2019-10-12 00:13:00 -07:00
Matthew A. Wright
4aa06918ae
Qmix on gpu and with non-stacked-obs environment state support ( #5751 )
2019-10-08 13:18:07 -07:00
Eric Liang
04e997fe0d
Fix TF2 / rllib test ( #5846 )
2019-10-07 14:25:16 -07:00
Eric Liang
fb33160df8
Fix obs space lo/hi ( #5826 )
2019-10-04 09:28:06 -07:00
Eric Liang
c6919d315d
[rllib] Remove TorchPolicy locks ( #5764 )
...
* remove torch lock
* remove lock
2019-09-24 17:52:16 -07:00
Vince Jankovics
7e214fd95e
[tune] TensorBoard HParams for TF2.0 ( #5678 )
2019-09-21 11:06:34 -07:00
Kilian Batzner
79b9c70ad6
Add local_tf_session_args to unknown subkeys whitelist ( #5742 )
...
* Add local_tf_session_args to unknown subkeys whitelist
* Remove trailing whitespace
2019-09-20 10:32:49 -07:00
Eric Liang
fb3b232c0e
[rllib] Properly flatten 2-d observations as input to FCnet ( #5733 )
2019-09-19 12:10:31 -07:00
Matthew A. Wright
3131e1742d
[rllib] Qmix off by 1 in double Q calculation ( #5731 )
...
* Qmix fix.
-Current version of double Q learning is incorrect; it selects actions
at timestep t instead of t+1 when computing the t+1 Q value.
* Allow extra obs dict keys
* Move Q-value-computing replay code to own function
* Run the autoformatter
* use better terms in comments ("policy" network instead of "live" network)
2019-09-18 18:12:30 -07:00
gehring
8903bcd0c3
[rllib] Tracing for eager tensorflow policies with tf.function
( #5705 )
...
* Added tracing of eager policies with `tf.function`
* lint
* add config option
* add docs
* wip
* tracing now works with a3c
* typo
* none
* file doc
* returns
* syntax error
* syntax error
2019-09-17 01:44:20 -07:00
Edward Oakes
07c4c6367a
[core worker] Python core worker object interface ( #5272 )
2019-09-12 23:07:46 -07:00
Ashwinee Panda
946ebfaa3c
[rllib] Validate that entropy coeff is not an integer ( #5687 )
...
* Validate that entropy coeff is not an integer
Passing an integer value for entropy coeff such as 0 raises an error somewhere inside the TF policy graph, so this checks to make sure the entropy coeff is a float.
* Cast to float instead
Also move this check after the negative value check
2019-09-11 14:35:42 -07:00
Eric Liang
bc6a95deb0
[rllib] Eager execution for centralized critic example, fix simple optimizer for multiagent ( #5683 )
2019-09-11 12:15:34 -07:00
Richard Liaw
0010f54378
Update Cloudpickle ( #5643 )
2019-09-09 17:17:29 -07:00
Eric Liang
74abeab057
[rllib] Improve accessing model state docs ( #5656 )
...
* [rllib] better model docs
* fix
* s
2019-09-08 23:01:26 -07:00
Eric Liang
cf90394a09
[rllib] Fix TF2 import of EagerVariableStore ( #5625 )
2019-09-07 12:10:03 -07:00
Eric Liang
1455a19c85
Consolidate and clean up documentation ( #5645 )
2019-09-07 11:50:18 -07:00
Eric Liang
19bbf1eb4d
[rllib] Revert [rllib] Port DDPG to the build_tf_policy pattern ( #5626 )
2019-09-04 21:39:22 -07:00
Eric Liang
a101812b9f
Replace --redis-address with --address in test, docs, tune, rllib ( #5602 )
...
* wip
* add tests and tune
* add ci
* test fix
* lint
* fix tests
* wip
* sugar dep
2019-09-01 16:53:02 -07:00
Eric Liang
daf38c8723
[tune] Deprecate tune.function ( #5601 )
...
* remove tune function
* remove examples
* Update tune-usage.rst
2019-08-31 16:00:10 -07:00