Commit graph

1117 commits

Author SHA1 Message Date
Eyal Sela
7b955881f3 Initializing default saver inside the function (#6540) 2019-12-19 12:29:45 -08:00
Eric Liang
2530eb90dc
Move tf.test.is_gpu_available() to after session init (#6515)
* move to after session init

* script fixes
2019-12-17 14:55:39 -08:00
Eugene Vinitsky
3cb499632e (Bug Fix): Remove the extra 0.5 in the Diagonal Gaussian entropy (#6475) 2019-12-13 14:42:30 -08:00
Eric Liang
be5dd8eb5e
Enable direct calls by default (#6367)
* wip

* add

* timeout fix

* const ref

* comments

* fix

* fix

* Move actor state into actor handle

* comments 2

* enable by default

* temp reorder

* some fixes

* add debug code

* tmp

* fix

* wip

* remove dbg

* fix compile

* fix

* fix check

* remove non direct tests

* Increment ref count before resolving value

* rename

* fix another bug

* tmp

* tmp

* Fix object pinning

* build change

* lint

* ActorManager

* tmp

* ActorManager

* fix test component failures

* Remove old code

* Remove unused

* fix

* fix

* fix resources

* fix advanced

* eric's diff

* blacklist

* blacklist

* cleanup

* annotate

* disable tests for now

* remove

* fix

* fix

* clean up verbosity

* fix test

* fix concurrency test

* Update .travis.yml

* Update .travis.yml

* Update .travis.yml

* split up analysis suite

* split up trial runner suite

* fix detached direct actors

* fix

* split up advanced tesT

* lint

* fix core worker test hang

* fix bad check fail which breaks test_cluster.py in tune

* fix some minor diffs in test_cluster

* less workers

* make less stressful

* split up test

* retry flaky tests

* remove old test flags

* fixes

* lint

* Update worker_pool.cc

* fix race

* fix

* fix bugs in node failure handling

* fix race condition

* fix bugs in node failure handling

* fix race condition

* nits

* fix test

* disable heartbeatS

* disable heartbeatS

* fix

* fix

* use worker id

* fix max fail

* debug exit

* fix merge, and apply [PATCH] fix concurrency test

* [patch] fix core worker test hang

* remove NotifyActorCreation, and return worker on completion of actor creation task

* remove actor diied callback

* Update core_worker.cc

* lint

* use task manager

* fix merge

* fix deadlock

* wip

* merge conflits

* fix

* better sysexit handling

* better sysexit handling

* better sysexit handling

* check id

* better debug

* task failed msg

* task failed msg

* retry failed tasks with delay

* retry failed tasks with delay

* clip deps

* fix

* fix core worker tests

* fix task manager test

* fix all tests

* cleanup

* set to 0 for direct tests

* dont check worker id for ownership rpc

* dont check worker id for ownership rpc

* debug messages

* add comment

* remove debug statements

* nit

* check worker id

* fix test

* owner

* fix tests
2019-12-13 13:58:04 -08:00
Zack Polizzi
9e9c524823 Update pong-apex tuned example (#6462) 2019-12-12 10:57:55 -08:00
Victor Le
4e24c805ee AlphaZero and Ranked reward implementation (#6385) 2019-12-07 12:08:40 -08:00
Eric Liang
4c6739476b
[rllib] Raise an error if GPUs are enabled but not tf.test.is_gpu_available() (#6365) 2019-12-05 10:13:54 -08:00
Stephanie Wang
da41180dc0
[direct task] Retry tasks on failure and turn on RAY_FORCE_DIRECT for test_multinode_failures.py (#6306)
* multinode failures direct

* Add number of retries allowed for tasks

* Retry tasks

* Add failing test for object reconstruction

* Handle return status and debug

* update

* Retry task unit test

* update

* update

* todo

* Fix max_retries decorator, fix test

* Fix test that flaked

* lint

* comments
2019-12-02 10:20:57 -08:00
Eric Liang
77b5098e7d
[rllib] Warn about dict action spaces 2019-11-27 12:57:38 -08:00
Eric Liang
ddc8855f41
Fix wrap (#6293) 2019-11-26 17:47:47 -08:00
Ameer Haj Ali
71316fa8d0 wrap models with DistributionalQModel when running DQN (#6258)
* wrap models with DistributionalQModel when running DQN

* wrap only for tensorflow models

* Update custom_keras_model.py
2019-11-25 00:11:24 -08:00
Eric Liang
53641f1f74
Move more unit tests to bazel (#6250)
* move more unit tests to bazel

* move to avoid conflict

* fix lint

* fix deps

* seprate

* fix failing tests

* show tests

* ignore mismatch

* try combining bazel runs

* build lint

* remove tests from install

* fix test utils

* better config

* split up

* exclusive

* fix verbosity

* fix tests class

* cleanup

* remove flaky

* fix metrics test

* Update .travis.yml

* no retry flaky

* split up actor

* split basic test

* split up trial runner test

* split stress

* fix basic test

* fix tests

* switch to pytest runner for main

* make microbench not fail

* move load code to py3

* test is no longer package

* bazel to end
2019-11-24 11:43:34 -08:00
Eric Liang
7559fdb141 [rllib/tune] Cache get_preprocessor() calls, default max_failur… (#6211) 2019-11-21 15:55:56 -08:00
Eric Liang
8fc2272f43
[rllib] Reorganize trainer config, add warnings about high VF loss magnitude for PPO (#6181) 2019-11-18 10:39:07 -08:00
Philipp Moritz
fc655acfee
Fix linting on master branch (#6174) 2019-11-16 10:02:58 -08:00
Eric Liang
a68cda0a33
[rllib] remove exists call (#6168) 2019-11-15 21:59:40 -08:00
Eric Liang
243b1b7281
[rllib] Add microbatch optimizer with A2C example (#6161) 2019-11-14 12:14:00 -08:00
waldroje
e4c0843f60 Allow EntropyCoeffSchedule to accept custom schedule (#6158)
* modify tf_policy to enable EntropyCoeffSchedule to handle list, and avoid negative values under current implementation

* Update custom_metrics_and_callbacks.py

* Update tf_policy.py
2019-11-14 00:45:43 -08:00
Eric Liang
e4565c9cc6
Reduce RLlib log verbosity (#6154) 2019-11-13 18:50:45 -08:00
Eric Liang
b924299833
Add large scale regression test for RLlib (#6093) 2019-11-13 12:22:55 -08:00
Siyuan (Ryans) Zhuang
f48293f96d
Fix deprecated warning (#6142) 2019-11-11 17:49:15 -08:00
Miguel Morales
d17ae5ad7a Update hyperband-cartpole.yaml (#6121)
Typo
2019-11-09 19:39:03 -08:00
Eric Liang
1f043daf69
[rllib] Fix and add test for LR annealing config 2019-11-07 12:17:27 -08:00
David Bignell
3f83b2daa9 [rllib] Rollout extensions (#6065)
* Rollout improvements

* Make info-saving optional, to avoid breaking change.

* Store generating ray version in checkpoint metadata

* Keep the linter happy

* Add small rollout test

* Terse.

* Update test_io.py
2019-11-05 20:34:18 -08:00
Eric Liang
2a0225dd25
[rllib] RLlib chooses wrong neural network model for Atari in 0.7.5 (#6087) 2019-11-05 11:36:29 -08:00
Eric Liang
16891e9379
[rllib] Don't use flat weights in non-eager mode (#6001) 2019-10-31 15:16:02 -07:00
Eric Liang
a0dcb45dc3
[rllib] Fix APEX priorities returning zero all the time (#5980)
* fix

* move example tests to end

* level err

* guard against none

* no trace test

* ignore thumbs

* np

* fix multi node

* fix
2019-10-26 13:23:42 -07:00
Eric Liang
34fbc7fb4c
rllib] Fix leak of TensorFlow assign operations in DQN/DDPG 2019-10-23 00:28:15 -07:00
Eric Liang
f7bda0abad
[rllib] Fix rnn shape with multi-dimensional data (#5939)
* fix shape

* add test

* Update rnn_sequencing.py
2019-10-22 11:07:26 -07:00
Stefan Otte
d70abcfd70 Fix typo in examples/centralized_critic.py (#5943)
`opp_ops` should be `opp_obs`.
2019-10-17 08:42:50 -07:00
Matthew A. Wright
0110941de5 rllib: use pytorch's fn to see if gpu is available (#5890) 2019-10-12 00:13:00 -07:00
Matthew A. Wright
4aa06918ae Qmix on gpu and with non-stacked-obs environment state support (#5751) 2019-10-08 13:18:07 -07:00
Eric Liang
04e997fe0d
Fix TF2 / rllib test (#5846) 2019-10-07 14:25:16 -07:00
Eric Liang
fb33160df8
Fix obs space lo/hi (#5826) 2019-10-04 09:28:06 -07:00
Eric Liang
c6919d315d
[rllib] Remove TorchPolicy locks (#5764)
* remove torch lock

* remove lock
2019-09-24 17:52:16 -07:00
Vince Jankovics
7e214fd95e [tune] TensorBoard HParams for TF2.0 (#5678) 2019-09-21 11:06:34 -07:00
Kilian Batzner
79b9c70ad6 Add local_tf_session_args to unknown subkeys whitelist (#5742)
* Add local_tf_session_args to unknown subkeys whitelist

* Remove trailing whitespace
2019-09-20 10:32:49 -07:00
Eric Liang
fb3b232c0e
[rllib] Properly flatten 2-d observations as input to FCnet (#5733) 2019-09-19 12:10:31 -07:00
Matthew A. Wright
3131e1742d [rllib] Qmix off by 1 in double Q calculation (#5731)
* Qmix fix.

-Current version of double Q learning is incorrect; it selects actions
at timestep t instead of t+1 when computing the t+1 Q value.

* Allow extra obs dict keys

* Move Q-value-computing replay code to own function

* Run the autoformatter

* use better terms in comments ("policy" network instead of "live" network)
2019-09-18 18:12:30 -07:00
gehring
8903bcd0c3 [rllib] Tracing for eager tensorflow policies with tf.function (#5705)
* Added tracing of eager policies with `tf.function`

* lint

* add config option

* add docs

* wip

* tracing now works with a3c

* typo

* none

* file doc

* returns

* syntax error

* syntax error
2019-09-17 01:44:20 -07:00
Edward Oakes
07c4c6367a [core worker] Python core worker object interface (#5272) 2019-09-12 23:07:46 -07:00
Ashwinee Panda
946ebfaa3c [rllib] Validate that entropy coeff is not an integer (#5687)
* Validate that entropy coeff is not an integer

Passing an integer value for entropy coeff such as 0 raises an error somewhere inside the TF policy graph, so this checks to make sure the entropy coeff is a float.

* Cast to float instead

Also move this check after the negative value check
2019-09-11 14:35:42 -07:00
Eric Liang
bc6a95deb0
[rllib] Eager execution for centralized critic example, fix simple optimizer for multiagent (#5683) 2019-09-11 12:15:34 -07:00
Richard Liaw
0010f54378 Update Cloudpickle (#5643) 2019-09-09 17:17:29 -07:00
Eric Liang
74abeab057
[rllib] Improve accessing model state docs (#5656)
* [rllib] better model docs

* fix

* s
2019-09-08 23:01:26 -07:00
Eric Liang
cf90394a09
[rllib] Fix TF2 import of EagerVariableStore (#5625) 2019-09-07 12:10:03 -07:00
Eric Liang
1455a19c85
Consolidate and clean up documentation (#5645) 2019-09-07 11:50:18 -07:00
Eric Liang
19bbf1eb4d [rllib] Revert [rllib] Port DDPG to the build_tf_policy pattern (#5626) 2019-09-04 21:39:22 -07:00
Eric Liang
a101812b9f
Replace --redis-address with --address in test, docs, tune, rllib (#5602)
* wip

* add tests and tune

* add ci

* test fix

* lint

* fix tests

* wip

* sugar dep
2019-09-01 16:53:02 -07:00
Eric Liang
daf38c8723
[tune] Deprecate tune.function (#5601)
* remove tune function

* remove examples

* Update tune-usage.rst
2019-08-31 16:00:10 -07:00