Justin Terry
97bf79917c
[RLlib] Update MADDPG example repo to maintained fork ( #6831 )
2020-01-18 13:08:27 -08:00
Sven Mika
303547f119
[RLlib] Policy-classes cleanup and torch/tf unification. ( #6770 )
2020-01-17 22:26:28 -08:00
Sven Mika
e6227082bd
[RLlib] Add torch
flag to train.py ( #6807 )
2020-01-17 18:48:44 -08:00
Sven Mika
2bcf72e306
DQN distributional model: Replace all legacy tf.contrib imports with tf.keras.layers.xyz or tf.initializers.xyz. ( #6772 )
...
- This fixes a test case in test_evaluators.py.
2020-01-13 21:48:16 -08:00
Sven
60d4d5e1aa
Remove future imports ( #6724 )
...
* Remove all __future__ imports from RLlib.
* Remove (object) again from tf_run_builder.py::TFRunBuilder.
* Fix 2xLINT warnings.
* Fix broken appo_policy import (must be appo_tf_policy)
* Remove future imports from all other ray files (not just RLlib).
* Remove future imports from all other ray files (not just RLlib).
* Remove future import blocks that contain `unicode_literals` as well.
Revert appo_tf_policy.py to appo_policy.py (belongs to another PR).
* Add two empty lines before Schedule class.
* Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.
2020-01-09 00:15:48 -08:00
Ujval Misra
20ba7ef647
[tune] Move util to utils package ( #6682 )
...
* Move util.py to utils
* Fix import
2020-01-06 18:11:02 -08:00
Robert Nishihara
39a3459886
Remove (object) from class declarations. ( #6658 )
2020-01-02 17:42:13 -08:00
Sven
f1b56fa5ee
PG unify/cleanup tf vs torch and PG functionality test cases (tf + torch). ( #6650 )
...
* Unifying the code for PGTrainer/Policy wrt tf vs torch.
Adding loss function test cases for the PGAgent (confirm equivalence of tf and torch).
* Fix LINT line-len errors.
* Fix LINT errors.
* Fix `tf_pg_policy` imports (formerly: `pg_policy`).
* Rename tf_pg_... into pg_tf_... following <alg>_<framework>_... convention, where ...=policy/loss/agent/trainer.
Retire `PGAgent` class (use PGTrainer instead).
* - Move PG test into agents/pg/tests directory.
- All test cases will be located near the classes that are tested and
then built into the Bazel/Travis test suite.
* Moved post_process_advantages into pg.py (from pg_tf_policy.py), b/c
the function is not a tf-specific one.
* Fix remaining import errors for agents/pg/...
* Fix circular dependency in pg imports.
* Add pg tests to Jenkins test suite.
2020-01-02 16:08:03 -08:00
Robert Nishihara
480206eef8
Remove some Python 2 compatibility code. ( #6624 )
2019-12-31 17:14:58 -08:00
Michael Luo
1cb335487e
SAC for Mujoco Environments ( #6642 )
2019-12-31 00:16:54 -08:00
Sven
8b16847c02
Get utils ready for better Agent torch support. ( #6561 )
2019-12-30 12:27:32 -08:00
Eric Liang
7c1e0e5715
Implement wait_local for wait ( #6524 )
2019-12-28 17:40:49 -08:00
Eric Liang
022954ac09
[rllib] Tuple action dist tensors not reduced properly in eager mode ( #6615 )
2019-12-28 09:51:09 -08:00
Eric Liang
3af84ada47
Revert "[rllib] remove exists call ( #6168 )" ( #6616 )
...
This reverts commit a68cda0a33
.
2019-12-26 22:44:26 -08:00
Zhongxia Yan
98689bd263
Changed foreach_policy to foreach_trainable_policy ( #6564 )
...
Changed foreach_policy to foreach_trainable_policy in DQN when disabling exploration. This makes it consistent with the rest of the file
2019-12-26 19:50:48 -08:00
gehring
b40869d0e4
Wrapper for the dm_env interface ( #6468 )
2019-12-26 13:22:17 -08:00
Michael Luo
548df014ec
SAC Performance Fixes ( #6295 )
...
* SAC Performance Fixes
* Small Changes
* Update sac_model.py
* fix normalize wrapper
* Update test_eager_support.py
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2019-12-20 10:51:25 -08:00
Eyal Sela
7b955881f3
Initializing default saver inside the function ( #6540 )
2019-12-19 12:29:45 -08:00
Eric Liang
2530eb90dc
Move tf.test.is_gpu_available() to after session init ( #6515 )
...
* move to after session init
* script fixes
2019-12-17 14:55:39 -08:00
Eugene Vinitsky
3cb499632e
(Bug Fix): Remove the extra 0.5 in the Diagonal Gaussian entropy ( #6475 )
2019-12-13 14:42:30 -08:00
Eric Liang
be5dd8eb5e
Enable direct calls by default ( #6367 )
...
* wip
* add
* timeout fix
* const ref
* comments
* fix
* fix
* Move actor state into actor handle
* comments 2
* enable by default
* temp reorder
* some fixes
* add debug code
* tmp
* fix
* wip
* remove dbg
* fix compile
* fix
* fix check
* remove non direct tests
* Increment ref count before resolving value
* rename
* fix another bug
* tmp
* tmp
* Fix object pinning
* build change
* lint
* ActorManager
* tmp
* ActorManager
* fix test component failures
* Remove old code
* Remove unused
* fix
* fix
* fix resources
* fix advanced
* eric's diff
* blacklist
* blacklist
* cleanup
* annotate
* disable tests for now
* remove
* fix
* fix
* clean up verbosity
* fix test
* fix concurrency test
* Update .travis.yml
* Update .travis.yml
* Update .travis.yml
* split up analysis suite
* split up trial runner suite
* fix detached direct actors
* fix
* split up advanced tesT
* lint
* fix core worker test hang
* fix bad check fail which breaks test_cluster.py in tune
* fix some minor diffs in test_cluster
* less workers
* make less stressful
* split up test
* retry flaky tests
* remove old test flags
* fixes
* lint
* Update worker_pool.cc
* fix race
* fix
* fix bugs in node failure handling
* fix race condition
* fix bugs in node failure handling
* fix race condition
* nits
* fix test
* disable heartbeatS
* disable heartbeatS
* fix
* fix
* use worker id
* fix max fail
* debug exit
* fix merge, and apply [PATCH] fix concurrency test
* [patch] fix core worker test hang
* remove NotifyActorCreation, and return worker on completion of actor creation task
* remove actor diied callback
* Update core_worker.cc
* lint
* use task manager
* fix merge
* fix deadlock
* wip
* merge conflits
* fix
* better sysexit handling
* better sysexit handling
* better sysexit handling
* check id
* better debug
* task failed msg
* task failed msg
* retry failed tasks with delay
* retry failed tasks with delay
* clip deps
* fix
* fix core worker tests
* fix task manager test
* fix all tests
* cleanup
* set to 0 for direct tests
* dont check worker id for ownership rpc
* dont check worker id for ownership rpc
* debug messages
* add comment
* remove debug statements
* nit
* check worker id
* fix test
* owner
* fix tests
2019-12-13 13:58:04 -08:00
Zack Polizzi
9e9c524823
Update pong-apex tuned example ( #6462 )
2019-12-12 10:57:55 -08:00
Victor Le
4e24c805ee
AlphaZero and Ranked reward implementation ( #6385 )
2019-12-07 12:08:40 -08:00
Eric Liang
4c6739476b
[rllib] Raise an error if GPUs are enabled but not tf.test.is_gpu_available() ( #6365 )
2019-12-05 10:13:54 -08:00
Stephanie Wang
da41180dc0
[direct task] Retry tasks on failure and turn on RAY_FORCE_DIRECT for test_multinode_failures.py ( #6306 )
...
* multinode failures direct
* Add number of retries allowed for tasks
* Retry tasks
* Add failing test for object reconstruction
* Handle return status and debug
* update
* Retry task unit test
* update
* update
* todo
* Fix max_retries decorator, fix test
* Fix test that flaked
* lint
* comments
2019-12-02 10:20:57 -08:00
Eric Liang
77b5098e7d
[rllib] Warn about dict action spaces
2019-11-27 12:57:38 -08:00
Eric Liang
ddc8855f41
Fix wrap ( #6293 )
2019-11-26 17:47:47 -08:00
Ameer Haj Ali
71316fa8d0
wrap models with DistributionalQModel when running DQN ( #6258 )
...
* wrap models with DistributionalQModel when running DQN
* wrap only for tensorflow models
* Update custom_keras_model.py
2019-11-25 00:11:24 -08:00
Eric Liang
53641f1f74
Move more unit tests to bazel ( #6250 )
...
* move more unit tests to bazel
* move to avoid conflict
* fix lint
* fix deps
* seprate
* fix failing tests
* show tests
* ignore mismatch
* try combining bazel runs
* build lint
* remove tests from install
* fix test utils
* better config
* split up
* exclusive
* fix verbosity
* fix tests class
* cleanup
* remove flaky
* fix metrics test
* Update .travis.yml
* no retry flaky
* split up actor
* split basic test
* split up trial runner test
* split stress
* fix basic test
* fix tests
* switch to pytest runner for main
* make microbench not fail
* move load code to py3
* test is no longer package
* bazel to end
2019-11-24 11:43:34 -08:00
Eric Liang
7559fdb141
[rllib/tune] Cache get_preprocessor() calls, default max_failur… ( #6211 )
2019-11-21 15:55:56 -08:00
Eric Liang
8fc2272f43
[rllib] Reorganize trainer config, add warnings about high VF loss magnitude for PPO ( #6181 )
2019-11-18 10:39:07 -08:00
Philipp Moritz
fc655acfee
Fix linting on master branch ( #6174 )
2019-11-16 10:02:58 -08:00
Eric Liang
a68cda0a33
[rllib] remove exists call ( #6168 )
2019-11-15 21:59:40 -08:00
Eric Liang
243b1b7281
[rllib] Add microbatch optimizer with A2C example ( #6161 )
2019-11-14 12:14:00 -08:00
waldroje
e4c0843f60
Allow EntropyCoeffSchedule to accept custom schedule ( #6158 )
...
* modify tf_policy to enable EntropyCoeffSchedule to handle list, and avoid negative values under current implementation
* Update custom_metrics_and_callbacks.py
* Update tf_policy.py
2019-11-14 00:45:43 -08:00
Eric Liang
e4565c9cc6
Reduce RLlib log verbosity ( #6154 )
2019-11-13 18:50:45 -08:00
Eric Liang
b924299833
Add large scale regression test for RLlib ( #6093 )
2019-11-13 12:22:55 -08:00
Siyuan (Ryans) Zhuang
f48293f96d
Fix deprecated warning ( #6142 )
2019-11-11 17:49:15 -08:00
Miguel Morales
d17ae5ad7a
Update hyperband-cartpole.yaml ( #6121 )
...
Typo
2019-11-09 19:39:03 -08:00
Eric Liang
1f043daf69
[rllib] Fix and add test for LR annealing config
2019-11-07 12:17:27 -08:00
David Bignell
3f83b2daa9
[rllib] Rollout extensions ( #6065 )
...
* Rollout improvements
* Make info-saving optional, to avoid breaking change.
* Store generating ray version in checkpoint metadata
* Keep the linter happy
* Add small rollout test
* Terse.
* Update test_io.py
2019-11-05 20:34:18 -08:00
Eric Liang
2a0225dd25
[rllib] RLlib chooses wrong neural network model for Atari in 0.7.5 ( #6087 )
2019-11-05 11:36:29 -08:00
Eric Liang
16891e9379
[rllib] Don't use flat weights in non-eager mode ( #6001 )
2019-10-31 15:16:02 -07:00
Eric Liang
a0dcb45dc3
[rllib] Fix APEX priorities returning zero all the time ( #5980 )
...
* fix
* move example tests to end
* level err
* guard against none
* no trace test
* ignore thumbs
* np
* fix multi node
* fix
2019-10-26 13:23:42 -07:00
Eric Liang
34fbc7fb4c
rllib] Fix leak of TensorFlow assign operations in DQN/DDPG
2019-10-23 00:28:15 -07:00
Eric Liang
f7bda0abad
[rllib] Fix rnn shape with multi-dimensional data ( #5939 )
...
* fix shape
* add test
* Update rnn_sequencing.py
2019-10-22 11:07:26 -07:00
Stefan Otte
d70abcfd70
Fix typo in examples/centralized_critic.py ( #5943 )
...
`opp_ops` should be `opp_obs`.
2019-10-17 08:42:50 -07:00
Matthew A. Wright
0110941de5
rllib: use pytorch's fn to see if gpu is available ( #5890 )
2019-10-12 00:13:00 -07:00
Matthew A. Wright
4aa06918ae
Qmix on gpu and with non-stacked-obs environment state support ( #5751 )
2019-10-08 13:18:07 -07:00
Eric Liang
04e997fe0d
Fix TF2 / rllib test ( #5846 )
2019-10-07 14:25:16 -07:00