Commit graph

1117 commits

Author SHA1 Message Date
Eric Liang
399424c418
[rllib] Fix broken check in eval mode for IMPALA #7217 2020-02-19 11:54:30 -08:00
Simon Mo
e8941b1b79
Revert "Revert "Removing Pyarrow dependency (#7146)" (#7209) (#7214) 2020-02-19 10:08:52 -08:00
Eric Liang
0aa9373d62
Revert "Removing Pyarrow dependency (#7146)" (#7209)
This reverts commit 2116fd3bca.
2020-02-18 14:12:06 -08:00
Eric Liang
5df801605e
Add ray.util package and move libraries from experimental (#7100) 2020-02-18 13:43:19 -08:00
ijrsvt
2116fd3bca
Removing Pyarrow dependency (#7146) 2020-02-17 18:00:13 -08:00
mehrdadn
3bd82d0bcd
Fix various issues/warnings that come up on Jenkins (#7147)
* Avoid warning about swap being unlimited

Currently we get the following message on Jenkins:
"Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap."

Since we're not limiting swap anyway, we might as well avoid trying to.
https://docs.docker.com/config/containers/resource_constraints/#--memory-swap-details

* Fix escaping in re.search()

* Fix escaping in _noisy_layer()

* Raise a more descriptive error when dashboard data isn't found

* Don't error on dashboard files not being found when webui isn't required

* Change dashboard error to a warning instead
2020-02-17 16:08:55 -08:00
Eric Liang
42aea966ff
[rllib] Convert torch state arrays to tensors during compute actions (#7162)
* convert to tensor

* normalize fix
2020-02-17 10:26:58 -08:00
Eric Liang
b6233dff3c
[rllib] Fix bad sample count assert 2020-02-15 17:22:23 -08:00
Sven Mika
2e60f0d4d8
[RLlib] Move all jenkins RLlib-tests into bazel (rllib/BUILD). (#7178)
* commit

* comment
2020-02-15 14:50:44 -08:00
Adrian O'Grady
fe6ce714a0
[rllib] - TaskPool.completed_prefetch() no longer returns stale object ids after an error (#7139) 2020-02-13 22:30:44 -08:00
Sven Mika
5518a738b3
[RLlib] Fix erroneous use of LinearSchedule (in DDPG's exploration annealing). (#7125)
* Fix erroneous use of LinearSchedule (in DDPG's exploration annealing).
Erase schedules_obsoleted.py.

* Trigger re-test.

* Re-test.
2020-02-12 23:46:49 -08:00
Sven Mika
f41a9b9813
[RLlib] Fix KL method of MultiCategorial tf distribution (issue #7009). (#7119)
* Fix KL method of MultiCategorial tf distribution.

* Fix KL method of MultiCategorial tf distribution.

* Merge AsyncReplayOptimizer fixes into this branch.
2020-02-12 12:46:15 -08:00
Sven Mika
2a0e4d94aa
[RLlib] Fix AsyncReplayOptimizer bug where it swallows all good worker tasks … (#7111) 2020-02-11 12:51:44 -08:00
Eric Liang
026f6884b5
[rllib] Add Decentralized DDPPO trainer and documentation (#7088) 2020-02-10 15:28:27 -08:00
Sven Mika
6e1c3ea824
[RLlib] Exploration API (+EpsilonGreedy sub-class). (#6974) 2020-02-10 15:22:07 -08:00
Sven Mika
5ac5ac9560
[RLlib] Fix broken example: tf-eager with custom-RNN (#6732). (#7021)
* WIP.

* Fix float32 conversion in OneHot preprocessor (would cause float64 in eager, then NN-matmul-failure).
Add proper seq-len + state-in construction in eager_tf_policy.py::_compute_gradients().

* LINT.

* eager_tf_policy.py: Only set samples["seq_lens"] if RNN. Otherwise, eager-tracing will throw flattened-dict key-mismatch error.

* Move issue code to examples folder.

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-02-06 09:44:08 -08:00
Eric Liang
fbc545c03b
[rllib] Support parallel, parameterized evaluation (#6981)
* eval api

* update

* sync eval filters

* sync fix

* docs

* update

* docs

* update

* link

* nit

* doc updates

* format
2020-02-01 22:12:12 -08:00
Sven Mika
b9ad79d66f
Add cartpole PPO torch to regression (besides tf). (#7005) 2020-02-01 17:41:38 -08:00
roireshef
3c60caa448
[rllib] implemented compute_advantages without gae (#6941) 2020-01-31 22:25:45 -08:00
Jaroslaw Rzepecki
67319bc887
[RLlib] Update MARWIL to use tf policy template (#6975)
* update MARWIL to use tf policy template

* formatting fixes
2020-01-31 12:57:52 -08:00
Sven Mika
211a9be9a5
[RLlib] Bug fix: PR anneals beta parameter beyond final given value. (#6973)
* Bug fix: PR anneals beta parameter beyond final given value.

* LINT.

* Trigger travis re-test.
2020-01-31 09:55:03 -08:00
Sven Mika
2ccf08ad10
[RLlib] Bug fix: DQN goes into negative epsilon values after reaching explora… (#6971)
* Bug fix: DQN goes into negative epsilon values after reaching exploration percentage.

* Add `epsilon_initial_eps` to SAC to pass test_nested_spaces.py.

* Add `exploration_initial_eps` to QMIX default config.
2020-01-31 09:54:12 -08:00
roireshef
dc7a555260
[rllib] Feature/histograms in tensorboard (#6942)
* Added histogram functionality to custom metrics infrastructure (another tab in tensorboard)

* updated example to include histogram metric

* added histograms to TBXLogger

* add episode rewards

* lint

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-01-30 22:02:53 -08:00
Sven Mika
136ada5fb9
[RLlib] Experiment with py_func as a means to further unify tf and torch (Schedule classes). (#6951) 2020-01-30 11:27:57 -08:00
Sven Mika
4c97348cb6 [RLlib] Schedule-classes multi-framework support. (#6926) 2020-01-28 11:07:55 -08:00
Eric Liang
e659699ca9
[tune] Fix directory naming regression (#6839) 2020-01-27 15:53:40 -08:00
Eric Liang
2fb53396ad [rllib] [experimental] Decentralized Distributed PPO for torch (DD-PPO) (#6918) 2020-01-25 22:36:43 -08:00
Sven Mika
446cbdf2e0 [RLlib] Fix issue (bug): LSTM + non-shared vf + PPO + tuple actions (#6890)
* Add `RandomEnv` example to examples folder.
Convert warning into Error message when using an LSTM in a non-shared-vf network (after the warning, the program would crash).

* LINT.

* Fix issue #6884. LSTM + non-shared vf NN + PPO crashes when using a Tuple action space.

* LINT

* Change warning message for Model: shared_vf=False, LSTM=True cases.

* Bug fix.

* Add examples/random_env.py test to Jenkins.
2020-01-24 10:29:35 -08:00
AnanthHari
aa2a0cb6da Fixes empty state argument in compute_single_action method (#6894)
* Fixes empty `state` parameter in compute_single_action method

* Fixed style
2020-01-23 00:42:52 -08:00
Sven Mika
ae9a3a2237 [RLlib] from_config util method for framework agnostic components; start moving RLlib tests into Bazel. (#6865) 2020-01-22 17:02:58 -08:00
Sven Mika
c957ed58ed [RLlib] Implement PPO torch version. (#6826) 2020-01-20 23:06:50 -08:00
Eric Liang
a229bdf272
[rllib] Deprecate custom preprocessors (#6833)
* deprecation warnings

* add log warn

* fix test
2020-01-18 23:30:09 -08:00
Sven Mika
7659cae3ba [RLlib] Add PG torch regression test (#6828)
* Add PG torch regression test to tuned_examples/regression_tests dir.

* Rename cartpole-pg.yaml into cartpole-pg-tf.yaml

* cartpole-pg-tf.yaml: Change cartpole-pg name of tuned_example to cartpole-pg-tf.
2020-01-18 15:57:12 -08:00
Justin Terry
97bf79917c [RLlib] Update MADDPG example repo to maintained fork (#6831) 2020-01-18 13:08:27 -08:00
Sven Mika
303547f119 [RLlib] Policy-classes cleanup and torch/tf unification. (#6770) 2020-01-17 22:26:28 -08:00
Sven Mika
e6227082bd [RLlib] Add torch flag to train.py (#6807) 2020-01-17 18:48:44 -08:00
Sven Mika
2bcf72e306 DQN distributional model: Replace all legacy tf.contrib imports with tf.keras.layers.xyz or tf.initializers.xyz. (#6772)
- This fixes a test case in test_evaluators.py.
2020-01-13 21:48:16 -08:00
Sven
60d4d5e1aa Remove future imports (#6724)
* Remove all __future__ imports from RLlib.

* Remove (object) again from tf_run_builder.py::TFRunBuilder.

* Fix 2xLINT warnings.

* Fix broken appo_policy import (must be appo_tf_policy)

* Remove future imports from all other ray files (not just RLlib).

* Remove future imports from all other ray files (not just RLlib).

* Remove future import blocks that contain `unicode_literals` as well.
Revert appo_tf_policy.py to appo_policy.py (belongs to another PR).

* Add two empty lines before Schedule class.

* Put back __future__ imports into determine_tests_to_run.py. Fails otherwise on a py2/print related error.
2020-01-09 00:15:48 -08:00
Ujval Misra
20ba7ef647 [tune] Move util to utils package (#6682)
* Move util.py to utils

* Fix import
2020-01-06 18:11:02 -08:00
Robert Nishihara
39a3459886 Remove (object) from class declarations. (#6658) 2020-01-02 17:42:13 -08:00
Sven
f1b56fa5ee PG unify/cleanup tf vs torch and PG functionality test cases (tf + torch). (#6650)
* Unifying the code for PGTrainer/Policy wrt tf vs torch.
Adding loss function test cases for the PGAgent (confirm equivalence of tf and torch).

* Fix LINT line-len errors.

* Fix LINT errors.

* Fix `tf_pg_policy` imports (formerly: `pg_policy`).

* Rename tf_pg_... into pg_tf_... following <alg>_<framework>_... convention, where ...=policy/loss/agent/trainer.
Retire `PGAgent` class (use PGTrainer instead).

* - Move PG test into agents/pg/tests directory.
- All test cases will be located near the classes that are tested and
  then built into the Bazel/Travis test suite.

* Moved post_process_advantages into pg.py (from pg_tf_policy.py), b/c
the function is not a tf-specific one.

* Fix remaining import errors for agents/pg/...

* Fix circular dependency in pg imports.

* Add pg tests to Jenkins test suite.
2020-01-02 16:08:03 -08:00
Robert Nishihara
480206eef8
Remove some Python 2 compatibility code. (#6624) 2019-12-31 17:14:58 -08:00
Michael Luo
1cb335487e SAC for Mujoco Environments (#6642) 2019-12-31 00:16:54 -08:00
Sven
8b16847c02 Get utils ready for better Agent torch support. (#6561) 2019-12-30 12:27:32 -08:00
Eric Liang
7c1e0e5715
Implement wait_local for wait (#6524) 2019-12-28 17:40:49 -08:00
Eric Liang
022954ac09 [rllib] Tuple action dist tensors not reduced properly in eager mode (#6615) 2019-12-28 09:51:09 -08:00
Eric Liang
3af84ada47
Revert "[rllib] remove exists call (#6168)" (#6616)
This reverts commit a68cda0a33.
2019-12-26 22:44:26 -08:00
Zhongxia Yan
98689bd263 Changed foreach_policy to foreach_trainable_policy (#6564)
Changed foreach_policy to foreach_trainable_policy in DQN when disabling exploration. This makes it consistent with the rest of the file
2019-12-26 19:50:48 -08:00
gehring
b40869d0e4 Wrapper for the dm_env interface (#6468) 2019-12-26 13:22:17 -08:00
Michael Luo
548df014ec SAC Performance Fixes (#6295)
* SAC Performance Fixes

* Small Changes

* Update sac_model.py

* fix normalize wrapper

* Update test_eager_support.py

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2019-12-20 10:51:25 -08:00