Commit graph

75 commits

Author SHA1 Message Date
Sven Mika
8204717eed
[RLlib] Issue 9218: PyTorch Policy places Model on GPU even with num_gpus=0 (#9516) 2020-07-17 05:53:25 +02:00
Sven Mika
03ab86567f
[RLlib] Layout of Trajectory View API (new class: Trajectory; not used yet). (#9269) 2020-07-14 04:27:49 +02:00
Sven Mika
fcdf410ae1
[RLlib] Tf2.x native. (#8752) 2020-07-11 22:06:35 +02:00
Hao Chen
d49dadf891
Change Python's ObjectID to ObjectRef (#9353) 2020-07-10 17:49:04 +08:00
Sven Mika
5b2a97597b
[RLlib] Retire try_import_tree (should be installed along with other requirements). (#9211)
- Retire try_import_tree.
- Stabilize test_supported_multi_agent.py.
2020-07-02 13:06:34 +02:00
Sven Mika
43043ee4d5
[RLlib] Tf2x preparation; part 2 (upgrading try_import_tf()). (#9136)
* WIP.

* Fixes.

* LINT.

* WIP.

* WIP.

* Fixes.

* Fixes.

* Fixes.

* Fixes.

* WIP.

* Fixes.

* Test

* Fix.

* Fixes and LINT.

* Fixes and LINT.

* LINT.
2020-06-30 10:13:20 +02:00
Sven Mika
5c6d5d4ab1
This PR fixes the currently broken lstm_use_prev_action_reward flag for default lstm models (model.use_lstm=True). (#8970) 2020-06-27 20:50:01 +02:00
Eric Liang
1e0e1a45e6
[rllib] Add type annotations for evaluation/, env/ packages (#9003) 2020-06-19 13:09:05 -07:00
Ian Rodney
2e972c2a77
RLLIB and pylintrc (#8995) 2020-06-17 18:14:25 +02:00
Sven Mika
7008902cff
[RLlib] Minor rllib.utils cleanup. (#8932) 2020-06-16 08:52:20 +02:00
Eric Liang
34bae27ac7
[rllib] Flexible multi-agent replay modes and replay_sequence_length (#8893) 2020-06-12 20:17:27 -07:00
Sven Mika
8d1ccfd0f7
[RLlib] Issue 8889: action clipping bug ppo not learning mujoco (#8898) 2020-06-11 19:17:43 +02:00
Sven Mika
a90cd0fcbb
[RLlib] Unity3d soccer benchmarks (#8834) 2020-06-11 14:29:57 +02:00
mehrdadn
f93bb008bb
Change os.uname()[1] and socket.gethostname() to the portable and faster platform.node_ip() (#8839)
Co-authored-by: Mehrdad <noreply@github.com>
2020-06-08 21:29:46 -07:00
Sven Mika
368088be85
[RLlib] Sample batch docs and cleanup. (#8778) 2020-06-04 22:47:32 +02:00
Sven Mika
d8a081a185
[RLlib] Unity3D integration (n Unity3D clients vs learning server). (#8590) 2020-05-30 22:48:34 +02:00
Sven Mika
2746fc0476
[RLlib] Auto-framework, retire use_pytorch in favor of framework=... (#8520) 2020-05-27 16:19:13 +02:00
Sven Mika
6d196197bc
[RLlib] utils/spaces ... (#8608) 2020-05-27 10:21:30 +02:00
Eric Liang
9a83908c46
[rllib] Deprecate policy optimizers (#8345) 2020-05-21 10:16:18 -07:00
Sven Mika
d76578700d
[RLlib] Policy.compute_single_action() broken for nested actions (Issue 8411). (#8514) 2020-05-20 22:29:08 +02:00
Eric Liang
9d012626e5
[rllib] Distributed exec workflow for impala (#8321) 2020-05-11 20:24:43 -07:00
Eric Liang
f48da50e1c
[rllib] observation function api for multi-agent (#8236) 2020-05-04 22:13:49 -07:00
Sven Mika
b95e28faea
[RLlib] APEX_DDPG (PyTorch) test case and docs. (#8288)
APEX_DDPG (PyTorch) test case and docs.
2020-05-04 09:36:27 +02:00
Eric Liang
2a0ad0b8ce
[rllib] [hotfix] Remove assert that trips on pytorch multiagent (#8241) 2020-05-01 06:32:54 +02:00
Eric Liang
baadbdf8d4
[rllib] Execute PPO using training workflow (#8206)
* wip

* add kl

* kl

* works now

* doc update

* reorg

* add ddppo

* add stats

* fix fetch

* comment

* fix learner stat regression

* test fixes

* fix test
2020-04-30 01:18:09 -07:00
Sven Mika
1775e89f26
[RLlib] Remove TupleActions and support arbitrarily nested action spaces. (#8143)
Deprecate TupleActions and support arbitrarily nested action spaces.
Closes issue #8143.
2020-04-28 14:59:16 +02:00
Sven Mika
4e713152e9
[RLlib] Fix for issue https://github.com/ray-project/ray/issues/8191 (#8200)
Fix attribute error when missing exploration in Policy.
Issue #8191
2020-04-27 23:19:26 +02:00
Sven Mika
e9ee5c4e5f
[RLlib] Nested action space PR (minimally invasive; torch only + test). (#8101)
- Add TorchMultiActionDistribution class.
- Add framework-agnostic test cases for TorchMultiActionDistribution.
2020-04-23 09:09:22 +02:00
roireshef
dbcad35022
[RLlib] Added DefaultCallbacks which replaces old callbacks dict interface (#6972) 2020-04-16 16:06:42 -07:00
Xianyang Liu
e1d3f7eba6
[rllib]Add config for rllib to support set python environments (#8026)
* support set extra python environments

* wrap value with str

* Apply suggestions from code review

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* addresses comments

* fix lint errors

* remove unrelated changes due to format.sh

* remove unrelated changes due to format.sh

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-04-16 01:13:45 -07:00
Sven Mika
1b31c11806
[RLlib] DDPG re-factor to fit into RLlib's functional algorithm builder API. (#7934) 2020-04-09 14:04:21 -07:00
Sven Mika
22ccc43670
[RLlib] DQN torch version. (#7597)
* Fix.

* Rollback.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* WIP.

* Fix.

* Fix.

* Fix.

* Fix.

* Fix.

* WIP.

* WIP.

* Fix.

* Test case fixes.

* Test case fixes and LINT.

* Test case fixes and LINT.

* Rollback.

* WIP.

* WIP.

* Test case fixes.

* Fix.

* Fix.

* Fix.

* Add regression test for DQN w/ param noise.

* Fixes and LINT.

* Fixes and LINT.

* Fixes and LINT.

* Fixes and LINT.

* Fixes and LINT.

* Comment

* Regression test case.

* WIP.

* WIP.

* LINT.

* LINT.

* WIP.

* Fix.

* Fix.

* Fix.

* LINT.

* Fix (SAC does currently not support eager).

* Fix.

* WIP.

* LINT.

* Update rllib/evaluation/sampler.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Update rllib/evaluation/sampler.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Update rllib/utils/exploration/exploration.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Update rllib/utils/exploration/exploration.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* WIP.

* WIP.

* Fix.

* LINT.

* LINT.

* Fix and LINT.

* WIP.

* WIP.

* WIP.

* WIP.

* Fix.

* LINT.

* Fix.

* Fix and LINT.

* Update rllib/utils/exploration/exploration.py

* Update rllib/policy/dynamic_tf_policy.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Update rllib/policy/dynamic_tf_policy.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Update rllib/policy/dynamic_tf_policy.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Fixes.

* WIP.

* LINT.

* Fixes and LINT.

* LINT and fixes.

* LINT.

* Move action_dist back into torch extra_action_out_fn and LINT.

* Working SimpleQ learning cartpole on both torch AND tf.

* Working Rainbow learning cartpole on tf.

* Working Rainbow learning cartpole on tf.

* WIP.

* LINT.

* LINT.

* Update docs and add torch to APEX test.

* LINT.

* Fix.

* LINT.

* Fix.

* Fix.

* Fix and docstrings.

* Fix broken RLlib tests in master.

* Split BAZEL learning tests into cartpole and pendulum (reached the 60min barrier).

* Fix error_outputs option in BAZEL for RLlib regression tests.

* Fix.

* Tune param-noise tests.

* LINT.

* Fix.

* Fix.

* test

* test

* test

* Fix.

* Fix.

* WIP.

* WIP.

* WIP.

* WIP.

* LINT.

* WIP.

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-04-06 11:56:16 -07:00
Eric Liang
630b3b1752
[rllib] set daemon status for PolicyServerInput thread (#7862) 2020-04-04 16:08:51 -07:00
Sven Mika
5537fe13b0
[RLlib] Exploration API: ParamNoise Integration into DQN; working example/test cases. (#7814) 2020-04-03 10:44:25 -07:00
Sven Mika
e153e3179f
[RLlib] Exploration API: Policy changes needed for forward pass noisifications. (#7798)
* Rollback.

* WIP.

* WIP.

* LINT.

* WIP.

* Fix.

* Fix.

* Fix.

* LINT.

* Fix (SAC does currently not support eager).

* Fix.

* WIP.

* LINT.

* Update rllib/evaluation/sampler.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Update rllib/evaluation/sampler.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Update rllib/utils/exploration/exploration.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Update rllib/utils/exploration/exploration.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* WIP.

* WIP.

* Fix.

* LINT.

* LINT.

* Fix and LINT.

* WIP.

* WIP.

* WIP.

* WIP.

* Fix.

* LINT.

* Fix.

* Fix and LINT.

* Update rllib/utils/exploration/exploration.py

* Update rllib/policy/dynamic_tf_policy.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Update rllib/policy/dynamic_tf_policy.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Update rllib/policy/dynamic_tf_policy.py

Co-Authored-By: Eric Liang <ekhliang@gmail.com>

* Fixes.

* LINT.

* WIP.

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-04-01 00:43:21 -07:00
Sven Mika
e356e97eb2
[RLlib] Assert correct policy class being used in Worker. (#7769) 2020-03-30 14:03:29 -07:00
Sven Mika
e4bd5db4d8
[RLlib] Minimal ParamNoise PR. (#7772) 2020-03-28 16:16:30 -07:00
Sven Mika
1138f2ebed
[RLlib] Issue 7046 cannot restore keras model from h5 file. (#7482) 2020-03-23 12:19:30 -07:00
Robert Nishihara
ee8c9ff732
Remove six and cloudpickle from setup.py. (#7694) 2020-03-23 11:42:05 -07:00
Eric Liang
9392cdbf74
[rllib] Add high-performance external application connector (#7641) 2020-03-20 12:43:57 -07:00
Eric Liang
797e6cfc2a
[rllib][tune] fix some nans (#7611) 2020-03-16 11:19:58 -07:00
Eric Liang
dd70720578
[rllib] Rename sample_batch_size => rollout_fragment_length (#7503)
* bulk rename

* deprecation warn

* update doc

* update fig

* line length

* rename

* make pytest comptaible

* fix test

* fi sys

* rename

* wip

* fix more

* lint

* update svg

* comments

* lint

* fix use of batch steps
2020-03-14 12:05:04 -07:00
Eric Liang
c3a8ba399f
[rllib] Enable distributed exec api for A2C, A3C, PG by default (#7580) 2020-03-13 18:48:41 -07:00
Sven Mika
f165766813
[RLlib] Bug: If trainer config horizon is provided, should try to increase env steps to that value. (#7531) 2020-03-12 11:03:37 -07:00
Eric Liang
be48e1964b
[rllib] Fix per-worker exploration in Ape-X; make more kwargs required for future safety (#7504)
* fix sched

* lintc

* lint

* fix

* add unit test

* fix

* format

* fix test

* fix test
2020-03-10 11:14:14 -07:00
Eric Liang
a644060daa
[rllib] First pass at pipeline implementation of DQN (#7433)
* wip iters

* add test

* speed up

* update docs

* document it

* support serial sampling

* add test

* spacing

* annotate it

* update

* rename to pipeline

* comment

* iter2 wip

* update

* update

* context test

* update

* fix

* fix

* a3c pipeline

* doc

* update

* move timer

* comment

* add piepline test

* fix

* clean up

* document

* iter s

* wip dqn

* wip

* wip

* metrics

* metrics rename

* metrics ctx

* wip

* constants

* add todo

* suppport .union

* wip

* support union

* remove prints

* add todo

* remove auto timer

* fix up

* fix pipeline test

* typing

* fix breakage

* remove bad assert

* wip

* fix multiagent example

* fixapply

* update a3c

* remove a2c pl

* 0 workers

* wip

* wip

* share metrics

* wip

* wip

* doc

* fix weight sync and global var updates

* mode

* fix

* fix

* doc

* fix
2020-03-07 14:47:58 -08:00
Eric Liang
fddeb6809c
[RLlib] Issue 7401: In eval mode (if evaluation_episodes > 0), agent hangs if Env does not terminate. (#7448)
* Fix.

* Rollback.

* Fix issue 7421.

* Fix.
2020-03-04 12:58:34 -08:00
Sven Mika
357232d124
[Core/RLlib] Move log_once from rllib to ray.util. (#7273)
* Move log_once from rllib to tune.

* Move log_once from rllib to tune.

* LINT.

* Move to ray.util.debug.
2020-02-27 10:40:44 -08:00
Eric Liang
46af992efd
[rllib] [experimental] custom RL training pipelines (PG_pl, A2C_pl) (#7213) 2020-02-19 16:07:37 -08:00
Sven Mika
d537e9f0d8
[RLlib] Exploration API: merge deterministic flag with exploration classes (SoftQ and StochasticSampling). (#7155) 2020-02-19 12:18:45 -08:00