Commit graph

356 commits

Author SHA1 Message Date
Sven Mika
3a234ed9e3
[RLlib] Error: "Unknown trainable [some rllib algo name]" (#8525) 2020-05-21 08:59:32 +02:00
Sven Mika
d76578700d
[RLlib] Policy.compute_single_action() broken for nested actions (Issue 8411). (#8514) 2020-05-20 22:29:08 +02:00
mehrdadn
ebf060d484
Make more tests run on Windows (#8446)
* Remove worker Wait() call due to SIGCHLD being ignored

* Port _pid_alive to Windows

* Show PID as well as TID in glog

* Update TensorFlow version for Python 3.8 on Windows

* Handle missing Pillow on Windows

* Work around dm-tree PermissionError on Windows

* Fix some lint errors on Windows with Python 3.8

* Simplify torch requirements

* Quiet git clean

* Handle finalizer issues

* Exit with the signal number

* Get rid of wget

* Fix some Windows compatibility issues with tests

Co-authored-by: Mehrdad <noreply@github.com>
2020-05-20 12:25:04 -07:00
Eric Liang
aa7a58e92f
[rllib] Support training intensity for dqn / apex (#8396) 2020-05-20 11:22:30 -07:00
Sven Mika
796a834c48
[RLlib] Attention Net integration into ModelV2 and learning RL example. (#8371) 2020-05-18 17:26:40 +02:00
Dennis van der Hoff
be1f158747
Added Done to MultiAgentExternalEnv. (#8478)
Co-authored-by: devanderhoff <devanderhoff@hotmail.com>
2020-05-17 16:29:47 -07:00
Edward Oakes
16f48078d9
Remove use of ObjectID transport flag (#7699) 2020-05-17 11:29:49 -05:00
Sven Mika
c9435cad43
WIP. (#8456)
Fix multi-GPU histogram metrics for > 0D tensors.
2020-05-15 21:43:27 +02:00
Sven Mika
5f4c196fed
[RLlib] Make PyTorch Model forward pass faster in vf-case. (#8422) 2020-05-14 10:15:50 +02:00
Eric Liang
6bf1dc0888
[rllib] [hotfix] Build broken due to merge conflict: MixInReplay has no attribute buffer 2020-05-13 12:21:04 -07:00
Eric Liang
96f4d82cc3
[rllib] Qmix replay ratio is wrong 2020-05-12 13:07:19 -07:00
Eric Liang
7ce138a6dc
[rllib] Support free_log_std in ModelV2 (#8380)
* update

* factor

* update

* fix test failures

* fix torch net
2020-05-12 10:14:05 -07:00
Sven Mika
57544b1ff9
[RLlib] Examples folder restructuring (Model examples; final part). (#8278)
- This PR completes any previously missing PyTorch Model counterparts to TFModels in examples/models.
- It also makes sure, all example scripts in the rllib/examples folder are tested for both frameworks and learn the given task (this is often currently not checked) using a --as-test flag in connection with a --stop-reward.
2020-05-12 08:23:10 +02:00
Eric Liang
9d012626e5
[rllib] Distributed exec workflow for impala (#8321) 2020-05-11 20:24:43 -07:00
Sven Mika
c7cb2f5416
[RLlib] IMPALA PyTorch GPU fixes (#8397) 2020-05-11 22:03:27 +02:00
A Kharitonov
304e31b7e5
Fixed: contrib/MADDPG MADDPGTFPolicy missing self.config assignment (#8343) 2020-05-08 12:05:06 -07:00
Sven Mika
754290daad
[RLlib] Add light-weight Trainer.compute_action() tests for all Algos. (#8356) 2020-05-08 16:31:31 +02:00
Sven Mika
d946f58fd0
LINT fixes. (#8370) 2020-05-08 16:24:20 +02:00
gehring
7f14fb577d
[RLlib] Added TransformerXL and "stabilized for RL" variant, GTrXL (#6470) 2020-05-08 14:10:23 +02:00
Eric Liang
2c599dbf05
[rllib] Port QMIX, MADDPG to new execution API (#8344) 2020-05-07 23:41:10 -07:00
Eric Liang
9f04a65922
[rllib] Add PPO+DQN two trainer multiagent workflow example (#8334) 2020-05-07 23:40:29 -07:00
Sven Mika
d7eaacb5fe
[RLlib] Issue 8319 DDPG (MA or num_envs_per_worker > 1) broken. (#8324) 2020-05-08 08:26:32 +02:00
Sven Mika
5f278c6411
[RLlib] Examples folder restructuring (models) part 1 (#8353) 2020-05-08 08:20:18 +02:00
Eric Liang
30db920787
[rllib] Fix centralized critic example to use right policy (#8341)
* update

* update
2020-05-07 10:47:55 -07:00
Eric Liang
b14cc16616
[rllib] Enable functional execution workflow API by default (#8221) 2020-05-05 12:36:42 -07:00
Eric Liang
ee0eb44a32
Rename async_queue_depth -> num_async (#8207)
* rename

* lint
2020-05-05 01:38:10 -07:00
Eric Liang
f48da50e1c
[rllib] observation function api for multi-agent (#8236) 2020-05-04 22:13:49 -07:00
Sven Mika
6c2b9a4cfa
[RLlib] Remove tf.py_function from all Schedule classes (not differentiable and causes other bugs in MA setups). (#8304)
Remove tf.py_function from all Schedule classes (not differentiable and causes other bugs in MA setups). (#8304)
2020-05-04 23:53:38 +02:00
Sven Mika
a00144f746
[RLlib] Fix issue 8135 (DDPG inf actions when using [-inf,inf] action space). (#8302) 2020-05-04 22:27:30 +02:00
Sven Mika
b95e28faea
[RLlib] APEX_DDPG (PyTorch) test case and docs. (#8288)
APEX_DDPG (PyTorch) test case and docs.
2020-05-04 09:36:27 +02:00
Sven Mika
166bb5d690
[RLlib] IMPALA PyTorch (#8287)
This PR adds an IMPALA PyTorch implementation.

- adds compilation tests for LSTM and w/o LSTM.
- adds learning test for CartPole.
2020-05-03 13:44:25 +02:00
Sven Mika
76e1a4df9e
Fix TD3 torch via GaussianNoise torch bug. (#8276) 2020-05-02 08:12:21 +02:00
Sven Mika
42991d723f
[RLlib] rllib/examples folder restructuring (#8250)
Cleans up of the rllib/examples folder by moving all example Envs into rllibexamples/env (so they can be used by other scripts and tests as well).
2020-05-01 22:59:34 +02:00
Eric Liang
2a0ad0b8ce
[rllib] [hotfix] Remove assert that trips on pytorch multiagent (#8241) 2020-05-01 06:32:54 +02:00
Sven Mika
c593fb09b7
[RLlib] Remove all f-strings to keep py3.5 compatibility. 2020-04-30 11:10:16 -07:00
Sven Mika
eea75ac623
[RLlib] Beta distribution. (#8229) 2020-04-30 11:09:33 -07:00
Sven Mika
b23b6addfc
[RLlib] Stabilize Pendulum-v0 regression test cases. (#8232)
Stabilize Pendulum regression test cases.
2020-04-30 15:48:11 +02:00
Eric Liang
baadbdf8d4
[rllib] Execute PPO using training workflow (#8206)
* wip

* add kl

* kl

* works now

* doc update

* reorg

* add ddppo

* add stats

* fix fetch

* comment

* fix learner stat regression

* test fixes

* fix test
2020-04-30 01:18:09 -07:00
Eric Liang
ae54e0dc0a
[rllib] Copy plasma memory before adding data to replay buffer 2020-04-29 14:17:54 -07:00
Sven Mika
bf25aee392
[RLlib] Deprecate all Model(v1) usage. (#8146)
Deprecate all Model(v1) usage.
2020-04-29 12:12:59 +02:00
Sven Mika
eb91619175
Fix release 0.8.5 tests for PPO torch Breakout. (#8226) 2020-04-29 10:36:41 +02:00
Sven Mika
1775e89f26
[RLlib] Remove TupleActions and support arbitrarily nested action spaces. (#8143)
Deprecate TupleActions and support arbitrarily nested action spaces.
Closes issue #8143.
2020-04-28 14:59:16 +02:00
Sven Mika
4e713152e9
[RLlib] Fix for issue https://github.com/ray-project/ray/issues/8191 (#8200)
Fix attribute error when missing exploration in Policy.
Issue #8191
2020-04-27 23:19:26 +02:00
Sven Mika
7ec2223c84
[RLlib] DDPG PyTorch actor-model was missing sigmoid layer (#8188)
Fix DDPG PyTorch (missing sigmoid layer (to squash action outputs) after deterministic action outputs).
2020-04-26 23:08:13 +02:00
Tomasz Wrona
b508166419
Copy initial state of an RNN to a CPU before converting it to a NumPy array (#8097) 2020-04-25 18:49:09 -07:00
Eric Liang
2298f6fb40
[rllib] Port DQN/Ape-X to training workflow api (#8077) 2020-04-23 12:39:19 -07:00
Sven Mika
499ad5fbe4
[RLlib] PyTorch version of APPO. (#8120)
- Translate all vtrace functionality to torch and added torch to the framework_iterator-loop in all existing vtrace test cases.
- Add learning test cases for APPO torch (both w/ and w/o v-trace).
- Add quick compilation tests for APPO (tf and torch, v-trace and no v-trace).
2020-04-23 09:11:12 +02:00
Sven Mika
e9ee5c4e5f
[RLlib] Nested action space PR (minimally invasive; torch only + test). (#8101)
- Add TorchMultiActionDistribution class.
- Add framework-agnostic test cases for TorchMultiActionDistribution.
2020-04-23 09:09:22 +02:00
Sven Mika
d15609ba2a
[RLlib] PyTorch version of ARS (Augmented Random Search). (#8106)
This PR implements a PyTorch version of RLlib's ARS algorithm using RLlib's functional algo builder API. It also adds a regression test for ARS (torch) on CartPole.
2020-04-21 09:47:52 +02:00
Eric Liang
17e3c545d9
[rllib] Fix truncate episodes mode in central critic example (#8073) 2020-04-20 12:58:01 -07:00