Sven Mika
3a234ed9e3
[RLlib] Error: "Unknown trainable [some rllib algo name]" ( #8525 )
2020-05-21 08:59:32 +02:00
Sven Mika
d76578700d
[RLlib] Policy.compute_single_action()
broken for nested actions (Issue 8411). ( #8514 )
2020-05-20 22:29:08 +02:00
mehrdadn
ebf060d484
Make more tests run on Windows ( #8446 )
...
* Remove worker Wait() call due to SIGCHLD being ignored
* Port _pid_alive to Windows
* Show PID as well as TID in glog
* Update TensorFlow version for Python 3.8 on Windows
* Handle missing Pillow on Windows
* Work around dm-tree PermissionError on Windows
* Fix some lint errors on Windows with Python 3.8
* Simplify torch requirements
* Quiet git clean
* Handle finalizer issues
* Exit with the signal number
* Get rid of wget
* Fix some Windows compatibility issues with tests
Co-authored-by: Mehrdad <noreply@github.com>
2020-05-20 12:25:04 -07:00
Eric Liang
aa7a58e92f
[rllib] Support training intensity for dqn / apex ( #8396 )
2020-05-20 11:22:30 -07:00
Sven Mika
796a834c48
[RLlib] Attention Net integration into ModelV2 and learning RL example. ( #8371 )
2020-05-18 17:26:40 +02:00
Dennis van der Hoff
be1f158747
Added Done to MultiAgentExternalEnv. ( #8478 )
...
Co-authored-by: devanderhoff <devanderhoff@hotmail.com>
2020-05-17 16:29:47 -07:00
Edward Oakes
16f48078d9
Remove use of ObjectID transport flag ( #7699 )
2020-05-17 11:29:49 -05:00
Sven Mika
c9435cad43
WIP. ( #8456 )
...
Fix multi-GPU histogram metrics for > 0D tensors.
2020-05-15 21:43:27 +02:00
Sven Mika
5f4c196fed
[RLlib] Make PyTorch Model forward pass faster in vf-case. ( #8422 )
2020-05-14 10:15:50 +02:00
Eric Liang
6bf1dc0888
[rllib] [hotfix] Build broken due to merge conflict: MixInReplay has no attribute buffer
2020-05-13 12:21:04 -07:00
Eric Liang
96f4d82cc3
[rllib] Qmix replay ratio is wrong
2020-05-12 13:07:19 -07:00
Eric Liang
7ce138a6dc
[rllib] Support free_log_std in ModelV2 ( #8380 )
...
* update
* factor
* update
* fix test failures
* fix torch net
2020-05-12 10:14:05 -07:00
Sven Mika
57544b1ff9
[RLlib] Examples folder restructuring (Model examples; final part). ( #8278 )
...
- This PR completes any previously missing PyTorch Model counterparts to TFModels in examples/models.
- It also makes sure, all example scripts in the rllib/examples folder are tested for both frameworks and learn the given task (this is often currently not checked) using a --as-test flag in connection with a --stop-reward.
2020-05-12 08:23:10 +02:00
Eric Liang
9d012626e5
[rllib] Distributed exec workflow for impala ( #8321 )
2020-05-11 20:24:43 -07:00
Sven Mika
c7cb2f5416
[RLlib] IMPALA PyTorch GPU fixes ( #8397 )
2020-05-11 22:03:27 +02:00
A Kharitonov
304e31b7e5
Fixed: contrib/MADDPG MADDPGTFPolicy missing self.config assignment ( #8343 )
2020-05-08 12:05:06 -07:00
Sven Mika
754290daad
[RLlib] Add light-weight Trainer.compute_action()
tests for all Algos. ( #8356 )
2020-05-08 16:31:31 +02:00
Sven Mika
d946f58fd0
LINT fixes. ( #8370 )
2020-05-08 16:24:20 +02:00
gehring
7f14fb577d
[RLlib] Added TransformerXL and "stabilized for RL" variant, GTrXL ( #6470 )
2020-05-08 14:10:23 +02:00
Eric Liang
2c599dbf05
[rllib] Port QMIX, MADDPG to new execution API ( #8344 )
2020-05-07 23:41:10 -07:00
Eric Liang
9f04a65922
[rllib] Add PPO+DQN two trainer multiagent workflow example ( #8334 )
2020-05-07 23:40:29 -07:00
Sven Mika
d7eaacb5fe
[RLlib] Issue 8319 DDPG (MA or num_envs_per_worker > 1) broken. ( #8324 )
2020-05-08 08:26:32 +02:00
Sven Mika
5f278c6411
[RLlib] Examples folder restructuring (models) part 1 ( #8353 )
2020-05-08 08:20:18 +02:00
Eric Liang
30db920787
[rllib] Fix centralized critic example to use right policy ( #8341 )
...
* update
* update
2020-05-07 10:47:55 -07:00
Eric Liang
b14cc16616
[rllib] Enable functional execution workflow API by default ( #8221 )
2020-05-05 12:36:42 -07:00
Eric Liang
ee0eb44a32
Rename async_queue_depth -> num_async ( #8207 )
...
* rename
* lint
2020-05-05 01:38:10 -07:00
Eric Liang
f48da50e1c
[rllib] observation function api for multi-agent ( #8236 )
2020-05-04 22:13:49 -07:00
Sven Mika
6c2b9a4cfa
[RLlib] Remove tf.py_function from all Schedule classes (not differentiable and causes other bugs in MA setups). ( #8304 )
...
Remove tf.py_function from all Schedule classes (not differentiable and causes other bugs in MA setups). (#8304 )
2020-05-04 23:53:38 +02:00
Sven Mika
a00144f746
[RLlib] Fix issue 8135 (DDPG inf actions when using [-inf,inf] action space). ( #8302 )
2020-05-04 22:27:30 +02:00
Sven Mika
b95e28faea
[RLlib] APEX_DDPG (PyTorch) test case and docs. ( #8288 )
...
APEX_DDPG (PyTorch) test case and docs.
2020-05-04 09:36:27 +02:00
Sven Mika
166bb5d690
[RLlib] IMPALA PyTorch ( #8287 )
...
This PR adds an IMPALA PyTorch implementation.
- adds compilation tests for LSTM and w/o LSTM.
- adds learning test for CartPole.
2020-05-03 13:44:25 +02:00
Sven Mika
76e1a4df9e
Fix TD3 torch via GaussianNoise torch bug. ( #8276 )
2020-05-02 08:12:21 +02:00
Sven Mika
42991d723f
[RLlib] rllib/examples folder restructuring ( #8250 )
...
Cleans up of the rllib/examples folder by moving all example Envs into rllibexamples/env (so they can be used by other scripts and tests as well).
2020-05-01 22:59:34 +02:00
Eric Liang
2a0ad0b8ce
[rllib] [hotfix] Remove assert that trips on pytorch multiagent ( #8241 )
2020-05-01 06:32:54 +02:00
Sven Mika
c593fb09b7
[RLlib] Remove all f-strings to keep py3.5 compatibility.
2020-04-30 11:10:16 -07:00
Sven Mika
eea75ac623
[RLlib] Beta distribution. ( #8229 )
2020-04-30 11:09:33 -07:00
Sven Mika
b23b6addfc
[RLlib] Stabilize Pendulum-v0 regression test cases. ( #8232 )
...
Stabilize Pendulum regression test cases.
2020-04-30 15:48:11 +02:00
Eric Liang
baadbdf8d4
[rllib] Execute PPO using training workflow ( #8206 )
...
* wip
* add kl
* kl
* works now
* doc update
* reorg
* add ddppo
* add stats
* fix fetch
* comment
* fix learner stat regression
* test fixes
* fix test
2020-04-30 01:18:09 -07:00
Eric Liang
ae54e0dc0a
[rllib] Copy plasma memory before adding data to replay buffer
2020-04-29 14:17:54 -07:00
Sven Mika
bf25aee392
[RLlib] Deprecate all Model(v1) usage. ( #8146 )
...
Deprecate all Model(v1) usage.
2020-04-29 12:12:59 +02:00
Sven Mika
eb91619175
Fix release 0.8.5 tests for PPO torch Breakout. ( #8226 )
2020-04-29 10:36:41 +02:00
Sven Mika
1775e89f26
[RLlib] Remove TupleActions and support arbitrarily nested action spaces. ( #8143 )
...
Deprecate TupleActions and support arbitrarily nested action spaces.
Closes issue #8143 .
2020-04-28 14:59:16 +02:00
Sven Mika
4e713152e9
[RLlib] Fix for issue https://github.com/ray-project/ray/issues/8191 ( #8200 )
...
Fix attribute error when missing exploration in Policy.
Issue #8191
2020-04-27 23:19:26 +02:00
Sven Mika
7ec2223c84
[RLlib] DDPG PyTorch actor-model was missing sigmoid layer ( #8188 )
...
Fix DDPG PyTorch (missing sigmoid layer (to squash action outputs) after deterministic action outputs).
2020-04-26 23:08:13 +02:00
Tomasz Wrona
b508166419
Copy initial state of an RNN to a CPU before converting it to a NumPy array ( #8097 )
2020-04-25 18:49:09 -07:00
Eric Liang
2298f6fb40
[rllib] Port DQN/Ape-X to training workflow api ( #8077 )
2020-04-23 12:39:19 -07:00
Sven Mika
499ad5fbe4
[RLlib] PyTorch version of APPO. ( #8120 )
...
- Translate all vtrace functionality to torch and added torch to the framework_iterator-loop in all existing vtrace test cases.
- Add learning test cases for APPO torch (both w/ and w/o v-trace).
- Add quick compilation tests for APPO (tf and torch, v-trace and no v-trace).
2020-04-23 09:11:12 +02:00
Sven Mika
e9ee5c4e5f
[RLlib] Nested action space PR (minimally invasive; torch only + test). ( #8101 )
...
- Add TorchMultiActionDistribution class.
- Add framework-agnostic test cases for TorchMultiActionDistribution.
2020-04-23 09:09:22 +02:00
Sven Mika
d15609ba2a
[RLlib] PyTorch version of ARS (Augmented Random Search). ( #8106 )
...
This PR implements a PyTorch version of RLlib's ARS algorithm using RLlib's functional algo builder API. It also adds a regression test for ARS (torch) on CartPole.
2020-04-21 09:47:52 +02:00
Eric Liang
17e3c545d9
[rllib] Fix truncate episodes mode in central critic example ( #8073 )
2020-04-20 12:58:01 -07:00