Sven Mika
5c7b35d694
[RLlib] Issue 10833 TorchPolicy GPU. ( #10834 )
2020-09-17 09:04:46 +02:00
desktable
799318d7d7
[RLlib] Add type annotations for agents/dqn ( #10626 )
2020-09-09 18:55:26 +02:00
Sven Mika
28ab797cf5
[RLlib] Deprecate old classes, methods, functions, config keys (in prep for RLlib 1.0). ( #10544 )
2020-09-06 10:58:00 +02:00
Sven Mika
ef18893fb5
[RLlib] PPO, APPO, and DD-PPO code cleanup. ( #10420 )
2020-09-02 14:03:01 +02:00
Michael Luo
4e9888ce2f
[RLlib] Dreamer ( #10172 )
2020-08-26 13:24:05 +02:00
Sven Mika
e968b52cb7
[RLlib] Trajectory view API - 03 Fast LSTM + prev actions/rewards ( #9950 )
2020-08-21 12:35:16 +02:00
Sven Mika
d14b501692
[RLlib] First attempt at cleaning up algo code in RLlib: PG. ( #10115 )
2020-08-20 17:05:57 +02:00
Sven Mika
2cbe29a7fa
[RLlib] Curiosity minor fixes, do-overs, and testing. ( #10143 )
2020-08-19 17:49:50 +02:00
Eric Liang
ca133e2699
[rllib] Remove extra model config kwargs passed incorrectly for Torch models ( #10055 )
2020-08-17 11:12:20 -07:00
Olli Huotari
9ff599cbb8
torch policy now includes model.metrics ( #10121 )
...
* torch policy now includes model.metrics
* Fixed tests to work with custom metrics
* Forgot to run format.sh
2020-08-15 10:43:11 -07:00
Sven Mika
aeb5be7733
[RLlib] Trajectory View API (part 2.5): Actual implementations (not used yet) of a SampleCollector. ( #10112 )
2020-08-15 15:09:00 +02:00
Sven Mika
2256047876
[RLlib] Rename rllib.utils.types into typing to match built-in python module's name. ( #10114 )
2020-08-15 13:24:22 +02:00
Tanay Wakhare
1826b29757
[RLlib] Curiosity (intrinsic motivation) Exploration module. ( #9912 )
2020-08-13 20:14:16 +02:00
yncxcw
32cd94b750
[Core] Do not convert gpu id to int ( #9744 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-11 12:09:46 -07:00
Barak Michener
8e76796fd0
ci: Redo format.sh --all
script & backfill lint fixes ( #9956 )
2020-08-07 16:49:49 -07:00
Eric Liang
668f555755
[rllib] Clean up outdated docs #9915
2020-08-06 18:29:04 -07:00
Sven Mika
57690a3a9f
[RLlib] Trajectory view API - 02 actual API scaffold ( #9753 )
2020-08-06 10:54:20 +02:00
Sven Mika
9b90f7db67
[RLlib] Missing type annotations policy templates. ( #9846 )
2020-08-06 05:33:24 +02:00
Sven Mika
b0b0463161
[RLlib] Trajectory View API (preparatory cleanup and enhancements). ( #9678 )
2020-07-29 21:15:09 +02:00
Eric Liang
590943a499
[rllib] Type annotations for model classes ( #9646 )
2020-07-24 12:01:46 -07:00
Eric Liang
5acd3e66dd
[rllib] Fix torch TD error, IMPALA LR updates ( #9477 )
...
* update
* add test
* lint
* fix super call
* speed es test up
2020-07-23 12:50:25 -07:00
Raphael Avalos
440c9c42be
[RLlib] Fix combination of lockstep and multiple agnts controlled by the same policy. ( #9521 )
...
* Change aggregation when lockstep is activated.
Modification of MultiAgentBatch.timeslices to support the combination of lockstep and multiple agents controlled by the same policy.
fix ray-project/ray#9295
* Line too long.
2020-07-19 23:03:12 -07:00
Sven Mika
8204717eed
[RLlib] Issue 9218: PyTorch Policy places Model on GPU even with num_gpus=0 ( #9516 )
2020-07-17 05:53:25 +02:00
Sven Mika
935d8308fb
[RLlib] Issue #9437 (PyTorch converts to CPU tensor, even if on GPU). ( #9497 )
2020-07-16 14:55:50 +02:00
Sven Mika
03ab86567f
[RLlib] Layout of Trajectory View API (new class: Trajectory; not used yet). ( #9269 )
2020-07-14 04:27:49 +02:00
Sven Mika
fcdf410ae1
[RLlib] Tf2.x native. ( #8752 )
2020-07-11 22:06:35 +02:00
Sven Mika
01125b8fcf
[RLlib] DQN rainbow eager-mode (keras style NoisyLayer) (preparation for native tf2.x support). ( #9304 )
2020-07-09 10:44:10 +02:00
Sven Mika
4da0e542d5
[RLlib] DDPG and SAC eager support (preparation for tf2.x) ( #9204 )
2020-07-08 16:12:20 +02:00
Sven Mika
f43d934817
[RLlib] Type annotations for policy. ( #9248 )
2020-07-05 13:09:51 +02:00
Sven Mika
5b2a97597b
[RLlib] Retire try_import_tree
(should be installed along with other requirements). ( #9211 )
...
- Retire try_import_tree.
- Stabilize test_supported_multi_agent.py.
2020-07-02 13:06:34 +02:00
Sven Mika
b4c0b942fe
[RLlib] Remove requirement for dataclasses in rllib (not supported in py3.5) ( #9237 )
2020-07-01 17:31:44 +02:00
Sven Mika
43043ee4d5
[RLlib] Tf2x preparation; part 2 (upgrading try_import_tf()
). ( #9136 )
...
* WIP.
* Fixes.
* LINT.
* WIP.
* WIP.
* Fixes.
* Fixes.
* Fixes.
* Fixes.
* WIP.
* Fixes.
* Test
* Fix.
* Fixes and LINT.
* Fixes and LINT.
* LINT.
2020-06-30 10:13:20 +02:00
Sven Mika
0d37103f84
[RLlib] Prototype: Model Trajectory View API, part 0 ( #9171 )
2020-06-30 05:33:19 +02:00
Sven Mika
5c6d5d4ab1
This PR fixes the currently broken lstm_use_prev_action_reward flag for default lstm models (model.use_lstm=True). ( #8970 )
2020-06-27 20:50:01 +02:00
Sven Mika
af1203b9df
[RLlib] Issue 8507 (PyTorch does not support custom loss). ( #9142 )
2020-06-26 09:52:22 +02:00
Sven Mika
4fd8977eaf
[RLlib] Minor cleanup in preparation to tf2.x support. ( #9130 )
...
* WIP.
* Fixes.
* LINT.
* Fixes.
* Fixes and LINT.
* WIP.
2020-06-25 19:01:32 +02:00
Eric Liang
1e0e1a45e6
[rllib] Add type annotations for evaluation/, env/ packages ( #9003 )
2020-06-19 13:09:05 -07:00
Sven Mika
14405b90d5
[RLlib] Prototype of a DynaTrainer (for env dynamics learning in upcoming MBMPO algo). ( #8860 )
2020-06-16 09:01:20 +02:00
Sven Mika
7008902cff
[RLlib] Minor rllib.utils
cleanup. ( #8932 )
2020-06-16 08:52:20 +02:00
Sven Mika
4ed796a7d6
[RLlib] Add testing Policy.compute_single_action()
for all agents. ( #8903 )
2020-06-13 17:51:50 +02:00
Eric Liang
34bae27ac7
[rllib] Flexible multi-agent replay modes and replay_sequence_length ( #8893 )
2020-06-12 20:17:27 -07:00
Sven Mika
25c0974543
[RLlib] Issue 8412 (Adam vars not stored in ModelV2). ( #8480 )
2020-06-05 21:07:02 +02:00
Sven Mika
c74dc58f8b
[RLlib] Fix use_lstm
flag for ModelV2 (w/o ModelV1 wrapping) and add it for PyTorch. ( #8734 )
2020-06-05 15:40:30 +02:00
Sven Mika
368088be85
[RLlib] Sample batch docs and cleanup. ( #8778 )
2020-06-04 22:47:32 +02:00
Victor Le
aee01133cd
Fix dict/tuple hybrid action space for tensorflow eager execution ( #8781 )
2020-06-04 13:28:46 -07:00
Sven Mika
d8a081a185
[RLlib] Unity3D integration (n Unity3D clients vs learning server). ( #8590 )
2020-05-30 22:48:34 +02:00
Sven Mika
2746fc0476
[RLlib] Auto-framework, retire use_pytorch
in favor of framework=...
( #8520 )
2020-05-27 16:19:13 +02:00
Sven Mika
6d196197bc
[RLlib] utils/spaces ... ( #8608 )
2020-05-27 10:21:30 +02:00
Sven Mika
0422e9c5a8
[RLlib] Add 2 Transformer learning test cases on StatelessCartPole (PPO and IMPALA). ( #8624 )
2020-05-27 10:19:47 +02:00
Sven Mika
d76578700d
[RLlib] Policy.compute_single_action()
broken for nested actions (Issue 8411). ( #8514 )
2020-05-20 22:29:08 +02:00