Sven Mika
|
8a844ff840
|
[RLlib] Issues: 17397, 17425, 16715, 17174. When on driver, Torch|TFPolicy should not use ray.get_gpu_ids() (b/c no GPUs assigned by ray). (#17444)
|
2021-08-02 17:29:59 -04:00 |
|
Sven Mika
|
5a313ba3d6
|
[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169)
|
2021-07-20 14:58:13 -04:00 |
|
Sven Mika
|
18d173b172
|
[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. (#17031)
|
2021-07-19 13:16:03 -04:00 |
|
Sven Mika
|
78b776942f
|
[RLlib] Discussion 1928: Initial lr wrong if schedule used that includes ts=0 (both tf and torch). (#15538)
|
2021-04-27 17:19:52 +02:00 |
|
Sven Mika
|
c3a15ecc0f
|
[RLlib] Issue #13802: Enhance metrics for multiagent->count_steps_by=agent_steps setting. (#14033)
|
2021-03-18 20:27:41 +01:00 |
|
Sven Mika
|
ef944bc5f0
|
[RLlib] Re-enable placement group support for RLlib. (#14384)
|
2021-03-05 08:16:24 +01:00 |
|
Richard Liaw
|
a2d2275ee1
|
Revert "[RLlib + Tune] Add placement group support to RLlib. (#14289)" (#14360)
This reverts commit 6cd0cd3bd9 .
|
2021-02-25 14:27:35 -08:00 |
|
Sven Mika
|
6cd0cd3bd9
|
[RLlib + Tune] Add placement group support to RLlib. (#14289)
|
2021-02-25 16:01:31 +01:00 |
|
Sven Mika
|
c524f86785
|
[RLlib] BC/MARWIL/recurrent nets minor cleanups and bug fixes. (#13064)
|
2020-12-27 09:46:03 -05:00 |
|
Sven Mika
|
592c161032
|
[RLlib] Issue 12118: LSTM prev-a/r should be separately configurable. Fix missing prev-a one-hot encoding. (#12397)
* WIP.
* Fix and LINT.
|
2020-11-25 11:27:46 -08:00 |
|
Sven Mika
|
d9f1874e34
|
[RLlib] Minor fixes (torch GPU bugs + some cleanup). (#11609)
|
2020-10-27 10:00:24 +01:00 |
|
Eric Liang
|
5acd3e66dd
|
[rllib] Fix torch TD error, IMPALA LR updates (#9477)
* update
* add test
* lint
* fix super call
* speed es test up
|
2020-07-23 12:50:25 -07:00 |
|
Sven Mika
|
fcdf410ae1
|
[RLlib] Tf2.x native. (#8752)
|
2020-07-11 22:06:35 +02:00 |
|
Sven Mika
|
43043ee4d5
|
[RLlib] Tf2x preparation; part 2 (upgrading try_import_tf() ). (#9136)
* WIP.
* Fixes.
* LINT.
* WIP.
* WIP.
* Fixes.
* Fixes.
* Fixes.
* Fixes.
* WIP.
* Fixes.
* Test
* Fix.
* Fixes and LINT.
* Fixes and LINT.
* LINT.
|
2020-06-30 10:13:20 +02:00 |
|
Sven Mika
|
5c6d5d4ab1
|
This PR fixes the currently broken lstm_use_prev_action_reward flag for default lstm models (model.use_lstm=True). (#8970)
|
2020-06-27 20:50:01 +02:00 |
|
Sven Mika
|
4ed796a7d6
|
[RLlib] Add testing Policy.compute_single_action() for all agents. (#8903)
|
2020-06-13 17:51:50 +02:00 |
|
Sven Mika
|
c74dc58f8b
|
[RLlib] Fix use_lstm flag for ModelV2 (w/o ModelV1 wrapping) and add it for PyTorch. (#8734)
|
2020-06-05 15:40:30 +02:00 |
|
Sven Mika
|
2746fc0476
|
[RLlib] Auto-framework, retire use_pytorch in favor of framework=... (#8520)
|
2020-05-27 16:19:13 +02:00 |
|
Eric Liang
|
9d012626e5
|
[rllib] Distributed exec workflow for impala (#8321)
|
2020-05-11 20:24:43 -07:00 |
|
Sven Mika
|
754290daad
|
[RLlib] Add light-weight Trainer.compute_action() tests for all Algos. (#8356)
|
2020-05-08 16:31:31 +02:00 |
|
Sven Mika
|
166bb5d690
|
[RLlib] IMPALA PyTorch (#8287)
This PR adds an IMPALA PyTorch implementation.
- adds compilation tests for LSTM and w/o LSTM.
- adds learning test for CartPole.
|
2020-05-03 13:44:25 +02:00 |
|