Amog Kamsetty
|
0b8489dcc6
|
Revert "[RLlib] Add support for multi-GPU to DDPG. (#17586)" (#17707)
This reverts commit 0eb0e0ff58 .
|
2021-08-10 10:50:21 -07:00 |
|
Amog Kamsetty
|
77f28f1c30
|
Revert "[RLlib] Fix Trainer.add_policy for num_workers>0 (self play example scripts). (#17566)" (#17709)
This reverts commit 3b447265d8 .
|
2021-08-10 10:50:01 -07:00 |
|
Sven Mika
|
3b447265d8
|
[RLlib] Fix Trainer.add_policy for num_workers>0 (self play example scripts). (#17566)
|
2021-08-05 11:41:18 -04:00 |
|
Sven Mika
|
0eb0e0ff58
|
[RLlib] Add support for multi-GPU to DDPG. (#17586)
|
2021-08-05 11:39:51 -04:00 |
|
Sven Mika
|
5107d16ae5
|
[RLlib] Add @Deprecated decorator to simplify/unify deprecation of classes, methods, functions. (#17530)
|
2021-08-03 18:30:02 -04:00 |
|
Sven Mika
|
924f11cd45
|
[RLlib] Torch algos use now-framework-agnostic MultiGPUTrainOneStep execution op (~33% speedup for PPO-torch + GPU). (#17371)
|
2021-08-03 11:35:49 -04:00 |
|
Sven Mika
|
8a844ff840
|
[RLlib] Issues: 17397, 17425, 16715, 17174. When on driver, Torch|TFPolicy should not use ray.get_gpu_ids() (b/c no GPUs assigned by ray). (#17444)
|
2021-08-02 17:29:59 -04:00 |
|
Julius Frost
|
d7a5ec1830
|
[RLlib] SAC tuple observation space fix (#17356)
|
2021-07-28 12:39:28 -04:00 |
|
Rohan138
|
f30b444bac
|
[Rllib] set self._allow_unknown_config (#17335)
Co-authored-by: Sven Mika <sven@anyscale.io>
|
2021-07-28 11:48:41 +01:00 |
|
Sven Mika
|
90b21ce27e
|
[RLlib] De-flake 3 test cases; Fix config.simple_optimizer and SampleBatch.is_training warnings. (#17321)
|
2021-07-27 14:39:06 -04:00 |
|
Sven Mika
|
5231fdd996
|
[Testing] Split RLlib example scripts CI tests into 4 jobs (from 2). (#17331)
|
2021-07-26 10:52:55 -04:00 |
|
Sven Mika
|
0c5c70b584
|
[RLlib] Discussion 247: Allow remote sub-envs (within vectorized) to be used with custom APIs. (#17118)
|
2021-07-25 16:55:51 -04:00 |
|
ddworak94
|
fba8461663
|
[RLlib] Add RNN-SAC agent (#16577)
Shoutout to @ddworak94 :)
|
2021-07-25 10:04:52 -04:00 |
|
Sven Mika
|
7bc4376466
|
[RLlib] Example script: Simple league-based self-play w/ open spiel env (markov soccer or connect-4). (#17077)
|
2021-07-22 10:59:13 -04:00 |
|
Julius Frost
|
0b1b6222bc
|
[rllib] Add merge_trainer_config arguments to trainer template (#17160)
|
2021-07-21 15:43:06 -07:00 |
|
Sven Mika
|
5a313ba3d6
|
[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169)
|
2021-07-20 14:58:13 -04:00 |
|
Amog Kamsetty
|
cb74053ee5
|
Retry remove gpustat dependency (#17115)
* remove gpustat
* move psutil imports
|
2021-07-19 11:14:10 -07:00 |
|
Sven Mika
|
18d173b172
|
[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. (#17031)
|
2021-07-19 13:16:03 -04:00 |
|
Sven Mika
|
e0640ad0dc
|
[RLlib] Fix seeding for ES and ARS. (#16744)
|
2021-07-19 13:13:05 -04:00 |
|
Sven Mika
|
649580d735
|
[RLlib] Redo simplify multi agent config dict: Reverted b/c seemed to break test_typing (non RLlib test). (#17046)
|
2021-07-15 05:51:24 -04:00 |
|
Grzegorz Bartyzel
|
d553d4da6c
|
[RLlib] DQN (Rainbow): Fix torch noisy layer support and loss (#16716)
|
2021-07-13 16:48:06 -04:00 |
|
Sven Mika
|
1fd0eb805e
|
[RLlib] Redo fix bug normalize vs unsquash actions (original PR made log-likelihood test flakey). (#17014)
|
2021-07-13 14:01:30 -04:00 |
|
Antoine Galataud
|
16f1011c07
|
[RLlib] Issue 15910: APEX current learning rate not updated on local worker (#15911)
|
2021-07-13 14:01:00 -04:00 |
|
Amog Kamsetty
|
38b5b6d24c
|
Revert "[RLlib] Simplify multiagent config (automatically infer class/spaces/config). (#16565)" (#17036)
This reverts commit e4123fff27 .
|
2021-07-13 09:57:15 -07:00 |
|
Kai Fricke
|
27d80c4c88
|
[RLlib] ONNX export for tensorflow (1.x) and torch (#16805)
|
2021-07-13 12:38:11 -04:00 |
|
Sven Mika
|
e4123fff27
|
[RLlib] Simplify multiagent config (automatically infer class/spaces/config). (#16565)
|
2021-07-13 06:38:14 -04:00 |
|
Amog Kamsetty
|
bc33dc7e96
|
Revert "[RLlib] Fix bug in policy.py: normalize_actions=True has to call unsquash_action , not normalize_action ." (#17002)
This reverts commit 7862dd64ea .
|
2021-07-12 11:09:14 -07:00 |
|
Sven Mika
|
55a90e670a
|
[RLlib] Trainer.add_policy() not working for tf, if added policy is trained afterwards. (#16927)
|
2021-07-11 23:41:38 +02:00 |
|
Julius Frost
|
a88b217d3f
|
[rllib] Enhancements to Input API for customizing offline datasets (#16957)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
|
2021-07-10 15:05:25 -07:00 |
|
Sven Mika
|
7862dd64ea
|
[RLlib] Fix bug in policy.py: normalize_actions=True has to call unsquash_action , not normalize_action . (#16774)
|
2021-07-08 17:31:34 +02:00 |
|
Sven Mika
|
53206dd440
|
[RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes (#16531)
|
2021-06-30 12:32:11 +02:00 |
|
AnnaKosiorek
|
1e709771b2
|
[rllib][minor] clarification of the softmax axis in dqn_torch_policy (#16311)
pytorch nn.functional.softmax (unlike tf.nn.softmax) calculates softmax along zeroth dimension by default
|
2021-06-26 11:19:54 -07:00 |
|
Sven Mika
|
c95dea51e9
|
[RLlib] External env enhancements + more examples. (#16583)
|
2021-06-23 09:09:01 +02:00 |
|
Sven Mika
|
be6db06485
|
[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569)
|
2021-06-21 13:46:01 +02:00 |
|
Sven Mika
|
169ddabae7
|
[RLlib] Issue 15973: Trainer.with_updates(validate_config=...) behaves confusingly. (#16429)
|
2021-06-19 22:42:00 +02:00 |
|
Amog Kamsetty
|
bd3cbfc56a
|
Revert "[RLlib] Allow policies to be added/deleted on the fly. (#16359)" (#16543)
This reverts commit e78ec370a9 .
|
2021-06-18 12:21:49 -07:00 |
|
Sven Mika
|
2900a06dd7
|
[RLlib] Issue 14503: SAC not allowing custom action distributions. (#16427)
|
2021-06-18 17:27:29 +02:00 |
|
Sven Mika
|
e78ec370a9
|
[RLlib] Allow policies to be added/deleted on the fly. (#16359)
|
2021-06-18 10:31:30 +02:00 |
|
Sven Mika
|
d0014cd351
|
[RLlib] Policies get/set_state fixes and enhancements. (#16354)
|
2021-06-15 13:08:43 +02:00 |
|
Chris Bamford
|
fd1a97e39f
|
[RLlib] Memory leak docs (#15908)
|
2021-06-10 18:10:21 +02:00 |
|
Sven Mika
|
3d4dc60e2e
|
[RLlib] CQL iteration count fixes: Remove dummy buffer and unnecessary store op from exec_plan. (#16332)
|
2021-06-10 07:49:17 +02:00 |
|
Sven Mika
|
e2be41b407
|
[RLlib] MARWIL + BC: Various fixes and enhancements. (#16218)
|
2021-06-03 22:29:00 +02:00 |
|
Sven Mika
|
5fe34862ce
|
[RLlib] DDPG torch GPU bug. (#16133)
|
2021-05-28 22:09:25 +02:00 |
|
Sven Mika
|
33a69135cb
|
[RLlib] Issue 16117: DQN/APEX torch not working on GPU. (#16118)
|
2021-05-28 09:12:53 +02:00 |
|
Sven Mika
|
f6302d81be
|
[RLlib] Discussion 2210: BC algo broken, if "advantages" missing in offline data. (#16019)
|
2021-05-25 08:47:17 +02:00 |
|
Sven Mika
|
e80095591c
|
[RLlib] Entropy coeff schedule bug fix and git bisect script. (#15937)
|
2021-05-20 18:15:10 +02:00 |
|
Sven Mika
|
2d34216660
|
[RLlib] APEX-DQN: Bug fix for torch and add learning test. (#15762)
|
2021-05-20 09:27:03 +02:00 |
|
Sven Mika
|
eaa7f6696d
|
[RLlib] Issue 15887: MARWIL adv norm update mismatch for tf (static-graph) vs torch versions. (#15898)
|
2021-05-19 15:44:11 -07:00 |
|
Michael Luo
|
474f04e322
|
[RLlib] DDPG/TD3 + A3C/A2C + MARWIL/BC Annotation/Comments/Code Cleanup (#14707)
|
2021-05-19 16:32:29 +02:00 |
|
Chris Bamford
|
0be83d9a95
|
[RLlib] Fixing Memory Leak In Multi-Agent environments. Adding tooling for finding memory leaks in workers. (#15815)
|
2021-05-18 13:23:00 +02:00 |
|