Sven Mika
|
b99943806e
|
[RLlib] Add support for IMPALA to handle more than one loss/optimizer (analogous to recent enhancement for APPO). (#18971)
|
2021-09-29 21:30:04 +02:00 |
|
Sven Mika
|
61a1274619
|
[RLlib] No Preprocessors (part 2). (#18468)
|
2021-09-23 12:56:45 +02:00 |
|
Sven Mika
|
a2a077b874
|
[RLlib] Faster remote worker space inference (don't infer if not required). (#18805)
|
2021-09-23 10:54:37 +02:00 |
|
Sven Mika
|
698b4eeed3
|
[RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). (#18669)
|
2021-09-21 22:00:14 +02:00 |
|
Sven Mika
|
fd13bac9b3
|
[RLlib] Add worker arg (optional) to policy_mapping_fn . (#18184)
|
2021-09-17 12:07:11 +02:00 |
|
Sven Mika
|
ba1c489b79
|
[RLlib Testing] Lower --smoke-test "time_total_s" to make sure it doesn't time out. (#18670)
|
2021-09-16 18:22:23 +02:00 |
|
Sven Mika
|
8a00154038
|
[RLlib] Bump tf version in ML docker to tf==2.5.0; add tfp to ML-docker. (#18544)
|
2021-09-15 08:46:37 +02:00 |
|
Sven Mika
|
08c09737fa
|
[RLlib] Fix R2D2 (torch) multi-GPU issue. (#18550)
|
2021-09-14 19:58:10 +02:00 |
|
Sven Mika
|
3803e796ff
|
[RLlib] Multi-GPU learner thread (IMPALA) error messages/comments/code-cleanup. (#18540)
|
2021-09-13 19:27:53 +02:00 |
|
Sven Mika
|
ea4a22249c
|
[RLlib] Add simple action-masking example script/env/model (tf and torch). (#18494)
|
2021-09-11 23:08:09 +02:00 |
|
Sven Mika
|
3f89f35e52
|
[RLlib] Better error messages and hints; + failure-mode tests; (#18466)
|
2021-09-10 16:52:47 +02:00 |
|
Sven Mika
|
8a066474d4
|
[RLlib] No Preprocessors; preparatory PR #1 (#18367)
|
2021-09-09 08:10:42 +02:00 |
|
Sven Mika
|
1520c3d147
|
[RLlib] Deepcopy env_ctx for vectorized sub-envs AND add eval-worker-option to Trainer.add_policy() (#18428)
|
2021-09-09 07:10:06 +02:00 |
|
gjoliver
|
808b683f81
|
[RLlib] Add a unittest for learning rate schedule used with APEX agent. (#18389)
|
2021-09-08 23:29:40 +02:00 |
|
Sven Mika
|
45f60e51a9
|
[RLlib] DDPPO fixes and benchmarks. (#18390)
|
2021-09-08 19:39:01 +02:00 |
|
Sven Mika
|
56f142cac1
|
[RLlib] Add support for evaluation_num_episodes=auto (run eval for as long as the parallel train step takes). (#18380)
|
2021-09-07 08:08:37 +02:00 |
|
Sven Mika
|
e3e6ed7aaa
|
[RLlib] Issues 17844, 18034: Fix n-step > 1 bug. (#18358)
|
2021-09-06 12:14:20 +02:00 |
|
Sven Mika
|
ba58f5edb1
|
[RLlib] Strictly run evaluation_num_episodes episodes each evaluation run (no matter the other eval config settings). (#18335)
|
2021-09-05 15:37:05 +02:00 |
|
Sven Mika
|
a772c775cd
|
[RLlib] Set random seed (if provided) to Trainer process as well. (#18307)
|
2021-09-04 11:02:30 +02:00 |
|
Sven Mika
|
9a8ca6a69d
|
[RLlib] Fix Atari learning test regressions (2 bugs) and 1 minor attention net bug. (#18306)
|
2021-09-03 13:29:57 +02:00 |
|
Sven Mika
|
82465f9342
|
[RLlib] Better PolicyServer example (w/ or w/o tune) and add printing out actual listen port address in log-level=INFO. (#18254)
|
2021-08-31 22:03:23 +02:00 |
|
Sven Mika
|
599e589481
|
[RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. (#18065)
|
2021-08-31 14:56:53 +02:00 |
|
Sven Mika
|
4888d7c9af
|
[RLlib] Replay buffers: Add config option to store contents in checkpoints. (#17999)
|
2021-08-31 12:21:49 +02:00 |
|
Joseph Suarez
|
8136d2912b
|
[RLlib] Add policies arg to callback: on_episode_step (already exists in all other episode-related callbacks) (#18119)
|
2021-08-27 16:12:19 +02:00 |
|
Sven Mika
|
b6aa8223bc
|
[RLlib] Fix final_scale 's default value to 0.02 (see OrnsteinUhlenbeck exploration). (#18070)
|
2021-08-25 14:22:09 +02:00 |
|
Sven Mika
|
9883505e84
|
[RLlib] Add [LSTM=True + multi-GPU]-tests to nightly RLlib testing suite (for all algos supporting RNNs, except R2D2, RNNSAC, and DDPPO). (#18017)
|
2021-08-24 21:55:27 +02:00 |
|
Sven Mika
|
494ddd98c1
|
[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. (#17928)
|
2021-08-21 17:05:48 +02:00 |
|
Sven Mika
|
8248ba531b
|
[RLlib] Redo #17410: Example script: Remote worker envs with inference done on main node. (#17960)
|
2021-08-20 08:02:18 +02:00 |
|
Alex Wu
|
318ba6fae0
|
Revert "[RLlib] Add example script for how to have n remote (parallel) envs with inference happening on "main" (possibly GPU) node. (#17410)" (#17951)
This reverts commit 8fc16b9a18 .
|
2021-08-19 07:55:10 -07:00 |
|
Sven Mika
|
8fc16b9a18
|
[RLlib] Add example script for how to have n remote (parallel) envs with inference happening on "main" (possibly GPU) node. (#17410)
|
2021-08-19 12:14:50 +02:00 |
|
Kai Fricke
|
bf3eaa9264
|
[RLlib] Dreamer fixes and reinstate Dreamer test. (#17821)
Co-authored-by: sven1977 <svenmika1977@gmail.com>
|
2021-08-18 18:47:08 +02:00 |
|
Sven Mika
|
a428f10ebe
|
[RLlib] Add multi-GPU learning tests to nightly. (#17778)
|
2021-08-18 17:21:01 +02:00 |
|
Sven Mika
|
f18213712f
|
[RLlib] Redo: "fix self play example scripts" PR (17566) (#17895)
* wip.
* wip.
* wip.
* wip.
* wip.
* wip.
* wip.
* wip.
* wip.
|
2021-08-17 09:13:35 -07:00 |
|
Thomas Lecat
|
c02f91fa2d
|
[RLlib] Ape-X doesn't take the value of prioritized_replay into account (#17541)
|
2021-08-16 22:18:08 +02:00 |
|
Sven Mika
|
f3bbe4ea44
|
[RLlib] Test cases/BUILD cleanup; split "everything else" (longest running one rn) tests in 2. (#17640)
|
2021-08-16 22:01:01 +02:00 |
|
Sven Mika
|
c2ea2c01bb
|
[RLlib] Redo: Add support for multi-GPU to DDPG. (#17789)
* wip.
* wip.
* wip.
* wip.
* wip.
* wip.
|
2021-08-13 18:01:24 -07:00 |
|
Sven Mika
|
7f2b3c0824
|
[RLlib] Issue 17667: CQL-torch + GPU not working (due to simple_optimizer=False; must use simple optimizer!). (#17742)
|
2021-08-11 18:30:21 +02:00 |
|
Sven Mika
|
811d71b368
|
[RLlib] Issue 17653: Torch multi-GPU (>1) broken for LSTMs. (#17657)
|
2021-08-11 12:44:35 +02:00 |
|
Amog Kamsetty
|
0b8489dcc6
|
Revert "[RLlib] Add support for multi-GPU to DDPG. (#17586)" (#17707)
This reverts commit 0eb0e0ff58 .
|
2021-08-10 10:50:21 -07:00 |
|
Amog Kamsetty
|
77f28f1c30
|
Revert "[RLlib] Fix Trainer.add_policy for num_workers>0 (self play example scripts). (#17566)" (#17709)
This reverts commit 3b447265d8 .
|
2021-08-10 10:50:01 -07:00 |
|
Sven Mika
|
3b447265d8
|
[RLlib] Fix Trainer.add_policy for num_workers>0 (self play example scripts). (#17566)
|
2021-08-05 11:41:18 -04:00 |
|
Sven Mika
|
0eb0e0ff58
|
[RLlib] Add support for multi-GPU to DDPG. (#17586)
|
2021-08-05 11:39:51 -04:00 |
|
Sven Mika
|
5107d16ae5
|
[RLlib] Add @Deprecated decorator to simplify/unify deprecation of classes, methods, functions. (#17530)
|
2021-08-03 18:30:02 -04:00 |
|
Sven Mika
|
924f11cd45
|
[RLlib] Torch algos use now-framework-agnostic MultiGPUTrainOneStep execution op (~33% speedup for PPO-torch + GPU). (#17371)
|
2021-08-03 11:35:49 -04:00 |
|
Sven Mika
|
8a844ff840
|
[RLlib] Issues: 17397, 17425, 16715, 17174. When on driver, Torch|TFPolicy should not use ray.get_gpu_ids() (b/c no GPUs assigned by ray). (#17444)
|
2021-08-02 17:29:59 -04:00 |
|
Julius Frost
|
d7a5ec1830
|
[RLlib] SAC tuple observation space fix (#17356)
|
2021-07-28 12:39:28 -04:00 |
|
Rohan138
|
f30b444bac
|
[Rllib] set self._allow_unknown_config (#17335)
Co-authored-by: Sven Mika <sven@anyscale.io>
|
2021-07-28 11:48:41 +01:00 |
|
Sven Mika
|
90b21ce27e
|
[RLlib] De-flake 3 test cases; Fix config.simple_optimizer and SampleBatch.is_training warnings. (#17321)
|
2021-07-27 14:39:06 -04:00 |
|
Sven Mika
|
5231fdd996
|
[Testing] Split RLlib example scripts CI tests into 4 jobs (from 2). (#17331)
|
2021-07-26 10:52:55 -04:00 |
|
Sven Mika
|
0c5c70b584
|
[RLlib] Discussion 247: Allow remote sub-envs (within vectorized) to be used with custom APIs. (#17118)
|
2021-07-25 16:55:51 -04:00 |
|