Commit graph

422 commits

Author SHA1 Message Date
Sven Mika
b6aa8223bc
[RLlib] Fix final_scale's default value to 0.02 (see OrnsteinUhlenbeck exploration). (#18070) 2021-08-25 14:22:09 +02:00
Sven Mika
9883505e84
[RLlib] Add [LSTM=True + multi-GPU]-tests to nightly RLlib testing suite (for all algos supporting RNNs, except R2D2, RNNSAC, and DDPPO). (#18017) 2021-08-24 21:55:27 +02:00
Sven Mika
494ddd98c1
[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. (#17928) 2021-08-21 17:05:48 +02:00
Sven Mika
8248ba531b
[RLlib] Redo #17410: Example script: Remote worker envs with inference done on main node. (#17960) 2021-08-20 08:02:18 +02:00
Alex Wu
318ba6fae0
Revert "[RLlib] Add example script for how to have n remote (parallel) envs with inference happening on "main" (possibly GPU) node. (#17410)" (#17951)
This reverts commit 8fc16b9a18.
2021-08-19 07:55:10 -07:00
Sven Mika
8fc16b9a18
[RLlib] Add example script for how to have n remote (parallel) envs with inference happening on "main" (possibly GPU) node. (#17410) 2021-08-19 12:14:50 +02:00
Kai Fricke
bf3eaa9264
[RLlib] Dreamer fixes and reinstate Dreamer test. (#17821)
Co-authored-by: sven1977 <svenmika1977@gmail.com>
2021-08-18 18:47:08 +02:00
Sven Mika
a428f10ebe
[RLlib] Add multi-GPU learning tests to nightly. (#17778) 2021-08-18 17:21:01 +02:00
Sven Mika
f18213712f
[RLlib] Redo: "fix self play example scripts" PR (17566) (#17895)
* wip.

* wip.

* wip.

* wip.

* wip.

* wip.

* wip.

* wip.

* wip.
2021-08-17 09:13:35 -07:00
Thomas Lecat
c02f91fa2d
[RLlib] Ape-X doesn't take the value of prioritized_replay into account (#17541) 2021-08-16 22:18:08 +02:00
Sven Mika
f3bbe4ea44
[RLlib] Test cases/BUILD cleanup; split "everything else" (longest running one rn) tests in 2. (#17640) 2021-08-16 22:01:01 +02:00
Sven Mika
c2ea2c01bb
[RLlib] Redo: Add support for multi-GPU to DDPG. (#17789)
* wip.

* wip.

* wip.

* wip.

* wip.

* wip.
2021-08-13 18:01:24 -07:00
Sven Mika
7f2b3c0824
[RLlib] Issue 17667: CQL-torch + GPU not working (due to simple_optimizer=False; must use simple optimizer!). (#17742) 2021-08-11 18:30:21 +02:00
Sven Mika
811d71b368
[RLlib] Issue 17653: Torch multi-GPU (>1) broken for LSTMs. (#17657) 2021-08-11 12:44:35 +02:00
Amog Kamsetty
0b8489dcc6
Revert "[RLlib] Add support for multi-GPU to DDPG. (#17586)" (#17707)
This reverts commit 0eb0e0ff58.
2021-08-10 10:50:21 -07:00
Amog Kamsetty
77f28f1c30
Revert "[RLlib] Fix Trainer.add_policy for num_workers>0 (self play example scripts). (#17566)" (#17709)
This reverts commit 3b447265d8.
2021-08-10 10:50:01 -07:00
Sven Mika
3b447265d8
[RLlib] Fix Trainer.add_policy for num_workers>0 (self play example scripts). (#17566) 2021-08-05 11:41:18 -04:00
Sven Mika
0eb0e0ff58
[RLlib] Add support for multi-GPU to DDPG. (#17586) 2021-08-05 11:39:51 -04:00
Sven Mika
5107d16ae5
[RLlib] Add @Deprecated decorator to simplify/unify deprecation of classes, methods, functions. (#17530) 2021-08-03 18:30:02 -04:00
Sven Mika
924f11cd45
[RLlib] Torch algos use now-framework-agnostic MultiGPUTrainOneStep execution op (~33% speedup for PPO-torch + GPU). (#17371) 2021-08-03 11:35:49 -04:00
Sven Mika
8a844ff840
[RLlib] Issues: 17397, 17425, 16715, 17174. When on driver, Torch|TFPolicy should not use ray.get_gpu_ids() (b/c no GPUs assigned by ray). (#17444) 2021-08-02 17:29:59 -04:00
Julius Frost
d7a5ec1830
[RLlib] SAC tuple observation space fix (#17356) 2021-07-28 12:39:28 -04:00
Rohan138
f30b444bac
[Rllib] set self._allow_unknown_config (#17335)
Co-authored-by: Sven Mika <sven@anyscale.io>
2021-07-28 11:48:41 +01:00
Sven Mika
90b21ce27e
[RLlib] De-flake 3 test cases; Fix config.simple_optimizer and SampleBatch.is_training warnings. (#17321) 2021-07-27 14:39:06 -04:00
Sven Mika
5231fdd996
[Testing] Split RLlib example scripts CI tests into 4 jobs (from 2). (#17331) 2021-07-26 10:52:55 -04:00
Sven Mika
0c5c70b584
[RLlib] Discussion 247: Allow remote sub-envs (within vectorized) to be used with custom APIs. (#17118) 2021-07-25 16:55:51 -04:00
ddworak94
fba8461663
[RLlib] Add RNN-SAC agent (#16577)
Shoutout to @ddworak94 :)
2021-07-25 10:04:52 -04:00
Sven Mika
7bc4376466
[RLlib] Example script: Simple league-based self-play w/ open spiel env (markov soccer or connect-4). (#17077) 2021-07-22 10:59:13 -04:00
Julius Frost
0b1b6222bc
[rllib] Add merge_trainer_config arguments to trainer template (#17160) 2021-07-21 15:43:06 -07:00
Sven Mika
5a313ba3d6
[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169) 2021-07-20 14:58:13 -04:00
Amog Kamsetty
cb74053ee5
Retry remove gpustat dependency (#17115)
* remove gpustat

* move psutil imports
2021-07-19 11:14:10 -07:00
Sven Mika
18d173b172
[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. (#17031) 2021-07-19 13:16:03 -04:00
Sven Mika
e0640ad0dc
[RLlib] Fix seeding for ES and ARS. (#16744) 2021-07-19 13:13:05 -04:00
Sven Mika
649580d735
[RLlib] Redo simplify multi agent config dict: Reverted b/c seemed to break test_typing (non RLlib test). (#17046) 2021-07-15 05:51:24 -04:00
Grzegorz Bartyzel
d553d4da6c
[RLlib] DQN (Rainbow): Fix torch noisy layer support and loss (#16716) 2021-07-13 16:48:06 -04:00
Sven Mika
1fd0eb805e
[RLlib] Redo fix bug normalize vs unsquash actions (original PR made log-likelihood test flakey). (#17014) 2021-07-13 14:01:30 -04:00
Antoine Galataud
16f1011c07
[RLlib] Issue 15910: APEX current learning rate not updated on local worker (#15911) 2021-07-13 14:01:00 -04:00
Amog Kamsetty
38b5b6d24c
Revert "[RLlib] Simplify multiagent config (automatically infer class/spaces/config). (#16565)" (#17036)
This reverts commit e4123fff27.
2021-07-13 09:57:15 -07:00
Kai Fricke
27d80c4c88
[RLlib] ONNX export for tensorflow (1.x) and torch (#16805) 2021-07-13 12:38:11 -04:00
Sven Mika
e4123fff27
[RLlib] Simplify multiagent config (automatically infer class/spaces/config). (#16565) 2021-07-13 06:38:14 -04:00
Amog Kamsetty
bc33dc7e96
Revert "[RLlib] Fix bug in policy.py: normalize_actions=True has to call unsquash_action, not normalize_action." (#17002)
This reverts commit 7862dd64ea.
2021-07-12 11:09:14 -07:00
Sven Mika
55a90e670a
[RLlib] Trainer.add_policy() not working for tf, if added policy is trained afterwards. (#16927) 2021-07-11 23:41:38 +02:00
Julius Frost
a88b217d3f
[rllib] Enhancements to Input API for customizing offline datasets (#16957)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-07-10 15:05:25 -07:00
Sven Mika
7862dd64ea
[RLlib] Fix bug in policy.py: normalize_actions=True has to call unsquash_action, not normalize_action. (#16774) 2021-07-08 17:31:34 +02:00
Sven Mika
53206dd440
[RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes (#16531) 2021-06-30 12:32:11 +02:00
AnnaKosiorek
1e709771b2
[rllib][minor] clarification of the softmax axis in dqn_torch_policy (#16311)
pytorch nn.functional.softmax (unlike tf.nn.softmax) calculates softmax along zeroth dimension by default
2021-06-26 11:19:54 -07:00
Sven Mika
c95dea51e9
[RLlib] External env enhancements + more examples. (#16583) 2021-06-23 09:09:01 +02:00
Sven Mika
be6db06485
[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569) 2021-06-21 13:46:01 +02:00
Sven Mika
169ddabae7
[RLlib] Issue 15973: Trainer.with_updates(validate_config=...) behaves confusingly. (#16429) 2021-06-19 22:42:00 +02:00
Amog Kamsetty
bd3cbfc56a
Revert "[RLlib] Allow policies to be added/deleted on the fly. (#16359)" (#16543)
This reverts commit e78ec370a9.
2021-06-18 12:21:49 -07:00