Sven Mika
026849cd27
[RLlib] APPO TrainerConfig objects. ( #24376 )
2022-05-02 15:06:23 +02:00
Sven Mika
f066180ed5
[RLlib] Deprecate timesteps_per_iteration
config key (in favor of min_[sample|train]_timesteps_per_reporting
. ( #24372 )
2022-05-02 12:51:14 +02:00
Balaji Veeramani
7f1bacc7dc
[CI] Format Python code with Black ( #21975 )
...
See #21316 and #21311 for the motivation behind these changes.
2022-01-29 18:41:57 -08:00
Jun Gong
8ebc50f844
[RLlib] Issue 21334: Fix APPO when kl_loss is enabled. ( #21855 )
2022-01-27 20:08:58 +01:00
Sven Mika
d5bfb7b7da
[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 ( #21652 )
2022-01-25 14:16:58 +01:00
Sven Mika
a931076f59
[RLlib] Tf2 + eager-tracing same speed as framework=tf; Add more test coverage for tf2+tracing. ( #19981 )
2021-11-05 16:10:00 +01:00
gjoliver
9385b6c1be
[RLlib] Make a few LRSchedule and EntropyCoeffSchedule tests more reliable. ( #19934 )
2021-11-02 16:52:56 +01:00
gjoliver
89fbfc00f8
[RLlib] Some minor cleanups (buffer buffer_size -> capacity and others). ( #19623 )
2021-10-25 09:42:39 +02:00
gjoliver
44a4e42172
[rllib] Add entropy_coeff_schedule support for APPO. ( #19544 )
...
* Add entropy_coeff_schedule support for APPO.
* lint
2021-10-20 14:18:01 -07:00
Sven Mika
bd2d2079d2
[RLlib] Support >1 loss terms and optimizers for framework=tf2 (already supported for framework=[tf|torch]) ( #19269 )
2021-10-10 12:19:47 +02:00
Sven Mika
ed85f59194
[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. ( #18879 )
2021-09-30 16:39:05 +02:00
Sven Mika
698b4eeed3
[RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). ( #18669 )
2021-09-21 22:00:14 +02:00
Sven Mika
599e589481
[RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. ( #18065 )
2021-08-31 14:56:53 +02:00
Sven Mika
a428f10ebe
[RLlib] Add multi-GPU learning tests to nightly. ( #17778 )
2021-08-18 17:21:01 +02:00
Sven Mika
0df55a139c
[RLlib] Attention Net prep PR #1 : Smaller cleanups. ( #12447 )
...
* WIP.
* Fix.
* Fix.
* Fix.
2020-11-27 16:25:47 -08:00
Sven Mika
d9f1874e34
[RLlib] Minor fixes (torch GPU bugs + some cleanup). ( #11609 )
2020-10-27 10:00:24 +01:00
Sven Mika
ef18893fb5
[RLlib] PPO, APPO, and DD-PPO code cleanup. ( #10420 )
2020-09-02 14:03:01 +02:00
Sven Mika
43043ee4d5
[RLlib] Tf2x preparation; part 2 (upgrading try_import_tf()
). ( #9136 )
...
* WIP.
* Fixes.
* LINT.
* WIP.
* WIP.
* Fixes.
* Fixes.
* Fixes.
* Fixes.
* WIP.
* Fixes.
* Test
* Fix.
* Fixes and LINT.
* Fixes and LINT.
* LINT.
2020-06-30 10:13:20 +02:00
Sven Mika
4ed796a7d6
[RLlib] Add testing Policy.compute_single_action()
for all agents. ( #8903 )
2020-06-13 17:51:50 +02:00
Sven Mika
754290daad
[RLlib] Add light-weight Trainer.compute_action()
tests for all Algos. ( #8356 )
2020-05-08 16:31:31 +02:00
Sven Mika
499ad5fbe4
[RLlib] PyTorch version of APPO. ( #8120 )
...
- Translate all vtrace functionality to torch and added torch to the framework_iterator-loop in all existing vtrace test cases.
- Add learning test cases for APPO torch (both w/ and w/o v-trace).
- Add quick compilation tests for APPO (tf and torch, v-trace and no v-trace).
2020-04-23 09:11:12 +02:00