ray/rllib/agents/qmix at 3131e1742d2ec6cf139186bbab845925fe288734 - hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-07 02:51:39 -05:00

History

Matthew A. Wright 3131e1742d [rllib] Qmix off by 1 in double Q calculation (#5731 ) * Qmix fix. -Current version of double Q learning is incorrect; it selects actions at timestep t instead of t+1 when computing the t+1 Q value. * Allow extra obs dict keys * Move Q-value-computing replay code to own function * Run the autoformatter * use better terms in comments ("policy" network instead of "live" network)		2019-09-18 18:12:30 -07:00
..
__init__.py	[rllib] Try moving RLlib to top level dir (#5324 )	2019-08-05 23:25:49 -07:00
apex.py	[rllib] Try moving RLlib to top level dir (#5324 )	2019-08-05 23:25:49 -07:00
mixers.py	[rllib] Try moving RLlib to top level dir (#5324 )	2019-08-05 23:25:49 -07:00
model.py	[rllib] Try moving RLlib to top level dir (#5324 )	2019-08-05 23:25:49 -07:00
qmix.py	[rllib] Try moving RLlib to top level dir (#5324 )	2019-08-05 23:25:49 -07:00
qmix_policy.py	[rllib] Qmix off by 1 in double Q calculation (#5731 )	2019-09-18 18:12:30 -07:00
README.md	[rllib] Try moving RLlib to top level dir (#5324 )	2019-08-05 23:25:49 -07:00

README.md

Code in this package is adapted from https://github.com/oxwhirl/pymarl.