ray/rllib/agents/qmix
Matthew A. Wright 3131e1742d [rllib] Qmix off by 1 in double Q calculation (#5731)
* Qmix fix.

-Current version of double Q learning is incorrect; it selects actions
at timestep t instead of t+1 when computing the t+1 Q value.

* Allow extra obs dict keys

* Move Q-value-computing replay code to own function

* Run the autoformatter

* use better terms in comments ("policy" network instead of "live" network)
2019-09-18 18:12:30 -07:00
..
__init__.py [rllib] Try moving RLlib to top level dir (#5324) 2019-08-05 23:25:49 -07:00
apex.py [rllib] Try moving RLlib to top level dir (#5324) 2019-08-05 23:25:49 -07:00
mixers.py [rllib] Try moving RLlib to top level dir (#5324) 2019-08-05 23:25:49 -07:00
model.py [rllib] Try moving RLlib to top level dir (#5324) 2019-08-05 23:25:49 -07:00
qmix.py [rllib] Try moving RLlib to top level dir (#5324) 2019-08-05 23:25:49 -07:00
qmix_policy.py [rllib] Qmix off by 1 in double Q calculation (#5731) 2019-09-18 18:12:30 -07:00
README.md [rllib] Try moving RLlib to top level dir (#5324) 2019-08-05 23:25:49 -07:00

Code in this package is adapted from https://github.com/oxwhirl/pymarl.