Implementation of deep deterministic policy gradients (https://arxiv.org/abs/1509.02971), including an Ape-X variant.