Commit graph

5 commits

Author SHA1 Message Date
Eric Liang
9ea57c2a93
[rllib] Basic IMPALA implementation (using deepmind's reference vtrace.py) (#2504)
Rename AsyncSamplesOptimizer -> AsyncReplayOptimizer
  Add AsyncSamplesOptimizer that implements the IMPALA architecture
  integrate V-trace with a3c policy graph
  audit V-trace integration
  benchmark compare vs A3C and with V-trace on/off
PongNoFrameskip-v4 on IMPALA scaling from 16 to 128 workers, solving Pong in <10 min. For reference, solving this env takes ~40 minutes for Ape-X and several hours for A3C.
2018-08-01 20:53:53 -07:00
Eric Liang
f012e597c2 [rllib] Basic port of baselines/deepq to rllib (#709)
* rllib v0

* fix imports

* lint

* comments

* update docs

* a3c wip

* a3c wip

* report stats

* update doc

* add common logdir attr

* name is too long

* fix small bug

* propagate exception on error

* fetch metrics

* initial port

* fix lint

* add right license

* port to common alg format

* fix lint

* rename dqn

* add imports from future

* fix lint
2017-07-07 18:37:00 +00:00
Robert Nishihara
d56c1a0b9c Change license to Apache 2 (#20) 2016-11-01 23:19:06 -07:00
Robert Nishihara
9d7e417b9c switching to BSD (#90) 2016-06-06 12:07:36 -07:00
Philipp Moritz
3655dfbc23 Initial commit 2016-02-07 14:18:40 -08:00