Commit graph

49 commits

Author SHA1 Message Date
Sven Mika
6e1c3ea824
[RLlib] Exploration API (+EpsilonGreedy sub-class). (#6974) 2020-02-10 15:22:07 -08:00
Eric Liang
fbc545c03b
[rllib] Support parallel, parameterized evaluation (#6981)
* eval api

* update

* sync eval filters

* sync fix

* docs

* update

* docs

* update

* link

* nit

* doc updates

* format
2020-02-01 22:12:12 -08:00
Sven Mika
446cbdf2e0 [RLlib] Fix issue (bug): LSTM + non-shared vf + PPO + tuple actions (#6890)
* Add `RandomEnv` example to examples folder.
Convert warning into Error message when using an LSTM in a non-shared-vf network (after the warning, the program would crash).

* LINT.

* Fix issue #6884. LSTM + non-shared vf NN + PPO crashes when using a Tuple action space.

* LINT

* Change warning message for Model: shared_vf=False, LSTM=True cases.

* Bug fix.

* Add examples/random_env.py test to Jenkins.
2020-01-24 10:29:35 -08:00
Sven Mika
ae9a3a2237 [RLlib] from_config util method for framework agnostic components; start moving RLlib tests into Bazel. (#6865) 2020-01-22 17:02:58 -08:00
Sven Mika
c957ed58ed [RLlib] Implement PPO torch version. (#6826) 2020-01-20 23:06:50 -08:00
Sven
f1b56fa5ee PG unify/cleanup tf vs torch and PG functionality test cases (tf + torch). (#6650)
* Unifying the code for PGTrainer/Policy wrt tf vs torch.
Adding loss function test cases for the PGAgent (confirm equivalence of tf and torch).

* Fix LINT line-len errors.

* Fix LINT errors.

* Fix `tf_pg_policy` imports (formerly: `pg_policy`).

* Rename tf_pg_... into pg_tf_... following <alg>_<framework>_... convention, where ...=policy/loss/agent/trainer.
Retire `PGAgent` class (use PGTrainer instead).

* - Move PG test into agents/pg/tests directory.
- All test cases will be located near the classes that are tested and
  then built into the Bazel/Travis test suite.

* Moved post_process_advantages into pg.py (from pg_tf_policy.py), b/c
the function is not a tf-specific one.

* Fix remaining import errors for agents/pg/...

* Fix circular dependency in pg imports.

* Add pg tests to Jenkins test suite.
2020-01-02 16:08:03 -08:00
Eric Liang
be5dd8eb5e
Enable direct calls by default (#6367)
* wip

* add

* timeout fix

* const ref

* comments

* fix

* fix

* Move actor state into actor handle

* comments 2

* enable by default

* temp reorder

* some fixes

* add debug code

* tmp

* fix

* wip

* remove dbg

* fix compile

* fix

* fix check

* remove non direct tests

* Increment ref count before resolving value

* rename

* fix another bug

* tmp

* tmp

* Fix object pinning

* build change

* lint

* ActorManager

* tmp

* ActorManager

* fix test component failures

* Remove old code

* Remove unused

* fix

* fix

* fix resources

* fix advanced

* eric's diff

* blacklist

* blacklist

* cleanup

* annotate

* disable tests for now

* remove

* fix

* fix

* clean up verbosity

* fix test

* fix concurrency test

* Update .travis.yml

* Update .travis.yml

* Update .travis.yml

* split up analysis suite

* split up trial runner suite

* fix detached direct actors

* fix

* split up advanced tesT

* lint

* fix core worker test hang

* fix bad check fail which breaks test_cluster.py in tune

* fix some minor diffs in test_cluster

* less workers

* make less stressful

* split up test

* retry flaky tests

* remove old test flags

* fixes

* lint

* Update worker_pool.cc

* fix race

* fix

* fix bugs in node failure handling

* fix race condition

* fix bugs in node failure handling

* fix race condition

* nits

* fix test

* disable heartbeatS

* disable heartbeatS

* fix

* fix

* use worker id

* fix max fail

* debug exit

* fix merge, and apply [PATCH] fix concurrency test

* [patch] fix core worker test hang

* remove NotifyActorCreation, and return worker on completion of actor creation task

* remove actor diied callback

* Update core_worker.cc

* lint

* use task manager

* fix merge

* fix deadlock

* wip

* merge conflits

* fix

* better sysexit handling

* better sysexit handling

* better sysexit handling

* check id

* better debug

* task failed msg

* task failed msg

* retry failed tasks with delay

* retry failed tasks with delay

* clip deps

* fix

* fix core worker tests

* fix task manager test

* fix all tests

* cleanup

* set to 0 for direct tests

* dont check worker id for ownership rpc

* dont check worker id for ownership rpc

* debug messages

* add comment

* remove debug statements

* nit

* check worker id

* fix test

* owner

* fix tests
2019-12-13 13:58:04 -08:00
Victor Le
4e24c805ee AlphaZero and Ranked reward implementation (#6385) 2019-12-07 12:08:40 -08:00
Eric Liang
4c6739476b
[rllib] Raise an error if GPUs are enabled but not tf.test.is_gpu_available() (#6365) 2019-12-05 10:13:54 -08:00
Eric Liang
64a3a7239e
Set RAY_FORCE_DIRECT=1 for run_rllib_tests, test_basic (#6171) 2019-11-25 14:12:11 -08:00
Eric Liang
04e997fe0d
Fix TF2 / rllib test (#5846) 2019-10-07 14:25:16 -07:00
Edward Oakes
443feb75f0 Fix test (#5810) 2019-09-30 19:39:53 -07:00
Eric Liang
97ccd75952
[rllib] Enable object store memory limit by default (#5534) 2019-08-26 01:37:28 -07:00
gehring
b520f6141e [rllib] Adds eager support with a generic TFEagerPolicy class (#5436) 2019-08-23 14:21:11 +08:00
Robert Nishihara
851c5b2dae Add a script for benchmarking performance for Ray developers. (#5472) 2019-08-19 23:41:23 -07:00
Eric Liang
a1d2e17623
[rllib] Autoregressive action distributions (#5304) 2019-08-10 14:05:12 -07:00
Eric Liang
592f313210
[rllib] Centralized critic / PPO example on TwoStepGame (#5392) 2019-08-08 14:03:28 -07:00
Wonseok Jeon
281829e712 MADDPG implementation in RLlib (#5348) 2019-08-06 16:22:06 -07:00
Eric Liang
5d7afe8092
[rllib] Try moving RLlib to top level dir (#5324) 2019-08-05 23:25:49 -07:00
Eric Liang
3bdd114282
[rllib] Better example rnn envs (#5300) 2019-07-28 14:07:18 -07:00
Eric Liang
a62c5f40f6
[rllib] Document ModelV2 and clean up the models/ directory (#5277) 2019-07-27 02:08:16 -07:00
Eric Liang
f9043cc49a
[rllib] Remove experimental eager support 2019-07-21 12:27:17 -07:00
Jones Wong
0af07bd493 Enable seeding actors for reproducible experiments (#5197)
*  enable graph-level worker-specific seed

*  lint checked

*  revised according to eric's suggestions

*  revised accordingly and added a test case

*  formated

* Update test_reproducibility.py

* Update trainer.py

* Update rollout_worker.py

* Update run_rllib_tests.sh

* Update worker_set.py
2019-07-17 23:31:34 -07:00
Eric Liang
34d054ff19
[rllib] ModelV2 API (#4926) 2019-07-03 15:59:47 -07:00
Eric Liang
9e328fbe6f
[rllib] Add docs on how to use TF eager execution (#4927) 2019-06-07 16:42:37 -07:00
Eric Liang
7501ee51db
[rllib] Rename PolicyEvaluator => RolloutWorker (#4820) 2019-06-03 06:49:24 +08:00
Eric Liang
1c073e92e4
[rllib] Fix documentation on custom policies (#4910)
* wip

* add docs

* lint

* todo sections

* fix doc
2019-06-01 16:13:21 +08:00
Eric Liang
d7be5a5d36
[rllib] Fix error getting kl when simple_optimizer: True in multi-agent PPO 2019-05-27 17:24:45 -07:00
Eric Liang
351753aae5
[rllib] Remove dependency on TensorFlow (#4764)
* remove hard tf dep

* add test

* comment fix

* fix test
2019-05-10 20:36:18 -07:00
Eric Liang
3fd9dea721
[rllib] Fix tune.run(Agent class) (#4630)
* update

* Update __init__.py
2019-04-15 09:12:23 -07:00
cfan
bb207a205b [rllib] Support torch device and distributions. (#4553) 2019-04-12 11:39:14 -07:00
Eric Liang
4f46d3e9bf
[rllib] Add multi-agent examples for hand-coded policy, centralized VF (#4554) 2019-04-09 00:36:49 -07:00
ctombumila37
7746d20d30 [rllib] ExternalMultiAgentEnv (#4200) 2019-04-06 19:58:14 -07:00
Eric Liang
0d94f3eeef
[rllib] Improve datapath throughput of IMPALA / APPO (#4324) 2019-03-31 12:25:52 -07:00
bjg2
77005d1814 [rllib] Make batch timeout for remote workers tunable (#4435) 2019-03-29 13:19:42 -07:00
Eric Liang
2ffe67c5c3
[rllib] Minor cleanups to TFPolicyGraph: add init args, constants for loss inputs (#4478) 2019-03-29 12:44:23 -07:00
Eric Liang
8ee240f40e
[rllib] Use 64-byte aligned memory when concatenating arrays (#4408) 2019-03-25 23:56:51 -07:00
Eric Liang
57c1aeb427
[rllib] Use suppress_output instead of run_silent.sh script for tests (#4386)
* fix

* enable custom loss

* Update run_rllib_tests.sh

* enable tests

* fix action prob

* Update suppress_output

* fix example

* fix
2019-03-21 00:15:24 -07:00
Eric Liang
a45019d98c
[rllib] Add option to proceed even if some workers crashed (#4376) 2019-03-16 13:34:09 -07:00
Stefan Pantic
2202a81773 Fix multi discrete (#4338)
* Revert "Revert "[wingman -> rllib] IMPALA MultiDiscrete changes (#3967)" (#4332)"

This reverts commit 3c41cb9b60.

* Fix a bug with log rhos for vtrace

* Reformat

* lint
2019-03-12 20:32:11 -07:00
Eric Liang
3c41cb9b60
Revert "[wingman -> rllib] IMPALA MultiDiscrete changes (#3967)" (#4332)
This reverts commit 962b17f567.
2019-03-11 22:51:26 -07:00
Eric Liang
c7f74dbdc7
[rllib] Add async remote workers (#4253) 2019-03-08 15:39:48 -08:00
Yuhong Guo
d5fb7b70a9
Update arrow version to fix plasma bugs (#4127)
* Update arrow

* Change to 2c511979b13b230e73a179dab1d55b03cd81ec02 which is rebased on Arrow 46f75d7

* Update to fix comment

* disable tests which use python/ray/rllib/tests/data/cartpole_small

* Fix get order of meta and data in MockObjectStore.java
2019-03-08 18:03:58 +08:00
Eric Liang
b0332551dd
[rllib] Fix APPO + continuous spaces, feed prev_rew/act to A3C properly (#4286) 2019-03-06 21:36:26 -08:00
Eric Liang
30bf8e46c7
[rllib] Use nested scope in custom loss example 2019-03-04 18:29:22 -08:00
Richard Liaw
a27cb225b6
Modularize Tune tests from multi-node tests (#4204) 2019-03-02 19:21:08 -08:00
Robert Nishihara
4b89eebfc7 Move test folders under rllib/tune from test -> tests. (#4214) 2019-03-02 13:37:16 -08:00
bjg2
962b17f567 [wingman -> rllib] IMPALA MultiDiscrete changes (#3967) 2019-03-01 19:47:06 -08:00
Eric Liang
b809ef0107
[rllib] Silent tests (#4151) 2019-02-28 16:32:22 -08:00