Commit graph

289 commits

Author SHA1 Message Date
Richard Liaw
0537508106 Bump strings for 0.6.2 (#3801) 2019-01-17 19:03:27 -08:00
Jones Wong
319c1340cb [rllib] Develop MARWIL (#3635)
*  add marvil policy graph

*  fix typo

*  add offline optimizer and enable running marwil

*  fix loss function

*  add maintaining the moving average of advantage norm

*  use sync replay optimizer for unifying

*  remove offline optimizer and use sync replay optimizer

*  format by yapf

*  add imitation learning objective

*  fix according to eric's review

*  format by yapf

* revise

* add test data

* marwil
2019-01-16 19:00:43 -08:00
Eric Liang
401e656b95 [rllib] Sync filters at end of iteration not start; hierarchical docs (#3769) 2019-01-15 16:25:25 -08:00
jhpenger
3adffe6a4e [docs] Add example showing how to use Ray on Kubernetes. (#3126)
Closes #1353.
2019-01-13 13:56:47 -08:00
Robert Nishihara
1480f309c3 [doc] Replace runtest.py with mini_test.py in documentation. (#3750)
Rename `xray_test.py` to `mini_test.py` and use that in the documentation. Right now we suggest that people run `runtest.py`, but that often doesn't succeed and takes too long.
2019-01-12 14:05:28 -08:00
Eric Liang
e78562b2e8
[rllib] Misc fixes: set lr for PG, better error message for LSTM/PPO, fix multi-agent/APEX (#3697)
* fix

* update test

* better error

* compute

* eps fix

* add get_policy() api

* Update agent.py

* better err msg

* fix

* pass in rew
2019-01-06 19:37:35 -08:00
Eric Liang
03fe760616
[rllib] Model self loss isn't included in all algorithms (#3679) 2019-01-04 22:30:35 -08:00
Eric Liang
7db1f3be2a [tune] resume=False by default but print a tip to set resume="prompt" + jenkins fix (#3681) 2019-01-04 17:23:19 -08:00
Robert Nishihara
586a5c9ffa Limit default redis max memory to 10GB. (#3630)
* Limit Redis max memory to 10GB/shard by default.

* Update stress tests.

* Reorganize

* Update

* Add minimum cap size for object store and redis.

* Small test update.
2019-01-03 13:23:54 -08:00
Eric Liang
ca864faece
[rllib] Documentation for I/O API and multi-agent support / cleanup (#3650) 2019-01-03 15:15:36 +08:00
Eric Liang
47d36d7bd6
[rllib] Refactor pytorch custom model support (#3634) 2019-01-03 13:48:33 +08:00
Robert Nishihara
b6bcd18d65 Split profile table among many keys in the GCS. (#3676)
* Divide profile table among many keys in GCS.

* Fix, and remove --collect-profiling-data arg.

* Remove reference in doc.
2019-01-02 21:33:01 -08:00
Eric Liang
b8a9e3f106
[rllib] Remove uses of sgd_stepsize => lr (#3667)
* lr

* Update example-evolution-strategies.rst
2019-01-01 12:01:27 +08:00
Richard Liaw
aad3c50e2d
[tune] Cluster Fault Tolerance (#3309)
This PR introduces cluster-level fault tolerance for Tune by checkpointing global state. This occurs with relatively high frequency and allows users to easily resume experiments when the cluster crashes.

Note that this PR may affect automated workflows due to auto-prompting, but this is resolvable.
2018-12-29 11:42:25 +08:00
Robert Nishihara
5426234cd8 Update documentation to reflect 0.6.1 release. (#3622) 2018-12-24 11:10:04 -08:00
Eric Liang
9f63119a83
[rllib] Allow development without needing to compile Ray (#3623)
* wip

* lint

* wip

* wip

* rename

* wip

* Cleaner handling of cli prompt
2018-12-24 18:08:23 +09:00
Si-Yuan
a1995ff3b0 Resize logo in README. (#3619) 2018-12-23 22:59:23 -08:00
Eric Liang
ddc97864df [rllib] Add requested clarifications to test requirement of contrib docs (#3589) 2018-12-21 11:02:02 -08:00
Richard Liaw
e046a5c767
[tune] resources_per_trial from trial_resources (#3580)
Renaming variable due to user errors.
2018-12-20 19:00:47 -08:00
Eric Liang
303883a3b6 [rllib] [rfc] add contrib module and guideline for merging (#3565)
This adds guidelines for merging code into `rllib/contrib` vs `rllib/agents`. Also, clean up the agent import code to make registration easier.
2018-12-20 10:44:34 -08:00
Yuhong Guo
fb33fa9097 Enable function_descriptor in backend to replace the function_id (#3028) 2018-12-18 18:53:59 -05:00
Alexey Tumanov
3822b20319 [doc] update testing and dev instructions (#3562)
* [doc] update python testing command

* update installation/dev instructions
2018-12-18 14:45:24 -08:00
Eric Liang
db0dee573e
[rllib] Q-Mix implementation (Q-Mix, VDN, IQN, and Ape-X variants) (#3548) 2018-12-18 10:40:01 -08:00
YifengHuang
bc4aa85ea3 fix link in doc (#3567) 2018-12-18 00:10:55 -08:00
Tianming Xu
7767aba637 Note requirement cython==0.29.0 in installation instructions (#3555) 2018-12-17 20:43:47 +08:00
Chunyang Wen
5dcc333199 [sgd] Modify: add interface for model (#3458)
* Modify: add interface for model

* Modify: remove single quota and build; add metrics

* Modify: flatten into list of dict

* Update distributed_sgd.rst

* Modify: update format with scripts/format.sh

* Update sgd_worker.py
2018-12-12 21:23:25 -08:00
Richard Liaw
cc8f7db246
[docs] Improve cluster/docker docs (#3517)
- Surfaces local cluster usage
 - Increases visability of these instructions
 - Removes some docker docs (that are really out of scope for Ray
 documentation IMO)

Closes #3517.
2018-12-12 10:40:54 -08:00
Richard Liaw
e0fbb68e47
[tune] Custom Logging, Trial Name (#3465)
Adds support for custom loggers, custom trial strings, and custom sync commands. Closes #3034, #2985, and #3390.
2018-12-11 13:41:59 -08:00
Eric Liang
ce388a45cf
[rllib] Learner should not see clipped actions (#3496) 2018-12-09 21:57:11 -08:00
Eric Liang
cffe8f9806 Add option to evict keys LRU from the sharded redis tables (#3499)
* wip

* wip

* format

* wip

* note

* lint

* fix

* flag

* typo

* raise timeout

* fix

* optional get

* fix flag

* increase timeout in test

* update docs

* format
2018-12-09 05:48:52 -08:00
Si-Yuan
c2c501bbe6 Experimental asyncio support (#2015)
* Init commit for async plasma client

* Create an eventloop model for ray/plasma

* Implement a poll-like selector base on `ray.wait`. Huge improvements.

* Allow choosing workers & selectors

* remove original design

* initial implementation of epoll-like selector for plasma

* Add a param for `worker` used in `PlasmaSelectorEventLoop`

* Allow accepting a `Future` which returns object_id

* Do not need `io.py` anymore

* Create a basic testing model

* fix: `ray.wait` returns tuple of lists

* fix a few bugs

* improving performance & bug fixing

* add test

* several improvements & fixing

* fix relative import

* [async] change code format, remove old files

* [async] Create context wrapper for the eventloop

* [async] fix: context should return a value

* [async] Implement futures grouping

* [async] Fix bugs & replace old functions

* [async] Fix bugs found in tests

* [async] Implement `PlasmaEpoll`

* [async] Make test faster, add tests for epoll

* [async] Fix code format

* [async] Add comments for main code.

* [async] Fix import path.

* [async] Fix test.

* [async] Compatibility.

* [async] less verbose to not annoy the CI.

* [async] Add test for new API

* [async] Allow showing debug info in some of the test.

* [async] Fix test.

* [async] Proper shutdown.

* [async] Lint~

* [async] Move files to experimental and create API

* [async] Use async/await syntax

* [async] Fix names & styles

* [async] comments

* [async] bug fixing & use pytest

* [async] bug fixing & change tests

* [async] use logger

* [async] add tests

* [async] lint

* [async] type checking

* [async] add more tests

* [async] fix bugs on waiting a future while timeout. Add more docs.

* [async] Formal docs.

* [async] Add typing info since these codes are compatible with py3.5+.

* [async] Documents.

* [async] Lint.

* [async] Fix deprecated call.

* [async] Fix deprecated call.

* [async] Implement a more reasonable way for dealing with pending inputs.

* [async] Fix docs

* [async] Lint

* [async] Fix bug: Type for time

* [async] Set our eventloop as the default eventloop so that we can get it through `asyncio.get_event_loop()`.

* [async] Update test & docs.

* [async] Lint.

* [async] Temporarily print more debug info.

* [async] Use `Poll` as a default option.

* [async] Limit resources.

* new async implementation for Ray

* implement linked list

* bug fix

* update

* support seamless async operations

* update

* update API

* fix tests

* lint

* bug fix

* refactor names

* improve doc

* properly shutdown async_api

* doc

* Change the table on the index page.

* Adjust table size.

* Only keeps `as_future`.

* change how we init connection

* init connection in `ray.worker.connect`

* doc

* fix

* Move initialization code into the module.

* Fix docs & code

* Update pyarrow version.

* lint

* Restore index.rst

* Add known issues.

* Apply suggestions from code review

Co-Authored-By: suquark <suquark@gmail.com>

* rename

* Update async_api.rst

* Update async_api.py

* Update async_api.rst

* Update async_api.py

* Update worker.py

* Update async_api.rst

* fix tests

* lint

* lint

* replace the magic number
2018-12-06 17:39:05 -08:00
Eric Liang
412aaa5195
[tune] Deprecate ambiguous function values (use tune.function / tune.sample_from instead) (#3457)
* wip

* exclude
2018-12-06 11:35:20 -08:00
Eric Liang
d864f299d7
[rllib] fixes from dogfooding multi-agent (#3456)
auto wrap multi-agent dict and tuple spaces by keeping a policy -> preprocessor in the sampler
add some Q-learning debug stats
report min, max of custom metrics
better errors
2018-12-05 23:31:45 -08:00
Eric Liang
93a9d32288
[docs] Switch docs to use rllib train instead of train.py 2018-12-04 17:36:06 -08:00
Eric Liang
ce355d13d4
[rllib] Allow envs to be auto-registered; add on_train_result callback with curriculum example (#3451)
* train step and docs

* debug

* doc

* doc

* fix examples

* fix code

* integration test

* fix

* ...

* space

* instance

* Update .travis.yml

* fix test
2018-12-03 23:15:43 -08:00
Eric Liang
d8205976e8
[rllib] Auto clip actions to Box space range; deprecate squash_to_range (#3426)
* fix clip

* tweak wording

* remove squash entirely

* Update rllib-models.rst

* fix argument order

* Apply suggestions from code review

Co-Authored-By: ericl <ekhliang@gmail.com>
2018-12-03 19:55:25 -08:00
Eric Liang
13c8ce4d84 Update README.rst with 0.6.0 version number. (#3453) 2018-12-01 19:16:45 -08:00
Devin Petersohn
57512616e1 Update readme to contain logo (#3443)
* Adding logo to readme

* Updating link

* Add badge

* Addressing comments

* Moving logo

* Change align

* Move image
2018-11-30 18:28:35 -08:00
GiliR4t1qbit
454d3aa07d [docs] Snippet did not have a code-block tag above it (#3442) 2018-11-30 16:39:40 -08:00
Eric Liang
07d8cbf414
[rllib] Support batch norm layers (#3369)
* batch norm

* lint

* fix dqn/ddpg update ops

* bn model

* Update tf_policy_graph.py

* Update multi_gpu_impl.py

* Apply suggestions from code review

Co-Authored-By: ericl <ekhliang@gmail.com>
2018-11-29 13:33:39 -08:00
Robert Nishihara
82863b5251
[autoscaler] Update autoscaler to use heartbeat batches. (#3409) 2018-11-27 23:46:27 -08:00
Eric Liang
f0df97db6f
[rllib] example and docs on how to use parametric actions with DQN / PG algorithms (#3384) 2018-11-27 23:35:19 -08:00
Eric Liang
0d56fc10cc Move setproctitle to ray[debug] package (#3415) 2018-11-27 09:50:59 -08:00
Eric Liang
8b76bab25c
[rllib] docs for td3 (#3381)
* td3 doc

* Update rllib-env.rst
2018-11-22 13:36:47 -08:00
Richard Liaw
c24d87b4d1 [autoscaler] Submit command (#3312) 2018-11-20 14:03:34 -08:00
Eric Liang
abdc3b592e
[rllib] Update multi-gpu impala numbers (#3327) 2018-11-19 20:55:27 -08:00
Eric Liang
61e3bbbfee
Update stale example links 2018-11-17 15:40:38 -08:00
Robert Nishihara
98edf752a9 Note requirement cython==0.27.3 in installation instructions. (#3322) 2018-11-15 15:27:19 -08:00
Eric Liang
706dc1d473
[rllib] Add test for multi-agent support and fix IMPALA multi-agent (#3289)
IMPALA support for multiagent was broken since IMPALA has a requirement that batch sizes be of a certain length. However multi-agent envs can create variable-length batches.

Fix this by adding zero-padding as needed (similar to the RNN case).
2018-11-14 14:14:07 -08:00
Eric Liang
65c27c70cf [rllib] Clean up agent resource configurations (#3296)
Closes #3284
2018-11-13 18:00:03 -08:00