Commit graph

3093 commits

Author SHA1 Message Date
Hao Chen
0131353d42 [gRPC] Migrate gcs data structures to protobuf (#5024) 2019-06-25 14:31:19 -07:00
Richard Liaw
bd8aceb896 [ci] Change Jenkins to py3 (#5022)
* conda3

* integration

* add nevergrad, remotedata

* pytest 0.3.1

* otherdockers

* setup

* tune
2019-06-24 21:50:37 -07:00
Ashwinee Panda
11ccf66346 [docs] docs for running Tensorboard without sudo (#5015)
* Instructions for running Tensorboard without sudo

When we run Tensorboard to visualize the results of Ray outputs on multi-user clusters where we don't have sudo access, such as RISE clusters, a few commands need to first be run to make sure tensorboard can edit the tmp directory. This is a pretty common usecase so I figured we may as well put it in the documentation for Tune.

* Update tune-usage.rst
2019-06-24 11:26:53 -07:00
Qing Wang
e33d0eac68
Add dynamic worker options for worker command. (#4970)
* Add fields for fbs

* WIP

* Fix complition errors

* Add java part

* FIx

* Fix

* Fix

* Fix lint

* Refine API

* address comments and add test

* Fix

* Address comment.

* Address comments.

* Fix linting

* Refine

* Fix lint

* WIP: address comment.

* Fix java

* Fix py

* Refin

* Fix

* Fix

* Fix linting

* Fix lint

* Address comments

* WIP

* Fix

* Fix

* minor refine

* Fix lint

* Fix raylet test.

* Fix lint

* Update src/ray/raylet/worker_pool.h

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update java/runtime/src/main/java/org/ray/runtime/AbstractRayRuntime.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Address comments.

* Address comments.

* Fix test.

* Update src/ray/raylet/worker_pool.h

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Address comments.

* Address comments.

* Fix

* Fix lint

* Fix lint

* Fix

* Address comments.

* Fix linting
2019-06-23 18:08:33 +08:00
Philipp Moritz
2e342ef71f Fix tensorflow-1.14 installation in jenkins (#5007) 2019-06-21 11:04:40 -07:00
Joey Jiang
a7f84b536f Fix no cpus test (#5009) 2019-06-21 17:08:25 +08:00
Philipp Moritz
3b23d94cb8 Fix valgrind build by installing new version of valgrind (#5008) 2019-06-20 22:22:37 -07:00
Richard Liaw
31b6da12f9
[tune] Tutorial UX Changes (#4990)
* add integration, iris, ASHA, recursive changes, set reuse_actors=True, and enable Analysis as a return object

* docstring

* fix up example

* fix

* cleanup tests

* experiment analysis
2019-06-21 12:59:49 +08:00
Eric Liang
1d17125333 temp fix for build (#5006) 2019-06-20 18:07:44 -07:00
Andrew Berger
e59e8074dd fix handling of non-integral timeout values in signal.receive (#5002) 2019-06-20 15:33:40 -07:00
Qing Wang
7bda5edc16 Fix Java CI failure (#4995) 2019-06-19 11:36:21 +08:00
Hao Chen
2bf92e02e2
[gRPC] Use gRPC for inter-node-manager communication (#4968) 2019-06-17 19:00:50 +08:00
Qing Wang
b08765a08b Fix a crash when unknown worker registering to raylet (#4992) 2019-06-17 13:34:23 +08:00
Simon Mo
05e2748070 Inherit Function Docstrings and other metedata (#4985) 2019-06-15 11:01:27 -07:00
Tianhong Dai
1b86e551fb Fix bugs in the a3c code template. (#4984) 2019-06-14 17:22:36 -07:00
Zhijun Fu
37abdb283f [Core worker] add store & task provider (#4966) 2019-06-14 18:35:32 +08:00
Hao Chen
3c92b2ee4d
Upgrade CI clang-format to 6.0 (#4976) 2019-06-14 14:52:32 +08:00
Eric Liang
fa1d4c9807
[rllib] Fix DDPG example (#4973) 2019-06-13 15:07:46 -07:00
Qing Wang
ef1af49efd [Java] Fix bug of BaseID in multi-threading case. (#4974) 2019-06-13 20:52:41 +08:00
Robert Nishihara
d2f5b71c3b Remove typing from setup.py install_requirements. (#4971) 2019-06-12 15:02:12 -07:00
Stephanie Wang
89ca5eeb29 Flush all tasks from local lineage cache after a node failure (#4964) 2019-06-12 11:13:39 -07:00
Peter Schafhalter
e0e52f1871 [sgd] Add non-distributed PyTorch runner (#4933)
* Add non-distributed PyTorch runner

* use dist.is_available() instead of checking OS

* Nicer exception

* Fix bug in choosing port

* Refactor some code

* Address comments

* Address comments
2019-06-11 22:38:34 -07:00
Zhijun Fu
472c36ed1e [core worker] add task submission & execution interface (#4922) 2019-06-12 10:10:12 +08:00
Philipp Moritz
ebb3b3b928 [core] Use int64_t instead of int to keep track of fractional resources (#4959) 2019-06-10 23:49:04 -07:00
Philipp Moritz
1e2b649580 Use proper session directory for debug_string.txt (#4960) 2019-06-10 23:46:37 -07:00
Robert Nishihara
6f48992322 Make release stress tests work and improve them. (#4955) 2019-06-10 23:04:01 -07:00
Qing Wang
e6baffba56
[Java] Add inner class Builder to build call options. (#4956)
* Add Builder class

* format

* Refactor by IDE

* Remove uncessary dependency
2019-06-10 23:52:08 +08:00
Eric Liang
4f8e100fe0 fix (#4950) 2019-06-10 10:20:55 +08:00
Qing Wang
671c0f769e
[Java] Fix serializing issues of RaySerializer (#4887)
* Fix

* Address comment.
2019-06-08 22:56:00 +08:00
Robert Nishihara
ec8aaf011b Upload wheels on Travis to branchname/commit_id. (#4949) 2019-06-07 23:20:29 -07:00
Robert Nishihara
85b82b2454 Update aws keys for uploading wheels to s3. (#4948) 2019-06-07 23:19:10 -07:00
Robert Nishihara
a82e8118a0 Fix resource bookkeeping bug with acquiring unknown resource. (#4945) 2019-06-07 21:07:27 -07:00
Eric Liang
77689d1116
[rllib] Port remainder of algorithms to build_trainer() pattern (#4920) 2019-06-07 16:45:36 -07:00
Eric Liang
9e328fbe6f
[rllib] Add docs on how to use TF eager execution (#4927) 2019-06-07 16:42:37 -07:00
Stephanie Wang
873d45b467 Flush lineage cache on task submission instead of execution (#4942) 2019-06-07 11:35:18 -07:00
Yuhong Guo
5eff47b657 [C++] Add hash table to Redis-Module (#4911) 2019-06-07 16:11:37 +08:00
Stephanie Wang
cbc67fc750 [doc] Update developer docs with bazel instructions (#4944) 2019-06-06 18:18:24 -07:00
Robert Nishihara
c3f8fc1c44
Update version number in documentation after release 0.7.0 -> 0.7.1 and 0.8.0.dev0 -> 0.8.0.dev1. (#4941) 2019-06-06 17:22:45 -07:00
Robert Nishihara
a0f14e9e6c Bump version from 0.7.1 to 0.8.0.dev1. (#4937) 2019-06-06 11:20:05 -07:00
Timon Ruban
2702b15b04 [tune] Add requirements-dev.txt and update docs for contributing (#4925)
* Add requirements-dev.txt and update docs.

* Update doc/source/tune-contrib.rst

Co-Authored-By: Richard Liaw <rliaw@berkeley.edu>

* Unpin everything except for yapf.
2019-06-05 09:04:36 -07:00
Hao Chen
d106283769 Better organize ray_common module (#4898) 2019-06-04 23:19:09 -07:00
Timon Ruban
c2253d2313 [tune] Make PBT Quantile fraction configurable (#4912) 2019-06-03 18:45:15 -07:00
Zhijun Fu
b674c4a5ba [Core Worker] implement ObjectInterface and add test framework (#4899) 2019-06-03 19:59:43 +08:00
Hersh Godse
89722ff003 [tune] Directional metrics for components (#4120) (#4915) 2019-06-02 22:13:40 -07:00
Richard Liaw
084b22181e Fix local cluster yaml (#4918) 2019-06-03 08:45:57 +08:00
Eric Liang
7501ee51db
[rllib] Rename PolicyEvaluator => RolloutWorker (#4820) 2019-06-03 06:49:24 +08:00
Eric Liang
99eae05cf6
[tune] Disallow setting resources_per_trial when it is already configured (#4880)
* disallow it

* import fix

* fix example

* fix test

* fix tests

* Update mock.py

* fix

* make less convoluted

* fix tests
2019-06-03 06:47:39 +08:00
Akshat Gokhale
d86ee8c83e fetching objects in parallel in _get_arguments_for_execution (#4775) 2019-06-01 23:35:48 -07:00
Eric Liang
665d081fe9
[rllib] Rough port of DQN to build_tf_policy() pattern (#4823) 2019-06-02 14:14:31 +08:00
Peter Schafhalter
c2ade075a3 [sgd] Distributed Training via PyTorch (#4797)
Implements distributed SGD using distributed PyTorch.
2019-06-01 21:39:22 -07:00