Commit graph

719 commits

Author SHA1 Message Date
Zhijun Fu
54d5969cea [grpc] Add grpc server to worker (#5054)
* refactor grpc server

* format

* change GetTask() to PushTask()

* change PushTask to AssignTask

* format

* update

* fix test

* format

* Update src/ray/rpc/worker_client.h

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update BUILD.bazel

* Update src/ray/core_worker/task_execution.cc

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* update

* format

* address comments

* format

* Update src/ray/rpc/worker/worker_server.h

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/protobuf/worker.proto

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* format

* fix

* format
2019-07-04 20:16:42 +08:00
Stephanie Wang
71d4637b75
[core worker] Refactor CoreWorker member classes (#5062)
* Move store client mutex inside CoreWorkerPlasmaStoreProvider

* Move PlasmaClient inside CoreWorkerStoreProvider

* Remove CoreWorkerObjectInterface's ref to CoreWorker

* Remove WorkerLanguage

* Remove CoreWorkerTaskInterface's ref to CoreWorker

* Remove CoreWorkerTaskExecutionInterface's ref to CoreWorker

* lint

* move comment

* Fix build

* Fix build
2019-07-02 15:30:30 -07:00
Kai Yang
1cf7728f35 [Core worker] Serialize ActorHandle in core worker. Make ActorHandle thread safe. (#5034)
* Serialize ActorHandle in core worker. Make ActorHandle thread safe.

* Address comments

* Address comments

* Address comments

* Address comments

* lint

* Address comments

* Address comments

* Address comments

* Address comments

* Minor update

* Address comments

* lint
2019-07-02 16:48:43 +08:00
Qing Wang
247f95b3ff
Refine RegisterClientRequest message to make it clearer. (#5057)
* transfor driver task id Explicitly

* Refins

* Fix and add comment.

* add more

* Fix

* Fix

* Add comments

* Fix
2019-07-02 14:26:19 +08:00
Simon Mo
6c4c1d444d Update VersionKey in stats (#5070) 2019-06-30 18:23:12 +08:00
Kai Yang
4ccb7b05cc [Core worker] Add metadata support in object interface (#5031) 2019-06-28 11:35:03 -07:00
Hao Chen
cefbb0c94c
Fix driver id in TaskInfo (#5055) 2019-06-28 12:56:48 +08:00
Kai Yang
a39982e676 [Core worker] Task execution passes TaskInfo struct to executor (#5032) 2019-06-28 10:59:45 +08:00
Joey Jiang
d6bbbdef35 Use gRPC to handle communication and data transmission between object manager (#4996) 2019-06-28 10:56:34 +08:00
Qing Wang
62e4b591e3
[ID Refactor] Rename DriverID to JobID (#5004)
* WIP

WIP

WIP

Rename Driver -> Job

Fix complition

Fix

Rename in Java

In py

WIP

Fix

WIP

Fix

Fix test

Fix

Fix C++ linting

Fix

* Update java/runtime/src/main/java/org/ray/runtime/config/RayConfig.java

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/core_worker/core_worker.cc

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Address comments

* Fix

* Fix CI

* Fix cpp linting

* Fix py lint

* FIx

* Address comments and fix

* Address comments

* Address

* Fix import_threading
2019-06-28 00:44:51 +08:00
Hao Chen
469ae41013
Fix memory leak in rpc ServerCall and ClientCall (#5046) 2019-06-27 13:19:47 +08:00
Stephanie Wang
1a8d0af814
Remove debug check for uncommitted lineage (#5038) 2019-06-26 11:21:00 -07:00
Zhijun Fu
bb8e75b532 [grpc] refactor rpc server to support multiple io services (#5023) 2019-06-25 19:08:09 -07:00
Hao Chen
0131353d42 [gRPC] Migrate gcs data structures to protobuf (#5024) 2019-06-25 14:31:19 -07:00
Qing Wang
e33d0eac68
Add dynamic worker options for worker command. (#4970)
* Add fields for fbs

* WIP

* Fix complition errors

* Add java part

* FIx

* Fix

* Fix

* Fix lint

* Refine API

* address comments and add test

* Fix

* Address comment.

* Address comments.

* Fix linting

* Refine

* Fix lint

* WIP: address comment.

* Fix java

* Fix py

* Refin

* Fix

* Fix

* Fix linting

* Fix lint

* Address comments

* WIP

* Fix

* Fix

* minor refine

* Fix lint

* Fix raylet test.

* Fix lint

* Update src/ray/raylet/worker_pool.h

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update java/runtime/src/main/java/org/ray/runtime/AbstractRayRuntime.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Address comments.

* Address comments.

* Fix test.

* Update src/ray/raylet/worker_pool.h

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Address comments.

* Address comments.

* Fix

* Fix lint

* Fix lint

* Fix

* Address comments.

* Fix linting
2019-06-23 18:08:33 +08:00
Hao Chen
2bf92e02e2
[gRPC] Use gRPC for inter-node-manager communication (#4968) 2019-06-17 19:00:50 +08:00
Qing Wang
b08765a08b Fix a crash when unknown worker registering to raylet (#4992) 2019-06-17 13:34:23 +08:00
Zhijun Fu
37abdb283f [Core worker] add store & task provider (#4966) 2019-06-14 18:35:32 +08:00
Hao Chen
3c92b2ee4d
Upgrade CI clang-format to 6.0 (#4976) 2019-06-14 14:52:32 +08:00
Stephanie Wang
89ca5eeb29 Flush all tasks from local lineage cache after a node failure (#4964) 2019-06-12 11:13:39 -07:00
Zhijun Fu
472c36ed1e [core worker] add task submission & execution interface (#4922) 2019-06-12 10:10:12 +08:00
Philipp Moritz
ebb3b3b928 [core] Use int64_t instead of int to keep track of fractional resources (#4959) 2019-06-10 23:49:04 -07:00
Philipp Moritz
1e2b649580 Use proper session directory for debug_string.txt (#4960) 2019-06-10 23:46:37 -07:00
Robert Nishihara
a82e8118a0 Fix resource bookkeeping bug with acquiring unknown resource. (#4945) 2019-06-07 21:07:27 -07:00
Stephanie Wang
873d45b467 Flush lineage cache on task submission instead of execution (#4942) 2019-06-07 11:35:18 -07:00
Yuhong Guo
5eff47b657 [C++] Add hash table to Redis-Module (#4911) 2019-06-07 16:11:37 +08:00
Robert Nishihara
c3f8fc1c44
Update version number in documentation after release 0.7.0 -> 0.7.1 and 0.8.0.dev0 -> 0.8.0.dev1. (#4941) 2019-06-06 17:22:45 -07:00
Hao Chen
d106283769 Better organize ray_common module (#4898) 2019-06-04 23:19:09 -07:00
Zhijun Fu
b674c4a5ba [Core Worker] implement ObjectInterface and add test framework (#4899) 2019-06-03 19:59:43 +08:00
Yuhong Guo
0066d7cf2a Hotfix for change of from_random to FromRandom (#4909) 2019-05-31 16:41:31 +08:00
Yuhong Guo
1f0809e2b4 Refactor ID Serial 2: change all ID functions to CamelCase (#4896) 2019-05-31 11:31:18 +08:00
Hao Chen
2912a7cb86
Initial high-level code structure of CoreWorker. (#4875) 2019-05-30 02:43:17 -07:00
Qing Wang
b7c284aaa3
Refactor redis callback handling (#4841)
* Add CallbackReply

* Fix

* fix linting by format.sh

* Fix linting

* Address comments.

* Fix
2019-05-30 11:54:30 +08:00
Yuhong Guo
fa0892f285
Replace ReturnIds with NumReturns in TaskInfo to reduce the size (#4854)
* Refine TaskInfo

* Fix

* Add a test to print task info size

* Lint

* Refine
2019-05-28 13:30:41 +08:00
Yuhong Guo
1a39fee9c6
Refactor ID Serial 1: Separate ObjectID and TaskID from UniqueID (#4776)
* Enable BaseId.

* Change TaskID and make python test pass

* Remove unnecessary functions and fix test failure and change TaskID to
16 bytes.

* Java code change draft

* Refine

* Lint

* Update java/api/src/main/java/org/ray/api/id/TaskId.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update java/api/src/main/java/org/ray/api/id/BaseId.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update java/api/src/main/java/org/ray/api/id/BaseId.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update java/api/src/main/java/org/ray/api/id/ObjectId.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Address comment

* Lint

* Fix SINGLE_PROCESS

* Fix comments

* Refine code

* Refine test

* Resolve conflict
2019-05-22 14:46:30 +08:00
Qing Wang
081708bdef [Java] Dynamic resource API in Java (#4824) 2019-05-21 17:13:48 +08:00
Stephanie Wang
cb1a195ca2
Queue tasks in the raylet in between async callbacks (#4766)
* Add a SWAP TaskQueue so that we can keep track of tasks that are temporarily dequeued

* Fix bug where tasks that fail to be forwarded don't appear to be local by adding them to SWAP queue

* cleanups

* updates

* updates
2019-05-15 10:23:25 -07:00
Stephanie Wang
1622fc21fc Fatal check if object store dies (#4763) 2019-05-13 11:59:12 -07:00
Romil Bhardwaj
004440f526 Dynamic Custom Resources - create and delete resources (#3742) 2019-05-11 20:06:04 +08:00
ashione
ccc540adf1 Remove mutable data function in id(UniqueID and its subclass) (#4696)
* remove mutable data in jni
fix flatbuffer string to ID check

* replace sizeof(ID) by ID.size()

sizeof(ID) = 20 if no other members in class

* fix new string unbounded

* code polished according to comments

* lazy hash eval
2019-05-09 16:41:48 +08:00
Yuhong Guo
481bfbde58
[c++] Allow RayConfig to have items other then integer (#4701)
* Allow RayConfig to have items other then integer

* Fix a small bug
2019-05-09 11:18:28 +08:00
Romil Bhardwaj
686d4caefe Updates to scheduling objects to support dynamic custom resources (#4465) 2019-04-27 18:45:23 -07:00
Qing Wang
c26f24ab9f Integrate metric items into raylet (#4602) 2019-04-25 11:40:24 +08:00
Qing Wang
f39b6747e5 Refactor command line argument parsing with gflags (#4676) 2019-04-24 14:53:07 +08:00
William Ma
c99e3caaca Change resource bookkeeping to account for machine precision. (#4533) 2019-04-23 11:59:53 -07:00
justinwyang
8dfc833a8b Change all instances of JobID to DriverID. (#4431) 2019-04-22 16:28:09 -07:00
Wang Qing
d951eb740f [Metrics] Add a flag to disable stdout exporter (#4634) 2019-04-19 19:06:30 -07:00
Hao Chen
d52b080081
[Java] Avoid unnecessary memory copy and addd a benchmark (#4611) 2019-04-14 00:17:04 +08:00
Romil Bhardwaj
0f42f87ebc Updating zero capacity resource semantics (#4555) 2019-04-12 16:53:57 -07:00
Wang Qing
fe07a5b4b1 Add delete_creating_tasks option for internal.free() (#4588)
* add delete creating task objects.

* format code style

* Fix lint

* add tests add address comments.

* Refine test

* Refine java test

* Fix CI

* Refine

* Fix lint

* Fix CI
2019-04-12 13:38:31 +08:00