Commit graph

969 commits

Author SHA1 Message Date
Zhijun Fu
eab595777f Support multiple store providers in ObjectInterface (#5452) 2019-08-21 11:16:48 +08:00
micafan
52a7c1d673 modify ActorStateAccessor::AsyncGet callback (#5417) 2019-08-21 10:54:33 +08:00
Hao Chen
f2b3c273db Fix direct actor transport not treating some tasks as failed (#5464) 2019-08-20 12:44:38 -07:00
micafan
da7bdacea5 support for subscription to an actor (#5269) 2019-08-20 20:32:53 +08:00
Philipp Moritz
599cc2be60
Revert raylet to worker GRPC communication back to asio (#5450) 2019-08-17 19:11:32 -07:00
micafan
47aa2b110d Make GCS Client thread-safe. (#5413) 2019-08-17 17:21:09 +08:00
Kai Yang
b1aae0e398 [Java worker] Migrate task execution and submission on top of core worker (#5370) 2019-08-16 13:52:13 +08:00
Zhijun Fu
b1e010feec Fix TestDirectActorTaskCrossNodesFailure test (#5406) 2019-08-11 11:15:52 +08:00
Philipp Moritz
8d6c50c821
Fix compiler warnings and make warnings fatal (#5375) 2019-08-07 14:04:05 -07:00
Qing Wang
d372f24e3c
[ID Refactor] Refactor ActorID, TaskID and ObjectID (#5286)
* Refactor ActorID, TaskID on the Java side.

Left a TODO comment

WIP for ObjectID

ADD test

Fix

Add java part

Fix Java test

Fix

Refine test.

Enable test in CI

* Extra a helper function.

* Resolve TODOs

* Fix Python CI

* Fix Java lint

* Update .travis.yml

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Address some comments.

Address some comments.

Add id_specification.rst

Reanme id_specification.rst to id_specification.md

typo

Address zhijun's comments.

Fix test

Address comments.

Fix lint

Address comments

* Fix test

* Address comments.

* Fix build error

* Update src/ray/design_docs/id_specification.md

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/design_docs/id_specification.md

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/design_docs/id_specification.md

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/design_docs/id_specification.md

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/design_docs/id_specification.md

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Address comments

* Update src/ray/common/id.h

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/common/id.h

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/common/id.h

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/design_docs/id_specification.md

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update src/ray/design_docs/id_specification.md

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Address comments.

* Address comments.

* Address comments.

* Update C++ part to make sure task id is generated determantic

* WIP

* Fix core worker

* Fix Java part

* Fix comments.

* Add Python side

* Fix python

* Address comments

* Fix linting

* Fix

* Fix C++ linting

* Add JobId() method to TaskID

* Fix linting

* Update src/ray/common/id.h

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update java/api/src/main/java/org/ray/api/id/TaskId.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update java/api/src/main/java/org/ray/api/id/TaskId.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update java/api/src/main/java/org/ray/api/id/ActorId.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Address comments

* Add DriverTaskId embeding job id

* Fix tests

* Add python dor_fake_driver_id

* Address comments and fix linting

* Fix CI
2019-08-07 11:04:51 +08:00
Hao Chen
3ad2fe76e0 Cap concurrent requests (#5341) 2019-08-06 13:56:19 -07:00
Joey Jiang
02c5d2be20 Add common preprocessing for each request in node manager. (#5296) 2019-08-06 20:48:58 +08:00
Eric Liang
0a3ff489fa
Send raylet error logs through the log monitor (#5351) 2019-08-05 23:35:09 -07:00
Kai Yang
384cbfb211 Fix duplicated timeout logic in AbstractRayRuntime.get() (#5338) 2019-08-06 13:36:49 +08:00
Qing Wang
cc5c78b1da Fix the issue of not initializing GLOG 2019-08-05 13:26:09 -07:00
Zhijun Fu
134c6bd128 [direct call] In memory store (#5303) 2019-08-05 13:14:45 -07:00
Stephanie Wang
e218e615df
Lineage cache performance optimization to avoid duplicate GCS requests #5327 2019-07-31 10:43:29 -07:00
Hao Chen
991e71dde6 Submit task asynchronously from raylet client (#5313) 2019-07-30 12:58:57 -07:00
Zhijun Fu
eb307f93f8 Support direct actor call (#5183) 2019-07-30 17:47:17 +08:00
micafan
b3bcf59148 Rename ClientTableData to GcsNodeInfo (#5251) 2019-07-30 11:22:47 +08:00
Simon Mo
3ba8680963 Bump version to 0.8.0.dev3 (#5308) 2019-07-29 18:28:38 -07:00
Simon Mo
3b00144e7d Bump version to 0.7.3 (#5301) 2019-07-29 10:25:32 -07:00
Qing Wang
1465a30ea9
Fix releasing CPUs incorrectly when actor creation task blocked. (#5271)
* Fix

* Remove useless log

* Address

* Fix typo

* sleep
2019-07-28 15:46:17 +08:00
micafan
6f682db99d avoid copying ActorTableData when NodeMananger updates an actor to GCS (#5244) 2019-07-26 11:17:24 +08:00
Joey Jiang
40395acadf [gRPC] Migrate raylet client implementation to grpc (#5120) 2019-07-25 14:48:56 +08:00
Eric Liang
5b76238bce
Fix two types of eviction hangs (#5225) 2019-07-23 21:20:17 -07:00
Stephanie Wang
15959b0f0d
Leave ray.wait calls open until the task or actor exits (#5234)
* Regression test

* Split TaskDependencyManager::SubscribeDependencies into ray.get and ray.wait dependencies
- Some initial implementation

* unit test

* Improve unit tests for TaskDependencyManager

* Implement SubscribeWaitDependencies and UnsubscribeWaitDependencies, unit tests passing

* Add ray.wait python test for drivers that exit early

* Add WorkerID to Worker

* Update test to use two nodes

* Regression test for ray.wait passes

* Extend regression test to include ray.wait from an actor

* Fix ClientID and WorkerIDs

* lint

* lint

* Remove unnecessary ray_get argument

* fix build
2019-07-23 11:55:28 -07:00
Qing Wang
a3d4f9f16d
Fix the issue when passing multiple options in one string (#5241)
* Fix

* Fix linting

* Fix linting

* Address

* Fix test
2019-07-23 12:28:54 +08:00
Zhijun Fu
aa42328874 [direct call] add local plasma provider (#5184) 2019-07-19 11:29:12 +08:00
micafan
b5b8c1d361 [GCS] introduce new gcs client and refactor actor table (#5058) 2019-07-19 11:28:34 +08:00
Richard Liaw
3e0ad11ae0
Add heartbeat test + Fix monitor.py (#5191) 2019-07-16 21:59:48 -07:00
Kai Yang
806524384b [Java worker] Refactor object store and worker context on top of core worker (#5079) 2019-07-16 20:58:02 +08:00
Edward Oakes
e5be5fd46d Remove dependencies from TaskExecutionSpecification (#5166) 2019-07-15 18:15:21 -07:00
Hao Chen
ea6aa6409a Reconstruct failed actors without sending tasks. (#5161)
* fast reconstruct dead actors

* add test

* fix typos

* remove debug print

* small fix

* fix typos

* Update test_actor.py
2019-07-15 10:25:09 -07:00
Hao Chen
7342117710
Fix a multithreading bug in grpc ClientCall (#5196) 2019-07-15 14:49:53 +08:00
Philipp Moritz
322b5166ad Update arrow to include user defined status for plasma (#5156) 2019-07-12 22:51:14 -07:00
Hao Chen
f5a87b88a3 Fix: ServerCallFactory's destructor not marked as virtual (#5185) 2019-07-13 09:38:47 +08:00
Stephanie Wang
f46c555e9e Only get actor ID if actor task (#5180) 2019-07-12 14:31:21 +08:00
vipulharsh
3b42d5ccb1 Track newly created actor's parent actor (#5098)
* Track parent actor of actor

* Update src/ray/raylet/node_manager.cc

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/raylet/node_manager.cc

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* fixing a comment

* Fixing typo in a comment

* capturing task_spec instead of actor_data

* adding const for some local variables

* changing an if else to else

* Linted version

* use updated method to create task from task_data

Change-Id: I9c1a65134dc23a2d175047e96b86ab9d9cf61971

* fixing linter issues

Change-Id: I1def06218130b399d2527b999258aecf9abb98dd
2019-07-11 14:52:04 -07:00
Philipp Moritz
ccee77aafd fix node_failures.py (#5167) 2019-07-11 11:40:13 -07:00
Zhijun Fu
1649f1370e [direct call] changes raylet to push tasks to worker (#5140)
* refactor grpc server

* format

* change GetTask() to PushTask()

* change PushTask to AssignTask

* format

* add resource_ids

* move done_callback to server call

* remove SetTaskHandler and initialize it in task receiver's constructor

* format

* resolve comments

* update

* update

* Update src/ray/core_worker/core_worker.cc

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* resolve comments

* format

* Update src/ray/core_worker/transport/raylet_transport.cc

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* resolve comments

* resolve comments

* fix build

* format

* fix

* format

* noop
2019-07-11 11:01:32 -07:00
Hao Chen
fd835d107e
Move task to common module and add checks in getter methods (#5147) 2019-07-11 17:07:04 +08:00
Qing Wang
f2293243cc
[ID Refactor] Shorten the length of JobID to 4 bytes (#5110)
* WIP

* Fix

* Add jobid test

* Fix

* Add python part

* Fix

* Fix tes

* Remove TODOs

* Fix C++ tests

* Lint

* Fix

* Fix exporting functions in multiple ray.init

* Fix java test

* Fix lint

* Fix linting

* Address comments.

* FIx

* Address and fix linting

* Refine and fix

* Fix

* address

* Address comments.

* Fix linting

* Fix

* Address

* Address comments.

* Address

* Address

* Fix

* Fix

* Fix

* Fix lint

* Fix

* Fix linting

* Address comments.

* Fix linting

* Address comments.

* Fix linting

* address comments.

* Fix
2019-07-11 14:25:16 +08:00
Kai Yang
43b6513d19 [GCS] Move node resource info from client table to resource table (#5050) 2019-07-11 13:17:19 +08:00
Philipp Moritz
e6a81d40a5 [stability] Make task result for RemoveTask optional (#5146)
* make task result for RemoveTask optional

* lint

* update

* update

* update

* rename

* lint
2019-07-10 13:33:41 -07:00
Joey Jiang
e55c8ca165 Fix crash because of the reference to deleted variable in grpc server call (#5158) 2019-07-10 14:06:21 +08:00
Joey Jiang
5733690aa6 Add success and fail callback of grpc sending reply (#5141) 2019-07-09 17:03:57 +08:00
Hao Chen
8a30b93e42
Define common data structures with protobuf. (#5121) 2019-07-08 22:41:37 +08:00
Joey Jiang
274233962f Remove unused connection file in object manager (#5123) 2019-07-08 10:59:36 +08:00
Philipp Moritz
c5253cc300 Add job table to state API (#5076) 2019-07-06 00:05:48 -07:00