Commit graph

1199 commits

Author SHA1 Message Date
Edward Oakes
4e049232a8
shared_ptr (#5830) 2019-10-02 16:29:04 -07:00
Edward Oakes
963bbe8bbd
Move profiling to c++ (#5771)
* Move profiling to c++

* comments

* Fix tests

* Start after constructor

* fix comment

* always init logging

* Fix logging

* fix logging issue

* shared_ptr for profiler

* DEBUG -> WARNING

* fix killed_ init

* Fix flaky checkpointing tests

* Fix checkpoint test logic

* Fix exception matching

* timeout exception

* Fix import

* fix build

* use boost::asio

* fix double const

* Properly reset async_wait

* remove SIGINT

* Change error message

* increase timeout

* small nits

* Don't trap on SIGINT

* -v for tune

* Fix test
2019-10-01 10:06:25 -07:00
Edward Oakes
86610a30c9
[flaky test] Fix flaky checkpointing tests (#5791)
* Fix flaky checkpointing tests

* Fix checkpoint test logic

* Fix exception matching

* timeout exception

* Fix import

* fix build
2019-09-27 11:03:07 -07:00
Eric Liang
b5da32df78 Bump Ray version in documentation to dev5 (#5794) 2019-09-27 00:19:17 -07:00
Edward Oakes
8a33891a40
Include object size in full error (#5782) 2019-09-25 17:04:17 -07:00
Zhijun Fu
ea9376c9ce Fix flaky core worker tests because of race condition in gcs client subscription (#5735) 2019-09-24 22:47:38 +08:00
Edward Oakes
61e5d674be
Push driver task in core worker (#5752) 2019-09-23 10:53:55 -05:00
Philipp Moritz
f4deecb5ab Fix travis error in direct_actor_transport.cc (#5710) 2019-09-15 22:19:20 -07:00
Eric Liang
4bf7de084d Speed up TaskSpecification copy (#5709) 2019-09-15 19:57:34 -07:00
Eric Liang
4979b8c4d9
Ordered execution of tasks per actor handle (#5664) 2019-09-14 22:31:33 -07:00
Edward Oakes
a5d7de6aaf [core worker] Python core worker normal task submission (#5566) 2019-09-14 13:02:53 -07:00
Edward Oakes
07c4c6367a [core worker] Python core worker object interface (#5272) 2019-09-12 23:07:46 -07:00
Edward Oakes
0bf79cfbde Properly short circuit core worker Get() on exception (#5672) 2019-09-11 18:38:14 -07:00
Eric Liang
2fdefe19b7
Take into account queue length in autoscaling (#5684) 2019-09-11 11:31:35 -07:00
Kai Yang
ed761900f6 [Java] Support direct actor call in Java worker (#5504) 2019-09-09 14:29:20 +08:00
Kai Yang
d8f5804690 Support metadata for passing by value task arguments (#5527) 2019-09-08 11:07:48 +08:00
Kai Yang
732336fc4f [Java] Support multiple workers in Java worker process (#5505) 2019-09-07 22:52:05 +08:00
Edward Oakes
f38bb288e2 Clean up Wait() in the core worker (#5628) 2019-09-04 21:31:34 -07:00
Zhijun Fu
bb5609afb3 ignore object exists error for memory store provider (#5607) 2019-09-05 11:45:35 +08:00
micafan
8236936189 Fix code style in unit test of GCS. (#5634) 2019-09-04 19:36:44 +07:00
Edward Oakes
0c68b4cc30 Clean up Wait() and Get() in the core worker (#5556) 2019-09-03 14:45:15 -07:00
micafan
378757eb88 fix CallbackReply resize (#5589) 2019-09-03 13:48:18 +08:00
Eric Liang
3e70daba74
Warn on resource deadlock; improve object store error messages (#5555)
* wip

* wip

* wip

* wip

* wip

* add impl

* second

* warn once
2019-08-30 16:45:54 -07:00
Philipp Moritz
85a92bcb8b Bump version string to 0.8.0.dev4 (#5523) 2019-08-29 21:25:28 -07:00
Philipp Moritz
e9d2d0432a
Make RAY_CHECK for actor re-creation non-fatal (#5553) 2019-08-28 21:07:52 -07:00
Kai Yang
fadfa5f30b [Java] ObjectID::fromRandom sets proper flags (#5548) 2019-08-28 11:31:06 +08:00
Philipp Moritz
dbf7089c79
Bump version to 0.7.4 (#5474) 2019-08-23 17:08:16 -07:00
Kai Yang
7812dd5636 [Java] Fix getCurrentActorId in multi-threading scenario. (#5506) 2019-08-23 17:56:10 +08:00
Edward Oakes
f359333933 Batch fetch requests in core worker get (#5342) 2019-08-22 11:16:46 -07:00
Eric Liang
e2e30ca507 Ray, Tune, and RLlib support for memory, object_store_memory options (#5226) 2019-08-21 23:01:10 -07:00
Zhijun Fu
eab595777f Support multiple store providers in ObjectInterface (#5452) 2019-08-21 11:16:48 +08:00
micafan
52a7c1d673 modify ActorStateAccessor::AsyncGet callback (#5417) 2019-08-21 10:54:33 +08:00
Hao Chen
f2b3c273db Fix direct actor transport not treating some tasks as failed (#5464) 2019-08-20 12:44:38 -07:00
micafan
da7bdacea5 support for subscription to an actor (#5269) 2019-08-20 20:32:53 +08:00
Philipp Moritz
599cc2be60
Revert raylet to worker GRPC communication back to asio (#5450) 2019-08-17 19:11:32 -07:00
micafan
47aa2b110d Make GCS Client thread-safe. (#5413) 2019-08-17 17:21:09 +08:00
Kai Yang
b1aae0e398 [Java worker] Migrate task execution and submission on top of core worker (#5370) 2019-08-16 13:52:13 +08:00
Zhijun Fu
b1e010feec Fix TestDirectActorTaskCrossNodesFailure test (#5406) 2019-08-11 11:15:52 +08:00
Philipp Moritz
8d6c50c821
Fix compiler warnings and make warnings fatal (#5375) 2019-08-07 14:04:05 -07:00
Qing Wang
d372f24e3c
[ID Refactor] Refactor ActorID, TaskID and ObjectID (#5286)
* Refactor ActorID, TaskID on the Java side.

Left a TODO comment

WIP for ObjectID

ADD test

Fix

Add java part

Fix Java test

Fix

Refine test.

Enable test in CI

* Extra a helper function.

* Resolve TODOs

* Fix Python CI

* Fix Java lint

* Update .travis.yml

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Address some comments.

Address some comments.

Add id_specification.rst

Reanme id_specification.rst to id_specification.md

typo

Address zhijun's comments.

Fix test

Address comments.

Fix lint

Address comments

* Fix test

* Address comments.

* Fix build error

* Update src/ray/design_docs/id_specification.md

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/design_docs/id_specification.md

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/design_docs/id_specification.md

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/design_docs/id_specification.md

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/design_docs/id_specification.md

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Address comments

* Update src/ray/common/id.h

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/common/id.h

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/common/id.h

Co-Authored-By: Stephanie Wang <swang@cs.berkeley.edu>

* Update src/ray/design_docs/id_specification.md

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update src/ray/design_docs/id_specification.md

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Address comments.

* Address comments.

* Address comments.

* Update C++ part to make sure task id is generated determantic

* WIP

* Fix core worker

* Fix Java part

* Fix comments.

* Add Python side

* Fix python

* Address comments

* Fix linting

* Fix

* Fix C++ linting

* Add JobId() method to TaskID

* Fix linting

* Update src/ray/common/id.h

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update java/api/src/main/java/org/ray/api/id/TaskId.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update java/api/src/main/java/org/ray/api/id/TaskId.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Update java/api/src/main/java/org/ray/api/id/ActorId.java

Co-Authored-By: Hao Chen <chenh1024@gmail.com>

* Address comments

* Add DriverTaskId embeding job id

* Fix tests

* Add python dor_fake_driver_id

* Address comments and fix linting

* Fix CI
2019-08-07 11:04:51 +08:00
Hao Chen
3ad2fe76e0 Cap concurrent requests (#5341) 2019-08-06 13:56:19 -07:00
Joey Jiang
02c5d2be20 Add common preprocessing for each request in node manager. (#5296) 2019-08-06 20:48:58 +08:00
Eric Liang
0a3ff489fa
Send raylet error logs through the log monitor (#5351) 2019-08-05 23:35:09 -07:00
Kai Yang
384cbfb211 Fix duplicated timeout logic in AbstractRayRuntime.get() (#5338) 2019-08-06 13:36:49 +08:00
Qing Wang
cc5c78b1da Fix the issue of not initializing GLOG 2019-08-05 13:26:09 -07:00
Zhijun Fu
134c6bd128 [direct call] In memory store (#5303) 2019-08-05 13:14:45 -07:00
Stephanie Wang
e218e615df
Lineage cache performance optimization to avoid duplicate GCS requests #5327 2019-07-31 10:43:29 -07:00
Hao Chen
991e71dde6 Submit task asynchronously from raylet client (#5313) 2019-07-30 12:58:57 -07:00
Zhijun Fu
eb307f93f8 Support direct actor call (#5183) 2019-07-30 17:47:17 +08:00
micafan
b3bcf59148 Rename ClientTableData to GcsNodeInfo (#5251) 2019-07-30 11:22:47 +08:00