Siyuan (Ryans) Zhuang
993ff5fd81
[Core] Fix concurrency issues in plasma store runner ( #9642 )
2020-07-23 00:12:10 -07:00
SangBin Cho
ca391ed052
[Core] GCS Actor management on by default. ( #8845 )
...
* GCS Actor management on by default.
* Fix travis config.
* Change condition.
* Remove unnecessary CI.
2020-07-22 14:31:41 -07:00
Kai Yang
bfa0605282
[Java] Avoid data copy from C++ to Java for ByteBuffer type ( #9033 )
2020-07-22 16:25:32 +08:00
mehrdadn
b14728d999
Shellcheck quoting ( #9596 )
...
* Fix SC2006: Use $(...) notation instead of legacy backticked `...`.
* Fix SC2016: Expressions don't expand in single quotes, use double quotes for that.
* Fix SC2046: Quote this to prevent word splitting.
* Fix SC2053: Quote the right-hand side of == in [[ ]] to prevent glob matching.
* Fix SC2068: Double quote array expansions to avoid re-splitting elements.
* Fix SC2086: Double quote to prevent globbing and word splitting.
* Fix SC2102: Ranges can only match single chars (mentioned due to duplicates).
* Fix SC2140: Word is of the form "A"B"C" (B indicated). Did you mean "ABC" or "A\"B\"C"?
* Fix SC2145: Argument mixes string and array. Use * or separate argument.
* Fix SC2209: warning: Use var=$(command) to assign output (or quote to assign string).
Co-authored-by: Mehrdad <noreply@github.com>
2020-07-21 21:56:41 -05:00
ZhuSenlin
382b314241
[GCS] fix the fault tolerance about gcs node manager ( #9380 )
2020-07-22 10:55:51 +08:00
Lingxuan Zuo
cd42450fc1
[Metrics] Java metric API ( #9377 )
2020-07-22 10:35:08 +08:00
kisuke95
4e2e3bd348
[New scheduler] Fix new scheduler bug ( #9467 )
...
* fix new scheduler bug
* add testcase for soft resource allocation
* modify RemoveNode
2020-07-20 13:09:53 -07:00
mehrdadn
f3ef9060e4
Handle warnings in core ( #9575 )
2020-07-20 12:55:07 -07:00
mehrdadn
dcec26ac7b
Fix log losses ( #9559 )
...
* Close log on shutdown
* Disable log buffering
Co-authored-by: Mehrdad <noreply@github.com>
2020-07-20 11:03:58 -07:00
fangfengbin
8605b594f1
Fix Java named actor bug ( #9580 )
2020-07-20 19:51:00 +08:00
Siyuan (Ryans) Zhuang
4accc16995
[Core] Replace the Plasma eventloop with boost::asio ( #9431 )
2020-07-20 02:52:51 -07:00
fangfengbin
0cee75c86a
GCS client add fetch operation before subscribe ( #9564 )
2020-07-20 10:16:42 +08:00
mehrdadn
2554a1a997
Bazel fixes ( #9519 )
2020-07-19 12:53:08 -07:00
Lingxuan Zuo
ce3f542739
[Metric] new cython interface for python worker metric ( #9469 )
2020-07-19 10:43:21 +08:00
Siyuan (Ryans) Zhuang
7edd1e6694
[Core] Remove socket pair exchange in Plasma Store ( #9565 )
...
* try use boost::asio for notification processing
2020-07-18 15:47:52 -07:00
fangfengbin
b12b8e1324
[GCS]Fix lease worker leak bug when gcs server restarts ( #9315 )
...
* add part code
* fix compile bug
* fix review comments
* fix review comments
* fix review comments
* fix review comments
* fix review comment
* fix ut bug
* fix lint error
* fix review comment
* fix review comments
* add testcase
* add testcase
* fix bug
* fix review comments
* fix review comment
* fix review comment
* refine comments
Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
Co-authored-by: Hao Chen <chenh1024@gmail.com>
2020-07-18 15:49:41 +08:00
Alex Wu
a78c5d5ef2
[New scheduler] Queueing refactor ( #9491 )
...
* .
* test_args passes
* .
* test_basic.py::test_many_fractional_resources causes ray to hang
* test_basic.py::test_many_fractional_resources causes ray to hang
* .
* .
* useful
* test_many_fractional_resources fails instead of hanging now :)
* Passes test_fractional_resources
* .
* .
* Some cleanup
* git is hard
* cleanup
* .
* .
* .
* .
* .
* .
* .
* cleanup
* address reviews
* address reviews
* more refactor
* :)
* travis pls
* .
* travis pls
* .
2020-07-17 11:08:03 -07:00
Gabriele Oliaro
026c009086
Pipelining task submission to workers ( #9363 )
...
* first step of pipelining
* pipelining tests & default configs
- added pipelining unit tests in direct_task_transport_test.cc
- added an entry in ray_config_def.h, ray_config.pxi, and ray_config.pxd to configure the parameter controlling the maximum number of tasks that can be in fligh to each worker
- consolidated worker_to_lease_client_ and worker_to_lease_client_ hash maps in direct_task_transport.h into a single one called worker_to_lease_entry_
* post-review revisions
* linting, following naming/style convention
* linting
2020-07-17 10:45:13 -07:00
Stephanie Wang
b351d13940
[core] Add flag to enable object reconstruction during ray start ( #9488 )
...
* Add flag
* doc
* Fix tests
2020-07-17 10:13:14 -07:00
Alisa
f080aa6ce3
Add placement group manager and some code in core_worker ( #9120 )
...
Co-authored-by: Lingxuan Zuo <skyzlxuan@gmail.com>
2020-07-17 20:49:51 +08:00
mehrdadn
37942ea1e7
Windows cleanup ( #9508 )
...
* Remove unneeded code for Windows
* Get rid of usleep()
* Make platform_shims includes non-transitive
Co-authored-by: Mehrdad <noreply@github.com>
2020-07-17 02:08:15 -07:00
Lingxuan Zuo
3a74164289
[Stats] Fix metric exporter test ( #9376 )
2020-07-17 14:38:24 +08:00
SangBin Cho
94e94ae0c3
[Core] Fix Java detached error ( #9526 )
2020-07-16 16:39:42 -07:00
Siyuan (Ryans) Zhuang
d61d92afc7
Cleanup Plasma Store (hash utilities) ( #9524 )
2020-07-16 14:52:14 -07:00
SangBin Cho
63e052a5f3
Fix. ( #9464 )
2020-07-16 11:51:32 -05:00
SangBin Cho
2f674728a6
[GCS Actor Management] Gcs actor management broken detached actor ( #9473 )
2020-07-16 15:41:18 +08:00
mehrdadn
06ed2313e2
Fix clang-cl build ( #9494 )
...
Co-authored-by: Mehrdad <noreply@github.com>
2020-07-15 22:17:11 -07:00
chaokunyang
9318e76b81
[Java] Named java actor ( #9037 )
2020-07-16 11:31:18 +08:00
Stephanie Wang
4e81804cba
[core] Replace task resubmission in raylet with ownership protocol ( #9394 )
...
* Add intended worker ID to GetObjectStatus, tests
* Remove TaskID owner_id
* lint
* Add owner address to task args
* Make TaskArg a virtual class, remove multi args
* Set owner address for task args
* merge
* Fix tests
* Add ObjectRefs to task dependency manager, pass from task spec args
* tmp
* tmp
* Fix
* Add ownership info for task arguments
* Convert WaitForDirectActorCallArgs
* lint
* build
* update
* build
* java
* Move code
* build
* Revert "Fix Google log directory again (#9063 )"
This reverts commit 275da2e400
.
* Fix free
* Regression tests - shorten timeouts in reconstruction unit tests
* Remove timeout for non-actor tasks
* Modify tests using ray.internal.free
* Clean up future resolution code
* Raylet polls the owner
* todo
* comment
* Update src/ray/core_worker/core_worker.cc
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
* Drop stale actor table notifications
* Fix bug where actor restart hangs
* Revert buggy code for duplicate tasks
* build
* Fix errors for lru_evict and internal.free
* Revert "Drop stale actor table notifications"
This reverts commit 193c5d20e5577befd43f166e16c972e2f9247c91.
* Revert "build"
This reverts commit 5644edbac906ff6ef98feb40b6f62c9e63698c29.
* Fix free test
* Fixes for freed objects
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2020-07-15 14:55:51 -07:00
Kai Yang
005ea1e125
Add job configs to gcs ( #9374 )
2020-07-15 15:18:48 +08:00
mehrdadn
33e400998c
Fix name clash on Windows ( #9412 )
...
Co-authored-by: Mehrdad <noreply@github.com>
2020-07-14 23:14:53 -07:00
Stephanie Wang
6d99aa34a5
[core] Handle out-of-order actor table notifications ( #9449 )
...
* Drop stale actor table notifications
* build
* Add num_restarts to disconnect handler
* Unit test and increment num_restarts on ALIVE, not RESTARTING
* Wait for pid to exit
2020-07-14 22:55:04 -07:00
Zhuohan Li
003518619f
[Core] remove create_and_seal and create_and_seal_batch ( #9457 )
2020-07-14 14:37:13 -07:00
SangBin Cho
539c51a003
[Core] Support GCS server port assignment. ( #8962 )
2020-07-14 11:49:56 -05:00
SangBin Cho
f6eb47fc1f
[Stats] metrics agent exporter ( #9361 )
2020-07-14 11:49:16 -05:00
Siyuan (Ryans) Zhuang
d57ff5e2af
Remove legacy C++ code ( #9459 )
2020-07-14 00:57:42 -07:00
kisuke95
276fe109c5
change error code name of boost timer ( #9417 )
2020-07-14 11:50:58 +08:00
fangfengbin
3c90f960fb
Fix gcs_pubsub_test bug( #9438 )
...
Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-07-14 11:34:50 +08:00
Siyuan (Ryans) Zhuang
4da97a7c99
[Core] Build raylet client as an independent component ( #9434 )
2020-07-13 16:00:32 -07:00
Hao Chen
e6225bdfa1
[GCS] Fix the bug about raylet receiving duplicate actor creation tasks ( #9422 )
2020-07-13 11:34:02 -07:00
Siyuan (Ryans) Zhuang
381c242f6b
[Core] Simplify Raylet Client ( #9420 )
2020-07-12 12:42:54 -07:00
Siyuan (Ryans) Zhuang
1798deae94
[Core] Plasma RAII support ( #9370 )
2020-07-10 09:22:29 -07:00
SangBin Cho
d8a0d76d02
Fix macos compliation bug ( #9391 )
...
* Fix.
2020-07-10 09:18:09 -07:00
Kai Yang
c89b59cf48
Remove the RAY_CHECK in Worker::Port() ( #9348 )
2020-07-10 18:06:25 +08:00
Kai Yang
a98cd0670e
[Java] Improve JNI performance when submitting and executing tasks ( #9032 )
2020-07-10 17:51:07 +08:00
Hao Chen
d49dadf891
Change Python's ObjectID
to ObjectRef
( #9353 )
2020-07-10 17:49:04 +08:00
Tao Wang
6311e5a947
[HOTFIX] Fix compile direct_actor_transport_test on mac ( #9403 )
2020-07-10 17:19:34 +08:00
fangfengbin
35861f17a3
Fix gcs_table_storage testcase bug ( #9393 )
...
Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-07-10 16:16:28 +08:00
mehrdadn
dd2cc6eb48
Update hiredis and remove Windows patches ( #9289 )
...
Co-authored-by: Mehrdad <noreply@github.com>
2020-07-09 18:45:44 -07:00
Zhuohan Li
8a76f4cbb5
[Core] put small objects in memory store ( #8972 )
...
* remove the put in memory store
* put small objects directly in memory store
* cast data type
* fix another place that uses Put to spill to plasma store
* fix multiple tests related to memory limits
* partially fix test_metrics
* remove not functioning codes
* fix core_worker_test
* refactor put to plasma codes
* add a flag for the new feature
* add flag to more places
* do a warmup round for the plasma store
* lint
* lint again
* fix warmup store
* Update _raylet.pyx
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-07-09 15:39:40 -07:00