Commit graph

1250 commits

Author SHA1 Message Date
Siyuan (Ryans) Zhuang
613abdf1b6
Remove arrow macros in plasma store (#9115) 2020-06-23 23:34:44 -07:00
Siyuan (Ryans) Zhuang
acb7270bd7
Adopt upstream plasma changes (#9061)
* adopt upstream plasma changes
2020-06-23 14:19:57 -07:00
Simon Mo
b6d425526d
Move actor task submission to io service (#9093) 2020-06-23 10:07:33 -07:00
Siyuan (Ryans) Zhuang
306ca75737
Fix ray arrow logs (#9097)
* convert arrow logs to ray logs

* remove extra plasma tests and modules
2020-06-23 10:02:30 -07:00
Zhilei Chen
8f2564f1a6
fix a bug that move a const variable (#9080) 2020-06-23 11:54:18 +08:00
Siyuan (Ryans) Zhuang
7a110b9401
[Core] Remove digests in plasma (4x performance improvement) (#8980)
* remove digest in plasma

* totally remove list
2020-06-22 14:24:32 -07:00
mehrdadn
275da2e400
Fix Google log directory again (#9063) 2020-06-22 14:56:28 -05:00
mehrdadn
1a40d24174
Handle loop_ NULL case (#9067)
Co-authored-by: Mehrdad <noreply@github.com>
2020-06-22 11:05:29 -07:00
SangBin Cho
e254dd3115
Do not add reference count when it is local mode. (#8979) 2020-06-21 16:01:06 -05:00
mehrdadn
981f67bfb0
Fix more Windows issues (#9011)
Co-authored-by: Mehrdad <noreply@github.com>
2020-06-19 18:51:45 -07:00
mehrdadn
f8d49d69c1
Fix and merge asio client read/write operations (#9026)
Co-authored-by: Mehrdad <noreply@github.com>
2020-06-19 18:49:55 -07:00
Gabriele Oliaro
311c55132c
redefined SchedulingClass to avoid including the FunctionDescriptor (#9022)
* redefined SchedulingClass to avoid including the FunctionDescriptor

* updated TestSchedulingKeys test in DirectTaskTransportTest
2020-06-19 13:12:48 -07:00
mehrdadn
f43cad6371
Fix Google log directory (#9025) 2020-06-19 11:00:02 -05:00
Zhilei Chen
d8a9247448
Remove gcs_service_disabled ci jobs and code (#8854) 2020-06-19 11:32:27 +08:00
SangBin Cho
110f88ff61
Improve raylet failure msg. (#8986)
* Use better error messages.

* Improve message.

* Fix message based on code review.
2020-06-18 13:36:32 -07:00
Siyuan (Ryans) Zhuang
4654b3c07a
Use raylet signal handling in plasma store when running plasma store as a thread (#9007) 2020-06-18 11:52:06 -07:00
Edward Oakes
8a99fd205e
[dashboard] Pipe resource assignments to dashboard (#8998) 2020-06-18 11:14:59 -05:00
Zhilei Chen
0de2efd330
Fix a bug that 'remote_avaiable' used after it was moved #9002(#9002) 2020-06-18 18:04:42 +08:00
fangfengbin
c295284370
Optimize gcs server resubscribe (#8896) 2020-06-17 20:05:50 +08:00
Tao Wang
9f0f542660
Remove actor table info from storage when a driver exits (#8761)
* delete contents of table related to specified job when the job is dead

* check status

* implement GetByJobId in gcs table storage

* add test case

* add test case

* fix test cases

* expose MGET and make match_pattern only related with SCAN

* add test case for table storage

* delete checkpoint

* make MGetValues static

* add most test case

* add object test case

* avoid accessing to storage when get matched object ids per job id

* rename job info handler

* use listener to sense job finished

* clear actor state

* add comments, remove actions in task handler

* let raylet do object cleaning. only remove non-detached actors

* only remove informations of non-detached actor

* remove unused methods
2020-06-16 18:43:08 -07:00
Stephanie Wang
fa16c7666a
Fix possible deadlock in CoreWorkerDirectActorTaskSubmitter (#8973) 2020-06-16 15:30:15 -07:00
fangfengbin
4facac023f
Fix heap-use-after-free bug of gcs pub sub testcase (#8968) 2020-06-16 21:00:37 +08:00
Siyuan (Ryans) Zhuang
b68fede30b
Convert include guard to pragma once (#8957) 2020-06-16 01:29:43 -07:00
mehrdadn
101c215125
Get more tests running on Windows (#6537)
* Get rid of system() calls

* Work around '/usr/share/mini' showing up on GitHub Actions (probably due to psutil truncation)

https://github.com/ray-project/ray/runs/722480047?check_suite_focus=true

* Don't check for socket max path length on Windows

* Don't check for socket existence on Windows

* Fix race condition in Windows fate-sharing

* Work around missing .exe extension for Redis tests

* Add more tests to GitHub Actions

Co-authored-by: Mehrdad <noreply@github.com>
2020-06-12 21:32:10 -07:00
Siyuan (Ryans) Zhuang
ed77c8b16c
[Core] Use global variable to eliminate force thread termination in plasma (#8912)
* use global variable to eliminate force thread termination
2020-06-12 14:20:53 -07:00
Siyuan (Ryans) Zhuang
4b31b383f3
[Core] Run Plasma Store as a Raylet thread (with a feature flag) (#8897)
* integrate plasma store as a thread (C++)

* integrate plasma store as a thread (Python)

* fix config issues

* remove plasma component fail tests

* without forcefully kill the plasma store thread
2020-06-11 22:54:08 -07:00
mehrdadn
cae475c46a
Fix Windows build (#8905)
Co-authored-by: Mehrdad <noreply@github.com>
2020-06-11 14:54:37 -07:00
Stephanie Wang
05010caed2
[core] Fix race condition for object reconstruction (#8791)
* Fix

* doc

* Unit test

* Update src/ray/core_worker/task_manager.h

Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>

* Update src/ray/core_worker/task_manager.h

Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>

* Update src/ray/core_worker/task_manager.h

Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>

* lint

Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
2020-06-10 19:49:12 -07:00
Ian Rodney
2cf3d8c92c
[core] Check that port is unused before assigning to worker (#8773) 2020-06-10 18:35:38 -05:00
fangfengbin
a5bebd4408
Fix create actor rpc reconnect bug (#8855) 2020-06-10 10:53:53 +08:00
Siyuan (Ryans) Zhuang
3d473600a8
[Core] Use Ray ObjectID in Plasma (#8852)
* Use Ray ObjectIDs instead

* remove unused code
2020-06-09 10:10:49 -07:00
chaokunyang
31a4d07bc4
[Java] Rename java ObjectRef/ActorHandle (#8799) 2020-06-09 11:40:43 +08:00
Siyuan (Ryans) Zhuang
c1e6813cea
[core] Move plasma store under object_manager (#8832)
* move plasma under object directory

* update include paths

* cleanup

* disable lint of third-party libraries

* lint
2020-06-08 18:21:41 -07:00
SangBin Cho
3388864768
[Core] Clean up detached actors (#8759) 2020-06-08 11:22:01 -05:00
fangfengbin
68718b33b4
GCS Server add SIGTERM signal handler (#8795) 2020-06-08 17:26:36 +08:00
mehrdadn
3ee2e9f7e5
Make #include consistent (#8666)
Co-authored-by: Mehrdad <noreply@github.com>
2020-06-07 15:43:24 +02:00
mehrdadn
f68183d778
Error-checking for a couple of corruption issues (#8059)
* Extra error handling
* Handle connection closed in Redis monitor
Co-authored-by: Mehrdad <noreply@github.com>
2020-06-07 15:43:00 +02:00
Siyuan (Ryans) Zhuang
a0247ffe55
Build plasma store as a library (#8817)
* build plasma store as a library

* remove unused headers

* windows support
2020-06-06 22:11:37 -07:00
Stephanie Wang
b160b83d3e
[core] Queue subscription/unsubscription commands in the GCS (#8756)
* Only remove callback index if in map

* test

* Queue subscription commands

* lint

* Check status

* update

* update

* update

* Disable GCS restart tests

* lint
2020-06-05 19:49:19 -07:00
mehrdadn
d78757623d
bazel build --compilation_mode=debug (#6457) 2020-06-05 14:36:10 +02:00
Tao Wang
41072fbcc8
Implement GetByJobId in gcs table storage (#8727) 2020-06-04 20:51:43 +08:00
fangfengbin
84a8f2ccb5
Support reloading storage data when gcs server restarts (#8650) 2020-06-04 14:53:20 +08:00
Siyuan (Ryans) Zhuang
ea05ebe89e
Ship plasma store with Ray (#7901) 2020-06-03 17:44:34 -07:00
Stephanie Wang
aa06c3b15a
Eager eviction even when object pinning is disabled (#8561)
* Eager eviction even when object pinning is disabled, add regression test

* Make test more robust

* lint
2020-06-02 11:48:03 -07:00
Lingxuan Zuo
64a98e4447
Fix sum aggregator in its metric (#8724) 2020-06-02 17:36:25 +08:00
Lingxuan Zuo
4cbbc15ca7
[GCS] Global state accessor from node resource table (#8658) 2020-06-02 14:01:00 +08:00
acxz
8b924a4846
[gcs] add missing templated log classes (#8690)
Resolves #8535
2020-06-01 13:39:59 -07:00
Tao Wang
1df408d6ed
Resubscribe object table info when gcs service restart (#8639) 2020-06-01 10:42:26 +08:00
fangfengbin
016337d4eb
Heartbeat table uses gcs pub-sub instead of redis accessor (#8655) 2020-05-30 23:17:25 +08:00
fangfengbin
10c87063be
merge actor info handler into actor manager (#8682) 2020-05-30 21:56:29 +08:00