Commit graph

2157 commits

Author SHA1 Message Date
SongGuyang
79bec61e12
[event] support WithField option in RAY_EVENT api (#17476) 2021-08-05 20:45:55 +08:00
Eric Liang
6db63990af
Don't capture child tasks in placement groups by default (#17527) 2021-08-04 16:09:45 -07:00
Chen Shen
53a0c74413
[nightly-test] fix non_streaming_shuffle_1tb_5000_partitions 2021-08-04 16:06:53 -07:00
architkulkarni
63708468df
[runtime env] [Doc] Runtime env doc and messaging improvements (#17547) 2021-08-04 12:28:42 -07:00
Yi Cheng
521457b51b
[gcs] Fix actor killing race condition (#17456) 2021-08-04 10:37:56 -07:00
Lixin Wei
a2b0d2f99f
[Core] Add Back Pressure to GCS's gRPC Server (#17427) 2021-08-04 10:36:39 -07:00
SongGuyang
3e42f54910
Support copyright format for c++ files (#14348) 2021-08-04 17:19:38 +08:00
Eric Liang
cb48f3a712
Be more conservative in warning about too many workers (#17531) 2021-08-03 22:30:18 -07:00
Eric Liang
fbd3f11533
OBOD log source error properly 2021-08-02 20:57:01 -07:00
Lixin Wei
6f4c8ebdb2
[Core] Rmove the GetActorIfno RPC for Current Actor When Creating Actors (#17334) 2021-08-01 22:10:40 -07:00
Chen Shen
1b89fa8624
[object store refactor 2/n] More refactor on PlasmaAllocator, and add unit tests 2021-08-01 22:10:03 -07:00
Chen Shen
96c69f8c77
[object store refactor 1/n] Introduce IAllocator and PlasmaAllocator (#17307)
* initial commit

* address comments
2021-07-30 19:08:20 -07:00
Stephanie Wang
c9a2046287
[core] Update error message for hanging ray.get (#17449)
* Update error message

* x
2021-07-30 17:57:10 -07:00
Jiao
d67c57007b
change placement group report size to 1k (#17216)
Co-authored-by: Jiao Dong <jiaodong@anyscale.com>
2021-07-30 11:29:41 -07:00
Chen Shen
32803b53b0
Fix potential dead-lock (#17396) 2021-07-30 11:28:49 -07:00
wanxing
705248f4ee
[CoreWorker]Remove plasma_objects_only parameter (#17384) 2021-07-30 14:48:36 +08:00
Tao Wang
411c49746d
Remove deprecated HEARTBEAT table (#17405)
* Remove deprecated HEARTBEAT table

* incr by 1
2021-07-29 10:14:59 -07:00
Edward Oakes
7007c6271d
[runtime_env] Gracefully fail tasks when an environment fails to be set up (#17249) 2021-07-28 15:25:02 -05:00
Yi Cheng
72abf81900
[gcs] Fix GCS related issues: ByteSizeLong and redis connection (#17373) 2021-07-28 13:01:54 -07:00
Simon Mo
4a4210a083
Support streaming output of runtime env setup to logger/driver (#17306) 2021-07-27 16:39:15 -07:00
fyrestone
57b9b1bb0f
[Dashboard] Use a dedicated RPC to check the GCS is alive (#16330)
* Dashboard check gcs is alive

* Fix dashboard hangs at exit

* ray health-check call GCS CheckAlive

* Minor fixes

Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-27 14:05:44 +08:00
DK.Pino
684e2b28e9
Placement group bug fix (#17320) 2021-07-26 21:03:35 -07:00
Tao Wang
d98ec7fc4d
Remove libray_redis_module (#17283) 2021-07-25 23:15:29 -07:00
Lixin Wei
ded239205f
[Core] Close RPC Server After GcsHearbeatManager (#17238) 2021-07-23 09:12:13 -07:00
Edward Oakes
811eb4b092
[debugger] Enable attaching to breakpoints on remote nodes (off by default) (#17275) 2021-07-23 09:37:40 -05:00
Clark Zinzow
cff7596ea1
[Core] Update locality protocol comment. (#17267) 2021-07-22 11:43:01 -07:00
Chen Shen
c691f73d87
[core][usability] fix noisy push related log (#17250) 2021-07-22 09:33:08 -07:00
Chen Shen
7736d06399
[core][easy] remove unused code in buffer_pool 2021-07-22 09:31:20 -07:00
Chen Shen
70ab8aa1d4
Revert "[core] Do not spill back tasks blocked on args to blocked nodes (#16488)" (#17247)
This reverts commit dad8db46e1.
2021-07-21 19:41:35 -07:00
Chen Shen
edb80d6122
[core][rfc] Fix race condition between write chunk and abort object. (#17234)
* fix

* address comments

* sang's comment
2021-07-21 17:39:06 -07:00
chenk008
afd59be8ca
[Core] Add worker resource limit (#17179)
* add resource restricted

* fix test

* lint

* lint
2021-07-21 22:00:34 +08:00
Stephanie Wang
dad8db46e1
[core] Do not spill back tasks blocked on args to blocked nodes (#16488) 2021-07-20 17:13:02 -07:00
Jialing He
492076806d
[object store] Assign the object owner in ray.put() (#16833) 2021-07-20 11:06:00 -07:00
Chen Shen
055a90374c
[Core] fix erase iterator while iterating over a map. (#17204) 2021-07-20 11:02:55 -07:00
Simon Mo
908aa2c7f3
Fix runtime env and dispatch queue take 2 (#17163) 2021-07-20 10:24:08 -07:00
Kai Yang
f0c148b158
[Core] Simplify the code to read env variables in RayConfig (#16775)
* Simplify the code to read env variables in RayConfig

* simplify

* Correctly print config type

* Change to lower case

* fix template specialization

* lint
2021-07-20 08:40:16 -07:00
SangBin Cho
d6b6356173
[Core] Properly call shutdown instead of deleting a reference (#17096)
* Properly call shutdown instead of deleting a reference

* Add unit tests

* Add test ray shutdown

* Formatting

* format2

* Revert main logic to see if windows issue still fail

* Skip tests for windows.

* formatting

* Try fixing flakiness

* Remove node removed code path

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2021-07-20 08:22:33 -07:00
Siyuan (Ryans) Zhuang
8efc04a8a6
[Core] Actor namespace (#17178)
* set actor namespace in Python on creation

* get actor with namespace in Python

* update message
2021-07-19 21:51:04 -07:00
Chen Shen
b26fcd3fce
fix spill bug (#17187) 2021-07-19 17:44:12 -07:00
Chen Shen
80e013f342
[core] Fix SIGABRT on erase call (#17140) 2021-07-19 11:42:38 -07:00
SangBin Cho
bfc9e5c36f
[Logs] Clean core worker logs (#17033)
* Ready

* Formatting

* Fix

* addressed review.
2021-07-19 11:25:41 -07:00
Qing Wang
195cdcf5b8
Fix memory leak in JNI. (#17177)
Co-authored-by: Qing Wang <jovany.wq@antgroup.com>
2021-07-19 14:06:30 +08:00
Amog Kamsetty
8dfd471823
Revert "Revert "[Dashboard][event] Basic event module (#16985)" (#17068)" (#17107)
This reverts commit c17e171f92.

Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-18 12:59:04 +08:00
Clark Zinzow
8302b5a335
[Core] Reverts full dispatch queue iteration PRs. (#17127)
* Revert "[Core] iterate over entire dispatch queue instead of returning when worker unavailable (#16535)"

This reverts commit 54d66ac637.

* Revert "[Core] [runtime env] [Tests] Add C++ unit test for dispatch queue nonblocking behavior (#16751)"

This reverts commit 13a133817b.

* Revert failing runtime_env test.
2021-07-16 10:28:00 -07:00
SongGuyang
dcb1baabd7
[C++ API] support loading C++ dynamic libraries from code search path (#16828) 2021-07-16 13:02:45 +08:00
Chen Shen
c39571a1f2
Fix GCS shutdown order (#17135) 2021-07-15 17:41:19 -07:00
Yi Cheng
138676295f
[core] Add bundle id as a label; (#16819)
* check

* up

* up

* up

* up

* up

* up

* format

* up

* up

* add test

* format

* up

* format

* up

* format

* up

* up

* up

* rollback

* uncomment

* format

* fix comments

* fix mac build
2021-07-15 16:05:42 -07:00
Lixin Wei
06f6f4e0ec
[Core] Limit Batch Size When Broadcasting Resources (#17072) 2021-07-15 14:28:57 -07:00
Stephanie Wang
bdaa96bf43
[core] Fix bugs in worker cleanup on driver exit (#17049)
* unit test

* cleanup test

* Don't kill workers when job finishes

* better test

* lint

* lint

* comment

* check
2021-07-15 12:53:51 -07:00
Chen Shen
ba70d8dbc6
[RFC] Fix object size inconsistency caused by object-marked-failed. (#16976) 2021-07-14 23:33:36 -07:00