SangBin Cho
6160c06c69
[Core] Fix a bug where get_actor crashes gcs if the actor is already killed. ( #17670 )
...
* Fix a bug where get_actor crashes gcs if the actor is already killed.
* Test the restart code path.
* Add an additional test
* Add a comment
* addressed code review.
2021-08-10 09:58:09 -07:00
Yi Cheng
473740b739
[gcs] Fix actor killing hang due to race condition ( #17634 )
...
* Revert "Revert "[gcs] Fix actor killing race condition (#17456 )" (#17599 )"
This reverts commit 381ffdb6d0
.
* update
* format
* up
2021-08-09 21:11:26 -07:00
qicosmos
05da724521
[C++ Worker] Replace Ray::xxx
with ray::xxx
and update namespaces ( #17388 )
2021-08-10 11:17:59 +08:00
wanxing
8312628c30
Remove unused Spill function ( #17607 )
2021-08-09 10:10:03 -07:00
Tao Wang
5990b60f8b
[Core]Cache named actor in local in case of getting them from GCS frequently. ( #17339 )
...
* [Core]Cach named actor in local in case of getting them from GCS frequently
* lint
* fix nullptr
* typo
* add namespace to cache
* lint
* lock, reference and others
* lint
* fix comments and add test
* lint
* lint
* optimize test
* add necessary fields in pub for caching
* add removing test
* fix test
2021-08-09 14:01:57 +08:00
Hao Chen
0858f0e4f2
Change core worker C++ namespace to ray::core ( #17610 )
2021-08-08 23:34:25 +08:00
SangBin Cho
654718902f
Fix ( #17660 )
2021-08-07 18:07:27 -07:00
Qing Wang
4cc34588db
[Core] Support ConcurrentGroup part1 ( #16795 )
...
* Core change and Java change.
* Fix void call.
* Address comments and fix cases.
* Fix asyncio
2021-08-07 22:41:33 +08:00
SangBin Cho
4616e8a03c
Fix wrong invariant pubsub ( #17620 )
...
* ip
* loose check failure
* Fix the bug properly.
* Fix comments.
2021-08-06 14:14:54 -07:00
liuyang-my
12bd904594
[Serve] Define BackendConfig protobuf and adapt it in Java ( #17201 )
2021-08-06 09:50:45 -07:00
Zhi Lin
82123123c4
[object store] Java API for Assign the object owner in Ray.put()
( #17237 )
...
Co-authored-by: Qing Wang <kingchin1218@126.com>
Co-authored-by: Kai Yang <kfstorm@outlook.com>
2021-08-06 15:26:59 +08:00
Stephanie Wang
a06d71477f
[core] Do not spill back tasks blocked on args to blocked nodes ( #17550 )
...
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2021-08-05 20:43:32 -07:00
Chen Shen
920a4e3d56
[core] Improve fatal message for fallback allocation ( #17595 )
2021-08-05 17:58:45 -07:00
Chen Shen
4ff35d43b3
[object store refactor 3/n] introduce object_store ( #17332 )
...
refactor-allocator
add object_store
2021-08-05 17:36:27 -07:00
SangBin Cho
8bc9286296
Remove an unused profile event code from object manager. ( #17529 )
...
* Remove an unused profile event code from object manager.
* Addressed code review.
* Temporarily skip a test
* lint
2021-08-05 17:13:16 -07:00
SangBin Cho
381ffdb6d0
Revert "[gcs] Fix actor killing race condition ( #17456 )" ( #17599 )
...
This reverts commit 521457b51b
.
2021-08-05 15:54:03 -07:00
architkulkarni
e84ae6caa5
[Core] [runtime env] Avoid spurious worker startup ( #17422 )
2021-08-05 15:46:23 -05:00
Eric Liang
8ff3fce4ba
Add a warning if the number of queued tasks to an actor exceeds 5k ( #17581 )
2021-08-05 12:03:48 -07:00
SongGuyang
79bec61e12
[event] support WithField option in RAY_EVENT api ( #17476 )
2021-08-05 20:45:55 +08:00
Eric Liang
6db63990af
Don't capture child tasks in placement groups by default ( #17527 )
2021-08-04 16:09:45 -07:00
Chen Shen
53a0c74413
[nightly-test] fix non_streaming_shuffle_1tb_5000_partitions
2021-08-04 16:06:53 -07:00
architkulkarni
63708468df
[runtime env] [Doc] Runtime env doc and messaging improvements ( #17547 )
2021-08-04 12:28:42 -07:00
Yi Cheng
521457b51b
[gcs] Fix actor killing race condition ( #17456 )
2021-08-04 10:37:56 -07:00
Lixin Wei
a2b0d2f99f
[Core] Add Back Pressure to GCS's gRPC Server ( #17427 )
2021-08-04 10:36:39 -07:00
SongGuyang
3e42f54910
Support copyright format for c++ files ( #14348 )
2021-08-04 17:19:38 +08:00
Eric Liang
cb48f3a712
Be more conservative in warning about too many workers ( #17531 )
2021-08-03 22:30:18 -07:00
Eric Liang
fbd3f11533
OBOD log source error properly
2021-08-02 20:57:01 -07:00
Lixin Wei
6f4c8ebdb2
[Core] Rmove the GetActorIfno RPC for Current Actor When Creating Actors ( #17334 )
2021-08-01 22:10:40 -07:00
Chen Shen
1b89fa8624
[object store refactor 2/n] More refactor on PlasmaAllocator, and add unit tests
2021-08-01 22:10:03 -07:00
Chen Shen
96c69f8c77
[object store refactor 1/n] Introduce IAllocator and PlasmaAllocator ( #17307 )
...
* initial commit
* address comments
2021-07-30 19:08:20 -07:00
Stephanie Wang
c9a2046287
[core] Update error message for hanging ray.get
( #17449 )
...
* Update error message
* x
2021-07-30 17:57:10 -07:00
Jiao
d67c57007b
change placement group report size to 1k ( #17216 )
...
Co-authored-by: Jiao Dong <jiaodong@anyscale.com>
2021-07-30 11:29:41 -07:00
Chen Shen
32803b53b0
Fix potential dead-lock ( #17396 )
2021-07-30 11:28:49 -07:00
wanxing
705248f4ee
[CoreWorker]Remove plasma_objects_only parameter ( #17384 )
2021-07-30 14:48:36 +08:00
Tao Wang
411c49746d
Remove deprecated HEARTBEAT table ( #17405 )
...
* Remove deprecated HEARTBEAT table
* incr by 1
2021-07-29 10:14:59 -07:00
Edward Oakes
7007c6271d
[runtime_env] Gracefully fail tasks when an environment fails to be set up ( #17249 )
2021-07-28 15:25:02 -05:00
Yi Cheng
72abf81900
[gcs] Fix GCS related issues: ByteSizeLong and redis connection ( #17373 )
2021-07-28 13:01:54 -07:00
Simon Mo
4a4210a083
Support streaming output of runtime env setup to logger/driver ( #17306 )
2021-07-27 16:39:15 -07:00
fyrestone
57b9b1bb0f
[Dashboard] Use a dedicated RPC to check the GCS is alive ( #16330 )
...
* Dashboard check gcs is alive
* Fix dashboard hangs at exit
* ray health-check call GCS CheckAlive
* Minor fixes
Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-27 14:05:44 +08:00
DK.Pino
684e2b28e9
Placement group bug fix ( #17320 )
2021-07-26 21:03:35 -07:00
Tao Wang
d98ec7fc4d
Remove libray_redis_module ( #17283 )
2021-07-25 23:15:29 -07:00
Lixin Wei
ded239205f
[Core] Close RPC Server After GcsHearbeatManager ( #17238 )
2021-07-23 09:12:13 -07:00
Edward Oakes
811eb4b092
[debugger] Enable attaching to breakpoints on remote nodes (off by default) ( #17275 )
2021-07-23 09:37:40 -05:00
Clark Zinzow
cff7596ea1
[Core] Update locality protocol comment. ( #17267 )
2021-07-22 11:43:01 -07:00
Chen Shen
c691f73d87
[core][usability] fix noisy push related log ( #17250 )
2021-07-22 09:33:08 -07:00
Chen Shen
7736d06399
[core][easy] remove unused code in buffer_pool
2021-07-22 09:31:20 -07:00
Chen Shen
70ab8aa1d4
Revert "[core] Do not spill back tasks blocked on args to blocked nodes ( #16488 )" ( #17247 )
...
This reverts commit dad8db46e1
.
2021-07-21 19:41:35 -07:00
Chen Shen
edb80d6122
[core][rfc] Fix race condition between write chunk and abort object. ( #17234 )
...
* fix
* address comments
* sang's comment
2021-07-21 17:39:06 -07:00
chenk008
afd59be8ca
[Core] Add worker resource limit ( #17179 )
...
* add resource restricted
* fix test
* lint
* lint
2021-07-21 22:00:34 +08:00
Stephanie Wang
dad8db46e1
[core] Do not spill back tasks blocked on args to blocked nodes ( #16488 )
2021-07-20 17:13:02 -07:00