Lixin Wei
d287fc941b
[Core] Add Running Count to instrumented_io_context ( #17664 )
2021-08-12 13:56:40 -07:00
Chen Shen
9565fa549e
[Core][RFC] limit the total number of inlined bytes in task request rpc
...
Co-authored-by: Clark Zinzow <clarkzinzow@gmail.com>
2021-08-12 13:55:54 -07:00
Eric Liang
ce171f10a1
Remove legacy plasma unlimited and pull manager pinning flag ( #17753 )
2021-08-11 20:19:12 -07:00
Qing Wang
6d6a1ea43e
Support reading system configs from native in Java. ( #17703 )
...
* Support reading system configs from native in Java.
* Fix lint
* Lint cpp
* Fix Java cases.
* Address comments.
* Address comments.
2021-08-12 10:06:01 +08:00
Yi Cheng
02e79f3fe5
Revert "[Observability] Export useful metrics ( #17578 )" ( #17752 )
...
This reverts commit bd4db53df2
.
2021-08-11 12:21:50 -07:00
SongGuyang
4176e43ef2
Remove binary printing from RAY_CHECK log ( #17728 )
2021-08-11 18:32:12 +08:00
Yi Cheng
bd4db53df2
[Observability] Export useful metrics ( #17578 )
...
* up
* up
* up
* up
* up
* up
* up
* up
* up
* up
* up
* up
* up
* checkpoint
* up
* up
* up
* up
* fix
* up
* up
* up
* up
* up
* up
* up
* up
* up
* up
* add comments
* up
* up
* up
* up
* add tests
2021-08-10 17:14:42 -07:00
SongGuyang
63c15d7ced
[core] make 'PopWorker' to be an async function ( #17202 )
...
* make 'PopWorker' to be an async function
* pop worker async works
* fix
* address comments
* bugfix
* fix cluster_task_manager_test
* fix
* bugfix of detached actor
* address comments
* fix
* address comments
* fix aioredis
* Revert "fix aioredis"
This reverts commit 041b983eac95b105ab0e853e84c4cf2647008431.
* bug fix
* fix
* fix test_step_resources test
* format
* add unit test
* fix
* add test case PopWorkerStatus
* address commit
* fix lint
* address comments
* add python test
* address comments
* make an independent function
* Update test_basic_3.py
Co-authored-by: Hao Chen <chenh1024@gmail.com>
2021-08-10 17:03:17 -07:00
SangBin Cho
6160c06c69
[Core] Fix a bug where get_actor crashes gcs if the actor is already killed. ( #17670 )
...
* Fix a bug where get_actor crashes gcs if the actor is already killed.
* Test the restart code path.
* Add an additional test
* Add a comment
* addressed code review.
2021-08-10 09:58:09 -07:00
Yi Cheng
473740b739
[gcs] Fix actor killing hang due to race condition ( #17634 )
...
* Revert "Revert "[gcs] Fix actor killing race condition (#17456 )" (#17599 )"
This reverts commit 381ffdb6d0
.
* update
* format
* up
2021-08-09 21:11:26 -07:00
qicosmos
05da724521
[C++ Worker] Replace Ray::xxx
with ray::xxx
and update namespaces ( #17388 )
2021-08-10 11:17:59 +08:00
wanxing
8312628c30
Remove unused Spill function ( #17607 )
2021-08-09 10:10:03 -07:00
Tao Wang
5990b60f8b
[Core]Cache named actor in local in case of getting them from GCS frequently. ( #17339 )
...
* [Core]Cach named actor in local in case of getting them from GCS frequently
* lint
* fix nullptr
* typo
* add namespace to cache
* lint
* lock, reference and others
* lint
* fix comments and add test
* lint
* lint
* optimize test
* add necessary fields in pub for caching
* add removing test
* fix test
2021-08-09 14:01:57 +08:00
Hao Chen
0858f0e4f2
Change core worker C++ namespace to ray::core ( #17610 )
2021-08-08 23:34:25 +08:00
SangBin Cho
654718902f
Fix ( #17660 )
2021-08-07 18:07:27 -07:00
Qing Wang
4cc34588db
[Core] Support ConcurrentGroup part1 ( #16795 )
...
* Core change and Java change.
* Fix void call.
* Address comments and fix cases.
* Fix asyncio
2021-08-07 22:41:33 +08:00
SangBin Cho
4616e8a03c
Fix wrong invariant pubsub ( #17620 )
...
* ip
* loose check failure
* Fix the bug properly.
* Fix comments.
2021-08-06 14:14:54 -07:00
liuyang-my
12bd904594
[Serve] Define BackendConfig protobuf and adapt it in Java ( #17201 )
2021-08-06 09:50:45 -07:00
Zhi Lin
82123123c4
[object store] Java API for Assign the object owner in Ray.put()
( #17237 )
...
Co-authored-by: Qing Wang <kingchin1218@126.com>
Co-authored-by: Kai Yang <kfstorm@outlook.com>
2021-08-06 15:26:59 +08:00
Stephanie Wang
a06d71477f
[core] Do not spill back tasks blocked on args to blocked nodes ( #17550 )
...
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2021-08-05 20:43:32 -07:00
Chen Shen
920a4e3d56
[core] Improve fatal message for fallback allocation ( #17595 )
2021-08-05 17:58:45 -07:00
Chen Shen
4ff35d43b3
[object store refactor 3/n] introduce object_store ( #17332 )
...
refactor-allocator
add object_store
2021-08-05 17:36:27 -07:00
SangBin Cho
8bc9286296
Remove an unused profile event code from object manager. ( #17529 )
...
* Remove an unused profile event code from object manager.
* Addressed code review.
* Temporarily skip a test
* lint
2021-08-05 17:13:16 -07:00
SangBin Cho
381ffdb6d0
Revert "[gcs] Fix actor killing race condition ( #17456 )" ( #17599 )
...
This reverts commit 521457b51b
.
2021-08-05 15:54:03 -07:00
architkulkarni
e84ae6caa5
[Core] [runtime env] Avoid spurious worker startup ( #17422 )
2021-08-05 15:46:23 -05:00
Eric Liang
8ff3fce4ba
Add a warning if the number of queued tasks to an actor exceeds 5k ( #17581 )
2021-08-05 12:03:48 -07:00
SongGuyang
79bec61e12
[event] support WithField option in RAY_EVENT api ( #17476 )
2021-08-05 20:45:55 +08:00
Eric Liang
6db63990af
Don't capture child tasks in placement groups by default ( #17527 )
2021-08-04 16:09:45 -07:00
Chen Shen
53a0c74413
[nightly-test] fix non_streaming_shuffle_1tb_5000_partitions
2021-08-04 16:06:53 -07:00
architkulkarni
63708468df
[runtime env] [Doc] Runtime env doc and messaging improvements ( #17547 )
2021-08-04 12:28:42 -07:00
Yi Cheng
521457b51b
[gcs] Fix actor killing race condition ( #17456 )
2021-08-04 10:37:56 -07:00
Lixin Wei
a2b0d2f99f
[Core] Add Back Pressure to GCS's gRPC Server ( #17427 )
2021-08-04 10:36:39 -07:00
SongGuyang
3e42f54910
Support copyright format for c++ files ( #14348 )
2021-08-04 17:19:38 +08:00
Eric Liang
cb48f3a712
Be more conservative in warning about too many workers ( #17531 )
2021-08-03 22:30:18 -07:00
Eric Liang
fbd3f11533
OBOD log source error properly
2021-08-02 20:57:01 -07:00
Lixin Wei
6f4c8ebdb2
[Core] Rmove the GetActorIfno RPC for Current Actor When Creating Actors ( #17334 )
2021-08-01 22:10:40 -07:00
Chen Shen
1b89fa8624
[object store refactor 2/n] More refactor on PlasmaAllocator, and add unit tests
2021-08-01 22:10:03 -07:00
Chen Shen
96c69f8c77
[object store refactor 1/n] Introduce IAllocator and PlasmaAllocator ( #17307 )
...
* initial commit
* address comments
2021-07-30 19:08:20 -07:00
Stephanie Wang
c9a2046287
[core] Update error message for hanging ray.get
( #17449 )
...
* Update error message
* x
2021-07-30 17:57:10 -07:00
Jiao
d67c57007b
change placement group report size to 1k ( #17216 )
...
Co-authored-by: Jiao Dong <jiaodong@anyscale.com>
2021-07-30 11:29:41 -07:00
Chen Shen
32803b53b0
Fix potential dead-lock ( #17396 )
2021-07-30 11:28:49 -07:00
wanxing
705248f4ee
[CoreWorker]Remove plasma_objects_only parameter ( #17384 )
2021-07-30 14:48:36 +08:00
Tao Wang
411c49746d
Remove deprecated HEARTBEAT table ( #17405 )
...
* Remove deprecated HEARTBEAT table
* incr by 1
2021-07-29 10:14:59 -07:00
Edward Oakes
7007c6271d
[runtime_env] Gracefully fail tasks when an environment fails to be set up ( #17249 )
2021-07-28 15:25:02 -05:00
Yi Cheng
72abf81900
[gcs] Fix GCS related issues: ByteSizeLong and redis connection ( #17373 )
2021-07-28 13:01:54 -07:00
Simon Mo
4a4210a083
Support streaming output of runtime env setup to logger/driver ( #17306 )
2021-07-27 16:39:15 -07:00
fyrestone
57b9b1bb0f
[Dashboard] Use a dedicated RPC to check the GCS is alive ( #16330 )
...
* Dashboard check gcs is alive
* Fix dashboard hangs at exit
* ray health-check call GCS CheckAlive
* Minor fixes
Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-27 14:05:44 +08:00
DK.Pino
684e2b28e9
Placement group bug fix ( #17320 )
2021-07-26 21:03:35 -07:00
Tao Wang
d98ec7fc4d
Remove libray_redis_module ( #17283 )
2021-07-25 23:15:29 -07:00
Lixin Wei
ded239205f
[Core] Close RPC Server After GcsHearbeatManager ( #17238 )
2021-07-23 09:12:13 -07:00