DK.Pino
fb89f9c2c8
[Placement Group] Support named placement group ( #13755 )
2021-02-05 11:04:51 +08:00
Tao Wang
44aa9c173f
Rename timeout to period with heartbeat interval ( #13872 )
2021-02-04 10:37:28 +08:00
Tao Wang
e0d9c8f0a8
Always replace DEL with UNLINK ( #13832 )
2021-02-04 10:30:00 +08:00
Clark Zinzow
407302f93a
[Core] Ownership-based Object Directory - Changed infinite short-poll location subscription to long-poll. ( #13841 )
2021-02-03 14:16:42 -08:00
SangBin Cho
cb9fa90203
[Object Spilling] Add consumed bytes to detect thrashing. ( #13853 )
2021-02-03 14:16:26 -08:00
Alex Wu
f14171ced9
[Core] Put raylet ip's in resource usage report ( #13871 )
...
* .
* done?
Co-authored-by: Alex Wu <alex@anyscale.com>
2021-02-03 11:28:56 -08:00
Gabriele Oliaro
79310452e7
Enabling the cancellation of non-actor tasks in a worker's queue 2 ( #13244 )
...
* wrote code to enable cancellation of queued non-actor tasks
* minor changes
* bug fixes
* added comments
* rev1
* linting
* making ActorSchedulingQueue::CancelTaskIfFound raise a fatal error
* bug fix
* added two unit tests
* linting
* iterating through pending_normal_tasks starting from end
* fixup! iterating through pending_normal_tasks starting from end
* fixup! fixup! iterating through pending_normal_tasks starting from end
* post merge fixes
* added debugging instructions, pulled Accept() out of guarded loop
* removed debugging instructions, linting
* first commit
* lint
* lint
* added hack to avoid race condition in test stress
* moved hack
* fix test cancel
* removed hack (hopefully no longer needed)
* Revert "removed hack (hopefully no longer needed)"
This reverts commit 99d0e7c91539f290700f50aaaed805dcde04a5ee.
* added sleep in mock_worker.cc
* sleep function fixup to work on windows
* sleep in test_fast both for force=true and force=false
* linting
Co-authored-by: Ian <ian.rodney@gmail.com>
2021-02-03 10:20:12 -08:00
fangfengbin
b4684cf37a
Fix bug that otal_commands_queued_ is not initialized ( #13852 )
2021-02-03 10:00:15 +08:00
Eric Liang
fa4290090d
Add Ray client protocol version ( #13846 )
2021-02-02 00:19:08 -08:00
SangBin Cho
886217c333
[Object Spilling] Skip normal ray.get path when spilling objects. ( #13831 )
2021-02-01 16:03:34 -08:00
Stephanie Wang
754bee9282
[core][object spillin] Fix bugs in admission control ( #13781 )
2021-02-01 10:48:21 -08:00
Tao Wang
1d2ab018b0
Use right reserve size ( #13829 )
2021-02-01 15:49:34 +08:00
Lingxuan Zuo
b5f0aed974
[Log] use default stderr logger if no raylog starting ( #13762 )
2021-02-01 11:13:06 +08:00
Stephanie Wang
30f82329e3
[core] Add debug information for the PullManager and LocalObjectManager ( #13782 )
...
* Add debug info
* Formatting.
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
2021-01-29 17:55:46 -08:00
Hao Chen
0f3a3e14aa
Only delete local object in CoreWorkerPlasmaStoreProvider:::WarmupStore ( #13788 )
2021-01-29 20:24:09 +08:00
Stephanie Wang
42d501d747
[core] Pin arguments during task execution ( #13737 )
...
* tmp
* Pin task args
* unit tests
* update
* test
* Fix
2021-01-28 19:07:10 -08:00
Tao Wang
56ee6ef55f
[GCS]only update states related fields when publish actor table data ( #13448 )
2021-01-28 11:12:57 +08:00
Simon Mo
4f1f558802
[Core] Hotfix Windows Compilation Error for ClusterTaskManager ( #13754 )
...
* [Core] Hotfix Windows Compilation Error for ClusterTaskManager
* fix
2021-01-27 19:01:56 -08:00
Alex Wu
c0fe816466
[Core/Autoscaler] Properly clean up resource backlog from ( #13727 )
2021-01-27 15:30:58 -08:00
Eric Liang
56a9523020
Fix high CPU usage in object manager due to O(n^2) iteration over active pulls list ( #13724 )
2021-01-27 14:02:22 -08:00
DK.Pino
7f6d326ad8
[Placement Group]Add detached support for placement group. ( #13582 )
2021-01-27 18:51:26 +08:00
SangBin Cho
8baafacb1e
[Logging] Log rotation config ( #13375 )
...
* In Progress.
* formatting.
* in progress.
* linting.
* Done.
* Fix typo.
* Fixed the issue.
2021-01-26 20:15:55 -08:00
Lingxuan Zuo
f9f2bfa778
[Metric] Fix crashed when register metric view in multithread ( #13485 )
...
* Fix crashed when register metric view in multithread
* fix comments
* fix
2021-01-25 20:32:08 +08:00
SangBin Cho
edbb2937d3
[Object Spilling] Multi node file spilling V2. ( #13542 )
...
* done.
* done.
* Fix a mistake.
* Ready.
* Fix issues.
* fix.
* Finished the first round of code review.
* formatting.
* In progress.
* Formatting.
* Addressed code review.
* Formatting
* Fix tests.
* fix bugs.
* Skip flaky tests for now.
2021-01-23 23:15:32 -08:00
Qing Wang
8ef835ff03
Remove idle actor from worker pool. ( #13523 )
2021-01-23 13:57:30 +08:00
Kai Yang
90f1e408de
[Java] Add fetchLocal
parameter in Ray.wait()
( #13604 )
2021-01-22 17:55:00 +08:00
Stephanie Wang
0998d69968
[core] Admission control for pulling objects to the local node ( #13514 )
...
* Admission control, TODO: tests, object size
* Unit tests for admission control and some bug fixes
* Add object size to object table, only activate pull if object size is known
* Some fixes, reset timer on eviction
* doc
* update
* Trigger OOM from the pull manager
* don't spam
* doc
* Update src/ray/object_manager/pull_manager.cc
Co-authored-by: Eric Liang <ekhliang@gmail.com>
* Remove useless tests
* Fix test
* osx build
* Skip broken test
* tests
* Skip failing tests
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2021-01-21 16:46:42 -08:00
Amog Kamsetty
20acc3b05e
Revert "Inline small objects in GetObjectStatus response. ( #13309 )" ( #13615 )
...
This reverts commit a82fa80f7b
.
2021-01-21 16:10:34 -08:00
Clark Zinzow
a82fa80f7b
Inline small objects in GetObjectStatus response. ( #13309 )
2021-01-21 09:15:18 -08:00
Siyuan (Ryans) Zhuang
a09997dc9e
[Core] Remove 'PlasmaBuffer' in the buffer header ( #13188 )
2021-01-20 12:01:44 -08:00
ZhuSenlin
2e7c2b774f
[Core] add thread name to help performance profiling ( #13506 )
2021-01-20 20:34:28 +08:00
Tao Wang
b2a6e55289
[GCS]Only publish fileds used by sub clients in WorkerTableData ( #13508 )
2021-01-20 16:14:59 +08:00
Keqiu Hu
6c9088eb62
[core] refactor disconnect message processing and enrich WorkExitType ( #13527 )
...
* [core] refactor disconnect message processing and enrich WorkExitType
add changes from refactor pr
fix type typo
fix typo
fix
* address comments
* also update WorkerTableData
* fix tests
2021-01-19 22:09:46 -08:00
SangBin Cho
e544c008df
Fix restoration request dedup issues. ( #13546 )
2021-01-19 15:28:54 -08:00
Stephanie Wang
bfe147a6a8
Debug info to GCS pub sub ( #13564 )
2021-01-19 14:55:23 -08:00
SangBin Cho
99375c4cfc
[Object Spilling] Remove retries and use a timer instead. ( #13175 )
2021-01-19 11:01:45 -08:00
fyrestone
86d5000047
Fix passing env on windows ( #13253 )
2021-01-19 10:04:38 -06:00
Tao Wang
516eb77080
[GCS] Remove task info publish as nowhere uses it ( #13509 )
...
* Remove task info publish as nowhere uses it
* simplify right publish channel
2021-01-18 01:15:03 -08:00
Tao Wang
3a0710130c
[GCS]Only publish changed field when node dead ( #13364 )
...
* Only update changed field when node dead
* node_id missed
2021-01-17 21:28:35 -08:00
ZhuSenlin
a4ebdbd7da
Refactor node manager to eliminate new_scheduler_enabled_
( #12936 )
2021-01-18 00:15:35 +08:00
ZhuSenlin
2cd51ce608
sync write internal config in gcs ( #13197 )
2021-01-17 12:00:01 +08:00
Eric Liang
ee6332dbb0
Bump dev branch to 2.0 to avoid endless version bump toil ( #13497 )
...
* wip
* fix
* fix
2021-01-15 17:41:17 -08:00
SangBin Cho
d09df55b14
Update ID specification doc ( #13356 )
2021-01-15 15:15:51 -08:00
Eric Liang
4aeb0ea550
Return version info from Ray client connect, to allow for discovering version mismatches
2021-01-15 14:27:26 -08:00
SangBin Cho
f6d9996874
[Object Spilling] Dedup restore objects ( #13470 )
...
* done.
* Addressed code review.
2021-01-14 23:51:11 -08:00
fangfengbin
ce1b208e41
[GCS]Remove unused class variable ( #13454 )
2021-01-15 14:48:18 +08:00
Barak Michener
84e110a949
[ray_client]: Support runtime_context as metadata ( #13428 )
2021-01-14 14:37:00 -08:00
Clark Zinzow
9a658b568f
[Core] Ownership-based Object Directory: Consolidate location table and reference table. ( #13220 )
...
* Added owned object reference before Plasma put on Create() + Seal() path.
* Consolidated location table and reference table in reference counter.
* Restore type in definition.
* Clean up owned reference on failed Seal().
* Added RemoveOwnedObject test for reference counter.
* Guard against ref going out of scope before location RPCs.
* Add 'owner must have ref in scope' precondition to documentation for object location methods.
* Move to separate Create() + Seal() methods for existing objects.
* Clearer distinction between Create() and Seal() methods.
* Make it clear that references will normally be cleaned up by reference counting.
2021-01-14 13:48:10 -08:00
fangfengbin
4a6c53da46
[Core]Fix raylet scheduling bug ( #13452 )
...
* [Core]Fix raylet scheduling bug
* fix lint error
* fix lint error
Co-authored-by: 灵洵 <fengbin.ffb@antgroup.com>
2021-01-14 14:50:32 +01:00
fangfengbin
33b092de28
[GCS]Add gcs resource scheduler ( #13072 )
2021-01-14 20:05:55 +08:00