Commit graph

1910 commits

Author SHA1 Message Date
SangBin Cho
3965310f93
[Core] Fix the check failure from object manager (#15070) 2021-04-01 21:21:42 -07:00
Alex Wu
f52c855704
[core] Fix placement group GPU assignment bug (#15049) 2021-04-01 17:46:09 -07:00
Yi Cheng
d4c20c970b
[core] Fix UTIL worker issue (#14925)
* Fix

* format

* more

* format

* fix

* fix

* fix comment

* fix test failure
2021-04-01 17:36:45 -07:00
Siyuan (Ryans) Zhuang
6ad379864e
[doc] Fix inconsistent doc about ObjectID bytes (#15072) 2021-04-01 17:14:30 -07:00
Alex Wu
4fba05ae4d
[core] Hybrid scheduling policy. (#14790) 2021-04-01 16:59:59 -07:00
fangfengbin
18728b2b7e
Fix c++ gcs test bug (#15063)
* fix ut bug

* fix bug

Co-authored-by: 灵洵 <fengbin.ffb@antgroup.com>
2021-04-01 09:19:24 -07:00
SangBin Cho
005cff0092
Revert "Revert "[Core] Implement long polling-based pubsub to reduce … (#14909) 2021-04-01 09:03:15 -07:00
Hao Chen
3e1a0439b7
Fix concurrent actor starting too many threads. (#14927) 2021-04-01 19:58:18 +08:00
Stephanie Wang
a86a7a6a98
[core] Cap total memory used by executing tasks' arguments (#15027)
* Task dependency map

* Pinned args threshold

* Unit test and fix

* no leaks

* update

* update

* remove assertion
2021-03-31 15:38:40 -07:00
SangBin Cho
79a6aa97b7
[Core] Optimize get core worker Stats (#15008)
* in progress.

* Optimize get core worker stats.

* Fix a segfault.

* Addressed code review.

* Update comments.

* Addressed code review.
2021-03-31 12:21:53 -07:00
Yi Cheng
4480132229
[core] Integration runtime_env with ray client (#14881)
* server side ready

* client size

* py

* fix

* up

* format

* add files

* add pyx

* up

* up

* up

* add keys

* format

* update

* format

* add unittests

* add files

* up

* up

* fix

* up

* fix thread issue

* format

* fix

* update proto

* Fix

* format

* fix

* more

* fix conflict

* fix

* fix order

* format

* add

* up

* compiling fix

* lint

* fix

* format

* fix some

* some fix

* fix comment

* test cases

* add test

* comments

* fix name

* format

* fix

* revert gcs-kv

* fix comments

* fix failure

* fix test

* format

* fix timeout

* fix

* fix

* fix

* format

* format

* fix flaky test

Co-authored-by: Yi Cheng <singye888@gmail.com>
2021-03-31 11:39:34 -07:00
Kai Yang
6278df8604
[Java] refine generation of jvm options (#14931) 2021-03-31 21:04:52 +08:00
Siyuan (Ryans) Zhuang
3aa39142db
[Core] Remove code paths that run plasma store as a process (#14924)
* enable plasma store as thread by default

remove unused code path that runs plasma store as a process
2021-03-30 16:19:03 -07:00
SangBin Cho
4edcaa8870
[Stats] Basic implementation for the the periodic asio stats printing support. (#14982)
* Basic implementation for the the periodic asio stats printing support.

* hacky way to count grpc stats.

* lint

* Fix an issue.

* Revert the request/reply.
2021-03-29 21:51:16 -07:00
Alex Wu
1f4d4dfeb0
Gcs pull resource reports (#14336) 2021-03-29 11:36:30 -07:00
Siyuan (Ryans) Zhuang
87c79553e9
[Core] Remove code paths that contains plasma store executable (#14950)
* remove plasma store executable & never used tests

* set default behavior

* fix tests
2021-03-28 21:22:14 -07:00
qicosmos
de7ee75d27
[C++ worker] Ray normal task for RAY_REMOTE (#14599) 2021-03-27 09:56:40 +08:00
SangBin Cho
839cd1e0a2
[Core] Remove unnecessary redis connection (#14511)
* remove unnecessary stuff.

* test in progress.

* Fix tests.

* lint

* fix.

* Remove tests that were not working properly before.
2021-03-26 10:29:12 -07:00
Eric Liang
2157021fd3
Refactor object restoration path (#14821) 2021-03-25 22:46:50 -07:00
Yi Cheng
f427801c10
Revert "[core] Fix worker type in python (#14823)" (#14910)
This reverts commit 9ccf291f4d.
2021-03-24 13:27:56 -07:00
SangBin Cho
ec3cfef883
Revert "[Core] Implement long polling-based pubsub to reduce number of WaitForObjectEviction requests in flight. (#14638)" (#14905)
This reverts commit 35ec91c4e0.
2021-03-24 11:22:48 -07:00
Clark Zinzow
ed46d8bf45
[Core] Added ownership-based object directory metrics, fixed raylet metric bug. (#14855)
* Added ownership-based object directory metrics.

* Updated OBOD metric descriptions.

* Dump OBOD metrics in debug string.

* Added e2e tests for metrics.
2021-03-24 10:53:22 -07:00
SangBin Cho
35ec91c4e0
[Core] Implement long polling-based pubsub to reduce number of WaitForObjectEviction requests in flight. (#14638)
* in progress.

* IN progress.

* lint.

* Updated code

* lint.

* In progress of writing tets.

* Finished implementation. Need cleanup & refactoring.

* fixing tests...

* Finish the impl.

* Fix typo.

* impl done. Only cleanup left.

* done.

* Finished clean up.

* Fix issues.

* Add a stronger consistency check.

* Addressed code review.

* lint.

* done.

* Addressed more.

* addressed all reviews.

* Addressed code review.

* lint.

* Added unit tests to assert no leak.
2021-03-23 23:47:08 -07:00
Stephanie Wang
201ebc3f92
Revert "[core] Set a configurable max memory for fetched objects (#14817)" (#14887)
This reverts commit 8769953474.
2021-03-23 21:58:11 -07:00
fyrestone
52cfa1cdd7
Fix load code from local (#12102) 2021-03-24 11:49:58 +08:00
Stephanie Wang
8769953474
[core] Set a configurable max memory for fetched objects (#14817)
* Set threshold, tests

* comment

* move max to pull manager

* unit test

* fix plasma

* comment
2021-03-23 13:55:02 -07:00
Yi Cheng
9ccf291f4d
[core] Fix worker type in python (#14823)
* Fix

* format

* more

* format
2021-03-23 00:58:57 -07:00
SangBin Cho
87877cdfbf
[Test] Fix flaky object spilling test (#14722)
* start

* done.

* d

* d

* Push the fix.

* done.

* Enable test.
2021-03-22 12:51:47 -07:00
Yi Cheng
881a46e1d6
[core] RuntimeEnv GC in local node (#14594) 2021-03-18 14:55:11 -07:00
Tao Wang
5305dbb639
[large scale]Always disable sync/subscribe context in sharding context (#14706) 2021-03-18 19:31:36 +08:00
Tao Wang
44a7ce3d35
[large scale]Disable async/subscribe context in global state accessor (#14705) 2021-03-18 11:07:33 +08:00
Tao Wang
ea7c9171e9
[large scale]Disable async context in raylets' gcs client (#14704) 2021-03-18 10:50:09 +08:00
Clark Zinzow
6a28cf4add
[Core] Event loop instrumentation concurrency fixes. (#14719)
* Moved global stats member to a shared pointer explicitly captured by-value by handler lambdas, fixed handler stats copy outside of lock, ported to generalized lambda capture.

* Reenabled event loop instrumentation by default.

* Remove explicit inline specifier from non-member functions, move into anonymous namespace.

* Revert "Reenabled event loop instrumentation by default."

This reverts commit 949215269f79a1ab5ddc1ce0285c3ff4477ee6e0.
2021-03-17 16:49:25 -07:00
Lixin Wei
72d87093b9
[Core] Make Actor DEAD and Save Exceptions in GCS When Error Happens in Constructor (#14211) 2021-03-17 12:50:28 -07:00
Ian Rodney
bd641a5e71
Revert "[Core] Added event loop metrics for posts. (#14546)" (#14692) 2021-03-16 10:38:45 -07:00
Tao Wang
897b84b300
[large scale]Add option for disable/enable context connection and disable asynchro… (#14596) 2021-03-16 15:09:13 +08:00
Tao Wang
c572563e1e
[large scale]Add enable sharding option and disable sharding for gcs client (#14600) 2021-03-15 19:35:00 +08:00
Siyuan (Ryans) Zhuang
b92531918e
Make use of C++14 'make_unique' (#14663) 2021-03-15 03:00:52 -07:00
Tao Wang
3402b1752f
[GCS]Report job error to gcs instead of direct publishing (#14617)
* [GCS]Report job error to gcs instead of direct publishing

* fix compile
2021-03-12 14:54:08 -08:00
Eric Liang
2ba49c2701
Distinguish between grpc client and server events in asio metrics (#14637) 2021-03-12 11:13:59 -08:00
Clark Zinzow
7b3102dd32
Add resource report lag warning. (#14611) 2021-03-11 17:29:45 -08:00
Yi Cheng
ad8e35b919
[ray] Update cpp to std14 (#14441) 2021-03-10 14:05:52 -08:00
Clark Zinzow
566dcea56a
[Core] Added event loop metrics for posts. (#14546)
* Added event loop metrics for posts.

* io_context_proxy --> instrumented_io_context

* Fix feature flag, chrono-->absl, trim the stats, inline functions, reformat stats string.

* Make stats struct mutex plain lock instead of reader-writer lock.

* Mutex reader locking, std::array double braces initialization.

* Fix Bazel BUILD formatting.
2021-03-10 11:52:45 -08:00
Stephanie Wang
0f3530da3b
[core] Only consider actual workers when killing idle workers (#14578) 2021-03-10 09:30:19 -08:00
Alex Wu
e1fbb8489e
[core] Supress infeasible warning (#14068) 2021-03-09 16:37:56 -08:00
Yi Cheng
ed8935406b
[core] Minimal support for runtime env (#14270) 2021-03-09 11:53:58 -08:00
Alex Wu
ba6cebe30f
Raylet request resource report endpoint (#14291)
* .

* done?

* raylet side done?

* .

* .

* .

* client

* .

* fix tests

* make ci happy

* lint

* cleanup

* clang sucks

Co-authored-by: Alex Wu <alex@anyscale.com>
2021-03-09 09:50:50 -08:00
Eric Liang
3fab5e2ada
Switch memory units to bytes (#14433) 2021-03-06 19:32:35 -08:00
Alex Wu
2395e25fc0
[hotfix][core] Load balancing spillback feature flag (#14457) 2021-03-05 16:45:33 -08:00
DK.Pino
26907b7708
Support placement group for normal task in Java API (#14342)
* support pg for normal task

* fix lint

* fix comment

* fix comment

* update comment

* fix java typo
2021-03-05 10:21:37 +08:00