Commit graph

314 commits

Author SHA1 Message Date
Chong-Li
5e22257cec
[GCS] Fix: GCS Based Actor Scheduler (#17944) 2021-08-18 23:40:35 -07:00
Simon Mo
b573864928
[CI] Add test owners (#17893) 2021-08-18 18:38:31 -07:00
Chen Shen
89d83228f6
[Core][Plasma-store] add stats-collector that eagerly collect stats 2021-08-18 13:47:50 -07:00
Chong-Li
a9b4545502
[GCS] GCS Based Actor Scheduler (#16580) 2021-08-18 13:44:59 -07:00
Guyang Song
8227e24424
[event] event framework integration in raylet, gcs server and core worker (#17671) 2021-08-17 11:21:23 +08:00
Yi Cheng
03a82d733a
Revert "Revert "Export useful metrics"" (#17755)
* Revert "Revert "[Observability] Export useful metrics (#17578)" (#17752)"

This reverts commit 02e79f3fe5.

* Update metric.h

* up

* up

* Update server_call.h

* Update test_metrics_agent.py

* up

* fix comment
2021-08-16 17:05:56 -07:00
Chen Shen
b349c6bc4f
[object store refactor 4/n] object lifecycle manager (#17344)
* lifecycle

* address comments
2021-08-16 09:58:35 -07:00
Eric Liang
ce171f10a1
Remove legacy plasma unlimited and pull manager pinning flag (#17753) 2021-08-11 20:19:12 -07:00
Yi Cheng
02e79f3fe5
Revert "[Observability] Export useful metrics (#17578)" (#17752)
This reverts commit bd4db53df2.
2021-08-11 12:21:50 -07:00
Yi Cheng
bd4db53df2
[Observability] Export useful metrics (#17578)
* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* checkpoint

* up

* up

* up

* up

* fix

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* add comments

* up

* up

* up

* up

* add tests
2021-08-10 17:14:42 -07:00
Chen Shen
4ff35d43b3
[object store refactor 3/n] introduce object_store (#17332)
refactor-allocator

add object_store
2021-08-05 17:36:27 -07:00
SongGuyang
79bec61e12
[event] support WithField option in RAY_EVENT api (#17476) 2021-08-05 20:45:55 +08:00
Chen Shen
1b89fa8624
[object store refactor 2/n] More refactor on PlasmaAllocator, and add unit tests 2021-08-01 22:10:03 -07:00
Chen Shen
96c69f8c77
[object store refactor 1/n] Introduce IAllocator and PlasmaAllocator (#17307)
* initial commit

* address comments
2021-07-30 19:08:20 -07:00
Tao Wang
d98ec7fc4d
Remove libray_redis_module (#17283) 2021-07-25 23:15:29 -07:00
Edward Oakes
f6375cbb7c
[core] Fix bazel test sizes for C++ unit tests (#17272) 2021-07-22 17:38:56 -05:00
Amog Kamsetty
8dfd471823
Revert "Revert "[Dashboard][event] Basic event module (#16985)" (#17068)" (#17107)
This reverts commit c17e171f92.

Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-18 12:59:04 +08:00
Amog Kamsetty
c17e171f92
Revert "[Dashboard][event] Basic event module (#16985)" (#17068)
This reverts commit f1faa79a04.
2021-07-13 23:18:43 -07:00
Chen Shen
645d8fcaf0
[logging][rfc] add RAY_LOG_EVERY_N and RAY_LOG_EVERY_MS (#17018)
* introduce log-every-n

* add n

* linter

* add license
2021-07-13 19:14:28 -07:00
fyrestone
f1faa79a04
[Dashboard][event] Basic event module (#16985)
* Basic event module

* Fix comments

* Set the SCAN_EVENT_DIR_INTERVAL_SECONDS defaults to 2

* Fix lint

* Fix lint

* Clean code

* Try to fix flaky

* Fix test

* Disable event module by default

* Make monitor events task cancellable

* Fix error

Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-13 19:08:39 -07:00
Amog Kamsetty
a14342ce6f
Revert "[Dashboard][event] Basic event module (#16698)" (#17004)
This reverts commit 66ea099897.
2021-07-12 11:22:46 -07:00
qicosmos
298d2afc35
[Ray Log] remove glog dependency (#16077) 2021-07-12 17:06:52 +08:00
Scott Graham
3334357c58
[autoscaler] [azure] Fix Azure Autoscaling Failures (#16640)
Co-authored-by: Scott Graham <scgraham@microsoft.com>
2021-07-10 11:55:00 -07:00
fyrestone
66ea099897
[Dashboard][event] Basic event module (#16698)
* Basic event module

* Fix comments

* Set the SCAN_EVENT_DIR_INTERVAL_SECONDS defaults to 2

* Fix lint

* Fix lint

* Clean code

* Try to fix flaky

* Fix test

* Disable event module by default

Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-09 10:25:30 -07:00
Kai Yang
e925051ce4
[Core] Get node to connect for driver in global state accessor (#16810) 2021-07-08 11:21:12 +08:00
Chen Shen
dbd3260141
[core] Deprecate QuotaAwareEvictionPolicy (#16911) 2021-07-07 13:44:41 -07:00
Kai Yang
7c21be5450
[Object spilling] Clean up spilled objects on disk when Raylet starts (#16669) 2021-07-05 12:01:25 +08:00
Alex Wu
d89f148fbf
[Pubsub] Don't depend on subscriber address (#16752)
* remove subscriber address

* .

* lint

* test

* done

* lint

* .

* Update BUILD.bazel

Co-authored-by: Alex <alex@anyscale.com>
2021-06-29 17:34:37 -07:00
architkulkarni
06dfd8dddb
Revert "[Dashboard][event] Basic event module (#16283)" (#16676)
This reverts commit 5afa53aa64.
2021-06-25 09:38:18 -07:00
fyrestone
5afa53aa64
[Dashboard][event] Basic event module (#16283) 2021-06-25 13:59:02 +08:00
Alex Wu
8ffaa8d3fa
Refactor pubsub to support GCS publisher/raylet client (#16624)
* .

* .

* .

* .

* .

* import error :(

* boop

* .

* fix tests

* fix tests

* .

* cleanup

Co-authored-by: Alex Wu <alex@anyscale.com>
2021-06-24 15:30:42 -07:00
Chen Shen
54f9aef35b
[spilled object push optimization 1/3] create a SpilledObject that reads data in chunks. 2021-06-10 10:08:51 -07:00
SongGuyang
874e947d6f
[runtime env] support create or delete runtime envs in agent (#15904) 2021-06-09 20:22:25 +08:00
Lixin Wei
3d37e3a315
[Refactor] Replace FractionalResourceQuantity with FixedPoint (#16052)
* refactor

* fix

* fix compilation

* fix

* fix cross-platform compilation

* lint

* fix test

* Revert "fix test"

This reverts commit 0ff23b125ce4159b91cc170dbc17b5ed70c9ab11.

* change rounding to truncating

* Update BUILD.bazel

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2021-05-28 09:32:51 -07:00
Yi Cheng
5d0b302121
[core] Trigger global gc when plasma store is under pressure. (#15775) 2021-05-27 10:07:59 -07:00
fyrestone
56c309416e
[Job submission] Basic job submission structure (#15103) 2021-05-12 15:08:20 +08:00
Eric Liang
ff36ae594b
Remove flaky tag from newly unflaky tests (#15639) 2021-05-05 12:15:46 -07:00
Alex Wu
18d85d2de9
Grpc based resource broadcast (#15466) 2021-05-05 11:20:08 -07:00
Eric Liang
a482034916
Flaky test builder for tests tagged "flaky" (#15408) 2021-04-20 00:19:07 -07:00
SangBin Cho
61d120557d
[Pubsub] Generalize pubsub, Move pubsub code to pubsub_lib module (#15164)
* cherry-pick-1

* cherry-pick-2

* cherry-pick-part-3

* Should work.

* Lint fix.

* Fix lint 2.
2021-04-07 20:40:39 -07:00
Siyuan (Ryans) Zhuang
7fd86f7e15
[Core] Use static callback instead of dynamic notification listener (#15059)
* static callback & remove outdated protocol

* address comments

* fix

* make fields constant

* fix windows compilation error
2021-04-02 22:33:41 -07:00
Alex Wu
4fba05ae4d
[core] Hybrid scheduling policy. (#14790) 2021-04-01 16:59:59 -07:00
SangBin Cho
005cff0092
Revert "Revert "[Core] Implement long polling-based pubsub to reduce … (#14909) 2021-04-01 09:03:15 -07:00
Alex Wu
1f4d4dfeb0
Gcs pull resource reports (#14336) 2021-03-29 11:36:30 -07:00
Siyuan (Ryans) Zhuang
87c79553e9
[Core] Remove code paths that contains plasma store executable (#14950)
* remove plasma store executable & never used tests

* set default behavior

* fix tests
2021-03-28 21:22:14 -07:00
SangBin Cho
ec3cfef883
Revert "[Core] Implement long polling-based pubsub to reduce number of WaitForObjectEviction requests in flight. (#14638)" (#14905)
This reverts commit 35ec91c4e0.
2021-03-24 11:22:48 -07:00
SangBin Cho
35ec91c4e0
[Core] Implement long polling-based pubsub to reduce number of WaitForObjectEviction requests in flight. (#14638)
* in progress.

* IN progress.

* lint.

* Updated code

* lint.

* In progress of writing tets.

* Finished implementation. Need cleanup & refactoring.

* fixing tests...

* Finish the impl.

* Fix typo.

* impl done. Only cleanup left.

* done.

* Finished clean up.

* Fix issues.

* Add a stronger consistency check.

* Addressed code review.

* lint.

* done.

* Addressed more.

* addressed all reviews.

* Addressed code review.

* lint.

* Added unit tests to assert no leak.
2021-03-23 23:47:08 -07:00
Yi Cheng
881a46e1d6
[core] RuntimeEnv GC in local node (#14594) 2021-03-18 14:55:11 -07:00
Clark Zinzow
566dcea56a
[Core] Added event loop metrics for posts. (#14546)
* Added event loop metrics for posts.

* io_context_proxy --> instrumented_io_context

* Fix feature flag, chrono-->absl, trim the stats, inline functions, reformat stats string.

* Make stats struct mutex plain lock instead of reader-writer lock.

* Mutex reader locking, std::array double braces initialization.

* Fix Bazel BUILD formatting.
2021-03-10 11:52:45 -08:00
Eric Liang
99a63b3dd1
Remove old scheduler and friends (#14184) 2021-03-03 18:29:15 -08:00