Amog Kamsetty
8dfd471823
Revert "Revert "[Dashboard][event] Basic event module ( #16985 )" ( #17068 )" ( #17107 )
...
This reverts commit c17e171f92
.
Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-18 12:59:04 +08:00
Amog Kamsetty
c17e171f92
Revert "[Dashboard][event] Basic event module ( #16985 )" ( #17068 )
...
This reverts commit f1faa79a04
.
2021-07-13 23:18:43 -07:00
Chen Shen
645d8fcaf0
[logging][rfc] add RAY_LOG_EVERY_N and RAY_LOG_EVERY_MS ( #17018 )
...
* introduce log-every-n
* add n
* linter
* add license
2021-07-13 19:14:28 -07:00
fyrestone
f1faa79a04
[Dashboard][event] Basic event module ( #16985 )
...
* Basic event module
* Fix comments
* Set the SCAN_EVENT_DIR_INTERVAL_SECONDS defaults to 2
* Fix lint
* Fix lint
* Clean code
* Try to fix flaky
* Fix test
* Disable event module by default
* Make monitor events task cancellable
* Fix error
Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-13 19:08:39 -07:00
Amog Kamsetty
a14342ce6f
Revert "[Dashboard][event] Basic event module ( #16698 )" ( #17004 )
...
This reverts commit 66ea099897
.
2021-07-12 11:22:46 -07:00
qicosmos
298d2afc35
[Ray Log] remove glog dependency ( #16077 )
2021-07-12 17:06:52 +08:00
Scott Graham
3334357c58
[autoscaler] [azure] Fix Azure Autoscaling Failures ( #16640 )
...
Co-authored-by: Scott Graham <scgraham@microsoft.com>
2021-07-10 11:55:00 -07:00
fyrestone
66ea099897
[Dashboard][event] Basic event module ( #16698 )
...
* Basic event module
* Fix comments
* Set the SCAN_EVENT_DIR_INTERVAL_SECONDS defaults to 2
* Fix lint
* Fix lint
* Clean code
* Try to fix flaky
* Fix test
* Disable event module by default
Co-authored-by: 刘宝 <po.lb@antfin.com>
2021-07-09 10:25:30 -07:00
Kai Yang
e925051ce4
[Core] Get node to connect for driver in global state accessor ( #16810 )
2021-07-08 11:21:12 +08:00
Chen Shen
dbd3260141
[core] Deprecate QuotaAwareEvictionPolicy ( #16911 )
2021-07-07 13:44:41 -07:00
Kai Yang
7c21be5450
[Object spilling] Clean up spilled objects on disk when Raylet starts ( #16669 )
2021-07-05 12:01:25 +08:00
Alex Wu
d89f148fbf
[Pubsub] Don't depend on subscriber address ( #16752 )
...
* remove subscriber address
* .
* lint
* test
* done
* lint
* .
* Update BUILD.bazel
Co-authored-by: Alex <alex@anyscale.com>
2021-06-29 17:34:37 -07:00
architkulkarni
06dfd8dddb
Revert "[Dashboard][event] Basic event module ( #16283 )" ( #16676 )
...
This reverts commit 5afa53aa64
.
2021-06-25 09:38:18 -07:00
fyrestone
5afa53aa64
[Dashboard][event] Basic event module ( #16283 )
2021-06-25 13:59:02 +08:00
Alex Wu
8ffaa8d3fa
Refactor pubsub to support GCS publisher/raylet client ( #16624 )
...
* .
* .
* .
* .
* .
* import error :(
* boop
* .
* fix tests
* fix tests
* .
* cleanup
Co-authored-by: Alex Wu <alex@anyscale.com>
2021-06-24 15:30:42 -07:00
Chen Shen
54f9aef35b
[spilled object push optimization 1/3] create a SpilledObject that reads data in chunks.
2021-06-10 10:08:51 -07:00
SongGuyang
874e947d6f
[runtime env] support create or delete runtime envs in agent ( #15904 )
2021-06-09 20:22:25 +08:00
Lixin Wei
3d37e3a315
[Refactor] Replace FractionalResourceQuantity with FixedPoint ( #16052 )
...
* refactor
* fix
* fix compilation
* fix
* fix cross-platform compilation
* lint
* fix test
* Revert "fix test"
This reverts commit 0ff23b125ce4159b91cc170dbc17b5ed70c9ab11.
* change rounding to truncating
* Update BUILD.bazel
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2021-05-28 09:32:51 -07:00
Yi Cheng
5d0b302121
[core] Trigger global gc when plasma store is under pressure. ( #15775 )
2021-05-27 10:07:59 -07:00
fyrestone
56c309416e
[Job submission] Basic job submission structure ( #15103 )
2021-05-12 15:08:20 +08:00
Eric Liang
ff36ae594b
Remove flaky tag from newly unflaky tests ( #15639 )
2021-05-05 12:15:46 -07:00
Alex Wu
18d85d2de9
Grpc based resource broadcast ( #15466 )
2021-05-05 11:20:08 -07:00
Eric Liang
a482034916
Flaky test builder for tests tagged "flaky" ( #15408 )
2021-04-20 00:19:07 -07:00
SangBin Cho
61d120557d
[Pubsub] Generalize pubsub, Move pubsub code to pubsub_lib module ( #15164 )
...
* cherry-pick-1
* cherry-pick-2
* cherry-pick-part-3
* Should work.
* Lint fix.
* Fix lint 2.
2021-04-07 20:40:39 -07:00
Siyuan (Ryans) Zhuang
7fd86f7e15
[Core] Use static callback instead of dynamic notification listener ( #15059 )
...
* static callback & remove outdated protocol
* address comments
* fix
* make fields constant
* fix windows compilation error
2021-04-02 22:33:41 -07:00
Alex Wu
4fba05ae4d
[core] Hybrid scheduling policy. ( #14790 )
2021-04-01 16:59:59 -07:00
SangBin Cho
005cff0092
Revert "Revert "[Core] Implement long polling-based pubsub to reduce … ( #14909 )
2021-04-01 09:03:15 -07:00
Alex Wu
1f4d4dfeb0
Gcs pull resource reports ( #14336 )
2021-03-29 11:36:30 -07:00
Siyuan (Ryans) Zhuang
87c79553e9
[Core] Remove code paths that contains plasma store executable ( #14950 )
...
* remove plasma store executable & never used tests
* set default behavior
* fix tests
2021-03-28 21:22:14 -07:00
SangBin Cho
ec3cfef883
Revert "[Core] Implement long polling-based pubsub to reduce number of WaitForObjectEviction requests in flight. ( #14638 )" ( #14905 )
...
This reverts commit 35ec91c4e0
.
2021-03-24 11:22:48 -07:00
SangBin Cho
35ec91c4e0
[Core] Implement long polling-based pubsub to reduce number of WaitForObjectEviction requests in flight. ( #14638 )
...
* in progress.
* IN progress.
* lint.
* Updated code
* lint.
* In progress of writing tets.
* Finished implementation. Need cleanup & refactoring.
* fixing tests...
* Finish the impl.
* Fix typo.
* impl done. Only cleanup left.
* done.
* Finished clean up.
* Fix issues.
* Add a stronger consistency check.
* Addressed code review.
* lint.
* done.
* Addressed more.
* addressed all reviews.
* Addressed code review.
* lint.
* Added unit tests to assert no leak.
2021-03-23 23:47:08 -07:00
Yi Cheng
881a46e1d6
[core] RuntimeEnv GC in local node ( #14594 )
2021-03-18 14:55:11 -07:00
Clark Zinzow
566dcea56a
[Core] Added event loop metrics for posts. ( #14546 )
...
* Added event loop metrics for posts.
* io_context_proxy --> instrumented_io_context
* Fix feature flag, chrono-->absl, trim the stats, inline functions, reformat stats string.
* Make stats struct mutex plain lock instead of reader-writer lock.
* Mutex reader locking, std::array double braces initialization.
* Fix Bazel BUILD formatting.
2021-03-10 11:52:45 -08:00
Eric Liang
99a63b3dd1
Remove old scheduler and friends ( #14184 )
2021-03-03 18:29:15 -08:00
Stephanie Wang
5c6c9d5b91
[core] Spill tasks from waiting queue ( #14288 )
...
* Spill back waiting tasks
* test
* test
* todo
* Avoid iterating over args
* update
* lint
* Fix test
* test
* Test force spillback
* Unit test resource scheduler
* test
* travis?
* rename
* debug
* revert flaky test
* lint
* fix test
* fix
2021-03-02 22:30:02 -08:00
Stephanie Wang
a24ac13671
[core] Randomize actor ID to avoid collisions ( #14358 )
...
* Randomize actor ID
* Mix index and current time, add python test
* test
* nanos
2021-03-02 10:00:28 -08:00
Eric Liang
cc156f7b3c
Fix deadlock in unhandled exception handler and re-merge ( #3 ) ( #14192 )
2021-02-19 11:52:09 -08:00
SangBin Cho
66f93a3d63
Revert "Fix OSX error and re-merge unhandled exceptions handling ( #14138 )" ( #14180 )
...
This reverts commit ee584e8328
.
2021-02-18 10:35:38 -08:00
Eric Liang
ee584e8328
Fix OSX error and re-merge unhandled exceptions handling ( #14138 )
2021-02-17 13:35:07 -08:00
architkulkarni
3ce03a52bc
Revert "Revert "Revert "Unhandled exception handler based on local ref counti… ( #14113 )" ( #14136 )
...
This reverts commit e457872fe1
.
2021-02-16 11:47:09 -08:00
Eric Liang
e457872fe1
Revert "Revert "Unhandled exception handler based on local ref counti… ( #14113 )
...
* Revert "Revert "Unhandled exception handler based on local ref counting (#14049 )" (#14099 )"
This reverts commit b45ae76765
.
* reomve test
* fix
* fix
2021-02-15 14:11:11 -08:00
SangBin Cho
b45ae76765
Revert "Unhandled exception handler based on local ref counting ( #14049 )" ( #14099 )
...
This reverts commit 9dc671ae02
.
2021-02-14 22:08:32 -08:00
Eric Liang
9dc671ae02
Unhandled exception handler based on local ref counting ( #14049 )
2021-02-12 22:58:38 -08:00
Stephanie Wang
0998d69968
[core] Admission control for pulling objects to the local node ( #13514 )
...
* Admission control, TODO: tests, object size
* Unit tests for admission control and some bug fixes
* Add object size to object table, only activate pull if object size is known
* Some fixes, reset timer on eviction
* doc
* update
* Trigger OOM from the pull manager
* don't spam
* doc
* Update src/ray/object_manager/pull_manager.cc
Co-authored-by: Eric Liang <ekhliang@gmail.com>
* Remove useless tests
* Fix test
* osx build
* Skip broken test
* tests
* Skip failing tests
Co-authored-by: Eric Liang <ekhliang@gmail.com>
2021-01-21 16:46:42 -08:00
fangfengbin
33b092de28
[GCS]Add gcs resource scheduler ( #13072 )
2021-01-14 20:05:55 +08:00
Siyuan (Ryans) Zhuang
46cf433f0e
[Core] Remove Arrow dependencies ( #13157 )
...
* remove arrow ubsan
* remove arrow build depend
* remove arrow buffer
2021-01-04 11:19:09 -08:00
Clark Zinzow
c2bff64699
[Core] Locality-aware leasing: Milestone 1 - Owned refs, pinned location ( #12817 )
...
* Locality-aware leasing for owned refs (pinned locations).
* LessorPicker --> LeasePolicy.
* Consolidate GetBestNodeIdForTask and GetBestNodeIdForObjects.
* Update comments.
* Turn on locality-aware leasing feature flag by default.
* Move local fallback logic to LeasePolicy, move feature flag check to CoreWorker constructor, add local-only lease policy.
* Add lease policy consulting assertions to the direct task submitter tests.
* Add lease policy tests.
* LocalityLeasePolicy --> LocalityAwareLeasePolicy.
* Add missing const declarations.
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
* Add RAY_CHECK for raylet address nullptr when creating lease client.
* Make the fact that LocalLeasePolicy always returns the local node more explicit.
* Flatten GetLocalityData conditionals to make it more readable.
* Add ReferenceCounter::GetLocalityData() unit test.
* Add data-intensive microbenchmarks for single-node perf testing.
* Add data-intensive microbenchmarks for simulated cluster perf testing.
* Remove redundant comment.
* Remove data-intensive benchmarks.
* Add locality-aware leasing Python test.
* Formatting changes in ray_perf.py.
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
2021-01-04 09:49:08 -08:00
Siyuan (Ryans) Zhuang
cf9952a028
[Core] Remote outdated external store ( #13080 )
...
* remove outdated external store
2020-12-24 17:30:06 -08:00
Stephanie Wang
4461f9980a
Refactor TaskDependencyManager, allow passing bundles of objects to ObjectManager ( #13006 )
...
* New dependency manager
* Switch raylet to new DependencyManager
* PullManager accepts bundles
* Cleanup, remove old task dependency manager
* x
* PullManager unit tests
* lint
* Unit tests
* Rename
* lint
* test
* Update src/ray/raylet/dependency_manager.cc
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
* Update src/ray/raylet/dependency_manager.cc
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
* x
* lint
Co-authored-by: SangBin Cho <rkooo567@gmail.com>
2020-12-23 18:36:00 -08:00
DK.Pino
6e19facc7f
[GCS] Delete redis gcs client and redis_xxx_accessor ( #12996 )
2020-12-23 20:31:46 +08:00