Commit graph

1494 commits

Author SHA1 Message Date
Stephanie Wang
552ebdbeda
[Core] Announce worker port at end of constructor (#11036) 2020-09-25 21:56:00 -07:00
SangBin Cho
29663d89f1
[Placement Group] Remove warning msg for placement groups. (#11034)
* Done.

* Addressed code review.

* Fixed typo.

* Addressed code review.
2020-09-25 20:53:42 -07:00
SangBin Cho
8abe13023f
[Metric] Fix issue 10634 (#10940)
* Fix.

* Revert "Fix."

This reverts commit 52c9c1ee646b551a4dd2b639c78be67683db2b1c.

* ADdressed code review.

* Addressed code review.
2020-09-25 09:11:05 -07:00
Alex Wu
0f168bf2ef
[hotfix] Use ref in WorkerPool::TryKillingIdleWorkers (#11017) 2020-09-24 17:23:56 -07:00
SangBin Cho
5e6b887f2d
[Placement Group] Capture Child Task Part 1 (#10968)
* In progress.

* In progers.

* Done.

* Addressed code review.

* Increase timeout to make a test less flaky.

* Addressed code review.

* Addressed code review.
2020-09-24 09:02:03 -07:00
DK.Pino
4fa6523e4e
[Core] Remove unnecessary if judgment (#10971)
* Remove unnecessary if judgment

* format code style
2020-09-23 21:24:11 -07:00
fangfengbin
2a79571c29
[Placement Group] Optimize log (#10974)
Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-09-23 20:28:08 -07:00
Kai Yang
b251a445dd
[Core] Fix maximum_startup_concurrency caused by AnnounceWorkerPort (#10853)
* Fix maximum_startup_concurrency caused by AnnounceWorkerPort

* Address comment

* Update src/ray/raylet/worker_pool.h

Co-authored-by: Eric Liang <ekhliang@gmail.com>

Co-authored-by: Eric Liang <ekhliang@gmail.com>
2020-09-23 20:27:44 -07:00
Alex Wu
295782d411
[New Scheduler] Refactor cluster resource scheduler (#10938) 2020-09-23 15:46:31 -07:00
SangBin Cho
7931b6ce2e
Fix placement group bug failing in release test (#10944) 2020-09-23 12:37:28 -07:00
fangfengbin
a260e66016
[Placement Group]Fix CommitResources crash bug (#10951) 2020-09-23 17:24:53 +08:00
SangBin Cho
390107b6cb
[Core] Allow to pass node ip address to gcs server. (#10946)
* Allow to pass node ip address to gcs server.

* Fix.

* Addressed code review.

* Fixed an error.

* Addressed code review.
2020-09-23 01:52:26 -07:00
Kai Yang
864d1d2b59
[Core] Multi-tenancy: Kill idle workers in FIFO order (#10597)
* Kill idle workers in FIFO order

* Update test

* minor update

* Address comments

* fix after merge

* fix worker_pool_test
2020-09-22 10:59:11 -07:00
SangBin Cho
e3b4850224
[Placement group] Release test (#10924)
* Done.

* Lint.

* Addressed code review.
2020-09-22 00:49:04 -07:00
fangfengbin
1cc4543048
[GCS]Limit the number of profile table (#10888)
* add part code

* add part code

* fix compile bug

* fix compile bug

* fix review comments

Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-09-21 21:53:42 -07:00
fangfengbin
3e94c690c7
Fix flaky placement group test bug (#10915)
Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>
2020-09-20 19:50:55 -07:00
fangfengbin
3f90ec5963
[GCS]Fix actor idempotent bug (#10856) 2020-09-20 12:35:45 +08:00
fangfengbin
890fa6704f
[GCS]Fix MGetValues Command to send is too large bug (#10877) 2020-09-19 12:22:20 +08:00
SangBin Cho
bc74a10748
[Core] Fix Flaky GCS actor manager test (#10600)
* Try.

* Fix the issue.

* Fix.
2020-09-17 16:10:57 -07:00
SangBin Cho
fe4c6ab778
[Core] Remove unused credis related code. (#10849)
* Done.

* Lint.
2020-09-16 23:34:54 -07:00
fyrestone
50784e2496
[Dashboard] Dashboard node grouping (#10528)
* Add RAY_NODE_ID environment var to agent

* Node ralated data use node id as key

* ray.init() return node id; Pass test_reporter.py

* Fix lint & CI

* Fix comments

* Minor fixes

* Fix CI

* Add const to ClientID in AgentManager::Options

* Use fstring

* Add comments

* Fix lint

* Add test_multi_nodes_info

Co-authored-by: 刘宝 <po.lb@antfin.com>
2020-09-16 10:17:29 -07:00
Basasuya
5e030db8a5
[EVENT] add log reporter (#10419) 2020-09-16 11:54:05 +08:00
Kai Yang
4c03f7ca2f
[Core] Multi-tenancy: Reject worker registration if job has finished (#10569) 2020-09-14 14:49:31 +08:00
Kai Yang
a43817f34b
[Java] Attach owner address for pass-by-reference task arguments (#9634) 2020-09-14 11:46:59 +08:00
Xianyang Liu
8166d71bde
[Java] Support exchange ObjectRef between processes (#10729) 2020-09-13 11:54:45 +08:00
SangBin Cho
517e164fb7
[Core] Update the object manager pulling objects error message to warning. (#10657)
* Update the message to expose less implementation details and make the severity WARNING.

* Fix formatting.
2020-09-11 15:53:04 -07:00
Stephanie Wang
dbca2f9889
Fix segfault in network utils (#10741) 2020-09-11 15:35:03 -07:00
Kai Yang
23051385a4
Fix Java CI crash caused by incorrect destruction order in core worker (#10709) 2020-09-11 17:33:09 +08:00
Barak Michener
c6b1ed7f8f
release process: bump version number to 1.1.0.dev0 everywhere (#10686) 2020-09-10 16:00:21 -07:00
Max Fitton
3e8164ff8a
[Dashboard] Logical View Actor Class Grouping Details (#10453)
* wip

* wip

* wip

* wip

* Need to track the timestamp actors are created for the dashboard. This adds that functionality back in and deletes unused code

* Add the materialui lab packages to get access to the Alert component and fix up some vulnerabilities with npm audit.

* Finish supporting information on a per-actor-class basis in the logical view, add bug fixes around timestamps and infeasible task names, and add a new warning popup that shows if there are infeasible actors around.

* lint and add seconds annotation to actor lifetime values

* real lint

* remove typo

* Somehow missed something last lint

* Add new comments for actor states

* Add underscores to some private functions

* Add tooltips to the actor states on the logical view

* change test metrics to be aligned with new changes.

* lint

* Remove some unnecessary log lines and catch error that happens when we try to decode data from an unexpected source

* Re-add a function I had removed. It is used in the Java codebase.

Co-authored-by: Max Fitton <max@semprehealth.com>
2020-09-09 10:34:54 -07:00
Kai Yang
afa0216280
Remove the '--include-java' option (#10594) 2020-09-09 17:01:17 +08:00
chaokunyang
ccf27a9ad2
[Streaming] Fix streaming ci (#10665) 2020-09-09 16:53:43 +08:00
Alex Wu
d9c68fca5c
[Core] Logging improvements (#10625)
* other stuff
:

* lint

* .

* .

* lint

* comment

* lint

* .
2020-09-08 20:58:05 -07:00
SangBin Cho
b7040f1310
Revert "[Streaming] fix streaming ci (#9675)" (#10656)
This reverts commit 3645a05644.
2020-09-08 19:07:21 -07:00
SangBin Cho
dcb9e03fde
[Placement Group] Atomic Creation using 2 phase protocol part 2. (#10599)
* In progress.

* In Progress

* Basic done.

* Fix build issues.

* Addressed code review.

* Change the confusing test name.

* Fix comments.

* Addressed code review.
2020-09-08 13:11:11 -07:00
chaokunyang
bbfbc98a41
[Core] Allow users to specify the classpath and import path (#10560)
* move job resource path to job config

* job resource path support list

* job resource path support for python

* fix job_resource_path support

* fix worker command

* fix job config

* use jar file instead of parent path

* fix job resource path

* add test to test.sh

* lint

* Update java/runtime/src/main/resources/ray.default.conf

Co-authored-by: Kai Yang <kfstorm@outlook.com>

* fix testGetFunctionFromLocalResource

* lint

* fix rebase

* add jars in resource path to classloader

* add job_resource_path to worker

* add ray stop

* rename job_resource_path to resource_path

* fix resource_path

* refine resource_path comments

* rename job resource path to code search path

* Add instruction about starting a cross-language cluster

* fix ClassLoaderTest.java

* add code-search-path to RunManager

* refine comments for code-search-path

* rename resourcePath to codeSearchPath

* Update doc

* fix

* rename resourcePath to codeSearchPath

* update doc

* filter out empty path

* fix comments

* fix comments

* fix tests

* revert pom

* lint

* fix doc

* update doc

* Apply suggestions from code review

* lint

Co-authored-by: Kai Yang <kfstorm@outlook.com>
Co-authored-by: Hao Chen <chenh1024@gmail.com>
2020-09-09 00:46:32 +08:00
chaokunyang
3645a05644
[Streaming] fix streaming ci (#9675) 2020-09-08 22:20:58 +08:00
Kai Yang
ca8792e4ff
[Java] Disable the multi-worker feature by default (#10593) 2020-09-08 13:10:46 +08:00
kisuke95
b7003839bd
[Core] Use core worker options to initialize (#10467)
* fix

* fix

* .
2020-09-07 16:36:43 -07:00
Stephanie Wang
4f02ad4ef9
[core] Disable GCS reconnect (#10579)
* Set default GCS retries to 1

* Fix cc test
2020-09-05 13:14:07 -07:00
Kai Yang
5f5160ead9
[Core] Multi-tenancy: Worker capping (#10500) 2020-09-04 20:34:06 +08:00
SangBin Cho
2a7f56e429
[Placement group] Fix Logging issues. (#10557) 2020-09-03 23:55:10 -07:00
chaokunyang
cf3875bd8c
[Java] add exitActor API for java (#10496) 2020-09-04 10:11:42 +08:00
Edward Oakes
ead30ca655
[Core] fix named actor bug (#10550) 2020-09-03 17:48:31 -07:00
Clark Zinzow
0c0b0d0a73
[Core] Added support for submission-time task names. (#10449)
* Added support for submission-time task names.

* Suggestions from code review: add missing consts

Co-authored-by: SangBin Cho <rkooo567@gmail.com>

* Add num_returns arg to actor method options docstring example.

* Add process name line and proctitle assertion to submission-time task name section of advanced docs.

* Add submission-time task name --> proctitle test for Python worker.

* Added Python actor options tests for num_returns and name.

* Added Java test for submission-time task names.

* Add dashboard image to task name docs section.

* Move to fstrings.

Co-authored-by: SangBin Cho <rkooo567@gmail.com>
2020-09-03 11:45:24 -07:00
Edward Oakes
71274954d1
Remove unnecessary output when connecting to a cluster. (#10512) 2020-09-03 13:30:33 -05:00
Sven Mika
715ee8dfc9
[RLlib] Issue 10469: Callbacks should receive env idx ... (#10477) 2020-09-03 17:27:05 +02:00
SangBin Cho
dc7fe1a4c5
[Placement Group] Atomic Placement Group Part 1, Basic Structure. (#10482)
* Write a test.

* Basic structure done.

* Reduce flakiness of tests.

* Addressed code review.

* Skipping tests because it is flaky for now.

* Fix linting issues.

* Increase sleep time to see lint messages.

* Lint issue fixed.
2020-09-02 18:14:46 -07:00
chaokunyang
f10a5a40b0
[Java] Simplify ray cmd params (#10394) 2020-09-02 19:47:52 +08:00
Ian Rodney
283f4d1060
[docker] Use tmp paths for rsync and fix file_mounts on docker (#10368) 2020-09-01 13:14:35 -07:00