hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
Alex Wu	295782d411	[New Scheduler] Refactor cluster resource scheduler (#10938 )	2020-09-23 15:46:31 -07:00
SangBin Cho	7931b6ce2e	Fix placement group bug failing in release test (#10944 )	2020-09-23 12:37:28 -07:00
fangfengbin	a260e66016	[Placement Group]Fix CommitResources crash bug (#10951 )	2020-09-23 17:24:53 +08:00
SangBin Cho	390107b6cb	[Core] Allow to pass node ip address to gcs server. (#10946 ) * Allow to pass node ip address to gcs server. * Fix. * Addressed code review. * Fixed an error. * Addressed code review.	2020-09-23 01:52:26 -07:00
Kai Yang	864d1d2b59	[Core] Multi-tenancy: Kill idle workers in FIFO order (#10597 ) * Kill idle workers in FIFO order * Update test * minor update * Address comments * fix after merge * fix worker_pool_test	2020-09-22 10:59:11 -07:00
SangBin Cho	e3b4850224	[Placement group] Release test (#10924 ) * Done. * Lint. * Addressed code review.	2020-09-22 00:49:04 -07:00
fangfengbin	1cc4543048	[GCS]Limit the number of profile table (#10888 ) * add part code * add part code * fix compile bug * fix compile bug * fix review comments Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>	2020-09-21 21:53:42 -07:00
fangfengbin	3e94c690c7	Fix flaky placement group test bug (#10915 ) Co-authored-by: 灵洵 <fengbin.ffb@antfin.com>	2020-09-20 19:50:55 -07:00
fangfengbin	3f90ec5963	[GCS]Fix actor idempotent bug (#10856 )	2020-09-20 12:35:45 +08:00
fangfengbin	890fa6704f	[GCS]Fix MGetValues Command to send is too large bug (#10877 )	2020-09-19 12:22:20 +08:00
SangBin Cho	bc74a10748	[Core] Fix Flaky GCS actor manager test (#10600 ) * Try. * Fix the issue. * Fix.	2020-09-17 16:10:57 -07:00
SangBin Cho	fe4c6ab778	[Core] Remove unused credis related code. (#10849 ) * Done. * Lint.	2020-09-16 23:34:54 -07:00
fyrestone	50784e2496	[Dashboard] Dashboard node grouping (#10528 ) * Add RAY_NODE_ID environment var to agent * Node ralated data use node id as key * ray.init() return node id; Pass test_reporter.py * Fix lint & CI * Fix comments * Minor fixes * Fix CI * Add const to ClientID in AgentManager::Options * Use fstring * Add comments * Fix lint * Add test_multi_nodes_info Co-authored-by: 刘宝 <po.lb@antfin.com>	2020-09-16 10:17:29 -07:00
Basasuya	5e030db8a5	[EVENT] add log reporter (#10419 )	2020-09-16 11:54:05 +08:00
Kai Yang	4c03f7ca2f	[Core] Multi-tenancy: Reject worker registration if job has finished (#10569 )	2020-09-14 14:49:31 +08:00
Kai Yang	a43817f34b	[Java] Attach owner address for pass-by-reference task arguments (#9634 )	2020-09-14 11:46:59 +08:00
Xianyang Liu	8166d71bde	[Java] Support exchange ObjectRef between processes (#10729 )	2020-09-13 11:54:45 +08:00
SangBin Cho	517e164fb7	[Core] Update the object manager pulling objects error message to warning. (#10657 ) * Update the message to expose less implementation details and make the severity WARNING. * Fix formatting.	2020-09-11 15:53:04 -07:00
Stephanie Wang	dbca2f9889	Fix segfault in network utils (#10741 )	2020-09-11 15:35:03 -07:00
Kai Yang	23051385a4	Fix Java CI crash caused by incorrect destruction order in core worker (#10709 )	2020-09-11 17:33:09 +08:00
Barak Michener	c6b1ed7f8f	release process: bump version number to 1.1.0.dev0 everywhere (#10686 )	2020-09-10 16:00:21 -07:00
Max Fitton	3e8164ff8a	[Dashboard] Logical View Actor Class Grouping Details (#10453 ) * wip * wip * wip * wip * Need to track the timestamp actors are created for the dashboard. This adds that functionality back in and deletes unused code * Add the materialui lab packages to get access to the Alert component and fix up some vulnerabilities with npm audit. * Finish supporting information on a per-actor-class basis in the logical view, add bug fixes around timestamps and infeasible task names, and add a new warning popup that shows if there are infeasible actors around. * lint and add seconds annotation to actor lifetime values * real lint * remove typo * Somehow missed something last lint * Add new comments for actor states * Add underscores to some private functions * Add tooltips to the actor states on the logical view * change test metrics to be aligned with new changes. * lint * Remove some unnecessary log lines and catch error that happens when we try to decode data from an unexpected source * Re-add a function I had removed. It is used in the Java codebase. Co-authored-by: Max Fitton <max@semprehealth.com>	2020-09-09 10:34:54 -07:00
Kai Yang	afa0216280	Remove the '--include-java' option (#10594 )	2020-09-09 17:01:17 +08:00
chaokunyang	ccf27a9ad2	[Streaming] Fix streaming ci (#10665 )	2020-09-09 16:53:43 +08:00
Alex Wu	d9c68fca5c	[Core] Logging improvements (#10625 ) * other stuff : * lint * . * . * lint * comment * lint * .	2020-09-08 20:58:05 -07:00
SangBin Cho	b7040f1310	Revert "[Streaming] fix streaming ci (#9675 )" (#10656 ) This reverts commit `3645a05644`.	2020-09-08 19:07:21 -07:00
SangBin Cho	dcb9e03fde	[Placement Group] Atomic Creation using 2 phase protocol part 2. (#10599 ) * In progress. * In Progress * Basic done. * Fix build issues. * Addressed code review. * Change the confusing test name. * Fix comments. * Addressed code review.	2020-09-08 13:11:11 -07:00
chaokunyang	bbfbc98a41	[Core] Allow users to specify the classpath and import path (#10560 ) * move job resource path to job config * job resource path support list * job resource path support for python * fix job_resource_path support * fix worker command * fix job config * use jar file instead of parent path * fix job resource path * add test to test.sh * lint * Update java/runtime/src/main/resources/ray.default.conf Co-authored-by: Kai Yang <kfstorm@outlook.com> * fix testGetFunctionFromLocalResource * lint * fix rebase * add jars in resource path to classloader * add job_resource_path to worker * add ray stop * rename job_resource_path to resource_path * fix resource_path * refine resource_path comments * rename job resource path to code search path * Add instruction about starting a cross-language cluster * fix ClassLoaderTest.java * add code-search-path to RunManager * refine comments for code-search-path * rename resourcePath to codeSearchPath * Update doc * fix * rename resourcePath to codeSearchPath * update doc * filter out empty path * fix comments * fix comments * fix tests * revert pom * lint * fix doc * update doc * Apply suggestions from code review * lint Co-authored-by: Kai Yang <kfstorm@outlook.com> Co-authored-by: Hao Chen <chenh1024@gmail.com>	2020-09-09 00:46:32 +08:00
chaokunyang	3645a05644	[Streaming] fix streaming ci (#9675 )	2020-09-08 22:20:58 +08:00
Kai Yang	ca8792e4ff	[Java] Disable the multi-worker feature by default (#10593 )	2020-09-08 13:10:46 +08:00
kisuke95	b7003839bd	[Core] Use core worker options to initialize (#10467 ) * fix * fix * .	2020-09-07 16:36:43 -07:00
Stephanie Wang	4f02ad4ef9	[core] Disable GCS reconnect (#10579 ) * Set default GCS retries to 1 * Fix cc test	2020-09-05 13:14:07 -07:00
Kai Yang	5f5160ead9	[Core] Multi-tenancy: Worker capping (#10500 )	2020-09-04 20:34:06 +08:00
SangBin Cho	2a7f56e429	[Placement group] Fix Logging issues. (#10557 )	2020-09-03 23:55:10 -07:00
chaokunyang	cf3875bd8c	[Java] add exitActor API for java (#10496 )	2020-09-04 10:11:42 +08:00
Edward Oakes	ead30ca655	[Core] fix named actor bug (#10550 )	2020-09-03 17:48:31 -07:00
Clark Zinzow	0c0b0d0a73	[Core] Added support for submission-time task names. (#10449 ) * Added support for submission-time task names. * Suggestions from code review: add missing consts Co-authored-by: SangBin Cho <rkooo567@gmail.com> * Add num_returns arg to actor method options docstring example. * Add process name line and proctitle assertion to submission-time task name section of advanced docs. * Add submission-time task name --> proctitle test for Python worker. * Added Python actor options tests for num_returns and name. * Added Java test for submission-time task names. * Add dashboard image to task name docs section. * Move to fstrings. Co-authored-by: SangBin Cho <rkooo567@gmail.com>	2020-09-03 11:45:24 -07:00
Edward Oakes	71274954d1	Remove unnecessary output when connecting to a cluster. (#10512 )	2020-09-03 13:30:33 -05:00
Sven Mika	715ee8dfc9	[RLlib] Issue 10469: Callbacks should receive env idx ... (#10477 )	2020-09-03 17:27:05 +02:00
SangBin Cho	dc7fe1a4c5	[Placement Group] Atomic Placement Group Part 1, Basic Structure. (#10482 ) * Write a test. * Basic structure done. * Reduce flakiness of tests. * Addressed code review. * Skipping tests because it is flaky for now. * Fix linting issues. * Increase sleep time to see lint messages. * Lint issue fixed.	2020-09-02 18:14:46 -07:00
chaokunyang	f10a5a40b0	[Java] Simplify ray cmd params (#10394 )	2020-09-02 19:47:52 +08:00
Ian Rodney	283f4d1060	[docker] Use tmp paths for rsync and fix file_mounts on docker (#10368 )	2020-09-01 13:14:35 -07:00
chaokunyang	d584a4e5c4	Fix java ci break (#10472 )	2020-09-01 19:57:03 +08:00
chaokunyang	ba3bd6b225	Fix java ci break (#10470 )	2020-09-01 19:33:23 +08:00
SangBin Cho	a0c7907d88	[Placement Group] Leasing context refactoring part 2 (#10413 ) * In progress. * Refactoring done, but still failing tests. * Fix issues. * Addressed code review. * Addressed code review.	2020-08-31 15:54:34 -07:00
Gabriele Oliaro	05fe6dc278	Keeping pipelines full (#10225 ) * requesting new workers only when pipelines to existing ones are full * linting * added unit testing & linting * finished refactoring to consolidate all the fields that belong to a SchedulingKey into a single hashmap * linting * fixed bugs introduced by rebasing from new upstream master * changes as part of the PR review process * Fix typo in src/ray/core_worker/transport/direct_task_transport.cc Co-authored-by: fangfengbin <869218239a@zju.edu.cn> * Fixed comment in src/ray/core_worker/transport/direct_task_transport.cc Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu> * second revision, with linting. all tests are passing locally * Renamed SafeToDeleteEntry method in SchedulingKeyEntry Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu> * all new revisions but the memory leak check. performed linting. * added checks to make sure scheduling_key_entries does not leak memory * linting. all checks passing locally * edited CheckNoSchedulingKeyEntries function * linting * fixed build error on mac * created public version of CheckNoSchedulingKeyEntries to acquire the lock * linting Co-authored-by: fangfengbin <869218239a@zju.edu.cn> Co-authored-by: Stephanie Wang <swang@cs.berkeley.edu>	2020-08-30 18:49:25 -07:00
fyrestone	e9b046306a	[Dashboard] Dashboard basic modules (#10303 ) * Improve reporter module * Add test_node_physical_stats to test_reporter.py * Add test_class_method_route_table to test_dashboard.py * Add stats_collector module for dashboard * Subscribe actor table data * Add log module for dashboard * Only enable test module in some test cases * CI run all dashboard tests * Reduce test timeout to 10s * Use fstring * Remove unused code * Remove blank line * Fix dashboard tests * Fix asyncio.create_task not available in py36; Fix lint * Add format_web_url to ray.test_utils * Update dashboard/modules/reporter/reporter_head.py Co-authored-by: Max Fitton <mfitton@berkeley.edu> * Add DictChangeItem type for Dict change * Refine logger.exception * Refine GET /api/launch_profiling * Remove disable_test_module fixture * Fix test_basic may fail Co-authored-by: 刘宝 <po.lb@antfin.com> Co-authored-by: Max Fitton <mfitton@berkeley.edu>	2020-08-29 23:09:34 -07:00
Stephanie Wang	9a31166050	Option to disable profiling and task timeline (#10414 )	2020-08-29 11:35:22 -07:00
Lixin Wei	eb66db3199	[Build] bug fixed for logging (#10364 )	2020-08-28 09:17:08 -07:00
SangBin Cho	d206fbbc99	[Placement group] Scheduler map refactoring part 1. (#10381 ) * In Progress * done. * Address code review.	2020-08-28 00:57:09 -07:00

1 2 3 4 5 ...

1486 commits