hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-05 10:01:43 -05:00

Author	SHA1	Message	Date
Qing Wang	1172195571	[Java] Remove global named actor and global pg (#20135 ) This PR removes global named actor and global PGs. I believe these APIs are not used widely in OSS. CPP part is not included in this PR. @kfstorm @clay4444 @raulchen Please take a look if this change is reasonable. IMPORTANT NOTE: This is a Java API change and will lead backward incompatibility in Java global named actor and global PG usage. CPP part is not included in this PR. INCLUDES: Remove setGlobalName() and getGlobalActor() APIs. Remove getGlobalPlacementGroup() and setGlobalPG Add getActor(name, namespace) API Add getPlacementGroup(name, namespace) API Update doc pages.	2021-11-15 16:28:53 +08:00
Yi Cheng	e54d3117a4	[gcs] Update all redis kv usage in python except function table (#20014 ) ## Why are these changes needed? This is part of redis removal project. In this PR all direct usage of redis got removed except function table. Function table will be migrated in the next PR ## Related issue number #19443	2021-11-10 20:24:53 -08:00
Tao Wang	60df705b4e	[Cpp]Get next job id globally instead of random selecting (#20102 ) ## Why are these changes needed? ## Related issue number Final part of #13984 ## Checks - [x] I've run `scripts/format.sh` to lint the changes in this PR. - [x] I've included any doc changes needed for https://docs.ray.io/en/master/. - [x] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [x] Unit tests - [ ] Release tests - [ ] This PR is not tested :(	2021-11-09 15:46:57 +08:00
Kai Yang	e84391d1d3	[Core] Encode job ID in randomized task IDs for user-created threads (#19320 ) ## Why are these changes needed? Currently, when `WorkerContext::GetCurrentTaskID()` returns a random task ID in user-created threads, and the returned task ID doesn't include the job ID. In this case, subsequent non-actor tasks and return values, and objects created by `ray.put()` don't include the job ID neither. This makes us hard to find the correct job ID from a task or object ID. This PR updates the task ID generation code to always encode the job ID. A side-effect of this PR is the change of possibility of task ID collision in user-created threads due to the fixed job ID part. w/o this PR: `sqrt(pi * 256 ^ 12 / 2)` ~= 352 trillion tasks. w/ this PR: `sqrt(pi * 256 ^ 8 / 2)` ~= 5 billion tasks. But this should be OK because the job ID part of task IDs in non-user-created threads are always fixed, so it won't be worse than non-user-created threads. ## Related issue number ## Checks - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :(	2021-11-08 21:00:40 +08:00
Alex Wu	146b3d6bcc	[scheduler] Include depth and function descriptor in scheduling class (#20004 )	2021-11-05 08:19:48 -07:00
qicosmos	246a901aea	[C++ API] Support object ref args (#19550 )	2021-10-29 17:36:53 +08:00
qicosmos	efef38f240	[C++ Worker] Add basic ref counting test cases (#17768 )	2021-10-29 11:22:19 +08:00
Qing Wang	048e7f7d5d	[Core] Port concurrency groups with asyncio (#18567 ) ## Why are these changes needed? This PR aims to port concurrency groups functionality with asyncio for Python. ### API ```python @ray.remote(concurrency_groups={"io": 2, "compute": 4}) class AsyncActor: def __init__(self): pass @ray.method(concurrency_group="io") async def f1(self): pass @ray.method(concurrency_group="io") def f2(self): pass @ray.method(concurrency_group="compute") def f3(self): pass @ray.method(concurrency_group="compute") def f4(self): pass def f5(self): pass ``` The annotation above the actor class `AsyncActor` defines this actor will have 2 concurrency groups and defines their max concurrencies, and it has a default concurrency group. Every concurrency group has an async eventloop and a pythread to execute the methods which is defined on them. Method `f1` will be invoked in the `io` concurrency group. `f2` in `io`, `f3` in `compute` and etc. TO BE NOTICED, `f5` and `__init__` will be invoked in the default concurrency. The following method `f2` will be invoked in the concurrency group `compute` since the dynamic specifying has a higher priority. ```python a.f2.options(concurrency_group="compute").remote() ``` ### Implementation The straightforward implementation details are: - Before we only have 1 eventloop binding 1 pythread for an asyncio actor. Now we create 1 eventloop binding 1 pythread for every concurrency group of the asyncio actor. - Before we have 1 fiber state for every caller in the asyncio actor. Now we create a FiberStateManager for every caller in the asyncio actor. And the FiberStateManager manages the fiber states for concurrency groups. ## Related issue number #16047	2021-10-21 21:46:56 +08:00
Guyang Song	c04fb62f1d	[C++ worker] set native library path for shared library search (#19376 )	2021-10-18 16:03:49 +08:00
Gagandeep Singh	d226cbf21a	Added StartupToken to idenitfy a process at startup (#19014 ) * Added StartupToken to idenitfy a process at startup * Applied linting formats * Addressed reviews * Fixing worker_pool_test * Fixed worker_pool_test * Applied linting formatting * Added documentation for StartupToken * Fixed linting * Reordered initialisation of WorkerPool members * Fixed Python docs * Fixing bugs in cluster_mode_test * Fixing Java tests * Create and set shim process after verifying startup_token * shim_process.GetId() -> worker_shim_pid * Improvements in startup token and modifying java files * update io_ray_runtime_RayNativeRuntime.h * Fixed java tests by adding startup-token to conf * Applied linting * Increased arg count for startup_token * Attempt to fix streaming tests * Type correction * applied linting * Corrected index of startup token arg * Modified, mock_worker.cc to accept startup tokens * Applied linting * Applied linting changes from CI * Removed override from worker.h * Applied linting from scripts/format.sh * Addressed reviews and applied scripts/format.sh * Applied linting script from ci/travis * Removed unrequired methods from public scope * Applied linting	2021-10-15 15:13:13 -07:00
Guyang Song	ab55b808c5	[runtime env] move worker env to runtime env in Java (#19060 )	2021-10-11 17:25:09 +08:00
gjoliver	635010d460	Update build rules and patches for darwin_arm64 platform. (#19037 ) * Update build rules and patches for darwin_arm64 platform. Changes include: Update nelhage/rules_boost package from current version (08/5/2020) to 5/27/2021 version. Remove rules_boost-undefine-boost_fallthrough.patch, since BOOST_FALLTHROUGH seems to be defined now. Minor changes to rules_boost-windows-linkopts.patch to use default condition to add -lpthread flag for all platforms. Add darwin_arm64 config to BUILD files for lib civetweb pulled in via prometheu dependency. * upgrade boost to 1.74.0 from 1.71.0 to match the udpated build file for windows. * Fix ray_cpp_pkg * Use boost/bind/bind.hpp boost/bind.hpp and global namespace placeholders are deprecated. * lint * Use absl::bind_front when possible. Otherwise, NOLINT * lint * lint * lint * lint * more lint * final lint * trigger build	2021-10-09 18:48:35 -07:00
Jiajun Yao	ed9118393c	Listen to 127.0.0.1 by default on mac osx (#18904 )	2021-09-29 11:40:19 -07:00
Guyang Song	337005d5a5	[C++ API][hotfix] fix C++ worker dynamic library loading issue on macOS (#18877 ) * fix C++ worker in macox * fix	2021-09-24 23:39:00 +08:00
Guyang Song	739cf64115	[C++ API] support head_args config in C++ API (#18709 )	2021-09-23 19:30:53 +08:00
qicosmos	64c25987f3	[C++ Worker]Simple kv store example (#18613 )	2021-09-18 16:02:44 +08:00
Jiajun Yao	ffe7108eae	Fix cpp api doc (#18671 )	2021-09-17 14:01:23 -07:00
Guyang Song	187e4a86ca	[C++ API] expose C++ task failure event (#18596 )	2021-09-16 19:20:16 +08:00
qicosmos	d7c631209b	[C++ Worker]Add api get placement group (#18535 )	2021-09-15 14:11:31 +08:00
Stephanie Wang	284dee493e	[core][usability] Disambiguate ObjectLostErrors for better understandability (#18292 ) * Define error types, throw error for ObjectReleased * x * Disambiguate OBJECT_UNRECONSTRUCTABLE and OBJECT_LOST * OwnerDiedError * fix test * x * ObjectReconstructionFailed * ObjectReconstructionFailed * x * x * print owner addr * str * doc * rename * x	2021-09-13 16:16:17 -07:00
qicosmos	ac0a153b06	[C++ Worker]Add some api of placement group (#18431 )	2021-09-13 15:10:54 +08:00
qicosmos	dd096c8e73	[C++ Worker]Fix abi issue (#18273 )	2021-09-10 11:53:05 +08:00
qicosmos	ba0084e9c7	[C++ Worker]Add gcs global state accessor (#17976 )	2021-09-09 12:08:08 +08:00
qicosmos	1da05209b9	[C++ Worker]Add get actor API. (#17897 ) * linkopts shared * add get actor api * fix * improve * reduce some duplicate code * improve some	2021-09-06 11:46:46 +08:00
qicosmos	72739462a9	[C++ Worker]Add some api of placement group part1. (#17925 ) * linkopts shared * add some pg api * add Wait for PlacementGroup	2021-09-03 13:32:28 +08:00
Stephanie Wang	d43d297d9a	[core] Attach call site to ObjectRefs, print on error (#17971 ) * Attach call site to ObjectRef * flag * Fix build * build * build * build * x * x * skip on windows * lint	2021-09-01 15:29:05 -07:00
Jiajun Yao	fbb3ac6a86	Retry application-level errors (#18176 ) * Retry application-level errors * Retry application-level errors * Push retry message to the driver	2021-09-01 10:53:06 -07:00
Stephanie Wang	8e06db7280	Revert "[Core] revert: revert Unified worker starter (#18008 )" (#18228 ) This reverts commit `b9978dd02b`.	2021-08-30 17:28:41 -07:00
Eric Liang	1adce7da4e	Revert "Auto discover dashboard agent port (#17855 )" (#18217 ) This reverts commit `53ddb551d5`.	2021-08-30 10:46:37 -07:00
fyrestone	53ddb551d5	Auto discover dashboard agent port (#17855 )	2021-08-30 12:06:28 +08:00
Jiajun Yao	25ef452b15	[Core] Fix typo in local_mode_task_submitter.cc (#18046 )	2021-08-24 13:03:05 -07:00
chenk008	b9978dd02b	[Core] revert: revert Unified worker starter (#18008 )	2021-08-23 13:34:32 -07:00
Stephanie Wang	b8fe776638	[core] Fix inlined nested ids (#17834 ) * test * Use ObjectRef instead of ObjectID in nested refs * java * doc * java * build * build * x * lint * simplify * fix	2021-08-20 08:58:29 -07:00
Eric Liang	661ac4e37b	Remove last traces of ref-counting flag (#17932 )	2021-08-19 21:08:13 -07:00
Simon Mo	b573864928	[CI] Add test owners (#17893 )	2021-08-18 18:38:31 -07:00
Eric Liang	a9073d16f4	Revert "[Core] Unified worker initiators (#17401 )" (#17935 ) This reverts commit `c3764ffd7d`.	2021-08-18 18:06:24 -07:00
chenk008	c3764ffd7d	[Core] Unified worker initiators (#17401 ) * use setup_worker as starter * use setup_worker as starter * add java test * fix * fix * lint * sleep in ci * sleep in ci * fix ut * fix * fix * fix * fix * fix * fix * change test size * test * fix * fix * fix ut * restore sgd test * change test size * fix merge confict * restore cpp worker flag * fix * fix * add worker-languange in setup_runtime_env.py * lint * fix java command Co-authored-by: root <chenk008>	2021-08-17 19:37:26 +08:00
qicosmos	a2a1c46c83	[C++ Worker]Fix for mac (#17633 ) * linkopts shared * replace gflags with absl flags * fix * add test option * fix * add cpp worker to mac ci * fix * support empty redis password;mod arc argv * add encoding * test * ignore example test on mac * support mac * fix * fix and update doc * fix * fix run.sh * fix init * fix typo * fix run.sh * fix lint Co-authored-by: 久龙 <guyang.sgy@antfin.com>	2021-08-13 12:22:37 +08:00
Guyang Song	b97027ec64	[C++ API] support cpu gpu num 0 (#17783 ) * support cpu gpu num 0 * support cpu gpu num 0 * fix	2021-08-13 08:45:33 +08:00
Guyang Song	88b8de5904	[C++ API] support ray::IsInitialized (#17780 ) * support ray::IsInitialized * address comments * fix	2021-08-13 00:51:26 +08:00
Guyang Song	e53aeca6bb	[C++ API]support set resources in RayConfig (#17779 )	2021-08-12 22:53:42 +08:00
Guyang Song	63f9ba2858	[C++ API][Fix] support ray::Init without RayConfig (#17733 )	2021-08-12 10:59:21 +08:00
qicosmos	05da724521	[C++ Worker] Replace `Ray::xxx` with `ray::xxx` and update namespaces (#17388 )	2021-08-10 11:17:59 +08:00
SongGuyang	c62ce78be8	make C++ example more simpler (#17609 )	2021-08-09 19:39:16 +08:00
Hao Chen	0858f0e4f2	Change core worker C++ namespace to ray::core (#17610 )	2021-08-08 23:34:25 +08:00
qicosmos	f1f7d4a085	[C++ Worker]Add some APIs for task call part one (#16499 )	2021-08-05 17:25:36 +08:00
Chen Shen	53a0c74413	[nightly-test] fix non_streaming_shuffle_1tb_5000_partitions	2021-08-04 16:06:53 -07:00
SongGuyang	3e42f54910	Support copyright format for c++ files (#14348 )	2021-08-04 17:19:38 +08:00
Siyuan (Ryans) Zhuang	8efc04a8a6	[Core] Actor namespace (#17178 ) * set actor namespace in Python on creation * get actor with namespace in Python * update message	2021-07-19 21:51:04 -07:00
SongGuyang	21b464ae9d	[C++ API] support get ray address from env (#17144 )	2021-07-16 17:17:43 +08:00

1 2 3

117 commits