hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 10:31:39 -05:00

Author	SHA1	Message	Date
SangBin Cho	39b9c44c8d	[State Observability] pre-alpha documentation (#26560 ) Adds Documentation for state APIs API reference	2022-07-26 05:49:28 -07:00
Kai Fricke	8fe439998e	[air/tuner/docs] Update docs for Tuner() API 1: RSTs, docs, move reuse_actors (#26930 ) Signed-off-by: Kai Fricke coding@kaifricke.com Why are these changes needed? Splitting up #26884: This PR includes changes to use Tuner() instead of tune.run() for most docs files (rst and py), and a change to move reuse_actors to the TuneConfig	2022-07-24 07:45:24 -07:00
Stephanie Wang	55a0f7bb2d	[core] ray.init defaults to an existing Ray instance if there is one (#26678 ) ray.init() will currently start a new Ray instance even if one is already existing, which is very confusing if you are a new user trying to go from local development to a cluster. This PR changes it so that, when no address is specified, we first try to find an existing Ray cluster that was created through `ray start`. If none is found, we will start a new one. This makes two changes to the ray.init() resolution order: 1. When `ray start` is called, the started cluster address was already written to a file called `/tmp/ray/ray_current_cluster`. For ray.init() and ray.init(address="auto"), we will first check this local file for an existing cluster address. The file is deleted on `ray stop`. If the file is empty, autodetect any running cluster (legacy behavior) if address="auto", or we will start a new local Ray instance if address=None. 2. When ray.init(address="local") is called, we will create a new local Ray instance, even if one is already existing. This behavior seems to be necessary mainly for `ray.client` use cases. This also surfaces the logs about which Ray instance we are connecting to. Previously these were hidden because we didn't set up the log until after connecting to Ray. So now Ray will log one of the following messages during ray.init: ``` (Connecting to existing Ray cluster at address: <IP>...) ...connection... (Started a local Ray cluster.\| Connected to Ray Cluster.)( View the dashboard at <URL>) ``` Note that this changes the dashboard URL to be printed with `ray.init()` instead of when the dashboard is first started. Co-authored-by: Eric Liang <ekhliang@gmail.com>	2022-07-23 11:27:22 -07:00
Siyuan (Ryans) Zhuang	0063d94166	[Core] Make "GetTimeoutError" a subclass of "TimeoutError" (#26771 ) I am surprised by the fact that `GetTimeoutError` is not a subclass of `TimeoutError`, which is counter-intuitive and may discourage users from trying the timeout feature in `ray.get`, because you have to "guess" the correct error type. For most people, I believe the first error type in their mind would be `TimeoutError`. This PR fixes this.	2022-07-20 14:37:39 -05:00
tomsunelite	d915529e9e	Add doc for custom lifetime of java actor (#26706 ) Custom lifetime of java Actor is already supported, but the related document is not updated Co-authored-by: sunkunjian1 <sunkunjian1@jd.com>	2022-07-20 22:19:44 +08:00
Tao Wang	4f2747f12a	[Core][C++ worker] Add GetNamespace api (#26509 )	2022-07-20 11:17:14 +08:00
Tao Wang	cd521ed132	[Doc][namespaces][C++ worker]add document for c++ worker namespace and specifying namespace while creating/getting named actors (#26498 ) We've supported namespace in c++ worker in https://github.com/ray-project/ray/pull/26327. Here we add doc for usage and also reinforce the documents of Java and Python, like adding explanation of specifying namespace while creating named actors. - [x] add doc for basic c++ worker namespace usage - [x] add explanation for specifying namespace while creating named actors, in Python, Java and C++	2022-07-20 10:58:41 +08:00
Sumanth Ratna	759966781f	[air] Allow users to use instances of `ScalingConfig` (#25712 ) Co-authored-by: Xiaowei Jiang <xwjiang2010@gmail.com> Co-authored-by: matthewdeng <matthew.j.deng@gmail.com> Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>	2022-07-18 15:46:58 -07:00
M Waleed Kadous	7c32993c15	[core/docs]Add a new section under Ray Core called Ray Gotchas (#26624 ) Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2022-07-16 16:53:01 -07:00
Archit Kulkarni	61c9e761f3	[runtime_env] [doc] Remove outdated info about "isolated" environment (#26314 )	2022-07-08 14:15:12 -05:00
Philipp Moritz	60c5ed7bfd	[Doc] Fix actor example (#26381 )	2022-07-07 21:19:51 -07:00
Antoni Baum	ea94cda1f3	[AIR] Replace `train.` with `session.` (#26303 ) This PR replaces legacy API calls to `train.` with AIR `session.` in Train code, examples and docs. Depends on https://github.com/ray-project/ray/pull/25735	2022-07-07 16:29:04 -07:00
Chen Shen	29358f9677	[Core][Doc] add docs for out of disk prevention. (#26291 ) Update docs to reflect the out of disk prevention feature.	2022-07-06 07:37:54 -07:00
Guyang Song	cf7305a2c9	Revert "[Core] Add retry exception allowlist for user-defined filteri… (#26289 ) Closes #26287.	2022-07-05 15:17:36 -07:00
Clark Zinzow	2a4d22fbd2	[Core] Add retry exception allowlist for user-defined filtering of retryable application-level errors. (#25896 ) This PR adds supported for specifying an exception allowlist (List[Exception]) as the retry_exceptions argument, such that an application-level exception will only be retried if it is in the allowlist.	2022-07-01 20:06:02 -07:00
Chen Shen	95fe3271ec	[Core][Doc] remove cython section from advanced doc. #26062 the example is removed.	2022-06-24 10:39:45 -07:00
Guyang Song	2934efe502	[runtime_envmove 'eager_intall' to 'config' (#26004 )	2022-06-23 13:16:52 -05:00
Guyang Song	26ae3a0239	[Doc] [C++ API] Add note about ABI issue (#26030 )	2022-06-23 13:14:50 -05:00
sychen52	84401bb616	add missing brackets (#25992 )	2022-06-22 15:30:55 -05:00
sychen52	5c58d43df2	[docs][minor] Change one of the because to therefore. (#25921 )	2022-06-21 10:41:40 -05:00
matthewdeng	fe4185974a	[docs] fix swapped pattern docs (#25948 ) Content of the two docs were switched. Unnecessary Ray Get images were correctly in `unnecessary-ray-get.rst`, which made this noticeable beyond the URL.	2022-06-21 10:37:37 -05:00
Archit Kulkarni	b24c736bb8	[Doc] [runtime env] Add note that `excludes` paths are relative to working_dir (#25874 ) Users' intuition might lead them to fill out `excludes` with absolute paths, e.g. `/Users/working_dir/subdir/`. However, the `excludes` field uses `gitignore` syntax. In `gitignore` syntax, paths that start with `/` are interpreted relative to the level of the directory where the `gitignore` file resides, and in our case this is the `working_dir` directory (morally speaking, since there's no actual `.gitignore` file.) So the correct thing to put in `excludes` would be `/subdir/`. As long as we support `gitignore` syntax, we should have a note in the docs for this. This PR adds the note.	2022-06-17 10:50:04 -05:00
sychen52	edf16b8e2c	[docs] Edit the output of the script to match the code (#25855 )	2022-06-17 10:48:28 -05:00
sychen52	ce02ac0311	[docs] Fix example actor indentation (#25882 )	2022-06-16 22:06:21 -07:00
Jiao	f6735f90c7	[Ray DAG] Move `dag` project folder out of `experimental` (#25532 )	2022-06-16 19:15:39 -07:00
clarng	ef866d1e49	exclude doc_code from import sorting (#25772 ) Skip sorting the imports in doc_code.	2022-06-15 11:34:45 -07:00
sychen52	d5b8a1caab	[docs] actor is not created in driver1 (#25749 ) call .remote() after .option	2022-06-13 21:41:14 -07:00
Antoni Baum	182f604d32	[docs] Fix bad argument name in PTL docs (#25736 ) Fixes bad argument name in PTL docs. This is just a quick fix - we should be testing the code snippet.	2022-06-13 15:20:24 -07:00
Jiao	f8b0ab7e78	[Ray DAG] Add documentation in `more options` section (#25528 )	2022-06-12 09:47:20 -07:00
M Waleed Kadous	9e2e84bc1c	[docs] Add an example for simple highly parallelizable tasks. (#24885 ) It's important to show how Ray can be used for easily parallelizable independent tasks. I put this together to demonstrate how to di this.	2022-06-08 18:10:37 -07:00
Eric Liang	48acbf0d69	[hotfix] Revert "[runtime env] runtime env inheritance refactor (#24538 )" (#25487 ) This reverts commit `eb2692c`. This is a temporary mitigation for #25484	2022-06-05 14:55:38 -07:00
Stephanie Wang	473a962d89	[Datasets] [Docs] Add docs about fault tolerance in Datasets (#25371 ) Adds description of fault tolerance guarantees for Datasets. Related issue number Closes #24856.	2022-06-02 15:53:50 -07:00
Stephanie Wang	ab8785ca5c	Revert "Revert "[core] Support generators for tasks with multiple return values (#25247 )" (#25380 )" (#25383 ) Duplicate for #25247. Adds a fix for Dask-on-Ray. Previously, for tasks with multiple return values, we implicitly allowed returning a dict with the return index as the key. This was used by Dask-on-Ray, but this is not documented behavior, and we now require task returns to be iterable instead.	2022-06-02 10:50:11 -07:00
Yi Cheng	80168a09a6	Revert "[core] Support generators for tasks with multiple return values (#25247 )" (#25380 ) This reverts commit `1f9488724a`.	2022-06-01 15:31:59 -07:00
Stephanie Wang	961b875ab8	[core] Allow user to override global default for max_retries (#25189 ) This PR allows the user to override the global default for max_retries for non-actor tasks. It adds an OS env called RAY_task_max_retries which can be passed to the driver or set with runtime envs. Any future tasks submitted by that worker will default to this value instead of 3, the hard-coded default. It would be nicer if we could have a standard way of setting these defaults, but I think this is fine as a one-off for now (not a clear need for overriding defaults of other @ray.remote options yet). Related issue number Closes #24854.	2022-06-01 14:42:18 -07:00
Stephanie Wang	1f9488724a	[core] Support generators for tasks with multiple return values (#25247 ) Adds support for Python generators instead of just normal return functions when a task has multiple return values. This will allow developers to cut down on total memory usage for tasks, as they can free previous return values before allocating the next one on the heap. The semantics for num_returns are about the same as usual tasks - the function will throw an error if the number of values returned by the generator does not match the number of return values specified by the user. The one difference is that if num_returns=1, the task will throw the usual Python exception that the generator cannot be pickled. As an example, this feature will allow us to reduce memory usage in Datasets shuffle operations (see #25200 for a prototype).	2022-06-01 13:30:52 -07:00
Eric Liang	905258dbc1	Clean up docstyle in python modules and add LINT rule (#25272 )	2022-06-01 11:27:54 -07:00
Sven Mika	18c03f8d93	[RLlib] A2C + A3C move to `algorithms` folder and re-name into A2C/A3C (from ...Trainer). (#25314 )	2022-06-01 09:29:16 +02:00
Edward Oakes	4ad55f640d	[runtime_env] Clarify in docs that python and ray versions must match cluster (#25245 ) Follow up from a few users who were confused by this.	2022-05-27 14:24:48 -05:00
Guyang Song	1bc91a4129	[doc] Add info about eager_install to runtime_env FAQ (#25008 )	2022-05-23 10:26:57 -05:00
Guyang Song	c6edfdd2a0	[script] expose options of xxx_port in 'ray start' command (#24919 )	2022-05-23 17:18:09 +08:00
Guyang Song	99d25d4d4e	[Doc] Fix ray core doc (#25006 )	2022-05-20 14:51:59 +08:00
Guyang Song	eb2692cb32	[runtime env] runtime env inheritance refactor (#24538 ) * [runtime env] runtime env inheritance refactor (#22244) Runtime Environments is already GA in Ray 1.6.0. The latest doc is [here](https://docs.ray.io/en/master/ray-core/handling-dependencies.html#runtime-environments). And now, we already supported a [inheritance](https://docs.ray.io/en/master/ray-core/handling-dependencies.html#inheritance) behavior as follows (copied from the doc): - The runtime_env["env_vars"] field will be merged with the runtime_env["env_vars"] field of the parent. This allows for environment variables set in the parent’s runtime environment to be automatically propagated to the child, even if new environment variables are set in the child’s runtime environment. - Every other field in the runtime_env will be overridden by the child, not merged. For example, if runtime_env["py_modules"] is specified, it will replace the runtime_env["py_modules"] field of the parent. We think this runtime env merging logic is so complex and confusing to users because users can't know the final runtime env before the jobs are run. Current PR tries to do a refactor and change the behavior of Runtime Environments inheritance. Here is the new behavior: - If there is no runtime env option when we create actor, inherit the parent runtime env. - Otherwise, use the optional runtime env directly and don't do the merging. Add a new API named `ray.runtime_env.get_current_runtime_env()` to get the parent runtime env and modify this dict by yourself. Like: ```Actor.options(runtime_env=ray.runtime_env.get_current_runtime_env().update({"X": "Y"}))``` This new API also can be used in ray client.	2022-05-20 10:53:54 +08:00
Qing Wang	af418fb729	[Java][API CHANGE] Move exception to api module. (#24540 ) This PR moves all exception classes from runtime module to api module. It's aiming to eliminate the confusion about ray exceptions. It means that Ray users don't need to touch runtime module when API programming after this PR. Note that this should be merged onto 2.0.	2022-05-19 10:18:20 +08:00
SangBin Cho	f228245520	[Placement group] Update the old placement group API usage to the new scheduling_strategy based API (#24544 ) Documentation should use the new API, not the old one that will be deprecated	2022-05-18 09:41:51 -07:00
Max Pumperla	3ffcb81bd3	[docs] remove non-functional lbfgs example (#24727 ) This example simply doesn't run as is. We can bring it back up again later, if it makes sense. But it's not clear what the variables used there, like actor are. Fixes #21328 Signed-off-by: Max Pumperla <max.pumperla@googlemail.com>	2022-05-18 10:53:14 +01:00
Simon Mo	c3ac6fcf3f	Bump Ray Version from 2.0.0.dev0 to 3.0.0.dev0 (#24894 )	2022-05-17 19:31:05 -07:00
Eric Liang	437df9431c	[docs] Remove bad suggestions to use local_mode or num_cpus in init (#24827 )	2022-05-17 12:55:04 -07:00
Ofey Chan	c6c72a6f89	[Doc] [Core] Enhance actor queue doc code (#24532 ) Why are these changes needed? Current documentation code in Message passing using Ray Queue can be enhanced, for better demonstration of the message queue. It creates 10 tasks but only 2 consumers, and each consumer consumes one task then exit. Therefore, the output is a bit vague: (consumer pid=1022727) got work 0 (consumer pid=1022595) got work 1 So I make consumer working until the queue is empty. The output shows consumer 1 and 2 working in parallel: (consumer pid=1030876) consumer 0 got work 0 (consumer pid=1030876) consumer 0 got work 1 (consumer pid=1030876) consumer 0 got work 3 (consumer pid=1030876) consumer 0 got work 5 (consumer pid=1030876) consumer 0 got work 7 (consumer pid=1030876) consumer 0 got work 9 (consumer pid=1030949) consumer 1 got work 2 (consumer pid=1030949) consumer 1 got work 4 (consumer pid=1030949) consumer 1 got work 6 (consumer pid=1030949) consumer 1 got work 8 P.S. Also fix a typo in doc.	2022-05-15 17:38:21 -07:00
Archit Kulkarni	738da639d9	[runtime env] Add FAQ for runtime_env (#24412 ) Adds some frequently asked user questions to the docs. Co-authored-by: shrekris-anyscale <92341594+shrekris-anyscale@users.noreply.github.com>	2022-05-13 11:03:58 -05:00

1 2 3

118 commits