hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-05 18:11:42 -05:00

Author	SHA1	Message	Date
xwjiang2010	ff2b728e9a	[air] add tuner user guide (#26837 ) Co-authored-by: Kai Fricke <kai@anyscale.com> Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2022-08-03 09:43:42 -07:00
Archit Kulkarni	a12c04a2fe	[Serve] [Doc] Update key concepts for 2.0, remove deprecated APIs (#26965 ) Removes deprecated APIs: - serve.start() - get_handle() Rewrites the ServeHandle doc snippet to use the recommended workflow for ServeHandles (only access them from other deployments, pass Deployments in as input args to `.bind()`, which get resolved to ServeHandles at runtime) Co-authored-by: shrekris-anyscale <92341594+shrekris-anyscale@users.noreply.github.com>	2022-08-03 11:27:23 -05:00
Jiajun Yao	8b7e4ac701	[Doc] Test ray core doc code (#27334 ) - Currently not all code under ray-core/doc_code is covered by CI. - tf_example.py and torch_example.py are not used anywhere. Signed-off-by: Jiajun Yao <jeromeyjj@gmail.com>	2022-08-02 20:51:47 -07:00
Dmitri Gekhtman	4d87e8112a	[docs][kubernetes] GPU user guide (#27360 ) Signed-off-by: Dmitri Gekhtman <dmitri.m.gekhtman@gmail.com> This PR adds a page of guidance on GPU deployment with Ray/K8s. This page is a modified and slightly expanded version of the existing page https://docs.ray.io/en/latest/cluster/kubernetes-gpu.html moves managed K8s service intro links to their own page	2022-08-02 15:58:23 -07:00
Avnish Narayan	00f9438101	[RLlib] Training step docs. (#27344 )	2022-08-02 23:41:45 +02:00
Archit Kulkarni	e02b072939	[Doc] [Serve] Edit grammar/usage/organization for HTTP adapters page (#26969 ) Moves FastAPI into its own section instead of appearing in a duplicated note. Co-authored-by: simon-mo <simon.mo@hey.com>	2022-08-02 15:08:05 -05:00
Richard Liaw	c8561071f3	[air/train/docs] gbdt trainer user guide (#27362 ) Co-authored-by: Kai Fricke <kai@anyscale.com>	2022-08-02 13:02:42 -07:00
clarng	84674fa868	[docs] ray core namespace docs: edit pass & move python code into doc_code dir (#27341 )	2022-08-02 12:52:30 -07:00
clarng	34385b8136	[docs] ray core cross-lang docs: edit pass & move python code into doc_code dir (#27350 ) Edit pass. Move code into doc_code dir. Code in doc_code is verified by CI	2022-08-02 12:50:05 -07:00
Jiajun Yao	cd2e590567	Support placement_group=None in PlacementGroupSchedulingStrategy (#27370 ) We decided to allow escaping the parent pg via `PlacementGroupSchedulingStrategy(placement_group=None)` instead of using "DEFAULT". Our doc is updated with that but in the code it's still not allowed.	2022-08-02 12:49:41 -07:00
Ricky Xu	82a24f9319	[Doc][Core][State Observability] Adding Python SDK doc and docstring (#26997 ) 1. Add doc for python SDK and docstrings on public SDK 2. Rename list -> ray_list and get -> ray_get for better naming 3. Fix some typos 4. Auto translate address to api server url. Co-authored-by: SangBin Cho <rkooo567@gmail.com>	2022-08-02 11:24:59 -05:00
xwjiang2010	36cf1baa82	[air doc] checkpoint_freq --> checkpoint_frequency (#27325 ) Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com>	2022-08-02 11:34:10 +01:00
Jules S. Damji	4045ba4841	[DOC Ray AIR] minor editorial tweaks for clarity and usage (#27128 ) Co-authored-by: Jules Damji <jules@anyscale.com>	2022-08-01 21:09:04 -07:00
Dmitri Gekhtman	6efca71c35	[docs][kubernetes] XGBoost ML example (#27313 ) Adds a guide on running an XGBoost-Ray workload using KubeRay. Signed-off-by: Dmitri Gekhtman <dmitri.m.gekhtman@gmail.com>	2022-08-01 19:30:41 -07:00
shrekris-anyscale	324d8e4bca	[Serve] Serialize `user_config` with JSON instead of Pickle (#26235 )	2022-08-01 17:53:43 -07:00
Eric Liang	f7ae8923f6	[docs] Reorganize the tensor data support docs; general editing (#26952 ) Why are these changes needed? Editing pass over the tensor support docs for clarity: Make heavy use of tabbed guides to condense the content Rewrite examples to be more organized around creating vs reading tensors Use doc_code for testing	2022-08-01 17:31:41 -07:00
clarng	fffcae1cb4	[docs] ray core dag docs: edit pass & move code into separate dir (#27318 )	2022-08-01 17:05:36 -07:00
shrekris-anyscale	cc84953da3	[Serve] [Docs] Update "Getting Started" documentation (#26745 )	2022-08-01 16:31:48 -07:00
matthewdeng	fedfaddb3f	[docs] add k8s docs to toc (#27310 )	2022-07-30 15:26:30 -07:00
clarng	a61478fb73	import style (#25755 )	2022-07-30 09:43:09 -07:00
Dmitri Gekhtman	059895ab5b	[docs][kubernetes] Shift docs into new structure (#27239 ) This PR shifts KubeRay docs into the structure introduced in #27036. There are no content changes.	2022-07-29 14:19:51 -07:00
Siyuan (Ryans) Zhuang	1bcd3e41d1	[Workflow] Cleanup workflow docs (#27197 ) * cleanup workflow docs Signed-off-by: Siyuan Zhuang <suquark@gmail.com>	2022-07-29 13:03:50 -07:00
Kai Fricke	1f097e9d12	[tune/docs] Update custom syncer example (#27252 ) There is a small bug in the docs example for custom command based syncers. This PR fixes them and adds a test to test these changes. Signed-off-by: Kai Fricke <kai@anyscale.com>	2022-07-29 16:09:19 +01:00
xwjiang2010	d331489a9d	[ air ] clean up some more `tune.run` (#27117 ) More replacements of tune.run() in examples/docstrings for Tuner.fit() Signed-off-by: xwjiang2010 <xwjiang2010@gmail.com> Co-authored-by: Kai Fricke <kai@anyscale.com>	2022-07-29 10:43:45 +01:00
Jimmy Yao	749d313dcd	hot fix ray lightning (#27235 ) hot fix ray lightning #27235	2022-07-28 22:41:28 -07:00
Cade Daniel	0374637e53	Adding --keep-going flag to sphinx-build so all lint failures are listed in CI (#27068 ) This PR adds --keep-going flag to the make html target for building the Ray docs. This means that when there is a lint failure in CI, the BuildKite log will show all lint failures instead of just the first one. Despite continuing past the first lint error, it will still fail the build. Signed-off-by: Cade Daniel <cade@anyscale.com>	2022-07-28 16:24:27 -07:00
Jimmy Yao	73e1632599	Hot fix again ray lightning docs (#27229 )	2022-07-28 16:19:30 -07:00
Clark Zinzow	df124d0ad5	[AIR - Datasets] Hide tensor extension from UDFs. (#27019 ) We previously added automatic tensor extension casting on Datasets transformation outputs to allow the user to not have to worry about tensor column casting; however, this current state creates several issues: 1. Not all tensors are supported, which means that we’ll need to have an opaque object dtype (i.e. ndarray of ndarray pointers) fallback for the Pandas-only case. Known unsupported tensor use cases: a. Heterogeneous-shaped (i.e. ragged) tensors b. Struct arrays 2. UDFs will expect a NumPy column and won’t know what to do with our TensorArray type. E.g., torchvision transforms don’t respect the array protocol (which they should), and instead only support Torch tensors and NumPy ndarrays; passing a TensorArray column or a TensorArrayElement (a single item in the TensorArray column) fails. Implicit casting with object dtype fallback on UDF outputs can make the input type to downstream UDFs nondeterministic, where the user won’t know if they’ll get a TensorArray column or an object dtype column. 3. The tensor extension cast fallback warning spams the logs. This PR: 1. Adds automatic casting of tensor extension columns to NumPy ndarray columns for Datasets UDF inputs, meaning the UDFs will never have to see tensor extensions and that the UDF input column types will be consistent and deterministic; this fixes both (2) and (3). 2. No longer implicitly falls back to an opaque object dtype when TensorArray casting fails (e.g. for ragged tensors), and instead raises an error; this fixes (4) but removes our support for (1). 3. Adds a global enable_tensor_extension_casting config flag, which is True by default, that controls whether we perform this automatic casting. Turning off the implicit casting provides a path for (1), where the tensor extension can be avoided if working with ragged tensors in Pandas land. Turning off this flag also allows the user to explicitly control their tensor extension casting, if they want to work with it in their UDFs in order to reap the benefits of less data copies, more efficient slicing, stronger column typing, etc.	2022-07-28 10:37:45 -07:00
shrekris-anyscale	510a0e038c	[Serve] Add `host` and `port` options to the Serve config file (#27026 ) The Serve CLI and REST API always sets the host to `0.0.0.0` and the port to Serve's default. This change adds `host` and `port` as top level options in the Serve config file, so users can manually set the host and port of their Serve application to different values. This change introduces a new Serve config file format: ```yaml import_path: ... runtime_env: ... host: ... port: ... deployments: ... ... ``` `host` and `port` are optional and can be omitted. A running Serve application's `host` and `port` cannot be changed. If a user tries to `serve deploy` a config file with different `host` and `port` options than an already-running Serve application, `serve deploy` will fail without making any changes to the application. The user must `serve shutdown` their application and restart it with `serve deploy` to change their `host` and `port`. Follow-Up Items * The following CLI commands should not start Serve automatically. They should check whether Serve is running and perform some sort of no-op if it's not. That would alleviate the concern that the user starts Serve by accident through a `GET` request and needs to deal with default `host`/`port` options. Corresponding docs should also be updated. * `serve status` * `serve config` * `serve shutdown`	2022-07-28 11:26:46 -05:00
Jiao	0dbb18a87d	[AIR][Data] Fix nyc_taxi_basic_processing notebook (#26983 )	2022-07-27 21:37:04 -07:00
Cade Daniel	db26c779a0	[Ray clusters] [docs] Copying all Ray Clusters doc content to new structure (#27062 )	2022-07-27 14:22:44 -07:00
xwjiang2010	eb69c1ca28	[air] Add annotation for Tune module. (#27060 ) Co-authored-by: Kai Fricke <kai@anyscale.com>	2022-07-27 13:53:46 -07:00
Kai Fricke	3924a4b7cc	[air/train] Rename BaseWorkerMixin, only log info torch loop for rank 0 (#27098 ) This PR - only prints train_loop info strings (e.g. `train_loop_utils.py:298 -- Moving model to device: cpu`) for rank 0 workers for torch - renames `BaseWorkerMixin` to `RayTrainWorker` as the name comes up often in output and is more meaningful Signed-off-by: Kai Fricke <kai@anyscale.com>	2022-07-27 20:11:59 +01:00
matthewdeng	113c4d7fab	[air][data] move train_test_split to ray.data.Dataset (#27065 )	2022-07-27 09:53:37 -07:00
Simon Mo	e5a8b1dd55	[Serve] Add API Annotations And Move to _private (#27058 )	2022-07-27 09:08:26 -07:00
Amog Kamsetty	862d10c162	[AIR] Remove ML code from `ray.util` (#27005 ) Removes all ML related code from `ray.util` Removes: - `ray.util.xgboost` - `ray.util.lightgbm` - `ray.util.horovod` - `ray.util.ray_lightning` Moves `ray.util.ml_utils` to other locations Closes #23900 Signed-off-by: Amog Kamsetty <amogkamsetty@yahoo.com> Signed-off-by: Kai Fricke <kai@anyscale.com> Co-authored-by: Kai Fricke <kai@anyscale.com>	2022-07-27 14:24:19 +01:00
Cade Daniel	7a817ad364	Moving Ray Clusters restructuring section to be subpage under existing Ray Clusters. (#27036 ) This PR puts the Ray Clusters (under construction) docs section (see #26754) under Ray Clusters as a subpage. This makes the master branch docs clean and presentable for users Ray Clusters doc writers can use existing CI to iterate on the docs, without having a massive PR once we're done. Signed-off-by: Cade Daniel <cade@anyscale.com>	2022-07-26 15:52:06 -07:00
Balaji Veeramani	89f7f2a567	[Datasets] Add `size` parameter to `ImageFolderDatasource` (#26975 ) If you read a folder with differently-sized images, `ImageFolderDatasource` errors. This PR fixes the issue by resizing images to a user-specified size.	2022-07-26 14:57:38 -07:00
Rohan Potdar	deccf33912	[RLlib]: Add Off-Policy Estimation docs (#26809 ) Co-authored-by: Kourosh Hakhamaneshi <kourosh@anyscale.com>	2022-07-26 13:57:56 -07:00
Cade Daniel	0427add12b	Adding Ray Clusters (Under Construction) doc section with new structure (#26754 ) This PR: Creates a new chapter in the docs titled "Ray Clusters (Under Construction)". The new chapter makes the Ray Clusters docs follow the same structure as the other docs (https://diataxis.fr/) The new chapter will eventually replace the old chapter. I want to merge this now so that @DmitriGekhtman can put his Kuberay docs into the new structure. Signed-off-by: Cade Daniel <cade@anyscale.com>	2022-07-26 12:00:20 -07:00
Siyuan (Ryans) Zhuang	e1db8fb382	[Workflow] Workflow client integration (#26702 ) ## Why are these changes needed? This PR ensures that workflow can work properly with Ray client. Regular workflow tests will (also) be running under client mode (as a pytest parameter). Some tests are moved and reorganized, because the Ray client tests requires starting the cluster, so some tests requires isolation or related changes. Tests that literally take down the cluster are not tested with Ray client, since Ray client would fail in this scenario. Limitations of Ray Workflow under Ray client are noted in the doc. ## Related issue number Closes #21595	2022-07-26 11:15:47 -07:00
Balaji Veeramani	8bc836d9fb	[AIR] Remove `CustomStatefulPreprocessor` (#26981 )	2022-07-26 10:10:57 -07:00
Balaji Veeramani	55988992b9	[AIR] Rename `limit` parameter as `max_categories` (#26977 )	2022-07-26 10:10:40 -07:00
SangBin Cho	39b9c44c8d	[State Observability] pre-alpha documentation (#26560 ) Adds Documentation for state APIs API reference	2022-07-26 05:49:28 -07:00
Dmitri Gekhtman	a70ada7341	[kubernetes][docs] Implement landing page and getting started guide (#26912 ) Implements a landing page for the new KubeRay-based deployment guide. Implements a "Getting started" Jupyter notebook	2022-07-26 00:41:56 -07:00
Archit Kulkarni	084f06f49a	[Doc] [Job submission] [Dashboard] Add tip for long runtime_env installation and improve error (#26911 ) # Why are these changes needed? The dashboard can display the message <actor> cannot be created because the Ray cluster cannot satisfy its resource requirements in the case where the runtime env setup is stalled. This PR updates this message to include the possibility of the runtime env setup failing. This PR adds a tip to the Job Submission doc saying that if a job is stalled in PENDING, the runtime env setup may have stalled. It adds a pointer to the log files which should have more information. The runtime env cannot stall forever, it fails after 10 minutes. This is a new feature added after the Ray 1.13 branch cut. In Ray <= 1.13, the runtime env can still stall forever. # Related issue number Closes #26332	2022-07-25 23:32:27 -07:00
Sihan Wang	8ecd928c34	[Serve] Make the checkpoint and recover only from GCS (#26753 )	2022-07-25 14:24:53 -07:00
Jules S. Damji	193e824bc1	[AIR DOC] minor tweaks to checkpoint user guide for clarity and consistency subheadings (#26937 ) Co-authored-by: Jules Damji <jules@anyscale.com> Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2022-07-25 14:21:29 -07:00
Jiao	5315f1e643	[AIR] Enable other notebooks previously marked with # REGRESSION (#26896 ) Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2022-07-25 13:40:21 -07:00
matthewdeng	df638b3f0f	[Datasets] Automatically cast tensor columns when building Pandas blocks. (#26924 ) This PR just applies the changes from the following PRs: [Datasets] Automatically cast tensor columns when building Pandas blocks. #26684 reverted by Revert "[Datasets] Automatically cast tensor columns when building Pandas blocks." #26921 [AIR - Datasets] Fix TensorDtype construction from string and fix example. #26904 This fixes the test failures introduced in the originally reverted PRs.	2022-07-25 12:12:10 -07:00

... 2 3 4 5 6 ...

2576 commits