hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-05 18:11:42 -05:00

No description

Find a file

Cheng Su 7c7828f818 [Datasets] Improve size estimation of image folder data source (#27219 ) This PR is to improve in-memory data size estimation of image folder data source. Before this PR, we use on-disk file size as estimation of in-memory data size of image folder data source. This can be inaccurate due to image compression and in-memory image resizing. Given `size` and `mode` is set to be optional in https://github.com/ray-project/ray/pull/27295, so change this PR to tackle the simple case when `size` and `mode` are both provided. * `size` and `mode` is provided: just calculate the in-memory size based on the dimensions, not need to read any image (this PR) * `size` or `mode` is not provided: need sampling to determine the in-memory size (will do in another followup PR). Here is example of estiamted size for our test image folder ``` >>> import ray >>> from ray.data.datasource.image_folder_datasource import ImageFolderDatasource >>> root = "example://image-folders/different-sizes" >>> ds = ray.data.read_datasource(ImageFolderDatasource(), root=root, size=(64, 64), mode="RGB") >>> ds.size_bytes() 40310 >>> ds.fully_executed().size_bytes() 37428 ``` Without this PR: ``` >>> ds.size_bytes() 18978 ```		2022-08-12 11:26:03 -07:00
.buildkite	Revert "[serve] Integrate and Document Bring-Your-Own Gradio Applications (#26403 )" (#27587 )	2022-08-06 21:38:55 -07:00
.github	[docs] Add codeowners for subdirectories (#27569 )	2022-08-05 11:37:15 -07:00
.gitpod	[CI] Check test files for `if __name__...` snippet (#25322 )	2022-06-02 10:30:00 +01:00
bazel	[runtime env] plugin refactor [7/n]: support runtime env in C++ API (#27010 )	2022-07-27 18:24:31 +08:00
binder	run code in browser (#22727 )	2022-03-02 10:27:00 +01:00
ci	[serve] Make serve agent not blocking when GCS is down. (#27526 )	2022-08-08 16:29:42 -07:00
cpp	[Core] Unrevert "Add retry exception allowlist for user-defined filtering of retryable application-level errors." (#26449 )	2022-08-05 16:07:13 -07:00
dashboard	[Dashboard] Fix edge cases for log file names in the dashboard log viewer (#27772 )	2022-08-12 09:39:54 -07:00
deploy	[K8s/Autoscaler] Added a field for the service account name (#27004 )	2022-07-26 19:47:18 -07:00
doc	[Serve] [Docs] Revise "Serving Ray AIR Checkpoints" Header (#27824 )	2022-08-12 11:18:33 -07:00
docker	[Docker] Add Cuda 11.6 support (#26695 )	2022-07-26 10:15:53 -07:00
java	[Serve] Java documentation (#26321 )	2022-08-12 09:07:12 -07:00
python	[Datasets] Improve size estimation of image folder data source (#27219 )	2022-08-12 11:26:03 -07:00
release	Spread the actors in data ingest benchmark, which 2x the throughput (#27620 )	2022-08-11 11:47:54 -07:00
rllib	[RLlib] Fix `get_init_state` annotation in torch and define more specific `TensorType`. (#27791 )	2022-08-11 20:02:17 +02:00
scripts	[CI] Add bazel py_test checking for Serve (#25509 )	2022-06-07 10:54:10 -07:00
src	Fix out-of-band deserialization of actor handle (#27700 )	2022-08-09 14:25:14 -07:00
thirdparty	Revert "Revert "[grpc] Upgrade grpc to 1.45.2"" (#24201 )	2022-04-26 10:49:54 -07:00
.bazelrc	[runtime env] plugin refactor[6/n]: java api refactor (#26783 )	2022-07-26 09:00:57 +08:00
.clang-format	[Lint] One parameter/argument per line for C++ code (#22725 )	2022-03-13 17:05:44 +08:00
.clang-tidy	[Lint] Disable `modernize-use-override` (#19368 )	2021-10-13 20:20:08 -07:00
.editorconfig	Improve .editorconfig entries (#7344 )	2020-02-26 19:05:36 -08:00
.flake8	[Streaming]Farewell : remove all of streaming related from ray repo. (#21770 )	2022-01-23 17:53:41 +08:00
.git-blame-ignore-revs	Create `.git-blame-ignore-revs` for black formatting (#25118 )	2022-05-23 21:55:57 -07:00
.gitignore	[Core] Unrevert "Add retry exception allowlist for user-defined filtering of retryable application-level errors." (#26449 )	2022-08-05 16:07:13 -07:00
.gitpod.yml	[dev] Enable gitpod (#15420 )	2021-04-21 13:26:46 -07:00
.isort.cfg	Update import sorting blacklist, enable sorting for experimental dir (#26101 )	2022-07-12 21:25:58 -07:00
build-docker.sh	Bump Ray Version from 2.0.0.dev0 to 3.0.0.dev0 (#24894 )	2022-05-17 19:31:05 -07:00
BUILD.bazel	Replace boost::filesystem with std::filesystem (#27522 )	2022-08-04 21:33:51 -07:00
build.sh	Get rid of build shell scripts and move them to Python (#6082 )	2020-07-16 11:26:47 -05:00
CONTRIBUTING.rst	Link to the documentation on contributing from CONTRIBUTING.rst (#19396 )	2021-11-15 15:34:18 -08:00
LICENSE	[State Observability] Use a table format by default (#26159 )	2022-07-19 00:54:16 -07:00
pylintrc	RLLIB and pylintrc (#8995 )	2020-06-17 18:14:25 +02:00
README.rst	[docs] Minor polish on AIR getting started page (#27696 )	2022-08-09 11:24:18 -07:00
SECURITY.md	Create SECURITY.md (#21521 )	2022-01-11 08:54:51 -08:00
setup_hooks.sh	[ci] Clean up ci/ directory (refactor ci/travis) (#23866 )	2022-04-13 18:11:30 +01:00
WORKSPACE	[CI] Bump Bazel version to 4.2.2 (#24242 )	2022-05-26 17:09:40 -07:00

README.rst

.. image:: https://github.com/ray-project/ray/raw/master/doc/source/images/ray_header_logo.png

.. image:: https://readthedocs.org/projects/ray/badge/?version=master
    :target: http://docs.ray.io/en/master/?badge=master

.. image:: https://img.shields.io/badge/Ray-Join%20Slack-blue
    :target: https://forms.gle/9TSdDYUgxYs8SA9e8

.. image:: https://img.shields.io/badge/Discuss-Ask%20Questions-blue
    :target: https://discuss.ray.io/

.. image:: https://img.shields.io/twitter/follow/raydistributed.svg?style=social&logo=twitter
    :target: https://twitter.com/raydistributed

|

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a toolkit of libraries (Ray AIR) for simplifying ML compute:

.. image:: https://github.com/ray-project/ray/raw/master/doc/source/images/what-is-ray-padded.svg

..
  https://docs.google.com/drawings/d/1Pl8aCYOsZCo61cmp57c7Sja6HhIygGCvSZLi_AuBuqo/edit

Learn more about `Ray AIR`_ and its libraries:

- `Datasets`_: Distributed Data Preprocessing
- `Train`_: Distributed Training
- `Tune`_: Scalable Hyperparameter Tuning
- `RLlib`_: Scalable Reinforcement Learning
- `Serve`_: Scalable and Programmable Serving

Or more about `Ray Core`_ and its key abstractions:

- `Tasks`_: Stateless functions executed in the cluster.
- `Actors`_: Stateful worker processes created in the cluster.
- `Objects`_: Immutable values accessible across the cluster.

Ray runs on any machine, cluster, cloud provider, and Kubernetes, and features a growing
`ecosystem of community integrations`_.

Install Ray with: ``pip install ray``. For nightly wheels, see the
`Installation page <https://docs.ray.io/en/latest/installation.html>`__.

.. _`Serve`: https://docs.ray.io/en/latest/serve/index.html
.. _`Datasets`: https://docs.ray.io/en/latest/data/dataset.html
.. _`Workflow`: https://docs.ray.io/en/latest/workflows/concepts.html
.. _`Train`: https://docs.ray.io/en/latest/train/train.html
.. _`Tune`: https://docs.ray.io/en/latest/tune/index.html
.. _`RLlib`: https://docs.ray.io/en/latest/rllib/index.html
.. _`ecosystem of community integrations`: https://docs.ray.io/en/latest/ray-overview/ray-libraries.html


Why Ray?
--------

Today's ML workloads are increasingly compute-intensive. As convenient as they are, single-node development environments such as your laptop cannot scale to meet these demands.

Ray is a unified way to scale Python and AI applications from a laptop to a cluster.

With Ray, you can seamlessly scale the same code from a laptop to a cluster. Ray is designed to be general-purpose, meaning that it can performantly run any kind of workload. If your application is written in Python, you can scale it with Ray, no other infrastructure required.

More Information
----------------

- `Documentation`_
- `Ray Architecture whitepaper`_
- `Exoshuffle: large-scale data shuffle in Ray`_
- `Ownership: a distributed futures system for fine-grained tasks`_
- `RLlib paper`_
- `Tune paper`_

*Older documents:*

- `Ray paper`_
- `Ray HotOS paper`_

.. _`Ray AIR`: https://docs.ray.io/en/latest/ray-air/getting-started.html
.. _`Ray Core`: https://docs.ray.io/en/latest/ray-core/walkthrough.html
.. _`Tasks`: https://docs.ray.io/en/latest/ray-core/tasks.html
.. _`Actors`: https://docs.ray.io/en/latest/ray-core/actors.html
.. _`Objects`: https://docs.ray.io/en/latest/ray-core/objects.html
.. _`Documentation`: http://docs.ray.io/en/latest/index.html
.. _`Ray Architecture whitepaper`: https://docs.google.com/document/d/1lAy0Owi-vPz2jEqBSaHNQcy2IBSDEHyXNOQZlGuj93c/preview
.. _`Exoshuffle: large-scale data shuffle in Ray`: https://arxiv.org/abs/2203.05072
.. _`Ownership: a distributed futures system for fine-grained tasks`: https://www.usenix.org/system/files/nsdi21-wang.pdf
.. _`Ray paper`: https://arxiv.org/abs/1712.05889
.. _`Ray HotOS paper`: https://arxiv.org/abs/1703.03924
.. _`RLlib paper`: https://arxiv.org/abs/1712.09381
.. _`Tune paper`: https://arxiv.org/abs/1807.05118

Getting Involved
----------------

.. list-table::
   :widths: 25 50 25 25
   :header-rows: 1

   * - Platform
     - Purpose
     - Estimated Response Time
     - Support Level
   * - `Discourse Forum`_
     - For discussions about development and questions about usage.
     - < 1 day
     - Community
   * - `GitHub Issues`_
     - For reporting bugs and filing feature requests.
     - < 2 days
     - Ray OSS Team
   * - `Slack`_
     - For collaborating with other Ray users.
     - < 2 days
     - Community
   * - `StackOverflow`_
     - For asking questions about how to use Ray.
     - 3-5 days
     - Community
   * - `Meetup Group`_
     - For learning about Ray projects and best practices.
     - Monthly
     - Ray DevRel
   * - `Twitter`_
     - For staying up-to-date on new features.
     - Daily
     - Ray DevRel

.. _`Discourse Forum`: https://discuss.ray.io/
.. _`GitHub Issues`: https://github.com/ray-project/ray/issues
.. _`StackOverflow`: https://stackoverflow.com/questions/tagged/ray
.. _`Meetup Group`: https://www.meetup.com/Bay-Area-Ray-Meetup/
.. _`Twitter`: https://twitter.com/raydistributed
.. _`Slack`: https://forms.gle/9TSdDYUgxYs8SA9e8