--- layout: post title: "Ray: 0.4 Release" excerpt: "This post announces the release of Ray 0.4." date: 2018-03-27 14:00:00 --- We are pleased to announce the 0.4 release of [Ray][1]. This release introduces improvements to Ray's scheduling, substantial backend improvements, and the start of [Pandas on Ray][2], as well as many improvements to [RLlib][3] and [Tune][4] (you can read more about the improvements in RLlib in [this blog post][5]). To upgrade to the latest version, run ``` pip install -U ray ``` ## Scheduling This release includes two major changes to the scheduling behavior: spillback scheduling and support for custom resources. ### Spillback Scheduling Because Ray takes a bottom-up hierarchical approach to scheduling in which scheduling decisions are made by local schedulers on each machine (in order to avoid a centralized scheduling bottleneck), scheduling decisions are often made with a slightly stale view of the system state. As a consequence, it is possible for race conditions to occur and for too many tasks to be assigned to a single machine. For example, consider the case in which there are 100 GPUs scattered around the cluster and workers on the various machines submit a total of exactly 100 GPU tasks. If these tasks take a while to execute, then the desired behavior is for one task to be assigned to each GPU. However, without a single scheduling bottleneck like a centralized scheduler, too many tasks may be assigned to a single machine resulting in delays. Spillback scheduling provides a mechanism for correcting for bad scheduling decisions. At a high-level, if a local scheduler decides that it does not want to execute a task, it can spill the task back to the global scheduler (or in principle to another local scheduler). This mechanism allows us to achieve perfect load balancing along with high task throughput. ### Custom Resources Remote functions and actors now support scheduling with [arbitrary custom resource requirements][6]. We can specify that a remote function requires 1 CPU, 2 GPUs, and 3 of some custom resource with syntax like the following. ```python @ray.remote(num_cpus=1, num_gpus=2, resources={'Custom': 3}) def f(): pass ``` To tell Ray that a node has 6 of the custom resource, Ray should be started on that machine with a command like the following. ``` ray start ... --resources='{"Custom": 6}' ``` Custom resources can be used for a variety of different purposes. For example, the can be used to do bookkeeping of a concrete resource like memory, or to indicate that a particular dataset lives on a particular machine, or to give machines certain roles (such as a "parameter server" machine or a "worker" machine). ## Libraries This release also includes the start of [Pandas on Ray][2], which is a project aimed at speeding up [Pandas][7] DataFrames. It also includes substantial improvements to [RLlib][3], such as high-quality implementations of algorithms like [Ape-X][8], and substantial improvements to [Tune][4], such as implementations of state of the art algorithms like [Population Based Training][9]. [1]: https://github.com/ray-project/ray [2]: https://rise.cs.berkeley.edu/blog/pandas-on-ray/ [3]: http://ray.readthedocs.io/en/latest/rllib.html [4]: http://ray.readthedocs.io/en/latest/tune.html [5]: https://rise.cs.berkeley.edu/blog/distributed-policy-optimizers-for-scalable-and-reproducible-deep-rl/ [6]: http://ray.readthedocs.io/en/latest/resources.html [7]: https://pandas.pydata.org/ [8]: https://arxiv.org/abs/1803.00933 [9]: http://ray.readthedocs.io/en/latest/pbt.html