No description
Find a file
Stephanie Wang 61904c4c3e Object hashes (#104)
* factoring out object_info for general use by several Ray components

* addressing comments

* Replace SHA256 task hash with MD5

Add object hash to object table (always overwrites)

Support for table operations that span multiple asynchronous Redis
commands

Add a new object location in a transaction, using Redis's optimistic
concurrency

Use Redis GETSET instead of transactions and Python frontend code for object hashing

Remove spurious log message

Fix for object_table_add

Revert "Replace SHA256 task hash with MD5"

This reverts commit e599de473c8dad9189ccb0600429534b469b76a2.

Revert to sha256

Test case for illegal puts

Use SETNX to set object hashes

Initialize digest with zeros

Initialize plasma_request with zeros

* Fixes

* replace SHA256 with a faster hash in the object store

* Fix valgrind

* Address Robert's comments

* Check that plasma_compute_object_hash succeeds.

* Don't run test_illegal_put test with valgrind because it causes an intentional crash which causes valgrind to complain.

* Debugging after rebase.

* handling Robert's comments

* Fix bugs after rebase.

* final fixes for Stephanie's PR

* fix
2016-12-08 20:57:08 -08:00
.travis Changes to make tests pass on Travis. (#3) 2016-10-25 22:39:21 -07:00
cmake/Modules help cmake find right python interpreter on mac (#251) 2016-07-11 12:16:10 -07:00
doc Remove numbuf from requirements for setup.py. (#54) 2016-11-21 14:30:17 -08:00
docker Migrate repositories to ray-project. (#438) 2016-09-17 00:52:05 -07:00
examples Implement repr, hash, and richcompare for ObjectIDs. (#33) 2016-11-11 09:18:36 -08:00
lib/python Enable fetching objects from remote object stores. (#87) 2016-12-06 15:47:31 -08:00
numbuf When searching for Python with cmake, try custom tricks before using default cmake behavior. (#67) 2016-11-29 14:32:54 -08:00
scripts Update worker.py and services.py to use plasma and the local scheduler. (#19) 2016-11-02 00:39:35 -07:00
src Object hashes (#104) 2016-12-08 20:57:08 -08:00
test Enable fetching objects from remote object stores. (#87) 2016-12-06 15:47:31 -08:00
vsprojects Windows compatibility (#57) 2016-11-22 17:04:24 -08:00
webui Integration of Webui with Ray (#32) 2016-11-17 22:33:29 -08:00
.clang-format Changes to make tests pass on Travis. (#3) 2016-10-25 22:39:21 -07:00
.editorconfig Update Windows support (#317) 2016-07-28 13:11:13 -07:00
.gitignore Update gitignore. (#94) 2016-12-07 11:54:16 -08:00
.gitmodules Windows compatibility (#57) 2016-11-22 17:04:24 -08:00
.travis.yml Fix pip install hanging by moving C tests out of build.sh. (#52) 2016-11-20 21:02:54 -08:00
build-docker.sh Migrate repositories to ray-project. (#438) 2016-09-17 00:52:05 -07:00
build-webui.sh Global scheduler skeleton (#45) 2016-11-18 19:57:51 -08:00
build.sh Remove Redis version from Linux scripts (#56) 2016-11-21 15:02:40 -08:00
install-dependencies.sh switch from submodule to cloning arrow, travis fixes & Robert's comments 2016-11-19 17:38:36 -08:00
LICENSE Change license to Apache 2 (#20) 2016-11-01 23:19:06 -07:00
pylintrc adding pylint (#233) 2016-07-08 12:39:11 -07:00
Ray.sln Windows compatibility (#57) 2016-11-22 17:04:24 -08:00
README.md Remove out-of-date documentation. (#40) 2016-11-12 19:34:22 -08:00

Ray

Build Status

Ray is an experimental distributed extension of Python. It is under development and not ready to be used.

The goal of Ray is to make it easy to write machine learning applications that run on a cluster while providing the development and debugging experience of working on a single machine.

Before jumping into the details, here's a simple Python example for doing a Monte Carlo estimation of pi (using multiple cores or potentially multiple machines).

import ray
import numpy as np

# Start a scheduler, an object store, and some workers.
ray.init(start_ray_local=True, num_workers=10)

# Define a remote function for estimating pi.
@ray.remote
def estimate_pi(n):
  x = np.random.uniform(size=n)
  y = np.random.uniform(size=n)
  return 4 * np.mean(x ** 2 + y ** 2 < 1)

# Launch 10 tasks, each of which estimates pi.
result_ids = []
for _ in range(10):
  result_ids.append(estimate_pi.remote(100))

# Fetch the results of the tasks and print their average.
estimate = np.mean(ray.get(result_ids))
print "Pi is approximately {}.".format(estimate)

Within the for loop, each call to estimate_pi.remote(100) sends a message to the scheduler asking it to schedule the task of running estimate_pi with the argument 100. This call returns right away without waiting for the actual estimation of pi to take place. Instead of returning a float, it returns an object ID, which represents the eventual output of the computation (this is a similar to a Future).

The call to ray.get(result_id) takes an object ID and returns the actual estimate of pi (waiting until the computation has finished if necessary).

Next Steps

Example Applications