* Add manylinux setup
* Switch to cp27mu
* python/MANIFEST.in
* Fix MANIFEST.in
* Add build-wheel-manylinux1.sh
* Update readme
* Install correct version of numpy
* Fix typo in README-manylinux1.md
* Don't install cmake
* Remove commented line from setup.py
* Delete unused manylinux1.sh
* Run setup.py bdist_wheel twice
* Don't use package_data and MANIFEST.in.
* Small aesthetic change.
* Trigger build_ext in setup.py.
* Remove nonexistent file from MANIFEST.in.
* Manually copy files in MANIFEST.in to where Python expects them in order to prevent setup.py from having to be run twice.
* Only run setup.py once when building wheels.
* Aesthetic change to readme.
* Copy generated flatbuffer Python files in build_ext.
* Fix permission denied error by making sure to preserve executableness when copying files.
* Remove unnecessary argument to setup.py.
* Remove MANIFEST.in and move files to include into list in setup.py.
* Fix numpy version when building wheels and replace rm with git clean.
* Enable remote function and actor definitions to close over actor definitions.
* Give better error message if actor objects are pickled.
* Add tests for closing over actor definitions.
* Fix linting.
* Implement sharding in the Ray core
* Single node Python modifications to do sharding
* Do the sharding in redis.cc
* Pipe num_redis_shards through start_ray.py and worker.py.
* Use multiple redis shards in multinode tests.
* first steps for sharding ray.global_state
* Fix problem in multinode docker test.
* fix runtest.py
* fix some tests
* fix redis shard startup
* fix redis sharding
* fix
* fix bug introduced by the map-iterator being consumed
* fix sharding bug
* shard event table
* update number of Redis clients to be 64K
* Fix object table tests by flushing shards in between unit tests
* Fix local scheduler tests
* Documentation
* Register shard locations in the primary shard
* Add plasma unit tests back to build
* lint
* lint and fix build
* Fix
* Address Robert's comments
* Refactor start_ray_processes to start Redis shard
* lint
* Fix global scheduler python tests
* Fix redis module test
* Fix plasma test
* Fix component failure test
* Fix local scheduler test
* Fix runtest.py
* Fix global scheduler test for python3
* Fix task_table_test_and_update bug, from actor task table submission race
* Fix jenkins tests.
* Retry Redis shard connections
* Fix test cases
* Convert database clients to DBClient struct
* Fix race condition when subscribing to db client table
* Remove unused lines, add APITest for sharded Ray
* Fix
* Fix memory leak
* Suppress ReconstructionTests output
* Suppress output for APITestSharded
* Reissue task table add/update commands if initial command does not publish to any subscribers.
* fix
* Fix linting.
* fix tests
* fix linting
* fix python test
* fix linting
* Perform ray.register_class under the hood.
* Fix bug.
* Release worker lock when waiting for imports to arrive in get.
* Remove calls to register_class from examples and tests.
* Clear serialization state between tests.
* Fix bug and add test for multiple custom classes with same name.
* Fix failure test.
* Fix linting and cleanups to python code.
* Fixes to documentation.
* Implement recursion depth for recursively registering classes.
* Fix linting.
* Push warning to user if waiting for class for too long.
* Fix typos.
* Don't export FunctionToRun if pickling the function fails.
* Don't broadcast class definition when pickling class.
* Direction substitution of @ray.remote -> @ray.task.
* Changes to make '@ray.task' work.
* Instantiate actors with Class.remote() instead of Class().
* Convert actor instantiation in tests and examples from Class() to Class.remote().
* Change actor method invocation from object.method() to object.method.remote().
* Update tests and examples to invoke actor methods with .remote().
* Fix bugs in jenkins tests.
* Fix example applications.
* Change @ray.task back to @ray.remote.
* Changes to make @ray.actor -> @ray.remote work.
* Direct substitution of @ray.actor -> @ray.remote.
* Fixes.
* Raise exception if @ray.actor decorator is used.
* Simplify ActorMethod class.
* Make note about bug in which actor creation notification message is not received.
* Prevent actors from being created on removed nodes.
* Prevent actors from being created on nodes with no CPUs.
* Fix linting.
* Add test for scheduling actors on local schedulers with no CPUs.
* Improve error message when actors created before ray.init called.
* Change local scheduler bookkeeping to use GPU IDs.
* Update actor test.
* Add tests for actors and tasks simultaneously using GPUs.
* Add additional task GPU ID test.
* Fix linting.
* Make redis GPU assignment ignore GPU IDs.
* Small fix.
* copy task specifications put into the actor task cache so it won't get overwritten when the scheduler receives the next task
* cleanup
* cleanup and fix
* linting
* fix jenkins test
* fix linting
* Serialize lambdas with pickle by default.
* Serialize sets with pickle by default.
* Serialize types with pickle by default.
* Small update to documentation.
* Update tests.
* Augment test to verify that relevant workers and actors are killed during driver cleanup.
* Fix bug in which we were only killing one worker when a driver exited.
* Fix remove driver test.
* Fix and augment test.
* Clean up state when drivers exit.
* Remove unnecessary field in ActorMapEntry struct.
* Have monitor release GPU resources in Redis when driver exits.
* Enable multiple drivers in multi-node tests and test driver cleanup.
* Make redis GPU allocation a redis transaction and small cleanups.
* Fix multi-node test.
* Small cleanups.
* Make global scheduler take node_ip_address so it appears in the right place in the client table.
* Cleanups.
* Fix linting and cleanups in local scheduler.
* Fix removed_driver_test.
* Fix bug related to vector -> list.
* Fix linting.
* Cleanup.
* Fix multi node tests.
* Fix jenkins tests.
* Add another multi node test with many drivers.
* Fix linting.
* Make the actor creation notification a flatbuffer message.
* Revert "Make the actor creation notification a flatbuffer message."
This reverts commit af99099c8084dbf9177fb4e34c0c9b1a12c78f39.
* Add comment explaining flatbuffer problems.
* Ignore deleted clients when reading address info from Redis
* Remove self from db_client table when exiting cleanly
* Fix valgrind test
* Do not call plasma_perform_release when disconnecting
* Fix worker blocked bug
* tmp
* Push an error to the driver on ray.put for non-driver tasks
* Fix result table tests
* Fix test, logging
* Address comments
* Fix suppression bug
* Fix redis module test
* Edit error message
* Get values in chunks during reconstruction
* Test case for driver ray.put errors
* Error for evicting ray.put objects from the driver
* Fix tests
* Reduce verbosity
* Documentation
* Failing test case
* Local scheduler exits cleanly after plasma store dies
* Tolerate one plasma store failure
* Tolerate plasma store failures on all nodes except head node
* Plasma manager heartbeats
* Component failure tests
* Don't run the helper for Python testing
* Fix C test
* Fix hanging plasma transfer test
* Fix python3
* Consolidate ClientConnection code
* Fix valgrind test
* fix c test
* We can restart worker nodes!
* Fix flatbuffers bug
* Address comments
* Only register actual workers with the local scheduler
* Fix bug
* Fix segfaults
* Add test case that tests for driver liveness, fix local scheduler bug
* Clean up after tests
* Allocate retry info on the stack
* Send SIGKILL before waiting
* Relax unit test conditions
* Driver liveness test case and documentation
* Start process for monitoring log files and push changes to redis.
* Display log files in UI.
* Bug fix for recent tasks.
* Use flatbuffers to parse local scheduler heartbeats.
* Compile the Ray redis module with C++.
* Redo parsing of object table notifications with flatbuffers.
* Update redis module python tests.
* Redo parsing of task table notifications with flatbuffers.
* Fix linting.
* Redo parsing of db client notifications with flatbuffers.
* Redo publishing of local scheduler heartbeats with flatbuffers.
* Fix linting.
* Remove usage of fixed-width formatting of scheduling state in channel name.
* Reply with flatbuffer object to task table queries, also simplify redis string to flatbuffer string conversion.
* Fix linting and tests.
* fix
* cleanup
* simplify logic in ReplyWithTask
* Initial conversion
* Further changes
* fixes
* some changes
* Fixes
* Added data pipeline
* Added updates to cifar
* Currently borken need sep pr
* Added test for retriving variables from an optimizer
* Removed FlAG ref in environment variables
* Added comments to test
* Addressed comments
* Added updates
* Made further changes for tfutils
* Fixed finalized bug
* Removed ipython
* Added accuracy printing
* Temp commit
* added fixes
* changes
* Added writing to file
* Fixes for gpus
* Cleaned up code
* Temp commit
* Gpu support fully implemented
* Updated to use num_gpus for actors
* Finished testing gpus implementation
* Changed to be more in line with origin implementation
* Updated test to use actors
* Added support for cpu only systems
* Now works with no cpus
* Minor changes and some documentation.
* WARN instead of FATAL for object hash mismatches, push error to driver
* Document the callback signature for object_table_add/remove
* Error table
* Wait for all errors in python test
* Fix doc
* Fix state test
* Clean up plasma subscribers on EPIPE
First pass at a monitoring script - monitor can detect local scheduler death
Clean up task table upon local scheduler death in monitoring script
Don't schedule to dead local schedulers in global scheduler
Have global scheduler update the db clients table, monitor script cleans up state
Documentation
Monitor script should scan tables before beginning to read from subscription channel
Fix for python3
Redirect monitor output to redis logs, fix hanging in multinode tests
* Publish auxiliary addresses as part of db_client deletion notifications
* Fix test case?
* Small changes.
* Use SCAN instead of KEYS
* Address comments
* Address more comments
* Free redis module strings
* Availability after a killed worker
* Workers exit cleanly
* Memory cleanup in photon C tests
* Worker failure in multinode
* Consolidate worker cleanup handlers
* Update the result table before handling a task submission
* KILL_WORKER_TIMEOUT -> KILL_WORKER_TIMEOUT_MILLISECONDS
* Log a warning instead of crashing if no result table entry found