Commit graph

34 commits

Author SHA1 Message Date
Philipp Moritz
12a68e84d2 Implement a first pass at actors in the API. (#242)
* Implement actor field for tasks

* Implement actor management in local scheduler.

* initial python frontend for actors

* import actors on worker

* IPython code completion and tests

* prepare creating actors through local schedulers

* add actor id to PyTask

* submit actor calls to local scheduler

* starting to integrate

* simple fix

* Fixes from rebasing.

* more work on python actors

* Improve local scheduler actor handlers.

* Pass actor ID to local scheduler when connecting a client.

* first working version of actors

* fixing actors

* fix creating two copies of the same actor

* fix actors

* remove sleep

* get rid of export synchronization

* update

* insert actor methods into the queue in the right order

* remove print statements

* make it compile again after rebase

* Minor updates.

* fix python actor ids

* Pass actor_id to start_worker.

* add test

* Minor changes.

* Update actor tests.

* Temporary plan for import counter.

* Temporarily fix import counters.

* Fix some tests.

* Fixes.

* Make actor creation non-blocking.

* Fix test?

* Fix actors on Python 2.

* fix rare case.

* Fix python 2 test.

* More tests.

* Small fixes.

* Linting.

* Revert tensorflow version to 0.12.0 temporarily.

* Small fix.

* Enhance inheritance test.
2017-02-15 00:10:05 -08:00
Stephanie Wang
2b8e6485e3 Start and clean up workers from the local scheduler. (#250)
* Start and clean up workers from the local scheduler

Ability to kill workers in photon scheduler

Test for old method of starting workers

Common codepath for killing workers

Common codepath for killing workers

Photon test case for starting and killing workers

fix build

Fix component failure test

Register a worker's pid as part of initial connection

Address comments and revert photon_connect

Set PATH during travis install

Fix

* Fix photon test case to accept clients on plasma manager fd
2017-02-10 12:46:23 -08:00
Robert Nishihara
ab8c3432f7 Add driver ID to task spec and add driver ID to Python error handling. (#225)
* Add driver ID to task spec and add driver ID to Python error handling.

* Make constants global variables.

* Add test for error isolation.
2017-01-25 22:53:48 -08:00
Robert Nishihara
303d0fed3e Prevent plasma store and manager from dying when a client dies. (#203)
* Prevent plasma store and manager from dying when a worker dies.

* Check errno inside of warn_if_sigpipe. Passing in errno doesn't work because the arguments to warn_if_sigpipe can be evaluated out of order.
2017-01-17 20:34:31 -08:00
Philipp Moritz
a708e36225 Switch build system to use CMake completely. (#200)
* switch to CMake completely

...

* cleanup

* Run C tests, update installation instructions.
2017-01-17 16:56:40 -08:00
Philipp Moritz
ab3448a9b4 Plasma Optimizations (#190)
* bypass python when storing objects into the object store

* clang-format

* Bug fixes.

* fix include paths

* Fixes.

* fix bug

* clang-format

* fix

* fix release after disconnect
2017-01-09 20:15:54 -08:00
Wapaul1
0ac2abee51 Added helper class for getting tf variables from loss function (#184)
* Added helper class for getting tf variables from loss function

* Updated usage and documentation

* Removed try-catches

* Added futures

* Added documentation

* fixes and tests

* more tests

* install tensorflow in travis
2017-01-07 01:54:11 -08:00
Stephanie Wang
6828d694ae Test object notifications from Plasma store (#141)
* Object notification test for Photon, and turn on valgrind for Photon C tests

* Test object notification handler in the plasma manager

* Fix hanging test case
2016-12-29 23:10:38 -08:00
Philipp Moritz
0ca0864856 Use flatcc for serialization of IPC messages. (#140)
* added Phllipp's updates

* Switch to using flatbuffers for IPC.

* Various changes.

* convert remaining messages and cleanups

* fix

* fix function signatures

* fix valgrind errors

* clang-format

* final commit

* Fix valgrind test.
2016-12-20 14:46:25 -08:00
Robert Nishihara
79dd1815a2 Python 3 compatibility. (#121)
* Make common module Python 3 compatible.

* Make plasma module Python 3 compatible.

* Make photon module Python 3 compatible.

* Make numbuf module Python 3 compatible.

* Remaining changes for Python 3 compatibility.

* Test Python 3 in Travis.

* Fixes.
2016-12-16 14:40:37 -08:00
Stephanie Wang
4bdb9f7224 Object reconstruction in Photon (#65)
* Object reconstruction in Photon and C test cases for Photon

* Fix hanging test case on mac

* Remove unnecessary event from photon tests

* make photon_disconnect not leak file descriptors

* fix some of the memory errors

* Fix valgrind

* lint

* Address Robert's comments and add test case for object reconstruction suppression

* Remove OWNER
2016-12-12 23:17:22 -08:00
Robert Nishihara
6441571d31 Introduce some stress tests. (#106)
* Retry first connection to redis in db_connect.

* Declare usleep.

* Formatting.

* Introduce some stress tests.
2016-12-09 17:49:31 -08:00
Robert Nishihara
9ed56c23db Fix pip install hanging by moving C tests out of build.sh. (#52)
* Remove C tests from build.sh.

* Run C tests in Travis.
2016-11-20 21:02:54 -08:00
Philipp Moritz
d64f69f215 switch from submodule to cloning arrow, travis fixes & Robert's comments 2016-11-19 17:38:36 -08:00
Robert Nishihara
d77b685a90 Global scheduler skeleton (#45)
* Initial scheduler commit

* global scheduler

* add global scheduler

* Implement global scheduler skeleton.

* Formatting.

* Allow local scheduler to be started without a connection to redis so that we can test it without a global scheduler.

* Fail if there are no local schedulers when the global scheduler receives a task.

* Initialize uninitialized value and formatting fix.

* Generalize local scheduler table to db client table.

* Remove code duplication in local scheduler and add flag for whether a task came from the global scheduler or not.

* Queue task specs in the local scheduler instead of tasks.

* Simple global scheduler tests, including valgrind.

* Factor out functions for starting processes.

* Fixes.
2016-11-18 19:57:51 -08:00
Robert Nishihara
072f442c1f Update worker.py and services.py to use plasma and the local scheduler. (#19)
* Update worker code and services code to use plasma and the local scheduler.

* Cleanups.

* Fix bug in which threads were started before the worker mode was set. This caused remote functions to be defined on workers before the worker knew it was in WORKER_MODE.

* Fix bug in install-dependencies.sh.

* Lengthen timeout in failure_test.py.

* Cleanups.

* Cleanup services.start_ray_local.

* Clean up random name generation.

* Cleanups.
2016-11-02 00:39:35 -07:00
Robert Nishihara
47851eeccc Build Ray with setup.py. (#14)
* Build Ray with setup.py.

* Building photon extensions with cmake.

* Fix formatting in photon_extension.c

* Pip install with sudo in Travis.

* Fix plasma __init__.py.

* Rename and remove some files.
2016-10-31 17:08:03 -07:00
Ion
ee3718c80c Ion and Philipp's table retries (#10)
* Ion and Philipp's table retries

* Refactor the retry struct:
- Rename it from retry_struct to retry_info
- Retry information contains the failure callback, not the retry callback
- All functions take in retry information as an arg instead of its expanded fields

* Rename cb -> callback

* Remove prints

* Fix compiler warnings

* Change some CHECKs to greatest ASSERTs

* Key outstanding callbacks hash table with timer ID instead of callback data pointer

* Use the new retry API for table commands

* Memory cleanup in plasma unit tests

* fix Robert's comments

* add valgrind for common
2016-10-29 15:22:33 -07:00
Stephanie Wang
f3a43179c8 Unit testing for the plasma manager. 2016-10-28 11:56:22 -07:00
Robert Nishihara
f83a98d71a Changes to make tests pass on Travis. (#3)
* Remove hiredis submodule.

* Squashed 'src/common/thirdparty/hiredis/' content from commit acd1966

git-subtree-dir: src/common/thirdparty/hiredis
git-subtree-split: acd1966bf7f5e1be74b426272635c672def78779

* Make Plasma tests pass.

* Make Photon tests pass.

* Compile and test with Travis.

* Deactive fetch test so that the tests pass.
2016-10-25 22:39:21 -07:00
Robert Nishihara
91f16a3df0 Migrate repositories to ray-project. (#438)
* Migrate repositories to ray-project.

* Update numbuf to the migrated version.
2016-09-17 00:52:05 -07:00
Robert Nishihara
87bb7a8f67 [WIP] Large changes to make the tests pass. (#376)
* Revert "Make tests more informative (#372)"

This reverts commit fd353250c8.

* fix bugs, in particular deactivate worker service on driver and remove condition variables

* changes to minimize the changes in this PR

* switch from faulty mutex synchronization to using atomics

* Increase the default size of the message queues, to accommodate exporting large numbers of remote functions. This is a temporary fix, but not a long term solution.

* Reorganize the scheduler export code to queue up exports. This does not solve the underlying problem yet, but sets up a solution.

* Start a separate thread on driver to print error messages by constantly querying the scheduler. This is a temporary solution because the solution based on starting a worker service for the driver which the scheduler can push error messages to is buggy.

* Fix segfault in taskcapsule destructor.

* Move tests for catching errors into a separate test file.

* Revert "roll back grpc (#368)"

This reverts commit c01ef95d04.
2016-08-15 11:02:54 -07:00
Johann Schleier-Smith
4d5f85e77c Fix Travis failure detection on Mac + testing enhancements (#358)
* properly handle failure in continuous integration

* add stack overflow reference
2016-08-08 18:02:52 -07:00
Johann Schleier-Smith
583df08957 Docker builds on Travis (#343)
* attempt to build on travis using docker

* run tests in foreground

* add examples to travis tests

* test from current checkout

* attempt to fix docker version issues

* try build with xenial

* attempt docker upgrade

* avoid hang on configuration files

* matrix osx and linux w/ docker

* restore non-test docker builds

* fix typo

* tuning and cleanup

* add missing file

* comment cleanup
2016-08-02 17:03:28 -07:00
Robert Nishihara
a155f315e2 remove assertions from microbenchmarks and add to travis 2016-07-29 00:08:00 -07:00
Wapaul1
292abaa41c Moved Imagenet loading library to example applications; Changed code to return filenames as well as arrays (#257) 2016-07-11 18:06:58 -07:00
Robert Nishihara
e1a74eadbe remove installation of dependencies from setup script (#239) 2016-07-08 20:03:21 -07:00
Robert Nishihara
902cac3089 arrays -> array (#172) 2016-06-27 11:35:31 -07:00
Robert Nishihara
b05ed0d3b1 don't allow failures on os x 2016-06-23 23:42:43 -07:00
Mehrdad
f2d2b585e3 Split up and polish build scripts 2016-06-22 16:20:56 -07:00
Robert Nishihara
d19c455b2b travis for os x (#125) 2016-06-22 11:28:01 -07:00
Philipp Moritz
acc51309e7 load imagenet 2016-06-10 17:25:55 -07:00
Robert Nishihara
137909d177 remove old code (#101) 2016-06-10 17:22:35 -07:00
Philipp Moritz
4cb4f288cc adding continuous integration 2016-06-05 20:46:54 -07:00