Commit graph

178 commits

Author SHA1 Message Date
Melih Elibol
bea97b425b Fix python linting (#2076) 2018-05-16 15:04:31 -07:00
Robert Nishihara
570c3153cd Some tests for _submit API. (#2062) 2018-05-16 00:26:25 -07:00
Robert Nishihara
8fbb88485b Create RemoteFunction class, remove FunctionProperties, simplify worker Python code. (#2052)
* Cleaning up worker and actor code. Create remote function class. Remove FunctionProperties object.

* Remove register_actor_signatures function.

* Small cleanups.

* Fix linting.

* Support @ray.method syntax for actor methods.

* Fix pickling bug.

* Fix linting.

* Shorten testBlockingTasks.

* Small fixes.

* Call get_global_worker().
2018-05-14 14:35:23 -07:00
Robert Nishihara
52b0f3734a [xray] Add Travis build for testing xray on Linux. (#2047)
* Run xray tests in travis.

* Comment out TaskTests.testSubmittingManyTasks.

* Comment out failing tests.

* Comment out hanging test.

* Linting

* Comment out failing test.

* Comment out failing test.

* Ignore test_dataframe.py for now.

* Comment out testDriverExitingQuickly.
2018-05-13 21:22:01 -07:00
Robert Nishihara
18071d95a7 Use more CPUs for testMultipleWaitsAndGets. (#2051) 2018-05-13 15:35:02 -07:00
eric-jj
71997a481b Improve shared_ptr usage (#2030)
[xray] Improve shared_ptr usage
2018-05-11 20:05:04 -07:00
Alok Singh
cdf94c18a4 Clean up syntax for supported Python versions. (#1963)
* Use set/dict literal syntax

Ran code through [pyupgrade](https://github.com/asottile/pyupgrade). This is
supported in every Python version 2.7+.

* Drop unnecessary string format specification

No need to specify 0,1.. if paramters are passed in order.

* Revert "Drop unnecessary string format specification"

This reverts commit efa5ec85d30ff69f34e5ed93e31343fea7647bcb.

* Undo changes to cloudpickle

Drop use of set literal until cloudpickle uses it.

* Reformat code with YAPF

We need to set up a git pre-push hook to automatically run this stuff.
2018-05-03 07:45:11 -07:00
Robert Nishihara
7792032ee3 Fix UI issue for non-json-serializable task arguments. (#1892)
* Fix UI issue for non-json-serializable task arguments.

* Simplify approach.
2018-04-15 13:54:42 -07:00
Philipp Moritz
74162d1492 Lint Python files with Yapf (#1872) 2018-04-11 10:11:35 -07:00
Robert Nishihara
5bde5e75e7 Implement unsafe method for flushing entire object table and task table. (#1824)
* Implement unsafe method for flushing entire object table and task table.

* Add test.

* Fix test.
2018-04-04 18:29:24 -07:00
Robert Nishihara
0fc989c6c1 Don't use 127.0.0.1 for local ip address. (#1596)
* Don't use 127.0.0.1 for ip address.

* Update test
2018-04-02 00:34:20 -07:00
Robert Nishihara
4bccabd910 Redirect output of all processes by default. (#1752)
* Redirect output of all processes by default.

* Add separate flag for redirecting worker output.

* Fix tests.
2018-03-20 18:14:54 -07:00
Robert Nishihara
2922e1c388 Add API for getting total cluster resources. (#1736)
* Add API for getting total cluster resources.

* Add test.
2018-03-20 15:57:00 -07:00
Robert Nishihara
d78de0d41f Provide experimental API for changing number of return values and res… (#1735)
* Provide experimental API for changing number of return values and resource requirements at task submission time.

* Remove code duplication and add tests.
2018-03-19 15:32:23 -07:00
Robert Nishihara
96913be939 Treat actor creation like a regular task. (#1668)
* Treat actor creation like a regular task.

* Small cleanups.

* Change semantics of actor resource handling.

* Bug fix.

* Minor linting

* Bug fix

* Fix jenkins test.

* Fix actor tests

* Some cleanups

* Bug fix

* Fix bug.

* Remove cached actor tasks when a driver is removed.

* Add more info to taskspec in global state API.

* Fix cyclic import bug in tune.

* Fix

* Fix linting.

* Fix linting.

* Don't schedule any tasks (especially actor creaiton tasks) on local schedulers with 0 CPUs.

* Bug fix.

* Add test for 0 CPU case

* Fix linting

* Address comments.

* Fix typos and add comment.

* Add assertion and fix test.
2018-03-16 11:18:07 -07:00
Alexey Tumanov
844a6afcdd Implement simple random spillback policy. (#1493)
* spillback policy implementation: global + local scheduler

* modernize global scheduler policy state; factor out random number engine and generator

* Minimal version.

* Fix test.

* Make load balancing test less strenuous.
2018-02-13 00:09:35 -08:00
Robert Nishihara
ed77a4c415 Make ray.get_gpu_ids() respect existing CUDA_VISIBLE_DEVICES. (#1499)
* Make ray.get_gpu_ids() respect existing CUDA_VISIBLE_DEVICES.

* Comment out failing GPUID check.

* Add import.

* Fix test.

* Remove test.

* Factor out environment variable setting/getting into utils.
2018-02-01 21:29:14 -08:00
Philipp Moritz
a3f8fa426b Start integrating new GCS APIs (#1379)
* Start integrating new GCS calls

* fixes

* tests

* cleanup

* cleanup and valgrind fix

* update tests

* fix valgrind

* fix more valgrind

* fixes

* add separate tests for GCS

* fix linting

* update tests

* cleanup

* fix python linting

* more fixes

* fix linting

* add plasma manager callback

* add some documentation

* fix linting

* fix linting

* fixes

* update

* fix linting

* fix

* add spillback count

* fixes

* linting

* fixes

* fix linting

* fix

* fix

* fix
2018-01-31 11:01:12 -08:00
Robert Nishihara
ab5d4a6010 Bring cloudpickle inside the repository. (#1445)
* Bring cloudpickle version 0.5.2 inside the repo.

* Use internal copy of cloudpickle everywhere.

* Fix linting.

* Import ordering.

* Change __init__.py.

* Set pickler in serialization context.

* Don't check ray location.
2018-01-25 11:36:37 -08:00
Robert Nishihara
f32c0c8ec1 Move calls to ray.worker.cleanup into tearDown part of tests for isolation. (#1433) 2018-01-22 22:54:56 -08:00
Robert Nishihara
f75b51d178 Register Common.error with local scheduler extension module. (#1316)
* Register Common.error with local scheduler extension module.

* Add test.
2017-12-13 11:55:54 -08:00
Robert Nishihara
c21e189371 Allow scheduling with arbitrary user-defined resource labels. (#1236)
* Enable scheduling with custom resource labels.

* Fix.

* Minor fixes and ref counting fix.

* Linting

* Use .data() instead of .c_str().

* Fix linting.

* Fix ResourcesTest.testGPUIDs test by waiting for workers to start up.

* Sleep in test so that all tasks are submitted before any completes.
2017-12-01 11:41:40 -08:00
Eric Liang
37831ae0c3 Add a nicer warning message when you pass the wrong thing to ray.wait() (#1239)
* add warnings

* fix python mode

* Small changes and add tests.

* Fix test failure.
2017-11-27 22:57:33 -08:00
Robert Nishihara
2865128df0 Remove counter from run_function_on_all_workers. Also remove utilitie… (#1260)
* Remove counter from run_function_on_all_workers. Also remove utilities for copying directories across machines.

* Fix linting.
2017-11-26 18:29:10 -08:00
Robert Nishihara
6852e8839e Expose custom serializers through the API. (#1147)
* Expose custom serializers through the API.

* minor renaming

* Add test.

* Remove comment.

* Clean up assertions.
2017-10-29 00:08:55 -07:00
Richard Liaw
797f4fcbf3 Fixing Lint after flake upgrade (#1162)
* Fixing Lint after flake upgrade

* more lint fixes
2017-10-26 21:02:07 -05:00
Robert Nishihara
a52a1e893f Automatically set CUDA_VISIBLE_DEVICES when worker gets task. (#1044)
* Automatically set CUDA_VISIBLE_DEVICES when worker gets task.

* Add test.
2017-10-06 18:38:08 -07:00
Philipp Moritz
57bd1d6ff5 Specialize Serialization for OrderedDict (#1035)
Specialize Serialization for OrderedDict and defaultdict
2017-10-02 17:33:10 -07:00
Robert Nishihara
ad61af7333 Workaround for passing empty list to ray.wait. (#1043)
* Workaround for passing empty list to ray.wait.

* Add test for passing empty list to wait.
2017-10-01 11:45:02 -07:00
Zongheng Yang
5a50e80b63 Make Monitor remove dead Redis entries from exiting drivers. (#994)
* WIP: removing OL, OI, TT on client exit; no saving yet.

* ray_redis_module.cc: update header comment.

* Cleanup: just the removal.

* Reformat via yapf: use pep8 style instead of google.

* Checkpoint addressing comments (partially)

* Add 'b' marker before strings (py3 compat)

* Add MonitorTest.

* Use `isort` to sort imports.

* Remove some loggings

* Fix flake8 noqa marker runtest.py

* Try to separate tests out to monitor_test.py

* Rework cleanup algorithm: correct logic

* Extend tests to cover multi-shard cases

* Add some small comments and formatting changes.
2017-09-26 00:11:38 -07:00
gycn
a432285e77 Disable parallelization for Actors and ray.wait for debugging (#961)
Support actors and ray.wait in PYTHON_MODE.
2017-09-17 00:12:50 -07:00
Eric Liang
d8aa826e63 [webui] Scalability fixes for the task timeline and visualizations (#935)
* fixes

* comments

* fix test

* Update ui.py

* upd

* Fix linting.
2017-09-10 15:47:44 -07:00
Alexey Tumanov
fc885bd918 Adding basic support for a user-interpretable resource label (#761)
* adding support for the user-interpretable label(UIR)

* more plumbing for num_uirs further upstream; set to infty when specified on cmd line

* pass default num_uirs for actors; update GlobalStateAPI

* support num_uirs in ray.init()

* local scheduler resource accounting: support num_uirs; prep for vectorized resource accounting

* global scheduler test updated

* Fix bug introduced by rebase.

* Rename UIR -> CustomResource and add test.

* Small changes and use constexpr instead of macros.

* Linting and some renaming.

* Reorder some code.

* Remove cpus_in_use and fix bug.

* Add another test and make a small change.

* Rephrase documentation about feature stability.
2017-08-08 02:53:59 -07:00
Robert Nishihara
d7b10a84b6 Fallback to custom serializer for very long python ints. (#821)
* Fallback to custom serializer for very long python ints.

* Fix linting.

* Fix naming convention and add RETURN_NOT_OK.
2017-08-07 17:21:06 -07:00
Robert Nishihara
1fe49d7676 Simplify testMultipleLocalSchedulers by having it start only one worker. (#789) 2017-07-31 17:44:45 -07:00
alanamarzoev
2b3190ad13 Chrome trace timeline with sliders. (#731)
* Trace timeline with sliders.

* Trace.

* Switched ujson to json.

* Fixed tests.

* linting fixes

* Fixed bug.

* Cleaned up code.

* Fixes according to comments.

* removed checkpoints.

* Undid accidental delete.

* Fixed linting error.

* Added documentation to notebook.

* Undid accidental deletes.

* Add comments and small formatting fixes.

* Small fix.
2017-07-17 19:59:49 -07:00
Robert Nishihara
e0867c8845 Switch Python indentation from 2 spaces to 4 spaces. (#726)
* 4 space indentation for actor.py.

* 4 space indentation for worker.py.

* 4 space indentation for more files.

* 4 space indentation for some test files.

* Check indentation in Travis.

* 4 space indentation for some rl files.

* Fix failure test.

* Fix multi_node_test.

* 4 space indentation for more files.

* 4 space indentation for remaining files.

* Fixes.
2017-07-13 21:53:57 +00:00
alanamarzoev
8464d77c76 Change event logs to store one Redis ZSET per worker. (#705)
* Changing to zset

* Fixed bug.

* Fixed another bug.

* Modified task_profiles.

* Removed extra file.

* Modified task_profiles test.

* WIP

* WIP

* Undid changes

* Updated

* WIP

* Made changes according to comments.

* Removed unneeded print.

* Removed ujson usage.

* failing test

* tests passing

* Fixed linting errors and modified style.

* Fixed bug.

* Fixed linting

* Fixed according to comments.

* Redis crashing?

* Fixed linting

* Fixed linting
2017-07-09 01:42:29 +02:00
alanamarzoev
716469160e Enable dumping profiling information to timeline format viewable by chrome tracing. (#703)
* Chrome tracing timeline.

* Modified decode statement.

* Some cleanups and add test.

* Remove example.

* Fix test.
2017-06-30 12:14:11 -04:00
alanamarzoev
e16df6da9a Updated task_profiles function to avoid future repetitive parsing. (#691)
* Updated task_profiles function to avoid future repetitive parsing.

* Fix indentation.

* Fixed according to comments.

* Included updated test for task_profiles function.

* Simplify test.

* Fix indentation.

* Fix.
2017-06-22 19:21:18 -07:00
alanamarzoev
cc4990b543 Task profiles function and test (#647)
Expose some task profiling information through global state API.
2017-06-13 17:53:34 -07:00
Philipp Moritz
54925996ca Allow remote functions to specify max executions and kill worker once limit is reached. (#660)
* implement restarting workers after certain number of task executions

* Clean up python code.

* Don't start new worker when an actor disconnects.

* Move wait_for_pid_to_exit to test_utils.py.

* Add test.

* Fix linting errors.

* Fix linting.

* Fix typo.
2017-06-13 00:34:58 -07:00
alanamarzoev
f0339f3386 Expose log files through global state API. (#641)
* added log_table function and a test

* fixed log_files and added task_profiles

* fixed formatting

* fixed linting errors

* fixes

* removed file

* more fixes

* hopefully fixed

* Small changes.

* Fix linting.

* Fix bug in log monitor.

* Small changes.

* Fix bug in travis.
2017-06-08 00:08:10 -07:00
Philipp Moritz
647e1d9fc3 Fix runtest.py on the ubuntu system python 3 (#599)
* fix runtest.py on the ubuntu system python 3

* less strict version of the test
2017-05-26 15:22:36 -07:00
Robert Nishihara
c5bc76193f Remove Ray environment variables from codebase. (#590) 2017-05-24 18:29:40 -07:00
Stephanie Wang
ee08c8274b Shard Redis. (#539)
* Implement sharding in the Ray core

* Single node Python modifications to do sharding

* Do the sharding in redis.cc

* Pipe num_redis_shards through start_ray.py and worker.py.

* Use multiple redis shards in multinode tests.

* first steps for sharding ray.global_state

* Fix problem in multinode docker test.

* fix runtest.py

* fix some tests

* fix redis shard startup

* fix redis sharding

* fix

* fix bug introduced by the map-iterator being consumed

* fix sharding bug

* shard event table

* update number of Redis clients to be 64K

* Fix object table tests by flushing shards in between unit tests

* Fix local scheduler tests

* Documentation

* Register shard locations in the primary shard

* Add plasma unit tests back to build

* lint

* lint and fix build

* Fix

* Address Robert's comments

* Refactor start_ray_processes to start Redis shard

* lint

* Fix global scheduler python tests

* Fix redis module test

* Fix plasma test

* Fix component failure test

* Fix local scheduler test

* Fix runtest.py

* Fix global scheduler test for python3

* Fix task_table_test_and_update bug, from actor task table submission race

* Fix jenkins tests.

* Retry Redis shard connections

* Fix test cases

* Convert database clients to DBClient struct

* Fix race condition when subscribing to db client table

* Remove unused lines, add APITest for sharded Ray

* Fix

* Fix memory leak

* Suppress ReconstructionTests output

* Suppress output for APITestSharded

* Reissue task table add/update commands if initial command does not publish to any subscribers.

* fix

* Fix linting.

* fix tests

* fix linting

* fix python test

* fix linting
2017-05-18 17:40:41 -07:00
Philipp Moritz
28f0882387 Expose function table to python global control state API (#542)
* expose function table to python global control state API

* fix

* fix linting

* add test for function table
2017-05-16 20:06:13 -07:00
Robert Nishihara
ec2534422b Remove register_class from API. (#550)
* Perform ray.register_class under the hood.

* Fix bug.

* Release worker lock when waiting for imports to arrive in get.

* Remove calls to register_class from examples and tests.

* Clear serialization state between tests.

* Fix bug and add test for multiple custom classes with same name.

* Fix failure test.

* Fix linting and cleanups to python code.

* Fixes to documentation.

* Implement recursion depth for recursively registering classes.

* Fix linting.

* Push warning to user if waiting for class for too long.

* Fix typos.

* Don't export FunctionToRun if pickling the function fails.

* Don't broadcast class definition when pickling class.
2017-05-16 18:38:52 -07:00
Eric Liang
e2e9e4ce6f Fix segmentation fault when calling ray.put on a dictionary with object keys (#548)
* fix segfault when serializing dict key

* fix style

* fix test

* Fix linting.
2017-05-15 01:09:13 -07:00
Robert Nishihara
c688a64235 Expose GPU IDs to remote functions. (#496)
* Change local scheduler bookkeeping to use GPU IDs.

* Update actor test.

* Add tests for actors and tasks simultaneously using GPUs.

* Add additional task GPU ID test.

* Fix linting.

* Make redis GPU assignment ignore GPU IDs.

* Small fix.
2017-05-07 13:03:49 -07:00