hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-10 13:26:39 -04:00

Author	SHA1	Message	Date
Robert Nishihara	5e76d52868	Improve cluster.wait_for_nodes() API. (#3712 ) * Separate out functionality for querying client table and improve cluster.wait_for_nodes() API. * Linting * Add back logging statements. * info -> debug	2019-01-07 21:26:58 -08:00
Robert Nishihara	c9d70f0dda	Remove num_local_schedulers argument from ray.worker._init. (#3704 ) * Remove num_local_schedulers argument from ray.worker._init. * Fix * Fix tests.	2019-01-07 12:44:49 -08:00
Yuhong Guo	c9b8ecca51	Add RayParams to refactor the parameters used by ray python. (#3558 )	2018-12-29 22:04:27 +08:00
Yuhong Guo	b9e1977fae	Fix failure of test_free_objects_multi_node (#3481 ) It is possible that `test_free_objects_multi_node` would fail sometimes. If we run this test 20 times, we may found at least one failure. The cause is that the test is based on function tasks. One raylet may create more than one worker to execute the tasks. So flush operations may be separated to several workers and not clean all the worker objects held by the plasma client. In this PR, I change function task to actor tasks, which guarantee all the tasks are executed in one worker of a raylet.	2018-12-06 15:55:49 -05:00
Si-Yuan	2e6f9bedf2	Add the extra fallback for serialization (#3468 ) * Add the extra fallback for serialization. * Better comments & warnings. quotes. * Update test/runtest.py Co-Authored-By: suquark <suquark@gmail.com> * Update test/runtest.py Co-Authored-By: suquark <suquark@gmail.com> * linting * Don't hijack too much errors. * simplify the test * Update runtest.py * simplify	2018-12-05 13:09:08 -08:00
Robert Nishihara	3856533065	Fix incompatibility with most recent version of Redis. (#3379 ) * Fix incompatibility with most recent version of Redis. * Fix * Fixes.	2018-11-24 16:36:38 -08:00
Robert Nishihara	5cbc597494	Suppress duplicate pre-emptive object pushes. (#3276 ) * Suppress duplicate pre-emptive object pushes. * Add test. * Fix linting * Remove timer and inline recent_pushes_ into local_objects_. * Improve test. * Fix * Fix linting * Enable retrying pull from same object manager. Randomize object manager. * Speed up test * Linting * Add test. * Minor * Lengthen pull timeout and reissue pull every time a new object becomes available. * Increase pull timeout in test. * Wait for nodes to start in object manager test. * Wait longer for nodes to start up in test. * Small fixes. * _submit -> _remote * Change assert to warning.	2018-11-16 23:02:45 -08:00
Robert Nishihara	d10cb570ab	Rename _submit -> _remote. (#3321 )	2018-11-15 15:30:18 -08:00
Philipp Moritz	1be1455d86	Fix redis crash when duplicate messages are appended to log. (#3316 )	2018-11-15 15:09:39 -08:00
Stephanie Wang	d950e92f63	Allow multiple threads to call ray.get and ray.wait (#3244 ) * Handle multiple threads calling ray.get * Multithreaded ray.wait * Pass in current task ID in java backend * Add multithreaded actor to tests, add warning messages to worker for multithreaded ray.get * Fix test * Some cleanups * Improve error message * Add assertion * Cleanup, throw error in HandleTaskUnblocked if task not actually blocked * lint * Fix python worker reset * Fix references to reconstruct_objects * Linting * java lint * Fix java * Fix iterator	2018-11-07 22:39:28 -08:00
Robert Nishihara	1dd5d92789	Enable timeline visualizations of object transfers. (#3255 ) * Plot object transfers. * Linting	2018-11-07 12:45:59 -08:00
Eric Liang	725df3a485	Set the process title in workers and actors (#3219 )	2018-11-06 14:59:22 -08:00
Eric Liang	9a0f0db070	Add `ray stack` tool for debugging (#3213 )	2018-11-03 13:13:02 -07:00
Wang Qing	ca7d4c2cf5	Enable to specify driver id by user. (#3084 )	2018-11-02 19:01:50 -07:00
Robert Nishihara	5822aa2388	Rename get_task -> worker_idle in timeline. (#3179 ) * Rename get_task -> worker_idle in timeline. * Fix test.	2018-11-02 12:08:46 -07:00
Robert Nishihara	1f29a960f4	Update task_table and object_table API. (#3161 ) * Update task_table and object_table API. * Fix	2018-10-31 12:52:50 -07:00
Robert Nishihara	32f0d6b77e	Deprecate num_workers argument to ray.init and ray start. (#3114 ) * Remove num_workers argument. * Fix * Fix	2018-10-28 20:12:49 -07:00
Robert Nishihara	658c14282c	Remove legacy Ray code. (#3121 ) * Remove legacy Ray code. * Fix cmake and simplify monitor. * Fix linting * Updates * Fix * Implement some methods. * Remove more plasma manager references. * Fix * Linting * Fix * Fix * Make sure class IDs are strings. * Some path fixes * Fix * Path fixes and update arrow * Fixes. * linting * Fixes * Java fixes * Some java fixes * TaskLanguage -> Language * Minor * Fix python test and remove unused method signature. * Fix java tests * Fix jenkins tests * Remove commented out code.	2018-10-26 13:36:58 -07:00
Robert Nishihara	9c1826ed69	Use XRay backend by default. (#3020 ) * Use XRay backend by default. * Remove irrelevant valgrind tests. * Fix * Move tests around. * Fix * Fix test * Fix test. * String/unicode fix. * Fix test * Fix unicode issue. * Minor changes * Fix bug in test_global_state.py. * Fix test. * Linting * Try arrow change and other object manager changes. * Use newer plasma client API * Small updates * Revert plasma client api change. * Update * Update arrow and allow SendObjectHeaders to fail. * Update arrow * Update python/ray/experimental/state.py Co-Authored-By: robertnishihara <robertnishihara@gmail.com> * Address comments.	2018-10-23 12:46:39 -07:00
Robert Nishihara	22dd7e0428	Add test for wait reconstruction. (#3110 )	2018-10-22 23:16:54 -07:00
Robert Nishihara	ed6289771a	Convert runtest.py to use pytest. (#2966 ) * Convert runtest.py to use pytest. * Linting. * Fix * Fix * Fix * Fix	2018-09-30 07:59:44 -07:00
Hanwei Jin	dc76e51a60	bugfix: cmake copy plasma java lib from lib64 directory in centos (#2885 )	2018-09-16 22:32:09 -07:00
Robert Nishihara	f16d33593b	Mark worker as blocked and trigger reconstruction in ray.wait. (#2864 ) * Trigger reconstruction in ray.wait and mark worker as blocked. * Add test. * Linting. * Don't run new test with legacy Ray. * Only call HandleClientUnblocked if it actually blocked in ray.wait. * Reduce time to ray.wait in the test.	2018-09-13 15:28:17 -07:00
Robert Nishihara	3f6ed537a4	Add ray.is_initialized() function. (#2818 ) * Add ray.is_initialized() function. * Add assert.	2018-09-06 21:20:59 -07:00
Yucong He	5b45f0bdff	[xray] Implementing Gcs sharding (#2409 ) Basically a re-implementation of #2281, with modifications of #2298 (A fix of #2334, for rebasing issues.). [+] Implement sharding for gcs tables. [+] Keep ClientTable and ErrorTable managed by the primary_shard. TaskTable is managed by the primary_shard for now, until a good hashing for tasks is implemented. [+] Move AsyncGcsClient's initialization into Connect function. [-] Move GetRedisShard and bool sharding from RedisContext's connect into AsyncGcsClient. This may make the interface cleaner.	2018-08-31 15:54:30 -07:00
Robert Nishihara	eda6ebb87d	Convert some unittests to pytest. (#2779 ) * Convert multi_node_test.py to pytest. * Convert array_test.py to pytest. * Convert failure_test.py to pytest. * Convert microbenchmarks to pytest. * Convert component_failures_test.py to pytest and some minor quotes changes. * Convert tensorflow_test.py to pytest. * Convert actor_test.py to pytest. * Fix. * Fix	2018-08-31 11:24:15 -07:00
Robert Nishihara	32f7d6fcf5	Add back some tests for xray. (#2772 )	2018-08-30 11:07:23 -07:00
Robert Nishihara	b7722897b4	Deprecate 'driver_mode' argument. (#2758 ) * Deprecate 'driver_mode' argument. * Fix * Fix	2018-08-28 16:45:49 -07:00
Alexey Tumanov	de047daea7	[xray] raylet scheduling mechanism with a simple spillback policy (#2749 ) ## What do these changes do? * distribute load and resource information on a heartbeat * for each raylet, maintain total and available resource capacity as well as measure of current load * this PR introduces a new notion of load, defined as a sum of all resource demand induced by queued ready tasks on the local raylet. This provides a heterogeneity-aware measure of load that supersedes legacy Ray's task count as a proxy for load. * modify the scheduling policy to perform capacity-based, load-aware, optimistically concurrent resource allocation * perform task spillover to the heartbeating node in response to a heartbeat, implementing heterogeneity-aware late-binding/work-stealing.	2018-08-28 00:03:34 -07:00
Yuhong Guo	eeb15771ba	Add `ray.internal.free` (#2542 )	2018-08-14 22:01:23 -07:00
Philipp Moritz	d8ba667175	Convert asserts in unittest to pytest (#2529 )	2018-08-01 22:32:10 -07:00
Robert Nishihara	909d7172b1	Introduce constant for ID_SIZE in python code. (#2517 )	2018-07-31 12:40:53 -07:00
Philipp Moritz	696a229ece	Fix text verbosity in python 2.7 by running tests with pytest (#2470 )	2018-07-30 11:04:06 -07:00
Hao Chen	05f485e274	Allow Ray API to be used from multiple threads (#2422 )	2018-07-20 15:39:01 -07:00
Robert Nishihara	515da7721a	Change ray.worker.cleanup -> ray.shutdown and improve API documentation. (#2374 ) * Change ray.worker.cleanup -> ray.shutdown and improve API documentation. * Deprecate ray.worker.cleanup() gracefully. * Fix linting	2018-07-12 12:00:00 -07:00
Robert Nishihara	b90e551b41	[xray] Implement timeline and profiling API. (#2306 ) * Add profile table and store profiling information there. * Code for dumping timeline. * Improve color scheme. * Push timeline events on driver only for raylet. * Improvements to profiling and timeline visualization * Some linting * Small fix. * Linting * Propagate node IP address through profiling events. * Fix test. * object_id.hex() should return byte string in python 2. * Include gcs.fbs in node_manager.fbs. * Remove flatbuffer definition duplication. * Decode to unicode in Python 3 and bytes in Python 2. * Minor * Submit profile events in a batch. Revert some CMake changes. * Fix * Workaround test failure. * Fix linting * Linting * Don't return anything from chrome_tracing_dump when filename is provided. * Remove some redundancy from profile table. * Linting * Move TODOs out of docstring. * Minor	2018-07-04 23:23:48 -07:00
Robert Nishihara	800f7cc77d	Make actor handles work in Python mode. (#2283 ) * Make actor handles work in local mode. * Add test for actor handles in local mode.	2018-06-20 23:02:41 -07:00
Robert Nishihara	ff2217251f	[xray] Add error table and push error messages to driver through node manager. (#2256 ) * Fix documentation indentation. * Add error table to GCS and push error messages through node manager. * Add type to error data. * Linting * Fix failure_test bug. * Linting. * Enable one more test. * Attempt to fix doc building. * Restructuring * Fixes * More fixes. * Move current_time_ms function into util.h.	2018-06-20 21:29:28 -07:00
Robert Nishihara	61139e1509	Enable fractional resources and resource IDs for xray. (#2187 ) * Implement GPU IDs and fractional resources. * Add documentation and python exceptions. * Fix signed/unsigned comparison. * Fix linting. * Fixes from rebase. * Re-enable tests that use ray.wait. * Don't kill the raylet if an infeasible task is submitted. * Ignore tests that require better load balancing. * Linting * Ignore array test. * Ignore stress test reconstructions tests. * Don't kill node manager if remote node manager disconnects. * Ignore more stress tests. * Naming changes * Remove outdated todo * Small fix * Re-enable test. * Linting * Fix resource bookkeeping for blocked tasks. * Fix linting * Fix Java client. * Ignore test * Ignore put error tests	2018-06-10 15:31:43 -07:00
Melih Elibol	7246ff80a4	[xray] Implements ray.wait (#2162 ) Implements ray.wait for xray. Fixes #1128.	2018-06-06 16:56:44 -07:00
Kunal Gosar	317d0da7d8	Add experimental API for ray.get and ray.wait with additional argument types (#2071 )	2018-06-01 16:42:27 -07:00
Robert Nishihara	6172f94c04	Implement Python global state API for xray. (#2125 ) * Implement global state API for xray. * Fix object table. * Fixes for log structure. * Implement cluster_resources. * Add driver task to task table. * Remove python flatbuffers code * Get some global state API tests running. * Python linting. * Fix linting. * Fix mock modules for doc * Copy over flatbuffer bindings. * Fix for tests. * Linting * Fix monitor crash.	2018-05-29 16:25:54 -07:00
Zongheng Yang	fa97acbc89	Integrate credis with Ray & route task table entries into credis. (#1841 )	2018-05-24 23:35:25 -07:00
Alok Singh	f795173b51	Use flake8-comprehensions (#1976 ) * Add flake8 to Travis * Add flake8-comprehensions [flake8 plugin](https://github.com/adamchainz/flake8-comprehensions) that checks for useless constructions. * Use generators instead of lists where appropriate A lot of the builtins can take in generators instead of lists. This commit applies `flake8-comprehensions` to find them. * Fix lint error * Fix some string formatting The rest can be fixed in another PR * Fix compound literals syntax This should probably be merged after #1963. * dict() -> {} * Use dict literal syntax dict(...) -> {...} * Rewrite nested dicts * Fix hanging indent * Add missing import * Add missing quote * fmt * Add missing whitespace * rm duplicate pip install This is already installed in another file. * Fix indent * move `merge_dicts` into utils * Bring up to date with `master` * Add automatic syntax upgrade * rm pyupgrade In case users want to still use it on their own, the upgrade-syn.sh script was left in the `.travis` dir.	2018-05-20 16:15:06 -07:00
Alok Singh	9a8f29e571	YAPF, take 3 (#2098 ) * Use pep8 style The original style file is actually just pep8 style, but with everything spelled out. It's easier to use the `based_on_style` feature. Any overrides are clearer that way. * Improve yapf script 1. Do formatting in parallel 2. Lint RLlib 3. Use .style.yapf file * Pull out expressions into variables * Don't format rllib * Don't allow splits in dicts * Apply yapf * Disallow single line if-statements * Use arithmetic comparison * Simplify checking for changed files * Pull out expr into var	2018-05-19 16:07:28 -07:00
Robert Nishihara	78e4b021ab	Functions for flushing done tasks and evicted objects. (#2033 )	2018-05-18 01:59:58 -07:00
Adam Gleave	470887c2ad	Support calling positional arguments by keyword (fix #998 ) (#2081 )	2018-05-17 16:10:26 -07:00
Melih Elibol	bea97b425b	Fix python linting (#2076 )	2018-05-16 15:04:31 -07:00
Robert Nishihara	570c3153cd	Some tests for _submit API. (#2062 )	2018-05-16 00:26:25 -07:00
Robert Nishihara	8fbb88485b	Create RemoteFunction class, remove FunctionProperties, simplify worker Python code. (#2052 ) * Cleaning up worker and actor code. Create remote function class. Remove FunctionProperties object. * Remove register_actor_signatures function. * Small cleanups. * Fix linting. * Support @ray.method syntax for actor methods. * Fix pickling bug. * Fix linting. * Shorten testBlockingTasks. * Small fixes. * Call get_global_worker().	2018-05-14 14:35:23 -07:00

1 2 3 4 5

225 commits