hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-08 19:41:38 -05:00

Author	SHA1	Message	Date
Robert Nishihara	db4a920bdb	Cleanup parquet installation. (#1549 ) * Cleanup parquet installation. * Fix * Small changes. * Add brew installs * Modify paths for compilation of parquet. * Remove LD_LIBRARY_PATH * Don't set unnecessary environment variables on Linux. * Set environment variables for make. * Brew installs for macos wheels. * Update * Pass PARQUET_HOME when building pyarrow. * Don't exit with error code.	2018-02-20 15:21:32 -08:00
Philipp Moritz	eabc4027c8	Hiredis asio integration (#1547 )	2018-02-20 13:37:09 -08:00
Simon Mo	a24cc28773	[DataFrame] Add Parquet Support in Build Process (#1531 ) * Add shell script for building parquet * Use parquet ci script; remove anaconda * Remove gcc flag, use default * add boost_root * Fix $TP_DIR reference issue * fix the PR * check out specific parquet-cpp commit	2018-02-16 07:18:42 -08:00
Alexey Tumanov	844a6afcdd	Implement simple random spillback policy. (#1493 ) * spillback policy implementation: global + local scheduler * modernize global scheduler policy state; factor out random number engine and generator * Minimal version. * Fix test. * Make load balancing test less strenuous.	2018-02-13 00:09:35 -08:00
Philipp Moritz	1ab2e63dbd	Tune transfer buffer size (#1363 ) Increase buffsize from `4096` to `80*1024`.	2018-02-09 14:56:36 -08:00
Robert Nishihara	89db7841d2	Update arrow version. (#1512 )	2018-02-07 23:05:16 -08:00
Stephanie Wang	ff8e7f8259	Actor checkpointing for distributed actor handles (#1498 ) * Expose calls to get and set the actor frontier * Remove fields used for old checkpointing prototype, change actor_checkpoint_failed -> succeeded * Prototype for actor checkpointing * Filter out duplicate tasks on the local scheduler * Clean up some of the Python checkpointing code * More cleanups * Documentation * cleanup and fix unit test * Allow remote checkpoint calls through actor handle * Check whether object is local before reconstructing * Enable checkpointing for distributed actor handles, refactor tests * Fix local scheduler tests * lint * Address comments * lint * Skip tests that fail on new GCS * style * Don't put same object twice when setting the actor frontier * Address Philipp's comments, cleaner fbs naming	2018-02-07 11:19:32 -08:00
Melih Elibol	d8850eac4b	Suppress object transfer requests when object is already being received. (#1430 ) * added deterministic check for objects received in fetch_timeout_handler. * use receive time, in case something goes wrong after object is received. * increase timeout for removal. * indentation fix. * make log info log debug. clean up debug log. * undo unecessary changes. * changed description var. * shorten line 949. * incorporate feedback. * linting; make is_object_received function consts. * change semantics of received_objects to objects being received. added checks to both points at which objects are re-requested. updated object receive initialization accordingly. * eliminate erase on receive init. check call to request_transfer_from instead of request_transfer. * updated comments. * added todo for multiple object transfers. * linting.	2018-02-01 22:45:31 -08:00
Philipp Moritz	a3f8fa426b	Start integrating new GCS APIs (#1379 ) * Start integrating new GCS calls * fixes * tests * cleanup * cleanup and valgrind fix * update tests * fix valgrind * fix more valgrind * fixes * add separate tests for GCS * fix linting * update tests * cleanup * fix python linting * more fixes * fix linting * add plasma manager callback * add some documentation * fix linting * fix linting * fixes * update * fix linting * fix * add spillback count * fixes * linting * fixes * fix linting * fix * fix * fix	2018-01-31 11:01:12 -08:00
Robert Nishihara	3195c6aa63	Fix local scheduler crash when driver creates actor and exits. (#1474 ) * Make check failures in redis.cc more informative. * Fix bug by calling task_table_add_task. * Add test.	2018-01-26 14:29:53 -08:00
Stephanie Wang	668737f383	Replace actor dummy objects with mock calls to the local scheduler (#1467 ) * Replace putting the dummy object with a call to the local scheduler * Mark dummy objects as locally available	2018-01-26 14:18:45 -08:00
Robert Nishihara	5acc98e629	Update arrow with better dataframe serialization and get rid of custo… (#1413 ) * Update arrow with better dataframe serialization and get rid of custom dataframe serializers. * Update plasma client API. * Fix potential bug. * Bug fix. * Update arrow to use deduplicated file descriptors and mutable buffers. * Fix tests. * Update commit. * Update commit. * Update commit. * Update commit. * Update commit * Update commit back to arrow codebase.'	2018-01-24 10:03:29 -08:00
Alexey Tumanov	f1303291b4	Ray scheduler spillback plumbing + mechanism (#1362 ) * spillback mechanism and plumbing : adding spillback counter + timestamp * linting fix * documentation * Fix argument name.	2018-01-23 20:18:12 -08:00
Melih Elibol	4b1c8be4fe	Fix setting log-level to debug. (#1432 )	2018-01-21 21:51:05 -08:00
Stephanie Wang	74718efa73	Nondeterministic reconstruction for actors (#1344 ) * Add failing unit test for nondeterministic reconstruction * Retry scheduling actor tasks if reassigned to local scheduler * Update execution edges asynchronously upon dispatch for nondeterministic reconstruction * Fix bug for updating checkpoint task execution dependencies * Update comments for deterministic reconstruction * cleanup * Add (and skip) failing test case for nondeterministic reconstruction * Suppress test output	2018-01-21 13:44:13 -08:00
Robert Nishihara	088f01496c	Remove unused object info table code. (#1388 )	2018-01-05 11:00:06 -08:00
Robert Nishihara	e970e24ea5	Update arrow, and pass memcopy_threads into put. (#1374 )	2017-12-31 13:32:06 -08:00
Philipp Moritz	3d224c4edf	Second Part of Internal API Refactor (#1326 )	2017-12-26 16:22:04 -08:00
Melih Elibol	4a2d62e7ef	fix thirdparty install bug. (#1354 )	2017-12-20 23:08:53 -08:00
Philipp Moritz	3c4408cf51	Rebase Ray on Arrow 0.8 (#1323 ) * rebase Ray on Arrow 0.8 * rebase on apache repo	2017-12-19 14:24:21 -08:00
Robert Nishihara	76b6b4a2d3	When killing worker, release resources before dispatching tasks. (#1327 )	2017-12-15 18:12:03 -08:00
Stephanie Wang	12fdb3f53a	Convert actor dummy objects to task execution edges. (#1281 ) * Define execution dependencies flatbuffer and add to Redis commands * Convert TaskSpec to TaskExecutionSpec * Add execution dependencies to Python bindings * Submitting actor tasks uses execution dependency API instead of dummy argument * Fix dependency getters and some cleanup for fetching missing dependencies * C++ convention * Make TaskExecutionSpec a C++ class * Convert local scheduler to use TaskExecutionSpec class * Convert some pointers to references * Finish conversion to TaskExecutionSpec class * fix * Fix * Fix memory errors? * Cast flatbuffers GetSize to size_t * Fixes * add more retries in global scheduler unit test * fix linting and cast fbb.GetSize to size_t * Style and doc * Fix linting and simplify from_flatbuf.	2017-12-14 20:47:54 -08:00
Philipp Moritz	cac5f47600	First Part of Internal Ray API Refactor (#1173 ) * add Ray status class * add C++ util files * add ID types * more APIs * build system integration * add test infrastructure and implement some APIs * add more tests * fix bugs * add task table tests * update * add toolchain file * fix * test * link with pthread * update * fix * more fixes * fixes * always vendor gtest and gflags * linting * fixes * add constants file * comments * more fixes * fix linting	2017-12-14 14:54:09 -08:00
Robert Nishihara	2f750e9ba7	Add parentheses around one-line if statement. (#1318 )	2017-12-13 23:48:53 -08:00
Robert Nishihara	f75b51d178	Register Common.error with local scheduler extension module. (#1316 ) * Register Common.error with local scheduler extension module. * Add test.	2017-12-13 11:55:54 -08:00
Stephanie Wang	bac39a134e	Define a wrapper class for callback_data.data (#1301 )	2017-12-08 11:48:21 -08:00
Stephanie Wang	044548bcff	Mark the killed as done outside of loop (#1284 )	2017-12-02 14:42:16 -08:00
Robert Nishihara	c21e189371	Allow scheduling with arbitrary user-defined resource labels. (#1236 ) * Enable scheduling with custom resource labels. * Fix. * Minor fixes and ref counting fix. * Linting * Use .data() instead of .c_str(). * Fix linting. * Fix ResourcesTest.testGPUIDs test by waiting for workers to start up. * Sleep in test so that all tasks are submitted before any completes.	2017-12-01 11:41:40 -08:00
Robert Nishihara	e0a340ee7e	Allow actors to pin at most 1000 dummy objects at a time. (#1241 ) * Allow actors to pin at most 1000 dummy objects at a time. * Fix linting.	2017-11-22 13:38:01 -08:00
Eric Liang	9233e496cc	Raise exception when getting the task results of workers that died (#1224 ) * wip * with test * add timeout * also add test for f * remove on cleanup * update * wip * fix tests * mark actor removed in redis * clang-format * fix bug when no-inprogress tasks * try to set task status done * Add comment.	2017-11-20 15:18:39 -08:00
Peter Schafhalter	e0360eb429	Remove UT libraries and clean up remaining UT datastructures (#1230 ) * Remove UT string include from redis * Remove UT string include from DB tests * Modify TaskSpec_print to remove UT string * Remove UT libraries	2017-11-19 15:01:33 -08:00
Peter Schafhalter	d986294c2b	Replace UT strings in local scheduler (#1213 ) * Convert to string using std::string * Fix linting issue * Fix linting * Construct db_connect_args using vector * Use vector size() instead of num_args * Hopefully fix linting now	2017-11-17 16:14:46 -08:00
Robert Nishihara	94423c0542	Upgrade Arrow with fixes to Plasma eviction policy. (#1228 ) * Upgrade Arrow with fixes to Plasma eviction policy. * Upgrade arrow to have -f flag for plasma store.	2017-11-17 14:41:22 -08:00
Peter Schafhalter	4cbc2b1978	Clean up UT datastructures in Python extension (#1227 )	2017-11-17 01:07:12 -08:00
Stephanie Wang	c70430f322	Fix bugs in plasma manager transfer (#1188 ) * Plasma client test for plasma abort * Use ray-project/arrow:abort-objects branch * Set plasma manager connection cursor to -1 when not in use * Handle transfer errors between plasma managers, abort unsealed objects * Add TODO for local scheduler exiting on plasma manager death * Revert "Plasma client test for plasma abort" This reverts commit e00fbd58dc4a632f58383549b19fb9057b305a14. * Upgrade arrow to version with PlasmaClient::Abort * Fix plasma manager test * Fix plasma test * Temporarily use arrow fork for testing * fix and set arrow commit * Fix plasma test * Fix plasma manager test and make write_object_chunk consistent with read_object_chunk * style * upgrade arrow	2017-11-15 22:32:38 -08:00
Peter Schafhalter	9a7b15447b	Replace UT string in redis tests (#1211 ) * Replace UT arg formatting with vsnprintf * Fix bug with va_list usage	2017-11-15 22:21:56 -08:00
Peter Schafhalter	428858c1ff	Convert UT string to std::string (#1210 )	2017-11-12 21:00:36 -08:00
Peter Schafhalter	9a6a056609	Convert UT datastructures in tests (#1203 ) * bind_ipc_sock_retry returns std::string * snprintf -> std::snprintf * Fix formatting * Use stringstream instead of snprintf * Fix typo	2017-11-11 16:55:05 -08:00
Philipp Moritz	e798a652bc	Change TaskSpec to allow multiple object IDs per argument. (#1204 ) * Implement object ID bags * linting * fix tests * fix linting * fix comments	2017-11-10 16:33:34 -08:00
Stephanie Wang	07f0532b9b	Local scheduler filters out dead clients during reconstruction (#1182 ) * Object table lookup returns vector of DBClientID instead of address strings * Add node IP address to DBClient notification * DB client cache stores entire DB client, convert addresses to std::string * get cached db client returns the client * Expose a call to initialize the redis cache * Local scheduler filters out dead clients during reconstruction * Remove node ip address from dbclient, use aux_address for plasma managers * Get entire db client entry when not found in cache * Fix common tests * Fix address in tests * Push error to driver if driver task did the put * Address Robert's comments and cleanup * Remove unused Redis command * Fix db test	2017-11-10 11:29:24 -08:00
Robert Nishihara	d3c082d325	More checking in redis.cc. (#1057 )	2017-11-08 23:25:19 -08:00
Robert Nishihara	1c6b30b5e2	Move all config constants into single file. (#1192 ) * Initial pass at factoring out C++ configuration into a single file. * Expose config through Python. * Forward declarations. * Fixes with Python extensions * Remove old code. * Consistent naming for constants. * Fixes * Fix linting. * More linting. * Whitespace * rename config -> _config. * Move config inside a class. * update naming convention * Fix linting. * More linting * More linting. * Add in some more constants. * Fix linting	2017-11-08 11:10:38 -08:00
Peter Schafhalter	a8032b9ca1	Convert connections from UT_array to std::vector (#1190 )	2017-11-07 20:59:41 -08:00
Peter Schafhalter	7215f7d228	Remove UT String from logging (#1184 ) * Removed unnecessary utarray include * Removed ut_string from logging * Fix formatting	2017-11-05 14:05:20 -08:00
Robert Nishihara	97c6369b49	Update arrow to include custom serializer for pytorch and register default serialization handlers. (#1152 ) * Update arrow to include custom serializer for pytorch. * Call pyarrow function for registering default custom serialization handlers. * Change class ID used in serialization context for object IDs.	2017-10-21 21:24:10 -07:00
Philipp Moritz	684e62e784	upgrade arrow to include numpy bool fix (#1148 )	2017-10-20 17:25:15 -07:00
Peter Schafhalter	ad4cbd4016	Updated outstanding_callbacks to unordered_map (#1108 ) * Updated outstanding_callbacks to unordered_map * Fix bug in destroy_outstanding_callbacks and comments	2017-10-20 10:06:22 -07:00
Stephanie Wang	af47737bd5	Prototype distributed actor handles (#1137 ) * Add actor handle ID to the task spec * Local scheduler dispatches actor tasks according to a task counter per handle * Fix python test * Allow passing actor handles into tasks. Not completely working yet. Also this is very messy. * Fixes, should be roughly working now. * Refactor actor handle wrapper * Fix __init__ tests * Terminate actor when the original handle goes out of scope * TODO and a couple test cases * Make tests for unsupported cases * Fix Python mode tests * Linting. * Cache actor definitions that occur before ray.init() is called. * Fix export actor class * Deterministically compute actor handle ID * Fix __getattribute__ * Fix string encoding for python3 * doc * Add comment and assertion.	2017-10-19 23:49:59 -07:00
Robert Nishihara	1cdc2fb011	Clean up event loop and callbacks when processes exit. (#1125 ) * Clean up event loop and callbacks when processes exit. * Fix bug.	2017-10-19 17:07:03 -07:00
Philipp Moritz	4157bcb80b	Improve deserialization performance by rebasing on latest arrow (#1129 ) * improve serialization performance by rebasing on latest arrow * update * revert worker.py	2017-10-17 14:56:11 -07:00

1 2 3 4 5 ...

361 commits