hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-04 17:41:43 -05:00

Author	SHA1	Message	Date
Antoni Baum	3876fcdbe8	[CI] Add bazel py_test checking for Serve (#25509 )	2022-06-07 10:54:10 -07:00
Antoni Baum	045c47f172	[CI] Check test files for `if __name__...` snippet (#25322 ) Bazel operates by simply running the python scripts given to it in `py_test`. If the script doesn't invoke pytest on itself in the `if _name__ == "__main__"` snippet, no tests will be ran, and the script will pass. This has led to several tests (indeed, some are fixed in this PR) that, despite having been written, have never ran in CI. This PR adds a lint check to check all `py_test` sources for the presence of `if _name__ == "__main__"` snippet, and will fail CI if there are any detected without it. This system is only enabled for libraries right now (tune, train, air, rllib), but it could be trivially extended to other modules if approved.	2022-06-02 10:30:00 +01:00
Kai Fricke	65d9a410f7	[ci] Clean up ci/ directory (refactor ci/travis) (#23866 ) Clean up the ci/ directory. This means getting rid of the travis/ path completely and moving the files into sensible subdirectories. Details: - Moves everything under ci/travis into subdirectories, e.g. ci/build, ci/lint, etc. - Minor adjustments to some scripts (variable renames) - Removes the outdated (unused) asan tests	2022-04-13 18:11:30 +01:00
Siyuan (Ryans) Zhuang	1c992661a8	Add scripts symlink back (#9219 ) (#9475 ) (cherry picked from commit `77933c922d`) Co-authored-by: Simon Mo <xmo@berkeley.edu>	2020-07-14 12:31:49 -07:00
Sven Mika	fcdf410ae1	[RLlib] Tf2.x native. (#8752 )	2020-07-11 22:06:35 +02:00
Simon Mo	77933c922d	Add scripts symlink back (#9219 ) This partially reverts commit `43043ee4d5`.	2020-06-30 13:25:59 -07:00
Sven Mika	43043ee4d5	[RLlib] Tf2x preparation; part 2 (upgrading `try_import_tf()`). (#9136 ) * WIP. * Fixes. * LINT. * WIP. * WIP. * Fixes. * Fixes. * Fixes. * Fixes. * WIP. * Fixes. * Test * Fix. * Fixes and LINT. * Fixes and LINT. * LINT.	2020-06-30 10:13:20 +02:00
Eric Liang	f1239a7a63	Lint script link broken, also lint filter was broken for generated py files (#4133 )	2019-02-22 17:33:08 -08:00
Alok Singh	42a9233e1d	Improve yapf speed and document its usage (#2160 ) * Allow yapf to lint individual files * Add tip for using yapf * Update doc * Update script to autoformat changed py files The new default is for the script to only updated changed files to encourage using it as a pre-push hook. Travis still checks all since it's not that big an increase to runtime. * Exclude formatting thirdparty/autogen py files * Symlink .travis -> scripts Hidden directories may get glossed over otherwise. * .travis -> scripts in docs They are symlinks to the same thing, but `scripts` is more dev-friendly, while `.travis` is really only for Travis CI. * Document different yapf format functions Most devs will only need `format_changed`, and this is run by default. `format_changed` should be fast enough in most cases to work as a pre-commit hook. * Speed up yapf by only formatting changed files * Update docs 1. Mention how yapf can be used a pre-commit hook 2. rm `bash`, script is executable * Update yapf.sh * Update development.rst * Update yapf.sh * Use bash arrays for correct argument splitting Playing fast and loose with whitespace in bash is a terrible idea. * Only format non-excluded by default * Check changes against master Normally, the remote is called `origin`, but naming it explicit * Adding missing directory to `format_all` * Cleanup YAPF code Remove unused function and move around code to make clearer and adding lines give cleaner diffs. * Ensure correct files are autoformatted * Fix cmd line arg splitting Each arg has to be in its own set of quotes. * Diff against mergebase TIL there's a clean syntax for doing that, but it's too clever to belong in a shell script. We use `mapfile -t` to ensure no problems down the line with weird filenames.	2018-06-05 20:22:11 -07:00
Robert Nishihara	1a682e2807	Enable starting and stopping ray with "ray start" and "ray stop". (#628 ) * Install start_ray and stop_ray scripts in setup.py. * Update documentation. * Fix docker tests. * Implement stop_ray script in python. * Fix linting.	2017-06-02 20:17:48 +00:00
Stephanie Wang	ee08c8274b	Shard Redis. (#539 ) * Implement sharding in the Ray core * Single node Python modifications to do sharding * Do the sharding in redis.cc * Pipe num_redis_shards through start_ray.py and worker.py. * Use multiple redis shards in multinode tests. * first steps for sharding ray.global_state * Fix problem in multinode docker test. * fix runtest.py * fix some tests * fix redis shard startup * fix redis sharding * fix * fix bug introduced by the map-iterator being consumed * fix sharding bug * shard event table * update number of Redis clients to be 64K * Fix object table tests by flushing shards in between unit tests * Fix local scheduler tests * Documentation * Register shard locations in the primary shard * Add plasma unit tests back to build * lint * lint and fix build * Fix * Address Robert's comments * Refactor start_ray_processes to start Redis shard * lint * Fix global scheduler python tests * Fix redis module test * Fix plasma test * Fix component failure test * Fix local scheduler test * Fix runtest.py * Fix global scheduler test for python3 * Fix task_table_test_and_update bug, from actor task table submission race * Fix jenkins tests. * Retry Redis shard connections * Fix test cases * Convert database clients to DBClient struct * Fix race condition when subscribing to db client table * Remove unused lines, add APITest for sharded Ray * Fix * Fix memory leak * Suppress ReconstructionTests output * Suppress output for APITestSharded * Reissue task table add/update commands if initial command does not publish to any subscribers. * fix * Fix linting. * fix tests * fix linting * fix python test * fix linting	2017-05-18 17:40:41 -07:00
Robert Nishihara	8061b3b596	Revert "Suppress warning in start_ray.sh about leaving child processes running when parent exits. (#429 )" (#437 ) This reverts commit `85b373a4be`.	2017-04-07 17:32:28 -07:00
Robert Nishihara	320109a5bd	By default, start a number of workers equal to the number of CPUs. (#430 ) * By default, start a number of workers equal to the number of CPUs. * Fix stress tests.	2017-04-06 00:02:58 -07:00
Robert Nishihara	85b373a4be	Suppress warning in start_ray.sh about leaving child processes running when parent exits. (#429 )	2017-04-05 23:54:22 -07:00
Robert Nishihara	ba02fc0eb0	Run flake8 in Travis and make code PEP8 compliant. (#387 )	2017-03-21 12:57:54 -07:00
Stephanie Wang	12c9618c0c	Plasma and worker node failure. (#373 ) * Failing test case * Local scheduler exits cleanly after plasma store dies * Tolerate one plasma store failure * Tolerate plasma store failures on all nodes except head node * Plasma manager heartbeats * Component failure tests * Don't run the helper for Python testing * Fix C test * Fix hanging plasma transfer test * Fix python3 * Consolidate ClientConnection code * Fix valgrind test * fix c test * We can restart worker nodes! * Fix flatbuffers bug * Address comments * Only register actual workers with the local scheduler * Fix bug * Fix segfaults * Add test case that tests for driver liveness, fix local scheduler bug * Clean up after tests * Allocate retry info on the stack * Send SIGKILL before waiting * Relax unit test conditions * Driver liveness test case and documentation	2017-03-17 17:03:58 -07:00
Robert Nishihara	f1d4dda8cb	Put all log files in redis and visualize them in UI. (#350 ) * Start process for monitoring log files and push changes to redis. * Display log files in UI. * Bug fix for recent tasks. * Use flatbuffers to parse local scheduler heartbeats.	2017-03-16 15:27:00 -07:00
Robert Nishihara	53dffe0bf2	Use flatbuffers for some messages from Redis. (#341 ) * Compile the Ray redis module with C++. * Redo parsing of object table notifications with flatbuffers. * Update redis module python tests. * Redo parsing of task table notifications with flatbuffers. * Fix linting. * Redo parsing of db client notifications with flatbuffers. * Redo publishing of local scheduler heartbeats with flatbuffers. * Fix linting. * Remove usage of fixed-width formatting of scheduling state in channel name. * Reply with flatbuffer object to task table queries, also simplify redis string to flatbuffer string conversion. * Fix linting and tests. * fix * cleanup * simplify logic in ReplyWithTask	2017-03-10 18:35:25 -08:00
Stephanie Wang	41b8675d04	Availability after local scheduler failure (#329 ) * Clean up plasma subscribers on EPIPE First pass at a monitoring script - monitor can detect local scheduler death Clean up task table upon local scheduler death in monitoring script Don't schedule to dead local schedulers in global scheduler Have global scheduler update the db clients table, monitor script cleans up state Documentation Monitor script should scan tables before beginning to read from subscription channel Fix for python3 Redirect monitor output to redis logs, fix hanging in multinode tests * Publish auxiliary addresses as part of db_client deletion notifications * Fix test case? * Small changes. * Use SCAN instead of KEYS * Address comments * Address more comments * Free redis module strings	2017-03-02 19:51:20 -08:00
Robert Nishihara	1ae7e7d29e	Rename photon -> local scheduler. (#322 )	2017-02-27 12:24:07 -08:00
Robert Nishihara	072eadd57f	Pipe num_cpus and num_gpus through from start_ray.py. (#275 ) * Pipe num_cpus and num_gpus through from start_ray.py. * Improve load balancing tests. * Fix bug. * Factor out some testing code.	2017-02-13 17:43:23 -08:00
Robert Nishihara	3934d5f6eb	Remove old files and remove old documentation for copying files around cluster. (#274 )	2017-02-13 11:20:04 -08:00
Robert Nishihara	cb7f6ca9b5	Attempt to start web UI when starting Ray. (#269 ) * Attempt to start web UI when starting Ray. * Add instructions for using web UI to cluster documentation. * Don't check if port 8080 is open. * Remove print statement.	2017-02-12 15:17:58 -08:00
Robert Nishihara	f6ce9dfa6c	Allow start_ray.sh to take an object manager port. (#272 ) * Allow start_ray.sh to take a object manager port. * Fix typo and add test. * Small cleanups.	2017-02-12 12:39:32 -08:00
Johann Schleier-Smith	6ad2b5d87a	Add Redis port option to startup script (#232 ) * specify redis address when starting head * cleanup * update starting cluster documentation * Whitespace. * Address Philipp's comments. * Change redis_host -> redis_ip_address.	2017-01-31 00:28:00 -08:00
Richard Liaw	4575cd88b2	Improve error messages when nodes can't communicate with each other. (#223 ) * Good error messages when nodes can't communicate with each other * Print more information when starting the head node. * Change retries back to 5.	2017-01-22 14:53:15 -08:00
Robert Nishihara	9bb8162621	Improvements to documentation and error messages. (#221 )	2017-01-19 20:27:46 -08:00
Robert Nishihara	84296c8905	Documentation for using Ray on a cluster. (#165 )	2016-12-30 00:29:03 -08:00
Robert Nishihara	241c955707	Determine node IP address programatically. (#151 ) * Determine node ip address programatically. * Factor out methods for getting node IP addresses. * Address comments.	2016-12-23 15:31:40 -08:00
Robert Nishihara	92010ca5b5	Check that we can connect to Redis and that there aren't existing redis clients on the same node in start_ray.py (#148 )	2016-12-22 21:54:19 -08:00
Robert Nishihara	6cd02d71f8	Fixes and cleanups for the multinode setting. (#143 ) * Add function for driver to get address info from Redis. * Use Redis address instead of Redis port. * Configure Redis to run in unprotected mode. * Add method for starting Ray processes on non-head node. * Pass in correct node ip address to start_plasma_manager. * Script for starting Ray processes. * Handle the case where an object already exists in the store. Maybe this should also compare the object hashes. * Have driver get info from Redis when start_ray_local=False. * Fix. * Script for killing ray processes. * Catch some errors when the main_loop in a worker throws an exception. * Allow redirecting stdout and stderr to /dev/null. * Wrap start_ray.py in a shell script. * More helpful error messages. * Fixes. * Wait for redis server to start up before configuring it. * Allow seeding of deterministic object ID generation. * Small change.	2016-12-21 18:53:12 -08:00
Robert Nishihara	ddba1df802	Start working toward Python3 compatibility. (#117 )	2016-12-11 12:25:31 -08:00
Robert Nishihara	072f442c1f	Update worker.py and services.py to use plasma and the local scheduler. (#19 ) * Update worker code and services code to use plasma and the local scheduler. * Cleanups. * Fix bug in which threads were started before the worker mode was set. This caused remote functions to be defined on workers before the worker knew it was in WORKER_MODE. * Fix bug in install-dependencies.sh. * Lengthen timeout in failure_test.py. * Cleanups. * Cleanup services.start_ray_local. * Clean up random name generation. * Cleanups.	2016-11-02 00:39:35 -07:00
Robert Nishihara	6ed641177d	Remove unnecessary files. (#4 )	2016-10-26 23:24:40 -07:00
Robert Nishihara	91f16a3df0	Migrate repositories to ray-project. (#438 ) * Migrate repositories to ray-project. * Update numbuf to the migrated version.	2016-09-17 00:52:05 -07:00
Robert Nishihara	e06311d415	Automatically add relevant directories to Python paths of workers (#380 ) * Make ray.init set python paths of workers. * Decouple starting cluster from copying user source code * also add current directory to path * Add comments about deallocation. * Add test for new code path.	2016-08-16 14:53:55 -07:00
Robert Nishihara	13df8302e6	enable running example apps in cluster mode (#357 )	2016-08-08 16:01:13 -07:00
Robert Nishihara	a6452aca47	Command for installing example applications dependencies on cluster (#353 )	2016-08-05 14:54:32 -07:00
Robert Nishihara	1454c26693	fix bug with home directory on cluster (#352 )	2016-08-05 11:49:11 -07:00
Robert Nishihara	ac363bf451	Let worker get worker address and object store address from scheduler (#350 )	2016-08-04 17:47:08 -07:00
Johann Schleier-Smith	3ee0fd8f34	Update cluster guide (#347 ) * clarify cluster setup instructions * update multinode documentation, update cluster script, fix minor bug in worker.py * clarify cluster documentation and fix update_user_code	2016-08-04 09:14:20 -07:00
Robert Nishihara	2040372084	unify starting local cluster with attaching to existing cluster (#327 )	2016-07-31 19:26:35 -07:00
Robert Nishihara	bcd0e3781f	remove example functions and remove imports from shell (#314 )	2016-07-29 12:42:44 -07:00
Philipp Moritz	b5215f1e6a	make it possible to use directory as user source directory that doesn't contain worker.py (#297 )	2016-07-26 18:39:06 -07:00
Robert Nishihara	aa2f618ab7	add directory containing script to python path of workers (#296 )	2016-07-26 16:18:39 -07:00
Robert Nishihara	3bae6f136b	export remote functions and reusable variables that were defined before connect was called (#292 )	2016-07-26 11:40:09 -07:00
Robert Nishihara	8465df1146	script for launching nodes on ec2 (#270 ) * original spark-ec2 script * modifying spark-ec2 for ray	2016-07-16 15:14:14 -07:00
mehrdadn	0f1d7c5835	Run IPython shell without embedding (#269 )	2016-07-16 14:42:58 -07:00
Robert Nishihara	80526f7777	add documentation and refactor cluster.py (#238 )	2016-07-12 23:54:18 -07:00
Robert Nishihara	8952ff8cf9	allow cluster script to update worker code on nodes (#243 )	2016-07-11 17:58:16 -07:00

1 2

69 commits