hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 18:41:40 -05:00

Author	SHA1	Message	Date
Philipp Moritz	54925996ca	Allow remote functions to specify max executions and kill worker once limit is reached. (#660 ) * implement restarting workers after certain number of task executions * Clean up python code. * Don't start new worker when an actor disconnects. * Move wait_for_pid_to_exit to test_utils.py. * Add test. * Fix linting errors. * Fix linting. * Fix typo.	2017-06-13 00:34:58 -07:00
Eric Liang	4374ad1453	Policy gradient example: Support multi-GPU training (#584 ) * add tf metrics * comments * fix network scopes * add doc * initial work * try with 3 virtual cpus * clean up metrics * use format string * fix trace level * back to pong * always run summary on cpu * plot intermediate and final sgd stats * add back a global step * update * add timeline * use staging area and reuse weights properly * stage at cpu * whoops, stage only the batch * clean up a bit * fix py flake * wip * create an optimizer graph per device * print timeline on 5th batch instead * print examples per second * log placement for training ops * force placement on cpu:0 * try separating weights onto different gpus * try using nccl * add cpu fallback * remove space from date * check has gpu device * fix flag config * checkpoint * wip * update * add some timing * trace loading * try cpu * revert that * remove expensive test * lint * cleanups * clean up timers * clean it up a bit * fix code for non-scalar action spaces * address some nits * fix quotes * efficient shuffling between sgd epochs	2017-06-13 06:03:25 +00:00
Robert Nishihara	1916475e14	Increase socket listen backlog from 5 to 128. (#661 )	2017-06-11 06:34:16 +00:00
Richard Liaw	8d350f628a	Fixing Redis Key Consistencies for Actor, FunctionTable, FunctionsToRun, and RemoteFunction (#659 ) * consistencies for Actor, FunctionTable, and FunctionsToRun * NOT WORKING: changing remote fn keys	2017-06-10 23:45:22 +00:00
Eric Liang	d4d2c03ac5	Remove timeout for Redis commands. (#649 ) * update * Remove interaction between callback data identifier and event loop. * Remove tests that no longer apply.	2017-06-09 15:55:36 -07:00
alanamarzoev	ee1d4e5ea2	Redirect worker stdout/stderr to log files. (#646 ) * local scheduler * redirect output files to be associated with workers rather than the local scheduler * fixed formatting * fixes * Moved output redirection logic to worker.py. * Changed write mode. * Fixed formatting. * Added comment. * Reuse log file creation in services.py. * Fix linting. * Fix problem in which multiple processes attempt to create /tmp/raylogs at the same time.	2017-06-08 18:30:48 -07:00
Crystal	fff50d824c	Doc using ray with gpu (#644 ) * Added to troubleshooting documentation about whether redefining remote functions runs the new code version * Minor correction to troubleshooting documentation * Writing new documentation page for using Ray with GPUs * Wrote new documentation page on using ray with gpus * Add some more details.	2017-06-08 00:12:44 -07:00
alanamarzoev	f0339f3386	Expose log files through global state API. (#641 ) * added log_table function and a test * fixed log_files and added task_profiles * fixed formatting * fixed linting errors * fixes * removed file * more fixes * hopefully fixed * Small changes. * Fix linting. * Fix bug in log monitor. * Small changes. * Fix bug in travis.	2017-06-08 00:08:10 -07:00
Robert Nishihara	fde843a636	Update installation documentation to recommend installing Ray with pip. (#637 )	2017-06-07 05:51:06 +00:00
Crystal	60161f276b	Added to troubleshooting documentation about whether redefining remot… (#640 ) * Added to troubleshooting documentation about whether redefining remote functions runs the new code version * Minor correction to troubleshooting documentation * Small rewordings.	2017-06-06 22:49:53 -07:00
Philipp Moritz	690fe10bb6	Save policies for Evolution Strategies (#638 ) Save policies for evolution strategies.	2017-06-04 16:21:19 -07:00
Crystal	4c94d6c3b9	Rewrote and reordered the examples in the Actor documentation for cla… (#635 ) * Rewrote and reordered the examples in the Actor documentation for clarity. Also added an introduction to Gym * Minor tweaks to actor documentation * Small changes to wording.	2017-06-02 23:42:41 -07:00
Philipp Moritz	6adf39959c	put back large python object tests (commented out) (#636 )	2017-06-02 20:36:10 -07:00
Robert Nishihara	301e0b0db8	Bump version to 0.1.1 in preparation for uploading wheels to PyPI. (#630 )	2017-06-03 02:17:39 +00:00
Philipp Moritz	0254efa5e8	Use parallel memcopy from arrow (#633 ) * use parallel memcopy from arrow * fix linting * remove memory.h	2017-06-02 18:18:41 -07:00
Robert Nishihara	2694337c0f	Fix large memory tests. (#632 ) * Log the driver ID in hex instead of binary. * Fix large memory test and add more tests to it. * Remove tests that are too stressful.	2017-06-03 01:12:56 +00:00
Robert Nishihara	23b0c80967	Rename linux wheels so they can be uploaded to PyPI. (#629 )	2017-06-02 20:20:34 +00:00
Robert Nishihara	1a682e2807	Enable starting and stopping ray with "ray start" and "ray stop". (#628 ) * Install start_ray and stop_ray scripts in setup.py. * Update documentation. * Fix docker tests. * Implement stop_ray script in python. * Fix linting.	2017-06-02 20:17:48 +00:00
Robert Nishihara	a4d8e13094	Suppress excess warning messages related to intentional actor deaths. (#627 ) * Don't submit the actor destructor tasks when the job is exiting. * Don't propagate error messages to the driver when an actor exits intentionally.	2017-06-01 20:10:40 +00:00
Robert Nishihara	d0bfc0a849	Clean up actor workers when actor handle goes out of scope. (#617 )	2017-06-01 07:02:43 +00:00
Robert Nishihara	dd7f866a92	Fix compilation error on CentOS. (#622 ) * Fix compilation error on CentOS. * add TODO	2017-06-01 06:51:00 +00:00
Robert Nishihara	5f193afb87	Tell local scheduler to ignore SIGCHLD so that workers don't become zombies. (#620 )	2017-06-01 06:37:28 +00:00
Robert Nishihara	4d51ed37b2	Fix bug in which plasma client file descriptors were not closed. (#618 ) * Fix bug in which plasma client file descriptors were not closed. * Add logging statement when disconnecting client from plasma store. * Fix after rebasing. * Add more checks to plasma disconnect client.	2017-06-01 05:37:29 +00:00
Robert Nishihara	bcaab78908	Add script for building MacOS wheels. (#601 ) * Add script for building MacOS wheels. * Small cleanups to script. * Fix setting of PATH before building wheel. * Create symbolic link to correct Python executable so Ray installation finds the right Python. * Address comments. * Rename readme.	2017-06-01 00:30:46 +00:00
Philipp Moritz	b94b4a35e0	Make the Plasma store ready for Arrow integration (#579 ) * port plasma to arrow * fixes * refactor plasma client * more modernization * fix plasma manager tests * everything compiles * fix plasma client tests * update plasma serialization tests * fix plasma manager tests * fix bug * updates * fix bug * fix tests * fix rebase * address comments * fix travis valgrind build * fix linting * fix include order again * fix linting * address comments	2017-05-31 16:24:23 -07:00
Richard Shin	609b5c1a4c	Add script to build manylinux1 .whl files (#600 ) * Add manylinux setup * Switch to cp27mu * python/MANIFEST.in * Fix MANIFEST.in * Add build-wheel-manylinux1.sh * Update readme * Install correct version of numpy * Fix typo in README-manylinux1.md * Don't install cmake * Remove commented line from setup.py * Delete unused manylinux1.sh * Run setup.py bdist_wheel twice * Don't use package_data and MANIFEST.in. * Small aesthetic change. * Trigger build_ext in setup.py. * Remove nonexistent file from MANIFEST.in. * Manually copy files in MANIFEST.in to where Python expects them in order to prevent setup.py from having to be run twice. * Only run setup.py once when building wheels. * Aesthetic change to readme. * Copy generated flatbuffer Python files in build_ext. * Fix permission denied error by making sure to preserve executableness when copying files. * Remove unnecessary argument to setup.py. * Remove MANIFEST.in and move files to include into list in setup.py. * Fix numpy version when building wheels and replace rm with git clean.	2017-05-27 21:35:48 -07:00
Robert Nishihara	97af3b34d8	Use string instead of list in tutorial example to make it clearer. (#586 )	2017-05-26 15:32:51 -07:00
Philipp Moritz	647e1d9fc3	Fix runtest.py on the ubuntu system python 3 (#599 ) * fix runtest.py on the ubuntu system python 3 * less strict version of the test	2017-05-26 15:22:36 -07:00
Richard Shin	16050eca8d	Don't link Python extensions to libpython*.so (#598 )	2017-05-25 19:01:12 -07:00
Chelsea Finn	f97d0393cc	Fix to json decoding bug (#597 ) * fix json decoding bug * Fix linting error.	2017-05-25 18:48:39 -07:00
Michael Whittaker	1985838a30	Fixed small typo in actors.rst. (#595 )	2017-05-25 11:30:45 -07:00
Philipp Moritz	3885d1b286	make builds with CMake incremental (#592 )	2017-05-24 21:52:33 -07:00
Robert Nishihara	997aa35721	Remove cloudpickle customization and just use plain cloudpickle. (#588 ) * Remove augmentations of cloudpickle. * Entirely remove cloudpickle modifications. Just use plain cloudpickle.	2017-05-24 20:22:28 -07:00
Philipp Moritz	679910496e	fix policy gradients for mujoco domains (#589 )	2017-05-24 18:39:37 -07:00
Robert Nishihara	c5bc76193f	Remove Ray environment variables from codebase. (#590 )	2017-05-24 18:29:40 -07:00
Robert Nishihara	c647dd5f6c	Make it possible to use actor definitions within remote functions and other actors. (#587 ) * Enable remote function and actor definitions to close over actor definitions. * Give better error message if actor objects are pickled. * Add tests for closing over actor definitions. * Fix linting.	2017-05-24 15:43:32 -07:00
Robert Nishihara	bc8b0db13e	Add section on troubleshooting to the documentation. (#583 ) * Add section on troubleshooting to the documentation. * Address comments. * Update file descriptor troubleshooting.	2017-05-22 15:20:20 -07:00
Eric Liang	06241daf61	Policy gradient example: record stats for tensorboard (#577 ) * add tf metrics * comments * fix network scopes * add doc * use format string * fix trace level * plot intermediate and final sgd stats * add back a global step	2017-05-21 14:51:24 -07:00
Robert Nishihara	c440010cbd	Bump version to 0.1.0. (#581 )	2017-05-20 23:25:01 -07:00
Robert Nishihara	40b96b03b8	Add note about adaptively launching tasks in blog post. (#582 )	2017-05-20 23:19:42 -07:00
Robert Nishihara	07b21e057c	Print the driver stdout/stderr if we fail to decode it in jenkins. (#567 ) * Print the driver stdout/stderr if we fail to decode it in jenkins. * Fix whitespace. * Add explanation.	2017-05-20 23:11:19 -07:00
Robert Nishihara	849d2aaf47	Fix image paths in blog post, add section on ray.wait. (#580 ) * Fix image paths in blog post. * Use future instead of object ID. * Add description of ray.wait. * Revert to keep some of the object ID terminology.	2017-05-20 23:10:18 -07:00
Robert Nishihara	3f30f29987	Fix typos in actor documentation. (#578 ) * Fix typos in actor documentation. * Reset the gym environment in actor documentation.	2017-05-20 23:06:44 -07:00
Philipp Moritz	d394a3fdf6	Website for the v0.1 release (#576 ) * commit jekyll template * Port blog post to markdown. * Small changes. * Improvements to layout and post. * More improvements. * Add computation graph figures to the blog post. * Small changes. * Update gitignore.	2017-05-20 18:33:36 -07:00
Robert Nishihara	3d2f1b1941	Small improvements to using ray on large cluster documentation. (#573 )	2017-05-19 16:13:50 -07:00
Robert Nishihara	b62693ca67	Fix Python 2 bug in hyperopt example. (#575 )	2017-05-19 16:12:13 -07:00
Robert Nishihara	179416e8a2	Improve the cluster usage documentation. (#568 ) * Update cluster documentation and switch md to rst. * Improve cluster documentation.	2017-05-19 11:36:48 -07:00
Stephanie Wang	ee08c8274b	Shard Redis. (#539 ) * Implement sharding in the Ray core * Single node Python modifications to do sharding * Do the sharding in redis.cc * Pipe num_redis_shards through start_ray.py and worker.py. * Use multiple redis shards in multinode tests. * first steps for sharding ray.global_state * Fix problem in multinode docker test. * fix runtest.py * fix some tests * fix redis shard startup * fix redis sharding * fix * fix bug introduced by the map-iterator being consumed * fix sharding bug * shard event table * update number of Redis clients to be 64K * Fix object table tests by flushing shards in between unit tests * Fix local scheduler tests * Documentation * Register shard locations in the primary shard * Add plasma unit tests back to build * lint * lint and fix build * Fix * Address Robert's comments * Refactor start_ray_processes to start Redis shard * lint * Fix global scheduler python tests * Fix redis module test * Fix plasma test * Fix component failure test * Fix local scheduler test * Fix runtest.py * Fix global scheduler test for python3 * Fix task_table_test_and_update bug, from actor task table submission race * Fix jenkins tests. * Retry Redis shard connections * Fix test cases * Convert database clients to DBClient struct * Fix race condition when subscribing to db client table * Remove unused lines, add APITest for sharded Ray * Fix * Fix memory leak * Suppress ReconstructionTests output * Suppress output for APITestSharded * Reissue task table add/update commands if initial command does not publish to any subscribers. * fix * Fix linting. * fix tests * fix linting * fix python test * fix linting	2017-05-18 17:40:41 -07:00
shane	0a4304725f	adding -x for clearer output in build console log (#565 )	2017-05-18 17:04:56 -07:00
Wapaul1	f861124b9a	Added python2 support and check for outdated tf (#562 ) Improve the Evolutionary Strategies example.	2017-05-17 20:42:17 -07:00

1 2 3 4 5 ...

955 commits