hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-05 18:11:42 -05:00

Author	SHA1	Message	Date
Robert Nishihara	01e18b47f4	Direct people to stackoverflow for questions about usage. (#3830 ) * Direct people to stackoverflow for questions about usage. * Improve wording	2019-01-23 13:30:02 -08:00
Wang Qing	dcb744518e	Implement actor dummy object gc in java (#3822 ) * Add dummy object gc in java * Fix * Address comments. * Refine * Address comments.	2019-01-23 11:56:25 -08:00
Wang Qing	816406ea3d	[Java] Fix `setCurrentTask()` in multi threading (#3821 )	2019-01-23 20:45:30 +08:00
Robert Nishihara	0b1608a546	Factor out code for starting new processes and test plasma store in valgrind. (#3824 ) * Factor out starting Ray processes. * Detect flags through environment variables. * Return ProcessInfo from start_ray_process. * Print valgrind errors at exit. * Test valgrind in travis. * Some valgrind fixes. * Undo raylet monitor change. * Only test plasma store in valgrind.	2019-01-22 14:59:11 -08:00
Eric Liang	f0e6523323	[rllib] Don't call reset() unless necessary for multi-agent envs	2019-01-20 15:00:18 -08:00
Philipp Moritz	0dad4e6a25	Build Raylet with Bazel (#3806 )	2019-01-20 12:16:47 -08:00
Eric Liang	aad48ee5a5	[tune] Fully deprecate raw function literals in Tune (#3788 ) Related: https://github.com/ray-project/ray/issues/3785	2019-01-19 17:09:36 -08:00
Michael Luo	16f7ca45e4	Appo (#3779 ) * Deleted old fork, updated new ray and moved PPO-impala to APPO in ppo folder * Deleted unneccesary vtrace.py file * Update pong-impala.yaml * Cleaned PPO Code * Update pong-impala.yaml * Update pong-impala.yaml * wip * new ifle * refactor * add vtrace off option * revert * support any space * docs * fix comment * remove kl * Update cartpole-appo-vtrace.yaml	2019-01-18 13:40:26 -08:00
Philipp Moritz	931e6a2fc3	Fix compilation error on ARM. (#3800 )	2019-01-18 00:25:16 -08:00
Robert Nishihara	9af5a62e05	Give better error for old-style actor classes. (#3793 )	2019-01-17 19:05:04 -08:00
Richard Liaw	0537508106	Bump strings for 0.6.2 (#3801 )	2019-01-17 19:03:27 -08:00
Si-Yuan	16a3b99d8d	Get rid of Arrow test utils (#3734 ) * convert code to proper C++ * revert changes to "id.h" because #3765 has been merged. * revert changes to Python bindings because they will be removed in #3541 * remove dependencies of Arrow logging * revert changes to Arrow logging * lint	2019-01-17 18:35:41 -08:00
Jones Wong	319c1340cb	[rllib] Develop MARWIL (#3635 ) * add marvil policy graph * fix typo * add offline optimizer and enable running marwil * fix loss function * add maintaining the moving average of advantage norm * use sync replay optimizer for unifying * remove offline optimizer and use sync replay optimizer * format by yapf * add imitation learning objective * fix according to eric's review * format by yapf * revise * add test data * marwil	2019-01-16 19:00:43 -08:00
Hao Chen	d1840bc7a9	Simplify RayConfig (#3714 )	2019-01-16 16:43:26 -08:00
Richard Liaw	75ac016e2b	Bump version (#3787 )	2019-01-16 11:40:54 -08:00
Richard Liaw	fa99fda2b4	Application Stress Tests (#3612 )	2019-01-16 02:05:16 -08:00
Richard Liaw	c28e6d41f5	[tune] Avoid overwriting checkpoint file (#3781 )	2019-01-16 02:03:16 -08:00
ggdupont	a237b4a6a1	[Java] Fix package jaxb not exist when JDK11 (#3738 )	2019-01-16 14:15:00 +08:00
Philipp Moritz	3b39066c15	Fix pandas 0.22 incompatibility by upgrading Arrow (#3786 )	2019-01-15 21:17:32 -08:00
Eric Liang	401e656b95	[rllib] Sync filters at end of iteration not start; hierarchical docs (#3769 )	2019-01-15 16:25:25 -08:00
Richard Liaw	3918934dfd	[tune] Cross-Node Recovery (#3725 ) Augments trial restore to also check if the runner is at the same location. If not, the checkpoint files are pushed onto the new location.	2019-01-15 10:37:28 -08:00
Si-Yuan	a5df8e3532	minor fix (#3770 )	2019-01-14 13:52:51 -08:00
Tianming Xu	0b8008f41c	remove RAY_CHECK around wait_state.remaining.erase (#3745 )	2019-01-14 10:32:31 -08:00
Philipp Moritz	02bdaf221d	Update arrow to include https://github.com/apache/arrow/pull/3392 (#3765 ) * update arrow to include https://github.com/apache/arrow/pull/3392 * add appropriate includes * update	2019-01-14 19:20:26 +08:00
Wang Qing	3cf59855af	[Java] Replace junit with testNG (#3768 )	2019-01-14 17:49:17 +08:00
Robert Nishihara	19908c01b8	Use environment markers to only install faulthandler in Python < 3.3. (#3764 )	2019-01-14 15:55:59 +08:00
Hao Chen	1bb20badec	[Java] Fix bug when actor creation task fails (#3740 ) * [Java] Fix bug when actor creation task fails * remove imports	2019-01-14 11:09:15 +08:00
Robert Nishihara	27c20a41a9	Update stress tests. (#3614 ) Starts clusters for testing and has a fallback to kill the cluster if the command fails. The results are then printed at the end of test.	2019-01-13 17:08:51 -08:00
Eugene Vinitsky	a5d1f03515	[rllib] fix for rollout of lstm policies (#3643 ) * fix for lstm policies * added call to local evaluator * Update python/ray/rllib/rollout.py Co-Authored-By: eugenevinitsky <eugenevinitsky@users.noreply.github.com> * Update rollout.py * Update rollout.py	2019-01-13 15:54:23 -08:00
Philipp Moritz	00e9f8d870	Fix pyarrow version (#3760 )	2019-01-13 14:28:23 -08:00
jhpenger	3adffe6a4e	[docs] Add example showing how to use Ray on Kubernetes. (#3126 ) Closes #1353.	2019-01-13 13:56:47 -08:00
Wang Qing	8674606e26	Support to auto-generate Java files from flatbuffer (#3749 ) * auto gen flatbuffers for Java * Add auto_gen_tool.py * Refine * Add a comment * address comments. * Address comments. * Addressed * Refine * Address comments * Fix typo * Add exception * Address comments. * Refine * Fix lint * Fix * Fix lint and address comment. * Fix lint error	2019-01-13 11:39:23 -08:00
Yuhong Guo	d2cf8561f2	Refactor code about ray.ObjectID. (#3674 ) * Refactor code about ray.ObjectID. * remove from_random and use nil_id instead of constructor * remove id() in hash * Lint and fix * Change driver id to ObjectID * Replace binary_to_hex(ObjectID.id()) to ObjectID.hex()	2019-01-13 01:47:29 -08:00
Eric Liang	c4b058739b	Remove redundant error message (#3761 )	2019-01-12 22:22:41 -08:00
Richard Liaw	bdeeacc70f	[autoscaler] RecoverUnhealthyWorker mitigation (#3699 ) Increases number of retries for RecoverUnhealthyWorkers Closes #3435.	2019-01-12 14:06:53 -08:00
Robert Nishihara	1480f309c3	[doc] Replace runtest.py with mini_test.py in documentation. (#3750 ) Rename `xray_test.py` to `mini_test.py` and use that in the documentation. Right now we suggest that people run `runtest.py`, but that often doesn't succeed and takes too long.	2019-01-12 14:05:28 -08:00
James Casbon	528bb3afd9	gcp allow manual network configuration (#3748 )	2019-01-12 14:02:20 -08:00
Robert Nishihara	fbea1ece2e	Clear new actor handle list after submitting task. (#3755 )	2019-01-12 23:25:40 +08:00
Wang Qing	0a556dc0b5	Refine redis client (#3758 )	2019-01-12 23:01:48 +08:00
Wang Qing	a0cf8ee5a8	Refine Java worker code (#3735 )	2019-01-12 22:45:33 +08:00
Robert Nishihara	8723d6b061	Define a Node class to manage Ray processes. (#3733 ) * Implement Node class and move most of services.py into it. * Wait for nodes as they are added to the cluster. * Fix Redis authentication bug. * Fix bug in client table ordering. * Address comments. * Kill raylet before plasma store in test. * Minor	2019-01-11 22:30:38 -08:00
Wang Qing	fa2bfa6d76	Fix some small code quality issues. (#3719 )	2019-01-11 15:24:49 +08:00
Stephanie Wang	cc5ecd71c5	[autoscaler] Add kill and get IP commands to CLI for testing (#3731 ) ## What do these changes do? Adds 2 commands to the CLI that take in an autoscaler config: 1. Kill a random ray node in the cluster. 2. Get all the worker node IP addresses. These commands are both for testing and are not recommended for normal use. ## Related issue number Closes #3685.	2019-01-10 22:06:57 -08:00
Richard Liaw	574f0b73bc	[tune] Fix Trial Serialization (#3743 )	2019-01-10 19:26:10 -08:00
Hao Chen	597abb24ea	Refine multi-threading support (#3672 ) * [Python] refine multi-threading support fix * [java] refine multithreading code fix java * format	2019-01-10 13:58:11 -08:00
Eric Liang	71243203a4	[rllib] Fix KeyError: 'kl' in multiagent ppo training	2019-01-09 19:33:07 -08:00
Hao Chen	6fc3fc4120	Cap task lease timeout (#3707 )	2019-01-09 17:19:48 -08:00
Richard Liaw	edb7aaf7c7	[tune] Better Serialization for Server (#3708 ) * Add cloudpickle for serialization * Fix tests	2019-01-09 11:55:32 -08:00
Stephanie Wang	04f31db54d	Actor dummy object garbage collection (#3593 ) * Convert UniqueID::nil() to a constructor * Cleanup actor handle pickling code * Add new actor handles to the task spec * Pass in new actor handles * Add new handles to the actor registration * Regression test for actor handle forking and GC * lint and doc * Handle pickled actor handles in the backend and some refactoring * Add regression test for dummy object GC and pickled actor handles * Check for duplicate actor tasks on submission * Regression test for forking twice, fix failed named actor leak * Fix bug for forking twice * lint * Revert "Fix bug for forking twice" This reverts commit 3da85e59d401e53606c2e37ffbebcc8653ff27ac. * Add new actor handles when task is assigned, not finished * Remove comment * remove UniqueID() * Updates * update * fix * fix java * fixes * fix	2019-01-09 10:37:11 -08:00
Wenting Shen	3027dde303	Fix some storage problems of RayLog (#3595 ) 1. Fix the problem of duplicated stored logs. 2. Save log whose level is higher than severity_threshold, not only with severity_threshold. 3. Fix a `log_dir` bug: storing logs in a wrong path.	2019-01-09 13:54:21 +08:00

1 2 3 4 5 ...

2419 commits