hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-09 04:46:38 -04:00

Author	SHA1	Message	Date
Philipp Moritz	62de86ff7a	fix redis module build dependencies (#2247 )	2018-06-13 10:18:09 -07:00
Hao Chen	8efd0f7b1b	[xray] support multi-workers per process (#2244 ) * support multi-workers per process Signed-off-by: Hao Chen <chenh1024@gmail.com> * use RayConfig Signed-off-by: Hao Chen <chenh1024@gmail.com> * fix Signed-off-by: Hao Chen <chenh1024@gmail.com> * fix * remove clear * address comments * fix lint * fix bug * make WorkerPool and WorkerPoolMock more consistent	2018-06-13 10:14:05 -07:00
songqing	78a48fa1e0	Fix build error when building Ray for Java later than Python (#2241 )	2018-06-12 21:11:30 -07:00
Eric Liang	be178ae031	[autoscaler] GCP docs (#2235 )	2018-06-12 12:40:12 -07:00
Eric Liang	7fcaad264a	[autoscaler] Translate to/from AWS 'Name' tag (#2219 ) * fix tag * fix	2018-06-11 12:10:10 -07:00
Alok Singh	d47d6a6b7a	[rllib] Use correct method name (#2226 )	2018-06-11 09:53:31 -07:00
Devin Petersohn	b886ceca47	[DataFrame] Implement __array_wrap__ (#2218 ) * Implement __array_wrap__ * Removing unnecessary test	2018-06-11 08:56:43 -07:00
Robert Nishihara	61139e1509	Enable fractional resources and resource IDs for xray. (#2187 ) * Implement GPU IDs and fractional resources. * Add documentation and python exceptions. * Fix signed/unsigned comparison. * Fix linting. * Fixes from rebase. * Re-enable tests that use ray.wait. * Don't kill the raylet if an infeasible task is submitted. * Ignore tests that require better load balancing. * Linting * Ignore array test. * Ignore stress test reconstructions tests. * Don't kill node manager if remote node manager disconnects. * Ignore more stress tests. * Naming changes * Remove outdated todo * Small fix * Re-enable test. * Linting * Fix resource bookkeeping for blocked tasks. * Fix linting * Fix Java client. * Ignore test * Ignore put error tests	2018-06-10 15:31:43 -07:00
Richard Liaw	f19decb848	[docs] Update RLlib install to not include Tensorflow (#2178 )	2018-06-10 10:29:12 -07:00
Philipp Moritz	4ec5bea03b	[xray] Implement fetch (#2195 )	2018-06-09 23:36:27 -07:00
Robert Nishihara	125fe1c09c	Print warning when defining very large remote function or actor. (#2179 ) * Print warning when defining very large remote function or actor. * Add weak test. * Check that warnings appear in test. * Make wait_for_errors actually fail in failure_test.py. * Use constants for error types. * Fix	2018-06-09 19:59:15 -07:00
andrewztan	1475600c81	[rllib] Merge DDPG and DDPG2 implementations (#2202 ) * removed ddpg2 * removed ddpg2 from codebase * added tests used in ddpg vs ddpg2 comparison * added notes about training timesteps to yaml files * removed ddpg2 yaml files * removed unnecessary configs from yaml files * removed unnecessary configs from yaml files * moved pendulum, mountaincarcontinuous, and halfcheetah tests to tuned_examples * moved pendulum, mountaincarcontinuous, and halfcheetah tests to tuned_examples * added more configuration details to yaml files * removed random starts from halfcheetah	2018-06-09 16:46:23 -07:00
Yujie Liu	3b5e700fd7	[JavaWorker] Java code lint check and binding to CI (#2225 ) * add java code lint check and fix the java code lint error * add java doc lint check and fix the java doc lint error * add java code and doc lint to the CI	2018-06-09 16:26:54 -07:00
Robert Nishihara	5789a247f9	[xray] Do not redirect worker output to files by default. (#2220 )	2018-06-09 15:00:42 -07:00
Eric Liang	71eb558eb0	[rllib] Refactor rllib to have a common sample collection pathway (#2149 )	2018-06-09 00:21:35 -07:00
Stephanie Wang	cb5e6e6d68	Add dependency between copy_ray and python extensions (#2221 )	2018-06-08 20:41:54 -07:00
Eric Liang	32b9a4d3f1	Fix yapf excludes, print diff in --all mode (#2211 ) * fix * travis	2018-06-08 02:25:55 -07:00
Eric Liang	8da558f5b7	[autoscaler] Should use internal IP for ssh (#2209 )	2018-06-08 01:08:59 -07:00
Eric Liang	31046f7e06	Autoscaler Python 2 queue fix (#2205 )	2018-06-07 18:43:07 -07:00
Eric Liang	100d8c207f	[xray] [autoscaler] Fix autoscaler / raylet integration (#2143 )	2018-06-07 15:43:20 -07:00
Yuhong Guo	0a34bea0b0	Use scoped enums in C++ and flatbuffers. (#2194 ) * Enable --scoped-enums in flatbuffer compiler. * Change enum to c++11 style (enum class). * Resolve conflicts. * Solve building failure when RAY_USE_NEW_GCS=on and remove ERROR_INDEX suffix. * Merge with master and fix CI failure.	2018-06-07 01:01:21 -07:00
Hao Chen	f0907a6ee9	Optimize lineage eviction efficiency (#2196 ) * Java in vscode. * Optimize lineage eviction * minor fix * fix ut * fix comment and lint * format * format * remove unneeded code	2018-06-07 00:35:15 -07:00
Philipp Moritz	343f29801b	[xray] Fix compilation on mac (#2199 )	2018-06-06 22:33:46 -07:00
Melih Elibol	7246ff80a4	[xray] Implements ray.wait (#2162 ) Implements ray.wait for xray. Fixes #1128.	2018-06-06 16:56:44 -07:00
Devin Petersohn	c8c0349511	[DataFrame] Temporarily changing the requirement until our pandas compat is updated (#2197 ) * Temporarily changing the requirement until our pandas compat is updated for 0.23 * Fix lint	2018-06-06 12:01:43 -07:00
Yuhong Guo	5b0df0eca2	Change surefire version to 2.21.0 to fix test failure on Java10. (#2198 )	2018-06-06 10:39:20 -07:00
Alok Singh	42a9233e1d	Improve yapf speed and document its usage (#2160 ) * Allow yapf to lint individual files * Add tip for using yapf * Update doc * Update script to autoformat changed py files The new default is for the script to only updated changed files to encourage using it as a pre-push hook. Travis still checks all since it's not that big an increase to runtime. * Exclude formatting thirdparty/autogen py files * Symlink .travis -> scripts Hidden directories may get glossed over otherwise. * .travis -> scripts in docs They are symlinks to the same thing, but `scripts` is more dev-friendly, while `.travis` is really only for Travis CI. * Document different yapf format functions Most devs will only need `format_changed`, and this is run by default. `format_changed` should be fast enough in most cases to work as a pre-commit hook. * Speed up yapf by only formatting changed files * Update docs 1. Mention how yapf can be used a pre-commit hook 2. rm `bash`, script is executable * Update yapf.sh * Update development.rst * Update yapf.sh * Use bash arrays for correct argument splitting Playing fast and loose with whitespace in bash is a terrible idea. * Only format non-excluded by default * Check changes against master Normally, the remote is called `origin`, but naming it explicit * Adding missing directory to `format_all` * Cleanup YAPF code Remove unused function and move around code to make clearer and adding lines give cleaner diffs. * Ensure correct files are autoformatted * Fix cmd line arg splitting Each arg has to be in its own set of quotes. * Diff against mergebase TIL there's a clean syntax for doing that, but it's too clever to belong in a shell script. We use `mapfile -t` to ensure no problems down the line with weird filenames.	2018-06-05 20:22:11 -07:00
Adam Gleave	6ef3b255ea	Launch nodes in separate threads (#2183 ) Modifies the autoscaler to run launch_new_nodes in a separate thread, keeping track of the number of pending requests.	2018-06-05 20:19:31 -07:00
Richard Liaw	13d4e0db95	Add Docker Support for ASV (#2184 ) * added new instructions and script * initialize ray only once * use ray-project/asv master	2018-06-05 15:55:35 -07:00
Simon Mo	a139a5df8c	[DataFrame] Implement Memoizer (#2157 ) * Implement Memoizer * Add LRUCache * Add comments	2018-06-05 07:18:12 -07:00
songqing	451cdb43f6	Fix redefinition of flatbuffer types (#2189 )	2018-06-05 00:08:05 -07:00
Devin Petersohn	b56c8ed8dc	[DataFrame] Fix equals and make it more efficient (#2186 ) * Fixing equals * Adding test fix * Working on fix for equals and drop * Fix equals and fix tests to use ray.dataframe.equals * Addressing comments	2018-06-04 13:10:06 -07:00
Peter Schafhalter	a5d888e49b	[DataFrames] More dtypes optimizations (#2124 ) * Pass dtypes for some DataFrame constructors * More optimizations with dtypes_cache * Optimizations	2018-06-04 10:50:13 -07:00
Binglin Chang	19d6ca0670	Support constructing TensorFlowVariables from multiple tf operations (#2182 )	2018-06-02 18:13:52 -07:00
Philipp Moritz	d699bfbf10	Use hashing function that takes into account all UniqueID bytes (#2174 )	2018-06-01 23:07:29 -07:00
Philipp Moritz	e1024d84e9	[xray] Start actor workers in parallel (#2168 )	2018-06-01 23:04:16 -07:00
Kunal Gosar	317d0da7d8	Add experimental API for ray.get and ray.wait with additional argument types (#2071 )	2018-06-01 16:42:27 -07:00
songqing	4dd4698564	unify build dir for Python and Java (#2171 ) * unify build dir for Python and Java * enable executables auto installed when just running 'make' * fix plasma_store copy error * fix cmake error about copying executables * lint fix * recover python/setup.py * enable to copy optional file automatically * a small fix of path * lint fix * lint fix * lint fix * Add comment.	2018-06-01 16:28:27 -07:00
Yuhong Guo	c1de03acac	Add timeout mechanism to Push function instead of retries (#2148 ) Use timer instead of retries in Push when objects are not local.	2018-06-01 01:21:05 -07:00
Kristian Hartikainen	74dc14d1fc	[autoscaler] GCP node provider (#2061 ) * Google Cloud Platform scaffolding * Add minimal gcp config example * Add googleapiclient discoveries, update gcp.config constants * Rename and update gcp.config key pair name function * Implement gcp.config._configure_project * Fix the create project get project flow * Implement gcp.config._configure_iam_role * Implement service account iam binding * Implement gcp.config._configure_key_pair * Implement rsa key pair generation * Implement gcp.config._configure_subnet * Save work-in-progress gcp.config._configure_firewall_rules. These are likely to be not needed at all. Saving them if we happen to need them later. * Remove unnecessary firewall configuration * Update example-minimal.yaml configuration * Add new wait_for_compute_operation, rename old wait_for_operation * Temporarily rename autoscaler tags due to gcp incompatibility * Implement initial gcp.node_provider.nodes * Still missing filter support * Implement initial gcp.node_provider.create_node * Implement another compute wait operation (wait_For_compute_zone_operation). TODO: figure out if we can remove the function. * Implement initial gcp.node_provider._node and node status functions * Implement initial gcp.node_provider.terminate_node * Implement node tagging and ip getter methods for nodes * Temporarily rename tags due to gcp incompatibility * Tiny tweaks for autoscaler.updater * Remove unused config from gcp node_provider * Add new example-full example to gcp, update load_gcp_example_config * Implement label filtering for gcp.node_provider.nodes * Revert unnecessary change in ssh command * Revert "Temporarily rename tags due to gcp incompatibility" This reverts commit e2fe634c5d11d705c0f5d3e76c80c37394bb23fb. * Revert "Temporarily rename autoscaler tags due to gcp incompatibility" This reverts commit c938ee435f4b75854a14e78242ad7f1d1ed8ad4b. * Refactor autoscaler tagging to support multiple tag specs * Remove missing cryptography imports * Update quote function import * Fix threading issue in gcp.config with the compute discovery object * Add gcs support for log_sync * Fix the labels/tags naming discrepancy * Add expanduser to file_mounts hashing * Fix gcp.node_provider.internal_ip * Add uuid to node name * Remove 'set -i' from updater ssh command * Also add TODO with the context and reason for the change. * Update ssh key creation in autoscaler.gcp.config * Fix wait_for_compute_zone_operation's threading issue Google discovery api's compute object is not thread safe, and thus needs to be recreated for each thread. This moves the `wait_for_compute_zone_operation` under `autoscaler.gcp.config`, and adds compute as its argument. * Address pr feedback from @ericl * Expand local file mount paths in NodeUpdater * Add ssh_user name to key names * Update updater ssh to attempt 'set -i' and fall back if that fails * Update gcp/example-full.yaml * Fix wait crm operation in gcp.config * Update gcp/example-minimal.yaml to match aws/example-minimal.yaml * Fix gcp/example-full.yaml comment indentation * Add gcp/example-full.yaml to setup files * Update example-full.yaml command * Revert "Refactor autoscaler tagging to support multiple tag specs" This reverts commit 9cf48409ca2e5b66f800153853072c706fa502f6. * Update tag spec to only use characters [0-9a-z_-] * Change the tag values to conform gcp spec * Add project_id in the ssh key name * Replace '_' with '-' in autoscaler tag names * Revert "Update updater ssh to attempt 'set -i' and fall back if that fails" This reverts commit 23a0066c5254449e49746bd5e43b94b66f32bfb4. * Revert "Remove 'set -i' from updater ssh command" This reverts commit 5fa034cdf79fa7f8903691518c0d75699c630172. * Add fallback to `set -i` in force_interactive command * Update autoscaler tests to match current implementation * Update GCPNodeProvider.create_node to include hash in instance name * Add support for creating multiple instance on one create_node call * Clean TODOs * Update styles * Replace single quotes with double quotes * Some minor indentation fixes etc. * Remove unnecessary comment. Fix indentation. * Yapfify files that fail flake8 test * Yapfify more files * Update project_id handling in gcp node provider * temporary yapf mod * Revert "temporary yapf mod" This reverts commit b6744e4e15d4d936d1a14f4bf155ed1d3bb14126. * Fix autoscaler/updater.py lint error, remove unused variable	2018-05-31 09:00:03 -07:00
Stephanie Wang	117107cb15	[xray] Evict tasks from the lineage cache (#2152 )	2018-05-31 00:24:39 -07:00
Philipp Moritz	12de668ccb	[ASV] Add ray.init and simple Ray benchmarks (#2166 )	2018-05-31 00:06:17 -07:00
Robert Nishihara	c85bb8fb4e	Re-encrypt key for uploading to S3 from travis to use travis-ci.com. (#2169 )	2018-05-31 00:05:03 -07:00
Alok Singh	fd234e3171	[rllib] Fix A3C PyTorch implementation (#2036 ) * Use F.softmax instead of a pointless network layer Stateless functions should not be network layers. * Use correct pytorch functions * Rename argument name to out_size Matches in_size and makes more sense. * Fix shapes of tensors Advantages and rewards both should be scalars, and therefore a list of them should be 1D. * Fmt * replace deprecated function * rm unnecessary Variable wrapper * rm all use of torch Variables Torch does this for us now. * Ensure that values are flat list * Fix shape error in conv nets * fmt * Fix shape errors Reshaping the action before stepping in the env fixes a few errors. * Add TODO * Use correct filter size Works when `self.config['model']['channel_major'] = True`. * Add missing channel major * Revert reshape of action This should be handled by the agent or at least in a cleaner way that doesn't break existing envs. * Squeeze action * Squeeze actions along first dimension This should deal with some cases such as cartpole where actions are scalars while leaving alone cases where actions are arrays (some robotics tasks). * try adding pytorch tests * typo * fixup docker messages * Fix A3C for some envs Pendulum doesn't work since it's an edge case (expects singleton arrays, which `.squeeze()` collapses to scalars). * fmt * nit flake * small lint	2018-05-30 10:48:11 -07:00
Hao Chen	ac1e5a7d15	[JavaWorker] Do not kill local-scheduler-forked workers in RunManager.cleanup (#2151 ) Local-scheduler-forked workers will be killed by local scheduler itself, don't need to be killed here. See: `570c3153cd/src/local_scheduler/local_scheduler.cc (L184-L192)` Also, using `ps \| grep \| kill` might be dangerous, because it could also kill irrelevant processes, e.g., `vim DefaultWorker.java`.	2018-05-30 00:25:03 -07:00
Robert Nishihara	aa34509bc7	Update Travis CI badge from travis-ci.org to travis-ci.com. (#2155 )	2018-05-29 16:44:02 -07:00
Robert Nishihara	6172f94c04	Implement Python global state API for xray. (#2125 ) * Implement global state API for xray. * Fix object table. * Fixes for log structure. * Implement cluster_resources. * Add driver task to task table. * Remove python flatbuffers code * Get some global state API tests running. * Python linting. * Fix linting. * Fix mock modules for doc * Copy over flatbuffer bindings. * Fix for tests. * Linting * Fix monitor crash.	2018-05-29 16:25:54 -07:00
Stephanie Wang	166000b089	[xray] Improve flush algorithm for the lineage cache (#2130 ) * Private method to flush a single task from the lineage cache * Track parent->child relationships for faster flushing * doc * Only flush the newly ready task * Flush() returns void * x	2018-05-28 21:03:15 -07:00
Eric Liang	bc2a83e698	Fix support for actor classmethods (#2146 )	2018-05-28 17:43:23 -07:00
Peter Veerman	eb1d7ac4bc	Add empty df test (#1879 )	2018-05-27 09:25:50 -07:00

... 3 4 5 6 7 ...

1910 commits