hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 18:41:40 -05:00

Author	SHA1	Message	Date
Robert Nishihara	6e1de19cc2	Bump version to 0.5.1. (#2755 )	2018-08-28 16:52:17 -07:00
Robert Nishihara	b7722897b4	Deprecate 'driver_mode' argument. (#2758 ) * Deprecate 'driver_mode' argument. * Fix * Fix	2018-08-28 16:45:49 -07:00
Alexey Tumanov	de047daea7	[xray] raylet scheduling mechanism with a simple spillback policy (#2749 ) ## What do these changes do? * distribute load and resource information on a heartbeat * for each raylet, maintain total and available resource capacity as well as measure of current load * this PR introduces a new notion of load, defined as a sum of all resource demand induced by queued ready tasks on the local raylet. This provides a heterogeneity-aware measure of load that supersedes legacy Ray's task count as a proxy for load. * modify the scheduling policy to perform capacity-based, load-aware, optimistically concurrent resource allocation * perform task spillover to the heartbeating node in response to a heartbeat, implementing heterogeneity-aware late-binding/work-stealing.	2018-08-28 00:03:34 -07:00
adoda	90ae8f11df	The function get_node_ip_address while catch an exception and return … (#2722 ) …'127.0.0.1', when we forbid the external network. Instead of we can get ip address from hostname. The function get_node_ip_address while catch an exception and return '127.0.0.1' when we forbid the external network. Instead of we can get ip address from hostname. https://github.com/ray-project/ray/issues/2721	2018-08-27 22:24:49 -07:00
Yuhong Guo	0b6e08ebee	Separate python logger module-wise (#2703 ) ## What do these changes do? 1. Separate the log related code to logger.py from services.py. 2. Allow users to modify logging formatter in `ray start`. ## Related issue number https://github.com/ray-project/ray/pull/2664	2018-08-26 13:46:14 -07:00
Richard Liaw	dbba7f2a53	[autoscaler] Cleanup Logging (#2709 ) Moves Autoscaler onto Python `logging` module.	2018-08-25 17:08:45 -07:00
Jones Wong	982cde664f	[rllib] Add noisy network and distributional Q-learning to implement Rainbow (#2737 ) * add noisy network * distributional q-learning in dev * add distributional q-learning * validated rainbow module * add some comments * supply some comments * remove redundant argument to pass CI test * async replay optimizer does NOT need annealing beta * ignore rainbow specific arguments for DDPG and Apex * formatted by yapf * Update dqn_policy_graph.py * Update dqn_policy_graph.py	2018-08-25 14:17:14 -07:00
eugenevinitsky	6201a6d1c7	[rllib] add augmented random search (#2714 ) * added ars * functioning ars with regression test * added regression tests for ARs * fixed default config for ARS * ARS code runs, now time to test * ARS working and tested, changed std deviation of meanstd filter to initialize to 1 * ARS working and tested, changed std deviation of meanstd filter to initialize to 1 * pep8 fixes * removed unused linear model * address comments * more fixing comments * post yapf * fixed support failure * Update LICENSE * Update policies.py * Update test_supported_spaces.py * Update policies.py * Update LICENSE * Update test_supported_spaces.py * Update policies.py * Update policies.py * Update filter.py	2018-08-24 22:20:02 -07:00
Michael Tu	d16b6f6a32	[tune] Rename 'repeat' to 'num_samples' (#2698 ) Deprecates the `repeat` argument and introduces `num_samples`. Also updates docs accordingly.	2018-08-24 15:05:24 -07:00
Philipp Moritz	b4c47a5861	Upgrade arrow to include more detailed flushing message (#2706 )	2018-08-24 11:44:04 -07:00
Eric Liang	aa014af85b	[rllib] Fix atari reward calculations, add LR annealing, explained var stat for A2C / impala (#2700 ) Changes needed to reproduce Atari plots in IMPALA / A2C: https://github.com/ray-project/rl-experiments	2018-08-23 17:49:10 -07:00
old-bear	4be324efc3	[tune] Support infinity value in report result (#2693 ) * + Compatibility fix under py2 on ray.tune * + Revert changes on master branch * + Use default JsonEncoder in ray.tune.logger * + Add UT for infinity support	2018-08-22 13:09:14 -07:00
joyyoj	38867eea4e	[tune] Cross-Framework Compatibility (#2646 ) This commit is a first pass at restructuring the Trial execution logic to support running on multiple frameworks.	2018-08-22 10:55:45 -07:00
Eric Liang	fbe6c59f72	[rllib] Misc fixes, A2C (#2679 ) A bunch of minor rllib fixes: pull in latest baselines atari wrapper changes (and use deepmind wrapper by default) move reward clipping to policy evaluator add a2c variant of a3c reduce vision network fc layer size to 256 units switch to 84x84 images doc tweaks print timesteps in tune status	2018-08-20 15:28:03 -07:00
Yucong He	880ef1bd21	doc fix (#2696 )	2018-08-20 14:11:32 -07:00
Robert Nishihara	89d4a6df93	Start Redis in protected mode when started via ray.init(). (#2697 ) This PR makes it so that when Ray is started via ray.init() (as opposed to via ray start) the Redis servers will be started in "protected mode" (which means that clients can only connect by connecting to localhost). In practice, we actually connect redis clients by passing in the node IP address (not localhost), so I need to create a redis config file on the fly to allow both localhost and the node's actual IP address (it would have been nice to find a way to do this from the Python redis client, but I couldn't find one).	2018-08-20 14:08:01 -07:00
old-bear	230ac7aa80	[tune] Compatibility fix under py2 on str condition (#2673 ) * * Compatibility fix under py2 on ray.tune * + Fix compatibility * + Use package six to achieve str compatibility	2018-08-19 20:43:03 -07:00
Eric Liang	9473da69bd	[autoscaler] Experimental support for local / on-prem clusters (#2678 ) This adds some experimental (undocumented) support for launching Ray on existing nodes. You have to provide the head ip, and the list of worker ips. There are also a couple additional utils added for rsyncing files and port-forward.	2018-08-19 12:43:04 -07:00
Richard Liaw	62d0698097	[tune] Tune Facelift (#2472 ) This PR introduces the following changes: * Ray Tune -> Tune * [breaking] Creation of `schedulers/`, moving PBT, HyperBand into a submodule * [breaking] Search Algorithms now must take in experiment configurations via `add_configurations` rather through initialization * Support `"run": (function \| class \| str)` with automatic registering of trainable * Documentation Changes	2018-08-19 11:00:55 -07:00
Eric Liang	e56eb354eb	[tune] Remove hack to serve pin requests off thread (#2680 ) * nopin * fix	2018-08-18 13:19:52 -07:00
Wang Qing	06a58016d8	[multi-language part 2] Change the command line arguments to start raylet (#2670 )	2018-08-16 21:59:44 -07:00
Eric Liang	6670880f03	[rllib] Workaround actor creation hang edge case for ape-X (#2661 ) * apex hang * fix * move pyt to end	2018-08-16 18:03:50 -07:00
Eric Liang	5f430da180	[rllib] Provide internal access to episode state in compute_actions() and allow returning extra batches (#2559 ) The goal of this PR is to allow custom policies to perform model-based rollouts. In the multi-agent setting, this requires access to not only policies of other agents, but also their current observations. Also, you might want to return the model-based trajectories as part of the rollout for efficiency. compute_actions() now takes a new keyword arg episodes pull out internal episode class into a top-level file add function to return extra trajectories from an episode that will be appended to the sample batch documentation	2018-08-16 14:37:21 -07:00
Eric Liang	127cf291a3	Delete __init__.py (#2668 )	2018-08-16 02:01:21 -07:00
Eric Liang	079c4e482a	ray exec and ray attach commands (#2560 ) ray exec CLUSTER CMD [--screen] [--start] [--stop] ray attach CLUSTER [--start] Example: ray exec sgd.yaml 'source activate tensorflow_p27 && cd ~/ray/python/ray/rllib && ./train.py --run=PPO --env=CartPole-v0' --screen --start --stop This will in one command create a cluster and run the command on it in a screen session. The screen can later be attached to via ray attach. After the command finishes, the cluster workers will be terminated and the head node stopped.	2018-08-15 14:31:50 -07:00
Eric Liang	53f9755594	[rllib] Fix support for mixed discrete and continuous action spaces, add to regression test (#2655 ) * fix * lint * fix	2018-08-15 10:19:41 -07:00
Yuhong Guo	eeb15771ba	Add `ray.internal.free` (#2542 )	2018-08-14 22:01:23 -07:00
Mitar	493585574a	Updating documentation. (#2643 )	2018-08-13 19:18:12 -07:00
efang96	baba624373	updated agent.compute_action to return rnn state (#2581 ) * updated agent.compute_action to return rnn state * updated compute_action method, added case for state=None * fixing lint	2018-08-13 18:04:42 -07:00
Mitar	8769b8ac32	Fixing docstring. (#2638 )	2018-08-13 16:19:32 -07:00
Eric Liang	9559873d13	[rllib] tuple space shouldn't assume elements are all the same size (#2637 ) * fix * lint	2018-08-11 10:57:40 -07:00
Peter Schafhalter	230b9ab33b	[asv] Add benchmark for ray.wait (#2625 ) * Add benchmarks for ray.wait * Fix bug	2018-08-10 17:52:36 -07:00
Jones Wong	007208d2bb	Support older version TF and Support RMSProp in Impala (#2590 ) to support TF version < 1.5 to support rmsprop optimizer in Impala Before TF1.5, tf.reduce_sum() and tf.reduce_max() has an argument keep_dims which has been renamed as keepdims in later versions. In the original paper of Impala, they use rmsprop algorithm to optimize the model. We'd better also support it so that users can reproduce their experiments. Without any tuning, say that using the same hyper-parameters as AdamOptimizer, it reaches "episode_reward_mean": 19.083333333333332 in Pong after consume 3,610,350 samples.	2018-08-09 19:51:32 -07:00
Melih Elibol	8ae82180b4	[xray] Adds a driver table. (#2289 ) This PR adds a driver table for the new GCS, which enables cleanup functionality associated with monitoring driver death. Some testing in `monitor_test.py` is restored, but redis sharding for xray is needed to enable remaining tests.	2018-08-08 23:41:40 -07:00
Eric Liang	64053278aa	[tune] Support lambda functions in hyperparameters / tune rllib multiagent support (#2568 ) * update * func * Update registry.py * revert	2018-08-07 16:29:21 -07:00
Richard Liaw	bb44456f6f	[rllib, tune] TrainingResult -> Dict, Removes C408 from flake8 (#2565 )	2018-08-07 12:17:44 -07:00
Philipp Moritz	a3202f581c	[xray] Add flag to start raylet in valgrind (#2582 )	2018-08-07 11:25:21 -07:00
Yuhong Guo	9825da7233	Change training tasks to xray for Jenkins tests (#2567 )	2018-08-06 13:35:26 -07:00
Eric Liang	981d9818c1	[rllib] Support the timesteps_per_batch in simple optimizer PPO mode (#2558 ) * support ts * doc * Update sync_samples_optimizer.py	2018-08-06 12:10:59 -07:00
Richard Liaw	914a433e3f	[tune] Split Search from Scheduling (#2452 ) Introduces SearchAlgorithm concept, separate from schedulers in Tune. Moves HyperOpt under this concept.	2018-08-04 21:27:39 -07:00
Eric Liang	9449d07eca	[rllib] Fix crash when setting horizon in multiagent If a horizon is set, an env terminates without done=True.	2018-08-03 16:37:56 -07:00
Philipp Moritz	d5dda1ebf2	copy all files when installing pyarrow (#2547 )	2018-08-02 17:06:37 -07:00
Peter Schafhalter	7a5f25248e	[rllib] Improve conv_filters documentation (#2540 ) * Improve conv_filters documentation * Update catalog.py * Update catalog.py	2018-08-02 14:29:40 -07:00
Eric Liang	f7ec292360	[rllib] Support agent.get_action in multiagent (#2543 ) * support get action on policy id * comment * grammar fixes * Update rllib-algorithms.rst	2018-08-02 13:35:53 -07:00
Yuhong Guo	d2ebe4d9a3	Fix frequent failure of Jenkins CI. (#2490 )	2018-08-02 10:28:28 -07:00
Eric Liang	9ea57c2a93	[rllib] Basic IMPALA implementation (using deepmind's reference vtrace.py) (#2504 ) Rename AsyncSamplesOptimizer -> AsyncReplayOptimizer Add AsyncSamplesOptimizer that implements the IMPALA architecture integrate V-trace with a3c policy graph audit V-trace integration benchmark compare vs A3C and with V-trace on/off PongNoFrameskip-v4 on IMPALA scaling from 16 to 128 workers, solving Pong in <10 min. For reference, solving this env takes ~40 minutes for Ape-X and several hours for A3C.	2018-08-01 20:53:53 -07:00
Eric Liang	9a479b3a63	[rllib] Document creating an ensemble of envs; also add vector_index attribute to env config (#2513 ) This also removes the async resetting code in VectorEnv. While that improves benchmark performance slightly, it substantially complicates env configuration and probably isn't worth it for most envs. This makes it easy to efficiently support setups like Joint PPO: https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/retro-contest/gotta_learn_fast_report.pdf For example, for 188 envs, you could do something like num_envs: 10, num_envs_per_worker: 19.	2018-08-01 16:29:27 -07:00
Eric Liang	a630e332f3	[rllib] Don't use get_gpu_ids() in ppo This lets the num_gpus config work properly even when not using tune, since the gpu ids won't be set by ray in that case.	2018-08-01 16:25:11 -07:00
Eric Liang	d9a36c4e39	[rllib] Document auto-concat in a3c (#2533 ) * docs * update hyperparm docs	2018-08-01 15:11:30 -07:00
Melih Elibol	89f60e39f3	Override user-specified name tag. (#2480 ) Override user-specified name tag.	2018-08-01 14:16:57 -04:00

1 2 3 4 5 ...

742 commits