hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 18:41:40 -05:00

Author	SHA1	Message	Date
Peter Schafhalter	a41bbc10ef	Add password authentication to Redis ports (#2952 ) * Implement Redis authentication * Throw exception for legacy Ray * Add test * Formatting * Fix bugs in CLI * Fix bugs in Raylet * Move default password to constants.h * Use pytest.fixture * Fix bug * Authenticate using formatted strings * Add missing passwords * Add test * Improve authentication of async contexts * Disable Redis authentication for credis * Update test for credis * Fix rebase artifacts * Fix formatting * Add workaround for issue #3045 * Increase timeout for test * Improve C++ readability * Fixes for CLI * Add security docs * Address comments * Address comments * Adress comments * Use ray.get * Fix lint	2018-10-16 22:48:30 -07:00
Eric Liang	a9e454f6fd	[rllib] Include config dicts in the sphinx docs (#3064 )	2018-10-16 15:55:11 -07:00
Eric Liang	3c891c6ece	[rllib] Parallel-data loading and multi-gpu support for IMPALA (#2766 )	2018-10-15 11:02:50 -07:00
Richard Liaw	f9b58d7b02	[tune] Tweaks to Trainable and Verbosity (#2889 )	2018-10-11 23:42:13 -07:00
Robert Nishihara	d73ee36e60	Update links to use latest 0.5.3 wheels instead of 0.5.2. (#3018 )	2018-10-03 13:43:40 -07:00
Si-Yuan	cc7e2ecdd5	Change logfile names and also allow plasma store socket to be passed in. (#2862 )	2018-10-03 10:03:53 -07:00
Eric Liang	b45bed4bce	[rllib] Propagate model options correctly in ARS / ES, to action dist of PPO (#2974 ) * fix * fix * fix it * propagate conf to action dist * move carla example too * rr * Update policies.py * wip * lint	2018-10-01 12:49:39 -07:00
Eric Liang	814c35b7d7	[rllib] Simplify sample batch size and num envs config, n_step adjustment (#2995 ) * simplify vec batch requirements * Update rllib-training.rst * Update rllib-training.rst * Update rllib-training.rst * Update rllib-training.rst * Update rllib-training.rst * Update rllib-models.rst	2018-09-30 18:36:22 -07:00
Eric Liang	b06c604a51	[rllib] Add some more tuned atari results to documentation (#2991 ) * dqn results ++ * add scale * hour * fix * small dqn table * update * steps * upd * apex * up * add apex results * tip	2018-09-29 23:13:36 -07:00
Richard Liaw	1c9617bc1c	[autoscaler] Add tmux support for attach and exec (#2907 ) Adds a tmux flag that can be used to support background execution of experiments. Cannot be used together with screen. Seems to be useful feature that has shown up with different users.	2018-09-26 23:22:45 -07:00
Eric Liang	3cde5957b3	[rllib] Better document APIs to access policy state (#2932 ) * fix * doc * example * up	2018-09-24 19:08:32 -07:00
Robert Nishihara	ea9d1cc887	Remove dependence on psutil. Add utility functions for getting system memory. (#2892 )	2018-09-18 15:03:29 +08:00
Joerg Schad	a1b8e79c30	Fixed Typo. (#2865 )	2018-09-13 13:32:56 +08:00
Robert Nishihara	3f6ed537a4	Add ray.is_initialized() function. (#2818 ) * Add ray.is_initialized() function. * Add assert.	2018-09-06 21:20:59 -07:00
Eric Liang	995ac24a2c	[rllib] clarify train batch size for PPO (#2793 ) It's possible to configure PPO in a way that ends up discarding most of the samples (they are treated as "stragglers"). Add a warning when this happens, and raise an exception if the waste is particularly egregious.	2018-09-05 12:06:13 -07:00
Eric Liang	df4788e501	[rllib/tune] Add test for fractional gpu support in xray mode; add rllib support for fractional gpu (#2768 ) * frac gpu * doc * Update rllib-training.rst * yapf * remove xray	2018-09-03 11:12:23 -07:00
Philipp Moritz	4db196438b	fix 'from ray.rllib import ppo' in doc (#2794 )	2018-08-31 23:34:47 -07:00
Robert Nishihara	5021795190	Update documents to replace 0.5.0 with 0.5.2. (#2761 ) * Update documents to replace 0.5.0 with 0.5.1. * Update documentation from 0.5.1 -> 0.5.2.	2018-08-29 21:05:09 -07:00
Praveen Palanisamy	357c0d6156	[tune] Adds option to checkpoint at end of trials (#2754 ) * Added checkpoint_at_end option. To fix #2740 * Added ability to checkpoint at the end of trials if the option is set to True * checkpoint_at_end option added; Consistent with Experience and Trial runner * checkpoint_at_end option mentioned in the tune usage guide * Moved the redundant checkpoint criteria check out of the if-elif * Added note that checkpoint_at_end is enabled only when checkpoint_freq is not 0 * Added test case for checkpoint_at_end * Made checkpoint_at_end have an effect regardless of checkpoint_freq * Removed comment from the test case * Fixed the indentation * Fixed pep8 E231 * Handled cases when trainable does not have _save implemented * Constrained test case to a particular exp using the MockAgent * Revert "Constrained test case to a particular exp using the MockAgent" This reverts commit e965a9358ec7859b99a3aabb681286d6ba3c3906. * Revert "Handled cases when trainable does not have _save implemented" This reverts commit 0f5382f996ff0cbf3d054742db866c33494d173a. * Simpler test case for checkpoint_at_end * Preserved bools from loosing their actual value * Revert "Moved the redundant checkpoint criteria check out of the if-elif" This reverts commit 783005122902240b0ee177e9e206e397356af9c5. * Fix linting error.	2018-08-29 13:14:17 -07:00
Eric Liang	69d1354016	[rllib] Document ARS & rainbow (#2744 ) * wip * rainbow doc too * e not used * fix ppo doc * clean list * use same title	2018-08-28 18:13:36 -07:00
Robert Nishihara	5fd44afb8a	Add note about huge pages using up memory. (#2733 ) * Add note about huge pages using up memory. * Update doc * Update	2018-08-24 17:02:54 -07:00
Michael Tu	d16b6f6a32	[tune] Rename 'repeat' to 'num_samples' (#2698 ) Deprecates the `repeat` argument and introduces `num_samples`. Also updates docs accordingly.	2018-08-24 15:05:24 -07:00
Eric Liang	aa014af85b	[rllib] Fix atari reward calculations, add LR annealing, explained var stat for A2C / impala (#2700 ) Changes needed to reproduce Atari plots in IMPALA / A2C: https://github.com/ray-project/rl-experiments	2018-08-23 17:49:10 -07:00
Eric Liang	fbe6c59f72	[rllib] Misc fixes, A2C (#2679 ) A bunch of minor rllib fixes: pull in latest baselines atari wrapper changes (and use deepmind wrapper by default) move reward clipping to policy evaluator add a2c variant of a3c reduce vision network fc layer size to 256 units switch to 84x84 images doc tweaks print timesteps in tune status	2018-08-20 15:28:03 -07:00
Eric Liang	9473da69bd	[autoscaler] Experimental support for local / on-prem clusters (#2678 ) This adds some experimental (undocumented) support for launching Ray on existing nodes. You have to provide the head ip, and the list of worker ips. There are also a couple additional utils added for rsyncing files and port-forward.	2018-08-19 12:43:04 -07:00
Richard Liaw	62d0698097	[tune] Tune Facelift (#2472 ) This PR introduces the following changes: * Ray Tune -> Tune * [breaking] Creation of `schedulers/`, moving PBT, HyperBand into a submodule * [breaking] Search Algorithms now must take in experiment configurations via `add_configurations` rather through initialization * Support `"run": (function \| class \| str)` with automatic registering of trainable * Documentation Changes	2018-08-19 11:00:55 -07:00
Eric Liang	5f430da180	[rllib] Provide internal access to episode state in compute_actions() and allow returning extra batches (#2559 ) The goal of this PR is to allow custom policies to perform model-based rollouts. In the multi-agent setting, this requires access to not only policies of other agents, but also their current observations. Also, you might want to return the model-based trajectories as part of the rollout for efficiency. compute_actions() now takes a new keyword arg episodes pull out internal episode class into a top-level file add function to return extra trajectories from an episode that will be appended to the sample batch documentation	2018-08-16 14:37:21 -07:00
Eric Liang	079c4e482a	ray exec and ray attach commands (#2560 ) ray exec CLUSTER CMD [--screen] [--start] [--stop] ray attach CLUSTER [--start] Example: ray exec sgd.yaml 'source activate tensorflow_p27 && cd ~/ray/python/ray/rllib && ./train.py --run=PPO --env=CartPole-v0' --screen --start --stop This will in one command create a cluster and run the command on it in a screen session. The screen can later be attached to via ray attach. After the command finishes, the cluster workers will be terminated and the head node stopped.	2018-08-15 14:31:50 -07:00
Eric Liang	53f9755594	[rllib] Fix support for mixed discrete and continuous action spaces, add to regression test (#2655 ) * fix * lint * fix	2018-08-15 10:19:41 -07:00
Jones Wong	007208d2bb	Support older version TF and Support RMSProp in Impala (#2590 ) to support TF version < 1.5 to support rmsprop optimizer in Impala Before TF1.5, tf.reduce_sum() and tf.reduce_max() has an argument keep_dims which has been renamed as keepdims in later versions. In the original paper of Impala, they use rmsprop algorithm to optimize the model. We'd better also support it so that users can reproduce their experiments. Without any tuning, say that using the same hyper-parameters as AdamOptimizer, it reaches "episode_reward_mean": 19.083333333333332 in Pong after consume 3,610,350 samples.	2018-08-09 19:51:32 -07:00
Melih Elibol	8ae82180b4	[xray] Adds a driver table. (#2289 ) This PR adds a driver table for the new GCS, which enables cleanup functionality associated with monitoring driver death. Some testing in `monitor_test.py` is restored, but redis sharding for xray is needed to enable remaining tests.	2018-08-08 23:41:40 -07:00
Eric Liang	64053278aa	[tune] Support lambda functions in hyperparameters / tune rllib multiagent support (#2568 ) * update * func * Update registry.py * revert	2018-08-07 16:29:21 -07:00
Richard Liaw	bb44456f6f	[rllib, tune] TrainingResult -> Dict, Removes C408 from flake8 (#2565 )	2018-08-07 12:17:44 -07:00
Eric Liang	981d9818c1	[rllib] Support the timesteps_per_batch in simple optimizer PPO mode (#2558 ) * support ts * doc * Update sync_samples_optimizer.py	2018-08-06 12:10:59 -07:00
Mitar	9015e742c4	Update installation instructions with psmisc to enable 'ray stop' (#2550 )	2018-08-05 23:58:58 -07:00
Richard Liaw	914a433e3f	[tune] Split Search from Scheduling (#2452 ) Introduces SearchAlgorithm concept, separate from schedulers in Tune. Moves HyperOpt under this concept.	2018-08-04 21:27:39 -07:00
Eric Liang	9449d07eca	[rllib] Fix crash when setting horizon in multiagent If a horizon is set, an env terminates without done=True.	2018-08-03 16:37:56 -07:00
Eric Liang	f7ec292360	[rllib] Support agent.get_action in multiagent (#2543 ) * support get action on policy id * comment * grammar fixes * Update rllib-algorithms.rst	2018-08-02 13:35:53 -07:00
Eric Liang	9ea57c2a93	[rllib] Basic IMPALA implementation (using deepmind's reference vtrace.py) (#2504 ) Rename AsyncSamplesOptimizer -> AsyncReplayOptimizer Add AsyncSamplesOptimizer that implements the IMPALA architecture integrate V-trace with a3c policy graph audit V-trace integration benchmark compare vs A3C and with V-trace on/off PongNoFrameskip-v4 on IMPALA scaling from 16 to 128 workers, solving Pong in <10 min. For reference, solving this env takes ~40 minutes for Ape-X and several hours for A3C.	2018-08-01 20:53:53 -07:00
Eric Liang	9a479b3a63	[rllib] Document creating an ensemble of envs; also add vector_index attribute to env config (#2513 ) This also removes the async resetting code in VectorEnv. While that improves benchmark performance slightly, it substantially complicates env configuration and probably isn't worth it for most envs. This makes it easy to efficiently support setups like Joint PPO: https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/retro-contest/gotta_learn_fast_report.pdf For example, for 188 envs, you could do something like num_envs: 10, num_envs_per_worker: 19.	2018-08-01 16:29:27 -07:00
Eric Liang	d9a36c4e39	[rllib] Document auto-concat in a3c (#2533 ) * docs * update hyperparm docs	2018-08-01 15:11:30 -07:00
Sergey Kolesnikov	05490b8cb9	[rllib] dqn/ddpg policy customization (#2445 ) * dqn policy update - more customization * docs for custom DQN graph * Update rllib-training.rst * Update rllib-models.rst * Update rllib.rst * Update rllib-training.rst * Update rllib-concepts.rst * yapf codestyle	2018-07-22 14:47:14 -07:00
Eric Liang	68660453e4	[rllib] Better support and add two-trainer example for multiagent (#2443 ) This adds a simple DQN+PPO example for multi-agent. We don't do anything fancy here, just syncing weights between two separate trainers. This potentially is wasting some compute, but is very simple to set up. It might be nice to share experience collection between the top-level trainers in the future.	2018-07-22 05:09:25 -07:00
Robert Nishihara	4b6157ed09	Remove link to install Linux Python 3.3 wheel. (#2434 )	2018-07-20 15:15:43 -07:00
Richard Liaw	8e8c733696	[tune] Fix Categorical Space + Add Keras Example (#2401 ) Previously did not properly resolve categorical variables for HyperOpt.	2018-07-17 23:52:52 +02:00
Crystal	ebf4070d88	Documentation- Basic Profiling for Ray Users (#2326 ) * Ray documentation - created new section 'Profiling for Ray Users', opposed to current Profiling section for Ray developers. Completed three sections 'A Basic Profiling Example', 'Timing Performance Using Python's Timestamps', and 'Profiling Using An External Profiler (Line_Profiler).' Left to-do two sections on CProfile and Ray Timeline Visualization.' * Ray documentation - Fixed rst codeblock linebreaks in 'User Profiling' * Ray documentation - For User Profiling, added section on cProfile * Ray documentation - For User Profiling, completed Ray Timeline Visualization section, including graphical images * Ray documentation - made User Profiling timeline image larger, minor wording edits * Ray documentation - minor wording edits to User Profiling * Ray documentation - User Profiling- fixed broken link * Minor wording changes requested by Philipp Moritz addressed. Still need to address (1) compressing the image files, (2) correcting ex 3 to not be remote, and (3) using cProfile on an actor * Ray documentation - For user-profiling.rst, revised example 3 to show a semi-parallelized example. Compressed timeline example image to be under 50 KB, removed view timeline GUI image. Updated timeline example image to reflect revised example 3. cProfile actor example left * Ray documentation - in user-profiling.rst, added a new example including actors in the cProfile section * Ray documentation - For user-profiling.rst, added section header for the Ray actor cProfile example * Update user-profiling.rst * Update user-profiling.rst * 4 space indentation * Update user-profiling.rst * Update user-profiling.rst * Update user-profiling.rst * corrections	2018-07-12 16:57:39 -07:00
Robert Nishihara	515da7721a	Change ray.worker.cleanup -> ray.shutdown and improve API documentation. (#2374 ) * Change ray.worker.cleanup -> ray.shutdown and improve API documentation. * Deprecate ray.worker.cleanup() gracefully. * Fix linting	2018-07-12 12:00:00 -07:00
Eric Liang	b316afeb43	[rllib] Add debug info back to PPO and fix optimizer compatibility (#2366 )	2018-07-12 19:22:46 +02:00
Eric Liang	4ef9d15315	[rllib] Add concepts section of docs (#2373 ) This fills in the rllib concepts documentation.	2018-07-08 18:46:52 -07:00
Robert Nishihara	35f4a3070c	Update 0.4.0 to 0.5.0 in autoscaler and installation examples. (#2352 )	2018-07-07 14:34:20 -07:00

1 2 3 4 5 ...

302 commits