hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-11 13:46:40 -04:00

Author	SHA1	Message	Date
Melih Elibol	8ae82180b4	[xray] Adds a driver table. (#2289 ) This PR adds a driver table for the new GCS, which enables cleanup functionality associated with monitoring driver death. Some testing in `monitor_test.py` is restored, but redis sharding for xray is needed to enable remaining tests.	2018-08-08 23:41:40 -07:00
Alexey Tumanov	df7ee7ff1e	raylet memory corruption fixes (#2591 ) * raylet memory corruption fixes * add util function to translate boost error to ray status * tcp client connection now using ray status utility function * lint	2018-08-08 19:50:43 -07:00
Stephanie Wang	6ab01a2cad	[xray] Fix bug when counting a task's lineage size (#2600 )	2018-08-08 00:00:17 -07:00
Ujval Misra	a0691ee49b	[xray] Prevent sending excessive uncommitted lineage on task forwarding (#2534 ) * Add set to lineage cache entry to track nodes already forwarded to. * Uncommitted lineage function naming, documentation. * Simple test for uncommitted lineage with a marked task. * Rebased, changed tests to use ClientID::nil. * Bug fix, change MergeLineageHelper function type. * Formatting. * Checks and test changes based on PR comments. * GetUncommittedLineage now always returns at least the requested task ID. * Bug fix (return at least requested task ID) * Formatting	2018-08-07 21:10:23 -07:00
Eric Liang	64053278aa	[tune] Support lambda functions in hyperparameters / tune rllib multiagent support (#2568 ) * update * func * Update registry.py * revert	2018-08-07 16:29:21 -07:00
Philipp Moritz	e7f76d7914	[xray] Fix typo concerning heartbeat_timeout_milliseconds in monitor (#2586 )	2018-08-07 13:45:51 -07:00
Richard Liaw	bb44456f6f	[rllib, tune] TrainingResult -> Dict, Removes C408 from flake8 (#2565 )	2018-08-07 12:17:44 -07:00
Philipp Moritz	a3202f581c	[xray] Add flag to start raylet in valgrind (#2582 )	2018-08-07 11:25:21 -07:00
Philipp Moritz	25f0094ee4	Fix copying the plasma fbs directory from arrow (#2579 )	2018-08-07 00:04:37 -07:00
Yuhong Guo	d35ce7fa63	Use real callback index in subscribe_callback_index_ (#2473 )	2018-08-06 15:29:56 -07:00
Yuhong Guo	9825da7233	Change training tasks to xray for Jenkins tests (#2567 )	2018-08-06 13:35:26 -07:00
Alexey Tumanov	85b8b2a395	mark all remaining placeable tasks pending with task dependency manager (#2528 )	2018-08-06 13:08:11 -07:00
Eric Liang	981d9818c1	[rllib] Support the timesteps_per_batch in simple optimizer PPO mode (#2558 ) * support ts * doc * Update sync_samples_optimizer.py	2018-08-06 12:10:59 -07:00
Mitar	9015e742c4	Update installation instructions with psmisc to enable 'ray stop' (#2550 )	2018-08-05 23:58:58 -07:00
Wang Qing	3845c294c3	[java] Fix java raylet wait (#2553 )	2018-08-05 23:49:54 -07:00
Melih Elibol	34d3a46f48	[xray] Revert dynamic chunk size optimization for ObjectManager. (#2557 ) * Revert dynamic chunk size optimization. * fix mac build issues.	2018-08-05 02:09:37 -07:00
Richard Liaw	914a433e3f	[tune] Split Search from Scheduling (#2452 ) Introduces SearchAlgorithm concept, separate from schedulers in Tune. Moves HyperOpt under this concept.	2018-08-04 21:27:39 -07:00
Eric Liang	9449d07eca	[rllib] Fix crash when setting horizon in multiagent If a horizon is set, an env terminates without done=True.	2018-08-03 16:37:56 -07:00
Philipp Moritz	d5dda1ebf2	copy all files when installing pyarrow (#2547 )	2018-08-02 17:06:37 -07:00
Philipp Moritz	5e59cc6a20	Update arrow to include plasma memory footprint reduction (#2545 )	2018-08-02 14:37:37 -07:00
Peter Schafhalter	7a5f25248e	[rllib] Improve conv_filters documentation (#2540 ) * Improve conv_filters documentation * Update catalog.py * Update catalog.py	2018-08-02 14:29:40 -07:00
Eric Liang	f7ec292360	[rllib] Support agent.get_action in multiagent (#2543 ) * support get action on policy id * comment * grammar fixes * Update rllib-algorithms.rst	2018-08-02 13:35:53 -07:00
Yuhong Guo	d2ebe4d9a3	Fix frequent failure of Jenkins CI. (#2490 )	2018-08-02 10:28:28 -07:00
Philipp Moritz	d8ba667175	Convert asserts in unittest to pytest (#2529 )	2018-08-01 22:32:10 -07:00
Eric Liang	9ea57c2a93	[rllib] Basic IMPALA implementation (using deepmind's reference vtrace.py) (#2504 ) Rename AsyncSamplesOptimizer -> AsyncReplayOptimizer Add AsyncSamplesOptimizer that implements the IMPALA architecture integrate V-trace with a3c policy graph audit V-trace integration benchmark compare vs A3C and with V-trace on/off PongNoFrameskip-v4 on IMPALA scaling from 16 to 128 workers, solving Pong in <10 min. For reference, solving this env takes ~40 minutes for Ape-X and several hours for A3C.	2018-08-01 20:53:53 -07:00
Wang Qing	e4f68ff8cf	[Java Worker] Support raylet on Java (#2479 )	2018-08-01 17:52:49 -07:00
Eric Liang	9a479b3a63	[rllib] Document creating an ensemble of envs; also add vector_index attribute to env config (#2513 ) This also removes the async resetting code in VectorEnv. While that improves benchmark performance slightly, it substantially complicates env configuration and probably isn't worth it for most envs. This makes it easy to efficiently support setups like Joint PPO: https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/retro-contest/gotta_learn_fast_report.pdf For example, for 188 envs, you could do something like num_envs: 10, num_envs_per_worker: 19.	2018-08-01 16:29:27 -07:00
Eric Liang	a630e332f3	[rllib] Don't use get_gpu_ids() in ppo This lets the num_gpus config work properly even when not using tune, since the gpu ids won't be set by ray in that case.	2018-08-01 16:25:11 -07:00
Eric Liang	d9a36c4e39	[rllib] Document auto-concat in a3c (#2533 ) * docs * update hyperparm docs	2018-08-01 15:11:30 -07:00
Zhijun Fu	ca36827f01	[Issues 2403][xray] Fix raylet performance issues on scheduling queue (#2438 ) * merge from ray * Revert "merge from ray" This reverts commit 32b181ebbb1fa184026631e1a7368112c4c3118d. * fix raylet performance regression * address comments * Update code after merging latest changes * fix lint * address comments	2018-08-01 14:41:20 -07:00
Melih Elibol	89f60e39f3	Override user-specified name tag. (#2480 ) Override user-specified name tag.	2018-08-01 14:16:57 -04:00
Stephanie Wang	e90ecef297	[xray] Try to flush children of a task that is evicted from the lineage cache (#2531 )	2018-08-01 00:23:02 -07:00
Robert Nishihara	909d7172b1	Introduce constant for ID_SIZE in python code. (#2517 )	2018-07-31 12:40:53 -07:00
mehrdadn	64d00ff39e	Remove Visual Studio projects (#2525 )	2018-07-31 10:22:24 -07:00
Philipp Moritz	d9a019b8e5	Upgrade arrow to include pytorch fix (#2522 ) This fixes https://github.com/ray-project/ray/issues/2520	2018-07-30 20:20:18 -07:00
Stephanie Wang	a45f9cfafc	[xray] Implement task lease table, logic for deciding when to reconstruct a task (#2497 )	2018-07-30 14:42:28 -07:00
Eric Liang	38d00986a5	[rllib] Cleanups: deep merge configs properly; enforce min iter time on APEX (#2500 ) The dict merge prevents crashes when tune is trying to get resource requests for agents and you override a config subkey. The min iter time prevents iterations from getting too small, incurring high overhead. This is easy to run into on Ape-X since throughput can get very high.	2018-07-30 13:25:35 -07:00
Eric Liang	62a52ee989	[rllib] Fix corner case in rnn episode handling We should use episode ids instead of the timestep to determine when sequences should be cut, since when batches are concatenated, increasing t does not guarantee we are part of the same episode.	2018-07-30 13:24:43 -07:00
Philipp Moritz	696a229ece	Fix text verbosity in python 2.7 by running tests with pytest (#2470 )	2018-07-30 11:04:06 -07:00
Hao Chen	fe65f9fbbc	improve java api doc (#2508 )	2018-07-29 20:41:11 -07:00
Robert Nishihara	3f3514c2b3	Deprecate PYTHON_MODE more gracefully. (#2487 )	2018-07-29 16:25:46 -07:00
Steve Severance	f1b4ea69a3	Prevent hasher from running out of memory on large files (#2451 ) * Prevent hasher from running out of memory on large files * dump out keys * only print if failed * remove debugging * Fix lint error. Reverse adding newline.	2018-07-28 23:29:09 -07:00
Ion	80db69d245	State transition diagram documentation. (#2502 ) * Added description of transition diagram and a few name changes for imporved clarity. * rename some methods and update task_states.rst	2018-07-28 22:28:45 -07:00
Hao Chen	0ea7a6abf0	add java tutorial (#2491 )	2018-07-28 17:09:30 -07:00
Eric Liang	90a3ea9443	[xray] Fix heartbeat subscription for autoscaler (#2498 )	2018-07-28 13:34:55 -07:00
Peter Schafhalter	e10377567c	Add benchmark for ray.get (#2499 )	2018-07-28 09:09:21 -07:00
Philipp Moritz	ecc100cb3b	Upgrade arrow to include pytorch fix (#2496 )	2018-07-28 01:28:44 -07:00
Peter Schafhalter	ccb9a27393	Add benchmarks for ray.put (#2489 )	2018-07-27 17:49:21 -07:00
Peter Schafhalter	302510ada0	[asv] Add actor benchmarks (#2469 ) * Add actor benchmarks * Fix bug * Address comments and refactor * Update benchmark_actor.py	2018-07-27 17:40:02 -07:00
Robert Nishihara	2be1ccbd8f	Raise application-level exceptions for some failure scenarios. (#2429 ) * Raise application level exception for actor methods that can't be executed and failed tasks. * Retry task forwarding for actor tasks. * Small cleanups * Move constant to ray_config. * Create ForwardTaskOrResubmit method. * Minor * Clean up queued tasks for dead actors. * Some cleanups. * Linting * Notify task_dependency_manager_ about failed tasks. * Manage timer lifetime better. * Use smart pointers to deallocate the timer. * Fix * add comment	2018-07-27 19:53:30 -04:00

... 8 9 10 11 12 ...

2332 commits