hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-06 18:41:40 -05:00

Author	SHA1	Message	Date
Ujval Misra	ca651af1d7	[tune] Async restores and S3/GCP-capable trial FT (#6376 ) * Initial commit for asynchronous save/restore * Set stage for cloud checkpointable trainable. * Refactor log_sync and sync_client. * Add durable trainable impl. * Support delete in cmd based client * Fix some tests and such * Cleanup, comments. * Use upload_dir instead. * Revert files belonging to other PR in split. * Pass upload_dir into trainable init. * Pickle checkpoint at driver, more robust checkpoint_dir discovery. * Cleanup trainable helper functions, fix tests. * Addressed comments. * Fix bugs from cluster testing, add parameterized cluster tests. * Add trainable util test * package_ref * pbt_address * Fix bug after running pbt example (_save returning dir). * get cluster tests running, other bug fixes. * raise_errors * Fix deleter bug, add durable trainable example. * Fix cluster test bugs. * filelock * save/restore bug fixes * . * Working cluster tests. * Lint, revert to tracking memory checkpoints. * Documentation, cleanup * fixinitialsync * fix_one_test * Fix cluster test bug * nit * lint * Revert tune md change * Fix basename bug for directories. * lint * fix_tests * nit_fix * Add __init__ file. * Move to utils package Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2020-01-02 20:40:53 -08:00
Harrison Feng	57061a15cf	[docs] configure.rst with --num-cpus (#6678 ) --num-cpus -> --num-gpus Signed-off-by: Harrison Feng <feng.harrison@gmail.com>	2020-01-02 20:33:41 -08:00
Eric Liang	895f2727fb	Add experimental parallel iterators API (#6644 )	2020-01-02 13:45:26 -08:00
Robert Nishihara	d2c6457832	Remove public facing references to --redis-address. (#6631 )	2019-12-31 13:21:53 -08:00
Michael Luo	1cb335487e	SAC for Mujoco Environments (#6642 )	2019-12-31 00:16:54 -08:00
Philipp Moritz	735f282494	Use 0.9.0.dev0 as the version tag (#6630 )	2019-12-30 10:14:07 -08:00
Richard Liaw	646643a588	[doc] remove redundant PS example (#6634 )	2019-12-29 20:54:42 -08:00
Edward Oakes	2a66529fb7	Add multiprocessing.Pool API (#6194 )	2019-12-29 21:40:58 -06:00
Eric Liang	677004ee3d	Add 'ray stat' command for debugging (#6622 ) * wip * wip * wip * iterate * move * fix thread safety	2019-12-28 14:40:32 -08:00
Robert Nishihara	96f2f8ff10	Stop testing Python 2.7 and building Python 2.7 wheels. (#6601 )	2019-12-27 20:47:49 -08:00
Robert Nishihara	8724e5ffd5	Start WebUI by default. (#6493 )	2019-12-27 13:49:07 -08:00
zhu-eric	65297e65f0	Experimental Actor Pool (#6055 ) * mod_table * Example fix for gallery * lint * nit * nit * fix * gallery * remove table for now * training, object store, tune, actors, advanced * start tf code * first cut tf * yapf * pytorch * add torch example * torch * parallel * tune * tuning * reviewsready * finetune * fix * move_code * update conf * compile * init hyperparameter * Start images * overview * extra * fix * works * update-ps-example * param_actor * fix * examples * simple * simplify_pong * flake8 and run hyperopt * add comments * add comments * add suggestion * add suggestion * suggestions * add suggestion * add suggestions * fixed in wrong area * last edit * finish changes * add line * format * reset * tests and docs * fix tests * bazelify Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2019-12-26 14:35:10 -08:00
micafan	b98b288ffd	[GCS] Change GCS Test to cc_test (#6596 )	2019-12-26 14:34:35 +08:00
Yuhao Yang	df4533c649	[tune] demo exporting trained models in pbt examples (#6533 )	2019-12-26 02:14:49 +01:00
Edward Oakes	6b1a57542e	Add `actor.__ray_kill__()` to terminate actors immediately (#6523 )	2019-12-23 23:12:57 -06:00
Edward Oakes	1b14fbe179	Hotfix ray download links in documentation (#6572 )	2019-12-21 23:00:29 +01:00
Rong Rong	3af5fe60e7	modify developer tips page to add PR and CI tips (#6530 )	2019-12-18 16:20:05 -08:00
Simon Mo	e530c37b0e	Use localhost and set redis password by default (#6481 )	2019-12-17 19:41:19 -08:00
Eric Liang	6725a61bda	Release 0.8.0 test logs (#6512 )	2019-12-17 15:56:50 -08:00
Eric Liang	2530eb90dc	Move tf.test.is_gpu_available() to after session init (#6515 ) * move to after session init * script fixes	2019-12-17 14:55:39 -08:00
Eric Liang	1a1324d2a2	Bump version from 0.8.0.dev6 -> 0.9.0.dev (#6508 )	2019-12-16 23:57:42 -08:00
Edward Oakes	8636d67b72	Improve release docs and add results from 0.7.7 (#6506 ) * Improve docs, add logs * add logs * microbenchmark * lint	2019-12-16 15:51:39 -08:00
Richard Liaw	5719a05757	[sgd] Add support for multi-model multi-optimizer training (#6317 )	2019-12-15 15:19:45 -08:00
Philipp Moritz	f5d10eea0b	[Projects] Refactor cluster specification (#6488 )	2019-12-14 22:43:06 -08:00
Yuhao Yang	ad4da17899	[Tune] Add example and tutorial for DCGAN (#6400 )	2019-12-13 14:15:44 -08:00
Richard Liaw	4ff6ca89f4	[docs] slight doc modifications (#6466 )	2019-12-13 10:38:17 -08:00
alindkhare	76e678d775	[Serve] Added deadline awareness (#6442 ) * [Serve] Added deadline awareness Added deadline awareness while enqueuing a query Using Blist sorted-list implementation (ascending order) to get queries according to their specified deadlines. [buffer_queues] Exposed slo_ms via handle/http request Added slo example The queries in example will be executed in almost the opposite order of which they are fired Added slo pytest Added check for slo_ms to not be negative Included the changes suggested * Linting Corrections * Adding the code changes suggested by format.sh * Added the suggested changes Added justification for blist Added blist in travis/ci/install-dependencies.sh * Fixed linting issues * Added blist to ray/doc/requirements-doc.txt	2019-12-11 16:41:54 -08:00
Maltimore	0ec613c95a	[rllib] doc: fix typo: on_postprocess_batch -> on_postprocess_traj (#6438 )	2019-12-11 15:00:53 -08:00
Dean Wampler	abb4fb3f8e	Added small section on installation when using Anaconda. (#6427 )	2019-12-11 10:23:41 -08:00
Ameer Haj Ali	1a9948eef9	Update rllib-examples.rst (#6396 )	2019-12-08 16:21:50 -08:00
Dean Wampler	65694cdc4c	[bug] Attempt to fix links not working. (#6390 )	2019-12-07 14:31:50 -08:00
Victor Le	4e24c805ee	AlphaZero and Ranked reward implementation (#6385 )	2019-12-07 12:08:40 -08:00
Yuhao Yang	c327ae152f	[doc] Update the test command in getting-involved. (#6347 )	2019-12-07 11:03:52 -08:00
Dean Wampler	53d62d3eec	Expanded with new pages for getting started, etc. Blog links unchanged. (#6388 )	2019-12-06 15:18:47 -08:00
Edward Oakes	f63b64310a	Bump version to 0.8.0.dev7 (#6303 )	2019-12-05 18:33:54 -08:00
Philipp Moritz	dd27bfbb75	Rename .rayproject to ray-project (#6278 )	2019-12-05 16:15:42 -08:00
Eric Liang	bc5e259264	[rllib] Add a doc section on computing actions (#6326 ) * options doc * add note * hint shr * doc update	2019-12-03 00:10:50 -08:00
Shital Shah	670cb6374e	Doc enhancement: use build.sh for ray, clarification on how rllib selects VisionNetwork, note on setup-dev.py for rllib. (#6092 )	2019-12-02 22:19:01 -08:00
Richard Liaw	0b3d5d989b	[docs] Add public materials (#6331 ) * startup * update tune readme * usingrah	2019-12-02 19:59:23 -08:00
Eric Liang	0b0a16982a	[doc] Use .options() (#6323 ) * options doc * add note * hint shr	2019-12-01 17:24:00 -08:00
Philipp Moritz	a4437813eb	[Projects] Unify hyphen vs underscore handling for arguments (#6208 )	2019-11-20 23:52:41 -08:00
Richard Liaw	d3c7a8fda5	[docs] yarn update (#6173 )	2019-11-19 16:15:08 -08:00
Yuhao Yang	d3ff2252c4	[doc] Fix link to getting involved	2019-11-18 12:59:14 -08:00
Eric Liang	8fc2272f43	[rllib] Reorganize trainer config, add warnings about high VF loss magnitude for PPO (#6181 )	2019-11-18 10:39:07 -08:00
Ujval Misra	2965dc1b72	[tune] Fault tolerance improvements (#5877 ) * Precede ray.get with ray.wait. * Trigger checkpoint deletes locally in Trainable * Clean-up code. * Minor changes. * Track best checkpoint so far again * Pulled checkpoint GC out of Trainable. * Added comments, error logging. * Immediate pull after checkpoint taken; rsync source delete on pull * Minor doc fixes * Fix checkpoint manager bug * Fix bugs, tests, formatting * Fix bugs, feature flag for force sync. * Fix test. * Fix minor bugs: clear proc and less verbose sync_on_checkpoint warnings. * Fix bug: update IP of last_result. * Fixed message. * Added a lot of logging. * Changes to ray trial executor. * More bug fixes (logging after failure), better logging. * Fix richards bug and logging * Add comments. * try-except * Fix heapq bug. * . * Move handling of no available trials to ray_trial_executor (#1) * Fix formatting bug, lint. * Addressed Richard's comments * Revert tests. * fix rebase * Fix trial location reporting. * Fix test * Fix lint * Rebase, use ray.get w/ timeout, lint. * lint * fix rebase * Address richard's comments	2019-11-18 01:14:41 -08:00
Richard Liaw	62cbc043b4	[tune] tbx logger (#6133 ) * tbx * add_hparams * fix_hparams * ok * ok * fix * ok * fix	2019-11-15 08:45:44 -08:00
Edward Oakes	385783fcec	Ray on YARN + Skein Documentation (#6119 )	2019-11-14 15:06:05 -08:00
Eric Liang	243b1b7281	[rllib] Add microbatch optimizer with A2C example (#6161 )	2019-11-14 12:14:00 -08:00
Ujval Misra	e3e3ad4b25	Add timeout param to ray.get (#6107 )	2019-11-14 00:50:04 -08:00
Eric Liang	e4565c9cc6	Reduce RLlib log verbosity (#6154 )	2019-11-13 18:50:45 -08:00

1 2 3 4 5 ...

635 commits