hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-09 12:56:46 -04:00

Author	SHA1	Message	Date
Richard Liaw	f4b2dd94b2	[tune] Cache MNIST and restore MNIST tests (#15260 )	2021-04-13 14:20:26 -07:00
Kai Fricke	d33b0e4bc3	[tune] Reconcile placement groups every N seconds to avoid bottlenecks when running many short trials (#15011 ) Closes a release blocking issue	2021-04-01 17:04:44 +02:00
Kai Fricke	84b3c3376b	[tune] document scalability best practices (k8s, scalability thresholds) (#14566 ) Adds a new page and table to document current scalability thresholds in Ray Tune to the documentation. Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2021-03-25 09:54:14 +01:00
Kai Fricke	898243d538	[tune] Limit maximum number of pending trials. Add convergence test. (#14835 )	2021-03-23 18:19:41 -07:00
Amog Kamsetty	7ee2e4185b	[Tune] PTL Fractional GPUs (#14781 ) Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2021-03-18 17:07:51 -07:00
Kai Fricke	43e098402a	[tune] make `tune.with_parameters()` work with the class API (#14532 ) * [tune] make `tune.with_parameters()` work with the class API * Update python/ray/tune/utils/trainable.py Co-authored-by: Richard Liaw <rliaw@berkeley.edu> Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2021-03-09 09:36:17 +01:00
Kai Fricke	b0bf44b154	[tune/docs] Add high level trial runner flow to documentation (#14468 ) * [tune/docs] Add high level trial runner flow to documentation * Apply suggestions from code review	2021-03-08 10:35:54 +01:00
Kai Fricke	4014168928	[tune] Introduce `durable()` wrapper to convert trainables into durable trainables (#14306 ) * [tune] Introduce `durable()` wrapper to convert trainables into durable trainables * Fix wrong check * Improve docs, add FAQ for tackling overhead * Fix bugs in `tune.with_parameters` * Update doc/source/tune/api_docs/trainable.rst Co-authored-by: Richard Liaw <rliaw@berkeley.edu> * Update doc/source/tune/_tutorials/_faq.rst Co-authored-by: Richard Liaw <rliaw@berkeley.edu> Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2021-02-26 13:59:28 +01:00
Kai Fricke	757866ec01	[tune] enable placement groups per default (#13906 ) * Refactor placement group factory object to accept placement_group arguments instead of callables * Convert resources to pgf * Enable placement groups per default * Fix tests WIP * Fix stop/resume with placement groups * Fix progress reporter test * Fix trial executor tests * Check resource for trial, not resource object * Move ENV vars into class * Fix tests * Sphinx * Wait for trial start in PBT * Revert merge errors * Support trial reuse with placement groups * Better check for just staged trials * Fix trial queuing * Wait for pg after trial termination * Clean up PGs before tune run * No PG settings in pbt scheduler * Fix buffering tests * Skip test if ray reports erroneous available resources * Disable PG for cluster resource counting test * Debug output for tests * Output in-use resources for placement groups * Don't start new trial on trial start failure * Add docs * Cleanup PGs once futures returned * Fix placement group shutdown * Use updated_queue flag * Apply suggestions from code review * Apply suggestions from code review * Update docs * Reuse placement groups independently from actors * Do not remove placement groups for paused trials * Only continue enqueueing trials if it didn't fail the first time * Rename parameter * Fix pause trial * Code review + try_recover * Update python/ray/tune/utils/placement_groups.py Co-authored-by: Richard Liaw <rliaw@berkeley.edu> * Move placement group lifecycle management * Move total used resources to pg manager * Update FAQ example * Requeue trial if start was unsuccessful * Do not cleanup pgs at start of run * Revert "Do not cleanup pgs at start of run" This reverts commit 933d9c4c * Delayed PG removal * Fix trial requeue test * Trigger pg cleanup on status update * Fix tests * Fix docs * fix-test Signed-off-by: Richard Liaw <rliaw@berkeley.edu> Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2021-02-23 18:46:02 +01:00
Antoni Baum	58d7398246	[Tune] Add `HEBOSearch` Searcher (#13863 ) * HEBO first pass * Fix bad quotes * Fixes * Reproductibility * Update python/ray/tune/suggest/hebo.py Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com> * Add hebo_example.py to BUILD * Nit * Update to pypi package * Alphabetical HEBO requirement * Fix syntax error * Fix wrong space in hebo example * Move validate_warmstart to utils * Space assertion in HEBO * Comment * Apply suggestions from code review Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com> * Formatting Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>	2021-02-17 22:53:10 +01:00
javi-redondo	b8b2d6410d	[docs] new Ray Cluster documentation (#13839 ) Co-authored-by: Javier Redondo <javier@anyscale.com> Co-authored-by: AmeerHajAli <ameerh@berkeley.edu>	2021-02-15 00:47:14 -08:00
Richard Liaw	6c77aeb98a	[docs] ray slack remove banners (#13898 ) Signed-off-by: Richard Liaw <rliaw@berkeley.edu>	2021-02-04 01:14:34 -08:00
Kai Fricke	d29fcfb45c	[tune] catch SIGINT signal and trigger experiment checkpoint (#13767 ) * [tune] catch SIGINT signal and trigger experiment checkpoint * Apply suggestions from code review * Fix user guide docs * Update doc/source/tune/user-guide.rst	2021-02-02 14:52:09 +01:00
architkulkarni	28cf5f91e3	[docs] change MLFlow to MLflow in docs (#13739 )	2021-01-27 16:53:15 -08:00
Amog Kamsetty	20016c983f	[Tune] MLflow Credentials (#13533 )	2021-01-19 11:55:13 -08:00
Kai Fricke	dc42abb2f5	[tune] placement group support (#13370 )	2021-01-18 11:58:57 -08:00
Richard Liaw	86387504ee	[tune] fix small docs typo (#13355 ) Signed-off-by: Richard Liaw <rliaw@berkeley.edu>	2021-01-16 00:49:17 -08:00
Kai Fricke	518427627b	[tune] buffer trainable results (#13236 ) * Working prototype * Pass buffer length, fix tests * Don't buffer per default * Dispatch and process save in one go, added tests * Fix tests * Pass adaptive seconds to train_buffered, stop result processing after STOP decision * Fix tests, add release test * Update tests * Added detailed logs for slow operations * Update python/ray/tune/trial_runner.py Co-authored-by: Richard Liaw <rliaw@berkeley.edu> * Apply suggestions from code review * Revert tests and go back to old tuning loop * nit Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2021-01-12 18:52:47 +01:00
Edwin Goh	a5ddc27bab	Fix typo in Tune Docs (Checkpointing) (#13348 ) See issue #13299	2021-01-11 20:27:18 -08:00
Amog Kamsetty	0452a3a435	[Tune] Rename MLFlow to MLflow (#13301 )	2021-01-11 17:36:55 -08:00
Kai Fricke	97211a6170	[Tune] Fix tune serve integration example (#13233 )	2021-01-06 17:02:04 +01:00
Lavanya Shukla	350917958c	[docs] fix wandb url (#13094 )	2020-12-28 17:19:17 -08:00
Antoni Baum	a4f2dd2138	[Tune]Add integer loguniform support (#12994 ) * Add integer quantization and loguniform support * Fix hyperopt qloguniform not being np.log'd first * Add tests, __init__ * Try to fix tests, better exceptions * Tweak docstrings * Type checks in SearchSpaceTest * Update docs * Lint, tests * Update doc/source/tune/api_docs/search_space.rst Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com> Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>	2020-12-23 09:27:16 -08:00
Amog Kamsetty	5d3c9c8861	[Tune] Mlflow Integration (#12840 ) Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com> Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2020-12-19 00:40:02 -08:00
Kai Fricke	3d72000826	[tune] Add `points_to_evaluate` to BasicVariantGenerator (#12916 ) Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2020-12-17 19:16:03 -08:00
Kai Fricke	5f04ade6ef	[tune] add more stoppers and stopper documentation (#12750 ) * Add new stoppers & docs * Add tests for maximum iteration stopper and trial plateau stopper * Update python/ray/tune/stopper.py Co-authored-by: Richard Liaw <rliaw@berkeley.edu> * Update doc/source/tune/api_docs/stoppers.rst Co-authored-by: Richard Liaw <rliaw@berkeley.edu> * Update doc/source/tune/api_docs/stoppers.rst Co-authored-by: Richard Liaw <rliaw@berkeley.edu> * Apply suggestions from code review * Apply suggestions from code review * Update python/ray/tune/stopper.py Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2020-12-12 01:47:19 -08:00
Richard Liaw	9ce7ad17fd	[tune] remove some bottlenecks in trialrunner (#12476 )	2020-11-30 14:54:25 -08:00
Richard Liaw	7c009d22cf	[docs] Add xgboost_ray to docs (#12184 ) Co-authored-by: Amog Kamsetty <amogkamsetty@yahoo.com>	2020-11-27 11:36:56 -08:00
Richard Liaw	e59fe65d3d	[tune] Fix logging for dockersyncer (#12196 )	2020-11-23 14:29:41 -08:00
Kai Fricke	9f5986ee58	[tune] logger migration to ExperimentLogger classes (#11984 )	2020-11-16 15:08:37 -08:00
Richard Liaw	8b3f79f307	[tune] refactor and add examples (#11931 )	2020-11-14 20:43:28 -08:00
Keqiu Hu	0c1bdaef59	[tune] TensorFlow Distributed Trainable (#11876 ) Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2020-11-10 14:59:08 -08:00
Kai Fricke	603accf1c2	[tune] logger refactor part 3: Add ExperimentLogger class (#11749 )	2020-11-05 08:55:38 -08:00
Richard Liaw	efa07d5403	Revert "Revert "[tune] PB2 (#11466 )" (#11795 )" (#11812 )	2020-11-04 20:47:12 -08:00
Amog Kamsetty	7248d5f4ae	Revert "[tune] PB2 (#11466 )" (#11795 ) This reverts commit `e7aafd7d24`.	2020-11-03 21:05:00 -08:00
Kai Fricke	f7b19c41e3	[tune] logger refactor part 1: move classes and utilities to own files (#11746 ) * [tune] logger refactor part 1: move classes and utilities to own files * Fix circular dependency * Remove uneeded pretty print copy * Apply suggestions from code review	2020-11-03 07:48:09 -08:00
Jack Parker-Holder	e7aafd7d24	[tune] PB2 (#11466 ) Co-authored-by: Sumanth Ratna <sumanthratna@gmail.com> Co-authored-by: Amog Kamsetty <amogkamsetty@yahoo.com> Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com> Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2020-10-27 01:03:21 -07:00
Richard Liaw	b02e61f672	[minor] fix up docs (#11596 ) Signed-off-by: Richard Liaw <rliaw@berkeley.edu>	2020-10-26 12:19:03 -07:00
Richard Liaw	1b357533b1	[tune] Try to enable PTL, SKlearn tests (#11542 )	2020-10-24 01:08:46 -07:00
Richard Liaw	e7aa6441b7	[tune] a tiny ptl example (#11497 )	2020-10-22 18:50:34 -07:00
Frank Gu	73fa94731f	[tune] Add HDFS as Cloud Sync Client (#11524 )	2020-10-22 14:12:51 -07:00
Richard Liaw	a4b418d30c	[docs] update cloud docs (#11262 ) * update-cloud-docs Signed-off-by: Richard Liaw <rliaw@berkeley.edu> * Update doc/source/cluster/config.rst Co-authored-by: Ian Rodney <ian.rodney@gmail.com> * fix Signed-off-by: Richard Liaw <rliaw@berkeley.edu> * fix Signed-off-by: Richard Liaw <rliaw@berkeley.edu> Co-authored-by: Ian Rodney <ian.rodney@gmail.com>	2020-10-21 16:37:26 -07:00
Kai Fricke	2f74fe5b71	[tune/docs] Add PTL example to tune docs/examples (#11474 )	2020-10-19 14:47:58 -07:00
Sumanth Ratna	92a58aabce	[tune][docs] Fix learning rate bounds in FAQ (#11345 )	2020-10-12 09:44:53 -07:00
Richard Liaw	56f858ed1a	[tune][docs/util] gputil check, docs (#11260 ) Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>	2020-10-10 00:54:31 -07:00
Kai Fricke	b450cb030a	[tune] reuse actors for function API (#11230 ) Co-authored-by: Kristian Hartikainen <kristian.hartikainen@gmail.com>	2020-10-08 16:15:02 -07:00
Sumanth Ratna	14d8826e43	Fix overriden typo (#11227 )	2020-10-07 19:11:07 -07:00
Amog Kamsetty	3b76def2d2	[Docs] [Tune] Add NeuroCard to open source projects using Tune (#11213 )	2020-10-06 14:22:32 -07:00
Kai Fricke	681c24754a	[tune] Example for using huggingface hyperparamer_search API (#11158 )	2020-10-01 16:00:57 -07:00
Kai Fricke	bdf647c4ec	[tune] docker syncer (#11035 ) * Add DockerSyncer * Add docs * Update python/ray/tune/integration/docker.py Co-authored-by: Richard Liaw <rliaw@berkeley.edu> * Updated docs * fix dir * Added docker integration test * added docker integration test to bazel build * Use sdk.rsync API Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2020-10-01 11:59:23 -07:00

1 2 3 4

178 commits