hiro/ray - Forgejo: Beyond coding. We Forge.

hiro/ray

mirror of https://github.com/vale981/ray synced 2025-03-09 12:56:46 -04:00

Author	SHA1	Message	Date
Max Pumperla	372c620f58	[docs] Tune overhaul part II (#22656 ) Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>	2022-02-26 23:07:34 -08:00
Max Pumperla	5cc9355303	[Docs ] Tune docs overhaul (first part) (#22112 ) Continuing docs overhaul, tune now has: - [x] better landing page - [x] a getting started guide - [x] user guide was cut down, partially merged with FAQ, and partially integrated with tutorials - [x] the new user guide contains guides to tune features and practical integrations - [x] we rewrote some of the feature guides for clarity - [x] we got rid of sphinx-gallery for this sub-project (only data and core left), as it looks bad and is unnecessarily complicated anyway (plus, makes the build slower) - [x] sphinx-gallery examples are now moved to markdown notebook, as started in #22030. - [x] Examples are tested in the new framework, of course. There's still a lot one can do, but this is already getting too large. Will follow up with more fine-tuning next week. Co-authored-by: Antoni Baum <antoni.baum@protonmail.com> Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>	2022-02-07 15:47:03 +00:00
xwjiang2010	9af8f11191	Revert "[docs] Clean up doc structure (first part) (#21667 )" (#21763 ) This reverts commit `38e46c9fb3`.	2022-01-20 15:30:56 -08:00
Max Pumperla	38e46c9fb3	[docs] Clean up doc structure (first part) (#21667 )	2022-01-20 16:19:04 +01:00
Will Drevo	fa878e2d4d	Added example to user guide for cloud checkpointing (#20045 ) Co-authored-by: will <will@anyscale.com> Co-authored-by: Antoni Baum <antoni.baum@protonmail.com> Co-authored-by: Kai Fricke <kai@anyscale.com>	2021-11-15 15:43:06 +00:00
matthewdeng	790e22f9ad	[tune] move force_on_current_node to ml_utils (#20211 )	2021-11-10 10:21:24 -08:00
Kai Fricke	9c2b8c8501	[tune] Deprecate DurableTrainable (#19880 )	2021-11-08 20:56:07 +00:00
Antoni Baum	f2773267c7	[docs] Tune doc fixes (#19791 )	2021-10-29 11:45:29 +02:00
Amog Kamsetty	38b657cb65	[Tune] Place remote tune.run on node running the client server (#16034 ) * force placement on persistent node * address comments * doc	2021-05-28 18:32:57 -07:00
Kai Fricke	43e098402a	[tune] make `tune.with_parameters()` work with the class API (#14532 ) * [tune] make `tune.with_parameters()` work with the class API * Update python/ray/tune/utils/trainable.py Co-authored-by: Richard Liaw <rliaw@berkeley.edu> Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2021-03-09 09:36:17 +01:00
Kai Fricke	4014168928	[tune] Introduce `durable()` wrapper to convert trainables into durable trainables (#14306 ) * [tune] Introduce `durable()` wrapper to convert trainables into durable trainables * Fix wrong check * Improve docs, add FAQ for tackling overhead * Fix bugs in `tune.with_parameters` * Update doc/source/tune/api_docs/trainable.rst Co-authored-by: Richard Liaw <rliaw@berkeley.edu> * Update doc/source/tune/_tutorials/_faq.rst Co-authored-by: Richard Liaw <rliaw@berkeley.edu> Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2021-02-26 13:59:28 +01:00
Kai Fricke	757866ec01	[tune] enable placement groups per default (#13906 ) * Refactor placement group factory object to accept placement_group arguments instead of callables * Convert resources to pgf * Enable placement groups per default * Fix tests WIP * Fix stop/resume with placement groups * Fix progress reporter test * Fix trial executor tests * Check resource for trial, not resource object * Move ENV vars into class * Fix tests * Sphinx * Wait for trial start in PBT * Revert merge errors * Support trial reuse with placement groups * Better check for just staged trials * Fix trial queuing * Wait for pg after trial termination * Clean up PGs before tune run * No PG settings in pbt scheduler * Fix buffering tests * Skip test if ray reports erroneous available resources * Disable PG for cluster resource counting test * Debug output for tests * Output in-use resources for placement groups * Don't start new trial on trial start failure * Add docs * Cleanup PGs once futures returned * Fix placement group shutdown * Use updated_queue flag * Apply suggestions from code review * Apply suggestions from code review * Update docs * Reuse placement groups independently from actors * Do not remove placement groups for paused trials * Only continue enqueueing trials if it didn't fail the first time * Rename parameter * Fix pause trial * Code review + try_recover * Update python/ray/tune/utils/placement_groups.py Co-authored-by: Richard Liaw <rliaw@berkeley.edu> * Move placement group lifecycle management * Move total used resources to pg manager * Update FAQ example * Requeue trial if start was unsuccessful * Do not cleanup pgs at start of run * Revert "Do not cleanup pgs at start of run" This reverts commit 933d9c4c * Delayed PG removal * Fix trial requeue test * Trigger pg cleanup on status update * Fix tests * Fix docs * fix-test Signed-off-by: Richard Liaw <rliaw@berkeley.edu> Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2021-02-23 18:46:02 +01:00
Keqiu Hu	0c1bdaef59	[tune] TensorFlow Distributed Trainable (#11876 ) Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2020-11-10 14:59:08 -08:00
Richard Liaw	b02e61f672	[minor] fix up docs (#11596 ) Signed-off-by: Richard Liaw <rliaw@berkeley.edu>	2020-10-26 12:19:03 -07:00
Richard Liaw	56f858ed1a	[tune][docs/util] gputil check, docs (#11260 ) Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>	2020-10-10 00:54:31 -07:00
Kai Fricke	508cfa3540	[tune] Support `yield` and `return` statements (#10857 ) * Support `yield` and `return` statements in Tune trainable functions * Support anonymous metric with ``tune.report(value)`` * Raise on invalid return/yield value * Fix end to end reporter test	2020-09-17 20:18:35 -07:00
krfricke	f3f698816d	[tune] Added PyTorch Lightning callbacks to integrations (#10220 ) Co-authored-by: Richard Liaw <rliaw@berkeley.edu>	2020-08-31 15:30:48 -07:00
Richard Liaw	0c3b9ebeef	[tune/sgd] Document func_trainable and add checkpoint context (#9739 ) Co-authored-by: krfricke <krfricke@users.noreply.github.com> Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>	2020-07-30 09:46:37 -07:00
Richard Liaw	b71c912da7	[tune] Fix up examples (#9201 )	2020-07-05 01:16:20 -07:00
Richard Liaw	d35f0e40d0	[tune] Use public methods for trainable (#9184 )	2020-07-01 11:00:00 -07:00
Richard Liaw	6c49c01837	[tune] Function API checkpointing (#8471 ) Co-authored-by: krfricke <krfricke@users.noreply.github.com>	2020-06-15 10:42:54 -07:00
Richard Liaw	67c01455fe	[tune] `tune.track` -> `tune.report` (#8388 )	2020-05-16 12:55:08 -07:00
Richard Liaw	be5235d982	[tune] Clarify Intro Tune Documentation (#8201 )	2020-04-27 18:01:00 -07:00
Richard Liaw	b506f87117	[tune] New Doc edits, add Concepts page (#8083 ) Co-Authored-By: Sven Mika <sven@anyscale.io>	2020-04-25 18:25:56 -07:00
Richard Liaw	a67edc4051	[tune] Improve user guides and API docs (#7716 ) * create guide gallery for Tune * mods * ok * fix * fix_up_gallery * ok * Apply suggestions from code review Co-Authored-By: Sven Mika <sven@anyscale.io> * Apply suggestions from code review Co-Authored-By: Sven Mika <sven@anyscale.io> Co-authored-by: Sven Mika <sven@anyscale.io>	2020-04-06 12:16:35 -07:00
Richard Liaw	e311013afd	[tune] Reformat Sections of API Reference (#7706 ) * moveit * moveit * docstrings to ref * Update tune-usage.rst Co-authored-by: Sven Mika <sven@anyscale.io>	2020-03-23 12:23:21 -07:00

26 commits