Antoni Baum
c7d6f838f6
[tune] Optional forcible trial cleanup, return default autofilled metrics even if Trainable doesn't report at least once ( #19144 )
2021-10-08 18:16:26 +01:00
xwjiang2010
7ffd9cbed1
[Tune] Fix column width in doc. ( #19159 )
2021-10-07 18:16:21 +01:00
Antoni Baum
27b8633198
[docs] Remove outdated note in Tune docs ( #19110 )
2021-10-07 15:42:11 +01:00
Antoni Baum
cc3199b814
[docs] Provide information about resource deadlocks, early stopping in Tune docs ( #18947 )
...
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
2021-10-01 13:52:47 +01:00
Richard Liaw
227aa9e89b
[tune] change delimiter for results ( #16573 )
...
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2021-09-28 10:03:00 +01:00
Kai Fricke
9b0d804eed
[tune] Add documentation for reproducible runs (setting seeds) ( #18849 )
...
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2021-09-24 10:57:31 +01:00
xwjiang2010
5551cdac19
[Tune] Break from loop after warning msg is logged. ( #18720 )
2021-09-18 16:33:44 -07:00
Kai Fricke
395976c8a1
[tune] Never block for results ( #18391 )
...
* [tune] Never block for results
* Fix tests
* Block in tests
* Add comment to test
2021-09-09 12:08:00 -07:00
Richard Liaw
0594deafdf
[tune] allow users to configure bootstrap for docker syncer ( #17786 )
2021-09-05 22:04:31 -07:00
xwjiang2010
01adf030ec
[Tune] Raise Error when there are insufficient resources. ( #17957 )
2021-09-03 10:49:54 -07:00
xwjiang2010
63f00843f3
[Tune] Inform users of the setup needed for uploading results to cloud. ( #18220 )
2021-08-31 10:27:50 -07:00
Ryan L. Melvin
c081c68de7
[tune] Conditional search space example using hyperopt ( #18130 )
...
Co-authored-by: Ryan Melvin <rmelvin@uabmc.edu>
Co-authored-by: Antoni Baum <antoni.baum@protonmail.com>
2021-08-31 17:06:22 +02:00
Amog Kamsetty
3b77840c1b
PyTorch Lightning Updates ( #17876 )
2021-08-27 23:15:51 -07:00
xwjiang2010
0be9f06ab6
[tune] Output insufficent resources warning msg when trials are in pending for extended amount of time. ( #17533 )
2021-08-13 01:37:56 -07:00
Amog Kamsetty
be238e159d
[Tune] Update docs for with_parameters
( #17441 )
...
* with_parameters_doc
* update docstring
* address comments
2021-08-05 08:48:34 -07:00
Antoni Baum
c40555c82b
[tune] Add define-by-run support to OptunaSearcher
( #17464 )
2021-08-03 16:11:58 +01:00
Kai Fricke
81d3d8705e
[tune] fix docs example for tune qloguniform ( #17539 )
2021-08-03 14:48:22 +01:00
Qingyun Wu
7678503d84
[Tune][docs]Correct reference name to CFO example ( #17503 )
2021-08-02 14:46:10 +01:00
amavilla
f2d9b1f2b9
[docs] Link broken in Tune's page ( #17394 ) ( #17407 )
2021-07-28 09:27:54 -07:00
Antoni Baum
b500a651b7
[docs] Add LightGBM Tune integration to docs ( #17304 )
...
* Add LightGBM integration to docs
* Fix
2021-07-23 21:21:13 -07:00
Antoni Baum
2e37826458
[tune] Function API support for ResourceChangingScheduler
( #17150 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-07-21 14:14:12 -07:00
Antoni Baum
f20311f194
[tune] ResourceChangingScheduler
improvements ( #17082 )
2021-07-15 15:03:27 +01:00
Antoni Baum
6e780ebf07
[tune] ResourceChangingScheduler
dynamic resource allocation during tuning ( #16787 )
2021-07-14 10:45:13 +01:00
Kai Fricke
fce8fa2668
[tune] use bayesopt for quick start example (which actually converges) ( #16997 )
2021-07-12 14:50:32 +01:00
Antoni Baum
0935ec30d0
[tune] Add information about environment variables to tune.run
docstring ( #16980 )
2021-07-11 17:20:17 -07:00
Amog Kamsetty
33d798f8fc
[Docs] Add e2e guide on using Pytorch Lightning with Ray ( #16484 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-06-19 10:04:58 -07:00
Kai Fricke
172d33be02
[tune] Use unbuffered training when checkpoint_at_end is used. ( #16504 )
2021-06-18 14:19:14 +01:00
Antoni Baum
d71ec6e874
[docs] Add examples of new features to contribute ( #16477 )
2021-06-18 00:07:03 -07:00
Qingyun Wu
dae3ac1def
[Tune] Add new searchers from FLAML ( #16329 )
2021-06-12 02:10:51 -07:00
Kai Fricke
e8f8e9f328
[tune] Adjust searcher sample bounds to match Tune API ( #15899 )
2021-06-11 14:31:08 +01:00
Amog Kamsetty
04863d158a
[Tune] MLflow with Ray Client ( #16029 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-06-01 09:50:44 -07:00
Amog Kamsetty
38b657cb65
[Tune] Place remote tune.run on node running the client server ( #16034 )
...
* force placement on persistent node
* address comments
* doc
2021-05-28 18:32:57 -07:00
Edward Oakes
82410f20b2
[serve] Add warning + docstring for anonymous namespaces ( #15921 )
2021-05-20 22:27:15 -05:00
Tom Dörr
3c99f1db4c
[Docs] Tune Contributors fix ( #15719 )
2021-05-10 12:22:47 -07:00
Tom Dörr
b5c03b6458
Fix Link ( #15722 )
2021-05-10 12:19:32 -07:00
Kai Fricke
16381625db
[tune] Reduce default number of maximum pending trials to max(16, cluster_cpus) ( #15628 )
2021-05-05 15:54:27 +01:00
Edward Oakes
c9550a86dc
[serve] Update docs for v2 Deployments API ( #15582 )
2021-05-03 13:19:34 -05:00
Richard Liaw
f4b2dd94b2
[tune] Cache MNIST and restore MNIST tests ( #15260 )
2021-04-13 14:20:26 -07:00
Kai Fricke
d33b0e4bc3
[tune] Reconcile placement groups every N seconds to avoid bottlenecks when running many short trials ( #15011 )
...
Closes a release blocking issue
2021-04-01 17:04:44 +02:00
Kai Fricke
84b3c3376b
[tune] document scalability best practices (k8s, scalability thresholds) ( #14566 )
...
Adds a new page and table to document current scalability thresholds in Ray Tune to the documentation.
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-25 09:54:14 +01:00
Kai Fricke
898243d538
[tune] Limit maximum number of pending trials. Add convergence test. ( #14835 )
2021-03-23 18:19:41 -07:00
Amog Kamsetty
7ee2e4185b
[Tune] PTL Fractional GPUs ( #14781 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-18 17:07:51 -07:00
Kai Fricke
43e098402a
[tune] make tune.with_parameters()
work with the class API ( #14532 )
...
* [tune] make `tune.with_parameters()` work with the class API
* Update python/ray/tune/utils/trainable.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-09 09:36:17 +01:00
Kai Fricke
b0bf44b154
[tune/docs] Add high level trial runner flow to documentation ( #14468 )
...
* [tune/docs] Add high level trial runner flow to documentation
* Apply suggestions from code review
2021-03-08 10:35:54 +01:00
Kai Fricke
4014168928
[tune] Introduce durable()
wrapper to convert trainables into durable trainables ( #14306 )
...
* [tune] Introduce `durable()` wrapper to convert trainables into durable trainables
* Fix wrong check
* Improve docs, add FAQ for tackling overhead
* Fix bugs in `tune.with_parameters`
* Update doc/source/tune/api_docs/trainable.rst
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Update doc/source/tune/_tutorials/_faq.rst
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-02-26 13:59:28 +01:00
Kai Fricke
757866ec01
[tune] enable placement groups per default ( #13906 )
...
* Refactor placement group factory object to accept placement_group arguments instead of callables
* Convert resources to pgf
* Enable placement groups per default
* Fix tests WIP
* Fix stop/resume with placement groups
* Fix progress reporter test
* Fix trial executor tests
* Check resource for trial, not resource object
* Move ENV vars into class
* Fix tests
* Sphinx
* Wait for trial start in PBT
* Revert merge errors
* Support trial reuse with placement groups
* Better check for just staged trials
* Fix trial queuing
* Wait for pg after trial termination
* Clean up PGs before tune run
* No PG settings in pbt scheduler
* Fix buffering tests
* Skip test if ray reports erroneous available resources
* Disable PG for cluster resource counting test
* Debug output for tests
* Output in-use resources for placement groups
* Don't start new trial on trial start failure
* Add docs
* Cleanup PGs once futures returned
* Fix placement group shutdown
* Use updated_queue flag
* Apply suggestions from code review
* Apply suggestions from code review
* Update docs
* Reuse placement groups independently from actors
* Do not remove placement groups for paused trials
* Only continue enqueueing trials if it didn't fail the first time
* Rename parameter
* Fix pause trial
* Code review + try_recover
* Update python/ray/tune/utils/placement_groups.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Move placement group lifecycle management
* Move total used resources to pg manager
* Update FAQ example
* Requeue trial if start was unsuccessful
* Do not cleanup pgs at start of run
* Revert "Do not cleanup pgs at start of run"
This reverts commit 933d9c4c
* Delayed PG removal
* Fix trial requeue test
* Trigger pg cleanup on status update
* Fix tests
* Fix docs
* fix-test
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-02-23 18:46:02 +01:00
Antoni Baum
58d7398246
[Tune] Add HEBOSearch
Searcher ( #13863 )
...
* HEBO first pass
* Fix bad quotes
* Fixes
* Reproductibility
* Update python/ray/tune/suggest/hebo.py
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
* Add hebo_example.py to BUILD
* Nit
* Update to pypi package
* Alphabetical HEBO requirement
* Fix syntax error
* Fix wrong space in hebo example
* Move validate_warmstart to utils
* Space assertion in HEBO
* Comment
* Apply suggestions from code review
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
* Formatting
Co-authored-by: Kai Fricke <krfricke@users.noreply.github.com>
2021-02-17 22:53:10 +01:00
javi-redondo
b8b2d6410d
[docs] new Ray Cluster documentation ( #13839 )
...
Co-authored-by: Javier Redondo <javier@anyscale.com>
Co-authored-by: AmeerHajAli <ameerh@berkeley.edu>
2021-02-15 00:47:14 -08:00
Richard Liaw
6c77aeb98a
[docs] ray slack remove banners ( #13898 )
...
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-02-04 01:14:34 -08:00
Kai Fricke
d29fcfb45c
[tune] catch SIGINT signal and trigger experiment checkpoint ( #13767 )
...
* [tune] catch SIGINT signal and trigger experiment checkpoint
* Apply suggestions from code review
* Fix user guide docs
* Update doc/source/tune/user-guide.rst
2021-02-02 14:52:09 +01:00