Commit graph

12 commits

Author SHA1 Message Date
Kai Fricke
43e098402a
[tune] make tune.with_parameters() work with the class API (#14532)
* [tune] make `tune.with_parameters()` work with the class API

* Update python/ray/tune/utils/trainable.py

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-03-09 09:36:17 +01:00
Kai Fricke
4014168928
[tune] Introduce durable() wrapper to convert trainables into durable trainables (#14306)
* [tune] Introduce `durable()` wrapper to convert trainables into durable trainables

* Fix wrong check

* Improve docs, add FAQ for tackling overhead

* Fix bugs in `tune.with_parameters`

* Update doc/source/tune/api_docs/trainable.rst

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

* Update doc/source/tune/_tutorials/_faq.rst

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-02-26 13:59:28 +01:00
Kai Fricke
757866ec01
[tune] enable placement groups per default (#13906)
* Refactor placement group factory object to accept placement_group arguments instead of callables

* Convert resources to pgf

* Enable placement groups per default

* Fix tests WIP

* Fix stop/resume with placement groups

* Fix progress reporter test

* Fix trial executor tests

* Check resource for trial, not resource object

* Move ENV vars into class

* Fix tests

* Sphinx

* Wait for trial start in PBT

* Revert merge errors

* Support trial reuse with placement groups

* Better check for just staged trials

* Fix trial queuing

* Wait for pg after trial termination

* Clean up PGs before tune run

* No PG settings in pbt scheduler

* Fix buffering tests

* Skip test if ray reports erroneous available resources

* Disable PG for cluster resource counting test

* Debug output for tests

* Output in-use resources for placement groups

* Don't start new trial on trial start failure

* Add docs

* Cleanup PGs once futures returned

* Fix placement group shutdown

* Use updated_queue flag

* Apply suggestions from code review

* Apply suggestions from code review

* Update docs

* Reuse placement groups independently from actors

* Do not remove placement groups for paused trials

* Only continue enqueueing trials if it didn't fail the first time

* Rename parameter

* Fix pause trial

* Code review + try_recover

* Update python/ray/tune/utils/placement_groups.py

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>

* Move placement group lifecycle management

* Move total used resources to pg manager

* Update FAQ example

* Requeue trial if start was unsuccessful

* Do not cleanup pgs at start of run

* Revert "Do not cleanup pgs at start of run"

This reverts commit 933d9c4c

* Delayed PG removal

* Fix trial requeue test

* Trigger pg cleanup on status update

* Fix tests

* Fix docs

* fix-test

Signed-off-by: Richard Liaw <rliaw@berkeley.edu>

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-02-23 18:46:02 +01:00
Sumanth Ratna
92a58aabce
[tune][docs] Fix learning rate bounds in FAQ (#11345) 2020-10-12 09:44:53 -07:00
Richard Liaw
56f858ed1a
[tune][docs/util] gputil check, docs (#11260)
Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>
2020-10-10 00:54:31 -07:00
Sumanth Ratna
98ebf8e2d8
[tune][docs] fix typo in Tune FAQ (#11161)
* Fix typo in tune FAQ (used to use)

* Update doc/source/tune/_tutorials/_faq.rst

Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-10-01 11:20:41 -07:00
Kai Fricke
b8f344f695
[tune] add faq entry for reproducing experiments (setting seeds etc) (#11106) 2020-09-29 14:48:39 -07:00
Richard Liaw
a563344bc2
[docs] remove ref to google groups -> github discussions (#11019) 2020-09-24 18:09:51 -07:00
Richard Liaw
b0ca70f628
[tune+core] tune lifecycle and starting ray guide (#10813) 2020-09-21 11:27:50 -07:00
Kai Fricke
7eaf063f29
[tune] wrapper function to pass arbitrary objects through the object store to trainables (#10679) 2020-09-10 17:39:44 -07:00
Sumanth Ratna
89bf262130
[tune] Fix lr typo in FAQ (#10548) 2020-09-03 13:37:39 -07:00
krfricke
5a787a8253
[tune] added FAQ to docs (#10222)
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-08-24 21:51:02 -07:00