Kai Fricke
dc42abb2f5
[tune] placement group support ( #13370 )
2021-01-18 11:58:57 -08:00
Richard Liaw
86387504ee
[tune] fix small docs typo ( #13355 )
...
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
2021-01-16 00:49:17 -08:00
Kai Fricke
518427627b
[tune] buffer trainable results ( #13236 )
...
* Working prototype
* Pass buffer length, fix tests
* Don't buffer per default
* Dispatch and process save in one go, added tests
* Fix tests
* Pass adaptive seconds to train_buffered, stop result processing after STOP decision
* Fix tests, add release test
* Update tests
* Added detailed logs for slow operations
* Update python/ray/tune/trial_runner.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Apply suggestions from code review
* Revert tests and go back to old tuning loop
* nit
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2021-01-12 18:52:47 +01:00
Edwin Goh
a5ddc27bab
Fix typo in Tune Docs (Checkpointing) ( #13348 )
...
See issue #13299
2021-01-11 20:27:18 -08:00
Kai Fricke
5f04ade6ef
[tune] add more stoppers and stopper documentation ( #12750 )
...
* Add new stoppers & docs
* Add tests for maximum iteration stopper and trial plateau stopper
* Update python/ray/tune/stopper.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Update doc/source/tune/api_docs/stoppers.rst
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Update doc/source/tune/api_docs/stoppers.rst
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Apply suggestions from code review
* Apply suggestions from code review
* Update python/ray/tune/stopper.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-12-12 01:47:19 -08:00
Richard Liaw
9ce7ad17fd
[tune] remove some bottlenecks in trialrunner ( #12476 )
2020-11-30 14:54:25 -08:00
Richard Liaw
e59fe65d3d
[tune] Fix logging for dockersyncer ( #12196 )
2020-11-23 14:29:41 -08:00
Keqiu Hu
0c1bdaef59
[tune] TensorFlow Distributed Trainable ( #11876 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-11-10 14:59:08 -08:00
Kai Fricke
603accf1c2
[tune] logger refactor part 3: Add ExperimentLogger class ( #11749 )
2020-11-05 08:55:38 -08:00
Frank Gu
73fa94731f
[tune] Add HDFS as Cloud Sync Client ( #11524 )
2020-10-22 14:12:51 -07:00
Richard Liaw
a4b418d30c
[docs] update cloud docs ( #11262 )
...
* update-cloud-docs
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* Update doc/source/cluster/config.rst
Co-authored-by: Ian Rodney <ian.rodney@gmail.com>
* fix
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
* fix
Signed-off-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Ian Rodney <ian.rodney@gmail.com>
2020-10-21 16:37:26 -07:00
Richard Liaw
56f858ed1a
[tune][docs/util] gputil check, docs ( #11260 )
...
Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>
2020-10-10 00:54:31 -07:00
Kai Fricke
b450cb030a
[tune] reuse actors for function API ( #11230 )
...
Co-authored-by: Kristian Hartikainen <kristian.hartikainen@gmail.com>
2020-10-08 16:15:02 -07:00
Kai Fricke
bdf647c4ec
[tune] docker syncer ( #11035 )
...
* Add DockerSyncer
* Add docs
* Update python/ray/tune/integration/docker.py
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Updated docs
* fix dir
* Added docker integration test
* added docker integration test to bazel build
* Use sdk.rsync API
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-10-01 11:59:23 -07:00
Kai Fricke
c77cfaa5ad
[tune] use dated experiment dir per default ( #11104 )
2020-09-30 14:43:59 -07:00
Kai Fricke
e7315b0856
[tune] Callbacks for tune runs ( #11001 )
2020-09-27 16:50:07 -07:00
Richard Liaw
a563344bc2
[docs] remove ref to google groups -> github discussions ( #11019 )
2020-09-24 18:09:51 -07:00
Kai Fricke
d9c4dea7cf
[tune] strict metric checking ( #10972 )
2020-09-24 10:00:48 -07:00
Richard Liaw
b0ca70f628
[tune+core] tune lifecycle and starting ray guide ( #10813 )
2020-09-21 11:27:50 -07:00
Ameer Haj Ali
6edacb22b8
Fix abstraction violations in command_runner interface ( #10715 )
...
* Fix abstraction violations in command_runner interface
* user guide
* lint
* breaking abstraction in commands
* extra initialization commands
* more cleanup
* small fixes
* fix test_integration_kubernetes.py
* lint
Co-authored-by: root <root@ip-172-31-28-155.us-west-2.compute.internal>
Co-authored-by: Ameer Haj Ali <ameerhajali@Ameers-MacBook-Pro.local>
2020-09-14 20:28:38 -07:00
Max Fitton
017737b82b
[Documentation] local_mode
doc updates and actor / worker explanation from Slack ( #10748 )
...
* wip
* Update local mode docs in all locations
* Update doc/source/actors.rst
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Update doc/source/actors.rst
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
* Change duplicated text to links to a subtitle for local_mode
* change a reference to be explicit
* Apply suggestions from code review
Co-authored-by: Max Fitton <max@semprehealth.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-09-14 13:19:38 -07:00
Kai Fricke
7eaf063f29
[tune] wrapper function to pass arbitrary objects through the object store to trainables ( #10679 )
2020-09-10 17:39:44 -07:00
Richard Liaw
551c597312
[tune] API revamp fix ( #10518 )
2020-09-05 15:34:53 -07:00
Kai Fricke
2fac66650d
[tune] extend search space api docs ( #10576 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-09-04 18:39:51 -07:00
Eric Liang
519354a39a
[api] Initial API deprecations for Ray 1.0 ( #10325 )
2020-08-28 15:03:50 -07:00
Richard Liaw
6bd5458bef
[tune] cleanup error messaging/diagnose_serialization helper ( #10210 )
2020-08-22 11:50:49 -07:00
Richard Liaw
927a073226
[tune] Update node syncing documentation ( #10126 )
2020-08-17 18:08:27 -07:00
krfricke
8f0f7371a0
[tune] Added Kubernetes syncer and sync client ( #10097 )
...
Co-authored-by: Edward Oakes <ed.nmi.oakes@gmail.com>
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-08-16 14:09:28 -07:00
krfricke
c741d1cf9c
[tune] stdout/stderr logging redirection ( #9817 )
...
* Add `log_to_file` parameter, pass to Trainable config, redirect stdout/stderr.
* Add logging handler to root ray logger
* Added test for `log_to_file` parameter
* Added logs, reuse test
* Revert debug change
* Update logdir on reset, flush streams after each train() step
* Remove magic keys from visible config
Co-authored-by: Kai Fricke <kai@anyscale.com>
2020-08-03 11:18:34 -07:00
Richard Liaw
0c3b9ebeef
[tune/sgd] Document func_trainable and add checkpoint context ( #9739 )
...
Co-authored-by: krfricke <krfricke@users.noreply.github.com>
Co-authored-by: Amog Kamsetty <amogkam@users.noreply.github.com>
2020-07-30 09:46:37 -07:00
Bill Chambers
067c2752f8
[TUNE] Tune Docs re-organization ( #9600 )
...
Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
2020-07-29 11:22:44 -07:00