mirror of
https://github.com/vale981/ray
synced 2025-03-09 12:56:46 -04:00

This PR updates the Ray AIR/Tune ipynb examples to use the Tuner() API instead of tune.run(). Signed-off-by: Kai Fricke <kai@anyscale.com> Signed-off-by: Richard Liaw <rliaw@berkeley.edu> Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com> Signed-off-by: Kai Fricke <coding@kaifricke.com> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> Co-authored-by: Xiaowei Jiang <xwjiang2010@gmail.com>
860 lines
35 KiB
Text
860 lines
35 KiB
Text
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "6df76a1f",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Using MLflow with Tune\n",
|
|
"\n",
|
|
"(tune-mlflow-ref)=\n",
|
|
"\n",
|
|
":::{warning}\n",
|
|
"If you are using these MLflow integrations with {ref}`ray-client`, it is recommended that you setup a\n",
|
|
"remote Mlflow tracking server instead of one that is backed by the local filesystem.\n",
|
|
":::\n",
|
|
"\n",
|
|
"[MLflow](https://mlflow.org/) is an open source platform to manage the ML lifecycle, including experimentation,\n",
|
|
"reproducibility, deployment, and a central model registry. It currently offers four components, including\n",
|
|
"MLflow Tracking to record and query experiments, including code, data, config, and results.\n",
|
|
"\n",
|
|
"```{image} /images/mlflow.png\n",
|
|
":align: center\n",
|
|
":alt: MLflow\n",
|
|
":height: 80px\n",
|
|
":target: https://www.mlflow.org/\n",
|
|
"```\n",
|
|
"\n",
|
|
"Ray Tune currently offers two lightweight integrations for MLflow Tracking.\n",
|
|
"One is the {ref}`MLflowLoggerCallback <tune-mlflow-logger>`, which automatically logs\n",
|
|
"metrics reported to Tune to the MLflow Tracking API.\n",
|
|
"\n",
|
|
"The other one is the {ref}`@mlflow_mixin <tune-mlflow-mixin>` decorator, which can be\n",
|
|
"used with the function API. It automatically\n",
|
|
"initializes the MLflow API with Tune's training information and creates a run for each Tune trial.\n",
|
|
"Then within your training function, you can just use the\n",
|
|
"MLflow like you would normally do, e.g. using `mlflow.log_metrics()` or even `mlflow.autolog()`\n",
|
|
"to log to your training process.\n",
|
|
"\n",
|
|
"```{contents}\n",
|
|
":backlinks: none\n",
|
|
":local: true\n",
|
|
"```\n",
|
|
"\n",
|
|
"## Running an MLflow Example\n",
|
|
"\n",
|
|
"In the following example we're going to use both of the above methods, namely the `MLflowLoggerCallback` and\n",
|
|
"the `mlflow_mixin` decorator to log metrics.\n",
|
|
"Let's start with a few crucial imports:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"id": "b0e47339",
|
|
"metadata": {
|
|
"vscode": {
|
|
"languageId": "python"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"import os\n",
|
|
"import tempfile\n",
|
|
"import time\n",
|
|
"\n",
|
|
"import mlflow\n",
|
|
"\n",
|
|
"from ray import air, tune\n",
|
|
"from ray.air import session\n",
|
|
"from ray.air.callbacks.mlflow import MLflowLoggerCallback\n",
|
|
"from ray.tune.integration.mlflow import mlflow_mixin"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "618b6935",
|
|
"metadata": {
|
|
"pycharm": {
|
|
"name": "#%% md\n"
|
|
}
|
|
},
|
|
"source": [
|
|
"Next, let's define an easy objective function (a Tune `Trainable`) that iteratively computes steps and evaluates\n",
|
|
"intermediate scores that we report to Tune."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"id": "f449538e",
|
|
"metadata": {
|
|
"pycharm": {
|
|
"name": "#%%\n"
|
|
},
|
|
"vscode": {
|
|
"languageId": "python"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"def evaluation_fn(step, width, height):\n",
|
|
" return (0.1 + width * step / 100) ** (-1) + height * 0.1\n",
|
|
"\n",
|
|
"\n",
|
|
"def easy_objective(config):\n",
|
|
" width, height = config[\"width\"], config[\"height\"]\n",
|
|
"\n",
|
|
" for step in range(config.get(\"steps\", 100)):\n",
|
|
" # Iterative training function - can be any arbitrary training procedure\n",
|
|
" intermediate_score = evaluation_fn(step, width, height)\n",
|
|
" # Feed the score back to Tune.\n",
|
|
" session.report({\"iterations\": step, \"mean_loss\": intermediate_score})\n",
|
|
" time.sleep(0.1)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "722e5d2f",
|
|
"metadata": {
|
|
"pycharm": {
|
|
"name": "#%% md\n"
|
|
}
|
|
},
|
|
"source": [
|
|
"Given an MLFlow tracking URI, you can now simply use the `MLflowLoggerCallback` as a `callback` argument to\n",
|
|
"your `RunConfig()`:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"id": "8e0b9ab7",
|
|
"metadata": {
|
|
"pycharm": {
|
|
"name": "#%%\n"
|
|
},
|
|
"vscode": {
|
|
"languageId": "python"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"def tune_function(mlflow_tracking_uri, finish_fast=False):\n",
|
|
" tuner = tune.Tuner(\n",
|
|
" easy_objective,\n",
|
|
" tune_config=tune.TuneConfig(\n",
|
|
" num_samples=5\n",
|
|
" ),\n",
|
|
" run_config=air.RunConfig(\n",
|
|
" name=\"mlflow\",\n",
|
|
" callbacks=[\n",
|
|
" MLflowLoggerCallback(\n",
|
|
" tracking_uri=mlflow_tracking_uri,\n",
|
|
" experiment_name=\"example\",\n",
|
|
" save_artifact=True,\n",
|
|
" )\n",
|
|
" ],\n",
|
|
" ),\n",
|
|
" param_space={\n",
|
|
" \"width\": tune.randint(10, 100),\n",
|
|
" \"height\": tune.randint(0, 100),\n",
|
|
" \"steps\": 5 if finish_fast else 100,\n",
|
|
" },\n",
|
|
" )\n",
|
|
" results = tuner.fit()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "e086f110",
|
|
"metadata": {},
|
|
"source": [
|
|
"To use the `mlflow_mixin` decorator, you can simply decorate the objective function from earlier.\n",
|
|
"Note that we also use `mlflow.log_metrics(...)` to log metrics to MLflow.\n",
|
|
"Otherwise, the decorated version of our objective is identical to its original."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"id": "144b8f39",
|
|
"metadata": {
|
|
"pycharm": {
|
|
"name": "#%%\n"
|
|
},
|
|
"vscode": {
|
|
"languageId": "python"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"@mlflow_mixin\n",
|
|
"def decorated_easy_objective(config):\n",
|
|
" # Hyperparameters\n",
|
|
" width, height = config[\"width\"], config[\"height\"]\n",
|
|
"\n",
|
|
" for step in range(config.get(\"steps\", 100)):\n",
|
|
" # Iterative training function - can be any arbitrary training procedure\n",
|
|
" intermediate_score = evaluation_fn(step, width, height)\n",
|
|
" # Log the metrics to mlflow\n",
|
|
" mlflow.log_metrics(dict(mean_loss=intermediate_score), step=step)\n",
|
|
" # Feed the score back to Tune.\n",
|
|
" session.report({\"iterations\": step, \"mean_loss\": intermediate_score})\n",
|
|
" time.sleep(0.1)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "dc480366",
|
|
"metadata": {},
|
|
"source": [
|
|
"With this new objective function ready, you can now create a Tune run with it as follows:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"id": "4b9fe6be",
|
|
"metadata": {
|
|
"pycharm": {
|
|
"name": "#%%\n"
|
|
},
|
|
"vscode": {
|
|
"languageId": "python"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"def tune_decorated(mlflow_tracking_uri, finish_fast=False):\n",
|
|
" # Set the experiment, or create a new one if does not exist yet.\n",
|
|
" mlflow.set_tracking_uri(mlflow_tracking_uri)\n",
|
|
" mlflow.set_experiment(experiment_name=\"mixin_example\")\n",
|
|
" \n",
|
|
" tuner = tune.Tuner(\n",
|
|
" decorated_easy_objective,\n",
|
|
" tune_config=tune.TuneConfig(\n",
|
|
" num_samples=5\n",
|
|
" ),\n",
|
|
" run_config=air.RunConfig(\n",
|
|
" name=\"mlflow\",\n",
|
|
" ),\n",
|
|
" param_space={\n",
|
|
" \"width\": tune.randint(10, 100),\n",
|
|
" \"height\": tune.randint(0, 100),\n",
|
|
" \"steps\": 5 if finish_fast else 100,\n",
|
|
" \"mlflow\": {\n",
|
|
" \"experiment_name\": \"mixin_example\",\n",
|
|
" \"tracking_uri\": mlflow.get_tracking_uri(),\n",
|
|
" },\n",
|
|
" },\n",
|
|
" )\n",
|
|
" results = tuner.fit()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "915dfd30",
|
|
"metadata": {},
|
|
"source": [
|
|
"If you hapen to have an MLFlow tracking URI, you can set it below in the `mlflow_tracking_uri` variable and set\n",
|
|
"`smoke_test=False`.\n",
|
|
"Otherwise, you can just run a quick test of the `tune_function` and `tune_decorated` functions without using MLflow."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 6,
|
|
"id": "05d11774",
|
|
"metadata": {
|
|
"pycharm": {
|
|
"name": "#%%\n"
|
|
},
|
|
"vscode": {
|
|
"languageId": "python"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"2022-07-22 16:27:41,371\tINFO services.py:1483 -- View the Ray dashboard at \u001b[1m\u001b[32mhttp://127.0.0.1:8271\u001b[39m\u001b[22m\n",
|
|
"2022-07-22 16:27:43,768\tWARNING function_trainable.py:619 -- Function checkpointing is disabled. This may result in unexpected behavior when using checkpointing features or certain schedulers. To enable, set the train function arguments to be `func(config, checkpoint_dir=None)`.\n"
|
|
]
|
|
},
|
|
{
|
|
"data": {
|
|
"text/html": [
|
|
"== Status ==<br>Current time: 2022-07-22 16:27:50 (running for 00:00:06.29)<br>Memory usage on this node: 10.1/16.0 GiB<br>Using FIFO scheduling algorithm.<br>Resources requested: 0/16 CPUs, 0/0 GPUs, 0.0/5.63 GiB heap, 0.0/2.0 GiB objects<br>Result logdir: /Users/kai/ray_results/mlflow<br>Number of trials: 5/5 (5 TERMINATED)<br><table>\n",
|
|
"<thead>\n",
|
|
"<tr><th>Trial name </th><th>status </th><th>loc </th><th style=\"text-align: right;\"> height</th><th style=\"text-align: right;\"> width</th><th style=\"text-align: right;\"> loss</th><th style=\"text-align: right;\"> iter</th><th style=\"text-align: right;\"> total time (s)</th><th style=\"text-align: right;\"> iterations</th><th style=\"text-align: right;\"> neg_mean_loss</th></tr>\n",
|
|
"</thead>\n",
|
|
"<tbody>\n",
|
|
"<tr><td>easy_objective_d4e29_00000</td><td>TERMINATED</td><td>127.0.0.1:52551</td><td style=\"text-align: right;\"> 38</td><td style=\"text-align: right;\"> 23</td><td style=\"text-align: right;\">4.78039</td><td style=\"text-align: right;\"> 5</td><td style=\"text-align: right;\"> 0.549093</td><td style=\"text-align: right;\"> 4</td><td style=\"text-align: right;\"> -4.78039</td></tr>\n",
|
|
"<tr><td>easy_objective_d4e29_00001</td><td>TERMINATED</td><td>127.0.0.1:52561</td><td style=\"text-align: right;\"> 86</td><td style=\"text-align: right;\"> 88</td><td style=\"text-align: right;\">8.87624</td><td style=\"text-align: right;\"> 5</td><td style=\"text-align: right;\"> 0.548692</td><td style=\"text-align: right;\"> 4</td><td style=\"text-align: right;\"> -8.87624</td></tr>\n",
|
|
"<tr><td>easy_objective_d4e29_00002</td><td>TERMINATED</td><td>127.0.0.1:52562</td><td style=\"text-align: right;\"> 22</td><td style=\"text-align: right;\"> 95</td><td style=\"text-align: right;\">2.45641</td><td style=\"text-align: right;\"> 5</td><td style=\"text-align: right;\"> 0.587558</td><td style=\"text-align: right;\"> 4</td><td style=\"text-align: right;\"> -2.45641</td></tr>\n",
|
|
"<tr><td>easy_objective_d4e29_00003</td><td>TERMINATED</td><td>127.0.0.1:52563</td><td style=\"text-align: right;\"> 11</td><td style=\"text-align: right;\"> 81</td><td style=\"text-align: right;\">1.3994 </td><td style=\"text-align: right;\"> 5</td><td style=\"text-align: right;\"> 0.560393</td><td style=\"text-align: right;\"> 4</td><td style=\"text-align: right;\"> -1.3994 </td></tr>\n",
|
|
"<tr><td>easy_objective_d4e29_00004</td><td>TERMINATED</td><td>127.0.0.1:52564</td><td style=\"text-align: right;\"> 21</td><td style=\"text-align: right;\"> 27</td><td style=\"text-align: right;\">2.94746</td><td style=\"text-align: right;\"> 5</td><td style=\"text-align: right;\"> 0.534 </td><td style=\"text-align: right;\"> 4</td><td style=\"text-align: right;\"> -2.94746</td></tr>\n",
|
|
"</tbody>\n",
|
|
"</table><br><br>"
|
|
],
|
|
"text/plain": [
|
|
"<IPython.core.display.HTML object>"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"output_type": "display_data"
|
|
},
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"2022-07-22 16:27:44,945\tINFO plugin_schema_manager.py:52 -- Loading the default runtime env schemas: ['/Users/kai/coding/ray/python/ray/_private/runtime_env/../../runtime_env/schemas/working_dir_schema.json', '/Users/kai/coding/ray/python/ray/_private/runtime_env/../../runtime_env/schemas/pip_schema.json'].\n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Result for easy_objective_d4e29_00000:\n",
|
|
" date: 2022-07-22_16-27-47\n",
|
|
" done: false\n",
|
|
" experiment_id: 421feb6ca1cb40969430bd0ab995fe37\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations: 0\n",
|
|
" iterations_since_restore: 1\n",
|
|
" mean_loss: 13.8\n",
|
|
" neg_mean_loss: -13.8\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 52551\n",
|
|
" time_since_restore: 0.00015282630920410156\n",
|
|
" time_this_iter_s: 0.00015282630920410156\n",
|
|
" time_total_s: 0.00015282630920410156\n",
|
|
" timestamp: 1658503667\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: d4e29_00000\n",
|
|
" warmup_time: 0.0036253929138183594\n",
|
|
" \n",
|
|
"Result for easy_objective_d4e29_00000:\n",
|
|
" date: 2022-07-22_16-27-48\n",
|
|
" done: true\n",
|
|
" experiment_id: 421feb6ca1cb40969430bd0ab995fe37\n",
|
|
" experiment_tag: 0_height=38,width=23\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations: 4\n",
|
|
" iterations_since_restore: 5\n",
|
|
" mean_loss: 4.780392156862745\n",
|
|
" neg_mean_loss: -4.780392156862745\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 52551\n",
|
|
" time_since_restore: 0.5490927696228027\n",
|
|
" time_this_iter_s: 0.12111282348632812\n",
|
|
" time_total_s: 0.5490927696228027\n",
|
|
" timestamp: 1658503668\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 5\n",
|
|
" trial_id: d4e29_00000\n",
|
|
" warmup_time: 0.0036253929138183594\n",
|
|
" \n",
|
|
"Result for easy_objective_d4e29_00001:\n",
|
|
" date: 2022-07-22_16-27-50\n",
|
|
" done: false\n",
|
|
" experiment_id: 40ac54d80e854437b4126dca98a7f995\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations: 0\n",
|
|
" iterations_since_restore: 1\n",
|
|
" mean_loss: 18.6\n",
|
|
" neg_mean_loss: -18.6\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 52561\n",
|
|
" time_since_restore: 0.00013113021850585938\n",
|
|
" time_this_iter_s: 0.00013113021850585938\n",
|
|
" time_total_s: 0.00013113021850585938\n",
|
|
" timestamp: 1658503670\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: d4e29_00001\n",
|
|
" warmup_time: 0.002991914749145508\n",
|
|
" \n",
|
|
"Result for easy_objective_d4e29_00002:\n",
|
|
" date: 2022-07-22_16-27-50\n",
|
|
" done: false\n",
|
|
" experiment_id: 23f2d0c4631e4a2abb5449ba68f80e8b\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations: 0\n",
|
|
" iterations_since_restore: 1\n",
|
|
" mean_loss: 12.2\n",
|
|
" neg_mean_loss: -12.2\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 52562\n",
|
|
" time_since_restore: 0.0001289844512939453\n",
|
|
" time_this_iter_s: 0.0001289844512939453\n",
|
|
" time_total_s: 0.0001289844512939453\n",
|
|
" timestamp: 1658503670\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: d4e29_00002\n",
|
|
" warmup_time: 0.002949953079223633\n",
|
|
" \n",
|
|
"Result for easy_objective_d4e29_00003:\n",
|
|
" date: 2022-07-22_16-27-50\n",
|
|
" done: false\n",
|
|
" experiment_id: 7cb23325d6044f0f995b338d2e15f31e\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations: 0\n",
|
|
" iterations_since_restore: 1\n",
|
|
" mean_loss: 11.1\n",
|
|
" neg_mean_loss: -11.1\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 52563\n",
|
|
" time_since_restore: 0.00010609626770019531\n",
|
|
" time_this_iter_s: 0.00010609626770019531\n",
|
|
" time_total_s: 0.00010609626770019531\n",
|
|
" timestamp: 1658503670\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: d4e29_00003\n",
|
|
" warmup_time: 0.0026869773864746094\n",
|
|
" \n",
|
|
"Result for easy_objective_d4e29_00004:\n",
|
|
" date: 2022-07-22_16-27-50\n",
|
|
" done: false\n",
|
|
" experiment_id: fc3b1add717842f4ae0b4882a1292f93\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations: 0\n",
|
|
" iterations_since_restore: 1\n",
|
|
" mean_loss: 12.1\n",
|
|
" neg_mean_loss: -12.1\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 52564\n",
|
|
" time_since_restore: 0.00011801719665527344\n",
|
|
" time_this_iter_s: 0.00011801719665527344\n",
|
|
" time_total_s: 0.00011801719665527344\n",
|
|
" timestamp: 1658503670\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: d4e29_00004\n",
|
|
" warmup_time: 0.0028209686279296875\n",
|
|
" \n",
|
|
"Result for easy_objective_d4e29_00001:\n",
|
|
" date: 2022-07-22_16-27-50\n",
|
|
" done: true\n",
|
|
" experiment_id: 40ac54d80e854437b4126dca98a7f995\n",
|
|
" experiment_tag: 1_height=86,width=88\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations: 4\n",
|
|
" iterations_since_restore: 5\n",
|
|
" mean_loss: 8.876243093922652\n",
|
|
" neg_mean_loss: -8.876243093922652\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 52561\n",
|
|
" time_since_restore: 0.548691987991333\n",
|
|
" time_this_iter_s: 0.12308692932128906\n",
|
|
" time_total_s: 0.548691987991333\n",
|
|
" timestamp: 1658503670\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 5\n",
|
|
" trial_id: d4e29_00001\n",
|
|
" warmup_time: 0.002991914749145508\n",
|
|
" \n",
|
|
"Result for easy_objective_d4e29_00004:\n",
|
|
" date: 2022-07-22_16-27-50\n",
|
|
" done: true\n",
|
|
" experiment_id: fc3b1add717842f4ae0b4882a1292f93\n",
|
|
" experiment_tag: 4_height=21,width=27\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations: 4\n",
|
|
" iterations_since_restore: 5\n",
|
|
" mean_loss: 2.947457627118644\n",
|
|
" neg_mean_loss: -2.947457627118644\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 52564\n",
|
|
" time_since_restore: 0.5339996814727783\n",
|
|
" time_this_iter_s: 0.12359499931335449\n",
|
|
" time_total_s: 0.5339996814727783\n",
|
|
" timestamp: 1658503670\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 5\n",
|
|
" trial_id: d4e29_00004\n",
|
|
" warmup_time: 0.0028209686279296875\n",
|
|
" \n",
|
|
"Result for easy_objective_d4e29_00003:\n",
|
|
" date: 2022-07-22_16-27-50\n",
|
|
" done: true\n",
|
|
" experiment_id: 7cb23325d6044f0f995b338d2e15f31e\n",
|
|
" experiment_tag: 3_height=11,width=81\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations: 4\n",
|
|
" iterations_since_restore: 5\n",
|
|
" mean_loss: 1.3994011976047904\n",
|
|
" neg_mean_loss: -1.3994011976047904\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 52563\n",
|
|
" time_since_restore: 0.5603930950164795\n",
|
|
" time_this_iter_s: 0.12318706512451172\n",
|
|
" time_total_s: 0.5603930950164795\n",
|
|
" timestamp: 1658503670\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 5\n",
|
|
" trial_id: d4e29_00003\n",
|
|
" warmup_time: 0.0026869773864746094\n",
|
|
" \n",
|
|
"Result for easy_objective_d4e29_00002:\n",
|
|
" date: 2022-07-22_16-27-50\n",
|
|
" done: true\n",
|
|
" experiment_id: 23f2d0c4631e4a2abb5449ba68f80e8b\n",
|
|
" experiment_tag: 2_height=22,width=95\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations: 4\n",
|
|
" iterations_since_restore: 5\n",
|
|
" mean_loss: 2.4564102564102566\n",
|
|
" neg_mean_loss: -2.4564102564102566\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 52562\n",
|
|
" time_since_restore: 0.5875582695007324\n",
|
|
" time_this_iter_s: 0.12340712547302246\n",
|
|
" time_total_s: 0.5875582695007324\n",
|
|
" timestamp: 1658503670\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 5\n",
|
|
" trial_id: d4e29_00002\n",
|
|
" warmup_time: 0.002949953079223633\n",
|
|
" \n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"2022-07-22 16:27:51,033\tINFO tune.py:738 -- Total run time: 7.27 seconds (6.28 seconds for the tuning loop).\n",
|
|
"2022/07/22 16:27:51 INFO mlflow.tracking.fluent: Experiment with name 'mixin_example' does not exist. Creating a new experiment.\n"
|
|
]
|
|
},
|
|
{
|
|
"data": {
|
|
"text/html": [
|
|
"== Status ==<br>Current time: 2022-07-22 16:27:58 (running for 00:00:07.03)<br>Memory usage on this node: 10.4/16.0 GiB<br>Using FIFO scheduling algorithm.<br>Resources requested: 0/16 CPUs, 0/0 GPUs, 0.0/5.63 GiB heap, 0.0/2.0 GiB objects<br>Result logdir: /Users/kai/ray_results/mlflow<br>Number of trials: 5/5 (5 TERMINATED)<br><table>\n",
|
|
"<thead>\n",
|
|
"<tr><th>Trial name </th><th>status </th><th>loc </th><th style=\"text-align: right;\"> height</th><th style=\"text-align: right;\"> width</th><th style=\"text-align: right;\"> loss</th><th style=\"text-align: right;\"> iter</th><th style=\"text-align: right;\"> total time (s)</th><th style=\"text-align: right;\"> iterations</th><th style=\"text-align: right;\"> neg_mean_loss</th></tr>\n",
|
|
"</thead>\n",
|
|
"<tbody>\n",
|
|
"<tr><td>decorated_easy_objective_d93b6_00000</td><td>TERMINATED</td><td>127.0.0.1:52581</td><td style=\"text-align: right;\"> 45</td><td style=\"text-align: right;\"> 51</td><td style=\"text-align: right;\"> 4.96729</td><td style=\"text-align: right;\"> 5</td><td style=\"text-align: right;\"> 0.460993</td><td style=\"text-align: right;\"> 4</td><td style=\"text-align: right;\"> -4.96729</td></tr>\n",
|
|
"<tr><td>decorated_easy_objective_d93b6_00001</td><td>TERMINATED</td><td>127.0.0.1:52598</td><td style=\"text-align: right;\"> 44</td><td style=\"text-align: right;\"> 94</td><td style=\"text-align: right;\"> 4.65907</td><td style=\"text-align: right;\"> 5</td><td style=\"text-align: right;\"> 0.434945</td><td style=\"text-align: right;\"> 4</td><td style=\"text-align: right;\"> -4.65907</td></tr>\n",
|
|
"<tr><td>decorated_easy_objective_d93b6_00002</td><td>TERMINATED</td><td>127.0.0.1:52599</td><td style=\"text-align: right;\"> 93</td><td style=\"text-align: right;\"> 25</td><td style=\"text-align: right;\">10.2091 </td><td style=\"text-align: right;\"> 5</td><td style=\"text-align: right;\"> 0.471808</td><td style=\"text-align: right;\"> 4</td><td style=\"text-align: right;\"> -10.2091 </td></tr>\n",
|
|
"<tr><td>decorated_easy_objective_d93b6_00003</td><td>TERMINATED</td><td>127.0.0.1:52600</td><td style=\"text-align: right;\"> 40</td><td style=\"text-align: right;\"> 26</td><td style=\"text-align: right;\"> 4.87719</td><td style=\"text-align: right;\"> 5</td><td style=\"text-align: right;\"> 0.437302</td><td style=\"text-align: right;\"> 4</td><td style=\"text-align: right;\"> -4.87719</td></tr>\n",
|
|
"<tr><td>decorated_easy_objective_d93b6_00004</td><td>TERMINATED</td><td>127.0.0.1:52601</td><td style=\"text-align: right;\"> 16</td><td style=\"text-align: right;\"> 65</td><td style=\"text-align: right;\"> 1.97037</td><td style=\"text-align: right;\"> 5</td><td style=\"text-align: right;\"> 0.468027</td><td style=\"text-align: right;\"> 4</td><td style=\"text-align: right;\"> -1.97037</td></tr>\n",
|
|
"</tbody>\n",
|
|
"</table><br><br>"
|
|
],
|
|
"text/plain": [
|
|
"<IPython.core.display.HTML object>"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"output_type": "display_data"
|
|
},
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Result for decorated_easy_objective_d93b6_00000:\n",
|
|
" date: 2022-07-22_16-27-54\n",
|
|
" done: false\n",
|
|
" experiment_id: 2d0d9fbc13c64acfa27153a5fb9aeb68\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations: 0\n",
|
|
" iterations_since_restore: 1\n",
|
|
" mean_loss: 14.5\n",
|
|
" neg_mean_loss: -14.5\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 52581\n",
|
|
" time_since_restore: 0.001725912094116211\n",
|
|
" time_this_iter_s: 0.001725912094116211\n",
|
|
" time_total_s: 0.001725912094116211\n",
|
|
" timestamp: 1658503674\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: d93b6_00000\n",
|
|
" warmup_time: 0.20471811294555664\n",
|
|
" \n",
|
|
"Result for decorated_easy_objective_d93b6_00000:\n",
|
|
" date: 2022-07-22_16-27-54\n",
|
|
" done: true\n",
|
|
" experiment_id: 2d0d9fbc13c64acfa27153a5fb9aeb68\n",
|
|
" experiment_tag: 0_height=45,width=51\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations: 4\n",
|
|
" iterations_since_restore: 5\n",
|
|
" mean_loss: 4.9672897196261685\n",
|
|
" neg_mean_loss: -4.9672897196261685\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 52581\n",
|
|
" time_since_restore: 0.46099305152893066\n",
|
|
" time_this_iter_s: 0.10984206199645996\n",
|
|
" time_total_s: 0.46099305152893066\n",
|
|
" timestamp: 1658503674\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 5\n",
|
|
" trial_id: d93b6_00000\n",
|
|
" warmup_time: 0.20471811294555664\n",
|
|
" \n",
|
|
"Result for decorated_easy_objective_d93b6_00001:\n",
|
|
" date: 2022-07-22_16-27-57\n",
|
|
" done: false\n",
|
|
" experiment_id: 4bec5377a38a47d7bae57f7502ff0312\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations: 0\n",
|
|
" iterations_since_restore: 1\n",
|
|
" mean_loss: 14.4\n",
|
|
" neg_mean_loss: -14.4\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 52598\n",
|
|
" time_since_restore: 0.0016498565673828125\n",
|
|
" time_this_iter_s: 0.0016498565673828125\n",
|
|
" time_total_s: 0.0016498565673828125\n",
|
|
" timestamp: 1658503677\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: d93b6_00001\n",
|
|
" warmup_time: 0.18288898468017578\n",
|
|
" \n",
|
|
"Result for decorated_easy_objective_d93b6_00003:\n",
|
|
" date: 2022-07-22_16-27-57\n",
|
|
" done: false\n",
|
|
" experiment_id: 6868d31636df4c4a8e9ed91927120269\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations: 0\n",
|
|
" iterations_since_restore: 1\n",
|
|
" mean_loss: 14.0\n",
|
|
" neg_mean_loss: -14.0\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 52600\n",
|
|
" time_since_restore: 0.0016481876373291016\n",
|
|
" time_this_iter_s: 0.0016481876373291016\n",
|
|
" time_total_s: 0.0016481876373291016\n",
|
|
" timestamp: 1658503677\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: d93b6_00003\n",
|
|
" warmup_time: 0.17208290100097656\n",
|
|
" \n",
|
|
"Result for decorated_easy_objective_d93b6_00004:\n",
|
|
" date: 2022-07-22_16-27-57\n",
|
|
" done: false\n",
|
|
" experiment_id: f021ddc2dc164413931c17cb593dfa12\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations: 0\n",
|
|
" iterations_since_restore: 1\n",
|
|
" mean_loss: 11.6\n",
|
|
" neg_mean_loss: -11.6\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 52601\n",
|
|
" time_since_restore: 0.0015459060668945312\n",
|
|
" time_this_iter_s: 0.0015459060668945312\n",
|
|
" time_total_s: 0.0015459060668945312\n",
|
|
" timestamp: 1658503677\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: d93b6_00004\n",
|
|
" warmup_time: 0.1808018684387207\n",
|
|
" \n",
|
|
"Result for decorated_easy_objective_d93b6_00002:\n",
|
|
" date: 2022-07-22_16-27-57\n",
|
|
" done: false\n",
|
|
" experiment_id: a341941781824ea9b1a072b587e42a84\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations: 0\n",
|
|
" iterations_since_restore: 1\n",
|
|
" mean_loss: 19.3\n",
|
|
" neg_mean_loss: -19.3\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 52599\n",
|
|
" time_since_restore: 0.0015799999237060547\n",
|
|
" time_this_iter_s: 0.0015799999237060547\n",
|
|
" time_total_s: 0.0015799999237060547\n",
|
|
" timestamp: 1658503677\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: d93b6_00002\n",
|
|
" warmup_time: 0.1837329864501953\n",
|
|
" \n",
|
|
"Result for decorated_easy_objective_d93b6_00001:\n",
|
|
" date: 2022-07-22_16-27-57\n",
|
|
" done: true\n",
|
|
" experiment_id: 4bec5377a38a47d7bae57f7502ff0312\n",
|
|
" experiment_tag: 1_height=44,width=94\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations: 4\n",
|
|
" iterations_since_restore: 5\n",
|
|
" mean_loss: 4.659067357512954\n",
|
|
" neg_mean_loss: -4.659067357512954\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 52598\n",
|
|
" time_since_restore: 0.43494510650634766\n",
|
|
" time_this_iter_s: 0.10719513893127441\n",
|
|
" time_total_s: 0.43494510650634766\n",
|
|
" timestamp: 1658503677\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 5\n",
|
|
" trial_id: d93b6_00001\n",
|
|
" warmup_time: 0.18288898468017578\n",
|
|
" \n",
|
|
"Result for decorated_easy_objective_d93b6_00003:\n",
|
|
" date: 2022-07-22_16-27-57\n",
|
|
" done: true\n",
|
|
" experiment_id: 6868d31636df4c4a8e9ed91927120269\n",
|
|
" experiment_tag: 3_height=40,width=26\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations: 4\n",
|
|
" iterations_since_restore: 5\n",
|
|
" mean_loss: 4.87719298245614\n",
|
|
" neg_mean_loss: -4.87719298245614\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 52600\n",
|
|
" time_since_restore: 0.4373021125793457\n",
|
|
" time_this_iter_s: 0.10880899429321289\n",
|
|
" time_total_s: 0.4373021125793457\n",
|
|
" timestamp: 1658503677\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 5\n",
|
|
" trial_id: d93b6_00003\n",
|
|
" warmup_time: 0.17208290100097656\n",
|
|
" \n",
|
|
"Result for decorated_easy_objective_d93b6_00004:\n",
|
|
" date: 2022-07-22_16-27-57\n",
|
|
" done: true\n",
|
|
" experiment_id: f021ddc2dc164413931c17cb593dfa12\n",
|
|
" experiment_tag: 4_height=16,width=65\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations: 4\n",
|
|
" iterations_since_restore: 5\n",
|
|
" mean_loss: 1.9703703703703703\n",
|
|
" neg_mean_loss: -1.9703703703703703\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 52601\n",
|
|
" time_since_restore: 0.46802687644958496\n",
|
|
" time_this_iter_s: 0.1077277660369873\n",
|
|
" time_total_s: 0.46802687644958496\n",
|
|
" timestamp: 1658503677\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 5\n",
|
|
" trial_id: d93b6_00004\n",
|
|
" warmup_time: 0.1808018684387207\n",
|
|
" \n",
|
|
"Result for decorated_easy_objective_d93b6_00002:\n",
|
|
" date: 2022-07-22_16-27-57\n",
|
|
" done: true\n",
|
|
" experiment_id: a341941781824ea9b1a072b587e42a84\n",
|
|
" experiment_tag: 2_height=93,width=25\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations: 4\n",
|
|
" iterations_since_restore: 5\n",
|
|
" mean_loss: 10.209090909090909\n",
|
|
" neg_mean_loss: -10.209090909090909\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 52599\n",
|
|
" time_since_restore: 0.47180795669555664\n",
|
|
" time_this_iter_s: 0.10791492462158203\n",
|
|
" time_total_s: 0.47180795669555664\n",
|
|
" timestamp: 1658503677\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 5\n",
|
|
" trial_id: d93b6_00002\n",
|
|
" warmup_time: 0.1837329864501953\n",
|
|
" \n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"2022-07-22 16:27:58,211\tINFO tune.py:738 -- Total run time: 7.15 seconds (7.01 seconds for the tuning loop).\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"smoke_test = True\n",
|
|
"\n",
|
|
"if smoke_test:\n",
|
|
" mlflow_tracking_uri = os.path.join(tempfile.gettempdir(), \"mlruns\")\n",
|
|
"else:\n",
|
|
" mlflow_tracking_uri = \"<MLFLOW_TRACKING_URI>\"\n",
|
|
"\n",
|
|
"tune_function(mlflow_tracking_uri, finish_fast=smoke_test)\n",
|
|
"if not smoke_test:\n",
|
|
" df = mlflow.search_runs(\n",
|
|
" [mlflow.get_experiment_by_name(\"example\").experiment_id]\n",
|
|
" )\n",
|
|
" print(df)\n",
|
|
"\n",
|
|
"tune_decorated(mlflow_tracking_uri, finish_fast=smoke_test)\n",
|
|
"if not smoke_test:\n",
|
|
" df = mlflow.search_runs(\n",
|
|
" [mlflow.get_experiment_by_name(\"mixin_example\").experiment_id]\n",
|
|
" )\n",
|
|
" print(df)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "f0df0817",
|
|
"metadata": {},
|
|
"source": [
|
|
"This completes our Tune and MLflow walk-through.\n",
|
|
"In the following sections you can find more details on the API of the Tune-MLflow integration.\n",
|
|
"\n",
|
|
"## MLflow AutoLogging\n",
|
|
"\n",
|
|
"You can also check out {doc}`here </tune/examples/includes/mlflow_ptl_example>` for an example on how you can\n",
|
|
"leverage MLflow auto-logging, in this case with Pytorch Lightning\n",
|
|
"\n",
|
|
"## MLflow Logger API\n",
|
|
"\n",
|
|
"(tune-mlflow-logger)=\n",
|
|
"\n",
|
|
"```{eval-rst}\n",
|
|
".. autoclass:: ray.air.callbacks.mlflow.MLflowLoggerCallback\n",
|
|
" :noindex:\n",
|
|
"```\n",
|
|
"\n",
|
|
"## MLflow Mixin API\n",
|
|
"\n",
|
|
"(tune-mlflow-mixin)=\n",
|
|
"\n",
|
|
"```{eval-rst}\n",
|
|
".. autofunction:: ray.tune.integration.mlflow.mlflow_mixin\n",
|
|
" :noindex:\n",
|
|
"```\n",
|
|
"\n",
|
|
"## More MLflow Examples\n",
|
|
"\n",
|
|
"- {doc}`/tune/examples/includes/mlflow_ptl_example`: Example for using [MLflow](https://github.com/mlflow/mlflow/)\n",
|
|
" and [Pytorch Lightning](https://github.com/PyTorchLightning/pytorch-lightning) with Ray Tune."
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.7.7"
|
|
},
|
|
"orphan": true
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|