mirror of
https://github.com/vale981/ray
synced 2025-03-10 05:16:49 -04:00

This PR updates the Ray AIR/Tune ipynb examples to use the Tuner() API instead of tune.run(). Signed-off-by: Kai Fricke <kai@anyscale.com> Signed-off-by: Richard Liaw <rliaw@berkeley.edu> Signed-off-by: Xiaowei Jiang <xwjiang2010@gmail.com> Signed-off-by: Kai Fricke <coding@kaifricke.com> Co-authored-by: Richard Liaw <rliaw@berkeley.edu> Co-authored-by: Xiaowei Jiang <xwjiang2010@gmail.com>
1148 lines
48 KiB
Text
1148 lines
48 KiB
Text
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "ecad719c",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Using Weights & Biases with Tune\n",
|
|
"\n",
|
|
"(tune-wandb-ref)=\n",
|
|
"\n",
|
|
"[Weights & Biases](https://www.wandb.ai/) (Wandb) is a tool for experiment\n",
|
|
"tracking, model optimizaton, and dataset versioning. It is very popular\n",
|
|
"in the machine learning and data science community for its superb visualization\n",
|
|
"tools.\n",
|
|
"\n",
|
|
"```{image} /images/wandb_logo_full.png\n",
|
|
":align: center\n",
|
|
":alt: Weights & Biases\n",
|
|
":height: 80px\n",
|
|
":target: https://www.wandb.ai/\n",
|
|
"```\n",
|
|
"\n",
|
|
"Ray Tune currently offers two lightweight integrations for Weights & Biases.\n",
|
|
"One is the {ref}`WandbLoggerCallback <tune-wandb-logger>`, which automatically logs\n",
|
|
"metrics reported to Tune to the Wandb API.\n",
|
|
"\n",
|
|
"The other one is the {ref}`@wandb_mixin <tune-wandb-mixin>` decorator, which can be\n",
|
|
"used with the function API. It automatically\n",
|
|
"initializes the Wandb API with Tune's training information. You can just use the\n",
|
|
"Wandb API like you would normally do, e.g. using `wandb.log()` to log your training\n",
|
|
"process.\n",
|
|
"\n",
|
|
"```{contents}\n",
|
|
":backlinks: none\n",
|
|
":local: true\n",
|
|
"```\n",
|
|
"\n",
|
|
"## Running A Weights & Biases Example\n",
|
|
"\n",
|
|
"In the following example we're going to use both of the above methods, namely the `WandbLoggerCallback` and\n",
|
|
"the `wandb_mixin` decorator to log metrics.\n",
|
|
"Let's start with a few crucial imports:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"id": "100bcf8a",
|
|
"metadata": {
|
|
"vscode": {
|
|
"languageId": "python"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"import numpy as np\n",
|
|
"import wandb\n",
|
|
"\n",
|
|
"from ray import air, tune\n",
|
|
"from ray.air import session\n",
|
|
"from ray.tune import Trainable\n",
|
|
"from ray.air.callbacks.wandb import WandbLoggerCallback\n",
|
|
"from ray.tune.integration.wandb import (\n",
|
|
" WandbTrainableMixin,\n",
|
|
" wandb_mixin,\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "9346c0f6",
|
|
"metadata": {},
|
|
"source": [
|
|
"Next, let's define an easy `objective` function (a Tune `Trainable`) that reports a random loss to Tune.\n",
|
|
"The objective function itself is not important for this example, since we want to focus on the Weights & Biases\n",
|
|
"integration primarily."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"id": "e8b4fc4d",
|
|
"metadata": {
|
|
"pycharm": {
|
|
"name": "#%%\n"
|
|
},
|
|
"vscode": {
|
|
"languageId": "python"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"def objective(config, checkpoint_dir=None):\n",
|
|
" for i in range(30):\n",
|
|
" loss = config[\"mean\"] + config[\"sd\"] * np.random.randn()\n",
|
|
" session.report({\"loss\": loss})"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "831eed42",
|
|
"metadata": {},
|
|
"source": [
|
|
"Given that you provide an `api_key_file` pointing to your Weights & Biases API key, you cna define a\n",
|
|
"simple grid-search Tune run using the `WandbLoggerCallback` as follows:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"id": "52988599",
|
|
"metadata": {
|
|
"pycharm": {
|
|
"name": "#%%\n"
|
|
},
|
|
"vscode": {
|
|
"languageId": "python"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"def tune_function(api_key_file):\n",
|
|
" \"\"\"Example for using a WandbLoggerCallback with the function API\"\"\"\n",
|
|
" tuner = tune.Tuner(\n",
|
|
" objective,\n",
|
|
" tune_config=tune.TuneConfig(\n",
|
|
" metric=\"loss\",\n",
|
|
" mode=\"min\",\n",
|
|
" ),\n",
|
|
" run_config=air.RunConfig(\n",
|
|
" callbacks=[\n",
|
|
" WandbLoggerCallback(api_key_file=api_key_file, project=\"Wandb_example\")\n",
|
|
" ],\n",
|
|
" ),\n",
|
|
" param_space={\n",
|
|
" \"mean\": tune.grid_search([1, 2, 3, 4, 5]),\n",
|
|
" \"sd\": tune.uniform(0.2, 0.8),\n",
|
|
" },\n",
|
|
" )\n",
|
|
" results = tuner.fit()\n",
|
|
"\n",
|
|
" return results.get_best_result().config"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "e24c05fa",
|
|
"metadata": {},
|
|
"source": [
|
|
"To use the `wandb_mixin` decorator, you can simply decorate the objective function from earlier.\n",
|
|
"Note that we also use `wandb.log(...)` to log the `loss` to Weights & Biases as a dictionary.\n",
|
|
"Otherwise, the decorated version of our objective is identical to its original."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"id": "5e30d5e7",
|
|
"metadata": {
|
|
"pycharm": {
|
|
"name": "#%%\n"
|
|
},
|
|
"vscode": {
|
|
"languageId": "python"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"@wandb_mixin\n",
|
|
"def decorated_objective(config, checkpoint_dir=None):\n",
|
|
" for i in range(30):\n",
|
|
" loss = config[\"mean\"] + config[\"sd\"] * np.random.randn()\n",
|
|
" session.report({\"loss\": loss})\n",
|
|
" wandb.log(dict(loss=loss))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "04040bcb",
|
|
"metadata": {},
|
|
"source": [
|
|
"With the `decorated_objective` defined, running a Tune experiment is as simple as providing this objective and\n",
|
|
"passing the `api_key_file` to the `wandb` key of your Tune `config`:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"id": "d4fbd368",
|
|
"metadata": {
|
|
"pycharm": {
|
|
"name": "#%%\n"
|
|
},
|
|
"vscode": {
|
|
"languageId": "python"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"def tune_decorated(api_key_file):\n",
|
|
" \"\"\"Example for using the @wandb_mixin decorator with the function API\"\"\"\n",
|
|
" tuner = tune.Tuner(\n",
|
|
" objective,\n",
|
|
" tune_config=tune.TuneConfig(\n",
|
|
" metric=\"loss\",\n",
|
|
" mode=\"min\",\n",
|
|
" ),\n",
|
|
" param_space={\n",
|
|
" \"mean\": tune.grid_search([1, 2, 3, 4, 5]),\n",
|
|
" \"sd\": tune.uniform(0.2, 0.8),\n",
|
|
" \"wandb\": {\"api_key_file\": api_key_file, \"project\": \"Wandb_example\"},\n",
|
|
" },\n",
|
|
" )\n",
|
|
" results = tuner.fit()\n",
|
|
"\n",
|
|
" return results.get_best_result().config"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "f9521481",
|
|
"metadata": {},
|
|
"source": [
|
|
"Finally, you can also define a class-based Tune `Trainable` by using the `WandbTrainableMixin` to define your objective:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 6,
|
|
"id": "d27a7a35",
|
|
"metadata": {
|
|
"pycharm": {
|
|
"name": "#%%\n"
|
|
},
|
|
"vscode": {
|
|
"languageId": "python"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"class WandbTrainable(WandbTrainableMixin, Trainable):\n",
|
|
" def step(self):\n",
|
|
" for i in range(30):\n",
|
|
" loss = self.config[\"mean\"] + self.config[\"sd\"] * np.random.randn()\n",
|
|
" wandb.log({\"loss\": loss})\n",
|
|
" return {\"loss\": loss, \"done\": True}"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "fa189bb2",
|
|
"metadata": {},
|
|
"source": [
|
|
"Running Tune with this `WandbTrainable` works exactly the same as with the function API.\n",
|
|
"The below `tune_trainable` function differs from `tune_decorated` above only in the first argument we pass to\n",
|
|
"`Tuner()`:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 8,
|
|
"id": "6e546cc2",
|
|
"metadata": {
|
|
"pycharm": {
|
|
"name": "#%%\n"
|
|
},
|
|
"vscode": {
|
|
"languageId": "python"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"def tune_trainable(api_key_file):\n",
|
|
" \"\"\"Example for using a WandTrainableMixin with the class API\"\"\"\n",
|
|
" tuner = tune.Tuner(\n",
|
|
" WandbTrainable,\n",
|
|
" tune_config=tune.TuneConfig(\n",
|
|
" metric=\"loss\",\n",
|
|
" mode=\"min\",\n",
|
|
" ),\n",
|
|
" param_space={\n",
|
|
" \"mean\": tune.grid_search([1, 2, 3, 4, 5]),\n",
|
|
" \"sd\": tune.uniform(0.2, 0.8),\n",
|
|
" \"wandb\": {\"api_key_file\": api_key_file, \"project\": \"Wandb_example\"},\n",
|
|
" },\n",
|
|
" )\n",
|
|
" results = tuner.fit()\n",
|
|
"\n",
|
|
" return results.get_best_result().config"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "0b736172",
|
|
"metadata": {},
|
|
"source": [
|
|
"Since you may not have an API key for Wandb, we can _mock_ the Wandb logger and test all three of our training\n",
|
|
"functions as follows.\n",
|
|
"If you do have an API key file, make sure to set `mock_api` to `False` and pass in the right `api_key_file` below."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 9,
|
|
"id": "e0e7f481",
|
|
"metadata": {
|
|
"pycharm": {
|
|
"name": "#%%\n"
|
|
},
|
|
"vscode": {
|
|
"languageId": "python"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"2022-07-22 15:39:38,323\tINFO services.py:1483 -- View the Ray dashboard at \u001b[1m\u001b[32mhttp://127.0.0.1:8266\u001b[39m\u001b[22m\n",
|
|
"/Users/kai/coding/ray/python/ray/tune/trainable/function_trainable.py:643: DeprecationWarning: `checkpoint_dir` in `func(config, checkpoint_dir)` is being deprecated. To save and load checkpoint in trainable functions, please use the `ray.air.session` API:\n",
|
|
"\n",
|
|
"from ray.air import session\n",
|
|
"\n",
|
|
"def train(config):\n",
|
|
" # ...\n",
|
|
" session.report({\"metric\": metric}, checkpoint=checkpoint)\n",
|
|
"\n",
|
|
"For more information please see https://docs.ray.io/en/master/ray-air/key-concepts.html#session\n",
|
|
"\n",
|
|
" DeprecationWarning,\n"
|
|
]
|
|
},
|
|
{
|
|
"data": {
|
|
"text/html": [
|
|
"== Status ==<br>Current time: 2022-07-22 15:39:47 (running for 00:00:06.01)<br>Memory usage on this node: 9.9/16.0 GiB<br>Using FIFO scheduling algorithm.<br>Resources requested: 0/16 CPUs, 0/0 GPUs, 0.0/5.52 GiB heap, 0.0/2.0 GiB objects<br>Current best trial: 1e575_00000 with loss=0.6535282890948189 and parameters={'mean': 1, 'sd': 0.6540704916919089}<br>Result logdir: /Users/kai/ray_results/objective_2022-07-22_15-39-35<br>Number of trials: 5/5 (5 TERMINATED)<br><table>\n",
|
|
"<thead>\n",
|
|
"<tr><th>Trial name </th><th>status </th><th>loc </th><th style=\"text-align: right;\"> mean</th><th style=\"text-align: right;\"> sd</th><th style=\"text-align: right;\"> iter</th><th style=\"text-align: right;\"> total time (s)</th><th style=\"text-align: right;\"> loss</th></tr>\n",
|
|
"</thead>\n",
|
|
"<tbody>\n",
|
|
"<tr><td>objective_1e575_00000</td><td>TERMINATED</td><td>127.0.0.1:47932</td><td style=\"text-align: right;\"> 1</td><td style=\"text-align: right;\">0.65407 </td><td style=\"text-align: right;\"> 30</td><td style=\"text-align: right;\"> 0.203522</td><td style=\"text-align: right;\">0.653528</td></tr>\n",
|
|
"<tr><td>objective_1e575_00001</td><td>TERMINATED</td><td>127.0.0.1:47941</td><td style=\"text-align: right;\"> 2</td><td style=\"text-align: right;\">0.72087 </td><td style=\"text-align: right;\"> 30</td><td style=\"text-align: right;\"> 0.314281</td><td style=\"text-align: right;\">1.14091 </td></tr>\n",
|
|
"<tr><td>objective_1e575_00002</td><td>TERMINATED</td><td>127.0.0.1:47942</td><td style=\"text-align: right;\"> 3</td><td style=\"text-align: right;\">0.680016</td><td style=\"text-align: right;\"> 30</td><td style=\"text-align: right;\"> 0.43947 </td><td style=\"text-align: right;\">2.11278 </td></tr>\n",
|
|
"<tr><td>objective_1e575_00003</td><td>TERMINATED</td><td>127.0.0.1:47943</td><td style=\"text-align: right;\"> 4</td><td style=\"text-align: right;\">0.296117</td><td style=\"text-align: right;\"> 30</td><td style=\"text-align: right;\"> 0.442453</td><td style=\"text-align: right;\">4.33397 </td></tr>\n",
|
|
"<tr><td>objective_1e575_00004</td><td>TERMINATED</td><td>127.0.0.1:47944</td><td style=\"text-align: right;\"> 5</td><td style=\"text-align: right;\">0.358219</td><td style=\"text-align: right;\"> 30</td><td style=\"text-align: right;\"> 0.362729</td><td style=\"text-align: right;\">5.41971 </td></tr>\n",
|
|
"</tbody>\n",
|
|
"</table><br><br>"
|
|
],
|
|
"text/plain": [
|
|
"<IPython.core.display.HTML object>"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"output_type": "display_data"
|
|
},
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"2022-07-22 15:39:41,596\tINFO plugin_schema_manager.py:52 -- Loading the default runtime env schemas: ['/Users/kai/coding/ray/python/ray/_private/runtime_env/../../runtime_env/schemas/working_dir_schema.json', '/Users/kai/coding/ray/python/ray/_private/runtime_env/../../runtime_env/schemas/pip_schema.json'].\n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Result for objective_1e575_00000:\n",
|
|
" date: 2022-07-22_15-39-44\n",
|
|
" done: false\n",
|
|
" experiment_id: 60ffbe63fc834195a37fabc078985531\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 1\n",
|
|
" loss: 0.4005309978356091\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47932\n",
|
|
" time_since_restore: 0.0001418590545654297\n",
|
|
" time_this_iter_s: 0.0001418590545654297\n",
|
|
" time_total_s: 0.0001418590545654297\n",
|
|
" timestamp: 1658500784\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: 1e575_00000\n",
|
|
" warmup_time: 0.002913236618041992\n",
|
|
" \n",
|
|
"Result for objective_1e575_00000:\n",
|
|
" date: 2022-07-22_15-39-44\n",
|
|
" done: true\n",
|
|
" experiment_id: 60ffbe63fc834195a37fabc078985531\n",
|
|
" experiment_tag: 0_mean=1,sd=0.6541\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 30\n",
|
|
" loss: 0.6535282890948189\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47932\n",
|
|
" time_since_restore: 0.203521728515625\n",
|
|
" time_this_iter_s: 0.003339052200317383\n",
|
|
" time_total_s: 0.203521728515625\n",
|
|
" timestamp: 1658500784\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 30\n",
|
|
" trial_id: 1e575_00000\n",
|
|
" warmup_time: 0.002913236618041992\n",
|
|
" \n",
|
|
"Result for objective_1e575_00002:\n",
|
|
" date: 2022-07-22_15-39-46\n",
|
|
" done: false\n",
|
|
" experiment_id: c812a92f07134341a2908abc6e315061\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 1\n",
|
|
" loss: 2.7700164667438716\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47942\n",
|
|
" time_since_restore: 0.00013971328735351562\n",
|
|
" time_this_iter_s: 0.00013971328735351562\n",
|
|
" time_total_s: 0.00013971328735351562\n",
|
|
" timestamp: 1658500786\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: 1e575_00002\n",
|
|
" warmup_time: 0.002918720245361328\n",
|
|
" \n",
|
|
"Result for objective_1e575_00003:\n",
|
|
" date: 2022-07-22_15-39-46\n",
|
|
" done: false\n",
|
|
" experiment_id: b97d28ec439342ae8dd7c7fa4ac4ccca\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 1\n",
|
|
" loss: 3.895346250529465\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47943\n",
|
|
" time_since_restore: 0.00013494491577148438\n",
|
|
" time_this_iter_s: 0.00013494491577148438\n",
|
|
" time_total_s: 0.00013494491577148438\n",
|
|
" timestamp: 1658500786\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: 1e575_00003\n",
|
|
" warmup_time: 0.0031499862670898438\n",
|
|
" \n",
|
|
"Result for objective_1e575_00001:\n",
|
|
" date: 2022-07-22_15-39-46\n",
|
|
" done: false\n",
|
|
" experiment_id: 7034e40ba23f495eb6974ad5bda1406d\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 1\n",
|
|
" loss: 1.8250068029519693\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47941\n",
|
|
" time_since_restore: 0.00015974044799804688\n",
|
|
" time_this_iter_s: 0.00015974044799804688\n",
|
|
" time_total_s: 0.00015974044799804688\n",
|
|
" timestamp: 1658500786\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: 1e575_00001\n",
|
|
" warmup_time: 0.0026862621307373047\n",
|
|
" \n",
|
|
"Result for objective_1e575_00004:\n",
|
|
" date: 2022-07-22_15-39-46\n",
|
|
" done: false\n",
|
|
" experiment_id: 6b7bf17ee7444b22b809897292864e19\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 1\n",
|
|
" loss: 5.098807619369106\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47944\n",
|
|
" time_since_restore: 0.00012803077697753906\n",
|
|
" time_this_iter_s: 0.00012803077697753906\n",
|
|
" time_total_s: 0.00012803077697753906\n",
|
|
" timestamp: 1658500786\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: 1e575_00004\n",
|
|
" warmup_time: 0.002666950225830078\n",
|
|
" \n",
|
|
"Result for objective_1e575_00002:\n",
|
|
" date: 2022-07-22_15-39-47\n",
|
|
" done: true\n",
|
|
" experiment_id: c812a92f07134341a2908abc6e315061\n",
|
|
" experiment_tag: 2_mean=3,sd=0.6800\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 30\n",
|
|
" loss: 2.1127773612837975\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47942\n",
|
|
" time_since_restore: 0.4394698143005371\n",
|
|
" time_this_iter_s: 0.005173921585083008\n",
|
|
" time_total_s: 0.4394698143005371\n",
|
|
" timestamp: 1658500787\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 30\n",
|
|
" trial_id: 1e575_00002\n",
|
|
" warmup_time: 0.002918720245361328\n",
|
|
" \n",
|
|
"Result for objective_1e575_00001:\n",
|
|
" date: 2022-07-22_15-39-47\n",
|
|
" done: true\n",
|
|
" experiment_id: 7034e40ba23f495eb6974ad5bda1406d\n",
|
|
" experiment_tag: 1_mean=2,sd=0.7209\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 30\n",
|
|
" loss: 1.1409060371452806\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47941\n",
|
|
" time_since_restore: 0.31428098678588867\n",
|
|
" time_this_iter_s: 0.008217096328735352\n",
|
|
" time_total_s: 0.31428098678588867\n",
|
|
" timestamp: 1658500787\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 30\n",
|
|
" trial_id: 1e575_00001\n",
|
|
" warmup_time: 0.0026862621307373047\n",
|
|
" \n",
|
|
"Result for objective_1e575_00003:\n",
|
|
" date: 2022-07-22_15-39-47\n",
|
|
" done: true\n",
|
|
" experiment_id: b97d28ec439342ae8dd7c7fa4ac4ccca\n",
|
|
" experiment_tag: 3_mean=4,sd=0.2961\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 30\n",
|
|
" loss: 4.333967406156947\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47943\n",
|
|
" time_since_restore: 0.44245290756225586\n",
|
|
" time_this_iter_s: 0.005827903747558594\n",
|
|
" time_total_s: 0.44245290756225586\n",
|
|
" timestamp: 1658500787\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 30\n",
|
|
" trial_id: 1e575_00003\n",
|
|
" warmup_time: 0.0031499862670898438\n",
|
|
" \n",
|
|
"Result for objective_1e575_00004:\n",
|
|
" date: 2022-07-22_15-39-47\n",
|
|
" done: true\n",
|
|
" experiment_id: 6b7bf17ee7444b22b809897292864e19\n",
|
|
" experiment_tag: 4_mean=5,sd=0.3582\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 30\n",
|
|
" loss: 5.419707275520466\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47944\n",
|
|
" time_since_restore: 0.3627290725708008\n",
|
|
" time_this_iter_s: 0.006065845489501953\n",
|
|
" time_total_s: 0.3627290725708008\n",
|
|
" timestamp: 1658500787\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 30\n",
|
|
" trial_id: 1e575_00004\n",
|
|
" warmup_time: 0.002666950225830078\n",
|
|
" \n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"2022-07-22 15:39:47,478\tINFO tune.py:738 -- Total run time: 6.95 seconds (6.00 seconds for the tuning loop).\n"
|
|
]
|
|
},
|
|
{
|
|
"data": {
|
|
"text/html": [
|
|
"== Status ==<br>Current time: 2022-07-22 15:39:53 (running for 00:00:05.64)<br>Memory usage on this node: 9.8/16.0 GiB<br>Using FIFO scheduling algorithm.<br>Resources requested: 0/16 CPUs, 0/0 GPUs, 0.0/5.52 GiB heap, 0.0/2.0 GiB objects<br>Current best trial: 227e1_00000 with loss=1.4158135642199134 and parameters={'mean': 1, 'sd': 0.35625806806413973, 'wandb': {'api_key_file': '/var/folders/b2/0_91bd757rz02lrmr920v0gw0000gn/T/tmp9qec20eq', 'project': 'Wandb_example'}}<br>Result logdir: /Users/kai/ray_results/objective_2022-07-22_15-39-47<br>Number of trials: 5/5 (5 TERMINATED)<br><table>\n",
|
|
"<thead>\n",
|
|
"<tr><th>Trial name </th><th>status </th><th>loc </th><th style=\"text-align: right;\"> mean</th><th style=\"text-align: right;\"> sd</th><th style=\"text-align: right;\"> iter</th><th style=\"text-align: right;\"> total time (s)</th><th style=\"text-align: right;\"> loss</th></tr>\n",
|
|
"</thead>\n",
|
|
"<tbody>\n",
|
|
"<tr><td>objective_227e1_00000</td><td>TERMINATED</td><td>127.0.0.1:47968</td><td style=\"text-align: right;\"> 1</td><td style=\"text-align: right;\">0.356258</td><td style=\"text-align: right;\"> 30</td><td style=\"text-align: right;\"> 0.0869601</td><td style=\"text-align: right;\">1.41581</td></tr>\n",
|
|
"<tr><td>objective_227e1_00001</td><td>TERMINATED</td><td>127.0.0.1:47973</td><td style=\"text-align: right;\"> 2</td><td style=\"text-align: right;\">0.411041</td><td style=\"text-align: right;\"> 30</td><td style=\"text-align: right;\"> 0.371924 </td><td style=\"text-align: right;\">2.9165 </td></tr>\n",
|
|
"<tr><td>objective_227e1_00002</td><td>TERMINATED</td><td>127.0.0.1:47974</td><td style=\"text-align: right;\"> 3</td><td style=\"text-align: right;\">0.359191</td><td style=\"text-align: right;\"> 30</td><td style=\"text-align: right;\"> 0.305055 </td><td style=\"text-align: right;\">2.57809</td></tr>\n",
|
|
"<tr><td>objective_227e1_00003</td><td>TERMINATED</td><td>127.0.0.1:47975</td><td style=\"text-align: right;\"> 4</td><td style=\"text-align: right;\">0.543202</td><td style=\"text-align: right;\"> 30</td><td style=\"text-align: right;\"> 0.218044 </td><td style=\"text-align: right;\">5.06532</td></tr>\n",
|
|
"<tr><td>objective_227e1_00004</td><td>TERMINATED</td><td>127.0.0.1:47976</td><td style=\"text-align: right;\"> 5</td><td style=\"text-align: right;\">0.777638</td><td style=\"text-align: right;\"> 30</td><td style=\"text-align: right;\"> 0.287682 </td><td style=\"text-align: right;\">6.36554</td></tr>\n",
|
|
"</tbody>\n",
|
|
"</table><br><br>"
|
|
],
|
|
"text/plain": [
|
|
"<IPython.core.display.HTML object>"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"output_type": "display_data"
|
|
},
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Result for objective_227e1_00000:\n",
|
|
" date: 2022-07-22_15-39-50\n",
|
|
" done: false\n",
|
|
" experiment_id: e80ef3e4843c41068c733322d48e0817\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 1\n",
|
|
" loss: 0.27641082730463906\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47968\n",
|
|
" time_since_restore: 0.0001361370086669922\n",
|
|
" time_this_iter_s: 0.0001361370086669922\n",
|
|
" time_total_s: 0.0001361370086669922\n",
|
|
" timestamp: 1658500790\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: 227e1_00000\n",
|
|
" warmup_time: 0.003004789352416992\n",
|
|
" \n",
|
|
"Result for objective_227e1_00000:\n",
|
|
" date: 2022-07-22_15-39-50\n",
|
|
" done: true\n",
|
|
" experiment_id: e80ef3e4843c41068c733322d48e0817\n",
|
|
" experiment_tag: 0_mean=1,sd=0.3563\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 30\n",
|
|
" loss: 1.4158135642199134\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47968\n",
|
|
" time_since_restore: 0.0869600772857666\n",
|
|
" time_this_iter_s: 0.0022199153900146484\n",
|
|
" time_total_s: 0.0869600772857666\n",
|
|
" timestamp: 1658500790\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 30\n",
|
|
" trial_id: 227e1_00000\n",
|
|
" warmup_time: 0.003004789352416992\n",
|
|
" \n",
|
|
"Result for objective_227e1_00001:\n",
|
|
" date: 2022-07-22_15-39-52\n",
|
|
" done: false\n",
|
|
" experiment_id: bf0685a616354a02af154ac3601a2109\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 1\n",
|
|
" loss: 2.058177604134134\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47973\n",
|
|
" time_since_restore: 0.00015783309936523438\n",
|
|
" time_this_iter_s: 0.00015783309936523438\n",
|
|
" time_total_s: 0.00015783309936523438\n",
|
|
" timestamp: 1658500792\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: 227e1_00001\n",
|
|
" warmup_time: 0.0029697418212890625\n",
|
|
" \n",
|
|
"Result for objective_227e1_00004:\n",
|
|
" date: 2022-07-22_15-39-52\n",
|
|
" done: false\n",
|
|
" experiment_id: 1f45d26f052c443d8a4aef3279f4e29e\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 1\n",
|
|
" loss: 5.383672927239436\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47976\n",
|
|
" time_since_restore: 0.00013184547424316406\n",
|
|
" time_this_iter_s: 0.00013184547424316406\n",
|
|
" time_total_s: 0.00013184547424316406\n",
|
|
" timestamp: 1658500792\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: 227e1_00004\n",
|
|
" warmup_time: 0.0028159618377685547\n",
|
|
" \n",
|
|
"Result for objective_227e1_00003:\n",
|
|
" date: 2022-07-22_15-39-52\n",
|
|
" done: false\n",
|
|
" experiment_id: c4b18bff67ec45939614ad8b66cecb8c\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 1\n",
|
|
" loss: 2.6242029842903367\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47975\n",
|
|
" time_since_restore: 0.00014901161193847656\n",
|
|
" time_this_iter_s: 0.00014901161193847656\n",
|
|
" time_total_s: 0.00014901161193847656\n",
|
|
" timestamp: 1658500792\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: 227e1_00003\n",
|
|
" warmup_time: 0.0026941299438476562\n",
|
|
" \n",
|
|
"Result for objective_227e1_00002:\n",
|
|
" date: 2022-07-22_15-39-52\n",
|
|
" done: false\n",
|
|
" experiment_id: b84e7701625e49ef8056680eb616b611\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 1\n",
|
|
" loss: 3.2091889147367088\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47974\n",
|
|
" time_since_restore: 0.00016427040100097656\n",
|
|
" time_this_iter_s: 0.00016427040100097656\n",
|
|
" time_total_s: 0.00016427040100097656\n",
|
|
" timestamp: 1658500792\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: 227e1_00002\n",
|
|
" warmup_time: 0.0029571056365966797\n",
|
|
" \n",
|
|
"Result for objective_227e1_00003:\n",
|
|
" date: 2022-07-22_15-39-53\n",
|
|
" done: true\n",
|
|
" experiment_id: c4b18bff67ec45939614ad8b66cecb8c\n",
|
|
" experiment_tag: 3_mean=4,sd=0.5432\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 30\n",
|
|
" loss: 5.065320265027247\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47975\n",
|
|
" time_since_restore: 0.21804404258728027\n",
|
|
" time_this_iter_s: 0.011553049087524414\n",
|
|
" time_total_s: 0.21804404258728027\n",
|
|
" timestamp: 1658500793\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 30\n",
|
|
" trial_id: 227e1_00003\n",
|
|
" warmup_time: 0.0026941299438476562\n",
|
|
" \n",
|
|
"Result for objective_227e1_00002:\n",
|
|
" date: 2022-07-22_15-39-53\n",
|
|
" done: true\n",
|
|
" experiment_id: b84e7701625e49ef8056680eb616b611\n",
|
|
" experiment_tag: 2_mean=3,sd=0.3592\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 30\n",
|
|
" loss: 2.578088712628635\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47974\n",
|
|
" time_since_restore: 0.3050551414489746\n",
|
|
" time_this_iter_s: 0.005466938018798828\n",
|
|
" time_total_s: 0.3050551414489746\n",
|
|
" timestamp: 1658500793\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 30\n",
|
|
" trial_id: 227e1_00002\n",
|
|
" warmup_time: 0.0029571056365966797\n",
|
|
" \n",
|
|
"Result for objective_227e1_00001:\n",
|
|
" date: 2022-07-22_15-39-53\n",
|
|
" done: true\n",
|
|
" experiment_id: bf0685a616354a02af154ac3601a2109\n",
|
|
" experiment_tag: 1_mean=2,sd=0.4110\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 30\n",
|
|
" loss: 2.9165001549045844\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47973\n",
|
|
" time_since_restore: 0.37192392349243164\n",
|
|
" time_this_iter_s: 0.007360935211181641\n",
|
|
" time_total_s: 0.37192392349243164\n",
|
|
" timestamp: 1658500793\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 30\n",
|
|
" trial_id: 227e1_00001\n",
|
|
" warmup_time: 0.0029697418212890625\n",
|
|
" \n",
|
|
"Result for objective_227e1_00004:\n",
|
|
" date: 2022-07-22_15-39-53\n",
|
|
" done: true\n",
|
|
" experiment_id: 1f45d26f052c443d8a4aef3279f4e29e\n",
|
|
" experiment_tag: 4_mean=5,sd=0.7776\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 30\n",
|
|
" loss: 6.365540480426036\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47976\n",
|
|
" time_since_restore: 0.28768181800842285\n",
|
|
" time_this_iter_s: 0.003290891647338867\n",
|
|
" time_total_s: 0.28768181800842285\n",
|
|
" timestamp: 1658500793\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 30\n",
|
|
" trial_id: 227e1_00004\n",
|
|
" warmup_time: 0.0028159618377685547\n",
|
|
" \n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"2022-07-22 15:39:53,254\tINFO tune.py:738 -- Total run time: 5.76 seconds (5.63 seconds for the tuning loop).\n"
|
|
]
|
|
},
|
|
{
|
|
"data": {
|
|
"text/html": [
|
|
"== Status ==<br>Current time: 2022-07-22 15:39:59 (running for 00:00:06.06)<br>Memory usage on this node: 10.1/16.0 GiB<br>Using FIFO scheduling algorithm.<br>Resources requested: 0/16 CPUs, 0/0 GPUs, 0.0/5.52 GiB heap, 0.0/2.0 GiB objects<br>Current best trial: 25f04_00000 with loss=0.9941371354505734 and parameters={'mean': 1, 'sd': 0.5245309522439918}<br>Result logdir: /Users/kai/ray_results/WandbTrainable_2022-07-22_15-39-53<br>Number of trials: 5/5 (5 ERROR)<br><table>\n",
|
|
"<thead>\n",
|
|
"<tr><th>Trial name </th><th>status </th><th>loc </th><th style=\"text-align: right;\"> mean</th><th style=\"text-align: right;\"> sd</th><th style=\"text-align: right;\"> iter</th><th style=\"text-align: right;\"> total time (s)</th><th style=\"text-align: right;\"> loss</th></tr>\n",
|
|
"</thead>\n",
|
|
"<tbody>\n",
|
|
"<tr><td>WandbTrainable_25f04_00000</td><td>ERROR </td><td>127.0.0.1:47994</td><td style=\"text-align: right;\"> 1</td><td style=\"text-align: right;\">0.524531</td><td style=\"text-align: right;\"> 1</td><td style=\"text-align: right;\"> 0.000827789</td><td style=\"text-align: right;\">0.994137</td></tr>\n",
|
|
"<tr><td>WandbTrainable_25f04_00001</td><td>ERROR </td><td>127.0.0.1:48005</td><td style=\"text-align: right;\"> 2</td><td style=\"text-align: right;\">0.515265</td><td style=\"text-align: right;\"> 1</td><td style=\"text-align: right;\"> 0.00108528 </td><td style=\"text-align: right;\">2.31254 </td></tr>\n",
|
|
"<tr><td>WandbTrainable_25f04_00002</td><td>ERROR </td><td>127.0.0.1:48006</td><td style=\"text-align: right;\"> 3</td><td style=\"text-align: right;\">0.56327 </td><td style=\"text-align: right;\"> 1</td><td style=\"text-align: right;\"> 0.00111198 </td><td style=\"text-align: right;\">3.43952 </td></tr>\n",
|
|
"<tr><td>WandbTrainable_25f04_00003</td><td>ERROR </td><td>127.0.0.1:48007</td><td style=\"text-align: right;\"> 4</td><td style=\"text-align: right;\">0.507054</td><td style=\"text-align: right;\"> 1</td><td style=\"text-align: right;\"> 0.000993013</td><td style=\"text-align: right;\">4.53341 </td></tr>\n",
|
|
"<tr><td>WandbTrainable_25f04_00004</td><td>ERROR </td><td>127.0.0.1:48008</td><td style=\"text-align: right;\"> 5</td><td style=\"text-align: right;\">0.372142</td><td style=\"text-align: right;\"> 1</td><td style=\"text-align: right;\"> 0.000849962</td><td style=\"text-align: right;\">5.13408 </td></tr>\n",
|
|
"</tbody>\n",
|
|
"</table><br>Number of errored trials: 5<br><table>\n",
|
|
"<thead>\n",
|
|
"<tr><th>Trial name </th><th style=\"text-align: right;\"> # failures</th><th>error file </th></tr>\n",
|
|
"</thead>\n",
|
|
"<tbody>\n",
|
|
"<tr><td>WandbTrainable_25f04_00000</td><td style=\"text-align: right;\"> 1</td><td>/Users/kai/ray_results/WandbTrainable_2022-07-22_15-39-53/WandbTrainable_25f04_00000_0_mean=1,sd=0.5245_2022-07-22_15-39-53/error.txt</td></tr>\n",
|
|
"<tr><td>WandbTrainable_25f04_00001</td><td style=\"text-align: right;\"> 1</td><td>/Users/kai/ray_results/WandbTrainable_2022-07-22_15-39-53/WandbTrainable_25f04_00001_1_mean=2,sd=0.5153_2022-07-22_15-39-56/error.txt</td></tr>\n",
|
|
"<tr><td>WandbTrainable_25f04_00002</td><td style=\"text-align: right;\"> 1</td><td>/Users/kai/ray_results/WandbTrainable_2022-07-22_15-39-53/WandbTrainable_25f04_00002_2_mean=3,sd=0.5633_2022-07-22_15-39-56/error.txt</td></tr>\n",
|
|
"<tr><td>WandbTrainable_25f04_00003</td><td style=\"text-align: right;\"> 1</td><td>/Users/kai/ray_results/WandbTrainable_2022-07-22_15-39-53/WandbTrainable_25f04_00003_3_mean=4,sd=0.5071_2022-07-22_15-39-56/error.txt</td></tr>\n",
|
|
"<tr><td>WandbTrainable_25f04_00004</td><td style=\"text-align: right;\"> 1</td><td>/Users/kai/ray_results/WandbTrainable_2022-07-22_15-39-53/WandbTrainable_25f04_00004_4_mean=5,sd=0.3721_2022-07-22_15-39-56/error.txt</td></tr>\n",
|
|
"</tbody>\n",
|
|
"</table><br>"
|
|
],
|
|
"text/plain": [
|
|
"<IPython.core.display.HTML object>"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"output_type": "display_data"
|
|
},
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"2022-07-22 15:39:56,146\tERROR trial_runner.py:921 -- Trial WandbTrainable_25f04_00000: Error processing event.\n",
|
|
"ray.exceptions.RayTaskError(NotImplementedError): \u001b[36mray::WandbTrainable.save()\u001b[39m (pid=47994, ip=127.0.0.1, repr=<__main__.WandbTrainable object at 0x11052de10>)\n",
|
|
" File \"/Users/kai/coding/ray/python/ray/tune/trainable/trainable.py\", line 449, in save\n",
|
|
" checkpoint_dict_or_path = self.save_checkpoint(checkpoint_dir)\n",
|
|
" File \"/Users/kai/coding/ray/python/ray/tune/trainable/trainable.py\", line 1014, in save_checkpoint\n",
|
|
" raise NotImplementedError\n",
|
|
"NotImplementedError\n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Result for WandbTrainable_25f04_00000:\n",
|
|
" date: 2022-07-22_15-39-56\n",
|
|
" done: true\n",
|
|
" experiment_id: c0ac6bf4f2af45368a3c5c3e14e47115\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 1\n",
|
|
" loss: 0.9941371354505734\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47994\n",
|
|
" time_since_restore: 0.000827789306640625\n",
|
|
" time_this_iter_s: 0.000827789306640625\n",
|
|
" time_total_s: 0.000827789306640625\n",
|
|
" timestamp: 1658500796\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: 25f04_00000\n",
|
|
" warmup_time: 0.0031821727752685547\n",
|
|
" \n",
|
|
"Result for WandbTrainable_25f04_00000:\n",
|
|
" date: 2022-07-22_15-39-56\n",
|
|
" done: true\n",
|
|
" experiment_id: c0ac6bf4f2af45368a3c5c3e14e47115\n",
|
|
" experiment_tag: 0_mean=1,sd=0.5245\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 1\n",
|
|
" loss: 0.9941371354505734\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 47994\n",
|
|
" time_since_restore: 0.000827789306640625\n",
|
|
" time_this_iter_s: 0.000827789306640625\n",
|
|
" time_total_s: 0.000827789306640625\n",
|
|
" timestamp: 1658500796\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: 25f04_00000\n",
|
|
" warmup_time: 0.0031821727752685547\n",
|
|
" \n",
|
|
"Result for WandbTrainable_25f04_00002:\n",
|
|
" date: 2022-07-22_15-39-59\n",
|
|
" done: true\n",
|
|
" experiment_id: b4174fe95248493e8dedfcbc67549339\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 1\n",
|
|
" loss: 3.4395203958985836\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 48006\n",
|
|
" time_since_restore: 0.0011119842529296875\n",
|
|
" time_this_iter_s: 0.0011119842529296875\n",
|
|
" time_total_s: 0.0011119842529296875\n",
|
|
" timestamp: 1658500799\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: 25f04_00002\n",
|
|
" warmup_time: 0.004413127899169922\n",
|
|
" \n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"2022-07-22 15:39:59,299\tERROR trial_runner.py:921 -- Trial WandbTrainable_25f04_00002: Error processing event.\n",
|
|
"ray.exceptions.RayTaskError(NotImplementedError): \u001b[36mray::WandbTrainable.save()\u001b[39m (pid=48006, ip=127.0.0.1, repr=<__main__.WandbTrainable object at 0x11a54c8d0>)\n",
|
|
" File \"/Users/kai/coding/ray/python/ray/tune/trainable/trainable.py\", line 449, in save\n",
|
|
" checkpoint_dict_or_path = self.save_checkpoint(checkpoint_dir)\n",
|
|
" File \"/Users/kai/coding/ray/python/ray/tune/trainable/trainable.py\", line 1014, in save_checkpoint\n",
|
|
" raise NotImplementedError\n",
|
|
"NotImplementedError\n",
|
|
"2022-07-22 15:39:59,305\tERROR trial_runner.py:921 -- Trial WandbTrainable_25f04_00004: Error processing event.\n",
|
|
"ray.exceptions.RayTaskError(NotImplementedError): \u001b[36mray::WandbTrainable.save()\u001b[39m (pid=48008, ip=127.0.0.1, repr=<__main__.WandbTrainable object at 0x11c314d90>)\n",
|
|
" File \"/Users/kai/coding/ray/python/ray/tune/trainable/trainable.py\", line 449, in save\n",
|
|
" checkpoint_dict_or_path = self.save_checkpoint(checkpoint_dir)\n",
|
|
" File \"/Users/kai/coding/ray/python/ray/tune/trainable/trainable.py\", line 1014, in save_checkpoint\n",
|
|
" raise NotImplementedError\n",
|
|
"NotImplementedError\n",
|
|
"2022-07-22 15:39:59,310\tERROR trial_runner.py:921 -- Trial WandbTrainable_25f04_00001: Error processing event.\n",
|
|
"ray.exceptions.RayTaskError(NotImplementedError): \u001b[36mray::WandbTrainable.save()\u001b[39m (pid=48005, ip=127.0.0.1, repr=<__main__.WandbTrainable object at 0x10e56fb90>)\n",
|
|
" File \"/Users/kai/coding/ray/python/ray/tune/trainable/trainable.py\", line 449, in save\n",
|
|
" checkpoint_dict_or_path = self.save_checkpoint(checkpoint_dir)\n",
|
|
" File \"/Users/kai/coding/ray/python/ray/tune/trainable/trainable.py\", line 1014, in save_checkpoint\n",
|
|
" raise NotImplementedError\n",
|
|
"NotImplementedError\n",
|
|
"2022-07-22 15:39:59,324\tERROR trial_runner.py:921 -- Trial WandbTrainable_25f04_00003: Error processing event.\n",
|
|
"ray.exceptions.RayTaskError(NotImplementedError): \u001b[36mray::WandbTrainable.save()\u001b[39m (pid=48007, ip=127.0.0.1, repr=<__main__.WandbTrainable object at 0x10b49ee50>)\n",
|
|
" File \"/Users/kai/coding/ray/python/ray/tune/trainable/trainable.py\", line 449, in save\n",
|
|
" checkpoint_dict_or_path = self.save_checkpoint(checkpoint_dir)\n",
|
|
" File \"/Users/kai/coding/ray/python/ray/tune/trainable/trainable.py\", line 1014, in save_checkpoint\n",
|
|
" raise NotImplementedError\n",
|
|
"NotImplementedError\n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Result for WandbTrainable_25f04_00001:\n",
|
|
" date: 2022-07-22_15-39-59\n",
|
|
" done: true\n",
|
|
" experiment_id: b0920f67a88f4993b7ec85dee2f78022\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 1\n",
|
|
" loss: 2.3125440070079093\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 48005\n",
|
|
" time_since_restore: 0.0010852813720703125\n",
|
|
" time_this_iter_s: 0.0010852813720703125\n",
|
|
" time_total_s: 0.0010852813720703125\n",
|
|
" timestamp: 1658500799\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: 25f04_00001\n",
|
|
" warmup_time: 0.0049626827239990234\n",
|
|
" \n",
|
|
"Result for WandbTrainable_25f04_00004:\n",
|
|
" date: 2022-07-22_15-39-59\n",
|
|
" done: true\n",
|
|
" experiment_id: 4435b2105eb24fbaba4778e33ce2e1a9\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 1\n",
|
|
" loss: 5.134083536061109\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 48008\n",
|
|
" time_since_restore: 0.0008499622344970703\n",
|
|
" time_this_iter_s: 0.0008499622344970703\n",
|
|
" time_total_s: 0.0008499622344970703\n",
|
|
" timestamp: 1658500799\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: 25f04_00004\n",
|
|
" warmup_time: 0.0031480789184570312\n",
|
|
" \n",
|
|
"Result for WandbTrainable_25f04_00002:\n",
|
|
" date: 2022-07-22_15-39-59\n",
|
|
" done: true\n",
|
|
" experiment_id: b4174fe95248493e8dedfcbc67549339\n",
|
|
" experiment_tag: 2_mean=3,sd=0.5633\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 1\n",
|
|
" loss: 3.4395203958985836\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 48006\n",
|
|
" time_since_restore: 0.0011119842529296875\n",
|
|
" time_this_iter_s: 0.0011119842529296875\n",
|
|
" time_total_s: 0.0011119842529296875\n",
|
|
" timestamp: 1658500799\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: 25f04_00002\n",
|
|
" warmup_time: 0.004413127899169922\n",
|
|
" \n",
|
|
"Result for WandbTrainable_25f04_00004:\n",
|
|
" date: 2022-07-22_15-39-59\n",
|
|
" done: true\n",
|
|
" experiment_id: 4435b2105eb24fbaba4778e33ce2e1a9\n",
|
|
" experiment_tag: 4_mean=5,sd=0.3721\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 1\n",
|
|
" loss: 5.134083536061109\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 48008\n",
|
|
" time_since_restore: 0.0008499622344970703\n",
|
|
" time_this_iter_s: 0.0008499622344970703\n",
|
|
" time_total_s: 0.0008499622344970703\n",
|
|
" timestamp: 1658500799\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: 25f04_00004\n",
|
|
" warmup_time: 0.0031480789184570312\n",
|
|
" \n",
|
|
"Result for WandbTrainable_25f04_00001:\n",
|
|
" date: 2022-07-22_15-39-59\n",
|
|
" done: true\n",
|
|
" experiment_id: b0920f67a88f4993b7ec85dee2f78022\n",
|
|
" experiment_tag: 1_mean=2,sd=0.5153\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 1\n",
|
|
" loss: 2.3125440070079093\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 48005\n",
|
|
" time_since_restore: 0.0010852813720703125\n",
|
|
" time_this_iter_s: 0.0010852813720703125\n",
|
|
" time_total_s: 0.0010852813720703125\n",
|
|
" timestamp: 1658500799\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: 25f04_00001\n",
|
|
" warmup_time: 0.0049626827239990234\n",
|
|
" \n",
|
|
"Result for WandbTrainable_25f04_00003:\n",
|
|
" date: 2022-07-22_15-39-59\n",
|
|
" done: true\n",
|
|
" experiment_id: a667aef035a1475a883c166a014b756c\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 1\n",
|
|
" loss: 4.533407187147774\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 48007\n",
|
|
" time_since_restore: 0.0009930133819580078\n",
|
|
" time_this_iter_s: 0.0009930133819580078\n",
|
|
" time_total_s: 0.0009930133819580078\n",
|
|
" timestamp: 1658500799\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: 25f04_00003\n",
|
|
" warmup_time: 0.0036199092864990234\n",
|
|
" \n",
|
|
"Result for WandbTrainable_25f04_00003:\n",
|
|
" date: 2022-07-22_15-39-59\n",
|
|
" done: true\n",
|
|
" experiment_id: a667aef035a1475a883c166a014b756c\n",
|
|
" experiment_tag: 3_mean=4,sd=0.5071\n",
|
|
" hostname: Kais-MacBook-Pro.local\n",
|
|
" iterations_since_restore: 1\n",
|
|
" loss: 4.533407187147774\n",
|
|
" node_ip: 127.0.0.1\n",
|
|
" pid: 48007\n",
|
|
" time_since_restore: 0.0009930133819580078\n",
|
|
" time_this_iter_s: 0.0009930133819580078\n",
|
|
" time_total_s: 0.0009930133819580078\n",
|
|
" timestamp: 1658500799\n",
|
|
" timesteps_since_restore: 0\n",
|
|
" training_iteration: 1\n",
|
|
" trial_id: 25f04_00003\n",
|
|
" warmup_time: 0.0036199092864990234\n",
|
|
" \n"
|
|
]
|
|
},
|
|
{
|
|
"name": "stderr",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"2022-07-22 15:39:59,455\tERROR tune.py:733 -- Trials did not complete: [WandbTrainable_25f04_00000, WandbTrainable_25f04_00001, WandbTrainable_25f04_00002, WandbTrainable_25f04_00003, WandbTrainable_25f04_00004]\n",
|
|
"2022-07-22 15:39:59,456\tINFO tune.py:738 -- Total run time: 6.18 seconds (6.04 seconds for the tuning loop).\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"import tempfile\n",
|
|
"from unittest.mock import MagicMock\n",
|
|
"\n",
|
|
"mock_api = True\n",
|
|
"\n",
|
|
"api_key_file = \"~/.wandb_api_key\"\n",
|
|
"\n",
|
|
"if mock_api:\n",
|
|
" WandbLoggerCallback._logger_process_cls = MagicMock\n",
|
|
" decorated_objective.__mixins__ = tuple()\n",
|
|
" WandbTrainable._wandb = MagicMock()\n",
|
|
" wandb = MagicMock() # noqa: F811\n",
|
|
" temp_file = tempfile.NamedTemporaryFile()\n",
|
|
" temp_file.write(b\"1234\")\n",
|
|
" temp_file.flush()\n",
|
|
" api_key_file = temp_file.name\n",
|
|
"\n",
|
|
"tune_function(api_key_file)\n",
|
|
"tune_decorated(api_key_file)\n",
|
|
"tune_trainable(api_key_file)\n",
|
|
"\n",
|
|
"if mock_api:\n",
|
|
" temp_file.close()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "2f6e9138",
|
|
"metadata": {},
|
|
"source": [
|
|
"This completes our Tune and Wandb walk-through.\n",
|
|
"In the following sections you can find more details on the API of the Tune-Wandb integration.\n",
|
|
"\n",
|
|
"## Tune Wandb API Reference\n",
|
|
"\n",
|
|
"### WandbLoggerCallback\n",
|
|
"\n",
|
|
"(tune-wandb-logger)=\n",
|
|
"\n",
|
|
"```{eval-rst}\n",
|
|
".. autoclass:: ray.air.callbacks.wandb.WandbLoggerCallback\n",
|
|
" :noindex:\n",
|
|
"```\n",
|
|
"\n",
|
|
"### Wandb-Mixin\n",
|
|
"\n",
|
|
"(tune-wandb-mixin)=\n",
|
|
"\n",
|
|
"```{eval-rst}\n",
|
|
".. autofunction:: ray.tune.integration.wandb.wandb_mixin\n",
|
|
" :noindex:\n",
|
|
"```"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.7.7"
|
|
},
|
|
"orphan": true
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|