mirror of
https://github.com/vale981/ray
synced 2025-03-09 04:46:38 -04:00

Co-authored-by: brettskymind <brett@pathmind.com> Co-authored-by: Max Pumperla <max.pumperla@googlemail.com>
550 lines
15 KiB
Text
550 lines
15 KiB
Text
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "fdacf514",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Running Tune experiments with SigOpt\n",
|
||
"\n",
|
||
"In this tutorial we introduce SigOpt, while running a simple Ray Tune experiment. Tune’s Search Algorithms integrate with SigOpt and, as a result, allow you to seamlessly scale up a SigOpt optimization process - without sacrificing performance.\n",
|
||
"\n",
|
||
"SigOpt is a model development platform with built in hyperparameter optimization algorithms. Their technology is closed source, but is designed for optimizing functions that are nondifferentiable, with many local minima, or even unknown but only testable. Therefore, SigOpt necessarily falls in the domain of \"derivative-free optimization\" and \"black-box optimization\". In this example we minimize a simple objective to briefly demonstrate the usage of SigOpt with Ray Tune via `SigOptSearch`. It's useful to keep in mind that despite the emphasis on machine learning experiments, Ray Tune optimizes any implicit or explicit objective. Here we assume `sigopt==7.5.0` library is installed and an API key exists. To learn more and to obtain the necessary API key, refer to [SigOpt website](https://sigopt.com/). \n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "6e1d6d8a",
|
||
"metadata": {
|
||
"tags": [
|
||
"remove-cell"
|
||
]
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"# !pip install ray[tune]\n",
|
||
"!pip install sigopt==7.5.0"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "28c03459",
|
||
"metadata": {},
|
||
"source": [
|
||
"Click below to see all the imports we need for this example.\n",
|
||
"You can also launch directly into a Binder instance to run this notebook yourself.\n",
|
||
"Just click on the rocket symbol at the top of the navigation."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "152fce99",
|
||
"metadata": {
|
||
"tags": [
|
||
"hide-input"
|
||
]
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"import time\n",
|
||
"import os\n",
|
||
"\n",
|
||
"import ray\n",
|
||
"import numpy as np\n",
|
||
"from ray import tune\n",
|
||
"from ray.tune.suggest.sigopt import SigOptSearch\n",
|
||
"\n",
|
||
"if \"SIGOPT_KEY\" not in os.environ:\n",
|
||
" raise ValueError(\n",
|
||
" \"SigOpt API Key not found. Please set the SIGOPT_KEY \"\n",
|
||
" \"environment variable.\"\n",
|
||
" )"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "58b82407",
|
||
"metadata": {},
|
||
"source": [
|
||
"Let's start by defining a simple evaluation function.\n",
|
||
"We artificially sleep for a bit (`0.1` seconds) to simulate a long-running ML experiment.\n",
|
||
"This setup assumes that we're running multiple `step`s of an experiment and try to tune two hyperparameters,\n",
|
||
"namely `width` and `height`, and `activation`."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "2f9f7650",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"def evaluate(step, width, height, activation):\n",
|
||
" time.sleep(0.1)\n",
|
||
" activation_boost = 10 if activation==\"relu\" else 1\n",
|
||
" return (0.1 + width * step / 100) ** (-1) + height * 0.1 + activation_boost"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "dec8f15e",
|
||
"metadata": {},
|
||
"source": [
|
||
"Next, our ``objective`` function takes a Tune ``config``, evaluates the `score` of your experiment in a training loop,\n",
|
||
"and uses `tune.report` to report the `score` back to Tune."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "7284f24a",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"def objective(config):\n",
|
||
" for step in range(config[\"steps\"]):\n",
|
||
" score = evaluate(step, config[\"width\"], config[\"height\"], config[\"activation\"])\n",
|
||
" tune.report(iterations=step, mean_loss=score)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "394c722e",
|
||
"metadata": {
|
||
"lines_to_next_cell": 0,
|
||
"tags": [
|
||
"remove-cell"
|
||
]
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"ray.init(configure_logging=False)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "a1040d00",
|
||
"metadata": {},
|
||
"source": [
|
||
"Next we define a search space. The critical assumption is that the optimal hyperparamters live within this space. Yet, if the space is very large, then those hyperparameters may be difficult to find in a short amount of time."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "57dbed8d",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"#search_config = {\n",
|
||
"# \"steps\": 100,\n",
|
||
"# \"width\": tune.uniform(0, 20),\n",
|
||
"# \"height\": tune.uniform(-100, 100),\n",
|
||
"# \"activation\": tune.choice([\"relu, tanh\"])\n",
|
||
"#}"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "7f542011",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"space = [\n",
|
||
" {\n",
|
||
" \"name\": \"width\",\n",
|
||
" \"type\": \"int\",\n",
|
||
" \"bounds\": {\"min\": 0, \"max\": 20},\n",
|
||
" },\n",
|
||
" {\n",
|
||
" \"name\": \"height\",\n",
|
||
" \"type\": \"int\",\n",
|
||
" \"bounds\": {\"min\": -100, \"max\": 100},\n",
|
||
" },\n",
|
||
" {\n",
|
||
" \"name\": \"activation\",\n",
|
||
" \"type\": \"categorical\",\n",
|
||
" \"categorical_values\": [\"relu\",\"tanh\"]\n",
|
||
" }\n",
|
||
"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "0493aaed",
|
||
"metadata": {},
|
||
"source": [
|
||
"Now we define the search algorithm built from `SigOptSearch`, constrained to a maximum of `1` concurrent trials."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "a9737339",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"algo = SigOptSearch(\n",
|
||
" space,\n",
|
||
" name=\"SigOpt Example Experiment\",\n",
|
||
" max_concurrent=1,\n",
|
||
" metric=\"mean_loss\",\n",
|
||
" mode=\"min\",\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d844d3ae",
|
||
"metadata": {},
|
||
"source": [
|
||
"```\n",
|
||
"\n",
|
||
"The number of samples is the number of hyperparameter combinations that will be tried out. This Tune run is set to `1000` samples.\n",
|
||
"(you can decrease this if it takes too long on your machine).\n",
|
||
"\n",
|
||
"```python\n",
|
||
"num_samples = 1000\n",
|
||
"```"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "38991a3c",
|
||
"metadata": {
|
||
"tags": [
|
||
"remove-cell"
|
||
]
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"# If 1000 samples take too long, you can reduce this number.\n",
|
||
"# We override this number here for our smoke tests.\n",
|
||
"num_samples = 10"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "13743e20",
|
||
"metadata": {},
|
||
"source": [
|
||
"Finally, we run the experiment to `\"min\"`imize the \"mean_loss\" of the `objective` by searching `space` provided above to `algo`, `num_samples` times. This previous sentence is fully characterizes the search problem we aim to solve. With this in mind, notice how efficient it is to execute `tune.run()`."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "a2b6d066",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"analysis = tune.run(\n",
|
||
" objective,\n",
|
||
" name=\"sigopt_exp\",\n",
|
||
" search_alg=algo,\n",
|
||
" num_samples=num_samples,\n",
|
||
" metric=\"mean_loss\",\n",
|
||
" mode=\"min\",\n",
|
||
" config={\"steps\": 100}\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "b7577b9e",
|
||
"metadata": {},
|
||
"source": [
|
||
"Here are the hyperparamters found to minimize the mean loss of the defined objective."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "3567e642",
|
||
"metadata": {
|
||
"lines_to_next_cell": 0
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"print(\"Best hyperparameters found were: \", analysis.best_config)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d6e96a2a",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Multi-objective optimization with Sigopt\n",
|
||
"\n",
|
||
"We define another simple objective."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "8e13db27",
|
||
"metadata": {
|
||
"lines_to_next_cell": 0
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"np.random.seed(0)\n",
|
||
"vector1 = np.random.normal(0, 0.1, 100)\n",
|
||
"vector2 = np.random.normal(0, 0.1, 100)\n",
|
||
"\n",
|
||
"def evaluate(w1, w2):\n",
|
||
" total = w1 * vector1 + w2 * vector2\n",
|
||
" return total.mean(), total.std()\n",
|
||
"\n",
|
||
"def multi_objective(config):\n",
|
||
" w1 = config[\"w1\"]\n",
|
||
" w2 = config[\"total_weight\"] - w1\n",
|
||
" \n",
|
||
" average, std = evaluate(w1, w2)\n",
|
||
" tune.report(average=average, std=std, sharpe=average / std)\n",
|
||
" time.sleep(0.1)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "a9487562",
|
||
"metadata": {},
|
||
"source": [
|
||
"We define the space manually for `SigOptSearch`."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "47ed922c",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"space = [\n",
|
||
" {\n",
|
||
" \"name\": \"w1\",\n",
|
||
" \"type\": \"double\",\n",
|
||
" \"bounds\": {\"min\": 0, \"max\": 1},\n",
|
||
" },\n",
|
||
"]\n",
|
||
"\n",
|
||
"algo = SigOptSearch(\n",
|
||
" space,\n",
|
||
" name=\"sigopt_multiobj_exp\",\n",
|
||
" observation_budget=num_samples,\n",
|
||
" max_concurrent=1,\n",
|
||
" metric=[\"average\", \"std\", \"sharpe\"],\n",
|
||
" mode=[\"max\", \"min\", \"obs\"],\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "07b0f93d",
|
||
"metadata": {},
|
||
"source": [
|
||
"Finally, we run the experiment using Ray Tune, which in this case requires very little input since most of the construction has gone inside `search_algo`."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "de17bd46",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"analysis = tune.run(\n",
|
||
" multi_objective,\n",
|
||
" name=\"sigopt_multiobj_exp\",\n",
|
||
" search_alg=algo,\n",
|
||
" num_samples=num_samples,\n",
|
||
" config={\"total_weight\": 1},\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "3d8441a6",
|
||
"metadata": {},
|
||
"source": [
|
||
"And here are they hyperparameters found to minimize the the objective on average."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "fcd606dd",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"print(\"Best hyperparameters found were: \", analysis.get_best_config(\"average\", \"min\"))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "66c34c60",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Incorporating prior beliefs with Sigopt\n",
|
||
"\n",
|
||
"If we have information about beneficial hyperparameters within the search space, then we can incorporate this bias via a prior distribution. Without explicitly incorporating a prior, the default is a uniform distribution of preference over the search space. Below we highlight the hyperparamters we expect to be better with a Gaussian prior distribution.\n",
|
||
"\n",
|
||
"We start with defining another objective."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "870dfb0d",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"np.random.seed(0)\n",
|
||
"vector1 = np.random.normal(0.0, 0.1, 100)\n",
|
||
"vector2 = np.random.normal(0.0, 0.1, 100)\n",
|
||
"vector3 = np.random.normal(0.0, 0.1, 100)\n",
|
||
"\n",
|
||
"def evaluate(w1, w2, w3):\n",
|
||
" total = w1 * vector1 + w2 * vector2 + w3 * vector3\n",
|
||
" return total.mean(), total.std()\n",
|
||
"\n",
|
||
"def multi_objective_two(config):\n",
|
||
" w1 = config[\"w1\"]\n",
|
||
" w2 = config[\"w2\"]\n",
|
||
" total = w1 + w2\n",
|
||
" if total > 1:\n",
|
||
" w3 = 0\n",
|
||
" w1 /= total\n",
|
||
" w2 /= total\n",
|
||
" else:\n",
|
||
" w3 = 1 - total\n",
|
||
" \n",
|
||
" average, std = evaluate(w1, w2, w3)\n",
|
||
" tune.report(average=average, std=std)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "7ae02ab4",
|
||
"metadata": {},
|
||
"source": [
|
||
"Now we begin setting up the SigOpt experiment and algorithm. Incorporating a prior distribution over hyperparameters requires establishing a connection with SigOpt via `\"SIGOPT_KEY\"` environment variable. Here we create a Gaussian prior over w1 and w2, each independently. "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "fa992019",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"samples = num_samples\n",
|
||
"\n",
|
||
"conn = Connection(client_token=os.environ[\"SIGOPT_KEY\"])\n",
|
||
"\n",
|
||
"experiment = conn.experiments().create(\n",
|
||
" name=\"prior experiment example\",\n",
|
||
" parameters=[\n",
|
||
" {\n",
|
||
" \"name\": \"w1\",\n",
|
||
" \"bounds\": {\"max\": 1, \"min\": 0},\n",
|
||
" \"prior\": {\"mean\": 1 / 3, \"name\": \"normal\", \"scale\": 0.2},\n",
|
||
" \"type\": \"double\",\n",
|
||
" },\n",
|
||
" {\n",
|
||
" \"name\": \"w2\",\n",
|
||
" \"bounds\": {\"max\": 1, \"min\": 0},\n",
|
||
" \"prior\": {\"mean\": 1 / 3, \"name\": \"normal\", \"scale\": 0.2},\n",
|
||
" \"type\": \"double\",\n",
|
||
" }, \n",
|
||
" ],\n",
|
||
" metrics=[\n",
|
||
" dict(name=\"std\", objective=\"minimize\", strategy=\"optimize\"),\n",
|
||
" dict(name=\"average\", strategy=\"store\"),\n",
|
||
" ],\n",
|
||
" observation_budget=samples,\n",
|
||
" parallel_bandwidth=1,\n",
|
||
")\n",
|
||
"\n",
|
||
"algo = SigOptSearch(\n",
|
||
" connection=conn,\n",
|
||
" experiment_id=experiment.id,\n",
|
||
" name=\"sigopt_prior_multi_exp\",\n",
|
||
" max_concurrent=1,\n",
|
||
" metric=[\"average\", \"std\"],\n",
|
||
" mode=[\"obs\", \"min\"],\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "dbc4099f",
|
||
"metadata": {},
|
||
"source": [
|
||
"Finally, we run the experiment using Ray Tune, where `search_algo` establishes the search space."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "da778e0f",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"analysis = tune.run(\n",
|
||
" objective,\n",
|
||
" name=\"sigopt_prior_multi_exp\",\n",
|
||
" search_alg=algo,\n",
|
||
" num_samples=samples,\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "6f9c2959",
|
||
"metadata": {},
|
||
"source": [
|
||
"And here are they hyperparameters found to minimize the the objective on average."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "6b9142bc",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"print(\"Best hyperparameters found were: \", analysis.get_best_config(\"average\", \"min\"))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "3e0151e1",
|
||
"metadata": {
|
||
"tags": [
|
||
"remove-cell"
|
||
]
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"ray.shutdown()"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3 (ipykernel)",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"orphan": true
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 5
|
||
}
|