2022-04-25 11:10:58 -07:00
|
|
|
|
{
|
|
|
|
|
"cells": [
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"id": "0332db08",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
|
|
|
|
"# Running Tune experiments with Dragonfly\n",
|
|
|
|
|
"\n",
|
2022-05-13 15:39:18 +02:00
|
|
|
|
"In this tutorial we introduce Dragonfly, while running a simple Ray Tune experiment.\n",
|
|
|
|
|
"Tune’s Search Algorithms integrate with Dragonfly and, as a result,\n",
|
|
|
|
|
"allow you to seamlessly scale up a Dragonfly optimization process - without\n",
|
|
|
|
|
"sacrificing performance.\n",
|
2022-04-25 11:10:58 -07:00
|
|
|
|
"\n",
|
2022-05-13 15:39:18 +02:00
|
|
|
|
"Dragonfly is an open source python library for scalable Bayesian optimization.\n",
|
|
|
|
|
"Bayesian optimization is used optimizing black-box functions whose evaluations\n",
|
|
|
|
|
"are usually expensive. Beyond vanilla optimization techniques,\n",
|
|
|
|
|
"Dragonfly provides an array of tools to scale up Bayesian optimization to expensive\n",
|
|
|
|
|
"large scale problems. These include features that are especially suited for high\n",
|
|
|
|
|
"dimensional spaces (optimizing with a large number of variables), parallel evaluations\n",
|
|
|
|
|
"in synchronous or asynchronous settings (conducting multiple evaluations in parallel),\n",
|
|
|
|
|
"multi-fidelity optimization (using cheap approximations to speed up the optimization\n",
|
|
|
|
|
"process), and multi-objective optimization (optimizing multiple functions\n",
|
|
|
|
|
"simultaneously).\n",
|
2022-04-25 11:10:58 -07:00
|
|
|
|
"\n",
|
2022-05-13 15:39:18 +02:00
|
|
|
|
"Bayesian optimization does not rely on the gradient of the objective function,\n",
|
|
|
|
|
"but instead, learns from samples of the search space. It is suitable for optimizing\n",
|
|
|
|
|
"functions that are non-differentiable, with many local minima, or even unknown but only\n",
|
|
|
|
|
"testable. Therefore, it belongs to the domain of \"derivative-free optimization\" and\n",
|
|
|
|
|
"\"black-box optimization\". In this example we minimize a simple objective to briefly\n",
|
|
|
|
|
"demonstrate the usage of Dragonfly with Ray Tune via `DragonflySearch`. It's useful\n",
|
|
|
|
|
"to keep in mind that despite the emphasis on machine learning experiments,\n",
|
|
|
|
|
"Ray Tune optimizes any implicit or explicit objective. Here we assume\n",
|
|
|
|
|
"`dragonfly-opt==0.1.6` library is installed. To learn more, please refer to\n",
|
|
|
|
|
"the [Dragonfly website](https://dragonfly-opt.readthedocs.io/)."
|
2022-04-25 11:10:58 -07:00
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"id": "9878e2bd",
|
|
|
|
|
"metadata": {
|
|
|
|
|
"tags": [
|
|
|
|
|
"remove-cell"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"# !pip install ray[tune]\n",
|
|
|
|
|
"!pip install dragonfly-opt==0.1.6"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"id": "3acad068",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
|
|
|
|
"Click below to see all the imports we need for this example.\n",
|
|
|
|
|
"You can also launch directly into a Binder instance to run this notebook yourself.\n",
|
|
|
|
|
"Just click on the rocket symbol at the top of the navigation."
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"id": "4ea7aefa",
|
|
|
|
|
"metadata": {
|
|
|
|
|
"tags": [
|
|
|
|
|
"hide-input"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"import numpy as np\n",
|
|
|
|
|
"import time\n",
|
|
|
|
|
"\n",
|
|
|
|
|
"import ray\n",
|
|
|
|
|
"from ray import tune\n",
|
2022-06-30 10:37:31 -07:00
|
|
|
|
"from ray.air import session\n",
|
2022-06-25 14:55:30 +01:00
|
|
|
|
"from ray.tune.search import ConcurrencyLimiter\n",
|
|
|
|
|
"from ray.tune.search.dragonfly import DragonflySearch"
|
2022-04-25 11:10:58 -07:00
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"id": "dc67fc3f",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
2022-05-13 15:39:18 +02:00
|
|
|
|
"Let's start by defining a optimization problem.\n",
|
|
|
|
|
"Suppose we want to figure out the proportions of water and several salts to add to an\n",
|
|
|
|
|
"ionic solution with the goal of maximizing it's ability to conduct electricity.\n",
|
|
|
|
|
"The objective here is explicit for demonstration, yet in practice they often come\n",
|
|
|
|
|
"out of a black-box (e.g. a physical device measuring conductivity, or reporting the\n",
|
|
|
|
|
"results of a long-running ML experiment). We artificially sleep for a bit\n",
|
|
|
|
|
"(`0.02` seconds) to simulate a more typical experiment. This setup assumes that we're\n",
|
|
|
|
|
"running multiple `step`s of an experiment and try to tune relative proportions of\n",
|
|
|
|
|
"4 ingredients-- these proportions should be considered as hyperparameters.\n",
|
|
|
|
|
"Our `objective` function will take a Tune `config`, evaluates the `conductivity` of\n",
|
|
|
|
|
"our experiment in a training loop,\n",
|
2022-06-30 10:37:31 -07:00
|
|
|
|
"and uses `session.report` to report the `conductivity` back to Tune."
|
2022-04-25 11:10:58 -07:00
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"id": "f0d72404",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"def objective(config):\n",
|
|
|
|
|
" \"\"\"\n",
|
2022-05-13 15:39:18 +02:00
|
|
|
|
" Simplistic model of electrical conductivity with added Gaussian\n",
|
|
|
|
|
" noise to simulate experimental noise.\n",
|
2022-04-25 11:10:58 -07:00
|
|
|
|
" \"\"\"\n",
|
|
|
|
|
" for i in range(config[\"iterations\"]):\n",
|
|
|
|
|
" vol1 = config[\"LiNO3_vol\"] # LiNO3\n",
|
|
|
|
|
" vol2 = config[\"Li2SO4_vol\"] # Li2SO4\n",
|
|
|
|
|
" vol3 = config[\"NaClO4_vol\"] # NaClO4\n",
|
|
|
|
|
" vol4 = 10 - (vol1 + vol2 + vol3) # Water\n",
|
|
|
|
|
" conductivity = vol1 + 0.1 * (vol2 + vol3) ** 2 + 2.3 * vol4 * (vol1 ** 1.5)\n",
|
|
|
|
|
" conductivity += np.random.normal() * 0.01\n",
|
2022-06-30 10:37:31 -07:00
|
|
|
|
" session.report({\"timesteps_total\": i, \"objective\": conductivity})\n",
|
2022-04-25 11:10:58 -07:00
|
|
|
|
" time.sleep(0.02)"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"id": "1808e8e0",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
2022-05-13 15:39:18 +02:00
|
|
|
|
"Next we define a search space. The critical assumption is that the optimal\n",
|
|
|
|
|
"hyperparameters live within this space. Yet, if the space is very large, then those\n",
|
|
|
|
|
"hyperparameters may be difficult to find in a short amount of time."
|
2022-04-25 11:10:58 -07:00
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"id": "6b867a19",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"search_space = {\n",
|
|
|
|
|
" \"iterations\": 100,\n",
|
|
|
|
|
" \"LiNO3_vol\": tune.uniform(0, 7),\n",
|
|
|
|
|
" \"Li2SO4_vol\": tune.uniform(0, 7),\n",
|
|
|
|
|
" \"NaClO4_vol\": tune.uniform(0, 7)\n",
|
|
|
|
|
"}"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"id": "6fe16177",
|
|
|
|
|
"metadata": {
|
|
|
|
|
"tags": [
|
|
|
|
|
"remove-cell"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"ray.init(configure_logging=False)"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"id": "ed5fc098",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
2022-05-13 15:39:18 +02:00
|
|
|
|
"Now we define the search algorithm from `DragonflySearch` with `optimizer` and\n",
|
|
|
|
|
"`domain` arguments specified in a common way. We also use `ConcurrencyLimiter`\n",
|
|
|
|
|
"to constrain to 4 concurrent trials."
|
2022-04-25 11:10:58 -07:00
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"id": "a8075e34",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"algo = DragonflySearch(\n",
|
|
|
|
|
" optimizer=\"bandit\",\n",
|
|
|
|
|
" domain=\"euclidean\",\n",
|
|
|
|
|
")\n",
|
|
|
|
|
"algo = ConcurrencyLimiter(algo, max_concurrent=4)"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"id": "82240170",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
2022-05-13 15:39:18 +02:00
|
|
|
|
"The number of samples is the number of hyperparameter combinations that will be\n",
|
|
|
|
|
"tried out. This Tune run is set to `1000` samples.\n",
|
2022-04-25 11:10:58 -07:00
|
|
|
|
"(you can decrease this if it takes too long on your machine)."
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"id": "c437c1e1",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"num_samples = 100"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"id": "70192183",
|
|
|
|
|
"metadata": {
|
|
|
|
|
"tags": [
|
|
|
|
|
"remove-cell"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"# Reducing samples for smoke tests\n",
|
|
|
|
|
"num_samples = 10"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"id": "3a823fb1",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
2022-05-13 15:39:18 +02:00
|
|
|
|
"Finally, we run the experiment to `min`imize the `mean_loss` of the `objective` by\n",
|
|
|
|
|
"searching `search_config` via `algo`, `num_samples` times. This previous sentence is\n",
|
|
|
|
|
"fully characterizes the search problem we aim to solve. With this in mind,\n",
|
|
|
|
|
"notice how efficient it is to execute `tune.run()`."
|
2022-04-25 11:10:58 -07:00
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"id": "97c69cbe",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"analysis = tune.run(\n",
|
|
|
|
|
" objective,\n",
|
|
|
|
|
" metric=\"objective\",\n",
|
|
|
|
|
" mode=\"max\",\n",
|
|
|
|
|
" name=\"dragonfly_search\",\n",
|
|
|
|
|
" search_alg=algo,\n",
|
|
|
|
|
" num_samples=num_samples,\n",
|
|
|
|
|
" config=search_space\n",
|
|
|
|
|
")"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "markdown",
|
|
|
|
|
"id": "fb1a7563",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"source": [
|
2022-05-13 15:39:18 +02:00
|
|
|
|
"Below are the recommended relative proportions of water and each salt found to\n",
|
|
|
|
|
"maximize conductivity in the ionic solution (according to the simple model):"
|
2022-04-25 11:10:58 -07:00
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"id": "ccb3b44b",
|
|
|
|
|
"metadata": {},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"print(\"Best hyperparameters found: \", analysis.best_config)"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
{
|
|
|
|
|
"cell_type": "code",
|
|
|
|
|
"execution_count": null,
|
|
|
|
|
"id": "42eca06e",
|
|
|
|
|
"metadata": {
|
|
|
|
|
"tags": [
|
|
|
|
|
"remove-cell"
|
|
|
|
|
]
|
|
|
|
|
},
|
|
|
|
|
"outputs": [],
|
|
|
|
|
"source": [
|
|
|
|
|
"ray.shutdown()"
|
|
|
|
|
]
|
|
|
|
|
}
|
|
|
|
|
],
|
|
|
|
|
"metadata": {
|
|
|
|
|
"kernelspec": {
|
|
|
|
|
"display_name": "Python 3 (ipykernel)",
|
|
|
|
|
"language": "python",
|
|
|
|
|
"name": "python3"
|
|
|
|
|
},
|
|
|
|
|
"orphan": true
|
|
|
|
|
},
|
|
|
|
|
"nbformat": 4,
|
|
|
|
|
"nbformat_minor": 5
|
2022-05-13 15:39:18 +02:00
|
|
|
|
}
|