ray/doc/source/tune/examples/dragonfly_example.ipynb
Max Pumperla cd5218f831
[docs] Tune examples better navigation, minor fixes (#24733)
Replaces #24225 and adds example navigation

Signed-off-by: Max Pumperla <max.pumperla@googlemail.com>
2022-05-13 14:39:18 +01:00

296 lines
No EOL
8.9 KiB
Text
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"cells": [
{
"cell_type": "markdown",
"id": "0332db08",
"metadata": {},
"source": [
"# Running Tune experiments with Dragonfly\n",
"\n",
"In this tutorial we introduce Dragonfly, while running a simple Ray Tune experiment.\n",
"Tunes Search Algorithms integrate with Dragonfly and, as a result,\n",
"allow you to seamlessly scale up a Dragonfly optimization process - without\n",
"sacrificing performance.\n",
"\n",
"Dragonfly is an open source python library for scalable Bayesian optimization.\n",
"Bayesian optimization is used optimizing black-box functions whose evaluations\n",
"are usually expensive. Beyond vanilla optimization techniques,\n",
"Dragonfly provides an array of tools to scale up Bayesian optimization to expensive\n",
"large scale problems. These include features that are especially suited for high\n",
"dimensional spaces (optimizing with a large number of variables), parallel evaluations\n",
"in synchronous or asynchronous settings (conducting multiple evaluations in parallel),\n",
"multi-fidelity optimization (using cheap approximations to speed up the optimization\n",
"process), and multi-objective optimization (optimizing multiple functions\n",
"simultaneously).\n",
"\n",
"Bayesian optimization does not rely on the gradient of the objective function,\n",
"but instead, learns from samples of the search space. It is suitable for optimizing\n",
"functions that are non-differentiable, with many local minima, or even unknown but only\n",
"testable. Therefore, it belongs to the domain of \"derivative-free optimization\" and\n",
"\"black-box optimization\". In this example we minimize a simple objective to briefly\n",
"demonstrate the usage of Dragonfly with Ray Tune via `DragonflySearch`. It's useful\n",
"to keep in mind that despite the emphasis on machine learning experiments,\n",
"Ray Tune optimizes any implicit or explicit objective. Here we assume\n",
"`dragonfly-opt==0.1.6` library is installed. To learn more, please refer to\n",
"the [Dragonfly website](https://dragonfly-opt.readthedocs.io/)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9878e2bd",
"metadata": {
"tags": [
"remove-cell"
]
},
"outputs": [],
"source": [
"# !pip install ray[tune]\n",
"!pip install dragonfly-opt==0.1.6"
]
},
{
"cell_type": "markdown",
"id": "3acad068",
"metadata": {},
"source": [
"Click below to see all the imports we need for this example.\n",
"You can also launch directly into a Binder instance to run this notebook yourself.\n",
"Just click on the rocket symbol at the top of the navigation."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4ea7aefa",
"metadata": {
"tags": [
"hide-input"
]
},
"outputs": [],
"source": [
"import numpy as np\n",
"import time\n",
"\n",
"import ray\n",
"from ray import tune\n",
"from ray.tune.suggest import ConcurrencyLimiter\n",
"from ray.tune.suggest.dragonfly import DragonflySearch"
]
},
{
"cell_type": "markdown",
"id": "dc67fc3f",
"metadata": {},
"source": [
"Let's start by defining a optimization problem.\n",
"Suppose we want to figure out the proportions of water and several salts to add to an\n",
"ionic solution with the goal of maximizing it's ability to conduct electricity.\n",
"The objective here is explicit for demonstration, yet in practice they often come\n",
"out of a black-box (e.g. a physical device measuring conductivity, or reporting the\n",
"results of a long-running ML experiment). We artificially sleep for a bit\n",
"(`0.02` seconds) to simulate a more typical experiment. This setup assumes that we're\n",
"running multiple `step`s of an experiment and try to tune relative proportions of\n",
"4 ingredients-- these proportions should be considered as hyperparameters.\n",
"Our `objective` function will take a Tune `config`, evaluates the `conductivity` of\n",
"our experiment in a training loop,\n",
"and uses `tune.report` to report the `conductivity` back to Tune."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f0d72404",
"metadata": {},
"outputs": [],
"source": [
"def objective(config):\n",
" \"\"\"\n",
" Simplistic model of electrical conductivity with added Gaussian\n",
" noise to simulate experimental noise.\n",
" \"\"\"\n",
" for i in range(config[\"iterations\"]):\n",
" vol1 = config[\"LiNO3_vol\"] # LiNO3\n",
" vol2 = config[\"Li2SO4_vol\"] # Li2SO4\n",
" vol3 = config[\"NaClO4_vol\"] # NaClO4\n",
" vol4 = 10 - (vol1 + vol2 + vol3) # Water\n",
" conductivity = vol1 + 0.1 * (vol2 + vol3) ** 2 + 2.3 * vol4 * (vol1 ** 1.5)\n",
" conductivity += np.random.normal() * 0.01\n",
" tune.report(timesteps_total=i, objective=conductivity)\n",
" time.sleep(0.02)"
]
},
{
"cell_type": "markdown",
"id": "1808e8e0",
"metadata": {},
"source": [
"Next we define a search space. The critical assumption is that the optimal\n",
"hyperparameters live within this space. Yet, if the space is very large, then those\n",
"hyperparameters may be difficult to find in a short amount of time."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6b867a19",
"metadata": {},
"outputs": [],
"source": [
"search_space = {\n",
" \"iterations\": 100,\n",
" \"LiNO3_vol\": tune.uniform(0, 7),\n",
" \"Li2SO4_vol\": tune.uniform(0, 7),\n",
" \"NaClO4_vol\": tune.uniform(0, 7)\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "6fe16177",
"metadata": {
"tags": [
"remove-cell"
]
},
"outputs": [],
"source": [
"ray.init(configure_logging=False)"
]
},
{
"cell_type": "markdown",
"id": "ed5fc098",
"metadata": {},
"source": [
"Now we define the search algorithm from `DragonflySearch` with `optimizer` and\n",
"`domain` arguments specified in a common way. We also use `ConcurrencyLimiter`\n",
"to constrain to 4 concurrent trials."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a8075e34",
"metadata": {},
"outputs": [],
"source": [
"algo = DragonflySearch(\n",
" optimizer=\"bandit\",\n",
" domain=\"euclidean\",\n",
")\n",
"algo = ConcurrencyLimiter(algo, max_concurrent=4)"
]
},
{
"cell_type": "markdown",
"id": "82240170",
"metadata": {},
"source": [
"The number of samples is the number of hyperparameter combinations that will be\n",
"tried out. This Tune run is set to `1000` samples.\n",
"(you can decrease this if it takes too long on your machine)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c437c1e1",
"metadata": {},
"outputs": [],
"source": [
"num_samples = 100"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "70192183",
"metadata": {
"tags": [
"remove-cell"
]
},
"outputs": [],
"source": [
"# Reducing samples for smoke tests\n",
"num_samples = 10"
]
},
{
"cell_type": "markdown",
"id": "3a823fb1",
"metadata": {},
"source": [
"Finally, we run the experiment to `min`imize the `mean_loss` of the `objective` by\n",
"searching `search_config` via `algo`, `num_samples` times. This previous sentence is\n",
"fully characterizes the search problem we aim to solve. With this in mind,\n",
"notice how efficient it is to execute `tune.run()`."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "97c69cbe",
"metadata": {},
"outputs": [],
"source": [
"analysis = tune.run(\n",
" objective,\n",
" metric=\"objective\",\n",
" mode=\"max\",\n",
" name=\"dragonfly_search\",\n",
" search_alg=algo,\n",
" num_samples=num_samples,\n",
" config=search_space\n",
")"
]
},
{
"cell_type": "markdown",
"id": "fb1a7563",
"metadata": {},
"source": [
"Below are the recommended relative proportions of water and each salt found to\n",
"maximize conductivity in the ionic solution (according to the simple model):"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ccb3b44b",
"metadata": {},
"outputs": [],
"source": [
"print(\"Best hyperparameters found: \", analysis.best_config)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "42eca06e",
"metadata": {
"tags": [
"remove-cell"
]
},
"outputs": [],
"source": [
"ray.shutdown()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"orphan": true
},
"nbformat": 4,
"nbformat_minor": 5
}