ray/doc/source/tune/examples/hyperopt_example.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "58fc50bc",
   "metadata": {},
   "source": [
    "# Running Tune experiments with HyperOpt\n",
    "\n",
    "In this tutorial we introduce HyperOpt, while running a simple Ray Tune experiment. Tune’s Search Algorithms integrate with HyperOpt and, as a result, allow you to seamlessly scale up a Hyperopt optimization process - without sacrificing performance.\n",
    "\n",
    "HyperOpt provides gradient/derivative-free optimization able to handle noise over the objective landscape, including evolutionary, bandit, and Bayesian optimization algorithms. Nevergrad internally supports search spaces which are continuous, discrete or a mixture of thereof. It also provides a library of functions on which to test the optimization algorithms and compare with other benchmarks.\n",
    "\n",
    "In this example we minimize a simple objective to briefly demonstrate the usage of HyperOpt with Ray Tune via `HyperOptSearch`. It's useful to keep in mind that despite the emphasis on machine learning experiments, Ray Tune optimizes any implicit or explicit objective. Here we assume `hyperopt==0.2.5` library is installed. To learn more, please refer to [HyperOpt website](http://hyperopt.github.io/hyperopt).\n",
    "\n",
    "We include a important example on conditional search spaces (stringing together relationships among hyperparameters)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e4586d28",
   "metadata": {},
   "source": [
    "Background information:\n",
    "- [HyperOpt website](http://hyperopt.github.io/hyperopt)\n",
    "\n",
    "Necessary requirements:\n",
    "- `pip install ray[tune]`\n",
    "- `pip install hyperopt==0.2.5`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6567f2dc",
   "metadata": {
    "tags": [
     "remove-cell"
    ]
   },
   "outputs": [],
   "source": [
    "# !pip install ray[tune]\n",
    "!pip install hyperopt==0.2.5"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b8e9e0cd",
   "metadata": {},
   "source": [
    "Click below to see all the imports we need for this example.\n",
    "You can also launch directly into a Binder instance to run this notebook yourself.\n",
    "Just click on the rocket symbol at the top of the navigation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6592315e",
   "metadata": {
    "tags": [
     "hide-input"
    ]
   },
   "outputs": [],
   "source": [
    "import time\n",
    "\n",
    "import ray\n",
    "from ray import tune\n",
    "from ray.tune.suggest import ConcurrencyLimiter\n",
    "from ray.tune.suggest.hyperopt import HyperOptSearch\n",
    "from hyperopt import hp"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d4b6d1d5",
   "metadata": {},
   "source": [
    "Let's start by defining a simple evaluation function.\n",
    "We artificially sleep for a bit (`0.1` seconds) to simulate a long-running ML experiment.\n",
    "This setup assumes that we're running multiple `step`s of an experiment and try to tune two hyperparameters,\n",
    "namely `width` and `height`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "12d4efc8",
   "metadata": {},
   "outputs": [],
   "source": [
    "def evaluate(step, width, height):\n",
    "    time.sleep(0.1)\n",
    "    return (0.1 + width * step / 100) ** (-1) + height * 0.1"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4f4f5aa2",
   "metadata": {},
   "source": [
    "Next, our ``objective`` function takes a Tune ``config``, evaluates the `score` of your experiment in a training loop,\n",
    "and uses `tune.report` to report the `score` back to Tune."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c9818009",
   "metadata": {},
   "outputs": [],
   "source": [
    "def objective(config):\n",
    "    for step in range(config[\"steps\"]):\n",
    "        score = evaluate(step, config[\"width\"], config[\"height\"])\n",
    "        tune.report(iterations=step, mean_loss=score)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "33eddcb9",
   "metadata": {
    "tags": [
     "remove-cell"
    ]
   },
   "outputs": [],
   "source": [
    "ray.init(configure_logging=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5be35d5e",
   "metadata": {},
   "source": [
    "While defining the search algorithm, we may choose to provide an initial set of hyperparameters that we believe are especially promising or informative, and\n",
    "pass this information as a helpful starting point for the `HyperOptSearch` object.\n",
    "\n",
    "We also set the maximum concurrent trials to `4` with a `ConcurrencyLimiter`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d4615bed",
   "metadata": {
    "lines_to_next_cell": 0
   },
   "outputs": [],
   "source": [
    "initial_params = [\n",
    "    {\"width\": 1, \"height\": 2},\n",
    "    {\"width\": 4, \"height\": 2},\n",
    "]\n",
    "algo = HyperOptSearch(points_to_evaluate=initial_params)\n",
    "algo = ConcurrencyLimiter(algo, max_concurrent=4)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2a51e7c1",
   "metadata": {},
   "source": [
    "The number of samples is the number of hyperparameter combinations that will be tried out. This Tune run is set to `1000` samples.\n",
    "(you can decrease this if it takes too long on your machine)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2dbb2be0",
   "metadata": {},
   "outputs": [],
   "source": [
    "num_samples = 1000"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "950558ed",
   "metadata": {
    "tags": [
     "remove-cell"
    ]
   },
   "outputs": [],
   "source": [
    "# If 1000 samples take too long, you can reduce this number.\n",
    "# We override this number here for our smoke tests.\n",
    "num_samples = 10"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6e3629cb",
   "metadata": {},
   "source": [
    "Next we define a search space. The critical assumption is that the optimal hyperparamters live within this space. Yet, if the space is very large, then those hyperparameters may be difficult to find in a short amount of time."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "65189946",
   "metadata": {},
   "outputs": [],
   "source": [
    "search_config = {\n",
    "    \"steps\": 100,\n",
    "    \"width\": tune.uniform(0, 20),\n",
    "    \"height\": tune.uniform(-100, 100),\n",
    "    \"activation\": tune.choice([\"relu, tanh\"])\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1b94c93b",
   "metadata": {},
   "source": [
    "Finally, we run the experiment to `\"min\"`imize the \"mean_loss\" of the `objective` by searching `search_config` via `algo`, `num_samples` times. This previous sentence is fully characterizes the search problem we aim to solve. With this in mind, notice how efficient it is to execute `tune.run()`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9a99a3a7",
   "metadata": {},
   "outputs": [],
   "source": [
    "analysis = tune.run(\n",
    "    objective,\n",
    "    search_alg=algo,\n",
    "    metric=\"mean_loss\",\n",
    "    mode=\"min\",\n",
    "    name=\"hyperopt_exp\",\n",
    "    num_samples=num_samples,\n",
    "    config=search_space,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "49be6f01",
   "metadata": {},
   "source": [
    "Here are the hyperparamters found to minimize the mean loss of the defined objective."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7036798c",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Best hyperparameters found were: \", analysis.best_config)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "504e9d2a",
   "metadata": {},
   "source": [
    "## Conditional search spaces\n",
    "\n",
    "Sometimes we may want to build a more complicated search space that has conditional dependencies on other hyperparameters. In this case, we pass a nested dictionary to `objective_two`, which has been slightly adjusted from `objective` to deal with the conditional search space."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2f7b5449",
   "metadata": {},
   "outputs": [],
   "source": [
    "def evaluation_fn(step, width, height, mult=1):\n",
    "    return (0.1 + width * step / 100) ** (-1) + height * 0.1 * mult"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4b83b81c",
   "metadata": {},
   "outputs": [],
   "source": [
    "def objective_two(config):\n",
    "    width, height = config[\"width\"], config[\"height\"]\n",
    "    sub_dict = config[\"activation\"]\n",
    "    mult = sub_dict.get(\"mult\", 1)\n",
    "    \n",
    "    for step in range(config[\"steps\"]):\n",
    "        intermediate_score = evaluation_fn(step, width, height, mult)\n",
    "        tune.report(iterations=step, mean_loss=intermediate_score)\n",
    "        time.sleep(0.1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "75cea99e",
   "metadata": {},
   "outputs": [],
   "source": [
    "conditional_space = {\n",
    "    \"activation\": hp.choice(\n",
    "        \"activation\",\n",
    "        [\n",
    "            {\"activation\": \"relu\", \"mult\": hp.uniform(\"mult\", 1, 2)},\n",
    "            {\"activation\": \"tanh\"},\n",
    "        ],\n",
    "    ),\n",
    "    \"width\": hp.uniform(\"width\", 0, 20),\n",
    "    \"height\": hp.uniform(\"height\", -100, 100),\n",
    "    \"steps\": 100,\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7df282c1",
   "metadata": {},
   "source": [
    "Now we the define the search algorithm built from `HyperOptSearch` constrained by `ConcurrencyLimiter`. When the hyperparameter search space is conditional, we pass it (`conditional_space`) into `HyperOptSearch`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ea2c71a6",
   "metadata": {},
   "outputs": [],
   "source": [
    "algo = HyperOptSearch(space=conditional_space, metric=\"mean_loss\", mode=\"min\")\n",
    "algo = ConcurrencyLimiter(algo, max_concurrent=4)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "630f84ab",
   "metadata": {},
   "source": [
    "Now we run the experiment, this time with an empty `config` because we instead provided `space` to the `HyperOptSearch` `search_alg`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "14111e9e",
   "metadata": {},
   "outputs": [],
   "source": [
    "analysis = tune.run(\n",
    "    objective_two,\n",
    "    metric=\"mean_loss\",\n",
    "    mode=\"min\",\n",
    "    search_alg=algo,\n",
    "    num_samples=num_samples\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e6172afa",
   "metadata": {},
   "source": [
    "Finally, we again show the hyperparameters that minimize the mean loss defined by the score of the objective function above. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "03c3fc49",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Best hyperparameters found were: \", analysis.best_config)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "2f7b72d3",
   "metadata": {
    "tags": [
     "remove-cell"
    ]
   },
   "outputs": [],
   "source": [
    "ray.shutdown()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "orphan": true
 },
 "nbformat": 4,
 "nbformat_minor": 5
}