ray/doc/source/tune/examples/hebo_example.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "83323340",
   "metadata": {},
   "source": [
    "# Running Tune experiments with HEBOSearch\n",
    "\n",
    "In this tutorial we introduce HEBO, while running a simple Ray Tune experiment. Tune’s Search Algorithms integrate with ZOOpt and, as a result, allow you to seamlessly scale up a HEBO optimization process - without sacrificing performance.\n",
    "\n",
    "Heteroscadastic Evolutionary Bayesian Optimization (HEBO) does not rely on the gradient of the objective function, but instead, learns from samples of the search space. It is suitable for optimizing functions that are nondifferentiable, with many local minima, or even unknown but only testable. This necessarily makes the algorithm belong to the domain of \"derivative-free optimization\" and \"black-box optimization\".\n",
    "\n",
    "In this example we minimize a simple objective to briefly demonstrate the usage of HEBO with Ray Tune via `HEBOSearch`. It's useful to keep in mind that despite the emphasis on machine learning experiments, Ray Tune optimizes any implicit or explicit objective. Here we assume `zoopt==0.4.1` library is installed. To learn more, please refer to the [HEBO website](https://github.com/huawei-noah/HEBO/tree/master/HEBO)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b8de7864",
   "metadata": {
    "tags": [
     "remove-cell"
    ]
   },
   "outputs": [],
   "source": [
    "# !pip install ray[tune]\n",
    "!pip install HEBO==0.3.2"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "37141309",
   "metadata": {},
   "source": [
    "Click below to see all the imports we need for this example.\n",
    "You can also launch directly into a Binder instance to run this notebook yourself.\n",
    "Just click on the rocket symbol at the top of the navigation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9dae789a",
   "metadata": {
    "tags": [
     "hide-input"
    ]
   },
   "outputs": [],
   "source": [
    "import time\n",
    "\n",
    "import ray\n",
    "from ray import tune\n",
    "from ray.tune.suggest.hebo import HEBOSearch"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f5747767",
   "metadata": {},
   "source": [
    "Let's start by defining a simple evaluation function.\n",
    "We artificially sleep for a bit (`0.1` seconds) to simulate a long-running ML experiment.\n",
    "This setup assumes that we're running multiple `step`s of an experiment and try to tune two hyperparameters,\n",
    "namely `width` and `height`, and `activation`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d654948a",
   "metadata": {},
   "outputs": [],
   "source": [
    "def evaluate(step, width, height, activation):\n",
    "    time.sleep(0.1)\n",
    "    activation_boost = 10 if activation==\"relu\" else 1\n",
    "    return (0.1 + width * step / 100) ** (-1) + height * 0.1 + activation_boost"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e61acd5f",
   "metadata": {},
   "source": [
    "Next, our ``objective`` function takes a Tune ``config``, evaluates the `score` of your experiment in a training loop,\n",
    "and uses `tune.report` to report the `score` back to Tune."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b7e5f38b",
   "metadata": {},
   "outputs": [],
   "source": [
    "def objective(config):\n",
    "    for step in range(config[\"steps\"]):\n",
    "        score = evaluate(step, config[\"width\"], config[\"height\"], config[\"activation\"])\n",
    "        tune.report(iterations=step, mean_loss=score)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "49966a2e",
   "metadata": {
    "tags": [
     "remove-cell"
    ]
   },
   "outputs": [],
   "source": [
    "ray.init(configure_logging=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "79567a1b",
   "metadata": {},
   "source": [
    "While defining the search algorithm, we may choose to provide an initial set of hyperparameters that we believe are especially promising or informative, and\n",
    "pass this information as a helpful starting point for the `HyperOptSearch` object.\n",
    "\n",
    "We also set the maximum concurrent trials to `8`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9aaf222e",
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "previously_run_params = [\n",
    "    {\"width\": 10, \"height\": 0, \"activation\": \"relu\"},\n",
    "    {\"width\": 15, \"height\": -20, \"activation\": \"tanh\"},\n",
    "]\n",
    "\n",
    "known_rewards = [-189, -1144]\n",
    "\n",
    "max_concurrent = 8\n",
    "\n",
    "algo = HEBOSearch(\n",
    "    metric=\"mean_loss\",\n",
    "    mode=\"min\",\n",
    "    points_to_evaluate=previously_run_params,\n",
    "    evaluated_rewards=known_rewards,\n",
    "    random_state_seed=123,\n",
    "    max_concurrent=max_concurrent,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "19942c67",
   "metadata": {},
   "source": [
    "The number of samples is the number of hyperparameter combinations that will be tried out. This Tune run is set to `1000` samples.\n",
    "(you can decrease this if it takes too long on your machine)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ea2a405c",
   "metadata": {},
   "outputs": [],
   "source": [
    "num_samples = 1000"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cdc1b707",
   "metadata": {
    "tags": [
     "remove-cell"
    ]
   },
   "outputs": [],
   "source": [
    "# If 1000 samples take too long, you can reduce this number.\n",
    "# We override this number here for our smoke tests.\n",
    "num_samples = 10"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "40fe2a91",
   "metadata": {},
   "source": [
    "Next we define a search space. The critical assumption is that the optimal hyperparamters live within this space. Yet, if the space is very large, then those hyperparameters may be difficult to find in a short amount of time."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0a530e21",
   "metadata": {},
   "outputs": [],
   "source": [
    "search_config = {\n",
    "    \"steps\": 100,\n",
    "    \"width\": tune.uniform(0, 20),\n",
    "    \"height\": tune.uniform(-100, 100),\n",
    "    \"activation\": tune.choice([\"relu, tanh\"])\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3e38301f",
   "metadata": {},
   "source": [
    "Finally, we run the experiment to `\"min\"`imize the \"mean_loss\" of the `objective` by searching `search_config` via `algo`, `num_samples` times. This previous sentence is fully characterizes the search problem we aim to solve. With this in mind, notice how efficient it is to execute `tune.run()`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3fb63ef3",
   "metadata": {},
   "outputs": [],
   "source": [
    "analysis = tune.run(\n",
    "    objective,\n",
    "    metric=\"mean_loss\",\n",
    "    mode=\"min\",\n",
    "    name=\"hebo_exp_with_warmstart\",\n",
    "    search_alg=algo,\n",
    "    num_samples=num_samples,\n",
    "    config=search_config\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "963e547b",
   "metadata": {},
   "source": [
    "Here are the hyperparamters found to minimize the mean loss of the defined objective."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8b8a23e4",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"Best hyperparameters found were: \", analysis.best_config)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d0882e22",
   "metadata": {
    "tags": [
     "remove-cell"
    ]
   },
   "outputs": [],
   "source": [
    "ray.shutdown()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "orphan": true
 },
 "nbformat": 4,
 "nbformat_minor": 5
}