mirror of
https://github.com/vale981/ray
synced 2025-03-07 02:51:39 -05:00

Co-authored-by: brettskymind <brett@pathmind.com> Co-authored-by: Max Pumperla <max.pumperla@googlemail.com>
557 lines
16 KiB
Text
557 lines
16 KiB
Text
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "a587ce4e",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Running Tune experiments with Optuna\n",
|
||
"\n",
|
||
"In this tutorial we introduce Optuna, while running a simple Ray Tune experiment. Tune’s Search Algorithms integrate with Optuna and, as a result, allow you to seamlessly scale up a Optuna optimization process - without sacrificing performance.\n",
|
||
"\n",
|
||
"Similar to Ray Tune, Optuna is an automatic hyperparameter optimization software framework, particularly designed for machine learning. It features an imperative (\"how\" over \"what\" emphasis), define-by-run style user API. With Optuna, a user has the ability to dynamically construct the search spaces for the hyperparameters. Optuna falls in the domain of \"derivative-free optimization\" and \"black-box optimization\".\n",
|
||
"\n",
|
||
"In this example we minimize a simple objective to briefly demonstrate the usage of Optuna with Ray Tune via `OptunaSearch`, including examples of conditional search spaces (string together relationships between hyperparameters), and the multi-objective problem (measure trade-offs among all important metrics). It's useful to keep in mind that despite the emphasis on machine learning experiments, Ray Tune optimizes any implicit or explicit objective. Here we assume `optuna==2.9.1` library is installed. To learn more, please refer to [Optuna website](https://optuna.org/).\n",
|
||
"\n",
|
||
"Please note that sophisticated schedulers, such as `AsyncHyperBandScheduler`, may not work correctly with multi-objective optimization, since they typically expect a scalar score to compare fitness among trials.\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "aeaf9ff0",
|
||
"metadata": {
|
||
"tags": [
|
||
"remove-cell"
|
||
]
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"# !pip install ray[tune]\n",
|
||
"!pip install optuna==2.9.1"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "467466a3",
|
||
"metadata": {},
|
||
"source": [
|
||
"Click below to see all the imports we need for this example.\n",
|
||
"You can also launch directly into a Binder instance to run this notebook yourself.\n",
|
||
"Just click on the rocket symbol at the top of the navigation."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "fb1ad624",
|
||
"metadata": {
|
||
"tags": [
|
||
"hide-input"
|
||
]
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"import time\n",
|
||
"from typing import Dict, Optional, Any\n",
|
||
"\n",
|
||
"import ray\n",
|
||
"from ray import tune\n",
|
||
"from ray.tune.suggest import ConcurrencyLimiter\n",
|
||
"from ray.tune.suggest.optuna import OptunaSearch"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "e64f0b44",
|
||
"metadata": {
|
||
"tags": [
|
||
"remove-cell"
|
||
]
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"ray.init(configure_logging=False)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "56b0c685",
|
||
"metadata": {},
|
||
"source": [
|
||
"Let's start by defining a simple evaluation function.\n",
|
||
"An explicit math formula is queried here for demonstration, yet in practice this is typically a black-box function-- e.g. the performance results after training an ML model.\n",
|
||
"We artificially sleep for a bit (`0.1` seconds) to simulate a long-running ML experiment.\n",
|
||
"This setup assumes that we're running multiple `step`s of an experiment while tuning three hyperparameters,\n",
|
||
"namely `width`, `height`, and `activation`."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "90a11f98",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"def evaluate(step, width, height, activation):\n",
|
||
" time.sleep(0.1)\n",
|
||
" activation_boost = 10 if activation==\"relu\" else 0\n",
|
||
" return (0.1 + width * step / 100) ** (-1) + height * 0.1 + activation_boost"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "bc579b83",
|
||
"metadata": {},
|
||
"source": [
|
||
"Next, our `objective` function to be optimized takes a Tune `config`, evaluates the `score` of your experiment in a training loop,\n",
|
||
"and uses `tune.report` to report the `score` back to Tune."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "3a11d0e0",
|
||
"metadata": {
|
||
"lines_to_next_cell": 0
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"def objective(config):\n",
|
||
" for step in range(config[\"steps\"]):\n",
|
||
" score = evaluate(step, config[\"width\"], config[\"height\"], config[\"activation\"])\n",
|
||
" tune.report(iterations=step, mean_loss=score)\n",
|
||
" "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "c58bd20b",
|
||
"metadata": {},
|
||
"source": [
|
||
"Next we define a search space. The critical assumption is that the optimal hyperparamters live within this space. Yet, if the space is very large, then those hyperparamters may be difficult to find in a short amount of time.\n",
|
||
"\n",
|
||
"The simplest case is a search space with independent dimensions. In this case, a config dictionary will suffice."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "c3e4eecb",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"search_space = {\n",
|
||
" \"steps\": 100,\n",
|
||
" \"width\": tune.uniform(0, 20),\n",
|
||
" \"height\": tune.uniform(-100, 100),\n",
|
||
" \"activation\": tune.choice([\"relu\", \"tanh\"]),\n",
|
||
"}"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "ef0c666d",
|
||
"metadata": {},
|
||
"source": [
|
||
"Here we define the Optuna search algorithm:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "f23cadc8",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"algo = OptunaSearch()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "4287fa79",
|
||
"metadata": {},
|
||
"source": [
|
||
"We also constrain the the number of concurrent trials to `4` with a `ConcurrencyLimiter`."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "68022ea4",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"algo = ConcurrencyLimiter(algo, max_concurrent=4)\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "4c250f74",
|
||
"metadata": {},
|
||
"source": [
|
||
"The number of samples is the number of hyperparameter combinations that will be tried out. This Tune run is set to `1000` samples.\n",
|
||
"(you can decrease this if it takes too long on your machine)."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "f6c21314",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"num_samples = 1000"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "9533aabf",
|
||
"metadata": {
|
||
"tags": [
|
||
"remove-cell"
|
||
]
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"# We override here for our smoke tests.\n",
|
||
"num_samples = 10"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "92942b88",
|
||
"metadata": {},
|
||
"source": [
|
||
"Finally, we run the experiment to `\"min\"`imize the \"mean_loss\" of the `objective` by searching `search_space` via `algo`, `num_samples` times. This previous sentence is fully characterizes the search problem we aim to solve. With this in mind, notice how efficient it is to execute `tune.run()`."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "4e224bb2",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"analysis = tune.run(\n",
|
||
" objective,\n",
|
||
" search_alg=algo,\n",
|
||
" metric=\"mean_loss\",\n",
|
||
" mode=\"min\",\n",
|
||
" num_samples=num_samples,\n",
|
||
" config=search_space\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "b66aab6a",
|
||
"metadata": {},
|
||
"source": [
|
||
"And now we have the hyperparameters found to minimize the mean loss."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "e69db02e",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"print(\"Best hyperparameters found were: \", analysis.best_config)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d545d30b",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Providing an initial set of hyperparameters\n",
|
||
"\n",
|
||
"While defining the search algorithm, we may choose to provide an initial set of hyperparameters that we believe are especially promising or informative, and\n",
|
||
"pass this information as a helpful starting point for the `OptunaSearch` object."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "7596b7f4",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"initial_params = [\n",
|
||
" {\"width\": 1, \"height\": 2, \"activation\": \"relu\"},\n",
|
||
" {\"width\": 4, \"height\": 2, \"activation\": \"relu\"},\n",
|
||
"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "f84bbff0",
|
||
"metadata": {},
|
||
"source": [
|
||
"Now the `search_alg` built using `OptunaSearch` takes `points_to_evaluate`."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "320d1935",
|
||
"metadata": {
|
||
"lines_to_next_cell": 0
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"searcher = OptunaSearch(points_to_evaluate=initial_params)\n",
|
||
"algo = ConcurrencyLimiter(searcher, max_concurrent=4)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "9147d9a2",
|
||
"metadata": {},
|
||
"source": [
|
||
"And run the experiment with initial hyperparameter evaluations:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "ee442efd",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"analysis = tune.run(\n",
|
||
" objective,\n",
|
||
" search_alg=algo,\n",
|
||
" metric=\"mean_loss\",\n",
|
||
" mode=\"min\",\n",
|
||
" num_samples=num_samples,\n",
|
||
" config=search_space\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "ccfe15e2",
|
||
"metadata": {},
|
||
"source": [
|
||
"We take another look at the optimal hyperparamters."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "fcfa0c2e",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"print(\"Best hyperparameters found were: \", analysis.best_config)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "88080576",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Conditional search spaces \n",
|
||
"\n",
|
||
"Sometimes we may want to build a more complicated search space that has conditional dependencies on other hyperparameters. In this case, we pass a define-by-run function to the `search_alg` argument in `ray.tune()`."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "f0acc2fc",
|
||
"metadata": {
|
||
"lines_to_next_cell": 0
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"def define_by_run_func(trial) -> Optional[Dict[str, Any]]:\n",
|
||
" \"\"\"Define-by-run function to create the search space.\n",
|
||
"\n",
|
||
" Ensure no actual computation takes place here. That should go into\n",
|
||
" the trainable passed to ``tune.run`` (in this example, that's\n",
|
||
" ``objective``).\n",
|
||
"\n",
|
||
" For more information, see https://optuna.readthedocs.io/en/stable\\\n",
|
||
" /tutorial/10_key_features/002_configurations.html\n",
|
||
"\n",
|
||
" This function should either return None or a dict with constant values.\n",
|
||
" \"\"\"\n",
|
||
"\n",
|
||
" activation = trial.suggest_categorical(\"activation\", [\"relu\", \"tanh\"])\n",
|
||
"\n",
|
||
" # Define-by-run allows for conditional search spaces.\n",
|
||
" if activation == \"relu\":\n",
|
||
" trial.suggest_float(\"width\", 0, 20)\n",
|
||
" trial.suggest_float(\"height\", -100, 100)\n",
|
||
" else:\n",
|
||
" trial.suggest_float(\"width\", -1, 21)\n",
|
||
" trial.suggest_float(\"height\", -101, 101)\n",
|
||
" \n",
|
||
" # Return all constants in a dictionary.\n",
|
||
" return {\"steps\": 100}"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "4c9d0945",
|
||
"metadata": {},
|
||
"source": [
|
||
"As before, we create the `search_alg` from `OptunaSearch` and `ConcurrencyLimiter`, this time we define the scope of search via the `space` argument and provide no initialization. We also must specific metric and mode when using `space`. "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "906f9ffc",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"searcher = OptunaSearch(space=define_by_run_func, metric=\"mean_loss\", mode=\"min\")\n",
|
||
"algo = ConcurrencyLimiter(searcher, max_concurrent=4)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "fea9399c",
|
||
"metadata": {},
|
||
"source": [
|
||
"Running the experiment with a define-by-run search space:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "bf0ee932",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"analysis = tune.run(\n",
|
||
" objective,\n",
|
||
" search_alg=algo,\n",
|
||
" num_samples=num_samples\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "11e1ee04",
|
||
"metadata": {},
|
||
"source": [
|
||
"We take a look again at the optimal hyperparameters."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "13e4ce18",
|
||
"metadata": {
|
||
"lines_to_next_cell": 0
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"print(\"Best hyperparameters for loss found were: \", analysis.get_best_config(\"mean_loss\", \"min\"))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "34bbd066",
|
||
"metadata": {},
|
||
"source": [
|
||
"## Multi-objective optimization\n",
|
||
"\n",
|
||
"Finally, let's take a look at the multi-objective case."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "b233cbea",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"def multi_objective(config):\n",
|
||
" # Hyperparameters\n",
|
||
" width, height = config[\"width\"], config[\"height\"]\n",
|
||
"\n",
|
||
" for step in range(config[\"steps\"]):\n",
|
||
" # Iterative training function - can be any arbitrary training procedure\n",
|
||
" intermediate_score = evaluate(step, config[\"width\"], config[\"height\"], config[\"activation\"])\n",
|
||
" # Feed the score back back to Tune.\n",
|
||
" tune.report(\n",
|
||
" iterations=step, loss=intermediate_score, gain=intermediate_score * width\n",
|
||
" )"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "338e5108",
|
||
"metadata": {},
|
||
"source": [
|
||
"We define the `OptunaSearch` object this time with metric and mode as list arguments."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "624d0bc8",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"searcher = OptunaSearch(metric=[\"loss\", \"gain\"], mode=[\"min\", \"max\"])\n",
|
||
"algo = ConcurrencyLimiter(searcher, max_concurrent=4)\n",
|
||
"\n",
|
||
"analysis = tune.run(\n",
|
||
" multi_objective,\n",
|
||
" search_alg=algo,\n",
|
||
" num_samples=num_samples,\n",
|
||
" config=search_space\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "df42b8b3",
|
||
"metadata": {},
|
||
"source": [
|
||
"Now there are two hyperparameter sets for the two objectives.\n",
|
||
"\n",
|
||
"```\n",
|
||
"print(\"Best hyperparameters for loss found were: \", analysis.get_best_config(\"loss\", \"min\"))\n",
|
||
"print(\"Best hyperparameters for gain found were: \", analysis.get_best_config(\"gain\", \"max\"))\n",
|
||
"```\n",
|
||
"\n",
|
||
"We can mix-and-match the use of initial hyperparameter evaluations, conditional search spaces via define-by-run functions, and multi-objective tasks. This is also true of scheduler usage, with the exception of multi-objective optimization-- schedulers typically rely on a single scalar score, rather than the two scores we use here: loss, gain."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"id": "a058fdb3",
|
||
"metadata": {
|
||
"tags": [
|
||
"remove-cell"
|
||
]
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"ray.shutdown()"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3 (ipykernel)",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"orphan": true
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 5
|
||
}
|