How to Perform Scikit-learn Hyperparameter Optimization with Optuna
Image by Author | Ideogram
Introduction
Optuna is a machine learning framework specifically designed for automating hyperparameter optimization, that is, finding an externally fixed setting of machine learning model hyperparameters that optimizes the model’s performance. It can be seamlessly integrated with other machine learning modeling frameworks with Scikit-learn.
In this article, we show how to combine both for a hyperparameter optimization task.
Performing Scikit-learn Hyperparameter Optimization with Optuna
If using Optuna for the first time in your Python development environment, you will have to install it first.
In this example, we will train a random forest classifier using Scikit-learn’s digits
dataset, which is a “simplified version” of MNIST dataset for image classification, containing 8×8 pixel images of handwritten digits. Time to import the necessary components.
import optuna from sklearn.datasets import load_digits from sklearn.model_selection import train_test_split, cross_val_score from sklearn.ensemble import RandomForestClassifier |
Next, we define the main function to be called later for conducting the hyperparameter optimization process:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
def objective(trial): n_estimators = trial.suggest_int(“n_estimators”, 10, 200) max_depth = trial.suggest_int(“max_depth”, 2, 32, log=True) min_samples_split = trial.suggest_int(“min_samples_split”, 2, 10)
digits = load_digits() X_train, X_test, y_train, y_test = train_test_split(digits.data, digits.target, test_size=0.2, random_state=42)
clf = RandomForestClassifier( n_estimators=n_estimators, max_depth=max_depth, min_samples_split=min_samples_split, random_state=42 )
score = cross_val_score(clf, X_train, y_train, cv=3, scoring=“accuracy”).mean() return score |
As you can observe, pretty much the entire modeling process is encapsulated in this objective(trial)
function, which is a core function managed by Optuna to automate the hyperparameter search, performing as many trials as specified in the function’s argument. Let’s see the body of the function, i.e. the process defined by us:
- Defining the hyperparameter search space or grid (see this article for a better understanding of this concept). Hyperparameters in the search space are added to Optuna’s radar via the
suggest_in
function. - Loading and splitting the dataset into training and test subsets.
- Initializing the model.
- Using cross-validation to evaluate it.
Now we apply the two-step execution of the whole process. First we define the “study” or hyperparameter optimization experiment. Notice we set the direction argument to “maximize” because earlier we chose “accuracy” — the higher the better — as the guiding metric for cross validation.
study = optuna.create_study(direction=“maximize”) |
Second, the optimize method indirectly invokes Optuna’s overridden objective function, which is the main function we defined earlier to encapsulate the process that must be trialled several times.
study.optimize(objective, n_trials=50) |
This will output a total of 50 experiment reports, specifying the hyperparameter setting and resulting model’s accuracy for each trial. Since we are interested in getting the best of all configurations tried, let’s easily get it:
print(“Best hyperparameters:”, study.best_params) print(“Best accuracy:”, study.best_value) |
And a sample output:
Best hyperparameters: {‘n_estimators’: 188, ‘max_depth’: 17, ‘min_samples_split’: 4} Best accuracy: 0.9700765483646485 |
Fantastic! Thanks to Optuna’s automated hyperparameter optimization capabilities, we found a random forest ensemble configuration capable of classifying digit images with over 97% prediction accuracy.
If you have used Scikit-learn built-in classes for hyperparameter optimization like GridSearchCV
or RandomizedSearchCV
, you might be wondering: why is Optuna better? One reason is Optuna’s use of Bayesian optimization behind the scenes to make the hyperparameter tuning process more efficient. Besides, Optuna applies internal strategies like pruning or abruptly terminating unpromising trials, and supports search in more complex spaces than conventional methods.
Wrapping Up
If you have followed along with the above, you should now be able to implement Scikit-learn hyperparameter optimization using Optuna.
For more information, check out the following Machine Learning Mastery resources: