
An exhaustive search over a given hyperparameter grid can be achieved using the popular GridSearchCV method from the scikit-learn library for Python hyper parametertuning. Hyperparameter tuning, which seeks to determine the optimal set of hyperparameters for a given model, is an important stage in the machine learning process.
Here’s how to use GridSearchCV to adjust hyperparameters:
First: Include the required libraries
import numpy as np
from sklearn.model_selection import train_test_split, GridSearchCV
# to load the iris dataset
from sklearn.datasets import load_iris
from sklearn.svm import SVC
Second: Prepare or load your dataset
In this example, the Iris dataset will be utilized.
# Load the Iris dataset
data = load_iris()
X, y = data.data, data.target
Third: From the data, create training and test sets
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
Then: Build the model
# Create a support vector machine (SVM) classifier
svm_classifier = SVC()
Moving On: Describe the grid of hyperparameters
# Define the hyperparameter grid to search
param_grid = {
'C': [0.1, 1, 10],
'kernel': ['linear', 'rbf', 'poly'],
'gamma': ['scale', 'auto']
}
Several values are being considered in this example for the regularization parameter (C), kernel (type of kernel function), and kernel coefficient (gamma) hyperparameters of the SVM classifier.
Subsequently: Run GridSearchCV to
# Create a GridSearchCV object with the SVM classifier and the hyperparameter grid
grid_search = GridSearchCV(svm_classifier, param_grid, cv=5, n_jobs=-1)
# Perform the grid search on the training data
grid_search.fit(X_train, y_train)
Finally: View the results here
Following GridSearchCV, you can view various properties to find out more about the ideal model and hyperparameters:
# Get the best hyperparameters found by GridSearchCV
best_params = grid_search.best_params_
print("Best Hyperparameters:", best_params)
# Get the best model with the best hyperparameters
best_model = grid_search.best_estimator_
Choose the best model by:
Lastly, you can use the best model that GridSearchCV finds to generate predictions, and you can evaluate the model’s efficacy using the test set:
# Use the best model to make predictions on the test data
y_pred = best_model.predict(X_test)
# Evaluate the model's performance
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")
Hence, GridSearchCV uses cross-validation to evaluate the model’s performance while conducting an extensive search across the hyperparameter grid (this is controlled by the cv parameter). As a scoring measure, it will return the set of hyperparameters that perform the best when using the classifier’s default mean accuracy.