Parameters:

These refer to internal variables that are learned directly from the training data.

Example: In linear regression (y=mx+b), m (slope) and b (intercept) are parameters learned from data.

Hyperparameters:

These are external settings that are set before training manually to control the learning process. Hyperparameters control how the model learns, and choosing the right values can significantly improve accuracy, efficiency, and generalization. So, tuning these hyperparameters are necessary to optimize a machine learning model’s performance.

Example:  The learning rate (which controls how fast the model updates m and b) is a hyperparameter that you must set before training.

Hyperparameter Tuning:

It is the process of selecting the best values for a machine learning model’s hyperparameters to improve performance. 

Common hyperparameter tuning techniques are :

  1. GridSearchCV
  2. RandomizedSearchCV
  3. Bayesian Optimization
  4. Hyperband (Successive Halving)

GridSearchCV:

  • GridSearchCV tries all the combinations of the values passed in the dictionary and evaluates the model for each combination using the Cross-Validation method.
  • This approach is called GridSearchCV, because it searches for the best set of hyperparameters from a grid of hyperparameter values.

Example:

Consider, we want to set two hyperparameters C and Alpha of the Logistic Regression Classifier model, with different sets of values.

  • The grid search technique will construct many versions of the model with all possible combinations of hyperparameters and will return the best one.

  • As in the image below,

    for C = [0.1, 0.2, 0.3, 0.4, 0.5] and

    Alpha = [0.1, 0.2, 0.3, 0.4].

  • For a combination of C=0.3 and Alpha=0.2, the performance score comes out to be 0.73(Highest), therefore it is selected.

C0.50.700.700.700.70
0.40.700.700.700.70
0.30.720.730.710.70
0.20.710.710.700.70
0.10.700.690.690.68
  0.10.20.30.4
  Alpha

Parameter of GridSearchCV:

ParameterDescription
estimatorThe machine learning model to optimize (e.g., RandomForestClassifier()).
scoringThe evaluation metric (e.g., ‘accuracy’, ‘f1’, ‘roc_auc’).
cvNumber of cross-validation folds (e.g., cv=5 for 5-fold cross-validation).
n_jobsNumber of CPU cores to use (-1 means use all available cores).
verboseControls log output (0 = silent, 1 = minimal, 2 = detailed).
refitIf True, retrains the best model on the full dataset after tuning.
return_train_scoreIf True, also returns training scores.
param_gridTries all possible combinations of hyperparameters.

Limitation:

GridSearchCV exhaustively searches all possible combinations of hyperparameters which makes grid search computationally very expensive.

Python Implementation for GridSearchCV:

# Import libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import GridSearchCV

# Load data
data = pd.read_csv('/content/drive/MyDrive/Data Science/CDS-07-Machine Learning & Deep Learning/06. Machine Learning Model /05_Support Vector Machines/SVM Class /Preprocessed_data.csv')

# Creating data
data2 = data.drop(['Loan_Status'], axis=1)
X = data2.iloc[:,2:]
y = data['Loan_Status'].map({'Y':1,'N':0})

# Defining Parameter range
param_grid = {'C' : [0.1,5,10,50,60,70],
              'gamma' : [1,0.1,0.01,0.001,0.0001],
              'random_state':(list(range(1,20)))}

# Model creation
from sklearn.svm import SVC
model1  = SVC()

#Tuning Model
tuned_model = GridSearchCV(model1,param_grid,refit = True, verbose = 2,scoring = 'f1',cv=5)
tuned_model.fit(X,y)

#Showing best parameters & corresponding score 
print("Tuned SVM Parameters: {}".format(tuned_model.best_params_))
print("Best score is {}".format(tuned_model.best_score_))

Output:
Tuned SVM Parameters: {'C': 5, 'gamma': 0.1, 'random_state': 1}
Best score is 0.8436087845413335

RandomizedSearchCV:

  • RandomizedSearchCV solves the limitation of GridSearchCV, as it goes through only a fixed number of combinations from specified distributions.
  • It randomly moves within the grid to find the best set of hyperparameters.
  • This approach reduces the computational cost.

Parameter of RandomizedSearchCV:

ParameterDescription
estimatorThe machine learning model to optimize (e.g., RandomForestClassifier()).
scoringThe evaluation metric (e.g., ‘accuracy’, ‘f1’, ‘roc_auc’).
cvNumber of cross-validation folds (e.g., cv=5 for 5-fold cross-validation).
n_jobsNumber of CPU cores to use (-1 means use all available cores).
verboseControls log output (0 = silent, 1 = minimal, 2 = detailed).
refitIf True, retrains the best model on the full dataset after tuning.
return_train_scoreIf True, also returns training scores.
param_distributionsRandomly samples hyperparameter values.
n_iterDefines the number of random hyperparameter combinations to try.
random_stateControls randomness for reproducibility.

Python Implementation for RandomizedSearchCV:

# Import libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import RandomizedSearchCV

# Load data
data = pd.read_csv('/content/drive/MyDrive/Data Science/CDS-07-Machine Learning & Deep Learning/06. Machine Learning Model /05_Support Vector Machines/SVM Class /Preprocessed_data.csv')

# Creating data
data2 = data.drop(['Loan_Status'], axis=1)
X = data2.iloc[:,2:]
y = data['Loan_Status'].map({'Y':1,'N':0})

# Defining Parameter distribution
param_dist = {'C': [0.1, 5, 10, 50, 60, 70],
              'gamma': [1, 0.1, 0.01, 0.001, 0.0001],
              'random_state': list(range(1, 20))}

# Model creation
from sklearn.svm import SVC
model1 = SVC()

# Tuning Model using RandomizedSearchCV
random_search = RandomizedSearchCV(model1, param_dist, refit=True, verbose=2, scoring='f1', cv=5, n_iter=10, random_state=42)
random_search.fit(X, y)

# Showing best parameters & corresponding score 
print("Tuned SVM Parameters: {}".format(random_search.best_params_))
print("Best score is {}".format(random_search.best_score_))
Output:
Tuned SVM Parameters: {'random_state': 18, 'gamma': 0.1, 'C': 5}
Best score is 0.8436087845413335

Bayesian Optimization:

  • It uses probabilistic models to find the best set of parameters for a machine learning model.
  • Instead of exhaustively testing every possibility (like Grid Search), it learns from past trials and predicts which hyperparameters might work best next.

How it works:

  • Define the Objective Function
    • The function that we want to optimize (e.g., maximize accuracy or minimize RMSE).
    • This function takes hyperparameters as input and returns a score.
  • Use a Probabilistic Model
    • Builds a model (e.g., Gaussian Process) to estimate the performance of different hyperparameters.
    • It predicts which hyperparameter values are promising based on previous trials(The hyperparameter sets that Optuna has already tested before selecting the next set).
  • Select the Next Hyperparameters to Test
    • It chooses hyperparameters that balance exploration and exploitation:
      • Exploration: Tries new areas (new hyperparameter values).
      • Exploitation: Focuses on areas that worked well before.
  • Evaluate & Update the Model
    • The chosen hyperparameters are used to train the ML model, and its performance is recorded.
    • The surrogate model updates itself with the new results.
    • This process repeats until we find the best hyperparameters.
ParameterDescription
n_trialsNumber of trials (iterations).
n_startup_trialsThe number of initial random trials before BO starts using the probabilistic model.
acquisition_functionDetermines how to choose the next set of parameters (e.g., Expected Improvement, Upper Confidence Bound).
xiControls the trade-off between exploration and exploitation.
random_stateSets a seed for reproducibility.

Limitation:

  • Although more efficient than Grid/Random Search, it is not Ideal for High-Dimensional Spaces.
  • Ideal for deep learning and large datasets where training is slow but the Gaussian Process model needs extra computation.

Python Implementation for Bayesian Optimization:

# Import libraries
import numpy as np
import pandas as pd
import optuna
from sklearn.svm import SVC
from sklearn.model_selection import cross_val_score

# Load data
data = pd.read_csv('/content/drive/MyDrive/Data Science/CDS-07-Machine Learning & Deep Learning/06. Machine Learning Model /05_Support Vector Machines/SVM Class /Preprocessed_data.csv')

# Creating data
data2 = data.drop(['Loan_Status'], axis=1)
X = data2.iloc[:, 2:]
y = data['Loan_Status'].map({'Y': 1, 'N': 0})

# Objective function for Optuna to optimize the hyperparameters
def objective(trial):
    # Suggest hyperparameters
    C = trial.suggest_loguniform('C', 0.1, 100)
    gamma = trial.suggest_loguniform('gamma', 0.0001, 1)
    random_state = trial.suggest_int('random_state', 1, 20)

    # Create model with suggested hyperparameters
    model = SVC(C=C, gamma=gamma, random_state=random_state)

    # Perform cross-validation and return the score
    score = cross_val_score(model, X, y, cv=5, scoring='f1').mean()
    return score

# Create and optimize the Optuna study
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=20)

# Best parameters found by Optuna
print("Tuned SVM Parameters: {}".format(study.best_params))
print("Best score is {}".format(study.best_value))

Output:
Tuned SVM Parameters: {'C': 39.723066053068294, 'gamma': 0.024448094259763114, 'random_state': 14}
Best score is 0.8435800613044847

Hyperband (Successive Halving):

  • It is an advanced hyperparameter tuning method that is much faster than Grid Search and Bayesian Optimization.
  • It works by allocating resources (like training time or dataset size) to different hyperparameter sets and quickly eliminates the worst-performing ones.
  • Instead of training all models fully, Hyperband trains many models with fewer resources first, then increases resources for the best ones.

How Hyperband Works :

  1. Randomly generate hyperparameter sets (like Random Search).
  2. Assign limited resources (e.g., train with a small dataset or fewer epochs).
  3. Evaluate and eliminate the worst half (only keep the best models).
  4. Increase resources for the remaining models (repeat until one best model remains).

Parameter of Hyperband (Successive Halving):

ParameterDescription
max_iterMaximum number of iterations (training rounds).
min_iterMinimum resources assigned per trial.
factorThe rate at which bad trials are eliminated (e.g., halve each round).
resourceDefines the resource used for halving (e.g., epochs, dataset size).
cvNumber of cross-validation folds.

Python Implementation for Hyperband (Successive Halving):

from sklearn.experimental import enable_halving_search_cv  # Enables HalvingGridSearchCV
from sklearn.model_selection import HalvingGridSearchCV

# Load data
data = pd.read_csv('/content/drive/MyDrive/Data Science/CDS-07-Machine Learning & Deep Learning/06. Machine Learning Model /05_Support Vector Machines/SVM Class /Preprocessed_data.csv')

# Creating data
data2 = data.drop(['Loan_Status'], axis=1)
X = data2.iloc[:, 2:]
y = data['Loan_Status'].map({'Y': 1, 'N': 0})

# Define the hyperparameter grid
param_grid = {
    'C': [0.1, 1, 10, 100],
    'gamma': [0.0001, 0.001, 0.01, 0.1, 1],
    'random_state': list(range(1, 21))
}

# Create the SVM model
model = SVC()

# Define the Hyperband (Successive Halving) search
tuned_model = HalvingGridSearchCV(
    model,
    param_grid,
    factor=2,  # Reduce half of the trials in each iteration
    min_resources='exhaust',  # Use all resources for final tuning
    cv=5,  # 5-fold cross-validation
    scoring='f1',
    verbose=2
)

# Fit the model
tuned_model.fit(X, y)

# Print the best parameters and best score
print("Tuned SVM Parameters:", tuned_model.best_params_)
print("Best score is:", tuned_model.best_score_)
Output:
Tuned SVM Parameters: {'C': 0.1, 'gamma': 0.001, 'random_state': 15}
Best score is: 0.8044520699053379

Hyperparameter space for different algorithm:

# Define the hyperparameter space for linear regression
param_dist = {
    'fit_intercept': [True, False],
    'normalize': [True, False],
    'copy_X': [True, False]
}

# Define the hyperparameter space for lasso regression
param_dist = {
    'alpha': [0.1, 1.0, 10.0],
    'fit_intercept': [True, False],
    'normalize': [True, False],
    'copy_X': [True, False]
}

# Define the hyperparameter space for ridge regression
param_dist = {
    'alpha': [0.1, 1.0, 10.0],
    'fit_intercept': [True, False],
    'normalize': [True, False],
    'solver': ['auto', 'svd', 'cholesky', 'lsqr', 'sparse_cg', 'sag', 'saga']
}

# Define the hyperparameter space for logistic regression
param_dist = {
    'penalty': ['l1', 'l2'],
    'C': [0.1, 1.0, 10.0],
    'fit_intercept': [True, False],
    'solver': ['liblinear', 'saga'],
    'max_iter': [100, 200, 500]
}

# Define the hyperparameter space for KNN
param_dist = {
    'n_neighbors': [3, 5, 7, 9],
    'weights': ['uniform', 'distance'],
    'algorithm': ['auto', 'ball_tree', 'kd_tree', 'brute'],
    'p': [1, 2]
}

# Define the hyperparameter space for SVM
param_dist = {
    'C': [0.1, 1.0, 10.0],
    'kernel': ['linear', 'poly', 'rbf', 'sigmoid'],
    'gamma': ['scale', 'auto'],
    'degree': [2, 3, 4]
}

# Define the hyperparameter space for Naive Bayes
param_dist = {
    'var_smoothing': [1e-9, 1e-8, 1e-7, 1e-6]
}

# Define the hyperparameter space for Decision Tree
param_dist = {
    'criterion': ['gini', 'entropy'],
    'max_depth': [None, 5, 10, 15],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4],
    'max_features': ['auto', 'sqrt', 'log2', None]
}

# Define the hyperparameter space for Random forest
param_dist = {
    'n_estimators': [100, 200, 300],
    'criterion': ['gini', 'entropy'],
    'max_depth': [None, 5, 10],
    'min_samples_split': [2, 5, 10],
    'min_samples_leaf': [1, 2, 4],
    'max_features': ['auto', 'sqrt', 'log2']
}

# Define the hyperparameter space for XGBOOST
param_dist = {
    'learning_rate': [0.1, 0.01, 0.001],
    'max_depth': [3, 5, 7],
    'n_estimators': [100, 200, 300],
    'subsample': [0.6, 0.8, 1.0],
    'colsample_bytree': [0.6, 0.8, 1.0],
    'gamma': [0, 1, 5]
}

# Define the hyperparameter space for K Means Clustering
param_dist = {
    'n_clusters': [2, 3, 4, 5],
    'init': ['k-means++', 'random'],
    'n_init': [10, 20, 30],
    'max_iter': [100, 200, 300]
}

# Define the hyperparameter space for DBScan Clustering
param_dist = {
    'eps': [0.1, 0.3, 0.5],
    'min_samples': [2, 5, 10],
    'metric': ['euclidean', 'manhattan', 'chebyshev']
}

 # Define the hyperparameter space for Neural Networks (MLP)
mlp_params = {
    'hidden_layer_sizes': [(50,), (100,), (50, 50), (100, 100)],
    'activation': ['relu', 'tanh', 'logistic'],
    'solver': ['adam', 'sgd'],
    'alpha': [0.0001, 0.001, 0.01, 0.1]
}

# Define the hyperparameter space for ANN
param_dist = {
    'hidden_layers': [1, 2, 3],
    'units': [16, 32, 64],
    'activation': ['relu', 'sigmoid'],
    'optimizer': ['adam', 'sgd'],
    'epochs': [10, 20, 30],
    'batch_size': [8, 16, 32]
}

# Define the hyperparameter space for CNN
param_dist = {
    'filters': [16, 32, 64],
    'kernel_size': [(3, 3), (5, 5)],
    'pool_size': [(2, 2), (3, 3)],
    'hidden_units': [64, 128, 256],
    'optimizer': ['adam', 'sgd'],
    'epochs': [10, 20, 30],
    'batch_size': [8, 16, 32]
}

Register

Login here

Forgot your password?

ads

ads

I am an enthusiastic advocate for the transformative power of data in the fashion realm. Armed with a strong background in data science, I am committed to revolutionizing the industry by unlocking valuable insights, optimizing processes, and fostering a data-centric culture that propels fashion businesses into a successful and forward-thinking future. - Masud Rana, Certified Data Scientist, IABAC

© Data4Fashion 2023-2025

Developed by: Behostweb.com

Please accept cookies
Accept All Cookies