Commonly used metrics and techniques for regression model evaluation as below: Mean Absolute Error (MAE): - MAE calculates the average absolute difference between the predicted values and the actual values. It gives you an idea of how far off, on average, your predictions are from the true values. - Formula: MAE = (1/n) ∑|y_true - y_pred| - Interpretation : Lower MAE indicates better model performance. - Python Code :from sklearn.metrics import mean_absolute_error mae = mean_absolute_error(y_true, y_pred) print("Mean Absolute Error (MAE):", mae)
Mean Squared Error (MSE): - MSE calculates the average of the squared differences between predicted values and actual values. It penalizes larger errors more heavily. - Formula: MSE = (1/n) ∑(y_true - y_pred)^2 - Interpretation : Lower MSE indicates better model performance. - Python Code :from sklearn.metrics import mean_squared_error mse = mean_squared_error(y_true, y_pred) print("Mean Squared Error (MSE):", mse)
Root Mean Squared Error (RMSE): - RMSE is the square root of MSE. It provides a more interpretable metric in the same units as the target variable. - Formula: RMSE = √MSE - Interpretation : Lower RMSE indicates better model performance. - Python Code :import math from sklearn.metrics import mean_squared_error rmse = math.sqrt(mean_squared_error(y_true, y_pred)) print("Root Mean Squared Error (RMSE):", rmse)
R-squared (R2) or Coefficient of Determination: - R-squared measures the proportion of the variance in the dependent variable (target) that is predictable from the independent variables (features). It ranges from 0 to 1, with higher values indicating a better fit. - Formula: R2 = 1 - (SSE / SST), where SSE is the sum of squared errors and SST is the total sum of squares. - Interpretation : Higher R2 indicates better model performance, with 1 indicating a perfect fit. - Python Code :from sklearn.metrics import r2_score r2 = r2_score(y_true, y_pred) print("R-squared (R2):", r2)
Adjusted R-squared: - Adjusted R-squared adjusts R-squared for the number of predictors in the model. It penalizes the inclusion of irrelevant predictors. - Interpretation : Higher adjusted R-squared indicates a better model fit. - Formula: 1 - (1 - R-squared) * ((n - 1)/(n - p - 1)). Here, n represents the number of observations, p represents the number of predictors(independent variables)/Rows - Python Code :def adjusted_r_squared(r2, n, p): return 1 - ((1 - r2) * (n - 1) / (n - p - 1)) adjusted_r2 = adjusted_r_squared(r2, len(y_true), No_of_Rows) print("Adjusted R-squared:", adjusted_r2)
Mean Absolute Percentage Error (MAPE): - MAPE expresses the prediction errors as a percentage of the actual values. It's commonly used in business contexts. - Formula: MAPE = (1/n) ∑(|(y_true - y_pred) / y_true|) * 100 - Interpretation :Lower MAPE indicates better model performance. - Python Code :import numpy as np mape = np.mean(np.abs((y_true - y_pred) / y_true)) * 100 print("Mean Absolute Percentage Error (MAPE):", mape)