Evaluation Metrics For Regression Problems

Evaluation metrics for regression problems.
Evaluation Metrics
are very important as they tell us, how accurate our model is.
Before we proceed to the evaluation techniques, it is important to

gain some intuition.
Linear Regression
In the above image, we can see that we have plotted a linear curve,
but the curve is not perfect as some points are lying above the line
& some are lying below the line.
So, how accurate our model is?
The evaluation metrics aim to solve these problems. Now, without

wasting time, let’s jump to the evaluation metrics & see the
evaluation techniques.
There are 6 evaluation techniques:

1. M.A.E (Mean Absolute Error)
2. M.S.E (Mean Squared Error)
3. R.M.S.E (Root Mean Squared Error)
4. R.M.S.L.E (Root Mean Squared Log Error)
5. R-Squared
6. Adjusted R-Squared
Now, let’s discuss these techniques one by one.
M.A.E (Mean Absolute Error)
It is the simplest & very widely used evaluation technique. It is

simply the mean of difference b/w actual & predicted values.
Below, is the mathematical formula of the Mean Absolute Error.
Mean Absolute Error
The Scikit-Learn is a great library, as it has almost all the inbuilt

functions that we need in our Data Science journey.
Below is the code to implement Mean Absolute Error
from sklearn.metrics import
mean_absolute_errormean_absolute_error(y_true, y_pred)
Here, ‘y_true’ is the true target values & ‘y_pred’ is the predicted
target values.
M.S.E (Mean Squared Error)

Another evaluation technique is the Mean Squared Error. It takes
the average of the square of the error. Here, the error is the
difference b/w actual & predicted values.
Below is the mathematical formula of the Mean Squared Error.
Mean Squared Error
I’m sure that after looking at the above mathematical function, you
will be thinking of the implementation. But, don’t worry, there is
an inbuilt function called ‘mean_squared_error’.
Here is the code below:

mean_squared_errormean_squared_error(y_true,y_pred)
Here, y_true is the true value & y_pred is the predicted value.
Well, A problem with the above function is that It changes the

units. To avoid, that problem, we will use another technique,
called, R.M.S.E (Root Mean Squared Error)
R.M.S.E (Root Mean Squared Error)

Root mean squared Error is another technique that is being used
these days. First of all, it solves the problem in the above
technique.
It squares the error & then it takes the square root of the total
average function. Below, is the mathematical function which will
make the things more clear.
Root Mean Squared Error
Below, is the code to calculate the Root Mean Squared Error.

from sklearn.metrics import mean_squared_errorimport numpy as npmse
= mean_squared_error(y_true,y_pred)rmse = np.sqrt(mse)
Here, you can see that we don’t have any special function to
calculate RMSE but instead we use MSE & take the square-root of
that.
But wait, there is a limitation of this method.
Let’s take the below example.
In Example 1, we can see that the error is very large. The actual
value is 1 & the predicted value is 401.
In example 2, we can see that, if we compare actual to predicted,

the predicted is giving us a good result.
For example 1 & 2, the error is 400, but actually, the ML model in
example 2 is giving us better results. But, according to RMSE, the
error is the same.
Therefore, to solve this problem, we use another similar, but yet

modified method, which is discussed below.
R.M.S.L.E (Root Mean Squared Log Error)
The mathematical function of this technique is displayed below.
Root Mean squared Log Error
Now, if we take the above case in the RMSLE, then, the RMSLE of
Ex 1 is greater than Ex 2. & therefore, RMSLE solves the problem
which occurred in RMSE (Root Mean Squared Error)
This method actually, scales down the values & thus, it avoids the
above error.
Below is the code to implement it

mean_squared_log_errornp.sqrt(mean_squared_log_error( y_true,
predictions ))
Here, ‘y_true’ is the actual target variable & ‘predictions’ is the
predicted target variable.
R — Squared
Now, we come to another technique called R — Squared, whose

actual name is Relative Squared Error.
This method helps us to calculate the relative error. This technique

helps us to judge, which algorithm is better based on their mean
squared errors.
The mathematical formula of the R — Squared method is given

below.
R — Squared
If x >1, this means that, the MSE of the numerator is greater than
the MSE of the baseline model which in turn means that, the new
model is worse than the baseline model.
Higher is the R — Squared, better is the model.
Below is the code to implement the R-Squared evaluation

technique
r2_scoresklearn.metrics.r2_score(y_true, y_pred)
Here, ‘y_true’ is the true target variable & ‘y_pred’ is the predicted
target variable
After, reading, the above paragraph, you may be really impressed

by these evaluation metrics. But wait there is a limitation of this
technique.
The limitation is that R-Squared value either increases or doesn’t

change upon adding more features. Regardless of how features
impact the model.
To overcome this limitation, there is another evaluation technique

called Adjusted R — Squared which is discussed below.
Adjusted R — Squared
The mathematical formula is displayed below.
Adjusted R — Squared
Here, n: number of samples & k: number of features.

There is no inbuilt function on scikit-learn to calculate Adjusted R-
Squared but we can find R-Squared & just calculate the Adjusted
R-Squared.
Well, the above are the 6 most commonly used evaluation metrics
for Regression Problems.
But, there are a lot of factors that are to be done before model
training like data cleaning, data visualization, data analysis,
missing value treatment, outlier treatment, etc.

Evaluation Metrics For Regression Problems

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Evaluation Metrics For Regression Problems

Uploaded by

Copyright:

Available Formats

Evaluation metrics for regression problems.

Before we proceed to the evaluation techniques, it is important to

So, how accurate our model is?

The evaluation metrics aim to solve these problems. Now, without

There are 6 evaluation techniques:

2. M.S.E (Mean Squared Error)

3. R.M.S.E (Root Mean Squared Error)

4. R.M.S.L.E (Root Mean Squared Log Error)

Now, let’s discuss these techniques one by one.

M.A.E (Mean Absolute Error)

It is the simplest & very widely used evaluation technique. It is

Below, is the mathematical formula of the Mean Absolute Error.

Mean Absolute Error

The Scikit-Learn is a great library, as it has almost all the inbuilt

M.S.E (Mean Squared Error)

Below is the mathematical formula of the Mean Squared Error.

Mean Squared Error

Here is the code below:

Well, A problem with the above function is that It changes the

R.M.S.E (Root Mean Squared Error)

Root Mean Squared Error

Below, is the code to calculate the Root Mean Squared Error.

But wait, there is a limitation of this method.

Let’s take the below example.

In example 2, we can see that, if we compare actual to predicted,

Therefore, to solve this problem, we use another similar, but yet

R.M.S.L.E (Root Mean Squared Log Error)

The mathematical function of this technique is displayed below.

Root Mean squared Log Error

Below is the code to implement it

Now, we come to another technique called R — Squared, whose

This method helps us to calculate the relative error. This technique

The mathematical formula of the R — Squared method is given

Higher is the R — Squared, better is the model.

Below is the code to implement the R-Squared evaluation

After, reading, the above paragraph, you may be really impressed

The limitation is that R-Squared value either increases or doesn’t

To overcome this limitation, there is another evaluation technique

The mathematical formula is displayed below.

Here, n: number of samples & k: number of features.

You might also like