Bias, variance and how they are related to Underfitting, Overfitting

This post was originally published by Rahul Banerjee at Towards Data Science

In the first image, we try to fit the data using a linear equation. The model is rigid and not at all flexible. Due to the low flexibility of a linear equation, it is not able to predict the samples (training data), therefore the error rate is high and it has a High Bias which in turn means it’s underfitting. This model won’t perform well on unseen data.

In the second image, we use an equation with degree 4. The model is flexible enough to predict most of the samples correctly but rigid enough to avoid overfitting. In this case, our model will be able to do well on the testing data therefore this is an ideal model.

In the third image, we use an equation with degree 15 to predict the samples. Although it’s able to predict almost all the samples, it has too much flexibility and will not be able to perform well on unseen data. As a result, it will have a high error rate in testing data. Since it has a low error rate in training data (Low Bias) and high error rate in training data (High Variance), it’s overfitting.

Assume we have three models ( Model A , Model B , Model C) with the following error rates on training and testing data.

```+---------------+---------+---------+---------+
|   Error Rate  | Model A | Model B | Model C |
+---------------+---------+---------+---------+
| Training Data |   30%   |    6%   |    1%   |
+---------------+---------+---------+---------+
|  Testing Data |   45%   |    8%   |   25%   |
+---------------+---------+---------+---------+
```

For Model A, The error rate of training data is too high as a result of which the error rate of Testing data is too high as well. It has a High Bias and a High Variance, therefore it’s underfit. This model won’t perform well on unseen data.

For Model B, The error rate of training data is low and the error rate ofTesting data is low as well. It has a Low Bias and a Low Variance, therefore it’s an ideal model. This model will perform well on unseen data.

For Model C, The error rate of training data is too low. However, the error rate of Testing data is too high as well. It has a Low Bias and a High Variance, therefore it’s overfit. This model won’t perform well on unseen data.

This article covers the content discussed in the Regularization Methods module of the Deep Learning course and all the images are taken…
Medium | Parveen Khurana

When the model’s complexity is too low, i.e a simple model, the model won’t be able to perform well on the training data nor the testing data, therefore it’s underfit.

At the sweet spot, the model has a low error rate on the training data as well as the testing data, therefore, that’s the ideal model.

As the complexity of the model increases, the model performs well on the training data but it doesn’t perform well on the testing data and therefore it’s overfit.

Thank You for reading the article.

Spread the word

This post was originally published by Rahul Banerjee at Towards Data Science