
Regularization
So, what can you do to prevent a model from learning misleading or irrelevant patterns from the training data? Well, with neural networks, the best solution is to almost always get more training data. A model that's trained on more data will indeed allow your model to have better out-of-set predictivity. Of course, getting more data is not always that simple, or even possible. When this is the case, you have several other techniques at your disposal to achieve similar effects. One of them is to constrain your model in terms of the amount of information that it may store. As we saw in the behind enemy lines example in Chapter 1, Overview of Neural Networks, it is useful to find the most efficient representations of information, or representations with the lowest entropy. Similarly, if we can only afford our model the ability to memorize a small number of patterns, we are actually forcing it to find the most efficient representations that generalize better on other data that our model may encounter later on. This process of improving model generalizability through reducing overfitting is known as regularization, and will be go over it in more detail before we use it in practice.