Keras 2.x Projects
上QQ阅读APP看书,第一时间看更新

Basic regression concepts

Regression is an inductive learning task that has been widely studied, and is widely used in practical applications. Unlike classification processes where you are trying to predict discrete class labels, regression models predict numeric values.

From a set of data, we can find a model that describes observations by the use of regression algorithms. For example, we can identify a correspondence between the input variables and output variables of a given system. One way to do this is to postulate the existence of some kind of mechanism for the parametric generation of data. This, however, does not contain the exact values of the parameters. This process typically makes reference to statistical techniques.

The extraction of general laws from a set of observed data is called induction, as opposed to deduction, in which we start from general laws and try to predict the value of a set of variables. Induction is the fundamental mechanism underlying the scientific method in which we want to derive general laws (typically described in mathematical terms), starting from the observation of phenomena.

The observation of the phenomena requires the measurement of a set of variables, and the subsequent acquisition of measured data. Then, the resulting model can be used to make predictions on additional data. The overall process is so that we start from a set of observations, and we aim to make predictions on new situations, which is called inference.

The generalization ability of the regression model is crucial for all other machine learning algorithms as well. Regression algorithms must not only detect the relationships between the target function and attribute values in the training set, but they also generalize them so that they may be used to predict new data.

It should be emphasized that the learning process must be able to capture the underlying regimes from the training set and not the specific details. Once the learning process is completed through training, the effectiveness of the model is tested further on a dataset named testset.