Linear Regression

Is a linear model used in Machine Learning belongs to Supervised Learning .

Sometimes also called Linear Score Functions
Given a labeled/classified data set:
- Each data point is associated with some label/classificaion
- is the desired quantity that we want to predict for new data
Training a linear regression model creates a linear function as part of the Model Parameter
Testing (validating) takes a new data point and the trained model to produce an estimate

Training the model #

The Loss Function for the linear regression:
to be independent on the number of samples
So the optimization problem becomes
This is linear least squares
To find the minimum:

Therefore:

Training in in 2D #

linear_regression_ad5581ff1f87925c02348b32636bc0b50e1e674d.svg

Is the Least Squares Estimate the best estimate? #

Yes because the resulting model is the same as one derived with the Maximum Likelihood Estimate . (Assuming the independence of the samples and the normal distribution)

Proof #

We take the MLE model:
since is monotonically increasing, the below expression optimizes for the same as the previous one, but is easier to work with ()
Next, assume is a Normal Distribution :
Therefore
Substituting this back into (2 ): ()
Simplifying
Substituting this back into (5 ):
Finally we are interested in the to maximize the expression, so we partially derive with respect to to find the extremum
Therefore:

From Linear Regression to a Neural Network #

Linear regression will always result in a linear Decision Boundary , which are often not flexible enough.

Until now the linear score function looks like linear_regression_606a31a0d61fc38862bb938661d39ef7b564bbb8.svg

with being the weight matrix.

Even with additional weight matrices, the result would still be linear, since the matrices could be collapsed into one, since matrix multiplication is associative.

The solution then is to make use of a non linear part similar as the Logistic Regression does:

2 layer NN
3 layer NN

and so on

where linear_regression_65c6fe1d0748ac787c9ee527feeff767d1b9c4ae.svg is a non-linear function, like Sigmoid Functions , linear_regression_5defd14c575fe0f8e212d45f01d02581197ba1c1.svg , linear_regression_c23b64349b8c65f0ff1e5cab125aa8d03a18497d.svg , …