Maximum Likelihood Estimate

or “MLE” for short

Data distributions:
is just the training samples
- Real world observations
- If we observed every possible event ever made then this is a continuous distribution
is parametrized by

Goal #

Finding maximum_likelihood_estimate_0dd984b3c17df1a79975e491d01fb734faac15d9.svg to maximize the likelihood to make the same observations with maximum_likelihood_estimate_a3c046e4346beec82a5555837028fa1b6c70abc6.svg as with maximum_likelihood_estimate_5f1ed1cf049a021af3004052a66db3c6dc533442.svg .

In other words: The model should predict the data that we have.

Independency Assumption #

maximum_likelihood_estimate_bf6376a72cb7a756a47cd87495ea429136ba0d9a.svg

Obtaining the model for a MLE #

maximum_likelihood_estimate_07ad078ef703ebe6b01309cca277df04907e3f2a.svg

The maximum_likelihood_estimate_5725e4ad8039d10985e65e66bc1d62d103bc1499.svg is the one maximum_likelihood_estimate_0dd984b3c17df1a79975e491d01fb734faac15d9.svg where the product of the probabilies to get the correct estimate, is maximized. Since probabilies go from 0 to 1, maximizing their product makes sense.