Machine learning in 10 pictures
Dataz is Unit of Geoinsyssoft. We Provide Big data Training institute in chennai. Best Machine Learning Courses in chennai.
1. Test and training error:
Why lower training error is not always a good thing: ESL Figure 2.11. Test and training error as a function of model complexity.
2. Under and overfitting:
PRML Figure 1.4. Plots of polynomials having various orders M, shown as red curves, fitted to the data set generated by the green curve.
3. Occam’s razor:
ITILA Figure 28.3. Why Bayesian inference embodies Occam’s razor. This figure gives the basic intuition for why complex models can turn out to be less probable. The horizontal axis represents the space of possible data sets D. Bayes’ theorem rewards models in proportion to how much they predicted the data that occurred. These predictions are quantified by a normalized probability distribution on D.
[ Best Machine Learning Courses in Chennai @Dataz ]
Data Sets
This probability of the data given model Hi, P (D | Hi), is called the evidence for Hi. A simple model H1 makes only a limited range of predictions and shown by P(D|H1). At that point, if the informational collection falls in district C1, the less ground-breaking model H1 will be the more plausible model.
4. Feature combinations:
5. Irrelevant features:
Why irrelevant features hurt kNN, clustering, and other similarity based methods. The figure on the left shows two classes well separated on the vertical axis. The figure on the right adds an irrelevant horizontal axis which destroys the grouping and makes many points nearest neighbors of the opposite class.
6. Basis functions:
How non-linear basis functions turn a low dimensional classification problem without a linear boundary into a high dimensional problem with a linear boundary. From SVM tutorial slides by Andrew Moore: a one dimensional non-linear classification problem with input x is turned into a 2-D problem z=(x, x^2) that is linearly separable.
[ Best Machine Learning Courses in Chennai @Dataz ]
7. Discriminative vs. Generative:
Why discriminative learning may be easier than generative: PRML Figure 1.27. Note that the left-hand mode of the class-conditional density p(x|C1), shown in blue on the left plot, has no effect on the posterior probabilities. The vertical green line in the right plot shows the decision boundary in x that gives the minimum misclassification rate.
8. Loss functions:
Learning algorithms can be viewed as optimizing different loss functions: PRML Figure 7.5. Plot of the ‘pivot’ mistake work utilized in help vector machines, appeared blue, alongside the blunder work for strategic relapse, rescaled by a factor of 1/ln(2) with the goal that it goes through the point (0, 1), appeared red.
Best Machine Learning Courses in Chennai
9. Geometry of least squares:
ESL Figure 3.2. The N-dimensional geometry of least squares regression with two predictors. The outcome vector y is orthogonally projected onto the hyperplane spanned by the input vectors x1 and x2. The projection yˆ represents the vector of the least squares predictions.