Distinguishing Features of kNN
00:09 kNN is a supervised learning algorithm. In general, supervised learning algorithms learn from training data in order to make predictions. You’ll start with a model and feed it data points that will have been labeled with the kinds of target values you’re interested in predicting.
00:36 Trained models can then go on to make predictions for new data points. Later, you may have some new data that do not have target labels. You can pass these values into your trained supervised learning model, and the output will be predicted targets. This phase is called predicting.
00:54 kNN works just like this. You’ll pass some training data with known target values into it, and then you’ll be able to use that trained model to make predictions for new data observations that perhaps do not have target labels.
01:11 There are two types of supervised learning problems, classification and regression, and kNN can be used for either type. In classification, you’re trying to build a model that makes categorical predictions, such as whether an observation is a square, circle, or triangle.
01:29 Or, for a real-world example, you might train a classification model to predict whether or not a customer will click on your ad. Regression problems, on the other hand, make numerical predictions, or continuous values from a number line.
01:44 Now your question might be, how much money will a customer spend next month? Since the predictions for this question are dollar amounts rather than discrete categories, this is a regression problem.
02:08 Machine learning algorithms can be linear or nonlinear. Linear models make predictions that follow lines or hyperplanes, whereas nonlinear models do not. For example, say you have classification data like bees.
02:23 Height and width are your input variables. You’re trying to predict if the target is a triangle, cross, or star. If you build a linear model to make this prediction, you could end up with a line like this one, which divides the input space into triangles versus non-triangles.
02:49 If you train a nonlinear model, however, you’ll likely end up with a more complicated classification boundaries. For example, you might create this decision tree model. It says if a point’s height is low, this point is predicted to be a triangle.
03:15 kNN is a nonlinear algorithm. For classification problems, that means it can make more complicated decision boundaries that don’t necessarily follow lines or hyperplanes. For regression problems, its predictions don’t need to follow linear or even monotonic relationships with respect to its inputs.
03:38 And finally, some machine learning algorithms work by learning model parameters. For example, linear regression models are built by finding the best parameters to summarize the training data that they’re exposed to.
04:12 kNN, on the other hand, is a nonparametric algorithm. That means it does not have model parameters that need to be fitted during training. Predictions are made directly from the training data itself.
04:24 This nonparametric nature makes kNN a highly flexible algorithm since it doesn’t attempt to force any model form onto the data. However, to be able to make predictions, a trained kNN model must retain the entire training set.
04:52 In summary, kNN is a machine learning algorithm. It is considered a supervised algorithm because it learns from data with labeled target values. kNN can be used for either classification or regression problems.
Become a Member to join the conversation.