Boosting is based on the following question: “Can a set of weak learners create a single strong learner?” A weak learner (or weak classifier) is defined as a classifier that is only slightly better than random guessing.
In face detection, this means that a weak learner can classify a subregion of an image as a face or not-face only slightly better than random guessing. A strong learner is substantially better at picking faces from non-faces.
The power of boosting comes from combining many (thousands) of weak classifiers into a single strong classifier. In the Viola-Jones algorithm, each Haar-like feature represents a weak learner. To decide the type and size of a feature that goes into the final classifier, AdaBoost checks the performance of all classifiers that you supply to it.
To calculate the performance of a classifier, you evaluate it on all subregions of all the images used for training. Some subregions will produce a strong response in the classifier. Those will be classified as positives, meaning the classifier thinks it contains a human face.
Subregions that don’t produce a strong response don’t contain a human face, in the classifiers opinion. They will be classified as negatives.
The classifiers that performed well are given higher importance or weight. The final result is a strong classifier, also called a boosted classifier, that contains the best performing weak classifiers.
The algorithm is called adaptive because, as training progresses, it gives more emphasis on those images that were incorrectly classified. The weak classifiers that perform better on these hard examples are weighted more strongly than others.
00:00 At this point, we know that we can use specific Haar-like features to identify parts of faces like the eyes and the nose, but what size should we make these features? Well, as it turns out, in a single 24 pixel by 24 pixel window, there are over 160,000 combinations of these features that we can use.
00:23 Trying out all of these features would be very slow, so we need a way to figure out which of these features will be actually useful in detecting faces—or, for that matter, any other object we want.
00:58 The AdaBoost algorithm takes in what is known as weak classifiers, or things that are only slightly better at detecting a face than just randomly guessing. In our case, these weak classifiers are all the possible Haar-like features—all 160,000 of them.
01:17 When we supply the algorithm with lots of training data—which, in this case, is lots of pictures with faces distinguished from those with just the background—the algorithm will return to us just several thousand features that are ideal for face detection. That’s a lot better than 160,000.
01:37 And here’s how that works. Imagine that we need to classify blue and orange circles in this image using a set of weak classifiers. The first classifier we use captures some of the blue circles, but it misses some of the others. We assigned this weak classifier a medium weight. You can think of this like a mediocre score—it did all right, but it still missed some.
02:03 Now, we give more importance to the missed blue circles as indicated by the fact that they are now bigger, and we run the next classifier. This one manages to capture all of the blue circles in the image.
02:16 It performs better, so it earns a higher weight. But it incorrectly captured some orange dots, so we give those more importance in the next run. Finally, our last classifier manages to capture those orange circles correctly.
02:31 It will get a high score too. Once we add all three of these classifiers together, we get a way to classify the blue circles from the orange ones. All of the low-scoring classifiers, meaning that they weren’t that useful, will be discarded. Going back to our Viola-Jones framework, each of the 160,000 Haar-like features is considered to be a weak classifier. By running these through AdaBoost one by one, we get what is called a strong classifier. This strong classifier, which is very likely to classify a face, is composed of only the best weak classifiers.
03:12 These classifiers are the Haar-like features that scored the highest. Great! Now we have a strong classifier composed of many Haar-like features that together will likely classify a face. From here, we can move on to the last step in the Viola-Jones algorithm, the classifier cascade.
Become a Member to join the conversation.