Join us and get access to thousands of tutorials and a community of expert Pythonistas.

This lesson is for members only. Join us and get access to thousands of tutorials and a community of expert Pythonistas.

Unlock This Lesson

Integral Images

Traditional Face Detection Using Python Austin Cepalia 03:54

An integral image (also known as a summed-area table) is the name of both a data structure and an algorithm used to obtain this data structure. It is used as a quick and efficient way to calculate the sum of pixel values in an image or rectangular part of an image.

In an integral image, the value of each point is the sum of all pixels above and to the left, including the target pixel. The integral image can be calculated in a single pass over the original image. This reduces summing the pixel intensities within a rectangle into only three operations with four numbers, regardless of rectangle size.

00:00 Calculating potentially hundreds and thousands of Haar-like features at every position in a high-resolution image would take a very long time, even for a fast computer.

00:11 It doesn’t sound like summing up pixels would be that hard because it’s just addition, but it becomes a lot slower than you might think. In fact, it’s an O(N^2) operation. To speed this up, Viola and Jones used what are called integral images. An integral image, also known as a summed area table, is the name of both a data structure and an algorithm used to obtain this data structure.

00:38 It is used as a quick and efficient way to calculate the sum of pixel values within an image. In an integral image, the value of each pixel is the sum of all pixels above and to the left,

00:53 including the target pixel. So in this example, we have our original grayscale image on the left and our new integral image on the right. The top-left corner maps to just 1, because 1 plus nothing else is just 1.

01:11 Then, if we move down and one pixel to the right of the integral image, we get 20 because in the original image, 1 + 3 + 12 + 4 = 20. Then, as you see in the picture, we can do the same with the next pixel down diagonally. 1 + 3 + 7 + 12 + 4 + 8 + 0 + 14 + 16 = 65.

01:40 After our original image has been converted to grayscale, we can run this algorithm over the entire image just once. This will give us a large integral image that represents the original image where every pixel’s value represents the sum of all the pixels above it and to the left.

02:00 Now if we’re trying to calculate the value of a Haar-like feature, we can do that with much fewer calculations. Take a look at this region of pixels right here on the left.

02:11 How do we quickly calculate this sum? The sum of the bottom-right four pixels is equal to the sum of the whole region, 113, minus 50, which will cut out the entire left half of the region, and minus 42, which will cut out the entire top half.

02:30 But now we’ve subtracted the top-left four pixels twice, so we add those 20 pixels back in. 113 - 50 - 42 + 20 = 41, which is the sum of the bottom-right four pixels that we’re looking for.

02:49 Now, at this point, you might be asking, “Well, how was this any faster?” Well, in the last example, it really wasn’t. Realistically, it takes a computer as long to sum 65 + 81 + 87 + 113 as it does to do 113 - 50 - 42 + 20, and the first computation doesn’t even require the initial integral image calculation at all.

03:16 But what if we needed to calculate a rectangular region of 6, 10, or even 100 pixels? Our original method would require us to sum up the values of all of those pixels. But with our integral image method, we only need to add or subtract four values, no matter how many pixels we’re working with. The original method grows with the number of pixels we’re summing, but this integral image method is always just dealing with four numbers. Because of that, it’s a constant time operation, or O(1).