Segmentation and Superimposition
00:00
Image Segmentation and Superimposition. In this section of the course, you’ll use the JPEG files, cat
and monastery
, which you can find in the course materials. On-screen, you can see the images, and as their names suggest, one is of a cat.
00:16 The other is of a cloister in a monastery. You can use Pillow to extract the cat from the first image and place it on the floor of the cloister. You’ll use a number of image processing techniques to achieve this.
00:29 You’ll start by working on the image of the cat. You’ll need to remove the cat from the background using image segmentation techniques. In this example, you’ll segment the image using thresholding. First, you can crop the image to a smaller one to remove some of the background.
00:47 Start a new REPL session and enter the code seen on-screen.
01:15 The cropped image contains the cat and some of the background that’s too close to the cat for you to crop it with a rectangle. As you’ve already seen, each pixel in a color image is represented digitally by three numbers corresponding to the red, green, and blue values of that pixel.
01:32 Thresholding is the process of converting all the pixels to either the maximum or minimum value depending on whether they’re higher or lower than a certain number.
01:42 It’s easier to do this on a grayscale image.
01:55
You set a threshold value of 100
,
02:03
and you achieve thresholding by calling .point()
to convert each pixel in the grayscale image to either 255
or 0
.
02:11 The conversion depends on whether the value in the grayscale image is greater or smaller than the threshold.
02:25
Here you can see the grayscale image and the result from the thresholding process. All the points in the grayscale image that had a pixel value greater than 100
are converted to white, and all the other pixels are changed to black.
02:39 You can change the sensitivity of the thresholding by varying the threshold value. Thresholding can be used to segment images when the object to segment is distinct from the background, and you can achieve better results with versions of the original image that have higher contrast. In this particular case, you can achieve higher contrast by thresholding the blue channel of the original image rather than the grayscale one.
03:03 This is because the dominant colors in the background are brown and green colors, which have a weak blue component. You can extract the red, green, and blue channels from the color image as you did earlier on.
03:20 Here you show the red, green,
03:29 and blue channels. They’re seen on-screen together with red, green, and blue from left to right displayed as grayscale images. The blue channel has a higher contrast between the pixels representing the cat and those of the background, so you use the blue channel image to threshold.
03:50
You choose a threshold value of 57
and perform the thresholding on the blue channel of the original cat image.
04:07
You also convert the image into binary mode using "1"
as an argument to .convert()
. The pixels in a binary image can only have the values 0
or 1
.
04:20 Note that when dealing with image formats, such as JPEG, that rely on lossy compression, the images may vary slightly depending on which JPEG decoders you are using.
04:29 Different operating systems often come with different default JPEG decoders. Therefore, the results that you get when processing images may vary slightly depending on the operating system and JPEG decoder that you are using. As a result, you may need to slightly alter the threshold value if your results don’t match the ones you see on-screen.
04:50 Talking of which, on-screen you can see the result of the thresholding. Finally, save the image to the working directory, as you’ll be needing it later on.
05:06 You can certainly identify the cat in this black-and-white image, but you’d like to have an image where all the pixels that correspond to the cat are white and all other pixels are black. In this image, you still have some black areas on the cat, such as the eyes, nose, and mouth, and you also have some white areas in the background.
05:24 You can use two image processing techniques called erosion and dilation to create a better mask that represents the cat, and you’ll learn about these two techniques in the next section of the course.
Become a Member to join the conversation.