Find Elements by HTML Class Name

Web Scraping With Beautiful Soup and Python Martin Breuss 04:53

00:00 You’ve identified the column that contains all the information, and now you want to drill down a bit more by actually finding the elements for the boxes that contain all of the job descriptions. We’re going to do that by HTML class name.

00:14 Heading back over to the site, I’m going to do a bit more inspecting. So, let’s see. When I click on this,

00:23 I see that each of those cards, it looks like that. We have a class="jobsearch-" […] blah, blah, blah. Okay. So, this has a couple of classes here, right?

00:33 And seems like there’s also—let me see that.

01:17 Now, in this case, I know they all have the 'result', in here—one of the class names. And that’s a pretty—for me—understandable one because I’m looking for results, so this is the one that I pick here. I’m going to say .find_all(). So, this is a bit different than the .find() one up here, which is going to only return one element.

01:37 .find_all() is going to return a list of elements, and I’m giving them the class_='result'. And what I’m doing additionally here is I say—because classes in HTML can be applied to all sorts of elements, while ID is unique and can only be applied to one element.

01:53 So, here, this .find() makes a lot of sense. A class could give me different results. And I don’t really have a complete understanding of this whole HTML in here, so I don’t know whether there is maybe some other HTML element somewhere down there that also has the class 'result'. That’s always possible.

02:11 So, I went to restrict it some more and I say, I want to find only <div> HTML elements that have the class 'result'. So if there’s a paragraph (<p>) somewhere, or emphasis tag (<em>), or something like that with the same class name, it’s going to not be taken into this list but only <div> elements with that specific class.

02:59 So, this gives you an easy way to keep just step by step digging deeper into the HTML structure. Okay. And in this case, I’m saving all of the results, all of the card elements, into jobs.

03:11 Let’s see how many there are. So, there seem to be 15 on this page. That could be right. 1, 2, 3, 4…

04:09 So, we have a list of Beautiful Soup elements that each contain one of those cards, that each contain the information that we’re looking for. So, we’re moving ahead and learned how to use the class_ to filter for it.

jramirez857 on Dec. 31, 2021

Great video! I had to use

jobs = results.find_all('div', class_='slider_container')

rather than

jobs = results.find_all('div', class_='result')

to get a single job posting.

Jose3XL on Jan. 23, 2022

Thank you jramirez85! I couldn’t figure this out, kept getting an error.

Become a Member to join the conversation.