Inspect the Site Using Developer Tools
00:00 Here we are in the final lesson about inspecting the page, and we’re going to do a deep dive now by using developer tools. I’m going to head back over to the search results. Here we are.
00:12 I’m currently using Chrome, so your menu items might be a bit different to access the developer tools, but each modern browser contains developer tools, so some equivalent exists for anything that you’re using, be it Safari or Firefox.
00:28 In Chrome, you head over here, you say View, and then there’s the Developer option, and here you can open up Developer Tools.
00:35 There’s also keyboard shortcuts that you can use, or you can also right-click and say Inspect. So, each of those things open up the developer tools.
00:46 You can see there’s a lot of stuff going on here. We’re not too concerned with all of it, so I’m going to click some of it away. What we’re mostly interested in are the elements here. And you can see, I’m going over these with my mouse here, and then you can see the corresponding things light up on the left on the page.
01:05 So what I can see here is the page structure. That’s the Document Object Model of the page that I’m inspecting here, and you can just think of it essentially as the HTML that builds the page.
01:18 Now, there’s a couple of ways that you can inspect it more. For example, you can click this little thing up here that gives you an option to hover over our page here, and then if you click it, you can see it corresponds over in the HTML structure.
01:34 You can click it and it takes it to where is this specific element inside of the code. And also, it works the other way around as well, I believe. Like, the other way around is with highlighting, so you can hover over it and you can see where it highlights. But this is a nice way if you’re, let’s say, somewhere deep down on the page—you don’t know where you are and you want to figure out where is this in the code. I can go ahead and click this and then it takes me there and highlights this. Okay!
02:00 So, with this power of inspecting the page, we want to remember, you said earlier that what you want to get actually from this page is the information inside of these cards, here, on the left side.
02:13 It looks like it’s a column that essentially has a list of these cards in there with the information about the jobs. So, let’s see if you can find this HTML element using the developer tools.
02:25
If I hover around here, this looks like a single card. Can I see all of the cards? So, how would I highlight all of this? There’s something—look at this. So if I click—oh! I might have missed it, but it’s okay. Up here as well, you can see we have resultsTop
if I hover over it in the code, and that’s not really what we’re looking for, but if I go a bit deeper… what do we have here? Still not. "jobPostingsAnchor"
—okay. So here, I have one card, but I want to be able to access all of the cards, so I have to go a bit higher up, and here it is.
03:03
So, this looks like it’s a <td>
element with the id="resultsCol"
(results column). I clicked it, so it’s highlighted, and it can see that if I’m on this, this seems to contain all the information that we’re interested in.
03:17
So this element <td>
with the id="resultsCol"
for this specific page contains the information that I’m interested in scraping from here for now.
03:29 Okay. So you can see there’s a lot of HTML code in here and it’s easy to get a bit confused about it, but just keep in mind that the main thing that you want to identify here is: How are you going to be able to address the information that you’re interested in? And one of the main pieces here is like, okay, so this is the whole column that contains all of the information of the search results. That’s interesting.
03:52 And then another thing that you can go ahead and inspect by yourself some more is going to be these specific cards. So, one of these cards…
04:02
you can go in here and expand it and then you can see that there’s a bunch of different elements in here that are all interesting for us. We have the "title"
, there’s a link that’s going to be interesting, and then also the name of the job, then we have something else down here in the class
"sjcl"
—I’m not sure what that means, but we can see that it contains the name of the company,
04:27 which might be interesting, and it contains also the location…
04:34 down here, I believe. Let’s see. And you see, I’m just switching between scrolling directly inside of the HTML code here,
04:44 and then also inspecting an element by clicking on it in the page directly.
04:50 Your task is to explore one of these elements some more—now by yourself. Make sure that you have somewhat of an understanding of what’s going on. Where is something nested inside?
04:59 What are the pieces of information that you’re interested in? And don’t get too overwhelmed with all the HTML code, because we’re going to use Python to parse through this mess here and just pick out the information much easier than it is by just going through it with your eyes. But having an understanding of where things are located and where does the information live that you’re interested in is really important, because you’re going to need to target it specifically with code. And that’s all for this lesson. I hope this helps. Keep playing around with this and just keep in mind that these developer tools are pretty powerful.
05:35 They give you a way of understanding the structure of your website, and that this is very helpful for your web scraping.
Become a Member to join the conversation.