Explore the Website
00:10 You can see inside of this Jupyter Notebook—that you also have access to in the course materials—we have a link to indeed.com down here, under this heading Explore the Website, so that’s what I’m going to follow. I just Command + click it to move over, open it up in a new tab, and then take a look at it!
You can see that this one links to the
/worldwide Indeed page, which is just a distinction because there’s country-specific ones that you could click on, so feel free to use your own country if you’re interested in searching in your own country, in your own language.
I believe that all of the structures of the websites are the same, so the scraping should work anywhere. They just have somewhat different URL endpoints. If you use this
/worldwide one, it’s going to search in the US. So, I’m going to start off by typing my search here.
01:17 So, what I can see is that there are these kind of blocks here on the side that contain some information, and it looks like the first one is highlighted, which—from my assumption here—this is probably related. Here on the right, I can see some information that relates to this one—it fits with the heading here as well. So if I click on this one, I expect that this information here is going to change to the new job, and it does indeed. I get some information about the company here.
01:48 I’m also going to dismiss this down there so we have a bit more screen space. And I can keep scrolling and clicking on these and just get information about the jobs, and then I also have some other options here. It looks like I can apply, which is relevant for me if I’m looking for a job. If I click this, I expect it’s going to take me to a different page related to that specific company. And yeah, it does.
02:24 So most of those cards are going to have some possibility to apply. That’s something interesting. And then there’s also this option—it looks like I can star them or heart them. For this, I probably need some kind of account on Indeed—I need to sign up or create an account, and then I can probably save these. Yeah. Okay.
02:56 And for now, it looks like we’re going to focus on this part here—this column, essentially, that has all the search results. That has some title of the job, it has a location, and it has a bit of information, plus also it’s going to contain in here a link to even more information that opens up here on the side. If I can collect all of this information, that’s pretty cool, and that’s what I’m going to focus on right here.
03:22 So, feel free to explore the site some more, just in the way that I did now: scroll around, click around, think what you expect, and see what clicking does. The better you understand your site, the easier it’s going to be to scrape it eventually. And also remember, this is just step one. In the next lesson, we’re going to look at the information that you can get out of the URLs up here.
Become a Member to join the conversation.