Confirming the Script Works as Expected
00:00 We got to the solution and it was a lot faster than previously when I had to pass through all the text. So let me clean that up and confirm that it works as expected.
00:13 Save this and run it once more. I get three URLs and if I click on those, get Aphrodite’s profile
00:26
Poseidon's
profile and Dionysus’ profile. Alright, so these are the URLs that actually point to those resources. And that’s important that you’re able to construct such URLs, because maybe if you’re building a scraper, you’d want to fetch URLs from an all profiles site like the one you’ve seen in this case And then construct URLs that move forward so that then you can also scrape those sites.
00:52 This is something that’s called web crawlers that follow the links that are on websites to keep getting information, keep going deeper into the Internet basically.
01:02
Alright, and this is a lot more fun to do the parsing using a nice library like BS4
because these BeautifulSoup
objects, they’re designed for parsing HTML text, right?
01:15
So you can do intuitive things such as saying soup.findAll
and then just passing in the tag name and you get a list of all the link elements on a site as BeautifulSoup
objects so that then you can still do fun stuff like getting an attribute of that element by just using square bracket notation.
01:32 Cool. Alright, and that solves this challenge and let’s move on to the final exercise.
Become a Member to join the conversation.