Understanding Relevant Terms
00:00 Let’s get some important terms out of the way before starting with the practical part. If you haven’t heard about it before, web scraping means to extract data from websites usually in an automated manner using a Python script, for example.
00:14 And this gives you access to a lot of information on the internet that you can use for data analysis or research and other things. You’ll hear me talk about parsers and specifically HTML parsers. A parser is a tool that allows you to read and manipulate structured data formats, such as, for example, HTML.
00:32 And finally, Beautiful Soup is the library you’ll be working with. And I’ve already mentioned it’s a high-level Python library that helps you with parsing, searching, and navigating HTML and also XML.
00:43 You’ll be working with HTML in this course. One thing to note here that’s maybe interesting is that Beautiful Soup does not actually scrape the information from the internet.
00:52 You’ll need to use something else for that. Beautiful Soup doesn’t even really do the parsing itself, but it uses a parser to process the HTML, but it gives you a high-level interface to conveniently interact with the HTML once it’s parsed. This is what Beautiful Soup does and it’s pretty good at it.
01:12 Okay, so let’s get Beautiful Soup installed in your environment.
Become a Member to join the conversation.