Python Web Scraping

Learning PathSkills: Web Scraping, HTTP Requests, Data Parsing

Python Web Scraping Category Artwork

Web scraping is about downloading structured data from the Web, selecting some of that data, and passing along what you selected to another process. With this learning path, you’ll learn the core Python technologies and skills that you need to build your own web scraper.

Python Web Scraping

Learning Path ⋅ 9 Resources

Laying the Foundation for Web Scraping

Before you jump into web scraping, it’s important to brush up on some foundational skills, like making HTTP requests and understanding HTML and CSS.

Title image for HTTP Requests With Python's urllib.request (Python's urllib.request for HTTP Requests)

Course

HTTP Requests With Python's urllib.request

In this video course, you'll explore how to make HTTP requests using Python's handy built-in module, urllib.request. You'll try out examples and go over common errors, all while learning more about HTTP requests and Python in general.

Title image for Making HTTP Requests With Python (Python’s Requests Library (Guide))

Course

Making HTTP Requests With Python

The requests library is the de facto standard for making HTTP requests in Python. It abstracts the complexities of making requests behind a beautiful, simple API so that you can focus on interacting with services and consuming data in your application. This course shows you how to work effectively with requests, from start to finish.

Title image for HTML and CSS Foundations for Python Developers (HTML and CSS for Python Developers)

Course

HTML and CSS Foundations for Python Developers

There's no way around HTML and CSS when you want to build web apps. Even if you're not aiming to become a web developer, knowing the basics of HTML and CSS will help you understand the Web better. In this video course, you'll get an introduction to HTML and CSS for Python programmers.

Getting Started With Web Scraping

Now that you’ve learned some foundational skills, you’re ready to start web scraping!

Title image for Web Scraping in Python: Tools, Techniques, and Legality (Real Python Podcast Episode #012 Title Artwork)

Podcast

Web Scraping in Python: Tools, Techniques, and Legality

Do you want to get started with web scraping using Python? Are you concerned about the potential legal implications? What are the tools required and what are some of the best practices? This week on the show we have Kimberly Fessel to discuss her excellent tutorial created for PyCon 2020 online titled "It's Officially Legal so Let's Scrape the Web."

Title image for Web Scraping With Beautiful Soup and Python (Beautiful Soup: Build a Web Scraper With Python)

Course

Web Scraping With Beautiful Soup and Python

In this course, you'll walk through the main steps of the web scraping process. You'll learn how to write a script that uses Python's requests library to scrape data from a website. You'll also use Beautiful Soup to extract the specific pieces of information that you're interested in.

Title image for A Practical Introduction to Web Scraping in Python (Web Scraping in Python)

Tutorial

A Practical Introduction to Web Scraping in Python

Learn all about web scraping in Python. You'll see how to parse data from websites and interact with HTML forms using tools such as Beautiful Soup and MechanicalSoup.

Handling Response Data

In web scraping, you end up with lots of response data. Next up, you’ll learn what to do with it.

Title image for Working With JSON Data in Python (Working With JSON Data in Python)

Course

Working With JSON Data in Python

Learn how to work with Python's built-in json module to serialize the data in your programs into JSON format. Then, you'll deserialize some JSON from an online API and convert it into Python objects.

Title image for Reading and Writing CSV Files (Python CSV Parsing)

Course

Reading and Writing CSV Files

This short course covers how to read and write data to CSV files using Python's built in csv module and the pandas library. You'll learn how to handle standard and non-standard data such as CSV files without headers, or files containing delimeters in the data.

Automating Your Web Scraping Process

Finally, you’ll learn how to use a headless browser to automate the web scraping process.

Title image for Modern Web Automation With Python and Selenium (Modern Web Automation with Python and Selenium)

Tutorial

Modern Web Automation With Python and Selenium

Your guide to learning advanced Python web automation techniques: Selenium, headless browsing, exporting scraped data to CSV, and wrapping your scraping code in a Python class.

Congratulations on completing this learning path! If you’d like to continue to develop your skills for interacting with web data, then check out the web scraping topic on Real Python.

Or maybe you’d like to explore different ways to organize and work with a variety of data. In that case, these learning paths have got you covered:

Got feedback on this learning path?

Looking for real-time conversation? Visit the Real Python Community Chat or join the next “Office Hours” Live Q&A Session. Happy Pythoning!

« Browse All Learning Paths