urllib

The Python urllib package is a collection of modules for working with URLs. It allows you to fetch data across the web, parse URLs, and handle various internet protocols. The urllib package is a staple for networking tasks in Python.

Here’s a quick example:

Python
>>> import urllib.request

>>> response = urllib.request.urlopen("http://www.example.com")
>>> html = response.read()
>>> html[:60]
b'<!doctype html>\n<html>\n<head>\n    <title>Example Domain</title>'

Key Features

  • Opens and reads URLs
  • Parses and constructs URLs
  • Handles URL encoding and decoding
  • Manages cookies and HTTP headers
  • Supports multiple protocols (HTTP, HTTPS, FTP)
  • Provides robust error handling for network operations
  • Allows customization of HTTP requests (headers, methods)
  • Handles redirects and authentication

Frequently Used Classes and Functions

Object Type Description
urllib.request.urlopen() Function Opens a URL and retrieves data
urllib.parse.urlparse() Function Parses a URL into components
urllib.parse.urlencode() Function Converts a dictionary into a URL-encoded query string
urllib.error.HTTPError Class Represents an exception raised for HTTP-related errors
urllib.request.Request Class Represents an HTTP request object
urllib.request.urlretrieve() Function Downloads a file from a URL and saves it locally

Examples

Open and read a URL:

Python
>>> import urllib.request

>>> with urllib.request.urlopen("http://www.example.com") as response:
...     html = response.read()
...

Parse a URL into its components:

Python
>>> from urllib.parse import urlparse

>>> url = urlparse("http://www.example.com/index.html;params?query=arg#frag")
>>> url.scheme, url.netloc, url.path
('http', 'www.example.com', '/index.html')

Common Use Cases

  • Fetching data from web pages
  • Parsing and manipulating URLs
  • Encoding data for query strings
  • Handling HTTP requests and responses
  • Downloading files or images programmatically
  • Authenticating with web APIs using custom headers
  • Automating web data extraction for basic web scraping
  • Handling HTTP errors and timeouts gracefully
  • Interacting with REST APIs

Real-World Example

Suppose you want to download an image from the web and verify its format and size. You can combine urllib with the popular Pillow library for image processing:

Python
>>> import urllib.request
>>> from PIL import Image
>>> import io

>>> url = "https://www.python.org/static/community_logos/python-logo.png"
>>> filename = "python-logo.png"

>>> # Download the image data
>>> with urllib.request.urlopen(url) as response:
...     img_data = response.read()
...

>>> # Save to file
>>> with open(filename, "wb") as img:
...     img.write(img_data)
...

>>> # Load the image with Pillow and check details
>>> with Image.open(io.BytesIO(img_data)) as img:
...     print(img.format, img.size)
...
PNG (601, 203)

In this example, you download an image from the web, save it locally, and use Pillow to check its format and dimensions. These are common practical steps for anyone working with images in modern Python.

Tutorial

Python's urllib.request for HTTP Requests

In this tutorial, you'll be making HTTP requests with Python's built-in urllib.request. You'll try out examples and review common errors encountered, all while learning more about HTTP requests and Python in general.

intermediate api web-dev web-scraping

For additional information on related topics, take a look at the following resources:


By Leodanis Pozo Ramos • Updated July 29, 2025