Modern Web Automation With Python and Selenium

Selenium is a web automation tool that allows you to use Python to programmatically interact with dynamic, JavaScript-generated web pages. Your Python Selenium code drives a real browser that you can instruct to fill out forms, click buttons, scrape dynamically generated data, or write automated tests for web applications.

By implementing the Page Object Model (POM) design pattern, you can create clean and scalable automation scripts that are straightforward to read and maintain.

By the end of this tutorial, you’ll understand that:

Selenium allows you to launch browsers, visit URLs, and interact with web elements.
Headless browsers let you run scripts without displaying a browser window, which is useful for automation and testing.
You can target web elements using different locators, such as CSS selectors, XPath, or IDs.
Explicit waits provide a flexible way to handle dynamic content by waiting for specific conditions.
The Page Object Model design pattern separates page structure from business logic.

In this tutorial, you’ll learn how to use Selenium with Python to build a fully functional music player that interacts with Bandcamp’s Discover page. You’ll control the player from the command line while a headless Firefox browser runs in the background. With it, you’ll be able to play tracks, pause music, list available tracks, and load more tracks, replicating some of the website’s core functionality.

Along the way, you’ll learn modern best practices, like implementing the Page Object Model (POM), which helps keep your automation scripts clean, testable, and maintainable. Ready to get started? Head over to bandcamp.com/discover/ and play some of the available music to get a feel for the website and pump up your mood for this project!

Get Your Code: Click here to download the free sample code that shows you how to use Selenium in Python for modern web automation.

Take the Quiz: Test your knowledge with our interactive “Web Automation With Python and Selenium” quiz. You’ll receive a score upon completion to help you track your learning progress:

Interactive Quiz

Web Automation With Python and Selenium

In this quiz, you'll test your understanding of using Selenium with Python for web automation. You'll revisit concepts like launching browsers, interacting with web elements, handling dynamic content, and implementing the Page Object Model (POM) design pattern.

Understand the Project and Approach

Web automation involves using a script to drive a browser and perform actions such as clicking links, filling out forms, and gathering data. Instead of manually navigating a website, you can delegate these tasks to Python. A typical scenario is automating repetitive tasks, such as logging in daily to a tool or scraping regularly updated data.

Because many web apps are built for human interaction, they can present challenges when you try to interact with them automatically. In the early days of the internet, you could send HTTP requests and parse the resulting HTML. But modern sites often rely on JavaScript to handle events or generate content dynamically, meaning that an HTTP request alone probably won’t reveal the full page content. That’s where Selenium comes in.

Remove ads

The Selenium Project

Selenium is a mature open-source project that provides a convenient API to control browsers. With Selenium, you can:

Launch a headless or visible browser such as Firefox or Chrome using a web driver.
Visit URLs and navigate pages just like a real user would.
Locate elements with CSS selectors, XPath, or similar locators.
Interact with elements by clicking, typing, dragging, or waiting for them to change.

Once you install the appropriate driver for your browser, you can control your browser through a script using Selenium.

Selenium itself is written in Java, but has bindings for different programming languages. In Python, it’s distributed on PyPI as a single package called selenium, which you can install using pip.

Selenium is often used for automated testing, but it’s equally useful for generic web automation, which is what this tutorial will focus on.

Note: You might be wondering how Selenium differs from other tools for scripted web interactions, such as Beautiful Soup, Scrapy, or Requests.

One central difference is that those tools are great at handling static data, while Selenium allows you to replicate user behavior at the JavaScript level. This means that you can interact with dynamically generated web content using Selenium.

Before diving into the nuts and bolts of Selenium, it’s helpful to get a clear picture of what you’ll build by the end of this tutorial. As mentioned, you’ll create a fully functional, console-based music player that interacts with the Bandcamp Discover page using a headless Firefox browser.

Your Bandcamp Discover Music Player

Bandcamp is a popular online record store and music community where you can stream songs, explore artists, and discover new albums.

Selenium allows you to automate direct interactions with Bandcamp’s web interface—as though you were clicking and scrolling yourself!

Your finished project will open the Bandcamp Discover page in the background, which means you won’t get to see any of the wonderful album artwork:

If a browser automation tool creates a browser instance without a visible browser window, it’s said to run in headless mode. But don’t lose your head over that word—your code will stay calm and in control!

In headless mode, the browser instance will gather a list of tracks. Each track will have information about its associated album, artist, and—if present—genre.

Finally, your app will provide a text-based interface to control the music playback with a couple of options:

play: Plays or resumes playing a track, optionally through a selected track number
pause: Pauses the currently playing track
tracks: Lists the currently loaded tracks
more: Loads more tracks from the Bandcamp Discover page
exit: Shuts down the program and closes the headless browser instance

Using these commands, you’ll be able to listen to the music that’s currently available on Bandcamp’s Discover page:

Text

Type: play [<track number>] | pause | tracks | more | exit
> play
Track(album='Carrie & Lowell (10th Anniversary Edition)',
      artist='by Sufjan Stevens',
      genre='folk',
      url='https://music.sufjan.com/album/carrie-lowell-10th-anniversary-edition')

Type: play [<track number>] | pause | tracks | more | exit
> tracks
#     Album                          Artist                         Genre
--------------------------------------------------------------------------------
1     Carrie & Lowell (10th Anniv... by Sufjan Stevens              folk
2     moisturizer                    by Wet Leg                     alternative
3     Outpost (DiN11)                by Robert Rich & Ian Boddy     electronic
4     Obscure Power                  by Quest Master                ambient
5     Hex; Or Printing In The Inf... by Earth                       experimental
...

Hopefully this sounds like a fun project to tackle! Reducing a colorful user interface with album images to a text-based interaction surely must be a programmer’s dream!

Even if you (rightfully) disagree with this sentiment, there’s a serious educational point to working on this project. Building a music player with Selenium hits many real-world tasks that you can also utilize for other web automation tasks, such as finding and clicking elements, handling dynamic content, and structuring your code well.

This project also merges the idea of functional testing—ensuring the interface works as intended—with the potential for gathering data or performing tasks automatically.

Remove ads

The POM Design Pattern

You could theoretically write your entire automation script in a single Python file, but as soon as your project grows, you’ll feel the pain of monolithic spaghetti code. That’s where the Page Object Model (POM) enters the picture.

The POM is a design pattern commonly used in test automation to improve the structure, readability, and maintainability of test code. It encourages a clear separation between the logic that interacts with the user interface and the actual test scripts. By following the POM, you represent each web page or component in an application by a dedicated class known as a page object.

This class serves as an interface to that page and usually encapsulates both the locators and the actions a user can perform on the page, such as clicking a button or entering text into a field.

Note: In this tutorial, you’ll further separate the locators into a dedicated file. You’ll use references to these locators in your page objects.

By organizing UI interactions this way, POM helps reduce duplication and makes test or automation code less effort to manage, especially as a project grows. If the structure of a web page changes, then you’ll typically need to apply updates only in the corresponding page object rather than across multiple scripts. This leads to more robust and reusable code.

Ultimately, the Page Object Model allows automation code to be more scalable, more straightforward to understand, and less prone to breaking when the web application evolves.

This Tutorial

In this tutorial, you’ll start by covering essential Selenium skills in short, targeted sections, before moving into implementing these skills using the POM, and finally, you’ll bring it all together to build your music player.

When you’re done, you can use your app to discover new music on Bandcamp. You’ll also have learned how to orchestrate Selenium to reliably find and interact with page elements in a well-organized codebase.

Note: Any time you automate interactions with a website, make sure that you abide by the site’s terms of use and act responsibly.

This tutorial doesn’t scrape any personal data and only performs basic actions that a typical user would. If you plan to adapt such scripts for more extensive scraping, then confirm that you’re not violating any policies or overwhelming servers with too many requests.

Now that you know where you’re headed, you can move on to prepare your environment so you can develop, run, and test your Selenium automation.

Set Up Your Environment

To run Selenium in Python, you’ll need a modern version of Python, the selenium package, a browser, and a browser driver that can talk to your chosen browser. In this tutorial, you’ll use Firefox along with GeckoDriver. But you can pick a different driver, like ChromeDriver, if you prefer using Chrome.

Install Selenium

To follow along, install Python 3.10 or newer. You can verify your installation by opening a terminal and checking for Python’s version:

Shell
      
$ python --version
Python 3.13.2

Your exact version number will differ, but it should read 3.10 or above so that you can leverage Python’s structural pattern matching when you build the command-line interface (CLI) for your music player.

With a compatible version of Python installed, create a virtual environment so that your Selenium installation and any dependencies don’t clutter your global Python setup.

Navigate to your project folder and create the environment using venv:

Windows PowerShell
      
PS> python -m venv venv\
PS> .\venv\Scripts\activate
(venv) PS> python --version
Python 3.13.2

Shell
      
$ python -m venv venv/
$ source venv/bin/activate
(venv) $ python --version
Python 3.13.2

You’ll notice (venv) in your shell prompt, which means that your virtual environment is active.

Next, install Selenium into this virtual environment using pip:

Shell
      
(venv) $ python -m pip install selenium

The Python bindings for Selenium that you get using this command are the only direct dependencies for this project that you’ll install with pip.

If you want to make sure that you’re running the same version of all external packages, then you can install them from the requirements.txt file provided in the downloadable materials:

Get Your Code: Click here to download the free sample code that shows you how to use Selenium in Python for modern web automation.

Once you’re in the same folder as the the requirements.txt file, you can install all dependencies listed there:

Shell
      
(venv) $ python -m pip install -r requirements.txt

If you install selenium directly, then pip will fetch the latest stable version of Selenium from PyPI. If you choose to install the dependencies using the requirements.txt file, then you’ll work with the exact versions used in this tutorial.

Remove ads

Set Up GeckoDriver for Firefox

Selenium interacts with a real browser under the hood. Before proceeding, make sure that you have an up-to-date installation of Firefox on your computer.

To communicate with Firefox, you need the geckodriver binary. Download the correct version from the Mozilla geckodriver releases page, and choose the appropriate build for your operating system.

After unpacking, place the geckodriver binary in a location accessible by your system. For instance, on macOS or Linux, you could move it into /usr/local/bin. On Windows, you can add the folder containing geckodriver.exe to your system’s PATH.

Confirm the driver is available by typing:

Shell
      
(venv) $ geckodriver --version
geckodriver 0.36.0 (a3d508507022 2025-02-24 15:57 +0000)
...

Your version number may be different, but you should see some version string printed to your terminal. If you get a command not found error, then you still need to adjust your PATH or specify the driver’s exact location in your Selenium code.

Verify Your Setup

To make sure everything works, open a Python REPL inside your virtual environment and take your browser for a drive:

Python
      
        
      
    
>>> from selenium import webdriver
>>> from selenium.webdriver.firefox.options import Options

>>> options = Options()
>>> options.add_argument("--headless")
>>> driver = webdriver.Firefox(options=options)
>>> driver.get("https://www.python.org")
>>> driver.title
'Welcome to Python.org'
>>> driver.quit()

If you see 'Welcome to Python.org', then congratulations—Selenium just launched Firefox in headless mode, navigated to Python’s home page, and fetched the page title. You’ve confirmed that your environment is set up correctly.

In case Firefox isn’t working correctly for you, expand the collapsible section below to try using Chrome instead:

You’ll need to have the Chrome browser and the ChromeDriver properly installed if you want to work with Chrome instead of Firefox.

If that’s all set up correctly, then you just need to swap out the relevant import and setup steps:

Python
      
        
      
    
>>> from selenium import webdriver
>>> from selenium.webdriver.chrome.options import Options

>>> options = Options()
>>> options.add_argument("--headless")
>>> driver = webdriver.Chrome(options=options)
>>> driver.get("https://www.python.org")
>>> driver.title
'Welcome to Python.org'
>>> driver.quit()

The tutorial will keep referring to a setup using Firefox and GeckoDriver, but if you’re working with Chrome and ChromeDriver, you’ll just need to swap out these lines of code in the upcoming code examples.

If you’re curious to see Selenium drive your browser while keeping its head on, then try running the same code without adding any options. You should see a browser window pop up, navigate to python.org, and finally close again when you call .quit().

You may run into snags during this setup. If you do, then check out the following troubleshooting tips:

Version Mismatch: If instantiating your WebDriver fails, then check that your browser version aligns with the one required by the matching WebDriver. Typically, GeckoDriver updates are backward-compatible with many versions of Firefox, but very old or very new versions might cause an issue.
Permission Issues: On UNIX systems, you need to make sure that the geckodriver file has execute permissions.
PATH Issues: Ensure that the directory containing the geckodriver is in your PATH, so that your operating system knows where to find it.
Firewall or Security Software Blocks: Occasionally, overly strict antivirus programs can block WebDriver traffic. Temporarily disabling or adding exceptions can help.

These are some quick tips that may send you in the right direction if your setup isn’t working yet.

With everything installed and tested, you’re ready to code more complex interactions. You’ll soon see how to open the Bandcamp page, parse track elements, and eventually control an entire music player.

But first, you’ll learn how to navigate a page with Selenium, select elements, and read or manipulate their content. These building blocks will help you understand how your final player selects the correct track, clicks play, and gathers information about each album and artist.

Navigate a Web Page With Python and Selenium

With Selenium installed and your driver set up, you can now open up a website using Python code and inspect or manipulate its content. In this section, you’ll learn some essential operations:

Loading a page and waiting for it to finish loading
Locating elements by CSS selector or HTML ID
Reading attribute values or text from the elements

Because you’ll be building a music player eventually, it’s a good idea to get more familiar with the structure of the Bandcamp discover page.

Underlying every web page is the DOM (Document Object Model), which represents HTML elements in a tree-like structure. Selenium’s job is to let you query and manipulate parts of that DOM.

Remove ads

Understand the DOM Structure of Bandcamp

The first step of programmatically interacting with any website is always to interact with it manually. Before you write code, you need to understand the page that you want to work with. So, open up your browser at https://bandcamp.com/discover/.

Note: In some geographical locations, the site may greet you with a cookie consent form. Take a mental note of that. If you need to click away a cookie consent form, then your Selenium driver will have to do that as well.

You should see a grid of tracks, each with an image thumbnail, album name, artist, and so on. On the right side, there’s a larger player that highlights the first song, or the song that’s currently playing. When you scroll to the bottom of the page, you can load more tracks dynamically by clicking the View more results button.

Listen to a song or two, press play and pause a couple of times, then load more songs into view:

Next, open your developer tools by right-clicking the page and selecting Inspect. Bandcamp’s discover page has a large container that holds the track elements, and each track has an associated button to start or pause audio.

By inspecting these elements, you can identify the relevant classes or attributes that you’ll later use in your script:

HTML
      
    
<div class="results-grid">
  <ul class="items">
    <li class="results-grid-item">
      <section  class="image-carousel">
        ...
        <button class="play-pause-button" aria-label="Play"></button>
        ...
      </section>
      <div class="meta">
        <p>
          <a href="https://artist-name.bandcamp.com/album/album-name?from=discover_page" >
            <strong>Album Name</strong>
            <span>by Artist Name</span>
          </a>
        </p>
        <p class="genre">genre</p>
      </div>
    </li>
    ...
  </ul>
</div>

The actual HTML is a lot longer and more complex, and may also change when Bandcamp updates their site structure.

Note: If you need a refresher for understanding this code, then you can learn more in HTML and CSS for Python Developers.

The main idea is that each track is a list element with a play-pause button, an album URL, the name of the album and artist, and possibly some information about the genre. You’ll use some of the classes you see above to locate elements later on.

Launch a Headless Browser and Navigate to a URL

After manually opening the site, you’ll now open Bandcamp’s Discover page using Python code. Start by creating a webdriver.Firefox instance, then navigate to the Discover page URL:

Pythonnavigation.py
      
    
from selenium import webdriver
from selenium.webdriver.firefox.options import Options

options = Options()
options.add_argument("--headless")
driver = webdriver.Firefox(options=options)
driver.implicitly_wait(5)

driver.get("https://bandcamp.com/discover/")
print(driver.title)

driver.quit()

This short script opens a headless Firefox browser, navigates to your target site, waits for elements to load, and fetches the page title. Then, it closes the browser instance.

Note: Calling .quit() on your WebDriver instances is important to avoid invisible headless browser instances looming in the back of your system, eating up processing power and RAM. Sounds scary? It sure does—so remember to close the instances that you summon!

If you omit the --headless option, then you’ll see the browser window pop up and navigate to the page. This can be useful for debugging, but it’s often redundant for automated tasks. Showing your browser’s interface also exposes it to accidental input, such as when you accidentally click on the visible window. Further, some websites may behave differently depending on your screen size.

Locate Elements in the DOM

After loading a page, you’ll want to locate certain elements. You may be looking for a search bar or a login button—or a music track! Selenium supports several locator strategies, including:

By ID
By CSS selector
By XPath
By link text, tag name, or class name

In modern versions of Selenium, the recommended approach is to use the By class together with .find_element() or .find_elements():

Pythonnavigation.py
      
    
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.common.by import By

options = Options()
options.add_argument("--headless")
driver = webdriver.Firefox(options=options)
driver.implicitly_wait(5)

driver.get("https://bandcamp.com/discover/")
print(driver.title)

pagination_button = driver.find_element(By.ID, "view-more")
print(pagination_button.accessible_name)

tracks = driver.find_elements(By.CLASS_NAME, "results-grid-item")
print(len(tracks))
print(tracks[0].text)

driver.quit()

In this example, you use two different locator strategies:

You look for a single element using .find_element() with the selector By.ID to find the View more results button.
You look for all track elements that are currently visible on the page using .find_elements() with the selector By.CLASS_NAME. This returns a list of WebElement objects.

Then, you print some of the gathered results to your terminal.

Note: The number of tracks you see will also depend on the viewport size of your headless browser instance. You’ll set a fixed value for it later to ensure reproducible results.

Because Selenium returns WebElement objects, you can perform sub-searches on those elements using the same approach. For example, you can pick the first track and find the album title:

Pythonnavigation.py
      
    
# ...

track_1 = tracks[0]
album = track_1.find_element(By.CSS_SELECTOR, "div.meta a strong")
print(album.text)

In this case, the CSS selector "div.meta a strong" points to the HTML element that contains the album name.

Before you write this code, you already need to know where the information you’re looking for is located on the website. That’s why it’s important to inspect the site using your developer tools first, and in doing so, identify locators that’ll help you target the elements you’re interested in.

Also, keep in mind that not all situations benefit from the same locator strategy. IDs or unique CSS classes are usually good choices and you’ll work with them throughout this tutorial. XPaths are another powerful locator strategy, but they can become unwieldy if you rely on complicated XPath expressions and are often less performant than other options.

Any locators that you settle on may stop working when the website structure changes. It sure can be frustrating when you write locators that match your current HTML, but break when the page design changes—and it will eventually!

For example, consider the following XPath expression that targets an item unambiguously:

Python

"/html/body/div[2]/span[1]/a[3]"

This XPath expression is an absolute path that starts at the root <html> element and drills down through the DOM tree to select a specific element. It navigates to the <body>, then to the second <div> inside the body, followed by the first <span> within that <div>. Finally, it selects the third link element within that <span>.

While this expression is great for now, it may be overly fragile. If the page layout changes slightly, for example, if a designer adds a new <div> or reorders elements, then this locator fails.

Wherever possible, look for stable attributes like id or semantic classes. Unfortunately, some websites only use auto-generated classes, which may change frequently. In these cases, you can rely on partial text matches or more abstract patterns.

Either way, locators are brittle and will require some maintenance across the life of your web automation. That’s just a fact, based on the reality that web pages keep changing and evolving.

Note: The Page Object Model design pattern, which you’ll learn more about later, can help you manage your locators in a maintainable way.

At this point, you know how to start Selenium, navigate to a URL, and locate key elements in the DOM.

In the next section, you’ll refine your skills by clicking elements and performing more advanced interactions, like sending keystrokes or dealing with forms. These building blocks will give you a good general overview and will also lead into the final application, where you’ll systematically gather track information and press Play on Bandcamp.

Remove ads

Interact With Web Elements

Navigating to a page and targeting elements is only step one. You probably also need to interact with some elements on the page. This could mean filling out forms, clicking buttons, selecting checkboxes, or scrolling. You’ll do these tasks by calling methods on your WebElement instances. In this section, you’ll explore common interactions and see how to adapt them for your Bandcamp music player.

Click Buttons and Links

To click a button with Selenium, you need to locate the button element and call .click() on it:

Python
      
    
button = driver.find_element(By.ID, "submit-button")
button.click()

Calling .click() instructs Selenium to simulate a mouse click. Under the hood, Selenium ensures the element is in view and clickable. If it’s obscured, then you may get an exception about an element not being interactable. Sometimes you need to scroll or wait for animations to end.

In your music player, you’ll use click events to play and pause tracks, and to load additional tracks. Start with a script similar to the one you built in the previous section:

Pythoninteraction.py
      
    
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.common.by import By

options = Options()
options.add_argument("--headless")
driver = webdriver.Firefox(options=options)
driver.implicitly_wait(5)

driver.get("https://bandcamp.com/discover/")

tracks = driver.find_elements(By.CLASS_NAME, "results-grid-item")
print(len(tracks))

driver.quit()

This script prints the number of currently visible tracks to your terminal. Run the script and take note of how many tracks your headless browser sees displayed.

Now, add code to find the View more results button like you did before—and this time also click it:

Pythoninteraction.py
      
import time

# ...

pagination_button = driver.find_element(By.ID, "view-more")
pagination_button.click()

time.sleep(0.5)

tracks = driver.find_elements(By.CLASS_NAME, "results-grid-item")
print(len(tracks))

driver.quit()

You’ve identified the pagination button and clicked it. To give the site a chance to load the new results, you’ve added a crude call to time.sleep() that pauses execution for half a second. There are better ways to do this within Selenium, and you’ll learn about them in just a bit.

Then, you did another search for all track elements in their container, and printed the number of tracks a second time. You’ll see that you have more accessible tracks after clicking the button. Indeed, you just loaded additional results into your headless browser!

Selenium will throw an ElementNotInteractableError if the button isn’t visible or is outside of the current viewport. Usually, you can fix this by ensuring the page is scrolled properly or the button is truly ready to be clicked.

Note: If your headless browser encounters a cookie consent form, then the above interaction will also fail with an ElementNotInteractableError.

In this case, you can practice writing code to locate the cookie consent form first and instruct Selenium to click one of the buttons to dismiss it. That’s good practice! If you need to dismiss the cookie form now, then you’ll also need to do so later on for the music player.

You’ll see a possible solution in the next section and learn more about how to tackle these situations in the section on dealing with overlays.

In many standard web pages, if the element is in the DOM, then .click() works perfectly—but keep an eye out for special JavaScript behaviors and overlays.

Send Keystrokes and Text Entry

If you’re dealing with input fields, then you can type text into them using .send_keys():

Python
      
    
search_box = driver.find_element(By.TAG_NAME, "input")
search_box.send_keys("Search for this")
search_box.submit()

Here, we find the input element by its HTML tag name, type a query, and then submit the form. The .submit() method is equivalent to pressing the Enter key when focused on a form field, but you could also explicitly instruct the browser to press the Enter key:

Python
      
from selenium.webdriver.common.keys import Keys

# ...

search_box.send_keys(Keys.ENTER)

How could this work on the Bandcamp page? At the top, you have a search field that allows you to search for albums, artists, and more. You can identify the locator of that field using your developer tools, target the HTML input field, and send you search query.

Because it’s fun to see your code type something on a website, run the following script without using headless mode:

Pythoncommunication.py
      
    
import time
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.by import By

driver = webdriver.Firefox()  # Run in normal mode
driver.implicitly_wait(5)

driver.get("https://bandcamp.com/discover/")

# Accept cookies, if required
try:
    cookie_accept_button = driver.find_element(
        By.CSS_SELECTOR,
        "#cookie-control-dialog button.g-button.outline",
    )
    cookie_accept_button.click()
except NoSuchElementException:
    pass

time.sleep(0.5)

search = driver.find_element(By.CLASS_NAME, "site-search-form")
search_field = search.find_element(By.TAG_NAME, "input")
search_field.send_keys("selenium")
search_field.submit()

time.sleep(5)

driver.quit()

When you run this script, you’ll see how Python opens up a browser window, locates the search box up top, enters the text you passed to .send_keys(), and submits the form.

The code includes a call to time.sleep() to give you a moment to look at the newly loaded page before closing the browser instance with .quit().

Keep in mind that Selenium does the same when you run it in headless mode—you just don’t get to see the interactions. As you might expect, switching off headless mode can often be helpful if you’re wondering why something doesn’t quite work the way you want it to. For example, like when a cookie consent form overlay blocks any other interaction with the page.

Remove ads

Deal With Hidden or Overlaid Elements

Modern sites often use overlays or modals that might conceal underlying elements. Selenium typically interacts with the topmost clickable element under the cursor. If you suspect an overlay is interfering, then you first need to close the overlay, or wait until it disappears.

In the previous examples, you may have encountered a cookie overlay that needs to be dealt with before you can click anything else on the page.

Considering that such an overlay is a normal part of the page like anything else, you can target the overlay and the relevant buttons, then perform a click event to dismiss it:

Python
      
    
# ...

cookie_accept_button = driver.find_element(
    By.CSS_SELECTOR,
    "#cookie-control-dialog button.g-button.outline",
)
cookie_accept_button.click()

This code snippet finds the Accept necessary only button on Bandcamp’s cookie consent overlay and clicks it. As a privacy-aware internet user, you’d take the same action manually when visiting the page for the first time.

Because some HTML pop-ups, such as cookie consent forms, may be targeted only at certain geographies, it can be a good idea to account for that by wrapping the logic into a try…except block:

Python
      
    
from selenium.common.exceptions import NoSuchElementException

# ...

try:
    cookie_accept_button = driver.find_element(
        By.CSS_SELECTOR,
        "#cookie-control-dialog button.g-button.outline",
    )
    cookie_accept_button.click()
except NoSuchElementException:
    pass

By wrapping the cookie consent logic into a try...except block, you make your code more robust and versatile. Following EAFP, you first ask Selenium to find the cookie consent button. If the button exists, you instruct Selenium to click it. If it doesn’t exist, the framework raises a NoSuchElementException, which you catch and follow up with a pass statement to continue normal execution.

Another approach you can follow is to use JavaScript directly:

Python
      
driver.execute_script("arguments[0].click();", overlay_element)

This code snippet executes a short JavaScript function that clicks on the page. This may work to remove some overlays which you don’t need to target. But be careful—bypassing standard user interactions can break real-world test conditions.

For the final Bandcamp-based project, you won’t need such workarounds. You’ve already identified the cookie consent form, and the site’s track elements are straightforward to interact with.

Use Hover, Drag-and-Drop, and More Complex Gestures

Selenium can also replicate user actions like dragging elements, hovering, or right-clicking. These are performed using the ActionChains class. Here’s an hypothetical example of performing a hover and clicking a submenu item:

Python
      
    
from selenium.webdriver import ActionChains

# ...

menu = driver.find_element(By.CSS_SELECTOR, ".menu")
submenu = driver.find_element(By.CSS_SELECTOR, ".menu #submenu")

actions = ActionChains(driver)
actions.move_to_element(menu)
actions.click(submenu)

actions.perform()

You start by identifying both relevant elements, then set up an instance of ActionChains. You then use .move_to_element() to perform a hover action on the menu element. This hover action triggers a drop-down that allows you to select a sub-menu item. Because the sub-menu is now open for interaction, you can call .click() on this child element.

In this example, you stack multiple actions, which you then execute in the defined order using .perform(). Being able to collect actions before executing them together is a helpful feature of ActionChains.

Though the final Bandcamp app won’t require such advanced gestures, it’s good to know that Selenium can handle them. If you want a comprehensive reference, then check out Action Chains in the Selenium documentation.

Submit Forms

If you’re testing or automating a form, you’ll often fill in multiple fields before clicking a submit button. For example:

Python
      
signup_form = driver.find_element(By.ID, "signup-form")

email_input = signup_form.find_element(By.NAME, "email")
password_input = signup_form.find_element(By.NAME, "password")

email_input.send_keys("user@example.com")
password_input.send_keys("MySecurePassword123")

signup_form.submit()

Just like when interacting with a form manually, you can fill multiple input fields and then submit the whole form in one go.

Submitting the form typically triggers a page load or AJAX call, so it’s wise to pair this with a wait condition, which you’ll cover in the next section.

While some of the code snippets in this section are fairly general, most of them still feed into your final project. You’ll soon implement button clicks to play and pause tracks, and load additional tracks by pressing the View more results button. Knowing how to locate elements and issue interactions using .click() is essential for that.

Next, you’ll see how to handle one of the biggest challenges on many modern sites: dynamic content that arrives asynchronously. You’ll learn about explicit waits and how you can make sure that your code doesn’t break just because a track or button isn’t immediately visible.

Remove ads

Handle Dynamic Content

One of the biggest obstacles in web automation is dynamic content. Many sites are single-page applications or rely heavily on JavaScript to fetch data after the initial page load. If you naively try to locate elements before they exist, then you’ll get errors like NoSuchElementException. You need a strategy for telling Selenium to wait until the content is actually there.

You’ve used time.sleep() in previous examples, but Selenium has much more flexible and robust built-in solutions to handle this challenge.

Understand Implicit, Explicit, and Fluent Waits

Selenium offers different built-in waiting mechanisms that give you a lot of flexibility in how to wait for a site to represent the state you need for interaction.

The Java implementation of Selenium distinguishes between three types of waits:

Implicit Wait: Selenium polls the DOM for a specified time whenever you try to find an element.
Explicit Wait: Selenium applies conditions to waits, which makes it more flexible.
Fluent Wait: Selenium allows you to specify the polling interval, ignore certain exceptions that occur during polling, and set custom timeout messages.

In Java, an explicit wait is just a fluent wait with certain default restrictions applied.

The Python bindings for Selenium skip setting up a third type of wait and expose implicit and explicit waits. In Python, the explicit waits have all the flexibility that a Java fluent wait provides.

Set Up an Implicit Wait

As mentioned, an implicit wait tells Selenium to poll the DOM for a specified time whenever you try to find an element. You’ve seen it in all the previous code examples and you only need to set it once for a driver session:

Python
      
driver.implicitly_wait(5)

This line of code sets up an implicit wait of five seconds for your driver. If your element is found quickly, then execution continues. Otherwise, Selenium keeps checking until either the element appears or five seconds pass, whichever comes first.

Adding an implicit wait can be a quick fix if your script breaks due to load time issues, and generally establishes a good safety net. It’s certainly a step up compared to calling time.sleep() multiple times in your code! However, it’s a general wait that’s not targeted to any conditions, or specific to an element or interaction.

Use Explicit Waits for Targeted Waiting

The true stars of waiting for dynamic content when working with Selenium are explicit waits. Explicit waits use a WebDriverWait object in combination with predefined conditions. This is immensely more flexible!

Copy the code that you wrote in interaction.py to load additional tracks into a new file that you can call observation.py:

Pythonobservation.py
      
    
import time
from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.firefox.options import Options
from selenium.webdriver.common.by import By

options = Options()
options.add_argument("--headless")
driver = webdriver.Firefox(options=options)
driver.implicitly_wait(5)

driver.get("https://bandcamp.com/discover/")

tracks = driver.find_elements(By.CLASS_NAME, "results-grid-item")
print(len(tracks))

try:
    cookie_accept_button = driver.find_element(
        By.CSS_SELECTOR,
        "#cookie-control-dialog button.g-button.outline",
    )
    cookie_accept_button.click()
except NoSuchElementException:
    pass

pagination_button = driver.find_element(By.ID, "view-more")
pagination_button.click()

time.sleep(0.5)

tracks = driver.find_elements(By.CLASS_NAME, "results-grid-item")
print(len(tracks))

driver.quit()

Here, you’ve used time.sleep(0.5) to give Bandcamp time to load the requested tracks. This probably works for you, but how long your code should sleep depends on factors such as internet speed, which you can’t reliably predict.

But, while inspecting the page using your developer tools, you identified that the View more results button isn’t clickable while the page loads more results. It only becomes clickable again once Bandcamp has finished loading the new tracks.

Therefore, you can use an explicit wait to wait exactly as long as is necessary:

Pythonobservation.py
      
    
# ...

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# ...

wait = WebDriverWait(driver, 10)
wait.until(
    EC.element_to_be_clickable((By.ID, "view-more"))
)

tracks = driver.find_elements(By.CLASS_NAME, "results-grid-item")
print(len(tracks))

driver.quit()

You’ve removed time.sleep(0.5) and replaced it with an explicit wait condition. Specifically, you told Selenium to wait until the element with the id view-more, which is the pagination button, is clickable.

In the final codebase, you’ll structure this logic within a dedicated class method. This method will handle the click and then update your list of available tracks.

Once the new results have finished loading, you can continue with the code as before—searching for all track elements and printing how many there are before quitting the browser instance.

Calling .until() on a WebDriverWait object returns the element that Selenium was waiting for, so it’s a common approach to wait for a button to become clickable, and then click it like this:

Python
      
    
wait = WebDriverWait(driver, 10)

pagination_button = wait.until(
    EC.element_to_be_clickable((By.ID, "view-more"))
)

pagination_button.click()

However, as you’ve seen in the previous example, you can also use explicit waits without interacting with the element they return.

Remove ads

Choose From Common Expected Conditions

The expected_conditions module contains many conditions that your WebDriverWait object can wait on, such as:

Function	Description
`presence_of_element_located()`	Waits for an element to be present in the DOM.
`visibility_of_element_located()`	Waits for an element to be present and visible.
`element_to_be_clickable()`	Waits for an element to be visible and enabled for clicking.
`alert_is_present()`	Waits for a JavaScript alert to appear.
`title_is()` / `title_contains()`	Checks if the page title exactly matches or contains a substring.
`url_to_be()` / `url_contains()`	Verifies that the current URL exactly matches or contains a substring.

Check out the documentation on expected conditions for a full list. If none of the pre-built conditions fit your scenario, then you can also write a custom function.

For example, you may want to wait until at least one Bandcamp track has a non-empty text property. You can achieve this by defining a function using Python’s any() and a generator expression:

Python
      
    
wait = WebDriverWait(driver, 10)

def tracks_loaded(driver):
    track_cards = driver.find_elements(By.CLASS_NAME, "results-grid-item")
    return any(card.text.strip() for card in track_cards)

wait.until(tracks_loaded)

After defining tracks_loaded(), you can pass it to .until() as an argument. In this scenario, Selenium proceeds when tracks_loaded() returns a truthy value.

Additionally, WebDriverWait also supports parameters for setting even more advanced wait conditions that match Java’s fluent wait:

timeout: Specifies the number of seconds to wait before timing out
poll_frequency: Specifies how long to wait in between calls and defaults to half a second
ignored_exceptions: Specifies which exceptions to ignore during waits and defaults to NoSuchElementException only

In the example above, you instantiated a WebDriverWait object with a timeout of 10. This means that Selenium will wait until the expected condition is met or ten seconds have passed, whichever happens earlier.

Similarly, you could also pass values for poll_frequency and ignored_exceptions to customize your explicit wait even more.

In most cases, it’s a good idea to work with explicit waits because they let you precisely define what condition you’re waiting for.

Handle Synchronization Issues

Even with explicit waits, you can still run into flaky tests or scripts if the site triggers multiple asynchronous events. It’s best to identify the most stable sign that the page is done updating. If you rely on less stable signals, then your script might fail sporadically under a heavier server load.

Some websites behave differently when you run a headless browser. For example, certain animations or transitions might run faster—or not at all—in headless mode. If you suspect a bug is related to headless mode, try removing the "--headless" flag to watch what’s actually happening on screen.

Sites often display cookie consent pop-ups or promotional overlays, like the one you learned to tackle in an earlier section. As mentioned there, you may need to find the cookie consent button and click it before you can interact with the site underneath.

Of course, you can also rely on an explicit wait if the cookie overlay loads asynchronously. This ensures your script doesn’t attempt to click before the overlay is visible.

Sites may also trigger JavaScript alerts. You can use built-in Selenium functionality to switch to such alerts and dismiss them. If it’s unreliable whether or not the alert will appear, then you can handle that using a try...except block and a NoAlertPresentException:

Python
      
    
from selenium.common.exceptions import NoAlertPresentException

try:
    alert = driver.switch_to.alert
    alert.dismiss()  # Or alert.accept()
except NoAlertPresentException:
    pass

This construct utilizes Selenium’s .switch_to property. It’ll dismiss a JavaScript alert if the site triggers one. If there’s no alert but you attempt to switch to one, then Selenium will raise a NoAlertPresentException and your code will pass and continue normally.

You won’t need to handle these types of alerts for automating your music player.

Now that you’ve covered navigating, clicking, entering text, and waiting for content to load, you’re almost ready to put everything together for your Bandcamp-based music player. However, as your code grows, you should avoid stuffing all these selectors and waits into a single file. That’s where the Page Object Model (POM) comes in.

In the next section, you’ll learn how the POM helps isolate your interactions with the Discover page from your business logic that decides which track to play. You’ll define Python classes to represent pages and components. This approach keeps your code modular, testable, and better maintainable—even if Bandcamp changes its layout.

Remove ads

Implement the Page Object Model (POM)

As you’ve seen, Selenium can handle just about any web interaction. But if you keep piling all your code into one file, then you’ll end up with a mess of locators, wait statements, and repeated logic. This is where the Page Object Model (POM) shines by separating page structure from business logic.

Understand the POM Design Pattern

POM is a design pattern where you represent each significant page component with a dedicated class in your code. This class knows how to locate and interact with elements on that specific area. The rest of your application uses these classes without worrying about the underlying selectors or waiting logic that are specific to that element.

Implementing the POM for your Selenium applications—whether it’s for automated testing or for other web automation tasks—makes your code more maintainable and stable:

[Using Page Objects] reduces the amount of duplicated code and means that if the UI changes, the fix needs only to be applied in one place. (Source)

To go about building a Selenium project following the POM, you need to understand the web page, or web pages, that you’ll interact with. A straightforward first step is to create a new class for each separate web page that you’re dealing with. You could add these classes to a module named pages.py.

However, the POM suggests that you primarily focus on page elements, or panels, more than on full pages:

Despite the term “page” object, these objects shouldn’t usually be built for each page, but rather for the significant elements on a page.

—Martin Fowler (Source)

For any more complex web automation, you’ll need to interact with parts of a web page rather than a full page. When you model these elements as separate classes, you can test them in isolation.

In this tutorial, you’ll add classes that model page elements in a module named elements.py. If you’re dealing with more complex pages and elements, then it may make sense to create separate modules for each page object that you create. Decide what works best for you—your aim should be to keep your code well-organized and maintainable.

Because locators are the most brittle aspect of the application that you’re building, you’ll add locators into a separate module called locators.py. Alternatively, you could also include the relevant locators as attributes directly in each page object.

Finally, both pages and elements often have a base class that they inherit from. In this tutorial, you’ll split off these base classes into a separate module called base.py that they’ll share with other basic settings for the web automation part of your music player.

When you follow these suggestions, you’ll end up with a basic file structure consisting of four modules and an __init__.py file:

bandcamp/
├── __init__.py
├── base.py
├── elements.py
├── locators.py
└── pages.py

Adding an __init__.py file marks bandcamp as a regular Python package that you can reference in imports. You don’t need to add any content to the file for this to work.

With this fundamental structure set up, you can start to practice thinking in terms of the POM. Fetch a pen and some paper, open up the Bandcamp Discover page, turn on some new music if you want to, and sketch the significant page components that you’ll want to interact with to build the music player.

When you’re done, you can open the collapsible section below to read about the implementation that you’ll build out in the rest of this tutorial:

Here are some classes that you could build to split out significant components of the Bandcamp Discover page following the POM:

DiscoverPage (in bandcamp/pages.py) represents the full Discover page of Bandcamp. It’ll need methods to accept cookies and expose the list of track items.
TrackListElement (in bandcamp/elements.py) represents the grid of track items. It’ll contain methods to load new tracks and to find track items in the DOM.
TrackElement (in bandcamp/elements.py) represents a single track item. This object will know how to play and pause a track, and fetch information about the track, such as album, artist, and genre.

Note that this isn’t necessarily the only way you could model the Bandcamp Discover page following the POM, but it’s what you’ll keep working with in this tutorial.

Note that you’re not yet thinking about the front-end functionality of your music player app. The POM focuses on presenting a middle layer between the UI of your web page and whatever code you’ll stick to the other end. Often, that code will be automated tests for the site. In your case, it’ll be a music player.

Remove ads

Establish a Base for Pages and Elements

A good first step is to create base classes that all your page and element classes will inherit from. In this project, you’ll place this code into a separate file, base.py. However, you could also keep a base class for pages in pages.py, and a base class for elements in elements.py.

Note: Remember to create the files in this section inside a package you name bandcamp, like shown in the folder structure further up.

The base classes should handle setup logic that applies to all child elements, such as setting a viewport size and initializing a WebDriverWait object:

Pythonbandcamp/base.py
      
    
from selenium.webdriver.remote.webdriver import WebDriver
from selenium.webdriver.remote.webelement import WebElement
from selenium.webdriver.support.wait import WebDriverWait

MAX_WAIT_SECONDS = 10.0
DEFAULT_WINDOW_SIZE = (1920, 3000)

class WebPage:
    def __init__(self, driver: WebDriver) -> None:
        self._driver = driver
        self._driver.set_window_size(*DEFAULT_WINDOW_SIZE)
        self._driver.implicitly_wait(5)
        self._wait = WebDriverWait(driver, MAX_WAIT_SECONDS)

class WebComponent(WebPage):
    def __init__(self, parent: WebElement, driver: WebDriver) -> None:
        super().__init__(driver)
        self._parent = parent

In this initial version of base.py, you define two classes: WebPage and WebComponent.

WebPage functions as the base class. Using two constants, it fixes the browser window size to a larger area to fit more items, and initializes a WebDriverWait object with a maximum wait time of ten seconds.

Note: While using one default setup for WebDriverWait works in this example, you may need differently initialized wait objects for more complex scenarios. For example, if you’re dealing with a brittle element that requires you to ignore certain exceptions, or adapt the polling frequency.

The second class, WebComponent, retains a reference to a parent WebElement in addition to inheriting the same driver and wait object from WebPage. You’re importing the WebElement class from webelement so you can properly type hint the parent parameter. To avoid confusion with this Selenium-provided class, you name your base class for your web elements WebComponent instead.

Using a base.py module can help you keep shared logic like waiting or standard properties in one place, if that’s a good fit for your project. Keep in mind that some more complex web automation or testing scenarios may require you to set up custom wait logic for different page objects.

Describe Your Web Page as a Page Object

Now it’s time to think about the structure of the specific page that you want to test or automate. In your case, that’s the Bandcamp Discover page. But you won’t need everything that’s on the website. Take another look at the live page and consider what you really need to model in this high-level page object that represents the overall Discover page.

You want to play music, so you need access to the tracks. Also, you may need to deal with a cookie consent form. In the code below, you identify the track list container and implement logic to dismiss the cookie consent form:

Pythonbandcamp/pages.py
      
    
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.remote.webdriver import WebDriver

from bandcamp.base import WebPage
from bandcamp.elements import TrackListElement
from bandcamp.locators import DiscoverPageLocator

class DiscoverPage(WebPage):
    """Model the relevant parts of the Bandcamp Discover page."""

    def __init__(self, driver: WebDriver) -> None:
        super().__init__(driver)
        self._accept_cookie_consent()
        self.discover_tracklist = TrackListElement(
            self._driver.find_element(*DiscoverPageLocator.DISCOVER_RESULTS),
            self._driver,
        )

    def _accept_cookie_consent(self) -> None:
        """Accept the necessary cookie consent."""
        try:
            self._driver.find_element(
                *DiscoverPageLocator.COOKIE_ACCEPT_NECESSARY
            ).click()
        except NoSuchElementException:
            pass

You reuse the WebPage base class that you set up in base.py and allow your new DiscoverPage to inherit from it. That gives you access to a web driver as self._driver and an explicit wait as self._wait.

When you instantiate DiscoverPage, it automatically clicks the Accept necessary cookies button if it’s present, then sets up a TrackListElement. This is a natural, real-world representation of the page with a track list.

Because the track list is a significant element on the Discover page, it deserves its own page object.

Note: If using page in this context bothers you, you’re in good company and can think of it as a panel object instead.

You’ll model page elements in a dedicated file, elements.py, so both the import on line 5 as well as the instantiation of TrackListElement on lines 14 to 17 are just promises of code-to-be for now.

Similarly, so are the references to DiscoverPageLocator in your locators module and its use in the helper method ._accept_cookie_consent(). But don’t worry, you’ll address all of this in just a moment!

For now, you can revel in the accomplishment of describing the parts of Bandcamp’s Discover page that are relevant for your music player as a true page object!

Define Reusable Web Elements

Onward to elements.py, which is where most of the POM-related abstraction of this Selenium project takes place. Again, you can take a step back and consider which elements on the page are significant for interaction. Two elements stand out:

TrackListElement models the container element that harbors all the music tracks, and the button to load more results.
TrackElement models a single music track and should allow you to play and pause tracks, as well as fetch track information.

Starting with the TrackListElement, it needs to remember which tracks are available, and it needs functionality to load additional tracks:

Pythonbandcamp/elements.py
      
    
from selenium.webdriver.remote.webdriver import WebDriver
from selenium.webdriver.remote.webelement import WebElement
from selenium.webdriver.support import expected_conditions as EC

from bandcamp.base import WebComponent
from bandcamp.locators import TrackListLocator

class TrackListElement(WebComponent):
    """Model the track list on Bandcamp's Discover page."""

    def __init__(self, parent: WebElement, driver: WebDriver = None) -> None:
        super().__init__(parent, driver)
        self.available_tracks = self._get_available_tracks()

    def load_more(self) -> None:
        """Load additional tracks."""
        view_more_button = self._driver.find_element(
            *TrackListLocator.PAGINATION_BUTTON
        )
        view_more_button.click()
        # The button is disabled until all new tracks are loaded.
        self._wait.until(
            EC.element_to_be_clickable(TrackListLocator.PAGINATION_BUTTON)
        )
        self.available_tracks = self._get_available_tracks()

    def _get_available_tracks(self) -> list:
        """Find all currently available tracks."""
        self._wait.until(
            self._track_text_loaded,
            message="Timeout waiting for track text to load",
        )

        all_tracks = self._driver.find_elements(*TrackListLocator.ITEM)

        # Filter tracks that are displayed and have text.
        return [
            TrackElement(track, self._driver)
            for track in all_tracks
            if track.is_displayed() and track.text.strip()
        ]

    def _track_text_loaded(self, driver):
        """Check if the track text has loaded."""
        return any(
            e.is_displayed() and e.text.strip()
            for e in driver.find_elements(*TrackListLocator.ITEM)
        )

You’ve implemented the track list as a page object. In this code, you set up a list of tracks, each represented by a TrackElement. You’ll write code for this final page object next. You also set up some sanity checks, to confirm that tracks are displayed and contain at least some text.

While this may look like a lot of code, you’ve already encountered and reasoned about much of it before! In previous sections, you’ve implemented the same functionality to load additional tracks into view that you’ve now packaged into .load_more(). After first locating the right button, you click it, and then use an explicit wait to pause until all new tracks are loaded.

Earlier, you also searched for tracks. Now, you’ve added some checks to confirm that these tracks have a play button, are displayed, and contain at least some text. You’ve also seen similar code in the custom wait condition of ._track_text_loaded() when learning about expected conditions. It’s great to see it all coming together!

With TrackListElement set up you’re more than halfway there. Next, you’ll set up the missing TrackElement to model individual track panels. In your POM classes, you define methods that represent user actions. For example, the TrackElement might have .play() and .pause() methods:

Pythonbandcamp/elements.py
      
    
# ...

from bandcamp.locators import TrackListLocator, TrackLocator

# ...

class TrackElement(WebComponent):
    """Model a playable track on Bandcamp's Discover page."""

    def play(self) -> None:
        """Play the track."""
        if not self.is_playing:
            self._get_play_button().click()

    def pause(self) -> None:
        """Pause the track."""
        if self.is_playing:
            self._get_play_button().click()

    @property
    def is_playing(self) -> bool:
        return "Pause" in self._get_play_button().get_attribute("aria-label")

    def _get_play_button(self):
        return self._parent.find_element(*TrackLocator.PLAY_BUTTON)

By adding this code, you model a track element and expose the most important interactions with it—pressing play to start the song, and pause to stop it. To round off the logic, you add ._is_playing as a property, and ._get_play_button() as a helper method that locates and returns the play button in a track element.

Notice that ._get_play_button() uses self._parent.find_element(). Because TrackElement inherits from WebComponent, it has a ._parent attribute and its parent is the track’s container, not the entire page. This method encapsulates how you play a track, leaving your higher-level code to simply call .play() on a TrackElement.

Finally, you also want to be able to access information about each track. The TrackElement page object is the right place for setting up that logic as well:

Pythonbandcamp/elements.py
      
    
from selenium.common.exceptions import NoSuchElementException

# ...

class TrackElement(WebComponent):
    # ...

    def _get_track_info(self) -> dict:
        """Create a representation of the track's relevant information."""
        full_url = self._parent.find_element(*TrackLocator.URL).get_attribute(
            "href"
        )
        # Cut off the referrer query parameter
        clean_url = full_url.split("?")[0] if full_url else ""
        # Some tracks don't have a genre
        try:
            genre = self._parent.find_element(*TrackLocator.GENRE).text
        except NoSuchElementException:
            genre = ""
        return {
            "album": self._parent.find_element(*TrackLocator.ALBUM).text,
            "artist": self._parent.find_element(*TrackLocator.ARTIST).text,
            "genre": genre,
            "url": clean_url,
        }

You set up another helper method that identifies the album, artist, genre, and album URL. Because not all artists like to see their music shoved into genre boxes, Bandcamp apparently made adding genres optional. Your code must account for that, so you set up another try...except block that adds an empty string if there’s no genre information provided.

You also use a conditional expression in line 14 to cut off a possible referrer query parameter and display a clean URL to the album page.

Great! With this addition, you’ve finished setting up both pages.py and elements.py, actively implementing the POM in your web automation design. But there’s still one promise left to fulfill. You need to fill locators.py to allow all your page objects to utilize the locators they so sorely need to find anything on the page.

Keep Locators Separate

Locators are one of the most brittle aspects in web automation. They can quickly change, so it’s good practice to store locators in a dedicated file. That way, if a CSS selector or ID changes on the page—and they will—then you only need to look in one place to fix it.

You’ve already used the unpacking operator syntax together with some descriptive class names in the code you previously wrote. Now, it’s time to give *TrackLocator.ARTIST some meaning by setting up the classes and class attributes that you targeted with that code:

Pythonbandcamp/locators.py
      
    
from selenium.webdriver.common.by import By

class DiscoverPageLocator:
    DISCOVER_RESULTS = (By.CLASS_NAME, "results-grid")
    COOKIE_ACCEPT_NECESSARY = (
        By.CSS_SELECTOR,
        "#cookie-control-dialog button.g-button.outline",
    )

class TrackListLocator:
    ITEM = (By.CLASS_NAME, "results-grid-item")
    PAGINATION_BUTTON = (By.ID, "view-more")

class TrackLocator:
    PLAY_BUTTON = (By.CSS_SELECTOR, "button.play-pause-button")
    URL = (By.CSS_SELECTOR, "div.meta p a")
    ALBUM = (By.CSS_SELECTOR, "div.meta p a strong")
    GENRE = (By.CSS_SELECTOR, "div.meta p.genre")
    ARTIST = (By.CSS_SELECTOR, "div.meta p a span")

In this example implementation, you store the relevant locators in classes that you name after the corresponding page objects you defined. You store them as tuples of two elements each. The first element records which locator strategy you use and the second element is the locator string.

This allows you to use the unpacking operator to provide both as arguments to .find_element() and related methods.

Note: It’s not necessary to wrap your locators into classes, and you may see them also defined as global constants in other Selenium projects.

However, using descriptively named classes that reference your page objects adds another layer of structure and improves maintainability.

Now, all your locators live in a single place! What bliss! You now only need to go update locators.py when Bandcamp decides to move track elements into a different container.

Your code can continue to reference DiscoverPageLocator.DISCOVER_RESULTS wherever it needs to find that container, even if it’s used in multiple places. You won’t need to hunt down locators scattered throughout your increasingly complex Selenium codebase. This approach improves maintainability and is self-documenting, so your colleagues will send you appreciative letters for years to come.

Enjoy the Benefits of the POM in Practice

Now that you’ve modeled all the page elements that you need for this project following the POM, you can give it a spin to enjoy the abstraction that this setup offers you. Navigate to the parent folder that also contains your bandcamp module, then start a new REPL session and play a song:

Python
      
>>> from selenium.webdriver import Firefox
>>> from bandcamp.pages import DiscoverPage

>>> BANDCAMP_DISCOVER_URL = "https://bandcamp.com/discover/"
>>> driver = Firefox()
>>> driver.get(BANDCAMP_DISCOVER_URL)

>>> page = DiscoverPage(driver)

>>> track_1 = page.discover_tracklist.available_tracks[0]
>>> track_1.play()
>>> track_1.pause()

>>> page._driver.quit()

This code keeps the head on your browser so that you can see your automation in practice. Selenium will open a new window when you instantiate Firefox, then navigate to the Discover page when you call .get() on it.

Next, you start using your page objects by instantiating DiscoverPage. You can now access all available tracks by stepping through .discover_tracklist, which is a TrackListElement, and indexing into its .available_tracks list. You pick the first TrackElement and call .play() on it. You can watch as your browser instance clicks on the play button, and the music starts playing!

When you follow the Page Object Model design pattern for your Selenium projects, you can avoid duplicating logic for actions that a user may take on the site you’re automating.

In your Bandcamp music player, it means that you’ve defined a central place for playing, pausing, or loading more tracks. If Bandcamp changes the class name of the container that holds the tracks, for example, then you just update DiscoverPageLocator.DISCOVER_RESULTS in locators.py. If the track item’s HTML changes, then you adjust TrackLocator in elements.py.

Meanwhile, you could still use the same code you just played with in the REPL session above to play a song. This means that any code you write in your high-level music player can remain the same even if the page changes.

The POM approach is especially powerful if you want to add tests to your codebase. For example, you could write tests that verify you can pause a playing track, or that loading more tracks adds new items. Each test can reuse the same page objects, which ensures consistency.

Now that your code is structured in a maintainable way, it’s time to build out the final feature, your text-based music player interface that uses these page objects under the hood.

You’ll set up one module to orchestrate the Selenium interactions, and another to provide a command-line loop for the user. By the end, you’ll have a project that, when run, launches Firefox headlessly, goes to Bandcamp, and lets you explore and play tracks in your terminal.

Build the Music Player App

You’ve got a robust Page Object Model for interacting with Bandcamp’s Discover page. Now it’s time to assemble the final pieces into a runnable, text-based music player. You want to continue to follow maintainable design patterns, so you’ll split this part of the project into a separate namespace, and divide it into two modules:

player.py will contain the logic for your music player app and utilize the page objects that you defined previously.
tui.py will provide the command-line loop and other functionality that focuses on the display of information.

To bundle the logic you wrote for the web automation into a single place, you’ll also introduce two top-level packages: web/ for the code you’ve written so far, and app/ for the code you’ll write in this section.

Finally, you’ll also create a small __main__.py file that’ll serve as the entry point for your project. Go ahead and restructure your project accordingly:

bandcamp/
│
├── app/
│   │
│   ├── __init__.py
│   ├── player.py
│   └── tui.py
│
├── web/
│   │
│   ├── __init__.py
│   ├── base.py
│   ├── elements.py
│   ├── locators.py
│   └── pages.py
│
├── __init__.py
└── __main__.py

After you’ve restructured the project and added the new, initially empty files, you may need to update some imports to make your code work together smoothly. Specifically, you’ll need to rename all imports that previously used bandcamp.module to bandcamp.web.module in pages.py and elements.py.

Note: Some integrated development environments (IDEs), such as PyCharm, will automatically change these imports for you when you move the files.

For convenience, you’ll also set up a data class that you name Track in base.py to replace the dictionary that TrackElement previously returned:

Pythonbandcamp/web/base.py
      
    
from dataclasses import dataclass
from pprint import pformat

# ...

@dataclass
class Track:
    album: str
    artist: str
    genre: str
    url: str

    def __str__(self):
        return pformat(self)

# ...

With these additions to base.py, you’ve created a minimal data class that contains all the information you want to collect about a track. Using a data class instead of a dictionary in this case just makes it more straightforward to access and display the track information.

Note: To improve the readability of the output when you print a Track, you’ve also added a custom .__str__() method that ensures Python prints each attribute in a separate line. Sure, your music player will be text-based, but you can still make it a bit prettier!

Take a side-step into elements.py and update the code to use this data class instead:

Pythonbandcamp/web/elements.py
      
    
# ...

from bandcamp.web.base import WebComponent, Track

# ...

class TrackElement(WebComponent):
    # ...

    def _get_track_info(self) -> Track:
        # ...

        return Track(
            album=self._parent.find_element(*TrackLocator.ALBUM).text,
            artist=self._parent.find_element(*TrackLocator.ARTIST).text,
            genre=genre,
            url=clean_url,
        )

Here, you’ve updated the import to account for the new structure, added Track to it, and replaced the return value of TrackElement so it builds a Track for each element.

You can download the finished project code to preview the necessary updates:

Get Your Code: Click here to download the free sample code that shows you how to use Selenium in Python for modern web automation.

Now, you’re ready to build the core logic for your music player that utilizes the web automation code that you wrote and structured so beautifully in the previous section.

Create Your Music Player Class

First comes Player, a class that encapsulates the high-level logic of opening the browser, creating a DiscoverPage object, and providing simple methods like .play() and .pause(). Note that it’s similar to code you wrote in the earlier sections, as well as when you tested your POM structure in the REPL:

Pythonbandcamp/app/player.py
      
    
from selenium.webdriver import Firefox
from selenium.webdriver.firefox.options import Options

from bandcamp.web.pages import DiscoverPage

BANDCAMP_DISCOVER_URL = "https://bandcamp.com/discover/"

class Player:
    """Play tracks from Bandcamp's Discover page."""

    def __init__(self) -> None:
        self._driver = self._set_up_driver()
        self.page = DiscoverPage(self._driver)
        self.tracklist = self.page.discover_tracklist
        self._current_track = self.tracklist.available_tracks[0]

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_value, exc_tb):
        """Close the headless browser."""
        self._driver.quit()

    def play(self, track_number=None):
        """Play the first track, or one of the available numbered tracks."""
        if track_number:
            self._current_track = self.tracklist.available_tracks[track_number - 1]
        self._current_track.play()

    def pause(self):
        """Pause the current track."""
        self._current_track.pause()

    def _set_up_driver(self):
        """Create a headless browser pointing to Bandcamp."""
        options = Options()
        options.add_argument("--headless")
        browser = Firefox(options=options)
        browser.get(BANDCAMP_DISCOVER_URL)
        return browser

The music player provides an interface for doing the most important aspects of its purpose, which is playing and pausing tracks through .tracklist. Sure, this won’t be as engaging or good-looking as a first-generation iPod, but it’ll do the job.

The Player class also handles important set up and tear down logic:

Set up: When you instantiate a Player, it spins up a headless Firefox browser and navigates to Bandcamp’s Discover page using ._set_up_driver(). Then, Python constructs a DiscoverPage page object and fetches all available tracks through DiscoverPage.discover_tracklist, and finally sets the current track to the first available item.
Tear down: You define the .__enter__() and .__exit__() special methods, which allow you to use Player in a context manager and ensure that the browser closes automatically. No zombie foxes munching on your computer’s RAM!

Other than that, you only set up the functionality to play and pause. However, Player being the abstraction that it is allows you to optionally switch tracks by indexing into the .available_tracks list using .play() and passing it an integer. Then, it calls the .play() method on that specific TrackElement. Similarly, .pause() calls .pause() on the current track element.

Because you’ve already done the heavy lifting in your POM classes, Player can remain clean. Aside from opening and closing the headless browser, you defer all other interactions through the DiscoverPage object.

This is a high-level abstraction that makes sense. As a user, you’d also start by navigating to the main page and then use the interaction possibilities that the site offers. Because of the POM, you don’t need any raw selectors or waiting logic in player.py—those are hidden in the page object layers.

Assemble a Text-Based User Interface

With the logic in place, you only need to define the interface that you’ll use to interact with the page. This code will essentially be the front end of your music player. There are many third-party libraries available that allow you to build beautiful text-based user interfaces (TUIs) using Python.

To keep the scope smaller, you’ll stick with plain Python in this tutorial. But feel free to enhance your music player with a more advanced interface—like using Textual—if you’d like to take it further.

Start by defining the high-level interactions in interact(). The user interactions that you’ll define are a good use case for Python’s structural pattern matching:

Pythonbandcamp/app/tui.py
      
    
from bandcamp.app.player import Player

MAX_TRACKS = 100  # Allows to load more tracks once.

def interact():
    """Control the player through user interactions."""
    with Player() as player:
        while True:
            print(
                "\nType: play [<track number>] | pause | tracks | more | exit"
            )
            match input("> ").strip().lower().split():
                case ["play"]:
                    play(player)
                case ["play", track]:
                    try:
                        track_number = int(track)
                        play(player, track_number)
                    except ValueError:
                        print("Please provide a valid track number.")
                case ["pause"]:
                    pause(player)
                case ["tracks"]:
                    display_tracks(player)
                case ["more"] if len(
                    player.tracklist.available_tracks
                ) >= MAX_TRACKS:
                    print(
                        "Can't load more tracks. Pick one from the track list."
                    )
                case ["more"]:
                    player.tracklist.load_more()
                    display_tracks(player)
                case ["exit"]:
                    print("Exiting the player...")
                    break
                case _:
                    print("Unknown command. Try again.")

Within interact() you use Player as a context manager. Remember that this is possible because you defined .__enter__() and .__exit__() in that class. Using Player as a context manager ensures that Python will close the headless browser when it exits the context manager.

Then, you set up an indefinite iteration using while True, which starts by printing the options that exist for interacting with the music player.

You then use match to capture the user input that you sanitize with .strip() and .lower(), and that you split into separate items to account for the possibility that users enter play followed by a track number.

Then, you add a number of case statements that use structural pattern matching to correctly route the possible commands to the intended functionality. Note that you haven’t written some of these functions yet.

Your code also does a couple of sanity checks and uses case _ as the final catch-all for any unwanted inputs. The interactive loop continues until the user types exit.

You’ll next need to set up the missing functions play() and pause(), which mainly hand functionality over to the player instance:

Pythonbandcamp/app/tui.py
      
    
# ...

def play(player, track_number=None):
    """Play a track and show info about the track."""
    try:
        player.play(track_number)
        print(player._current_track._get_track_info())
    except IndexError:
        print(
            "Please provide a valid track number. "
            "You can list available tracks with `tracks`."
        )

def pause(player):
    """Pause the current track."""
    player.pause()

In addition to calling player.pause() and player.play(), the play() function of your TUI also checks whether the number entered is available as an index in your player’s tracklist and prints a representation of the playing track to the console.

Note: Currently, the interaction allows you to use negative indices to play tracks. That can be helpful—or unnecessary—so feel free to add more input validation to your liking.

Finally, you need a way to represent the list of available songs in your TUI. For this, you’ll define display_tracks() to neatly print all discovered tracks in a tabular style:

Pythonbandcamp/app/tui.py
      
    
# ...

COLUMN_WIDTH = CW = 30

# ...

def display_tracks(player):
    """Display information about the currently playable tracks."""
    header = f"{'#':<5} {'Album':<{CW}} {'Artist':<{CW}} {'Genre':<{CW}}"
    print(header)
    print("-" * 80)
    for track_number, track_element in enumerate(
        player.tracklist.available_tracks, start=1
    ):
        track = track_element._get_track_info()
        album = _truncate(track.album, CW)
        artist = _truncate(track.artist, CW)
        genre = _truncate(track.genre, CW)
        print(
            f"{track_number:<5} {album:<{CW}} {artist:<{CW}} {genre:<{CW}}"
        )


def _truncate(text, width):
    """Truncate track information."""
    return text[: width - 3] + "..." if len(text) > width else text

In this code, you use f-strings to set up a column header in line 9 and interpolate the track information pieces for each track as a line item.

Note how you’re using CW as a shortcut for the constant COLUMN_WIDTH in order to set up consistently sized cells using Python’s format mini-language. You also use a helper function ._truncate() to abbreviate longer pieces of information.

To display each available track, you iterate over .available_tracks, which is an attribute on your TrackListElement and contains all currently playable TrackElement objects. But of course, your code in tui.py doesn’t need to know much about these page objects, other than their attributes and methods.

Make Your App Executable

Finally, you package your script with a __main__.py file that calls the interact() function:

Pythonbandcamp/__main__.py
      
    
from bandcamp.app.tui import interact

def main():
    """Provide the main entry point for the app."""
    interact()

As the docstring explains, this file is the entry point for your entire application. To follow best practices, you should also set up a pyproject.toml file and declare the entry point:

TOMLpyproject.toml
      
    
[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "bandcamp_player"
version = "0.1.0"
requires-python = ">=3.10"
description = "A web player for Bandcamp using Selenium"
dependencies = [
    "selenium",
]
[project.scripts]
discover = "bandcamp.__main__:main"

Move this file so that it lives in the same folder as your bandcamp package:

./
│
├── bandcamp/
│
└── pyproject.toml

This way, it’ll correctly find the entry-point script with the given setting.

With pyproject.toml set up, you can now install the package locally and run the music player using the command you define in [project.scripts]:

Shell
      
(venv) $ python -m pip install .
...
(venv) $ discover

If you set up everything correctly, then you’ll get to see and interact with your very own Bandcamp Discover music player:

Text

Type: play [<track number>] | pause | tracks | more | exit
>

Type tracks to see a list of available tracks. Then type play 3 to start playing the third track. Type pause to pause, or more to load additional tracks if you want to see a bigger list. Finally, to quit, type exit.

But you don’t need to exit the player yet. You can stick around for a while and discover some new music. That was the whole point of building this music player after all!

Next Steps

You’ve built a fun, fully operational command-line music player that quietly runs a real Firefox session in the background. Pretty cool! You’ve also nearly reached the end of this tutorial—phew! However, that doesn’t need to be the end of your music player. Now that you have a working first version, you could expand it in different ways:

Search by genre: Automate clicking a specific genre filter.
Test: Add tests using pytest or unittest that validate the track list loads, or that the play command works.
Add a rich TUI: Use a more advanced text-based UI library, such as Textual, to add color or interactive menus.

Your Bandcamp player is just one example of how Selenium can automate a website for fun or practical workflows. By following best practices in Selenium development, such as implementing the Page Object Model design pattern, you’ve developed a codebase that’s both approachable and powerful. Without too much effort, you can now add more features or integrate a test suite.

Conclusion

In this tutorial, you’ve learned how to use Python and Selenium to automate web interactions and create a functional, text-based music player that interacts with Bandcamp’s Discover page. You explored essential concepts such as navigating web pages, locating and interacting with elements, handling dynamic content, and structuring your code using the Page Object Model (POM) for maintainability and scalability.

Understanding web automation opens up a wide range of possibilities, from testing web applications to scraping data and automating repetitive tasks. Learning to effectively automate web interactions can significantly enhance your efficiency and productivity.

In this tutorial, you’ve learned how to:

Set up and configure Selenium for web automation with Python.
Locate and interact with web elements using different strategies.
Handle dynamic content with explicit waits and expected conditions.
Implement the Page Object Model to organize and maintain your automation code.
Build a functional command-line music player on top of your web automation project.

You may often need to replicate user behavior in a browser, and with these skills, you can now automate complex web interactions and build robust automation scripts. Selenium is widely used for automated testing in quality assurance and web scraping, but it’s also helpful for tasks like verifying SEO metadata, generating screenshots, or even filling out complicated forms. Once you master the fundamentals, the only limit is your imagination.

Keep exploring the Selenium documentation, and you’ll find even more advanced techniques—like orchestrating multiple browsers, capturing network logs, or integrating with headless testing frameworks. But for now, take a moment to tune into Bandcamp using the interactive music player you built, powered by Selenium and Python.

Get Your Code: Click here to download the free sample code that shows you how to use Selenium in Python for modern web automation.

Frequently Asked Questions

Now that you have some experience using Selenium for web automation in Python, you can use the questions and answers below to check your understanding and recap what you’ve learned.

These FAQs are related to the most important concepts you’ve covered in this tutorial. Click the Show/Hide toggle beside each question to reveal the answer.

You use Selenium in Python to automate interactions with web browsers, allowing you to perform tasks like filling forms, clicking buttons, scraping data, and writing automated tests for web applications.

You set up Selenium with Python by installing the selenium package using pip, downloading the appropriate WebDriver for your browser, and configuring your environment to recognize the WebDriver.

You create a headless browser in Selenium by configuring your WebDriver options to run the browser without a visible interface using the "--headless" argument.

You use Selenium for web scraping by navigating to web pages, locating elements using various selectors, and extracting the desired data while handling dynamic content and JavaScript-generated elements.

You can interact with dynamic web content in Selenium by using explicit waits to ensure that elements are present and ready before interacting with them. This helps you avoid errors caused by trying to access elements that haven’t loaded yet.

The Page Object Model (POM) is a design pattern often used in web automation projects. It helps to structure your code by representing each significant page component as a class, separating the page structure from the business logic for better maintainability.

In Selenium, you can use implicit waits as a browser-wide setting to poll the DOM. You set up explicit waits through WebDriverWait objects, which allow for more flexible and targeted waiting conditions. Fluent waits are available in the Java implementation of Selenium. In Python, you can reproduce their functionality by passing arguments when instantiating a WebDriverWait object.

Take the Quiz: Test your knowledge with our interactive “Web Automation With Python and Selenium” quiz. You’ll receive a score upon completion to help you track your learning progress:

Interactive Quiz

Web Automation With Python and Selenium

What Do You Think?

Rate this article:

What’s your #1 takeaway or favorite thing you learned? How are you going to put your newfound skills to use? Leave a comment below and let us know.

Commenting Tips: The most useful comments are those written with the goal of learning from or helping out other students. Get tips for asking good questions and get answers to common questions in our support portal.

Looking for a real-time conversation? Visit the Real Python Community Chat or join the next “Office Hours” Live Q&A Session. Happy Pythoning!