Checking a Website's Connectivity

Building a Site Connectivity Checker Darren Jones 06:56

00:00 Check a Website’s Connectivity in Python.

00:05 At this point, you should have a suitable Python virtual environment with your project’s dependencies installed in a project directory containing all the files that you’ll use throughout the course. So it’s time to start coding.

00:17 Before jumping into the fun stuff, go ahead and add the application’s version number to the __init__.py module in your rpchecker package, as seen on-screen.

00:30 The __version__ module-level constant holds your project’s current version number because you’re creating a brand-new app. The initial version is set to 0.1.0. With this minimal setup, you can start implementing the connectivity-checking functionality.

00:48 There are several Python tools and libraries that you can use to check if a website is online at a given time. For example, a popular option is the requests third-party library, which allows you to perform HTTP requests using a human-readable API.

01:02 However, using requests has a drawback of installing an external library just to use a minimal part of its functionality. It would be more efficient to find an appropriate tool in the Python standard library. With a quick look at the standard library, you’ll find the urllib package, which provides several modules for handling HTTP requests.

01:23 For example, to check if the website’s online, you can use the urlopen() function from the urllib.request module, as seen on-screen.

01:41 The urlopen() function takes a URL and opens it, returning its content as a string or Request object. But you just need to check if the website is online, so downloading the entire page would be wasteful.

01:53 You need something more efficient. What about a tool that gives you lower-level control over your HTTP request? That’s where the http.client module comes in.

02:05 This module provides the HTTPConnection class, representing a connection to a given HTTP server. HTTPConnection has a .request() method that allows you to perform HTTP requests using the different HTTP methods. For this project, you can use the HEAD HTTP method to ask for a response containing only the headers of the target websites.

02:29 This option will reduce the amount of data to download, making your connectivity checker app more efficient. At this point, you have a clear idea of the tool to use.

02:39 Now you can go and do some quick tasks. Go ahead and run the following code in a Python interactive session. First, HTTPConnection is imported, and then a connection instance is made targeting the pypi.org website using port 80, which is the default HTTP port.

03:00 The timeout argument provides the number of seconds to wait before timing out the connection. Next, you perform a HEAD request on the site’s root path, "/", using the .request() method. To get the actual response from the server, you call .getresponse() on the connection object.

03:20 You can inspect the response headers by calling .getheaders().

03:27 The website connectivity checker just needs to create a connection and make a HEAD request. If the request is successful, then the target website is online.

03:36 Otherwise, the site is offline. In the latter case, it would be appropriate to display an error message to the user. Next, open checker.py in your editor and add the code seen on-screen.

03:51 This line imports HTTPConnection from http.client. This is the class you’ll use to establish a connection with the target website, as seen previously. Next, urlparse() is imported.

04:04 This function will help you pass the target URLs. Here, you define site_is_online(), which takes a url and a timeout argument.

04:15 The url argument holds a string representing the website’s URL, and timeout holds the number of seconds to wait before timing out connection attempts.

04:25 This defines a generic Exception instance as a placeholder.

04:30 Here, you define a parser variable containing the result of parsing the target URL using urlparse(). This line uses the or operator to extract the hostname from the target URL.

05:24 The finally block closes the connection to free the acquired resources. This happens regardless of whether an exception occurs, ensuring that the connection is closed.

05:36 This last line raises the exception stored in error if the loop finishes without a successful request.

05:44 The site_is_online() function returns True if the target website is available online. Otherwise, it raises an exception pointing out the problem it encountered. This latter behavior is convenient because you need to show an informative error message when the site isn’t online.

06:03 To try out site_is_online(), run the following code in a Python interactive session started in the project directory. First, site_is_online() is imported from the checker module.

06:17 Then you call the function with "python.org" as an argument. Because the function returns True, you know that the target site is online.

06:26 Here, you call site_is_online() with a non-existing website as a target URL. In this case, the function raises an exception that you can catch later and process to display an error message to the user.

06:42 You’ve implemented the application’s main functionality of checking a website’s connectivity. Now you can continue with the project by setting up its command-line interface, and that’s what will be in the next part of the course.

Become a Member to join the conversation.