Exploring HTTPS and Cryptography in Python (Overview)

Christopher Trudeau

Exploring HTTPS and Cryptography in Python Christopher Trudeau 11:05

Have you ever wondered why it’s okay for you to send your credit card information over the Internet? You may have noticed the https:// on URLs in your browser, but what is it, and how does it keep your information safe? Or perhaps you want to create a Python HTTPS application, but you’re not exactly sure what that means.

In this course, you’ll get a working knowledge of the various factors that combine to keep communications over the Internet safe. You’ll see concrete examples of how a Python HTTPS application keeps information secure.

In this course, you’ll learn how to:

Monitor and analyze network traffic
Apply cryptography to keep data safe
Describe the core concepts of Public Key Infrastructure (PKI)
Create your own Certificate Authority
Build a Python HTTPS application
Identify common Python HTTPS warnings and errors

Download

Sample Code (.zip)

13.2 KB

Download

Course Slides (.pdf)

1.8 MB

00:00 Welcome to Exploring HTTPS and Cryptography in Python. My name is Chris and I will be your guide. The ultimate goal of this course is to produce some code that allows you to issue certificates so that you can host your own internal HTTPS content. In order to do that, there’s a lot of things you need to learn along the way. First off, what is HTTPS and how does it work? Secondly, in order to test things over HTTPS, you’re going to need a web application, so I’ll show you how to use a little bit of Flask to do that. Third, moving into cryptography and how certificates work. Fourth, using Fernet ciphers as a symmetric cryptography mechanism to secure your content. Fifth, moving on to asymmetric cryptography. Sixth, how to actually use Python to write a Certificate Authority. Seventh, how to combine all of this into an HTTPS Flask application.

00:54 And finally, looking at other ways of approaching the same problem.

00:58 The code samples inside of this course were tested using Python 3.8, Flask 1.1, requests 2.23, and cryptography 2.9. Other versions should work equally well, but if you’re running into weird inconsistencies, always remember to check your versions.

01:16 As this course uses outside libraries, such as Flask and Python cryptography, it’s expected that you have a basic understanding of pip and virtual environments and be able to install these packages. Great! Let’s get started.

01:30 So, what’s HTTP? HTTP is the protocol that’s used between your web browser and a server to get content. First off, you type a URL into your browser.

01:41 Your computer opens up a TCP/IP socket to the hostname. That host comes from the first part of the URL. If you type in example.com it looks up the host example.com. HTTP traffic, by default, is on port 80, so your web browser automatically opens port 80 to the server.

02:00 Once a connection is established, the computer uses this connection to send an HTTP request. This request tells the server what content it’s looking for.

02:11 Your browser uses the second part of the URL—in this case /kittens.html—to tell the server that it’s looking for the /kittens content.

02:20 It also tells it that it was asking example.com. It does this so that if the web server is hosting multiple domains, it knows which one you connected to. Finally, the Accept parameter tells the server what kind of content that can come back.

02:34 The simplest of this is “Give me any content you have.”

02:39 An HTTP request is broken down into a series of parts. The first part is the method. The example I just gave you was GET. A common one is also a POST. GET is used for retrieving content.

02:51 POST is for sending content to the server, such as submitting a form. There are other methods as well, but we’ll skip over those for now. Secondly, the path tells the server what content to fetch. In the example before, this was /kittens.html.

03:08 The request should also include the HTTP version. There are multiple versions of HTTP. This is the browser telling the server what version of the protocol that it wishes to speak. Most commonly, this is still version 1.1. Although version 1.0 is still around, 2.0 is becoming increasingly common and 3.0 is actively supported by most of the popular browsers. In addition to these pieces, the HTTP request can also include additional headers.

03:36 These headers give more information to the server. Some common examples are Accept, which you saw in the previous—the browser telling the server what kind of content it can take.

03:48 Cookies are a way for the browser and the server to keep common content through multiple connections. The Cookie header is the browser telling the server that the last time they spoke, the server asked the browser to send this information back up on the next visit.

04:04 This is how cookies and state work. As I mentioned in the example, the Host is the domain name of the server for multi-domain hosting. The User-Agent header identifies the browser that is visiting the server.

04:17 This is just text, and some browsers even allow you to fake this out. And finally, if you’re doing a POST, you also have to include the body in the HTTP request.

04:29 This is the content being sent to the server. If you’re submitting a form, this would be the key-value pairs of the fields in the form.

04:37 Once the HTTP request has gone to the server, it’s now time for the server to reply back.

04:44 This is an HTTP response. The two main parts of this are a status code—in this case, 200, telling the browser the content was served fine—and then some information about what the content type is. For kittens.html, it’s going to be some HTML. Attached to this, then, is the body—the body being the HTML that’s being served. Now, the kitten’s adorable, but I’ve skipped over a step. Images actually don’t come down in the first hit.

05:16 So if you hit kittens.html, you would get back an HTML document. That HTML document would have references in it, one of which might be an image of a kitten.

05:26 The browser would then send another HTTP request for the image of the kitten, and the server would then respond with that image. This is a multistep process.

05:37 The HTTP response itself always contains the version number. This is to confirm the version of the protocol that the server is speaking. Normally this is the same version as was included by the browser in the HTTP request.

05:52 The status code is a numeric code indicating the success or error of the call. Some ones are 200, indicating it was successful, 404, saying the path you asked for is not found, or 500, indicating something went wrong on the server-side. Just like the request, the response can include additional headers giving metadata about the response.

06:14 Some common examples are Content-Encoding, indicating what kind of content the body has and how it is encoded, Content-Length, the amount of information in the body, in bytes, Server, identifying the web server that is serving the content, and Set-Cookie.

06:33 This corresponds to the Cookie header in the request. This is the server requesting that the browser—the next time it connects to the server—sends up information that’s encoded inside of this header.

06:45 This allows the server to keep state between connections. Finally, the response is no good without the body. The body is the actual content to be displayed.

06:55 This is usually HTML, images, JavaScript, or CSS.

07:00 So, that’s HTTP. What’s HTTPS? Well, the S is for secure. The web is used for all sorts of transactions, nowadays, that should be considered private: E-com, banking, authentication.

07:13 Instead of creating a new protocol, HTTPS is just HTTP over an encrypted channel. TLS is a way of encrypting that channel. Older versions also used SSL, but for the sake of the course, I’m going to concentrate on TLS. Because TLS is used to create a generic encrypted channel, this means it can be used for all sorts of protocols, including email, IM, and VoIP.

07:39 This separates the concerns between the encryption and the protocol being used over the encryption. Let’s look at an HTTPS connection. Like before, you type in the URL, this time with the s on the end of https.

07:55 A socket is connected to the server. The port, notice, is different. The default port for web is 80. The default port for TLS is 443, so when you type in https into your browser, the browser uses port 443 to connect to the server. The default nowadays, if you don’t type in the schema, is to use HTTPS in order to try and make sure that your connection is secure. Once the socket is established, TLS protocol is connected over that socket.

08:28 This is the encryption channel. Within that encryption channel, HTTP is then used to communicate to the server. This is the same as the previous example, just inside of the TLS channel. Throughout this course, I’m going to be using a command-line browser called curl. curl comes for free in Mac, Linux, and Windows distributions now.

08:50 If you’re using a slightly older version of Windows 10—before the 1803 build—you may have to go off to the web to download it, but otherwise, it’s there. In its simplest form, you type curl and the URL that you’re going to.

09:05 What comes back is the content. This is my own personal website with the world’s smallest HTML document, essentially just saying Hello world!

09:15 By using the -I parameter, you can look at the headers coming back from the server. In this case, HTTP […] 200, content-type: text/html, and the fact that I’m using Cloudflare to front the server.

09:28 This is actually how I get my HTTPS certificate—Cloudflare’s doing it for me. The -I parameter only shows the headers that come back. You can use --include, instead, to see both the headers and the content.

09:42 I’m going to combine this with -v, for verbose, to see all of the details that happen with a TLS connection.

09:51 Here it comes…

09:57 and here’s a whole bunch of debug information. It connects to the server. It tells us that it wishes to speak TLS. Handshake goes back and forth with some certificate exchange.

10:08 Next up, the secure connection is confirmed and a certificate comes back from the server, telling the browser who it is. The browser confirms the certificate through one of its known Certificate Authorities and then—because it passes—establishes the connection.

10:25 In this case, it’s using HTTP2 over the TLS connection.

10:31 The next part is the actual GET—from curl to the browser—using HTTP2, asking for the /helloworld.html document.

10:41 The connection state changes, and finally, the response comes back from the server using status code 200 and the actual body connected to it.

10:52 And that’s everything involved in asking for helloworld.html over HTTPS.

10:58 Next up, I’m going to show you how to use Flask to build a simple web server application.

Become a Member to join the conversation.