History of PyPDF2

How to Work With a PDF in Python Andrew Stephen 03:44

You’ll dive into the history of PyPDF2 and consider another PDF module for Python. You can check out the following resources:

00:00 Welcome back to working with PDFs in Python. Let’s explore PyPDF2 and its history. Back in 2005, pyPDF was initially released.

00:11 The last official release for pyPDF was in 2010. Then, after approximately one year had passed, a company called Phasit sponsored a fork of pyPDF called—you can guess—PyPDF2. The code for PyPDF2 was written in such a way so as to be backwards compatible with pyPDF, and it worked quite well.

00:33 Its final release was in 2016. PyPDF3 was then created, but after only a short time and a few releases, it was renamed to PyPDF4. All of these project packages do much the same thing, with the biggest difference between pyPDF and PyPDF2 and up is that the later versions include support for Python 3.

00:56 There’s a different Python 3 fork of the original pyPDF called pyPDF for Python 3, but this has not been maintained for a number of years now.

01:38 pdfrw was created by Patrick Maupin and it is capable of many of the manipulations that PyPDF2 can achieve, including most of the examples that this course covers. The notable exception to this, though, is PDF encryption.

02:21 Just like that. Just a side note:

02:29 if you do happen to be using Anaconda rather than regular Python, instead of using the pip install command within the Python shell, you can instead use the conda install command. However, if you are like me—and as you can see, I like using the Thonny IDE—there is a better way that you can do that.

02:50 You’ll want to go to Tools > Manage packages…. In here, you can search for PyPDF2, Find package, and Install.

02:57 It will go through its setup phase, and there you go.

03:08 We now have PyPDF2 installed. So, in case you missed that, that’s Tools > Manage packages…,

03:15 and as you can see now that I’ve gone back into it, PyPDF2 is there and it says to Uninstall, so you know it’s there. Using the package manager is also covered in the Real Python tutorial for Thonny.

03:28 A link to this tutorial is available below the video. Now that you have managed to install the PyPDF2 package, it is time to extract some information from a PDF. In order to do so, however, you’ll have to join me in the next part of this course. See you there.

Become a Member to join the conversation.