Merging and Splitting PDFs
00:19 Python can help you do this. For this example, open up a PDF and print a single page out as a separate PDF. Then do it again but for a different page. This will give you a couple of files to work with.
Now, you can use the
merge_pdfs() method when you have a list of PDFs to merge together. You will also need to know where to save the result, so along with the list of input paths, it also takes an output path, just there.
You then loop over the inputs and create a
pdf_reader object per input, just there. Next, you iterate over each of the pages in the PDF file and use
.addPage()—just there—to add each of the pages to itself.
Then I open the
merged.pdf, and you can see page 2 of the
Jupyter_Notebook.pdf, followed by page 1. Now, I did that intentionally so that you could see it had worked and it’s called
merged up here, which you can see is what I’ve named it just there. Now to take a look at the opposite of merging: splitting.
This is particularly useful for documents that have a lot of scanned-in content, but there are a lot of reasons for wanting to split a PDF. The example we’re going to look at is how you could use the
PyPDF2 module to split a PDF into multiple files. You start by once again creating a reader object and looping over the PDF pages, just there. For each page in the PDF, you create a new
pdf_writer instance and add a single page to it, right there.
03:53 page 3, et cetera. And as you can see in the tabs here, each file has its own unique name, as mentioned, so you can discern which document is which page. Now, hopefully you’ll join me in the next part so that you can find out how to add a watermark to a PDF as well as how to encrypt a PDF.
Become a Member to join the conversation.