Episode 84: Creating and Manipulating PDFs in Python With borb
The Real Python Podcast
Oct 29, 2021 1h 1m
Have you wanted to generate PDFs from your Python project? Many of the current libraries require designing the document down at the pixel level. Would you be interested in a tool that lets you specify the page layout while it handles the specific details of laying out the text? This week on the show, we talk with Joris Schellekens about his library for creating and manipulating PDFs named borb.
Episode Sponsor:
borb is a pure Python library that can read, write and manipulate PDFs. You can use it to build fillable forms, invoices with attached data files, and multiple column document layouts. We discuss the extensive example repository Joris has created for the library.
Joris shares his background in working with PDFs. He talks about starting the project and the challenges he had to overcome. We also talk about licensing and maintaining an open-source library.
Course Spotlight: Writing Idiomatic Python
What are the programming idioms unique to Python? This course is a short overview for people coming from other languages and an introduction for beginners to the idiomatic practices within Python. You’ll cover truth values, looping, DRY principles, and the Zen of Python.
Topics:
- 00:00:00 – Introduction
- 00:01:58 – Articles about borb
- 00:03:25 – History of the project
- 00:07:26 – Background in PDFs and Postscript
- 00:09:18 – Signatures and other challenges of working in PDFs
- 00:11:33 – Reading from PDFs and standards of versions
- 00:14:54 – Features of the library and creating documents
- 00:18:25 – Creating layout features
- 00:20:42 – How are fonts handled in borb?
- 00:21:19 – Sponsor: Cloudsmith
- 00:22:04 – Why use JSON across the library?
- 00:22:55 – Embedding data and files within a PDF
- 00:25:12 – What features were crucial for you to include in borb?
- 00:28:48 – Why creating a separate examples repository?
- 00:31:04 – Article series about borb
- 00:32:25 – Writing a book about borb
- 00:33:44 – Python 3.10 and borb
- 00:34:19 – Video Course Spotlight
- 00:35:39 – Licensing borb and AGPL
- 00:45:14 – Other open-source projects and Stack Overflow answers
- 00:46:37 – Working with forms in borb
- 00:47:55 – Additional tools for working with PDFs
- 00:50:15 – Different users of the library
- 00:53:36 – Thoughts on the future of PDFs
- 00:58:10 – What are you excited about in the world of Python?
- 00:58:40 – What do you want to learn next?
- 01:00:25 – Social connection info
- 01:00:46 – Thanks and goodbye
Show Links:
- borb: A Python PDF library
- borb Examples Repository
- Creating a PDF Document in Python with borb
- Creating PDF Invoices in Python with borb
- Creating a Form in a PDF Document in Python With borb
- iText PDF
- ISO 32000 (PDF): the family of ISO standards that defines the core PDF specification
- XRechnung update: What you should know about electronic invoices to the German public sector!
- XRechnung: Standard format for German authorities from 2020
- AGPL: Affero General Public License - Wikipedia
- Ghostscript: Interpreter for the PostScript language and for PDF
- veraPDF: Industry Supported PDF/A Validation
- Okular: The Universal Document Viewer
- Keras and Tensorflow: Getting Started