Loading video player…

Understanding Tom's Obvious Minimal Language

00:00 TOML Tom’s Obvious Minimal Language TOML is a fairly new format. The first format specification, version 0.1.0, was released in 2013. From the beginning, it focused on being a minimal configuration file format that’s human-readable.

00:18 You can see the TOML project’s goals on screen taken from the TOML webpage. As you work through this course, you’ll see how well TOML hits these targets. It’s clear though that TOML has become quite popular over its short lifespan.

00:34 More and more Python tools, including Black, Pytest, MyPy, and isort, use TOML for their configuration and TOML parsers are available for most popular programming languages.

00:46 On screen, you can see one way to express the configuration you saw previously in TOML. You’ll learn more about the details of the TOML format later on in the course, but for now, just try to read and parse the information yourself.

00:58 Note that it’s not much different from the first example you saw. The biggest change is the addition of quotation marks in some of the values.

01:07 TOML syntax is inspired by traditional configuration files. Its one major advantage over Windows ini files and Unix configuration files is that TOML has a specification that spells out precisely what’s allowed in a document and how different values should be interpreted.

01:22 The specification is stable and mature after reaching version 1.0 in early 2021. In contrast, the ini format doesn’t have a formal specification.

01:33 Instead, there are many variants and dialects, most of them defined by an implementation. Python comes bundled with support for reading .ini files in the standard library.

01:43 While ConfigParser is quite lenient, it doesn’t support all kinds of .ini files.

01:48 Another difference between TOML and many traditional formats is that TOML values have types. "blue" is interpreted as a string while 3 is a number.

01:58 One potential criticism of TOML is that humans writing TOML need to be aware of types. In simpler formats, that responsibility lies with the programmer parsing the configuration.

02:10 TOML is not meant to be a data serialization format like JSON or YAML. In other words, you shouldn’t try to store general data in TOML to recover it later.

02:19 TOML is restrictive in a few aspects. All keys are interpreted as strings. You can’t easily use a number as a key. TOML has no null type. Some whitespace is important, which makes it less efficient to compress the size of TOML documents.

02:37 Even though TOML is a good hammer, not all data files are nails. You should primarily use TOML for configurations.

02:46 You’ll dive deeper into TOML syntax in the next section of the course. And there you’ll learn about some of the syntax requirements of TOML files. But in practice, a given TOML file may also come with some non-syntactical requirements. These are schema requirements.

02:59 For example, your tic-tac-toe application may require that the configuration file contain the server URL. On the other hand, player colors may be optional if the application defines a default color.

03:11 Currently, TOML doesn’t include a schema language that can specify required and optional fields in a TOML document. Several proposals exist, although it’s not clear if any of them will be accepted anytime soon.

03:24 In simple applications, you can validate your TOML configuration manually. For example, you can use structural pattern matching, which was introduced in Python 3.10.

03:35 Assuming that you’ve parsed the configuration into Python and named it config, you can check its structure with the code seen on screen. The first case statement spells out the structure that you expect.

03:46 If config matches, then you use pass to continue your code. Otherwise, you raise an error.

03:54 This approach may not scale well if your TOML document is more complicated. You also need to do more work if you want to provide good error messages. A better alternative is to use Pydantic, which utilizes type annotations to do data validation at runtime.

04:10 One advantage of Pydantic is that it has precise and helpful error messages built in.

04:16 If you want to know more about Pydantic, then Real Python has you covered with this video course.

04:22 There are also tools that take advantage of the existing schema validations that exist for formats such as JSON. For example, Taplo is a TOML toolkit that can validate TOML documents against JSON schemas.

04:35 Taplo is also available for Visual Studio Code bundled into the “Even Better Tomal” extension.

04:41 You won’t worry about schema validation in the rest of this course. Instead, in the next section of the course, you’ll get more familiar with the TOML syntax and see all the different data types that are available to you.

Become a Member to join the conversation.