Tools and Challenges
00:00 In the previous lesson, I showed you PyYAML’s serialization and deserialization features in greater detail. This lesson is a hodgepodge of practical advice around tools and problems you might encounter.
00:13 This course has mainly shown you how to use PyYAML to deal with YAML docs in Python. The default installation is actually a pure-Python implementation. PyYAML is also available as a wrapper to LibYAML, a C-based library.
Although PyYAML is still the most popular library out there, it might be argued that it’s a bad idea, seeing as it’s 1.1-based. If you want to move along to something that’s 1.2-based,
ruamel.yaml is just such a beast. There is also
StrictYAML, which implements a subset of the YAML spec, skipping many of the problems I’m going to be mentioning shortly.
You can find
yamllint, a linter for YAML, and the
yq command for interacting with YAML based on the command line. There’s also a tool with the same name that is a compiled version, which is faster, but of course you can’t
pip install that.
01:35 I touched on this briefly before when showing you the trouble with timestamps, but YAML’s casting can be tricky. YAML 1.2 has fixed some of these challenges, but YAML 1.1 is still very much out there.
02:05 I’ve split Ruud’s doc into two parts, the part that parses problematically and the part that, well, doesn’t. The first part is in the top here. Let me load this, and then I’ll talk about the surprises.
And there’s the results. Let’s start with the
port_mapping spec. Because YAML doesn’t require quotes for strings, you might think, “Hey, I can just create port maps.” Well, that works for ports 80 and 443, but note what happens with the SSH port.
02:39 This is that timestamp problem I spoke about earlier. YAML 1.1’s base-60 support is turning 22:22 into an integer instead of a string. Let me scroll the document down a little bit to get the next surprise.
on as a key becomes
True in the Python dictionary. Scrolling down a little more … and the last one for version numbers might seem obvious, but it’s an easy enough mistake to make.
<number>.<number> is a float, so if you’re plugging away with major-minor-patch format for your version numbers, then accidentally drop a
.0 patch part, you’re not going to get a string like the other versions. You’re going to get a float.
04:53 So, when should you use it? Well, the snarky answer is don’t. The answer’s not quite fair, but to be more specific, don’t seek it out for yourself. If you’re working with data that is in YAML, great, don’t reinvent the wheel. If you’re in the DevOps space, you’re going to come across it, and that’s fine.
05:43 If you’re going down this road, personally, I would just use TOML instead. It’s not quite as popular as YAML as it’s newer, but it supports the same data types and hierarchical structure and doesn’t have all the weirdnesses that come with quoteless strings.
05:58 It is supported by many different programming languages and is part of the standard library as of Python 3.11. I joked earlier about the irony of a Python programmer kvetching about white space being important, but I do see these as two different things.
06:12 I don’t find that I need copy and paste code very often, and if you do, most IDEs will deal with it for you. Data, on the other hand, gets ferried around all the time, and indentation being significant for what is and isn’t a new line is asking for trouble.
07:01 Somebody remind me to find out if the comments section on this can be turned off for this lesson, huh? Okay, well, I’ll try not to trip as I get down off my soap box. Next up, I’ll summarize the course and point you at other content you might find interesting.
Become a Member to join the conversation.