Understanding That Dates and Times Are Messy
Dates and times aren’t simple things, especially now that most computing is done remotely. There’s no guarantee that the computer and the user are in the same place. This is further complicated by the fact that the rules governing daylight savings and time zones aren’t static. In this lesson, you’ll explore all the weird edge cases and learn what programmers normally do about them.
00:00 In the previous lesson, I gave an overview of the course and did a bad imitation of a Canadian accent. Insert your own joke about not being able to hear the difference here. In this lesson, I’ll talk about all the complexities of dates and times while sticking to my regular speaking voice. In the overview, I gave a quick peek at how messy dates and times can be. Some things to consider when coding with dates and times are time zones, which by the way can change.
00:28 I don’t mean it changes for you because you’re traveling. I mean where they are and what the difference is from GMT. Time zones don’t just naturally divide the world into twenty-four segments. For practical reasons, it doesn’t make sense for a boundary to split a city in two just because it is on a particular meridian. Time zones thus tend to follow political boundaries: countries and states and provinces within countries. Look at a time zone map, and you’ll find few of the boundaries are actually straight lines.
00:58 Another challenge is daylight savings time. In and of itself, this is messy. The time difference for a particular location can change based on the day, and if that’s not enough, these things can change. You know what the candy industry, golf courses, and barbecue companies all have in common?
01:16 They have lobbyists who in 2005 got a bill signed in the United States to change when daylight savings would take effect. Before 2007, Canada and the US changed to daylight savings just before Halloween. As of 2007, it now happens afterwards.
01:33 If you’re writing software that needs to take into account dates and times in the past, complications can arise just a little over a decade ago. It’s not like you have to be dealing with ancient times to have a problem. Queue the jokes about those of us born before the millennium being from ancient times.
01:50 If those two things aren’t enough, how about the fact that every four years, the length of a year is different? Or the fact that the speed of the rotation of the earth is changing and to keep noon on a clock in sync when the sun is at its apex, we introduced leap seconds?
02:05 What makes all this worse is a whole bunch of things about dates and times isn’t just regional, but also cultural. I mentioned the American short form for dates versus the typical European one versus the international standard and the only sane one: most significant digits first for the win. And all of these things have something in common: they’re part of Western culture. The Hebrew, Islamic, and many Asian date systems are lunar based.
02:34 And although both the Hebrew and Islamic calendars are both lunar, the Hebrew calendar has leap months, whereas the Islamic one does not. Even the concept of a year is arbitrary. Older history books use the Christian-based BC/AD to indicate when year one occurred. Newer books use BCE/CE to indicate common era and before common era to remove the religious overtone.
02:58 That’s just a renaming, though. Hebrew year 1 is 3,761 BCE, whereas Islamic year 1 is 622 CE. And of course, neither of them flip over on the Gregorian calendar’s January 1. Okay, so I’ve established that it’s messy.
03:20 Let’s talk about what computers do about it. Most computer operating systems track the amount of time since January 1, 1970. This is known as the Unix epoch. It, as you might guess from the name, has to do with Unix operating systems being written around this time.
03:37 The epoch itself wasn’t consistent when it first came out, but eventually got standardized. The POSIX.1 standard decided to ignore the complexity of leap seconds, which at the time seemed the simpler thing to do for library writers.
03:51 When this standard came out, most Unix computers used 32 bits to store time, which gave them about 136 years, half before and half after the epoch. At the time, computers were used mostly for localized record keeping, and this seemed like more than enough. As 2038 barrels down on us, 32 bits is starting to seem problematic.
04:13 A lot of older coders I knew made out like bandits during the Y2K fiasco. 2038 seems like a good retirement plan, assuming my whiskey habit allows me to live that long. Well, that got dark fast.
04:28 So you’ve got some dates and times that you want to store. Now what? First off, don’t assume you know where your user is. The problem became particularly obvious when the World Wide Web came about. The location of the computer and the location of the user now typically have nothing to do with each other.
04:44 The most common tactic is to use UTC time, which is the zero-offset time zone. Due to historical things, including maps, wars, and colonization, time zones were relative to the Greenwich meridian, a line that passed through Greenwich, England.
04:59 When this became a proper international standard, there was a need to distinguish the time zone from the zero offset, as Greenwich might want to do something crazy, like have daylight savings time.
05:10 What does UTC stand for, you might ask? Coordinated universal time. Funny story: English speakers wanted to call it CUT, coordinated universal time. French speakers wanted to call it to TUC, temps universel coordonné.
05:25 And the compromise was to jumble the letters to UTC. What’s the old joke about horses, camels, and committees?
05:33 Storing time as UTC works in certain simple cases, like debug log entries for example, but can get complicated with changes to time zone rules. Consider the Halloween change I mentioned before. That change happened due to a bill passed in 2005.
05:50 Let’s say it’s 2004, and you want to make an entry in your calendar for a big Halloween bash at 7:00 p.m. five years later. That date gets stored as UTC.
06:01 Unless you also store the date and time of when the appointment was created, you have no way of knowing how to adjust the appointment. Of course, the party will likely still take place at 7:00 p.m., but because of the rule change to daylight savings time, your UTC stored data will now be off by an hour. Confused?
06:21 Let me try it one more time. The appointment was created before a rule change, but for a time that happens after the rule change, and as you have no way of knowing about a change that might happen in the future, you can end up with a problem.
06:37 If that’s not messy enough, it gets problematic when you start trying to deal with dates that happened before the Unix epoch. Before 1918, Russia used the Julian calendar. As of February 14, 1918, they changed to a Gregorian calendar to be in sync with the rest of Europe.
06:54 They lost thirteen days in the process, the difference between the two calendars. If you’re storing January 5, 1917, in UTC without more context, you could be off by almost two weeks.
07:08 Of course, all of this messiness depends on your application. For a calendar application, you could assume only dates moving forward are important, which would simplify a lot of things. For a genealogy application, you might want all sorts of extra info to go along with any date in your system.
07:26 I think I’ve spent enough time scaring you. Next up, how to code some of this nuttiness in Python.
Become a Member to join the conversation.