Syncopated Systems
Seriously Sound Science

On the Formatting of Dates

The Problem With Dates

As the ability of people to communicate across national and cultural borders increases, so does the potential of miscommunication, especially for simple and often-abbreviated elements such as dates. This potential for error is well-summarized by Lloyd Honomichl of Lionbridge on the World Wide Web Consortium (W3C) Web site at < http://www.w3.org/International/questions/qa-date-format > (note the use of MLA format here to encapsulate the URL); there, Honomichl writes:

"Visitors to a web site from varying locales may be confused by date formats. The format MM/DD/YY is unique to the United States. Most of Europe uses DD/MM/YY. Japan uses YY/MM/DD. The separators may be slashes, dashes or periods. Some locales print leading zeroes, others suppress them. If a native Japanese speaker is reading a US English web page from a web site in Germany that contains the date 03/04/02 how do they interpret it?"

Because of the diversity of the world's people, there is unfortunately no single solution that fits the bill perfectly. However, the International Organization for Standardization's ISO 8601:2000 International Date Format has gained acceptance with the W3C, though it has now been superseded by ISO 8601:2004.

ISO 8601 International Date Format

The ISO 8601 date format (described in more detail on Wikipedia) provides the year, month and day in that order and in all-numerical form using common Arabic numerals. Each field is of fixed length, using four characters to represent the year, two characters for the month, and two for the day. The fields are separated (or "delimited", as we often say in the computer biz) by hyphens. The year, if before year zero (the "epoch" year), is preceded by a minus sign (commonly represented using another hyphen); otherwise, the year may or may not be preceded by a plus sign.

Using these formating rules, today's date (for example) would be represented as "2006-07-18" or "+2006-07-18". Including a universal time code (U.T.C.) time stamp, this representation would become "2006-07-18T23:18:05Z". (The "T" separates the date from the time and "Z" indicates that the time zone is U.T.C. or "zulu time".)

Note that the use of minus and (optionally) plus signs are analogous to the use of the terms "before common era" ("B.C.E.") and "common era" ("C.E."), respectively. These terms were introduced into common usage roughly around the year 2000, when they replaced "Before Christ" ("B.C.") and "Anno Domini" (Latin meaning "year of our lord", abbreviated "A.D."), respectively.

Practical Modifications

Overall, I like the ISO 8601 date format because most of the world's people read left-to-right and the ISO format's left-to-right lexicographical ordering enables easy data traversal from "most significant" to "least significant" data (a notation often referred to in the computer biz as "big-endian", a term borrowed from the book Gulliver's Travels), allowing computers to do easy string-based sorting into chronological order.

Reserve Hyphens

However, I prefer not to use the hyphen character ("-") as field delimiter, instead reserving its use to prefix negative years (as a "minus sign") and to indicate spans of dates and times.

Slash-Delimiting

In place of the hyphen, I substitute another character such as period ("."), slash ("/") or space, depending on the application. Using slash delimiters, I would modify the ISO representation of today's date to "2006/07/18" or "+2006/07/18".

Indicating Spans

Using this notation, I could then indicate spans of time such as, say, for example, Ronald Reagan's lifetime (according to Wikipedia) as "1911/02/06-2004/06/05". For shorter spans of time, I allow myself to use a shorthand notation omitting fields that would be duplicated in the ending date, so a span of "2006/07/11-2006/07/25" could be represented simply and clearly as "2006/07/11-25"; using the same principle, spans across months (that do not cross years) could be represented as "2006/07/14-08/25", while still conveying meaning with the fewest characters.

Space-Delimiting

When using modern computers having file systems that allow long file names, I often prefix file or directory names with their dates of creation so that I may more easily find them later. File systems, however, historically and generally now use the period character ('.') to delimit the file name extension (the suffix that some operating systems use to identify with which program to process file's data). To delimit directory names Unix-based computers use the slash character ("/") and DOS-based computers use the backward slash character ("back slash", "\"). So, for this application, I delimiter date fields using a space character. (Computer programs sometimes prefix space with a backward slash, as "\ ", or represent space as its ASCII hexadecimal value prefixed by a percent sign, as "%20").

Self-Delimiting

The data in the basic ISO date format may be made self-delimiting by replacing the representation of each month with a single corresponding Roman letter, preferably from the lower case. (To avoid confusion with similar Arabic numerals, the letters "I", "L", "O" and "S" should not be used; see the table below.) Using this format, today's date would be represented as "2006g18". (Syncopated uses this format when posting articles on its Web site, appending an extra letter to facilitate posting multiple articles on a given day.)

Alternate Representations of Months

NameNumberLetter
January01a
February02b
March03c
April04d
May05e
June06f
NameNumberLetter
July07g
August08h
September09j
October10k
November11m
December12n

Summary

Because improvements in global communication technology are likely to continue well into the foreseeable future, the practice of recording date and time data from "most significant" to "least significant" (with respect to left and right) should be used diligently wherever practical, whether the data is originally intended for international publication or not. The use of hyphen characters ('-') as field delimiters should be avoided, though their use to indicate spans should be encouraged.