theory.html revision 328476
1<!DOCTYPE html>
2<html lang="en">
3<head>
4  <title>Theory and pragmatics of the tz code and data</title>
5  <meta charset="UTF-8">
6</head>
7
8<!-- The somewhat-unusal indenting style in this file is intended to
9     shrink the output of the shell command 'diff Theory Theory.html',
10     where 'Theory' was the plain text file that this file is derived
11     from.  The 'Theory' file used leading white space to indent, and
12     when possible that indentation is preserved here.  Eventually we
13     may stop doing this and remove this comment.  -->
14
15<body>
16  <h1>Theory and pragmatics of the tz code and data</h1>
17  <h3>Outline</h3>
18  <nav>
19    <ul>
20      <li><a href="#scope">Scope of the tz database</a></li>
21      <li><a href="#naming">Names of time zone rules</a></li>
22      <li><a href="#abbreviations">Time zone abbreviations</a></li>
23      <li><a href="#accuracy">Accuracy of the tz database</a></li>
24      <li><a href="#functions">Time and date functions</a></li>
25      <li><a href="#stability">Interface stability</a></li>
26      <li><a href="#calendar">Calendrical issues</a></li>
27      <li><a href="#planets">Time and time zones on other planets</a></li>
28    </ul>
29  </nav>
30
31
32  <section>
33    <h2 id="scope">Scope of the tz database</h2>
34<p>
35The tz database attempts to record the history and predicted future of
36all computer-based clocks that track civil time.  To represent this
37data, the world is partitioned into regions whose clocks all agree
38about timestamps that occur after the somewhat-arbitrary cutoff point
39of the POSIX Epoch (1970-01-01 00:00:00 UTC).  For each such region,
40the database records all known clock transitions, and labels the region
41with a notable location.  Although 1970 is a somewhat-arbitrary
42cutoff, there are significant challenges to moving the cutoff earlier
43even by a decade or two, due to the wide variety of local practices
44before computer timekeeping became prevalent.
45</p>
46
47<p>
48Clock transitions before 1970 are recorded for each such location,
49because most systems support timestamps before 1970 and could
50misbehave if data entries were omitted for pre-1970 transitions.
51However, the database is not designed for and does not suffice for
52applications requiring accurate handling of all past times everywhere,
53as it would take far too much effort and guesswork to record all
54details of pre-1970 civil timekeeping.
55Athough some information outside the scope of the database is
56collected in a file <code>backzone</code> that is distributed along
57with the database proper, this file is less reliable and does not
58necessarily follow database guidelines.
59</p>
60
61<p>
62As described below, reference source code for using the tz database is
63also available.  The tz code is upwards compatible with POSIX, an
64international standard for UNIX-like systems.  As of this writing, the
65current edition of POSIX is:
66  <a href="http://pubs.opengroup.org/onlinepubs/9699919799/">
67  The Open Group Base Specifications Issue 7</a>,
68  IEEE Std 1003.1-2008, 2016 Edition.
69</p>
70  </section>
71
72
73
74  <section>
75    <h2 id="naming">Names of time zone rules</h2>
76<p>
77Each of the database's time zone rules has a unique name.
78Inexperienced users are not expected to select these names unaided.
79Distributors should provide documentation and/or a simple selection
80interface that explains the names; for one example, see the 'tzselect'
81program in the tz code.  The
82<a href="http://cldr.unicode.org/">Unicode Common Locale Data
83Repository</a> contains data that may be useful for other
84selection interfaces.
85</p>
86
87<p>
88The time zone rule naming conventions attempt to strike a balance
89among the following goals:
90</p>
91<ul>
92  <li>
93   Uniquely identify every region where clocks have agreed since 1970.
94   This is essential for the intended use: static clocks keeping local
95   civil time.
96  </li>
97  <li>
98   Indicate to experts where that region is.
99  </li>
100  <li>
101   Be robust in the presence of political changes.  For example, names
102   of countries are ordinarily not used, to avoid incompatibilities
103   when countries change their name (e.g. Zaire&rarr;Congo) or when
104   locations change countries (e.g. Hong Kong from UK colony to
105   China).
106  </li>
107  <li>
108   Be portable to a wide variety of implementations.
109  </li>
110  <li>
111   Use a consistent naming conventions over the entire world.
112  </li>
113</ul>
114<p>
115Names normally have the
116form <var>AREA</var><code>/</code><var>LOCATION</var>,
117where <var>AREA</var> is the name of a continent or ocean,
118and <var>LOCATION</var> is the name of a specific
119location within that region.  North and South America share the same
120area, '<code>America</code>'.  Typical names are
121'<code>Africa/Cairo</code>', '<code>America/New_York</code>', and
122'<code>Pacific/Honolulu</code>'.
123</p>
124
125<p>
126Here are the general rules used for choosing location names,
127in decreasing order of importance:
128</p>
129<ul>
130  <li>
131	Use only valid POSIX file name components (i.e., the parts of
132		names other than '<code>/</code>').  Do not use the file name
133		components '<code>.</code>' and '<code>..</code>'.
134		Within a file name component,
135		use only ASCII letters, '<code>.</code>',
136		'<code>-</code>' and '<code>_</code>'.  Do not use
137		digits, as that might create an ambiguity with POSIX
138		TZ strings.  A file name component must not exceed 14
139		characters or start with '<code>-</code>'.  E.g.,
140		prefer '<code>Brunei</code>' to
141		'<code>Bandar_Seri_Begawan</code>'.  Exceptions: see
142		the discussion
143		of legacy names below.
144  </li>
145  <li>
146	A name must not be empty, or contain '<code>//</code>', or
147	start or end with '<code>/</code>'.
148  </li>
149  <li>
150	Do not use names that differ only in case.  Although the reference
151		implementation is case-sensitive, some other implementations
152		are not, and they would mishandle names differing only in case.
153  </li>
154  <li>
155	If one name <var>A</var> is an initial prefix of another
156		name <var>AB</var> (ignoring case), then <var>B</var>
157		must not start with '<code>/</code>', as a
158		regular file cannot have
159		the same name as a directory in POSIX.  For example,
160		'<code>America/New_York</code>' precludes
161		'<code>America/New_York/Bronx</code>'.
162  </li>
163  <li>
164	Uninhabited regions like the North Pole and Bouvet Island
165		do not need locations, since local time is not defined there.
166  </li>
167  <li>
168	There should typically be at least one name for each ISO 3166-1
169		officially assigned two-letter code for an inhabited country
170		or territory.
171  </li>
172  <li>
173	If all the clocks in a region have agreed since 1970,
174		don't bother to include more than one location
175		even if subregions' clocks disagreed before 1970.
176		Otherwise these tables would become annoyingly large.
177  </li>
178  <li>
179	If a name is ambiguous, use a less ambiguous alternative;
180		e.g. many cities are named San Jos�� and Georgetown, so
181		prefer '<code>Costa_Rica</code>' to '<code>San_Jose</code>' and '<code>Guyana</code>' to '<code>Georgetown</code>'.
182  </li>
183  <li>
184	Keep locations compact.  Use cities or small islands, not countries
185		or regions, so that any future time zone changes do not split
186		locations into different time zones.  E.g. prefer
187		'<code>Paris</code>' to '<code>France</code>', since
188		France has had multiple time zones.
189  </li>
190  <li>
191	Use mainstream English spelling, e.g. prefer
192		'<code>Rome</code>' to '<code>Roma</code>', and prefer
193		'<code>Athens</code>' to the Greek
194		'<code>����������</code>' or the Romanized
195		'<code>Ath��na</code>'.
196		The POSIX file name restrictions encourage this rule.
197  </li>
198  <li>
199	Use the most populous among locations in a zone,
200		e.g. prefer '<code>Shanghai</code>' to
201		'<code>Beijing</code>'.  Among locations with
202		similar populations, pick the best-known location,
203		e.g. prefer '<code>Rome</code>' to '<code>Milan</code>'.
204  </li>
205  <li>
206	Use the singular form, e.g. prefer '<code>Canary</code>' to '<code>Canaries</code>'.
207  </li>
208  <li>
209	Omit common suffixes like '<code>_Islands</code>' and
210		'<code>_City</code>', unless that would lead to
211		ambiguity.  E.g. prefer '<code>Cayman</code>' to
212		'<code>Cayman_Islands</code>' and
213		'<code>Guatemala</code>' to
214		'<code>Guatemala_City</code>', but prefer
215		'<code>Mexico_City</code>' to '<code>Mexico</code>'
216		because the country
217		of Mexico has several time zones.
218  </li>
219  <li>
220	Use '<code>_</code>' to represent a space.
221  </li>
222  <li>
223	Omit '<code>.</code>' from abbreviations in names, e.g. prefer
224		'<code>St_Helena</code>' to '<code>St._Helena</code>'.
225  </li>
226  <li>
227	Do not change established names if they only marginally
228		violate the above rules.  For example, don't change
229		the existing name '<code>Rome</code>' to
230		'<code>Milan</code>' merely because
231		Milan's population has grown to be somewhat greater
232		than Rome's.
233  </li>
234  <li>
235	If a name is changed, put its old spelling in the
236		'<code>backward</code>' file.
237		This means old spellings will continue to work.
238  </li>
239</ul>
240
241<p>
242The file '<code>zone1970.tab</code>' lists geographical locations used
243to name time
244zone rules.  It is intended to be an exhaustive list of names for
245geographic regions as described above; this is a subset of the names
246in the data.  Although a '<code>zone1970.tab</code>' location's longitude
247corresponds to its LMT offset with one hour for every 15&deg; east
248longitude, this relationship is not exact.
249</p>
250
251<p>
252Older versions of this package used a different naming scheme,
253and these older names are still supported.
254See the file '<code>backward</code>' for most of these older names
255(e.g., '<code>US/Eastern</code>' instead of '<code>America/New_York</code>').
256The other old-fashioned names still supported are
257'<code>WET</code>', '<code>CET</code>', '<code>MET</code>', and '<code>EET</code>' (see the file '<code>europe</code>').
258</p>
259
260<p>
261Older versions of this package defined legacy names that are
262incompatible with the first rule of location names, but which are
263still supported.  These legacy names are mostly defined in the file
264'<code>etcetera</code>'.  Also, the file '<code>backward</code>' defines the legacy names
265'<code>GMT0</code>', '<code>GMT-0</code>' and '<code>GMT+0</code>', and the file '<code>northamerica</code>' defines the
266legacy names '<code>EST5EDT</code>', '<code>CST6CDT</code>', '<code>MST7MDT</code>', and '<code>PST8PDT</code>'.
267</p>
268
269<p>
270Excluding '<code>backward</code>' should not affect the other data.  If
271'<code>backward</code>' is excluded, excluding '<code>etcetera</code>' should not affect the
272remaining data.
273</p>
274
275
276  </section>
277  <section>
278    <h2 id="abbreviations">Time zone abbreviations</h2>
279<p>
280When this package is installed, it generates time zone abbreviations
281like '<code>EST</code>' to be compatible with human tradition and POSIX.
282Here are the general rules used for choosing time zone abbreviations,
283in decreasing order of importance:
284<ul>
285  <li>
286	Use three to six characters that are ASCII alphanumerics or
287		'<code>+</code>' or '<code>-</code>'.
288		Previous editions of this database also used characters like
289		'<code> </code>' and '<code>?</code>', but these
290		characters have a special meaning to
291		the shell and cause commands like
292			'<code>set `date`</code>'
293		to have unexpected effects.
294		Previous editions of this rule required upper-case letters,
295		but the Congressman who introduced Chamorro Standard Time
296		preferred "ChST", so lower-case letters are now allowed.
297		Also, POSIX from 2001 on relaxed the rule to allow
298		'<code>-</code>', '<code>+</code>',
299		and alphanumeric characters from the portable character set
300		in the current locale.  In practice ASCII alphanumerics and
301		'<code>+</code>' and '<code>-</code>' are safe in all locales.
302
303		In other words, in the C locale the POSIX extended regular
304		expression <code>[-+[:alnum:]]{3,6}</code> should match
305		the abbreviation.
306		This guarantees that all abbreviations could have been
307		specified by a POSIX TZ string.
308  </li>
309  <li>
310	Use abbreviations that are in common use among English-speakers,
311		e.g. 'EST' for Eastern Standard Time in North America.
312		We assume that applications translate them to other languages
313		as part of the normal localization process; for example,
314		a French application might translate 'EST' to 'HNE'.
315
316<p><small>These abbreviations (for standard/daylight/etc. time) are:
317ACST/ACDT Australian Central,
318AST/ADT/APT/AWT/ADDT Atlantic,
319AEST/AEDT Australian Eastern,
320AHST/AHDT Alaska-Hawaii,
321AKST/AKDT Alaska,
322AWST/AWDT Australian Western,
323BST/BDT Bering,
324CAT/CAST Central Africa,
325CET/CEST/CEMT Central European,
326ChST Chamorro,
327CST/CDT/CWT/CPT/CDDT Central [North America],
328CST/CDT China,
329GMT/BST/IST/BDST Greenwich,
330EAT East Africa,
331EST/EDT/EWT/EPT/EDDT Eastern [North America],
332EET/EEST Eastern European,
333GST Guam,
334HST/HDT Hawaii,
335HKT/HKST Hong Kong,
336IST India,
337IST/GMT Irish,
338IST/IDT/IDDT Israel,
339JST/JDT Japan,
340KST/KDT Korea,
341MET/MEST Middle European (a backward-compatibility alias for Central European),
342MSK/MSD Moscow,
343MST/MDT/MWT/MPT/MDDT Mountain,
344NST/NDT/NWT/NPT/NDDT Newfoundland,
345NST/NDT/NWT/NPT Nome,
346NZMT/NZST New Zealand through 1945,
347NZST/NZDT New Zealand 1946&ndash;present,
348PKT/PKST Pakistan,
349PST/PDT/PWT/PPT/PDDT Pacific,
350SAST South Africa,
351SST Samoa,
352WAT/WAST West Africa,
353WET/WEST/WEMT Western European,
354WIB Waktu Indonesia Barat,
355WIT Waktu Indonesia Timur,
356WITA Waktu Indonesia Tengah,
357YST/YDT/YWT/YPT/YDDT Yukon</small>.</p>
358  </li>
359  <li>
360	For zones whose times are taken from a city's longitude, use the
361traditional <var>x</var>MT notation. The only abbreviation like this
362in current use is 'GMT'. The others are for timestamps before 1960,
363except that Monrovia Mean Time persisted until 1972. Typically,
364numeric abbreviations (e.g., '<code>-</code>004430' for MMT) would
365cause trouble here, as the numeric strings would exceed the POSIX length limit.
366
367<p><small>These abbreviations are:
368AMT Amsterdam, Asunci��n, Athens;
369BMT Baghdad, Bangkok, Batavia, Bern, Bogot��, Bridgetown, Brussels, Bucharest;
370CMT Calamarca, Caracas, Chisinau, Col��n, Copenhagen, C��rdoba;
371DMT Dublin/Dunsink;
372EMT Easter;
373FFMT Fort-de-France;
374FMT Funchal;
375GMT Greenwich;
376HMT Havana, Helsinki, Horta, Howrah;
377IMT Irkutsk, Istanbul;
378JMT Jerusalem;
379KMT Kaunas, Kiev, Kingston;
380LMT Lima, Lisbon, local, Luanda;
381MMT Macassar, Madras, Mal��, Managua, Minsk, Monrovia, Montevideo, Moratuwa,
382 Moscow;
383PLMT Ph�� Li���n;
384PMT Paramaribo, Paris, Perm, Pontianak, Prague;
385PMMT Port Moresby;
386QMT Quito;
387RMT Rangoon, Riga, Rome;
388SDMT Santo Domingo;
389SJMT San Jos��;
390SMT Santiago, Simferopol, Singapore, Stanley;
391TBMT Tbilisi;
392TMT Tallinn, Tehran;
393WMT Warsaw</small>.</p>
394
395<p><small>A few abbreviations also follow the pattern that
396GMT/BST established for time in the UK. They are:
397
398CMT/BST for Calamarca Mean Time and Bolivian Summer Time
3991890&ndash;1932, DMT/IST for Dublin/Dunsink Mean Time and Irish Summer Time
4001880&ndash;1916, MMT/MST/MDST for Moscow 1880&ndash;1919, and RMT/LST
401for Riga Mean Time and Latvian Summer time 1880&ndash;1926.
402An extra-special case is SET for Swedish Time (<em>svensk
403normaltid</em>) 1879&ndash;1899, 3&deg; west of the Stockholm
404Observatory.</small></p>
405  </li>
406  <li>
407	Use 'LMT' for local mean time of locations before the introduction
408		of standard time; see "<a href="#scope">Scope of the
409		tz database</a>".
410  </li>
411  <li>
412	If there is no common English abbreviation, use numeric offsets like
413		<code>-</code>05 and <code>+</code>0830 that are
414		generated by zic's <code>%z</code> notation.
415  </li>
416  <li>
417	Use current abbreviations for older timestamps to avoid confusion.
418		For example, in 1910 a common English abbreviation for UT +01
419		in central Europe was 'MEZ' (short for both "Middle European
420		Zone" and for "Mitteleurop��ische Zeit" in German).  Nowadays
421		'CET' ("Central European Time") is more common in English, and
422		the database uses 'CET' even for circa-1910 timestamps as this
423		is less confusing for modern users and avoids the need for
424		determining when 'CET' supplanted 'MEZ' in common usage.
425  </li>
426  <li>
427	Use a consistent style in a zone's history.  For example, if a zone's
428		history tends to use numeric abbreviations and a particular
429		entry could go either way, use a numeric abbreviation.
430  </li>
431  <li>
432	Use UT (with time zone abbreviation '<code>-</code>00') for
433		locations while uninhabited.  The leading
434		'<code>-</code>' is a flag that the time
435		zone is in some sense undefined; this notation is
436		derived from Internet RFC 3339.
437  </li>
438</ul>
439<p>
440Application writers should note that these abbreviations are ambiguous
441in practice: e.g., 'CST' means one thing in China and something else
442in North America, and 'IST' can refer to time in India, Ireland or
443Israel. To avoid ambiguity, use numeric UT offsets like
444'<code>-</code>0600' instead of time zone abbreviations like 'CST'.
445</p>
446  </section>
447
448
449  <section>
450    <h2 id="accuracy">Accuracy of the tz database</h2>
451<p>
452The tz database is not authoritative, and it surely has errors.
453Corrections are welcome and encouraged; see the file <code>CONTRIBUTING</code>.
454Users requiring authoritative data should consult national standards
455bodies and the references cited in the database's comments.
456</p>
457
458<p>
459Errors in the tz database arise from many sources:
460</p>
461<ul>
462  <li>
463   The tz database predicts future timestamps, and current predictions
464   will be incorrect after future governments change the rules.
465   For example, if today someone schedules a meeting for 13:00 next
466   October 1, Casablanca time, and tomorrow Morocco changes its
467   daylight saving rules, software can mess up after the rule change
468   if it blithely relies on conversions made before the change.
469  </li>
470  <li>
471   The pre-1970 entries in this database cover only a tiny sliver of how
472   clocks actually behaved; the vast majority of the necessary
473   information was lost or never recorded.  Thousands more zones would
474   be needed if the tz database's scope were extended to cover even
475   just the known or guessed history of standard time; for example,
476   the current single entry for France would need to split into dozens
477   of entries, perhaps hundreds.  And in most of the world even this
478   approach would be misleading due to widespread disagreement or
479   indifference about what times should be observed.  In her 2015 book
480   <cite>The Global Transformation of Time, 1870-1950</cite>, Vanessa Ogle writes
481   "Outside of Europe and North America there was no system of time
482   zones at all, often not even a stable landscape of mean times,
483   prior to the middle decades of the twentieth century".  See:
484   Timothy Shenk, <a
485   href="https://www.dissentmagazine.org/blog/booked-a-global-history-of-time-vanessa-ogle">Booked:
486   A Global History of Time</a>. <cite>Dissent</cite> 2015-12-17.
487  </li>
488  <li>
489   Most of the pre-1970 data entries come from unreliable sources, often
490   astrology books that lack citations and whose compilers evidently
491   invented entries when the true facts were unknown, without
492   reporting which entries were known and which were invented.
493   These books often contradict each other or give implausible entries,
494   and on the rare occasions when they are checked they are
495   typically found to be incorrect.
496  </li>
497  <li>
498   For the UK the tz database relies on years of first-class work done by
499   Joseph Myers and others; see
500   "<a href="https://www.polyomino.org.uk/british-time/">History of
501   legal time in Britain</a>".
502   Other countries are not done nearly as well.
503  </li>
504  <li>
505   Sometimes, different people in the same city would maintain clocks
506   that differed significantly.  Railway time was used by railroad
507   companies (which did not always agree with each other),
508   church-clock time was used for birth certificates, etc.
509   Often this was merely common practice, but sometimes it was set by law.
510   For example, from 1891 to 1911 the UT offset in France was legally
511   0:09:21 outside train stations and 0:04:21 inside.
512  </li>
513  <li>
514   Although a named location in the tz database stands for the
515   containing region, its pre-1970 data entries are often accurate for
516   only a small subset of that region.  For example, <code>Europe/London</code>
517   stands for the United Kingdom, but its pre-1847 times are valid
518   only for locations that have London's exact meridian, and its 1847
519   transition to GMT is known to be valid only for the L&amp;NW and the
520   Caledonian railways.
521  </li>
522  <li>
523   The tz database does not record the earliest time for which a zone's
524   data entries are thereafter valid for every location in the region.
525   For example, <code>Europe/London</code> is valid for all locations in its
526   region after GMT was made the standard time, but the date of
527   standardization (1880-08-02) is not in the tz database, other than
528   in commentary.  For many zones the earliest time of validity is
529   unknown.
530  </li>
531  <li>
532   The tz database does not record a region's boundaries, and in many
533   cases the boundaries are not known.  For example, the zone
534   <code>America/Kentucky/Louisville</code> represents a region around
535   the city of
536   Louisville, the boundaries of which are unclear.
537  </li>
538  <li>
539   Changes that are modeled as instantaneous transitions in the tz
540   database were often spread out over hours, days, or even decades.
541  </li>
542  <li>
543   Even if the time is specified by law, locations sometimes
544   deliberately flout the law.
545  </li>
546  <li>
547   Early timekeeping practices, even assuming perfect clocks, were
548   often not specified to the accuracy that the tz database requires.
549  </li>
550  <li>
551   Sometimes historical timekeeping was specified more precisely
552   than what the tz database can handle.  For example, from 1909 to
553   1937 Netherlands clocks were legally UT +00:19:32.13, but the tz
554   database cannot represent the fractional second.
555  </li>
556  <li>
557   Even when all the timestamp transitions recorded by the tz database
558   are correct, the tz rules that generate them may not faithfully
559   reflect the historical rules.  For example, from 1922 until World
560   War II the UK moved clocks forward the day following the third
561   Saturday in April unless that was Easter, in which case it moved
562   clocks forward the previous Sunday.  Because the tz database has no
563   way to specify Easter, these exceptional years are entered as
564   separate tz Rule lines, even though the legal rules did not change.
565  </li>
566  <li>
567   The tz database models pre-standard time using the proleptic Gregorian
568   calendar and local mean time (LMT), but many people used other
569   calendars and other timescales.  For example, the Roman Empire used
570   the Julian calendar, and had 12 varying-length daytime hours with a
571   non-hour-based system at night.
572  </li>
573  <li>
574   Early clocks were less reliable, and data entries do not represent
575   clock error.
576  </li>
577  <li>
578   The tz database assumes Universal Time (UT) as an origin, even
579   though UT is not standardized for older timestamps.  In the tz
580   database commentary, UT denotes a family of time standards that
581   includes Coordinated Universal Time (UTC) along with other variants
582   such as UT1 and GMT, with days starting at midnight.  Although UT
583   equals UTC for modern timestamps, UTC was not defined until 1960,
584   so commentary uses the more-general abbreviation UT for timestamps
585   that might predate 1960.  Since UT, UT1, etc. disagree slightly,
586   and since pre-1972 UTC seconds varied in length, interpretation of
587   older timestamps can be problematic when subsecond accuracy is
588   needed.
589  </li>
590  <li>
591   Civil time was not based on atomic time before 1972, and we don't
592   know the history of earth's rotation accurately enough to map SI
593   seconds to historical solar time to more than about one-hour
594   accuracy.  See: Stephenson FR, Morrison LV, Hohenkerk CY.
595   <a href="http://dx.doi.org/10.1098/rspa.2016.0404">Measurement
596   of the Earth's rotation: 720 BC to AD 2015</a>.
597   <cite>Proc Royal Soc A</cite>. 2016 Dec 7;472:20160404.
598   Also see: Espenak F. <a
599   href="https://eclipse.gsfc.nasa.gov/SEhelp/uncertainty2004.html">Uncertainty
600   in Delta T (��T)</a>.
601  </li>
602  <li>
603   The relationship between POSIX time (that is, UTC but ignoring leap
604   seconds) and UTC is not agreed upon after 1972.  Although the POSIX
605   clock officially stops during an inserted leap second, at least one
606   proposed standard has it jumping back a second instead; and in
607   practice POSIX clocks more typically either progress glacially during
608   a leap second, or are slightly slowed while near a leap second.
609  </li>
610  <li>
611   The tz database does not represent how uncertain its information is.
612   Ideally it would contain information about when data entries are
613   incomplete or dicey.  Partial temporal knowledge is a field of
614   active research, though, and it's not clear how to apply it here.
615  </li>
616</ul>
617<p>
618In short, many, perhaps most, of the tz database's pre-1970 and future
619timestamps are either wrong or misleading.  Any attempt to pass the
620tz database off as the definition of time should be unacceptable to
621anybody who cares about the facts.  In particular, the tz database's
622LMT offsets should not be considered meaningful, and should not prompt
623creation of zones merely because two locations differ in LMT or
624transitioned to standard time at different dates.
625</p>
626  </section>
627
628
629  <section>
630    <h2 id="functions">Time and date functions</h2>
631<p>
632The tz code contains time and date functions that are upwards
633compatible with those of POSIX.
634</p>
635
636<p>
637POSIX has the following properties and limitations.
638</p>
639<ul>
640  <li>
641    <p>
642	In POSIX, time display in a process is controlled by the
643	environment variable TZ.  Unfortunately, the POSIX TZ string takes
644	a form that is hard to describe and is error-prone in practice.
645	Also, POSIX TZ strings can't deal with other (for example, Israeli)
646	daylight saving time rules, or situations where more than two
647	time zone abbreviations are used in an area.
648    </p>
649    <p>
650      The POSIX TZ string takes the following form:
651    </p>
652    <p>
653      <var>stdoffset</var>[<var>dst</var>[<var>offset</var>][<code>,</code><var>date</var>[<code>/</code><var>time</var>]<code>,</code><var>date</var>[<code>/</code><var>time</var>]]]
654    </p>
655    <p>
656	where:
657    <dl>
658      <dt><var>std</var> and <var>dst</var></dt><dd>
659		are 3 or more characters specifying the standard
660		and daylight saving time (DST) zone names.
661		Starting with POSIX.1-2001, <var>std</var>
662		and <var>dst</var> may also be
663		in a quoted form like '<code>&lt;+09&gt;</code>'; this allows
664		"<code>+</code>" and "<code>-</code>" in the names.
665      </dd>
666      <dt><var>offset</var></dt><dd>
667		is of the form
668		'<code>[&plusmn;]<var>hh</var>:[<var>mm</var>[:<var>ss</var>]]</code>'
669		and specifies the offset west of UT.  '<var>hh</var>'
670		may be a single digit; 0&le;<var>hh</var>&le;24.
671		The default DST offset is one hour ahead of standard time.
672      </dd>
673      <dt><var>date</var>[<code>/</code><var>time</var>]<code>,</code><var>date</var>[<code>/</code><var>time</var>]</dt><dd>
674		specifies the beginning and end of DST.  If this is absent,
675		the system supplies its own rules for DST, and these can
676		differ from year to year; typically US DST rules are used.
677      </dd>
678      <dt><var>time</var></dt><dd>
679		takes the form
680		'<var>hh</var><code>:</code>[<var>mm</var>[<code>:</code><var>ss</var>]]'
681		and defaults to 02:00.
682		This is the same format as the offset, except that a
683		leading '<code>+</code>' or '<code>-</code>' is not allowed.
684      </dd>
685      <dt><var>date</var></dt><dd>
686		takes one of the following forms:
687	<dl>
688	  <dt>J<var>n</var> (1&le;<var>n</var>&le;365)</dt><dd>
689			origin-1 day number not counting February 29
690          </dd>
691	  <dt><var>n</var> (0&le;<var>n</var>&le;365)</dt><dd>
692			origin-0 day number counting February 29 if present
693          </dd>
694	  <dt><code>M</code><var>m</var><code>.</code><var>n</var><code>.</code><var>d</var> (0[Sunday]&le;<var>d</var>&le;6[Saturday], 1&le;<var>n</var>&le;5, 1&le;<var>m</var>&le;12)</dt><dd>
695			for the <var>d</var>th day of
696			week <var>n</var> of month <var>m</var> of the
697			year, where week 1 is the first week in which
698			day <var>d</var> appears, and '<code>5</code>'
699			stands for the last week in which
700			day <var>d</var> appears
701			(which may be either the 4th or 5th week).
702			Typically, this is the only useful form;
703			the <var>n</var>
704			and <code>J</code><var>n</var> forms are
705			rarely used.
706	  </dd>
707</dl>
708</dd>
709</dl>
710	Here is an example POSIX TZ string for New Zealand after 2007.
711	It says that standard time (NZST) is 12 hours ahead of UT,
712	and that daylight saving time (NZDT) is observed from September's
713	last Sunday at 02:00 until April's first Sunday at 03:00:
714
715        <pre><code>TZ='NZST-12NZDT,M9.5.0,M4.1.0/3'</code></pre>
716
717	This POSIX TZ string is hard to remember, and mishandles some
718	timestamps before 2008.  With this package you can use this
719	instead:
720
721	<pre><code>TZ='Pacific/Auckland'</code></pre>
722  </li>
723  <li>
724	POSIX does not define the exact meaning of TZ values like
725	"<code>EST5EDT</code>".
726	Typically the current US DST rules are used to interpret such values,
727	but this means that the US DST rules are compiled into each program
728	that does time conversion.  This means that when US time conversion
729	rules change (as in the United States in 1987), all programs that
730	do time conversion must be recompiled to ensure proper results.
731  </li>
732  <li>
733	The TZ environment variable is process-global, which makes it hard
734	to write efficient, thread-safe applications that need access
735	to multiple time zones.
736  </li>
737  <li>
738	In POSIX, there's no tamper-proof way for a process to learn the
739	system's best idea of local wall clock.  (This is important for
740	applications that an administrator wants used only at certain
741	times &ndash;
742	without regard to whether the user has fiddled the TZ environment
743	variable.  While an administrator can "do everything in UT" to get
744	around the problem, doing so is inconvenient and precludes handling
745	daylight saving time shifts - as might be required to limit phone
746	calls to off-peak hours.)
747  </li>
748  <li>
749	POSIX provides no convenient and efficient way to determine the UT
750	offset and time zone abbreviation of arbitrary timestamps,
751	particularly for time zone settings that do not fit into the
752	POSIX model.
753  </li>
754  <li>
755	POSIX requires that systems ignore leap seconds.
756  </li>
757  <li>
758	The tz code attempts to support all the <code>time_t</code>
759	implementations allowed by POSIX.  The <code>time_t</code>
760	type represents a nonnegative count of
761	seconds since 1970-01-01 00:00:00 UTC, ignoring leap seconds.
762	In practice, <code>time_t</code> is usually a signed 64- or
763	32-bit integer; 32-bit signed <code>time_t</code> values stop
764	working after 2038-01-19 03:14:07 UTC, so
765	new implementations these days typically use a signed 64-bit integer.
766	Unsigned 32-bit integers are used on one or two platforms,
767	and 36-bit and 40-bit integers are also used occasionally.
768	Although earlier POSIX versions allowed <code>time_t</code> to be a
769	floating-point type, this was not supported by any practical
770	systems, and POSIX.1-2013 and the tz code both
771	require <code>time_t</code>
772	to be an integer type.
773  </li>
774</ul>
775<p>
776These are the extensions that have been made to the POSIX functions:
777</p>
778<ul>
779  <li>
780    <p>
781	The TZ environment variable is used in generating the name of a file
782	from which time zone information is read (or is interpreted a la
783	POSIX); TZ is no longer constrained to be a three-letter time zone
784	name followed by a number of hours and an optional three-letter
785	daylight time zone name.  The daylight saving time rules to be used
786	for a particular time zone are encoded in the time zone file;
787	the format of the file allows U.S., Australian, and other rules to be
788	encoded, and allows for situations where more than two time zone
789	abbreviations are used.
790    </p>
791    <p>
792	It was recognized that allowing the TZ environment variable to
793	take on values such as '<code>America/New_York</code>' might
794	cause "old" programs
795	(that expect TZ to have a certain form) to operate incorrectly;
796	consideration was given to using some other environment variable
797	(for example, TIMEZONE) to hold the string used to generate the
798	time zone information file name.  In the end, however, it was decided
799	to continue using TZ: it is widely used for time zone purposes;
800	separately maintaining both TZ and TIMEZONE seemed a nuisance;
801	and systems where "new" forms of TZ might cause problems can simply
802	use TZ values such as "<code>EST5EDT</code>" which can be used both by
803	"new" programs (a la POSIX) and "old" programs (as zone names and
804	offsets).
805    </p>
806</li>
807<li>
808	The code supports platforms with a UT offset member
809	in <code>struct tm</code>,
810	e.g., <code>tm_gmtoff</code>.
811</li>
812<li>
813	The code supports platforms with a time zone abbreviation member in
814	<code>struct tm</code>, e.g., <code>tm_zone</code>.
815</li>
816<li>
817	Since the TZ environment variable can now be used to control time
818	conversion, the <code>daylight</code>
819	and <code>timezone</code> variables are no longer needed.
820	(These variables are defined and set by <code>tzset</code>;
821	however, their values will not be used
822	by <code>localtime</code>.)
823</li>
824<li>
825	Functions <code>tzalloc</code>, <code>tzfree</code>,
826	<code>localtime_rz</code>, and <code>mktime_z</code> for
827	more-efficient thread-safe applications that need to use
828	multiple time zones.  The <code>tzalloc</code>
829	and <code>tzfree</code> functions allocate and free objects of
830	type <code>timezone_t</code>, and <code>localtime_rz</code>
831	and <code>mktime_z</code> are like <code>localtime_r</code>
832	and <code>mktime</code> with an extra
833	<code>timezone_t</code> argument.  The functions were inspired
834	by NetBSD.
835</li>
836<li>
837	A function <code>tzsetwall</code> has been added to arrange
838	for the system's
839	best approximation to local wall clock time to be delivered by
840	subsequent calls to <code>localtime</code>.  Source code for portable
841	applications that "must" run on local wall clock time should call
842	<code>tzsetwall</code>; if such code is moved to "old" systems that don't
843	provide tzsetwall, you won't be able to generate an executable program.
844	(These time zone functions also arrange for local wall clock time to be
845	used if tzset is called &ndash; directly or indirectly &ndash;
846	and there's no TZ
847	environment variable; portable applications should not, however, rely
848	on this behavior since it's not the way SVR2 systems behave.)
849</li>
850<li>
851	Negative <code>time_t</code> values are supported, on systems
852	where <code>time_t</code> is signed.
853</li>
854<li>
855	These functions can account for leap seconds, thanks to Bradley White.
856</li>
857</ul>
858<p>
859Points of interest to folks with other systems:
860</p>
861<ul>
862  <li>
863	Code compatible with this package is already part of many platforms,
864	including GNU/Linux, Android, the BSDs, Chromium OS, Cygwin, AIX, iOS,
865	BlackBery 10, macOS, Microsoft Windows, OpenVMS, and Solaris.
866	On such hosts, the primary use of this package
867	is to update obsolete time zone rule tables.
868	To do this, you may need to compile the time zone compiler
869	'<code>zic</code>' supplied with this package instead of using
870	the system '<code>zic</code>', since the format
871	of <code>zic</code>'s input is occasionally extended, and a
872	platform may still be shipping an older <code>zic</code>.
873  </li>
874  <li>
875	The UNIX Version 7 <code>timezone</code> function is not
876	present in this package;
877	it's impossible to reliably map timezone's arguments (a "minutes west
878	of GMT" value and a "daylight saving time in effect" flag) to a
879	time zone abbreviation, and we refuse to guess.
880	Programs that in the past used the timezone function may now examine
881	<code>localtime(&amp;clock)-&gt;tm_zone</code>
882	(if <code>TM_ZONE</code> is defined) or
883	<code>tzname[localtime(&amp;clock)-&gt;tm_isdst]</code>
884	(if <code>HAVE_TZNAME</code> is defined)
885	to learn the correct time zone abbreviation to use.
886  </li>
887  <li>
888	The 4.2BSD <code>gettimeofday</code> function is not used in
889	this package.
890	This formerly let users obtain the current UTC offset and DST flag,
891	but this functionality was removed in later versions of BSD.
892  </li>
893  <li>
894	In SVR2, time conversion fails for near-minimum or near-maximum
895	<code>time_t</code> values when doing conversions for places
896	that don't use UT.
897	This package takes care to do these conversions correctly.
898	A comment in the source code tells how to get compatibly wrong
899	results.
900  </li>
901</ul>
902<p>
903The functions that are conditionally compiled
904if <code>STD_INSPIRED</code> is defined
905should, at this point, be looked on primarily as food for thought.  They are
906not in any sense "standard compatible" &ndash; some are not, in fact,
907specified in <em>any</em> standard.  They do, however, represent responses of
908various authors to
909standardization proposals.
910</p>
911
912<p>
913Other time conversion proposals, in particular the one developed by folks at
914Hewlett Packard, offer a wider selection of functions that provide capabilities
915beyond those provided here.  The absence of such functions from this package
916is not meant to discourage the development, standardization, or use of such
917functions.  Rather, their absence reflects the decision to make this package
918contain valid extensions to POSIX, to ensure its broad acceptability.  If
919more powerful time conversion functions can be standardized, so much the
920better.
921</p>
922  </section>
923
924
925  <section>
926    <h2 id="stability">Interface stability</h2>
927<p>
928The tz code and data supply the following interfaces:
929</p>
930<ul>
931  <li>
932   A set of zone names as per "<a href="#naming">Names of time zone
933   rules</a>" above.
934  </li>
935  <li>
936   Library functions described in "<a href="#functions">Time and date
937   functions</a>" above.
938  </li>
939  <li>
940   The programs <code>tzselect</code>, <code>zdump</code>,
941   and <code>zic</code>, documented in their man pages.
942  </li>
943  <li>
944   The format of <code>zic</code> input files, documented in
945   the <code>zic</code> man page.
946  </li>
947  <li>
948   The format of <code>zic</code> output files, documented in
949   the <code>tzfile</code> man page.
950  </li>
951  <li>
952   The format of zone table files, documented in <code>zone1970.tab</code>.
953  </li>
954  <li>
955   The format of the country code file, documented in <code>iso3166.tab</code>.
956  </li>
957  <li>
958   The version number of the code and data, as the first line of
959   the text file '<code>version</code>' in each release.
960  </li>
961</ul>
962<p>
963Interface changes in a release attempt to preserve compatibility with
964recent releases.  For example, tz data files typically do not rely on
965recently-added <code>zic</code> features, so that users can run
966older <code>zic</code> versions to process newer data
967files.  <a href="tz-link.html">Sources for time zone and daylight
968saving time data</a> describes how
969releases are tagged and distributed.
970</p>
971
972<p>
973Interfaces not listed above are less stable.  For example, users
974should not rely on particular UT offsets or abbreviations for
975timestamps, as data entries are often based on guesswork and these
976guesses may be corrected or improved.
977</p>
978  </section>
979
980
981  <section>
982    <h2 id="calendar">Calendrical issues</h2>
983<p>
984Calendrical issues are a bit out of scope for a time zone database,
985but they indicate the sort of problems that we would run into if we
986extended the time zone database further into the past.  An excellent
987resource in this area is Nachum Dershowitz and Edward M. Reingold,
988<cite><a href="https://www.cs.tau.ac.il/~nachum/calendar-book/third-edition/">Calendrical
989Calculations: Third Edition</a></cite>, Cambridge University Press (2008).
990Other information and sources are given in the file '<samp>calendars</samp>'
991in the tz distribution.  They sometimes disagree.
992</p>
993  </section>
994
995
996  <section>
997    <h2 id="planets">Time and time zones on other planets</h2>
998<p>
999Some people's work schedules use Mars time.  Jet Propulsion Laboratory
1000(JPL) coordinators have kept Mars time on and off at least since 1997
1001for the Mars Pathfinder mission.  Some of their family members have
1002also adapted to Mars time.  Dozens of special Mars watches were built
1003for JPL workers who kept Mars time during the Mars Exploration
1004Rovers mission (2004).  These timepieces look like normal Seikos and
1005Citizens but use Mars seconds rather than terrestrial seconds.
1006</p>
1007
1008<p>
1009A Mars solar day is called a "sol" and has a mean period equal to
1010about 24 hours 39 minutes 35.244 seconds in terrestrial time.  It is
1011divided into a conventional 24-hour clock, so each Mars second equals
1012about 1.02749125 terrestrial seconds.
1013</p>
1014
1015<p>
1016The prime meridian of Mars goes through the center of the crater
1017Airy-0, named in honor of the British astronomer who built the
1018Greenwich telescope that defines Earth's prime meridian.  Mean solar
1019time on the Mars prime meridian is called Mars Coordinated Time (MTC).
1020</p>
1021
1022<p>
1023Each landed mission on Mars has adopted a different reference for
1024solar time keeping, so there is no real standard for Mars time zones.
1025For example, the Mars Exploration Rover project (2004) defined two
1026time zones "Local Solar Time A" and "Local Solar Time B" for its two
1027missions, each zone designed so that its time equals local true solar
1028time at approximately the middle of the nominal mission.  Such a "time
1029zone" is not particularly suited for any application other than the
1030mission itself.
1031</p>
1032
1033<p>
1034Many calendars have been proposed for Mars, but none have achieved
1035wide acceptance.  Astronomers often use Mars Sol Date (MSD) which is a
1036sequential count of Mars solar days elapsed since about 1873-12-29
103712:00 GMT.
1038</p>
1039
1040<p>
1041In our solar system, Mars is the planet with time and calendar most
1042like Earth's.  On other planets, Sun-based time and calendars would
1043work quite differently.  For example, although Mercury's sidereal
1044rotation period is 58.646 Earth days, Mercury revolves around the Sun
1045so rapidly that an observer on Mercury's equator would see a sunrise
1046only every 175.97 Earth days, i.e., a Mercury year is 0.5 of a Mercury
1047day.  Venus is more complicated, partly because its rotation is
1048slightly retrograde: its year is 1.92 of its days.  Gas giants like
1049Jupiter are trickier still, as their polar and equatorial regions
1050rotate at different rates, so that the length of a day depends on
1051latitude.  This effect is most pronounced on Neptune, where the day is
1052about 12 hours at the poles and 18 hours at the equator.
1053</p>
1054
1055<p>
1056Although the tz database does not support time on other planets, it is
1057documented here in the hopes that support will be added eventually.
1058</p>
1059
1060<p>
1061Sources:
1062</p>
1063<ul>
1064  <li>
1065Michael Allison and Robert Schmunk,
1066"<a href="https://www.giss.nasa.gov/tools/mars24/help/notes.html">Technical
1067Notes on Mars Solar Time as Adopted by the Mars24 Sunclock</a>"
1068(2015-06-30).
1069  </li>
1070  <li>
1071Jia-Rui Chong,
1072"<a href="http://articles.latimes.com/2004/jan/14/science/sci-marstime14">Workdays
1073Fit for a Martian</a>", Los Angeles Times
1074(2004-01-14), pp A1, A20-A21.
1075  </li>
1076  <li>
1077Tom Chmielewski,
1078"<a href="https://www.theatlantic.com/technology/archive/2015/02/jet-lag-is-worse-on-mars/386033/">Jet
1079Lag Is Worse on Mars</a>", The Atlantic (2015-02-26)
1080  </li>
1081  <li>
1082Matt Williams,
1083"<a href="https://www.universetoday.com/37481/days-of-the-planets/">How
1084long is a day on the other planets of the solar system?</a>"
1085(2017-04-27).
1086  </li>
1087</ul>
1088  </section>
1089
1090  <footer>
1091    <hr>
1092This file is in the public domain, so clarified as of 2009-05-17 by
1093Arthur David Olson.
1094  </footer>
1095</body>
1096</html>
1097