theory.html revision 325322
194742Sobrien<!DOCTYPE html>
294742Sobrien<html lang="en">
394742Sobrien<head>
495253Sru  <title>Theory and pragmatics of the tz code and data</title>
594742Sobrien  <meta charset="UTF-8">
696991Srwatson</head>
796991Srwatson
896991Srwatson<!-- The somewhat-unusal indenting style in this file is intended to
9102773Srwatson     shrink the output of the shell command 'diff Theory Theory.html',
10102773Srwatson     where 'Theory' was the plain text file that this file is derived
1194854Ssos     from.  The 'Theory' file used leading white space to indent, and
1294917Simp     when possible that indentation is preserved here.  Eventually we
13126445Sobrien     may stop doing this and remove this comment.  -->
1494917Simp
1594917Simp<body>
1694917Simp  <h1>Theory and pragmatics of the tz code and data</h1>
17117751Smarkm  <h3>Outline</h3>
18117751Smarkm  <nav>
19116149Smarkm    <ul>
20116149Smarkm      <li><a href="#scope">Scope of the tz database</a></li>
21125244Snectar      <li><a href="#naming">Names of time zone rules</a></li>
22125244Snectar      <li><a href="#abbreviations">Time zone abbreviations</a></li>
2394847Sjhb      <li><a href="#accuracy">Accuracy of the tz database</a></li>
2494847Sjhb      <li><a href="#functions">Time and date functions</a></li>
2594847Sjhb      <li><a href="#stability">Interface stability</a></li>
26126337Svkashyap      <li><a href="#calendar">Calendrical issues</a></li>
27128023Svkashyap      <li><a href="#planets">Time and time zones on other planets</a></li>
2894855Sscottl    </ul>
29126054Sscottl  </nav>
30126054Sscottl
31126054Sscottl
32126054Sscottl  <section>
33126054Sscottl    <h2 id="scope">Scope of the tz database</h2>
34126054Sscottl<p>
3594915SkenThe tz database attempts to record the history and predicted future of
3699607Smjacoball computer-based clocks that track civil time.  To represent this
3794915Skendata, the world is partitioned into regions whose clocks all agree
3894915Skenabout timestamps that occur after the somewhat-arbitrary cutoff point
3994915Skenof the POSIX Epoch (1970-01-01 00:00:00 UTC).  For each such region,
4094915Skenthe database records all known clock transitions, and labels the region
4194915Skenwith a notable location.  Although 1970 is a somewhat-arbitrary
4294915Skencutoff, there are significant challenges to moving the cutoff earlier
4394915Skeneven by a decade or two, due to the wide variety of local practices
4494915Skenbefore computer timekeeping became prevalent.
4599607Smjacob</p>
46106734Smjacob
47128435Stackerman<p>
4897611SbillfClock transitions before 1970 are recorded for each such location,
4994918Sgshapirobecause most systems support timestamps before 1970 and could
5094918Sgshapiromisbehave if data entries were omitted for pre-1970 transitions.
5194918SgshapiroHowever, the database is not designed for and does not suffice for
5294918Sgshapiroapplications requiring accurate handling of all past times everywhere,
5394918Sgshapiroas it would take far too much effort and guesswork to record all
54118316Smbrdetails of pre-1970 civil timekeeping.
5594955Smurray</p>
5695054Snectar
57125080Scperciva<p>
58106187SdesAs described below, reference source code for using the tz database is
59106187Sdesalso available.  The tz code is upwards compatible with POSIX, an
6095455Sdesinternational standard for UNIX-like systems.  As of this writing, the
6198750Sdescurrent edition of POSIX is:
6299606Sdes  <a href="http://pubs.opengroup.org/onlinepubs/9699919799/">
6399606Sdes  The Open Group Base Specifications Issue 7</a>,
6499606Sdes  IEEE Std 1003.1-2008, 2016 Edition.
6596268Sgad</p>
6696268Sgad  </section>
67116233Sgad
6896268Sgad
6996301Sgrog
7096332Speter  <section>
7196332Speter    <h2 id="naming">Names of time zone rules</h2>
7296332Speter<p>
7396332SpeterEach of the database's time zone rules has a unique name.
7496332SpeterInexperienced users are not expected to select these names unaided.
75100314SruDistributors should provide documentation and/or a simple selection
7696451Sruinterface that explains the names; for one example, see the 'tzselect'
7797611Sbillfprogram in the tz code.  The
7898333Sanholt<a href="http://cldr.unicode.org/">Unicode Common Locale Data
7998986SjmallettRepository</a> contains data that may be useful for other
80111061Sjmallettselection interfaces.
8199732Sjoerg</p>
8299732Sjoerg
83113692Snectar<p>
84113692SnectarThe time zone rule naming conventions attempt to strike a balance
85115825Sfanfamong the following goals:
86126445Sobrien</p>
87117645Sdwmalone<ul>
88118204Sbp  <li>
89118204Sbp   Uniquely identify every region where clocks have agreed since 1970.
90118204Sbp   This is essential for the intended use: static clocks keeping local
91118204Sbp   civil time.
92127337Smlaier  </li>
93126445Sobrien  <li>
94129082Spjd   Indicate to experts where that region is.
95129082Spjd  </li>
96131476Spjd  <li>
97129485Spjd   Be robust in the presence of political changes.  For example, names
98129485Spjd   of countries are ordinarily not used, to avoid incompatibilities
99129485Spjd   when countries change their name (e.g. Zaire&rarr;Congo) or when
100132311Salfred   locations change countries (e.g. Hong Kong from UK colony to
101132311Salfred   China).
102132311Salfred  </li>
103132268Salfred  <li>
104115822Sdougb   Be portable to a wide variety of implementations.
105115822Sdougb  </li>
106115822Sdougb  <li>
107115822Sdougb   Use a consistent naming conventions over the entire world.
108115822Sdougb  </li>
109115822Sdougb</ul>
110115822Sdougb<p>
111115822SdougbNames normally have the
112115822Sdougbform <var>AREA</var><code>/</code><var>LOCATION</var>,
113115822Sdougbwhere <var>AREA</var> is the name of a continent or ocean,
114115822Sdougband <var>LOCATION</var> is the name of a specific
115115822Sdougblocation within that region.  North and South America share the same
116115822Sdougbarea, '<code>America</code>'.  Typical names are
117115822Sdougb'<code>Africa/Cairo</code>', '<code>America/New_York</code>', and
118115822Sdougb'<code>Pacific/Honolulu</code>'.
119115822Sdougb</p>
120115822Sdougb
121115822Sdougb<p>
122115822SdougbHere are the general rules used for choosing location names,
123115822Sdougbin decreasing order of importance:
124115822Sdougb</p>
125115822Sdougb<ul>
126115895Sguido  <li>
127115822Sdougb	Use only valid POSIX file name components (i.e., the parts of
128115895Sguido		names other than '<code>/</code>').  Do not use the file name
129115895Sguido		components '<code>.</code>' and '<code>..</code>'.
130115895Sguido		Within a file name component,
131115822Sdougb		use only ASCII letters, '<code>.</code>',
132115822Sdougb		'<code>-</code>' and '<code>_</code>'.  Do not use
133115822Sdougb		digits, as that might create an ambiguity with POSIX
134115822Sdougb		TZ strings.  A file name component must not exceed 14
135115822Sdougb		characters or start with '<code>-</code>'.  E.g.,
136115822Sdougb		prefer '<code>Brunei</code>' to
137115822Sdougb		'<code>Bandar_Seri_Begawan</code>'.  Exceptions: see
138115822Sdougb		the discussion
139115822Sdougb		of legacy names below.
140115822Sdougb  </li>
141115822Sdougb  <li>
142115822Sdougb	A name must not be empty, or contain '<code>//</code>', or
143115822Sdougb	start or end with '<code>/</code>'.
144115822Sdougb  </li>
145115822Sdougb  <li>
146115822Sdougb	Do not use names that differ only in case.  Although the reference
147115822Sdougb		implementation is case-sensitive, some other implementations
148115822Sdougb		are not, and they would mishandle names differing only in case.
149115822Sdougb  </li>
150115822Sdougb  <li>
151115822Sdougb	If one name <var>A</var> is an initial prefix of another
152115822Sdougb		name <var>AB</var> (ignoring case), then <var>B</var>
153115822Sdougb		must not start with '<code>/</code>', as a
154115822Sdougb		regular file cannot have
155115822Sdougb		the same name as a directory in POSIX.  For example,
156115822Sdougb		'<code>America/New_York</code>' precludes
157115822Sdougb		'<code>America/New_York/Bronx</code>'.
158115822Sdougb  </li>
159115822Sdougb  <li>
160115822Sdougb	Uninhabited regions like the North Pole and Bouvet Island
161115895Sguido		do not need locations, since local time is not defined there.
162115895Sguido  </li>
163115895Sguido  <li>
164115895Sguido	There should typically be at least one name for each ISO 3166-1
165115822Sdougb		officially assigned two-letter code for an inhabited country
166115822Sdougb		or territory.
167115822Sdougb  </li>
168115822Sdougb  <li>
169	If all the clocks in a region have agreed since 1970,
170		don't bother to include more than one location
171		even if subregions' clocks disagreed before 1970.
172		Otherwise these tables would become annoyingly large.
173  </li>
174  <li>
175	If a name is ambiguous, use a less ambiguous alternative;
176		e.g. many cities are named San Jos�� and Georgetown, so
177		prefer '<code>Costa_Rica</code>' to '<code>San_Jose</code>' and '<code>Guyana</code>' to '<code>Georgetown</code>'.
178  </li>
179  <li>
180	Keep locations compact.  Use cities or small islands, not countries
181		or regions, so that any future time zone changes do not split
182		locations into different time zones.  E.g. prefer
183		'<code>Paris</code>' to '<code>France</code>', since
184		France has had multiple time zones.
185  </li>
186  <li>
187	Use mainstream English spelling, e.g. prefer
188		'<code>Rome</code>' to '<code>Roma</code>', and prefer
189		'<code>Athens</code>' to the Greek
190		'<code>����������</code>' or the Romanized
191		'<code>Ath��na</code>'.
192		The POSIX file name restrictions encourage this rule.
193  </li>
194  <li>
195	Use the most populous among locations in a zone,
196		e.g. prefer '<code>Shanghai</code>' to
197		'<code>Beijing</code>'.  Among locations with
198		similar populations, pick the best-known location,
199		e.g. prefer '<code>Rome</code>' to '<code>Milan</code>'.
200  </li>
201  <li>
202	Use the singular form, e.g. prefer '<code>Canary</code>' to '<code>Canaries</code>'.
203  </li>
204  <li>
205	Omit common suffixes like '<code>_Islands</code>' and
206		'<code>_City</code>', unless that would lead to
207		ambiguity.  E.g. prefer '<code>Cayman</code>' to
208		'<code>Cayman_Islands</code>' and
209		'<code>Guatemala</code>' to
210		'<code>Guatemala_City</code>', but prefer
211		'<code>Mexico_City</code>' to '<code>Mexico</code>'
212		because the country
213		of Mexico has several time zones.
214  </li>
215  <li>
216	Use '<code>_</code>' to represent a space.
217  </li>
218  <li>
219	Omit '<code>.</code>' from abbreviations in names, e.g. prefer
220		'<code>St_Helena</code>' to '<code>St._Helena</code>'.
221  </li>
222  <li>
223	Do not change established names if they only marginally
224		violate the above rules.  For example, don't change
225		the existing name '<code>Rome</code>' to
226		'<code>Milan</code>' merely because
227		Milan's population has grown to be somewhat greater
228		than Rome's.
229  </li>
230  <li>
231	If a name is changed, put its old spelling in the
232		'<code>backward</code>' file.
233		This means old spellings will continue to work.
234  </li>
235</ul>
236
237<p>
238The file '<code>zone1970.tab</code>' lists geographical locations used
239to name time
240zone rules.  It is intended to be an exhaustive list of names for
241geographic regions as described above; this is a subset of the names
242in the data.  Although a '<code>zone1970.tab</code>' location's longitude
243corresponds to its LMT offset with one hour for every 15 degrees east
244longitude, this relationship is not exact.
245</p>
246
247<p>
248Older versions of this package used a different naming scheme,
249and these older names are still supported.
250See the file '<code>backward</code>' for most of these older names
251(e.g., '<code>US/Eastern</code>' instead of '<code>America/New_York</code>').
252The other old-fashioned names still supported are
253'<code>WET</code>', '<code>CET</code>', '<code>MET</code>', and '<code>EET</code>' (see the file '<code>europe</code>').
254</p>
255
256<p>
257Older versions of this package defined legacy names that are
258incompatible with the first rule of location names, but which are
259still supported.  These legacy names are mostly defined in the file
260'<code>etcetera</code>'.  Also, the file '<code>backward</code>' defines the legacy names
261'<code>GMT0</code>', '<code>GMT-0</code>' and '<code>GMT+0</code>', and the file '<code>northamerica</code>' defines the
262legacy names '<code>EST5EDT</code>', '<code>CST6CDT</code>', '<code>MST7MDT</code>', and '<code>PST8PDT</code>'.
263</p>
264
265<p>
266Excluding '<code>backward</code>' should not affect the other data.  If
267'<code>backward</code>' is excluded, excluding '<code>etcetera</code>' should not affect the
268remaining data.
269</p>
270
271
272  </section>
273  <section>
274    <h2 id="abbreviations">Time zone abbreviations</h2>
275<p>
276When this package is installed, it generates time zone abbreviations
277like '<code>EST</code>' to be compatible with human tradition and POSIX.
278Here are the general rules used for choosing time zone abbreviations,
279in decreasing order of importance:
280<ul>
281  <li>
282	Use three or more characters that are ASCII alphanumerics or
283		'<code>+</code>' or '<code>-</code>'.
284		Previous editions of this database also used characters like
285		'<code> </code>' and '<code>?</code>', but these
286		characters have a special meaning to
287		the shell and cause commands like
288			'<code>set `date`</code>'
289		to have unexpected effects.
290		Previous editions of this rule required upper-case letters,
291		but the Congressman who introduced Chamorro Standard Time
292		preferred "ChST", so lower-case letters are now allowed.
293		Also, POSIX from 2001 on relaxed the rule to allow
294		'<code>-</code>', '<code>+</code>',
295		and alphanumeric characters from the portable character set
296		in the current locale.  In practice ASCII alphanumerics and
297		'<code>+</code>' and '<code>-</code>' are safe in all locales.
298
299		In other words, in the C locale the POSIX extended regular
300		expression <code>[-+[:alnum:]]{3,}</code> should match
301		the abbreviation.
302		This guarantees that all abbreviations could have been
303		specified by a POSIX TZ string.
304  </li>
305  <li>
306	Use abbreviations that are in common use among English-speakers,
307		e.g. 'EST' for Eastern Standard Time in North America.
308		We assume that applications translate them to other languages
309		as part of the normal localization process; for example,
310		a French application might translate 'EST' to 'HNE'.
311  </li>
312  <li>
313	For zones whose times are taken from a city's longitude, use the
314		traditional <var>x</var>MT notation, e.g. 'PMT' for
315		Paris Mean Time.
316		The only name like this in current use is 'GMT'.
317  </li>
318  <li>
319	Use 'LMT' for local mean time of locations before the introduction
320		of standard time; see "<a href="#scope">Scope of the
321		tz database</a>".
322  </li>
323  <li>
324	If there is no common English abbreviation, use numeric offsets like
325		<code>-</code>05 and <code>+</code>0830 that are
326		generated by zic's <code>%z</code> notation.
327  </li>
328  <li>
329	Use current abbreviations for older timestamps to avoid confusion.
330		For example, in 1910 a common English abbreviation for UT +01
331		in central Europe was 'MEZ' (short for both "Middle European
332		Zone" and for "Mitteleurop��ische Zeit" in German).  Nowadays
333		'CET' ("Central European Time") is more common in English, and
334		the database uses 'CET' even for circa-1910 timestamps as this
335		is less confusing for modern users and avoids the need for
336		determining when 'CET' supplanted 'MEZ' in common usage.
337  </li>
338  <li>
339	Use a consistent style in a zone's history.  For example, if a zone's
340		history tends to use numeric abbreviations and a particular
341		entry could go either way, use a numeric abbreviation.
342  </li>
343</ul>
344    [The remaining guidelines predate the introduction of <code>%z</code>.
345    They are problematic as they mean tz data entries invent
346    notation rather than record it.  These guidelines are now
347    deprecated and the plan is to gradually move to <code>%z</code> for
348    inhabited locations and to "<code>-</code>00" for uninhabited locations.]
349<ul>
350  <li>
351	If there is no common English abbreviation, abbreviate the English
352		translation of the usual phrase used by native speakers.
353		If this is not available or is a phrase mentioning the country
354		(e.g. "Cape Verde Time"), then:
355	<ul>
356	  <li>
357		When a country is identified with a single or principal zone,
358			append 'T' to the country's ISO	code, e.g. 'CVT' for
359			Cape Verde Time.  For summer time append 'ST';
360			for double summer time append 'DST'; etc.
361	  </li>
362	  <li>
363		Otherwise, take the first three letters of an English place
364			name identifying each zone and append 'T', 'ST', etc.
365			as before; e.g. 'CHAST' for CHAtham Summer Time.
366	  </li>
367	</ul>
368  </li>
369  <li>
370	Use UT (with time zone abbreviation '<code>-</code>00') for
371		locations while uninhabited.  The leading
372		'<code>-</code>' is a flag that the time
373		zone is in some sense undefined; this notation is
374		derived from Internet RFC 3339.
375  </li>
376</ul>
377<p>
378Application writers should note that these abbreviations are ambiguous
379in practice: e.g. 'CST' has a different meaning in China than
380it does in the United States.  In new applications, it's often better
381to use numeric UT offsets like '<code>-</code>0600' instead of time zone
382abbreviations like 'CST'; this avoids the ambiguity.
383</p>
384  </section>
385
386
387  <section>
388    <h2 id="accuracy">Accuracy of the tz database</h2>
389<p>
390The tz database is not authoritative, and it surely has errors.
391Corrections are welcome and encouraged; see the file CONTRIBUTING.
392Users requiring authoritative data should consult national standards
393bodies and the references cited in the database's comments.
394</p>
395
396<p>
397Errors in the tz database arise from many sources:
398</p>
399<ul>
400  <li>
401   The tz database predicts future timestamps, and current predictions
402   will be incorrect after future governments change the rules.
403   For example, if today someone schedules a meeting for 13:00 next
404   October 1, Casablanca time, and tomorrow Morocco changes its
405   daylight saving rules, software can mess up after the rule change
406   if it blithely relies on conversions made before the change.
407  </li>
408  <li>
409   The pre-1970 entries in this database cover only a tiny sliver of how
410   clocks actually behaved; the vast majority of the necessary
411   information was lost or never recorded.  Thousands more zones would
412   be needed if the tz database's scope were extended to cover even
413   just the known or guessed history of standard time; for example,
414   the current single entry for France would need to split into dozens
415   of entries, perhaps hundreds.  And in most of the world even this
416   approach would be misleading due to widespread disagreement or
417   indifference about what times should be observed.  In her 2015 book
418   <cite>The Global Transformation of Time, 1870-1950</cite>, Vanessa Ogle writes
419   "Outside of Europe and North America there was no system of time
420   zones at all, often not even a stable landscape of mean times,
421   prior to the middle decades of the twentieth century".  See:
422   Timothy Shenk, <a
423   href="https://www.dissentmagazine.org/blog/booked-a-global-history-of-time-vanessa-ogle">Booked:
424   A Global History of Time</a>. <cite>Dissent</cite> 2015-12-17.
425  </li>
426  <li>
427   Most of the pre-1970 data entries come from unreliable sources, often
428   astrology books that lack citations and whose compilers evidently
429   invented entries when the true facts were unknown, without
430   reporting which entries were known and which were invented.
431   These books often contradict each other or give implausible entries,
432   and on the rare occasions when they are checked they are
433   typically found to be incorrect.
434  </li>
435  <li>
436   For the UK the tz database relies on years of first-class work done by
437   Joseph Myers and others; see
438   "<a href="https://www.polyomino.org.uk/british-time/">History of
439   legal time in Britain</a>".
440   Other countries are not done nearly as well.
441  </li>
442  <li>
443   Sometimes, different people in the same city would maintain clocks
444   that differed significantly.  Railway time was used by railroad
445   companies (which did not always agree with each other),
446   church-clock time was used for birth certificates, etc.
447   Often this was merely common practice, but sometimes it was set by law.
448   For example, from 1891 to 1911 the UT offset in France was legally
449   0:09:21 outside train stations and 0:04:21 inside.
450  </li>
451  <li>
452   Although a named location in the tz database stands for the
453   containing region, its pre-1970 data entries are often accurate for
454   only a small subset of that region.  For example, <code>Europe/London</code>
455   stands for the United Kingdom, but its pre-1847 times are valid
456   only for locations that have London's exact meridian, and its 1847
457   transition to GMT is known to be valid only for the L&amp;NW and the
458   Caledonian railways.
459  </li>
460  <li>
461   The tz database does not record the earliest time for which a zone's
462   data entries are thereafter valid for every location in the region.
463   For example, <code>Europe/London</code> is valid for all locations in its
464   region after GMT was made the standard time, but the date of
465   standardization (1880-08-02) is not in the tz database, other than
466   in commentary.  For many zones the earliest time of validity is
467   unknown.
468  </li>
469  <li>
470   The tz database does not record a region's boundaries, and in many
471   cases the boundaries are not known.  For example, the zone
472   <code>America/Kentucky/Louisville</code> represents a region around
473   the city of
474   Louisville, the boundaries of which are unclear.
475  </li>
476  <li>
477   Changes that are modeled as instantaneous transitions in the tz
478   database were often spread out over hours, days, or even decades.
479  </li>
480  <li>
481   Even if the time is specified by law, locations sometimes
482   deliberately flout the law.
483  </li>
484  <li>
485   Early timekeeping practices, even assuming perfect clocks, were
486   often not specified to the accuracy that the tz database requires.
487  </li>
488  <li>
489   Sometimes historical timekeeping was specified more precisely
490   than what the tz database can handle.  For example, from 1909 to
491   1937 Netherlands clocks were legally UT +00:19:32.13, but the tz
492   database cannot represent the fractional second.
493  </li>
494  <li>
495   Even when all the timestamp transitions recorded by the tz database
496   are correct, the tz rules that generate them may not faithfully
497   reflect the historical rules.  For example, from 1922 until World
498   War II the UK moved clocks forward the day following the third
499   Saturday in April unless that was Easter, in which case it moved
500   clocks forward the previous Sunday.  Because the tz database has no
501   way to specify Easter, these exceptional years are entered as
502   separate tz Rule lines, even though the legal rules did not change.
503  </li>
504  <li>
505   The tz database models pre-standard time using the proleptic Gregorian
506   calendar and local mean time (LMT), but many people used other
507   calendars and other timescales.  For example, the Roman Empire used
508   the Julian calendar, and had 12 varying-length daytime hours with a
509   non-hour-based system at night.
510  </li>
511  <li>
512   Early clocks were less reliable, and data entries do not represent
513   clock error.
514  </li>
515  <li>
516   The tz database assumes Universal Time (UT) as an origin, even
517   though UT is not standardized for older timestamps.  In the tz
518   database commentary, UT denotes a family of time standards that
519   includes Coordinated Universal Time (UTC) along with other variants
520   such as UT1 and GMT, with days starting at midnight.  Although UT
521   equals UTC for modern timestamps, UTC was not defined until 1960,
522   so commentary uses the more-general abbreviation UT for timestamps
523   that might predate 1960.  Since UT, UT1, etc. disagree slightly,
524   and since pre-1972 UTC seconds varied in length, interpretation of
525   older timestamps can be problematic when subsecond accuracy is
526   needed.
527  </li>
528  <li>
529   Civil time was not based on atomic time before 1972, and we don't
530   know the history of earth's rotation accurately enough to map SI
531   seconds to historical solar time to more than about one-hour
532   accuracy.  See: Stephenson FR, Morrison LV, Hohenkerk CY.
533   <a href="http://dx.doi.org/10.1098/rspa.2016.0404">Measurement
534   of the Earth's rotation: 720 BC to AD 2015</a>.
535   <cite>Proc Royal Soc A</cite>. 2016 Dec 7;472:20160404.
536   Also see: Espenak F. <a
537   href="https://eclipse.gsfc.nasa.gov/SEhelp/uncertainty2004.html">Uncertainty
538   in Delta T (��T)</a>.
539  </li>
540  <li>
541   The relationship between POSIX time (that is, UTC but ignoring leap
542   seconds) and UTC is not agreed upon after 1972.  Although the POSIX
543   clock officially stops during an inserted leap second, at least one
544   proposed standard has it jumping back a second instead; and in
545   practice POSIX clocks more typically either progress glacially during
546   a leap second, or are slightly slowed while near a leap second.
547  </li>
548  <li>
549   The tz database does not represent how uncertain its information is.
550   Ideally it would contain information about when data entries are
551   incomplete or dicey.  Partial temporal knowledge is a field of
552   active research, though, and it's not clear how to apply it here.
553  </li>
554</ul>
555<p>
556In short, many, perhaps most, of the tz database's pre-1970 and future
557timestamps are either wrong or misleading.  Any attempt to pass the
558tz database off as the definition of time should be unacceptable to
559anybody who cares about the facts.  In particular, the tz database's
560LMT offsets should not be considered meaningful, and should not prompt
561creation of zones merely because two locations differ in LMT or
562transitioned to standard time at different dates.
563</p>
564  </section>
565
566
567  <section>
568    <h2 id="functions">Time and date functions</h2>
569<p>
570The tz code contains time and date functions that are upwards
571compatible with those of POSIX.
572</p>
573
574<p>
575POSIX has the following properties and limitations.
576</p>
577<ul>
578  <li>
579    <p>
580	In POSIX, time display in a process is controlled by the
581	environment variable TZ.  Unfortunately, the POSIX TZ string takes
582	a form that is hard to describe and is error-prone in practice.
583	Also, POSIX TZ strings can't deal with other (for example, Israeli)
584	daylight saving time rules, or situations where more than two
585	time zone abbreviations are used in an area.
586    </p>
587    <p>
588      The POSIX TZ string takes the following form:
589    </p>
590    <p>
591      <var>stdoffset</var>[<var>dst</var>[<var>offset</var>][<code>,</code><var>date</var>[<code>/</code><var>time</var>]<code>,</code><var>date</var>[<code>/</code><var>time</var>]]]
592    </p>
593    <p>
594	where:
595    <dl>
596      <dt><var>std</var> and <var>dst</var></dt><dd>
597		are 3 or more characters specifying the standard
598		and daylight saving time (DST) zone names.
599		Starting with POSIX.1-2001, <var>std</var>
600		and <var>dst</var> may also be
601		in a quoted form like '<code>&lt;UTC+10&gt;</code>'; this allows
602		"<code>+</code>" and "<code>-</code>" in the names.
603      </dd>
604      <dt><var>offset</var></dt><dd>
605		is of the form
606		'<code>[&plusmn;]<var>hh</var>:[<var>mm</var>[:<var>ss</var>]]</code>'
607		and specifies the offset west of UT.  '<var>hh</var>'
608		may be a single digit; 0&le;<var>hh</var>&le;24.
609		The default DST offset is one hour ahead of standard time.
610      </dd>
611      <dt><var>date</var>[<code>/</code><var>time</var>]<code>,</code><var>date</var>[<code>/</code><var>time</var>]</dt><dd>
612		specifies the beginning and end of DST.  If this is absent,
613		the system supplies its own rules for DST, and these can
614		differ from year to year; typically US DST rules are used.
615      </dd>
616      <dt><var>time</var></dt><dd>
617		takes the form
618		'<var>hh</var><code>:</code>[<var>mm</var>[<code>:</code><var>ss</var>]]'
619		and defaults to 02:00.
620		This is the same format as the offset, except that a
621		leading '<code>+</code>' or '<code>-</code>' is not allowed.
622      </dd>
623      <dt><var>date</var></dt><dd>
624		takes one of the following forms:
625	<dl>
626	  <dt>J<var>n</var> (1&le;<var>n</var>&le;365)</dt><dd>
627			origin-1 day number not counting February 29
628          </dd>
629	  <dt><var>n</var> (0&le;<var>n</var>&le;365)</dt><dd>
630			origin-0 day number counting February 29 if present
631          </dd>
632	  <dt><code>M</code><var>m</var><code>.</code><var>n</var><code>.</code><var>d</var> (0[Sunday]&le;<var>d</var>&le;6[Saturday], 1&le;<var>n</var>&le;5, 1&le;<var>m</var>&le;12)</dt><dd>
633			for the <var>d</var>th day of
634			week <var>n</var> of month <var>m</var> of the
635			year, where week 1 is the first week in which
636			day <var>d</var> appears, and '<code>5</code>'
637			stands for the last week in which
638			day <var>d</var> appears
639			(which may be either the 4th or 5th week).
640			Typically, this is the only useful form;
641			the <var>n</var>
642			and <code>J</code><var>n</var> forms are
643			rarely used.
644	  </dd>
645</dl>
646</dd>
647</dl>
648	Here is an example POSIX TZ string for New Zealand after 2007.
649	It says that standard time (NZST) is 12 hours ahead of UTC,
650	and that daylight saving time (NZDT) is observed from September's
651	last Sunday at 02:00 until April's first Sunday at 03:00:
652
653        <pre><code>TZ='NZST-12NZDT,M9.5.0,M4.1.0/3'</code></pre>
654
655	This POSIX TZ string is hard to remember, and mishandles some
656	timestamps before 2008.  With this package you can use this
657	instead:
658
659	<pre><code>TZ='Pacific/Auckland'</code></pre>
660  </li>
661  <li>
662	POSIX does not define the exact meaning of TZ values like
663	"<code>EST5EDT</code>".
664	Typically the current US DST rules are used to interpret such values,
665	but this means that the US DST rules are compiled into each program
666	that does time conversion.  This means that when US time conversion
667	rules change (as in the United States in 1987), all programs that
668	do time conversion must be recompiled to ensure proper results.
669  </li>
670  <li>
671	The TZ environment variable is process-global, which makes it hard
672	to write efficient, thread-safe applications that need access
673	to multiple time zones.
674  </li>
675  <li>
676	In POSIX, there's no tamper-proof way for a process to learn the
677	system's best idea of local wall clock.  (This is important for
678	applications that an administrator wants used only at certain
679	times &ndash;
680	without regard to whether the user has fiddled the TZ environment
681	variable.  While an administrator can "do everything in UTC" to get
682	around the problem, doing so is inconvenient and precludes handling
683	daylight saving time shifts - as might be required to limit phone
684	calls to off-peak hours.)
685  </li>
686  <li>
687	POSIX provides no convenient and efficient way to determine the UT
688	offset and time zone abbreviation of arbitrary timestamps,
689	particularly for time zone settings that do not fit into the
690	POSIX model.
691  </li>
692  <li>
693	POSIX requires that systems ignore leap seconds.
694  </li>
695  <li>
696	The tz code attempts to support all the <code>time_t</code>
697	implementations allowed by POSIX.  The <code>time_t</code>
698	type represents a nonnegative count of
699	seconds since 1970-01-01 00:00:00 UTC, ignoring leap seconds.
700	In practice, <code>time_t</code> is usually a signed 64- or
701	32-bit integer; 32-bit signed <code>time_t</code> values stop
702	working after 2038-01-19 03:14:07 UTC, so
703	new implementations these days typically use a signed 64-bit integer.
704	Unsigned 32-bit integers are used on one or two platforms,
705	and 36-bit and 40-bit integers are also used occasionally.
706	Although earlier POSIX versions allowed <code>time_t</code> to be a
707	floating-point type, this was not supported by any practical
708	systems, and POSIX.1-2013 and the tz code both
709	require <code>time_t</code>
710	to be an integer type.
711  </li>
712</ul>
713<p>
714These are the extensions that have been made to the POSIX functions:
715</p>
716<ul>
717  <li>
718    <p>
719	The TZ environment variable is used in generating the name of a file
720	from which time zone information is read (or is interpreted a la
721	POSIX); TZ is no longer constrained to be a three-letter time zone
722	name followed by a number of hours and an optional three-letter
723	daylight time zone name.  The daylight saving time rules to be used
724	for a particular time zone are encoded in the time zone file;
725	the format of the file allows U.S., Australian, and other rules to be
726	encoded, and allows for situations where more than two time zone
727	abbreviations are used.
728    </p>
729    <p>
730	It was recognized that allowing the TZ environment variable to
731	take on values such as '<code>America/New_York</code>' might
732	cause "old" programs
733	(that expect TZ to have a certain form) to operate incorrectly;
734	consideration was given to using some other environment variable
735	(for example, TIMEZONE) to hold the string used to generate the
736	time zone information file name.  In the end, however, it was decided
737	to continue using TZ: it is widely used for time zone purposes;
738	separately maintaining both TZ and TIMEZONE seemed a nuisance;
739	and systems where "new" forms of TZ might cause problems can simply
740	use TZ values such as "<code>EST5EDT</code>" which can be used both by
741	"new" programs (a la POSIX) and "old" programs (as zone names and
742	offsets).
743    </p>
744</li>
745<li>
746	The code supports platforms with a UT offset member
747	in <code>struct tm</code>,
748	e.g., <code>tm_gmtoff</code>.
749</li>
750<li>
751	The code supports platforms with a time zone abbreviation member in
752	<code>struct tm</code>, e.g., <code>tm_zone</code>.
753</li>
754<li>
755	Since the TZ environment variable can now be used to control time
756	conversion, the <code>daylight</code>
757	and <code>timezone</code> variables are no longer needed.
758	(These variables are defined and set by <code>tzset</code>;
759	however, their values will not be used
760	by <code>localtime</code>.)
761</li>
762<li>
763	Functions <code>tzalloc</code>, <code>tzfree</code>,
764	<code>localtime_rz</code>, and <code>mktime_z</code> for
765	more-efficient thread-safe applications that need to use
766	multiple time zones.  The <code>tzalloc</code>
767	and <code>tzfree</code> functions allocate and free objects of
768	type <code>timezone_t</code>, and <code>localtime_rz</code>
769	and <code>mktime_z</code> are like <code>localtime_r</code>
770	and <code>mktime</code> with an extra
771	<code>timezone_t</code> argument.  The functions were inspired
772	by NetBSD.
773</li>
774<li>
775	A function <code>tzsetwall</code> has been added to arrange
776	for the system's
777	best approximation to local wall clock time to be delivered by
778	subsequent calls to <code>localtime</code>.  Source code for portable
779	applications that "must" run on local wall clock time should call
780	<code>tzsetwall</code>; if such code is moved to "old" systems that don't
781	provide tzsetwall, you won't be able to generate an executable program.
782	(These time zone functions also arrange for local wall clock time to be
783	used if tzset is called &ndash; directly or indirectly &ndash;
784	and there's no TZ
785	environment variable; portable applications should not, however, rely
786	on this behavior since it's not the way SVR2 systems behave.)
787</li>
788<li>
789	Negative <code>time_t</code> values are supported, on systems
790	where <code>time_t</code> is signed.
791</li>
792<li>
793	These functions can account for leap seconds, thanks to Bradley White.
794</li>
795</ul>
796<p>
797Points of interest to folks with other systems:
798</p>
799<ul>
800  <li>
801	Code compatible with this package is already part of many platforms,
802	including GNU/Linux, Android, the BSDs, Chromium OS, Cygwin, AIX, iOS,
803	BlackBery 10, macOS, Microsoft Windows, OpenVMS, and Solaris.
804	On such hosts, the primary use of this package
805	is to update obsolete time zone rule tables.
806	To do this, you may need to compile the time zone compiler
807	'<code>zic</code>' supplied with this package instead of using
808	the system '<code>zic</code>', since the format
809	of <code>zic</code>'s input is occasionally extended, and a
810	platform may still be shipping an older <code>zic</code>.
811  </li>
812  <li>
813	The UNIX Version 7 <code>timezone</code> function is not
814	present in this package;
815	it's impossible to reliably map timezone's arguments (a "minutes west
816	of GMT" value and a "daylight saving time in effect" flag) to a
817	time zone abbreviation, and we refuse to guess.
818	Programs that in the past used the timezone function may now examine
819	<code>localtime(&amp;clock)-&gt;tm_zone</code>
820	(if <code>TM_ZONE</code> is defined) or
821	<code>tzname[localtime(&amp;clock)-&gt;tm_isdst]</code>
822	(if <code>HAVE_TZNAME</code> is defined)
823	to learn the correct time zone abbreviation to use.
824  </li>
825  <li>
826	The 4.2BSD <code>gettimeofday</code> function is not used in
827	this package.
828	This formerly let users obtain the current UTC offset and DST flag,
829	but this functionality was removed in later versions of BSD.
830  </li>
831  <li>
832	In SVR2, time conversion fails for near-minimum or near-maximum
833	<code>time_t</code> values when doing conversions for places
834	that don't use UT.
835	This package takes care to do these conversions correctly.
836	A comment in the source code tells how to get compatibly wrong
837	results.
838  </li>
839</ul>
840<p>
841The functions that are conditionally compiled
842if <code>STD_INSPIRED</code> is defined
843should, at this point, be looked on primarily as food for thought.  They are
844not in any sense "standard compatible" &ndash; some are not, in fact,
845specified in <em>any</em> standard.  They do, however, represent responses of
846various authors to
847standardization proposals.
848</p>
849
850<p>
851Other time conversion proposals, in particular the one developed by folks at
852Hewlett Packard, offer a wider selection of functions that provide capabilities
853beyond those provided here.  The absence of such functions from this package
854is not meant to discourage the development, standardization, or use of such
855functions.  Rather, their absence reflects the decision to make this package
856contain valid extensions to POSIX, to ensure its broad acceptability.  If
857more powerful time conversion functions can be standardized, so much the
858better.
859</p>
860  </section>
861
862
863  <section>
864    <h2 id="stability">Interface stability</h2>
865<p>
866The tz code and data supply the following interfaces:
867</p>
868<ul>
869  <li>
870   A set of zone names as per "<a href="#naming">Names of time zone
871   rules</a>" above.
872  </li>
873  <li>
874   Library functions described in "<a href="#functions">Time and date
875   functions</a>" above.
876  </li>
877  <li>
878   The programs <code>tzselect</code>, <code>zdump</code>,
879   and <code>zic</code>, documented in their man pages.
880  </li>
881  <li>
882   The format of <code>zic</code> input files, documented in
883   the <code>zic</code> man page.
884  </li>
885  <li>
886   The format of <code>zic</code> output files, documented in
887   the <code>tzfile</code> man page.
888  </li>
889  <li>
890   The format of zone table files, documented in <code>zone1970.tab</code>.
891  </li>
892  <li>
893   The format of the country code file, documented in <code>iso3166.tab</code>.
894  </li>
895  <li>
896   The version number of the code and data, as the first line of
897   the text file '<code>version</code>' in each release.
898  </li>
899</ul>
900<p>
901Interface changes in a release attempt to preserve compatibility with
902recent releases.  For example, tz data files typically do not rely on
903recently-added <code>zic</code> features, so that users can run
904older <code>zic</code> versions to process newer data
905files.  <a href="tz-link.htm">Sources for time zone and daylight
906saving time data</a> describes how
907releases are tagged and distributed.
908</p>
909
910<p>
911Interfaces not listed above are less stable.  For example, users
912should not rely on particular UT offsets or abbreviations for
913timestamps, as data entries are often based on guesswork and these
914guesses may be corrected or improved.
915</p>
916  </section>
917
918
919  <section>
920    <h2 id="calendar">Calendrical issues</h2>
921<p>
922Calendrical issues are a bit out of scope for a time zone database,
923but they indicate the sort of problems that we would run into if we
924extended the time zone database further into the past.  An excellent
925resource in this area is Nachum Dershowitz and Edward M. Reingold,
926<cite><a href="https://www.cs.tau.ac.il/~nachum/calendar-book/third-edition/">Calendrical
927Calculations: Third Edition</a></cite>, Cambridge University Press (2008).
928Other information and sources are given in the file '<samp>calendars</samp>'
929in the tz distribution.  They sometimes disagree.
930</p>
931  </section>
932
933
934  <section>
935    <h2 id="planets">Time and time zones on other planets</h2>
936<p>
937Some people's work schedules use Mars time.  Jet Propulsion Laboratory
938(JPL) coordinators have kept Mars time on and off at least since 1997
939for the Mars Pathfinder mission.  Some of their family members have
940also adapted to Mars time.  Dozens of special Mars watches were built
941for JPL workers who kept Mars time during the Mars Exploration
942Rovers mission (2004).  These timepieces look like normal Seikos and
943Citizens but use Mars seconds rather than terrestrial seconds.
944</p>
945
946<p>
947A Mars solar day is called a "sol" and has a mean period equal to
948about 24 hours 39 minutes 35.244 seconds in terrestrial time.  It is
949divided into a conventional 24-hour clock, so each Mars second equals
950about 1.02749125 terrestrial seconds.
951</p>
952
953<p>
954The prime meridian of Mars goes through the center of the crater
955Airy-0, named in honor of the British astronomer who built the
956Greenwich telescope that defines Earth's prime meridian.  Mean solar
957time on the Mars prime meridian is called Mars Coordinated Time (MTC).
958</p>
959
960<p>
961Each landed mission on Mars has adopted a different reference for
962solar time keeping, so there is no real standard for Mars time zones.
963For example, the Mars Exploration Rover project (2004) defined two
964time zones "Local Solar Time A" and "Local Solar Time B" for its two
965missions, each zone designed so that its time equals local true solar
966time at approximately the middle of the nominal mission.  Such a "time
967zone" is not particularly suited for any application other than the
968mission itself.
969</p>
970
971<p>
972Many calendars have been proposed for Mars, but none have achieved
973wide acceptance.  Astronomers often use Mars Sol Date (MSD) which is a
974sequential count of Mars solar days elapsed since about 1873-12-29
97512:00 GMT.
976</p>
977
978<p>
979In our solar system, Mars is the planet with time and calendar most
980like Earth's.  On other planets, Sun-based time and calendars would
981work quite differently.  For example, although Mercury's sidereal
982rotation period is 58.646 Earth days, Mercury revolves around the Sun
983so rapidly that an observer on Mercury's equator would see a sunrise
984only every 175.97 Earth days, i.e., a Mercury year is 0.5 of a Mercury
985day.  Venus is more complicated, partly because its rotation is
986slightly retrograde: its year is 1.92 of its days.  Gas giants like
987Jupiter are trickier still, as their polar and equatorial regions
988rotate at different rates, so that the length of a day depends on
989latitude.  This effect is most pronounced on Neptune, where the day is
990about 12 hours at the poles and 18 hours at the equator.
991</p>
992
993<p>
994Although the tz database does not support time on other planets, it is
995documented here in the hopes that support will be added eventually.
996</p>
997
998<p>
999Sources:
1000</p>
1001<ul>
1002  <li>
1003Michael Allison and Robert Schmunk,
1004"<a href="https://www.giss.nasa.gov/tools/mars24/help/notes.html">Technical
1005Notes on Mars Solar Time as Adopted by the Mars24 Sunclock</a>"
1006(2012-08-08).
1007  </li>
1008  <li>
1009Jia-Rui Chong,
1010"<a href="http://articles.latimes.com/2004/jan/14/science/sci-marstime14">Workdays
1011Fit for a Martian</a>", Los Angeles Times
1012(2004-01-14), pp A1, A20-A21.
1013  </li>
1014  <li>
1015Tom Chmielewski,
1016"<a href="https://www.theatlantic.com/technology/archive/2015/02/jet-lag-is-worse-on-mars/386033/">Jet
1017Lag Is Worse on Mars</a>", The Atlantic (2015-02-26)
1018  </li>
1019  <li>
1020Matt Williams,
1021"<a href="https://www.universetoday.com/37481/days-of-the-planets/">How
1022long is a day on the other planets of the solar system?</a>"
1023(2017-04-27).
1024  </li>
1025</ul>
1026  </section>
1027
1028  <footer>
1029    <hr>
1030This file is in the public domain, so clarified as of 2009-05-17 by
1031Arthur David Olson.
1032  </footer>
1033</body>
1034</html>
1035