diff options
Diffstat (limited to 'manual/time.texi')
-rw-r--r-- | manual/time.texi | 553 |
1 files changed, 542 insertions, 11 deletions
diff --git a/manual/time.texi b/manual/time.texi index 46a2832326..7b58ad4400 100644 --- a/manual/time.texi +++ b/manual/time.texi @@ -82,7 +82,7 @@ elapsed = ((double) (end - start)) / CLOCKS_PER_SEC; Different computers and operating systems vary wildly in how they keep track of processor time. It's common for the internal processor clock -to have a resolution somewhere between hundredths and millionths of a +to have a resolution somewhere between hundredth and millionth of a second. In the GNU system, @code{clock_t} is equivalent to @code{long int} and @@ -224,6 +224,8 @@ date and time values. * High-Resolution Calendar:: A time representation with greater precision. * Broken-down Time:: Facilities for manipulating local time. * Formatting Date and Time:: Converting times to strings. +* Parsing Date and Time:: Convert textual time and date information back + into broken-down time values. * TZ Variable:: How users specify the time zone. * Time Zone Functions:: Functions to examine or specify the time zone. * Time Functions Example:: An example program showing use of some of @@ -689,7 +691,6 @@ return @code{NULL}. @comment time.h @comment ISO -@comment POSIX.2 @deftypefun size_t strftime (char *@var{s}, size_t @var{size}, const char *@var{template}, const struct tm *@var{brokentime}) This function is similar to the @code{sprintf} function (@pxref{Formatted Input}), but the conversion specifications that can appear in the format @@ -789,12 +790,6 @@ The day of the month like with @code{%d}, but padded with blank (range This format is a POSIX.2 extension. -@item %f -The day of the week as a decimal number (range @code{1} through -@code{7}), Monday being @code{1}. - -This format is a @w{ISO C 9X} extension. - @item %F The date using the format @code{%Y-%m-%d}. This is the form specified in the @w{ISO 8601} standard and is the preferred form for all uses. @@ -890,7 +885,7 @@ Leap seconds are not counted unless leap second support is available. This format is a GNU extension. @item %S -The second as a decimal number (range @code{00} through @code{60}). +The seconds as a decimal number (range @code{00} through @code{60}). @item %t A single @samp{\t} (tabulator) character. @@ -959,8 +954,8 @@ determinable. This format is a GNU extension. -A full @w{RFC 822} timestamp is generated by the format -@w{@samp{"%a, %d %b %Y %H:%M:%S %z"}} (or the equivalent +A full @w{RFC 822} timestamp is generated by the format +@w{@samp{"%a, %d %b %Y %H:%M:%S %z"}} (or the equivalent @w{@samp{"%a, %d %b %Y %T %z"}}). @item %Z @@ -1008,6 +1003,542 @@ is examined before any output is produced. For an example of @code{strftime}, see @ref{Time Functions Example}. @end deftypefun +@node Parsing Date and Time +@subsection Convert textual time and date information back + +The @w{ISO C} standard does not specify any functions which can convert +the output of the @code{strftime} function back into a binary format. +This lead to variety of more or less successful implementations with +different interfaces over the years. Then the Unix standard got +extended by two functions: @code{strptime} and @code{getdate}. Both +have kind of strange interfaces but at least they are widely available. + +@menu +* Low-Level Time String Parsing:: Interpret string according to given format. +* General Time String Parsing:: User-friendly function to parse data and + time strings. +@end menu + +@node Low-Level Time String Parsing +@subsubsection Interpret string according to given format + +The first function is a rather low-level interface. It is nevertheless +frequently used in user programs since it is better known. Its +implementation and the interface though is heavily influenced by the +@code{getdate} function which is defined and implemented in terms of +calls to @code{strptime}. + +@comment time.h +@comment XPG4 +@deftypefun {char *} strptime (const char *@var{s}, const char *@var{fmt}, struct tm *@var{tp}) +The @code{strptime} function parses the input string @var{s} according +to the format string @var{fmt} and stores the found values in the +structure @var{tp}. + +The input string can be retrieved in any way. It does not matter +whether it was generated by a @code{strftime} call or made up directly +by a program. It is also not necessary that the content is in any +human-recognizable format. I.e., it is OK if a date is written like +@code{"02:1999:9"} which is not understandable without context. As long +the format string @var{fmt} matches the format of the input string +everything goes. + +The format string consists of the same components as the format string +for the @code{strftime} function. The only difference is that the flags +@code{_}, @code{-}, @code{0}, and @code{^} are not allowed. +@comment Is this really the intention? --drepper +Several of the formats which @code{strftime} handled differently do the +same work in @code{strptime} since differences like case of the output +do not matter. For symmetry reasons all formats are supported, though. + +The modifiers @code{E} and @code{O} are also allowed everywhere the +@code{strftime} function allows them. + +The formats are: + +@table @code +@item %a +@itemx %A +The weekday name according to the current locale, in abbreviated form or +the full name. + +@item %b +@itemx %B +@itemx %h +The month name according to the current locale, in abbreviated form or +the full name. + +@item %c +The date and time representation for the current locale. + +@item %Ec +Like @code{%c} but the locale's alternative date and time format is used. + +@item %C +The century of the year. + +It makes sense to use this format only if the format string also +contains the @code{%y} format. + +@item %EC +The locale's representation of the period. + +Unlike @code{%C} it makes sometimes sense to use this format since in +some cultures it is required to specify years relative to periods +instead of using the Gregorian years. + +@item %d +@item %e +The day of the month as a decimal number (range @code{1} through @code{31}). +Leading zeroes are permitted but not required. + +@item %Od +@itemx %Oe +Same as @code{%d} but the locale's alternative numeric symbols are used. + +Leading zeroes are permitted but not required. + +@item %D +Equivalent to the use of @code{%m/%d/%y} in this place. + +@item %F +Equivalent to the use of @code{%Y-%m-%d} which is the @w{ISO 8601} date +format. + +This is a GNU extension following an @w{ISO C 9X} extension to +@code{strftime}. + +@item %g +The year corresponding to the ISO week number, but without the century +(range @code{00} through @code{99}). + +@emph{Note:} This is not really implemented currently. The format is +recognized, input is consumed but no field in @var{tm} is set. + +This format is a GNU extension following a GNU extension of @code{strftime}. + +@item %G +The year corresponding to the ISO week number. + +@emph{Note:} This is not really implemented currently. The format is +recognized, input is consumed but no field in @var{tm} is set. + +This format is a GNU extension following a GNU extension of @code{strftime}. + +@item %H +@itemx %k +The hour as a decimal number, using a 24-hour clock (range @code{00} through +@code{23}). + +@code{%k} is a GNU extension following a GNU extension of @code{strftime}. + +@item %OH +Same as @code{%H} but using the locale's alternative numeric symbols are used. + +@item %I +@itemx %l +The hour as a decimal number, using a 12-hour clock (range @code{01} through +@code{12}). + +@code{%l} is a GNU extension following a GNU extension of @code{strftime}. + +@item %OI +Same as @code{%I} but using the locale's alternative numeric symbols are used. + +@item %j +The day of the year as a decimal number (range @code{1} through @code{366}). + +Leading zeroes are permitted but not required. + +@item %m +The month as a decimal number (range @code{1} through @code{12}). + +Leading zeroes are permitted but not required. + +@item %Om +Same as @code{%m} but using the locale's alternative numeric symbols are used. + +@item %M +The minute as a decimal number (range @code{0} through @code{59}). + +Leading zeroes are permitted but not required. + +@item %OM +Same as @code{%M} but using the locale's alternative numeric symbols are used. + +@item %n +@itemx %t +Matches any white space. + +@item %p +@item %P +The locale-dependent equivalent to @samp{AM} or @samp{PM}. + +This format is not useful unless @code{%I} or @code{%l} is also used. +Another complication is that the locale might not define these values at +all and therefore the conversion fails. + +@code{%P} is a GNU extension following a GNU extension to @code{strftime}. + +@item %r +The complete time using the AM/PM format of the current locale. + +A complication is that the locale might not define this format at all +and therefore the conversion fails. + +@item %R +The hour and minute in decimal numbers using the format @code{%H:%M}. + +@code{%R} is a GNU extension following a GNU extension to @code{strftime}. + +@item %s +The number of seconds since the epoch, i.e., since 1970-01-01 00:00:00 UTC. +Leap seconds are not counted unless leap second support is available. + +@code{%s} is a GNU extension following a GNU extension to @code{strftime}. + +@item %S +The seconds as a decimal number (range @code{0} through @code{61}). + +Leading zeroes are permitted but not required. + +Please note the nonsense with @code{61} being allowed. This is what the +Unix specification says. They followed the stupid decision once made to +allow double leap seconds. These do not exist but the myth persists. + +@item %OS +Same as @code{%S} but using the locale's alternative numeric symbols are used. + +@item %T +Equivalent to the use of @code{%H:%M:%S} in this place. + +@item %u +The day of the week as a decimal number (range @code{1} through +@code{7}), Monday being @code{1}. + +Leading zeroes are permitted but not required. + +@emph{Note:} This is not really implemented currently. The format is +recognized, input is consumed but no field in @var{tm} is set. + +@item %U +The week number of the current year as a decimal number (range @code{0} +through @code{53}). + +Leading zeroes are permitted but not required. + +@item %OU +Same as @code{%U} but using the locale's alternative numeric symbols are used. + +@item %V +The @w{ISO 8601:1988} week number as a decimal number (range @code{1} +through @code{53}). + +Leading zeroes are permitted but not required. + +@emph{Note:} This is not really implemented currently. The format is +recognized, input is consumed but no field in @var{tm} is set. + +@item %w +The day of the week as a decimal number (range @code{0} through +@code{6}), Sunday being @code{0}. + +Leading zeroes are permitted but not required. + +@emph{Note:} This is not really implemented currently. The format is +recognized, input is consumed but no field in @var{tm} is set. + +@item %Ow +Same as @code{%w} but using the locale's alternative numeric symbols are used. + +@item %W +The week number of the current year as a decimal number (range @code{0} +through @code{53}). + +Leading zeroes are permitted but not required. + +@emph{Note:} This is not really implemented currently. The format is +recognized, input is consumed but no field in @var{tm} is set. + +@item %OW +Same as @code{%W} but using the locale's alternative numeric symbols are used. + +@item %x +The date using the locale's date format. + +@item %Ex +Like @code{%x} but the locale's alternative data representation is used. + +@item %X +The time using the locale's time format. + +@item %EX +Like @code{%X} but the locale's alternative time representation is used. + +@item %y +The year without a century as a decimal number (range @code{0} through +@code{99}). + +Leading zeroes are permitted but not required. + +Please note that it is at least questionable to use this format without +the @code{%C} format. The @code{strptime} function does regard input +values in the range @math{68} to @math{99} as the years @math{1969} to +@math{1999} and the values @math{0} to @math{68} as the years +@math{2000} to @math{2068}. But maybe this heuristic fails for some +input data. + +Therefore it is best to avoid @code{%y} completely and use @code{%Y} +instead. + +@item %Ey +The offset from @code{%EC} in the locale's alternative representation. + +@item %Oy +The offset of the year (from @code{%C}) using the locale's alternative +numeric symbols. + +@item %Y +The year as a decimal number, using the Gregorian calendar. + +@item %EY +The full alternative year representation. + +@item %z +Equivalent to the use of @code{%a, %d %b %Y %H:%M:%S %z} in this place. +This is the full @w{ISO 8601} date and time format. + +@item %Z +The timezone name. + +@emph{Note:} This is not really implemented currently. The format is +recognized, input is consumed but no field in @var{tm} is set. + +@item %% +A literal @samp{%} character. +@end table + +All other characters in the format string must have a matching character +in the input string. Exceptions are white spaces in the input string +which can match zero or more white space characters in the input string. + +The @code{strptime} function processes the input string from right to +left. Each of the three possible input elements (white space, literal, +or format) are handled one after the other. If the input cannot be +matched to the format string the function stops. The remainder of the +format and input strings are not processed. + +The return value of the function is a pointer to the first character not +processed in this function call. In the case of an error the return +value points to the first character not matched. In case the input +string contains more than required by the format string the return value +points right after the last consumed input character. In case the whole +input string is consumed the return value points to the NUL byte at the +end of the string. +@end deftypefun + +The specification of the function in the XPG standard is rather vague. +It leaves out a few important pieces of information. Most important it +does not specify what happens to those elements of @var{tm} which are +not directly initialized by the different formats. Various +implementations on different Unix systems vary here. + +The GNU libc implementation does not touch those fields which are not +directly initialized. Exceptions are the @code{tm_wday} and +@code{tm_yday} elements which are recomputed if any of the year, month, +or date elements changed. This has two implications: + +@itemize @bullet +@item +Before calling the @code{strptime} function for a new input string one +has to prepare the structure passed in as the @var{tm}. Normally this +will mean that all values are initialized to zero. Alternatively one +can use all fields to values like @code{INT_MAX} which allows to +determine which elements were set by the function call. Zero does not +work here since it is a valid value for many of the fields. + +Careful initialization is necessary if one wants to find out whether a +certain field in @var{tm} was initialized by the function call. + +@item +One can construct a @code{struct tm} value in several @code{strptime} +calls in a row. A useful application of this is for example the parsing +of two separate strings, one containing the date information, the other +the time information. By parsing both one after the other without +clearing the structure in between one can construct a complete +broken-down time. +@end itemize + +The following example shows a function which parses a string which is +supposed to contain the date information in either US style or @w{ISO +8601} form. + +@smallexample +const char * +parse_date (const char *input, struct tm *tm) +@{ + const char *cp; + + /* @r{First clear the result structure.} */ + memset (tm, '\0', sizeof (*tm)); + + /* @r{Try the ISO format first.} */ + cp = strptime (input, "%F", tm); + if (cp == NULL) + @{ + /* @r{Does not match. Try the US form.} */ + cp = strptime (input, "%D", tm); + @} + + return cp; +@} +@end smallexample + +@node General Time String Parsing +@subsubsection A user-friendlier way to parse times and dates + +The Unix standard defines another function to parse date strings. The +interface is, mildly said, weird. But if this function fits into the +application to be written it is just fine. It is a problem when using +this function in multi-threaded programs or in libraries since it +returns a pointer to a static variable, uses a global variable, and a +global state (an environment variable). + +@comment time.h +@comment Unix98 +@defvar getdate_err +This variable of type @code{int} will contain the error code of the last +unsuccessful call of the @code{getdate} function. Defined values are: + +@table @math +@item 1 +The environment variable @code{DATEMSK} is not defined or null. +@item 2 +The template file denoted by the @code{DATEMSK} environment variable +cannot be opened. +@item 3 +Information about the template file cannot retrieved. +@item 4 +The template file is no regular file. +@item 5 +An I/O error occurred while reading the template file. +@item 6 +Not enough memory available to execute the function. +@item 7 +The template file contains no matching template. +@item 8 +The input string is invalid for a template which would match otherwise. +This includes error like February 31st, or return values which can be +represented using @code{time_t}. +@end table +@end defvar + +@comment time.h +@comment Unix98 +@deftypefun {struct tm *} getdate (const char *@var{string}) +The interface of the @code{getdate} function is the simplest possible +for a function to parse a string and return the value. @var{string} is +the input string and the result is passed to the user in a statically +allocated variable. + +The details about how the string is processed is hidden from the user. +In fact, it can be outside the control of the program. Which formats +are recognized is controlled by the file named by the environment +variable @code{DATEMSK}. The content of the named file should contain +lines of valid format strings which could be passed to @code{strptime}. + +The @code{getdate} function reads these format strings one after the +other and tries to match the input string. The first line which +completely matches the input string is used. + +Elements which were not initialized through the format string get +assigned the values of the time the @code{getdate} function is called. + +The format elements recognized by @code{getdate} are the same as for +@code{strptime}. See above for an explanation. There are only a few +extension to the @code{strptime} behavior: + +@itemize @bullet +@item +If the @code{%Z} format is given the broken-down time is based on the +current time in the timezone matched, not in the current timezone of the +runtime environment. + +@emph{Note}: This is not implemented (currently). The problem is that +timezone names are not unique. If a fixed timezone is assumed for a +given string (say @code{EST} meaning US East Coast time) uses for +countries other than the USA will fail. So far we have found no good +solution for this. + +@item +If only the weekday is specified the selected day depends on the current +date. If the current weekday is greater or equal to the @code{tm_wday} +value this weeks day is selected. Otherwise next weeks day. + +@item +A similar heuristic is used if only the month is given, not the year. +For value corresponding to the current or a later month the current year +s used. Otherwise the next year. The first day of the month is assumed +if it is not explicitly specified. + +@item +The current hour, minute, and second is used if the appropriate value is +not set through the format. + +@item +If no date is given the date for the next day is used if the time is +smaller than the current time. Otherwise it is the same day. +@end itemize + +It should be noted that the format in the template file need not only +contain format elements. The following is a list of possible format +strings (taken from the Unix standard): + +@smallexample +%m +%A %B %d, %Y %H:%M:%S +%A +%B +%m/%d/%y %I %p +%d,%m,%Y %H:%M +at %A the %dst of %B in %Y +run job at %I %p,%B %dnd +%A den %d. %B %Y %H.%M Uhr +@end smallexample + +As one can see the template list can contain very specific strings like +@code{run job at %I %p,%B %dnd}. Using the above list of templates and +assuming the current time is Mon Sep 22 12:19:47 EDT 1986 we can get the +The results for the given input. + +@multitable {xxxxxxxxxxxx} {xxxxxxxx} {xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx} +@item Mon @tab %a @tab Mon Sep 22 12:19:47 EDT 1986 +@item Sun @tab %a @tab Sun Sep 28 12:19:47 EDT 1986 +@item Fri @tab %a @tab Fri Sep 26 12:19:47 EDT 1986 +@item September @tab %B @tab Mon Sep 1 12:19:47 EDT 1986 +@item January @tab %B @tab Thu Jan 1 12:19:47 EST 1987 +@item December @tab %B @tab Mon Dec 1 12:19:47 EST 1986 +@item Sep Mon @tab %b %a @tab Mon Sep 1 12:19:47 EDT 1986 +@item Jan Fri @tab %b %a @tab Fri Jan 2 12:19:47 EST 1987 +@item Dec Mon @tab %b %a @tab Mon Dec 1 12:19:47 EST 1986 +@item Jan Wed 1989 @tab %b %a %Y @tab Wed Jan 4 12:19:47 EST 1989 +@item Fri 9 @tab %a %H @tab Fri Sep 26 09:00:00 EDT 1986 +@item Feb 10:30 @tab %b %H:%S @tab Sun Feb 1 10:00:30 EST 1987 +@item 10:30 @tab %H:%M @tab Tue Sep 23 10:30:00 EDT 1986 +@item 13:30 @tab %H:%M @tab Mon Sep 22 13:30:00 EDT 1986 +@end multitable + +The return value of the function is a pointer to a static variable of +type @w{@code{struct tm}} or a null pointer if an error occurred. The +result in the variable pointed to by the return value is only valid +until the next @code{getdate} call which makes this function unusable in +multi-threaded applications. + +The @code{errno} variable is @emph{not} changed. Error conditions are +signalled using the global variable @code{getdate_err}. See the +description above for a list of the possible error values. +@end deftypefun + @node TZ Variable @subsection Specifying the Time Zone with @code{TZ} |