diff options
-rw-r--r-- | ChangeLog | 2 | ||||
-rw-r--r-- | manual/ctype.texi | 89 | ||||
-rw-r--r-- | manual/locale.texi | 482 |
3 files changed, 284 insertions, 289 deletions
diff --git a/ChangeLog b/ChangeLog index 42e667435c..9c9137fc4b 100644 --- a/ChangeLog +++ b/ChangeLog @@ -3,11 +3,13 @@ * manual/argp.texi: Fixing language and typos. * manual/conf.texi: Likewise. * manual/contrib.texi: Likewise. + * manual/ctype.texi: Likewise. * manual/filesys.texi: Likewise. * manual/install.texi: Likewise. * manual/job.texi: Likewise. * manual/lang.texi: Likewise. * manual/llio.texi: Likewise. + * manual/locale.texi: Likewise. * manual/math.texi: Likewise. * manual/nss.texi: Likewise. * manual/pipe.texi: Likewise. diff --git a/manual/ctype.texi b/manual/ctype.texi index b5ab6bae3d..0d3ab60aa2 100644 --- a/manual/ctype.texi +++ b/manual/ctype.texi @@ -266,34 +266,34 @@ with the SVID. @section Character class determination for wide characters The second amendment to @w{ISO C89} defines functions to classify wide -characters. The original @w{ISO C89} standard defined the type -@code{wchar_t} but failed to define any functions to operate on wide -characters. +characters. Although the original @w{ISO C89} standard already defined +the type @code{wchar_t}, no functions operating on them were defined. The general design of the classification functions for wide characters -is more general. It allows extending the set of available -classifications beyond the set which is always available. The POSIX -standard specifies how the extension can be done and this is already +is more general. It allows extensions to the set of available +classifications, beyond those which are always available. The POSIX +standard specifies how extensions can be made, and this is already implemented in the GNU C library implementation of the @code{localedef} program. -The character class functions are normally implemented using bitsets. -I.e., for the character in question the appropriate bitset is read from -a table and a test is performed to determine whether a certain bit is -set in this bitset. Which bit is tested for is determined by the class. +The character class functions are normally implemented with bitsets, +with a bitset per character. For a given character, the appropriate +bitset is read from a table and a test is performed as to whether a +certain bit is set. Which bit is tested for is determined by the +class. For the wide character classification functions this is made visible. -There is a type representing the classification, a function to retrieve -this value for a specific class, and a function to test using the -classification value whether a given character is in this class. On top -of this the normal character classification functions as used for +There is a type classification type defined, a function to retrieve this +value for a given class, and a function to test whether a given +character is in this class, using the classification value. On top of +this the normal character classification functions as used for @code{char} objects can be defined. @comment wctype.h @comment ISO @deftp {Data type} wctype_t The @code{wctype_t} can hold a value which represents a character class. -The ony defined way to generate such a value is by using the +The only defined way to generate such a value is by using the @code{wctype} function. @pindex wctype.h @@ -306,8 +306,8 @@ This type is defined in @file{wctype.h}. The @code{wctype} returns a value representing a class of wide characters which is identified by the string @var{property}. Beside some standard properties each locale can define its own ones. In case -no property with the given name is known for the current locale for the -@code{LC_CTYPE} category the function returns zero. +no property with the given name is known for the current locale +selected for the @code{LC_CTYPE} category, the function returns zero. @noindent The properties known in every locale are: @@ -339,11 +339,11 @@ by a successful call to @code{wctype}. This function is declared in @file{wctype.h}. @end deftypefun -This makes it easier to use the commonly-used classification functions -that are defined in the C library. There is no need to use +To make it easier to use the commonly-used classification functions, +they are defined in the C library. There is no need to use @code{wctype} if the property string is one of the known character classes. In some situations it is desirable to construct the property -string and then it becomes important that @code{wctype} can also handle the +strings, and then it is important that @code{wctype} can also handle the standard classes. @cindex alphanumeric character @@ -420,7 +420,7 @@ wide characters: @smallexample n = 0; -while (iswctype (*wc)) +while (iswdigit (*wc)) @{ n *= 10; n += *wc++ - L'0'; @@ -604,11 +604,11 @@ This function is a GNU extension. It is declared in @file{wchar.h}. @node Using Wide Char Classes, Wide Character Case Conversion, Classification of Wide Characters, Character Handling @section Notes on using the wide character classes -The first note is probably nothing astonishing but still occasionally a +The first note is probably not astonishing but still occasionally a cause of problems. The @code{isw@var{XXX}} functions can be implemented using macros and in fact, the GNU C library does this. They are still available as real functions but when the @file{wctype.h} header is -included the macros will be used. This is nothing new compared to the +included the macros will be used. This is the same as the @code{char} type versions of these functions. The second note covers something new. It can be best illustrated by a @@ -630,8 +630,8 @@ is_in_class (int c, const char *class) @} @end smallexample -Now with the @code{wctype} and @code{iswctype} one could avoid the -@code{if} cascades. But rewriting the code as follows is wrong: +Now, with the @code{wctype} and @code{iswctype} you can avoid the +@code{if} cascades, but rewriting the code as follows is wrong: @smallexample int @@ -644,7 +644,7 @@ is_in_class (int c, const char *class) The problem is that it is not guaranteed that the wide character representation of a single-byte character can be found using casting. -In fact, usually this fails miserably. The correct solution for this +In fact, usually this fails miserably. The correct solution to this problem is to write the code as follows: @smallexample @@ -657,10 +657,10 @@ is_in_class (int c, const char *class) @end smallexample @xref{Converting a Character}, for more information on @code{btowc}. -Please note that this change probably does not improve the performance +Note that this change probably does not improve the performance of the program a lot since the @code{wctype} function still has to make -the string comparisons. But it gets really interesting if the -@code{is_in_class} function would be called more than once using the +the string comparisons. It gets really interesting if the +@code{is_in_class} function is called more than once for the same class name. In this case the variable @var{desc} could be computed once and reused for all the calls. Therefore the above form of the function is probably not the final one. @@ -669,18 +669,17 @@ function is probably not the final one. @node Wide Character Case Conversion, , Using Wide Char Classes, Character Handling @section Mapping of wide characters. -As for the classification functions, the @w{ISO C} standard also -generalizes the mapping functions. Instead of only allowing the two -standard mappings, the locale can contain others. Again, the -@code{localedef} program already supports generating such locale data -files. +The classification functions are also generalized by the @w{ISO C} +standard. Instead of just allowing the two standard mappings, a +locale can contain others. Again, the @code{localedef} program +already supports generating such locale data files. @comment wctype.h @comment ISO @deftp {Data Type} wctrans_t This data type is defined as a scalar type which can hold a value representing the locale-dependent character mapping. There is no way to -construct such a value except using the return value of the +construct such a value apar from using the return value of the @code{wctrans} function. @pindex wctype.h @@ -693,8 +692,8 @@ This type is defined in @file{wctype.h}. @deftypefun wctrans_t wctrans (const char *@var{property}) The @code{wctrans} function has to be used to find out whether a named mapping is defined in the current locale selected for the -@code{LC_CTYPE} category. If the returned value is non-zero it can -afterwards be used in calls to @code{towctrans}. If the return value is +@code{LC_CTYPE} category. If the returned value is non-zero, you can use +it afterwards in calls to @code{towctrans}. If the return value is zero no such mapping is known in the current locale. Beside locale-specific mappings there are two mappings which are @@ -707,15 +706,15 @@ guaranteed to be available in every locale: @pindex wctype.h @noindent -This function is declared in @file{wctype.h}. +These functions are declared in @file{wctype.h}. @end deftypefun @comment wctype.h @comment ISO @deftypefun wint_t towctrans (wint_t @var{wc}, wctrans_t @var{desc}) -The @code{towctrans} function maps the input character @var{wc} -according to the rules of the mapping for which @var{desc} is an -descriptor and returns the value so found. The @var{desc} value must be +@code{towctrans} maps the input character @var{wc} +according to the rules of the mapping for which @var{desc} is a +descriptor, and returns the value it finds. @var{desc} must be obtained by a successful call to @code{wctrans}. @pindex wctype.h @@ -723,8 +722,8 @@ obtained by a successful call to @code{wctrans}. This function is declared in @file{wctype.h}. @end deftypefun -The @w{ISO C} standard also defines for the generally available mappings -convenient shortcuts so that it is not necesary to call @code{wctrans} +For the generally available mappings, the @w{ISO C} standard defines +convenient shortcuts so that it is not necessary to call @code{wctrans} for them. @comment wctype.h @@ -765,6 +764,6 @@ This function is declared in @file{wctype.h}. @end deftypefun The same warnings given in the last section for the use of the wide -character classification function applies here. It is not possible to +character classification functions apply here. It is not possible to simply cast a @code{char} type value to a @code{wint_t} and use it as an -argument for @code{towctrans} calls. +argument to @code{towctrans} calls. diff --git a/manual/locale.texi b/manual/locale.texi index 6cfacbdb8c..096ac48105 100644 --- a/manual/locale.texi +++ b/manual/locale.texi @@ -99,7 +99,7 @@ most of Spain. The set of locales supported depends on the operating system you are using, and so do their names. We can't make any promises about what locales will exist, except for one standard locale called @samp{C} or -@samp{POSIX}. Later we will describe how to construct locales XXX. +@samp{POSIX}. Later we will describe how to construct locales. @comment (@pxref{Building Locale Files}). @cindex combining locales @@ -183,12 +183,12 @@ to use for all purposes except as overridden by the variables above. @vindex LANGUAGE When developing the message translation functions it was felt that the -functionality provided by the variables above is not sufficient. E.g., it -should be possible to specify more than one locale name. For an example -take a Swedish user who better speaks German than English, the programs -messages by default are written in English. Then it should be possible -to specify that the first choice for the language is Swedish, the second -choice is German, and if this also fails English is used. This is +functionality provided by the variables above is not sufficient. For +example, it should be possible to specify more than one locale name. +Take a Swedish user who better speaks German than English, and a program +whose messages are output in English by default. It should be possible +to specify that the first choice of language is Swedish, the second +German, and if this also fails to use English. This is possible with the variable @code{LANGUAGE}. For further description of this GNU extension see @ref{Using gettextized software}. @@ -226,7 +226,7 @@ category @var{category} to @var{locale}. If @var{category} is @code{LC_ALL}, this specifies the locale for all purposes. The other possible values of @var{category} specify an -individual purpose (@pxref{Locale Categories}). +single purpose (@pxref{Locale Categories}). You can also use this function to find out the current locale by passing a null pointer as the @var{locale} argument. In this case, @@ -250,19 +250,19 @@ don't make any promises about what it looks like. But if you specify the same ``locale name'' with @code{LC_ALL} in a subsequent call to @code{setlocale}, it restores the same combination of locale selections. -To ensure to be able to use the string encoding the currently selected -locale at a later time one has to make a copy of the string. It is not -guaranteed that the return value stays valid all the time. +To be sure you can use the returned string encoding the currently selected +locale at a later time, you must make a copy of the string. It is not +guaranteed that the returned pointer remains valid over time. When the @var{locale} argument is not a null pointer, the string returned -by @code{setlocale} reflects the newly modified locale. +by @code{setlocale} reflects the newly-modified locale. If you specify an empty string for @var{locale}, this means to read the appropriate environment variable and use its value to select the locale for @var{category}. -If a nonempty string is given for @var{locale} the locale with this name -is used, if this is possible. +If a nonempty string is given for @var{locale}, then the locale of that +name is used if possible. If you specify an invalid locale name, @code{setlocale} returns a null pointer and leaves the current locale unchanged. @@ -303,7 +303,7 @@ with_other_locale (char *new_locale, @end smallexample @strong{Portability Note:} Some @w{ISO C} systems may define additional -locale categories and future versions of the library will do so. For +locale categories, and future versions of the library will do so. For portability, assume that any symbol beginning with @samp{LC_} might be defined in @file{locale.h}. @@ -332,7 +332,7 @@ Defining and installing named locales is normally a responsibility of the system administrator at your site (or the person who installed the GNU C library). It is also possible for the user to create private locales. All this will be discussed later when describing the tool to -do so XXX. +do so. @comment (@pxref{Building Locale Files}). If your program needs to use something other than the @samp{C} locale, @@ -342,27 +342,27 @@ locale explicitly by name. Remember, different machines might have different sets of locales installed. @node Locale Information, Formatting Numbers, Standard Locales, Locales -@section Accessing the Locale Information +@section Accessing Locale Information -There are several ways to access the locale information. The simplest +There are several ways to access locale information. The simplest way is to let the C library itself do the work. Several of the -functions in this library access implicitly the locale data and use -what information is available in the currently selected locale. This is +functions in this library implicitly access the locale data, and use +what information is provided by the currently selected locale. This is how the locale model is meant to work normally. -As an example take the @code{strftime} function which is meant to nicely +As an example take the @code{strftime} function, which is meant to nicely format date and time information (@pxref{Formatting Date and Time}). Part of the standard information contained in the @code{LC_TIME} -category are, e.g., the names of the months. Instead of requiring the +category is the names of the months. Instead of requiring the programmer to take care of providing the translations the -@code{strftime} function does this all by itself. When using @code{%A} -in the format string this will be replaced by the appropriate weekday -name of the locale currently selected for @code{LC_TIME}. This is the -easy part and wherever possible functions do things automatically as in -this case. - -But there are quite often situations when there is simply no functions -to perform the task or it is simply not possible to do the work +@code{strftime} function does this all by itself. @code{%A} +in the format string is replaced by the appropriate weekday +name of the locale currently selected by @code{LC_TIME}. This is an +easy example, and wherever possible functions do things automatically +in this way. + +But there are quite often situations when there is simply no function +to perform the task, or it is simply not possible to do the work automatically. For these cases it is necessary to access the information in the locale directly. To do this the C library provides two functions: @code{localeconv} and @code{nl_langinfo}. The former is @@ -379,14 +379,13 @@ as far as the system follows the Unix standards. @subsection @code{localeconv}: It is portable but @dots{} Together with the @code{setlocale} function the @w{ISO C} people -invented @code{localeconv} function. It is a masterpiece of misdesign. -It is expensive to use, it is not extendable, and is not generally -usable as it provides access only to the @code{LC_MONETARY} and -@code{LC_NUMERIC} related information. If it is applicable for a -certain situation it should nevertheless be used since it is very -portable. In general it is better to use the function @code{strfmon} -which can be used to format monetary amounts correctly according to the -selected locale by implicitly using this information. +invented the @code{localeconv} function. It is a masterpiece of poor +design. It is expensive to use, not extendable, and not generally +usable as it provides access to only @code{LC_MONETARY} and +@code{LC_NUMERIC} related information. Nevertheless, if it is +applicable to a given situation it should be used since it is very +portable. The function @code{strfmon} formats monetary amounts +according to the selected locale using this information. @pindex locale.h @cindex monetary value formatting @cindex numeric value formatting @@ -407,8 +406,8 @@ value. @comment locale.h @comment ISO @deftp {Data Type} {struct lconv} -This is the data type of the value returned by @code{localeconv}. Its -elements are described in the following subsections. +@code{localeconv}'s return value is of this data type. Its elements are +described in the following subsections. @end deftp If a member of the structure @code{struct lconv} has type @code{char}, @@ -487,7 +486,7 @@ members have the same value.) In the standard @samp{C} locale, both of these members have the value @code{CHAR_MAX}, meaning ``unspecified''. The ISO standard doesn't say -what to do when you find this the value; we recommend printing no +what to do when you find this value; we recommend printing no fractional digits. (This locale also specifies the empty string for @code{mon_decimal_point}, so printing any fractional digits would be confusing!) @@ -521,8 +520,8 @@ The local currency symbol for the selected locale. In the standard @samp{C} locale, this member has a value of @code{""} (the empty string), meaning ``unspecified''. The ISO standard doesn't say what to do when you find this value; we recommend you simply print -the empty string as you would print any other string found in the -appropriate member. +the empty string as you would print any other string pointed to by this +variable. @item char *int_curr_symbol The international currency symbol for the selected locale. @@ -533,9 +532,9 @@ three-letter abbreviation determined by the international standard followed by a one-character separator (often a space). In the standard @samp{C} locale, this member has a value of @code{""} -(the empty string), meaning ``unspecified''. We recommend you simply -print the empty string as you would print any other string found in the -appropriate member. +(the empty string), meaning ``unspecified''. We recommend you simply print +the empty string as you would print any other string pointed to by this +variable. @item char p_cs_precedes @itemx char n_cs_precedes @@ -547,8 +546,8 @@ negative amounts. In the standard @samp{C} locale, both of these members have a value of @code{CHAR_MAX}, meaning ``unspecified''. The ISO standard doesn't say -what to do when you find this value, but we recommend printing the -currency symbol before the amount. That's right for most countries. +what to do when you find this value. We recommend printing the +currency symbol before the amount, which is right for most countries. In other words, treat all nonzero values alike in these members. The POSIX standard says that these two members apply to the @@ -573,7 +572,7 @@ negative amounts. In the standard @samp{C} locale, both of these members have a value of @code{CHAR_MAX}, meaning ``unspecified''. The ISO standard doesn't say what you should do when you find this value; we suggest you treat it as -one (print a space). In other words, treat all nonzero values alike in +1 (print a space). In other words, treat all nonzero values alike in these members. These members apply only to @code{currency_symbol}. When you use @@ -581,7 +580,7 @@ These members apply only to @code{currency_symbol}. When you use @code{int_curr_symbol} itself contains the appropriate separator. The POSIX standard says that these two members apply to the -@code{int_curr_symbol} as well as the @code{currency_symbol}. But an +@code{int_curr_symbol} as well as the @code{currency_symbol}. However, an example in the @w{ISO C} standard clearly implies that they should apply only to the @code{currency_symbol}---that the @code{int_curr_symbol} contains any appropriate separator, so you should never print an @@ -592,16 +591,16 @@ printing international currency symbols, and print no extra space. @end table @node Sign of Money Amount, , Currency Symbol, The Lame Way to Locale Data -@subsubsection Printing the Sign of an Amount of Money +@subsubsection Printing the Sign of a Monetary Amount These members of the @code{struct lconv} structure specify how to print -the sign (if any) in a monetary value. +the sign (if any) of a monetary value. @table @code @item char *positive_sign @itemx char *negative_sign These are strings used to indicate positive (or zero) and negative -(respectively) monetary quantities. +monetary quantities, respectively. In the standard @samp{C} locale, both of these members have a value of @code{""} (the empty string), meaning ``unspecified''. @@ -615,7 +614,7 @@ unreasonable.) @item char p_sign_posn @itemx char n_sign_posn -These members have values that are small integers indicating how to +These members are small integers that indicate how to position the sign for nonnegative and negative monetary quantities, respectively. (The string used by the sign is what was specified with @code{positive_sign} or @code{negative_sign}.) The possible values are @@ -650,36 +649,35 @@ symbol. It is not clear whether you should let these members apply to the international currency format or not. POSIX says you should, but intuition plus the examples in the @w{ISO C} standard suggest you should -not. We hope that someone who knows well the conventions for formatting -monetary quantities will tell us what we should recommend. +not. We hope that someone who knows the conventions for formatting +monetary quantities well will tell us what we should recommend. @node The Elegant and Fast Way, , The Lame Way to Locale Data, Locale Information @subsection Pinpoint Access to Locale Data When writing the X/Open Portability Guide the authors realized that the @code{localeconv} function is not enough to provide reasonable access to -the locale information. The information which was meant to be available +locale information. The information which was meant to be available in the locale (as later specified in the POSIX.1 standard) requires more -possibilities to access it. Therefore the @code{nl_langinfo} function +ways to access it. Therefore the @code{nl_langinfo} function was introduced. @comment langinfo.h @comment XOPEN @deftypefun {char *} nl_langinfo (nl_item @var{item}) The @code{nl_langinfo} function can be used to access individual -elements of the locale categories. I.e., unlike the @code{localeconv} -function which always returns all the information @code{nl_langinfo} -lets the caller select what information is necessary. This is very -fast and it is no problem to call this function multiple times. +elements of the locale categories. Unlike the @code{localeconv} +function, which returns all the information, @code{nl_langinfo} +lets the caller select what information it requires. This is very +fast and it is not a problem to call this function multiple times. -The second advantage is that not only the numeric and monetary -formatting information is available. Also the information of the +A second advantage is that in addition to the numeric and monetary +formatting information, information from the @code{LC_TIME} and @code{LC_MESSAGES} categories is available. -The type @code{nl_type} is defined in @file{nl_types.h}. -The argument @var{item} is a numeric values which must be one of the -values defined in the header @file{langinfo.h}. The X/Open standard -defines the following values: +The type @code{nl_type} is defined in @file{nl_types.h}. The argument +@var{item} is a numeric value defined in the header @file{langinfo.h}. +The X/Open standard defines the following values: @vtable @code @item ABDAY_1 @@ -698,7 +696,7 @@ corresponds to Sunday. @itemx DAY_5 @itemx DAY_6 @itemx DAY_7 -Similar to @code{ABDAY_1} etc, but here the return value is the +Similar to @code{ABDAY_1} etc., but here the return value is the unabbreviated weekday name. @item ABMON_1 @itemx ABMON_2 @@ -712,7 +710,7 @@ unabbreviated weekday name. @itemx ABMON_10 @itemx ABMON_11 @itemx ABMON_12 -The return value is abbreviated name for the month names. @code{ABMON_1} +The return value is abbreviated name of the month. @code{ABMON_1} corresponds to January. @item MON_1 @itemx MON_2 @@ -726,129 +724,127 @@ corresponds to January. @itemx MON_10 @itemx MON_11 @itemx MON_12 -Similar to @code{ABMON_1} etc but here the month names are not abbreviated. +Similar to @code{ABMON_1} etc., but here the month names are not abbreviated. Here the first value @code{MON_1} also corresponds to January. @item AM_STR @itemx PM_STR -The return values are strings which can be used in the time representation -which uses to American 1 to 12 hours plus am/pm representation. +The return values are strings which can be used in the representation of time +as an hour from 1 to 12 plus an am/pm specifier. -Please note that in locales which do not know this time representation -these strings actually might be empty and therefore the am/pm format +Note that in locales which do not use this time representation +these strings might be empty, in which case the am/pm format cannot be used at all. @item D_T_FMT The return value can be used as a format string for @code{strftime} to -represent time and date in a locale specific way. +represent time and date in a locale-specific way. @item D_FMT The return value can be used as a format string for @code{strftime} to -represent a date in a locale specific way. +represent a date in a locale-specific way. @item T_FMT The return value can be used as a format string for @code{strftime} to -represent time in a locale specific way. +represent time in a locale-specific way. @item T_FMT_AMPM The return value can be used as a format string for @code{strftime} to -represent time using the American-style am/pm format. +represent time in the am/pm format. -Please note that if the am/pm format does not make any sense for the -selected locale the returned value might be the same as the one for +Note that if the am/pm format does not make any sense for the +selected locale, the return value might be the same as the one for @code{T_FMT}. @item ERA -The return value is value representing the eras of time used in the -current locale. - -Most locales do not define this value. An example for a locale which -does define this value is the Japanese. Here the traditional data -representation is based on the eras measured by the reigns of the -emperors. - -Normally it should not be necessary to use this value directly. Using -the @code{E} modifier for its formats the @code{strftime} functions can -be made to use this information. The format of the returned string -is not specified and therefore one should not generalize the knowledge -about the representation on one system. +The return value represents the era used in the current locale. + +Most locales do not define this value. An example of a locale which +does define this value is the Japanese one. In Japan, the traditional +representation of dates includes the name of the era corresponding to +the then-emperor's reign. + +Normally it should not be necessary to use this value directly. +Specifying the @code{E} modifier in their format strings causes the +@code{strftime} functions to use this information. The format of the +returned string is not specified, and therefore you should not assume +knowledge of it on different systems. @item ERA_YEAR -The return value describes the name years for the eras of this locale. +The return value gives the year in the relevant era of the locale. As for @code{ERA} it should not be necessary to use this value directly. @item ERA_D_T_FMT This return value can be used as a format string for @code{strftime} to -represent time and date using the era representation in a locale -specific way. +represent dates and times in a locale-specific era-based way. @item ERA_D_FMT This return value can be used as a format string for @code{strftime} to -represent a date using the era representation in a locale specific way. +represent a date in a locale-specific era-based way. @item ERA_T_FMT This return value can be used as a format string for @code{strftime} to -represent time using the era representation in a locale specific way. +represent time in a locale-specific era-based way. @item ALT_DIGITS The return value is a representation of up to @math{100} values used to represent the values @math{0} to @math{99}. As for @code{ERA} this value is not intended to be used directly, but instead indirectly through the @code{strftime} function. When the modifier @code{O} is -used for format which would use numerals to represent hours, minutes, -seconds, weekdays, months, or weeks the appropriate value for this -locale values is used instead of the number. +used in a format which would otherwise use numerals to represent hours, +minutes, seconds, weekdays, months, or weeks, the appropriate value for +the locale is used instead. @item INT_CURR_SYMBOL -This value is the same as returned by @code{localeconv} in the +The same as the value returned by @code{localeconv} in the @code{int_curr_symbol} element of the @code{struct lconv}. @item CURRENCY_SYMBOL @itemx CRNCYSTR -This value is the same as returned by @code{localeconv} in the +The same as the value returned by @code{localeconv} in the @code{currency_symbol} element of the @code{struct lconv}. -@code{CRNCYSTR} is a deprecated alias, still required by Unix98. +@code{CRNCYSTR} is a deprecated alias still required by Unix98. @item MON_DECIMAL_POINT -This value is the same as returned by @code{localeconv} in the +The same as the value returned by @code{localeconv} in the @code{mon_decimal_point} element of the @code{struct lconv}. @item MON_THOUSANDS_SEP -This value is the same as returned by @code{localeconv} in the +The same as the value returned by @code{localeconv} in the @code{mon_thousands_sep} element of the @code{struct lconv}. @item MON_GROUPING -This value is the same as returned by @code{localeconv} in the +The same as the value returned by @code{localeconv} in the @code{mon_grouping} element of the @code{struct lconv}. @item POSITIVE_SIGN -This value is the same as returned by @code{localeconv} in the +The same as the value returned by @code{localeconv} in the @code{positive_sign} element of the @code{struct lconv}. @item NEGATIVE_SIGN -This value is the same as returned by @code{localeconv} in the +The same as the value returned by @code{localeconv} in the @code{negative_sign} element of the @code{struct lconv}. @item INT_FRAC_DIGITS -This value is the same as returned by @code{localeconv} in the +The same as the value returned by @code{localeconv} in the @code{int_frac_digits} element of the @code{struct lconv}. @item FRAC_DIGITS -This value is the same as returned by @code{localeconv} in the +The same as the value returned by @code{localeconv} in the @code{frac_digits} element of the @code{struct lconv}. @item P_CS_PRECEDES -This value is the same as returned by @code{localeconv} in the +The same as the value returned by @code{localeconv} in the @code{p_cs_precedes} element of the @code{struct lconv}. @item P_SEP_BY_SPACE -This value is the same as returned by @code{localeconv} in the +The same as the value returned by @code{localeconv} in the @code{p_sep_by_space} element of the @code{struct lconv}. @item N_CS_PRECEDES -This value is the same as returned by @code{localeconv} in the +The same as the value returned by @code{localeconv} in the @code{n_cs_precedes} element of the @code{struct lconv}. @item N_SEP_BY_SPACE -This value is the same as returned by @code{localeconv} in the +The same as the value returned by @code{localeconv} in the @code{n_sep_by_space} element of the @code{struct lconv}. @item P_SIGN_POSN -This value is the same as returned by @code{localeconv} in the +The same as the value returned by @code{localeconv} in the @code{p_sign_posn} element of the @code{struct lconv}. @item N_SIGN_POSN -This value is the same as returned by @code{localeconv} in the +The same as the value returned by @code{localeconv} in the @code{n_sign_posn} element of the @code{struct lconv}. @item DECIMAL_POINT @itemx RADIXCHAR -This value is the same as returned by @code{localeconv} in the +The same as the value returned by @code{localeconv} in the @code{decimal_point} element of the @code{struct lconv}. The name @code{RADIXCHAR} is a deprecated alias still used in Unix98. @item THOUSANDS_SEP @itemx THOUSEP -This value is the same as returned by @code{localeconv} in the +The same as the value returned by @code{localeconv} in the @code{thousands_sep} element of the @code{struct lconv}. The name @code{THOUSEP} is a deprecated alias still used in Unix98. @item GROUPING -This value is the same as returned by @code{localeconv} in the +The same as the value returned by @code{localeconv} in the @code{grouping} element of the @code{struct lconv}. @item YESEXPR The return value is a regular expression which can be used with the @@ -859,37 +855,37 @@ The return value is a regular expression which can be used with the @code{regex} function to recognize a negative response to a yes/no question. @item YESSTR -The return value is a locale specific translation of the positive response +The return value is a locale-specific translation of the positive response to a yes/no question. Using this value is deprecated since it is a very special case of -message translation and this better can be handled using the message +message translation, and is better handled by the message translation functions (@pxref{Message Translation}). @item NOSTR -The return value is a locale specific translation of the negative response +The return value is a locale-specific translation of the negative response to a yes/no question. What is said for @code{YESSTR} is also true here. @end vtable The file @file{langinfo.h} defines a lot more symbols but none of them -is official. Using them is completely unportable and the format of the -return values might change. Therefore it is highly requested to not use -them in any situation. - -Please note that the return value for any valid argument can be used for -in all situations (with the possible exception of the am/pm time format -related values). If the user has not selected any locale for the -appropriate category @code{nl_langinfo} returns the information from the +is official. Using them is not portable, and the format of the +return values might change. Therefore we recommended you not use +them. + +Note that the return value for any valid argument can be used for +in all situations (with the possible exception of the am/pm time formatting +codes). If the user has not selected any locale for the +appropriate category, @code{nl_langinfo} returns the information from the @code{"C"} locale. It is therefore possible to use this function as shown in the example below. -If the argument @var{item} is not valid the global variable @var{errno} +If the argument @var{item} is not valid, the global variable @var{errno} is set to @code{EINVAL} and a @code{NULL} pointer is returned. @end deftypefun -An example for the use of @code{nl_langinfo} is a function which has to -print a given date and time in the locale specific way. At first one -might think the since @code{strftime} internally uses the locale -information writing something like the following is enough: +An example of @code{nl_langinfo} usage is a function which has to +print a given date and time in a locale-specific way. At first one +might think that, since @code{strftime} internally uses the locale +information, writing something like the following is enough: @smallexample size_t @@ -913,37 +909,37 @@ i18n_time_n_data (char *s, size_t len, const struct tm *tp) @} @end smallexample -Now the date and time format which is explicitly selected for the locale -in place when the program runs is used. If the user selects the locale +Now it uses the date and time format of the locale +selected when the program runs. If the user selects the locale correctly there should never be a misunderstanding over the time and date format. -@node Formatting Numbers, , Locale Information, Locales +@node Formatting Numbers, Locale Information, Locales @section A dedicated function to format numbers We have seen that the structure returned by @code{localeconv} as well as -the values given to @code{nl_langinfo} allow to retrieve the various -pieces of locale specific information to format numbers and monetary -amounts. But we have also seen that the rules underlying this -information are quite complex. +the values given to @code{nl_langinfo} allow you to retrieve the various +pieces of locale-specific information to format numbers and monetary +amounts. We have also seen that the underlying rules are quite complex. -Therefore the X/Open standards introduce a function which uses this -information from the locale and so makes it is for the user to format +Therefore the X/Open standards introduce a function which uses such +locale information, making it easier for the user to format numbers according to these rules. @deftypefun ssize_t strfmon (char *@var{s}, size_t @var{maxsize}, const char *@var{format}, @dots{}) The @code{strfmon} function is similar to the @code{strftime} function -in that it takes a description of a buffer (with size), a format string -and values to write into a buffer a textual representation of the values -according to the format string. As for @code{strftime} the function +in that it takes a buffer, its size, a format string, +and values to write into the buffer as text in a form specified +by the format string. Like @code{strftime}, the function also returns the number of bytes written into the buffer. -There are two difference: @code{strfmon} can take more than one argument -and of course the format specification is different. The format string -consists as for @code{strftime} of normal text which is simply printed -and format specifiers, which here are also introduced using @samp{%}. -Following the @samp{%} the function allows similar to @code{printf} a -sequence of flags and other specifications before the format character: +There are two differences: @code{strfmon} can take more than one +argument, and, of course, the format specification is different. Like +@code{strftime}, the format string consists of normal text, which is +output as is, and format specifiers, which are indicated by a @samp{%}. +Immediately after the @samp{%}, you can optionally specify various flags +and formatting information before the main formatting character, in a +similar way to @code{printf}: @itemize @bullet @item @@ -956,77 +952,74 @@ fill character. By default this character is a space character. Filling with this character is only performed if a left precision is specified. It is not just to fill to the given field width. @item @samp{^} -The number is printed without grouping the digits using the rules of the -current locale. By default grouping is enabled. +The number is printed without grouping the digits according to the rules +of the current locale. By default grouping is enabled. @item @samp{+}, @samp{(} -At most one of these flags must be used. They select which format to -represent the sign of currency amount is used. By default and if -@samp{+} is used the locale equivalent to @math{+}/@math{-} is used. If -@samp{(} is used negative amounts are enclosed in parentheses. The +At most one of these flags can be used. They select which format to +represent the sign of a currency amount. By default, and if +@samp{+} is given, the locale equivalent of @math{+}/@math{-} is used. If +@samp{(} is given, negative amounts are enclosed in parentheses. The exact format is determined by the values of the @code{LC_MONETARY} category of the locale selected at program runtime. @item @samp{!} The output will not contain the currency symbol. @item @samp{-} -The output will be formatted right-justified instead left-justified if -the output does not fill the entire field width. +The output will be formatted left-justified instead of right-justified if +it does not fill the entire field width. @end table @end itemize -The next part of a specification is an, again optional, specification of -the field width. The width is given by digits following the flags. If -no width is specified it is assumed to be @math{0}. The width value is -used after it is determined how much space the printed result needs. If -it does not require fewer characters than specified by the width value -nothing happens. Otherwise the output is extended to use as many -characters as the width says by filling with spaces. At which side -depends on whether the @samp{-} flag was given or not. If it was given, -the spaces are added at the right, making the output right-justified and -vice versa. - -So far the format looks familiar as it is similar to @code{printf} or -@code{strftime} formats. But the next two fields introduce something -new. The first one, if available, is introduced by a @samp{#} character -which is followed by a decimal digit string. The value of the digit -string specifies the width the formatted digits left to the radix -character. This does @emph{not} include the grouping character needed -if the @samp{^} flag is not given. If the space needed to print the -number does not fill the whole width the field is padded at the left -side with the fill character which can be selected using the @samp{=} -flag and which by default is a space. For example, if the field width -is selected as 6 and the number is @math{123}, the fill character is -@samp{*} the result will be @samp{***123}. - -The next field is introduced by a @samp{.} (period) and consists of -another decimal digit string. Its value describes the number of -characters printed after the radix character. The default is -selected from the current locale (@code{frac_digits}, -@code{int_frac_digits}, see @pxref{General Numeric}). If the exact -representation needs more digits than those specified by the field width -the displayed value is rounded. In case the number of fractional digits -is selected to be zero, no radix character is printed. - -As a GNU extension the @code{strfmon} implementation in the GNU libc -allows as the next field an optional @samp{L} as a format modifier. If -this modifier is given the argument is expected to be a @code{long -double} instead of a @code{double} value. - -Finally as the last component of the format there must come a format -specifying. There are three specifiers defined: +The next part of a specification is an optional field width. If no +width is specified @math{0} is taken. During output, the function first +determines how much space is required. If it requires at least as many +characters as given by the field width, it is output using as much space +as necessary. Otherwise, it is extended to use the full width by +filling with the space character. The presence or absence of the +@samp{-} flag determines the side at which such padding occurs. If +present, the spaces are added at the right making the output +left-justified, and vice versa. + +So far the format looks familiar, being similar to the @code{printf} and +@code{strftime} formats. However, the next two optional fields +introduce something new. The first one is a @samp{#} character followed +by a decimal digit string. The value of the digit string specifies the +number of @emph{digit} positions to the left of the decimal point (or +equivalent). This does @emph{not} include the grouping character when +the @samp{^} flag is not given. If the space needed to print the number +does not fill the whole width, the field is padded at the left side with +the fill character, which can be selected using the @samp{=} flag and by +default is a space. For example, if the field width is selected as 6 +and the number is @math{123}, the fill character is @samp{*} the result +will be @samp{***123}. + +The second optional field starts with a @samp{.} (period) and consists +of another decimal digit string. Its value describes the number of +characters printed after the decimal point. The default is selected +from the current locale (@code{frac_digits}, @code{int_frac_digits}, see +@pxref{General Numeric}). If the exact representation needs more digits +than given by the field width, the displayed value is rounded. If the +number of fractional digits is selected to be zero, no decimal point is +printed. + +As a GNU extension, the @code{strfmon} implementation in the GNU libc +allows an optional @samp{L} next as a format modifier. If this modifier +is given, the argument is expected to be a @code{long double} instead of +a @code{double} value. + +Finally, the last component is a format specifier. There are three +specifiers defined: @table @asis @item @samp{i} -The argument is formatted according to the locale's rules to format an -international currency value. +Use the locale's rules for formatting an international currency value. @item @samp{n} -The argument is formatted according to the locale's rules to format an -national currency value. +Use the locale's rules for formatting a national currency value. @item @samp{%} -Creates a @samp{%} in the output. There must be no flag, width +Place a @samp{%} in the output. There must be no flag, width specifier or modifier given, only @samp{%%} is allowed. @end table -As it is done for @code{printf}, the function reads the format string +As for @code{printf}, the function reads the format string from left to right and uses the values passed to the function following the format string. The values are expected to be either of type @code{double} or @code{long double}, depending on the presence of the @@ -1034,15 +1027,15 @@ modifier @samp{L}. The result is stored in the buffer pointed to by @var{s}. At most @var{maxsize} characters are stored. The return value of the function is the number of characters stored in -@var{s}, including the terminating NUL byte. If the number of -characters stored would exceed @var{maxsize} the function returns +@var{s}, including the terminating @code{NULL} byte. If the number of +characters stored would exceed @var{maxsize}, the function returns @math{-1} and the content of the buffer @var{s} is unspecified. In this case @code{errno} is set to @code{E2BIG}. @end deftypefun -A few examples should make it clear how to use this function. It is +A few examples should make clear how the function works. It is assumed that all the following pieces of code are executed in a program -which uses the locale valid for the USA (@code{en_US}). The simplest +which uses the USA locale (@code{en_US}). The simplest form of the format is this: @smallexample @@ -1055,15 +1048,15 @@ The output produced is "@@$123.45@@-$567.89@@$12,345.68@@" @end smallexample -We can notice several things here. First, the width for all formats is -different. We have not specified a width in the format string and so -this is no wonder. Second, the third number is printed using thousands -separators. The thousands separator for the @code{en_US} locale is a -comma. Beside this the number is rounded. The @math{.678} are rounded -to @math{.68} since the format does not specify a precision and the -default value in the locale is @math{2}. A last thing is that the -national currency symbol is printed since @samp{%n} was used, not -@samp{i}. The next example shows how we can align the output. +We can notice several things here. First, the widths of the output +numbers are different. We have not specified a width in the format +string, and so this is no wonder. Second, the third number is printed +using thousands separators. The thousands separator for the +@code{en_US} locale is a comma. The number is also rounded. +@math{.678} is rounded to @math{.68} since the format does not specify a +precision and the default value in the locale is @math{2}. Finally, +note that the national currency symbol is printed since @samp{%n} was +used, not @samp{i}. The next example shows how we can align the output. @smallexample strfmon (buf, 100, "@@%=*11n@@%=*11n@@%=*11n@@", 123.45, -567.89, 12345.678); @@ -1076,13 +1069,13 @@ The output this time is: "@@ $123.45@@ -$567.89@@ $12,345.68@@" @end smallexample -Two things stand out. First, all fields have the same width (eleven +Two things stand out. Firstly, all fields have the same width (eleven characters) since this is the width given in the format and since no number required more characters to be printed. The second important point is that the fill character is not used. This is correct since the -white space was not used to fill the space specified by the right -precision, but instead it is used to fill to the given width. The -difference becomes obvious if we now add a right width specification. +white space was not used to achieve a precision given by a @samp{#} +modifier, but instead to fill to the given width. The difference +becomes obvious if we now add a width specification. @smallexample strfmon (buf, 100, "@@%=*11#5n@@%=*11#5n@@%=*11#5n@@", @@ -1096,14 +1089,14 @@ The output is "@@ $***123.45@@-$***567.89@@ $12,456.68@@" @end smallexample -Here we can see that all the currency symbols are now aligned and the -space between the currency sign and the number is filled with the -selected fill character. Please note that although the right precision -is selected to be @math{5} and @math{123.45} has three characters right -of the radix character, the space is filled with three asterisks. This -is correct since as explained above, the right precision does not count -the characters used for the thousands separators in. One last example -should explain the remaining functionality. +Here we can see that all the currency symbols are now aligned, and that +the space between the currency sign and the number is filled with the +selected fill character. Note that although the width is selected to be +@math{5} and @math{123.45} has three digits left of the decimal point, +the space is filled with three asterisks. This is correct since, as +explained above, the width does not include the positions used to store +thousands separators. One last example should explain the remaining +functionality. @smallexample strfmon (buf, 100, "@@%=0(16#5.3i@@%=0(16#5.3i@@%=0(16#5.3i@@", @@ -1117,14 +1110,15 @@ This rather complex format string produces the following output: "@@ USD 000123,450 @@(USD 000567.890)@@ USD 12,345.678 @@" @end smallexample -The most noticeable change is the use of the alternative style to -represent negative numbers. In financial circles it is often done using -parentheses and this is what the @samp{(} flag selected. The fill character -is now @samp{0}. Please note that this @samp{0} character is not -regarded as a numeric zero and therefore the first and second number are -not printed using a thousands separator. Since we use in the format the -specifier @samp{i} instead of @samp{n} now the international form of the +The most noticeable change is the alternative way of representing +negative numbers. In financial circles this is often done using +parentheses, and this is what the @samp{(} flag selected. The fill +character is now @samp{0}. Note that this @samp{0} character is not +regarded as a numeric zero, and therefore the first and second numbers +are not printed using a thousands separator. Since we used the format +specifier @samp{i} instead of @samp{n}, the international form of the currency symbol is used. This is a four letter string, in this case -@code{"USD "}. The last point is that since the left precision is -selected to be three the first and second number are printed with an -extra zero at the end and the third number is printed unrounded. +@code{"USD "}. The last point is that since the precision right of the +decimal point is selected to be three, the first and second numbers are +printed with an extra zero at the end and the third number is printed +without rounding. |