diff options
Diffstat (limited to 'manual')
72 files changed, 40458 insertions, 0 deletions
diff --git a/manual/.cvsignore b/manual/.cvsignore new file mode 100644 index 0000000000..999d5615a7 --- /dev/null +++ b/manual/.cvsignore @@ -0,0 +1,10 @@ +*.gz *.Z *.tar *.tgz +=* +TODO COPYING* AUTHORS copyr-* copying.* +glibc-* + +*.dvi* *.info* *.c.texi +*.toc *.aux *.log +*.cp *.cps *.fn *.fns *.vr *.vrs *.tp *.tps *.ky *.kys *.pg *.pgs + +chapters chapters-incl summary.texi stamp-* diff --git a/manual/=copying.texinfo b/manual/=copying.texinfo new file mode 100644 index 0000000000..6a61d64bfb --- /dev/null +++ b/manual/=copying.texinfo @@ -0,0 +1,540 @@ +@comment This material was copied from /gd/gnu/doc/lgpl.texinfo. + +@node Copying, Concept Index, Maintenance, Top +@appendix GNU GENERAL PUBLIC LICENSE +@center Version 2, June 1991 + +@display +Copyright @copyright{} 1991 Free Software Foundation, Inc. +675 Mass Ave, Cambridge, MA 02139, USA +Everyone is permitted to copy and distribute verbatim copies +of this license document, but changing it is not allowed. + +[This is the first released version of the library GPL. It is + numbered 2 because it goes with version 2 of the ordinary GPL.] +@end display + +@unnumberedsec Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +Licenses are intended to guarantee your freedom to share and change +free software---to make sure the software is free for all its users. + + This license, the Library General Public License, applies to some +specially designated Free Software Foundation software, and to any +other libraries whose authors decide to use it. You can use it for +your libraries, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if +you distribute copies of the library, or if you modify it. + + For example, if you distribute copies of the library, whether gratis +or for a fee, you must give the recipients all the rights that we gave +you. You must make sure that they, too, receive or can get the source +code. If you link a program with the library, you must provide +complete object files to the recipients so that they can relink them +with the library, after making changes to the library and recompiling +it. And you must show them these terms so they know their rights. + + Our method of protecting your rights has two steps: (1) copyright +the library, and (2) offer you this license which gives you legal +permission to copy, distribute and/or modify the library. + + Also, for each distributor's protection, we want to make certain +that everyone understands that there is no warranty for this free +library. If the library is modified by someone else and passed on, we +want its recipients to know that what they have is not the original +version, so that any problems introduced by others will not reflect on +the original authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that companies distributing free +software will individually obtain patent licenses, thus in effect +transforming the program into proprietary software. To prevent this, +we have made it clear that any patent must be licensed for everyone's +free use or not licensed at all. + + Most GNU software, including some libraries, is covered by the ordinary +GNU General Public License, which was designed for utility programs. This +license, the GNU Library General Public License, applies to certain +designated libraries. This license is quite different from the ordinary +one; be sure to read it in full, and don't assume that anything in it is +the same as in the ordinary license. + + The reason we have a separate public license for some libraries is that +they blur the distinction we usually make between modifying or adding to a +program and simply using it. Linking a program with a library, without +changing the library, is in some sense simply using the library, and is +analogous to running a utility program or application program. However, in +a textual and legal sense, the linked executable is a combined work, a +derivative of the original library, and the ordinary General Public License +treats it as such. + + Because of this blurred distinction, using the ordinary General +Public License for libraries did not effectively promote software +sharing, because most developers did not use the libraries. We +concluded that weaker conditions might promote sharing better. + + However, unrestricted linking of non-free programs would deprive the +users of those programs of all benefit from the free status of the +libraries themselves. This Library General Public License is intended to +permit developers of non-free programs to use free libraries, while +preserving your freedom as a user of such programs to change the free +libraries that are incorporated in them. (We have not seen how to achieve +this as regards changes in header files, but we have achieved it as regards +changes in the actual functions of the Library.) The hope is that this +will lead to faster development of free libraries. + + The precise terms and conditions for copying, distribution and +modification follow. Pay close attention to the difference between a +``work based on the library'' and a ``work that uses the library''. The +former contains code derived from the library, while the latter only +works together with the library. + + Note that it is possible for a library to be covered by the ordinary +General Public License rather than by this special one. + +@iftex +@unnumberedsec TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION +@end iftex +@ifinfo +@center TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION +@end ifinfo + +@enumerate +@item +This License Agreement applies to any software library which +contains a notice placed by the copyright holder or other authorized +party saying it may be distributed under the terms of this Library +General Public License (also called ``this License''). Each licensee is +addressed as ``you''. + + A ``library'' means a collection of software functions and/or data +prepared so as to be conveniently linked with application programs +(which use some of those functions and data) to form executables. + + The ``Library'', below, refers to any such software library or work +which has been distributed under these terms. A ``work based on the +Library'' means either the Library or any derivative work under +copyright law: that is to say, a work containing the Library or a +portion of it, either verbatim or with modifications and/or translated +straightforwardly into another language. (Hereinafter, translation is +included without limitation in the term ``modification''.) + + ``Source code'' for a work means the preferred form of the work for +making modifications to it. For a library, complete source code means +all the source code for all modules it contains, plus any associated +interface definition files, plus the scripts used to control compilation +and installation of the library. + + Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running a program using the Library is not restricted, and output from +such a program is covered only if its contents constitute a work based +on the Library (independent of the use of the Library in a tool for +writing it). Whether that is true depends on what the Library does +and what the program that uses the Library does. + +@item +You may copy and distribute verbatim copies of the Library's +complete source code as you receive it, in any medium, provided that +you conspicuously and appropriately publish on each copy an +appropriate copyright notice and disclaimer of warranty; keep intact +all the notices that refer to this License and to the absence of any +warranty; and distribute a copy of this License along with the +Library. + + You may charge a fee for the physical act of transferring a copy, +and you may at your option offer warranty protection in exchange for a +fee. + +@item +You may modify your copy or copies of the Library or any portion +of it, thus forming a work based on the Library, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + +@enumerate a +@item +The modified work must itself be a software library. + +@item +You must cause the files modified to carry prominent notices +stating that you changed the files and the date of any change. + +@item +You must cause the whole of the work to be licensed at no +charge to all third parties under the terms of this License. + +@item +If a facility in the modified Library refers to a function or a +table of data to be supplied by an application program that uses +the facility, other than as an argument passed when the facility +is invoked, then you must make a good faith effort to ensure that, +in the event an application does not supply such function or +table, the facility still operates, and performs whatever part of +its purpose remains meaningful. + +(For example, a function in a library to compute square roots has +a purpose that is entirely well-defined independent of the +application. Therefore, Subsection 2d requires that any +application-supplied function or table used by this function must +be optional: if the application does not supply it, the square +root function must still compute square roots.) +@end enumerate + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Library, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Library, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote +it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Library. + +In addition, mere aggregation of another work not based on the Library +with the Library (or with a work based on the Library) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + +@item +You may opt to apply the terms of the ordinary GNU General Public +License instead of this License to a given copy of the Library. To do +this, you must alter all the notices that refer to this License, so +that they refer to the ordinary GNU General Public License, version 2, +instead of to this License. (If a newer version than version 2 of the +ordinary GNU General Public License has appeared, then you can specify +that version instead if you wish.) Do not make any other change in +these notices. + + Once this change is made in a given copy, it is irreversible for +that copy, so the ordinary GNU General Public License applies to all +subsequent copies and derivative works made from that copy. + + This option is useful when you wish to copy part of the code of +the Library into a program that is not a library. + +@item +You may copy and distribute the Library (or a portion or +derivative of it, under Section 2) in object code or executable form +under the terms of Sections 1 and 2 above provided that you accompany +it with the complete corresponding machine-readable source code, which +must be distributed under the terms of Sections 1 and 2 above on a +medium customarily used for software interchange. + + If distribution of object code is made by offering access to copy +from a designated place, then offering equivalent access to copy the +source code from the same place satisfies the requirement to +distribute the source code, even though third parties are not +compelled to copy the source along with the object code. + +@item +A program that contains no derivative of any portion of the +Library, but is designed to work with the Library by being compiled or +linked with it, is called a ``work that uses the Library''. Such a +work, in isolation, is not a derivative work of the Library, and +therefore falls outside the scope of this License. + + However, linking a ``work that uses the Library'' with the Library +creates an executable that is a derivative of the Library (because it +contains portions of the Library), rather than a ``work that uses the +library''. The executable is therefore covered by this License. +Section 6 states terms for distribution of such executables. + + When a ``work that uses the Library'' uses material from a header file +that is part of the Library, the object code for the work may be a +derivative work of the Library even though the source code is not. +Whether this is true is especially significant if the work can be +linked without the Library, or if the work is itself a library. The +threshold for this to be true is not precisely defined by law. + + If such an object file uses only numerical parameters, data +structure layouts and accessors, and small macros and small inline +functions (ten lines or less in length), then the use of the object +file is unrestricted, regardless of whether it is legally a derivative +work. (Executables containing this object code plus portions of the +Library will still fall under Section 6.) + + Otherwise, if the work is a derivative of the Library, you may +distribute the object code for the work under the terms of Section 6. +Any executables containing that work also fall under Section 6, +whether or not they are linked directly with the Library itself. + +@item +As an exception to the Sections above, you may also compile or +link a ``work that uses the Library'' with the Library to produce a +work containing portions of the Library, and distribute that work +under terms of your choice, provided that the terms permit +modification of the work for the customer's own use and reverse +engineering for debugging such modifications. + + You must give prominent notice with each copy of the work that the +Library is used in it and that the Library and its use are covered by +this License. You must supply a copy of this License. If the work +during execution displays copyright notices, you must include the +copyright notice for the Library among them, as well as a reference +directing the user to the copy of this License. Also, you must do one +of these things: + +@enumerate a +@item +Accompany the work with the complete corresponding +machine-readable source code for the Library including whatever +changes were used in the work (which must be distributed under +Sections 1 and 2 above); and, if the work is an executable linked +with the Library, with the complete machine-readable ``work that +uses the Library'', as object code and/or source code, so that the +user can modify the Library and then relink to produce a modified +executable containing the modified Library. (It is understood +that the user who changes the contents of definitions files in the +Library will not necessarily be able to recompile the application +to use the modified definitions.) + +@item +Accompany the work with a written offer, valid for at +least three years, to give the same user the materials +specified in Subsection 6a, above, for a charge no more +than the cost of performing this distribution. + +@item +If distribution of the work is made by offering access to copy +from a designated place, offer equivalent access to copy the above +specified materials from the same place. + +@item +Verify that the user has already received a copy of these +materials or that you have already sent this user a copy. +@end enumerate + + For an executable, the required form of the ``work that uses the +Library'' must include any data and utility programs needed for +reproducing the executable from it. However, as a special exception, +the source code distributed need not include anything that is normally +distributed (in either source or binary form) with the major +components (compiler, kernel, and so on) of the operating system on +which the executable runs, unless that component itself accompanies +the executable. + + It may happen that this requirement contradicts the license +restrictions of other proprietary libraries that do not normally +accompany the operating system. Such a contradiction means you cannot +use both them and the Library together in an executable that you +distribute. + +@item +You may place library facilities that are a work based on the +Library side-by-side in a single library together with other library +facilities not covered by this License, and distribute such a combined +library, provided that the separate distribution of the work based on +the Library and of the other library facilities is otherwise +permitted, and provided that you do these two things: + +@enumerate a +@item +Accompany the combined library with a copy of the same work +based on the Library, uncombined with any other library +facilities. This must be distributed under the terms of the +Sections above. + +@item +Give prominent notice with the combined library of the fact +that part of it is a work based on the Library, and explaining +where to find the accompanying uncombined form of the same work. +@end enumerate + +@item +You may not copy, modify, sublicense, link with, or distribute +the Library except as expressly provided under this License. Any +attempt otherwise to copy, modify, sublicense, link with, or +distribute the Library is void, and will automatically terminate your +rights under this License. However, parties who have received copies, +or rights, from you under this License will not have their licenses +terminated so long as such parties remain in full compliance. + +@item +You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Library or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Library (or any work based on the +Library), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Library or works based on it. + +@item +Each time you redistribute the Library (or any work based on the +Library), the recipient automatically receives a license from the +original licensor to copy, distribute, link with or modify the Library +subject to these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + +@item +If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Library at all. For example, if a patent +license would not permit royalty-free redistribution of the Library by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Library. + +If any portion of this section is held invalid or unenforceable under any +particular circumstance, the balance of the section is intended to apply, +and the section as a whole is intended to apply in other circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + +@item +If the distribution and/or use of the Library is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Library under this License may add +an explicit geographical distribution limitation excluding those countries, +so that distribution is permitted only in or among countries not thus +excluded. In such case, this License incorporates the limitation as if +written in the body of this License. + +@item +The Free Software Foundation may publish revised and/or new +versions of the Library General Public License from time to time. +Such new versions will be similar in spirit to the present version, +but may differ in detail to address new problems or concerns. + +Each version is given a distinguishing version number. If the Library +specifies a version number of this License which applies to it and +``any later version'', you have the option of following the terms and +conditions either of that version or of any later version published by +the Free Software Foundation. If the Library does not specify a +license version number, you may choose any version ever published by +the Free Software Foundation. + +@item +If you wish to incorporate parts of the Library into other free +programs whose distribution conditions are incompatible with these, +write to the author to ask for permission. For software which is +copyrighted by the Free Software Foundation, write to the Free +Software Foundation; we sometimes make exceptions for this. Our +decision will be guided by the two goals of preserving the free status +of all derivatives of our free software and of promoting the sharing +and reuse of software generally. + +@iftex +@heading NO WARRANTY +@end iftex +@ifinfo +@center NO WARRANTY +@end ifinfo + +@item +BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO +WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. +EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR +OTHER PARTIES PROVIDE THE LIBRARY ``AS IS'' WITHOUT WARRANTY OF ANY +KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE +LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME +THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + +@item +IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN +WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY +AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU +FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR +CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE +LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING +RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A +FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF +SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH +DAMAGES. +@end enumerate + +@iftex +@heading END OF TERMS AND CONDITIONS +@end iftex +@ifinfo +@center END OF TERMS AND CONDITIONS +@end ifinfo + +@page +@unnumberedsec How to Apply These Terms to Your New Libraries + + If you develop a new library, and you want it to be of the greatest +possible use to the public, we recommend making it free software that +everyone can redistribute and change. You can do so by permitting +redistribution under these terms (or, alternatively, under the terms of the +ordinary General Public License). + + To apply these terms, attach the following notices to the library. It is +safest to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least the +``copyright'' line and a pointer to where the full notice is found. + +@smallexample +@var{one line to give the library's name and a brief idea of what it does.} +Copyright (C) @var{year} @var{name of author} + +This library is free software; you can redistribute it and/or +modify it under the terms of the GNU Library General Public +License as published by the Free Software Foundation; either +version 2 of the License, or (at your option) any later version. + +This library is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +Library General Public License for more details. + +You should have received a copy of the GNU Library General Public +License along with this library; if not, write to the Free +Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. +@end smallexample + +Also add information on how to contact you by electronic and paper mail. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a ``copyright disclaimer'' for the library, if +necessary. Here is a sample; alter the names: + +@example +Yoyodyne, Inc., hereby disclaims all copyright interest in the +library `Frob' (a library for tweaking knobs) written by James Random Hacker. + +@var{signature of Ty Coon}, 1 April 1990 +Ty Coon, President of Vice +@end example + +That's all there is to it! diff --git a/manual/=float.texinfo b/manual/=float.texinfo new file mode 100644 index 0000000000..a8c901542e --- /dev/null +++ b/manual/=float.texinfo @@ -0,0 +1,416 @@ +@node Floating-Point Limits +@chapter Floating-Point Limits +@pindex <float.h> +@cindex floating-point number representation +@cindex representation of floating-point numbers + +Because floating-point numbers are represented internally as approximate +quantities, algorithms for manipulating floating-point data often need +to be parameterized in terms of the accuracy of the representation. +Some of the functions in the C library itself need this information; for +example, the algorithms for printing and reading floating-point numbers +(@pxref{I/O on Streams}) and for calculating trigonometric and +irrational functions (@pxref{Mathematics}) use information about the +underlying floating-point representation to avoid round-off error and +loss of accuracy. User programs that implement numerical analysis +techniques also often need to be parameterized in this way in order to +minimize or compute error bounds. + +The specific representation of floating-point numbers varies from +machine to machine. The GNU C Library defines a set of parameters which +characterize each of the supported floating-point representations on a +particular system. + +@menu +* Floating-Point Representation:: Definitions of terminology. +* Floating-Point Parameters:: Descriptions of the library facilities. +* IEEE Floating-Point:: An example of a common representation. +@end menu + +@node Floating-Point Representation +@section Floating-Point Representation + +This section introduces the terminology used to characterize the +representation of floating-point numbers. + +You are probably already familiar with most of these concepts in terms +of scientific or exponential notation for floating-point numbers. For +example, the number @code{123456.0} could be expressed in exponential +notation as @code{1.23456e+05}, a shorthand notation indicating that the +mantissa @code{1.23456} is multiplied by the base @code{10} raised to +power @code{5}. + +More formally, the internal representation of a floating-point number +can be characterized in terms of the following parameters: + +@itemize @bullet +@item +The @dfn{sign} is either @code{-1} or @code{1}. +@cindex sign (of floating-point number) + +@item +The @dfn{base} or @dfn{radix} for exponentiation; an integer greater +than @code{1}. This is a constant for the particular representation. +@cindex base (of floating-point number) +@cindex radix (of floating-point number) + +@item +The @dfn{exponent} to which the base is raised. The upper and lower +bounds of the exponent value are constants for the particular +representation. +@cindex exponent (of floating-point number) + +Sometimes, in the actual bits representing the floating-point number, +the exponent is @dfn{biased} by adding a constant to it, to make it +always be represented as an unsigned quantity. This is only important +if you have some reason to pick apart the bit fields making up the +floating-point number by hand, which is something for which the GNU +library provides no support. So this is ignored in the discussion that +follows. +@cindex bias, in exponent (of floating-point number) + +@item +The value of the @dfn{mantissa} or @dfn{significand}, which is an +unsigned quantity. +@cindex mantissa (of floating-point number) +@cindex significand (of floating-point number) + +@item +The @dfn{precision} of the mantissa. If the base of the representation +is @var{b}, then the precision is the number of base-@var{b} digits in +the mantissa. This is a constant for the particular representation. + +Many floating-point representations have an implicit @dfn{hidden bit} in +the mantissa. Any such hidden bits are counted in the precision. +Again, the GNU library provides no facilities for dealing with such low-level +aspects of the representation. +@cindex precision (of floating-point number) +@cindex hidden bit, in mantissa (of floating-point number) +@end itemize + +The mantissa of a floating-point number actually represents an implicit +fraction whose denominator is the base raised to the power of the +precision. Since the largest representable mantissa is one less than +this denominator, the value of the fraction is always strictly less than +@code{1}. The mathematical value of a floating-point number is then the +product of this fraction; the sign; and the base raised to the exponent. + +If the floating-point number is @dfn{normalized}, the mantissa is also +greater than or equal to the base raised to the power of one less +than the precision (unless the number represents a floating-point zero, +in which case the mantissa is zero). The fractional quantity is +therefore greater than or equal to @code{1/@var{b}}, where @var{b} is +the base. +@cindex normalized floating-point number + +@node Floating-Point Parameters +@section Floating-Point Parameters + +@strong{Incomplete:} This section needs some more concrete examples +of what these parameters mean and how to use them in a program. + +These macro definitions can be accessed by including the header file +@file{<float.h>} in your program. + +Macro names starting with @samp{FLT_} refer to the @code{float} type, +while names beginning with @samp{DBL_} refer to the @code{double} type +and names beginning with @samp{LDBL_} refer to the @code{long double} +type. (In implementations that do not support @code{long double} as +a distinct data type, the values for those constants are the same +as the corresponding constants for the @code{double} type.)@refill + +Note that only @code{FLT_RADIX} is guaranteed to be a constant +expression, so the other macros listed here cannot be reliably used in +places that require constant expressions, such as @samp{#if} +preprocessing directives and array size specifications. + +Although the ANSI C standard specifies minimum and maximum values for +most of these parameters, the GNU C implementation uses whatever +floating-point representations are supported by the underlying hardware. +So whether GNU C actually satisfies the ANSI C requirements depends on +what machine it is running on. + +@comment float.h +@comment ANSI +@defvr Macro FLT_ROUNDS +This value characterizes the rounding mode for floating-point addition. +The following values indicate standard rounding modes: + +@table @code +@item -1 +The mode is indeterminable. +@item 0 +Rounding is towards zero. +@item 1 +Rounding is to the nearest number. +@item 2 +Rounding is towards positive infinity. +@item 3 +Rounding is towards negative infinity. +@end table + +@noindent +Any other value represents a machine-dependent nonstandard rounding +mode. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro FLT_RADIX +This is the value of the base, or radix, of exponent representation. +This is guaranteed to be a constant expression, unlike the other macros +described in this section. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro FLT_MANT_DIG +This is the number of base-@code{FLT_RADIX} digits in the floating-point +mantissa for the @code{float} data type. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro DBL_MANT_DIG +This is the number of base-@code{FLT_RADIX} digits in the floating-point +mantissa for the @code{double} data type. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro LDBL_MANT_DIG +This is the number of base-@code{FLT_RADIX} digits in the floating-point +mantissa for the @code{long double} data type. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro FLT_DIG +This is the number of decimal digits of precision for the @code{float} +data type. Technically, if @var{p} and @var{b} are the precision and +base (respectively) for the representation, then the decimal precision +@var{q} is the maximum number of decimal digits such that any floating +point number with @var{q} base 10 digits can be rounded to a floating +point number with @var{p} base @var{b} digits and back again, without +change to the @var{q} decimal digits. + +The value of this macro is guaranteed to be at least @code{6}. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro DBL_DIG +This is similar to @code{FLT_DIG}, but is for the @code{double} data +type. The value of this macro is guaranteed to be at least @code{10}. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro LDBL_DIG +This is similar to @code{FLT_DIG}, but is for the @code{long double} +data type. The value of this macro is guaranteed to be at least +@code{10}. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro FLT_MIN_EXP +This is the minimum negative integer such that the mathematical value +@code{FLT_RADIX} raised to this power minus 1 can be represented as a +normalized floating-point number of type @code{float}. In terms of the +actual implementation, this is just the smallest value that can be +represented in the exponent field of the number. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro DBL_MIN_EXP +This is similar to @code{FLT_MIN_EXP}, but is for the @code{double} data +type. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro LDBL_MIN_EXP +This is similar to @code{FLT_MIN_EXP}, but is for the @code{long double} +data type. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro FLT_MIN_10_EXP +This is the minimum negative integer such that the mathematical value +@code{10} raised to this power minus 1 can be represented as a +normalized floating-point number of type @code{float}. This is +guaranteed to be no greater than @code{-37}. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro DBL_MIN_10_EXP +This is similar to @code{FLT_MIN_10_EXP}, but is for the @code{double} +data type. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro LDBL_MIN_10_EXP +This is similar to @code{FLT_MIN_10_EXP}, but is for the @code{long +double} data type. +@end defvr + + + +@comment float.h +@comment ANSI +@defvr Macro FLT_MAX_EXP +This is the maximum negative integer such that the mathematical value +@code{FLT_RADIX} raised to this power minus 1 can be represented as a +floating-point number of type @code{float}. In terms of the actual +implementation, this is just the largest value that can be represented +in the exponent field of the number. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro DBL_MAX_EXP +This is similar to @code{FLT_MAX_EXP}, but is for the @code{double} data +type. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro LDBL_MAX_EXP +This is similar to @code{FLT_MAX_EXP}, but is for the @code{long double} +data type. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro FLT_MAX_10_EXP +This is the maximum negative integer such that the mathematical value +@code{10} raised to this power minus 1 can be represented as a +normalized floating-point number of type @code{float}. This is +guaranteed to be at least @code{37}. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro DBL_MAX_10_EXP +This is similar to @code{FLT_MAX_10_EXP}, but is for the @code{double} +data type. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro LDBL_MAX_10_EXP +This is similar to @code{FLT_MAX_10_EXP}, but is for the @code{long +double} data type. +@end defvr + + +@comment float.h +@comment ANSI +@defvr Macro FLT_MAX +The value of this macro is the maximum representable floating-point +number of type @code{float}, and is guaranteed to be at least +@code{1E+37}. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro DBL_MAX +The value of this macro is the maximum representable floating-point +number of type @code{double}, and is guaranteed to be at least +@code{1E+37}. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro LDBL_MAX +The value of this macro is the maximum representable floating-point +number of type @code{long double}, and is guaranteed to be at least +@code{1E+37}. +@end defvr + + +@comment float.h +@comment ANSI +@defvr Macro FLT_MIN +The value of this macro is the minimum normalized positive +floating-point number that is representable by type @code{float}, and is +guaranteed to be no more than @code{1E-37}. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro DBL_MIN +The value of this macro is the minimum normalized positive +floating-point number that is representable by type @code{double}, and +is guaranteed to be no more than @code{1E-37}. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro LDBL_MIN +The value of this macro is the minimum normalized positive +floating-point number that is representable by type @code{long double}, +and is guaranteed to be no more than @code{1E-37}. +@end defvr + + +@comment float.h +@comment ANSI +@defvr Macro FLT_EPSILON +This is the minimum positive floating-point number of type @code{float} +such that @code{1.0 + FLT_EPSILON != 1.0} is true. It's guaranteed to +be no greater than @code{1E-5}. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro DBL_EPSILON +This is similar to @code{FLT_EPSILON}, but is for the @code{double} +type. The maximum value is @code{1E-9}. +@end defvr + +@comment float.h +@comment ANSI +@defvr Macro LDBL_EPSILON +This is similar to @code{FLT_EPSILON}, but is for the @code{long double} +type. The maximum value is @code{1E-9}. +@end defvr + + + +@node IEEE Floating Point +@section IEEE Floating Point + +Here is an example showing how these parameters work for a common +floating point representation, specified by the @cite{IEEE Standard for +Binary Floating-Point Arithmetic (ANSI/IEEE Std 754-1985)}. + +The IEEE single-precision float representation uses a base of 2. There +is a sign bit, a mantissa with 23 bits plus one hidden bit (so the total +precision is 24 base-2 digits), and an 8-bit exponent that can represent +values in the range -125 to 128, inclusive. + +So, for an implementation that uses this representation for the +@code{float} data type, appropriate values for the corresponding +parameters are: + +@example +FLT_RADIX 2 +FLT_MANT_DIG 24 +FLT_DIG 6 +FLT_MIN_EXP -125 +FLT_MIN_10_EXP -37 +FLT_MAX_EXP 128 +FLT_MAX_10_EXP +38 +FLT_MIN 1.17549435E-38F +FLT_MAX 3.40282347E+38F +FLT_EPSILON 1.19209290E-07F +@end example + + + diff --git a/manual/=limits.texinfo b/manual/=limits.texinfo new file mode 100644 index 0000000000..3e384dd6b6 --- /dev/null +++ b/manual/=limits.texinfo @@ -0,0 +1,593 @@ +@node Representation Limits, System Configuration Limits, System Information, Top +@chapter Representation Limits + +This chapter contains information about constants and parameters that +characterize the representation of the various integer and +floating-point types supported by the GNU C library. + +@menu +* Integer Representation Limits:: Determining maximum and minimum + representation values of + various integer subtypes. +* Floating-Point Limits :: Parameters which characterize + supported floating-point + representations on a particular + system. +@end menu + +@node Integer Representation Limits, Floating-Point Limits , , Representation Limits +@section Integer Representation Limits +@cindex integer representation limits +@cindex representation limits, integer +@cindex limits, integer representation + +Sometimes it is necessary for programs to know about the internal +representation of various integer subtypes. For example, if you want +your program to be careful not to overflow an @code{int} counter +variable, you need to know what the largest representable value that +fits in an @code{int} is. These kinds of parameters can vary from +compiler to compiler and machine to machine. Another typical use of +this kind of parameter is in conditionalizing data structure definitions +with @samp{#ifdef} to select the most appropriate integer subtype that +can represent the required range of values. + +Macros representing the minimum and maximum limits of the integer types +are defined in the header file @file{limits.h}. The values of these +macros are all integer constant expressions. +@pindex limits.h + +@comment limits.h +@comment ANSI +@deftypevr Macro int CHAR_BIT +This is the number of bits in a @code{char}, usually eight. +@end deftypevr + +@comment limits.h +@comment ANSI +@deftypevr Macro int SCHAR_MIN +This is the minimum value that can be represented by a @code{signed char}. +@end deftypevr + +@comment limits.h +@comment ANSI +@deftypevr Macro int SCHAR_MAX +This is the maximum value that can be represented by a @code{signed char}. +@end deftypevr + +@comment limits.h +@comment ANSI +@deftypevr Macro int UCHAR_MAX +This is the maximum value that can be represented by a @code{unsigned char}. +(The minimum value of an @code{unsigned char} is zero.) +@end deftypevr + +@comment limits.h +@comment ANSI +@deftypevr Macro int CHAR_MIN +This is the minimum value that can be represented by a @code{char}. +It's equal to @code{SCHAR_MIN} if @code{char} is signed, or zero +otherwise. +@end deftypevr + +@comment limits.h +@comment ANSI +@deftypevr Macro int CHAR_MAX +This is the maximum value that can be represented by a @code{char}. +It's equal to @code{SCHAR_MAX} if @code{char} is signed, or +@code{UCHAR_MAX} otherwise. +@end deftypevr + +@comment limits.h +@comment ANSI +@deftypevr Macro int SHRT_MIN +This is the minimum value that can be represented by a @code{signed +short int}. On most machines that the GNU C library runs on, +@code{short} integers are 16-bit quantities. +@end deftypevr + +@comment limits.h +@comment ANSI +@deftypevr Macro int SHRT_MAX +This is the maximum value that can be represented by a @code{signed +short int}. +@end deftypevr + +@comment limits.h +@comment ANSI +@deftypevr Macro int USHRT_MAX +This is the maximum value that can be represented by an @code{unsigned +short int}. (The minimum value of an @code{unsigned short int} is zero.) +@end deftypevr + +@comment limits.h +@comment ANSI +@deftypevr Macro int INT_MIN +This is the minimum value that can be represented by a @code{signed +int}. On most machines that the GNU C system runs on, an @code{int} is +a 32-bit quantity. +@end deftypevr + +@comment limits.h +@comment ANSI +@deftypevr Macro int INT_MAX +This is the maximum value that can be represented by a @code{signed +int}. +@end deftypevr + +@comment limits.h +@comment ANSI +@deftypevr Macro {unsigned int} UINT_MAX +This is the maximum value that can be represented by an @code{unsigned +int}. (The minimum value of an @code{unsigned int} is zero.) +@end deftypevr + +@comment limits.h +@comment ANSI +@deftypevr Macro {long int} LONG_MIN +This is the minimum value that can be represented by a @code{signed long +int}. On most machines that the GNU C system runs on, @code{long} +integers are 32-bit quantities, the same size as @code{int}. +@end deftypevr + +@comment limits.h +@comment ANSI +@deftypevr Macro {long int} LONG_MAX +This is the maximum value that can be represented by a @code{signed long +int}. +@end deftypevr + +@comment limits.h +@comment ANSI +@deftypevr Macro {unsigned long int} ULONG_MAX +This is the maximum value that can be represented by an @code{unsigned +long int}. (The minimum value of an @code{unsigned long int} is zero.) +@end deftypevr + +@strong{Incomplete:} There should be corresponding limits for the GNU +C Compiler's @code{long long} type, too. (But they are not now present +in the header file.) + +The header file @file{limits.h} also defines some additional constants +that parameterize various operating system and file system limits. These +constants are described in @ref{System Parameters} and @ref{File System +Parameters}. +@pindex limits.h + + +@node Floating-Point Limits , , Integer Representation Limits, Representation Limits +@section Floating-Point Limits +@cindex floating-point number representation +@cindex representation, floating-point number +@cindex limits, floating-point representation + +Because floating-point numbers are represented internally as approximate +quantities, algorithms for manipulating floating-point data often need +to be parameterized in terms of the accuracy of the representation. +Some of the functions in the C library itself need this information; for +example, the algorithms for printing and reading floating-point numbers +(@pxref{I/O on Streams}) and for calculating trigonometric and +irrational functions (@pxref{Mathematics}) use information about the +underlying floating-point representation to avoid round-off error and +loss of accuracy. User programs that implement numerical analysis +techniques also often need to be parameterized in this way in order to +minimize or compute error bounds. + +The specific representation of floating-point numbers varies from +machine to machine. The GNU C library defines a set of parameters which +characterize each of the supported floating-point representations on a +particular system. + +@menu +* Floating-Point Representation:: Definitions of terminology. +* Floating-Point Parameters:: Descriptions of the library + facilities. +* IEEE Floating Point:: An example of a common + representation. +@end menu + +@node Floating-Point Representation, Floating-Point Parameters, , Floating-Point Limits +@subsection Floating-Point Representation + +This section introduces the terminology used to characterize the +representation of floating-point numbers. + +You are probably already familiar with most of these concepts in terms +of scientific or exponential notation for floating-point numbers. For +example, the number @code{123456.0} could be expressed in exponential +notation as @code{1.23456e+05}, a shorthand notation indicating that the +mantissa @code{1.23456} is multiplied by the base @code{10} raised to +power @code{5}. + +More formally, the internal representation of a floating-point number +can be characterized in terms of the following parameters: + +@itemize @bullet +@item +The @dfn{sign} is either @code{-1} or @code{1}. +@cindex sign (of floating-point number) + +@item +The @dfn{base} or @dfn{radix} for exponentiation; an integer greater +than @code{1}. This is a constant for the particular representation. +@cindex base (of floating-point number) +@cindex radix (of floating-point number) + +@item +The @dfn{exponent} to which the base is raised. The upper and lower +bounds of the exponent value are constants for the particular +representation. +@cindex exponent (of floating-point number) + +Sometimes, in the actual bits representing the floating-point number, +the exponent is @dfn{biased} by adding a constant to it, to make it +always be represented as an unsigned quantity. This is only important +if you have some reason to pick apart the bit fields making up the +floating-point number by hand, which is something for which the GNU +library provides no support. So this is ignored in the discussion that +follows. +@cindex bias (of floating-point number exponent) + +@item +The value of the @dfn{mantissa} or @dfn{significand}, which is an +unsigned integer. +@cindex mantissa (of floating-point number) +@cindex significand (of floating-point number) + +@item +The @dfn{precision} of the mantissa. If the base of the representation +is @var{b}, then the precision is the number of base-@var{b} digits in +the mantissa. This is a constant for the particular representation. + +Many floating-point representations have an implicit @dfn{hidden bit} in +the mantissa. Any such hidden bits are counted in the precision. +Again, the GNU library provides no facilities for dealing with such low-level +aspects of the representation. +@cindex precision (of floating-point number) +@cindex hidden bit (of floating-point number mantissa) +@end itemize + +The mantissa of a floating-point number actually represents an implicit +fraction whose denominator is the base raised to the power of the +precision. Since the largest representable mantissa is one less than +this denominator, the value of the fraction is always strictly less than +@code{1}. The mathematical value of a floating-point number is then the +product of this fraction; the sign; and the base raised to the exponent. + +If the floating-point number is @dfn{normalized}, the mantissa is also +greater than or equal to the base raised to the power of one less +than the precision (unless the number represents a floating-point zero, +in which case the mantissa is zero). The fractional quantity is +therefore greater than or equal to @code{1/@var{b}}, where @var{b} is +the base. +@cindex normalized floating-point number + +@node Floating-Point Parameters, IEEE Floating Point, Floating-Point Representation, Floating-Point Limits +@subsection Floating-Point Parameters + +@strong{Incomplete:} This section needs some more concrete examples +of what these parameters mean and how to use them in a program. + +These macro definitions can be accessed by including the header file +@file{float.h} in your program. +@pindex float.h + +Macro names starting with @samp{FLT_} refer to the @code{float} type, +while names beginning with @samp{DBL_} refer to the @code{double} type +and names beginning with @samp{LDBL_} refer to the @code{long double} +type. (In implementations that do not support @code{long double} as +a distinct data type, the values for those constants are the same +as the corresponding constants for the @code{double} type.)@refill +@cindex @code{float} representation limits +@cindex @code{double} representation limits +@cindex @code{long double} representation limits + +Of these macros, only @code{FLT_RADIX} is guaranteed to be a constant +expression. The other macros listed here cannot be reliably used in +places that require constant expressions, such as @samp{#if} +preprocessing directives or array size specifications. + +Although the ANSI C standard specifies minimum and maximum values for +most of these parameters, the GNU C implementation uses whatever +floating-point representations are supported by the underlying hardware. +So whether GNU C actually satisfies the ANSI C requirements depends on +what machine it is running on. + +@comment float.h +@comment ANSI +@deftypevr Macro int FLT_ROUNDS +This value characterizes the rounding mode for floating-point addition. +The following values indicate standard rounding modes: + +@table @code +@item -1 +The mode is indeterminable. +@item 0 +Rounding is towards zero. +@item 1 +Rounding is to the nearest number. +@item 2 +Rounding is towards positive infinity. +@item 3 +Rounding is towards negative infinity. +@end table + +@noindent +Any other value represents a machine-dependent nonstandard rounding +mode. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro int FLT_RADIX +This is the value of the base, or radix, of exponent representation. +This is guaranteed to be a constant expression, unlike the other macros +described in this section. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro int FLT_MANT_DIG +This is the number of base-@code{FLT_RADIX} digits in the floating-point +mantissa for the @code{float} data type. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro int DBL_MANT_DIG +This is the number of base-@code{FLT_RADIX} digits in the floating-point +mantissa for the @code{double} data type. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro int LDBL_MANT_DIG +This is the number of base-@code{FLT_RADIX} digits in the floating-point +mantissa for the @code{long double} data type. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro int FLT_DIG +This is the number of decimal digits of precision for the @code{float} +data type. Technically, if @var{p} and @var{b} are the precision and +base (respectively) for the representation, then the decimal precision +@var{q} is the maximum number of decimal digits such that any floating +point number with @var{q} base 10 digits can be rounded to a floating +point number with @var{p} base @var{b} digits and back again, without +change to the @var{q} decimal digits. + +The value of this macro is guaranteed to be at least @code{6}. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro int DBL_DIG +This is similar to @code{FLT_DIG}, but is for the @code{double} data +type. The value of this macro is guaranteed to be at least @code{10}. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro int LDBL_DIG +This is similar to @code{FLT_DIG}, but is for the @code{long double} +data type. The value of this macro is guaranteed to be at least +@code{10}. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro int FLT_MIN_EXP +This is the minimum negative integer such that the mathematical value +@code{FLT_RADIX} raised to this power minus 1 can be represented as a +normalized floating-point number of type @code{float}. In terms of the +actual implementation, this is just the smallest value that can be +represented in the exponent field of the number. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro int DBL_MIN_EXP +This is similar to @code{FLT_MIN_EXP}, but is for the @code{double} data +type. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro int LDBL_MIN_EXP +This is similar to @code{FLT_MIN_EXP}, but is for the @code{long double} +data type. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro int FLT_MIN_10_EXP +This is the minimum negative integer such that the mathematical value +@code{10} raised to this power minus 1 can be represented as a +normalized floating-point number of type @code{float}. This is +guaranteed to be no greater than @code{-37}. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro int DBL_MIN_10_EXP +This is similar to @code{FLT_MIN_10_EXP}, but is for the @code{double} +data type. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro int LDBL_MIN_10_EXP +This is similar to @code{FLT_MIN_10_EXP}, but is for the @code{long +double} data type. +@end deftypevr + + + +@comment float.h +@comment ANSI +@deftypevr Macro int FLT_MAX_EXP +This is the maximum negative integer such that the mathematical value +@code{FLT_RADIX} raised to this power minus 1 can be represented as a +floating-point number of type @code{float}. In terms of the actual +implementation, this is just the largest value that can be represented +in the exponent field of the number. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro int DBL_MAX_EXP +This is similar to @code{FLT_MAX_EXP}, but is for the @code{double} data +type. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro int LDBL_MAX_EXP +This is similar to @code{FLT_MAX_EXP}, but is for the @code{long double} +data type. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro int FLT_MAX_10_EXP +This is the maximum negative integer such that the mathematical value +@code{10} raised to this power minus 1 can be represented as a +normalized floating-point number of type @code{float}. This is +guaranteed to be at least @code{37}. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro int DBL_MAX_10_EXP +This is similar to @code{FLT_MAX_10_EXP}, but is for the @code{double} +data type. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro int LDBL_MAX_10_EXP +This is similar to @code{FLT_MAX_10_EXP}, but is for the @code{long +double} data type. +@end deftypevr + + +@comment float.h +@comment ANSI +@deftypevr Macro double FLT_MAX +The value of this macro is the maximum representable floating-point +number of type @code{float}, and is guaranteed to be at least +@code{1E+37}. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro double DBL_MAX +The value of this macro is the maximum representable floating-point +number of type @code{double}, and is guaranteed to be at least +@code{1E+37}. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro {long double} LDBL_MAX +The value of this macro is the maximum representable floating-point +number of type @code{long double}, and is guaranteed to be at least +@code{1E+37}. +@end deftypevr + + +@comment float.h +@comment ANSI +@deftypevr Macro double FLT_MIN +The value of this macro is the minimum normalized positive +floating-point number that is representable by type @code{float}, and is +guaranteed to be no more than @code{1E-37}. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro double DBL_MIN +The value of this macro is the minimum normalized positive +floating-point number that is representable by type @code{double}, and +is guaranteed to be no more than @code{1E-37}. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro {long double} LDBL_MIN +The value of this macro is the minimum normalized positive +floating-point number that is representable by type @code{long double}, +and is guaranteed to be no more than @code{1E-37}. +@end deftypevr + + +@comment float.h +@comment ANSI +@deftypevr Macro double FLT_EPSILON +This is the minimum positive floating-point number of type @code{float} +such that @code{1.0 + FLT_EPSILON != 1.0} is true. It's guaranteed to +be no greater than @code{1E-5}. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro double DBL_EPSILON +This is similar to @code{FLT_EPSILON}, but is for the @code{double} +type. The maximum value is @code{1E-9}. +@end deftypevr + +@comment float.h +@comment ANSI +@deftypevr Macro {long double} LDBL_EPSILON +This is similar to @code{FLT_EPSILON}, but is for the @code{long double} +type. The maximum value is @code{1E-9}. +@end deftypevr + + +@node IEEE Floating Point, , Floating-Point Parameters, Floating-Point Limits +@subsection IEEE Floating Point +@cindex IEEE floating-point representation +@cindex floating-point, IEEE +@cindex IEEE Std 754 + + +Here is an example showing how these parameters work for a common +floating point representation, specified by the @cite{IEEE Standard for +Binary Floating-Point Arithmetic (ANSI/IEEE Std 754-1985)}. Nearly +all computers today use this format. + +The IEEE single-precision float representation uses a base of 2. There +is a sign bit, a mantissa with 23 bits plus one hidden bit (so the total +precision is 24 base-2 digits), and an 8-bit exponent that can represent +values in the range -125 to 128, inclusive. + +So, for an implementation that uses this representation for the +@code{float} data type, appropriate values for the corresponding +parameters are: + +@example +FLT_RADIX 2 +FLT_MANT_DIG 24 +FLT_DIG 6 +FLT_MIN_EXP -125 +FLT_MIN_10_EXP -37 +FLT_MAX_EXP 128 +FLT_MAX_10_EXP +38 +FLT_MIN 1.17549435E-38F +FLT_MAX 3.40282347E+38F +FLT_EPSILON 1.19209290E-07F +@end example + +Here are the values for the @code{double} data type: + +@example +DBL_MANT_DIG 53 +DBL_DIG 15 +DBL_MIN_EXP -1021 +DBL_MIN_10_EXP -307 +DBL_MAX_EXP 1024 +DBL_MAX_10_EXP 308 +DBL_MAX 1.7976931348623157E+308 +DBL_MIN 2.2250738585072014E-308 +DBL_EPSILON 2.2204460492503131E-016 +@end example diff --git a/manual/=process.texinfo b/manual/=process.texinfo new file mode 100644 index 0000000000..63c723ed37 --- /dev/null +++ b/manual/=process.texinfo @@ -0,0 +1,1452 @@ +@node Processes, Job Control, Signal Handling, Top +@chapter Processes + +@cindex process +@dfn{Processes} are the primitive units for allocation of system +resources. Each process has its own address space and (usually) one +thread of control. A process executes a program; you can have multiple +processes executing the same program, but each process has its own copy +of the program within its own address space and executes it +independently of the other copies. + +Processes are organized hierarchically. Child processes are created by +a parent process, and inherit many of their attributes from the parent +process. + +This chapter describes how a program can create, terminate, and control +child processes. + +@menu +* Program Arguments:: Parsing the command-line arguments to + a program. +* Environment Variables:: How to access parameters inherited from + a parent process. +* Program Termination:: How to cause a process to terminate and + return status information to its parent. +* Creating New Processes:: Running other programs. +@end menu + + +@node Program Arguments, Environment Variables, , Processes +@section Program Arguments +@cindex program arguments +@cindex command line arguments + +@cindex @code{main} function +When your C program starts, it begins by executing the function called +@code{main}. You can define @code{main} either to take no arguments, +or to take two arguments that represent the command line arguments +to the program, like this: + +@example +int main (int @var{argc}, char *@var{argv}[]) +@end example + +@cindex argc (program argument count) +@cindex argv (program argument vector) +The command line arguments are the whitespace-separated tokens typed by +the user to the shell in invoking the program. The value of the +@var{argc} argument is the number of command line arguments. The +@var{argv} argument is a vector of pointers to @code{char}; sometimes it +is also declared as @samp{char **@var{argv}}. The elements of +@var{argv} are the individual command line argument strings. By +convention, @code{@var{argv}[0]} is the file name of the program being +run, and @code{@var{argv}[@var{argc}]} is a null pointer. + +If the syntax for the command line arguments to your program is simple +enough, you can simply pick the arguments off from @var{argv} by hand. +But unless your program takes a fixed number of arguments, or all of the +arguments are interpreted in the same way (as file names, for example), +you are usually better off using @code{getopt} to do the parsing. + +@menu +* Argument Syntax Conventions:: By convention, program + options are specified by a + leading hyphen. +* Parsing Program Arguments:: The @code{getopt} function. +* Example Using getopt:: An example of @code{getopt}. +@end menu + +@node Argument Syntax Conventions, Parsing Program Arguments, , Program Arguments +@subsection Program Argument Syntax Conventions +@cindex program argument syntax +@cindex syntax, for program arguments +@cindex command argument syntax + +The @code{getopt} function decodes options following the usual +conventions for POSIX utilities: + +@itemize @bullet +@item +Arguments are options if they begin with a hyphen delimiter (@samp{-}). + +@item +Multiple options may follow a hyphen delimiter in a single token if +the options do not take arguments. Thus, @samp{-abc} is equivalent to +@samp{-a -b -c}. + +@item +Option names are single alphanumeric (as for @code{isalnum}; +see @ref{Classification of Characters}). + +@item +Certain options require an argument. For example, the @samp{-o} +command of the ld command requires an argument---an output file name. + +@item +An option and its argument may or may appear as separate tokens. (In +other words, the whitespace separating them is optional.) Thus, +@samp{-o foo} and @samp{-ofoo} are equivalent. + +@item +Options typically precede other non-option arguments. + +The implementation of @code{getopt} in the GNU C library normally makes +it appear as if all the option arguments were specified before all the +non-option arguments for the purposes of parsing, even if the user of +your program intermixed option and non-option arguments. It does this +by reordering the elements of the @var{argv} array. This behavior is +nonstandard; if you want to suppress it, define the +@code{_POSIX_OPTION_ORDER} environment variable. @xref{Standard +Environment Variables}. + +@item +The argument @samp{--} terminates all options; any following arguments +are treated as non-option arguments, even if they begin with a hyphen. + +@item +A token consisting of a single hyphen character is interpreted as an +ordinary non-option argument. By convention, it is used to specify +input from or output to the standard input and output streams. + +@item +Options may be supplied in any order, or appear multiple times. The +interpretation is left up to the particular application program. +@end itemize + +@node Parsing Program Arguments, Example Using getopt, Argument Syntax Conventions, Program Arguments +@subsection Parsing Program Arguments +@cindex program arguments, parsing +@cindex command arguments, parsing +@cindex parsing program arguments + +Here are the details about how to call the @code{getopt} function. To +use this facility, your program must include the header file +@file{unistd.h}. +@pindex unistd.h + +@comment unistd.h +@comment POSIX.2 +@deftypevar int opterr +If the value of this variable is nonzero, then @code{getopt} prints an +error message to the standard error stream if it encounters an unknown +option character or an option with a missing required argument. This is +the default behavior. If you set this variable to zero, @code{getopt} +does not print any messages, but it still returns @code{?} to indicate +an error. +@end deftypevar + +@comment unistd.h +@comment POSIX.2 +@deftypevar int optopt +When @code{getopt} encounters an unknown option character or an option +with a missing required argument, it stores that option character in +this variable. You can use this for providing your own diagnostic +messages. +@end deftypevar + +@comment unistd.h +@comment POSIX.2 +@deftypevar int optind +This variable is set by @code{getopt} to the index of the next element +of the @var{argv} array to be processed. Once @code{getopt} has found +all of the option arguments, you can use this variable to determine +where the remaining non-option arguments begin. The initial value of +this variable is @code{1}. +@end deftypevar + +@comment unistd.h +@comment POSIX.2 +@deftypevar {char *} optarg +This variable is set by @code{getopt} to point at the value of the +option argument, for those options that accept arguments. +@end deftypevar + +@comment unistd.h +@comment POSIX.2 +@deftypefun int getopt (int @var{argc}, char **@var{argv}, const char *@var{options}) +The @code{getopt} function gets the next option argument from the +argument list specified by the @var{argv} and @var{argc} arguments. +Normally these arguments' values come directly from the arguments of +@code{main}. + +The @var{options} argument is a string that specifies the option +characters that are valid for this program. An option character in this +string can be followed by a colon (@samp{:}) to indicate that it takes a +required argument. + +If the @var{options} argument string begins with a hyphen (@samp{-}), this +is treated specially. It permits arguments without an option to be +returned as if they were associated with option character @samp{\0}. + +The @code{getopt} function returns the option character for the next +command line option. When no more option arguments are available, it +returns @code{-1}. There may still be more non-option arguments; you +must compare the external variable @code{optind} against the @var{argv} +parameter to check this. + +If the options has an argument, @code{getopt} returns the argument by +storing it in the varables @var{optarg}. You don't ordinarily need to +copy the @code{optarg} string, since it is a pointer into the original +@var{argv} array, not into a static area that might be overwritten. + +If @code{getopt} finds an option character in @var{argv} that was not +included in @var{options}, or a missing option argument, it returns +@samp{?} and sets the external variable @code{optopt} to the actual +option character. In addition, if the external variable @code{opterr} +is nonzero, @code{getopt} prints an error message. +@end deftypefun + +@node Example Using getopt, , Parsing Program Arguments, Program Arguments +@subsection Example of Parsing Program Arguments + +Here is an example showing how @code{getopt} is typically used. The +key points to notice are: + +@itemize @bullet +@item +Normally, @code{getopt} is called in a loop. When @code{getopt} returns +@code{-1}, indicating no more options are present, the loop terminates. + +@item +A @code{switch} statement is used to dispatch on the return value from +@code{getopt}. In typical use, each case just sets a variable that +is used later in the program. + +@item +A second loop is used to process the remaining non-option arguments. +@end itemize + +@example +@include testopt.c.texi +@end example + +Here are some examples showing what this program prints with different +combinations of arguments: + +@example +% testopt +aflag = 0, bflag = 0, cvalue = (null) + +% testopt -a -b +aflag = 1, bflag = 1, cvalue = (null) + +% testopt -ab +aflag = 1, bflag = 1, cvalue = (null) + +% testopt -c foo +aflag = 0, bflag = 0, cvalue = foo + +% testopt -cfoo +aflag = 0, bflag = 0, cvalue = foo + +% testopt arg1 +aflag = 0, bflag = 0, cvalue = (null) +Non-option argument arg1 + +% testopt -a arg1 +aflag = 1, bflag = 0, cvalue = (null) +Non-option argument arg1 + +% testopt -c foo arg1 +aflag = 0, bflag = 0, cvalue = foo +Non-option argument arg1 + +% testopt -a -- -b +aflag = 1, bflag = 0, cvalue = (null) +Non-option argument -b + +% testopt -a - +aflag = 1, bflag = 0, cvalue = (null) +Non-option argument - +@end example + +@node Environment Variables, Program Termination, Program Arguments, Processes +@section Environment Variables + +@cindex environment variable +When a program is executed, it receives information about the context in +which it was invoked in two ways. The first mechanism uses the +@var{argv} and @var{argc} arguments to its @code{main} function, and is +discussed in @ref{Program Arguments}. The second mechanism is +uses @dfn{environment variables} and is discussed in this section. + +The @var{argv} mechanism is typically used to pass command-line +arguments specific to the particular program being invoked. The +environment, on the other hand, keeps track of information that is +shared by many programs, changes infrequently, and that is less +frequently accessed. + +The environment variables discussed in this section are the same +environment variables that you set using the assignments and the +@code{export} command in the shell. Programs executed from the shell +inherit all of the environment variables from the shell. + +@cindex environment +Standard environment variables are used for information about the user's +home directory, terminal type, current locale, and so on; you can define +additional variables for other purposes. The set of all environment +variables that have values is collectively known as the +@dfn{environment}. + +Names of environment variables are case-sensitive and must not contain +the character @samp{=}. System-defined environment variables are +invariably uppercase. + +The values of environment variables can be anything that can be +represented as a string. A value must not contain an embedded null +character, since this is assumed to terminate the string. + + +@menu +* Environment Access:: How to get and set the values of + environment variables. +* Standard Environment Variables:: These environment variables have + standard interpretations. +@end menu + +@node Environment Access, Standard Environment Variables, , Environment Variables +@subsection Environment Access +@cindex environment access +@cindex environment representation + +The value of an environment variable can be accessed with the +@code{getenv} function. This is declared in the header file +@file{stdlib.h}. +@pindex stdlib.h + +@comment stdlib.h +@comment ANSI +@deftypefun {char *} getenv (const char *@var{name}) +This function returns a string that is the value of the environment +variable @var{name}. You must not modify this string. In some systems +not using the GNU library, it might be overwritten by subsequent calls +to @code{getenv} (but not by any other library function). If the +environment variable @var{name} is not defined, the value is a null +pointer. +@end deftypefun + + +@comment stdlib.h +@comment SVID +@deftypefun int putenv (const char *@var{string}) +The @code{putenv} function adds or removes definitions from the environment. +If the @var{string} is of the form @samp{@var{name}=@var{value}}, the +definition is added to the environment. Otherwise, the @var{string} is +interpreted as the name of an environment variable, and any definition +for this variable in the environment is removed. + +The GNU library provides this function for compatibility with SVID; it +may not be available in other systems. +@end deftypefun + +You can deal directly with the underlying representation of environment +objects to add more variables to the environment (for example, to +communicate with another program you are about to execute; see +@ref{Executing a File}). + +@comment unistd.h +@comment POSIX.1 +@deftypevar {char **} environ +The environment is represented as an array of strings. Each string is +of the format @samp{@var{name}=@var{value}}. The order in which +strings appear in the environment is not significant, but the same +@var{name} must not appear more than once. The last element of the +array is a null pointer. + +This variable is not declared in any header file, but if you declare it +in your own program as @code{extern}, the right thing will happen. + +If you just want to get the value of an environment variable, use +@code{getenv}. +@end deftypevar + +@node Standard Environment Variables, , Environment Access, Environment Variables +@subsection Standard Environment Variables +@cindex standard environment variables + +These environment variables have standard meanings. +This doesn't mean that they are always present in the +environment, though; it just means that if these variables @emph{are} +present, they have these meanings, and that you shouldn't try to use +these environment variable names for some other purpose. + +@table @code +@item HOME +@cindex HOME environment variable +@cindex home directory +This is a string representing the user's @dfn{home directory}, or +initial default working directory. @xref{User Database}, for a +more secure way of determining this information. + +@comment RMS says to explay why HOME is better, but I don't know why. + +@item LOGNAME +@cindex LOGNAME environment variable +This is the name that the user used to log in. Since the value in the +environment can be tweaked arbitrarily, this is not a reliable way to +identify the user who is running a process; a function like +@code{getlogin} (@pxref{User Identification Functions}) is better for +that purpose. + +@comment RMS says to explay why LOGNAME is better, but I don't know why. + +@item PATH +@cindex PATH environment variable +A @dfn{path} is a sequence of directory names which is used for +searching for a file. The variable @var{PATH} holds a path The +@code{execlp} and @code{execvp} functions (@pxref{Executing a File}) +uses this environment variable, as do many shells and other utilities +which are implemented in terms of those functions. + +The syntax of a path is a sequence of directory names separated by +colons. An empty string instead of a directory name stands for the +current directory. (@xref{Working Directory}.) + +A typical value for this environment variable might be a string like: + +@example +.:/bin:/etc:/usr/bin:/usr/new/X11:/usr/new:/usr/local:/usr/local/bin +@end example + +This means that if the user tries to execute a program named @code{foo}, +the system will look for files named @file{./foo}, @file{/bin/foo}, +@file{/etc/foo}, and so on. The first of these files that exists is +the one that is executed. + +@item TERM +@cindex TERM environment variable +This specifies the kind of terminal that is receiving program output. +Some programs can make use of this information to take advantage of +special escape sequences or terminal modes supported by particular kinds +of terminals. Many programs which use the termcap library +(@pxref{Finding a Terminal Description,Find,,termcap,The Termcap Library +Manual}) use the @code{TERM} environment variable, for example. + +@item TZ +@cindex TZ environment variable +This specifies the time zone. @xref{Time Zone}, for information about +the format of this string and how it is used. + +@item LANG +@cindex LANG environment variable +This specifies the default locale to use for attribute categories where +neither @code{LC_ALL} nor the specific environment variable for that +category is set. @xref{Locales}, for more information about +locales. + +@item LC_ALL +@cindex LC_ALL environment variable +This is similar to the @code{LANG} environment variable. However, its +value takes precedence over any values provided for the individual +attribute category environment variables, or for the @code{LANG} +environment variable. + +@item LC_COLLATE +@cindex LC_COLLATE environment variable +This specifies what locale to use for string sorting. + +@item LC_CTYPE +@cindex LC_CTYPE environment variable +This specifies what locale to use for character sets and character +classification. + +@item LC_MONETARY +@cindex LC_MONETARY environment variable +This specifies what locale to use for formatting monetary values. + +@item LC_NUMERIC +@cindex LC_NUMERIC environment variable +This specifies what locale to use for formatting numbers. + +@item LC_TIME +@cindex LC_TIME environment variable +This specifies what locale to use for formatting date/time values. + +@item _POSIX_OPTION_ORDER +@cindex _POSIX_OPTION_ORDER environment variable. +If this environment variable is defined, it suppresses the usual +reordering of command line arguments by @code{getopt}. @xref{Program +Argument Syntax Conventions}. +@end table + +@node Program Termination, Creating New Processes, Environment Variables, Processes +@section Program Termination +@cindex program termination +@cindex process termination + +@cindex exit status value +The usual way for a program to terminate is simply for its @code{main} +function to return. The @dfn{exit status value} returned from the +@code{main} function is used to report information back to the process's +parent process or shell. + +A program can also terminate normally calling the @code{exit} +function + +In addition, programs can be terminated by signals; this is discussed in +more detail in @ref{Signal Handling}. The @code{abort} function causes +a terminal that kills the program. + +@menu +* Normal Program Termination:: +* Exit Status:: Exit Status +* Cleanups on Exit:: Cleanups on Exit +* Aborting a Program:: +* Termination Internals:: Termination Internals +@end menu + +@node Normal Program Termination, Exit Status, , Program Termination +@subsection Normal Program Termination + +@comment stdlib.h +@comment ANSI +@deftypefun void exit (int @var{status}) +The @code{exit} function causes normal program termination with status +@var{status}. This function does not return. +@end deftypefun + +When a program terminates normally by returning from its @code{main} +function or by calling @code{exit}, the following actions occur in +sequence: + +@enumerate +@item +Functions that were registered with the @code{atexit} or @code{on_exit} +functions are called in the reverse order of their registration. This +mechanism allows your application to specify its own ``cleanup'' actions +to be performed at program termination. Typically, this is used to do +things like saving program state information in a file, or unlock locks +in shared data bases. + +@item +All open streams are closed; writing out any buffered output data. See +@ref{Opening and Closing Streams}. In addition, temporary files opened +with the @code{tmpfile} function are removed; see @ref{Temporary Files}. + +@item +@code{_exit} is called. @xref{Termination Internals} +@end enumerate + +@node Exit Status, Cleanups on Exit, Normal Program Termination, Program Termination +@subsection Exit Status +@cindex exit status + +When a program exits, it can return to the parent process a small +amount of information about the cause of termination, using the +@dfn{exit status}. This is a value between 0 and 255 that the exiting +process passes as an argument to @code{exit}. + +Normally you should use the exit status to report very broad information +about success or failure. You can't provide a lot of detail about the +reasons for the failure, and most parent processes would not want much +detail anyway. + +There are conventions for what sorts of status values certain programs +should return. The most common convention is simply 0 for success and 1 +for failure. Programs that perform comparison use a different +convention: they use status 1 to indicate a mismatch, and status 2 to +indicate an inability to compare. Your program should follow an +existing convention if an existing convention makes sense for it. + +A general convention reserves status values 128 and up for special +purposes. In particular, the value 128 is used to indicate failure to +execute another program in a subprocess. This convention is not +universally obeyed, but it is a good idea to follow it in your programs. + +@strong{Warning:} Don't try to use the number of errors as the exit +status. This is actually not very useful; a parent process would +generally not care how many errors occurred. Worse than that, it does +not work, because the status value is truncated to eight bits. +Thus, if the program tried to report 256 errors, the parent would +receive a report of 0 errors---that is, success. + +For the same reason, it does not work to use the value of @code{errno} +as the exit status---these can exceed 255. + +@strong{Portability note:} Some non-POSIX systems use different +conventions for exit status values. For greater portability, you can +use the macros @code{EXIT_SUCCESS} and @code{EXIT_FAILURE} for the +conventional status value for success and failure, respectively. They +are declared in the file @file{stdlib.h}. +@pindex stdlib.h + +@comment stdlib.h +@comment ANSI +@deftypevr Macro int EXIT_SUCCESS +This macro can be used with the @code{exit} function to indicate +successful program completion. + +On POSIX systems, the value of this macro is @code{0}. On other +systems, the value might be some other (possibly non-constant) integer +expression. +@end deftypevr + +@comment stdlib.h +@comment ANSI +@deftypevr Macro int EXIT_FAILURE +This macro can be used with the @code{exit} function to indicate +unsuccessful program completion in a general sense. + +On POSIX systems, the value of this macro is @code{1}. On other +systems, the value might be some other (possibly non-constant) integer +expression. Other nonzero status values also indicate future. Certain +programs use different nonzero status values to indicate particular +kinds of "non-success". For example, @code{diff} uses status value +@code{1} to mean that the files are different, and @code{2} or more to +mean that there was difficulty in opening the files. +@end deftypevr + +@node Cleanups on Exit, Aborting a Program, Exit Status, Program Termination +@subsection Cleanups on Exit + +@comment stdlib.h +@comment ANSI +@deftypefun int atexit (void (*@var{function})) +The @code{atexit} function registers the function @var{function} to be +called at normal program termination. The @var{function} is called with +no arguments. + +The return value from @code{atexit} is zero on success and nonzero if +the function cannot be registered. +@end deftypefun + +@comment stdlib.h +@comment GNU +@deftypefun int on_exit (void (*@var{function})(int @var{status}, void *@var{arg}), void *@var{arg}) +This function is a somewhat more powerful variant of @code{atexit}. It +accepts two arguments, a function @var{function} and an arbitrary +pointer @var{arg}. At normal program termination, the @var{function} is +called with two arguments: the @var{status} value passed to @code{exit}, +and the @var{arg}. + +This function is a GNU extension, and may not be supported by other +implementations. +@end deftypefun + +Here's a trivial program that illustrates the use of @code{exit} and +@code{atexit}: + +@example +#include <stdio.h> +#include <stdlib.h> + +void bye (void) +@{ + printf ("Goodbye, cruel world....\n"); +@} + +void main (void) +@{ + atexit (bye); + exit (EXIT_SUCCESS); +@} +@end example + +@noindent +When this program is executed, it just prints the message and exits. + + +@node Aborting a Program, Termination Internals, Cleanups on Exit, Program Termination +@subsection Aborting a Program +@cindex aborting a program + +You can abort your program using the @code{abort} function. The prototype +for this function is in @file{stdlib.h}. +@pindex stdlib.h + +@comment stdlib.h +@comment ANSI +@deftypefun void abort () +The @code{abort} function causes abnormal program termination, without +executing functions registered with @code{atexit} or @code{on_exit}. + +This function actually terminates the process by raising a +@code{SIGABRT} signal, and your program can include a handler to +intercept this signal; see @ref{Signal Handling}. + +@strong{Incomplete:} Why would you want to define such a handler? +@end deftypefun + +@node Termination Internals, , Aborting a Program, Program Termination +@subsection Termination Internals + +The @code{_exit} function is the primitive used for process termination +by @code{exit}. It is declared in the header file @file{unistd.h}. +@pindex unistd.h + +@comment unistd.h +@comment POSIX.1 +@deftypefun void _exit (int @var{status}) +The @code{_exit} function is the primitive for causing a process to +terminate with status @var{status}. Calling this function does not +execute cleanup functions registered with @code{atexit} or +@code{on_exit}. +@end deftypefun + +When a process terminates for any reason---either by an explicit +termination call, or termination as a result of a signal---the +following things happen: + +@itemize @bullet +@item +All open file descriptors in the process are closed. @xref{Low-Level +Input/Output}. + +@item +The low-order 8 bits of the return status code are saved to be reported +back to the parent process via @code{wait} or @code{waitpid}; see +@ref{Process Completion}. + +@item +Any child processes of the process being terminated are assigned a new +parent process. (This is the @code{init} process, with process ID 1.) + +@item +A @code{SIGCHLD} signal is sent to the parent process. + +@item +If the process is a session leader that has a controlling terminal, then +a @code{SIGHUP} signal is sent to each process in the foreground job, +and the controlling terminal is disassociated from that session. +@xref{Job Control}. + +@item +If termination of a process causes a process group to become orphaned, +and any member of that process group is stopped, then a @code{SIGHUP} +signal and a @code{SIGCONT} signal are sent to each process in the +group. @xref{Job Control}. +@end itemize + +@node Creating New Processes, , Program Termination, Processes +@section Creating New Processes + +This section describes how your program can cause other programs to be +executed. Actually, there are three distinct operations involved: +creating a new child process, causing the new process to execute a +program, and coordinating the completion of the child process with the +original program. + +The @code{system} function provides a simple, portable mechanism for +running another program; it does all three steps automatically. If you +need more control over the details of how this is done, you can use the +primitive functions to do each step individually instead. + +@menu +* Running a Command:: The easy way to run another program. +* Process Creation Concepts:: An overview of the hard way to do it. +* Process Identification:: How to get the process ID of a process. +* Creating a Process:: How to fork a child process. +* Executing a File:: How to get a process to execute another + program. +* Process Completion:: How to tell when a child process has + completed. +* Process Completion Status:: How to interpret the status value + returned from a child process. +* BSD wait Functions:: More functions, for backward + compatibility. +* Process Creation Example:: A complete example program. +@end menu + + +@node Running a Command, Process Creation Concepts, , Creating New Processes +@subsection Running a Command +@cindex running a command + +The easy way to run another program is to use the @code{system} +function. This function does all the work of running a subprogram, but +it doesn't give you much control over the details: you have to wait +until the subprogram terminates before you can do anything else. + +@pindex stdlib.h + +@comment stdlib.h +@comment ANSI +@deftypefun int system (const char *@var{command}) +This function executes @var{command} as a shell command. In the GNU C +library, it always uses the default shell @code{sh} to run the command. +In particular, it searching the directories in @code{PATH} to find +programs to execute. The return value is @code{-1} if it wasn't +possible to create the shell process, and otherwise is the status of the +shell process. @xref{Process Completion}, for details on how this +status code can be interpreted. +@pindex sh +@end deftypefun + +The @code{system} function is declared in the header file +@file{stdlib.h}. + +@strong{Portability Note:} Some C implementations may not have any +notion of a command processor that can execute other programs. You can +determine whether a command processor exists by executing @code{system +(o)}; in this case the return value is nonzero if and only if such a +processor is available. + +The @code{popen} and @code{pclose} functions (@pxref{Pipe to a +Subprocess}) are closely related to the @code{system} function. They +allow the parent process to communicate with the standard input and +output channels of the command being executed. + +@node Process Creation Concepts, Process Identification, Running a Command, Creating New Processes +@subsection Process Creation Concepts + +This section gives an overview of processes and of the steps involved in +creating a process and making it run another program. + +@cindex process ID +@cindex process lifetime +Each process is named by a @dfn{process ID} number. A unique process ID +is allocated to each process when it is created. The @dfn{lifetime} of +a process ends when its termination is reported to its parent process; +at that time, all of the process resources, including its process ID, +are freed. + +@cindex creating a process +@cindex forking a process +@cindex child process +@cindex parent process +Processes are created with the @code{fork} system call (so the operation +of creating a new process is sometimes called @dfn{forking} a process). +The @dfn{child process} created by @code{fork} is an exact clone of the +original @dfn{parent process}, except that it has its own process ID. + +After forking a child process, both the parent and child processes +continue to execute normally. If you want your program to wait for a +child process to finish executing before continuing, you must do this +explicitly after the fork operation. This is done with the @code{wait} +or @code{waitpid} functions (@pxref{Process Completion}). These +functions give the parent information about why the child +terminated---for example, its exit status code. + +A newly forked child process continues to execute the same program as +its parent process, at the point where the @code{fork} call returns. +You can use the return value from @code{fork} to tell whether the program +is running in the parent process or the child. + +@cindex process image +Having all processes run the same program is usually not very useful. +But the child can execute another program using one of the @code{exec} +functions; see @ref{Executing a File}. The program that the process is +executing is called its @dfn{process image}. Starting execution of a +new program causes the process to forget all about its current process +image; when the new program exits, the process exits too, instead of +returning to the previous process image. + + +@node Process Identification, Creating a Process, Process Creation Concepts, Creating New Processes +@subsection Process Identification + +The @code{pid_t} data type represents process IDs. You can get the +process ID of a process by calling @code{getpid}. The function +@code{getppid} returns the process ID of the parent of the parent of the +current process (this is also known as the @dfn{parent process ID}). +Your program should include the header files @file{unistd.h} and +@file{sys/types.h} to use these functions. +@pindex sys/types.h +@pindex unistd.h + +@comment sys/types.h +@comment POSIX.1 +@deftp {Data Type} pid_t +The @code{pid_t} data type is a signed integer type which is capable +of representing a process ID. In the GNU library, this is an @code{int}. +@end deftp + +@comment unistd.h +@comment POSIX.1 +@deftypefun pid_t getpid () +The @code{getpid} function returns the process ID of the current process. +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun pid_t getppid () +The @code{getppid} function returns the process ID of the parent of the +current process. +@end deftypefun + +@node Creating a Process, Executing a File, Process Identification, Creating New Processes +@subsection Creating a Process + +The @code{fork} function is the primitive for creating a process. +It is declared in the header file @file{unistd.h}. +@pindex unistd.h + +@comment unistd.h +@comment POSIX.1 +@deftypefun pid_t fork () +The @code{fork} function creates a new process. + +If the operation is successful, there are then both parent and child +processes and both see @code{fork} return, but with different values: it +returns a value of @code{0} in the child process and returns the child's +process ID in the parent process. If the child process could not be +created, a value of @code{-1} is returned in the parent process. The +following @code{errno} error conditions are defined for this function: + +@table @code +@item EAGAIN +There aren't enough system resources to create another process, or the +user already has too many processes running. + +@item ENOMEM +The process requires more space than the system can supply. +@end table +@end deftypefun + +The specific attributes of the child process that differ from the +parent process are: + +@itemize @bullet +@item +The child process has its own unique process ID. + +@item +The parent process ID of the child process is the process ID of its +parent process. + +@item +The child process gets its own copies of the parent process's open file +descriptors. Subsequently changing attributes of the file descriptors +in the parent process won't affect the file descriptors in the child, +and vice versa. @xref{Control Operations}. + +@item +The elapsed processor times for the child process are set to zero; +see @ref{Processor Time}. + +@item +The child doesn't inherit file locks set by the parent process. +@xref{Control Operations}. + +@item +The child doesn't inherit alarms set by the parent process. +@xref{Setting an Alarm}. + +@item +The set of pending signals (@pxref{Delivery of Signal}) for the child +process is cleared. (The child process inherits its mask of blocked +signals and signal actions from the parent process.) +@end itemize + + +@comment unistd.h +@comment BSD +@deftypefun pid_t vfork (void) +The @code{vfork} function is similar to @code{fork} but more efficient; +however, there are restrictions you must follow to use it safely. + +While @code{fork} makes a complete copy of the calling process's address +space and allows both the parent and child to execute independently, +@code{vfork} does not make this copy. Instead, the child process +created with @code{vfork} shares its parent's address space until it calls +one of the @code{exec} functions. In the meantime, the parent process +suspends execution. + +You must be very careful not to allow the child process created with +@code{vfork} to modify any global data or even local variables shared +with the parent. Furthermore, the child process cannot return from (or +do a long jump out of) the function that called @code{vfork}! This +would leave the parent process's control information very confused. If +in doubt, use @code{fork} instead. + +Some operating systems don't really implement @code{vfork}. The GNU C +library permits you to use @code{vfork} on all systems, but actually +executes @code{fork} if @code{vfork} isn't available. +@end deftypefun + +@node Executing a File, Process Completion, Creating a Process, Creating New Processes +@subsection Executing a File +@cindex executing a file +@cindex @code{exec} functions + +This section describes the @code{exec} family of functions, for executing +a file as a process image. You can use these functions to make a child +process execute a new program after it has been forked. + +The functions in this family differ in how you specify the arguments, +but otherwise they all do the same thing. They are declared in the +header file @file{unistd.h}. +@pindex unistd.h + +@comment unistd.h +@comment POSIX.1 +@deftypefun int execv (const char *@var{filename}, char *const @var{argv}@t{[]}) +The @code{execv} function executes the file named by @var{filename} as a +new process image. + +The @var{argv} argument is an array of null-terminated strings that is +used to provide a value for the @code{argv} argument to the @code{main} +function of the program to be executed. The last element of this array +must be a null pointer. @xref{Program Arguments}, for information on +how programs can access these arguments. + +The environment for the new process image is taken from the +@code{environ} variable of the current process image; see @ref{Environment +Variables}, for information about environments. +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun int execl (const char *@var{filename}, const char *@var{arg0}, @dots{}) +This is similar to @code{execv}, but the @var{argv} strings are +specified individually instead of as an array. A null pointer must be +passed as the last such argument. +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun int execve (const char *@var{filename}, char *const @var{argv}@t{[]}, char *const @var{env}@t{[]}) +This is similar to @code{execv}, but permits you to specify the environment +for the new program explicitly as the @var{env} argument. This should +be an array of strings in the same format as for the @code{environ} +variable; see @ref{Environment Access}. +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun int execle (const char *@var{filename}, const char *@var{arg0}, char *const @var{env}@t{[]}, @dots{}) +This is similar to @code{execl}, but permits you to specify the +environment for the new program explicitly. The environment argument is +passed following the null pointer that marks the last @var{argv} +argument, and should be an array of strings in the same format as for +the @code{environ} variable. +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun int execvp (const char *@var{filename}, char *const @var{argv}@t{[]}) +The @code{execvp} function is similar to @code{execv}, except that it +searches the directories listed in the @code{PATH} environment variable +(@pxref{Standard Environment Variables}) to find the full file name of a +file from @var{filename} if @var{filename} does not contain a slash. + +This function is useful for executing installed system utility programs, +so that the user can control where to look for them. It is also useful +in shells, for executing commands typed by the user. +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun int execlp (const char *@var{filename}, const char *@var{arg0}, @dots{}) +This function is like @code{execl}, except that it performs the same +file name searching as the @code{execvp} function. +@end deftypefun + + +The size of the argument list and environment list taken together must not +be greater than @code{ARG_MAX} bytes. @xref{System Parameters}. + +@strong{Incomplete:} The POSIX.1 standard requires some statement here +about how null terminators, null pointers, and alignment requirements +affect the total size of the argument and environment lists. + +These functions normally don't return, since execution of a new program +causes the currently executing program to go away completely. A value +of @code{-1} is returned in the event of a failure. In addition to the +usual file name syntax errors (@pxref{File Name Errors}), the following +@code{errno} error conditions are defined for these functions: + +@table @code +@item E2BIG +The combined size of the new program's argument list and environment list +is larger than @code{ARG_MAX} bytes. + +@item ENOEXEC +The specified file can't be executed because it isn't in the right format. + +@item ENOMEM +Executing the specified file requires more storage than is available. +@end table + +If execution of the new file is successful, the access time field of the +file is updated as if the file had been opened. @xref{File Times}, for +more details about access times of files. + +The point at which the file is closed again is not specified, but +is at some point before the process exits or before another process +image is executed. + +Executing a new process image completely changes the contents of memory, +except for the arguments and the environment, but many other attributes +of the process are unchanged: + +@itemize @bullet +@item +The process ID and the parent process ID. @xref{Process Creation Concepts}. + +@item +Session and process group membership. @xref{Job Control Concepts}. + +@item +Real user ID and group ID, and supplementary group IDs. @xref{User/Group +IDs of a Process}. + +@item +Pending alarms. @xref{Setting an Alarm}. + +@item +Current working directory and root directory. @xref{Working Directory}. + +@item +File mode creation mask. @xref{Setting Permissions}. + +@item +Process signal mask; see @ref{Process Signal Mask}. + +@item +Pending signals; see @ref{Blocking Signals}. + +@item +Elapsed processor time associated with the process; see @ref{Processor Time}. +@end itemize + +If the set-user-ID and set-group-ID mode bits of the process image file +are set, this affects the effective user ID and effective group ID +(respectively) of the process. These concepts are discussed in detail +in @ref{User/Group IDs of a Process}. + +Signals that are set to be ignored in the existing process image are +also set to be ignored in the new process image. All other signals are +set to the default action in the new process image. For more +information about signals, see @ref{Signal Handling}. + +File descriptors open in the existing process image remain open in the +new process image, unless they have the @code{FD_CLOEXEC} +(close-on-exec) flag set. The files that remain open inherit all +attributes of the open file description from the existing process image, +including file locks. File descriptors are discussed in @ref{Low-Level +Input/Output}. + +Streams, by contrast, cannot survive through @code{exec} functions, +because they are located in the memory of the process itself. The new +process image has no streams except those it creates afresh. Each of +the streams in the pre-@code{exec} process image has a descriptor inside +it, and these descriptors do survive through @code{exec} (provided that +they do not have @code{FD_CLOEXEC} set. The new process image can +reconnect these to new streams using @code{fdopen}. + +@node Process Completion, Process Completion Status, Executing a File, Creating New Processes +@subsection Process Completion +@cindex process completion +@cindex waiting for completion of child process +@cindex testing exit status of child process + +The functions described in this section are used to wait for a child +process to terminate or stop, and determine its status. These functions +are declared in the header file @file{sys/wait.h}. +@pindex sys/wait.h + +@comment sys/wait.h +@comment POSIX.1 +@deftypefun pid_t waitpid (pid_t @var{pid}, int *@var{status_ptr}, int @var{options}) +The @code{waitpid} function is used to request status information from a +child process whose process ID is @var{pid}. Normally, the calling +process is suspended until the child process makes status information +available by terminating. + +Other values for the @var{pid} argument have special interpretations. A +value of @code{-1} or @code{WAIT_ANY} requests status information for +any child process; a value of @code{0} or @code{WAIT_MYPGRP} requests +information for any child process in the same process group as the +calling process; and any other negative value @minus{} @var{pgid} +requests information for any child process whose process group ID is +@var{pgid}. + +If status information for a child process is available immediately, this +function returns immediately without waiting. If more than one eligible +child process has status information available, one of them is chosen +randomly, and its status is returned immediately. To get the status +from the other programs, you need to call @code{waitpid} again. + +The @var{options} argument is a bit mask. Its value should be the +bitwise OR (that is, the @samp{|} operator) of zero or more of the +@code{WNOHANG} and @code{WUNTRACED} flags. You can use the +@code{WNOHANG} flag to indicate that the parent process shouldn't wait; +and the @code{WUNTRACED} flag to request status information from stopped +processes as well as processes that have terminated. + +The status information from the child process is stored in the object +that @var{status_ptr} points to, unless @var{status_ptr} is a null pointer. + +The return value is normally the process ID of the child process whose +status is reported. If the @code{WNOHANG} option was specified and no +child process is waiting to be noticed, a value of zero is returned. A +value of @code{-1} is returned in case of error. The following +@code{errno} error conditions are defined for this function: + +@table @code +@item EINTR +The function was interrupted by delivery of a signal to the calling +process. + +@item ECHILD +There are no child processes to wait for, or the specified @var{pid} +is not a child of the calling process. + +@item EINVAL +An invalid value was provided for the @var{options} argument. +@end table +@end deftypefun + +These symbolic constants are defined as values for the @var{pid} argument +to the @code{waitpid} function. + +@table @code +@item WAIT_ANY +This constant macro (whose value is @code{-1}) specifies that +@code{waitpid} should return status information about any child process. + +@item WAIT_MYPGRP +This constant (with value @code{0}) specifies that @code{waitpid} should +return status information about any child process in the same process +group as the calling process. + +These symbolic constants are defined as flags for the @var{options} +argument to the @code{waitpid} function. You can bitwise-OR the flags +together to obtain a value to use as the argument. + +@item WNOHANG +This flag specifies that @code{waitpid} should return immediately +instead of waiting if there is no child process ready to be noticed. + +@item WUNTRACED +This macro is used to specify that @code{waitpid} should also report the +status of any child processes that have been stopped as well as those +that have terminated. +@end table + +@deftypefun pid_t wait (int *@var{status_ptr}) +This is a simplified version of @code{waitpid}, and is used to wait +until any one child process terminates. + +@example +wait (&status) +@end example + +@noindent +is equivalent to: + +@example +waitpid (-1, &status, 0) +@end example + +Here's an example of how to use @code{waitpid} to get the status from +all child processes that have terminated, without ever waiting. This +function is designed to be used as a handler for @code{SIGCHLD}, the +signal that indicates that at least one child process has terminated. + +@example +void +sigchld_handler (int signum) +@{ + int pid; + int status; + while (1) @{ + pid = waitpid (WAIT_ANY, Estatus, WNOHANG); + if (pid < 0) @{ + perror ("waitpid"); + break; + @} + if (pid == 0) + break; + notice_termination (pid, status); + @} +@} +@end example +@end deftypefun + +@node Process Completion Status, BSD wait Functions, Process Completion, Creating New Processes +@subsection Process Completion Status + +If the exit status value (@pxref{Program Termination}) of the child +process is zero, then the status value reported by @code{waitpid} or +@code{wait} is also zero. You can test for other kinds of information +encoded in the returned status value using the following macros. +These macros are defined in the header file @file{sys/wait.h}. +@pindex sys/wait.h + +@comment sys/wait.h +@comment POSIX.1 +@deftypefn Macro int WIFEXITED (int @var{status}) +This macro returns a non-zero value if the child process terminated +normally with @code{exit} or @code{_exit}. +@end deftypefn + +@comment sys/wait.h +@comment POSIX.1 +@deftypefn Macro int WEXITSTATUS (int @var{status}) +If @code{WIFEXITED} is true of @var{status}, this macro returns the +low-order 8 bits of the exit status value from the child process. +@end deftypefn + +@comment sys/wait.h +@comment POSIX.1 +@deftypefn Macro int WIFSIGNALED (int @var{status}) +This macro returns a non-zero value if the child process terminated +by receiving a signal that was not handled. +@end deftypefn + +@comment sys/wait.h +@comment POSIX.1 +@deftypefn Macro int WTERMSIG (int @var{status}) +If @code{WIFSIGNALED} is true of @var{status}, this macro returns the +number of the signal that terminated the child process. +@end deftypefn + +@comment sys/wait.h +@comment BSD +@deftypefn Macro int WCOREDUMP (int @var{status}) +This macro returns a non-zero value if the child process terminated +and produced a core dump. +@end deftypefn + +@comment sys/wait.h +@comment POSIX.1 +@deftypefn Macro int WIFSTOPPED (int @var{status}) +This macro returns a non-zero value if the child process is stopped. +@end deftypefn + +@comment sys/wait.h +@comment POSIX.1 +@deftypefn Macro int WSTOPSIG (int @var{status}) +If @code{WIFSTOPPED} is true of @var{status}, this macro returns the +number of the signal that caused the child process to stop. +@end deftypefn + + +@node BSD wait Functions, Process Creation Example, Process Completion Status, Creating New Processes +@subsection BSD Process Completion Functions + +The GNU library also provides these related facilities for compatibility +with BSD Unix. BSD uses the @code{union wait} data type to represent +status values rather than an @code{int}. The two representations are +actually interchangeable; they describe the same bit patterns. The macros +such as @code{WEXITSTATUS} are defined so that they will work on either +kind of object, and the @code{wait} function is defined to accept either +type of pointer as its @var{status_ptr} argument. + +These functions are declared in @file{sys/wait.h}. +@pindex sys/wait.h + +@comment sys/wait.h +@comment BSD +@deftp {union Type} wait +This data type represents program termination status values. It has +the following members: + +@table @code +@item int w_termsig +This member is equivalent to the @code{WTERMSIG} macro. + +@item int w_coredump +This member is equivalent to the @code{WCOREDUMP} macro. + +@item int w_retcode +This member is equivalent to the @code{WEXISTATUS} macro. + +@item int w_stopsig +This member is equivalent to the @code{WSTOPSIG} macro. +@end table + +Instead of accessing these members directly, you should use the +equivalent macros. +@end deftp + +@comment sys/wait.h +@comment BSD +@deftypefun pid_t wait3 (union wait *@var{status_ptr}, int @var{options}, void * @var{usage}) +If @var{usage} is a null pointer, this function is equivalent to +@code{waitpid (-1, @var{status_ptr}, @var{options})}. + +The @var{usage} argument may also be a pointer to a +@code{struct rusage} object. Information about system resources used by +terminated processes (but not stopped processes) is returned in this +structure. + +@strong{Incomplete:} The description of the @code{struct rusage} structure +hasn't been written yet. Put in a cross-reference here. +@end deftypefun + +@comment sys/wait.h +@comment BSD +@deftypefun pid_t wait4 (pid_t @var{pid}, union wait *@var{status_ptr}, int @var{options}, void *@var{usage}) +If @var{usage} is a null pointer, this function is equivalent to +@code{waitpid (@var{pid}, @var{status_ptr}, @var{options})}. + +The @var{usage} argument may also be a pointer to a +@code{struct rusage} object. Information about system resources used by +terminated processes (but not stopped processes) is returned in this +structure. + +@strong{Incomplete:} The description of the @code{struct rusage} structure +hasn't been written yet. Put in a cross-reference here. +@end deftypefun + +@node Process Creation Example, , BSD wait Functions, Creating New Processes +@subsection Process Creation Example + +Here is an example program showing how you might write a function +similar to the built-in @code{system}. It executes its @var{command} +argument using the equivalent of @samp{sh -c @var{command}}. + +@example +#include <stddef.h> +#include <stdlib.h> +#include <unistd.h> +#include <sys/types.h> +#include <sys/wait.h> + +/* @r{Execute the command using this shell program.} */ +#define SHELL "/bin/sh" + +int +my_system (char *command) +@{ + int status; + pid_t pid; + + pid = fork (); + if (pid == 0) @{ + /* @r{This is the child process. Execute the shell command.} */ + execl (SHELL, SHELL, "-c", command, NULL); + exit (EXIT_FAILURE); + @} + else if (pid < 0) + /* @r{The fork failed. Report failure.} */ + status = -1; + else @{ + /* @r{This is the parent process. Wait for the child to complete.} */ + if (waitpid (pid, &status, 0) != pid) + status = -1; + @} + return status; +@} +@end example + +@comment Yes, this example has been tested. + +There are a couple of things you should pay attention to in this +example. + +Remember that the first @code{argv} argument supplied to the program +represents the name of the program being executed. That is why, in the +call to @code{execl}, @code{SHELL} is supplied once to name the program +to execute and a second time to supply a value for @code{argv[0]}. + +The @code{execl} call in the child process doesn't return if it is +successful. If it fails, you must do something to make the child +process terminate. Just returning a bad status code with @code{return} +would leave two processes running the original program. Instead, the +right behavior is for the child process to report failure to its parent +process. To do this, @code{exit} is called with a failure status. diff --git a/manual/=stdarg.texi b/manual/=stdarg.texi new file mode 100644 index 0000000000..384c992f13 --- /dev/null +++ b/manual/=stdarg.texi @@ -0,0 +1,290 @@ +@node Variable Argument Facilities, Memory Allocation, Common Definitions, Top +@chapter Variable Argument Facilities +@cindex variadic argument functions +@cindex variadic functions +@cindex variable number of arguments +@cindex optional arguments + +ANSI C defines a syntax as part of the kernel language for specifying +functions that take a variable number or type of arguments. (Such +functions are also referred to as @dfn{variadic functions}.) However, +the kernel language provides no mechanism for actually accessing +non-required arguments; instead, you use the variable arguments macros +defined in @file{stdarg.h}. +@pindex stdarg.h + +@menu +* Why Variable Arguments are Used:: Using variable arguments can + save you time and effort. +* How Variable Arguments are Used:: An overview of the facilities for + receiving variable arguments. +* Variable Arguments Interface:: Detailed specification of the + library facilities. +* Example of Variable Arguments:: A complete example. +@end menu + +@node Why Variable Arguments are Used, How Variable Arguments are Used, , Variable Argument Facilities +@section Why Variable Arguments are Used + +Most C functions take a fixed number of arguments. When you define a +function, you also supply a specific data type for each argument. +Every call to the function should supply the same number and type of +arguments as specified in the function definition. + +On the other hand, sometimes a function performs an operation that can +meaningfully accept an unlimited number of arguments. + +For example, consider a function that joins its arguments into a linked +list. It makes sense to connect any number of arguments together into a +list of arbitrary length. Without facilities for variable arguments, +you would have to define a separate function for each possible number of +arguments you might want to link together. This is an example of a +situation where some kind of mapping or iteration is performed over an +arbitrary number of arguments of the same type. + +Another kind of application where variable arguments can be useful is +for functions where values for some arguments can simply be omitted in +some calls, either because they are not used at all or because the +function can determine appropriate defaults for them if they're missing. + +The library function @code{printf} (@pxref{Formatted Output}) is an +example of still another class of function where variable arguments are +useful. This function prints its arguments (which can vary in type as +well as number) under the control of a format template string. + +@node How Variable Arguments are Used, Variable Arguments Interface, Why Variable Arguments are Used, Variable Argument Facilities +@section How Variable Arguments are Used + +This section describes how you can define and call functions that take +variable arguments, and how to access the values of the non-required +arguments. + +@menu +* Syntax for Variable Arguments:: How to make a prototype for a + function with variable arguments. +* Receiving the Argument Values:: Steps you must follow to access the + optional argument values. +* How Many Arguments:: How to decide whether there are more + arguments. +* Calling Variadic Functions:: Things you need to know about calling + variable arguments functions. +@end menu + +@node Syntax for Variable Arguments, Receiving the Argument Values, , How Variable Arguments are Used +@subsection Syntax for Variable Arguments + +A function that accepts a variable number of arguments must have at +least one required argument with a specified type. In the function +definition or prototype declaration, you indicate the fact that a +function can accept additional arguments of unspecified type by putting +@samp{@dots{}} at the end of the arguments. For example, + +@example +int +func (const char *a, int b, @dots{}) +@{ + @dots{} +@} +@end example + +@noindent +outlines a definition of a function @code{func} which returns an +@code{int} and takes at least two arguments, the first two being a +@code{const char *} and an @code{int}.@refill + +An obscure restriction placed by the ANSI C standard is that the last +required argument must not be declared @code{register} in the function +definition. Furthermore, this argument must not be of a function or +array type, and may not be, for example, a @code{char} or @code{short +int} (whether signed or not) or a @code{float}. + +@strong{Compatibility Note:} Many older C dialects provide a similar, +but incompatible, mechanism for defining functions with variable numbers +of arguments. In particular, the @samp{@dots{}} syntax is a new feature +of ANSI C. + + +@node Receiving the Argument Values, How Many Arguments, Syntax for Variable Arguments, How Variable Arguments are Used +@subsection Receiving the Argument Values + +Inside the definition of a variadic function, to access the optional +arguments with the following three step process: + +@enumerate +@item +You initialize an argument pointer variable of type @code{va_list} using +@code{va_start}. + +@item +You access the optional arguments by successive calls to @code{va_arg}. + +@item +You call @code{va_end} to indicate that you are finished accessing the +arguments. +@end enumerate + +Steps 1 and 3 must be performed in the function that is defined to +accept variable arguments. However, you can pass the @code{va_list} +variable as an argument to another function and perform all or part of +step 2 there. After doing this, the value of the @code{va_list} +variable in the calling function becomes undefined for further calls to +@code{va_arg}; you should just pass it to @code{va_end}. + +You can perform the entire sequence of the three steps multiple times +within a single function invocation. And, if the function doesn't want +to look at its optional arguments at all, it doesn't have to do any of +these steps. It is also perfectly all right for a function to access +fewer arguments than were supplied in the call, but you will get garbage +values if you try to access too many arguments. + + +@node How Many Arguments, Calling Variadic Functions, Receiving the Argument Values, How Variable Arguments are Used +@subsection How Many Arguments Were Supplied + +There is no general way for a function to determine the number and type +of the actual values that were passed as optional arguments. Typically, +the value of one of the required arguments is used to tell the function +this information. It is up to you to define an appropriate calling +convention for each function, and write all calls accordingly. + +One calling convention is to make one of the required arguments be an +explicit argument count. This convention is usable if all of the +optional arguments are of the same type. + +A required argument can be used as a pattern to specify both the number +and types of the optional arguments. The format template string +argument to @code{printf} is one example of this. + +A similar technique that is sometimes used is to have one of the +required arguments be a bit mask, with a bit for each possible optional +argument that might be supplied. The bits are tested in a predefined +sequence; if the bit is set, the value of the next argument is +retrieved, and otherwise a default value is used. + +Another technique that is sometimes used is to pass an ``end marker'' +value as the last optional argument. For example, for a function that +manipulates an arbitrary number of pointer arguments, a null pointer +might indicate the end of the argument list, provided that a null +pointer isn't otherwise meaningful to the function. + + +@node Calling Variadic Functions, , How Many Arguments, How Variable Arguments are Used +@subsection Calling Variadic Functions + +Functions that are @emph{defined} to be variadic must also be +@emph{declared} to be variadic using a function prototype in the scope +of all calls to it. This is because C compilers might use a different +internal function call protocol for variadic functions than for +functions that take a fixed number and type of arguments. If the +compiler can't determine in advance that the function being called is +variadic, it may end up trying to call it incorrectly and your program +won't work. +@cindex function prototypes +@cindex prototypes for variadic functions +@cindex variadic functions need prototypes + +Since the prototype doesn't specify types for optional arguments, in a +call to a variadic function the @dfn{default argument promotions} are +performed on the optional argument values. This means the objects of +type @code{char} or @code{short int} (whether signed or not) are +promoted to either @code{int} or @code{unsigned int}, as appropriate; +and that objects of type @code{float} are promoted to type +@code{double}. So, if the caller passes a @code{char} as an optional +argument, it is promoted to a @code{int}, and the function should get it +with @code{va_arg (@var{ap}, int)}. + +Promotions of the required arguments are determined by the function +prototype in the usual way (as if by assignment to the types of the +corresponding formal parameters). +@cindex default argument promotions +@cindex argument promotion + +@node Variable Arguments Interface, Example of Variable Arguments, How Variable Arguments are Used, Variable Argument Facilities +@section Variable Arguments Interface + +Here are descriptions of the macros used to retrieve variable arguments. +These macros are defined in the header file @file{stdarg.h}. +@pindex stdarg.h + +@comment stdarg.h +@comment ANSI +@deftp {Data Type} va_list +The type @code{va_list} is used for argument pointer variables. +@end deftp + +@comment stdarg.h +@comment ANSI +@deftypefn {Macro} void va_start (va_list @var{ap}, @var{last_required}) +This macro initialized the argument pointer variable @var{ap} to point +to the first of the optional arguments of the current function; +@var{last_required} must be the last required argument to the function. +@end deftypefn + +@comment stdarg.h +@comment ANSI +@deftypefn {Macro} @var{type} va_arg (va_list @var{ap}, @var{type}) +The @code{va_arg} macro returns the value of the next optional argument, +and changes the internal state of @var{ap} to move past this argument. +Thus, successive uses of @code{va_arg} return successive optional +arguments. +The type of the value returned by @code{va_arg} is the @var{type} +specified in the call. + +The @var{type} must match the type of the actual argument, and must not +be @code{char} or @code{short int} or @code{float}. (Remember that the +default argument promotions apply to optional arguments.) +@end deftypefn + +@comment stdarg.h +@comment ANSI +@deftypefn {Macro} void va_end (va_list @var{ap}) +This ends the use of @var{ap}. After a @code{va_end} call, further +@code{va_arg} calls with the same @var{ap} may not work. You should invoke +@code{va_end} before returning from the function in which @code{va_start} +was invoked with the same @var{ap} argument. + +In the GNU C library, @code{va_end} does nothing, and you need not ever +use it except for reasons of portability. +@refill +@end deftypefn + + +@node Example of Variable Arguments, , Variable Arguments Interface, Variable Argument Facilities +@section Example of Variable Arguments + +Here is a complete sample function that accepts variable numbers of +arguments. The first argument to the function is the count of remaining +arguments, which are added up and the result returned. (This is +obviously a rather pointless function, but it serves to illustrate the +way the variable arguments facility is commonly used.) + +@comment Yes, this example has been tested. + +@example +#include <stdarg.h> + +int +add_em_up (int count, @dots{}) +@{ + va_list ap; + int i, sum; + + va_start (ap, count); /* @r{Initialize the argument list.} */ + + sum = 0; + for (i = 0; i < count; i++) + sum = sum + va_arg (ap, int); /* @r{Get the next argument value.} */ + + va_end (ap); /* @r{Clean up.} */ + return sum; +@} + +void main (void) +@{ + /* @r{This call prints 16.} */ + printf ("%d\n", add_em_up (3, 5, 5, 6)); + + /* @r{This call prints 55.} */ + printf ("%d\n", add_em_up (10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)); +@} +@end example diff --git a/manual/=stddef.texi b/manual/=stddef.texi new file mode 100644 index 0000000000..28d4b26f33 --- /dev/null +++ b/manual/=stddef.texi @@ -0,0 +1,81 @@ +@node Common Definitions, Memory Allocation, Error Reporting, Top +@chapter Common Definitions + +There are some miscellaneous data types and macros that are not part of +the C language kernel but are nonetheless almost universally used, such +as the macro @code{NULL}. In order to use these type and macro +definitions, your program should include the header file +@file{stddef.h}. +@pindex stddef.h + +@comment stddef.h +@comment ANSI +@deftp {Data Type} ptrdiff_t +This is the signed integer type of the result of subtracting two +pointers. For example, with the declaration @code{char *p1, *p2;}, the +expression @code{p2 - p1} is of type @code{ptrdiff_t}. This will +probably be one of the standard signed integer types (@code{short int}, +@code{int} or @code{long int}), but might be a nonstandard type that +exists only for this purpose. +@end deftp + +@comment stddef.h +@comment ANSI +@deftp {Data Type} size_t +This is an unsigned integer type used to represent the sizes of objects. +The result of the @code{sizeof} operator is of this type, and functions +such as @code{malloc} (@pxref{Unconstrained Allocation}) and +@code{memcpy} (@pxref{Copying and Concatenation}) that manipulate +objects of arbitrary sizes accept arguments of this type to specify +object sizes. +@end deftp + +In the GNU system @code{size_t} is equivalent to one of the types +@code{unsigned int} and @code{unsigned long int}. These types have +identical properties on the GNU system, and for most purposes, you +can use them interchangeably. However, they are distinct types, +and in certain contexts, you may not treat them as identical. For +example, when you specify the type of a function argument in a +function prototype, it makes a difference which one you use. If +the system header files declare @code{malloc} with an argument +of type @code{size_t} and you declare @code{malloc} with an argument +of type @code{unsigned int}, you will get a compilation error if +@code{size_t} happens to be @code{unsigned long int} on your system. +To avoid any possibility of error, when a function argument is +supposed to have type @code{size_t}, always write the type as +@code{size_t}, and make no assumptions about what that type might +actually be. + +@strong{Compatibility Note:} Types such as @code{size_t} are new +features of ANSI C. Older, pre-ANSI C implementations have +traditionally used @code{unsigned int} for representing object sizes +and @code{int} for pointer subtraction results. + +@comment stddef.h +@comment ANSI +@deftypevr Macro {void *} NULL +@cindex null pointer +This is a null pointer constant. It can be assigned to any pointer +variable since it has type @code{void *}, and is guaranteed not to +point to any real object. This macro is the best way to get a null +pointer value. You can also use @code{0} or @code{(void *)0} as a null +pointer constant, but using @code{NULL} makes the purpose of the +constant more evident. + +When passing a null pointer as an argument to a function for which there +is no prototype declaration in scope, you should explicitly cast +@code{NULL} or @code{0} into a pointer of the appropriate type. Again, +this is because the default argument promotions may not do the right +thing. +@end deftypevr + +@comment stddef.h +@comment ANSI +@deftypefn {Macro} size_t offsetof (@var{type}, @var{member}) +This expands to a integer constant expression that is the offset of the +structure member named @var{member} in a @code{struct} of type +@var{type}. For example, @code{offsetof (struct s, elem)} is the +offset, in bytes, of the member @code{elem} in a @code{struct s}. This +macro won't work if @var{member} is a bit field; you get an error from +the C compiler in that case. +@end deftypefn diff --git a/manual/Makefile b/manual/Makefile new file mode 100644 index 0000000000..57e6ae2306 --- /dev/null +++ b/manual/Makefile @@ -0,0 +1,186 @@ +# Makefile for the GNU C Library manual. + +# Copyright (C) 1992, 1993, 1994 Free Software Foundation, Inc. +# This file is part of the GNU C Library. + +# The GNU C Library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Library General Public License +# as published by the Free Software Foundation; either version 2 of +# the License, or (at your option) any later version. + +# The GNU C Library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Library General Public License for more details. + +# You should have received a copy of the GNU Library General Public +# License along with the GNU C Library; see the file COPYING.LIB. If +# not, write to the Free Software Foundation, Inc., 675 Mass Ave, +# Cambridge, MA 02139, USA. + +subdir := manual +export subdir := $(subdir) + +.PHONY: all dvi info +all: dvi info +dvi: libc.dvi +info: libc.info + +# Get glibc's configuration info. +ifneq (,$(wildcard ../Makeconfig)) +include ../Makeconfig +endif + +# Set chapters and chapters-incl. +include chapters +chapters: libc.texinfo + $(find-includes) +chapters := $(filter-out summary.texi,$(chapters)) +ifdef chapters +include chapters-incl +chapters-incl: $(chapters) + $(find-includes) +endif + +define find-includes +(echo '$(@F) :=' \\ ;\ + awk '$$1 == "@include" { print $$2 " \\" }' $^) > $@.new +mv -f $@.new $@ +endef + +libc.dvi libc.info: $(chapters) summary.texi $(chapters-incl) +libc.dvi: texinfo.tex + +# Generate the summary from the Texinfo source files for each chapter. +summary.texi: stamp-summary ; +stamp-summary: summary.awk $(chapters) $(chapters-incl) + awk -f $^ \ + | sort -df +1 -2 | tr '\014' '\012' > summary-tmp + ./move-if-change summary-tmp summary.texi +# touch is broken on our machines. Sigh. + date > $@ + +# Generate Texinfo files from the C source for the example programs. +%.c.texi: examples/%.c + sed -e 's,[{}],@&,g' \ + -e 's,/\*\(@.*\)\*/,\1,g' \ + -e 's,/\* *,/* @r{,g' -e 's, *\*/,} */,' \ + -e 's/\(@[a-z][a-z]*\)@{\([^}]*\)@}/\1{\2}/'\ + $< | expand > $@.new + mv -f $@.new $@ + + +minimal-dist = summary.awk move-if-change libc.texinfo $(chapters) \ + $(patsubst %.c.texi,examples/%.c, \ + $(filter-out summary.texi,$(chapters-incl))) +doc-only-dist = Makefile COPYING.LIB mkinstalldirs +distribute = $(minimal-dist) \ + $(patsubst examples/%.c,%.c.texi,$(filter examples/%.c, \ + $(minimal-dist))) \ + libc.?? libc.??s texinfo.tex summary.texi \ + stamp-summary chapters chapters-incl +export distribute := $(distribute) + +tar-it = tar chovf $@ $^ + +manual.tar: $(doc-only-dist) $(minimal-dist) ; $(tar-it) +mandist.tar: $(doc-only-dist) $(distribute) ; $(tar-it) + +edition := $(shell sed -n 's/^@set EDITION \([0-9][0-9.]*\)[^0-9.]*.*$$/\1/p' \ + libc.texinfo) + +glibc-doc-$(edition).tar: $(doc-only-dist) $(distribute) + @rm -f glibc-doc-$(edition) + ln -s . glibc-doc-$(edition) + tar chovf $@ $(addprefix glibc-doc-$(edition)/,$^) + rm -f glibc-doc-$(edition) + +%.Z: % + compress -c $< > $@.new + mv -f $@.new $@ +%.gz: % + gzip -9 -c $< > $@.new + mv -f $@.new $@ +%.uu: % + uuencode $< < $< > $@.new + mv -f $@.new $@ + +# The parent makefile sometimes invokes us with targets `subdir_REAL-TARGET'. +subdir_%: % ; + +.PHONY: mostlyclean distclean realclean clean +mostlyclean: + -rm -f libc.dvi libc.info* +clean: mostlyclean +distclean: clean +indices = cp fn pg tp vr ky +realclean: distclean + -rm -f chapters chapters-incl summary.texi stamp-summary *.c.texi + -rm -f $(foreach index,$(indices),libc.$(index) libc.$(index)s) + -rm -f libc.log libc.aux libc.toc + +.PHONY: install subdir_install installdirs install-data +install-data subdir_install: install +install: $(infodir)/libc.info +# Catchall implicit rule for other installation targets from the parent. +install-%: ; + +ifndef infodir +infodir = $(prefix)/info +endif +ifndef prefix +prefix = /usr/local +endif + +ifndef INSTALL_DATA +INSTALL_DATA = $(INSTALL) -m 644 +endif +ifndef INSTALL +INSTALL = install +endif + +$(infodir)/libc.info: libc.info installdirs + for file in $<*; do \ + name=`basename $$file`; \ + $(INSTALL_DATA) $$file \ + `echo $@ | sed "s,$<\$$,$$name,"`; \ + done + +installdirs: $(firstword $(wildcard mkinstalldirs ../mkinstalldirs)) + $(dir $<)$(notdir $<) $(infodir) + +.PHONY: dist +dist: # glibc-doc-$(edition).tar.gz + +ifneq (,$(wildcard ../Make-dist)) +dist: ../Make-dist + $(MAKE) -f $< $(Make-dist-args) +endif + +ifndef ETAGS +ETAGS = etags -T +endif +TAGS: $(minimal-dist) + $(ETAGS) -o $@ $^ + +# These are targets that each glibc subdirectory is expected to understand. +# ../Rules defines them for code subdirectories; for us, they are no-ops. +glibc-targets := subdir_lib objects objs others tests subdir_lint.out \ + subdir_echo-headers subdir_echo-distinfo stubs +.PHONY: $(glibc-targets) +$(glibc-targets): + +stubs: $(common-objpfx)stub-manual +$(common-objpfx)stub-manual: + cp /dev/null $@ + +# The top-level glibc Makefile expects subdir_install to update the stubs file. +subdir_install: stubs + + +# Get rid of these variables if they came from the parent. +routines = +aux = +sources = +objects = +headers = diff --git a/manual/arith.texi b/manual/arith.texi new file mode 100644 index 0000000000..a5d2814b1d --- /dev/null +++ b/manual/arith.texi @@ -0,0 +1,623 @@ +@node Arithmetic, Date and Time, Mathematics, Top +@chapter Low-Level Arithmetic Functions + +This chapter contains information about functions for doing basic +arithmetic operations, such as splitting a float into its integer and +fractional parts. These functions are declared in the header file +@file{math.h}. + +@menu +* Not a Number:: Making NaNs and testing for NaNs. +* Predicates on Floats:: Testing for infinity and for NaNs. +* Absolute Value:: Absolute value functions. +* Normalization Functions:: Hacks for radix-2 representations. +* Rounding and Remainders:: Determinining the integer and + fractional parts of a float. +* Integer Division:: Functions for performing integer + division. +* Parsing of Numbers:: Functions for ``reading'' numbers + from strings. +@end menu + +@node Not a Number +@section ``Not a Number'' Values +@cindex NaN +@cindex not a number +@cindex IEEE floating point + +The IEEE floating point format used by most modern computers supports +values that are ``not a number''. These values are called @dfn{NaNs}. +``Not a number'' values result from certain operations which have no +meaningful numeric result, such as zero divided by zero or infinity +divided by infinity. + +One noteworthy property of NaNs is that they are not equal to +themselves. Thus, @code{x == x} can be 0 if the value of @code{x} is a +NaN. You can use this to test whether a value is a NaN or not: if it is +not equal to itself, then it is a NaN. But the recommended way to test +for a NaN is with the @code{isnan} function (@pxref{Predicates on Floats}). + +Almost any arithmetic operation in which one argument is a NaN returns +a NaN. + +@comment math.h +@comment GNU +@deftypevr Macro double NAN +An expression representing a value which is ``not a number''. This +macro is a GNU extension, available only on machines that support ``not +a number'' values---that is to say, on all machines that support IEEE +floating point. + +You can use @samp{#ifdef NAN} to test whether the machine supports +NaNs. (Of course, you must arrange for GNU extensions to be visible, +such as by defining @code{_GNU_SOURCE}, and then you must include +@file{math.h}.) +@end deftypevr + +@node Predicates on Floats +@section Predicates on Floats + +@pindex math.h +This section describes some miscellaneous test functions on doubles. +Prototypes for these functions appear in @file{math.h}. These are BSD +functions, and thus are available if you define @code{_BSD_SOURCE} or +@code{_GNU_SOURCE}. + +@comment math.h +@comment BSD +@deftypefun int isinf (double @var{x}) +This function returns @code{-1} if @var{x} represents negative infinity, +@code{1} if @var{x} represents positive infinity, and @code{0} otherwise. +@end deftypefun + +@comment math.h +@comment BSD +@deftypefun int isnan (double @var{x}) +This function returns a nonzero value if @var{x} is a ``not a number'' +value, and zero otherwise. (You can just as well use @code{@var{x} != +@var{x}} to get the same result). +@end deftypefun + +@comment math.h +@comment BSD +@deftypefun int finite (double @var{x}) +This function returns a nonzero value if @var{x} is finite or a ``not a +number'' value, and zero otherwise. +@end deftypefun + +@comment math.h +@comment BSD +@deftypefun double infnan (int @var{error}) +This function is provided for compatibility with BSD. The other +mathematical functions use @code{infnan} to decide what to return on +occasion of an error. Its argument is an error code, @code{EDOM} or +@code{ERANGE}; @code{infnan} returns a suitable value to indicate this +with. @code{-ERANGE} is also acceptable as an argument, and corresponds +to @code{-HUGE_VAL} as a value. + +In the BSD library, on certain machines, @code{infnan} raises a fatal +signal in all cases. The GNU library does not do likewise, because that +does not fit the ANSI C specification. +@end deftypefun + +@strong{Portability Note:} The functions listed in this section are BSD +extensions. + +@node Absolute Value +@section Absolute Value +@cindex absolute value functions + +These functions are provided for obtaining the @dfn{absolute value} (or +@dfn{magnitude}) of a number. The absolute value of a real number +@var{x} is @var{x} is @var{x} is positive, @minus{}@var{x} if @var{x} is +negative. For a complex number @var{z}, whose real part is @var{x} and +whose imaginary part is @var{y}, the absolute value is @w{@code{sqrt +(@var{x}*@var{x} + @var{y}*@var{y})}}. + +@pindex math.h +@pindex stdlib.h +Prototypes for @code{abs} and @code{labs} are in @file{stdlib.h}; +@code{fabs} and @code{cabs} are declared in @file{math.h}. + +@comment stdlib.h +@comment ANSI +@deftypefun int abs (int @var{number}) +This function returns the absolute value of @var{number}. + +Most computers use a two's complement integer representation, in which +the absolute value of @code{INT_MIN} (the smallest possible @code{int}) +cannot be represented; thus, @w{@code{abs (INT_MIN)}} is not defined. +@end deftypefun + +@comment stdlib.h +@comment ANSI +@deftypefun {long int} labs (long int @var{number}) +This is similar to @code{abs}, except that both the argument and result +are of type @code{long int} rather than @code{int}. +@end deftypefun + +@comment math.h +@comment ANSI +@deftypefun double fabs (double @var{number}) +This function returns the absolute value of the floating-point number +@var{number}. +@end deftypefun + +@comment math.h +@comment BSD +@deftypefun double cabs (struct @{ double real, imag; @} @var{z}) +The @code{cabs} function returns the absolute value of the complex +number @var{z}, whose real part is @code{@var{z}.real} and whose +imaginary part is @code{@var{z}.imag}. (See also the function +@code{hypot} in @ref{Exponents and Logarithms}.) The value is: + +@smallexample +sqrt (@var{z}.real*@var{z}.real + @var{z}.imag*@var{z}.imag) +@end smallexample +@end deftypefun + +@node Normalization Functions +@section Normalization Functions +@cindex normalization functions (floating-point) + +The functions described in this section are primarily provided as a way +to efficiently perform certain low-level manipulations on floating point +numbers that are represented internally using a binary radix; +see @ref{Floating Point Concepts}. These functions are required to +have equivalent behavior even if the representation does not use a radix +of 2, but of course they are unlikely to be particularly efficient in +those cases. + +@pindex math.h +All these functions are declared in @file{math.h}. + +@comment math.h +@comment ANSI +@deftypefun double frexp (double @var{value}, int *@var{exponent}) +The @code{frexp} function is used to split the number @var{value} +into a normalized fraction and an exponent. + +If the argument @var{value} is not zero, the return value is @var{value} +times a power of two, and is always in the range 1/2 (inclusive) to 1 +(exclusive). The corresponding exponent is stored in +@code{*@var{exponent}}; the return value multiplied by 2 raised to this +exponent equals the original number @var{value}. + +For example, @code{frexp (12.8, &exponent)} returns @code{0.8} and +stores @code{4} in @code{exponent}. + +If @var{value} is zero, then the return value is zero and +zero is stored in @code{*@var{exponent}}. +@end deftypefun + +@comment math.h +@comment ANSI +@deftypefun double ldexp (double @var{value}, int @var{exponent}) +This function returns the result of multiplying the floating-point +number @var{value} by 2 raised to the power @var{exponent}. (It can +be used to reassemble floating-point numbers that were taken apart +by @code{frexp}.) + +For example, @code{ldexp (0.8, 4)} returns @code{12.8}. +@end deftypefun + +The following functions which come from BSD provide facilities +equivalent to those of @code{ldexp} and @code{frexp}: + +@comment math.h +@comment BSD +@deftypefun double scalb (double @var{value}, int @var{exponent}) +The @code{scalb} function is the BSD name for @code{ldexp}. +@end deftypefun + +@comment math.h +@comment BSD +@deftypefun double logb (double @var{x}) +This BSD function returns the integer part of the base-2 logarithm of +@var{x}, an integer value represented in type @code{double}. This is +the highest integer power of @code{2} contained in @var{x}. The sign of +@var{x} is ignored. For example, @code{logb (3.5)} is @code{1.0} and +@code{logb (4.0)} is @code{2.0}. + +When @code{2} raised to this power is divided into @var{x}, it gives a +quotient between @code{1} (inclusive) and @code{2} (exclusive). + +If @var{x} is zero, the value is minus infinity (if the machine supports +such a value), or else a very small number. If @var{x} is infinity, the +value is infinity. + +The value returned by @code{logb} is one less than the value that +@code{frexp} would store into @code{*@var{exponent}}. +@end deftypefun + +@comment math.h +@comment BSD +@deftypefun double copysign (double @var{value}, double @var{sign}) +The @code{copysign} function returns a value whose absolute value is the +same as that of @var{value}, and whose sign matches that of @var{sign}. +This is a BSD function. +@end deftypefun + +@node Rounding and Remainders +@section Rounding and Remainder Functions +@cindex rounding functions +@cindex remainder functions +@cindex converting floats to integers + +@pindex math.h +The functions listed here perform operations such as rounding, +truncation, and remainder in division of floating point numbers. Some +of these functions convert floating point numbers to integer values. +They are all declared in @file{math.h}. + +You can also convert floating-point numbers to integers simply by +casting them to @code{int}. This discards the fractional part, +effectively rounding towards zero. However, this only works if the +result can actually be represented as an @code{int}---for very large +numbers, this is impossible. The functions listed here return the +result as a @code{double} instead to get around this problem. + +@comment math.h +@comment ANSI +@deftypefun double ceil (double @var{x}) +The @code{ceil} function rounds @var{x} upwards to the nearest integer, +returning that value as a @code{double}. Thus, @code{ceil (1.5)} +is @code{2.0}. +@end deftypefun + +@comment math.h +@comment ANSI +@deftypefun double floor (double @var{x}) +The @code{ceil} function rounds @var{x} downwards to the nearest +integer, returning that value as a @code{double}. Thus, @code{floor +(1.5)} is @code{1.0} and @code{floor (-1.5)} is @code{-2.0}. +@end deftypefun + +@comment math.h +@comment BSD +@deftypefun double rint (double @var{x}) +This function rounds @var{x} to an integer value according to the +current rounding mode. @xref{Floating Point Parameters}, for +information about the various rounding modes. The default +rounding mode is to round to the nearest integer; some machines +support other modes, but round-to-nearest is always used unless +you explicit select another. +@end deftypefun + +@comment math.h +@comment ANSI +@deftypefun double modf (double @var{value}, double *@var{integer-part}) +This function breaks the argument @var{value} into an integer part and a +fractional part (between @code{-1} and @code{1}, exclusive). Their sum +equals @var{value}. Each of the parts has the same sign as @var{value}, +so the rounding of the integer part is towards zero. + +@code{modf} stores the integer part in @code{*@var{integer-part}}, and +returns the fractional part. For example, @code{modf (2.5, &intpart)} +returns @code{0.5} and stores @code{2.0} into @code{intpart}. +@end deftypefun + +@comment math.h +@comment ANSI +@deftypefun double fmod (double @var{numerator}, double @var{denominator}) +This function computes the remainder from the division of +@var{numerator} by @var{denominator}. Specifically, the return value is +@code{@var{numerator} - @w{@var{n} * @var{denominator}}}, where @var{n} +is the quotient of @var{numerator} divided by @var{denominator}, rounded +towards zero to an integer. Thus, @w{@code{fmod (6.5, 2.3)}} returns +@code{1.9}, which is @code{6.5} minus @code{4.6}. + +The result has the same sign as the @var{numerator} and has magnitude +less than the magnitude of the @var{denominator}. + +If @var{denominator} is zero, @code{fmod} fails and sets @code{errno} to +@code{EDOM}. +@end deftypefun + +@comment math.h +@comment BSD +@deftypefun double drem (double @var{numerator}, double @var{denominator}) +The function @code{drem} is like @code{fmod} except that it rounds the +internal quotient @var{n} to the nearest integer instead of towards zero +to an integer. For example, @code{drem (6.5, 2.3)} returns @code{-0.4}, +which is @code{6.5} minus @code{6.9}. + +The absolute value of the result is less than or equal to half the +absolute value of the @var{denominator}. The difference between +@code{fmod (@var{numerator}, @var{denominator})} and @code{drem +(@var{numerator}, @var{denominator})} is always either +@var{denominator}, minus @var{denominator}, or zero. + +If @var{denominator} is zero, @code{drem} fails and sets @code{errno} to +@code{EDOM}. +@end deftypefun + + +@node Integer Division +@section Integer Division +@cindex integer division functions + +This section describes functions for performing integer division. These +functions are redundant in the GNU C library, since in GNU C the @samp{/} +operator always rounds towards zero. But in other C implementations, +@samp{/} may round differently with negative arguments. @code{div} and +@code{ldiv} are useful because they specify how to round the quotient: +towards zero. The remainder has the same sign as the numerator. + +These functions are specified to return a result @var{r} such that the value +@code{@var{r}.quot*@var{denominator} + @var{r}.rem} equals +@var{numerator}. + +@pindex stdlib.h +To use these facilities, you should include the header file +@file{stdlib.h} in your program. + +@comment stdlib.h +@comment ANSI +@deftp {Data Type} div_t +This is a structure type used to hold the result returned by the @code{div} +function. It has the following members: + +@table @code +@item int quot +The quotient from the division. + +@item int rem +The remainder from the division. +@end table +@end deftp + +@comment stdlib.h +@comment ANSI +@deftypefun div_t div (int @var{numerator}, int @var{denominator}) +This function @code{div} computes the quotient and remainder from +the division of @var{numerator} by @var{denominator}, returning the +result in a structure of type @code{div_t}. + +If the result cannot be represented (as in a division by zero), the +behavior is undefined. + +Here is an example, albeit not a very useful one. + +@smallexample +div_t result; +result = div (20, -6); +@end smallexample + +@noindent +Now @code{result.quot} is @code{-3} and @code{result.rem} is @code{2}. +@end deftypefun + +@comment stdlib.h +@comment ANSI +@deftp {Data Type} ldiv_t +This is a structure type used to hold the result returned by the @code{ldiv} +function. It has the following members: + +@table @code +@item long int quot +The quotient from the division. + +@item long int rem +The remainder from the division. +@end table + +(This is identical to @code{div_t} except that the components are of +type @code{long int} rather than @code{int}.) +@end deftp + +@comment stdlib.h +@comment ANSI +@deftypefun ldiv_t ldiv (long int @var{numerator}, long int @var{denominator}) +The @code{ldiv} function is similar to @code{div}, except that the +arguments are of type @code{long int} and the result is returned as a +structure of type @code{ldiv}. +@end deftypefun + + +@node Parsing of Numbers +@section Parsing of Numbers +@cindex parsing numbers (in formatted input) +@cindex converting strings to numbers +@cindex number syntax, parsing +@cindex syntax, for reading numbers + +This section describes functions for ``reading'' integer and +floating-point numbers from a string. It may be more convenient in some +cases to use @code{sscanf} or one of the related functions; see +@ref{Formatted Input}. But often you can make a program more robust by +finding the tokens in the string by hand, then converting the numbers +one by one. + +@menu +* Parsing of Integers:: Functions for conversion of integer values. +* Parsing of Floats:: Functions for conversion of floating-point + values. +@end menu + +@node Parsing of Integers +@subsection Parsing of Integers + +@pindex stdlib.h +These functions are declared in @file{stdlib.h}. + +@comment stdlib.h +@comment ANSI +@deftypefun {long int} strtol (const char *@var{string}, char **@var{tailptr}, int @var{base}) +The @code{strtol} (``string-to-long'') function converts the initial +part of @var{string} to a signed integer, which is returned as a value +of type @code{long int}. + +This function attempts to decompose @var{string} as follows: + +@itemize @bullet +@item +A (possibly empty) sequence of whitespace characters. Which characters +are whitespace is determined by the @code{isspace} function +(@pxref{Classification of Characters}). These are discarded. + +@item +An optional plus or minus sign (@samp{+} or @samp{-}). + +@item +A nonempty sequence of digits in the radix specified by @var{base}. + +If @var{base} is zero, decimal radix is assumed unless the series of +digits begins with @samp{0} (specifying octal radix), or @samp{0x} or +@samp{0X} (specifying hexadecimal radix); in other words, the same +syntax used for integer constants in C. + +Otherwise @var{base} must have a value between @code{2} and @code{35}. +If @var{base} is @code{16}, the digits may optionally be preceded by +@samp{0x} or @samp{0X}. + +@item +Any remaining characters in the string. If @var{tailptr} is not a null +pointer, @code{strtol} stores a pointer to this tail in +@code{*@var{tailptr}}. +@end itemize + +If the string is empty, contains only whitespace, or does not contain an +initial substring that has the expected syntax for an integer in the +specified @var{base}, no conversion is performed. In this case, +@code{strtol} returns a value of zero and the value stored in +@code{*@var{tailptr}} is the value of @var{string}. + +In a locale other than the standard @code{"C"} locale, this function +may recognize additional implementation-dependent syntax. + +If the string has valid syntax for an integer but the value is not +representable because of overflow, @code{strtol} returns either +@code{LONG_MAX} or @code{LONG_MIN} (@pxref{Range of Type}), as +appropriate for the sign of the value. It also sets @code{errno} +to @code{ERANGE} to indicate there was overflow. + +There is an example at the end of this section. +@end deftypefun + +@comment stdlib.h +@comment ANSI +@deftypefun {unsigned long int} strtoul (const char *@var{string}, char **@var{tailptr}, int @var{base}) +The @code{strtoul} (``string-to-unsigned-long'') function is like +@code{strtol} except that it returns its value with type @code{unsigned +long int}. The value returned in case of overflow is @code{ULONG_MAX} +(@pxref{Range of Type}). +@end deftypefun + +@comment stdlib.h +@comment ANSI +@deftypefun {long int} atol (const char *@var{string}) +This function is similar to the @code{strtol} function with a @var{base} +argument of @code{10}, except that it need not detect overflow errors. +The @code{atol} function is provided mostly for compatibility with +existing code; using @code{strtol} is more robust. +@end deftypefun + +@comment stdlib.h +@comment ANSI +@deftypefun int atoi (const char *@var{string}) +This function is like @code{atol}, except that it returns an @code{int} +value rather than @code{long int}. The @code{atoi} function is also +considered obsolete; use @code{strtol} instead. +@end deftypefun + +Here is a function which parses a string as a sequence of integers and +returns the sum of them: + +@smallexample +int +sum_ints_from_string (char *string) +@{ + int sum = 0; + + while (1) @{ + char *tail; + int next; + + /* @r{Skip whitespace by hand, to detect the end.} */ + while (isspace (*string)) string++; + if (*string == 0) + break; + + /* @r{There is more nonwhitespace,} */ + /* @r{so it ought to be another number.} */ + errno = 0; + /* @r{Parse it.} */ + next = strtol (string, &tail, 0); + /* @r{Add it in, if not overflow.} */ + if (errno) + printf ("Overflow\n"); + else + sum += next; + /* @r{Advance past it.} */ + string = tail; + @} + + return sum; +@} +@end smallexample + +@node Parsing of Floats +@subsection Parsing of Floats + +@pindex stdlib.h +These functions are declared in @file{stdlib.h}. + +@comment stdlib.h +@comment ANSI +@deftypefun double strtod (const char *@var{string}, char **@var{tailptr}) +The @code{strtod} (``string-to-double'') function converts the initial +part of @var{string} to a floating-point number, which is returned as a +value of type @code{double}. + +This function attempts to decompose @var{string} as follows: + +@itemize @bullet +@item +A (possibly empty) sequence of whitespace characters. Which characters +are whitespace is determined by the @code{isspace} function +(@pxref{Classification of Characters}). These are discarded. + +@item +An optional plus or minus sign (@samp{+} or @samp{-}). + +@item +A nonempty sequence of digits optionally containing a decimal-point +character---normally @samp{.}, but it depends on the locale +(@pxref{Numeric Formatting}). + +@item +An optional exponent part, consisting of a character @samp{e} or +@samp{E}, an optional sign, and a sequence of digits. + +@item +Any remaining characters in the string. If @var{tailptr} is not a null +pointer, a pointer to this tail of the string is stored in +@code{*@var{tailptr}}. +@end itemize + +If the string is empty, contains only whitespace, or does not contain an +initial substring that has the expected syntax for a floating-point +number, no conversion is performed. In this case, @code{strtod} returns +a value of zero and the value returned in @code{*@var{tailptr}} is the +value of @var{string}. + +In a locale other than the standard @code{"C"} locale, this function may +recognize additional locale-dependent syntax. + +If the string has valid syntax for a floating-point number but the value +is not representable because of overflow, @code{strtod} returns either +positive or negative @code{HUGE_VAL} (@pxref{Mathematics}), depending on +the sign of the value. Similarly, if the value is not representable +because of underflow, @code{strtod} returns zero. It also sets @code{errno} +to @code{ERANGE} if there was overflow or underflow. +@end deftypefun + +@comment stdlib.h +@comment ANSI +@deftypefun double atof (const char *@var{string}) +This function is similar to the @code{strtod} function, except that it +need not detect overflow and underflow errors. The @code{atof} function +is provided mostly for compatibility with existing code; using +@code{strtod} is more robust. +@end deftypefun diff --git a/manual/assert.texi b/manual/assert.texi new file mode 100644 index 0000000000..1095dc4754 --- /dev/null +++ b/manual/assert.texi @@ -0,0 +1,113 @@ +@node Consistency Checking, Mathematics, Low-Level Terminal Interface, Top +@chapter Explicitly Checking Internal Consistency +@cindex consistency checking +@cindex impossible events +@cindex assertions + +When you're writing a program, it's often a good idea to put in checks +at strategic places for ``impossible'' errors or violations of basic +assumptions. These kinds of checks are helpful in debugging problems +with the interfaces between different parts of the program, for example. + +@pindex assert.h +The @code{assert} macro, defined in the header file @file{assert.h}, +provides a convenient way to abort the program while printing some +debugging information about where in the program the error was detected. + +@vindex NDEBUG +Once you think your program is debugged, you can disable the error +checks performed by the @code{assert} macro by recompiling with the +macro @code{NDEBUG} defined. This means you don't actually have to +change the program source code to disable these checks. + +But disabling these consistency checks is undesirable unless they make +the program significantly slower. All else being equal, more error +checking is good no matter who is running the program. A wise user +would rather have a program crash, visibly, than have it return nonsense +without indicating anything might be wrong. + +@comment assert.h +@comment ANSI +@deftypefn Macro void assert (int @var{expression}) +Verify the programmer's belief that @var{expression} should be nonzero +at a certain point in the program. + +If @code{NDEBUG} is not defined, @code{assert} tests the value of +@var{expression}. If it is false (zero), @code{assert} aborts the +program (@pxref{Aborting a Program}) after printing a message of the +form: + +@smallexample +@file{@var{file}}:@var{linenum}: @var{function}: Assertion `@var{expression}' failed. +@end smallexample + +@noindent +on the standard error stream @code{stderr} (@pxref{Standard Streams}). +The filename and line number are taken from the C preprocessor macros +@code{__FILE__} and @code{__LINE__} and specify where the call to +@code{assert} was written. When using the GNU C compiler, the name of +the function which calls @code{assert} is taken from the built-in +variable @code{__PRETTY_FUNCTION__}; with older compilers, the function +name and following colon are omitted. + +If the preprocessor macro @code{NDEBUG} is defined before +@file{assert.h} is included, the @code{assert} macro is defined to do +absolutely nothing. Even the argument expression @var{expression} is +not evaluated, so you should avoid calling @code{assert} with arguments +that involve side effects. + +For example, @code{assert (++i > 0);} is a bad idea, because @code{i} +will not be incremented if @code{NDEBUG} is defined. +@end deftypefn + +Sometimes the ``impossible'' condition you want to check for is an error +return from an operating system function. Then it is useful to display +not only where the program crashes, but also what error was returned. +The @code{assert_perror} macro makes this easy. + +@comment assert.h +@comment GNU +@deftypefn Macro void assert_perror (int @var{errnum}) +Similar to @code{assert}, but verifies that @var{errnum} is zero. + +If @code{NDEBUG} is defined, @code{assert_perror} tests the value of +@var{errnum}. If it is nonzero, @code{assert_perror} aborts the program +after a printing a message of the form: + +@smallexample +@file{@var{file}}:@var{linenum}: @var{function}: @var{error text} +@end smallexample + +@noindent +on the standard error stream. The file name, line number, and function +name are as for @code{assert}. The error text is the result of +@w{@code{strerror (@var{errnum})}}. @xref{Error Messages}. + +Like @code{assert}, if @code{NDEBUG} is defined before @file{assert.h} +is included, the @code{assert_perror} macro does absolutely nothing. It +does not evaluate the argument, so @var{errnum} should not have any side +effects. It is best for @var{errnum} to be a just simple variable +reference; often it will be @code{errno}. + +This macro is a GNU extension. +@end deftypefn + +@strong{Usage note:} The @code{assert} facility is designed for +detecting @emph{internal inconsistency}; it is not suitable for +reporting invalid input or improper usage. + +The information in the diagnostic messages provided by the @code{assert} +macro is intended to to help you, the programmer, track down the cause +of a bug, but is not really useful in telling a user of your program why +his or her input was invalid or why a command could not be carried out. +So you can't use @code{assert} to print the error messages for these +eventualities. + +What's more, your program should not abort when given invalid input, as +@code{assert} would do---it should exit with nonzero status after +printing its error messages, or perhaps read another command or move +on to the next input file. + +@xref{Error Messages}, for information on printing error messages for +problems that @emph{do not} represent bugs in the program. + diff --git a/manual/conf.texi b/manual/conf.texi new file mode 100644 index 0000000000..86afeca597 --- /dev/null +++ b/manual/conf.texi @@ -0,0 +1,1091 @@ +@node System Configuration, Language Features, System Information, Top +@chapter System Configuration Parameters + +The functions and macros listed in this chapter give information about +configuration parameters of the operating system---for example, capacity +limits, presence of optional POSIX features, and the default path for +executable files (@pxref{String Parameters}). + +@menu +* General Limits:: Constants and functions that describe + various process-related limits that have + one uniform value for any given machine. +* System Options:: Optional POSIX features. +* Version Supported:: Version numbers of POSIX.1 and POSIX.2. +* Sysconf:: Getting specific configuration values + of general limits and system options. +* Minimums:: Minimum values for general limits. + +* Limits for Files:: Size limitations that pertain to individual files. + These can vary between file systems + or even from file to file. +* Options for Files:: Optional features that some files may support. +* File Minimums:: Minimum values for file limits. +* Pathconf:: Getting the limit values for a particular file. + +* Utility Limits:: Capacity limits of some POSIX.2 utility programs. +* Utility Minimums:: Minimum allowable values of those limits. + +* String Parameters:: Getting the default search path. +@end menu + +@node General Limits +@section General Capacity Limits +@cindex POSIX capacity limits +@cindex limits, POSIX +@cindex capacity limits, POSIX + +The POSIX.1 and POSIX.2 standards specify a number of parameters that +describe capacity limitations of the system. These limits can be fixed +constants for a given operating system, or they can vary from machine to +machine. For example, some limit values may be configurable by the +system administrator, either at run time or by rebuilding the kernel, +and this should not require recompiling application programs. + +@pindex limits.h +Each of the following limit parameters has a macro that is defined in +@file{limits.h} only if the system has a fixed, uniform limit for the +parameter in question. If the system allows different file systems or +files to have different limits, then the macro is undefined; use +@code{sysconf} to find out the limit that applies at a particular time +on a particular machine. @xref{Sysconf}. + +Each of these parameters also has another macro, with a name starting +with @samp{_POSIX}, which gives the lowest value that the limit is +allowed to have on @emph{any} POSIX system. @xref{Minimums}. + +@cindex limits, program argument size +@comment limits.h +@comment POSIX.1 +@deftypevr Macro int ARG_MAX +If defined, the unvarying maximum combined length of the @var{argv} and +@var{environ} arguments that can be passed to the @code{exec} functions. +@end deftypevr + +@cindex limits, number of processes +@comment limits.h +@comment POSIX.1 +@deftypevr Macro int CHILD_MAX +If defined, the unvarying maximum number of processes that can exist +with the same real user ID at any one time. In BSD and GNU, this is +controlled by the @code{RLIMIT_NPROC} resource limit; @pxref{Limits on +Resources}. +@end deftypevr + +@cindex limits, number of open files +@comment limits.h +@comment POSIX.1 +@deftypevr Macro int OPEN_MAX +If defined, the unvarying maximum number of files that a single process +can have open simultaneously. In BSD and GNU, this is controlled +by the @code{RLIMIT_NOFILE} resource limit; @pxref{Limits on Resources}. +@end deftypevr + +@comment limits.h +@comment POSIX.1 +@deftypevr Macro int STREAM_MAX +If defined, the unvarying maximum number of streams that a single +process can have open simultaneously. @xref{Opening Streams}. +@end deftypevr + +@cindex limits, time zone name length +@comment limits.h +@comment POSIX.1 +@deftypevr Macro int TZNAME_MAX +If defined, the unvarying maximum length of a time zone name. +@xref{Time Zone Functions}. +@end deftypevr + +These limit macros are always defined in @file{limits.h}. + +@cindex limits, number of supplementary group IDs +@comment limits.h +@comment POSIX.1 +@deftypevr Macro int NGROUPS_MAX +The maximum number of supplementary group IDs that one process can have. + +The value of this macro is actually a lower bound for the maximum. That +is, you can count on being able to have that many supplementary group +IDs, but a particular machine might let you have even more. You can use +@code{sysconf} to see whether a particular machine will let you have +more (@pxref{Sysconf}). +@end deftypevr + +@comment limits.h +@comment POSIX.1 +@deftypevr Macro int SSIZE_MAX +The largest value that can fit in an object of type @code{ssize_t}. +Effectively, this is the limit on the number of bytes that can be read +or written in a single operation. + +This macro is defined in all POSIX systems because this limit is never +configurable. +@end deftypevr + +@comment limits.h +@comment POSIX.2 +@deftypevr Macro int RE_DUP_MAX +The largest number of repetitions you are guaranteed is allowed in the +construct @samp{\@{@var{min},@var{max}\@}} in a regular expression. + +The value of this macro is actually a lower bound for the maximum. That +is, you can count on being able to have that many repetitions, but a +particular machine might let you have even more. You can use +@code{sysconf} to see whether a particular machine will let you have +more (@pxref{Sysconf}). And even the value that @code{sysconf} tells +you is just a lower bound---larger values might work. + +This macro is defined in all POSIX.2 systems, because POSIX.2 says it +should always be defined even if there is no specific imposed limit. +@end deftypevr + +@node System Options +@section Overall System Options +@cindex POSIX optional features +@cindex optional POSIX features + +POSIX defines certain system-specific options that not all POSIX systems +support. Since these options are provided in the kernel, not in the +library, simply using the GNU C library does not guarantee any of these +features is supported; it depends on the system you are using. + +@pindex unistd.h +You can test for the availability of a given option using the macros in +this section, together with the function @code{sysconf}. The macros are +defined only if you include @file{unistd.h}. + +For the following macros, if the macro is defined in @file{unistd.h}, +then the option is supported. Otherwise, the option may or may not be +supported; use @code{sysconf} to find out. @xref{Sysconf}. + +@comment unistd.h +@comment POSIX.1 +@deftypevr Macro int _POSIX_JOB_CONTROL +If this symbol is defined, it indicates that the system supports job +control. Otherwise, the implementation behaves as if all processes +within a session belong to a single process group. @xref{Job Control}. +@end deftypevr + +@comment unistd.h +@comment POSIX.1 +@deftypevr Macro int _POSIX_SAVED_IDS +If this symbol is defined, it indicates that the system remembers the +effective user and group IDs of a process before it executes an +executable file with the set-user-ID or set-group-ID bits set, and that +explicitly changing the effective user or group IDs back to these values +is permitted. If this option is not defined, then if a nonprivileged +process changes its effective user or group ID to the real user or group +ID of the process, it can't change it back again. @xref{Enable/Disable +Setuid}. +@end deftypevr + +For the following macros, if the macro is defined in @file{unistd.h}, +then its value indicates whether the option is supported. A value of +@code{-1} means no, and any other value means yes. If the macro is not +defined, then the option may or may not be supported; use @code{sysconf} +to find out. @xref{Sysconf}. + +@comment unistd.h +@comment POSIX.2 +@deftypevr Macro int _POSIX2_C_DEV +If this symbol is defined, it indicates that the system has the POSIX.2 +C compiler command, @code{c89}. The GNU C library always defines this +as @code{1}, on the assumption that you would not have installed it if +you didn't have a C compiler. +@end deftypevr + +@comment unistd.h +@comment POSIX.2 +@deftypevr Macro int _POSIX2_FORT_DEV +If this symbol is defined, it indicates that the system has the POSIX.2 +Fortran compiler command, @code{fort77}. The GNU C library never +defines this, because we don't know what the system has. +@end deftypevr + +@comment unistd.h +@comment POSIX.2 +@deftypevr Macro int _POSIX2_FORT_RUN +If this symbol is defined, it indicates that the system has the POSIX.2 +@code{asa} command to interpret Fortran carriage control. The GNU C +library never defines this, because we don't know what the system has. +@end deftypevr + +@comment unistd.h +@comment POSIX.2 +@deftypevr Macro int _POSIX2_LOCALEDEF +If this symbol is defined, it indicates that the system has the POSIX.2 +@code{localedef} command. The GNU C library never defines this, because +we don't know what the system has. +@end deftypevr + +@comment unistd.h +@comment POSIX.2 +@deftypevr Macro int _POSIX2_SW_DEV +If this symbol is defined, it indicates that the system has the POSIX.2 +commands @code{ar}, @code{make}, and @code{strip}. The GNU C library +always defines this as @code{1}, on the assumption that you had to have +@code{ar} and @code{make} to install the library, and it's unlikely that +@code{strip} would be absent when those are present. +@end deftypevr + +@node Version Supported +@section Which Version of POSIX is Supported + +@comment unistd.h +@comment POSIX.1 +@deftypevr Macro {long int} _POSIX_VERSION +This constant represents the version of the POSIX.1 standard to which +the implementation conforms. For an implementation conforming to the +1990 POSIX.1 standard, the value is the integer @code{199009L}. + +@code{_POSIX_VERSION} is always defined (in @file{unistd.h}) in any +POSIX system. + +@strong{Usage Note:} Don't try to test whether the system supports POSIX +by including @file{unistd.h} and then checking whether +@code{_POSIX_VERSION} is defined. On a non-POSIX system, this will +probably fail because there is no @file{unistd.h}. We do not know of +@emph{any} way you can reliably test at compilation time whether your +target system supports POSIX or whether @file{unistd.h} exists. + +The GNU C compiler predefines the symbol @code{__POSIX__} if the target +system is a POSIX system. Provided you do not use any other compilers +on POSIX systems, testing @code{defined (__POSIX__)} will reliably +detect such systems. +@end deftypevr + +@comment unistd.h +@comment POSIX.2 +@deftypevr Macro {long int} _POSIX2_C_VERSION +This constant represents the version of the POSIX.2 standard which the +library and system kernel support. We don't know what value this will +be for the first version of the POSIX.2 standard, because the value is +based on the year and month in which the standard is officially adopted. + +The value of this symbol says nothing about the utilities installed on +the system. + +@strong{Usage Note:} You can use this macro to tell whether a POSIX.1 +system library supports POSIX.2 as well. Any POSIX.1 system contains +@file{unistd.h}, so include that file and then test @code{defined +(_POSIX2_C_VERSION)}. +@end deftypevr + +@node Sysconf +@section Using @code{sysconf} + +When your system has configurable system limits, you can use the +@code{sysconf} function to find out the value that applies to any +particular machine. The function and the associated @var{parameter} +constants are declared in the header file @file{unistd.h}. + +@menu +* Sysconf Definition:: Detailed specifications of @code{sysconf}. +* Constants for Sysconf:: The list of parameters @code{sysconf} can read. +* Examples of Sysconf:: How to use @code{sysconf} and the parameter + macros properly together. +@end menu + +@node Sysconf Definition +@subsection Definition of @code{sysconf} + +@comment unistd.h +@comment POSIX.1 +@deftypefun {long int} sysconf (int @var{parameter}) +This function is used to inquire about runtime system parameters. The +@var{parameter} argument should be one of the @samp{_SC_} symbols listed +below. + +The normal return value from @code{sysconf} is the value you requested. +A value of @code{-1} is returned both if the implementation does not +impose a limit, and in case of an error. + +The following @code{errno} error conditions are defined for this function: + +@table @code +@item EINVAL +The value of the @var{parameter} is invalid. +@end table +@end deftypefun + +@node Constants for Sysconf +@subsection Constants for @code{sysconf} Parameters + +Here are the symbolic constants for use as the @var{parameter} argument +to @code{sysconf}. The values are all integer constants (more +specifically, enumeration type values). + +@table @code +@comment unistd.h +@comment POSIX.1 +@item _SC_ARG_MAX +Inquire about the parameter corresponding to @code{ARG_MAX}. + +@comment unistd.h +@comment POSIX.1 +@item _SC_CHILD_MAX +Inquire about the parameter corresponding to @code{CHILD_MAX}. + +@comment unistd.h +@comment POSIX.1 +@item _SC_OPEN_MAX +Inquire about the parameter corresponding to @code{OPEN_MAX}. + +@comment unistd.h +@comment POSIX.1 +@item _SC_STREAM_MAX +Inquire about the parameter corresponding to @code{STREAM_MAX}. + +@comment unistd.h +@comment POSIX.1 +@item _SC_TZNAME_MAX +Inquire about the parameter corresponding to @code{TZNAME_MAX}. + +@comment unistd.h +@comment POSIX.1 +@item _SC_NGROUPS_MAX +Inquire about the parameter corresponding to @code{NGROUPS_MAX}. + +@comment unistd.h +@comment POSIX.1 +@item _SC_JOB_CONTROL +Inquire about the parameter corresponding to @code{_POSIX_JOB_CONTROL}. + +@comment unistd.h +@comment POSIX.1 +@item _SC_SAVED_IDS +Inquire about the parameter corresponding to @code{_POSIX_SAVED_IDS}. + +@comment unistd.h +@comment POSIX.1 +@item _SC_VERSION +Inquire about the parameter corresponding to @code{_POSIX_VERSION}. + +@comment unistd.h +@comment POSIX.1 +@item _SC_CLK_TCK +Inquire about the parameter corresponding to @code{CLOCKS_PER_SEC}; +@pxref{Basic CPU Time}. + +@comment unistd.h +@comment POSIX.2 +@item _SC_2_C_DEV +Inquire about whether the system has the POSIX.2 C compiler command, +@code{c89}. + +@comment unistd.h +@comment POSIX.2 +@item _SC_2_FORT_DEV +Inquire about whether the system has the POSIX.2 Fortran compiler +command, @code{fort77}. + +@comment unistd.h +@comment POSIX.2 +@item _SC_2_FORT_RUN +Inquire about whether the system has the POSIX.2 @code{asa} command to +interpret Fortran carriage control. + +@comment unistd.h +@comment POSIX.2 +@item _SC_2_LOCALEDEF +Inquire about whether the system has the POSIX.2 @code{localedef} +command. + +@comment unistd.h +@comment POSIX.2 +@item _SC_2_SW_DEV +Inquire about whether the system has the POSIX.2 commands @code{ar}, +@code{make}, and @code{strip}. + +@comment unistd.h +@comment POSIX.2 +@item _SC_BC_BASE_MAX +Inquire about the maximum value of @code{obase} in the @code{bc} +utility. + +@comment unistd.h +@comment POSIX.2 +@item _SC_BC_DIM_MAX +Inquire about the maximum size of an array in the @code{bc} +utility. + +@comment unistd.h +@comment POSIX.2 +@item _SC_BC_SCALE_MAX +Inquire about the maximum value of @code{scale} in the @code{bc} +utility. + +@comment unistd.h +@comment POSIX.2 +@item _SC_BC_STRING_MAX +Inquire about the maximum size of a string constant in the +@code{bc} utility. + +@comment unistd.h +@comment POSIX.2 +@item _SC_COLL_WEIGHTS_MAX +Inquire about the maximum number of weights that can necessarily +be used in defining the collating sequence for a locale. + +@comment unistd.h +@comment POSIX.2 +@item _SC_EXPR_NEST_MAX +Inquire about the maximum number of expressions nested within +parentheses when using the @code{expr} utility. + +@comment unistd.h +@comment POSIX.2 +@item _SC_LINE_MAX +Inquire about the maximum size of a text line that the POSIX.2 text +utilities can handle. + +@comment unistd.h +@comment POSIX.2 +@item _SC_EQUIV_CLASS_MAX +Inquire about the maximum number of weights that can be assigned to an +entry of the @code{LC_COLLATE} category @samp{order} keyword in a locale +definition. The GNU C library does not presently support locale +definitions. + +@comment unistd.h +@comment POSIX.2 +@item _SC_VERSION +Inquire about the version number of POSIX.1 that the library and kernel +support. + +@comment unistd.h +@comment POSIX.2 +@item _SC_2_VERSION +Inquire about the version number of POSIX.2 that the system utilities +support. + +@comment unistd.h +@comment GNU +@item _SC_PAGESIZE +Inquire about the virtual memory page size of the machine. +@code{getpagesize} returns the same value. +@c @xref{XXX getpagesize}. !!! ??? +@end table + +@node Examples of Sysconf +@subsection Examples of @code{sysconf} + +We recommend that you first test for a macro definition for the +parameter you are interested in, and call @code{sysconf} only if the +macro is not defined. For example, here is how to test whether job +control is supported: + +@smallexample +@group +int +have_job_control (void) +@{ +#ifdef _POSIX_JOB_CONTROL + return 1; +#else + int value = sysconf (_SC_JOB_CONTROL); + if (value < 0) + /* @r{If the system is that badly wedged,} + @r{there's no use trying to go on.} */ + fatal (strerror (errno)); + return value; +#endif +@} +@end group +@end smallexample + +Here is how to get the value of a numeric limit: + +@smallexample +int +get_child_max () +@{ +#ifdef CHILD_MAX + return CHILD_MAX; +#else + int value = sysconf (_SC_CHILD_MAX); + if (value < 0) + fatal (strerror (errno)); + return value; +#endif +@} +@end smallexample + +@node Minimums +@section Minimum Values for General Capacity Limits + +Here are the names for the POSIX minimum upper bounds for the system +limit parameters. The significance of these values is that you can +safely push to these limits without checking whether the particular +system you are using can go that far. + +@table @code +@comment limits.h +@comment POSIX.1 +@item _POSIX_ARG_MAX +The value of this macro is the most restrictive limit permitted by POSIX +for the maximum combined length of the @var{argv} and @var{environ} +arguments that can be passed to the @code{exec} functions. +Its value is @code{4096}. + +@comment limits.h +@comment POSIX.1 +@item _POSIX_CHILD_MAX +The value of this macro is the most restrictive limit permitted by POSIX +for the maximum number of simultaneous processes per real user ID. Its +value is @code{6}. + +@comment limits.h +@comment POSIX.1 +@item _POSIX_NGROUPS_MAX +The value of this macro is the most restrictive limit permitted by POSIX +for the maximum number of supplementary group IDs per process. Its +value is @code{0}. + +@comment limits.h +@comment POSIX.1 +@item _POSIX_OPEN_MAX +The value of this macro is the most restrictive limit permitted by POSIX +for the maximum number of files that a single process can have open +simultaneously. Its value is @code{16}. + +@comment limits.h +@comment POSIX.1 +@item _POSIX_SSIZE_MAX +The value of this macro is the most restrictive limit permitted by POSIX +for the maximum value that can be stored in an object of type +@code{ssize_t}. Its value is @code{32767}. + +@comment limits.h +@comment POSIX.1 +@item _POSIX_STREAM_MAX +The value of this macro is the most restrictive limit permitted by POSIX +for the maximum number of streams that a single process can have open +simultaneously. Its value is @code{8}. + +@comment limits.h +@comment POSIX.1 +@item _POSIX_TZNAME_MAX +The value of this macro is the most restrictive limit permitted by POSIX +for the maximum length of a time zone name. Its value is @code{3}. + +@comment limits.h +@comment POSIX.2 +@item _POSIX2_RE_DUP_MAX +The value of this macro is the most restrictive limit permitted by POSIX +for the numbers used in the @samp{\@{@var{min},@var{max}\@}} construct +in a regular expression. Its value is @code{255}. +@end table + +@node Limits for Files +@section Limits on File System Capacity + +The POSIX.1 standard specifies a number of parameters that describe the +limitations of the file system. It's possible for the system to have a +fixed, uniform limit for a parameter, but this isn't the usual case. On +most systems, it's possible for different file systems (and, for some +parameters, even different files) to have different maximum limits. For +example, this is very likely if you use NFS to mount some of the file +systems from other machines. + +@pindex limits.h +Each of the following macros is defined in @file{limits.h} only if the +system has a fixed, uniform limit for the parameter in question. If the +system allows different file systems or files to have different limits, +then the macro is undefined; use @code{pathconf} or @code{fpathconf} to +find out the limit that applies to a particular file. @xref{Pathconf}. + +Each parameter also has another macro, with a name starting with +@samp{_POSIX}, which gives the lowest value that the limit is allowed to +have on @emph{any} POSIX system. @xref{File Minimums}. + +@cindex limits, link count of files +@comment limits.h +@comment POSIX.1 +@deftypevr Macro int LINK_MAX +The uniform system limit (if any) for the number of names for a given +file. @xref{Hard Links}. +@end deftypevr + +@cindex limits, terminal input queue +@comment limits.h +@comment POSIX.1 +@deftypevr Macro int MAX_CANON +The uniform system limit (if any) for the amount of text in a line of +input when input editing is enabled. @xref{Canonical or Not}. +@end deftypevr + +@comment limits.h +@comment POSIX.1 +@deftypevr Macro int MAX_INPUT +The uniform system limit (if any) for the total number of characters +typed ahead as input. @xref{I/O Queues}. +@end deftypevr + +@cindex limits, file name length +@comment limits.h +@comment POSIX.1 +@deftypevr Macro int NAME_MAX +The uniform system limit (if any) for the length of a file name component. +@end deftypevr + +@comment limits.h +@comment POSIX.1 +@deftypevr Macro int PATH_MAX +The uniform system limit (if any) for the length of an entire file name (that +is, the argument given to system calls such as @code{open}). +@end deftypevr + +@cindex limits, pipe buffer size +@comment limits.h +@comment POSIX.1 +@deftypevr Macro int PIPE_BUF +The uniform system limit (if any) for the number of bytes that can be +written atomically to a pipe. If multiple processes are writing to the +same pipe simultaneously, output from different processes might be +interleaved in chunks of this size. @xref{Pipes and FIFOs}. +@end deftypevr + +These are alternative macro names for some of the same information. + +@comment dirent.h +@comment BSD +@deftypevr Macro int MAXNAMLEN +This is the BSD name for @code{NAME_MAX}. It is defined in +@file{dirent.h}. +@end deftypevr + +@comment stdio.h +@comment ANSI +@deftypevr Macro int FILENAME_MAX +The value of this macro is an integer constant expression that +represents the maximum length of a file name string. It is defined in +@file{stdio.h}. + +Unlike @code{PATH_MAX}, this macro is defined even if there is no actual +limit imposed. In such a case, its value is typically a very large +number. @strong{This is always the case on the GNU system.} + +@strong{Usage Note:} Don't use @code{FILENAME_MAX} as the size of an +array in which to store a file name! You can't possibly make an array +that big! Use dynamic allocation (@pxref{Memory Allocation}) instead. +@end deftypevr + +@node Options for Files +@section Optional Features in File Support + +POSIX defines certain system-specific options in the system calls for +operating on files. Some systems support these options and others do +not. Since these options are provided in the kernel, not in the +library, simply using the GNU C library does not guarantee any of these +features is supported; it depends on the system you are using. They can +also vary between file systems on a single machine. + +@pindex unistd.h +This section describes the macros you can test to determine whether a +particular option is supported on your machine. If a given macro is +defined in @file{unistd.h}, then its value says whether the +corresponding feature is supported. (A value of @code{-1} indicates no; +any other value indicates yes.) If the macro is undefined, it means +particular files may or may not support the feature. + +Since all the machines that support the GNU C library also support NFS, +one can never make a general statement about whether all file systems +support the @code{_POSIX_CHOWN_RESTRICTED} and @code{_POSIX_NO_TRUNC} +features. So these names are never defined as macros in the GNU C +library. + +@comment unistd.h +@comment POSIX.1 +@deftypevr Macro int _POSIX_CHOWN_RESTRICTED +If this option is in effect, the @code{chown} function is restricted so +that the only changes permitted to nonprivileged processes is to change +the group owner of a file to either be the effective group ID of the +process, or one of its supplementary group IDs. @xref{File Owner}. +@end deftypevr + +@comment unistd.h +@comment POSIX.1 +@deftypevr Macro int _POSIX_NO_TRUNC +If this option is in effect, file name components longer than +@code{NAME_MAX} generate an @code{ENAMETOOLONG} error. Otherwise, file +name components that are too long are silently truncated. +@end deftypevr + +@comment unistd.h +@comment POSIX.1 +@deftypevr Macro {unsigned char} _POSIX_VDISABLE +This option is only meaningful for files that are terminal devices. +If it is enabled, then handling for special control characters can +be disabled individually. @xref{Special Characters}. +@end deftypevr + +@pindex unistd.h +If one of these macros is undefined, that means that the option might be +in effect for some files and not for others. To inquire about a +particular file, call @code{pathconf} or @code{fpathconf}. +@xref{Pathconf}. + +@node File Minimums +@section Minimum Values for File System Limits + +Here are the names for the POSIX minimum upper bounds for some of the +above parameters. The significance of these values is that you can +safely push to these limits without checking whether the particular +system you are using can go that far. + +@table @code +@comment limits.h +@comment POSIX.1 +@item _POSIX_LINK_MAX +The most restrictive limit permitted by POSIX for the maximum value of a +file's link count. The value of this constant is @code{8}; thus, you +can always make up to eight names for a file without running into a +system limit. + +@comment limits.h +@comment POSIX.1 +@item _POSIX_MAX_CANON +The most restrictive limit permitted by POSIX for the maximum number of +bytes in a canonical input line from a terminal device. The value of +this constant is @code{255}. + +@comment limits.h +@comment POSIX.1 +@item _POSIX_MAX_INPUT +The most restrictive limit permitted by POSIX for the maximum number of +bytes in a terminal device input queue (or typeahead buffer). +@xref{Input Modes}. The value of this constant is @code{255}. + +@comment limits.h +@comment POSIX.1 +@item _POSIX_NAME_MAX +The most restrictive limit permitted by POSIX for the maximum number of +bytes in a file name component. The value of this constant is +@code{14}. + +@comment limits.h +@comment POSIX.1 +@item _POSIX_PATH_MAX +The most restrictive limit permitted by POSIX for the maximum number of +bytes in a file name. The value of this constant is @code{255}. + +@comment limits.h +@comment POSIX.1 +@item _POSIX_PIPE_BUF +The most restrictive limit permitted by POSIX for the maximum number of +bytes that can be written atomically to a pipe. The value of this +constant is @code{512}. +@end table + +@node Pathconf +@section Using @code{pathconf} + +When your machine allows different files to have different values for a +file system parameter, you can use the functions in this section to find +out the value that applies to any particular file. + +These functions and the associated constants for the @var{parameter} +argument are declared in the header file @file{unistd.h}. + +@comment unistd.h +@comment POSIX.1 +@deftypefun {long int} pathconf (const char *@var{filename}, int @var{parameter}) +This function is used to inquire about the limits that apply to +the file named @var{filename}. + +The @var{parameter} argument should be one of the @samp{_PC_} constants +listed below. + +The normal return value from @code{pathconf} is the value you requested. +A value of @code{-1} is returned both if the implementation does not +impose a limit, and in case of an error. In the former case, +@code{errno} is not set, while in the latter case, @code{errno} is set +to indicate the cause of the problem. So the only way to use this +function robustly is to store @code{0} into @code{errno} just before +calling it. + +Besides the usual file name errors (@pxref{File Name Errors}), +the following error condition is defined for this function: + +@table @code +@item EINVAL +The value of @var{parameter} is invalid, or the implementation doesn't +support the @var{parameter} for the specific file. +@end table +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun {long int} fpathconf (int @var{filedes}, int @var{parameter}) +This is just like @code{pathconf} except that an open file descriptor +is used to specify the file for which information is requested, instead +of a file name. + +The following @code{errno} error conditions are defined for this function: + +@table @code +@item EBADF +The @var{filedes} argument is not a valid file descriptor. + +@item EINVAL +The value of @var{parameter} is invalid, or the implementation doesn't +support the @var{parameter} for the specific file. +@end table +@end deftypefun + +Here are the symbolic constants that you can use as the @var{parameter} +argument to @code{pathconf} and @code{fpathconf}. The values are all +integer constants. + +@table @code +@comment unistd.h +@comment POSIX.1 +@item _PC_LINK_MAX +Inquire about the value of @code{LINK_MAX}. + +@comment unistd.h +@comment POSIX.1 +@item _PC_MAX_CANON +Inquire about the value of @code{MAX_CANON}. + +@comment unistd.h +@comment POSIX.1 +@item _PC_MAX_INPUT +Inquire about the value of @code{MAX_INPUT}. + +@comment unistd.h +@comment POSIX.1 +@item _PC_NAME_MAX +Inquire about the value of @code{NAME_MAX}. + +@comment unistd.h +@comment POSIX.1 +@item _PC_PATH_MAX +Inquire about the value of @code{PATH_MAX}. + +@comment unistd.h +@comment POSIX.1 +@item _PC_PIPE_BUF +Inquire about the value of @code{PIPE_BUF}. + +@comment unistd.h +@comment POSIX.1 +@item _PC_CHOWN_RESTRICTED +Inquire about the value of @code{_POSIX_CHOWN_RESTRICTED}. + +@comment unistd.h +@comment POSIX.1 +@item _PC_NO_TRUNC +Inquire about the value of @code{_POSIX_NO_TRUNC}. + +@comment unistd.h +@comment POSIX.1 +@item _PC_VDISABLE +Inquire about the value of @code{_POSIX_VDISABLE}. +@end table + +@node Utility Limits +@section Utility Program Capacity Limits + +The POSIX.2 standard specifies certain system limits that you can access +through @code{sysconf} that apply to utility behavior rather than the +behavior of the library or the operating system. + +The GNU C library defines macros for these limits, and @code{sysconf} +returns values for them if you ask; but these values convey no +meaningful information. They are simply the smallest values that +POSIX.2 permits. + +@comment limits.h +@comment POSIX.2 +@deftypevr Macro int BC_BASE_MAX +The largest value of @code{obase} that the @code{bc} utility is +guaranteed to support. +@end deftypevr + +@comment limits.h +@comment POSIX.2 +@deftypevr Macro int BC_SCALE_MAX +The largest value of @code{scale} that the @code{bc} utility is +guaranteed to support. +@end deftypevr + +@comment limits.h +@comment POSIX.2 +@deftypevr Macro int BC_DIM_MAX +The largest number of elements in one array that the @code{bc} utility +is guaranteed to support. +@end deftypevr + +@comment limits.h +@comment POSIX.2 +@deftypevr Macro int BC_STRING_MAX +The largest number of characters in one string constant that the +@code{bc} utility is guaranteed to support. +@end deftypevr + +@comment limits.h +@comment POSIX.2 +@deftypevr Macro int BC_DIM_MAX +The largest number of elements in one array that the @code{bc} utility +is guaranteed to support. +@end deftypevr + +@comment limits.h +@comment POSIX.2 +@deftypevr Macro int COLL_WEIGHTS_MAX +The largest number of weights that can necessarily be used in defining +the collating sequence for a locale. +@end deftypevr + +@comment limits.h +@comment POSIX.2 +@deftypevr Macro int EXPR_NEST_MAX +The maximum number of expressions that can be nested within parenthesis +by the @code{expr} utility. +@end deftypevr + +@comment limits.h +@comment POSIX.2 +@deftypevr Macro int LINE_MAX +The largest text line that the text-oriented POSIX.2 utilities can +support. (If you are using the GNU versions of these utilities, then +there is no actual limit except that imposed by the available virtual +memory, but there is no way that the library can tell you this.) +@end deftypevr + +@comment limits.h +@comment POSIX.2 +@deftypevr Macro int EQUIV_CLASS_MAX +The maximum number of weights that can be assigned to an entry of the +@code{LC_COLLATE} category @samp{order} keyword in a locale definition. +The GNU C library does not presently support locale definitions. +@end deftypevr + +@node Utility Minimums +@section Minimum Values for Utility Limits + +@table @code +@comment limits.h +@comment POSIX.2 +@item _POSIX2_BC_BASE_MAX +The most restrictive limit permitted by POSIX.2 for the maximum value of +@code{obase} in the @code{bc} utility. Its value is @code{99}. + +@comment limits.h +@comment POSIX.2 +@item _POSIX2_BC_DIM_MAX +The most restrictive limit permitted by POSIX.2 for the maximum size of +an array in the @code{bc} utility. Its value is @code{2048}. + +@comment limits.h +@comment POSIX.2 +@item _POSIX2_BC_SCALE_MAX +The most restrictive limit permitted by POSIX.2 for the maximum value of +@code{scale} in the @code{bc} utility. Its value is @code{99}. + +@comment limits.h +@comment POSIX.2 +@item _POSIX2_BC_STRING_MAX +The most restrictive limit permitted by POSIX.2 for the maximum size of +a string constant in the @code{bc} utility. Its value is @code{1000}. + +@comment limits.h +@comment POSIX.2 +@item _POSIX2_COLL_WEIGHTS_MAX +The most restrictive limit permitted by POSIX.2 for the maximum number +of weights that can necessarily be used in defining the collating +sequence for a locale. Its value is @code{2}. + +@comment limits.h +@comment POSIX.2 +@item _POSIX2_EXPR_NEST_MAX +The most restrictive limit permitted by POSIX.2 for the maximum number +of expressions nested within parenthesis when using the @code{expr} utility. +Its value is @code{32}. + +@comment limits.h +@comment POSIX.2 +@item _POSIX2_LINE_MAX +The most restrictive limit permitted by POSIX.2 for the maximum size of +a text line that the text utilities can handle. Its value is +@code{2048}. + +@comment limits.h +@comment POSIX.2 +@item _POSIX2_EQUIV_CLASS_MAX +The most restrictive limit permitted by POSIX.2 for the maximum number +of weights that can be assigned to an entry of the @code{LC_COLLATE} +category @samp{order} keyword in a locale definition. Its value is +@code{2}. The GNU C library does not presently support locale +definitions. +@end table + +@node String Parameters +@section String-Valued Parameters + +POSIX.2 defines a way to get string-valued parameters from the operating +system with the function @code{confstr}: + +@comment unistd.h +@comment POSIX.2 +@deftypefun size_t confstr (int @var{parameter}, char *@var{buf}, size_t @var{len}) +This function reads the value of a string-valued system parameter, +storing the string into @var{len} bytes of memory space starting at +@var{buf}. The @var{parameter} argument should be one of the +@samp{_CS_} symbols listed below. + +The normal return value from @code{confstr} is the length of the string +value that you asked for. If you supply a null pointer for @var{buf}, +then @code{confstr} does not try to store the string; it just returns +its length. A value of @code{0} indicates an error. + +If the string you asked for is too long for the buffer (that is, longer +than @code{@var{len} - 1}), then @code{confstr} stores just that much +(leaving room for the terminating null character). You can tell that +this has happened because @code{confstr} returns a value greater than or +equal to @var{len}. + +The following @code{errno} error conditions are defined for this function: + +@table @code +@item EINVAL +The value of the @var{parameter} is invalid. +@end table +@end deftypefun + +Currently there is just one parameter you can read with @code{confstr}: + +@table @code +@comment unistd.h +@comment POSIX.2 +@item _CS_PATH +This parameter's value is the recommended default path for searching for +executable files. This is the path that a user has by default just +after logging in. +@end table + +The way to use @code{confstr} without any arbitrary limit on string size +is to call it twice: first call it to get the length, allocate the +buffer accordingly, and then call @code{confstr} again to fill the +buffer, like this: + +@smallexample +@group +char * +get_default_path (void) +@{ + size_t len = confstr (_CS_PATH, NULL, 0); + char *buffer = (char *) xmalloc (len); + + if (confstr (_CS_PATH, buf, len + 1) == 0) + @{ + free (buffer); + return NULL; + @} + + return buffer; +@} +@end group +@end smallexample diff --git a/manual/creature.texi b/manual/creature.texi new file mode 100644 index 0000000000..51bf53a0c2 --- /dev/null +++ b/manual/creature.texi @@ -0,0 +1,113 @@ +@node Feature Test Macros +@subsection Feature Test Macros + +@cindex feature test macros +The exact set of features available when you compile a source file +is controlled by which @dfn{feature test macros} you define. + +If you compile your programs using @samp{gcc -ansi}, you get only the +ANSI C library features, unless you explicitly request additional +features by defining one or more of the feature macros. +@xref{Invoking GCC,, GNU CC Command Options, gcc.info, The GNU CC Manual}, +for more information about GCC options.@refill + +You should define these macros by using @samp{#define} preprocessor +directives at the top of your source code files. These directives +@emph{must} come before any @code{#include} of a system header file. It +is best to make them the very first thing in the file, preceded only by +comments. You could also use the @samp{-D} option to GCC, but it's +better if you make the source files indicate their own meaning in a +self-contained way. + +@comment (none) +@comment POSIX.1 +@defvr Macro _POSIX_SOURCE +If you define this macro, then the functionality from the POSIX.1 +standard (IEEE Standard 1003.1) is available, as well as all of the +ANSI C facilities. +@end defvr + +@comment (none) +@comment POSIX.2 +@defvr Macro _POSIX_C_SOURCE +If you define this macro with a value of @code{1}, then the +functionality from the POSIX.1 standard (IEEE Standard 1003.1) is made +available. If you define this macro with a value of @code{2}, then both +the functionality from the POSIX.1 standard and the functionality from +the POSIX.2 standard (IEEE Standard 1003.2) are made available. This is +in addition to the ANSI C facilities. +@end defvr + +@comment (none) +@comment GNU +@defvr Macro _BSD_SOURCE +If you define this macro, functionality derived from 4.3 BSD Unix is +included as well as the ANSI C, POSIX.1, and POSIX.2 material. + +Some of the features derived from 4.3 BSD Unix conflict with the +corresponding features specified by the POSIX.1 standard. If this +macro is defined, the 4.3 BSD definitions take precedence over the +POSIX definitions. + +Due to the nature of some of the conflicts between 4.3 BSD and POSIX.1, +you need to use a special @dfn{BSD compatibility library} when linking +programs compiled for BSD compatibility. This is because some functions +must be defined in two different ways, one of them in the normal C +library, and one of them in the compatibility library. If your program +defines @code{_BSD_SOURCE}, you must give the option @samp{-lbsd-compat} +to the compiler or linker when linking the program, to tell it to find +functions in this special compatibility library before looking for them in +the normal C library. +@pindex -lbsd-compat +@pindex bsd-compat +@cindex BSD compatibility library. +@end defvr + +@comment (none) +@comment GNU +@defvr Macro _SVID_SOURCE +If you define this macro, functionality derived from SVID is +included as well as the ANSI C, POSIX.1, and POSIX.2 material. +@end defvr + +@comment (none) +@comment GNU +@defvr Macro _GNU_SOURCE +If you define this macro, everything is included: ANSI C, POSIX.1, +POSIX.2, BSD, SVID, and GNU extensions. In the cases where POSIX.1 +conflicts with BSD, the POSIX definitions take precedence. + +If you want to get the full effect of @code{_GNU_SOURCE} but make the +BSD definitions take precedence over the POSIX definitions, use this +sequence of definitions: + +@smallexample +#define _GNU_SOURCE +#define _BSD_SOURCE +#define _SVID_SOURCE +@end smallexample + +Note that if you do this, you must link your program with the BSD +compatibility library by passing the @samp{-lbsd-compat} option to the +compiler or linker. @strong{Note:} If you forget to do this, you may +get very strange errors at run time. +@end defvr + +We recommend you use @code{_GNU_SOURCE} in new programs. If you don't +specify the @samp{-ansi} option to GCC and don't define any of these macros +explicitly, the effect is the same as defining @code{_GNU_SOURCE}. + +When you define a feature test macro to request a larger class of features, +it is harmless to define in addition a feature test macro for a subset of +those features. For example, if you define @code{_POSIX_C_SOURCE}, then +defining @code{_POSIX_SOURCE} as well has no effect. Likewise, if you +define @code{_GNU_SOURCE}, then defining either @code{_POSIX_SOURCE} or +@code{_POSIX_C_SOURCE} or @code{_SVID_SOURCE} as well has no effect. + +Note, however, that the features of @code{_BSD_SOURCE} are not a subset of +any of the other feature test macros supported. This is because it defines +BSD features that take precedence over the POSIX features that are +requested by the other macros. For this reason, defining +@code{_BSD_SOURCE} in addition to the other feature test macros does have +an effect: it causes the BSD features to take priority over the conflicting +POSIX features. diff --git a/manual/ctype.texi b/manual/ctype.texi new file mode 100644 index 0000000000..e7a7946466 --- /dev/null +++ b/manual/ctype.texi @@ -0,0 +1,250 @@ +@node Character Handling, String and Array Utilities, Memory Allocation, Top +@chapter Character Handling + +Programs that work with characters and strings often need to classify a +character---is it alphabetic, is it a digit, is it whitespace, and so +on---and perform case conversion operations on characters. The +functions in the header file @file{ctype.h} are provided for this +purpose. +@pindex ctype.h + +Since the choice of locale and character set can alter the +classifications of particular character codes, all of these functions +are affected by the current locale. (More precisely, they are affected +by the locale currently selected for character classification---the +@code{LC_CTYPE} category; see @ref{Locale Categories}.) + +@menu +* Classification of Characters:: Testing whether characters are + letters, digits, punctuation, etc. + +* Case Conversion:: Case mapping, and the like. +@end menu + +@node Classification of Characters, Case Conversion, , Character Handling +@section Classification of Characters +@cindex character testing +@cindex classification of characters +@cindex predicates on characters +@cindex character predicates + +This section explains the library functions for classifying characters. +For example, @code{isalpha} is the function to test for an alphabetic +character. It takes one argument, the character to test, and returns a +nonzero integer if the character is alphabetic, and zero otherwise. You +would use it like this: + +@smallexample +if (isalpha (c)) + printf ("The character `%c' is alphabetic.\n", c); +@end smallexample + +Each of the functions in this section tests for membership in a +particular class of characters; each has a name starting with @samp{is}. +Each of them takes one argument, which is a character to test, and +returns an @code{int} which is treated as a boolean value. The +character argument is passed as an @code{int}, and it may be the +constant value @code{EOF} instead of a real character. + +The attributes of any given character can vary between locales. +@xref{Locales}, for more information on locales.@refill + +These functions are declared in the header file @file{ctype.h}. +@pindex ctype.h + +@cindex lower-case character +@comment ctype.h +@comment ANSI +@deftypefun int islower (int @var{c}) +Returns true if @var{c} is a lower-case letter. +@end deftypefun + +@cindex upper-case character +@comment ctype.h +@comment ANSI +@deftypefun int isupper (int @var{c}) +Returns true if @var{c} is an upper-case letter. +@end deftypefun + +@cindex alphabetic character +@comment ctype.h +@comment ANSI +@deftypefun int isalpha (int @var{c}) +Returns true if @var{c} is an alphabetic character (a letter). If +@code{islower} or @code{isupper} is true of a character, then +@code{isalpha} is also true. + +In some locales, there may be additional characters for which +@code{isalpha} is true--letters which are neither upper case nor lower +case. But in the standard @code{"C"} locale, there are no such +additional characters. +@end deftypefun + +@cindex digit character +@cindex decimal digit character +@comment ctype.h +@comment ANSI +@deftypefun int isdigit (int @var{c}) +Returns true if @var{c} is a decimal digit (@samp{0} through @samp{9}). +@end deftypefun + +@cindex alphanumeric character +@comment ctype.h +@comment ANSI +@deftypefun int isalnum (int @var{c}) +Returns true if @var{c} is an alphanumeric character (a letter or +number); in other words, if either @code{isalpha} or @code{isdigit} is +true of a character, then @code{isalnum} is also true. +@end deftypefun + +@cindex hexadecimal digit character +@comment ctype.h +@comment ANSI +@deftypefun int isxdigit (int @var{c}) +Returns true if @var{c} is a hexadecimal digit. +Hexadecimal digits include the normal decimal digits @samp{0} through +@samp{9} and the letters @samp{A} through @samp{F} and +@samp{a} through @samp{f}. +@end deftypefun + +@cindex punctuation character +@comment ctype.h +@comment ANSI +@deftypefun int ispunct (int @var{c}) +Returns true if @var{c} is a punctuation character. +This means any printing character that is not alphanumeric or a space +character. +@end deftypefun + +@cindex whitespace character +@comment ctype.h +@comment ANSI +@deftypefun int isspace (int @var{c}) +Returns true if @var{c} is a @dfn{whitespace} character. In the standard +@code{"C"} locale, @code{isspace} returns true for only the standard +whitespace characters: + +@table @code +@item ' ' +space + +@item '\f' +formfeed + +@item '\n' +newline + +@item '\r' +carriage return + +@item '\t' +horizontal tab + +@item '\v' +vertical tab +@end table +@end deftypefun + +@cindex blank character +@comment ctype.h +@comment GNU +@deftypefun int isblank (int @var{c}) +Returns true if @var{c} is a blank character; that is, a space or a tab. +This function is a GNU extension. +@end deftypefun + +@cindex graphic character +@comment ctype.h +@comment ANSI +@deftypefun int isgraph (int @var{c}) +Returns true if @var{c} is a graphic character; that is, a character +that has a glyph associated with it. The whitespace characters are not +considered graphic. +@end deftypefun + +@cindex printing character +@comment ctype.h +@comment ANSI +@deftypefun int isprint (int @var{c}) +Returns true if @var{c} is a printing character. Printing characters +include all the graphic characters, plus the space (@samp{ }) character. +@end deftypefun + +@cindex control character +@comment ctype.h +@comment ANSI +@deftypefun int iscntrl (int @var{c}) +Returns true if @var{c} is a control character (that is, a character that +is not a printing character). +@end deftypefun + +@cindex ASCII character +@comment ctype.h +@comment SVID, BSD +@deftypefun int isascii (int @var{c}) +Returns true if @var{c} is a 7-bit @code{unsigned char} value that fits +into the US/UK ASCII character set. This function is a BSD extension +and is also an SVID extension. +@end deftypefun + +@node Case Conversion, , Classification of Characters, Character Handling +@section Case Conversion +@cindex character case conversion +@cindex case conversion of characters +@cindex converting case of characters + +This section explains the library functions for performing conversions +such as case mappings on characters. For example, @code{toupper} +converts any character to upper case if possible. If the character +can't be converted, @code{toupper} returns it unchanged. + +These functions take one argument of type @code{int}, which is the +character to convert, and return the converted character as an +@code{int}. If the conversion is not applicable to the argument given, +the argument is returned unchanged. + +@strong{Compatibility Note:} In pre-ANSI C dialects, instead of +returning the argument unchanged, these functions may fail when the +argument is not suitable for the conversion. Thus for portability, you +may need to write @code{islower(c) ? toupper(c) : c} rather than just +@code{toupper(c)}. + +These functions are declared in the header file @file{ctype.h}. +@pindex ctype.h + +@comment ctype.h +@comment ANSI +@deftypefun int tolower (int @var{c}) +If @var{c} is an upper-case letter, @code{tolower} returns the corresponding +lower-case letter. If @var{c} is not an upper-case letter, +@var{c} is returned unchanged. +@end deftypefun + +@comment ctype.h +@comment ANSI +@deftypefun int toupper (int @var{c}) +If @var{c} is a lower-case letter, @code{tolower} returns the corresponding +upper-case letter. Otherwise @var{c} is returned unchanged. +@end deftypefun + +@comment ctype.h +@comment SVID, BSD +@deftypefun int toascii (int @var{c}) +This function converts @var{c} to a 7-bit @code{unsigned char} value +that fits into the US/UK ASCII character set, by clearing the high-order +bits. This function is a BSD extension and is also an SVID extension. +@end deftypefun + +@comment ctype.h +@comment SVID +@deftypefun int _tolower (int @var{c}) +This is identical to @code{tolower}, and is provided for compatibility +with the SVID. @xref{SVID}.@refill +@end deftypefun + +@comment ctype.h +@comment SVID +@deftypefun int _toupper (int @var{c}) +This is identical to @code{toupper}, and is provided for compatibility +with the SVID. +@end deftypefun diff --git a/manual/errno.texi b/manual/errno.texi new file mode 100644 index 0000000000..836fff3bf2 --- /dev/null +++ b/manual/errno.texi @@ -0,0 +1,1015 @@ +@node Error Reporting, Memory Allocation, Introduction, Top +@chapter Error Reporting +@cindex error reporting +@cindex reporting errors +@cindex error codes +@cindex status codes + +Many functions in the GNU C library detect and report error conditions, +and sometimes your programs need to check for these error conditions. +For example, when you open an input file, you should verify that the +file was actually opened correctly, and print an error message or take +other appropriate action if the call to the library function failed. + +This chapter describes how the error reporting facility works. Your +program should include the header file @file{errno.h} to use this +facility. +@pindex errno.h + +@menu +* Checking for Errors:: How errors are reported by library functions. +* Error Codes:: Error code macros; all of these expand + into integer constant values. +* Error Messages:: Mapping error codes onto error messages. +@end menu + +@node Checking for Errors, Error Codes, , Error Reporting +@section Checking for Errors + +Most library functions return a special value to indicate that they have +failed. The special value is typically @code{-1}, a null pointer, or a +constant such as @code{EOF} that is defined for that purpose. But this +return value tells you only that an error has occurred. To find out +what kind of error it was, you need to look at the error code stored in the +variable @code{errno}. This variable is declared in the header file +@file{errno.h}. +@pindex errno.h + +@comment errno.h +@comment ANSI +@deftypevr {Variable} {volatile int} errno +The variable @code{errno} contains the system error number. You can +change the value of @code{errno}. + +Since @code{errno} is declared @code{volatile}, it might be changed +asynchronously by a signal handler; see @ref{Defining Handlers}. +However, a properly written signal handler saves and restores the value +of @code{errno}, so you generally do not need to worry about this +possibility except when writing signal handlers. + +The initial value of @code{errno} at program startup is zero. Many +library functions are guaranteed to set it to certain nonzero values +when they encounter certain kinds of errors. These error conditions are +listed for each function. These functions do not change @code{errno} +when they succeed; thus, the value of @code{errno} after a successful +call is not necessarily zero, and you should not use @code{errno} to +determine @emph{whether} a call failed. The proper way to do that is +documented for each function. @emph{If} the call the failed, you can +examine @code{errno}. + +Many library functions can set @code{errno} to a nonzero value as a +result of calling other library functions which might fail. You should +assume that any library function might alter @code{errno} when the +function returns an error. + +@strong{Portability Note:} ANSI C specifies @code{errno} as a +``modifiable lvalue'' rather than as a variable, permitting it to be +implemented as a macro. For example, its expansion might involve a +function call, like @w{@code{*_errno ()}}. In fact, that is what it is +on the GNU system itself. The GNU library, on non-GNU systems, does +whatever is right for the particular system. + +There are a few library functions, like @code{sqrt} and @code{atan}, +that return a perfectly legitimate value in case of an error, but also +set @code{errno}. For these functions, if you want to check to see +whether an error occurred, the recommended method is to set @code{errno} +to zero before calling the function, and then check its value afterward. +@end deftypevr + +@pindex errno.h +All the error codes have symbolic names; they are macros defined in +@file{errno.h}. The names start with @samp{E} and an upper-case +letter or digit; you should consider names of this form to be +reserved names. @xref{Reserved Names}. + +The error code values are all positive integers and are all distinct, +with one exception: @code{EWOULDBLOCK} and @code{EAGAIN} are the same. +Since the values are distinct, you can use them as labels in a +@code{switch} statement; just don't use both @code{EWOULDBLOCK} and +@code{EAGAIN}. Your program should not make any other assumptions about +the specific values of these symbolic constants. + +The value of @code{errno} doesn't necessarily have to correspond to any +of these macros, since some library functions might return other error +codes of their own for other situations. The only values that are +guaranteed to be meaningful for a particular library function are the +ones that this manual lists for that function. + +On non-GNU systems, almost any system call can return @code{EFAULT} if +it is given an invalid pointer as an argument. Since this could only +happen as a result of a bug in your program, and since it will not +happen on the GNU system, we have saved space by not mentioning +@code{EFAULT} in the descriptions of individual functions. + +In some Unix systems, many system calls can also return @code{EFAULT} if +given as an argument a pointer into the stack, and the kernel for some +obscure reason fails in its attempt to extend the stack. If this ever +happens, you should probably try using statically or dynamically +allocated memory instead of stack memory on that system. + +@node Error Codes, Error Messages, Checking for Errors, Error Reporting +@section Error Codes + +@pindex errno.h +The error code macros are defined in the header file @file{errno.h}. +All of them expand into integer constant values. Some of these error +codes can't occur on the GNU system, but they can occur using the GNU +library on other systems. + +@comment errno.h +@comment POSIX.1: Operation not permitted +@deftypevr Macro int EPERM +@comment errno 1 @c DO NOT REMOVE +Operation not permitted; only the owner of the file (or other resource) +or processes with special privileges can perform the operation. +@end deftypevr + +@comment errno.h +@comment POSIX.1: No such file or directory +@deftypevr Macro int ENOENT +@comment errno 2 @c DO NOT REMOVE +No such file or directory. This is a ``file doesn't exist'' error +for ordinary files that are referenced in contexts where they are +expected to already exist. +@end deftypevr + +@comment errno.h +@comment POSIX.1: No such process +@deftypevr Macro int ESRCH +@comment errno 3 @c DO NOT REMOVE +No process matches the specified process ID. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Interrupted system call +@deftypevr Macro int EINTR +@comment errno 4 @c DO NOT REMOVE +Interrupted function call; an asynchronous signal occured and prevented +completion of the call. When this happens, you should try the call +again. + +You can choose to have functions resume after a signal that is handled, +rather than failing with @code{EINTR}; see @ref{Interrupted +Primitives}. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Input/output error +@deftypevr Macro int EIO +@comment errno 5 @c DO NOT REMOVE +Input/output error; usually used for physical read or write errors. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Device not configured +@deftypevr Macro int ENXIO +@comment errno 6 @c DO NOT REMOVE +No such device or address. The system tried to use the device +represented by a file you specified, and it couldn't find the device. +This can mean that the device file was installed incorrectly, or that +the physical device is missing or not correctly attached to the +computer. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Argument list too long +@deftypevr Macro int E2BIG +@comment errno 7 @c DO NOT REMOVE +Argument list too long; used when the arguments passed to a new program +being executed with one of the @code{exec} functions (@pxref{Executing a +File}) occupy too much memory space. This condition never arises in the +GNU system. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Exec format error +@deftypevr Macro int ENOEXEC +@comment errno 8 @c DO NOT REMOVE +Invalid executable file format. This condition is detected by the +@code{exec} functions; see @ref{Executing a File}. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Bad file descriptor +@deftypevr Macro int EBADF +@comment errno 9 @c DO NOT REMOVE +Bad file descriptor; for example, I/O on a descriptor that has been +closed or reading from a descriptor open only for writing (or vice +versa). +@end deftypevr + +@comment errno.h +@comment POSIX.1: No child processes +@deftypevr Macro int ECHILD +@comment errno 10 @c DO NOT REMOVE +There are no child processes. This error happens on operations that are +supposed to manipulate child processes, when there aren't any processes +to manipulate. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Resource deadlock avoided +@deftypevr Macro int EDEADLK +@comment errno 11 @c DO NOT REMOVE +Deadlock avoided; allocating a system resource would have resulted in a +deadlock situation. The system does not guarantee that it will notice +all such situations. This error means you got lucky and the system +noticed; it might just hang. @xref{File Locks}, for an example. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Cannot allocate memory +@deftypevr Macro int ENOMEM +@comment errno 12 @c DO NOT REMOVE +No memory available. The system cannot allocate more virtual memory +because its capacity is full. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Permission denied +@deftypevr Macro int EACCES +@comment errno 13 @c DO NOT REMOVE +Permission denied; the file permissions do not allow the attempted operation. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Bad address +@deftypevr Macro int EFAULT +@comment errno 14 @c DO NOT REMOVE +Bad address; an invalid pointer was detected. +In the GNU system, this error never happens; you get a signal instead. +@end deftypevr + +@comment errno.h +@comment BSD: Block device required +@deftypevr Macro int ENOTBLK +@comment errno 15 @c DO NOT REMOVE +A file that isn't a block special file was given in a situation that +requires one. For example, trying to mount an ordinary file as a file +system in Unix gives this error. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Device busy +@deftypevr Macro int EBUSY +@comment errno 16 @c DO NOT REMOVE +Resource busy; a system resource that can't be shared is already in use. +For example, if you try to delete a file that is the root of a currently +mounted filesystem, you get this error. +@end deftypevr + +@comment errno.h +@comment POSIX.1: File exists +@deftypevr Macro int EEXIST +@comment errno 17 @c DO NOT REMOVE +File exists; an existing file was specified in a context where it only +makes sense to specify a new file. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Invalid cross-device link +@deftypevr Macro int EXDEV +@comment errno 18 @c DO NOT REMOVE +An attempt to make an improper link across file systems was detected. +This happens not only when you use @code{link} (@pxref{Hard Links}) but +also when you rename a file with @code{rename} (@pxref{Renaming Files}). +@end deftypevr + +@comment errno.h +@comment POSIX.1: Operation not supported by device +@deftypevr Macro int ENODEV +@comment errno 19 @c DO NOT REMOVE +The wrong type of device was given to a function that expects a +particular sort of device. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Not a directory +@deftypevr Macro int ENOTDIR +@comment errno 20 @c DO NOT REMOVE +A file that isn't a directory was specified when a directory is required. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Is a directory +@deftypevr Macro int EISDIR +@comment errno 21 @c DO NOT REMOVE +File is a directory; you cannot open a directory for writing, +or create or remove hard links to it. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Invalid argument +@deftypevr Macro int EINVAL +@comment errno 22 @c DO NOT REMOVE +Invalid argument. This is used to indicate various kinds of problems +with passing the wrong argument to a library function. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Too many open files +@deftypevr Macro int EMFILE +@comment errno 24 @c DO NOT REMOVE +The current process has too many files open and can't open any more. +Duplicate descriptors do count toward this limit. + +In BSD and GNU, the number of open files is controlled by a resource +limit that can usually be increased. If you get this error, you might +want to increase the @code{RLIMIT_NOFILE} limit or make it unlimited; +@pxref{Limits on Resources}. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Too many open files in system +@deftypevr Macro int ENFILE +@comment errno 23 @c DO NOT REMOVE +There are too many distinct file openings in the entire system. Note +that any number of linked channels count as just one file opening; see +@ref{Linked Channels}. This error never occurs in the GNU system. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Inappropriate ioctl for device +@deftypevr Macro int ENOTTY +@comment errno 25 @c DO NOT REMOVE +Inappropriate I/O control operation, such as trying to set terminal +modes on an ordinary file. +@end deftypevr + +@comment errno.h +@comment BSD: Text file busy +@deftypevr Macro int ETXTBSY +@comment errno 26 @c DO NOT REMOVE +An attempt to execute a file that is currently open for writing, or +write to a file that is currently being executed. Often using a +debugger to run a program is considered having it open for writing and +will cause this error. (The name stands for ``text file busy''.) This +is not an error in the GNU system; the text is copied as necessary. +@end deftypevr + +@comment errno.h +@comment POSIX.1: File too large +@deftypevr Macro int EFBIG +@comment errno 27 @c DO NOT REMOVE +File too big; the size of a file would be larger than allowed by the system. +@end deftypevr + +@comment errno.h +@comment POSIX.1: No space left on device +@deftypevr Macro int ENOSPC +@comment errno 28 @c DO NOT REMOVE +No space left on device; write operation on a file failed because the +disk is full. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Illegal seek +@deftypevr Macro int ESPIPE +@comment errno 29 @c DO NOT REMOVE +Invalid seek operation (such as on a pipe). +@end deftypevr + +@comment errno.h +@comment POSIX.1: Read-only file system +@deftypevr Macro int EROFS +@comment errno 30 @c DO NOT REMOVE +An attempt was made to modify something on a read-only file system. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Too many links +@deftypevr Macro int EMLINK +@comment errno 31 @c DO NOT REMOVE +Too many links; the link count of a single file would become too large. +@code{rename} can cause this error if the file being renamed already has +as many links as it can take (@pxref{Renaming Files}). +@end deftypevr + +@comment errno.h +@comment POSIX.1: Broken pipe +@deftypevr Macro int EPIPE +@comment errno 32 @c DO NOT REMOVE +Broken pipe; there is no process reading from the other end of a pipe. +Every library function that returns this error code also generates a +@code{SIGPIPE} signal; this signal terminates the program if not handled +or blocked. Thus, your program will never actually see @code{EPIPE} +unless it has handled or blocked @code{SIGPIPE}. +@end deftypevr + +@comment errno.h +@comment ANSI: Numerical argument out of domain +@deftypevr Macro int EDOM +@comment errno 33 @c DO NOT REMOVE +Domain error; used by mathematical functions when an argument value does +not fall into the domain over which the function is defined. +@end deftypevr + +@comment errno.h +@comment ANSI: Numerical result out of range +@deftypevr Macro int ERANGE +@comment errno 34 @c DO NOT REMOVE +Range error; used by mathematical functions when the result value is +not representable because of overflow or underflow. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Resource temporarily unavailable +@deftypevr Macro int EAGAIN +@comment errno 35 @c DO NOT REMOVE +Resource temporarily unavailable; the call might work if you try again +later. The macro @code{EWOULDBLOCK} is another name for @code{EAGAIN}; +they are always the same in the GNU C library. + +This error can happen in a few different situations: + +@itemize @bullet +@item +An operation that would block was attempted on an object that has +non-blocking mode selected. Trying the same operation again will block +until some external condition makes it possible to read, write, or +connect (whatever the operation). You can use @code{select} to find out +when the operation will be possible; @pxref{Waiting for I/O}. + +@strong{Portability Note:} In older Unix many systems, this condition +was indicated by @code{EWOULDBLOCK}, which was a distinct error code +different from @code{EAGAIN}. To make your program portable, you should +check for both codes and treat them the same. + +@item +A temporary resource shortage made an operation impossible. @code{fork} +can return this error. It indicates that the shortage is expected to +pass, so your program can try the call again later and it may succeed. +It is probably a good idea to delay for a few seconds before trying it +again, to allow time for other processes to release scarce resources. +Such shortages are usually fairly serious and affect the whole system, +so usually an interactive program should report the error to the user +and return to its command loop. +@end itemize +@end deftypevr + +@comment errno.h +@comment BSD: Operation would block +@deftypevr Macro int EWOULDBLOCK +@comment errno EAGAIN @c DO NOT REMOVE +In the GNU C library, this is another name for @code{EAGAIN} (above). +The values are always the same, on every operating system. + +C libraries in many older Unix systems have @code{EWOULDBLOCK} as a +separate error code. +@end deftypevr + +@comment errno.h +@comment BSD: Operation now in progress +@deftypevr Macro int EINPROGRESS +@comment errno 36 @c DO NOT REMOVE +An operation that cannot complete immediately was initiated on an object +that has non-blocking mode selected. Some functions that must always +block (such as @code{connect}; @pxref{Connecting}) never return +@code{EAGAIN}. Instead, they return @code{EINPROGRESS} to indicate that +the operation has begun and will take some time. Attempts to manipulate +the object before the call completes return @code{EALREADY}. You can +use the @code{select} function to find out when the pending operation +has completed; @pxref{Waiting for I/O}. +@end deftypevr + +@comment errno.h +@comment BSD: Operation already in progress +@deftypevr Macro int EALREADY +@comment errno 37 @c DO NOT REMOVE +An operation is already in progress on an object that has non-blocking +mode selected. +@end deftypevr + +@comment errno.h +@comment BSD: Socket operation on non-socket +@deftypevr Macro int ENOTSOCK +@comment errno 38 @c DO NOT REMOVE +A file that isn't a socket was specified when a socket is required. +@end deftypevr + +@comment errno.h +@comment BSD: Message too long +@deftypevr Macro int EMSGSIZE +@comment errno 40 @c DO NOT REMOVE +The size of a message sent on a socket was larger than the supported +maximum size. +@end deftypevr + +@comment errno.h +@comment BSD: Protocol wrong type for socket +@deftypevr Macro int EPROTOTYPE +@comment errno 41 @c DO NOT REMOVE +The socket type does not support the requested communications protocol. +@end deftypevr + +@comment errno.h +@comment BSD: Protocol not available +@deftypevr Macro int ENOPROTOOPT +@comment errno 42 @c DO NOT REMOVE +You specified a socket option that doesn't make sense for the +particular protocol being used by the socket. @xref{Socket Options}. +@end deftypevr + +@comment errno.h +@comment BSD: Protocol not supported +@deftypevr Macro int EPROTONOSUPPORT +@comment errno 43 @c DO NOT REMOVE +The socket domain does not support the requested communications protocol +(perhaps because the requested protocol is completely invalid.) +@xref{Creating a Socket}. +@end deftypevr + +@comment errno.h +@comment BSD: Socket type not supported +@deftypevr Macro int ESOCKTNOSUPPORT +@comment errno 44 @c DO NOT REMOVE +The socket type is not supported. +@end deftypevr + +@comment errno.h +@comment BSD: Operation not supported +@deftypevr Macro int EOPNOTSUPP +@comment errno 45 @c DO NOT REMOVE +The operation you requested is not supported. Some socket functions +don't make sense for all types of sockets, and others may not be +implemented for all communications protocols. In the GNU system, this +error can happen for many calls when the object does not support the +particular operation; it is a generic indication that the server knows +nothing to do for that call. +@end deftypevr + +@comment errno.h +@comment BSD: Protocol family not supported +@deftypevr Macro int EPFNOSUPPORT +@comment errno 46 @c DO NOT REMOVE +The socket communications protocol family you requested is not supported. +@end deftypevr + +@comment errno.h +@comment BSD: Address family not supported by protocol family +@deftypevr Macro int EAFNOSUPPORT +@comment errno 47 @c DO NOT REMOVE +The address family specified for a socket is not supported; it is +inconsistent with the protocol being used on the socket. @xref{Sockets}. +@end deftypevr + +@comment errno.h +@comment BSD: Address already in use +@deftypevr Macro int EADDRINUSE +@comment errno 48 @c DO NOT REMOVE +The requested socket address is already in use. @xref{Socket Addresses}. +@end deftypevr + +@comment errno.h +@comment BSD: Can't assign requested address +@deftypevr Macro int EADDRNOTAVAIL +@comment errno 49 @c DO NOT REMOVE +The requested socket address is not available; for example, you tried +to give a socket a name that doesn't match the local host name. +@xref{Socket Addresses}. +@end deftypevr + +@comment errno.h +@comment BSD: Network is down +@deftypevr Macro int ENETDOWN +@comment errno 50 @c DO NOT REMOVE +A socket operation failed because the network was down. +@end deftypevr + +@comment errno.h +@comment BSD: Network is unreachable +@deftypevr Macro int ENETUNREACH +@comment errno 51 @c DO NOT REMOVE +A socket operation failed because the subnet containing the remote host +was unreachable. +@end deftypevr + +@comment errno.h +@comment BSD: Network dropped connection on reset +@deftypevr Macro int ENETRESET +@comment errno 52 @c DO NOT REMOVE +A network connection was reset because the remote host crashed. +@end deftypevr + +@comment errno.h +@comment BSD: Software caused connection abort +@deftypevr Macro int ECONNABORTED +@comment errno 53 @c DO NOT REMOVE +A network connection was aborted locally. +@end deftypevr + +@comment errno.h +@comment BSD: Connection reset by peer +@deftypevr Macro int ECONNRESET +@comment errno 54 @c DO NOT REMOVE +A network connection was closed for reasons outside the control of the +local host, such as by the remote machine rebooting or an unrecoverable +protocol violation. +@end deftypevr + +@comment errno.h +@comment BSD: No buffer space available +@deftypevr Macro int ENOBUFS +@comment errno 55 @c DO NOT REMOVE +The kernel's buffers for I/O operations are all in use. In GNU, this +error is always synonymous with @code{ENOMEM}; you may get one or the +other from network operations. +@end deftypevr + +@comment errno.h +@comment BSD: Socket is already connected +@deftypevr Macro int EISCONN +@comment errno 56 @c DO NOT REMOVE +You tried to connect a socket that is already connected. +@xref{Connecting}. +@end deftypevr + +@comment errno.h +@comment BSD: Socket is not connected +@deftypevr Macro int ENOTCONN +@comment errno 57 @c DO NOT REMOVE +The socket is not connected to anything. You get this error when you +try to transmit data over a socket, without first specifying a +destination for the data. For a connectionless socket (for datagram +protocols, such as UDP), you get @code{EDESTADDRREQ} instead. +@end deftypevr + +@comment errno.h +@comment BSD: Destination address required +@deftypevr Macro int EDESTADDRREQ +@comment errno 39 @c DO NOT REMOVE +No default destination address was set for the socket. You get this +error when you try to transmit data over a connectionless socket, +without first specifying a destination for the data with @code{connect}. +@end deftypevr + +@comment errno.h +@comment BSD: Can't send after socket shutdown +@deftypevr Macro int ESHUTDOWN +@comment errno 58 @c DO NOT REMOVE +The socket has already been shut down. +@end deftypevr + +@comment errno.h +@comment BSD: Too many references: can't splice +@deftypevr Macro int ETOOMANYREFS +@comment errno 59 @c DO NOT REMOVE +??? +@end deftypevr + +@comment errno.h +@comment BSD: Connection timed out +@deftypevr Macro int ETIMEDOUT +@comment errno 60 @c DO NOT REMOVE +A socket operation with a specified timeout received no response during +the timeout period. +@end deftypevr + +@comment errno.h +@comment BSD: Connection refused +@deftypevr Macro int ECONNREFUSED +@comment errno 61 @c DO NOT REMOVE +A remote host refused to allow the network connection (typically because +it is not running the requested service). +@end deftypevr + +@comment errno.h +@comment BSD: Too many levels of symbolic links +@deftypevr Macro int ELOOP +@comment errno 62 @c DO NOT REMOVE +Too many levels of symbolic links were encountered in looking up a file name. +This often indicates a cycle of symbolic links. +@end deftypevr + +@comment errno.h +@comment POSIX.1: File name too long +@deftypevr Macro int ENAMETOOLONG +@comment errno 63 @c DO NOT REMOVE +Filename too long (longer than @code{PATH_MAX}; @pxref{Limits for +Files}) or host name too long (in @code{gethostname} or +@code{sethostname}; @pxref{Host Identification}). +@end deftypevr + +@comment errno.h +@comment BSD: Host is down +@deftypevr Macro int EHOSTDOWN +@comment errno 64 @c DO NOT REMOVE +The remote host for a requested network connection is down. +@end deftypevr + +@comment errno.h +@comment BSD: No route to host +@deftypevr Macro int EHOSTUNREACH +@comment errno 65 @c DO NOT REMOVE +The remote host for a requested network connection is not reachable. +@end deftypevr + +@comment errno.h +@comment POSIX.1: Directory not empty +@deftypevr Macro int ENOTEMPTY +@comment errno 66 @c DO NOT REMOVE +Directory not empty, where an empty directory was expected. Typically, +this error occurs when you are trying to delete a directory. +@end deftypevr + +@comment errno.h +@comment BSD: Too many processes +@deftypevr Macro int EPROCLIM +@comment errno 67 @c DO NOT REMOVE +This means that the per-user limit on new process would be exceeded by +an attempted @code{fork}. @xref{Limits on Resources}, for details on +the @code{RLIMIT_NPROC} limit. +@end deftypevr + +@comment errno.h +@comment BSD: Too many users +@deftypevr Macro int EUSERS +@comment errno 68 @c DO NOT REMOVE +The file quota system is confused because there are too many users. +@c This can probably happen in a GNU system when using NFS. +@end deftypevr + +@comment errno.h +@comment BSD: Disc quota exceeded +@deftypevr Macro int EDQUOT +@comment errno 69 @c DO NOT REMOVE +The user's disk quota was exceeded. +@end deftypevr + +@comment errno.h +@comment BSD: Stale NFS file handle +@deftypevr Macro int ESTALE +@comment errno 70 @c DO NOT REMOVE +Stale NFS file handle. This indicates an internal confusion in the NFS +system which is due to file system rearrangements on the server host. +Repairing this condition usually requires unmounting and remounting +the NFS file system on the local host. +@end deftypevr + +@comment errno.h +@comment BSD: Too many levels of remote in path +@deftypevr Macro int EREMOTE +@comment errno 71 @c DO NOT REMOVE +An attempt was made to NFS-mount a remote file system with a file name that +already specifies an NFS-mounted file. +(This is an error on some operating systems, but we expect it to work +properly on the GNU system, making this error code impossible.) +@end deftypevr + +@comment errno.h +@comment BSD: RPC struct is bad +@deftypevr Macro int EBADRPC +@comment errno 72 @c DO NOT REMOVE +??? +@end deftypevr + +@comment errno.h +@comment BSD: RPC version wrong +@deftypevr Macro int ERPCMISMATCH +@comment errno 73 @c DO NOT REMOVE +??? +@end deftypevr + +@comment errno.h +@comment BSD: RPC program not available +@deftypevr Macro int EPROGUNAVAIL +@comment errno 74 @c DO NOT REMOVE +??? +@end deftypevr + +@comment errno.h +@comment BSD: RPC program version wrong +@deftypevr Macro int EPROGMISMATCH +@comment errno 75 @c DO NOT REMOVE +??? +@end deftypevr + +@comment errno.h +@comment BSD: RPC bad procedure for program +@deftypevr Macro int EPROCUNAVAIL +@comment errno 76 @c DO NOT REMOVE +??? +@end deftypevr + +@comment errno.h +@comment POSIX.1: No locks available +@deftypevr Macro int ENOLCK +@comment errno 77 @c DO NOT REMOVE +No locks available. This is used by the file locking facilities; see +@ref{File Locks}. This error is never generated by the GNU system, but +it can result from an operation to an NFS server running another +operating system. +@end deftypevr + +@comment errno.h +@comment BSD: Inappropriate file type or format +@deftypevr Macro int EFTYPE +@comment errno 79 @c DO NOT REMOVE +Inappropriate file type or format. The file was the wrong type for the +operation, or a data file had the wrong format. + +On some systems @code{chmod} returns this error if you try to set the +sticky bit on a non-directory file; @pxref{Setting Permissions}. +@end deftypevr + +@comment errno.h +@comment BSD: Authentication error +@deftypevr Macro int EAUTH +@comment errno 80 @c DO NOT REMOVE +??? +@end deftypevr + +@comment errno.h +@comment BSD: Need authenticator +@deftypevr Macro int ENEEDAUTH +@comment errno 81 @c DO NOT REMOVE +??? +@end deftypevr + +@comment errno.h +@comment POSIX.1: Function not implemented +@deftypevr Macro int ENOSYS +@comment errno 78 @c DO NOT REMOVE +Function not implemented. Some functions have commands or options defined +that might not be supported in all implementations, and this is the kind +of error you get if you request them and they are not supported. +@end deftypevr + +@comment errno.h +@comment GNU: Inappropriate operation for background process +@deftypevr Macro int EBACKGROUND +@comment errno 100 @c DO NOT REMOVE +In the GNU system, servers supporting the @code{term} protocol return +this error for certain operations when the caller is not in the +foreground process group of the terminal. Users do not usually see this +error because functions such as @code{read} and @code{write} translate +it into a @code{SIGTTIN} or @code{SIGTTOU} signal. @xref{Job Control}, +for information on process groups and these signals. +@end deftypevr + +@comment errno.h +@comment GNU: Translator died +@deftypevr Macro int EDIED +@comment errno 101 @c DO NOT REMOVE +In the GNU system, opening a file returns this error when the file is +translated by a program and the translator program dies while starting +up, before it has connected to the file. +@end deftypevr + +@comment errno.h +@comment GNU: ? +@deftypevr Macro int ED +@comment errno 102 @c DO NOT REMOVE +The experienced user will know what is wrong. +@end deftypevr + +@comment errno.h +@comment GNU: You really blew it this time +@deftypevr Macro int EGREGIOUS +@comment errno 103 @c DO NOT REMOVE +You did @strong{what}? +@end deftypevr + +@comment errno.h +@comment GNU: Computer bought the farm +@deftypevr Macro int EIEIO +@comment errno 104 @c DO NOT REMOVE +Go home and have a glass of warm, dairy-fresh milk. +@end deftypevr + +@comment errno.h +@comment GNU: Gratuitous error +@deftypevr Macro int EGRATUITOUS +@comment errno 105 @c DO NOT REMOVE +This error code has no purpose. +@end deftypevr + + +@node Error Messages, , Error Codes, Error Reporting +@section Error Messages + +The library has functions and variables designed to make it easy for +your program to report informative error messages in the customary +format about the failure of a library call. The functions +@code{strerror} and @code{perror} give you the standard error message +for a given error code; the variable +@w{@code{program_invocation_short_name}} gives you convenient access to the +name of the program that encountered the error. + +@comment string.h +@comment ANSI +@deftypefun {char *} strerror (int @var{errnum}) +The @code{strerror} function maps the error code (@pxref{Checking for +Errors}) specified by the @var{errnum} argument to a descriptive error +message string. The return value is a pointer to this string. + +The value @var{errnum} normally comes from the variable @code{errno}. + +You should not modify the string returned by @code{strerror}. Also, if +you make subsequent calls to @code{strerror}, the string might be +overwritten. (But it's guaranteed that no library function ever calls +@code{strerror} behind your back.) + +The function @code{strerror} is declared in @file{string.h}. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun void perror (const char *@var{message}) +This function prints an error message to the stream @code{stderr}; +see @ref{Standard Streams}. + +If you call @code{perror} with a @var{message} that is either a null +pointer or an empty string, @code{perror} just prints the error message +corresponding to @code{errno}, adding a trailing newline. + +If you supply a non-null @var{message} argument, then @code{perror} +prefixes its output with this string. It adds a colon and a space +character to separate the @var{message} from the error string corresponding +to @code{errno}. + +The function @code{perror} is declared in @file{stdio.h}. +@end deftypefun + +@code{strerror} and @code{perror} produce the exact same message for any +given error code; the precise text varies from system to system. On the +GNU system, the messages are fairly short; there are no multi-line +messages or embedded newlines. Each error message begins with a capital +letter and does not include any terminating punctuation. + +@strong{Compatibility Note:} The @code{strerror} function is a new +feature of ANSI C. Many older C systems do not support this function +yet. + +@cindex program name +@cindex name of running program +Many programs that don't read input from the terminal are designed to +exit if any system call fails. By convention, the error message from +such a program should start with the program's name, sans directories. +You can find that name in the variable +@code{program_invocation_short_name}; the full file name is stored the +variable @code{program_invocation_name}: + +@comment errno.h +@comment GNU +@deftypevar {char *} program_invocation_name +This variable's value is the name that was used to invoke the program +running in the current process. It is the same as @code{argv[0]}. Note +that this is not necessarily a useful file name; often it contains no +directory names. @xref{Program Arguments}. +@end deftypevar + +@comment errno.h +@comment GNU +@deftypevar {char *} program_invocation_short_name +This variable's value is the name that was used to invoke the program +running in the current process, with directory names removed. (That is +to say, it is the same as @code{program_invocation_name} minus +everything up to the last slash, if any.) +@end deftypevar + +The library initialization code sets up both of these variables before +calling @code{main}. + +@strong{Portability Note:} These two variables are GNU extensions. If +you want your program to work with non-GNU libraries, you must save the +value of @code{argv[0]} in @code{main}, and then strip off the directory +names yourself. We added these extensions to make it possible to write +self-contained error-reporting subroutines that require no explicit +cooperation from @code{main}. + +Here is an example showing how to handle failure to open a file +correctly. The function @code{open_sesame} tries to open the named file +for reading and returns a stream if successful. The @code{fopen} +library function returns a null pointer if it couldn't open the file for +some reason. In that situation, @code{open_sesame} constructs an +appropriate error message using the @code{strerror} function, and +terminates the program. If we were going to make some other library +calls before passing the error code to @code{strerror}, we'd have to +save it in a local variable instead, because those other library +functions might overwrite @code{errno} in the meantime. + +@smallexample +#include <errno.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> + +FILE * +open_sesame (char *name) +@{ + FILE *stream; + + errno = 0; + stream = fopen (name, "r"); + if (stream == NULL) + @{ + fprintf (stderr, "%s: Couldn't open file %s; %s\n", + program_invocation_short_name, name, strerror (errno)); + exit (EXIT_FAILURE); + @} + else + return stream; +@} +@end smallexample + diff --git a/manual/examples/add.c b/manual/examples/add.c new file mode 100644 index 0000000000..e4b1bba365 --- /dev/null +++ b/manual/examples/add.c @@ -0,0 +1,30 @@ +#include <stdarg.h> +#include <stdio.h> + +int +add_em_up (int count,...) +{ + va_list ap; + int i, sum; + + va_start (ap, count); /* Initialize the argument list. */ + + sum = 0; + for (i = 0; i < count; i++) + sum += va_arg (ap, int); /* Get the next argument value. */ + + va_end (ap); /* Clean up. */ + return sum; +} + +int +main (void) +{ + /* This call prints 16. */ + printf ("%d\n", add_em_up (3, 5, 5, 6)); + + /* This call prints 55. */ + printf ("%d\n", add_em_up (10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)); + + return 0; +} diff --git a/manual/examples/atexit.c b/manual/examples/atexit.c new file mode 100644 index 0000000000..42bba71126 --- /dev/null +++ b/manual/examples/atexit.c @@ -0,0 +1,15 @@ +#include <stdio.h> +#include <stdlib.h> + +void +bye (void) +{ + puts ("Goodbye, cruel world...."); +} + +int +main (void) +{ + atexit (bye); + exit (EXIT_SUCCESS); +} diff --git a/manual/examples/db.c b/manual/examples/db.c new file mode 100644 index 0000000000..1a1cb0c0d7 --- /dev/null +++ b/manual/examples/db.c @@ -0,0 +1,52 @@ +#include <grp.h> +#include <pwd.h> +#include <sys/types.h> +#include <unistd.h> +#include <stdlib.h> + +int +main (void) +{ + uid_t me; + struct passwd *my_passwd; + struct group *my_group; + char **members; + + /* Get information about the user ID. */ + me = getuid (); + my_passwd = getpwuid (me); + if (!my_passwd) + { + printf ("Couldn't find out about user %d.\n", (int) me); + exit (EXIT_FAILURE); + } + + /* Print the information. */ + printf ("I am %s.\n", my_passwd->pw_gecos); + printf ("My login name is %s.\n", my_passwd->pw_name); + printf ("My uid is %d.\n", (int) (my_passwd->pw_uid)); + printf ("My home directory is %s.\n", my_passwd->pw_dir); + printf ("My default shell is %s.\n", my_passwd->pw_shell); + + /* Get information about the default group ID. */ + my_group = getgrgid (my_passwd->pw_gid); + if (!my_group) + { + printf ("Couldn't find out about group %d.\n", + (int) my_passwd->pw_gid); + exit (EXIT_FAILURE); + } + + /* Print the information. */ + printf ("My default group is %s (%d).\n", + my_group->gr_name, (int) (my_passwd->pw_gid)); + printf ("The members of this group are:\n"); + members = my_group->gr_mem; + while (*members) + { + printf (" %s\n", *(members)); + members++; + } + + return EXIT_SUCCESS; +} diff --git a/manual/examples/dir.c b/manual/examples/dir.c new file mode 100644 index 0000000000..b90f72da03 --- /dev/null +++ b/manual/examples/dir.c @@ -0,0 +1,25 @@ +/*@group*/ +#include <stddef.h> +#include <stdio.h> +#include <sys/types.h> +#include <dirent.h> +/*@end group*/ + +int +main (void) +{ + DIR *dp; + struct dirent *ep; + + dp = opendir ("./"); + if (dp != NULL) + { + while (ep = readdir (dp)) + puts (ep->d_name); + (void) closedir (dp); + } + else + puts ("Couldn't open the directory."); + + return 0; +} diff --git a/manual/examples/filecli.c b/manual/examples/filecli.c new file mode 100644 index 0000000000..b77ae6763e --- /dev/null +++ b/manual/examples/filecli.c @@ -0,0 +1,54 @@ +#include <stdio.h> +#include <errno.h> +#include <unistd.h> +#include <stdlib.h> +#include <sys/socket.h> +#include <sys/un.h> + +#define SERVER "/tmp/serversocket" +#define CLIENT "/tmp/mysocket" +#define MAXMSG 512 +#define MESSAGE "Yow!!! Are we having fun yet?!?" + +int +main (void) +{ + extern int make_named_socket (const char *name); + int sock; + char message[MAXMSG]; + struct sockaddr_un name; + size_t size; + int nbytes; + + /* Make the socket. */ + sock = make_named_socket (CLIENT); + + /* Initialize the server socket address. */ + name.sun_family = AF_UNIX; + strcpy (name.sun_path, SERVER); + size = strlen (name.sun_path) + sizeof (name.sun_family); + + /* Send the datagram. */ + nbytes = sendto (sock, MESSAGE, strlen (MESSAGE) + 1, 0, + (struct sockaddr *) & name, size); + if (nbytes < 0) + { + perror ("sendto (client)"); + exit (EXIT_FAILURE); + } + + /* Wait for a reply. */ + nbytes = recvfrom (sock, message, MAXMSG, 0, NULL, 0); + if (nbytes < 0) + { + perror ("recfrom (client)"); + exit (EXIT_FAILURE); + } + + /* Print a diagnostic message. */ + fprintf (stderr, "Client: got message: %s\n", message); + + /* Clean up. */ + remove (CLIENT); + close (sock); +} diff --git a/manual/examples/filesrv.c b/manual/examples/filesrv.c new file mode 100644 index 0000000000..3596b99982 --- /dev/null +++ b/manual/examples/filesrv.c @@ -0,0 +1,46 @@ +#include <stdio.h> +#include <errno.h> +#include <stdlib.h> +#include <sys/socket.h> +#include <sys/un.h> + +#define SERVER "/tmp/serversocket" +#define MAXMSG 512 + +int +main (void) +{ + int sock; + char message[MAXMSG]; + struct sockaddr_un name; + size_t size; + int nbytes; + + /* Make the socket, then loop endlessly. */ + + sock = make_named_socket (SERVER); + while (1) + { + /* Wait for a datagram. */ + size = sizeof (name); + nbytes = recvfrom (sock, message, MAXMSG, 0, + (struct sockaddr *) & name, &size); + if (nbytes < 0) + { + perror ("recfrom (server)"); + exit (EXIT_FAILURE); + } + + /* Give a diagnostic message. */ + fprintf (stderr, "Server: got message: %s\n", message); + + /* Bounce the message back to the sender. */ + nbytes = sendto (sock, message, nbytes, 0, + (struct sockaddr *) & name, size); + if (nbytes < 0) + { + perror ("sendto (server)"); + exit (EXIT_FAILURE); + } + } +} diff --git a/manual/examples/inetcli.c b/manual/examples/inetcli.c new file mode 100644 index 0000000000..258c6892aa --- /dev/null +++ b/manual/examples/inetcli.c @@ -0,0 +1,59 @@ +#include <stdio.h> +#include <errno.h> +#include <stdlib.h> +#include <unistd.h> +#include <sys/types.h> +#include <sys/socket.h> +#include <netinet/in.h> +#include <netdb.h> + +#define PORT 5555 +#define MESSAGE "Yow!!! Are we having fun yet?!?" +#define SERVERHOST "churchy.gnu.ai.mit.edu" + +void +write_to_server (int filedes) +{ + int nbytes; + + nbytes = write (filedes, MESSAGE, strlen (MESSAGE) + 1); + if (nbytes < 0) + { + perror ("write"); + exit (EXIT_FAILURE); + } +} + + +int +main (void) +{ + extern void init_sockaddr (struct sockaddr_in *name, + const char *hostname, + unsigned short int port); + int sock; + struct sockaddr_in servername; + + /* Create the socket. */ + sock = socket (PF_INET, SOCK_STREAM, 0); + if (sock < 0) + { + perror ("socket (client)"); + exit (EXIT_FAILURE); + } + + /* Connect to the server. */ + init_sockaddr (&servername, SERVERHOST, PORT); + if (0 > connect (sock, + (struct sockaddr *) &servername, + sizeof (servername))) + { + perror ("connect (client)"); + exit (EXIT_FAILURE); + } + + /* Send data to the server. */ + write_to_server (sock); + close (sock); + exit (EXIT_SUCCESS); +} diff --git a/manual/examples/inetsrv.c b/manual/examples/inetsrv.c new file mode 100644 index 0000000000..bd86e80f36 --- /dev/null +++ b/manual/examples/inetsrv.c @@ -0,0 +1,103 @@ +#include <stdio.h> +#include <errno.h> +#include <stdlib.h> +#include <unistd.h> +#include <sys/types.h> +#include <sys/socket.h> +#include <netinet/in.h> +#include <netdb.h> + +#define PORT 5555 +#define MAXMSG 512 + +int +read_from_client (int filedes) +{ + char buffer[MAXMSG]; + int nbytes; + + nbytes = read (filedes, buffer, MAXMSG); + if (nbytes < 0) + { + /* Read error. */ + perror ("read"); + exit (EXIT_FAILURE); + } + else if (nbytes == 0) + /* End-of-file. */ + return -1; + else + { + /* Data read. */ + fprintf (stderr, "Server: got message: `%s'\n", buffer); + return 0; + } +} + +int +main (void) +{ + extern int make_socket (unsigned short int port); + int sock; + fd_set active_fd_set, read_fd_set; + int i; + struct sockaddr_in clientname; + size_t size; + + /* Create the socket and set it up to accept connections. */ + sock = make_socket (PORT); + if (listen (sock, 1) < 0) + { + perror ("listen"); + exit (EXIT_FAILURE); + } + + /* Initialize the set of active sockets. */ + FD_ZERO (&active_fd_set); + FD_SET (sock, &active_fd_set); + + while (1) + { + /* Block until input arrives on one or more active sockets. */ + read_fd_set = active_fd_set; + if (select (FD_SETSIZE, &read_fd_set, NULL, NULL, NULL) < 0) + { + perror ("select"); + exit (EXIT_FAILURE); + } + + /* Service all the sockets with input pending. */ + for (i = 0; i < FD_SETSIZE; ++i) + if (FD_ISSET (i, &read_fd_set)) + { + if (i == sock) + { + /* Connection request on original socket. */ + int new; + size = sizeof (clientname); + new = accept (sock, + (struct sockaddr *) &clientname, + &size); + if (new < 0) + { + perror ("accept"); + exit (EXIT_FAILURE); + } + fprintf (stderr, + "Server: connect from host %s, port %hd.\n", + inet_ntoa (clientname.sin_addr), + ntohs (clientname.sin_port)); + FD_SET (new, &active_fd_set); + } + else + { + /* Data arriving on an already-connected socket. */ + if (read_from_client (i) < 0) + { + close (i); + FD_CLR (i, &active_fd_set); + } + } + } + } +} diff --git a/manual/examples/isockad.c b/manual/examples/isockad.c new file mode 100644 index 0000000000..54ec1cca4c --- /dev/null +++ b/manual/examples/isockad.c @@ -0,0 +1,23 @@ +#include <stdio.h> +#include <stdlib.h> +#include <sys/socket.h> +#include <netinet/in.h> +#include <netdb.h> + +void +init_sockaddr (struct sockaddr_in *name, + const char *hostname, + unsigned short int port) +{ + struct hostent *hostinfo; + + name->sin_family = AF_INET; + name->sin_port = htons (port); + hostinfo = gethostbyname (hostname); + if (hostinfo == NULL) + { + fprintf (stderr, "Unknown host %s.\n", hostname); + exit (EXIT_FAILURE); + } + name->sin_addr = *(struct in_addr *) hostinfo->h_addr; +} diff --git a/manual/examples/longopt.c b/manual/examples/longopt.c new file mode 100644 index 0000000000..d5c841f24a --- /dev/null +++ b/manual/examples/longopt.c @@ -0,0 +1,92 @@ +#include <stdio.h> + +/* Flag set by @samp{--verbose}. */ +static int verbose_flag; + +int +main (argc, argv) + int argc; + char **argv; +{ + int c; + + while (1) + { + static struct option long_options[] = + { + /* These options set a flag. */ + {"verbose", 0, &verbose_flag, 1}, + {"brief", 0, &verbose_flag, 0}, + /* These options don't set a flag. + We distinguish them by their indices. */ + {"add", 1, 0, 0}, + {"append", 0, 0, 0}, + {"delete", 1, 0, 0}, + {"create", 0, 0, 0}, + {"file", 1, 0, 0}, + {0, 0, 0, 0} + }; + /* @code{getopt_long} stores the option index here. */ + int option_index = 0; + + c = getopt_long (argc, argv, "abc:d:", + long_options, &option_index); + + /* Detect the end of the options. */ + if (c == -1) + break; + + switch (c) + { + case 0: + /* If this option set a flag, do nothing else now. */ + if (long_options[option_index].flag != 0) + break; + printf ("option %s", long_options[option_index].name); + if (optarg) + printf (" with arg %s", optarg); + printf ("\n"); + break; + + case 'a': + puts ("option -a\n"); + break; + + case 'b': + puts ("option -b\n"); + break; + + case 'c': + printf ("option -c with value `%s'\n", optarg); + break; + + case 'd': + printf ("option -d with value `%s'\n", optarg); + break; + + case '?': + /* @code{getopt_long} already printed an error message. */ + break; + + default: + abort (); + } + } + + /* Instead of reporting @samp{--verbose} + and @samp{--brief} as they are encountered, + we report the final status resulting from them. */ + if (verbose_flag) + puts ("verbose flag is set"); + + /* Print any remaining command line arguments (not options). */ + if (optind < argc) + { + printf ("non-option ARGV-elements: "); + while (optind < argc) + printf ("%s ", argv[optind++]); + putchar ('\n'); + } + + exit (0); +} diff --git a/manual/examples/memopen.c b/manual/examples/memopen.c new file mode 100644 index 0000000000..682830fe5f --- /dev/null +++ b/manual/examples/memopen.c @@ -0,0 +1,17 @@ +#include <stdio.h> + +static char buffer[] = "foobar"; + +int +main (void) +{ + int ch; + FILE *stream; + + stream = fmemopen (buffer, strlen (buffer), "r"); + while ((ch = fgetc (stream)) != EOF) + printf ("Got %c\n", ch); + fclose (stream); + + return 0; +} diff --git a/manual/examples/memstrm.c b/manual/examples/memstrm.c new file mode 100644 index 0000000000..1674c36e0b --- /dev/null +++ b/manual/examples/memstrm.c @@ -0,0 +1,19 @@ +#include <stdio.h> + +int +main (void) +{ + char *bp; + size_t size; + FILE *stream; + + stream = open_memstream (&bp, &size); + fprintf (stream, "hello"); + fflush (stream); + printf ("buf = `%s', size = %d\n", bp, size); + fprintf (stream, ", world"); + fclose (stream); + printf ("buf = `%s', size = %d\n", bp, size); + + return 0; +} diff --git a/manual/examples/mkfsock.c b/manual/examples/mkfsock.c new file mode 100644 index 0000000000..d3750ec150 --- /dev/null +++ b/manual/examples/mkfsock.c @@ -0,0 +1,43 @@ +#include <stddef.h> +#include <stdio.h> +#include <errno.h> +#include <stdlib.h> +#include <sys/socket.h> +#include <sys/un.h> + +int +make_named_socket (const char *filename) +{ + struct sockaddr_un name; + int sock; + size_t size; + + /* Create the socket. */ + + sock = socket (PF_UNIX, SOCK_DGRAM, 0); + if (sock < 0) + { + perror ("socket"); + exit (EXIT_FAILURE); + } + + /* Bind a name to the socket. */ + + name.sun_family = AF_FILE; + strcpy (name.sun_path, filename); + + /* The size of the address is + the offset of the start of the filename, + plus its length, + plus one for the terminating null byte. */ + size = (offsetof (struct sockaddr_un, sun_path) + + strlen (name.sun_path) + 1); + + if (bind (sock, (struct sockaddr *) &name, size) < 0) + { + perror ("bind"); + exit (EXIT_FAILURE); + } + + return sock; +} diff --git a/manual/examples/mkisock.c b/manual/examples/mkisock.c new file mode 100644 index 0000000000..07411bb263 --- /dev/null +++ b/manual/examples/mkisock.c @@ -0,0 +1,31 @@ +#include <stdio.h> +#include <stdlib.h> +#include <sys/socket.h> +#include <netinet/in.h> + +int +make_socket (unsigned short int port) +{ + int sock; + struct sockaddr_in name; + + /* Create the socket. */ + sock = socket (PF_INET, SOCK_STREAM, 0); + if (sock < 0) + { + perror ("socket"); + exit (EXIT_FAILURE); + } + + /* Give the socket a name. */ + name.sin_family = AF_INET; + name.sin_port = htons (port); + name.sin_addr.s_addr = htonl (INADDR_ANY); + if (bind (sock, (struct sockaddr *) &name, sizeof (name)) < 0) + { + perror ("bind"); + exit (EXIT_FAILURE); + } + + return sock; +} diff --git a/manual/examples/pipe.c b/manual/examples/pipe.c new file mode 100644 index 0000000000..054550fec6 --- /dev/null +++ b/manual/examples/pipe.c @@ -0,0 +1,66 @@ +#include <sys/types.h> +#include <unistd.h> +#include <stdio.h> +#include <stdlib.h> + +/* Read characters from the pipe and echo them to @code{stdout}. */ + +void +read_from_pipe (int file) +{ + FILE *stream; + int c; + stream = fdopen (file, "r"); + while ((c = fgetc (stream)) != EOF) + putchar (c); + fclose (stream); +} + +/* Write some random text to the pipe. */ + +void +write_to_pipe (int file) +{ + FILE *stream; + stream = fdopen (file, "w"); + fprintf (stream, "hello, world!\n"); + fprintf (stream, "goodbye, world!\n"); + fclose (stream); +} + +int +main (void) +{ + pid_t pid; + int mypipe[2]; + +/*@group*/ + /* Create the pipe. */ + if (pipe (mypipe)) + { + fprintf (stderr, "Pipe failed.\n"); + return EXIT_FAILURE; + } +/*@end group*/ + + /* Create the child process. */ + pid = fork (); + if (pid == (pid_t) 0) + { + /* This is the child process. */ + read_from_pipe (mypipe[0]); + return EXIT_SUCCESS; + } + else if (pid < (pid_t) 0) + { + /* The fork failed. */ + fprintf (stderr, "Fork failed.\n"); + return EXIT_FAILURE; + } + else + { + /* This is the parent process. */ + write_to_pipe (mypipe[1]); + return EXIT_SUCCESS; + } +} diff --git a/manual/examples/popen.c b/manual/examples/popen.c new file mode 100644 index 0000000000..16ae32fa16 --- /dev/null +++ b/manual/examples/popen.c @@ -0,0 +1,33 @@ +#include <stdio.h> +#include <stdlib.h> + +void +write_data (FILE * stream) +{ + int i; + for (i = 0; i < 100; i++) + fprintf (stream, "%d\n", i); + if (ferror (stream)) + { + fprintf (stderr, "Output to stream failed.\n"); + exit (EXIT_FAILURE); + } +} + +/*@group*/ +int +main (void) +{ + FILE *output; + + output = popen ("more", "w"); + if (!output) + { + fprintf (stderr, "Could not run more.\n"); + return EXIT_FAILURE; + } + write_data (output); + pclose (output); + return EXIT_SUCCESS; +} +/*@end group*/ diff --git a/manual/examples/rprintf.c b/manual/examples/rprintf.c new file mode 100644 index 0000000000..eff1d8e7cf --- /dev/null +++ b/manual/examples/rprintf.c @@ -0,0 +1,52 @@ +#include <stdio.h> +#include <printf.h> +#include <stdarg.h> + +/*@group*/ +typedef struct + { + char *name; + } Widget; +/*@end group*/ + +int +print_widget (FILE *stream, const struct printf_info *info, va_list *app) +{ + Widget *w; + char *buffer; + int len; + + /* Format the output into a string. */ + w = va_arg (*app, Widget *); + len = asprintf (&buffer, "<Widget %p: %s>", w, w->name); + if (len == -1) + return -1; + + /* Pad to the minimum field width and print to the stream. */ + len = fprintf (stream, "%*s", + (info->left ? - info->width : info->width), + buffer); + + /* Clean up and return. */ + free (buffer); + return len; +} + + +int +main (void) +{ + /* Make a widget to print. */ + Widget mywidget; + mywidget.name = "mywidget"; + + /* Register the print function for widgets. */ + register_printf_function ('W', print_widget, NULL); /* No arginfo. */ + + /* Now print the widget. */ + printf ("|%W|\n", &mywidget); + printf ("|%35W|\n", &mywidget); + printf ("|%-35W|\n", &mywidget); + + return 0; +} diff --git a/manual/examples/search b/manual/examples/search new file mode 100755 index 0000000000..4916a2c52f --- /dev/null +++ b/manual/examples/search Binary files differdiff --git a/manual/examples/search.c b/manual/examples/search.c new file mode 100644 index 0000000000..182e6e4a3f --- /dev/null +++ b/manual/examples/search.c @@ -0,0 +1,93 @@ +#include <stdlib.h> +#include <stdio.h> +#include <string.h> + +/* Define an array of critters to sort. */ + +struct critter + { + const char *name; + const char *species; + }; + +struct critter muppets[] = + { + {"Kermit", "frog"}, + {"Piggy", "pig"}, + {"Gonzo", "whatever"}, + {"Fozzie", "bear"}, + {"Sam", "eagle"}, + {"Robin", "frog"}, + {"Animal", "animal"}, + {"Camilla", "chicken"}, + {"Sweetums", "monster"}, + {"Dr. Strangepork", "pig"}, + {"Link Hogthrob", "pig"}, + {"Zoot", "human"}, + {"Dr. Bunsen Honeydew", "human"}, + {"Beaker", "human"}, + {"Swedish Chef", "human"} + }; + +int count = sizeof (muppets) / sizeof (struct critter); + + + +/* This is the comparison function used for sorting and searching. */ + +int +critter_cmp (const struct critter *c1, const struct critter *c2) +{ + return strcmp (c1->name, c2->name); +} + + +/* Print information about a critter. */ + +void +print_critter (const struct critter *c) +{ + printf ("%s, the %s\n", c->name, c->species); +} + + +/*@group*/ +/* Do the lookup into the sorted array. */ + +void +find_critter (const char *name) +{ + struct critter target, *result; + target.name = name; + result = bsearch (&target, muppets, count, sizeof (struct critter), + critter_cmp); + if (result) + print_critter (result); + else + printf ("Couldn't find %s.\n", name); +} +/*@end group*/ + +/* Main program. */ + +int +main (void) +{ + int i; + + for (i = 0; i < count; i++) + print_critter (&muppets[i]); + printf ("\n"); + + qsort (muppets, count, sizeof (struct critter), critter_cmp); + + for (i = 0; i < count; i++) + print_critter (&muppets[i]); + printf ("\n"); + + find_critter ("Kermit"); + find_critter ("Gonzo"); + find_critter ("Janice"); + + return 0; +} diff --git a/manual/examples/select.c b/manual/examples/select.c new file mode 100644 index 0000000000..def2cd6f9f --- /dev/null +++ b/manual/examples/select.c @@ -0,0 +1,40 @@ +/*@group*/ +#include <stdio.h> +#include <unistd.h> +#include <sys/types.h> +#include <sys/time.h> +/*@end group*/ + +/*@group*/ +int +input_timeout (int filedes, unsigned int seconds) +{ + fd_set set; + struct timeval timeout; +/*@end group*/ + + /* Initialize the file descriptor set. */ + FD_ZERO (&set); + FD_SET (filedes, &set); + + /* Initialize the timeout data structure. */ + timeout.tv_sec = seconds; + timeout.tv_usec = 0; + +/*@group*/ + /* @code{select} returns 0 if timeout, 1 if input available, -1 if error. */ + return TEMP_FAILURE_RETRY (select (FD_SETSIZE, + &set, NULL, NULL, + &timeout)); +} +/*@end group*/ + +/*@group*/ +int +main (void) +{ + fprintf (stderr, "select returned %d.\n", + input_timeout (STDIN_FILENO, 5)); + return 0; +} +/*@end group*/ diff --git a/manual/examples/setjmp.c b/manual/examples/setjmp.c new file mode 100644 index 0000000000..023339c602 --- /dev/null +++ b/manual/examples/setjmp.c @@ -0,0 +1,32 @@ +#include <setjmp.h> +#include <stdlib.h> +#include <stdio.h> + +jmp_buf main_loop; + +void +abort_to_main_loop (int status) +{ + longjmp (main_loop, status); +} + +int +main (void) +{ + while (1) + if (setjmp (main_loop)) + puts ("Back at main loop...."); + else + do_command (); +} + + +void +do_command (void) +{ + char buffer[128]; + if (fgets (buffer, 128, stdin) == NULL) + abort_to_main_loop (-1); + else + exit (EXIT_SUCCESS); +} diff --git a/manual/examples/sigh1.c b/manual/examples/sigh1.c new file mode 100644 index 0000000000..2c6e95b9c9 --- /dev/null +++ b/manual/examples/sigh1.c @@ -0,0 +1,36 @@ +#include <signal.h> +#include <stdio.h> +#include <stdlib.h> + +/* This flag controls termination of the main loop. */ +volatile sig_atomic_t keep_going = 1; + +/* The signal handler just clears the flag and re-enables itself. */ +void +catch_alarm (int sig) +{ + keep_going = 0; + signal (sig, catch_alarm); +} + +void +do_stuff (void) +{ + puts ("Doing stuff while waiting for alarm...."); +} + +int +main (void) +{ + /* Establish a handler for SIGALRM signals. */ + signal (SIGALRM, catch_alarm); + + /* Set an alarm to go off in a little while. */ + alarm (2); + + /* Check the flag once in a while to see when to quit. */ + while (keep_going) + do_stuff (); + + return EXIT_SUCCESS; +} diff --git a/manual/examples/sigusr.c b/manual/examples/sigusr.c new file mode 100644 index 0000000000..11e3ceee8f --- /dev/null +++ b/manual/examples/sigusr.c @@ -0,0 +1,61 @@ +/*@group*/ +#include <signal.h> +#include <stdio.h> +#include <sys/types.h> +#include <unistd.h> +/*@end group*/ + +/* When a @code{SIGUSR1} signal arrives, set this variable. */ +volatile sig_atomic_t usr_interrupt = 0; + +void +synch_signal (int sig) +{ + usr_interrupt = 1; +} + +/* The child process executes this function. */ +void +child_function (void) +{ + /* Perform initialization. */ + printf ("I'm here!!! My pid is %d.\n", (int) getpid ()); + + /* Let parent know you're done. */ + kill (getppid (), SIGUSR1); + + /* Continue with execution. */ + puts ("Bye, now...."); + exit (0); +} + +int +main (void) +{ + struct sigaction usr_action; + sigset_t block_mask; + pid_t child_id; + + /* Establish the signal handler. */ + sigfillset (&block_mask); + usr_action.sa_handler = synch_signal; + usr_action.sa_mask = block_mask; + usr_action.sa_flags = 0; + sigaction (SIGUSR1, &usr_action, NULL); + + /* Create the child process. */ + child_id = fork (); + if (child_id == 0) + child_function (); /* Does not return. */ + +/*@group*/ + /* Busy wait for the child to send a signal. */ + while (!usr_interrupt) + ; +/*@end group*/ + + /* Now continue execution. */ + puts ("That's all, folks!"); + + return 0; +} diff --git a/manual/examples/stpcpy.c b/manual/examples/stpcpy.c new file mode 100644 index 0000000000..b83226354b --- /dev/null +++ b/manual/examples/stpcpy.c @@ -0,0 +1,13 @@ +#include <string.h> +#include <stdio.h> + +int +main (void) +{ + char buffer[10]; + char *to = buffer; + to = stpcpy (to, "foo"); + to = stpcpy (to, "bar"); + puts (buffer); + return 0; +} diff --git a/manual/examples/strftim.c b/manual/examples/strftim.c new file mode 100644 index 0000000000..7f95ef02ad --- /dev/null +++ b/manual/examples/strftim.c @@ -0,0 +1,31 @@ +#include <time.h> +#include <stdio.h> + +#define SIZE 256 + +int +main (void) +{ + char buffer[SIZE]; + time_t curtime; + struct tm *loctime; + + /* Get the current time. */ + curtime = time (NULL); + + /* Convert it to local time representation. */ + loctime = localtime (&curtime); + + /* Print out the date and time in the standard format. */ + fputs (asctime (loctime), stdout); + +/*@group*/ + /* Print it out in a nice format. */ + strftime (buffer, SIZE, "Today is %A, %B %d.\n", loctime); + fputs (buffer, stdout); + strftime (buffer, SIZE, "The time is %I:%M %p.\n", loctime); + fputs (buffer, stdout); + + return 0; +} +/*@end group*/ diff --git a/manual/examples/strncat.c b/manual/examples/strncat.c new file mode 100644 index 0000000000..f865167f4a --- /dev/null +++ b/manual/examples/strncat.c @@ -0,0 +1,14 @@ +#include <string.h> +#include <stdio.h> + +#define SIZE 10 + +static char buffer[SIZE]; + +main () +{ + strncpy (buffer, "hello", SIZE); + puts (buffer); + strncat (buffer, ", world", SIZE - strlen (buffer) - 1); + puts (buffer); +} diff --git a/manual/examples/termios.c b/manual/examples/termios.c new file mode 100644 index 0000000000..6db5990a0c --- /dev/null +++ b/manual/examples/termios.c @@ -0,0 +1,60 @@ +#include <unistd.h> +#include <stdio.h> +#include <stdlib.h> +#include <termios.h> + +/* Use this variable to remember original terminal attributes. */ + +struct termios saved_attributes; + +void +reset_input_mode (void) +{ + tcsetattr (STDIN_FILENO, TCSANOW, &saved_attributes); +} + +void +set_input_mode (void) +{ + struct termios tattr; + char *name; + + /* Make sure stdin is a terminal. */ + if (!isatty (STDIN_FILENO)) + { + fprintf (stderr, "Not a terminal.\n"); + exit (EXIT_FAILURE); + } + + /* Save the terminal attributes so we can restore them later. */ + tcgetattr (STDIN_FILENO, &saved_attributes); + atexit (reset_input_mode); + +/*@group*/ + /* Set the funny terminal modes. */ + tcgetattr (STDIN_FILENO, &tattr); + tattr.c_lflag &= ~(ICANON|ECHO); /* Clear ICANON and ECHO. */ + tattr.c_cc[VMIN] = 1; + tattr.c_cc[VTIME] = 0; + tcsetattr (STDIN_FILENO, TCSAFLUSH, &tattr); +} +/*@end group*/ + +int +main (void) +{ + char c; + + set_input_mode (); + + while (1) + { + read (STDIN_FILENO, &c, 1); + if (c == '\004') /* @kbd{C-d} */ + break; + else + putchar (c); + } + + return EXIT_SUCCESS; +} diff --git a/manual/examples/testopt.c b/manual/examples/testopt.c new file mode 100644 index 0000000000..8ebc9b6f7a --- /dev/null +++ b/manual/examples/testopt.c @@ -0,0 +1,50 @@ +/*@group*/ +#include <unistd.h> +#include <stdio.h> + +int +main (int argc, char **argv) +{ + int aflag = 0; + int bflag = 0; + char *cvalue = NULL; + int index; + int c; + + opterr = 0; +/*@end group*/ + +/*@group*/ + while ((c = getopt (argc, argv, "abc:")) != -1) + switch (c) + { + case 'a': + aflag = 1; + break; + case 'b': + bflag = 1; + break; + case 'c': + cvalue = optarg; + break; + case '?': + if (isprint (optopt)) + fprintf (stderr, "Unknown option `-%c'.\n", optopt); + else + fprintf (stderr, + "Unknown option character `\\x%x'.\n", + optopt); + return 1; + default: + abort (); + } +/*@end group*/ + +/*@group*/ + printf ("aflag = %d, bflag = %d, cvalue = %s\n", aflag, bflag, cvalue); + + for (index = optind; index < argc; index++) + printf ("Non-option argument %s\n", argv[index]); + return 0; +} +/*@end group*/ diff --git a/manual/filesys.texi b/manual/filesys.texi new file mode 100644 index 0000000000..d2afe8623f --- /dev/null +++ b/manual/filesys.texi @@ -0,0 +1,2080 @@ +@node File System Interface, Pipes and FIFOs, Low-Level I/O, Top +@chapter File System Interface + +This chapter describes the GNU C library's functions for manipulating +files. Unlike the input and output functions described in +@ref{I/O on Streams} and @ref{Low-Level I/O}, these +functions are concerned with operating on the files themselves, rather +than on their contents. + +Among the facilities described in this chapter are functions for +examining or modifying directories, functions for renaming and deleting +files, and functions for examining and setting file attributes such as +access permissions and modification times. + +@menu +* Working Directory:: This is used to resolve relative + file names. +* Accessing Directories:: Finding out what files a directory + contains. +* Hard Links:: Adding alternate names to a file. +* Symbolic Links:: A file that ``points to'' a file name. +* Deleting Files:: How to delete a file, and what that means. +* Renaming Files:: Changing a file's name. +* Creating Directories:: A system call just for creating a directory. +* File Attributes:: Attributes of individual files. +* Making Special Files:: How to create special files. +* Temporary Files:: Naming and creating temporary files. +@end menu + +@node Working Directory +@section Working Directory + +@cindex current working directory +@cindex working directory +@cindex change working directory +Each process has associated with it a directory, called its @dfn{current +working directory} or simply @dfn{working directory}, that is used in +the resolution of relative file names (@pxref{File Name Resolution}). + +When you log in and begin a new session, your working directory is +initially set to the home directory associated with your login account +in the system user database. You can find any user's home directory +using the @code{getpwuid} or @code{getpwnam} functions; see @ref{User +Database}. + +Users can change the working directory using shell commands like +@code{cd}. The functions described in this section are the primitives +used by those commands and by other programs for examining and changing +the working directory. +@pindex cd + +Prototypes for these functions are declared in the header file +@file{unistd.h}. +@pindex unistd.h + +@comment unistd.h +@comment POSIX.1 +@deftypefun {char *} getcwd (char *@var{buffer}, size_t @var{size}) +The @code{getcwd} function returns an absolute file name representing +the current working directory, storing it in the character array +@var{buffer} that you provide. The @var{size} argument is how you tell +the system the allocation size of @var{buffer}. + +The GNU library version of this function also permits you to specify a +null pointer for the @var{buffer} argument. Then @code{getcwd} +allocates a buffer automatically, as with @code{malloc} +(@pxref{Unconstrained Allocation}). If the @var{size} is greater than +zero, then the buffer is that large; otherwise, the buffer is as large +as necessary to hold the result. + +The return value is @var{buffer} on success and a null pointer on failure. +The following @code{errno} error conditions are defined for this function: + +@table @code +@item EINVAL +The @var{size} argument is zero and @var{buffer} is not a null pointer. + +@item ERANGE +The @var{size} argument is less than the length of the working directory +name. You need to allocate a bigger array and try again. + +@item EACCES +Permission to read or search a component of the file name was denied. +@end table +@end deftypefun + +Here is an example showing how you could implement the behavior of GNU's +@w{@code{getcwd (NULL, 0)}} using only the standard behavior of +@code{getcwd}: + +@smallexample +char * +gnu_getcwd () +@{ + int size = 100; + char *buffer = (char *) xmalloc (size); + + while (1) + @{ + char *value = getcwd (buffer, size); + if (value != 0) + return buffer; + size *= 2; + free (buffer); + buffer = (char *) xmalloc (size); + @} +@} +@end smallexample + +@noindent +@xref{Malloc Examples}, for information about @code{xmalloc}, which is +not a library function but is a customary name used in most GNU +software. + +@comment unistd.h +@comment BSD +@deftypefun {char *} getwd (char *@var{buffer}) +This is similar to @code{getcwd}, but has no way to specify the size of +the buffer. The GNU library provides @code{getwd} only +for backwards compatibility with BSD. + +The @var{buffer} argument should be a pointer to an array at least +@code{PATH_MAX} bytes long (@pxref{Limits for Files}). In the GNU +system there is no limit to the size of a file name, so this is not +necessarily enough space to contain the directory name. That is why +this function is deprecated. +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun int chdir (const char *@var{filename}) +This function is used to set the process's working directory to +@var{filename}. + +The normal, successful return value from @code{chdir} is @code{0}. A +value of @code{-1} is returned to indicate an error. The @code{errno} +error conditions defined for this function are the usual file name +syntax errors (@pxref{File Name Errors}), plus @code{ENOTDIR} if the +file @var{filename} is not a directory. +@end deftypefun + + +@node Accessing Directories +@section Accessing Directories +@cindex accessing directories +@cindex reading from a directory +@cindex directories, accessing + +The facilities described in this section let you read the contents of a +directory file. This is useful if you want your program to list all the +files in a directory, perhaps as part of a menu. + +@cindex directory stream +The @code{opendir} function opens a @dfn{directory stream} whose +elements are directory entries. You use the @code{readdir} function on +the directory stream to retrieve these entries, represented as +@w{@code{struct dirent}} objects. The name of the file for each entry is +stored in the @code{d_name} member of this structure. There are obvious +parallels here to the stream facilities for ordinary files, described in +@ref{I/O on Streams}. + +@menu +* Directory Entries:: Format of one directory entry. +* Opening a Directory:: How to open a directory stream. +* Reading/Closing Directory:: How to read directory entries from the stream. +* Simple Directory Lister:: A very simple directory listing program. +* Random Access Directory:: Rereading part of the directory + already read with the same stream. +@end menu + +@node Directory Entries +@subsection Format of a Directory Entry + +@pindex dirent.h +This section describes what you find in a single directory entry, as you +might obtain it from a directory stream. All the symbols are declared +in the header file @file{dirent.h}. + +@comment dirent.h +@comment POSIX.1 +@deftp {Data Type} {struct dirent} +This is a structure type used to return information about directory +entries. It contains the following fields: + +@table @code +@item char d_name[] +This is the null-terminated file name component. This is the only +field you can count on in all POSIX systems. + +@item ino_t d_fileno +This is the file serial number. For BSD compatibility, you can also +refer to this member as @code{d_ino}. In the GNU system and most POSIX +systems, for most files this the same as the @code{st_ino} member that +@code{stat} will return for the file. @xref{File Attributes}. + +@item unsigned char d_namlen +This is the length of the file name, not including the terminating null +character. Its type is @code{unsigned char} because that is the integer +type of the appropriate size + +@item unsigned char d_type +This is the type of the file, possibly unknown. The following constants +are defined for its value: + +@table @code +@item DT_UNKNOWN +The type is unknown. On some systems this is the only value returned. + +@item DT_REG +A regular file. + +@item DT_DIR +A directory. + +@item DT_FIFO +A named pipe, or FIFO. @xref{FIFO Special Files}. + +@item DT_SOCK +A local-domain socket. @c !!! @xref{Local Domain}. + +@item DT_CHR +A character device. + +@item DT_BLK +A block device. +@end table + +This member is a BSD extension. Each value except DT_UNKNOWN +corresponds to the file type bits in the @code{st_mode} member of +@code{struct statbuf}. These two macros convert between @code{d_type} +values and @code{st_mode} values: + +@deftypefun int IFTODT (mode_t @var{mode}) +This returns the @code{d_type} value corresponding to @var{mode}. +@end deftypefun + +@deftypefun mode_t DTTOIF (int @var{dirtype}) +This returns the @code{st_mode} value corresponding to @var{dirtype}. +@end deftypefun +@end table + +This structure may contain additional members in the future. + +When a file has multiple names, each name has its own directory entry. +The only way you can tell that the directory entries belong to a +single file is that they have the same value for the @code{d_fileno} +field. + +File attributes such as size, modification times, and the like are part +of the file itself, not any particular directory entry. @xref{File +Attributes}. +@end deftp + +@node Opening a Directory +@subsection Opening a Directory Stream + +@pindex dirent.h +This section describes how to open a directory stream. All the symbols +are declared in the header file @file{dirent.h}. + +@comment dirent.h +@comment POSIX.1 +@deftp {Data Type} DIR +The @code{DIR} data type represents a directory stream. +@end deftp + +You shouldn't ever allocate objects of the @code{struct dirent} or +@code{DIR} data types, since the directory access functions do that for +you. Instead, you refer to these objects using the pointers returned by +the following functions. + +@comment dirent.h +@comment POSIX.1 +@deftypefun {DIR *} opendir (const char *@var{dirname}) +The @code{opendir} function opens and returns a directory stream for +reading the directory whose file name is @var{dirname}. The stream has +type @code{DIR *}. + +If unsuccessful, @code{opendir} returns a null pointer. In addition to +the usual file name errors (@pxref{File Name Errors}), the +following @code{errno} error conditions are defined for this function: + +@table @code +@item EACCES +Read permission is denied for the directory named by @code{dirname}. + +@item EMFILE +The process has too many files open. + +@item ENFILE +The entire system, or perhaps the file system which contains the +directory, cannot support any additional open files at the moment. +(This problem cannot happen on the GNU system.) +@end table + +The @code{DIR} type is typically implemented using a file descriptor, +and the @code{opendir} function in terms of the @code{open} function. +@xref{Low-Level I/O}. Directory streams and the underlying +file descriptors are closed on @code{exec} (@pxref{Executing a File}). +@end deftypefun + +@node Reading/Closing Directory +@subsection Reading and Closing a Directory Stream + +@pindex dirent.h +This section describes how to read directory entries from a directory +stream, and how to close the stream when you are done with it. All the +symbols are declared in the header file @file{dirent.h}. + +@comment dirent.h +@comment POSIX.1 +@deftypefun {struct dirent *} readdir (DIR *@var{dirstream}) +This function reads the next entry from the directory. It normally +returns a pointer to a structure containing information about the file. +This structure is statically allocated and can be rewritten by a +subsequent call. + +@strong{Portability Note:} On some systems, @code{readdir} may not +return entries for @file{.} and @file{..}, even though these are always +valid file names in any directory. @xref{File Name Resolution}. + +If there are no more entries in the directory or an error is detected, +@code{readdir} returns a null pointer. The following @code{errno} error +conditions are defined for this function: + +@table @code +@item EBADF +The @var{dirstream} argument is not valid. +@end table +@end deftypefun + +@comment dirent.h +@comment POSIX.1 +@deftypefun int closedir (DIR *@var{dirstream}) +This function closes the directory stream @var{dirstream}. It returns +@code{0} on success and @code{-1} on failure. + +The following @code{errno} error conditions are defined for this +function: + +@table @code +@item EBADF +The @var{dirstream} argument is not valid. +@end table +@end deftypefun + +@node Simple Directory Lister +@subsection Simple Program to List a Directory + +Here's a simple program that prints the names of the files in +the current working directory: + +@smallexample +@include dir.c.texi +@end smallexample + +The order in which files appear in a directory tends to be fairly +random. A more useful program would sort the entries (perhaps by +alphabetizing them) before printing them; see @ref{Array Sort Function}. + +@c ??? not documented: scandir, alphasort + +@node Random Access Directory +@subsection Random Access in a Directory Stream + +@pindex dirent.h +This section describes how to reread parts of a directory that you have +already read from an open directory stream. All the symbols are +declared in the header file @file{dirent.h}. + +@comment dirent.h +@comment POSIX.1 +@deftypefun void rewinddir (DIR *@var{dirstream}) +The @code{rewinddir} function is used to reinitialize the directory +stream @var{dirstream}, so that if you call @code{readdir} it +returns information about the first entry in the directory again. This +function also notices if files have been added or removed to the +directory since it was opened with @code{opendir}. (Entries for these +files might or might not be returned by @code{readdir} if they were +added or removed since you last called @code{opendir} or +@code{rewinddir}.) +@end deftypefun + +@comment dirent.h +@comment BSD +@deftypefun off_t telldir (DIR *@var{dirstream}) +The @code{telldir} function returns the file position of the directory +stream @var{dirstream}. You can use this value with @code{seekdir} to +restore the directory stream to that position. +@end deftypefun + +@comment dirent.h +@comment BSD +@deftypefun void seekdir (DIR *@var{dirstream}, off_t @var{pos}) +The @code{seekdir} function sets the file position of the directory +stream @var{dirstream} to @var{pos}. The value @var{pos} must be the +result of a previous call to @code{telldir} on this particular stream; +closing and reopening the directory can invalidate values returned by +@code{telldir}. +@end deftypefun + +@node Hard Links +@section Hard Links +@cindex hard link +@cindex link, hard +@cindex multiple names for one file +@cindex file names, multiple + +In POSIX systems, one file can have many names at the same time. All of +the names are equally real, and no one of them is preferred to the +others. + +To add a name to a file, use the @code{link} function. (The new name is +also called a @dfn{hard link} to the file.) Creating a new link to a +file does not copy the contents of the file; it simply makes a new name +by which the file can be known, in addition to the file's existing name +or names. + +One file can have names in several directories, so the the organization +of the file system is not a strict hierarchy or tree. + +In most implementations, it is not possible to have hard links to the +same file in multiple file systems. @code{link} reports an error if you +try to make a hard link to the file from another file system when this +cannot be done. + +The prototype for the @code{link} function is declared in the header +file @file{unistd.h}. +@pindex unistd.h + +@comment unistd.h +@comment POSIX.1 +@deftypefun int link (const char *@var{oldname}, const char *@var{newname}) +The @code{link} function makes a new link to the existing file named by +@var{oldname}, under the new name @var{newname}. + +This function returns a value of @code{0} if it is successful and +@code{-1} on failure. In addition to the usual file name errors +(@pxref{File Name Errors}) for both @var{oldname} and @var{newname}, the +following @code{errno} error conditions are defined for this function: + +@table @code +@item EACCES +You are not allowed to write the directory in which the new link is to +be written. +@ignore +Some implementations also require that the existing file be accessible +by the caller, and use this error to report failure for that reason. +@end ignore + +@item EEXIST +There is already a file named @var{newname}. If you want to replace +this link with a new link, you must remove the old link explicitly first. + +@item EMLINK +There are already too many links to the file named by @var{oldname}. +(The maximum number of links to a file is @w{@code{LINK_MAX}}; see +@ref{Limits for Files}.) + +@item ENOENT +The file named by @var{oldname} doesn't exist. You can't make a link to +a file that doesn't exist. + +@item ENOSPC +The directory or file system that would contain the new link is full +and cannot be extended. + +@item EPERM +In the GNU system and some others, you cannot make links to directories. +Many systems allow only privileged users to do so. This error +is used to report the problem. + +@item EROFS +The directory containing the new link can't be modified because it's on +a read-only file system. + +@item EXDEV +The directory specified in @var{newname} is on a different file system +than the existing file. + +@item EIO +A hardware error occurred while trying to read or write the to filesystem. +@end table +@end deftypefun + +@node Symbolic Links +@section Symbolic Links +@cindex soft link +@cindex link, soft +@cindex symbolic link +@cindex link, symbolic + +The GNU system supports @dfn{soft links} or @dfn{symbolic links}. This +is a kind of ``file'' that is essentially a pointer to another file +name. Unlike hard links, symbolic links can be made to directories or +across file systems with no restrictions. You can also make a symbolic +link to a name which is not the name of any file. (Opening this link +will fail until a file by that name is created.) Likewise, if the +symbolic link points to an existing file which is later deleted, the +symbolic link continues to point to the same file name even though the +name no longer names any file. + +The reason symbolic links work the way they do is that special things +happen when you try to open the link. The @code{open} function realizes +you have specified the name of a link, reads the file name contained in +the link, and opens that file name instead. The @code{stat} function +likewise operates on the file that the symbolic link points to, instead +of on the link itself. + +By contrast, other operations such as deleting or renaming the file +operate on the link itself. The functions @code{readlink} and +@code{lstat} also refrain from following symbolic links, because their +purpose is to obtain information about the link. So does @code{link}, +the function that makes a hard link---it makes a hard link to the +symbolic link, which one rarely wants. + +Prototypes for the functions listed in this section are in +@file{unistd.h}. +@pindex unistd.h + +@comment unistd.h +@comment BSD +@deftypefun int symlink (const char *@var{oldname}, const char *@var{newname}) +The @code{symlink} function makes a symbolic link to @var{oldname} named +@var{newname}. + +The normal return value from @code{symlink} is @code{0}. A return value +of @code{-1} indicates an error. In addition to the usual file name +syntax errors (@pxref{File Name Errors}), the following @code{errno} +error conditions are defined for this function: + +@table @code +@item EEXIST +There is already an existing file named @var{newname}. + +@item EROFS +The file @var{newname} would exist on a read-only file system. + +@item ENOSPC +The directory or file system cannot be extended to make the new link. + +@item EIO +A hardware error occurred while reading or writing data on the disk. + +@ignore +@comment not sure about these +@item ELOOP +There are too many levels of indirection. This can be the result of +circular symbolic links to directories. + +@item EDQUOT +The new link can't be created because the user's disk quota has been +exceeded. +@end ignore +@end table +@end deftypefun + +@comment unistd.h +@comment BSD +@deftypefun int readlink (const char *@var{filename}, char *@var{buffer}, size_t @var{size}) +The @code{readlink} function gets the value of the symbolic link +@var{filename}. The file name that the link points to is copied into +@var{buffer}. This file name string is @emph{not} null-terminated; +@code{readlink} normally returns the number of characters copied. The +@var{size} argument specifies the maximum number of characters to copy, +usually the allocation size of @var{buffer}. + +If the return value equals @var{size}, you cannot tell whether or not +there was room to return the entire name. So make a bigger buffer and +call @code{readlink} again. Here is an example: + +@smallexample +char * +readlink_malloc (char *filename) +@{ + int size = 100; + + while (1) + @{ + char *buffer = (char *) xmalloc (size); + int nchars = readlink (filename, buffer, size); + if (nchars < size) + return buffer; + free (buffer); + size *= 2; + @} +@} +@end smallexample + +@c @group Invalid outside example. +A value of @code{-1} is returned in case of error. In addition to the +usual file name errors (@pxref{File Name Errors}), the following +@code{errno} error conditions are defined for this function: + +@table @code +@item EINVAL +The named file is not a symbolic link. + +@item EIO +A hardware error occurred while reading or writing data on the disk. +@end table +@c @end group +@end deftypefun + +@node Deleting Files +@section Deleting Files +@cindex deleting a file +@cindex removing a file +@cindex unlinking a file + +You can delete a file with the functions @code{unlink} or @code{remove}. + +Deletion actually deletes a file name. If this is the file's only name, +then the file is deleted as well. If the file has other names as well +(@pxref{Hard Links}), it remains accessible under its other names. + +@comment unistd.h +@comment POSIX.1 +@deftypefun int unlink (const char *@var{filename}) +The @code{unlink} function deletes the file name @var{filename}. If +this is a file's sole name, the file itself is also deleted. (Actually, +if any process has the file open when this happens, deletion is +postponed until all processes have closed the file.) + +@pindex unistd.h +The function @code{unlink} is declared in the header file @file{unistd.h}. + +This function returns @code{0} on successful completion, and @code{-1} +on error. In addition to the usual file name errors +(@pxref{File Name Errors}), the following @code{errno} error conditions are +defined for this function: + +@table @code +@item EACCES +Write permission is denied for the directory from which the file is to be +removed, or the directory has the sticky bit set and you do not own the file. + +@item EBUSY +This error indicates that the file is being used by the system in such a +way that it can't be unlinked. For example, you might see this error if +the file name specifies the root directory or a mount point for a file +system. + +@item ENOENT +The file name to be deleted doesn't exist. + +@item EPERM +On some systems, @code{unlink} cannot be used to delete the name of a +directory, or can only be used this way by a privileged user. +To avoid such problems, use @code{rmdir} to delete directories. +(In the GNU system @code{unlink} can never delete the name of a directory.) + +@item EROFS +The directory in which the file name is to be deleted is on a read-only +file system, and can't be modified. +@end table +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun int rmdir (const char *@var{filename}) +@cindex directories, deleting +@cindex deleting a directory +The @code{rmdir} function deletes a directory. The directory must be +empty before it can be removed; in other words, it can only contain +entries for @file{.} and @file{..}. + +In most other respects, @code{rmdir} behaves like @code{unlink}. There +are two additional @code{errno} error conditions defined for +@code{rmdir}: + +@table @code +@item ENOTEMPTY +@itemx EEXIST +The directory to be deleted is not empty. +@end table + +These two error codes are synonymous; some systems use one, and some use +the other. The GNU system always uses @code{ENOTEMPTY}. + +The prototype for this function is declared in the header file +@file{unistd.h}. +@pindex unistd.h +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int remove (const char *@var{filename}) +This is the ANSI C function to remove a file. It works like +@code{unlink} for files and like @code{rmdir} for directories. +@code{remove} is declared in @file{stdio.h}. +@pindex stdio.h +@end deftypefun + +@node Renaming Files +@section Renaming Files + +The @code{rename} function is used to change a file's name. + +@cindex renaming a file +@comment stdio.h +@comment ANSI +@deftypefun int rename (const char *@var{oldname}, const char *@var{newname}) +The @code{rename} function renames the file name @var{oldname} with +@var{newname}. The file formerly accessible under the name +@var{oldname} is afterward accessible as @var{newname} instead. (If the +file had any other names aside from @var{oldname}, it continues to have +those names.) + +The directory containing the name @var{newname} must be on the same +file system as the file (as indicated by the name @var{oldname}). + +One special case for @code{rename} is when @var{oldname} and +@var{newname} are two names for the same file. The consistent way to +handle this case is to delete @var{oldname}. However, POSIX requires +that in this case @code{rename} do nothing and report success---which is +inconsistent. We don't know what your operating system will do. + +If the @var{oldname} is not a directory, then any existing file named +@var{newname} is removed during the renaming operation. However, if +@var{newname} is the name of a directory, @code{rename} fails in this +case. + +If the @var{oldname} is a directory, then either @var{newname} must not +exist or it must name a directory that is empty. In the latter case, +the existing directory named @var{newname} is deleted first. The name +@var{newname} must not specify a subdirectory of the directory +@code{oldname} which is being renamed. + +One useful feature of @code{rename} is that the meaning of the name +@var{newname} changes ``atomically'' from any previously existing file +by that name to its new meaning (the file that was called +@var{oldname}). There is no instant at which @var{newname} is +nonexistent ``in between'' the old meaning and the new meaning. If +there is a system crash during the operation, it is possible for both +names to still exist; but @var{newname} will always be intact if it +exists at all. + +If @code{rename} fails, it returns @code{-1}. In addition to the usual +file name errors (@pxref{File Name Errors}), the following +@code{errno} error conditions are defined for this function: + +@table @code +@item EACCES +One of the directories containing @var{newname} or @var{oldname} +refuses write permission; or @var{newname} and @var{oldname} are +directories and write permission is refused for one of them. + +@item EBUSY +A directory named by @var{oldname} or @var{newname} is being used by +the system in a way that prevents the renaming from working. This includes +directories that are mount points for filesystems, and directories +that are the current working directories of processes. + +@item ENOTEMPTY +@itemx EEXIST +The directory @var{newname} isn't empty. The GNU system always returns +@code{ENOTEMPTY} for this, but some other systems return @code{EEXIST}. + +@item EINVAL +The @var{oldname} is a directory that contains @var{newname}. + +@item EISDIR +The @var{newname} names a directory, but the @var{oldname} doesn't. + +@item EMLINK +The parent directory of @var{newname} would have too many links. + +@item ENOENT +The file named by @var{oldname} doesn't exist. + +@item ENOSPC +The directory that would contain @var{newname} has no room for another +entry, and there is no space left in the file system to expand it. + +@item EROFS +The operation would involve writing to a directory on a read-only file +system. + +@item EXDEV +The two file names @var{newname} and @var{oldnames} are on different +file systems. +@end table +@end deftypefun + +@node Creating Directories +@section Creating Directories +@cindex creating a directory +@cindex directories, creating + +@pindex mkdir +Directories are created with the @code{mkdir} function. (There is also +a shell command @code{mkdir} which does the same thing.) +@c !!! umask + +@comment sys/stat.h +@comment POSIX.1 +@deftypefun int mkdir (const char *@var{filename}, mode_t @var{mode}) +The @code{mkdir} function creates a new, empty directory whose name is +@var{filename}. + +The argument @var{mode} specifies the file permissions for the new +directory file. @xref{Permission Bits}, for more information about +this. + +A return value of @code{0} indicates successful completion, and +@code{-1} indicates failure. In addition to the usual file name syntax +errors (@pxref{File Name Errors}), the following @code{errno} error +conditions are defined for this function: + +@table @code +@item EACCES +Write permission is denied for the parent directory in which the new +directory is to be added. + +@item EEXIST +A file named @var{filename} already exists. + +@item EMLINK +The parent directory has too many links. + +Well-designed file systems never report this error, because they permit +more links than your disk could possibly hold. However, you must still +take account of the possibility of this error, as it could result from +network access to a file system on another machine. + +@item ENOSPC +The file system doesn't have enough room to create the new directory. + +@item EROFS +The parent directory of the directory being created is on a read-only +file system, and cannot be modified. +@end table + +To use this function, your program should include the header file +@file{sys/stat.h}. +@pindex sys/stat.h +@end deftypefun + +@node File Attributes +@section File Attributes + +@pindex ls +When you issue an @samp{ls -l} shell command on a file, it gives you +information about the size of the file, who owns it, when it was last +modified, and the like. This kind of information is called the +@dfn{file attributes}; it is associated with the file itself and not a +particular one of its names. + +This section contains information about how you can inquire about and +modify these attributes of files. + +@menu +* Attribute Meanings:: The names of the file attributes, + and what their values mean. +* Reading Attributes:: How to read the attributes of a file. +* Testing File Type:: Distinguishing ordinary files, + directories, links... +* File Owner:: How ownership for new files is determined, + and how to change it. +* Permission Bits:: How information about a file's access + mode is stored. +* Access Permission:: How the system decides who can access a file. +* Setting Permissions:: How permissions for new files are assigned, + and how to change them. +* Testing File Access:: How to find out if your process can + access a file. +* File Times:: About the time attributes of a file. +@end menu + +@node Attribute Meanings +@subsection What the File Attribute Values Mean +@cindex status of a file +@cindex attributes of a file +@cindex file attributes + +When you read the attributes of a file, they come back in a structure +called @code{struct stat}. This section describes the names of the +attributes, their data types, and what they mean. For the functions +to read the attributes of a file, see @ref{Reading Attributes}. + +The header file @file{sys/stat.h} declares all the symbols defined +in this section. +@pindex sys/stat.h + +@comment sys/stat.h +@comment POSIX.1 +@deftp {Data Type} {struct stat} +The @code{stat} structure type is used to return information about the +attributes of a file. It contains at least the following members: + +@table @code +@item mode_t st_mode +Specifies the mode of the file. This includes file type information +(@pxref{Testing File Type}) and the file permission bits +(@pxref{Permission Bits}). + +@item ino_t st_ino +The file serial number, which distinguishes this file from all other +files on the same device. + +@item dev_t st_dev +Identifies the device containing the file. The @code{st_ino} and +@code{st_dev}, taken together, uniquely identify the file. The +@code{st_dev} value is not necessarily consistent across reboots or +system crashes, however. + +@item nlink_t st_nlink +The number of hard links to the file. This count keeps track of how +many directories have entries for this file. If the count is ever +decremented to zero, then the file itself is discarded as soon as no +process still holds it open. Symbolic links are not counted in the +total. + +@item uid_t st_uid +The user ID of the file's owner. @xref{File Owner}. + +@item gid_t st_gid +The group ID of the file. @xref{File Owner}. + +@item off_t st_size +This specifies the size of a regular file in bytes. For files that +are really devices and the like, this field isn't usually meaningful. +For symbolic links, this specifies the length of the file name the link +refers to. + +@item time_t st_atime +This is the last access time for the file. @xref{File Times}. + +@item unsigned long int st_atime_usec +This is the fractional part of the last access time for the file. +@xref{File Times}. + +@item time_t st_mtime +This is the time of the last modification to the contents of the file. +@xref{File Times}. + +@item unsigned long int st_mtime_usec +This is the fractional part of the time of last modification to the +contents of the file. @xref{File Times}. + +@item time_t st_ctime +This is the time of the last modification to the attributes of the file. +@xref{File Times}. + +@item unsigned long int st_ctime_usec +This is the fractional part of the time of last modification to the +attributes of the file. @xref{File Times}. + +@c !!! st_rdev +@item unsigned int st_blocks +This is the amount of disk space that the file occupies, measured in +units of 512-byte blocks. + +The number of disk blocks is not strictly proportional to the size of +the file, for two reasons: the file system may use some blocks for +internal record keeping; and the file may be sparse---it may have +``holes'' which contain zeros but do not actually take up space on the +disk. + +You can tell (approximately) whether a file is sparse by comparing this +value with @code{st_size}, like this: + +@smallexample +(st.st_blocks * 512 < st.st_size) +@end smallexample + +This test is not perfect because a file that is just slightly sparse +might not be detected as sparse at all. For practical applications, +this is not a problem. + +@item unsigned int st_blksize +The optimal block size for reading of writing this file, in bytes. You +might use this size for allocating the buffer space for reading of +writing the file. (This is unrelated to @code{st_blocks}.) +@end table +@end deftp + + Some of the file attributes have special data type names which exist +specifically for those attributes. (They are all aliases for well-known +integer types that you know and love.) These typedef names are defined +in the header file @file{sys/types.h} as well as in @file{sys/stat.h}. +Here is a list of them. + +@comment sys/types.h +@comment POSIX.1 +@deftp {Data Type} mode_t +This is an integer data type used to represent file modes. In the +GNU system, this is equivalent to @code{unsigned int}. +@end deftp + +@cindex inode number +@comment sys/types.h +@comment POSIX.1 +@deftp {Data Type} ino_t +This is an arithmetic data type used to represent file serial numbers. +(In Unix jargon, these are sometimes called @dfn{inode numbers}.) +In the GNU system, this type is equivalent to @code{unsigned long int}. +@end deftp + +@comment sys/types.h +@comment POSIX.1 +@deftp {Data Type} dev_t +This is an arithmetic data type used to represent file device numbers. +In the GNU system, this is equivalent to @code{int}. +@end deftp + +@comment sys/types.h +@comment POSIX.1 +@deftp {Data Type} nlink_t +This is an arithmetic data type used to represent file link counts. +In the GNU system, this is equivalent to @code{unsigned short int}. +@end deftp + +@node Reading Attributes +@subsection Reading the Attributes of a File + +To examine the attributes of files, use the functions @code{stat}, +@code{fstat} and @code{lstat}. They return the attribute information in +a @code{struct stat} object. All three functions are declared in the +header file @file{sys/stat.h}. + +@comment sys/stat.h +@comment POSIX.1 +@deftypefun int stat (const char *@var{filename}, struct stat *@var{buf}) +The @code{stat} function returns information about the attributes of the +file named by @w{@var{filename}} in the structure pointed at by @var{buf}. + +If @var{filename} is the name of a symbolic link, the attributes you get +describe the file that the link points to. If the link points to a +nonexistent file name, then @code{stat} fails, reporting a nonexistent +file. + +The return value is @code{0} if the operation is successful, and @code{-1} +on failure. In addition to the usual file name errors +(@pxref{File Name Errors}, the following @code{errno} error conditions +are defined for this function: + +@table @code +@item ENOENT +The file named by @var{filename} doesn't exist. +@end table +@end deftypefun + +@comment sys/stat.h +@comment POSIX.1 +@deftypefun int fstat (int @var{filedes}, struct stat *@var{buf}) +The @code{fstat} function is like @code{stat}, except that it takes an +open file descriptor as an argument instead of a file name. +@xref{Low-Level I/O}. + +Like @code{stat}, @code{fstat} returns @code{0} on success and @code{-1} +on failure. The following @code{errno} error conditions are defined for +@code{fstat}: + +@table @code +@item EBADF +The @var{filedes} argument is not a valid file descriptor. +@end table +@end deftypefun + +@comment sys/stat.h +@comment BSD +@deftypefun int lstat (const char *@var{filename}, struct stat *@var{buf}) +The @code{lstat} function is like @code{stat}, except that it does not +follow symbolic links. If @var{filename} is the name of a symbolic +link, @code{lstat} returns information about the link itself; otherwise, +@code{lstat} works like @code{stat}. @xref{Symbolic Links}. +@end deftypefun + +@node Testing File Type +@subsection Testing the Type of a File + +The @dfn{file mode}, stored in the @code{st_mode} field of the file +attributes, contains two kinds of information: the file type code, and +the access permission bits. This section discusses only the type code, +which you can use to tell whether the file is a directory, whether it is +a socket, and so on. For information about the access permission, +@ref{Permission Bits}. + +There are two predefined ways you can access the file type portion of +the file mode. First of all, for each type of file, there is a +@dfn{predicate macro} which examines a file mode value and returns +true or false---is the file of that type, or not. Secondly, you can +mask out the rest of the file mode to get just a file type code. +You can compare this against various constants for the supported file +types. + +All of the symbols listed in this section are defined in the header file +@file{sys/stat.h}. +@pindex sys/stat.h + +The following predicate macros test the type of a file, given the value +@var{m} which is the @code{st_mode} field returned by @code{stat} on +that file: + +@comment sys/stat.h +@comment POSIX +@deftypefn Macro int S_ISDIR (mode_t @var{m}) +This macro returns nonzero if the file is a directory. +@end deftypefn + +@comment sys/stat.h +@comment POSIX +@deftypefn Macro int S_ISCHR (mode_t @var{m}) +This macro returns nonzero if the file is a character special file (a +device like a terminal). +@end deftypefn + +@comment sys/stat.h +@comment POSIX +@deftypefn Macro int S_ISBLK (mode_t @var{m}) +This macro returns nonzero if the file is a block special file (a device +like a disk). +@end deftypefn + +@comment sys/stat.h +@comment POSIX +@deftypefn Macro int S_ISREG (mode_t @var{m}) +This macro returns nonzero if the file is a regular file. +@end deftypefn + +@comment sys/stat.h +@comment POSIX +@deftypefn Macro int S_ISFIFO (mode_t @var{m}) +This macro returns nonzero if the file is a FIFO special file, or a +pipe. @xref{Pipes and FIFOs}. +@end deftypefn + +@comment sys/stat.h +@comment GNU +@deftypefn Macro int S_ISLNK (mode_t @var{m}) +This macro returns nonzero if the file is a symbolic link. +@xref{Symbolic Links}. +@end deftypefn + +@comment sys/stat.h +@comment GNU +@deftypefn Macro int S_ISSOCK (mode_t @var{m}) +This macro returns nonzero if the file is a socket. @xref{Sockets}. +@end deftypefn + +An alterate non-POSIX method of testing the file type is supported for +compatibility with BSD. The mode can be bitwise ANDed with +@code{S_IFMT} to extract the file type code, and compared to the +appropriate type code constant. For example, + +@smallexample +S_ISCHR (@var{mode}) +@end smallexample + +@noindent +is equivalent to: + +@smallexample +((@var{mode} & S_IFMT) == S_IFCHR) +@end smallexample + +@comment sys/stat.h +@comment BSD +@deftypevr Macro int S_IFMT +This is a bit mask used to extract the file type code portion of a mode +value. +@end deftypevr + +These are the symbolic names for the different file type codes: + +@table @code +@comment sys/stat.h +@comment BSD +@item S_IFDIR +@vindex S_IFDIR +This macro represents the value of the file type code for a directory file. + +@comment sys/stat.h +@comment BSD +@item S_IFCHR +@vindex S_IFCHR +This macro represents the value of the file type code for a +character-oriented device file. + +@comment sys/stat.h +@comment BSD +@item S_IFBLK +@vindex S_IFBLK +This macro represents the value of the file type code for a block-oriented +device file. + +@comment sys/stat.h +@comment BSD +@item S_IFREG +@vindex S_IFREG +This macro represents the value of the file type code for a regular file. + +@comment sys/stat.h +@comment BSD +@item S_IFLNK +@vindex S_IFLNK +This macro represents the value of the file type code for a symbolic link. + +@comment sys/stat.h +@comment BSD +@item S_IFSOCK +@vindex S_IFSOCK +This macro represents the value of the file type code for a socket. + +@comment sys/stat.h +@comment BSD +@item S_IFIFO +@vindex S_IFIFO +This macro represents the value of the file type code for a FIFO or pipe. +@end table + +@node File Owner +@subsection File Owner +@cindex file owner +@cindex owner of a file +@cindex group owner of a file + +Every file has an @dfn{owner} which is one of the registered user names +defined on the system. Each file also has a @dfn{group}, which is one +of the defined groups. The file owner can often be useful for showing +you who edited the file (especially when you edit with GNU Emacs), but +its main purpose is for access control. + +The file owner and group play a role in determining access because the +file has one set of access permission bits for the user that is the +owner, another set that apply to users who belong to the file's group, +and a third set of bits that apply to everyone else. @xref{Access +Permission}, for the details of how access is decided based on this +data. + +When a file is created, its owner is set from the effective user ID of +the process that creates it (@pxref{Process Persona}). The file's group +ID may be set from either effective group ID of the process, or the +group ID of the directory that contains the file, depending on the +system where the file is stored. When you access a remote file system, +it behaves according to its own rule, not according to the system your +program is running on. Thus, your program must be prepared to encounter +either kind of behavior, no matter what kind of system you run it on. + +@pindex chown +@pindex chgrp +You can change the owner and/or group owner of an existing file using +the @code{chown} function. This is the primitive for the @code{chown} +and @code{chgrp} shell commands. + +@pindex unistd.h +The prototype for this function is declared in @file{unistd.h}. + +@comment unistd.h +@comment POSIX.1 +@deftypefun int chown (const char *@var{filename}, uid_t @var{owner}, gid_t @var{group}) +The @code{chown} function changes the owner of the file @var{filename} to +@var{owner}, and its group owner to @var{group}. + +Changing the owner of the file on certain systems clears the set-user-ID +and set-group-ID bits of the file's permissions. (This is because those +bits may not be appropriate for the new owner.) The other file +permission bits are not changed. + +The return value is @code{0} on success and @code{-1} on failure. +In addition to the usual file name errors (@pxref{File Name Errors}), +the following @code{errno} error conditions are defined for this function: + +@table @code +@item EPERM +This process lacks permission to make the requested change. + +Only privileged users or the file's owner can change the file's group. +On most file systems, only privileged users can change the file owner; +some file systems allow you to change the owner if you are currently the +owner. When you access a remote file system, the behavior you encounter +is determined by the system that actually holds the file, not by the +system your program is running on. + +@xref{Options for Files}, for information about the +@code{_POSIX_CHOWN_RESTRICTED} macro. + +@item EROFS +The file is on a read-only file system. +@end table +@end deftypefun + +@comment unistd.h +@comment BSD +@deftypefun int fchown (int @var{filedes}, int @var{owner}, int @var{group}) +This is like @code{chown}, except that it changes the owner of the file +with open file descriptor @var{filedes}. + +The return value from @code{fchown} is @code{0} on success and @code{-1} +on failure. The following @code{errno} error codes are defined for this +function: + +@table @code +@item EBADF +The @var{filedes} argument is not a valid file descriptor. + +@item EINVAL +The @var{filedes} argument corresponds to a pipe or socket, not an ordinary +file. + +@item EPERM +This process lacks permission to make the requested change. For +details, see @code{chmod}, above. + +@item EROFS +The file resides on a read-only file system. +@end table +@end deftypefun + +@node Permission Bits +@subsection The Mode Bits for Access Permission + +The @dfn{file mode}, stored in the @code{st_mode} field of the file +attributes, contains two kinds of information: the file type code, and +the access permission bits. This section discusses only the access +permission bits, which control who can read or write the file. +@xref{Testing File Type}, for information about the file type code. + +All of the symbols listed in this section are defined in the header file +@file{sys/stat.h}. +@pindex sys/stat.h + +@cindex file permission bits +These symbolic constants are defined for the file mode bits that control +access permission for the file: + +@table @code +@comment sys/stat.h +@comment POSIX.1 +@item S_IRUSR +@vindex S_IRUSR +@comment sys/stat.h +@comment BSD +@itemx S_IREAD +@vindex S_IREAD +Read permission bit for the owner of the file. On many systems, this +bit is 0400. @code{S_IREAD} is an obsolete synonym provided for BSD +compatibility. + +@comment sys/stat.h +@comment POSIX.1 +@item S_IWUSR +@vindex S_IWUSR +@comment sys/stat.h +@comment BSD +@itemx S_IWRITE +@vindex S_IWRITE +Write permission bit for the owner of the file. Usually 0200. +@w{@code{S_IWRITE}} is an obsolete synonym provided for BSD compatibility. + +@comment sys/stat.h +@comment POSIX.1 +@item S_IXUSR +@vindex S_IXUSR +@comment sys/stat.h +@comment BSD +@itemx S_IEXEC +@vindex S_IEXEC +Execute (for ordinary files) or search (for directories) permission bit +for the owner of the file. Usually 0100. @code{S_IEXEC} is an obsolete +synonym provided for BSD compatibility. + +@comment sys/stat.h +@comment POSIX.1 +@item S_IRWXU +@vindex S_IRWXU +This is equivalent to @samp{(S_IRUSR | S_IWUSR | S_IXUSR)}. + +@comment sys/stat.h +@comment POSIX.1 +@item S_IRGRP +@vindex S_IRGRP +Read permission bit for the group owner of the file. Usually 040. + +@comment sys/stat.h +@comment POSIX.1 +@item S_IWGRP +@vindex S_IWGRP +Write permission bit for the group owner of the file. Usually 020. + +@comment sys/stat.h +@comment POSIX.1 +@item S_IXGRP +@vindex S_IXGRP +Execute or search permission bit for the group owner of the file. +Usually 010. + +@comment sys/stat.h +@comment POSIX.1 +@item S_IRWXG +@vindex S_IRWXG +This is equivalent to @samp{(S_IRGRP | S_IWGRP | S_IXGRP)}. + +@comment sys/stat.h +@comment POSIX.1 +@item S_IROTH +@vindex S_IROTH +Read permission bit for other users. Usually 04. + +@comment sys/stat.h +@comment POSIX.1 +@item S_IWOTH +@vindex S_IWOTH +Write permission bit for other users. Usually 02. + +@comment sys/stat.h +@comment POSIX.1 +@item S_IXOTH +@vindex S_IXOTH +Execute or search permission bit for other users. Usually 01. + +@comment sys/stat.h +@comment POSIX.1 +@item S_IRWXO +@vindex S_IRWXO +This is equivalent to @samp{(S_IROTH | S_IWOTH | S_IXOTH)}. + +@comment sys/stat.h +@comment POSIX +@item S_ISUID +@vindex S_ISUID +This is the set-user-ID on execute bit, usually 04000. +@xref{How Change Persona}. + +@comment sys/stat.h +@comment POSIX +@item S_ISGID +@vindex S_ISGID +This is the set-group-ID on execute bit, usually 02000. +@xref{How Change Persona}. + +@cindex sticky bit +@comment sys/stat.h +@comment BSD +@item S_ISVTX +@vindex S_ISVTX +This is the @dfn{sticky} bit, usually 01000. + +On a directory, it gives permission to delete a file in the directory +only if you own that file. Ordinarily, a user either can delete all the +files in the directory or cannot delete any of them (based on whether +the user has write permission for the directory). The same restriction +applies---you must both have write permission for the directory and own +the file you want to delete. The one exception is that the owner of the +directory can delete any file in the directory, no matter who owns it +(provided the owner has given himself write permission for the +directory). This is commonly used for the @file{/tmp} directory, where +anyone may create files, but not delete files created by other users. + +Originally the sticky bit on an executable file modified the swapping +policies of the system. Normally, when a program terminated, its pages +in core were immediately freed and reused. If the sticky bit was set on +the executable file, the system kept the pages in core for a while as if +the program were still running. This was advantageous for a program +likely to be run many times in succession. This usage is obsolete in +modern systems. When a program terminates, its pages always remain in +core as long as there is no shortage of memory in the system. When the +program is next run, its pages will still be in core if no shortage +arose since the last run. + +On some modern systems where the sticky bit has no useful meaning for an +executable file, you cannot set the bit at all for a non-directory. +If you try, @code{chmod} fails with @code{EFTYPE}; +@pxref{Setting Permissions}. + +Some systems (particularly SunOS) have yet another use for the sticky +bit. If the sticky bit is set on a file that is @emph{not} executable, +it means the opposite: never cache the pages of this file at all. The +main use of this is for the files on an NFS server machine which are +used as the swap area of diskless client machines. The idea is that the +pages of the file will be cached in the client's memory, so it is a +waste of the server's memory to cache them a second time. In this use +the sticky bit also says that the filesystem may fail to record the +file's modification time onto disk reliably (the idea being that noone +cares for a swap file). +@end table + +The actual bit values of the symbols are listed in the table above +so you can decode file mode values when debugging your programs. +These bit values are correct for most systems, but they are not +guaranteed. + +@strong{Warning:} Writing explicit numbers for file permissions is bad +practice. It is not only nonportable, it also requires everyone who +reads your program to remember what the bits mean. To make your +program clean, use the symbolic names. + +@node Access Permission +@subsection How Your Access to a File is Decided +@cindex permission to access a file +@cindex access permission for a file +@cindex file access permission + +Recall that the operating system normally decides access permission for +a file based on the effective user and group IDs of the process, and its +supplementary group IDs, together with the file's owner, group and +permission bits. These concepts are discussed in detail in +@ref{Process Persona}. + +If the effective user ID of the process matches the owner user ID of the +file, then permissions for read, write, and execute/search are +controlled by the corresponding ``user'' (or ``owner'') bits. Likewise, +if any of the effective group ID or supplementary group IDs of the +process matches the group owner ID of the file, then permissions are +controlled by the ``group'' bits. Otherwise, permissions are controlled +by the ``other'' bits. + +Privileged users, like @samp{root}, can access any file, regardless of +its file permission bits. As a special case, for a file to be +executable even for a privileged user, at least one of its execute bits +must be set. + +@node Setting Permissions +@subsection Assigning File Permissions + +@cindex file creation mask +@cindex umask +The primitive functions for creating files (for example, @code{open} or +@code{mkdir}) take a @var{mode} argument, which specifies the file +permissions for the newly created file. But the specified mode is +modified by the process's @dfn{file creation mask}, or @dfn{umask}, +before it is used. + +The bits that are set in the file creation mask identify permissions +that are always to be disabled for newly created files. For example, if +you set all the ``other'' access bits in the mask, then newly created +files are not accessible at all to processes in the ``other'' +category, even if the @var{mode} argument specified to the creation +function would permit such access. In other words, the file creation +mask is the complement of the ordinary access permissions you want to +grant. + +Programs that create files typically specify a @var{mode} argument that +includes all the permissions that make sense for the particular file. +For an ordinary file, this is typically read and write permission for +all classes of users. These permissions are then restricted as +specified by the individual user's own file creation mask. + +@findex chmod +To change the permission of an existing file given its name, call +@code{chmod}. This function ignores the file creation mask; it uses +exactly the specified permission bits. + +@pindex umask +In normal use, the file creation mask is initialized in the user's login +shell (using the @code{umask} shell command), and inherited by all +subprocesses. Application programs normally don't need to worry about +the file creation mask. It will do automatically what it is supposed to +do. + +When your program should create a file and bypass the umask for its +access permissions, the easiest way to do this is to use @code{fchmod} +after opening the file, rather than changing the umask. + +In fact, changing the umask is usually done only by shells. They use +the @code{umask} function. + +The functions in this section are declared in @file{sys/stat.h}. +@pindex sys/stat.h + +@comment sys/stat.h +@comment POSIX.1 +@deftypefun mode_t umask (mode_t @var{mask}) +The @code{umask} function sets the file creation mask of the current +process to @var{mask}, and returns the previous value of the file +creation mask. + +Here is an example showing how to read the mask with @code{umask} +without changing it permanently: + +@smallexample +mode_t +read_umask (void) +@{ + mask = umask (0); + umask (mask); +@} +@end smallexample + +@noindent +However, it is better to use @code{getumask} if you just want to read +the mask value, because that is reentrant (at least if you use the GNU +operating system). +@end deftypefun + +@comment sys/stat.h +@comment GNU +@deftypefun mode_t getumask (void) +Return the current value of the file creation mask for the current +process. This function is a GNU extension. +@end deftypefun + +@comment sys/stat.h +@comment POSIX.1 +@deftypefun int chmod (const char *@var{filename}, mode_t @var{mode}) +The @code{chmod} function sets the access permission bits for the file +named by @var{filename} to @var{mode}. + +If the @var{filename} names a symbolic link, @code{chmod} changes the +permission of the file pointed to by the link, not those of the link +itself. + +This function returns @code{0} if successful and @code{-1} if not. In +addition to the usual file name errors (@pxref{File Name +Errors}), the following @code{errno} error conditions are defined for +this function: + +@table @code +@item ENOENT +The named file doesn't exist. + +@item EPERM +This process does not have permission to change the access permission of +this file. Only the file's owner (as judged by the effective user ID of +the process) or a privileged user can change them. + +@item EROFS +The file resides on a read-only file system. + +@item EFTYPE +@var{mode} has the @code{S_ISVTX} bit (the ``sticky bit'') set, +and the named file is not a directory. Some systems do not allow setting the +sticky bit on non-directory files, and some do (and only some of those +assign a useful meaning to the bit for non-directory files). + +You only get @code{EFTYPE} on systems where the sticky bit has no useful +meaning for non-directory files, so it is always safe to just clear the +bit in @var{mode} and call @code{chmod} again. @xref{Permission Bits}, +for full details on the sticky bit. +@end table +@end deftypefun + +@comment sys/stat.h +@comment BSD +@deftypefun int fchmod (int @var{filedes}, int @var{mode}) +This is like @code{chmod}, except that it changes the permissions of +the file currently open via descriptor @var{filedes}. + +The return value from @code{fchmod} is @code{0} on success and @code{-1} +on failure. The following @code{errno} error codes are defined for this +function: + +@table @code +@item EBADF +The @var{filedes} argument is not a valid file descriptor. + +@item EINVAL +The @var{filedes} argument corresponds to a pipe or socket, or something +else that doesn't really have access permissions. + +@item EPERM +This process does not have permission to change the access permission of +this file. Only the file's owner (as judged by the effective user ID of +the process) or a privileged user can change them. + +@item EROFS +The file resides on a read-only file system. +@end table +@end deftypefun + +@node Testing File Access +@subsection Testing Permission to Access a File +@cindex testing access permission +@cindex access, testing for +@cindex setuid programs and file access + +When a program runs as a privileged user, this permits it to access +files off-limits to ordinary users---for example, to modify +@file{/etc/passwd}. Programs designed to be run by ordinary users but +access such files use the setuid bit feature so that they always run +with @code{root} as the effective user ID. + +Such a program may also access files specified by the user, files which +conceptually are being accessed explicitly by the user. Since the +program runs as @code{root}, it has permission to access whatever file +the user specifies---but usually the desired behavior is to permit only +those files which the user could ordinarily access. + +The program therefore must explicitly check whether @emph{the user} +would have the necessary access to a file, before it reads or writes the +file. + +To do this, use the function @code{access}, which checks for access +permission based on the process's @emph{real} user ID rather than the +effective user ID. (The setuid feature does not alter the real user ID, +so it reflects the user who actually ran the program.) + +There is another way you could check this access, which is easy to +describe, but very hard to use. This is to examine the file mode bits +and mimic the system's own access computation. This method is +undesirable because many systems have additional access control +features; your program cannot portably mimic them, and you would not +want to try to keep track of the diverse features that different systems +have. Using @code{access} is simple and automatically does whatever is +appropriate for the system you are using. + +@code{access} is @emph{only} only appropriate to use in setuid programs. +A non-setuid program will always use the effective ID rather than the +real ID. + +@pindex unistd.h +The symbols in this section are declared in @file{unistd.h}. + +@comment unistd.h +@comment POSIX.1 +@deftypefun int access (const char *@var{filename}, int @var{how}) +The @code{access} function checks to see whether the file named by +@var{filename} can be accessed in the way specified by the @var{how} +argument. The @var{how} argument either can be the bitwise OR of the +flags @code{R_OK}, @code{W_OK}, @code{X_OK}, or the existence test +@code{F_OK}. + +This function uses the @emph{real} user and group ID's of the calling +process, rather than the @emph{effective} ID's, to check for access +permission. As a result, if you use the function from a @code{setuid} +or @code{setgid} program (@pxref{How Change Persona}), it gives +information relative to the user who actually ran the program. + +The return value is @code{0} if the access is permitted, and @code{-1} +otherwise. (In other words, treated as a predicate function, +@code{access} returns true if the requested access is @emph{denied}.) + +In addition to the usual file name errors (@pxref{File Name +Errors}), the following @code{errno} error conditions are defined for +this function: + +@table @code +@item EACCES +The access specified by @var{how} is denied. + +@item ENOENT +The file doesn't exist. + +@item EROFS +Write permission was requested for a file on a read-only file system. +@end table +@end deftypefun + +These macros are defined in the header file @file{unistd.h} for use +as the @var{how} argument to the @code{access} function. The values +are integer constants. +@pindex unistd.h + +@comment unistd.h +@comment POSIX.1 +@deftypevr Macro int R_OK +Argument that means, test for read permission. +@end deftypevr + +@comment unistd.h +@comment POSIX.1 +@deftypevr Macro int W_OK +Argument that means, test for write permission. +@end deftypevr + +@comment unistd.h +@comment POSIX.1 +@deftypevr Macro int X_OK +Argument that means, test for execute/search permission. +@end deftypevr + +@comment unistd.h +@comment POSIX.1 +@deftypevr Macro int F_OK +Argument that means, test for existence of the file. +@end deftypevr + +@node File Times +@subsection File Times + +@cindex file access time +@cindex file modification time +@cindex file attribute modification time +Each file has three timestamps associated with it: its access time, +its modification time, and its attribute modification time. These +correspond to the @code{st_atime}, @code{st_mtime}, and @code{st_ctime} +members of the @code{stat} structure; see @ref{File Attributes}. + +All of these times are represented in calendar time format, as +@code{time_t} objects. This data type is defined in @file{time.h}. +For more information about representation and manipulation of time +values, see @ref{Calendar Time}. +@pindex time.h + +Reading from a file updates its access time attribute, and writing +updates its modification time. When a file is created, all three +timestamps for that file are set to the current time. In addition, the +attribute change time and modification time fields of the directory that +contains the new entry are updated. + +Adding a new name for a file with the @code{link} function updates the +attribute change time field of the file being linked, and both the +attribute change time and modification time fields of the directory +containing the new name. These same fields are affected if a file name +is deleted with @code{unlink}, @code{remove}, or @code{rmdir}. Renaming +a file with @code{rename} affects only the attribute change time and +modification time fields of the two parent directories involved, and not +the times for the file being renamed. + +Changing attributes of a file (for example, with @code{chmod}) updates +its attribute change time field. + +You can also change some of the timestamps of a file explicitly using +the @code{utime} function---all except the attribute change time. You +need to include the header file @file{utime.h} to use this facility. +@pindex utime.h + +@comment time.h +@comment POSIX.1 +@deftp {Data Type} {struct utimbuf} +The @code{utimbuf} structure is used with the @code{utime} function to +specify new access and modification times for a file. It contains the +following members: + +@table @code +@item time_t actime +This is the access time for the file. + +@item time_t modtime +This is the modification time for the file. +@end table +@end deftp + +@comment time.h +@comment POSIX.1 +@deftypefun int utime (const char *@var{filename}, const struct utimbuf *@var{times}) +This function is used to modify the file times associated with the file +named @var{filename}. + +If @var{times} is a null pointer, then the access and modification times +of the file are set to the current time. Otherwise, they are set to the +values from the @code{actime} and @code{modtime} members (respectively) +of the @code{utimbuf} structure pointed at by @var{times}. + +The attribute modification time for the file is set to the current time +in either case (since changing the timestamps is itself a modification +of the file attributes). + +The @code{utime} function returns @code{0} if successful and @code{-1} +on failure. In addition to the usual file name errors +(@pxref{File Name Errors}), the following @code{errno} error conditions +are defined for this function: + +@table @code +@item EACCES +There is a permission problem in the case where a null pointer was +passed as the @var{times} argument. In order to update the timestamp on +the file, you must either be the owner of the file, have write +permission on the file, or be a privileged user. + +@item ENOENT +The file doesn't exist. + +@item EPERM +If the @var{times} argument is not a null pointer, you must either be +the owner of the file or be a privileged user. This error is used to +report the problem. + +@item EROFS +The file lives on a read-only file system. +@end table +@end deftypefun + +Each of the three time stamps has a corresponding microsecond part, +which extends its resolution. These fields are called +@code{st_atime_usec}, @code{st_mtime_usec}, and @code{st_ctime_usec}; +each has a value between 0 and 999,999, which indicates the time in +microseconds. They correspond to the @code{tv_usec} field of a +@code{timeval} structure; see @ref{High-Resolution Calendar}. + +The @code{utimes} function is like @code{utime}, but also lets you specify +the fractional part of the file times. The prototype for this function is +in the header file @file{sys/time.h}. +@pindex sys/time.h + +@comment sys/time.h +@comment BSD +@deftypefun int utimes (const char *@var{filename}, struct timeval @var{tvp}@t{[2]}) +This function sets the file access and modification times for the file +named by @var{filename}. The new file access time is specified by +@code{@var{tvp}[0]}, and the new modification time by +@code{@var{tvp}[1]}. This function comes from BSD. + +The return values and error conditions are the same as for the @code{utime} +function. +@end deftypefun + +@node Making Special Files +@section Making Special Files +@cindex creating special files +@cindex special files + +The @code{mknod} function is the primitive for making special files, +such as files that correspond to devices. The GNU library includes +this function for compatibility with BSD. + +The prototype for @code{mknod} is declared in @file{sys/stat.h}. +@pindex sys/stat.h + +@comment sys/stat.h +@comment BSD +@deftypefun int mknod (const char *@var{filename}, int @var{mode}, int @var{dev}) +The @code{mknod} function makes a special file with name @var{filename}. +The @var{mode} specifies the mode of the file, and may include the various +special file bits, such as @code{S_IFCHR} (for a character special file) +or @code{S_IFBLK} (for a block special file). @xref{Testing File Type}. + +The @var{dev} argument specifies which device the special file refers to. +Its exact interpretation depends on the kind of special file being created. + +The return value is @code{0} on success and @code{-1} on error. In addition +to the usual file name errors (@pxref{File Name Errors}), the +following @code{errno} error conditions are defined for this function: + +@table @code +@item EPERM +The calling process is not privileged. Only the superuser can create +special files. + +@item ENOSPC +The directory or file system that would contain the new file is full +and cannot be extended. + +@item EROFS +The directory containing the new file can't be modified because it's on +a read-only file system. + +@item EEXIST +There is already a file named @var{filename}. If you want to replace +this file, you must remove the old file explicitly first. +@end table +@end deftypefun + +@node Temporary Files +@section Temporary Files + +If you need to use a temporary file in your program, you can use the +@code{tmpfile} function to open it. Or you can use the @code{tmpnam} +function make a name for a temporary file and then open it in the usual +way with @code{fopen}. + +The @code{tempnam} function is like @code{tmpnam} but lets you choose +what directory temporary files will go in, and something about what +their file names will look like. + +These facilities are declared in the header file @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftypefun {FILE *} tmpfile (void) +This function creates a temporary binary file for update mode, as if by +calling @code{fopen} with mode @code{"wb+"}. The file is deleted +automatically when it is closed or when the program terminates. (On +some other ANSI C systems the file may fail to be deleted if the program +terminates abnormally). +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun {char *} tmpnam (char *@var{result}) +This function constructs and returns a file name that is a valid file +name and that does not name any existing file. If the @var{result} +argument is a null pointer, the return value is a pointer to an internal +static string, which might be modified by subsequent calls. Otherwise, +the @var{result} argument should be a pointer to an array of at least +@code{L_tmpnam} characters, and the result is written into that array. + +It is possible for @code{tmpnam} to fail if you call it too many times. +This is because the fixed length of a temporary file name gives room for +only a finite number of different names. If @code{tmpnam} fails, it +returns a null pointer. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypevr Macro int L_tmpnam +The value of this macro is an integer constant expression that represents +the minimum allocation size of a string large enough to hold the +file name generated by the @code{tmpnam} function. +@end deftypevr + +@comment stdio.h +@comment ANSI +@deftypevr Macro int TMP_MAX +The macro @code{TMP_MAX} is a lower bound for how many temporary names +you can create with @code{tmpnam}. You can rely on being able to call +@code{tmpnam} at least this many times before it might fail saying you +have made too many temporary file names. + +With the GNU library, you can create a very large number of temporary +file names---if you actually create the files, you will probably run out +of disk space before you run out of names. Some other systems have a +fixed, small limit on the number of temporary files. The limit is never +less than @code{25}. +@end deftypevr + +@comment stdio.h +@comment SVID +@deftypefun {char *} tempnam (const char *@var{dir}, const char *@var{prefix}) +This function generates a unique temporary filename. If @var{prefix} is +not a null pointer, up to five characters of this string are used as a +prefix for the file name. The return value is a string newly allocated +with @code{malloc}; you should release its storage with @code{free} when +it is no longer needed. + +The directory prefix for the temporary file name is determined by testing +each of the following, in sequence. The directory must exist and be +writable. + +@itemize @bullet +@item +The environment variable @code{TMPDIR}, if it is defined. + +@item +The @var{dir} argument, if it is not a null pointer. + +@item +The value of the @code{P_tmpdir} macro. + +@item +The directory @file{/tmp}. +@end itemize + +This function is defined for SVID compatibility. +@end deftypefun +@cindex TMPDIR environment variable + +@comment stdio.h +@comment SVID +@c !!! are we putting SVID/GNU/POSIX.1/BSD in here or not?? +@deftypevr {SVID Macro} {char *} P_tmpdir +This macro is the name of the default directory for temporary files. +@end deftypevr + +Older Unix systems did not have the functions just described. Instead +they used @code{mktemp} and @code{mkstemp}. Both of these functions +work by modifying a file name template string you pass. The last six +characters of this string must be @samp{XXXXXX}. These six @samp{X}s +are replaced with six characters which make the whole string a unique +file name. Usually the template string is something like +@samp{/tmp/@var{prefix}XXXXXX}, and each program uses a unique @var{prefix}. + +@strong{Note:} Because @code{mktemp} and @code{mkstemp} modify the +template string, you @emph{must not} pass string constants to them. +String constants are normally in read-only storage, so your program +would crash when @code{mktemp} or @code{mkstemp} tried to modify the +string. + +@comment unistd.h +@comment Unix +@deftypefun {char *} mktemp (char *@var{template}) +The @code{mktemp} function generates a unique file name by modifying +@var{template} as described above. If successful, it returns +@var{template} as modified. If @code{mktemp} cannot find a unique file +name, it makes @var{template} an empty string and returns that. If +@var{template} does not end with @samp{XXXXXX}, @code{mktemp} returns a +null pointer. +@end deftypefun + +@comment unistd.h +@comment BSD +@deftypefun int mkstemp (char *@var{template}) +The @code{mkstemp} function generates a unique file name just as +@code{mktemp} does, but it also opens the file for you with @code{open} +(@pxref{Opening and Closing Files}). If successful, it modifies +@var{template} in place and returns a file descriptor open on that file +for reading and writing. If @code{mkstemp} cannot create a +uniquely-named file, it makes @var{template} an empty string and returns +@code{-1}. If @var{template} does not end with @samp{XXXXXX}, +@code{mkstemp} returns @code{-1} and does not modify @var{template}. +@end deftypefun + +Unlike @code{mktemp}, @code{mkstemp} is actually guaranteed to create a +unique file that cannot possibly clash with any other program trying to +create a temporary file. This is because it works by calling +@code{open} with the @code{O_EXCL} flag bit, which says you want to +always create a new file, and get an error if the file already exists. diff --git a/manual/header.texi b/manual/header.texi new file mode 100644 index 0000000000..588d77eabf --- /dev/null +++ b/manual/header.texi @@ -0,0 +1,14 @@ +@node Library Summary, Maintenance, Language Features, Top +@appendix Summary of Library Facilities + +This appendix is a complete list of the facilities declared within the +header files supplied with the GNU C library. Each entry also lists the +standard or other source from which each facility is derived, and tells +you where in the manual you can find more information about how to use +it. + +@table @code +@comment summary.texi is generated from the other Texinfo files. +@comment See the Makefile and summary.awk for the details. +@include summary.texi +@end table diff --git a/manual/intro.texi b/manual/intro.texi new file mode 100644 index 0000000000..19f04a1474 --- /dev/null +++ b/manual/intro.texi @@ -0,0 +1,689 @@ +@node Introduction, Error Reporting, Top, Top +@chapter Introduction + +The C language provides no built-in facilities for performing such +common operations as input/output, memory management, string +manipulation, and the like. Instead, these facilities are defined +in a standard @dfn{library}, which you compile and link with your +programs. +@cindex library + +The GNU C library, described in this document, defines all of the +library functions that are specified by the ANSI C standard, as well as +additional features specific to POSIX and other derivatives of the Unix +operating system, and extensions specific to the GNU system. + +The purpose of this manual is to tell you how to use the facilities +of the GNU library. We have mentioned which features belong to which +standards to help you identify things that are potentially nonportable +to other systems. But the emphasis in this manual is not on strict +portability. + +@menu +* Getting Started:: What this manual is for and how to use it. +* Standards and Portability:: Standards and sources upon which the GNU + C library is based. +* Using the Library:: Some practical uses for the library. +* Roadmap to the Manual:: Overview of the remaining chapters in + this manual. +@end menu + +@node Getting Started, Standards and Portability, , Introduction +@section Getting Started + +This manual is written with the assumption that you are at least +somewhat familiar with the C programming language and basic programming +concepts. Specifically, familiarity with ANSI standard C +(@pxref{ANSI C}), rather than ``traditional'' pre-ANSI C dialects, is +assumed. + +The GNU C library includes several @dfn{header files}, each of which +provides definitions and declarations for a group of related facilities; +this information is used by the C compiler when processing your program. +For example, the header file @file{stdio.h} declares facilities for +performing input and output, and the header file @file{string.h} +declares string processing utilities. The organization of this manual +generally follows the same division as the header files. + +If you are reading this manual for the first time, you should read all +of the introductory material and skim the remaining chapters. There are +a @emph{lot} of functions in the GNU C library and it's not realistic to +expect that you will be able to remember exactly @emph{how} to use each +and every one of them. It's more important to become generally familiar +with the kinds of facilities that the library provides, so that when you +are writing your programs you can recognize @emph{when} to make use of +library functions, and @emph{where} in this manual you can find more +specific information about them. + + +@node Standards and Portability, Using the Library, Getting Started, Introduction +@section Standards and Portability +@cindex standards + +This section discusses the various standards and other sources that the +GNU C library is based upon. These sources include the ANSI C and +POSIX standards, and the System V and Berkeley Unix implementations. + +The primary focus of this manual is to tell you how to make effective +use of the GNU library facilities. But if you are concerned about +making your programs compatible with these standards, or portable to +operating systems other than GNU, this can affect how you use the +library. This section gives you an overview of these standards, so that +you will know what they are when they are mentioned in other parts of +the manual. + +@xref{Library Summary}, for an alphabetical list of the functions and +other symbols provided by the library. This list also states which +standards each function or symbol comes from. + +@menu +* ANSI C:: The American National Standard for the + C programming language. +* POSIX:: The IEEE 1003 standards for operating + systems. +* Berkeley Unix:: BSD and SunOS. +* SVID:: The System V Interface Description. +@end menu + +@node ANSI C, POSIX, , Standards and Portability +@subsection ANSI C +@cindex ANSI C + +The GNU C library is compatible with the C standard adopted by the +American National Standards Institute (ANSI): +@cite{American National Standard X3.159-1989---``ANSI C''}. +The header files and library facilities that make up the GNU library are +a superset of those specified by the ANSI C standard.@refill + +@pindex gcc +If you are concerned about strict adherence to the ANSI C standard, you +should use the @samp{-ansi} option when you compile your programs with +the GNU C compiler. This tells the compiler to define @emph{only} ANSI +standard features from the library header files, unless you explicitly +ask for additional features. @xref{Feature Test Macros}, for +information on how to do this. + +Being able to restrict the library to include only ANSI C features is +important because ANSI C puts limitations on what names can be defined +by the library implementation, and the GNU extensions don't fit these +limitations. @xref{Reserved Names}, for more information about these +restrictions. + +This manual does not attempt to give you complete details on the +differences between ANSI C and older dialects. It gives advice on how +to write programs to work portably under multiple C dialects, but does +not aim for completeness. + +@node POSIX, Berkeley Unix, ANSI C, Standards and Portability +@subsection POSIX (The Portable Operating System Interface) +@cindex POSIX +@cindex POSIX.1 +@cindex IEEE Std 1003.1 +@cindex POSIX.2 +@cindex IEEE Std 1003.2 + +The GNU library is also compatible with the IEEE @dfn{POSIX} family of +standards, known more formally as the @dfn{Portable Operating System +Interface for Computer Environments}. POSIX is derived mostly from +various versions of the Unix operating system. + +The library facilities specified by the POSIX standards are a superset +of those required by ANSI C; POSIX specifies additional features for +ANSI C functions, as well as specifying new additional functions. In +general, the additional requirements and functionality defined by the +POSIX standards are aimed at providing lower-level support for a +particular kind of operating system environment, rather than general +programming language support which can run in many diverse operating +system environments.@refill + +The GNU C library implements all of the functions specified in +@cite{IEEE Std 1003.1-1990, the POSIX System Application Program +Interface}, commonly referred to as POSIX.1. The primary extensions to +the ANSI C facilities specified by this standard include file system +interface primitives (@pxref{File System Interface}), device-specific +terminal control functions (@pxref{Low-Level Terminal Interface}), and +process control functions (@pxref{Processes}). + +Some facilities from @cite{IEEE Std 1003.2-1992, the POSIX Shell and +Utilities standard} (POSIX.2) are also implemented in the GNU library. +These include utilities for dealing with regular expressions and other +pattern matching facilities (@pxref{Pattern Matching}). + +@comment Roland sez: +@comment The GNU C library as it stands conforms to 1003.2 draft 11, which +@comment specifies: +@comment +@comment Several new macros in <limits.h>. +@comment popen, pclose +@comment <regex.h> (which is not yet fully implemented--wait on this) +@comment fnmatch +@comment getopt +@comment <glob.h> +@comment <wordexp.h> (not yet implemented) +@comment confstr + + +@node Berkeley Unix, SVID, POSIX, Standards and Portability +@subsection Berkeley Unix +@cindex BSD Unix +@cindex 4.@var{n} BSD Unix +@cindex Berkeley Unix +@cindex SunOS +@cindex Unix, Berkeley + +The GNU C library defines facilities from some versions of Unix which +are not formally standardized, specifically from the 4.2 BSD, 4.3 BSD, +and 4.4 BSD Unix systems (also known as @dfn{Berkeley Unix}) and from +@dfn{SunOS} (a popular 4.2 BSD derivative that includes some Unix System +V functionality). These systems support most of the ANSI and POSIX +facilities, and 4.4 BSD and newer releases of SunOS in fact support them all. + +The BSD facilities include symbolic links (@pxref{Symbolic Links}), the +@code{select} function (@pxref{Waiting for I/O}), the BSD signal +functions (@pxref{BSD Signal Handling}), and sockets (@pxref{Sockets}). + +@node SVID, , Berkeley Unix, Standards and Portability +@subsection SVID (The System V Interface Description) +@cindex SVID +@cindex System V Unix +@cindex Unix, System V + +The @dfn{System V Interface Description} (SVID) is a document describing +the AT&T Unix System V operating system. It is to some extent a +superset of the POSIX standard (@pxref{POSIX}). + +The GNU C library defines some of the facilities required by the SVID +that are not also required by the ANSI or POSIX standards, for +compatibility with System V Unix and other Unix systems (such as +SunOS) which include these facilities. However, many of the more +obscure and less generally useful facilities required by the SVID are +not included. (In fact, Unix System V itself does not provide them all.) + +@c !!! mention sysv ipc/shmem when it is there. + + +@node Using the Library, Roadmap to the Manual, Standards and Portability, Introduction +@section Using the Library + +This section describes some of the practical issues involved in using +the GNU C library. + +@menu +* Header Files:: How to include the header files in your + programs. +* Macro Definitions:: Some functions in the library may really + be implemented as macros. +* Reserved Names:: The C standard reserves some names for + the library, and some for users. +* Feature Test Macros:: How to control what names are defined. +@end menu + +@node Header Files, Macro Definitions, , Using the Library +@subsection Header Files +@cindex header files + +Libraries for use by C programs really consist of two parts: @dfn{header +files} that define types and macros and declare variables and +functions; and the actual library or @dfn{archive} that contains the +definitions of the variables and functions. + +(Recall that in C, a @dfn{declaration} merely provides information that +a function or variable exists and gives its type. For a function +declaration, information about the types of its arguments might be +provided as well. The purpose of declarations is to allow the compiler +to correctly process references to the declared variables and functions. +A @dfn{definition}, on the other hand, actually allocates storage for a +variable or says what a function does.) +@cindex definition (compared to declaration) +@cindex declaration (compared to definition) + +In order to use the facilities in the GNU C library, you should be sure +that your program source files include the appropriate header files. +This is so that the compiler has declarations of these facilities +available and can correctly process references to them. Once your +program has been compiled, the linker resolves these references to +the actual definitions provided in the archive file. + +Header files are included into a program source file by the +@samp{#include} preprocessor directive. The C language supports two +forms of this directive; the first, + +@smallexample +#include "@var{header}" +@end smallexample + +@noindent +is typically used to include a header file @var{header} that you write +yourself; this would contain definitions and declarations describing the +interfaces between the different parts of your particular application. +By contrast, + +@smallexample +#include <file.h> +@end smallexample + +@noindent +is typically used to include a header file @file{file.h} that contains +definitions and declarations for a standard library. This file would +normally be installed in a standard place by your system administrator. +You should use this second form for the C library header files. + +Typically, @samp{#include} directives are placed at the top of the C +source file, before any other code. If you begin your source files with +some comments explaining what the code in the file does (a good idea), +put the @samp{#include} directives immediately afterwards, following the +feature test macro definition (@pxref{Feature Test Macros}). + +For more information about the use of header files and @samp{#include} +directives, @pxref{Header Files,,, cpp.info, The GNU C Preprocessor +Manual}.@refill + +The GNU C library provides several header files, each of which contains +the type and macro definitions and variable and function declarations +for a group of related facilities. This means that your programs may +need to include several header files, depending on exactly which +facilities you are using. + +Some library header files include other library header files +automatically. However, as a matter of programming style, you should +not rely on this; it is better to explicitly include all the header +files required for the library facilities you are using. The GNU C +library header files have been written in such a way that it doesn't +matter if a header file is accidentally included more than once; +including a header file a second time has no effect. Likewise, if your +program needs to include multiple header files, the order in which they +are included doesn't matter. + +@strong{Compatibility Note:} Inclusion of standard header files in any +order and any number of times works in any ANSI C implementation. +However, this has traditionally not been the case in many older C +implementations. + +Strictly speaking, you don't @emph{have to} include a header file to use +a function it declares; you could declare the function explicitly +yourself, according to the specifications in this manual. But it is +usually better to include the header file because it may define types +and macros that are not otherwise available and because it may define +more efficient macro replacements for some functions. It is also a sure +way to have the correct declaration. + +@node Macro Definitions, Reserved Names, Header Files, Using the Library +@subsection Macro Definitions of Functions +@cindex shadowing functions with macros +@cindex removing macros that shadow functions +@cindex undefining macros that shadow functions + +If we describe something as a function in this manual, it may have a +macro definition as well. This normally has no effect on how your +program runs---the macro definition does the same thing as the function +would. In particular, macro equivalents for library functions evaluate +arguments exactly once, in the same way that a function call would. The +main reason for these macro definitions is that sometimes they can +produce an inline expansion that is considerably faster than an actual +function call. + +Taking the address of a library function works even if it is also +defined as a macro. This is because, in this context, the name of the +function isn't followed by the left parenthesis that is syntactically +necessary to recognize a macro call. + +You might occasionally want to avoid using the macro definition of a +function---perhaps to make your program easier to debug. There are +two ways you can do this: + +@itemize @bullet +@item +You can avoid a macro definition in a specific use by enclosing the name +of the function in parentheses. This works because the name of the +function doesn't appear in a syntactic context where it is recognizable +as a macro call. + +@item +You can suppress any macro definition for a whole source file by using +the @samp{#undef} preprocessor directive, unless otherwise stated +explicitly in the description of that facility. +@end itemize + +For example, suppose the header file @file{stdlib.h} declares a function +named @code{abs} with + +@smallexample +extern int abs (int); +@end smallexample + +@noindent +and also provides a macro definition for @code{abs}. Then, in: + +@smallexample +#include <stdlib.h> +int f (int *i) @{ return (abs (++*i)); @} +@end smallexample + +@noindent +the reference to @code{abs} might refer to either a macro or a function. +On the other hand, in each of the following examples the reference is +to a function and not a macro. + +@smallexample +#include <stdlib.h> +int g (int *i) @{ return ((abs)(++*i)); @} + +#undef abs +int h (int *i) @{ return (abs (++*i)); @} +@end smallexample + +Since macro definitions that double for a function behave in +exactly the same way as the actual function version, there is usually no +need for any of these methods. In fact, removing macro definitions usually +just makes your program slower. + + +@node Reserved Names, Feature Test Macros, Macro Definitions, Using the Library +@subsection Reserved Names +@cindex reserved names +@cindex name space + +The names of all library types, macros, variables and functions that +come from the ANSI C standard are reserved unconditionally; your program +@strong{may not} redefine these names. All other library names are +reserved if your program explicitly includes the header file that +defines or declares them. There are several reasons for these +restrictions: + +@itemize @bullet +@item +Other people reading your code could get very confused if you were using +a function named @code{exit} to do something completely different from +what the standard @code{exit} function does, for example. Preventing +this situation helps to make your programs easier to understand and +contributes to modularity and maintainability. + +@item +It avoids the possibility of a user accidentally redefining a library +function that is called by other library functions. If redefinition +were allowed, those other functions would not work properly. + +@item +It allows the compiler to do whatever special optimizations it pleases +on calls to these functions, without the possibility that they may have +been redefined by the user. Some library facilities, such as those for +dealing with variadic arguments (@pxref{Variadic Functions}) +and non-local exits (@pxref{Non-Local Exits}), actually require a +considerable amount of cooperation on the part of the C compiler, and +implementationally it might be easier for the compiler to treat these as +built-in parts of the language. +@end itemize + +In addition to the names documented in this manual, reserved names +include all external identifiers (global functions and variables) that +begin with an underscore (@samp{_}) and all identifiers regardless of +use that begin with either two underscores or an underscore followed by +a capital letter are reserved names. This is so that the library and +header files can define functions, variables, and macros for internal +purposes without risk of conflict with names in user programs. + +Some additional classes of identifier names are reserved for future +extensions to the C language or the POSIX.1 environment. While using these +names for your own purposes right now might not cause a problem, they do +raise the possibility of conflict with future versions of the C +or POSIX standards, so you should avoid these names. + +@itemize @bullet +@item +Names beginning with a capital @samp{E} followed a digit or uppercase +letter may be used for additional error code names. @xref{Error +Reporting}. + +@item +Names that begin with either @samp{is} or @samp{to} followed by a +lowercase letter may be used for additional character testing and +conversion functions. @xref{Character Handling}. + +@item +Names that begin with @samp{LC_} followed by an uppercase letter may be +used for additional macros specifying locale attributes. +@xref{Locales}. + +@item +Names of all existing mathematics functions (@pxref{Mathematics}) +suffixed with @samp{f} or @samp{l} are reserved for corresponding +functions that operate on @code{float} and @code{long double} arguments, +respectively. + +@item +Names that begin with @samp{SIG} followed by an uppercase letter are +reserved for additional signal names. @xref{Standard Signals}. + +@item +Names that begin with @samp{SIG_} followed by an uppercase letter are +reserved for additional signal actions. @xref{Basic Signal Handling}. + +@item +Names beginning with @samp{str}, @samp{mem}, or @samp{wcs} followed by a +lowercase letter are reserved for additional string and array functions. +@xref{String and Array Utilities}. + +@item +Names that end with @samp{_t} are reserved for additional type names. +@end itemize + +In addition, some individual header files reserve names beyond +those that they actually define. You only need to worry about these +restrictions if your program includes that particular header file. + +@itemize @bullet +@item +The header file @file{dirent.h} reserves names prefixed with +@samp{d_}. +@pindex dirent.h + +@item +The header file @file{fcntl.h} reserves names prefixed with +@samp{l_}, @samp{F_}, @samp{O_}, and @samp{S_}. +@pindex fcntl.h + +@item +The header file @file{grp.h} reserves names prefixed with @samp{gr_}. +@pindex grp.h + +@item +The header file @file{limits.h} reserves names suffixed with @samp{_MAX}. +@pindex limits.h + +@item +The header file @file{pwd.h} reserves names prefixed with @samp{pw_}. +@pindex pwd.h + +@item +The header file @file{signal.h} reserves names prefixed with @samp{sa_} +and @samp{SA_}. +@pindex signal.h + +@item +The header file @file{sys/stat.h} reserves names prefixed with @samp{st_} +and @samp{S_}. +@pindex sys/stat.h + +@item +The header file @file{sys/times.h} reserves names prefixed with @samp{tms_}. +@pindex sys/times.h + +@item +The header file @file{termios.h} reserves names prefixed with @samp{c_}, +@samp{V}, @samp{I}, @samp{O}, and @samp{TC}; and names prefixed with +@samp{B} followed by a digit. +@pindex termios.h +@end itemize + +@comment Include the section on Creature Nest Macros. +@comment It is in a separate file so it can be formatted into ../NOTES. +@include creature.texi + +@node Roadmap to the Manual, , Using the Library, Introduction +@section Roadmap to the Manual + +Here is an overview of the contents of the remaining chapters of +this manual. + +@itemize @bullet +@item +@ref{Error Reporting}, describes how errors detected by the library +are reported. + +@item +@ref{Language Features}, contains information about library support for +standard parts of the C language, including things like the @code{sizeof} +operator and the symbolic constant @code{NULL}, how to write functions +accepting variable numbers of arguments, and constants describing the +ranges and other properties of the numerical types. There is also a simple +debugging mechanism which allows you to put assertions in your code, and +have diagnostic messages printed if the tests fail. + +@item +@ref{Memory Allocation}, describes the GNU library's facilities for +dynamic allocation of storage. If you do not know in advance how much +storage your program needs, you can allocate it dynamically instead, +and manipulate it via pointers. + +@item +@ref{Character Handling}, contains information about character +classification functions (such as @code{isspace}) and functions for +performing case conversion. + +@item +@ref{String and Array Utilities}, has descriptions of functions for +manipulating strings (null-terminated character arrays) and general +byte arrays, including operations such as copying and comparison. + +@item +@ref{I/O Overview}, gives an overall look at the input and output +facilities in the library, and contains information about basic concepts +such as file names. + +@item +@ref{I/O on Streams}, describes I/O operations involving streams (or +@w{@code{FILE *}} objects). These are the normal C library functions +from @file{stdio.h}. + +@item +@ref{Low-Level I/O}, contains information about I/O operations +on file descriptors. File descriptors are a lower-level mechanism +specific to the Unix family of operating systems. + +@item +@ref{File System Interface}, has descriptions of operations on entire +files, such as functions for deleting and renaming them and for creating +new directories. This chapter also contains information about how you +can access the attributes of a file, such as its owner and file protection +modes. + +@item +@ref{Pipes and FIFOs}, contains information about simple interprocess +communication mechanisms. Pipes allow communication between two related +processes (such as between a parent and child), while FIFOs allow +communication between processes sharing a common file system on the same +machine. + +@item +@ref{Sockets}, describes a more complicated interprocess communication +mechanism that allows processes running on different machines to +communicate over a network. This chapter also contains information about +Internet host addressing and how to use the system network databases. + +@item +@ref{Low-Level Terminal Interface}, describes how you can change the +attributes of a terminal device. If you want to disable echo of +characters typed by the user, for example, read this chapter. + +@item +@ref{Mathematics}, contains information about the math library +functions. These include things like random-number generators and +remainder functions on integers as well as the usual trigonometric and +exponential functions on floating-point numbers. + +@item +@ref{Arithmetic,, Low-Level Arithmetic Functions}, describes functions +for simple arithmetic, analysis of floating-point values, and reading +numbers from strings. + +@item +@ref{Searching and Sorting}, contains information about functions +for searching and sorting arrays. You can use these functions on any +kind of array by providing an appropriate comparison function. + +@item +@ref{Pattern Matching}, presents functions for matching regular expressions +and shell file name patterns, and for expanding words as the shell does. + +@item +@ref{Date and Time}, describes functions for measuring both calendar time +and CPU time, as well as functions for setting alarms and timers. + +@item +@ref{Extended Characters}, contains information about manipulating +characters and strings using character sets larger than will fit in +the usual @code{char} data type. + +@item +@ref{Locales}, describes how selecting a particular country +or language affects the behavior of the library. For example, the locale +affects collation sequences for strings and how monetary values are +formatted. + +@item +@ref{Non-Local Exits}, contains descriptions of the @code{setjmp} and +@code{longjmp} functions. These functions provide a facility for +@code{goto}-like jumps which can jump from one function to another. + +@item +@ref{Signal Handling}, tells you all about signals---what they are, +how to establish a handler that is called when a particular kind of +signal is delivered, and how to prevent signals from arriving during +critical sections of your program. + +@item +@ref{Process Startup}, tells how your programs can access their +command-line arguments and environment variables. + +@item +@ref{Processes}, contains information about how to start new processes +and run programs. + +@item +@ref{Job Control}, describes functions for manipulating process groups +and the controlling terminal. This material is probably only of +interest if you are writing a shell or other program which handles job +control specially. + +@item +@ref{User Database}, and @ref{Group Database}, tell you how to access +the system user and group databases. + +@item +@ref{System Information}, describes functions for getting information +about the hardware and software configuration your program is executing +under. + +@item +@ref{System Configuration}, tells you how you can get information about +various operating system limits. Most of these parameters are provided for +compatibility with POSIX. + +@item +@ref{Library Summary}, gives a summary of all the functions, variables, and +macros in the library, with complete data types and function prototypes, +and says what standard or system each is derived from. + +@item +@ref{Maintenance}, explains how to build and install the GNU C library on +your system, how to report any bugs you might find, and how to add new +functions or port the library to a new system. +@end itemize + +If you already know the name of the facility you are interested in, you +can look it up in @ref{Library Summary}. This gives you a summary of +its syntax and a pointer to where you can find a more detailed +description. This appendix is particularly useful if you just want to +verify the order and type of arguments to a function, for example. It +also tells you what standard or system each function, variable, or macro +is derived from. diff --git a/manual/io.texi b/manual/io.texi new file mode 100644 index 0000000000..84fd0a9e44 --- /dev/null +++ b/manual/io.texi @@ -0,0 +1,396 @@ +@node I/O Overview, I/O on Streams, Pattern Matching, Top +@chapter Input/Output Overview + +Most programs need to do either input (reading data) or output (writing +data), or most frequently both, in order to do anything useful. The GNU +C library provides such a large selection of input and output functions +that the hardest part is often deciding which function is most +appropriate! + +This chapter introduces concepts and terminology relating to input +and output. Other chapters relating to the GNU I/O facilities are: + +@itemize @bullet +@item +@ref{I/O on Streams}, which covers the high-level functions +that operate on streams, including formatted input and output. + +@item +@ref{Low-Level I/O}, which covers the basic I/O and control +functions on file descriptors. + +@item +@ref{File System Interface}, which covers functions for operating on +directories and for manipulating file attributes such as access modes +and ownership. + +@item +@ref{Pipes and FIFOs}, which includes information on the basic interprocess +communication facilities. + +@item +@ref{Sockets}, which covers a more complicated interprocess communication +facility with support for networking. + +@item +@ref{Low-Level Terminal Interface}, which covers functions for changing +how input and output to terminal or other serial devices are processed. +@end itemize + + +@menu +* I/O Concepts:: Some basic information and terminology. +* File Names:: How to refer to a file. +@end menu + +@node I/O Concepts, File Names, , I/O Overview +@section Input/Output Concepts + +Before you can read or write the contents of a file, you must establish +a connection or communications channel to the file. This process is +called @dfn{opening} the file. You can open a file for reading, writing, +or both. +@cindex opening a file + +The connection to an open file is represented either as a stream or as a +file descriptor. You pass this as an argument to the functions that do +the actual read or write operations, to tell them which file to operate +on. Certain functions expect streams, and others are designed to +operate on file descriptors. + +When you have finished reading to or writing from the file, you can +terminate the connection by @dfn{closing} the file. Once you have +closed a stream or file descriptor, you cannot do any more input or +output operations on it. + +@menu +* Streams and File Descriptors:: The GNU Library provides two ways + to access the contents of files. +* File Position:: The number of bytes from the + beginning of the file. +@end menu + +@node Streams and File Descriptors, File Position, , I/O Concepts +@subsection Streams and File Descriptors + +When you want to do input or output to a file, you have a choice of two +basic mechanisms for representing the connection between your program +and the file: file descriptors and streams. File descriptors are +represented as objects of type @code{int}, while streams are represented +as @code{FILE *} objects. + +File descriptors provide a primitive, low-level interface to input and +output operations. Both file descriptors and streams can represent a +connection to a device (such as a terminal), or a pipe or socket for +communicating with another process, as well as a normal file. But, if +you want to do control operations that are specific to a particular kind +of device, you must use a file descriptor; there are no facilities to +use streams in this way. You must also use file descriptors if your +program needs to do input or output in special modes, such as +nonblocking (or polled) input (@pxref{File Status Flags}). + +Streams provide a higher-level interface, layered on top of the +primitive file descriptor facilities. The stream interface treats all +kinds of files pretty much alike---the sole exception being the three +styles of buffering that you can choose (@pxref{Stream Buffering}). + +The main advantage of using the stream interface is that the set of +functions for performing actual input and output operations (as opposed +to control operations) on streams is much richer and more powerful than +the corresponding facilities for file descriptors. The file descriptor +interface provides only simple functions for transferring blocks of +characters, but the stream interface also provides powerful formatted +input and output functions (@code{printf} and @code{scanf}) as well as +functions for character- and line-oriented input and output. +@c !!! glibc has dprintf, which lets you do printf on an fd. + +Since streams are implemented in terms of file descriptors, you can +extract the file descriptor from a stream and perform low-level +operations directly on the file descriptor. You can also initially open +a connection as a file descriptor and then make a stream associated with +that file descriptor. + +In general, you should stick with using streams rather than file +descriptors, unless there is some specific operation you want to do that +can only be done on a file descriptor. If you are a beginning +programmer and aren't sure what functions to use, we suggest that you +concentrate on the formatted input functions (@pxref{Formatted Input}) +and formatted output functions (@pxref{Formatted Output}). + +If you are concerned about portability of your programs to systems other +than GNU, you should also be aware that file descriptors are not as +portable as streams. You can expect any system running ANSI C to +support streams, but non-GNU systems may not support file descriptors at +all, or may only implement a subset of the GNU functions that operate on +file descriptors. Most of the file descriptor functions in the GNU +library are included in the POSIX.1 standard, however. + +@node File Position, , Streams and File Descriptors, I/O Concepts +@subsection File Position + +One of the attributes of an open file is its @dfn{file position} that +keeps track of where in the file the next character is to be read or +written. In the GNU system, and all POSIX.1 systems, the file position +is simply an integer representing the number of bytes from the beginning +of the file. + +The file position is normally set to the beginning of the file when it +is opened, and each time a character is read or written, the file +position is incremented. In other words, access to the file is normally +@dfn{sequential}. +@cindex file position +@cindex sequential-access files + +Ordinary files permit read or write operations at any position within +the file. Some other kinds of files may also permit this. Files which +do permit this are sometimes referred to as @dfn{random-access} files. +You can change the file position using the @code{fseek} function on a +stream (@pxref{File Positioning}) or the @code{lseek} function on a file +descriptor (@pxref{I/O Primitives}). If you try to change the file +position on a file that doesn't support random access, you get the +@code{ESPIPE} error. +@cindex random-access files + +Streams and descriptors that are opened for @dfn{append access} are +treated specially for output: output to such files is @emph{always} +appended sequentially to the @emph{end} of the file, regardless of the +file position. However, the file position is still used to control where in +the file reading is done. +@cindex append-access files + +If you think about it, you'll realize that several programs can read a +given file at the same time. In order for each program to be able to +read the file at its own pace, each program must have its own file +pointer, which is not affected by anything the other programs do. + +In fact, each opening of a file creates a separate file position. +Thus, if you open a file twice even in the same program, you get two +streams or descriptors with independent file positions. + +By contrast, if you open a descriptor and then duplicate it to get +another descriptor, these two descriptors share the same file position: +changing the file position of one descriptor will affect the other. + +@node File Names, , I/O Concepts, I/O Overview +@section File Names + +In order to open a connection to a file, or to perform other operations +such as deleting a file, you need some way to refer to the file. Nearly +all files have names that are strings---even files which are actually +devices such as tape drives or terminals. These strings are called +@dfn{file names}. You specify the file name to say which file you want +to open or operate on. + +This section describes the conventions for file names and how the +operating system works with them. +@cindex file name + +@menu +* Directories:: Directories contain entries for files. +* File Name Resolution:: A file name specifies how to look up a file. +* File Name Errors:: Error conditions relating to file names. +* File Name Portability:: File name portability and syntax issues. +@end menu + + +@node Directories, File Name Resolution, , File Names +@subsection Directories + +In order to understand the syntax of file names, you need to understand +how the file system is organized into a hierarchy of directories. + +@cindex directory +@cindex link +@cindex directory entry +A @dfn{directory} is a file that contains information to associate other +files with names; these associations are called @dfn{links} or +@dfn{directory entries}. Sometimes, people speak of ``files in a +directory'', but in reality, a directory only contains pointers to +files, not the files themselves. + +@cindex file name component +The name of a file contained in a directory entry is called a @dfn{file +name component}. In general, a file name consists of a sequence of one +or more such components, separated by the slash character (@samp{/}). A +file name which is just one component names a file with respect to its +directory. A file name with multiple components names a directory, and +then a file in that directory, and so on. + +Some other documents, such as the POSIX standard, use the term +@dfn{pathname} for what we call a file name, and either @dfn{filename} +or @dfn{pathname component} for what this manual calls a file name +component. We don't use this terminology because a ``path'' is +something completely different (a list of directories to search), and we +think that ``pathname'' used for something else will confuse users. We +always use ``file name'' and ``file name component'' (or sometimes just +``component'', where the context is obvious) in GNU documentation. Some +macros use the POSIX terminology in their names, such as +@code{PATH_MAX}. These macros are defined by the POSIX standard, so we +cannot change their names. + +You can find more detailed information about operations on directories +in @ref{File System Interface}. + +@node File Name Resolution, File Name Errors, Directories, File Names +@subsection File Name Resolution + +A file name consists of file name components separated by slash +(@samp{/}) characters. On the systems that the GNU C library supports, +multiple successive @samp{/} characters are equivalent to a single +@samp{/} character. + +@cindex file name resolution +The process of determining what file a file name refers to is called +@dfn{file name resolution}. This is performed by examining the +components that make up a file name in left-to-right order, and locating +each successive component in the directory named by the previous +component. Of course, each of the files that are referenced as +directories must actually exist, be directories instead of regular +files, and have the appropriate permissions to be accessible by the +process; otherwise the file name resolution fails. + +@cindex root directory +@cindex absolute file name +If a file name begins with a @samp{/}, the first component in the file +name is located in the @dfn{root directory} of the process (usually all +processes on the system have the same root directory). Such a file name +is called an @dfn{absolute file name}. +@c !!! xref here to chroot, if we ever document chroot. -rm + +@cindex relative file name +Otherwise, the first component in the file name is located in the +current working directory (@pxref{Working Directory}). This kind of +file name is called a @dfn{relative file name}. + +@cindex parent directory +The file name components @file{.} (``dot'') and @file{..} (``dot-dot'') +have special meanings. Every directory has entries for these file name +components. The file name component @file{.} refers to the directory +itself, while the file name component @file{..} refers to its +@dfn{parent directory} (the directory that contains the link for the +directory in question). As a special case, @file{..} in the root +directory refers to the root directory itself, since it has no parent; +thus @file{/..} is the same as @file{/}. + +Here are some examples of file names: + +@table @file +@item /a +The file named @file{a}, in the root directory. + +@item /a/b +The file named @file{b}, in the directory named @file{a} in the root directory. + +@item a +The file named @file{a}, in the current working directory. + +@item /a/./b +This is the same as @file{/a/b}. + +@item ./a +The file named @file{a}, in the current working directory. + +@item ../a +The file named @file{a}, in the parent directory of the current working +directory. +@end table + +@c An empty string may ``work'', but I think it's confusing to +@c try to describe it. It's not a useful thing for users to use--rms. +A file name that names a directory may optionally end in a @samp{/}. +You can specify a file name of @file{/} to refer to the root directory, +but the empty string is not a meaningful file name. If you want to +refer to the current working directory, use a file name of @file{.} or +@file{./}. + +Unlike some other operating systems, the GNU system doesn't have any +built-in support for file types (or extensions) or file versions as part +of its file name syntax. Many programs and utilities use conventions +for file names---for example, files containing C source code usually +have names suffixed with @samp{.c}---but there is nothing in the file +system itself that enforces this kind of convention. + +@node File Name Errors, File Name Portability, File Name Resolution, File Names +@subsection File Name Errors + +@cindex file name errors +@cindex usual file name errors + +Functions that accept file name arguments usually detect these +@code{errno} error conditions relating to the file name syntax or +trouble finding the named file. These errors are referred to throughout +this manual as the @dfn{usual file name errors}. + +@table @code +@item EACCES +The process does not have search permission for a directory component +of the file name. + +@item ENAMETOOLONG +This error is used when either the the total length of a file name is +greater than @code{PATH_MAX}, or when an individual file name component +has a length greater than @code{NAME_MAX}. @xref{Limits for Files}. + +In the GNU system, there is no imposed limit on overall file name +length, but some file systems may place limits on the length of a +component. + +@item ENOENT +This error is reported when a file referenced as a directory component +in the file name doesn't exist, or when a component is a symbolic link +whose target file does not exist. @xref{Symbolic Links}. + +@item ENOTDIR +A file that is referenced as a directory component in the file name +exists, but it isn't a directory. + +@item ELOOP +Too many symbolic links were resolved while trying to look up the file +name. The system has an arbitrary limit on the number of symbolic links +that may be resolved in looking up a single file name, as a primitive +way to detect loops. @xref{Symbolic Links}. +@end table + + +@node File Name Portability, , File Name Errors, File Names +@subsection Portability of File Names + +The rules for the syntax of file names discussed in @ref{File Names}, +are the rules normally used by the GNU system and by other POSIX +systems. However, other operating systems may use other conventions. + +There are two reasons why it can be important for you to be aware of +file name portability issues: + +@itemize @bullet +@item +If your program makes assumptions about file name syntax, or contains +embedded literal file name strings, it is more difficult to get it to +run under other operating systems that use different syntax conventions. + +@item +Even if you are not concerned about running your program on machines +that run other operating systems, it may still be possible to access +files that use different naming conventions. For example, you may be +able to access file systems on another computer running a different +operating system over a network, or read and write disks in formats used +by other operating systems. +@end itemize + +The ANSI C standard says very little about file name syntax, only that +file names are strings. In addition to varying restrictions on the +length of file names and what characters can validly appear in a file +name, different operating systems use different conventions and syntax +for concepts such as structured directories and file types or +extensions. Some concepts such as file versions might be supported in +some operating systems and not by others. + +The POSIX.1 standard allows implementations to put additional +restrictions on file name syntax, concerning what characters are +permitted in file names and on the length of file name and file name +component strings. However, in the GNU system, you do not need to worry +about these restrictions; any character except the null character is +permitted in a file name string, and there are no limits on the length +of file name strings. + + diff --git a/manual/job.texi b/manual/job.texi new file mode 100644 index 0000000000..1ac15fffc4 --- /dev/null +++ b/manual/job.texi @@ -0,0 +1,1249 @@ +@node Job Control +@chapter Job Control + +@cindex process groups +@cindex job control +@cindex job +@cindex session +@dfn{Job control} refers to the protocol for allowing a user to move +between multiple @dfn{process groups} (or @dfn{jobs}) within a single +@dfn{login session}. The job control facilities are set up so that +appropriate behavior for most programs happens automatically and they +need not do anything special about job control. So you can probably +ignore the material in this chapter unless you are writing a shell or +login program. + +You need to be familiar with concepts relating to process creation +(@pxref{Process Creation Concepts}) and signal handling (@pxref{Signal +Handling}) in order to understand this material presented in this +chapter. + +@menu +* Concepts of Job Control:: Jobs can be controlled by a shell. +* Job Control is Optional:: Not all POSIX systems support job control. +* Controlling Terminal:: How a process gets its controlling terminal. +* Access to the Terminal:: How processes share the controlling terminal. +* Orphaned Process Groups:: Jobs left after the user logs out. +* Implementing a Shell:: What a shell must do to implement job control. +* Functions for Job Control:: Functions to control process groups. +@end menu + +@node Concepts of Job Control, Job Control is Optional, , Job Control +@section Concepts of Job Control + +@cindex shell +The fundamental purpose of an interactive shell is to read +commands from the user's terminal and create processes to execute the +programs specified by those commands. It can do this using the +@code{fork} (@pxref{Creating a Process}) and @code{exec} +(@pxref{Executing a File}) functions. + +A single command may run just one process---but often one command uses +several processes. If you use the @samp{|} operator in a shell command, +you explicitly request several programs in their own processes. But +even if you run just one program, it can use multiple processes +internally. For example, a single compilation command such as @samp{cc +-c foo.c} typically uses four processes (though normally only two at any +given time). If you run @code{make}, its job is to run other programs +in separate processes. + +The processes belonging to a single command are called a @dfn{process +group} or @dfn{job}. This is so that you can operate on all of them at +once. For example, typing @kbd{C-c} sends the signal @code{SIGINT} to +terminate all the processes in the foreground process group. + +@cindex session +A @dfn{session} is a larger group of processes. Normally all the +proccesses that stem from a single login belong to the same session. + +Every process belongs to a process group. When a process is created, it +becomes a member of the same process group and session as its parent +process. You can put it in another process group using the +@code{setpgid} function, provided the process group belongs to the same +session. + +@cindex session leader +The only way to put a process in a different session is to make it the +initial process of a new session, or a @dfn{session leader}, using the +@code{setsid} function. This also puts the session leader into a new +process group, and you can't move it out of that process group again. + +Usually, new sessions are created by the system login program, and the +session leader is the process running the user's login shell. + +@cindex controlling terminal +A shell that supports job control must arrange to control which job can +use the terminal at any time. Otherwise there might be multiple jobs +trying to read from the terminal at once, and confusion about which +process should receive the input typed by the user. To prevent this, +the shell must cooperate with the terminal driver using the protocol +described in this chapter. + +@cindex foreground job +@cindex background job +The shell can give unlimited access to the controlling terminal to only +one process group at a time. This is called the @dfn{foreground job} on +that controlling terminal. Other process groups managed by the shell +that are executing without such access to the terminal are called +@dfn{background jobs}. + +@cindex stopped job +If a background job needs to read from its controlling +terminal, it is @dfn{stopped} by the terminal driver; if the +@code{TOSTOP} mode is set, likewise for writing. The user can stop +a foreground job by typing the SUSP character (@pxref{Special +Characters}) and a program can stop any job by sending it a +@code{SIGSTOP} signal. It's the responsibility of the shell to notice +when jobs stop, to notify the user about them, and to provide mechanisms +for allowing the user to interactively continue stopped jobs and switch +jobs between foreground and background. + +@xref{Access to the Terminal}, for more information about I/O to the +controlling terminal, + +@node Job Control is Optional, Controlling Terminal, Concepts of Job Control , Job Control +@section Job Control is Optional +@cindex job control is optional + +Not all operating systems support job control. The GNU system does +support job control, but if you are using the GNU library on some other +system, that system may not support job control itself. + +You can use the @code{_POSIX_JOB_CONTROL} macro to test at compile-time +whether the system supports job control. @xref{System Options}. + +If job control is not supported, then there can be only one process +group per session, which behaves as if it were always in the foreground. +The functions for creating additional process groups simply fail with +the error code @code{ENOSYS}. + +The macros naming the various job control signals (@pxref{Job Control +Signals}) are defined even if job control is not supported. However, +the system never generates these signals, and attempts to send a job +control signal or examine or specify their actions report errors or do +nothing. + + +@node Controlling Terminal, Access to the Terminal, Job Control is Optional, Job Control +@section Controlling Terminal of a Process + +One of the attributes of a process is its controlling terminal. Child +processes created with @code{fork} inherit the controlling terminal from +their parent process. In this way, all the processes in a session +inherit the controlling terminal from the session leader. A session +leader that has control of a terminal is called the @dfn{controlling +process} of that terminal. + +@cindex controlling process +You generally do not need to worry about the exact mechanism used to +allocate a controlling terminal to a session, since it is done for you +by the system when you log in. +@c ??? How does GNU system let a process get a ctl terminal. + +An individual process disconnects from its controlling terminal when it +calls @code{setsid} to become the leader of a new session. +@xref{Process Group Functions}. + +@c !!! explain how it gets a new one (by opening any terminal) +@c ??? How you get a controlling terminal is system-dependent. +@c We should document how this will work in the GNU system when it is decided. +@c What Unix does is not clean and I don't think GNU should use that. + +@node Access to the Terminal, Orphaned Process Groups, Controlling Terminal, Job Control +@section Access to the Controlling Terminal +@cindex controlling terminal, access to + +Processes in the foreground job of a controlling terminal have +unrestricted access to that terminal; background proesses do not. This +section describes in more detail what happens when a process in a +background job tries to access its controlling terminal. + +@cindex @code{SIGTTIN}, from background job +When a process in a background job tries to read from its controlling +terminal, the process group is usually sent a @code{SIGTTIN} signal. +This normally causes all of the processes in that group to stop (unless +they handle the signal and don't stop themselves). However, if the +reading process is ignoring or blocking this signal, then @code{read} +fails with an @code{EIO} error instead. + +@cindex @code{SIGTTOU}, from background job +Similarly, when a process in a background job tries to write to its +controlling terminal, the default behavior is to send a @code{SIGTTOU} +signal to the process group. However, the behavior is modified by the +@code{TOSTOP} bit of the local modes flags (@pxref{Local Modes}). If +this bit is not set (which is the default), then writing to the +controlling terminal is always permitted without sending a signal. +Writing is also permitted if the @code{SIGTTOU} signal is being ignored +or blocked by the writing process. + +Most other terminal operations that a program can do are treated as +reading or as writing. (The description of each operation should say +which.) + +For more information about the primitive @code{read} and @code{write} +functions, see @ref{I/O Primitives}. + + +@node Orphaned Process Groups, Implementing a Shell, Access to the Terminal, Job Control +@section Orphaned Process Groups +@cindex orphaned process group + +When a controlling process terminates, its terminal becomes free and a +new session can be established on it. (In fact, another user could log +in on the terminal.) This could cause a problem if any processes from +the old session are still trying to use that terminal. + +To prevent problems, process groups that continue running even after the +session leader has terminated are marked as @dfn{orphaned process +groups}. + +When a process group becomes an orphan, its processes are sent a +@code{SIGHUP} signal. Ordinarily, this causes the processes to +terminate. However, if a program ignores this signal or establishes a +handler for it (@pxref{Signal Handling}), it can continue running as in +the orphan process group even after its controlling process terminates; +but it still cannot access the terminal any more. + +@node Implementing a Shell, Functions for Job Control, Orphaned Process Groups, Job Control +@section Implementing a Job Control Shell + +This section describes what a shell must do to implement job control, by +presenting an extensive sample program to illustrate the concepts +involved. + +@iftex +@itemize @bullet +@item +@ref{Data Structures}, introduces the example and presents +its primary data structures. + +@item +@ref{Initializing the Shell}, discusses actions which the shell must +perform to prepare for job control. + +@item +@ref{Launching Jobs}, includes information about how to create jobs +to execute commands. + +@item +@ref{Foreground and Background}, discusses what the shell should +do differently when launching a job in the foreground as opposed to +a background job. + +@item +@ref{Stopped and Terminated Jobs}, discusses reporting of job status +back to the shell. + +@item +@ref{Continuing Stopped Jobs}, tells you how to continue jobs that +have been stopped. + +@item +@ref{Missing Pieces}, discusses other parts of the shell. +@end itemize +@end iftex + +@menu +* Data Structures:: Introduction to the sample shell. +* Initializing the Shell:: What the shell must do to take + responsibility for job control. +* Launching Jobs:: Creating jobs to execute commands. +* Foreground and Background:: Putting a job in foreground of background. +* Stopped and Terminated Jobs:: Reporting job status. +* Continuing Stopped Jobs:: How to continue a stopped job in + the foreground or background. +* Missing Pieces:: Other parts of the shell. +@end menu + +@node Data Structures, Initializing the Shell, , Implementing a Shell +@subsection Data Structures for the Shell + +All of the program examples included in this chapter are part of +a simple shell program. This section presents data structures +and utility functions which are used throughout the example. + +The sample shell deals mainly with two data structures. The +@code{job} type contains information about a job, which is a +set of subprocesses linked together with pipes. The @code{process} type +holds information about a single subprocess. Here are the relevant +data structure declarations: + +@smallexample +@group +/* @r{A process is a single process.} */ +typedef struct process +@{ + struct process *next; /* @r{next process in pipeline} */ + char **argv; /* @r{for exec} */ + pid_t pid; /* @r{process ID} */ + char completed; /* @r{true if process has completed} */ + char stopped; /* @r{true if process has stopped} */ + int status; /* @r{reported status value} */ +@} process; +@end group + +@group +/* @r{A job is a pipeline of processes.} */ +typedef struct job +@{ + struct job *next; /* @r{next active job} */ + char *command; /* @r{command line, used for messages} */ + process *first_process; /* @r{list of processes in this job} */ + pid_t pgid; /* @r{process group ID} */ + char notified; /* @r{true if user told about stopped job} */ + struct termios tmodes; /* @r{saved terminal modes} */ + int stdin, stdout, stderr; /* @r{standard i/o channels} */ +@} job; + +/* @r{The active jobs are linked into a list. This is its head.} */ +job *first_job = NULL; +@end group +@end smallexample + +Here are some utility functions that are used for operating on @code{job} +objects. + +@smallexample +@group +/* @r{Find the active job with the indicated @var{pgid}.} */ +job * +find_job (pid_t pgid) +@{ + job *j; + + for (j = first_job; j; j = j->next) + if (j->pgid == pgid) + return j; + return NULL; +@} +@end group + +@group +/* @r{Return true if all processes in the job have stopped or completed.} */ +int +job_is_stopped (job *j) +@{ + process *p; + + for (p = j->first_process; p; p = p->next) + if (!p->completed && !p->stopped) + return 0; + return 1; +@} +@end group + +@group +/* @r{Return true if all processes in the job have completed.} */ +int +job_is_completed (job *j) +@{ + process *p; + + for (p = j->first_process; p; p = p->next) + if (!p->completed) + return 0; + return 1; +@} +@end group +@end smallexample + + +@node Initializing the Shell, Launching Jobs, Data Structures, Implementing a Shell +@subsection Initializing the Shell +@cindex job control, enabling +@cindex subshell + +When a shell program that normally performs job control is started, it +has to be careful in case it has been invoked from another shell that is +already doing its own job control. + +A subshell that runs interactively has to ensure that it has been placed +in the foreground by its parent shell before it can enable job control +itself. It does this by getting its initial process group ID with the +@code{getpgrp} function, and comparing it to the process group ID of the +current foreground job associated with its controlling terminal (which +can be retrieved using the @code{tcgetpgrp} function). + +If the subshell is not running as a foreground job, it must stop itself +by sending a @code{SIGTTIN} signal to its own process group. It may not +arbitrarily put itself into the foreground; it must wait for the user to +tell the parent shell to do this. If the subshell is continued again, +it should repeat the check and stop itself again if it is still not in +the foreground. + +@cindex job control, enabling +Once the subshell has been placed into the foreground by its parent +shell, it can enable its own job control. It does this by calling +@code{setpgid} to put itself into its own process group, and then +calling @code{tcsetpgrp} to place this process group into the +foreground. + +When a shell enables job control, it should set itself to ignore all the +job control stop signals so that it doesn't accidentally stop itself. +You can do this by setting the action for all the stop signals to +@code{SIG_IGN}. + +A subshell that runs non-interactively cannot and should not support job +control. It must leave all processes it creates in the same process +group as the shell itself; this allows the non-interactive shell and its +child processes to be treated as a single job by the parent shell. This +is easy to do---just don't use any of the job control primitives---but +you must remember to make the shell do it. + + +Here is the initialization code for the sample shell that shows how to +do all of this. + +@smallexample +/* @r{Keep track of attributes of the shell.} */ + +#include <sys/types.h> +#include <termios.h> +#include <unistd.h> + +pid_t shell_pgid; +struct termios shell_tmodes; +int shell_terminal; +int shell_is_interactive; + + +/* @r{Make sure the shell is running interactively as the foreground job} + @r{before proceeding.} */ + +void +init_shell () +@{ + + /* @r{See if we are running interactively.} */ + shell_terminal = STDIN_FILENO; + shell_is_interactive = isatty (shell_terminal); + + if (shell_is_interactive) + @{ + /* @r{Loop until we are in the foreground.} */ + while (tcgetpgrp (shell_terminal) != (shell_pgid = getpgrp ())) + kill (- shell_pgid, SIGTTIN); + + /* @r{Ignore interactive and job-control signals.} */ + signal (SIGINT, SIG_IGN); + signal (SIGQUIT, SIG_IGN); + signal (SIGTSTP, SIG_IGN); + signal (SIGTTIN, SIG_IGN); + signal (SIGTTOU, SIG_IGN); + signal (SIGCHLD, SIG_IGN); + + /* @r{Put ourselves in our own process group.} */ + shell_pgid = getpid (); + if (setpgid (shell_pgid, shell_pgid) < 0) + @{ + perror ("Couldn't put the shell in its own process group"); + exit (1); + @} + + /* @r{Grab control of the terminal.} */ + tcsetpgrp (shell_terminal, shell_pgid); + + /* @r{Save default terminal attributes for shell.} */ + tcgetattr (shell_terminal, &shell_tmodes); + @} +@} +@end smallexample + + +@node Launching Jobs, Foreground and Background, Initializing the Shell, Implementing a Shell +@subsection Launching Jobs +@cindex launching jobs + +Once the shell has taken responsibility for performing job control on +its controlling terminal, it can launch jobs in response to commands +typed by the user. + +To create the processes in a process group, you use the same @code{fork} +and @code{exec} functions described in @ref{Process Creation Concepts}. +Since there are multiple child processes involved, though, things are a +little more complicated and you must be careful to do things in the +right order. Otherwise, nasty race conditions can result. + +You have two choices for how to structure the tree of parent-child +relationships among the processes. You can either make all the +processes in the process group be children of the shell process, or you +can make one process in group be the ancestor of all the other processes +in that group. The sample shell program presented in this chapter uses +the first approach because it makes bookkeeping somewhat simpler. + +@cindex process group leader +@cindex process group ID +As each process is forked, it should put itself in the new process group +by calling @code{setpgid}; see @ref{Process Group Functions}. The first +process in the new group becomes its @dfn{process group leader}, and its +process ID becomes the @dfn{process group ID} for the group. + +@cindex race conditions, relating to job control +The shell should also call @code{setpgid} to put each of its child +processes into the new process group. This is because there is a +potential timing problem: each child process must be put in the process +group before it begins executing a new program, and the shell depends on +having all the child processes in the group before it continues +executing. If both the child processes and the shell call +@code{setpgid}, this ensures that the right things happen no matter which +process gets to it first. + +If the job is being launched as a foreground job, the new process group +also needs to be put into the foreground on the controlling terminal +using @code{tcsetpgrp}. Again, this should be done by the shell as well +as by each of its child processes, to avoid race conditions. + +The next thing each child process should do is to reset its signal +actions. + +During initialization, the shell process set itself to ignore job +control signals; see @ref{Initializing the Shell}. As a result, any child +processes it creates also ignore these signals by inheritance. This is +definitely undesirable, so each child process should explicitly set the +actions for these signals back to @code{SIG_DFL} just after it is forked. + +Since shells follow this convention, applications can assume that they +inherit the correct handling of these signals from the parent process. +But every application has a responsibility not to mess up the handling +of stop signals. Applications that disable the normal interpretation of +the SUSP character should provide some other mechanism for the user to +stop the job. When the user invokes this mechanism, the program should +send a @code{SIGTSTP} signal to the process group of the process, not +just to the process itself. @xref{Signaling Another Process}. + +Finally, each child process should call @code{exec} in the normal way. +This is also the point at which redirection of the standard input and +output channels should be handled. @xref{Duplicating Descriptors}, +for an explanation of how to do this. + +Here is the function from the sample shell program that is responsible +for launching a program. The function is executed by each child process +immediately after it has been forked by the shell, and never returns. + +@smallexample +void +launch_process (process *p, pid_t pgid, + int infile, int outfile, int errfile, + int foreground) +@{ + pid_t pid; + + if (shell_is_interactive) + @{ + /* @r{Put the process into the process group and give the process group} + @r{the terminal, if appropriate.} + @r{This has to be done both by the shell and in the individual} + @r{child processes because of potential race conditions.} */ + pid = getpid (); + if (pgid == 0) pgid = pid; + setpgid (pid, pgid); + if (foreground) + tcsetpgrp (shell_terminal, pgid); + + /* @r{Set the handling for job control signals back to the default.} */ + signal (SIGINT, SIG_DFL); + signal (SIGQUIT, SIG_DFL); + signal (SIGTSTP, SIG_DFL); + signal (SIGTTIN, SIG_DFL); + signal (SIGTTOU, SIG_DFL); + signal (SIGCHLD, SIG_DFL); + @} + + /* @r{Set the standard input/output channels of the new process.} */ + if (infile != STDIN_FILENO) + @{ + dup2 (infile, STDIN_FILENO); + close (infile); + @} + if (outfile != STDOUT_FILENO) + @{ + dup2 (outfile, STDOUT_FILENO); + close (outfile); + @} + if (errfile != STDERR_FILENO) + @{ + dup2 (errfile, STDERR_FILENO); + close (errfile); + @} + + /* @r{Exec the new process. Make sure we exit.} */ + execvp (p->argv[0], p->argv); + perror ("execvp"); + exit (1); +@} +@end smallexample + +If the shell is not running interactively, this function does not do +anything with process groups or signals. Remember that a shell not +performing job control must keep all of its subprocesses in the same +process group as the shell itself. + +Next, here is the function that actually launches a complete job. +After creating the child processes, this function calls some other +functions to put the newly created job into the foreground or background; +these are discussed in @ref{Foreground and Background}. + +@smallexample +void +launch_job (job *j, int foreground) +@{ + process *p; + pid_t pid; + int mypipe[2], infile, outfile; + + infile = j->stdin; + for (p = j->first_process; p; p = p->next) + @{ + /* @r{Set up pipes, if necessary.} */ + if (p->next) + @{ + if (pipe (mypipe) < 0) + @{ + perror ("pipe"); + exit (1); + @} + outfile = mypipe[1]; + @} + else + outfile = j->stdout; + + /* @r{Fork the child processes.} */ + pid = fork (); + if (pid == 0) + /* @r{This is the child process.} */ + launch_process (p, j->pgid, infile, + outfile, j->stderr, foreground); + else if (pid < 0) + @{ + /* @r{The fork failed.} */ + perror ("fork"); + exit (1); + @} + else + @{ + /* @r{This is the parent process.} */ + p->pid = pid; + if (shell_is_interactive) + @{ + if (!j->pgid) + j->pgid = pid; + setpgid (pid, j->pgid); + @} + @} + + /* @r{Clean up after pipes.} */ + if (infile != j->stdin) + close (infile); + if (outfile != j->stdout) + close (outfile); + infile = mypipe[0]; + @} + + format_job_info (j, "launched"); + + if (!shell_is_interactive) + wait_for_job (j); + else if (foreground) + put_job_in_foreground (j, 0); + else + put_job_in_background (j, 0); +@} +@end smallexample + + +@node Foreground and Background, Stopped and Terminated Jobs, Launching Jobs, Implementing a Shell +@subsection Foreground and Background + +Now let's consider what actions must be taken by the shell when it +launches a job into the foreground, and how this differs from what +must be done when a background job is launched. + +@cindex foreground job, launching +When a foreground job is launched, the shell must first give it access +to the controlling terminal by calling @code{tcsetpgrp}. Then, the +shell should wait for processes in that process group to terminate or +stop. This is discussed in more detail in @ref{Stopped and Terminated +Jobs}. + +When all of the processes in the group have either completed or stopped, +the shell should regain control of the terminal for its own process +group by calling @code{tcsetpgrp} again. Since stop signals caused by +I/O from a background process or a SUSP character typed by the user +are sent to the process group, normally all the processes in the job +stop together. + +The foreground job may have left the terminal in a strange state, so the +shell should restore its own saved terminal modes before continuing. In +case the job is merely been stopped, the shell should first save the +current terminal modes so that it can restore them later if the job is +continued. The functions for dealing with terminal modes are +@code{tcgetattr} and @code{tcsetattr}; these are described in +@ref{Terminal Modes}. + +Here is the sample shell's function for doing all of this. + +@smallexample +@group +/* @r{Put job @var{j} in the foreground. If @var{cont} is nonzero,} + @r{restore the saved terminal modes and send the process group a} + @r{@code{SIGCONT} signal to wake it up before we block.} */ + +void +put_job_in_foreground (job *j, int cont) +@{ + /* @r{Put the job into the foreground.} */ + tcsetpgrp (shell_terminal, j->pgid); +@end group + +@group + /* @r{Send the job a continue signal, if necessary.} */ + if (cont) + @{ + tcsetattr (shell_terminal, TCSADRAIN, &j->tmodes); + if (kill (- j->pgid, SIGCONT) < 0) + perror ("kill (SIGCONT)"); + @} +@end group + + /* @r{Wait for it to report.} */ + wait_for_job (j); + + /* @r{Put the shell back in the foreground.} */ + tcsetpgrp (shell_terminal, shell_pgid); + +@group + /* @r{Restore the shell's terminal modes.} */ + tcgetattr (shell_terminal, &j->tmodes); + tcsetattr (shell_terminal, TCSADRAIN, &shell_tmodes); +@} +@end group +@end smallexample + +@cindex background job, launching +If the process group is launched as a background job, the shell should +remain in the foreground itself and continue to read commands from +the terminal. + +In the sample shell, there is not much that needs to be done to put +a job into the background. Here is the function it uses: + +@smallexample +/* @r{Put a job in the background. If the cont argument is true, send} + @r{the process group a @code{SIGCONT} signal to wake it up.} */ + +void +put_job_in_background (job *j, int cont) +@{ + /* @r{Send the job a continue signal, if necessary.} */ + if (cont) + if (kill (-j->pgid, SIGCONT) < 0) + perror ("kill (SIGCONT)"); +@} +@end smallexample + + +@node Stopped and Terminated Jobs, Continuing Stopped Jobs, Foreground and Background, Implementing a Shell +@subsection Stopped and Terminated Jobs + +@cindex stopped jobs, detecting +@cindex terminated jobs, detecting +When a foreground process is launched, the shell must block until all of +the processes in that job have either terminated or stopped. It can do +this by calling the @code{waitpid} function; see @ref{Process +Completion}. Use the @code{WUNTRACED} option so that status is reported +for processes that stop as well as processes that terminate. + +The shell must also check on the status of background jobs so that it +can report terminated and stopped jobs to the user; this can be done by +calling @code{waitpid} with the @code{WNOHANG} option. A good place to +put a such a check for terminated and stopped jobs is just before +prompting for a new command. + +@cindex @code{SIGCHLD}, handling of +The shell can also receive asynchronous notification that there is +status information available for a child process by establishing a +handler for @code{SIGCHLD} signals. @xref{Signal Handling}. + +In the sample shell program, the @code{SIGCHLD} signal is normally +ignored. This is to avoid reentrancy problems involving the global data +structures the shell manipulates. But at specific times when the shell +is not using these data structures---such as when it is waiting for +input on the terminal---it makes sense to enable a handler for +@code{SIGCHLD}. The same function that is used to do the synchronous +status checks (@code{do_job_notification}, in this case) can also be +called from within this handler. + +Here are the parts of the sample shell program that deal with checking +the status of jobs and reporting the information to the user. + +@smallexample +@group +/* @r{Store the status of the process @var{pid} that was returned by waitpid.} + @r{Return 0 if all went well, nonzero otherwise.} */ + +int +mark_process_status (pid_t pid, int status) +@{ + job *j; + process *p; +@end group + +@group + if (pid > 0) + @{ + /* @r{Update the record for the process.} */ + for (j = first_job; j; j = j->next) + for (p = j->first_process; p; p = p->next) + if (p->pid == pid) + @{ + p->status = status; + if (WIFSTOPPED (status)) + p->stopped = 1; + else + @{ + p->completed = 1; + if (WIFSIGNALED (status)) + fprintf (stderr, "%d: Terminated by signal %d.\n", + (int) pid, WTERMSIG (p->status)); + @} + return 0; + @} + fprintf (stderr, "No child process %d.\n", pid); + return -1; + @} +@end group +@group + else if (pid == 0 || errno == ECHILD) + /* @r{No processes ready to report.} */ + return -1; + else @{ + /* @r{Other weird errors.} */ + perror ("waitpid"); + return -1; + @} +@} +@end group + +@group +/* @r{Check for processes that have status information available,} + @r{without blocking.} */ + +void +update_status (void) +@{ + int status; + pid_t pid; + + do + pid = waitpid (WAIT_ANY, &status, WUNTRACED|WNOHANG); + while (!mark_process_status (pid, status)); +@} +@end group + +@group +/* @r{Check for processes that have status information available,} + @r{blocking until all processes in the given job have reported.} */ + +void +wait_for_job (job *j) +@{ + int status; + pid_t pid; + + do + pid = waitpid (WAIT_ANY, &status, WUNTRACED); + while (!mark_process_status (pid, status) + && !job_is_stopped (j) + && !job_is_completed (j)); +@} +@end group + +@group +/* @r{Format information about job status for the user to look at.} */ + +void +format_job_info (job *j, const char *status) +@{ + fprintf (stderr, "%ld (%s): %s\n", (long)j->pgid, status, j->command); +@} +@end group + +@group +/* @r{Notify the user about stopped or terminated jobs.} + @r{Delete terminated jobs from the active job list.} */ + +void +do_job_notification (void) +@{ + job *j, *jlast, *jnext; + process *p; + + /* @r{Update status information for child processes.} */ + update_status (); + + jlast = NULL; + for (j = first_job; j; j = jnext) + @{ + jnext = j->next; + + /* @r{If all processes have completed, tell the user the job has} + @r{completed and delete it from the list of active jobs.} */ + if (job_is_completed (j)) @{ + format_job_info (j, "completed"); + if (jlast) + jlast->next = jnext; + else + first_job = jnext; + free_job (j); + @} + + /* @r{Notify the user about stopped jobs,} + @r{marking them so that we won't do this more than once.} */ + else if (job_is_stopped (j) && !j->notified) @{ + format_job_info (j, "stopped"); + j->notified = 1; + jlast = j; + @} + + /* @r{Don't say anything about jobs that are still running.} */ + else + jlast = j; + @} +@} +@end group +@end smallexample + +@node Continuing Stopped Jobs, Missing Pieces, Stopped and Terminated Jobs, Implementing a Shell +@subsection Continuing Stopped Jobs + +@cindex stopped jobs, continuing +The shell can continue a stopped job by sending a @code{SIGCONT} signal +to its process group. If the job is being continued in the foreground, +the shell should first invoke @code{tcsetpgrp} to give the job access to +the terminal, and restore the saved terminal settings. After continuing +a job in the foreground, the shell should wait for the job to stop or +complete, as if the job had just been launched in the foreground. + +The sample shell program handles both newly created and continued jobs +with the same pair of functions, @w{@code{put_job_in_foreground}} and +@w{@code{put_job_in_background}}. The definitions of these functions +were given in @ref{Foreground and Background}. When continuing a +stopped job, a nonzero value is passed as the @var{cont} argument to +ensure that the @code{SIGCONT} signal is sent and the terminal modes +reset, as appropriate. + +This leaves only a function for updating the shell's internal bookkeeping +about the job being continued: + +@smallexample +@group +/* @r{Mark a stopped job J as being running again.} */ + +void +mark_job_as_running (job *j) +@{ + Process *p; + + for (p = j->first_process; p; p = p->next) + p->stopped = 0; + j->notified = 0; +@} +@end group + +@group +/* @r{Continue the job J.} */ + +void +continue_job (job *j, int foreground) +@{ + mark_job_as_running (j); + if (foreground) + put_job_in_foreground (j, 1); + else + put_job_in_background (j, 1); +@} +@end group +@end smallexample + +@node Missing Pieces, , Continuing Stopped Jobs, Implementing a Shell +@subsection The Missing Pieces + +The code extracts for the sample shell included in this chapter are only +a part of the entire shell program. In particular, nothing at all has +been said about how @code{job} and @code{program} data structures are +allocated and initialized. + +Most real shells provide a complex user interface that has support for +a command language; variables; abbreviations, substitutions, and pattern +matching on file names; and the like. All of this is far too complicated +to explain here! Instead, we have concentrated on showing how to +implement the core process creation and job control functions that can +be called from such a shell. + +Here is a table summarizing the major entry points we have presented: + +@table @code +@item void init_shell (void) +Initialize the shell's internal state. @xref{Initializing the +Shell}. + +@item void launch_job (job *@var{j}, int @var{foreground}) +Launch the job @var{j} as either a foreground or background job. +@xref{Launching Jobs}. + +@item void do_job_notification (void) +Check for and report any jobs that have terminated or stopped. Can be +called synchronously or within a handler for @code{SIGCHLD} signals. +@xref{Stopped and Terminated Jobs}. + +@item void continue_job (job *@var{j}, int @var{foreground}) +Continue the job @var{j}. @xref{Continuing Stopped Jobs}. +@end table + +Of course, a real shell would also want to provide other functions for +managing jobs. For example, it would be useful to have commands to list +all active jobs or to send a signal (such as @code{SIGKILL}) to a job. + + +@node Functions for Job Control, , Implementing a Shell, Job Control +@section Functions for Job Control +@cindex process group functions +@cindex job control functions + +This section contains detailed descriptions of the functions relating +to job control. + +@menu +* Identifying the Terminal:: Determining the controlling terminal's name. +* Process Group Functions:: Functions for manipulating process groups. +* Terminal Access Functions:: Functions for controlling terminal access. +@end menu + + +@node Identifying the Terminal, Process Group Functions, , Functions for Job Control +@subsection Identifying the Controlling Terminal +@cindex controlling terminal, determining + +You can use the @code{ctermid} function to get a file name that you can +use to open the controlling terminal. In the GNU library, it returns +the same string all the time: @code{"/dev/tty"}. That is a special +``magic'' file name that refers to the controlling terminal of the +current process (if it has one). To find the name of the specific +terminal device, use @code{ttyname}; @pxref{Is It a Terminal}. + +The function @code{ctermid} is declared in the header file +@file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment POSIX.1 +@deftypefun {char *} ctermid (char *@var{string}) +The @code{ctermid} function returns a string containing the file name of +the controlling terminal for the current process. If @var{string} is +not a null pointer, it should be an array that can hold at least +@code{L_ctermid} characters; the string is returned in this array. +Otherwise, a pointer to a string in a static area is returned, which +might get overwritten on subsequent calls to this function. + +An empty string is returned if the file name cannot be determined for +any reason. Even if a file name is returned, access to the file it +represents is not guaranteed. +@end deftypefun + +@comment stdio.h +@comment POSIX.1 +@deftypevr Macro int L_ctermid +The value of this macro is an integer constant expression that +represents the size of a string large enough to hold the file name +returned by @code{ctermid}. +@end deftypevr + +See also the @code{isatty} and @code{ttyname} functions, in +@ref{Is It a Terminal}. + + +@node Process Group Functions, Terminal Access Functions, Identifying the Terminal, Functions for Job Control +@subsection Process Group Functions + +Here are descriptions of the functions for manipulating process groups. +Your program should include the header files @file{sys/types.h} and +@file{unistd.h} to use these functions. +@pindex unistd.h +@pindex sys/types.h + +@comment unistd.h +@comment POSIX.1 +@deftypefun pid_t setsid (void) +The @code{setsid} function creates a new session. The calling process +becomes the session leader, and is put in a new process group whose +process group ID is the same as the process ID of that process. There +are initially no other processes in the new process group, and no other +process groups in the new session. + +This function also makes the calling process have no controlling terminal. + +The @code{setsid} function returns the new process group ID of the +calling process if successful. A return value of @code{-1} indicates an +error. The following @code{errno} error conditions are defined for this +function: + +@table @code +@item EPERM +The calling process is already a process group leader, or there is +already another process group around that has the same process group ID. +@end table +@end deftypefun + +The @code{getpgrp} function has two definitions: one derived from BSD +Unix, and one from the POSIX.1 standard. The feature test macros you +have selected (@pxref{Feature Test Macros}) determine which definition +you get. Specifically, you get the BSD version if you define +@code{_BSD_SOURCE}; otherwise, you get the POSIX version if you define +@code{_POSIX_SOURCE} or @code{_GNU_SOURCE}. Programs written for old +BSD systems will not include @file{unistd.h}, which defines +@code{getpgrp} specially under @code{_BSD_SOURCE}. You must link such +programs with the @code{-lbsd-compat} option to get the BSD definition.@refill +@pindex -lbsd-compat +@pindex bsd-compat +@cindex BSD compatibility library + +@comment unistd.h +@comment POSIX.1 +@deftypefn {POSIX.1 Function} pid_t getpgrp (void) +The POSIX.1 definition of @code{getpgrp} returns the process group ID of +the calling process. +@end deftypefn + +@comment unistd.h +@comment BSD +@deftypefn {BSD Function} pid_t getpgrp (pid_t @var{pid}) +The BSD definition of @code{getpgrp} returns the process group ID of the +process @var{pid}. You can supply a value of @code{0} for the @var{pid} +argument to get information about the calling process. +@end deftypefn + +@comment unistd.h +@comment POSIX.1 +@deftypefun int setpgid (pid_t @var{pid}, pid_t @var{pgid}) +The @code{setpgid} function puts the process @var{pid} into the process +group @var{pgid}. As a special case, either @var{pid} or @var{pgid} can +be zero to indicate the process ID of the calling process. + +This function fails on a system that does not support job control. +@xref{Job Control is Optional}, for more information. + +If the operation is successful, @code{setpgid} returns zero. Otherwise +it returns @code{-1}. The following @code{errno} error conditions are +defined for this function: + +@table @code +@item EACCES +The child process named by @var{pid} has executed an @code{exec} +function since it was forked. + +@item EINVAL +The value of the @var{pgid} is not valid. + +@item ENOSYS +The system doesn't support job control. + +@item EPERM +The process indicated by the @var{pid} argument is a session leader, +or is not in the same session as the calling process, or the value of +the @var{pgid} argument doesn't match a process group ID in the same +session as the calling process. + +@item ESRCH +The process indicated by the @var{pid} argument is not the calling +process or a child of the calling process. +@end table +@end deftypefun + +@comment unistd.h +@comment BSD +@deftypefun int setpgrp (pid_t @var{pid}, pid_t @var{pgid}) +This is the BSD Unix name for @code{setpgid}. Both functions do exactly +the same thing. +@end deftypefun + + +@node Terminal Access Functions, , Process Group Functions, Functions for Job Control +@subsection Functions for Controlling Terminal Access + +These are the functions for reading or setting the foreground +process group of a terminal. You should include the header files +@file{sys/types.h} and @file{unistd.h} in your application to use +these functions. +@pindex unistd.h +@pindex sys/types.h + +Although these functions take a file descriptor argument to specify +the terminal device, the foreground job is associated with the terminal +file itself and not a particular open file descriptor. + +@comment unistd.h +@comment POSIX.1 +@deftypefun pid_t tcgetpgrp (int @var{filedes}) +This function returns the process group ID of the foreground process +group associated with the terminal open on descriptor @var{filedes}. + +If there is no foreground process group, the return value is a number +greater than @code{1} that does not match the process group ID of any +existing process group. This can happen if all of the processes in the +job that was formerly the foreground job have terminated, and no other +job has yet been moved into the foreground. + +In case of an error, a value of @code{-1} is returned. The +following @code{errno} error conditions are defined for this function: + +@table @code +@item EBADF +The @var{filedes} argument is not a valid file descriptor. + +@item ENOSYS +The system doesn't support job control. + +@item ENOTTY +The terminal file associated with the @var{filedes} argument isn't the +controlling terminal of the calling process. +@end table +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun int tcsetpgrp (int @var{filedes}, pid_t @var{pgid}) +This function is used to set a terminal's foreground process group ID. +The argument @var{filedes} is a descriptor which specifies the terminal; +@var{pgid} specifies the process group. The calling process must be a +member of the same session as @var{pgid} and must have the same +controlling terminal. + +For terminal access purposes, this function is treated as output. If it +is called from a background process on its controlling terminal, +normally all processes in the process group are sent a @code{SIGTTOU} +signal. The exception is if the calling process itself is ignoring or +blocking @code{SIGTTOU} signals, in which case the operation is +performed and no signal is sent. + +If successful, @code{tcsetpgrp} returns @code{0}. A return value of +@code{-1} indicates an error. The following @code{errno} error +conditions are defined for this function: + +@table @code +@item EBADF +The @var{filedes} argument is not a valid file descriptor. + +@item EINVAL +The @var{pgid} argument is not valid. + +@item ENOSYS +The system doesn't support job control. + +@item ENOTTY +The @var{filedes} isn't the controlling terminal of the calling process. + +@item EPERM +The @var{pgid} isn't a process group in the same session as the calling +process. +@end table +@end deftypefun diff --git a/manual/lang.texi b/manual/lang.texi new file mode 100644 index 0000000000..66d41846d2 --- /dev/null +++ b/manual/lang.texi @@ -0,0 +1,1213 @@ +@node Language Features, Library Summary, System Configuration, Top +@appendix C Language Facilities in the Library + +Some of the facilities implemented by the C library really should be +thought of as parts of the C language itself. These facilities ought to +be documented in the C Language Manual, not in the library manual; but +since we don't have the language manual yet, and documentation for these +features has been written, we are publishing it here. + +@menu +* Consistency Checking:: Using @code{assert} to abort if + something ``impossible'' happens. +* Variadic Functions:: Defining functions with varying numbers + of args. +* Null Pointer Constant:: The macro @code{NULL}. +* Important Data Types:: Data types for object sizes. +* Data Type Measurements:: Parameters of data type representations. +@end menu + +@node Consistency Checking +@section Explicitly Checking Internal Consistency +@cindex consistency checking +@cindex impossible events +@cindex assertions + +When you're writing a program, it's often a good idea to put in checks +at strategic places for ``impossible'' errors or violations of basic +assumptions. These checks are helpful in debugging problems due to +misunderstandings between different parts of the program. + +@pindex assert.h +The @code{assert} macro, defined in the header file @file{assert.h}, +provides a convenient way to abort the program while printing a message +about where in the program the error was detected. + +@vindex NDEBUG +Once you think your program is debugged, you can disable the error +checks performed by the @code{assert} macro by recompiling with the +macro @code{NDEBUG} defined. This means you don't actually have to +change the program source code to disable these checks. + +But disabling these consistency checks is undesirable unless they make +the program significantly slower. All else being equal, more error +checking is good no matter who is running the program. A wise user +would rather have a program crash, visibly, than have it return nonsense +without indicating anything might be wrong. + +@comment assert.h +@comment ANSI +@deftypefn Macro void assert (int @var{expression}) +Verify the programmer's belief that @var{expression} should be nonzero +at this point in the program. + +If @code{NDEBUG} is not defined, @code{assert} tests the value of +@var{expression}. If it is false (zero), @code{assert} aborts the +program (@pxref{Aborting a Program}) after printing a message of the +form: + +@smallexample +@file{@var{file}}:@var{linenum}: Assertion `@var{expression}' failed. +@end smallexample + +@noindent +on the standard error stream @code{stderr} (@pxref{Standard Streams}). +The filename and line number are taken from the C preprocessor macros +@code{__FILE__} and @code{__LINE__} and specify where the call to +@code{assert} was written. + +If the preprocessor macro @code{NDEBUG} is defined at the point where +@file{assert.h} is included, the @code{assert} macro is defined to do +absolutely nothing. + +@strong{Warning:} Even the argument expression @var{expression} is not +evaluated if @code{NDEBUG} is in effect. So never use @code{assert} +with arguments that involve side effects. For example, @code{assert +(++i > 0);} is a bad idea, because @code{i} will not be incremented if +@code{NDEBUG} is defined. +@end deftypefn + +@strong{Usage note:} The @code{assert} facility is designed for +detecting @emph{internal inconsistency}; it is not suitable for +reporting invalid input or improper usage by @emph{the user} of the +program. + +The information in the diagnostic messages printed by the @code{assert} +macro is intended to help you, the programmer, track down the cause of a +bug, but is not really useful for telling a user of your program why his +or her input was invalid or why a command could not be carried out. So +you can't use @code{assert} to print the error messages for these +eventualities. + +What's more, your program should not abort when given invalid input, as +@code{assert} would do---it should exit with nonzero status (@pxref{Exit +Status}) after printing its error messages, or perhaps read another +command or move on to the next input file. + +@xref{Error Messages}, for information on printing error messages for +problems that @emph{do not} represent bugs in the program. + + +@node Variadic Functions +@section Variadic Functions +@cindex variable number of arguments +@cindex variadic functions +@cindex optional arguments + +ANSI C defines a syntax for declaring a function to take a variable +number or type of arguments. (Such functions are referred to as +@dfn{varargs functions} or @dfn{variadic functions}.) However, the +language itself provides no mechanism for such functions to access their +non-required arguments; instead, you use the variable arguments macros +defined in @file{stdarg.h}. + +This section describes how to declare variadic functions, how to write +them, and how to call them properly. + +@strong{Compatibility Note:} Many older C dialects provide a similar, +but incompatible, mechanism for defining functions with variable numbers +of arguments, using @file{varargs.h}. + +@menu +* Why Variadic:: Reasons for making functions take + variable arguments. +* How Variadic:: How to define and call variadic functions. +* Variadic Example:: A complete example. +@end menu + +@node Why Variadic +@subsection Why Variadic Functions are Used + +Ordinary C functions take a fixed number of arguments. When you define +a function, you specify the data type for each argument. Every call to +the function should supply the expected number of arguments, with types +that can be converted to the specified ones. Thus, if the function +@samp{foo} is declared with @code{int foo (int, char *);} then you must +call it with two arguments, a number (any kind will do) and a string +pointer. + +But some functions perform operations that can meaningfully accept an +unlimited number of arguments. + +In some cases a function can handle any number of values by operating on +all of them as a block. For example, consider a function that allocates +a one-dimensional array with @code{malloc} to hold a specified set of +values. This operation makes sense for any number of values, as long as +the length of the array corresponds to that number. Without facilities +for variable arguments, you would have to define a separate function for +each possible array size. + +The library function @code{printf} (@pxref{Formatted Output}) is an +example of another class of function where variable arguments are +useful. This function prints its arguments (which can vary in type as +well as number) under the control of a format template string. + +These are good reasons to define a @dfn{variadic} function which can +handle as many arguments as the caller chooses to pass. + +Some functions such as @code{open} take a fixed set of arguments, but +occasionally ignore the last few. Strict adherence to ANSI C requires +these functions to be defined as variadic; in practice, however, the GNU +C compiler and most other C compilers let you define such a function to +take a fixed set of arguments---the most it can ever use---and then only +@emph{declare} the function as variadic (or not declare its arguments +at all!). + +@node How Variadic +@subsection How Variadic Functions are Defined and Used + +Defining and using a variadic function involves three steps: + +@itemize @bullet +@item +@emph{Define} the function as variadic, using an ellipsis +(@samp{@dots{}}) in the argument list, and using special macros to +access the variable arguments. @xref{Receiving Arguments}. + +@item +@emph{Declare} the function as variadic, using a prototype with an +ellipsis (@samp{@dots{}}), in all the files which call it. +@xref{Variadic Prototypes}. + +@item +@emph{Call} the function by writing the fixed arguments followed by the +additional variable arguments. @xref{Calling Variadics}. +@end itemize + +@menu +* Variadic Prototypes:: How to make a prototype for a function + with variable arguments. +* Receiving Arguments:: Steps you must follow to access the + optional argument values. +* How Many Arguments:: How to decide whether there are more arguments. +* Calling Variadics:: Things you need to know about calling + variable arguments functions. +* Argument Macros:: Detailed specification of the macros + for accessing variable arguments. +* Old Varargs:: The pre-ANSI way of defining variadic functions. +@end menu + +@node Variadic Prototypes +@subsubsection Syntax for Variable Arguments +@cindex function prototypes (variadic) +@cindex prototypes for variadic functions +@cindex variadic function prototypes + +A function that accepts a variable number of arguments must be declared +with a prototype that says so. You write the fixed arguments as usual, +and then tack on @samp{@dots{}} to indicate the possibility of +additional arguments. The syntax of ANSI C requires at least one fixed +argument before the @samp{@dots{}}. For example, + +@smallexample +int +func (const char *a, int b, @dots{}) +@{ + @dots{} +@} +@end smallexample + +@noindent +outlines a definition of a function @code{func} which returns an +@code{int} and takes two required arguments, a @code{const char *} and +an @code{int}. These are followed by any number of anonymous +arguments. + +@strong{Portability note:} For some C compilers, the last required +argument must not be declared @code{register} in the function +definition. Furthermore, this argument's type must be +@dfn{self-promoting}: that is, the default promotions must not change +its type. This rules out array and function types, as well as +@code{float}, @code{char} (whether signed or not) and @w{@code{short int}} +(whether signed or not). This is actually an ANSI C requirement. + +@node Receiving Arguments +@subsubsection Receiving the Argument Values +@cindex variadic function argument access +@cindex arguments (variadic functions) + +Ordinary fixed arguments have individual names, and you can use these +names to access their values. But optional arguments have no +names---nothing but @samp{@dots{}}. How can you access them? + +@pindex stdarg.h +The only way to access them is sequentially, in the order they were +written, and you must use special macros from @file{stdarg.h} in the +following three step process: + +@enumerate +@item +You initialize an argument pointer variable of type @code{va_list} using +@code{va_start}. The argument pointer when initialized points to the +first optional argument. + +@item +You access the optional arguments by successive calls to @code{va_arg}. +The first call to @code{va_arg} gives you the first optional argument, +the next call gives you the second, and so on. + +You can stop at any time if you wish to ignore any remaining optional +arguments. It is perfectly all right for a function to access fewer +arguments than were supplied in the call, but you will get garbage +values if you try to access too many arguments. + +@item +You indicate that you are finished with the argument pointer variable by +calling @code{va_end}. + +(In practice, with most C compilers, calling @code{va_end} does nothing +and you do not really need to call it. This is always true in the GNU C +compiler. But you might as well call @code{va_end} just in case your +program is someday compiled with a peculiar compiler.) +@end enumerate + +@xref{Argument Macros}, for the full definitions of @code{va_start}, +@code{va_arg} and @code{va_end}. + +Steps 1 and 3 must be performed in the function that accepts the +optional arguments. However, you can pass the @code{va_list} variable +as an argument to another function and perform all or part of step 2 +there. + +You can perform the entire sequence of the three steps multiple times +within a single function invocation. If you want to ignore the optional +arguments, you can do these steps zero times. + +You can have more than one argument pointer variable if you like. You +can initialize each variable with @code{va_start} when you wish, and +then you can fetch arguments with each argument pointer as you wish. +Each argument pointer variable will sequence through the same set of +argument values, but at its own pace. + +@strong{Portability note:} With some compilers, once you pass an +argument pointer value to a subroutine, you must not keep using the same +argument pointer value after that subroutine returns. For full +portability, you should just pass it to @code{va_end}. This is actually +an ANSI C requirement, but most ANSI C compilers work happily +regardless. + +@node How Many Arguments +@subsubsection How Many Arguments Were Supplied +@cindex number of arguments passed +@cindex how many arguments +@cindex arguments, how many + +There is no general way for a function to determine the number and type +of the optional arguments it was called with. So whoever designs the +function typically designs a convention for the caller to tell it how +many arguments it has, and what kind. It is up to you to define an +appropriate calling convention for each variadic function, and write all +calls accordingly. + +One kind of calling convention is to pass the number of optional +arguments as one of the fixed arguments. This convention works provided +all of the optional arguments are of the same type. + +A similar alternative is to have one of the required arguments be a bit +mask, with a bit for each possible purpose for which an optional +argument might be supplied. You would test the bits in a predefined +sequence; if the bit is set, fetch the value of the next argument, +otherwise use a default value. + +A required argument can be used as a pattern to specify both the number +and types of the optional arguments. The format string argument to +@code{printf} is one example of this (@pxref{Formatted Output Functions}). + +Another possibility is to pass an ``end marker'' value as the last +optional argument. For example, for a function that manipulates an +arbitrary number of pointer arguments, a null pointer might indicate the +end of the argument list. (This assumes that a null pointer isn't +otherwise meaningful to the function.) The @code{execl} function works +in just this way; see @ref{Executing a File}. + + +@node Calling Variadics +@subsubsection Calling Variadic Functions +@cindex variadic functions, calling +@cindex calling variadic functions +@cindex declaring variadic functions + +You don't have to write anything special when you call a variadic function. +Just write the arguments (required arguments, followed by optional ones) +inside parentheses, separated by commas, as usual. But you should prepare +by declaring the function with a prototype, and you must know how the +argument values are converted. + +In principle, functions that are @emph{defined} to be variadic must also +be @emph{declared} to be variadic using a function prototype whenever +you call them. (@xref{Variadic Prototypes}, for how.) This is because +some C compilers use a different calling convention to pass the same set +of argument values to a function depending on whether that function +takes variable arguments or fixed arguments. + +In practice, the GNU C compiler always passes a given set of argument +types in the same way regardless of whether they are optional or +required. So, as long as the argument types are self-promoting, you can +safely omit declaring them. Usually it is a good idea to declare the +argument types for variadic functions, and indeed for all functions. +But there are a few functions which it is extremely convenient not to +have to declare as variadic---for example, @code{open} and +@code{printf}. + +@cindex default argument promotions +@cindex argument promotion +Since the prototype doesn't specify types for optional arguments, in a +call to a variadic function the @dfn{default argument promotions} are +performed on the optional argument values. This means the objects of +type @code{char} or @w{@code{short int}} (whether signed or not) are +promoted to either @code{int} or @w{@code{unsigned int}}, as +appropriate; and that objects of type @code{float} are promoted to type +@code{double}. So, if the caller passes a @code{char} as an optional +argument, it is promoted to an @code{int}, and the function should get +it with @code{va_arg (@var{ap}, int)}. + +Conversion of the required arguments is controlled by the function +prototype in the usual way: the argument expression is converted to the +declared argument type as if it were being assigned to a variable of +that type. + +@node Argument Macros +@subsubsection Argument Access Macros + +Here are descriptions of the macros used to retrieve variable arguments. +These macros are defined in the header file @file{stdarg.h}. +@pindex stdarg.h + +@comment stdarg.h +@comment ANSI +@deftp {Data Type} va_list +The type @code{va_list} is used for argument pointer variables. +@end deftp + +@comment stdarg.h +@comment ANSI +@deftypefn {Macro} void va_start (va_list @var{ap}, @var{last-required}) +This macro initializes the argument pointer variable @var{ap} to point +to the first of the optional arguments of the current function; +@var{last-required} must be the last required argument to the function. + +@xref{Old Varargs}, for an alternate definition of @code{va_start} +found in the header file @file{varargs.h}. +@end deftypefn + +@comment stdarg.h +@comment ANSI +@deftypefn {Macro} @var{type} va_arg (va_list @var{ap}, @var{type}) +The @code{va_arg} macro returns the value of the next optional argument, +and modifies the value of @var{ap} to point to the subsequent argument. +Thus, successive uses of @code{va_arg} return successive optional +arguments. + +The type of the value returned by @code{va_arg} is @var{type} as +specified in the call. @var{type} must be a self-promoting type (not +@code{char} or @code{short int} or @code{float}) that matches the type +of the actual argument. +@end deftypefn + +@comment stdarg.h +@comment ANSI +@deftypefn {Macro} void va_end (va_list @var{ap}) +This ends the use of @var{ap}. After a @code{va_end} call, further +@code{va_arg} calls with the same @var{ap} may not work. You should invoke +@code{va_end} before returning from the function in which @code{va_start} +was invoked with the same @var{ap} argument. + +In the GNU C library, @code{va_end} does nothing, and you need not ever +use it except for reasons of portability. +@refill +@end deftypefn + +@node Variadic Example +@subsection Example of a Variadic Function + +Here is a complete sample function that accepts a variable number of +arguments. The first argument to the function is the count of remaining +arguments, which are added up and the result returned. While trivial, +this function is sufficient to illustrate how to use the variable +arguments facility. + +@comment Yes, this example has been tested. +@smallexample +@include add.c.texi +@end smallexample + +@node Old Varargs +@subsubsection Old-Style Variadic Functions + +@pindex varargs.h +Before ANSI C, programmers used a slightly different facility for +writing variadic functions. The GNU C compiler still supports it; +currently, it is more portable than the ANSI C facility, since support +for ANSI C is still not universal. The header file which defines the +old-fashioned variadic facility is called @file{varargs.h}. + +Using @file{varargs.h} is almost the same as using @file{stdarg.h}. +There is no difference in how you call a variadic function; +@xref{Calling Variadics}. The only difference is in how you define +them. First of all, you must use old-style non-prototype syntax, like +this: + +@smallexample +tree +build (va_alist) + va_dcl +@{ +@end smallexample + +Secondly, you must give @code{va_start} just one argument, like this: + +@smallexample + va_list p; + va_start (p); +@end smallexample + +These are the special macros used for defining old-style variadic +functions: + +@comment varargs.h +@comment Unix +@deffn Macro va_alist +This macro stands for the argument name list required in a variadic +function. +@end deffn + +@comment varargs.h +@comment Unix +@deffn Macro va_dcl +This macro declares the implicit argument or arguments for a variadic +function. +@end deffn + +@comment varargs.h +@comment Unix +@deftypefn {Macro} void va_start (va_list @var{ap}) +This macro, as defined in @file{varargs.h}, initializes the argument +pointer variable @var{ap} to point to the first argument of the current +function. +@end deftypefn + +The other argument macros, @code{va_arg} and @code{va_end}, are the same +in @file{varargs.h} as in @file{stdarg.h}; see @ref{Argument Macros} for +details. + +It does not work to include both @file{varargs.h} and @file{stdarg.h} in +the same compilation; they define @code{va_start} in conflicting ways. + +@node Null Pointer Constant +@section Null Pointer Constant +@cindex null pointer constant + +The null pointer constant is guaranteed not to point to any real object. +You can assign it to any pointer variable since it has type @code{void +*}. The preferred way to write a null pointer constant is with +@code{NULL}. + +@comment stddef.h +@comment ANSI +@deftypevr Macro {void *} NULL +This is a null pointer constant. +@end deftypevr + +You can also use @code{0} or @code{(void *)0} as a null pointer +constant, but using @code{NULL} is cleaner because it makes the purpose +of the constant more evident. + +If you use the null pointer constant as a function argument, then for +complete portability you should make sure that the function has a +prototype declaration. Otherwise, if the target machine has two +different pointer representations, the compiler won't know which +representation to use for that argument. You can avoid the problem by +explicitly casting the constant to the proper pointer type, but we +recommend instead adding a prototype for the function you are calling. + +@node Important Data Types +@section Important Data Types + +The result of subtracting two pointers in C is always an integer, but the +precise data type varies from C compiler to C compiler. Likewise, the +data type of the result of @code{sizeof} also varies between compilers. +ANSI defines standard aliases for these two types, so you can refer to +them in a portable fashion. They are defined in the header file +@file{stddef.h}. +@pindex stddef.h + +@comment stddef.h +@comment ANSI +@deftp {Data Type} ptrdiff_t +This is the signed integer type of the result of subtracting two +pointers. For example, with the declaration @code{char *p1, *p2;}, the +expression @code{p2 - p1} is of type @code{ptrdiff_t}. This will +probably be one of the standard signed integer types (@w{@code{short +int}}, @code{int} or @w{@code{long int}}), but might be a nonstandard +type that exists only for this purpose. +@end deftp + +@comment stddef.h +@comment ANSI +@deftp {Data Type} size_t +This is an unsigned integer type used to represent the sizes of objects. +The result of the @code{sizeof} operator is of this type, and functions +such as @code{malloc} (@pxref{Unconstrained Allocation}) and +@code{memcpy} (@pxref{Copying and Concatenation}) accept arguments of +this type to specify object sizes. + +@strong{Usage Note:} @code{size_t} is the preferred way to declare any +arguments or variables that hold the size of an object. +@end deftp + +In the GNU system @code{size_t} is equivalent to either +@w{@code{unsigned int}} or @w{@code{unsigned long int}}. These types +have identical properties on the GNU system, and for most purposes, you +can use them interchangeably. However, they are distinct as data types, +which makes a difference in certain contexts. + +For example, when you specify the type of a function argument in a +function prototype, it makes a difference which one you use. If the +system header files declare @code{malloc} with an argument of type +@code{size_t} and you declare @code{malloc} with an argument of type +@code{unsigned int}, you will get a compilation error if @code{size_t} +happens to be @code{unsigned long int} on your system. To avoid any +possibility of error, when a function argument or value is supposed to +have type @code{size_t}, never declare its type in any other way. + +@strong{Compatibility Note:} Implementations of C before the advent of +ANSI C generally used @code{unsigned int} for representing object sizes +and @code{int} for pointer subtraction results. They did not +necessarily define either @code{size_t} or @code{ptrdiff_t}. Unix +systems did define @code{size_t}, in @file{sys/types.h}, but the +definition was usually a signed type. + +@node Data Type Measurements +@section Data Type Measurements + +Most of the time, if you choose the proper C data type for each object +in your program, you need not be concerned with just how it is +represented or how many bits it uses. When you do need such +information, the C language itself does not provide a way to get it. +The header files @file{limits.h} and @file{float.h} contain macros +which give you this information in full detail. + +@menu +* Width of Type:: How many bits does an integer type hold? +* Range of Type:: What are the largest and smallest values + that an integer type can hold? +* Floating Type Macros:: Parameters that measure the floating point types. +* Structure Measurement:: Getting measurements on structure types. +@end menu + +@node Width of Type +@subsection Computing the Width of an Integer Data Type +@cindex integer type width +@cindex width of integer type +@cindex type measurements, integer + +The most common reason that a program needs to know how many bits are in +an integer type is for using an array of @code{long int} as a bit vector. +You can access the bit at index @var{n} with + +@smallexample +vector[@var{n} / LONGBITS] & (1 << (@var{n} % LONGBITS)) +@end smallexample + +@noindent +provided you define @code{LONGBITS} as the number of bits in a +@code{long int}. + +@pindex limits.h +There is no operator in the C language that can give you the number of +bits in an integer data type. But you can compute it from the macro +@code{CHAR_BIT}, defined in the header file @file{limits.h}. + +@table @code +@comment limits.h +@comment ANSI +@item CHAR_BIT +This is the number of bits in a @code{char}---eight, on most systems. +The value has type @code{int}. + +You can compute the number of bits in any data type @var{type} like +this: + +@smallexample +sizeof (@var{type}) * CHAR_BIT +@end smallexample +@end table + +@node Range of Type +@subsection Range of an Integer Type +@cindex integer type range +@cindex range of integer type +@cindex limits, integer types + +Suppose you need to store an integer value which can range from zero to +one million. Which is the smallest type you can use? There is no +general rule; it depends on the C compiler and target machine. You can +use the @samp{MIN} and @samp{MAX} macros in @file{limits.h} to determine +which type will work. + +Each signed integer type has a pair of macros which give the smallest +and largest values that it can hold. Each unsigned integer type has one +such macro, for the maximum value; the minimum value is, of course, +zero. + +The values of these macros are all integer constant expressions. The +@samp{MAX} and @samp{MIN} macros for @code{char} and @w{@code{short +int}} types have values of type @code{int}. The @samp{MAX} and +@samp{MIN} macros for the other types have values of the same type +described by the macro---thus, @code{ULONG_MAX} has type +@w{@code{unsigned long int}}. + +@comment Extra blank lines make it look better. +@table @code +@comment limits.h +@comment ANSI +@item SCHAR_MIN + +This is the minimum value that can be represented by a @w{@code{signed char}}. + +@comment limits.h +@comment ANSI +@item SCHAR_MAX +@comment limits.h +@comment ANSI +@itemx UCHAR_MAX + +These are the maximum values that can be represented by a +@w{@code{signed char}} and @w{@code{unsigned char}}, respectively. + +@comment limits.h +@comment ANSI +@item CHAR_MIN + +This is the minimum value that can be represented by a @code{char}. +It's equal to @code{SCHAR_MIN} if @code{char} is signed, or zero +otherwise. + +@comment limits.h +@comment ANSI +@item CHAR_MAX + +This is the maximum value that can be represented by a @code{char}. +It's equal to @code{SCHAR_MAX} if @code{char} is signed, or +@code{UCHAR_MAX} otherwise. + +@comment limits.h +@comment ANSI +@item SHRT_MIN + +This is the minimum value that can be represented by a @w{@code{signed +short int}}. On most machines that the GNU C library runs on, +@code{short} integers are 16-bit quantities. + +@comment limits.h +@comment ANSI +@item SHRT_MAX +@comment limits.h +@comment ANSI +@itemx USHRT_MAX + +These are the maximum values that can be represented by a +@w{@code{signed short int}} and @w{@code{unsigned short int}}, +respectively. + +@comment limits.h +@comment ANSI +@item INT_MIN + +This is the minimum value that can be represented by a @w{@code{signed +int}}. On most machines that the GNU C system runs on, an @code{int} is +a 32-bit quantity. + +@comment limits.h +@comment ANSI +@item INT_MAX +@comment limits.h +@comment ANSI +@itemx UINT_MAX + +These are the maximum values that can be represented by, respectively, +the type @w{@code{signed int}} and the type @w{@code{unsigned int}}. + +@comment limits.h +@comment ANSI +@item LONG_MIN + +This is the minimum value that can be represented by a @w{@code{signed +long int}}. On most machines that the GNU C system runs on, @code{long} +integers are 32-bit quantities, the same size as @code{int}. + +@comment limits.h +@comment ANSI +@item LONG_MAX +@comment limits.h +@comment ANSI +@itemx ULONG_MAX + +These are the maximum values that can be represented by a +@w{@code{signed long int}} and @code{unsigned long int}, respectively. + +@comment limits.h +@comment GNU +@item LONG_LONG_MIN + +This is the minimum value that can be represented by a @w{@code{signed +long long int}}. On most machines that the GNU C system runs on, +@w{@code{long long}} integers are 64-bit quantities. + +@comment limits.h +@comment GNU +@item LONG_LONG_MAX +@comment limits.h +@comment ANSI +@itemx ULONG_LONG_MAX + +These are the maximum values that can be represented by a @code{signed +long long int} and @code{unsigned long long int}, respectively. + +@comment limits.h +@comment GNU +@item WCHAR_MAX + +This is the maximum value that can be represented by a @code{wchar_t}. +@xref{Wide Char Intro}. +@end table + +The header file @file{limits.h} also defines some additional constants +that parameterize various operating system and file system limits. These +constants are described in @ref{System Configuration}. + +@node Floating Type Macros +@subsection Floating Type Macros +@cindex floating type measurements +@cindex measurements of floating types +@cindex type measurements, floating +@cindex limits, floating types + +The specific representation of floating point numbers varies from +machine to machine. Because floating point numbers are represented +internally as approximate quantities, algorithms for manipulating +floating point data often need to take account of the precise details of +the machine's floating point representation. + +Some of the functions in the C library itself need this information; for +example, the algorithms for printing and reading floating point numbers +(@pxref{I/O on Streams}) and for calculating trigonometric and +irrational functions (@pxref{Mathematics}) use it to avoid round-off +error and loss of accuracy. User programs that implement numerical +analysis techniques also often need this information in order to +minimize or compute error bounds. + +The header file @file{float.h} describes the format used by your +machine. + +@menu +* Floating Point Concepts:: Definitions of terminology. +* Floating Point Parameters:: Details of specific macros. +* IEEE Floating Point:: The measurements for one common + representation. +@end menu + +@node Floating Point Concepts +@subsubsection Floating Point Representation Concepts + +This section introduces the terminology for describing floating point +representations. + +You are probably already familiar with most of these concepts in terms +of scientific or exponential notation for floating point numbers. For +example, the number @code{123456.0} could be expressed in exponential +notation as @code{1.23456e+05}, a shorthand notation indicating that the +mantissa @code{1.23456} is multiplied by the base @code{10} raised to +power @code{5}. + +More formally, the internal representation of a floating point number +can be characterized in terms of the following parameters: + +@itemize @bullet +@item +@cindex sign (of floating point number) +The @dfn{sign} is either @code{-1} or @code{1}. + +@item +@cindex base (of floating point number) +@cindex radix (of floating point number) +The @dfn{base} or @dfn{radix} for exponentiation, an integer greater +than @code{1}. This is a constant for a particular representation. + +@item +@cindex exponent (of floating point number) +The @dfn{exponent} to which the base is raised. The upper and lower +bounds of the exponent value are constants for a particular +representation. + +@cindex bias (of floating point number exponent) +Sometimes, in the actual bits representing the floating point number, +the exponent is @dfn{biased} by adding a constant to it, to make it +always be represented as an unsigned quantity. This is only important +if you have some reason to pick apart the bit fields making up the +floating point number by hand, which is something for which the GNU +library provides no support. So this is ignored in the discussion that +follows. + +@item +@cindex mantissa (of floating point number) +@cindex significand (of floating point number) +The @dfn{mantissa} or @dfn{significand}, an unsigned integer which is a +part of each floating point number. + +@item +@cindex precision (of floating point number) +The @dfn{precision} of the mantissa. If the base of the representation +is @var{b}, then the precision is the number of base-@var{b} digits in +the mantissa. This is a constant for a particular representation. + +@cindex hidden bit (of floating point number mantissa) +Many floating point representations have an implicit @dfn{hidden bit} in +the mantissa. This is a bit which is present virtually in the mantissa, +but not stored in memory because its value is always 1 in a normalized +number. The precision figure (see above) includes any hidden bits. + +Again, the GNU library provides no facilities for dealing with such +low-level aspects of the representation. +@end itemize + +The mantissa of a floating point number actually represents an implicit +fraction whose denominator is the base raised to the power of the +precision. Since the largest representable mantissa is one less than +this denominator, the value of the fraction is always strictly less than +@code{1}. The mathematical value of a floating point number is then the +product of this fraction, the sign, and the base raised to the exponent. + +@cindex normalized floating point number +We say that the floating point number is @dfn{normalized} if the +fraction is at least @code{1/@var{b}}, where @var{b} is the base. In +other words, the mantissa would be too large to fit if it were +multiplied by the base. Non-normalized numbers are sometimes called +@dfn{denormal}; they contain less precision than the representation +normally can hold. + +If the number is not normalized, then you can subtract @code{1} from the +exponent while multiplying the mantissa by the base, and get another +floating point number with the same value. @dfn{Normalization} consists +of doing this repeatedly until the number is normalized. Two distinct +normalized floating point numbers cannot be equal in value. + +(There is an exception to this rule: if the mantissa is zero, it is +considered normalized. Another exception happens on certain machines +where the exponent is as small as the representation can hold. Then +it is impossible to subtract @code{1} from the exponent, so a number +may be normalized even if its fraction is less than @code{1/@var{b}}.) + +@node Floating Point Parameters +@subsubsection Floating Point Parameters + +@pindex float.h +These macro definitions can be accessed by including the header file +@file{float.h} in your program. + +Macro names starting with @samp{FLT_} refer to the @code{float} type, +while names beginning with @samp{DBL_} refer to the @code{double} type +and names beginning with @samp{LDBL_} refer to the @code{long double} +type. (Currently GCC does not support @code{long double} as a distinct +data type, so the values for the @samp{LDBL_} constants are equal to the +corresponding constants for the @code{double} type.)@refill + +Of these macros, only @code{FLT_RADIX} is guaranteed to be a constant +expression. The other macros listed here cannot be reliably used in +places that require constant expressions, such as @samp{#if} +preprocessing directives or in the dimensions of static arrays. + +Although the ANSI C standard specifies minimum and maximum values for +most of these parameters, the GNU C implementation uses whatever values +describe the floating point representation of the target machine. So in +principle GNU C actually satisfies the ANSI C requirements only if the +target machine is suitable. In practice, all the machines currently +supported are suitable. + +@table @code +@comment float.h +@comment ANSI +@item FLT_ROUNDS +This value characterizes the rounding mode for floating point addition. +The following values indicate standard rounding modes: + +@need 750 + +@table @code +@item -1 +The mode is indeterminable. +@item 0 +Rounding is towards zero. +@item 1 +Rounding is to the nearest number. +@item 2 +Rounding is towards positive infinity. +@item 3 +Rounding is towards negative infinity. +@end table + +@noindent +Any other value represents a machine-dependent nonstandard rounding +mode. + +On most machines, the value is @code{1}, in accordance with the IEEE +standard for floating point. + +Here is a table showing how certain values round for each possible value +of @code{FLT_ROUNDS}, if the other aspects of the representation match +the IEEE single-precision standard. + +@smallexample + 0 1 2 3 + 1.00000003 1.0 1.0 1.00000012 1.0 + 1.00000007 1.0 1.00000012 1.00000012 1.0 +-1.00000003 -1.0 -1.0 -1.0 -1.00000012 +-1.00000007 -1.0 -1.00000012 -1.0 -1.00000012 +@end smallexample + +@comment float.h +@comment ANSI +@item FLT_RADIX +This is the value of the base, or radix, of exponent representation. +This is guaranteed to be a constant expression, unlike the other macros +described in this section. The value is 2 on all machines we know of +except the IBM 360 and derivatives. + +@comment float.h +@comment ANSI +@item FLT_MANT_DIG +This is the number of base-@code{FLT_RADIX} digits in the floating point +mantissa for the @code{float} data type. The following expression +yields @code{1.0} (even though mathematically it should not) due to the +limited number of mantissa digits: + +@smallexample +float radix = FLT_RADIX; + +1.0f + 1.0f / radix / radix / @dots{} / radix +@end smallexample + +@noindent +where @code{radix} appears @code{FLT_MANT_DIG} times. + +@comment float.h +@comment ANSI +@item DBL_MANT_DIG +@itemx LDBL_MANT_DIG +This is the number of base-@code{FLT_RADIX} digits in the floating point +mantissa for the data types @code{double} and @code{long double}, +respectively. + +@comment Extra blank lines make it look better. +@comment float.h +@comment ANSI +@item FLT_DIG + +This is the number of decimal digits of precision for the @code{float} +data type. Technically, if @var{p} and @var{b} are the precision and +base (respectively) for the representation, then the decimal precision +@var{q} is the maximum number of decimal digits such that any floating +point number with @var{q} base 10 digits can be rounded to a floating +point number with @var{p} base @var{b} digits and back again, without +change to the @var{q} decimal digits. + +The value of this macro is supposed to be at least @code{6}, to satisfy +ANSI C. + +@comment float.h +@comment ANSI +@item DBL_DIG +@itemx LDBL_DIG + +These are similar to @code{FLT_DIG}, but for the data types +@code{double} and @code{long double}, respectively. The values of these +macros are supposed to be at least @code{10}. + +@comment float.h +@comment ANSI +@item FLT_MIN_EXP +This is the smallest possible exponent value for type @code{float}. +More precisely, is the minimum negative integer such that the value +@code{FLT_RADIX} raised to this power minus 1 can be represented as a +normalized floating point number of type @code{float}. + +@comment float.h +@comment ANSI +@item DBL_MIN_EXP +@itemx LDBL_MIN_EXP + +These are similar to @code{FLT_MIN_EXP}, but for the data types +@code{double} and @code{long double}, respectively. + +@comment float.h +@comment ANSI +@item FLT_MIN_10_EXP +This is the minimum negative integer such that @code{10} raised to this +power minus 1 can be represented as a normalized floating point number +of type @code{float}. This is supposed to be @code{-37} or even less. + +@comment float.h +@comment ANSI +@item DBL_MIN_10_EXP +@itemx LDBL_MIN_10_EXP +These are similar to @code{FLT_MIN_10_EXP}, but for the data types +@code{double} and @code{long double}, respectively. + +@comment float.h +@comment ANSI +@item FLT_MAX_EXP +This is the largest possible exponent value for type @code{float}. More +precisely, this is the maximum positive integer such that value +@code{FLT_RADIX} raised to this power minus 1 can be represented as a +floating point number of type @code{float}. + +@comment float.h +@comment ANSI +@item DBL_MAX_EXP +@itemx LDBL_MAX_EXP +These are similar to @code{FLT_MAX_EXP}, but for the data types +@code{double} and @code{long double}, respectively. + +@comment float.h +@comment ANSI +@item FLT_MAX_10_EXP +This is the maximum positive integer such that @code{10} raised to this +power minus 1 can be represented as a normalized floating point number +of type @code{float}. This is supposed to be at least @code{37}. + +@comment float.h +@comment ANSI +@item DBL_MAX_10_EXP +@itemx LDBL_MAX_10_EXP +These are similar to @code{FLT_MAX_10_EXP}, but for the data types +@code{double} and @code{long double}, respectively. + +@comment float.h +@comment ANSI +@item FLT_MAX + +The value of this macro is the maximum number representable in type +@code{float}. It is supposed to be at least @code{1E+37}. The value +has type @code{float}. + +The smallest representable number is @code{- FLT_MAX}. + +@comment float.h +@comment ANSI +@item DBL_MAX +@itemx LDBL_MAX + +These are similar to @code{FLT_MAX}, but for the data types +@code{double} and @code{long double}, respectively. The type of the +macro's value is the same as the type it describes. + +@comment float.h +@comment ANSI +@item FLT_MIN + +The value of this macro is the minimum normalized positive floating +point number that is representable in type @code{float}. It is supposed +to be no more than @code{1E-37}. + +@comment float.h +@comment ANSI +@item DBL_MIN +@itemx LDBL_MIN + +These are similar to @code{FLT_MIN}, but for the data types +@code{double} and @code{long double}, respectively. The type of the +macro's value is the same as the type it describes. + +@comment float.h +@comment ANSI +@item FLT_EPSILON + +This is the minimum positive floating point number of type @code{float} +such that @code{1.0 + FLT_EPSILON != 1.0} is true. It's supposed to +be no greater than @code{1E-5}. + +@comment float.h +@comment ANSI +@item DBL_EPSILON +@itemx LDBL_EPSILON + +These are similar to @code{FLT_EPSILON}, but for the data types +@code{double} and @code{long double}, respectively. The type of the +macro's value is the same as the type it describes. The values are not +supposed to be greater than @code{1E-9}. +@end table + +@node IEEE Floating Point +@subsubsection IEEE Floating Point +@cindex IEEE floating point representation +@cindex floating point, IEEE + +Here is an example showing how the floating type measurements come out +for the most common floating point representation, specified by the +@cite{IEEE Standard for Binary Floating Point Arithmetic (ANSI/IEEE Std +754-1985)}. Nearly all computers designed since the 1980s use this +format. + +The IEEE single-precision float representation uses a base of 2. There +is a sign bit, a mantissa with 23 bits plus one hidden bit (so the total +precision is 24 base-2 digits), and an 8-bit exponent that can represent +values in the range -125 to 128, inclusive. + +So, for an implementation that uses this representation for the +@code{float} data type, appropriate values for the corresponding +parameters are: + +@smallexample +FLT_RADIX 2 +FLT_MANT_DIG 24 +FLT_DIG 6 +FLT_MIN_EXP -125 +FLT_MIN_10_EXP -37 +FLT_MAX_EXP 128 +FLT_MAX_10_EXP +38 +FLT_MIN 1.17549435E-38F +FLT_MAX 3.40282347E+38F +FLT_EPSILON 1.19209290E-07F +@end smallexample + +Here are the values for the @code{double} data type: + +@smallexample +DBL_MANT_DIG 53 +DBL_DIG 15 +DBL_MIN_EXP -1021 +DBL_MIN_10_EXP -307 +DBL_MAX_EXP 1024 +DBL_MAX_10_EXP 308 +DBL_MAX 1.7976931348623157E+308 +DBL_MIN 2.2250738585072014E-308 +DBL_EPSILON 2.2204460492503131E-016 +@end smallexample + +@node Structure Measurement +@subsection Structure Field Offset Measurement + +You can use @code{offsetof} to measure the location within a structure +type of a particular structure member. + +@comment stddef.h +@comment ANSI +@deftypefn {Macro} size_t offsetof (@var{type}, @var{member}) +This expands to a integer constant expression that is the offset of the +structure member named @var{member} in a the structure type @var{type}. +For example, @code{offsetof (struct s, elem)} is the offset, in bytes, +of the member @code{elem} in a @code{struct s}. + +This macro won't work if @var{member} is a bit field; you get an error +from the C compiler in that case. +@end deftypefn diff --git a/manual/lgpl.texinfo b/manual/lgpl.texinfo new file mode 100644 index 0000000000..8ba7317fe0 --- /dev/null +++ b/manual/lgpl.texinfo @@ -0,0 +1,546 @@ +@setfilename lgpl.info + +@set lgpl-appendix + +@ifset lgpl-appendix +@appendix GNU LIBRARY GENERAL PUBLIC LICENSE +@end ifset +@ifclear lgpl-appendix +@unnumbered GNU LIBRARY GENERAL PUBLIC LICENSE +@end ifclear +@center Version 2, June 1991 + +@display +Copyright @copyright{} 1991 Free Software Foundation, Inc. +675 Mass Ave, Cambridge, MA 02139, USA +Everyone is permitted to copy and distribute verbatim copies +of this license document, but changing it is not allowed. + +[This is the first released version of the library GPL. It is + numbered 2 because it goes with version 2 of the ordinary GPL.] +@end display + +@unnumberedsec Preamble + + The licenses for most software are designed to take away your +freedom to share and change it. By contrast, the GNU General Public +Licenses are intended to guarantee your freedom to share and change +free software---to make sure the software is free for all its users. + + This license, the Library General Public License, applies to some +specially designated Free Software Foundation software, and to any +other libraries whose authors decide to use it. You can use it for +your libraries, too. + + When we speak of free software, we are referring to freedom, not +price. Our General Public Licenses are designed to make sure that you +have the freedom to distribute copies of free software (and charge for +this service if you wish), that you receive source code or can get it +if you want it, that you can change the software or use pieces of it +in new free programs; and that you know you can do these things. + + To protect your rights, we need to make restrictions that forbid +anyone to deny you these rights or to ask you to surrender the rights. +These restrictions translate to certain responsibilities for you if +you distribute copies of the library, or if you modify it. + + For example, if you distribute copies of the library, whether gratis +or for a fee, you must give the recipients all the rights that we gave +you. You must make sure that they, too, receive or can get the source +code. If you link a program with the library, you must provide +complete object files to the recipients so that they can relink them +with the library, after making changes to the library and recompiling +it. And you must show them these terms so they know their rights. + + Our method of protecting your rights has two steps: (1) copyright +the library, and (2) offer you this license which gives you legal +permission to copy, distribute and/or modify the library. + + Also, for each distributor's protection, we want to make certain +that everyone understands that there is no warranty for this free +library. If the library is modified by someone else and passed on, we +want its recipients to know that what they have is not the original +version, so that any problems introduced by others will not reflect on +the original authors' reputations. + + Finally, any free program is threatened constantly by software +patents. We wish to avoid the danger that companies distributing free +software will individually obtain patent licenses, thus in effect +transforming the program into proprietary software. To prevent this, +we have made it clear that any patent must be licensed for everyone's +free use or not licensed at all. + + Most GNU software, including some libraries, is covered by the ordinary +GNU General Public License, which was designed for utility programs. This +license, the GNU Library General Public License, applies to certain +designated libraries. This license is quite different from the ordinary +one; be sure to read it in full, and don't assume that anything in it is +the same as in the ordinary license. + + The reason we have a separate public license for some libraries is that +they blur the distinction we usually make between modifying or adding to a +program and simply using it. Linking a program with a library, without +changing the library, is in some sense simply using the library, and is +analogous to running a utility program or application program. However, in +a textual and legal sense, the linked executable is a combined work, a +derivative of the original library, and the ordinary General Public License +treats it as such. + + Because of this blurred distinction, using the ordinary General +Public License for libraries did not effectively promote software +sharing, because most developers did not use the libraries. We +concluded that weaker conditions might promote sharing better. + + However, unrestricted linking of non-free programs would deprive the +users of those programs of all benefit from the free status of the +libraries themselves. This Library General Public License is intended to +permit developers of non-free programs to use free libraries, while +preserving your freedom as a user of such programs to change the free +libraries that are incorporated in them. (We have not seen how to achieve +this as regards changes in header files, but we have achieved it as regards +changes in the actual functions of the Library.) The hope is that this +will lead to faster development of free libraries. + + The precise terms and conditions for copying, distribution and +modification follow. Pay close attention to the difference between a +``work based on the library'' and a ``work that uses the library''. The +former contains code derived from the library, while the latter only +works together with the library. + + Note that it is possible for a library to be covered by the ordinary +General Public License rather than by this special one. + +@iftex +@unnumberedsec TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION +@end iftex +@ifinfo +@center TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION +@end ifinfo + +@enumerate 0 +@item +This License Agreement applies to any software library which +contains a notice placed by the copyright holder or other authorized +party saying it may be distributed under the terms of this Library +General Public License (also called ``this License''). Each licensee is +addressed as ``you''. + + A ``library'' means a collection of software functions and/or data +prepared so as to be conveniently linked with application programs +(which use some of those functions and data) to form executables. + + The ``Library'', below, refers to any such software library or work +which has been distributed under these terms. A ``work based on the +Library'' means either the Library or any derivative work under +copyright law: that is to say, a work containing the Library or a +portion of it, either verbatim or with modifications and/or translated +straightforwardly into another language. (Hereinafter, translation is +included without limitation in the term ``modification''.) + + ``Source code'' for a work means the preferred form of the work for +making modifications to it. For a library, complete source code means +all the source code for all modules it contains, plus any associated +interface definition files, plus the scripts used to control compilation +and installation of the library. + + Activities other than copying, distribution and modification are not +covered by this License; they are outside its scope. The act of +running a program using the Library is not restricted, and output from +such a program is covered only if its contents constitute a work based +on the Library (independent of the use of the Library in a tool for +writing it). Whether that is true depends on what the Library does +and what the program that uses the Library does. + +@item +You may copy and distribute verbatim copies of the Library's +complete source code as you receive it, in any medium, provided that +you conspicuously and appropriately publish on each copy an +appropriate copyright notice and disclaimer of warranty; keep intact +all the notices that refer to this License and to the absence of any +warranty; and distribute a copy of this License along with the +Library. + + You may charge a fee for the physical act of transferring a copy, +and you may at your option offer warranty protection in exchange for a +fee. + +@item +You may modify your copy or copies of the Library or any portion +of it, thus forming a work based on the Library, and copy and +distribute such modifications or work under the terms of Section 1 +above, provided that you also meet all of these conditions: + +@enumerate a +@item +The modified work must itself be a software library. + +@item +You must cause the files modified to carry prominent notices +stating that you changed the files and the date of any change. + +@item +You must cause the whole of the work to be licensed at no +charge to all third parties under the terms of this License. + +@item +If a facility in the modified Library refers to a function or a +table of data to be supplied by an application program that uses +the facility, other than as an argument passed when the facility +is invoked, then you must make a good faith effort to ensure that, +in the event an application does not supply such function or +table, the facility still operates, and performs whatever part of +its purpose remains meaningful. + +(For example, a function in a library to compute square roots has +a purpose that is entirely well-defined independent of the +application. Therefore, Subsection 2d requires that any +application-supplied function or table used by this function must +be optional: if the application does not supply it, the square +root function must still compute square roots.) +@end enumerate + +These requirements apply to the modified work as a whole. If +identifiable sections of that work are not derived from the Library, +and can be reasonably considered independent and separate works in +themselves, then this License, and its terms, do not apply to those +sections when you distribute them as separate works. But when you +distribute the same sections as part of a whole which is a work based +on the Library, the distribution of the whole must be on the terms of +this License, whose permissions for other licensees extend to the +entire whole, and thus to each and every part regardless of who wrote +it. + +Thus, it is not the intent of this section to claim rights or contest +your rights to work written entirely by you; rather, the intent is to +exercise the right to control the distribution of derivative or +collective works based on the Library. + +In addition, mere aggregation of another work not based on the Library +with the Library (or with a work based on the Library) on a volume of +a storage or distribution medium does not bring the other work under +the scope of this License. + +@item +You may opt to apply the terms of the ordinary GNU General Public +License instead of this License to a given copy of the Library. To do +this, you must alter all the notices that refer to this License, so +that they refer to the ordinary GNU General Public License, version 2, +instead of to this License. (If a newer version than version 2 of the +ordinary GNU General Public License has appeared, then you can specify +that version instead if you wish.) Do not make any other change in +these notices. + + Once this change is made in a given copy, it is irreversible for +that copy, so the ordinary GNU General Public License applies to all +subsequent copies and derivative works made from that copy. + + This option is useful when you wish to copy part of the code of +the Library into a program that is not a library. + +@item +You may copy and distribute the Library (or a portion or +derivative of it, under Section 2) in object code or executable form +under the terms of Sections 1 and 2 above provided that you accompany +it with the complete corresponding machine-readable source code, which +must be distributed under the terms of Sections 1 and 2 above on a +medium customarily used for software interchange. + + If distribution of object code is made by offering access to copy +from a designated place, then offering equivalent access to copy the +source code from the same place satisfies the requirement to +distribute the source code, even though third parties are not +compelled to copy the source along with the object code. + +@item +A program that contains no derivative of any portion of the +Library, but is designed to work with the Library by being compiled or +linked with it, is called a ``work that uses the Library''. Such a +work, in isolation, is not a derivative work of the Library, and +therefore falls outside the scope of this License. + + However, linking a ``work that uses the Library'' with the Library +creates an executable that is a derivative of the Library (because it +contains portions of the Library), rather than a ``work that uses the +library''. The executable is therefore covered by this License. +Section 6 states terms for distribution of such executables. + + When a ``work that uses the Library'' uses material from a header file +that is part of the Library, the object code for the work may be a +derivative work of the Library even though the source code is not. +Whether this is true is especially significant if the work can be +linked without the Library, or if the work is itself a library. The +threshold for this to be true is not precisely defined by law. + + If such an object file uses only numerical parameters, data +structure layouts and accessors, and small macros and small inline +functions (ten lines or less in length), then the use of the object +file is unrestricted, regardless of whether it is legally a derivative +work. (Executables containing this object code plus portions of the +Library will still fall under Section 6.) + + Otherwise, if the work is a derivative of the Library, you may +distribute the object code for the work under the terms of Section 6. +Any executables containing that work also fall under Section 6, +whether or not they are linked directly with the Library itself. + +@item +As an exception to the Sections above, you may also compile or +link a ``work that uses the Library'' with the Library to produce a +work containing portions of the Library, and distribute that work +under terms of your choice, provided that the terms permit +modification of the work for the customer's own use and reverse +engineering for debugging such modifications. + + You must give prominent notice with each copy of the work that the +Library is used in it and that the Library and its use are covered by +this License. You must supply a copy of this License. If the work +during execution displays copyright notices, you must include the +copyright notice for the Library among them, as well as a reference +directing the user to the copy of this License. Also, you must do one +of these things: + +@enumerate a +@item +Accompany the work with the complete corresponding +machine-readable source code for the Library including whatever +changes were used in the work (which must be distributed under +Sections 1 and 2 above); and, if the work is an executable linked +with the Library, with the complete machine-readable ``work that +uses the Library'', as object code and/or source code, so that the +user can modify the Library and then relink to produce a modified +executable containing the modified Library. (It is understood +that the user who changes the contents of definitions files in the +Library will not necessarily be able to recompile the application +to use the modified definitions.) + +@item +Accompany the work with a written offer, valid for at +least three years, to give the same user the materials +specified in Subsection 6a, above, for a charge no more +than the cost of performing this distribution. + +@item +If distribution of the work is made by offering access to copy +from a designated place, offer equivalent access to copy the above +specified materials from the same place. + +@item +Verify that the user has already received a copy of these +materials or that you have already sent this user a copy. +@end enumerate + + For an executable, the required form of the ``work that uses the +Library'' must include any data and utility programs needed for +reproducing the executable from it. However, as a special exception, +the source code distributed need not include anything that is normally +distributed (in either source or binary form) with the major +components (compiler, kernel, and so on) of the operating system on +which the executable runs, unless that component itself accompanies +the executable. + + It may happen that this requirement contradicts the license +restrictions of other proprietary libraries that do not normally +accompany the operating system. Such a contradiction means you cannot +use both them and the Library together in an executable that you +distribute. + +@item +You may place library facilities that are a work based on the +Library side-by-side in a single library together with other library +facilities not covered by this License, and distribute such a combined +library, provided that the separate distribution of the work based on +the Library and of the other library facilities is otherwise +permitted, and provided that you do these two things: + +@enumerate a +@item +Accompany the combined library with a copy of the same work +based on the Library, uncombined with any other library +facilities. This must be distributed under the terms of the +Sections above. + +@item +Give prominent notice with the combined library of the fact +that part of it is a work based on the Library, and explaining +where to find the accompanying uncombined form of the same work. +@end enumerate + +@item +You may not copy, modify, sublicense, link with, or distribute +the Library except as expressly provided under this License. Any +attempt otherwise to copy, modify, sublicense, link with, or +distribute the Library is void, and will automatically terminate your +rights under this License. However, parties who have received copies, +or rights, from you under this License will not have their licenses +terminated so long as such parties remain in full compliance. + +@item +You are not required to accept this License, since you have not +signed it. However, nothing else grants you permission to modify or +distribute the Library or its derivative works. These actions are +prohibited by law if you do not accept this License. Therefore, by +modifying or distributing the Library (or any work based on the +Library), you indicate your acceptance of this License to do so, and +all its terms and conditions for copying, distributing or modifying +the Library or works based on it. + +@item +Each time you redistribute the Library (or any work based on the +Library), the recipient automatically receives a license from the +original licensor to copy, distribute, link with or modify the Library +subject to these terms and conditions. You may not impose any further +restrictions on the recipients' exercise of the rights granted herein. +You are not responsible for enforcing compliance by third parties to +this License. + +@item +If, as a consequence of a court judgment or allegation of patent +infringement or for any other reason (not limited to patent issues), +conditions are imposed on you (whether by court order, agreement or +otherwise) that contradict the conditions of this License, they do not +excuse you from the conditions of this License. If you cannot +distribute so as to satisfy simultaneously your obligations under this +License and any other pertinent obligations, then as a consequence you +may not distribute the Library at all. For example, if a patent +license would not permit royalty-free redistribution of the Library by +all those who receive copies directly or indirectly through you, then +the only way you could satisfy both it and this License would be to +refrain entirely from distribution of the Library. + +If any portion of this section is held invalid or unenforceable under any +particular circumstance, the balance of the section is intended to apply, +and the section as a whole is intended to apply in other circumstances. + +It is not the purpose of this section to induce you to infringe any +patents or other property right claims or to contest validity of any +such claims; this section has the sole purpose of protecting the +integrity of the free software distribution system which is +implemented by public license practices. Many people have made +generous contributions to the wide range of software distributed +through that system in reliance on consistent application of that +system; it is up to the author/donor to decide if he or she is willing +to distribute software through any other system and a licensee cannot +impose that choice. + +This section is intended to make thoroughly clear what is believed to +be a consequence of the rest of this License. + +@item +If the distribution and/or use of the Library is restricted in +certain countries either by patents or by copyrighted interfaces, the +original copyright holder who places the Library under this License may add +an explicit geographical distribution limitation excluding those countries, +so that distribution is permitted only in or among countries not thus +excluded. In such case, this License incorporates the limitation as if +written in the body of this License. + +@item +The Free Software Foundation may publish revised and/or new +versions of the Library General Public License from time to time. +Such new versions will be similar in spirit to the present version, +but may differ in detail to address new problems or concerns. + +Each version is given a distinguishing version number. If the Library +specifies a version number of this License which applies to it and +``any later version'', you have the option of following the terms and +conditions either of that version or of any later version published by +the Free Software Foundation. If the Library does not specify a +license version number, you may choose any version ever published by +the Free Software Foundation. + +@item +If you wish to incorporate parts of the Library into other free +programs whose distribution conditions are incompatible with these, +write to the author to ask for permission. For software which is +copyrighted by the Free Software Foundation, write to the Free +Software Foundation; we sometimes make exceptions for this. Our +decision will be guided by the two goals of preserving the free status +of all derivatives of our free software and of promoting the sharing +and reuse of software generally. + +@iftex +@heading NO WARRANTY +@end iftex +@ifinfo +@center NO WARRANTY +@end ifinfo + +@item +BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO +WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. +EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR +OTHER PARTIES PROVIDE THE LIBRARY ``AS IS'' WITHOUT WARRANTY OF ANY +KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE +LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME +THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. + +@item +IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN +WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY +AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU +FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR +CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE +LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING +RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A +FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF +SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH +DAMAGES. +@end enumerate + +@iftex +@heading END OF TERMS AND CONDITIONS +@end iftex +@ifinfo +@center END OF TERMS AND CONDITIONS +@end ifinfo + +@page +@unnumberedsec How to Apply These Terms to Your New Libraries + + If you develop a new library, and you want it to be of the greatest +possible use to the public, we recommend making it free software that +everyone can redistribute and change. You can do so by permitting +redistribution under these terms (or, alternatively, under the terms of the +ordinary General Public License). + + To apply these terms, attach the following notices to the library. It is +safest to attach them to the start of each source file to most effectively +convey the exclusion of warranty; and each file should have at least the +``copyright'' line and a pointer to where the full notice is found. + +@smallexample +@var{one line to give the library's name and an idea of what it does.} +Copyright (C) @var{year} @var{name of author} + +This library is free software; you can redistribute it and/or modify it +under the terms of the GNU Library General Public License as published +by the Free Software Foundation; either version 2 of the License, or (at +your option) any later version. + +This library is distributed in the hope that it will be useful, but +WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +Library General Public License for more details. + +You should have received a copy of the GNU Library General Public +License along with this library; if not, write to the Free Software +Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. +@end smallexample + +Also add information on how to contact you by electronic and paper mail. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a ``copyright disclaimer'' for the library, if +necessary. Here is a sample; alter the names: + +@smallexample +Yoyodyne, Inc., hereby disclaims all copyright interest in the library +`Frob' (a library for tweaking knobs) written by James Random Hacker. + +@var{signature of Ty Coon}, 1 April 1990 +Ty Coon, President of Vice +@end smallexample + +That's all there is to it! diff --git a/manual/libc.texinfo b/manual/libc.texinfo new file mode 100644 index 0000000000..0b455b32d2 --- /dev/null +++ b/manual/libc.texinfo @@ -0,0 +1,1007 @@ +\input texinfo @c -*- Texinfo -*- +@comment %**start of header (This is for running Texinfo on a region.) +@setfilename libc.info +@settitle The GNU C Library +@setchapternewpage odd +@comment %**end of header (This is for running Texinfo on a region.) + +@c This tells texinfo.tex to use the real section titles in xrefs in +@c place of the node name, when no section title is explicitly given. +@set xref-automatic-section-title +@smallbook + +@c I've already told people the printed edition will be 0.06 +@set EDITION 0.06 +@set VERSION 1.09 Beta +@set UPDATED 23 December 1994 +@set ISBN 1-882114-53-1 + +@ifinfo +This file documents the GNU C library. + +This is Edition @value{EDITION}, last updated @value{UPDATED}, +of @cite{The GNU C Library Reference Manual}, for Version @value{VERSION}. + +Copyright (C) 1993, 1994 Free Software Foundation, Inc. + +Permission is granted to make and distribute verbatim copies of +this manual provided the copyright notice and this permission notice +are preserved on all copies. + +@ignore +Permission is granted to process this file through TeX and print the +results, provided the printed document carries copying permission +notice identical to this one except for the removal of this paragraph +(this paragraph not being relevant to the printed manual). + +@end ignore +Permission is granted to copy and distribute modified versions of this +manual under the conditions for verbatim copying, provided also that the +section entitled ``GNU Library General Public License'' is included +exactly as in the original, and provided that the entire resulting +derived work is distributed under the terms of a permission notice +identical to this one. + +Permission is granted to copy and distribute translations of this manual +Library General Public License'' must be approved for accuracy by the +Foundation. +@end ifinfo + +@iftex +@shorttitlepage The GNU C Library Reference Manual +@end iftex +@titlepage +@center @titlefont{The GNU C Library} +@sp 1 +@center @titlefont{Reference Manual} +@sp 2 +@center Sandra Loosemore +@center with +@center Richard M. Stallman, Roland McGrath, and Andrew Oram +@sp 3 +@center Edition @value{EDITION} +@sp 1 +@center last updated @value{UPDATED} +@sp 1 +@center for version @value{VERSION} +@page +@vskip 0pt plus 1filll +Copyright @copyright{} 1993, 1994 Free Software Foundation, Inc. +@sp 2 +Published by the Free Software Foundation @* +675 Massachusetts Avenue, @* +Cambridge, MA 02139 USA @* +Printed copies are available for $50 each. @* +ISBN @value{ISBN} @* + +Permission is granted to make and distribute verbatim copies of +this manual provided the copyright notice and this permission notice +are preserved on all copies. + +Permission is granted to copy and distribute modified versions of this +manual under the conditions for verbatim copying, provided also that the +section entitled ``GNU Library General Public License'' is included +exactly as in the original, and provided that the entire resulting +derived work is distributed under the terms of a permission notice +identical to this one. + +Permission is granted to copy and distribute translations of this manual +into another language, under the above conditions for modified versions, +except that the text of the translation of the section entitled ``GNU +Library General Public License'' must be approved for accuracy by the +Foundation. +@end titlepage +@page + +@ifinfo +@node Top, Introduction, (dir), (dir) +@top Main Menu +This is Edition @value{EDITION}, last updated @value{UPDATED}, of +@cite{The GNU C Library Reference Manual}, for Version @value{VERSION} +of the GNU C Library. +@end ifinfo + + +@menu +* Introduction:: Purpose of the GNU C Library. +* Error Reporting:: How the GNU Library functions report + error conditions. +* Memory Allocation:: Your program can allocate memory dynamically + and manipulate it via pointers. +* Character Handling:: Character testing and conversion functions. +* String and Array Utilities:: Utilities for copying and comparing + strings and arrays. +* Extended Characters:: Support for extended character sets. +* Locales:: The country and language can affect + the behavior of library functions. +* Searching and Sorting:: General searching and sorting functions. +* Pattern Matching:: Matching wildcards and regular expressions, + and shell-style ``word expansion''. +* I/O Overview:: Introduction to the I/O facilities. +* Streams: I/O on Streams. High-level, portable I/O facilities. +* Low-Level I/O:: Low-level, less portable I/O. +* File System Interface:: Functions for manipulating files. +* Pipes and FIFOs:: A simple interprocess communication mechanism. +* Sockets:: A more complicated interprocess communication + mechanism, with support for networking. +* Low-Level Terminal Interface::How to change the characteristics + of a terminal device. +* Mathematics:: Math functions (transcendental functions, + random numbers, absolute value, etc.). +* Arithmetic:: Low-level arithmetic functions. +* Date and Time:: Functions for getting the date and time, + and for conversion between formats. +* Non-Local Exits:: The @code{setjmp} and @code{longjmp} facilities. +* Signal Handling:: All about signals; how to send them, + block them, and handle them. +* Process Startup:: Writing the beginning and end of your program. +* Processes:: How to create processes and run other programs. +* Job Control:: All about process groups and sessions. +* Users and Groups:: How users are identified and classified. +* System Information:: Getting information about the + hardware and software configuration + of the machine a program runs on. +* System Configuration:: Parameters describing operating system limits. + +Appendices + +* Language Features:: C language features provided by the library. + +* Library Summary:: A summary showing the syntax, header file, + and derivation of each library feature. +* Maintenance:: How to install and maintain the GNU C Library. +* Copying:: The GNU Library General Public License says + how you can copy and share the GNU C Library. + +Indices + +* Concept Index:: Index of concepts and names. +* Type Index:: Index of types and type qualifiers. +* Function Index:: Index of functions and function-like macros. +* Variable Index:: Index of variables and variable-like macros. +* File Index:: Index of programs and files. + + --- The Detailed Node Listing --- + +Introduction + +* Getting Started:: Getting Started +* Standards and Portability:: Standards and Portability +* Using the Library:: Using the Library +* Roadmap to the Manual:: Roadmap to the Manual + +Standards and Portability + +* ANSI C:: The American National Standard for the + C programming language. +* POSIX:: The IEEE 1003 standards for operating systems. +* Berkeley Unix:: BSD and SunOS. +* SVID:: The System V Interface Description. + +Using the Library + +* Header Files:: How to use the header files in your programs. +* Macro Definitions:: Some functions in the library may really + be implemented as macros. +* Reserved Names:: The C standard reserves some names for + the library, and some for users. +* Feature Test Macros:: How to control what names are defined. + +Error Reporting + +* Checking for Errors:: How errors are reported by library functions. +* Error Codes:: What all the error codes are. +* Error Messages:: Mapping error codes onto error messages. + +Memory Allocation + +* Memory Concepts:: An introduction to concepts and terminology. +* Dynamic Allocation and C:: How to get different kinds of allocation in C. +* Unconstrained Allocation:: The @code{malloc} facility allows fully general + dynamic allocation. +* Obstacks:: Obstacks are less general than malloc + but more efficient and convenient. +* Variable Size Automatic:: Allocation of variable-sized blocks + of automatic storage that are freed when the + calling function returns. +* Relocating Allocator:: Waste less memory, if you can tolerate + automatic relocation of the blocks you get. +* Memory Warnings:: Getting warnings when memory is nearly full. + +Unconstrained Allocation + +* Basic Allocation:: Simple use of @code{malloc}. +* Malloc Examples:: Examples of @code{malloc}. @code{xmalloc}. +* Freeing after Malloc:: Use @code{free} to free a block you + got with @code{malloc}. +* Changing Block Size:: Use @code{realloc} to make a block + bigger or smaller. +* Allocating Cleared Space:: Use @code{calloc} to allocate a + block and clear it. +* Efficiency and Malloc:: Efficiency considerations in use of + these functions. +* Aligned Memory Blocks:: Allocating specially aligned memory: + @code{memalign} and @code{valloc}. +* Heap Consistency Checking:: Automatic checking for errors. +* Hooks for Malloc:: You can use these hooks for debugging + programs that use @code{malloc}. +* Statistics of Malloc:: Getting information about how much + memory your program is using. +* Summary of Malloc:: Summary of @code{malloc} and related functions. + +Obstacks + +* Creating Obstacks:: How to declare an obstack in your program. +* Preparing for Obstacks:: Preparations needed before you can + use obstacks. +* Allocation in an Obstack:: Allocating objects in an obstack. +* Freeing Obstack Objects:: Freeing objects in an obstack. +* Obstack Functions:: The obstack functions are both + functions and macros. +* Growing Objects:: Making an object bigger by stages. +* Extra Fast Growing:: Extra-high-efficiency (though more + complicated) growing objects. +* Status of an Obstack:: Inquiries about the status of an obstack. +* Obstacks Data Alignment:: Controlling alignment of objects in obstacks. +* Obstack Chunks:: How obstacks obtain and release chunks. + Efficiency considerations. +* Summary of Obstacks:: + +Automatic Storage with Variable Size + +* Alloca Example:: Example of using @code{alloca}. +* Advantages of Alloca:: Reasons to use @code{alloca}. +* Disadvantages of Alloca:: Reasons to avoid @code{alloca}. +* GNU C Variable-Size Arrays:: Only in GNU C, here is an alternative + method of allocating dynamically and + freeing automatically. +Relocating Allocator + +* Relocator Concepts:: How to understand relocating allocation. +* Using Relocator:: Functions for relocating allocation. + +Character Handling + +* Classification of Characters::Testing whether characters are + letters, digits, punctuation, etc. +* Case Conversion:: Case mapping, and the like. + +String and Array Utilities + +* Representation of Strings:: Introduction to basic concepts. +* String/Array Conventions:: Whether to use a string function or an + arbitrary array function. +* String Length:: Determining the length of a string. +* Copying and Concatenation:: Functions to copy the contents of strings + and arrays. +* String/Array Comparison:: Functions for byte-wise and character-wise + comparison. +* Collation Functions:: Functions for collating strings. +* Search Functions:: Searching for a specific element or substring. +* Finding Tokens in a String:: Splitting a string into tokens by looking + for delimiters. + +Extended Characters + +* Extended Char Intro:: Multibyte codes versus wide characters. +* Locales and Extended Chars:: The locale selects the character codes. +* Multibyte Char Intro:: How multibyte codes are represented. +* Wide Char Intro:: How wide characters are represented. +* Wide String Conversion:: Converting wide strings to multibyte code + and vice versa. +* Length of Char:: how many bytes make up one multibyte char. +* Converting One Char:: Converting a string character by character. +* Example of Conversion:: Example showing why converting + one character at a time may be useful. +* Shift State:: Multibyte codes with "shift characters". + +Locales and Internationalization + +* Effects of Locale:: Actions affected by the choice of locale. +* Choosing Locale:: How the user specifies a locale. +* Locale Categories:: Different purposes for which + you can select a locale. +* Setting the Locale:: How a program specifies the locale. +* Standard Locales:: Locale names available on all systems. +* Numeric Formatting:: How to format numbers for the chosen locale. + +Searching and Sorting + +* Comparison Functions:: Defining how to compare two objects. + Since the sort and search facilities are + general, you have to specify the ordering. +* Array Search Function:: The @code{bsearch} function. +* Array Sort Function:: The @code{qsort} function. +* Search/Sort Example:: An example program. + +Pattern Matching + +* Wildcard Matching:: Matching a wildcard pattern against a single string. +* Globbing:: Finding the files that match a wildcard pattern. +* Regular Expressions:: Matching regular expressions against strings. +* Word Expansion:: Expanding shell variables, nested commands, + arithmetic, and wildcards. + This is what the shell does with shell commands. + +I/O Overview + +* I/O Concepts:: Some basic information and terminology. +* File Names:: How to refer to a file. + +I/O Concepts + +* Streams and File Descriptors:: The GNU Library provides two ways + to access the contents of files. +* File Position:: The number of bytes from the + beginning of the file. + +File Names + +* Directories:: Directories contain entries for files. +* File Name Resolution:: A file name specifies how to look up a file. +* File Name Errors:: Error conditions relating to file names. +* File Name Portability:: File name portability and syntax issues. + +I/O on Streams + +* Streams:: About the data type representing a stream. +* Standard Streams:: Streams to the standard input and output + devices are created for you. +* Opening Streams:: How to create a stream to talk to a file. +* Closing Streams:: Close a stream when you are finished with it. +* Simple Output:: Unformatted output by characters and lines. +* Character Input:: Unformatted input by characters and words. +* Line Input:: Reading a line or a record from a stream. +* Unreading:: Peeking ahead/pushing back input just read. +* Formatted Output:: @code{printf} and related functions. +* Customizing Printf:: You can define new conversion specifiers for + @code{printf} and friends. +* Formatted Input:: @code{scanf} and related functions. +* Block Input/Output:: Input and output operations on blocks of data. +* EOF and Errors:: How you can tell if an I/O error happens. +* Binary Streams:: Some systems distinguish between text files + and binary files. +* File Positioning:: About random-access streams. +* Portable Positioning:: Random access on peculiar ANSI C systems. +* Stream Buffering:: How to control buffering of streams. +* Temporary Files:: How to open a temporary file. +* Other Kinds of Streams:: Other Kinds of Streams + +Unreading + +* Unreading Idea:: An explanation of unreading with pictures. +* How Unread:: How to call @code{ungetc} to do unreading. + +Formatted Output + +* Formatted Output Basics:: Some examples to get you started. +* Output Conversion Syntax:: General syntax of conversion specifications. +* Table of Output Conversions:: Summary of output conversions, what they do. +* Integer Conversions:: Details of formatting integers. +* Floating-Point Conversions:: Details of formatting floating-point numbers. +* Other Output Conversions:: Details about formatting of strings, + characters, pointers, and the like. +* Formatted Output Functions:: Descriptions of the actual functions. +* Variable Arguments Output:: @code{vprintf} and friends. +* Parsing a Template String:: What kinds of arguments does + a given template call for? + +Customizing Printf + +* Registering New Conversions:: +* Conversion Specifier Options:: +* Defining the Output Handler:: +* Printf Extension Example:: + +Formatted Input + +* Formatted Input Basics:: Some basics to get you started. +* Input Conversion Syntax:: Syntax of conversion specifications. +* Table of Input Conversions:: Summary of input conversions and what they do. +* Numeric Input Conversions:: Details of conversions for reading numbers. +* String Input Conversions:: Details of conversions for reading strings. +* Other Input Conversions:: Details of miscellaneous other conversions. +* Formatted Input Functions:: Descriptions of the actual functions. +* Variable Arguments Input:: @code{vscanf} and friends. + +Stream Buffering + +* Buffering Concepts:: Terminology is defined here. +* Flushing Buffers:: How to ensure that output buffers are flushed. +* Controlling Buffering:: How to specify what kind of buffering to use. + +Other Kinds of Streams + +* String Streams:: +* Custom Streams:: + +Programming Your Own Custom Streams + +* Streams and Cookies:: +* Hook Functions:: + +Low-Level I/O + +* Opening and Closing Files:: How to open and close file descriptors. +* I/O Primitives:: Reading and writing data. +* File Position Primitive:: Setting a descriptor's file position. +* Descriptors and Streams:: Converting descriptor to stream or vice-versa. +* Stream/Descriptor Precautions:: Precautions needed if you use both + descriptors and streams. +* Waiting for I/O:: How to check for input or output + on multiple file descriptors. +* Control Operations:: Various other operations on file descriptors. +* Duplicating Descriptors:: Fcntl commands for duplicating descriptors. +* Descriptor Flags:: Fcntl commands for manipulating flags + associated with file descriptors. +* File Status Flags:: Fcntl commands for manipulating flags + associated with open files. +* File Locks:: Fcntl commands for implementing file locking. +* Interrupt Input:: Getting a signal when input arrives. + +File System Interface + +* Working Directory:: This is used to resolve relative file names. +* Accessing Directories:: Finding out what files a directory contains. +* Hard Links:: Adding alternate names to a file. +* Symbolic Links:: A file that ``points to'' a file name. +* Deleting Files:: How to delete a file, and what that means. +* Renaming Files:: Changing a file's name. +* Creating Directories:: A system call just for creating a directory. +* File Attributes:: Attributes of individual files. +* Making Special Files:: How to create special files. + +Accessing Directories + +* Directory Entries:: Format of one directory entry. +* Opening a Directory:: How to open a directory stream. +* Reading/Closing Directory:: How to read directory entries from the stream. +* Simple Directory Lister:: A very simple directory listing program. +* Random Access Directory:: Rereading part of the directory + already read with the same stream. + +File Attributes + +* Attribute Meanings:: The names of the file attributes, + and what their values mean. +* Reading Attributes:: How to read the attributes of a file. +* Testing File Type:: Distinguishing ordinary files, + directories, links... +* File Owner:: How ownership for new files is determined, + and how to change it. +* Permission Bits:: How information about a file's access mode + is stored. +* Access Permission:: How the system decides who can access a file. +* Setting Permissions:: How permissions for new files are assigned, + and how to change them. +* Testing File Access:: How to find out if your process can + access a file. +* File Times:: About the time attributes of a file. + +Pipes and FIFOs + +* Creating a Pipe:: Making a pipe with the @code{pipe} function. +* Pipe to a Subprocess:: Using a pipe to communicate with a child. +* FIFO Special Files:: Making a FIFO special file. + +Sockets + +* Socket Concepts:: Basic concepts you need to know about. +* Communication Styles:: Stream communication, datagrams, and others. +* Socket Addresses:: How socket names (``addresses'') work. +* File Namespace:: Details about the file namespace. +* Internet Namespace:: Details about the Internet namespace. +* Open/Close Sockets:: Creating sockets and destroying them. +* Connections:: Operations on sockets with connection state. +* Datagrams:: Operations on datagram sockets. +* Socket Options:: Miscellaneous low-level socket options. +* Networks Database:: Accessing the database of network names. + +Socket Addresses + +* Address Formats:: About @code{struct sockaddr}. +* Setting Address:: Binding an address to a socket. +* Reading Address:: Reading the address of a socket. + +Internet Domain + +* Internet Address Format:: How socket addresses are specified in the + Internet namespace. +* Host Addresses:: All about host addresses of Internet hosts. +* Protocols Database:: Referring to protocols by name. +* Services Database:: Ports may have symbolic names. +* Byte Order:: Different hosts may use different byte + ordering conventions; you need to + canonicalize host address and port number. +* Inet Example:: Putting it all together. + +Host Addresses + +* Abstract Host Addresses:: What a host number consists of. +* Data type: Host Address Data Type. Data type for a host number. +* Functions: Host Address Functions. Functions to operate on them. +* Names: Host Names. Translating host names to host numbers. + +Open/Close Sockets + +* Creating a Socket:: How to open a socket. +* Closing a Socket:: How to close a socket. +* Socket Pairs:: These are created like pipes. + +Connections + +* Connecting:: What the client program must do. +* Listening:: How a server program waits for requests. +* Accepting Connections:: What the server does when it gets a request. +* Who is Connected:: Getting the address of the + other side of a connection. +* Transferring Data:: How to send and receive data. +* Byte Stream Example:: An example client for communicating over a + byte stream socket in the Internet namespace. +* Server Example:: A corresponding server program. +* Out-of-Band Data:: This is an advanced feature. + +Transferring Data + +* Sending Data:: Sending data with @code{write}. +* Receiving Data:: Reading data with @code{read}. +* Socket Data Options:: Using @code{send} and @code{recv}. + +Datagrams + +* Sending Datagrams:: Sending packets on a datagram socket. +* Receiving Datagrams:: Receiving packets on a datagram socket. +* Datagram Example:: An example program: packets sent over a + datagram stream in the file namespace. +* Example Receiver:: Another program, that receives those packets. + +Socket Options + +* Socket Option Functions:: The basic functions for setting and getting + socket options. +* Socket-Level Options:: Details of the options at the socket level. + +Low-Level Terminal Interface + +* Is It a Terminal:: How to determine if a file is a terminal + device, and what its name is. +* I/O Queues:: About flow control and typeahead. +* Canonical or Not:: Two basic styles of input processing. +* Terminal Modes:: How to examine and modify flags controlling + terminal I/O: echoing, signals, editing. +* Line Control:: Sending break sequences, clearing buffers... +* Noncanon Example:: How to read single characters without echo. + +Terminal Modes + +* Mode Data Types:: The data type @code{struct termios} and related types. +* Mode Functions:: Functions to read and set terminal attributes. +* Setting Modes:: The right way to set attributes reliably. +* Input Modes:: Flags controlling low-level input handling. +* Output Modes:: Flags controlling low-level output handling. +* Control Modes:: Flags controlling serial port behavior. +* Local Modes:: Flags controlling high-level input handling. +* Line Speed:: How to read and set the terminal line speed. +* Special Characters:: Characters that have special effects, + and how to change them. +* Noncanonical Input:: Controlling how long to wait for input. + +Special Characters + +* Editing Characters:: +* Signal Characters:: +* Start/Stop Characters:: + +Mathematics + +* Domain and Range Errors:: How overflow conditions and the + like are reported. +* Not a Number:: Making NANs and testing for NANs. +* Trig Functions:: Sine, cosine, and tangent. +* Inverse Trig Functions:: Arc sine, arc cosine, and arc tangent. +* Exponents and Logarithms:: Also includes square root. +* Hyperbolic Functions:: Hyperbolic sine and friends. +* Pseudo-Random Numbers:: Functions for generating pseudo-random numbers. +* Absolute Value:: Absolute value functions. + +Pseudo-Random Numbers + +* ANSI Random:: @code{rand} and friends. +* BSD Random:: @code{random} and friends. + +Low-Level Arithmetic Functions + +* Normalization Functions:: Hacks for radix-2 representations. +* Rounding and Remainders:: Determinining the integer and + fractional parts of a float. +* Integer Division:: Functions for performing integer division. +* Parsing of Numbers:: Functions for ``reading'' numbers from strings. +* Predicates on Floats:: Some miscellaneous test functions. + +Parsing of Numbers + +* Parsing of Integers:: Functions for conversion of integer values. +* Parsing of Floats:: Functions for conversion of floating-point. + +Date and Time + +* Processor Time:: Measures processor time used by a program. +* Calendar Time:: Manipulation of ``real'' dates and times. +* Setting an Alarm:: Sending a signal after a specified time. +* Sleeping:: Waiting for a period of time. + +Processor Time + +* Basic CPU Time:: The @code{clock} function. +* Detailed CPU Time:: The @code{times} function. + +Calendar Time + +* Simple Calendar Time:: Facilities for manipulating calendar time. +* High-Resolution Calendar:: A time representation with greater precision. +* Broken-down Time:: Facilities for manipulating local time. +* Formatting Date and Time:: Converting times to strings. +* TZ Variable:: How users specify the time zone. +* Time Zone Functions:: Functions to examine or specify the time zone. +* Time Functions Example:: An example program showing use of some of + the time functions. + +Signal Handling + +* Concepts of Signals:: Introduction to the signal facilities. +* Standard Signals:: Particular kinds of signals with standard + names and meanings. +* Signal Actions:: Specifying what happens when a particular + signal is delivered. +* Defining Handlers:: How to write a signal handler function. +* Generating Signals:: How to send a signal to a process. +* Blocking Signals:: Making the system hold signals temporarily. +* Waiting for a Signal:: Suspending your program until a signal arrives. +* Signal Stack:: Using a Separate Signal Stack +* BSD Signal Handling:: Additional functions for backward + compatibility with BSD. + +Basic Concepts of Signals + +* Kinds of Signals:: Some examples of what can cause a signal. +* Signal Generation:: Concepts of why and how signals occur. +* Delivery of Signal:: Concepts of what a signal does to the process. + +Standard Signals + +* Program Error Signals:: Used to report serious program errors. +* Termination Signals:: Used to interrupt and/or terminate the program. +* Alarm Signals:: Used to indicate expiration of timers. +* Asynchronous I/O Signals:: Used to indicate input is available. +* Job Control Signals:: Signals used to support job control. +* Operation Error Signals:: Used to report operational system errors. +* Miscellaneous Signals:: Miscellaneous Signals. +* Signal Messages:: Printing a message describing a signal. + +Specifying Signal Actions + +* Basic Signal Handling:: The simple @code{signal} function. +* Advanced Signal Handling:: The more powerful @code{sigaction} function. +* Signal and Sigaction:: How those two functions interact. +* Sigaction Function Example:: An example of using the sigaction function. +* Flags for Sigaction:: Specifying options for signal handling. +* Initial Signal Actions:: How programs inherit signal actions. + +Defining Signal Handlers + +* Handler Returns:: +* Termination in Handler:: +* Longjmp in Handler:: +* Signals in Handler:: +* Nonreentrancy:: +* Atomic Data Access:: + +Generating Signals + +* Signaling Yourself:: Signaling Yourself +* Signaling Another Process:: Send a signal to another process. +* Permission for kill:: Permission for using @code{kill} +* Kill Example:: Using @code{kill} for Communication + +Blocking Signals + +* Why Block:: The purpose of blocking signals. +* Signal Sets:: How to specify which signals to block. +* Process Signal Mask:: Blocking delivery of signals to your + process during normal execution. +* Testing for Delivery:: Blocking to Test for Delivery of a Signal +* Blocking for Handler:: Blocking additional signals while a + handler is being run. +* Checking for Pending Signals::Checking for Pending Signals +* Remembering a Signal:: How you can get almost the same effect + as blocking a signal, by handling it + and setting a flag to be tested later. + +Waiting for a Signal + +* Using Pause:: The simple way, using @code{pause}. +* Pause Problems:: Why the simple way is often not very good. +* Sigsuspend:: Reliably waiting for a specific signal. + +BSD Signal Handling + +* BSD Handler:: BSD Function to Establish a Handler. +* Blocking in BSD:: BSD Functions for Blocking Signals + +Process Startup and Termination + +* Program Arguments:: Parsing your program's command-line arguments. +* Environment Variables:: How to access parameters inherited from + a parent process. +* Program Termination:: How to cause a process to terminate and + return status information to its parent. + +Program Arguments + +* Argument Syntax:: By convention, options start with a hyphen. +* Parsing Options:: The @code{getopt} function. +* Example of Getopt:: An example of parsing options with @code{getopt}. +* Long Options:: GNU utilities should accept long-named options. + Here is how to do that. +* Long Option Example:: An example of using @code{getopt_long}. + +Environment Variables + +* Environment Access:: How to get and set the values of + environment variables. +* Standard Environment:: These environment variables have + standard interpretations. + +Program Termination + +* Normal Termination:: If a program calls @code{exit}, a + process terminates normally. +* Exit Status:: The @code{exit status} provides information + about why the process terminated. +* Cleanups on Exit:: A process can run its own cleanup + functions upon normal termination. +* Aborting a Program:: The @code{abort} function causes + abnormal program termination. +* Termination Internals:: What happens when a process terminates. + + +Child Processes + +* Running a Command:: The easy way to run another program. +* Process Creation Concepts:: An overview of the hard way to do it. +* Process Identification:: How to get the process ID of a process. +* Creating a Process:: How to fork a child process. +* Executing a File:: How to make a child execute another program. +* Process Completion:: How to tell when a child process has completed. +* Process Completion Status:: How to interpret the status value + returned from a child process. +* BSD Wait Functions:: More functions, for backward compatibility. +* Process Creation Example:: A complete example program. + +Job Control + +* Concepts of Job Control :: Concepts of Job Control +* Job Control is Optional:: Not all POSIX systems support job control. +* Controlling Terminal:: How a process gets its controlling terminal. +* Access to the Terminal:: How processes share the controlling terminal. +* Orphaned Process Groups:: Jobs left after the user logs out. +* Implementing a Shell:: What a shell must do to implement job control. +* Functions for Job Control:: Functions to control process groups. + +Implementing a Job Control Shell + +* Data Structures:: Introduction to the sample shell. +* Initializing the Shell:: What the shell must do to take + responsibility for job control. +* Launching Jobs:: Creating jobs to execute commands. +* Foreground and Background:: Putting a job in foreground of background. +* Stopped and Terminated Jobs:: Reporting job status. +* Continuing Stopped Jobs:: How to continue a stopped job in + the foreground or background. +* Missing Pieces:: Other parts of the shell. + +Functions for Job Control + +* Identifying the Terminal:: Determining the controlling terminal's name. +* Process Group Functions:: Functions for manipulating process groups. +* Terminal Access Functions:: Functions for controlling terminal access. + +Users and Groups + +* User and Group IDs:: Each user and group has a unique numeric ID. +* Process Persona:: The user IDs and group IDs of a process. +* Why Change Persona:: Why a program might need to change + its user and/or group IDs. +* How Change Persona:: Restrictions on changing user and group IDs. +* Reading Persona:: Examining the process's user and group IDs. +* Setting User ID:: +* Setting Groups:: +* Enable/Disable Setuid:: +* Setuid Program Example:: Setuid Program Example +* Tips for Setuid:: +* Who Logged In:: Getting the name of the user who logged in, + or of the real user ID of the current process. + +* User Database:: Functions and data structures for + accessing the user database. +* Group Database:: Functions and data structures for + accessing the group database. +* Database Example:: Example program showing use of database + inquiry functions. + +User Database + +* User Data Structure:: +* Lookup User:: +* Scanning All Users:: Scanning the List of All Users +* Writing a User Entry:: + +Group Database + +* Group Data Structure:: +* Lookup Group:: +* Scanning All Groups:: Scanning the List of All Groups + +System Information + +* Host Identification:: Determining the name of the machine. +* Hardware/Software Type ID:: Determining the hardware type and + operating system type. + +System Configuration Limits + +* General Limits:: Constants and functions that describe + various process-related limits that have + one uniform value for any given machine. +* System Options:: Optional POSIX features. +* Version Supported:: Version numbers of POSIX.1 and POSIX.2. +* Sysconf:: Getting specific configuration values + of general limits and system options. +* Minimums:: Minimum values for general limits. + +* Limits for Files:: Size limitations on individual files. + These can vary between file systems + or even from file to file. +* Options for Files:: Optional features that some files may support. +* File Minimums:: Minimum values for file limits. +* Pathconf:: Getting the limit values for a particular file. + +* Utility Limits:: Capacity limits of POSIX.2 utility programs. +* Utility Minimums:: Minimum allowable values of those limits. + +* String Parameters:: Getting the default search path. + +Library Facilities that are Part of the C Language + +* Consistency Checking:: Using @code{assert} to abort + if something ``impossible'' happens. +* Variadic Functions:: Defining functions with varying + numbers of arguments. +* Null Pointer Constant:: The macro @code{NULL}. +* Important Data Types:: Data types for object sizes. +* Data Type Measurements:: Parameters of data type representations. + +Variadic Functions + +* Why Variadic:: Reasons for making functions take + variable arguments. +* How Variadic:: How to define and call variadic functions. +* Argument Macros:: Detailed specification of the macros + for accessing variable arguments. +* Variadic Example:: A complete example. + +How Variadic Functions are Defined and Used + +* Variadic Prototypes:: How to make a prototype for a function + with variable arguments. +* Receiving Arguments:: Steps you must follow to access the + optional argument values. +* How Many Arguments:: How to decide whether there are more arguments. +* Calling Variadics:: Things you need to know about calling + variable arguments functions. + +Data Type Measurements + +* Width of Type:: How many bits does an integer type hold? +* Range of Type:: What are the largest and smallest values + that an integer type can hold? +* Floating Type Macros:: Parameters that measure floating-point types. +* Structure Measurement:: Getting measurements on structure types. + +Floating Type Macros + +* Floating Point Concepts:: Definitions of terminology. +* Floating Point Parameters:: Dimensions, limits of floating point types. +* IEEE Floating Point:: How one common representation is described. + +Library Maintenance + +* Installation:: How to configure, compile and install + the GNU C library. +* Reporting Bugs:: How to report bugs (if you want to + get them fixed) and other troubles + you may have with the GNU C library. +* Porting:: How to port the GNU C library to + a new machine or operating system. +@c * Traditional C Compatibility:: Using the GNU C library with non-ANSI +@c C compilers. +* Contributors:: Who wrote what parts of the GNU C Library. + +Porting the GNU C Library + +* Hierarchy Conventions:: How the @file{sysdeps} hierarchy is + layed out. +* Porting to Unix:: Porting the library to an average + Unix-like system. +@end menu + + +@comment Includes of all the individual chapters. +@include intro.texi +@include errno.texi +@include memory.texi +@include ctype.texi +@include string.texi +@include io.texi +@include stdio.texi +@include llio.texi +@include filesys.texi +@include pipe.texi +@include socket.texi +@include terminal.texi +@include math.texi +@include arith.texi +@include search.texi +@include pattern.texi +@include time.texi +@include mbyte.texi +@include locale.texi +@include setjmp.texi +@include signal.texi +@include startup.texi +@include process.texi +@include job.texi +@include users.texi +@include sysinfo.texi +@include conf.texi + +@comment Includes of the appendices. +@include lang.texi +@include header.texi +@include maint.texi + + +@set lgpl-appendix +@node Copying, Concept Index, Maintenance, Top +@include lgpl.texinfo + + +@node Concept Index, Type Index, Copying, Top +@unnumbered Concept Index + +@printindex cp + +@node Type Index, Function Index, Concept Index, Top +@unnumbered Type Index + +@printindex tp + +@node Function Index, Variable Index, Type Index, Top +@unnumbered Function and Macro Index + +@printindex fn + +@node Variable Index, File Index, Function Index, Top +@unnumbered Variable and Constant Macro Index + +@printindex vr + +@node File Index, , Variable Index, Top +@unnumbered Program and File Index + +@printindex pg + + +@shortcontents +@contents +@bye diff --git a/manual/libcbook.texi b/manual/libcbook.texi new file mode 100644 index 0000000000..b248304ede --- /dev/null +++ b/manual/libcbook.texi @@ -0,0 +1,3 @@ +\input texinfo +@finalout +@include libc.texinfo diff --git a/manual/llio.texi b/manual/llio.texi new file mode 100644 index 0000000000..6a5a5d27e0 --- /dev/null +++ b/manual/llio.texi @@ -0,0 +1,1979 @@ +@node Low-Level I/O, File System Interface, I/O on Streams, Top +@chapter Low-Level Input/Output + +This chapter describes functions for performing low-level input/output +operations on file descriptors. These functions include the primitives +for the higher-level I/O functions described in @ref{I/O on Streams}, as +well as functions for performing low-level control operations for which +there are no equivalents on streams. + +Stream-level I/O is more flexible and usually more convenient; +therefore, programmers generally use the descriptor-level functions only +when necessary. These are some of the usual reasons: + +@itemize @bullet +@item +For reading binary files in large chunks. + +@item +For reading an entire file into core before parsing it. + +@item +To perform operations other than data transfer, which can only be done +with a descriptor. (You can use @code{fileno} to get the descriptor +corresponding to a stream.) + +@item +To pass descriptors to a child process. (The child can create its own +stream to use a descriptor that it inherits, but cannot inherit a stream +directly.) +@end itemize + +@menu +* Opening and Closing Files:: How to open and close file + descriptors. +* I/O Primitives:: Reading and writing data. +* File Position Primitive:: Setting a descriptor's file + position. +* Descriptors and Streams:: Converting descriptor to stream + or vice-versa. +* Stream/Descriptor Precautions:: Precautions needed if you use both + descriptors and streams. +* Waiting for I/O:: How to check for input or output + on multiple file descriptors. +* Control Operations:: Various other operations on file + descriptors. +* Duplicating Descriptors:: Fcntl commands for duplicating + file descriptors. +* Descriptor Flags:: Fcntl commands for manipulating + flags associated with file + descriptors. +* File Status Flags:: Fcntl commands for manipulating + flags associated with open files. +* File Locks:: Fcntl commands for implementing + file locking. +* Interrupt Input:: Getting an asynchronous signal when + input arrives. +@end menu + + +@node Opening and Closing Files +@section Opening and Closing Files + +@cindex opening a file descriptor +@cindex closing a file descriptor +This section describes the primitives for opening and closing files +using file descriptors. The @code{open} and @code{creat} functions are +declared in the header file @file{fcntl.h}, while @code{close} is +declared in @file{unistd.h}. +@pindex unistd.h +@pindex fcntl.h + +@comment fcntl.h +@comment POSIX.1 +@deftypefun int open (const char *@var{filename}, int @var{flags}[, mode_t @var{mode}]) +The @code{open} function creates and returns a new file descriptor +for the file named by @var{filename}. Initially, the file position +indicator for the file is at the beginning of the file. The argument +@var{mode} is used only when a file is created, but it doesn't hurt +to supply the argument in any case. + +The @var{flags} argument controls how the file is to be opened. This is +a bit mask; you create the value by the bitwise OR of the appropriate +parameters (using the @samp{|} operator in C). +@xref{File Status Flags}, for the parameters available. + +The normal return value from @code{open} is a non-negative integer file +descriptor. In the case of an error, a value of @code{-1} is returned +instead. In addition to the usual file name errors (@pxref{File +Name Errors}), the following @code{errno} error conditions are defined +for this function: + +@table @code +@item EACCES +The file exists but is not readable/writable as requested by the @var{flags} +argument, the file does not exist and the directory is unwritable so +it cannot be created. + +@item EEXIST +Both @code{O_CREAT} and @code{O_EXCL} are set, and the named file already +exists. + +@item EINTR +The @code{open} operation was interrupted by a signal. +@xref{Interrupted Primitives}. + +@item EISDIR +The @var{flags} argument specified write access, and the file is a directory. + +@item EMFILE +The process has too many files open. +The maximum number of file descriptors is controlled by the +@code{RLIMIT_NOFILE} resource limit; @pxref{Limits on Resources}. + +@item ENFILE +The entire system, or perhaps the file system which contains the +directory, cannot support any additional open files at the moment. +(This problem cannot happen on the GNU system.) + +@item ENOENT +The named file does not exist, and @code{O_CREAT} is not specified. + +@item ENOSPC +The directory or file system that would contain the new file cannot be +extended, because there is no disk space left. + +@item ENXIO +@code{O_NONBLOCK} and @code{O_WRONLY} are both set in the @var{flags} +argument, the file named by @var{filename} is a FIFO (@pxref{Pipes and +FIFOs}), and no process has the file open for reading. + +@item EROFS +The file resides on a read-only file system and any of @w{@code{O_WRONLY}}, +@code{O_RDWR}, and @code{O_TRUNC} are set in the @var{flags} argument, +or @code{O_CREAT} is set and the file does not already exist. +@end table + +@c !!! umask + +The @code{open} function is the underlying primitive for the @code{fopen} +and @code{freopen} functions, that create streams. +@end deftypefun + +@comment fcntl.h +@comment POSIX.1 +@deftypefn {Obsolete function} int creat (const char *@var{filename}, mode_t @var{mode}) +This function is obsolete. The call: + +@smallexample +creat (@var{filename}, @var{mode}) +@end smallexample + +@noindent +is equivalent to: + +@smallexample +open (@var{filename}, O_WRONLY | O_CREAT | O_TRUNC, @var{mode}) +@end smallexample +@end deftypefn + +@comment unistd.h +@comment POSIX.1 +@deftypefun int close (int @var{filedes}) +The function @code{close} closes the file descriptor @var{filedes}. +Closing a file has the following consequences: + +@itemize @bullet +@item +The file descriptor is deallocated. + +@item +Any record locks owned by the process on the file are unlocked. + +@item +When all file descriptors associated with a pipe or FIFO have been closed, +any unread data is discarded. +@end itemize + +The normal return value from @code{close} is @code{0}; a value of @code{-1} +is returned in case of failure. The following @code{errno} error +conditions are defined for this function: + +@table @code +@item EBADF +The @var{filedes} argument is not a valid file descriptor. + +@item EINTR +The @code{close} call was interrupted by a signal. +@xref{Interrupted Primitives}. +Here is an example of how to handle @code{EINTR} properly: + +@smallexample +TEMP_FAILURE_RETRY (close (desc)); +@end smallexample + +@item ENOSPC +@itemx EIO +@itemx EDQUOT +When the file is accessed by NFS, these errors from @code{write} can sometimes +not be detected until @code{close}. @xref{I/O Primitives}, for details +on their meaning. +@end table +@end deftypefun + +To close a stream, call @code{fclose} (@pxref{Closing Streams}) instead +of trying to close its underlying file descriptor with @code{close}. +This flushes any buffered output and updates the stream object to +indicate that it is closed. + +@node I/O Primitives +@section Input and Output Primitives + +This section describes the functions for performing primitive input and +output operations on file descriptors: @code{read}, @code{write}, and +@code{lseek}. These functions are declared in the header file +@file{unistd.h}. +@pindex unistd.h + +@comment unistd.h +@comment POSIX.1 +@deftp {Data Type} ssize_t +This data type is used to represent the sizes of blocks that can be +read or written in a single operation. It is similar to @code{size_t}, +but must be a signed type. +@end deftp + +@cindex reading from a file descriptor +@comment unistd.h +@comment POSIX.1 +@deftypefun ssize_t read (int @var{filedes}, void *@var{buffer}, size_t @var{size}) +The @code{read} function reads up to @var{size} bytes from the file +with descriptor @var{filedes}, storing the results in the @var{buffer}. +(This is not necessarily a character string and there is no terminating +null character added.) + +@cindex end-of-file, on a file descriptor +The return value is the number of bytes actually read. This might be +less than @var{size}; for example, if there aren't that many bytes left +in the file or if there aren't that many bytes immediately available. +The exact behavior depends on what kind of file it is. Note that +reading less than @var{size} bytes is not an error. + +A value of zero indicates end-of-file (except if the value of the +@var{size} argument is also zero). This is not considered an error. +If you keep calling @code{read} while at end-of-file, it will keep +returning zero and doing nothing else. + +If @code{read} returns at least one character, there is no way you can +tell whether end-of-file was reached. But if you did reach the end, the +next read will return zero. + +In case of an error, @code{read} returns @code{-1}. The following +@code{errno} error conditions are defined for this function: + +@table @code +@item EAGAIN +Normally, when no input is immediately available, @code{read} waits for +some input. But if the @code{O_NONBLOCK} flag is set for the file +(@pxref{File Status Flags}), @code{read} returns immediately without +reading any data, and reports this error. + +@strong{Compatibility Note:} Most versions of BSD Unix use a different +error code for this: @code{EWOULDBLOCK}. In the GNU library, +@code{EWOULDBLOCK} is an alias for @code{EAGAIN}, so it doesn't matter +which name you use. + +On some systems, reading a large amount of data from a character special +file can also fail with @code{EAGAIN} if the kernel cannot find enough +physical memory to lock down the user's pages. This is limited to +devices that transfer with direct memory access into the user's memory, +which means it does not include terminals, since they always use +separate buffers inside the kernel. This problem never happens in the +GNU system. + +Any condition that could result in @code{EAGAIN} can instead result in a +successful @code{read} which returns fewer bytes than requested. +Calling @code{read} again immediately would result in @code{EAGAIN}. + +@item EBADF +The @var{filedes} argument is not a valid file descriptor, +or is not open for reading. + +@item EINTR +@code{read} was interrupted by a signal while it was waiting for input. +@xref{Interrupted Primitives}. A signal will not necessary cause +@code{read} to return @code{EINTR}; it may instead result in a +successful @code{read} which returns fewer bytes than requested. + +@item EIO +For many devices, and for disk files, this error code indicates +a hardware error. + +@code{EIO} also occurs when a background process tries to read from the +controlling terminal, and the normal action of stopping the process by +sending it a @code{SIGTTIN} signal isn't working. This might happen if +signal is being blocked or ignored, or because the process group is +orphaned. @xref{Job Control}, for more information about job control, +and @ref{Signal Handling}, for information about signals. +@end table + +The @code{read} function is the underlying primitive for all of the +functions that read from streams, such as @code{fgetc}. +@end deftypefun + +@cindex writing to a file descriptor +@comment unistd.h +@comment POSIX.1 +@deftypefun ssize_t write (int @var{filedes}, const void *@var{buffer}, size_t @var{size}) +The @code{write} function writes up to @var{size} bytes from +@var{buffer} to the file with descriptor @var{filedes}. The data in +@var{buffer} is not necessarily a character string and a null character is +output like any other character. + +The return value is the number of bytes actually written. This may be +@var{size}, but can always be smaller. Your program should always call +@code{write} in a loop, iterating until all the data is written. + +Once @code{write} returns, the data is enqueued to be written and can be +read back right away, but it is not necessarily written out to permanent +storage immediately. You can use @code{fsync} when you need to be sure +your data has been permanently stored before continuing. (It is more +efficient for the system to batch up consecutive writes and do them all +at once when convenient. Normally they will always be written to disk +within a minute or less.) +@c !!! xref fsync +You can use the @code{O_FSYNC} open mode to make @code{write} always +store the data to disk before returning; @pxref{Operating Modes}. + +In the case of an error, @code{write} returns @code{-1}. The following +@code{errno} error conditions are defined for this function: + +@table @code +@item EAGAIN +Normally, @code{write} blocks until the write operation is complete. +But if the @code{O_NONBLOCK} flag is set for the file (@pxref{Control +Operations}), it returns immediately without writing any data, and +reports this error. An example of a situation that might cause the +process to block on output is writing to a terminal device that supports +flow control, where output has been suspended by receipt of a STOP +character. + +@strong{Compatibility Note:} Most versions of BSD Unix use a different +error code for this: @code{EWOULDBLOCK}. In the GNU library, +@code{EWOULDBLOCK} is an alias for @code{EAGAIN}, so it doesn't matter +which name you use. + +On some systems, writing a large amount of data from a character special +file can also fail with @code{EAGAIN} if the kernel cannot find enough +physical memory to lock down the user's pages. This is limited to +devices that transfer with direct memory access into the user's memory, +which means it does not include terminals, since they always use +separate buffers inside the kernel. This problem does not arise in the +GNU system. + +@item EBADF +The @var{filedes} argument is not a valid file descriptor, +or is not open for writing. + +@item EFBIG +The size of the file would become larger than the implementation can support. + +@item EINTR +The @code{write} operation was interrupted by a signal while it was +blocked waiting for completion. A signal will not necessary cause +@code{write} to return @code{EINTR}; it may instead result in a +successful @code{write} which writes fewer bytes than requested. +@xref{Interrupted Primitives}. + +@item EIO +For many devices, and for disk files, this error code indicates +a hardware error. + +@item ENOSPC +The device containing the file is full. + +@item EPIPE +This error is returned when you try to write to a pipe or FIFO that +isn't open for reading by any process. When this happens, a @code{SIGPIPE} +signal is also sent to the process; see @ref{Signal Handling}. +@end table + +Unless you have arranged to prevent @code{EINTR} failures, you should +check @code{errno} after each failing call to @code{write}, and if the +error was @code{EINTR}, you should simply repeat the call. +@xref{Interrupted Primitives}. The easy way to do this is with the +macro @code{TEMP_FAILURE_RETRY}, as follows: + +@smallexample +nbytes = TEMP_FAILURE_RETRY (write (desc, buffer, count)); +@end smallexample + +The @code{write} function is the underlying primitive for all of the +functions that write to streams, such as @code{fputc}. +@end deftypefun + +@node File Position Primitive +@section Setting the File Position of a Descriptor + +Just as you can set the file position of a stream with @code{fseek}, you +can set the file position of a descriptor with @code{lseek}. This +specifies the position in the file for the next @code{read} or +@code{write} operation. @xref{File Positioning}, for more information +on the file position and what it means. + +To read the current file position value from a descriptor, use +@code{lseek (@var{desc}, 0, SEEK_CUR)}. + +@cindex file positioning on a file descriptor +@cindex positioning a file descriptor +@cindex seeking on a file descriptor +@comment unistd.h +@comment POSIX.1 +@deftypefun off_t lseek (int @var{filedes}, off_t @var{offset}, int @var{whence}) +The @code{lseek} function is used to change the file position of the +file with descriptor @var{filedes}. + +The @var{whence} argument specifies how the @var{offset} should be +interpreted in the same way as for the @code{fseek} function, and must be +one of the symbolic constants @code{SEEK_SET}, @code{SEEK_CUR}, or +@code{SEEK_END}. + +@table @code +@item SEEK_SET +Specifies that @var{whence} is a count of characters from the beginning +of the file. + +@item SEEK_CUR +Specifies that @var{whence} is a count of characters from the current +file position. This count may be positive or negative. + +@item SEEK_END +Specifies that @var{whence} is a count of characters from the end of +the file. A negative count specifies a position within the current +extent of the file; a positive count specifies a position past the +current end. If you set the position past the current end, and +actually write data, you will extend the file with zeros up to that +position.@end table + +The return value from @code{lseek} is normally the resulting file +position, measured in bytes from the beginning of the file. +You can use this feature together with @code{SEEK_CUR} to read the +current file position. + +If you want to append to the file, setting the file position to the +current end of file with @code{SEEK_END} is not sufficient. Another +process may write more data after you seek but before you write, +extending the file so the position you write onto clobbers their data. +Instead, use the @code{O_APPEND} operating mode; @pxref{Operating Modes}. + +You can set the file position past the current end of the file. This +does not by itself make the file longer; @code{lseek} never changes the +file. But subsequent output at that position will extend the file. +Characters between the previous end of file and the new position are +filled with zeros. Extending the file in this way can create a +``hole'': the blocks of zeros are not actually allocated on disk, so the +file takes up less space than it appears so; it is then called a +``sparse file''. +@cindex sparse files +@cindex holes in files + +If the file position cannot be changed, or the operation is in some way +invalid, @code{lseek} returns a value of @code{-1}. The following +@code{errno} error conditions are defined for this function: + +@table @code +@item EBADF +The @var{filedes} is not a valid file descriptor. + +@item EINVAL +The @var{whence} argument value is not valid, or the resulting +file offset is not valid. A file offset is invalid. + +@item ESPIPE +The @var{filedes} corresponds to an object that cannot be positioned, +such as a pipe, FIFO or terminal device. (POSIX.1 specifies this error +only for pipes and FIFOs, but in the GNU system, you always get +@code{ESPIPE} if the object is not seekable.) +@end table + +The @code{lseek} function is the underlying primitive for the +@code{fseek}, @code{ftell} and @code{rewind} functions, which operate on +streams instead of file descriptors. +@end deftypefun + +You can have multiple descriptors for the same file if you open the file +more than once, or if you duplicate a descriptor with @code{dup}. +Descriptors that come from separate calls to @code{open} have independent +file positions; using @code{lseek} on one descriptor has no effect on the +other. For example, + +@smallexample +@group +@{ + int d1, d2; + char buf[4]; + d1 = open ("foo", O_RDONLY); + d2 = open ("foo", O_RDONLY); + lseek (d1, 1024, SEEK_SET); + read (d2, buf, 4); +@} +@end group +@end smallexample + +@noindent +will read the first four characters of the file @file{foo}. (The +error-checking code necessary for a real program has been omitted here +for brevity.) + +By contrast, descriptors made by duplication share a common file +position with the original descriptor that was duplicated. Anything +which alters the file position of one of the duplicates, including +reading or writing data, affects all of them alike. Thus, for example, + +@smallexample +@{ + int d1, d2, d3; + char buf1[4], buf2[4]; + d1 = open ("foo", O_RDONLY); + d2 = dup (d1); + d3 = dup (d2); + lseek (d3, 1024, SEEK_SET); + read (d1, buf1, 4); + read (d2, buf2, 4); +@} +@end smallexample + +@noindent +will read four characters starting with the 1024'th character of +@file{foo}, and then four more characters starting with the 1028'th +character. + +@comment sys/types.h +@comment POSIX.1 +@deftp {Data Type} off_t +This is an arithmetic data type used to represent file sizes. +In the GNU system, this is equivalent to @code{fpos_t} or @code{long int}. +@end deftp + +These aliases for the @samp{SEEK_@dots{}} constants exist for the sake +of compatibility with older BSD systems. They are defined in two +different header files: @file{fcntl.h} and @file{sys/file.h}. + +@table @code +@item L_SET +An alias for @code{SEEK_SET}. + +@item L_INCR +An alias for @code{SEEK_CUR}. + +@item L_XTND +An alias for @code{SEEK_END}. +@end table + +@node Descriptors and Streams +@section Descriptors and Streams +@cindex streams, and file descriptors +@cindex converting file descriptor to stream +@cindex extracting file descriptor from stream + +Given an open file descriptor, you can create a stream for it with the +@code{fdopen} function. You can get the underlying file descriptor for +an existing stream with the @code{fileno} function. These functions are +declared in the header file @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment POSIX.1 +@deftypefun {FILE *} fdopen (int @var{filedes}, const char *@var{opentype}) +The @code{fdopen} function returns a new stream for the file descriptor +@var{filedes}. + +The @var{opentype} argument is interpreted in the same way as for the +@code{fopen} function (@pxref{Opening Streams}), except that +the @samp{b} option is not permitted; this is because GNU makes no +distinction between text and binary files. Also, @code{"w"} and +@code{"w+"} do not cause truncation of the file; these have affect only +when opening a file, and in this case the file has already been opened. +You must make sure that the @var{opentype} argument matches the actual +mode of the open file descriptor. + +The return value is the new stream. If the stream cannot be created +(for example, if the modes for the file indicated by the file descriptor +do not permit the access specified by the @var{opentype} argument), a +null pointer is returned instead. + +In some other systems, @code{fdopen} may fail to detect that the modes +for file descriptor do not permit the access specified by +@code{opentype}. The GNU C library always checks for this. +@end deftypefun + +For an example showing the use of the @code{fdopen} function, +see @ref{Creating a Pipe}. + +@comment stdio.h +@comment POSIX.1 +@deftypefun int fileno (FILE *@var{stream}) +This function returns the file descriptor associated with the stream +@var{stream}. If an error is detected (for example, if the @var{stream} +is not valid) or if @var{stream} does not do I/O to a file, +@code{fileno} returns @code{-1}. +@end deftypefun + +@cindex standard file descriptors +@cindex file descriptors, standard +There are also symbolic constants defined in @file{unistd.h} for the +file descriptors belonging to the standard streams @code{stdin}, +@code{stdout}, and @code{stderr}; see @ref{Standard Streams}. +@pindex unistd.h + +@comment unistd.h +@comment POSIX.1 +@table @code +@item STDIN_FILENO +@vindex STDIN_FILENO +This macro has value @code{0}, which is the file descriptor for +standard input. +@cindex standard input file descriptor + +@comment unistd.h +@comment POSIX.1 +@item STDOUT_FILENO +@vindex STDOUT_FILENO +This macro has value @code{1}, which is the file descriptor for +standard output. +@cindex standard output file descriptor + +@comment unistd.h +@comment POSIX.1 +@item STDERR_FILENO +@vindex STDERR_FILENO +This macro has value @code{2}, which is the file descriptor for +standard error output. +@end table +@cindex standard error file descriptor + +@node Stream/Descriptor Precautions +@section Dangers of Mixing Streams and Descriptors +@cindex channels +@cindex streams and descriptors +@cindex descriptors and streams +@cindex mixing descriptors and streams + +You can have multiple file descriptors and streams (let's call both +streams and descriptors ``channels'' for short) connected to the same +file, but you must take care to avoid confusion between channels. There +are two cases to consider: @dfn{linked} channels that share a single +file position value, and @dfn{independent} channels that have their own +file positions. + +It's best to use just one channel in your program for actual data +transfer to any given file, except when all the access is for input. +For example, if you open a pipe (something you can only do at the file +descriptor level), either do all I/O with the descriptor, or construct a +stream from the descriptor with @code{fdopen} and then do all I/O with +the stream. + +@menu +* Linked Channels:: Dealing with channels sharing a file position. +* Independent Channels:: Dealing with separately opened, unlinked channels. +* Cleaning Streams:: Cleaning a stream makes it safe to use + another channel. +@end menu + +@node Linked Channels +@subsection Linked Channels +@cindex linked channels + +Channels that come from a single opening share the same file position; +we call them @dfn{linked} channels. Linked channels result when you +make a stream from a descriptor using @code{fdopen}, when you get a +descriptor from a stream with @code{fileno}, when you copy a descriptor +with @code{dup} or @code{dup2}, and when descriptors are inherited +during @code{fork}. For files that don't support random access, such as +terminals and pipes, @emph{all} channels are effectively linked. On +random-access files, all append-type output streams are effectively +linked to each other. + +@cindex cleaning up a stream +If you have been using a stream for I/O, and you want to do I/O using +another channel (either a stream or a descriptor) that is linked to it, +you must first @dfn{clean up} the stream that you have been using. +@xref{Cleaning Streams}. + +Terminating a process, or executing a new program in the process, +destroys all the streams in the process. If descriptors linked to these +streams persist in other processes, their file positions become +undefined as a result. To prevent this, you must clean up the streams +before destroying them. + +@node Independent Channels +@subsection Independent Channels +@cindex independent channels + +When you open channels (streams or descriptors) separately on a seekable +file, each channel has its own file position. These are called +@dfn{independent channels}. + +The system handles each channel independently. Most of the time, this +is quite predictable and natural (especially for input): each channel +can read or write sequentially at its own place in the file. However, +if some of the channels are streams, you must take these precautions: + +@itemize @bullet +@item +You should clean an output stream after use, before doing anything else +that might read or write from the same part of the file. + +@item +You should clean an input stream before reading data that may have been +modified using an independent channel. Otherwise, you might read +obsolete data that had been in the stream's buffer. +@end itemize + +If you do output to one channel at the end of the file, this will +certainly leave the other independent channels positioned somewhere +before the new end. You cannot reliably set their file positions to the +new end of file before writing, because the file can always be extended +by another process between when you set the file position and when you +write the data. Instead, use an append-type descriptor or stream; they +always output at the current end of the file. In order to make the +end-of-file position accurate, you must clean the output channel you +were using, if it is a stream. + +It's impossible for two channels to have separate file pointers for a +file that doesn't support random access. Thus, channels for reading or +writing such files are always linked, never independent. Append-type +channels are also always linked. For these channels, follow the rules +for linked channels; see @ref{Linked Channels}. + +@node Cleaning Streams +@subsection Cleaning Streams + +On the GNU system, you can clean up any stream with @code{fclean}: + +@comment stdio.h +@comment GNU +@deftypefun int fclean (FILE *@var{stream}) +Clean up the stream @var{stream} so that its buffer is empty. If +@var{stream} is doing output, force it out. If @var{stream} is doing +input, give the data in the buffer back to the system, arranging to +reread it. +@end deftypefun + +On other systems, you can use @code{fflush} to clean a stream in most +cases. + +You can skip the @code{fclean} or @code{fflush} if you know the stream +is already clean. A stream is clean whenever its buffer is empty. For +example, an unbuffered stream is always clean. An input stream that is +at end-of-file is clean. A line-buffered stream is clean when the last +character output was a newline. + +There is one case in which cleaning a stream is impossible on most +systems. This is when the stream is doing input from a file that is not +random-access. Such streams typically read ahead, and when the file is +not random access, there is no way to give back the excess data already +read. When an input stream reads from a random-access file, +@code{fflush} does clean the stream, but leaves the file pointer at an +unpredictable place; you must set the file pointer before doing any +further I/O. On the GNU system, using @code{fclean} avoids both of +these problems. + +Closing an output-only stream also does @code{fflush}, so this is a +valid way of cleaning an output stream. On the GNU system, closing an +input stream does @code{fclean}. + +You need not clean a stream before using its descriptor for control +operations such as setting terminal modes; these operations don't affect +the file position and are not affected by it. You can use any +descriptor for these operations, and all channels are affected +simultaneously. However, text already ``output'' to a stream but still +buffered by the stream will be subject to the new terminal modes when +subsequently flushed. To make sure ``past'' output is covered by the +terminal settings that were in effect at the time, flush the output +streams for that terminal before setting the modes. @xref{Terminal +Modes}. + +@node Waiting for I/O +@section Waiting for Input or Output +@cindex waiting for input or output +@cindex multiplexing input +@cindex input from multiple files + +Sometimes a program needs to accept input on multiple input channels +whenever input arrives. For example, some workstations may have devices +such as a digitizing tablet, function button box, or dial box that are +connected via normal asynchronous serial interfaces; good user interface +style requires responding immediately to input on any device. Another +example is a program that acts as a server to several other processes +via pipes or sockets. + +You cannot normally use @code{read} for this purpose, because this +blocks the program until input is available on one particular file +descriptor; input on other channels won't wake it up. You could set +nonblocking mode and poll each file descriptor in turn, but this is very +inefficient. + +A better solution is to use the @code{select} function. This blocks the +program until input or output is ready on a specified set of file +descriptors, or until a timer expires, whichever comes first. This +facility is declared in the header file @file{sys/types.h}. +@pindex sys/types.h + +In the case of a server socket (@pxref{Listening}), we say that +``input'' is available when there are pending connections that could be +accepted (@pxref{Accepting Connections}). @code{accept} for server +sockets blocks and interacts with @code{select} just as @code{read} does +for normal input. + +@cindex file descriptor sets, for @code{select} +The file descriptor sets for the @code{select} function are specified +as @code{fd_set} objects. Here is the description of the data type +and some macros for manipulating these objects. + +@comment sys/types.h +@comment BSD +@deftp {Data Type} fd_set +The @code{fd_set} data type represents file descriptor sets for the +@code{select} function. It is actually a bit array. +@end deftp + +@comment sys/types.h +@comment BSD +@deftypevr Macro int FD_SETSIZE +The value of this macro is the maximum number of file descriptors that a +@code{fd_set} object can hold information about. On systems with a +fixed maximum number, @code{FD_SETSIZE} is at least that number. On +some systems, including GNU, there is no absolute limit on the number of +descriptors open, but this macro still has a constant value which +controls the number of bits in an @code{fd_set}; if you get a file +descriptor with a value as high as @code{FD_SETSIZE}, you cannot put +that descriptor into an @code{fd_set}. +@end deftypevr + +@comment sys/types.h +@comment BSD +@deftypefn Macro void FD_ZERO (fd_set *@var{set}) +This macro initializes the file descriptor set @var{set} to be the +empty set. +@end deftypefn + +@comment sys/types.h +@comment BSD +@deftypefn Macro void FD_SET (int @var{filedes}, fd_set *@var{set}) +This macro adds @var{filedes} to the file descriptor set @var{set}. +@end deftypefn + +@comment sys/types.h +@comment BSD +@deftypefn Macro void FD_CLR (int @var{filedes}, fd_set *@var{set}) +This macro removes @var{filedes} from the file descriptor set @var{set}. +@end deftypefn + +@comment sys/types.h +@comment BSD +@deftypefn Macro int FD_ISSET (int @var{filedes}, fd_set *@var{set}) +This macro returns a nonzero value (true) if @var{filedes} is a member +of the the file descriptor set @var{set}, and zero (false) otherwise. +@end deftypefn + +Next, here is the description of the @code{select} function itself. + +@comment sys/types.h +@comment BSD +@deftypefun int select (int @var{nfds}, fd_set *@var{read-fds}, fd_set *@var{write-fds}, fd_set *@var{except-fds}, struct timeval *@var{timeout}) +The @code{select} function blocks the calling process until there is +activity on any of the specified sets of file descriptors, or until the +timeout period has expired. + +The file descriptors specified by the @var{read-fds} argument are +checked to see if they are ready for reading; the @var{write-fds} file +descriptors are checked to see if they are ready for writing; and the +@var{except-fds} file descriptors are checked for exceptional +conditions. You can pass a null pointer for any of these arguments if +you are not interested in checking for that kind of condition. + +A file descriptor is considered ready for reading if it is at end of +file. A server socket is considered ready for reading if there is a +pending connection which can be accepted with @code{accept}; +@pxref{Accepting Connections}. A client socket is ready for writing when +its connection is fully established; @pxref{Connecting}. + +``Exceptional conditions'' does not mean errors---errors are reported +immediately when an erroneous system call is executed, and do not +constitute a state of the descriptor. Rather, they include conditions +such as the presence of an urgent message on a socket. (@xref{Sockets}, +for information on urgent messages.) + +The @code{select} function checks only the first @var{nfds} file +descriptors. The usual thing is to pass @code{FD_SETSIZE} as the value +of this argument. + +The @var{timeout} specifies the maximum time to wait. If you pass a +null pointer for this argument, it means to block indefinitely until one +of the file descriptors is ready. Otherwise, you should provide the +time in @code{struct timeval} format; see @ref{High-Resolution +Calendar}. Specify zero as the time (a @code{struct timeval} containing +all zeros) if you want to find out which descriptors are ready without +waiting if none are ready. + +The normal return value from @code{select} is the total number of ready file +descriptors in all of the sets. Each of the argument sets is overwritten +with information about the descriptors that are ready for the corresponding +operation. Thus, to see if a particular descriptor @var{desc} has input, +use @code{FD_ISSET (@var{desc}, @var{read-fds})} after @code{select} returns. + +If @code{select} returns because the timeout period expires, it returns +a value of zero. + +Any signal will cause @code{select} to return immediately. So if your +program uses signals, you can't rely on @code{select} to keep waiting +for the full time specified. If you want to be sure of waiting for a +particular amount of time, you must check for @code{EINTR} and repeat +the @code{select} with a newly calculated timeout based on the current +time. See the example below. See also @ref{Interrupted Primitives}. + +If an error occurs, @code{select} returns @code{-1} and does not modify +the argument file descriptor sets. The following @code{errno} error +conditions are defined for this function: + +@table @code +@item EBADF +One of the file descriptor sets specified an invalid file descriptor. + +@item EINTR +The operation was interrupted by a signal. @xref{Interrupted Primitives}. + +@item EINVAL +The @var{timeout} argument is invalid; one of the components is negative +or too large. +@end table +@end deftypefun + +@strong{Portability Note:} The @code{select} function is a BSD Unix +feature. + +Here is an example showing how you can use @code{select} to establish a +timeout period for reading from a file descriptor. The @code{input_timeout} +function blocks the calling process until input is available on the +file descriptor, or until the timeout period expires. + +@smallexample +@include select.c.texi +@end smallexample + +There is another example showing the use of @code{select} to multiplex +input from multiple sockets in @ref{Server Example}. + + +@node Control Operations +@section Control Operations on Files + +@cindex control operations on files +@cindex @code{fcntl} function +This section describes how you can perform various other operations on +file descriptors, such as inquiring about or setting flags describing +the status of the file descriptor, manipulating record locks, and the +like. All of these operations are performed by the function @code{fcntl}. + +The second argument to the @code{fcntl} function is a command that +specifies which operation to perform. The function and macros that name +various flags that are used with it are declared in the header file +@file{fcntl.h}. Many of these flags are also used by the @code{open} +function; see @ref{Opening and Closing Files}. +@pindex fcntl.h + +@comment fcntl.h +@comment POSIX.1 +@deftypefun int fcntl (int @var{filedes}, int @var{command}, @dots{}) +The @code{fcntl} function performs the operation specified by +@var{command} on the file descriptor @var{filedes}. Some commands +require additional arguments to be supplied. These additional arguments +and the return value and error conditions are given in the detailed +descriptions of the individual commands. + +Briefly, here is a list of what the various commands are. + +@table @code +@item F_DUPFD +Duplicate the file descriptor (return another file descriptor pointing +to the same open file). @xref{Duplicating Descriptors}. + +@item F_GETFD +Get flags associated with the file descriptor. @xref{Descriptor Flags}. + +@item F_SETFD +Set flags associated with the file descriptor. @xref{Descriptor Flags}. + +@item F_GETFL +Get flags associated with the open file. @xref{File Status Flags}. + +@item F_SETFL +Set flags associated with the open file. @xref{File Status Flags}. + +@item F_GETLK +Get a file lock. @xref{File Locks}. + +@item F_SETLK +Set or clear a file lock. @xref{File Locks}. + +@item F_SETLKW +Like @code{F_SETLK}, but wait for completion. @xref{File Locks}. + +@item F_GETOWN +Get process or process group ID to receive @code{SIGIO} signals. +@xref{Interrupt Input}. + +@item F_SETOWN +Set process or process group ID to receive @code{SIGIO} signals. +@xref{Interrupt Input}. +@end table +@end deftypefun + + +@node Duplicating Descriptors +@section Duplicating Descriptors + +@cindex duplicating file descriptors +@cindex redirecting input and output + +You can @dfn{duplicate} a file descriptor, or allocate another file +descriptor that refers to the same open file as the original. Duplicate +descriptors share one file position and one set of file status flags +(@pxref{File Status Flags}), but each has its own set of file descriptor +flags (@pxref{Descriptor Flags}). + +The major use of duplicating a file descriptor is to implement +@dfn{redirection} of input or output: that is, to change the +file or pipe that a particular file descriptor corresponds to. + +You can perform this operation using the @code{fcntl} function with the +@code{F_DUPFD} command, but there are also convenient functions +@code{dup} and @code{dup2} for duplicating descriptors. + +@pindex unistd.h +@pindex fcntl.h +The @code{fcntl} function and flags are declared in @file{fcntl.h}, +while prototypes for @code{dup} and @code{dup2} are in the header file +@file{unistd.h}. + +@comment unistd.h +@comment POSIX.1 +@deftypefun int dup (int @var{old}) +This function copies descriptor @var{old} to the first available +descriptor number (the first number not currently open). It is +equivalent to @code{fcntl (@var{old}, F_DUPFD, 0)}. +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun int dup2 (int @var{old}, int @var{new}) +This function copies the descriptor @var{old} to descriptor number +@var{new}. + +If @var{old} is an invalid descriptor, then @code{dup2} does nothing; it +does not close @var{new}. Otherwise, the new duplicate of @var{old} +replaces any previous meaning of descriptor @var{new}, as if @var{new} +were closed first. + +If @var{old} and @var{new} are different numbers, and @var{old} is a +valid descriptor number, then @code{dup2} is equivalent to: + +@smallexample +close (@var{new}); +fcntl (@var{old}, F_DUPFD, @var{new}) +@end smallexample + +However, @code{dup2} does this atomically; there is no instant in the +middle of calling @code{dup2} at which @var{new} is closed and not yet a +duplicate of @var{old}. +@end deftypefun + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int F_DUPFD +This macro is used as the @var{command} argument to @code{fcntl}, to +copy the file descriptor given as the first argument. + +The form of the call in this case is: + +@smallexample +fcntl (@var{old}, F_DUPFD, @var{next-filedes}) +@end smallexample + +The @var{next-filedes} argument is of type @code{int} and specifies that +the file descriptor returned should be the next available one greater +than or equal to this value. + +The return value from @code{fcntl} with this command is normally the value +of the new file descriptor. A return value of @code{-1} indicates an +error. The following @code{errno} error conditions are defined for +this command: + +@table @code +@item EBADF +The @var{old} argument is invalid. + +@item EINVAL +The @var{next-filedes} argument is invalid. + +@item EMFILE +There are no more file descriptors available---your program is already +using the maximum. In BSD and GNU, the maximum is controlled by a +resource limit that can be changed; @pxref{Limits on Resources}, for +more information about the @code{RLIMIT_NOFILE} limit. +@end table + +@code{ENFILE} is not a possible error code for @code{dup2} because +@code{dup2} does not create a new opening of a file; duplicate +descriptors do not count toward the limit which @code{ENFILE} +indicates. @code{EMFILE} is possible because it refers to the limit on +distinct descriptor numbers in use in one process. +@end deftypevr + +Here is an example showing how to use @code{dup2} to do redirection. +Typically, redirection of the standard streams (like @code{stdin}) is +done by a shell or shell-like program before calling one of the +@code{exec} functions (@pxref{Executing a File}) to execute a new +program in a child process. When the new program is executed, it +creates and initializes the standard streams to point to the +corresponding file descriptors, before its @code{main} function is +invoked. + +So, to redirect standard input to a file, the shell could do something +like: + +@smallexample +pid = fork (); +if (pid == 0) + @{ + char *filename; + char *program; + int file; + @dots{} + file = TEMP_FAILURE_RETRY (open (filename, O_RDONLY)); + dup2 (file, STDIN_FILENO); + TEMP_FAILURE_RETRY (close (file)); + execv (program, NULL); + @} +@end smallexample + +There is also a more detailed example showing how to implement redirection +in the context of a pipeline of processes in @ref{Launching Jobs}. + + +@node Descriptor Flags +@section File Descriptor Flags +@cindex file descriptor flags + +@dfn{File descriptor flags} are miscellaneous attributes of a file +descriptor. These flags are associated with particular file +descriptors, so that if you have created duplicate file descriptors +from a single opening of a file, each descriptor has its own set of flags. + +Currently there is just one file descriptor flag: @code{FD_CLOEXEC}, +which causes the descriptor to be closed if you use any of the +@code{exec@dots{}} functions (@pxref{Executing a File}). + +The symbols in this section are defined in the header file +@file{fcntl.h}. +@pindex fcntl.h + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int F_GETFD +This macro is used as the @var{command} argument to @code{fcntl}, to +specify that it should return the file descriptor flags associated +with the @var{filedes} argument. + +The normal return value from @code{fcntl} with this command is a +nonnegative number which can be interpreted as the bitwise OR of the +individual flags (except that currently there is only one flag to use). + +In case of an error, @code{fcntl} returns @code{-1}. The following +@code{errno} error conditions are defined for this command: + +@table @code +@item EBADF +The @var{filedes} argument is invalid. +@end table +@end deftypevr + + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int F_SETFD +This macro is used as the @var{command} argument to @code{fcntl}, to +specify that it should set the file descriptor flags associated with the +@var{filedes} argument. This requires a third @code{int} argument to +specify the new flags, so the form of the call is: + +@smallexample +fcntl (@var{filedes}, F_SETFD, @var{new-flags}) +@end smallexample + +The normal return value from @code{fcntl} with this command is an +unspecified value other than @code{-1}, which indicates an error. +The flags and error conditions are the same as for the @code{F_GETFD} +command. +@end deftypevr + +The following macro is defined for use as a file descriptor flag with +the @code{fcntl} function. The value is an integer constant usable +as a bit mask value. + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int FD_CLOEXEC +@cindex close-on-exec (file descriptor flag) +This flag specifies that the file descriptor should be closed when +an @code{exec} function is invoked; see @ref{Executing a File}. When +a file descriptor is allocated (as with @code{open} or @code{dup}), +this bit is initially cleared on the new file descriptor, meaning that +descriptor will survive into the new program after @code{exec}. +@end deftypevr + +If you want to modify the file descriptor flags, you should get the +current flags with @code{F_GETFD} and modify the value. Don't assume +that the flags listed here are the only ones that are implemented; your +program may be run years from now and more flags may exist then. For +example, here is a function to set or clear the flag @code{FD_CLOEXEC} +without altering any other flags: + +@smallexample +/* @r{Set the @code{FD_CLOEXEC} flag of @var{desc} if @var{value} is nonzero,} + @r{or clear the flag if @var{value} is 0.} + @r{Return 0 on success, or -1 on error with @code{errno} set.} */ + +int +set_cloexec_flag (int desc, int value) +@{ + int oldflags = fcntl (desc, F_GETFD, 0); + /* @r{If reading the flags failed, return error indication now.} + if (oldflags < 0) + return oldflags; + /* @r{Set just the flag we want to set.} */ + if (value != 0) + oldflags |= FD_CLOEXEC; + else + oldflags &= ~FD_CLOEXEC; + /* @r{Store modified flag word in the descriptor.} */ + return fcntl (desc, F_SETFD, oldflags); +@} +@end smallexample + +@node File Status Flags +@section File Status Flags +@cindex file status flags + +@dfn{File status flags} are used to specify attributes of the opening of a +file. Unlike the file descriptor flags discussed in @ref{Descriptor +Flags}, the file status flags are shared by duplicated file descriptors +resulting from a single opening of the file. The file status flags are +specified with the @var{flags} argument to @code{open}; +@pxref{Opening and Closing Files}. + +File status flags fall into three categories, which are described in the +following sections. + +@itemize @bullet +@item +@ref{Access Modes}, specify what type of access is allowed to the +file: reading, writing, or both. They are set by @code{open} and are +returned by @code{fcntl}, but cannot be changed. + +@item +@ref{Open-time Flags}, control details of what @code{open} will do. +These flags are not preserved after the @code{open} call. + +@item +@ref{Operating Modes}, affect how operations such as @code{read} and +@code{write} are done. They are set by @code{open}, and can be fetched or +changed with @code{fcntl}. +@end itemize + +The symbols in this section are defined in the header file +@file{fcntl.h}. +@pindex fcntl.h + +@menu +* Access Modes:: Whether the descriptor can read or write. +* Open-time Flags:: Details of @code{open}. +* Operating Modes:: Special modes to control I/O operations. +* Getting File Status Flags:: Fetching and changing these flags. +@end menu + +@node Access Modes +@subsection File Access Modes + +The file access modes allow a file descriptor to be used for reading, +writing, or both. (In the GNU system, they can also allow none of these, +and allow execution of the file as a program.) The access modes are chosen +when the file is opened, and never change. + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int O_RDONLY +Open the file for read access. +@end deftypevr + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int O_WRONLY +Open the file for write access. +@end deftypevr + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int O_RDWR +Open the file for both reading and writing. +@end deftypevr + +In the GNU system (and not in other systems), @code{O_RDONLY} and +@code{O_WRONLY} are independent bits that can be bitwise-ORed together, +and it is valid for either bit to be set or clear. This means that +@code{O_RDWR} is the same as @code{O_RDONLY|O_WRONLY}. A file access +mode of zero is permissible; it allows no operations that do input or +output to the file, but does allow other operations such as +@code{fchmod}. On the GNU system, since ``read-only'' or ``write-only'' +is a misnomer, @file{fcntl.h} defines additional names for the file +access modes. These names are preferred when writing GNU-specific code. +But most programs will want to be portable to other POSIX.1 systems and +should use the POSIX.1 names above instead. + +@comment fcntl.h +@comment GNU +@deftypevr Macro int O_READ +Open the file for reading. Same as @code{O_RDWR}; only defined on GNU. +@end deftypevr + +@comment fcntl.h +@comment GNU +@deftypevr Macro int O_WRITE +Open the file for reading. Same as @code{O_WRONLY}; only defined on GNU. +@end deftypevr + +@comment fcntl.h +@comment GNU +@deftypevr Macro int O_EXEC +Open the file for executing. Only defined on GNU. +@end deftypevr + +To determine the file access mode with @code{fcntl}, you must extract +the access mode bits from the retrieved file status flags. In the GNU +system, you can just test the @code{O_READ} and @code{O_WRITE} bits in +the flags word. But in other POSIX.1 systems, reading and writing +access modes are not stored as distinct bit flags. The portable way to +extract the file access mode bits is with @code{O_ACCMODE}. + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int O_ACCMODE +This macro stands for a mask that can be bitwise-ANDed with the file +status flag value to produce a value representing the file access mode. +The mode will be @code{O_RDONLY}, @code{O_WRONLY}, or @code{O_RDWR}. +(In the GNU system it could also be zero, and it never includes the +@code{O_EXEC} bit.) +@end deftypevr + +@node Open-time Flags +@subsection Open-time Flags + +The open-time flags specify options affecting how @code{open} will behave. +These options are not preserved once the file is open. The exception to +this is @code{O_NONBLOCK}, which is also an I/O operating mode and so it +@emph{is} saved. @xref{Opening and Closing Files}, for how to call +@code{open}. + +There are two sorts of options specified by open-time flags. + +@itemize @bullet +@item +@dfn{File name translation flags} affect how @code{open} looks up the +file name to locate the file, and whether the file can be created. +@cindex file name translation flags +@cindex flags, file name translation + +@item +@dfn{Open-time action flags} specify extra operations that @code{open} will +perform on the file once it is open. +@cindex open-time action flags +@cindex flags, open-time action +@end itemize + +Here are the file name translation flags. + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int O_CREAT +If set, the file will be created if it doesn't already exist. +@c !!! mode arg, umask +@cindex create on open (file status flag) +@end deftypevr + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int O_EXCL +If both @code{O_CREAT} and @code{O_EXCL} are set, then @code{open} fails +if the specified file already exists. This is guaranteed to never +clobber an existing file. +@end deftypevr + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int O_NONBLOCK +@cindex non-blocking open +This prevents @code{open} from blocking for a ``long time'' to open the +file. This is only meaningful for some kinds of files, usually devices +such as serial ports; when it is not meaningful, it is harmless and +ignored. Often opening a port to a modem blocks until the modem reports +carrier detection; if @code{O_NONBLOCK} is specified, @code{open} will +return immediately without a carrier. + +Note that the @code{O_NONBLOCK} flag is overloaded as both an I/O operating +mode and a file name translation flag. This means that specifying +@code{O_NONBLOCK} in @code{open} also sets nonblocking I/O mode; +@pxref{Operating Modes}. To open the file without blocking but do normal +I/O that blocks, you must call @code{open} with @code{O_NONBLOCK} set and +then call @code{fcntl} to turn the bit off. +@end deftypevr + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int O_NOCTTY +If the named file is a terminal device, don't make it the controlling +terminal for the process. @xref{Job Control}, for information about +what it means to be the controlling terminal. + +In the GNU system and 4.4 BSD, opening a file never makes it the +controlling terminal and @code{O_NOCTTY} is zero. However, other +systems may use a nonzero value for @code{O_NOCTTY} and set the +controlling terminal when you open a file that is a terminal device; so +to be portable, use @code{O_NOCTTY} when it is important to avoid this. +@cindex controlling terminal, setting +@end deftypevr + +The following three file name translation flags exist only in the GNU system. + +@comment fcntl.h +@comment GNU +@deftypevr Macro int O_IGNORE_CTTY +Do not recognize the named file as the controlling terminal, even if it +refers to the process's existing controlling terminal device. Operations +on the new file descriptor will never induce job control signals. +@xref{Job Control}. +@end deftypevr + +@comment fcntl.h +@comment GNU +@deftypevr Macro int O_NOLINK +If the named file is a symbolic link, open the link itself instead of +the file it refers to. (@code{fstat} on the new file descriptor will +return the information returned by @code{lstat} on the link's name.) +@cindex symbolic link, opening +@end deftypevr + +@comment fcntl.h +@comment GNU +@deftypevr Macro int O_NOTRANS +If the named file is specially translated, do not invoke the translator. +Open the bare file the translator itself sees. +@end deftypevr + + +The open-time action flags tell @code{open} to do additional operations +which are not really related to opening the file. The reason to do them +as part of @code{open} instead of in separate calls is that @code{open} +can do them @i{atomically}. + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int O_TRUNC +Truncate the file to zero length. This option is only useful for +regular files, not special files such as directories or FIFOs. POSIX.1 +requires that you open the file for writing to use @code{O_TRUNC}. In +BSD and GNU you must have permission to write the file to truncate it, +but you need not open for write access. + +This is the only open-time action flag specified by POSIX.1. There is +no good reason for truncation to be done by @code{open}, instead of by +calling @code{ftruncate} afterwards. The @code{O_TRUNC} flag existed in +Unix before @code{ftruncate} was invented, and is retained for backward +compatibility. +@end deftypevr + +@comment fcntl.h +@comment BSD +@deftypevr Macro int O_SHLOCK +Acquire a shared lock on the file, as with @code{flock}. +@xref{File Locks}. + +If @code{O_CREAT} is specified, the locking is done atomically when +creating the file. You are guaranteed that no other process will get +the lock on the new file first. +@end deftypevr + +@comment fcntl.h +@comment BSD +@deftypevr Macro int O_EXLOCK +Acquire an exclusive lock on the file, as with @code{flock}. +@xref{File Locks}. This is atomic like @code{O_SHLOCK}. +@end deftypevr + +@node Operating Modes +@subsection I/O Operating Modes + +The operating modes affect how input and output operations using a file +descriptor work. These flags are set by @code{open} and can be fetched +and changed with @code{fcntl}. + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int O_APPEND +The bit that enables append mode for the file. If set, then all +@code{write} operations write the data at the end of the file, extending +it, regardless of the current file position. This is the only reliable +way to append to a file. In append mode, you are guaranteed that the +data you write will always go to the current end of the file, regardless +of other processes writing to the file. Conversely, if you simply set +the file position to the end of file and write, then another process can +extend the file after you set the file position but before you write, +resulting in your data appearing someplace before the real end of file. +@end deftypevr + +@comment fcntl.h +@comment POSIX.1 +@deftypevr O_NONBLOCK +The bit that enables nonblocking mode for the file. If this bit is set, +@code{read} requests on the file can return immediately with a failure +status if there is no input immediately available, instead of blocking. +Likewise, @code{write} requests can also return immediately with a +failure status if the output can't be written immediately. + +Note that the @code{O_NONBLOCK} flag is overloaded as both an I/O +operating mode and a file name translation flag; @pxref{Open-time Flags}. +@end deftypevr + +@comment fcntl.h +@comment BSD +@deftypevr Macro int O_NDELAY +This is an obsolete name for @code{O_NONBLOCK}, provided for +compatibility with BSD. It is not defined by the POSIX.1 standard. +@end deftypevr + +The remaining operating modes are BSD and GNU extensions. They exist only +on some systems. On other systems, these macros are not defined. + +@comment fcntl.h +@comment BSD +@deftypevr Macro int O_ASYNC +The bit that enables asynchronous input mode. If set, then @code{SIGIO} +signals will be generated when input is available. @xref{Interrupt Input}. + +Asynchronous input mode is a BSD feature. +@end deftypevr + +@comment fcntl.h +@comment BSD +@deftypevr Macro int O_FSYNC +The bit that enables synchronous writing for the file. If set, each +@code{write} call will make sure the data is reliably stored on disk before +returning. @c !!! xref fsync + +Synchronous writing is a BSD feature. +@end deftypevr + +@comment fcntl.h +@comment BSD +@deftypevr Macro int O_SYNC +This is another name for @code{O_FSYNC}. They have the same value. +@end deftypevr + +@comment fcntl.h +@comment GNU +@deftypevr Macro int O_NOATIME +If this bit is set, @code{read} will not update the access time of the +file. @xref{File Times}. This is used by programs that do backups, so +that backing a file up does not count as reading it. +Only the owner of the file or the superuser may use this bit. + +This is a GNU extension. +@end deftypevr + +@node Getting File Status Flags +@subsection Getting and Setting File Status Flags + +The @code{fcntl} function can fetch or change file status flags. + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int F_GETFL +This macro is used as the @var{command} argument to @code{fcntl}, to +read the file status flags for the open file with descriptor +@var{filedes}. + +The normal return value from @code{fcntl} with this command is a +nonnegative number which can be interpreted as the bitwise OR of the +individual flags. Since the file access modes are not single-bit values, +you can mask off other bits in the returned flags with @code{O_ACCMODE} +to compare them. + +In case of an error, @code{fcntl} returns @code{-1}. The following +@code{errno} error conditions are defined for this command: + +@table @code +@item EBADF +The @var{filedes} argument is invalid. +@end table +@end deftypevr + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int F_SETFL +This macro is used as the @var{command} argument to @code{fcntl}, to set +the file status flags for the open file corresponding to the +@var{filedes} argument. This command requires a third @code{int} +argument to specify the new flags, so the call looks like this: + +@smallexample +fcntl (@var{filedes}, F_SETFL, @var{new-flags}) +@end smallexample + +You can't change the access mode for the file in this way; that is, +whether the file descriptor was opened for reading or writing. + +The normal return value from @code{fcntl} with this command is an +unspecified value other than @code{-1}, which indicates an error. The +error conditions are the same as for the @code{F_GETFL} command. +@end deftypevr + +If you want to modify the file status flags, you should get the current +flags with @code{F_GETFL} and modify the value. Don't assume that the +flags listed here are the only ones that are implemented; your program +may be run years from now and more flags may exist then. For example, +here is a function to set or clear the flag @code{O_NONBLOCK} without +altering any other flags: + +@smallexample +@group +/* @r{Set the @code{O_NONBLOCK} flag of @var{desc} if @var{value} is nonzero,} + @r{or clear the flag if @var{value} is 0.} + @r{Return 0 on success, or -1 on error with @code{errno} set.} */ + +int +set_nonblock_flag (int desc, int value) +@{ + int oldflags = fcntl (desc, F_GETFL, 0); + /* @r{If reading the flags failed, return error indication now.} */ + if (oldflags == -1) + return -1; + /* @r{Set just the flag we want to set.} */ + if (value != 0) + oldflags |= O_NONBLOCK; + else + oldflags &= ~O_NONBLOCK; + /* @r{Store modified flag word in the descriptor.} */ + return fcntl (desc, F_SETFL, oldflags); +@} +@end group +@end smallexample + +@node File Locks +@section File Locks + +@cindex file locks +@cindex record locking +The remaining @code{fcntl} commands are used to support @dfn{record +locking}, which permits multiple cooperating programs to prevent each +other from simultaneously accessing parts of a file in error-prone +ways. + +@cindex exclusive lock +@cindex write lock +An @dfn{exclusive} or @dfn{write} lock gives a process exclusive access +for writing to the specified part of the file. While a write lock is in +place, no other process can lock that part of the file. + +@cindex shared lock +@cindex read lock +A @dfn{shared} or @dfn{read} lock prohibits any other process from +requesting a write lock on the specified part of the file. However, +other processes can request read locks. + +The @code{read} and @code{write} functions do not actually check to see +whether there are any locks in place. If you want to implement a +locking protocol for a file shared by multiple processes, your application +must do explicit @code{fcntl} calls to request and clear locks at the +appropriate points. + +Locks are associated with processes. A process can only have one kind +of lock set for each byte of a given file. When any file descriptor for +that file is closed by the process, all of the locks that process holds +on that file are released, even if the locks were made using other +descriptors that remain open. Likewise, locks are released when a +process exits, and are not inherited by child processes created using +@code{fork} (@pxref{Creating a Process}). + +When making a lock, use a @code{struct flock} to specify what kind of +lock and where. This data type and the associated macros for the +@code{fcntl} function are declared in the header file @file{fcntl.h}. +@pindex fcntl.h + +@comment fcntl.h +@comment POSIX.1 +@deftp {Data Type} {struct flock} +This structure is used with the @code{fcntl} function to describe a file +lock. It has these members: + +@table @code +@item short int l_type +Specifies the type of the lock; one of @code{F_RDLCK}, @code{F_WRLCK}, or +@code{F_UNLCK}. + +@item short int l_whence +This corresponds to the @var{whence} argument to @code{fseek} or +@code{lseek}, and specifies what the offset is relative to. Its value +can be one of @code{SEEK_SET}, @code{SEEK_CUR}, or @code{SEEK_END}. + +@item off_t l_start +This specifies the offset of the start of the region to which the lock +applies, and is given in bytes relative to the point specified by +@code{l_whence} member. + +@item off_t l_len +This specifies the length of the region to be locked. A value of +@code{0} is treated specially; it means the region extends to the end of +the file. + +@item pid_t l_pid +This field is the process ID (@pxref{Process Creation Concepts}) of the +process holding the lock. It is filled in by calling @code{fcntl} with +the @code{F_GETLK} command, but is ignored when making a lock. +@end table +@end deftp + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int F_GETLK +This macro is used as the @var{command} argument to @code{fcntl}, to +specify that it should get information about a lock. This command +requires a third argument of type @w{@code{struct flock *}} to be passed +to @code{fcntl}, so that the form of the call is: + +@smallexample +fcntl (@var{filedes}, F_GETLK, @var{lockp}) +@end smallexample + +If there is a lock already in place that would block the lock described +by the @var{lockp} argument, information about that lock overwrites +@code{*@var{lockp}}. Existing locks are not reported if they are +compatible with making a new lock as specified. Thus, you should +specify a lock type of @code{F_WRLCK} if you want to find out about both +read and write locks, or @code{F_RDLCK} if you want to find out about +write locks only. + +There might be more than one lock affecting the region specified by the +@var{lockp} argument, but @code{fcntl} only returns information about +one of them. The @code{l_whence} member of the @var{lockp} structure is +set to @code{SEEK_SET} and the @code{l_start} and @code{l_len} fields +set to identify the locked region. + +If no lock applies, the only change to the @var{lockp} structure is to +update the @code{l_type} to a value of @code{F_UNLCK}. + +The normal return value from @code{fcntl} with this command is an +unspecified value other than @code{-1}, which is reserved to indicate an +error. The following @code{errno} error conditions are defined for +this command: + +@table @code +@item EBADF +The @var{filedes} argument is invalid. + +@item EINVAL +Either the @var{lockp} argument doesn't specify valid lock information, +or the file associated with @var{filedes} doesn't support locks. +@end table +@end deftypevr + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int F_SETLK +This macro is used as the @var{command} argument to @code{fcntl}, to +specify that it should set or clear a lock. This command requires a +third argument of type @w{@code{struct flock *}} to be passed to +@code{fcntl}, so that the form of the call is: + +@smallexample +fcntl (@var{filedes}, F_SETLK, @var{lockp}) +@end smallexample + +If the process already has a lock on any part of the region, the old lock +on that part is replaced with the new lock. You can remove a lock +by specifying a lock type of @code{F_UNLCK}. + +If the lock cannot be set, @code{fcntl} returns immediately with a value +of @code{-1}. This function does not block waiting for other processes +to release locks. If @code{fcntl} succeeds, it return a value other +than @code{-1}. + +The following @code{errno} error conditions are defined for this +function: + +@table @code +@item EAGAIN +@itemx EACCES +The lock cannot be set because it is blocked by an existing lock on the +file. Some systems use @code{EAGAIN} in this case, and other systems +use @code{EACCES}; your program should treat them alike, after +@code{F_SETLK}. (The GNU system always uses @code{EAGAIN}.) + +@item EBADF +Either: the @var{filedes} argument is invalid; you requested a read lock +but the @var{filedes} is not open for read access; or, you requested a +write lock but the @var{filedes} is not open for write access. + +@item EINVAL +Either the @var{lockp} argument doesn't specify valid lock information, +or the file associated with @var{filedes} doesn't support locks. + +@item ENOLCK +The system has run out of file lock resources; there are already too +many file locks in place. + +Well-designed file systems never report this error, because they have no +limitation on the number of locks. However, you must still take account +of the possibility of this error, as it could result from network access +to a file system on another machine. +@end table +@end deftypevr + +@comment fcntl.h +@comment POSIX.1 +@deftypevr Macro int F_SETLKW +This macro is used as the @var{command} argument to @code{fcntl}, to +specify that it should set or clear a lock. It is just like the +@code{F_SETLK} command, but causes the process to block (or wait) +until the request can be specified. + +This command requires a third argument of type @code{struct flock *}, as +for the @code{F_SETLK} command. + +The @code{fcntl} return values and errors are the same as for the +@code{F_SETLK} command, but these additional @code{errno} error conditions +are defined for this command: + +@table @code +@item EINTR +The function was interrupted by a signal while it was waiting. +@xref{Interrupted Primitives}. + +@item EDEADLK +The specified region is being locked by another process. But that +process is waiting to lock a region which the current process has +locked, so waiting for the lock would result in deadlock. The system +does not guarantee that it will detect all such conditions, but it lets +you know if it notices one. +@end table +@end deftypevr + + +The following macros are defined for use as values for the @code{l_type} +member of the @code{flock} structure. The values are integer constants. + +@table @code +@comment fcntl.h +@comment POSIX.1 +@vindex F_RDLCK +@item F_RDLCK +This macro is used to specify a read (or shared) lock. + +@comment fcntl.h +@comment POSIX.1 +@vindex F_WRLCK +@item F_WRLCK +This macro is used to specify a write (or exclusive) lock. + +@comment fcntl.h +@comment POSIX.1 +@vindex F_UNLCK +@item F_UNLCK +This macro is used to specify that the region is unlocked. +@end table + +As an example of a situation where file locking is useful, consider a +program that can be run simultaneously by several different users, that +logs status information to a common file. One example of such a program +might be a game that uses a file to keep track of high scores. Another +example might be a program that records usage or accounting information +for billing purposes. + +Having multiple copies of the program simultaneously writing to the +file could cause the contents of the file to become mixed up. But +you can prevent this kind of problem by setting a write lock on the +file before actually writing to the file. + +If the program also needs to read the file and wants to make sure that +the contents of the file are in a consistent state, then it can also use +a read lock. While the read lock is set, no other process can lock +that part of the file for writing. + +@c ??? This section could use an example program. + +Remember that file locks are only a @emph{voluntary} protocol for +controlling access to a file. There is still potential for access to +the file by programs that don't use the lock protocol. + +@node Interrupt Input +@section Interrupt-Driven Input + +@cindex interrupt-driven input +If you set the @code{O_ASYNC} status flag on a file descriptor +(@pxref{File Status Flags}), a @code{SIGIO} signal is sent whenever +input or output becomes possible on that file descriptor. The process +or process group to receive the signal can be selected by using the +@code{F_SETOWN} command to the @code{fcntl} function. If the file +descriptor is a socket, this also selects the recipient of @code{SIGURG} +signals that are delivered when out-of-band data arrives on that socket; +see @ref{Out-of-Band Data}. (@code{SIGURG} is sent in any situation +where @code{select} would report the socket as having an ``exceptional +condition''. @xref{Waiting for I/O}.) + +If the file descriptor corresponds to a terminal device, then @code{SIGIO} +signals are sent to the foreground process group of the terminal. +@xref{Job Control}. + +@pindex fcntl.h +The symbols in this section are defined in the header file +@file{fcntl.h}. + +@comment fcntl.h +@comment BSD +@deftypevr Macro int F_GETOWN +This macro is used as the @var{command} argument to @code{fcntl}, to +specify that it should get information about the process or process +group to which @code{SIGIO} signals are sent. (For a terminal, this is +actually the foreground process group ID, which you can get using +@code{tcgetpgrp}; see @ref{Terminal Access Functions}.) + +The return value is interpreted as a process ID; if negative, its +absolute value is the process group ID. + +The following @code{errno} error condition is defined for this command: + +@table @code +@item EBADF +The @var{filedes} argument is invalid. +@end table +@end deftypevr + +@comment fcntl.h +@comment BSD +@deftypevr Macro int F_SETOWN +This macro is used as the @var{command} argument to @code{fcntl}, to +specify that it should set the process or process group to which +@code{SIGIO} signals are sent. This command requires a third argument +of type @code{pid_t} to be passed to @code{fcntl}, so that the form of +the call is: + +@smallexample +fcntl (@var{filedes}, F_SETOWN, @var{pid}) +@end smallexample + +The @var{pid} argument should be a process ID. You can also pass a +negative number whose absolute value is a process group ID. + +The return value from @code{fcntl} with this command is @code{-1} +in case of error and some other value if successful. The following +@code{errno} error conditions are defined for this command: + +@table @code +@item EBADF +The @var{filedes} argument is invalid. + +@item ESRCH +There is no process or process group corresponding to @var{pid}. +@end table +@end deftypevr + +@c ??? This section could use an example program. diff --git a/manual/locale.texi b/manual/locale.texi new file mode 100644 index 0000000000..d2d7557ea9 --- /dev/null +++ b/manual/locale.texi @@ -0,0 +1,605 @@ +@node Locales, Searching and Sorting, Extended Characters, Top +@chapter Locales and Internationalization + +Different countries and cultures have varying conventions for how to +communicate. These conventions range from very simple ones, such as the +format for representing dates and times, to very complex ones, such as +the language spoken. + +@cindex internationalization +@cindex locales +@dfn{Internationalization} of software means programming it to be able +to adapt to the user's favorite conventions. In ANSI C, +internationalization works by means of @dfn{locales}. Each locale +specifies a collection of conventions, one convention for each purpose. +The user chooses a set of conventions by specifying a locale (via +environment variables). + +All programs inherit the chosen locale as part of their environment. +Provided the programs are written to obey the choice of locale, they +will follow the conventions preferred by the user. + +@menu +* Effects of Locale:: Actions affected by the choice of + locale. +* Choosing Locale:: How the user specifies a locale. +* Locale Categories:: Different purposes for which you can + select a locale. +* Setting the Locale:: How a program specifies the locale + with library functions. +* Standard Locales:: Locale names available on all systems. +* Numeric Formatting:: How to format numbers according to the + chosen locale. +@end menu + +@node Effects of Locale, Choosing Locale, , Locales +@section What Effects a Locale Has + +Each locale specifies conventions for several purposes, including the +following: + +@itemize @bullet +@item +What multibyte character sequences are valid, and how they are +interpreted (@pxref{Extended Characters}). + +@item +Classification of which characters in the local character set are +considered alphabetic, and upper- and lower-case conversion conventions +(@pxref{Character Handling}). + +@item +The collating sequence for the local language and character set +(@pxref{Collation Functions}). + +@item +Formatting of numbers and currency amounts (@pxref{Numeric Formatting}). + +@item +Formatting of dates and times (@pxref{Formatting Date and Time}). + +@item +What language to use for output, including error messages. +(The C library doesn't yet help you implement this.) + +@item +What language to use for user answers to yes-or-no questions. + +@item +What language to use for more complex user input. +(The C library doesn't yet help you implement this.) +@end itemize + +Some aspects of adapting to the specified locale are handled +automatically by the library subroutines. For example, all your program +needs to do in order to use the collating sequence of the chosen locale +is to use @code{strcoll} or @code{strxfrm} to compare strings. + +Other aspects of locales are beyond the comprehension of the library. +For example, the library can't automatically translate your program's +output messages into other languages. The only way you can support +output in the user's favorite language is to program this more or less +by hand. (Eventually, we hope to provide facilities to make this +easier.) + +This chapter discusses the mechanism by which you can modify the current +locale. The effects of the current locale on specific library functions +are discussed in more detail in the descriptions of those functions. + +@node Choosing Locale, Locale Categories, Effects of Locale, Locales +@section Choosing a Locale + +The simplest way for the user to choose a locale is to set the +environment variable @code{LANG}. This specifies a single locale to use +for all purposes. For example, a user could specify a hypothetical +locale named @samp{espana-castellano} to use the standard conventions of +most of Spain. + +The set of locales supported depends on the operating system you are +using, and so do their names. We can't make any promises about what +locales will exist, except for one standard locale called @samp{C} or +@samp{POSIX}. + +@cindex combining locales +A user also has the option of specifying different locales for different +purposes---in effect, choosing a mixture of multiple locales. + +For example, the user might specify the locale @samp{espana-castellano} +for most purposes, but specify the locale @samp{usa-english} for +currency formatting. This might make sense if the user is a +Spanish-speaking American, working in Spanish, but representing monetary +amounts in US dollars. + +Note that both locales @samp{espana-castellano} and @samp{usa-english}, +like all locales, would include conventions for all of the purposes to +which locales apply. However, the user can choose to use each locale +for a particular subset of those purposes. + +@node Locale Categories, Setting the Locale, Choosing Locale, Locales +@section Categories of Activities that Locales Affect +@cindex categories for locales +@cindex locale categories + +The purposes that locales serve are grouped into @dfn{categories}, so +that a user or a program can choose the locale for each category +independently. Here is a table of categories; each name is both an +environment variable that a user can set, and a macro name that you can +use as an argument to @code{setlocale}. + +@table @code +@comment locale.h +@comment ANSI +@item LC_COLLATE +@vindex LC_COLLATE +This category applies to collation of strings (functions @code{strcoll} +and @code{strxfrm}); see @ref{Collation Functions}. + +@comment locale.h +@comment ANSI +@item LC_CTYPE +@vindex LC_CTYPE +This category applies to classification and conversion of characters, +and to multibyte and wide characters; +see @ref{Character Handling} and @ref{Extended Characters}. + +@comment locale.h +@comment ANSI +@item LC_MONETARY +@vindex LC_MONETARY +This category applies to formatting monetary values; see @ref{Numeric +Formatting}. + +@comment locale.h +@comment ANSI +@item LC_NUMERIC +@vindex LC_NUMERIC +This category applies to formatting numeric values that are not +monetary; see @ref{Numeric Formatting}. + +@comment locale.h +@comment ANSI +@item LC_TIME +@vindex LC_TIME +This category applies to formatting date and time values; see +@ref{Formatting Date and Time}. + +@ignore This is apparently a feature that was in some early +draft of the POSIX.2 standard, but it's not listed in draft 11. Do we +still support this anyway? Is there a corresponding environment +variable? + +@comment locale.h +@comment GNU +@item LC_RESPONSE +@vindex LC_RESPONSE +This category applies to recognizing ``yes'' or ``no'' responses to +questions. +@end ignore + +@comment locale.h +@comment ANSI +@item LC_ALL +@vindex LC_ALL +This is not an environment variable; it is only a macro that you can use +with @code{setlocale} to set a single locale for all purposes. + +@comment locale.h +@comment ANSI +@item LANG +@vindex LANG +If this environment variable is defined, its value specifies the locale +to use for all purposes except as overridden by the variables above. +@end table + +@node Setting the Locale, Standard Locales, Locale Categories, Locales +@section How Programs Set the Locale + +A C program inherits its locale environment variables when it starts up. +This happens automatically. However, these variables do not +automatically control the locale used by the library functions, because +ANSI C says that all programs start by default in the standard @samp{C} +locale. To use the locales specified by the environment, you must call +@code{setlocale}. Call it as follows: + +@smallexample +setlocale (LC_ALL, ""); +@end smallexample + +@noindent +to select a locale based on the appropriate environment variables. + +@cindex changing the locale +@cindex locale, changing +You can also use @code{setlocale} to specify a particular locale, for +general use or for a specific category. + +@pindex locale.h +The symbols in this section are defined in the header file @file{locale.h}. + +@comment locale.h +@comment ANSI +@deftypefun {char *} setlocale (int @var{category}, const char *@var{locale}) +The function @code{setlocale} sets the current locale for +category @var{category} to @var{locale}. + +If @var{category} is @code{LC_ALL}, this specifies the locale for all +purposes. The other possible values of @var{category} specify an +individual purpose (@pxref{Locale Categories}). + +You can also use this function to find out the current locale by passing +a null pointer as the @var{locale} argument. In this case, +@code{setlocale} returns a string that is the name of the locale +currently selected for category @var{category}. + +The string returned by @code{setlocale} can be overwritten by subsequent +calls, so you should make a copy of the string (@pxref{Copying and +Concatenation}) if you want to save it past any further calls to +@code{setlocale}. (The standard library is guaranteed never to call +@code{setlocale} itself.) + +You should not modify the string returned by @code{setlocale}. +It might be the same string that was passed as an argument in a +previous call to @code{setlocale}. + +When you read the current locale for category @code{LC_ALL}, the value +encodes the entire combination of selected locales for all categories. +In this case, the value is not just a single locale name. In fact, we +don't make any promises about what it looks like. But if you specify +the same ``locale name'' with @code{LC_ALL} in a subsequent call to +@code{setlocale}, it restores the same combination of locale selections. + +When the @var{locale} argument is not a null pointer, the string returned +by @code{setlocale} reflects the newly modified locale. + +If you specify an empty string for @var{locale}, this means to read the +appropriate environment variable and use its value to select the locale +for @var{category}. + +If you specify an invalid locale name, @code{setlocale} returns a null +pointer and leaves the current locale unchanged. +@end deftypefun + +Here is an example showing how you might use @code{setlocale} to +temporarily switch to a new locale. + +@smallexample +#include <stddef.h> +#include <locale.h> +#include <stdlib.h> +#include <string.h> + +void +with_other_locale (char *new_locale, + void (*subroutine) (int), + int argument) +@{ + char *old_locale, *saved_locale; + + /* @r{Get the name of the current locale.} */ + old_locale = setlocale (LC_ALL, NULL); + + /* @r{Copy the name so it won't be clobbered by @code{setlocale}.} */ + saved_locale = strdup (old_locale); + if (old_locale == NULL) + fatal ("Out of memory"); + + /* @r{Now change the locale and do some stuff with it.} */ + setlocale (LC_ALL, new_locale); + (*subroutine) (argument); + + /* @r{Restore the original locale.} */ + setlocale (LC_ALL, saved_locale); + free (saved_locale); +@} +@end smallexample + +@strong{Portability Note:} Some ANSI C systems may define additional +locale categories. For portability, assume that any symbol beginning +with @samp{LC_} might be defined in @file{locale.h}. + +@node Standard Locales, Numeric Formatting, Setting the Locale, Locales +@section Standard Locales + +The only locale names you can count on finding on all operating systems +are these three standard ones: + +@table @code +@item "C" +This is the standard C locale. The attributes and behavior it provides +are specified in the ANSI C standard. When your program starts up, it +initially uses this locale by default. + +@item "POSIX" +This is the standard POSIX locale. Currently, it is an alias for the +standard C locale. + +@item "" +The empty name says to select a locale based on environment variables. +@xref{Locale Categories}. +@end table + +Defining and installing named locales is normally a responsibility of +the system administrator at your site (or the person who installed the +GNU C library). Some systems may allow users to create locales, but +we don't discuss that here. +@c ??? If we give the GNU system that capability, this place will have +@c ??? to be changed. + +If your program needs to use something other than the @samp{C} locale, +it will be more portable if you use whatever locale the user specifies +with the environment, rather than trying to specify some non-standard +locale explicitly by name. Remember, different machines might have +different sets of locales installed. + +@node Numeric Formatting, , Standard Locales, Locales +@section Numeric Formatting + +When you want to format a number or a currency amount using the +conventions of the current locale, you can use the function +@code{localeconv} to get the data on how to do it. The function +@code{localeconv} is declared in the header file @file{locale.h}. +@pindex locale.h +@cindex monetary value formatting +@cindex numeric value formatting + +@comment locale.h +@comment ANSI +@deftypefun {struct lconv *} localeconv (void) +The @code{localeconv} function returns a pointer to a structure whose +components contain information about how numeric and monetary values +should be formatted in the current locale. + +You shouldn't modify the structure or its contents. The structure might +be overwritten by subsequent calls to @code{localeconv}, or by calls to +@code{setlocale}, but no other function in the library overwrites this +value. +@end deftypefun + +@comment locale.h +@comment ANSI +@deftp {Data Type} {struct lconv} +This is the data type of the value returned by @code{localeconv}. +@end deftp + +If a member of the structure @code{struct lconv} has type @code{char}, +and the value is @code{CHAR_MAX}, it means that the current locale has +no value for that parameter. + +@menu +* General Numeric:: Parameters for formatting numbers and + currency amounts. +* Currency Symbol:: How to print the symbol that identifies an + amount of money (e.g. @samp{$}). +* Sign of Money Amount:: How to print the (positive or negative) sign + for a monetary amount, if one exists. +@end menu + +@node General Numeric, Currency Symbol, , Numeric Formatting +@subsection Generic Numeric Formatting Parameters + +These are the standard members of @code{struct lconv}; there may be +others. + +@table @code +@item char *decimal_point +@itemx char *mon_decimal_point +These are the decimal-point separators used in formatting non-monetary +and monetary quantities, respectively. In the @samp{C} locale, the +value of @code{decimal_point} is @code{"."}, and the value of +@code{mon_decimal_point} is @code{""}. +@cindex decimal-point separator + +@item char *thousands_sep +@itemx char *mon_thousands_sep +These are the separators used to delimit groups of digits to the left of +the decimal point in formatting non-monetary and monetary quantities, +respectively. In the @samp{C} locale, both members have a value of +@code{""} (the empty string). + +@item char *grouping +@itemx char *mon_grouping +These are strings that specify how to group the digits to the left of +the decimal point. @code{grouping} applies to non-monetary quantities +and @code{mon_grouping} applies to monetary quantities. Use either +@code{thousands_sep} or @code{mon_thousands_sep} to separate the digit +groups. +@cindex grouping of digits + +Each string is made up of decimal numbers separated by semicolons. +Successive numbers (from left to right) give the sizes of successive +groups (from right to left, starting at the decimal point). The last +number in the string is used over and over for all the remaining groups. + +If the last integer is @code{-1}, it means that there is no more +grouping---or, put another way, any remaining digits form one large +group without separators. + +For example, if @code{grouping} is @code{"4;3;2"}, the correct grouping +for the number @code{123456787654321} is @samp{12}, @samp{34}, +@samp{56}, @samp{78}, @samp{765}, @samp{4321}. This uses a group of 4 +digits at the end, preceded by a group of 3 digits, preceded by groups +of 2 digits (as many as needed). With a separator of @samp{,}, the +number would be printed as @samp{12,34,56,78,765,4321}. + +A value of @code{"3"} indicates repeated groups of three digits, as +normally used in the U.S. + +In the standard @samp{C} locale, both @code{grouping} and +@code{mon_grouping} have a value of @code{""}. This value specifies no +grouping at all. + +@item char int_frac_digits +@itemx char frac_digits +These are small integers indicating how many fractional digits (to the +right of the decimal point) should be displayed in a monetary value in +international and local formats, respectively. (Most often, both +members have the same value.) + +In the standard @samp{C} locale, both of these members have the value +@code{CHAR_MAX}, meaning ``unspecified''. The ANSI standard doesn't say +what to do when you find this the value; we recommend printing no +fractional digits. (This locale also specifies the empty string for +@code{mon_decimal_point}, so printing any fractional digits would be +confusing!) +@end table + +@node Currency Symbol, Sign of Money Amount, General Numeric, Numeric Formatting +@subsection Printing the Currency Symbol +@cindex currency symbols + +These members of the @code{struct lconv} structure specify how to print +the symbol to identify a monetary value---the international analog of +@samp{$} for US dollars. + +Each country has two standard currency symbols. The @dfn{local currency +symbol} is used commonly within the country, while the +@dfn{international currency symbol} is used internationally to refer to +that country's currency when it is necessary to indicate the country +unambiguously. + +For example, many countries use the dollar as their monetary unit, and +when dealing with international currencies it's important to specify +that one is dealing with (say) Canadian dollars instead of U.S. dollars +or Australian dollars. But when the context is known to be Canada, +there is no need to make this explicit---dollar amounts are implicitly +assumed to be in Canadian dollars. + +@table @code +@item char *currency_symbol +The local currency symbol for the selected locale. + +In the standard @samp{C} locale, this member has a value of @code{""} +(the empty string), meaning ``unspecified''. The ANSI standard doesn't +say what to do when you find this value; we recommend you simply print +the empty string as you would print any other string found in the +appropriate member. + +@item char *int_curr_symbol +The international currency symbol for the selected locale. + +The value of @code{int_curr_symbol} should normally consist of a +three-letter abbreviation determined by the international standard +@cite{ISO 4217 Codes for the Representation of Currency and Funds}, +followed by a one-character separator (often a space). + +In the standard @samp{C} locale, this member has a value of @code{""} +(the empty string), meaning ``unspecified''. We recommend you simply +print the empty string as you would print any other string found in the +appropriate member. + +@item char p_cs_precedes +@itemx char n_cs_precedes +These members are @code{1} if the @code{currency_symbol} string should +precede the value of a monetary amount, or @code{0} if the string should +follow the value. The @code{p_cs_precedes} member applies to positive +amounts (or zero), and the @code{n_cs_precedes} member applies to +negative amounts. + +In the standard @samp{C} locale, both of these members have a value of +@code{CHAR_MAX}, meaning ``unspecified''. The ANSI standard doesn't say +what to do when you find this value, but we recommend printing the +currency symbol before the amount. That's right for most countries. +In other words, treat all nonzero values alike in these members. + +The POSIX standard says that these two members apply to the +@code{int_curr_symbol} as well as the @code{currency_symbol}. The ANSI +C standard seems to imply that they should apply only to the +@code{currency_symbol}---so the @code{int_curr_symbol} should always +precede the amount. + +We can only guess which of these (if either) matches the usual +conventions for printing international currency symbols. Our guess is +that they should always preceed the amount. If we find out a reliable +answer, we will put it here. + +@item char p_sep_by_space +@itemx char n_sep_by_space +These members are @code{1} if a space should appear between the +@code{currency_symbol} string and the amount, or @code{0} if no space +should appear. The @code{p_sep_by_space} member applies to positive +amounts (or zero), and the @code{n_sep_by_space} member applies to +negative amounts. + +In the standard @samp{C} locale, both of these members have a value of +@code{CHAR_MAX}, meaning ``unspecified''. The ANSI standard doesn't say +what you should do when you find this value; we suggest you treat it as +one (print a space). In other words, treat all nonzero values alike in +these members. + +These members apply only to @code{currency_symbol}. When you use +@code{int_curr_symbol}, you never print an additional space, because +@code{int_curr_symbol} itself contains the appropriate separator. + +The POSIX standard says that these two members apply to the +@code{int_curr_symbol} as well as the @code{currency_symbol}. But an +example in the ANSI C standard clearly implies that they should apply +only to the @code{currency_symbol}---that the @code{int_curr_symbol} +contains any appropriate separator, so you should never print an +additional space. + +Based on what we know now, we recommend you ignore these members when +printing international currency symbols, and print no extra space. +@end table + +@node Sign of Money Amount, , Currency Symbol, Numeric Formatting +@subsection Printing the Sign of an Amount of Money + +These members of the @code{struct lconv} structure specify how to print +the sign (if any) in a monetary value. + +@table @code +@item char *positive_sign +@itemx char *negative_sign +These are strings used to indicate positive (or zero) and negative +(respectively) monetary quantities. + +In the standard @samp{C} locale, both of these members have a value of +@code{""} (the empty string), meaning ``unspecified''. + +The ANSI standard doesn't say what to do when you find this value; we +recommend printing @code{positive_sign} as you find it, even if it is +empty. For a negative value, print @code{negative_sign} as you find it +unless both it and @code{positive_sign} are empty, in which case print +@samp{-} instead. (Failing to indicate the sign at all seems rather +unreasonable.) + +@item char p_sign_posn +@itemx char n_sign_posn +These members have values that are small integers indicating how to +position the sign for nonnegative and negative monetary quantities, +respectively. (The string used by the sign is what was specified with +@code{positive_sign} or @code{negative_sign}.) The possible values are +as follows: + +@table @code +@item 0 +The currency symbol and quantity should be surrounded by parentheses. + +@item 1 +Print the sign string before the quantity and currency symbol. + +@item 2 +Print the sign string after the quantity and currency symbol. + +@item 3 +Print the sign string right before the currency symbol. + +@item 4 +Print the sign string right after the currency symbol. + +@item CHAR_MAX +``Unspecified''. Both members have this value in the standard +@samp{C} locale. +@end table + +The ANSI standard doesn't say what you should do when the value is +@code{CHAR_MAX}. We recommend you print the sign after the currency +symbol. +@end table + +It is not clear whether you should let these members apply to the +international currency format or not. POSIX says you should, but +intuition plus the examples in the ANSI C standard suggest you should +not. We hope that someone who knows well the conventions for formatting +monetary quantities will tell us what we should recommend. + diff --git a/manual/maint.texi b/manual/maint.texi new file mode 100644 index 0000000000..0d29d80ec9 --- /dev/null +++ b/manual/maint.texi @@ -0,0 +1,966 @@ +@c \input /gd/gnu/doc/texinfo +@c This is for making the `INSTALL' file for the distribution. +@c Makeinfo ignores it when processing the file from the include. +@setfilename INSTALL + +@node Maintenance, Copying, Library Summary, Top +@appendix Library Maintenance + +@menu +* Installation:: How to configure, compile and + install the GNU C library. +* Reporting Bugs:: How to report bugs (if you want to + get them fixed) and other troubles + you may have with the GNU C library. +* Source Layout:: How to add new functions or header files + to the GNU C library. +* Porting:: How to port the GNU C library to + a new machine or operating system. +* Contributors:: Contributors to the GNU C Library. +@end menu + +@node Installation +@appendixsec How to Install the GNU C Library +@cindex installing the library + +Installation of the GNU C library is relatively simple. + +You need the latest version of GNU @code{make}. Modifying the GNU C +Library to work with other @code{make} programs would be so hard that we +recommend you port GNU @code{make} instead. @strong{Really.}@refill + +To configure the GNU C library for your system, run the shell script +@file{configure} with @code{sh}. Use an argument which is the +conventional GNU name for your system configuration---for example, +@samp{sparc-sun-sunos4.1}, for a Sun 4 running Sunos 4.1. +@xref{Installation, Installation, Installing GNU CC, gcc.info, Using and +Porting GNU CC}, for a full description of standard GNU configuration +names. If you omit the configuration name, @file{configure} will try to +guess one for you by inspecting the system it is running on. It may or +may not be able to come up with a guess, and the its guess might be +wrong. @file{configure} will tell you the canonical name of the chosen +configuration before proceeding. + +The GNU C Library currently supports configurations that match the +following patterns: + +@smallexample +alpha-dec-osf1 +i386-@var{anything}-bsd4.3 +i386-@var{anything}-gnu +i386-@var{anything}-isc2.2 +i386-@var{anything}-isc3.@var{n} +i386-@var{anything}-sco3.2 +i386-@var{anything}-sco3.2v4 +i386-@var{anything}-sysv +i386-@var{anything}-sysv4 +i386-force_cpu386-none +i386-sequent-bsd +i960-nindy960-none +m68k-hp-bsd4.3 +m68k-mvme135-none +m68k-mvme136-none +m68k-sony-newsos3 +m68k-sony-newsos4 +m68k-sun-sunos4.@var{n} +mips-dec-ultrix4.@var{n} +mips-sgi-irix4.@var{n} +sparc-sun-solaris2.@var{n} +sparc-sun-sunos4.@var{n} +@end smallexample + +While no other configurations are supported, there are handy aliases for +these few. (These aliases work in other GNU software as well.) + +@smallexample +decstation +hp320-bsd4.3 hp300bsd +i386-sco +i386-sco3.2v4 +i386-sequent-dynix +i386-svr4 +news +sun3-sunos4.@var{n} sun3 +sun4-solaris2.@var{n} sun4-sunos5.@var{n} +sun4-sunos4.@var{n} sun4 +@end smallexample + +Here are some options that you should specify (if appropriate) when +you run @code{configure}: + +@table @samp +@item --with-gnu-ld +Use this option if you plan to use GNU @code{ld} to link programs with +the GNU C Library. (We strongly recommend that you do.) This option +enables use of features that exist only in GNU @code{ld}; so if you +configure for GNU @code{ld} you must use GNU @code{ld} @emph{every time} +you link with the GNU C Library, and when building it. + +@item --with-gnu-as +Use this option if you plan to use the GNU assembler, @code{gas}, when +building the GNU C Library. On some systems, the library may not build +properly if you do @emph{not} use @code{gas}. + +@c extra blank line makes it look better +@item --nfp + +Use this option if your computer lacks hardware floating point support. + +@item --prefix=@var{directory} +Install machine-independent data files in subdirectories of +@file{@var{directory}}. (You can also set this in @file{configparms}; +see below.) + +@item --exec-prefix=@var{directory} +Install the library and other machine-dependent files in subdirectories +of @file{@var{directory}}. (You can also set this in +@file{configparms}; see below.) +@end table + +The simplest way to run @code{configure} is to do it in the directory +that contains the library sources. This prepares to build the library +in that very directory. + +You can prepare to build the library in some other directory by going +to that other directory to run @code{configure}. In order to run +configure, you will have to specify a directory for it, like this: + +@smallexample +mkdir sun4 +cd sun4 +../configure sparc-sun-sunos4.1 +@end smallexample + +@noindent +@code{configure} looks for the sources in whatever directory you +specified for finding @code{configure} itself. It does not matter where +in the file system the source and build directories are---as long as you +specify the source directory when you run @code{configure}, you will get +the proper results. + +This feature lets you keep sources and binaries in different +directories, and that makes it easy to build the library for several +different machines from the same set of sources. Simply create a +build directory for each target machine, and run @code{configure} in +that directory specifying the target machine's configuration name. + +The library has a number of special-purpose configuration parameters. +These are defined in the file @file{Makeconfig}; see the comments in +that file for the details. + +But don't edit the file @file{Makeconfig} yourself---instead, create a +file @file{configparms} in the directory where you are building the +library, and define in that file the parameters you want to specify. +@file{configparms} should @strong{not} be an edited copy of +@file{Makeconfig}; specify only the parameters that you want to +override. To see how to set these parameters, find the section of +@file{Makeconfig} that says ``These are the configuration variables.'' +Then for each parameter that you want to change, copy the definition +from @file{Makeconfig} to your new @file{configparms} file, and change +the value as appropriate for your system. + +It is easy to configure the GNU C library for cross-compilation by +setting a few variables in @file{configparms}. Set @code{CC} to the +cross-compiler for the target you configured the library for; it is +important to use this same @code{CC} value when running +@code{configure}, like this: @samp{CC=@var{target}-gcc configure +@var{target}}. Set @code{BUILD_CC} to the compiler to use for for +programs run on the build system as part of compiling the library. You +may need to set @code{AR} and @code{RANLIB} to cross-compiling versions +of @code{ar} and @code{ranlib} if the native tools are not configured to +work with object files for the target you configured for. + +Some of the machine-dependent code for some machines uses extensions in +the GNU C compiler, so you may need to compile the library with GCC. +(In fact, all of the existing complete ports require GCC.) + +The current release of the C library contains some header files that the +compiler normally provides: @file{stddef.h}, @file{stdarg.h}, and +several files with names of the form @file{va-@var{machine}.h}. The +versions of these files that came with older releases of GCC do not work +properly with the GNU C library. The @file{stddef.h} file in release +2.2 and later of GCC is correct. If you have release 2.2 or later of +GCC, use its version of @file{stddef.h} instead of the C library's. To +do this, put the line @w{@samp{override stddef.h =}} in +@file{configparms}. The other files are corrected in release 2.3 and +later of GCC. @file{configure} will automatically detect whether the +installed @file{stdarg.h} and @file{va-@var{machine}.h} files are +compatible with the C library, and use its own if not. + +There is a potential problem with the @code{size_t} type and versions of +GCC prior to release 2.4. ANSI C requires that @code{size_t} always be +an unsigned type. For compatibility with existing systems' header +files, GCC defines @code{size_t} in @file{stddef.h} to be whatever type +the system's @file{sys/types.h} defines it to be. Most Unix systems +that define @code{size_t} in @file{sys/types.h}, define it to be a +signed type. Some code in the library depends on @code{size_t} being an +unsigned type, and will not work correctly if it is signed. + +The GNU C library code which expects @code{size_t} to be unsigned is +correct. The definition of @code{size_t} as a signed type is incorrect. +Versions 2.4 and later of GCC always define @code{size_t} as an unsigned +type, and GCC's @file{fixincludes} script massages the system's +@file{sys/types.h} so as not to conflict with this. + +In the meantime, we work around this problem by telling GCC explicitly +to use an unsigned type for @code{size_t} when compiling the GNU C +library. @file{configure} will automatically detect what type GCC uses +for @code{size_t} arrange to override it if necessary. + +To build the library, type @code{make lib}. This will produce a lot of +output, some of which looks like errors from @code{make} (but isn't). +Look for error messages from @code{make} containing @samp{***}. Those +indicate that something is really wrong. + +To build and run some test programs which exercise some of the library +facilities, type @code{make tests}. This will produce several files +with names like @file{@var{program}.out}. + +To format the @cite{GNU C Library Reference Manual} for printing, type +@w{@code{make dvi}}. To format the Info version of the manual for on +line reading with @kbd{C-h i} in Emacs or with the @code{info} program, +type @w{@code{make info}}. + +To install the library and its header files, and the Info files of the +manual, type @code{make install}, after setting the installation +directories in @file{configparms}. This will build things if necessary, +before installing them.@refill + +@node Reporting Bugs +@appendixsec Reporting Bugs +@cindex reporting bugs +@cindex bugs, reporting + +There are probably bugs in the GNU C library. There are certainly +errors and omissions in this manual. If you report them, they will get +fixed. If you don't, no one will ever know about them and they will +remain unfixed for all eternity, if not longer. + +To report a bug, first you must find it. Hopefully, this will be the +hard part. Once you've found a bug, make sure it's really a bug. A +good way to do this is to see if the GNU C library behaves the same way +some other C library does. If so, probably you are wrong and the +libraries are right (but not necessarily). If not, one of the libraries +is probably wrong. + +Once you're sure you've found a bug, try to narrow it down to the +smallest test case that reproduces the problem. In the case of a C +library, you really only need to narrow it down to one library +function call, if possible. This should not be too difficult. + +The final step when you have a simple test case is to report the bug. +When reporting a bug, send your test case, the results you got, the +results you expected, what you think the problem might be (if you've +thought of anything), your system type, and the version of the GNU C +library which you are using. Also include the files +@file{config.status} and @file{config.make} which are created by running +@file{configure}; they will be in whatever directory was current when +you ran @file{configure}. + +If you think you have found some way in which the GNU C library does not +conform to the ANSI and POSIX standards (@pxref{Standards and +Portability}), that is definitely a bug. Report it!@refill + +Send bug reports to the Internet address +@samp{bug-glibc@@prep.ai.mit.edu} or the UUCP path +@samp{mit-eddie!prep.ai.mit.edu!bug-glibc}. If you have other problems +with installation or use, please report those as well.@refill + +If you are not sure how a function should behave, and this manual +doesn't tell you, that's a bug in the manual. Report that too! If the +function's behavior disagrees with the manual, then either the library +or the manual has a bug, so report the disagreement. If you find any +errors or omissions in this manual, please report them to the Internet +address @samp{bug-glibc-manual@@prep.ai.mit.edu} or the UUCP path +@samp{mit-eddie!prep.ai.mit.edu!bug-glibc-manual}. + +@node Source Layout +@appendixsec Adding New Functions + +The process of building the library is driven by the makefiles, which +make heavy use of special features of GNU @code{make}. The makefiles +are very complex, and you probably don't want to try to understand them. +But what they do is fairly straightforward, and only requires that you +define a few variables in the right places. + +The library sources are divided into subdirectories, grouped by topic. +The @file{string} subdirectory has all the string-manipulation +functions, @file{stdio} has all the standard I/O functions, etc. + +Each subdirectory contains a simple makefile, called @file{Makefile}, +which defines a few @code{make} variables and then includes the global +makefile @file{Rules} with a line like: + +@smallexample +include ../Rules +@end smallexample + +@noindent +The basic variables that a subdirectory makefile defines are: + +@table @code +@item subdir +The name of the subdirectory, for example @file{stdio}. +This variable @strong{must} be defined. + +@item headers +The names of the header files in this section of the library, +such as @file{stdio.h}. + +@item routines +@itemx aux +The names of the modules (source files) in this section of the library. +These should be simple names, such as @samp{strlen} (rather than +complete file names, such as @file{strlen.c}). Use @code{routines} for +modules that define functions in the library, and @code{aux} for +auxiliary modules containing things like data definitions. But the +values of @code{routines} and @code{aux} are just concatenated, so there +really is no practical difference.@refill + +@item tests +The names of test programs for this section of the library. These +should be simple names, such as @samp{tester} (rather than complete file +names, such as @file{tester.c}). @w{@samp{make tests}} will build and +run all the test programs. If a test program needs input, put the test +data in a file called @file{@var{test-program}.input}; it will be given to +the test program on its standard input. If a test program wants to be +run with arguments, put the arguments (all on a single line) in a file +called @file{@var{test-program}.args}.@refill + +@item others +The names of ``other'' programs associated with this section of the +library. These are programs which are not tests per se, but are other +small programs included with the library. They are built by +@w{@samp{make others}}.@refill + +@item install-lib +@itemx install-data +@itemx install +Files to be installed by @w{@samp{make install}}. Files listed in +@samp{install-lib} are installed in the directory specified by +@samp{libdir} in @file{configparms} or @file{Makeconfig} +(@pxref{Installation}). Files listed in @code{install-data} are +installed in the directory specified by @samp{datadir} in +@file{configparms} or @file{Makeconfig}. Files listed in @code{install} +are installed in the directory specified by @samp{bindir} in +@file{configparms} or @file{Makeconfig}.@refill + +@item distribute +Other files from this subdirectory which should be put into a +distribution tar file. You need not list here the makefile itself or +the source and header files listed in the other standard variables. +Only define @code{distribute} if there are files used in an unusual way +that should go into the distribution. + +@item generated +Files which are generated by @file{Makefile} in this subdirectory. +These files will be removed by @w{@samp{make clean}}, and they will +never go into a distribution. + +@item extra-objs +Extra object files which are built by @file{Makefile} in this +subdirectory. This should be a list of file names like @file{foo.o}; +the files will actually be found in whatever directory object files are +being built in. These files will be removed by @w{@samp{make clean}}. +This variable is used for secondary object files needed to build +@code{others} or @code{tests}. +@end table + +@node Porting +@appendixsec Porting the GNU C Library + +The GNU C library is written to be easily portable to a variety of +machines and operating systems. Machine- and operating system-dependent +functions are well separated to make it easy to add implementations for +new machines or operating systems. This section describes the layout of +the library source tree and explains the mechanisms used to select +machine-dependent code to use. + +All the machine-dependent and operating system-dependent files in the +library are in the subdirectory @file{sysdeps} under the top-level +library source directory. This directory contains a hierarchy of +subdirectories (@pxref{Hierarchy Conventions}). + +Each subdirectory of @file{sysdeps} contains source files for a +particular machine or operating system, or for a class of machine or +operating system (for example, systems by a particular vendor, or all +machines that use IEEE 754 floating-point format). A configuration +specifies an ordered list of these subdirectories. Each subdirectory +implicitly appends its parent directory to the list. For example, +specifying the list @file{unix/bsd/vax} is equivalent to specifying the +list @file{unix/bsd/vax unix/bsd unix}. A subdirectory can also specify +that it implies other subdirectories which are not directly above it in +the directory hierarchy. If the file @file{Implies} exists in a +subdirectory, it lists other subdirectories of @file{sysdeps} which are +appended to the list, appearing after the subdirectory containing the +@file{Implies} file. Lines in an @file{Implies} file that begin with a +@samp{#} character are ignored as comments. For example, +@file{unix/bsd/Implies} contains:@refill +@smallexample +# BSD has Internet-related things. +unix/inet +@end smallexample +@noindent +and @file{unix/Implies} contains: +@need 300 +@smallexample +posix +@end smallexample + +@noindent +So the final list is @file{unix/bsd/vax unix/bsd unix/inet unix posix}. + +@file{sysdeps} has two ``special'' subdirectories, called @file{generic} +and @file{stub}. These two are always implicitly appended to the list +of subdirectories (in that order), so you needn't put them in an +@file{Implies} file, and you should not create any subdirectories under +them. @file{generic} is for things that can be implemented in +machine-independent C, using only other machine-independent functions in +the C library. @file{stub} is for @dfn{stub} versions of functions +which cannot be implemented on a particular machine or operating system. +The stub functions always return an error, and set @code{errno} to +@code{ENOSYS} (Function not implemented). @xref{Error Reporting}. + +A source file is known to be system-dependent by its having a version in +@file{generic} or @file{stub}; every system-dependent function should +have either a generic or stub implementation (there is no point in +having both). + +If you come across a file that is in one of the main source directories +(@file{string}, @file{stdio}, etc.), and you want to write a machine- or +operating system-dependent version of it, move the file into +@file{sysdeps/generic} and write your new implementation in the +appropriate system-specific subdirectory. Note that if a file is to be +system-dependent, it @strong{must not} appear in one of the main source +directories.@refill + +There are a few special files that may exist in each subdirectory of +@file{sysdeps}: + +@comment Blank lines after items make the table look better. +@table @file +@item Makefile + +A makefile for this machine or operating system, or class of machine or +operating system. This file is included by the library makefile +@file{Makerules}, which is used by the top-level makefile and the +subdirectory makefiles. It can change the variables set in the +including makefile or add new rules. It can use GNU @code{make} +conditional directives based on the variable @samp{subdir} (see above) to +select different sets of variables and rules for different sections of +the library. It can also set the @code{make} variable +@samp{sysdep-routines}, to specify extra modules to be included in the +library. You should use @samp{sysdep-routines} rather than adding +modules to @samp{routines} because the latter is used in determining +what to distribute for each subdirectory of the main source tree.@refill + +Each makefile in a subdirectory in the ordered list of subdirectories to +be searched is included in order. Since several system-dependent +makefiles may be included, each should append to @samp{sysdep-routines} +rather than simply setting it: + +@smallexample +sysdep-routines := $(sysdep-routines) foo bar +@end smallexample + +@need 1000 +@item Subdirs + +This file contains the names of new whole subdirectories under the +top-level library source tree that should be included for this system. +These subdirectories are treated just like the system-independent +subdirectories in the library source tree, such as @file{stdio} and +@file{math}. + +Use this when there are completely new sets of functions and header +files that should go into the library for the system this subdirectory +of @file{sysdeps} implements. For example, +@file{sysdeps/unix/inet/Subdirs} contains @file{inet}; the @file{inet} +directory contains various network-oriented operations which only make +sense to put in the library on systems that support the Internet.@refill + +@item Dist + +This file contains the names of files (relative to the subdirectory of +@file{sysdeps} in which it appears) which should be included in the +distribution. List any new files used by rules in the @file{Makefile} +in the same directory, or header files used by the source files in that +directory. You don't need to list files that are implementations +(either C or assembly source) of routines whose names are given in the +machine-independent makefiles in the main source tree. + +@item configure + +This file is a shell script fragment to be run at configuration time. +The top-level @file{configure} script uses the shell @code{.} command to +read the @file{configure} file in each system-dependent directory +chosen, in order. The @file{configure} files are often generated from +@file{configure.in} files using Autoconf. + +A system-dependent @file{configure} script will usually add things to +the shell variables @samp{DEFS} and @samp{config_vars}; see the +top-level @file{configure} script for details. The script can check for +@w{@samp{--with-@var{package}}} options that were passed to the +top-level @file{configure}. For an option +@w{@samp{--with-@var{package}=@var{value}}} @file{configure} sets the +shell variable @w{@samp{with_@var{package}}} (with any dashes in +@var{package} converted to underscores) to @var{value}; if the option is +just @w{@samp{--with-@var{package}}} (no argument), then it sets +@w{@samp{with_@var{package}}} to @samp{yes}. + +@item configure.in + +This file is an Autoconf input fragment to be processed into the file +@file{configure} in this subdirectory. @xref{Introduction,,, +autoconf.info, Autoconf: Generating Automatic Configuration Scripts}, +for a description of Autoconf. You should write either @file{configure} +or @file{configure.in}, but not both. The first line of +@file{configure.in} should invoke the @code{m4} macro +@samp{GLIBC_PROVIDES}. This macro does several @code{AC_PROVIDE} calls +for Autoconf macros which are used by the top-level @file{configure} +script; without this, those macros might be invoked again unnecessarily +by Autoconf. +@end table + +That is the general system for how system-dependencies are isolated. +@iftex +The next section explains how to decide what directories in +@file{sysdeps} to use. @ref{Porting to Unix}, has some tips on porting +the library to Unix variants. +@end iftex + +@menu +* Hierarchy Conventions:: The layout of the @file{sysdeps} hierarchy. +* Porting to Unix:: Porting the library to an average + Unix-like system. +@end menu + +@node Hierarchy Conventions +@appendixsubsec Layout of the @file{sysdeps} Directory Hierarchy + +A GNU configuration name has three parts: the CPU type, the +manufacturer's name, and the operating system. @file{configure} uses +these to pick the list of system-dependent directories to look for. If +the @samp{--nfp} option is @emph{not} passed to @file{configure}, the +directory @file{@var{machine}/fpu} is also used. The operating system +often has a @dfn{base operating system}; for example, if the operating +system is @samp{sunos4.1}, the base operating system is @samp{unix/bsd}. +The algorithm used to pick the list of directories is simple: +@file{configure} makes a list of the base operating system, +manufacturer, CPU type, and operating system, in that order. It then +concatenates all these together with slashes in between, to produce a +directory name; for example, the configuration @w{@samp{sparc-sun-sunos4.1}} +results in @file{unix/bsd/sun/sparc/sunos4.1}. @file{configure} then +tries removing each element of the list in turn, so +@file{unix/bsd/sparc} and @file{sun/sparc} are also tried, among others. +Since the precise version number of the operating system is often not +important, and it would be very inconvenient, for example, to have +identical @file{sunos4.1.1} and @file{sunos4.1.2} directories, +@file{configure} tries successively less specific operating system names +by removing trailing suffixes starting with a period. + +As an example, here is the complete list of directories that would be +tried for the configuration @w{@samp{sparc-sun-sunos4.1}} (without the +@w{@samp{--nfp}} option): + +@smallexample +sparc/fpu +unix/bsd/sun/sunos4.1/sparc +unix/bsd/sun/sunos4.1 +unix/bsd/sun/sunos4/sparc +unix/bsd/sun/sunos4 +unix/bsd/sun/sunos/sparc +unix/bsd/sun/sunos +unix/bsd/sun/sparc +unix/bsd/sun +unix/bsd/sunos4.1/sparc +unix/bsd/sunos4.1 +unix/bsd/sunos4/sparc +unix/bsd/sunos4 +unix/bsd/sunos/sparc +unix/bsd/sunos +unix/bsd/sparc +unix/bsd +unix/sun/sunos4.1/sparc +unix/sun/sunos4.1 +unix/sun/sunos4/sparc +unix/sun/sunos4 +unix/sun/sunos/sparc +unix/sun/sunos +unix/sun/sparc +unix/sun +unix/sunos4.1/sparc +unix/sunos4.1 +unix/sunos4/sparc +unix/sunos4 +unix/sunos/sparc +unix/sunos +unix/sparc +unix +sun/sunos4.1/sparc +sun/sunos4.1 +sun/sunos4/sparc +sun/sunos4 +sun/sunos/sparc +sun/sunos +sun/sparc +sun +sunos4.1/sparc +sunos4.1 +sunos4/sparc +sunos4 +sunos/sparc +sunos +sparc +@end smallexample + +Different machine architectures are conventionally subdirectories at the +top level of the @file{sysdeps} directory tree. For example, +@w{@file{sysdeps/sparc}} and @w{@file{sysdeps/m68k}}. These contain +files specific to those machine architectures, but not specific to any +particular operating system. There might be subdirectories for +specializations of those architectures, such as +@w{@file{sysdeps/m68k/68020}}. Code which is specific to the +floating-point coprocessor used with a particular machine should go in +@w{@file{sysdeps/@var{machine}/fpu}}. + +There are a few directories at the top level of the @file{sysdeps} +hierarchy that are not for particular machine architectures. + +@table @file +@item generic +@itemx stub +As described above (@pxref{Porting}), these are the two subdirectories +that every configuration implicitly uses after all others. + +@item ieee754 +This directory is for code using the IEEE 754 floating-point format, +where the C type @code{float} is IEEE 754 single-precision format, and +@code{double} is IEEE 754 double-precision format. Usually this +directory is referred to in the @file{Implies} file in a machine +architecture-specific directory, such as @file{m68k/Implies}. + +@item posix +This directory contains implementations of things in the library in +terms of @sc{POSIX.1} functions. This includes some of the @sc{POSIX.1} +functions themselves. Of course, @sc{POSIX.1} cannot be completely +implemented in terms of itself, so a configuration using just +@file{posix} cannot be complete. + +@item unix +This is the directory for Unix-like things. @xref{Porting to Unix}. +@file{unix} implies @file{posix}. There are some special-purpose +subdirectories of @file{unix}: + +@table @file +@item unix/common +This directory is for things common to both BSD and System V release 4. +Both @file{unix/bsd} and @file{unix/sysv/sysv4} imply @file{unix/common}. + +@item unix/inet +This directory is for @code{socket} and related functions on Unix systems. +The @file{inet} top-level subdirectory is enabled by @file{unix/inet/Subdirs}. +@file{unix/common} implies @file{unix/inet}. +@end table + +@item mach +This is the directory for things based on the Mach microkernel from CMU +(including the GNU operating system). Other basic operating systems +(VMS, for example) would have their own directories at the top level of +the @file{sysdeps} hierarchy, parallel to @file{unix} and @file{mach}. +@end table + +@node Porting to Unix +@appendixsubsec Porting the GNU C Library to Unix Systems + +Most Unix systems are fundamentally very similar. There are variations +between different machines, and variations in what facilities are +provided by the kernel. But the interface to the operating system +facilities is, for the most part, pretty uniform and simple. + +The code for Unix systems is in the directory @file{unix}, at the top +level of the @file{sysdeps} hierarchy. This directory contains +subdirectories (and subdirectory trees) for various Unix variants. + +The functions which are system calls in most Unix systems are +implemented in assembly code in files in @file{sysdeps/unix}. These +files are named with a suffix of @samp{.S}; for example, +@file{__open.S}. Files ending in @samp{.S} are run through the C +preprocessor before being fed to the assembler. + +These files all use a set of macros that should be defined in +@file{sysdep.h}. The @file{sysdep.h} file in @file{sysdeps/unix} +partially defines them; a @file{sysdep.h} file in another directory must +finish defining them for the particular machine and operating system +variant. See @file{sysdeps/unix/sysdep.h} and the machine-specific +@file{sysdep.h} implementations to see what these macros are and what +they should do.@refill + +The system-specific makefile for the @file{unix} directory (that is, the +file @file{sysdeps/unix/Makefile}) gives rules to generate several files +from the Unix system you are building the library on (which is assumed +to be the target system you are building the library @emph{for}). All +the generated files are put in the directory where the object files are +kept; they should not affect the source tree itself. The files +generated are @file{ioctls.h}, @file{errnos.h}, @file{sys/param.h}, and +@file{errlist.c} (for the @file{stdio} section of the library). + +@ignore +@c This section might be a good idea if it is finished, +@c but there's no point including it as it stands. --rms +@c @appendixsec Compatibility with Traditional C + +@c ??? This section is really short now. Want to keep it? --roland + +Although the GNU C library implements the ANSI C library facilities, you +@emph{can} use the GNU C library with traditional, ``pre-ANSI'' C +compilers. However, you need to be careful because the content and +organization of the GNU C library header files differs from that of +traditional C implementations. This means you may need to make changes +to your program in order to get it to compile. +@end ignore + +@node Contributors +@appendixsec Contributors to the GNU C Library + +The GNU C library was written almost entirely by Roland McGrath, who now +maintains it. Some parts of the library were contributed or worked on +by other people. + +@itemize @bullet +@item +The @code{getopt} function and related code were written by +Richard Stallman, @w{David J. MacKenzie}, and @w{Roland McGrath}. + +@item +Most of the math functions are taken from 4.4 BSD; they have been +modified only slightly to work with the GNU C library. The +Internet-related code (most of the @file{inet} subdirectory) and several +other miscellaneous functions and header files have been included with +little or no modification. + +All code incorporated from 4.4 BSD is under the following copyright: + +@quotation +@display +Copyright @copyright{} 1991 Regents of the University of California. +All rights reserved. +@end display + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions +are met: + +@enumerate +@item +Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. +@item +Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in the +documentation and/or other materials provided with the distribution. +@item +All advertising materials mentioning features or use of this software +must display the following acknowledgement: +@quotation +This product includes software developed by the University of +California, Berkeley and its contributors. +@end quotation +@item +Neither the name of the University nor the names of its contributors +may be used to endorse or promote products derived from this software +without specific prior written permission. +@end enumerate + +@sc{this software is provided by the regents and contributors ``as is'' and +any express or implied warranties, including, but not limited to, the +implied warranties of merchantability and fitness for a particular purpose +are disclaimed. in no event shall the regents or contributors be liable +for any direct, indirect, incidental, special, exemplary, or consequential +damages (including, but not limited to, procurement of substitute goods +or services; loss of use, data, or profits; or business interruption) +however caused and on any theory of liability, whether in contract, strict +liability, or tort (including negligence or otherwise) arising in any way +out of the use of this software, even if advised of the possibility of +such damage.} +@end quotation + +@item +The random number generation functions @code{random}, @code{srandom}, +@code{setstate} and @code{initstate}, which are also the basis for the +@code{rand} and @code{srand} functions, were written by Earl T. Cohen +for the University of California at Berkeley and are copyrighted by the +Regents of the University of California. They have undergone minor +changes to fit into the GNU C library and to fit the ANSI C standard, +but the functional code is Berkeley's.@refill + +@item +The merge sort function @code{qsort} was written by Michael J. Haertel. + +@item +The quick sort function used as a fallback by @code{qsort} was written +by Douglas C. Schmidt. + +@item +The memory allocation functions @code{malloc}, @code{realloc} and +@code{free} and related code were written by Michael J. Haertel. + +@comment tege's name has an umlaut. +@tex +\xdef\SETtege{Torbj\"orn Granlund} +@end tex +@ifinfo +@set tege Torbjorn Granlund +@end ifinfo +@item +Fast implementations of many of the string functions (@code{memcpy}, +@code{strlen}, etc.) were written by @value{tege}. + +@item +Some of the support code for Mach is taken from Mach 3.0 by CMU, +and is under the following copyright terms: + +@quotation +@display +Mach Operating System +Copyright @copyright{} 1991,1990,1989 Carnegie Mellon University +All Rights Reserved. +@end display + +Permission to use, copy, modify and distribute this software and its +documentation is hereby granted, provided that both the copyright +notice and this permission notice appear in all copies of the +software, derivative works or modified versions, and any portions +thereof, and that both notices appear in supporting documentation. + +@sc{carnegie mellon allows free use of this software in its ``as is'' +condition. carnegie mellon disclaims any liability of any kind for +any damages whatsoever resulting from the use of this software.} + +Carnegie Mellon requests users of this software to return to + +@display + Software Distribution Coordinator + School of Computer Science + Carnegie Mellon University + Pittsburgh PA 15213-3890 +@end display + +@noindent +or @samp{Software.Distribution@@CS.CMU.EDU} any improvements or +extensions that they make and grant Carnegie Mellon the rights to +redistribute these changes. +@end quotation + +@item +The @file{tar.h} header file was written by David J. MacKenzie. + +@item +The port to the MIPS DECStation running Ultrix 4 +(@code{mips-dec-ultrix4}) +was contributed by Brendan Kehoe and Ian Lance Taylor. + +@item +The DES encryption function @code{crypt} and related functions were +contributed by Michael Glad. + +@item +The @code{ftw} function was contributed by Ian Lance Taylor. + +@item +The code to support SunOS shared libraries was contributed by Tom Quinn. + +@item +The @code{mktime} function was contributed by Noel Cragg. + +@item +The port to the Sequent Symmetry running Dynix version 3 +(@code{i386-sequent-bsd}) was contributed by Jason Merrill. + +@item +The timezone support code is derived from the public-domain timezone +package by Arthur David Olson. + +@item +The Internet resolver code is taken directly from BIND 4.9.1, which is +under both the Berkeley copyright above and also: + +@quotation +Portions Copyright @copyright{} 1993 by Digital Equipment Corporation. + +Permission to use, copy, modify, and distribute this software for any +purpose with or without fee is hereby granted, provided that the above +copyright notice and this permission notice appear in all copies, and +that the name of Digital Equipment Corporation not be used in +advertising or publicity pertaining to distribution of the document or +software without specific, written prior permission. + +@sc{the software is provided ``as is'' and digital equipment corp. +disclaims all warranties with regard to this software, including all +implied warranties of merchantability and fitness. in no event shall +digital equipment corporation be liable for any special, direct, +indirect, or consequential damages or any damages whatsoever resulting +from loss of use, data or profits, whether in an action of contract, +negligence or other tortious action, arising out of or in connection +with the use or performance of this software.} +@end quotation + +@item +The port to the DEC Alpha running OSF/1 (@code{alpha-dec-osf1}) was +contributed by Brendan Kehoe, using some code written by Roland McGrath. + +@item +The floating-point printing function used by @code{printf} and friends +was written by Roland McGrath and @value{tege}. The multi-precision +integer functions used in that function are taken from GNU MP, which was +contributed by @value{tege}. + +@item +The code to support Sun RPC is taken verbatim from Sun's +@w{@sc{rpcsrc-4.0}} distribution, and is covered by this copyright: + +@quotation +@display +Copyright @copyright{} 1984, Sun Microsystems, Inc. +@end display + +Sun RPC is a product of Sun Microsystems, Inc. and is provided for +unrestricted use provided that this legend is included on all tape media +and as a part of the software program in whole or part. Users may copy +or modify Sun RPC without charge, but are not authorized to license or +distribute it to anyone else except as part of a product or program +developed by the user. + +@sc{sun rpc is provided as is with no warranties of any kind including the +warranties of design, merchantibility and fitness for a particular +purpose, or arising from a course of dealing, usage or trade practice.} + +Sun RPC is provided with no support and without any obligation on the +part of Sun Microsystems, Inc. to assist in its use, correction, +modification or enhancement. + +@sc{sun microsystems, inc. shall have no liability with respect to the +infringement of copyrights, trade secrets or any patents by sun rpc +or any part thereof.} + +In no event will Sun Microsystems, Inc. be liable for any lost revenue +or profits or other special, indirect and consequential damages, even if +Sun has been advised of the possibility of such damages. + +@display +Sun Microsystems, Inc. +2550 Garcia Avenue +Mountain View, California 94043 +@end display +@end quotation + +@item +The port to SGI machines running Irix 4 (@code{mips-sgi-irix4}) was +contributed by Tom Quinn. + +@item +The port of the Mach and Hurd code to the MIPS architecture +(@code{mips-@var{anything}-gnu}) was contribued by Kazumoto Kojima. +@end itemize + +@c @bye diff --git a/manual/math.texi b/manual/math.texi new file mode 100644 index 0000000000..a97d76c2a1 --- /dev/null +++ b/manual/math.texi @@ -0,0 +1,505 @@ +@node Mathematics, Arithmetic, Low-Level Terminal Interface, Top +@chapter Mathematics + +This chapter contains information about functions for performing +mathematical computations, such as trigonometric functions. Most of +these functions have prototypes declared in the header file +@file{math.h}. +@pindex math.h + +All of the functions that operate on floating-point numbers accept +arguments and return results of type @code{double}. In the future, +there may be additional functions that operate on @code{float} and +@code{long double} values. For example, @code{cosf} and @code{cosl} +would be versions of the @code{cos} function that operate on +@code{float} and @code{long double} arguments, respectively. In the +meantime, you should avoid using these names yourself. @xref{Reserved +Names}. + +@menu +* Domain and Range Errors:: Detecting overflow conditions and the like. +* Trig Functions:: Sine, cosine, and tangent. +* Inverse Trig Functions:: Arc sine, arc cosine, and arc tangent. +* Exponents and Logarithms:: Also includes square root. +* Hyperbolic Functions:: Hyperbolic sine and friends. +* Pseudo-Random Numbers:: Functions for generating pseudo-random + numbers. +@end menu + +@node Domain and Range Errors +@section Domain and Range Errors + +@cindex domain error +Many of the functions listed in this chapter are defined mathematically +over a domain that is only a subset of real numbers. For example, the +@code{acos} function is defined over the domain between @code{-1} and +@code{1}. If you pass an argument to one of these functions that is +outside the domain over which it is defined, the function sets +@code{errno} to @code{EDOM} to indicate a @dfn{domain error}. On +machines that support IEEE floating point, functions reporting error +@code{EDOM} also return a NaN. + +Some of these functions are defined mathematically to result in a +complex value over parts of their domains. The most familiar example of +this is taking the square root of a negative number. The functions in +this chapter take only real arguments and return only real values; +therefore, if the value ought to be nonreal, this is treated as a domain +error. + +@cindex range error +A related problem is that the mathematical result of a function may not +be representable as a floating point number. If magnitude of the +correct result is too large to be represented, the function sets +@code{errno} to @code{ERANGE} to indicate a @dfn{range error}, and +returns a particular very large value (named by the macro +@code{HUGE_VAL}) or its negation (@w{@code{- HUGE_VAL}}). + +If the magnitude of the result is too small, a value of zero is returned +instead. In this case, @code{errno} might or might not be +set to @code{ERANGE}. + +The only completely reliable way to check for domain and range errors is +to set @code{errno} to @code{0} before you call the mathematical function +and test @code{errno} afterward. As a consequence of this use of +@code{errno}, use of the mathematical functions is not reentrant if you +check for errors. + +@c !!! this isn't always true at the moment.... +None of the mathematical functions ever generates signals as a result of +domain or range errors. In particular, this means that you won't see +@code{SIGFPE} signals generated within these functions. (@xref{Signal +Handling}, for more information about signals.) + +@comment math.h +@comment ANSI +@deftypevr Macro double HUGE_VAL +An expression representing a particular very large number. On machines +that use IEEE floating point format, the value is ``infinity''. On +other machines, it's typically the largest positive number that can be +represented. + +The value of this macro is used as the return value from various +mathematical functions in overflow situations. +@end deftypevr + +For more information about floating-point representations and limits, +see @ref{Floating Point Parameters}. In particular, the macro +@code{DBL_MAX} might be more appropriate than @code{HUGE_VAL} for many +uses other than testing for an error in a mathematical function. + +@node Trig Functions +@section Trigonometric Functions +@cindex trigonometric functions + +These are the familiar @code{sin}, @code{cos}, and @code{tan} functions. +The arguments to all of these functions are in units of radians; recall +that pi radians equals 180 degrees. + +@cindex pi (trigonometric constant) +The math library doesn't define a symbolic constant for pi, but you can +define your own if you need one: + +@smallexample +#define PI 3.14159265358979323846264338327 +@end smallexample + +@noindent +You can also compute the value of pi with the expression @code{acos +(-1.0)}. + + +@comment math.h +@comment ANSI +@deftypefun double sin (double @var{x}) +This function returns the sine of @var{x}, where @var{x} is given in +radians. The return value is in the range @code{-1} to @code{1}. +@end deftypefun + +@comment math.h +@comment ANSI +@deftypefun double cos (double @var{x}) +This function returns the cosine of @var{x}, where @var{x} is given in +radians. The return value is in the range @code{-1} to @code{1}. +@end deftypefun + +@comment math.h +@comment ANSI +@deftypefun double tan (double @var{x}) +This function returns the tangent of @var{x}, where @var{x} is given in +radians. + +The following @code{errno} error conditions are defined for this function: + +@table @code +@item ERANGE +Mathematically, the tangent function has singularities at odd multiples +of pi/2. If the argument @var{x} is too close to one of these +singularities, @code{tan} sets @code{errno} to @code{ERANGE} and returns +either positive or negative @code{HUGE_VAL}. +@end table +@end deftypefun + + +@node Inverse Trig Functions +@section Inverse Trigonometric Functions +@cindex inverse trigonmetric functions + +These are the usual arc sine, arc cosine and arc tangent functions, +which are the inverses of the sine, cosine and tangent functions, +respectively. + +@comment math.h +@comment ANSI +@deftypefun double asin (double @var{x}) +This function computes the arc sine of @var{x}---that is, the value whose +sine is @var{x}. The value is in units of radians. Mathematically, +there are infinitely many such values; the one actually returned is the +one between @code{-pi/2} and @code{pi/2} (inclusive). + +@code{asin} fails, and sets @code{errno} to @code{EDOM}, if @var{x} is +out of range. The arc sine function is defined mathematically only +over the domain @code{-1} to @code{1}. +@end deftypefun + +@comment math.h +@comment ANSI +@deftypefun double acos (double @var{x}) +This function computes the arc cosine of @var{x}---that is, the value +whose cosine is @var{x}. The value is in units of radians. +Mathematically, there are infinitely many such values; the one actually +returned is the one between @code{0} and @code{pi} (inclusive). + +@code{acos} fails, and sets @code{errno} to @code{EDOM}, if @var{x} is +out of range. The arc cosine function is defined mathematically only +over the domain @code{-1} to @code{1}. +@end deftypefun + + +@comment math.h +@comment ANSI +@deftypefun double atan (double @var{x}) +This function computes the arc tangent of @var{x}---that is, the value +whose tangent is @var{x}. The value is in units of radians. +Mathematically, there are infinitely many such values; the one actually +returned is the one between @code{-pi/2} and @code{pi/2} +(inclusive). +@end deftypefun + +@comment math.h +@comment ANSI +@deftypefun double atan2 (double @var{y}, double @var{x}) +This is the two argument arc tangent function. It is similar to computing +the arc tangent of @var{y}/@var{x}, except that the signs of both arguments +are used to determine the quadrant of the result, and @var{x} is +permitted to be zero. The return value is given in radians and is in +the range @code{-pi} to @code{pi}, inclusive. + +If @var{x} and @var{y} are coordinates of a point in the plane, +@code{atan2} returns the signed angle between the line from the origin +to that point and the x-axis. Thus, @code{atan2} is useful for +converting Cartesian coordinates to polar coordinates. (To compute the +radial coordinate, use @code{hypot}; see @ref{Exponents and +Logarithms}.) + +The function @code{atan2} sets @code{errno} to @code{EDOM} if both +@var{x} and @var{y} are zero; the return value is not defined in this +case. +@end deftypefun + + +@node Exponents and Logarithms +@section Exponentiation and Logarithms +@cindex exponentiation functions +@cindex power functions +@cindex logarithm functions + +@comment math.h +@comment ANSI +@deftypefun double exp (double @var{x}) +The @code{exp} function returns the value of e (the base of natural +logarithms) raised to power @var{x}. + +The function fails, and sets @code{errno} to @code{ERANGE}, if the +magnitude of the result is too large to be representable. +@end deftypefun + +@comment math.h +@comment ANSI +@deftypefun double log (double @var{x}) +This function returns the natural logarithm of @var{x}. @code{exp (log +(@var{x}))} equals @var{x}, exactly in mathematics and approximately in +C. + +The following @code{errno} error conditions are defined for this function: + +@table @code +@item EDOM +The argument @var{x} is negative. The log function is defined +mathematically to return a real result only on positive arguments. + +@item ERANGE +The argument is zero. The log of zero is not defined. +@end table +@end deftypefun + +@comment math.h +@comment ANSI +@deftypefun double log10 (double @var{x}) +This function returns the base-10 logarithm of @var{x}. Except for the +different base, it is similar to the @code{log} function. In fact, +@code{log10 (@var{x})} equals @code{log (@var{x}) / log (10)}. +@end deftypefun + +@comment math.h +@comment ANSI +@deftypefun double pow (double @var{base}, double @var{power}) +This is a general exponentiation function, returning @var{base} raised +to @var{power}. + +@need 250 +The following @code{errno} error conditions are defined for this function: + +@table @code +@item EDOM +The argument @var{base} is negative and @var{power} is not an integral +value. Mathematically, the result would be a complex number in this case. + +@item ERANGE +An underflow or overflow condition was detected in the result. +@end table +@end deftypefun + +@cindex square root function +@comment math.h +@comment ANSI +@deftypefun double sqrt (double @var{x}) +This function returns the nonnegative square root of @var{x}. + +The @code{sqrt} function fails, and sets @code{errno} to @code{EDOM}, if +@var{x} is negative. Mathematically, the square root would be a complex +number. +@end deftypefun + +@cindex cube root function +@comment math.h +@comment BSD +@deftypefun double cbrt (double @var{x}) +This function returns the cube root of @var{x}. This function cannot +fail; every representable real value has a representable real cube root. +@end deftypefun + +@comment math.h +@comment BSD +@deftypefun double hypot (double @var{x}, double @var{y}) +The @code{hypot} function returns @code{sqrt (@var{x}*@var{x} + +@var{y}*@var{y})}. (This is the length of the hypotenuse of a right +triangle with sides of length @var{x} and @var{y}, or the distance +of the point (@var{x}, @var{y}) from the origin.) See also the function +@code{cabs} in @ref{Absolute Value}. +@end deftypefun + +@comment math.h +@comment BSD +@deftypefun double expm1 (double @var{x}) +This function returns a value equivalent to @code{exp (@var{x}) - 1}. +It is computed in a way that is accurate even if the value of @var{x} is +near zero---a case where @code{exp (@var{x}) - 1} would be inaccurate due +to subtraction of two numbers that are nearly equal. +@end deftypefun + +@comment math.h +@comment BSD +@deftypefun double log1p (double @var{x}) +This function returns a value equivalent to @w{@code{log (1 + @var{x})}}. +It is computed in a way that is accurate even if the value of @var{x} is +near zero. +@end deftypefun + +@node Hyperbolic Functions +@section Hyperbolic Functions +@cindex hyperbolic functions + +The functions in this section are related to the exponential functions; +see @ref{Exponents and Logarithms}. + +@comment math.h +@comment ANSI +@deftypefun double sinh (double @var{x}) +The @code{sinh} function returns the hyperbolic sine of @var{x}, defined +mathematically as @w{@code{exp (@var{x}) - exp (-@var{x}) / 2}}. The +function fails, and sets @code{errno} to @code{ERANGE}, if the value of +@var{x} is too large; that is, if overflow occurs. +@end deftypefun + +@comment math.h +@comment ANSI +@deftypefun double cosh (double @var{x}) +The @code{cosh} function returns the hyperbolic cosine of @var{x}, +defined mathematically as @w{@code{exp (@var{x}) + exp (-@var{x}) / 2}}. +The function fails, and sets @code{errno} to @code{ERANGE}, if the value +of @var{x} is too large; that is, if overflow occurs. +@end deftypefun + +@comment math.h +@comment ANSI +@deftypefun double tanh (double @var{x}) +This function returns the hyperbolic tangent of @var{x}, whose +mathematical definition is @w{@code{sinh (@var{x}) / cosh (@var{x})}}. +@end deftypefun + +@cindex inverse hyperbolic functions + +@comment math.h +@comment BSD +@deftypefun double asinh (double @var{x}) +This function returns the inverse hyperbolic sine of @var{x}---the +value whose hyperbolic sine is @var{x}. +@end deftypefun + +@comment math.h +@comment BSD +@deftypefun double acosh (double @var{x}) +This function returns the inverse hyperbolic cosine of @var{x}---the +value whose hyperbolic cosine is @var{x}. If @var{x} is less than +@code{1}, @code{acosh} returns @code{HUGE_VAL}. +@end deftypefun + +@comment math.h +@comment BSD +@deftypefun double atanh (double @var{x}) +This function returns the inverse hyperbolic tangent of @var{x}---the +value whose hyperbolic tangent is @var{x}. If the absolute value of +@var{x} is greater than or equal to @code{1}, @code{atanh} returns +@code{HUGE_VAL}. +@end deftypefun + +@node Pseudo-Random Numbers +@section Pseudo-Random Numbers +@cindex random numbers +@cindex pseudo-random numbers +@cindex seed (for random numbers) + +This section describes the GNU facilities for generating a series of +pseudo-random numbers. The numbers generated are not truly random; +typically, they form a sequence that repeats periodically, with a +period so large that you can ignore it for ordinary purposes. The +random number generator works by remembering at all times a @dfn{seed} +value which it uses to compute the next random number and also to +compute a new seed. + +Although the generated numbers look unpredictable within one run of a +program, the sequence of numbers is @emph{exactly the same} from one run +to the next. This is because the initial seed is always the same. This +is convenient when you are debugging a program, but it is unhelpful if +you want the program to behave unpredictably. If you want truly random +numbers, not just pseudo-random, specify a seed based on the current +time. + +You can get repeatable sequences of numbers on a particular machine type +by specifying the same initial seed value for the random number +generator. There is no standard meaning for a particular seed value; +the same seed, used in different C libraries or on different CPU types, +will give you different random numbers. + +The GNU library supports the standard ANSI C random number functions +plus another set derived from BSD. We recommend you use the standard +ones, @code{rand} and @code{srand}. + +@menu +* ANSI Random:: @code{rand} and friends. +* BSD Random:: @code{random} and friends. +@end menu + +@node ANSI Random +@subsection ANSI C Random Number Functions + +This section describes the random number functions that are part of +the ANSI C standard. + +To use these facilities, you should include the header file +@file{stdlib.h} in your program. +@pindex stdlib.h + +@comment stdlib.h +@comment ANSI +@deftypevr Macro int RAND_MAX +The value of this macro is an integer constant expression that +represents the maximum possible value returned by the @code{rand} +function. In the GNU library, it is @code{037777777}, which is the +largest signed integer representable in 32 bits. In other libraries, it +may be as low as @code{32767}. +@end deftypevr + +@comment stdlib.h +@comment ANSI +@deftypefun int rand () +The @code{rand} function returns the next pseudo-random number in the +series. The value is in the range from @code{0} to @code{RAND_MAX}. +@end deftypefun + +@comment stdlib.h +@comment ANSI +@deftypefun void srand (unsigned int @var{seed}) +This function establishes @var{seed} as the seed for a new series of +pseudo-random numbers. If you call @code{rand} before a seed has been +established with @code{srand}, it uses the value @code{1} as a default +seed. + +To produce truly random numbers (not just pseudo-random), do @code{srand +(time (0))}. +@end deftypefun + +@node BSD Random +@subsection BSD Random Number Functions + +This section describes a set of random number generation functions that +are derived from BSD. There is no advantage to using these functions +with the GNU C library; we support them for BSD compatibility only. + +The prototypes for these functions are in @file{stdlib.h}. +@pindex stdlib.h + +@comment stdlib.h +@comment BSD +@deftypefun {long int} random () +This function returns the next pseudo-random number in the sequence. +The range of values returned is from @code{0} to @code{RAND_MAX}. +@end deftypefun + +@comment stdlib.h +@comment BSD +@deftypefun void srandom (unsigned int @var{seed}) +The @code{srandom} function sets the seed for the current random number +state based on the integer @var{seed}. If you supply a @var{seed} value +of @code{1}, this will cause @code{random} to reproduce the default set +of random numbers. + +To produce truly random numbers (not just pseudo-random), do +@code{srandom (time (0))}. +@end deftypefun + +@comment stdlib.h +@comment BSD +@deftypefun {void *} initstate (unsigned int @var{seed}, void *@var{state}, size_t @var{size}) +The @code{initstate} function is used to initialize the random number +generator state. The argument @var{state} is an array of @var{size} +bytes, used to hold the state information. The size must be at least 8 +bytes, and optimal sizes are 8, 16, 32, 64, 128, and 256. The bigger +the @var{state} array, the better. + +The return value is the previous value of the state information array. +You can use this value later as an argument to @code{setstate} to +restore that state. +@end deftypefun + +@comment stdlib.h +@comment BSD +@deftypefun {void *} setstate (void *@var{state}) +The @code{setstate} function restores the random number state +information @var{state}. The argument must have been the result of +a previous call to @var{initstate} or @var{setstate}. + +The return value is the previous value of the state information array. +You can use thise value later as an argument to @code{setstate} to +restore that state. +@end deftypefun diff --git a/manual/mbyte.texi b/manual/mbyte.texi new file mode 100644 index 0000000000..c058cbfb69 --- /dev/null +++ b/manual/mbyte.texi @@ -0,0 +1,695 @@ +@node Extended Characters, Locales, String and Array Utilities, Top +@chapter Extended Characters + +A number of languages use character sets that are larger than the range +of values of type @code{char}. Japanese and Chinese are probably the +most familiar examples. + +The GNU C library includes support for two mechanisms for dealing with +extended character sets: multibyte characters and wide characters. This +chapter describes how to use these mechanisms, and the functions for +converting between them. +@cindex extended character sets + +The behavior of the functions in this chapter is affected by the current +locale for character classification---the @code{LC_CTYPE} category; see +@ref{Locale Categories}. This choice of locale selects which multibyte +code is used, and also controls the meanings and characteristics of wide +character codes. + +@menu +* Extended Char Intro:: Multibyte codes versus wide characters. +* Locales and Extended Chars:: The locale selects the character codes. +* Multibyte Char Intro:: How multibyte codes are represented. +* Wide Char Intro:: How wide characters are represented. +* Wide String Conversion:: Converting wide strings to multibyte code + and vice versa. +* Length of Char:: how many bytes make up one multibyte char. +* Converting One Char:: Converting a string character by character. +* Example of Conversion:: Example showing why converting + one character at a time may be useful. +* Shift State:: Multibyte codes with "shift characters". +@end menu + +@node Extended Char Intro, Locales and Extended Chars, , Extended Characters +@section Introduction to Extended Characters + +You can represent extended characters in either of two ways: + +@itemize @bullet +@item +As @dfn{multibyte characters} which can be embedded in an ordinary +string, an array of @code{char} objects. Their advantage is that many +programs and operating systems can handle occasional multibyte +characters scattered among ordinary ASCII characters, without any +change. + +@item +@cindex wide characters +As @dfn{wide characters}, which are like ordinary characters except that +they occupy more bits. The wide character data type, @code{wchar_t}, +has a range large enough to hold extended character codes as well as +old-fashioned ASCII codes. + +An advantage of wide characters is that each character is a single data +object, just like ordinary ASCII characters. There are a few +disadvantages: + +@itemize @bullet +@item +Each existing program must be modified and recompiled to make it use +wide characters. + +@item +Files of wide characters cannot be read by programs that expect ordinary +characters. +@end itemize +@end itemize + +Typically, you use the multibyte character representation as part of the +external program interface, such as reading or writing text to files. +However, it's usually easier to perform internal manipulations on +strings containing extended characters on arrays of @code{wchar_t} +objects, since the uniform representation makes most editing operations +easier. If you do use multibyte characters for files and wide +characters for internal operations, you need to convert between them +when you read and write data. + +If your system supports extended characters, then it supports them both +as multibyte characters and as wide characters. The library includes +functions you can use to convert between the two representations. +These functions are described in this chapter. + +@node Locales and Extended Chars, Multibyte Char Intro, Extended Char Intro, Extended Characters +@section Locales and Extended Characters + +A computer system can support more than one multibyte character code, +and more than one wide character code. The user controls the choice of +codes through the current locale for character classification +(@pxref{Locales}). Each locale specifies a particular multibyte +character code and a particular wide character code. The choice of locale +influences the behavior of the conversion functions in the library. + +Some locales support neither wide characters nor nontrivial multibyte +characters. In these locales, the library conversion functions still +work, even though what they do is basically trivial. + +If you select a new locale for character classification, the internal +shift state maintained by these functions can become confused, so it's +not a good idea to change the locale while you are in the middle of +processing a string. + +@node Multibyte Char Intro, Wide Char Intro, Locales and Extended Chars, Extended Characters +@section Multibyte Characters +@cindex multibyte characters + +In the ordinary ASCII code, a sequence of characters is a sequence of +bytes, and each character is one byte. This is very simple, but +allows for only 256 distinct characters. + +In a @dfn{multibyte character code}, a sequence of characters is a +sequence of bytes, but each character may occupy one or more consecutive +bytes of the sequence. + +@cindex basic byte sequence +There are many different ways of designing a multibyte character code; +different systems use different codes. To specify a particular code +means designating the @dfn{basic} byte sequences---those which represent +a single character---and what characters they stand for. A code that a +computer can actually use must have a finite number of these basic +sequences, and typically none of them is more than a few characters +long. + +These sequences need not all have the same length. In fact, many of +them are just one byte long. Because the basic ASCII characters in the +range from @code{0} to @code{0177} are so important, they stand for +themselves in all multibyte character codes. That is to say, a byte +whose value is @code{0} through @code{0177} is always a character in +itself. The characters which are more than one byte must always start +with a byte in the range from @code{0200} through @code{0377}. + +The byte value @code{0} can be used to terminate a string, just as it is +often used in a string of ASCII characters. + +Specifying the basic byte sequences that represent single characters +automatically gives meanings to many longer byte sequences, as more than +one character. For example, if the two byte sequence @code{0205 049} +stands for the Greek letter alpha, then @code{0205 049 065} must stand +for an alpha followed by an @samp{A} (ASCII code 065), and @code{0205 049 +0205 049} must stand for two alphas in a row. + +If any byte sequence can have more than one meaning as a sequence of +characters, then the multibyte code is ambiguous---and no good. The +codes that systems actually use are all unambiguous. + +In most codes, there are certain sequences of bytes that have no meaning +as a character or characters. These are called @dfn{invalid}. + +The simplest possible multibyte code is a trivial one: + +@quotation +The basic sequences consist of single bytes. +@end quotation + +This particular code is equivalent to not using multibyte characters at +all. It has no invalid sequences. But it can handle only 256 different +characters. + +Here is another possible code which can handle 9376 different +characters: + +@quotation +The basic sequences consist of + +@itemize @bullet +@item +single bytes with values in the range @code{0} through @code{0237}. + +@item +two-byte sequences, in which both of the bytes have values in the range +from @code{0240} through @code{0377}. +@end itemize +@end quotation + +@noindent +This code or a similar one is used on some systems to represent Japanese +characters. The invalid sequences are those which consist of an odd +number of consecutive bytes in the range from @code{0240} through +@code{0377}. + +Here is another multibyte code which can handle more distinct extended +characters---in fact, almost thirty million: + +@quotation +The basic sequences consist of + +@itemize @bullet +@item +single bytes with values in the range @code{0} through @code{0177}. + +@item +sequences of up to four bytes in which the first byte is in the range +from @code{0200} through @code{0237}, and the remaining bytes are in the +range from @code{0240} through @code{0377}. +@end itemize +@end quotation + +@noindent +In this code, any sequence that starts with a byte in the range +from @code{0240} through @code{0377} is invalid. + +And here is another variant which has the advantage that removing the +last byte or bytes from a valid character can never produce another +valid character. (This property is convenient when you want to search +strings for particular characters.) + +@quotation +The basic sequences consist of + +@itemize @bullet +@item +single bytes with values in the range @code{0} through @code{0177}. + +@item +two-byte sequences in which the first byte is in the range from +@code{0200} through @code{0207}, and the second byte is in the range +from @code{0240} through @code{0377}. + +@item +three-byte sequences in which the first byte is in the range from +@code{0210} through @code{0217}, and the other bytes are in the range +from @code{0240} through @code{0377}. + +@item +four-byte sequences in which the first byte is in the range from +@code{0220} through @code{0227}, and the other bytes are in the range +from @code{0240} through @code{0377}. +@end itemize +@end quotation + +@noindent +The list of invalid sequences for this code is long and not worth +stating in full; examples of invalid sequences include @code{0240} and +@code{0220 0300 065}. + +The number of @emph{possible} multibyte codes is astronomical. But a +given computer system will support at most a few different codes. (One +of these codes may allow for thousands of different characters.) +Another computer system may support a completely different code. The +library facilities described in this chapter are helpful because they +package up the knowledge of the details of a particular computer +system's multibyte code, so your programs need not know them. + +You can use special standard macros to find out the maximum possible +number of bytes in a character in the currently selected multibyte +code with @code{MB_CUR_MAX}, and the maximum for @emph{any} multibyte +code supported on your computer with @code{MB_LEN_MAX}. + +@comment limits.h +@comment ANSI +@deftypevr Macro int MB_LEN_MAX +This is the maximum length of a multibyte character for any supported +locale. It is defined in @file{limits.h}. +@pindex limits.h +@end deftypevr + +@comment stdlib.h +@comment ANSI +@deftypevr Macro int MB_CUR_MAX +This macro expands into a (possibly non-constant) positive integer +expression that is the maximum number of bytes in a multibyte character +in the current locale. The value is never greater than @code{MB_LEN_MAX}. + +@pindex stdlib.h +@code{MB_CUR_MAX} is defined in @file{stdlib.h}. +@end deftypevr + +Normally, each basic sequence in a particular character code stands for +one character, the same character regardless of context. Some multibyte +character codes have a concept of @dfn{shift state}; certain codes, +called @dfn{shift sequences}, change to a different shift state, and the +meaning of some or all basic sequences varies according to the current +shift state. In fact, the set of basic sequences might even be +different depending on the current shift state. @xref{Shift State}, for +more information on handling this sort of code. + +What happens if you try to pass a string containing multibyte characters +to a function that doesn't know about them? Normally, such a function +treats a string as a sequence of bytes, and interprets certain byte +values specially; all other byte values are ``ordinary''. As long as a +multibyte character doesn't contain any of the special byte values, the +function should pass it through as if it were several ordinary +characters. + +For example, let's figure out what happens if you use multibyte +characters in a file name. The functions such as @code{open} and +@code{unlink} that operate on file names treat the name as a sequence of +byte values, with @samp{/} as the only special value. Any other byte +values are copied, or compared, in sequence, and all byte values are +treated alike. Thus, you may think of the file name as a sequence of +bytes or as a string containing multibyte characters; the same behavior +makes sense equally either way, provided no multibyte character contains +a @samp{/}. + +@node Wide Char Intro, Wide String Conversion, Multibyte Char Intro, Extended Characters +@section Wide Character Introduction + +@dfn{Wide characters} are much simpler than multibyte characters. They +are simply characters with more than eight bits, so that they have room +for more than 256 distinct codes. The wide character data type, +@code{wchar_t}, has a range large enough to hold extended character +codes as well as old-fashioned ASCII codes. + +An advantage of wide characters is that each character is a single data +object, just like ordinary ASCII characters. Wide characters also have +some disadvantages: + +@itemize @bullet +@item +A program must be modified and recompiled in order to use wide +characters at all. + +@item +Files of wide characters cannot be read by programs that expect ordinary +characters. +@end itemize + +Wide character values @code{0} through @code{0177} are always identical +in meaning to the ASCII character codes. The wide character value zero +is often used to terminate a string of wide characters, just as a single +byte with value zero often terminates a string of ordinary characters. + +@comment stddef.h +@comment ANSI +@deftp {Data Type} wchar_t +This is the ``wide character'' type, an integer type whose range is +large enough to represent all distinct values in any extended character +set in the supported locales. @xref{Locales}, for more information +about locales. This type is defined in the header file @file{stddef.h}. +@pindex stddef.h +@end deftp + +If your system supports extended characters, then each extended +character has both a wide character code and a corresponding multibyte +basic sequence. + +@cindex code, character +@cindex character code +In this chapter, the term @dfn{code} is used to refer to a single +extended character object to emphasize the distinction from the +@code{char} data type. + +@node Wide String Conversion, Length of Char, Wide Char Intro, Extended Characters +@section Conversion of Extended Strings +@cindex extended strings, converting representations +@cindex converting extended strings + +@pindex stdlib.h +The @code{mbstowcs} function converts a string of multibyte characters +to a wide character array. The @code{wcstombs} function does the +reverse. These functions are declared in the header file +@file{stdlib.h}. + +In most programs, these functions are the only ones you need for +conversion between wide strings and multibyte character strings. But +they have limitations. If your data is not null-terminated or is not +all in core at once, you probably need to use the low-level conversion +functions to convert one character at a time. @xref{Converting One +Char}. + +@comment stdlib.h +@comment ANSI +@deftypefun size_t mbstowcs (wchar_t *@var{wstring}, const char *@var{string}, size_t @var{size}) +The @code{mbstowcs} (``multibyte string to wide character string'') +function converts the null-terminated string of multibyte characters +@var{string} to an array of wide character codes, storing not more than +@var{size} wide characters into the array beginning at @var{wstring}. +The terminating null character counts towards the size, so if @var{size} +is less than the actual number of wide characters resulting from +@var{string}, no terminating null character is stored. + +The conversion of characters from @var{string} begins in the initial +shift state. + +If an invalid multibyte character sequence is found, this function +returns a value of @code{-1}. Otherwise, it returns the number of wide +characters stored in the array @var{wstring}. This number does not +include the terminating null character, which is present if the number +is less than @var{size}. + +Here is an example showing how to convert a string of multibyte +characters, allocating enough space for the result. + +@smallexample +wchar_t * +mbstowcs_alloc (const char *string) +@{ + size_t size = strlen (string) + 1; + wchar_t *buf = xmalloc (size * sizeof (wchar_t)); + + size = mbstowcs (buf, string, size); + if (size == (size_t) -1) + return NULL; + buf = xrealloc (buf, (size + 1) * sizeof (wchar_t)); + return buf; +@} +@end smallexample + +@end deftypefun + +@comment stdlib.h +@comment ANSI +@deftypefun size_t wcstombs (char *@var{string}, const wchar_t @var{wstring}, size_t @var{size}) +The @code{wcstombs} (``wide character string to multibyte string'') +function converts the null-terminated wide character array @var{wstring} +into a string containing multibyte characters, storing not more than +@var{size} bytes starting at @var{string}, followed by a terminating +null character if there is room. The conversion of characters begins in +the initial shift state. + +The terminating null character counts towards the size, so if @var{size} +is less than or equal to the number of bytes needed in @var{wstring}, no +terminating null character is stored. + +If a code that does not correspond to a valid multibyte character is +found, this function returns a value of @code{-1}. Otherwise, the +return value is the number of bytes stored in the array @var{string}. +This number does not include the terminating null character, which is +present if the number is less than @var{size}. +@end deftypefun + +@node Length of Char, Converting One Char, Wide String Conversion, Extended Characters +@section Multibyte Character Length +@cindex multibyte character, length of +@cindex length of multibyte character + +This section describes how to scan a string containing multibyte +characters, one character at a time. The difficulty in doing this +is to know how many bytes each character contains. Your program +can use @code{mblen} to find this out. + +@comment stdlib.h +@comment ANSI +@deftypefun int mblen (const char *@var{string}, size_t @var{size}) +The @code{mblen} function with a non-null @var{string} argument returns +the number of bytes that make up the multibyte character beginning at +@var{string}, never examining more than @var{size} bytes. (The idea is +to supply for @var{size} the number of bytes of data you have in hand.) + +The return value of @code{mblen} distinguishes three possibilities: the +first @var{size} bytes at @var{string} start with valid multibyte +character, they start with an invalid byte sequence or just part of a +character, or @var{string} points to an empty string (a null character). + +For a valid multibyte character, @code{mblen} returns the number of +bytes in that character (always at least @code{1}, and never more than +@var{size}). For an invalid byte sequence, @code{mblen} returns +@code{-1}. For an empty string, it returns @code{0}. + +If the multibyte character code uses shift characters, then @code{mblen} +maintains and updates a shift state as it scans. If you call +@code{mblen} with a null pointer for @var{string}, that initializes the +shift state to its standard initial value. It also returns nonzero if +the multibyte character code in use actually has a shift state. +@xref{Shift State}. + +@pindex stdlib.h +The function @code{mblen} is declared in @file{stdlib.h}. +@end deftypefun + +@node Converting One Char, Example of Conversion, Length of Char, Extended Characters +@section Conversion of Extended Characters One by One +@cindex extended characters, converting +@cindex converting extended characters + +@pindex stdlib.h +You can convert multibyte characters one at a time to wide characters +with the @code{mbtowc} function. The @code{wctomb} function does the +reverse. These functions are declared in @file{stdlib.h}. + +@comment stdlib.h +@comment ANSI +@deftypefun int mbtowc (wchar_t *@var{result}, const char *@var{string}, size_t @var{size}) +The @code{mbtowc} (``multibyte to wide character'') function when called +with non-null @var{string} converts the first multibyte character +beginning at @var{string} to its corresponding wide character code. It +stores the result in @code{*@var{result}}. + +@code{mbtowc} never examines more than @var{size} bytes. (The idea is +to supply for @var{size} the number of bytes of data you have in hand.) + +@code{mbtowc} with non-null @var{string} distinguishes three +possibilities: the first @var{size} bytes at @var{string} start with +valid multibyte character, they start with an invalid byte sequence or +just part of a character, or @var{string} points to an empty string (a +null character). + +For a valid multibyte character, @code{mbtowc} converts it to a wide +character and stores that in @code{*@var{result}}, and returns the +number of bytes in that character (always at least @code{1}, and never +more than @var{size}). + +For an invalid byte sequence, @code{mbtowc} returns @code{-1}. For an +empty string, it returns @code{0}, also storing @code{0} in +@code{*@var{result}}. + +If the multibyte character code uses shift characters, then +@code{mbtowc} maintains and updates a shift state as it scans. If you +call @code{mbtowc} with a null pointer for @var{string}, that +initializes the shift state to its standard initial value. It also +returns nonzero if the multibyte character code in use actually has a +shift state. @xref{Shift State}. +@end deftypefun + +@comment stdlib.h +@comment ANSI +@deftypefun int wctomb (char *@var{string}, wchar_t @var{wchar}) +The @code{wctomb} (``wide character to multibyte'') function converts +the wide character code @var{wchar} to its corresponding multibyte +character sequence, and stores the result in bytes starting at +@var{string}. At most @code{MB_CUR_MAX} characters are stored. + +@code{wctomb} with non-null @var{string} distinguishes three +possibilities for @var{wchar}: a valid wide character code (one that can +be translated to a multibyte character), an invalid code, and @code{0}. + +Given a valid code, @code{wctomb} converts it to a multibyte character, +storing the bytes starting at @var{string}. Then it returns the number +of bytes in that character (always at least @code{1}, and never more +than @code{MB_CUR_MAX}). + +If @var{wchar} is an invalid wide character code, @code{wctomb} returns +@code{-1}. If @var{wchar} is @code{0}, it returns @code{0}, also +storing @code{0} in @code{*@var{string}}. + +If the multibyte character code uses shift characters, then +@code{wctomb} maintains and updates a shift state as it scans. If you +call @code{wctomb} with a null pointer for @var{string}, that +initializes the shift state to its standard initial value. It also +returns nonzero if the multibyte character code in use actually has a +shift state. @xref{Shift State}. + +Calling this function with a @var{wchar} argument of zero when +@var{string} is not null has the side-effect of reinitializing the +stored shift state @emph{as well as} storing the multibyte character +@code{0} and returning @code{0}. +@end deftypefun + +@node Example of Conversion, Shift State, Converting One Char, Extended Characters +@section Character-by-Character Conversion Example + +Here is an example that reads multibyte character text from descriptor +@code{input} and writes the corresponding wide characters to descriptor +@code{output}. We need to convert characters one by one for this +example because @code{mbstowcs} is unable to continue past a null +character, and cannot cope with an apparently invalid partial character +by reading more input. + +@smallexample +int +file_mbstowcs (int input, int output) +@{ + char buffer[BUFSIZ + MB_LEN_MAX]; + int filled = 0; + int eof = 0; + + while (!eof) + @{ + int nread; + int nwrite; + char *inp = buffer; + wchar_t outbuf[BUFSIZ]; + wchar_t *outp = outbuf; + + /* @r{Fill up the buffer from the input file.} */ + nread = read (input, buffer + filled, BUFSIZ); + if (nread < 0) + @{ + perror ("read"); + return 0; + @} + /* @r{If we reach end of file, make a note to read no more.} */ + if (nread == 0) + eof = 1; + + /* @r{@code{filled} is now the number of bytes in @code{buffer}.} */ + filled += nread; + + /* @r{Convert those bytes to wide characters--as many as we can.} */ + while (1) + @{ + int thislen = mbtowc (outp, inp, filled); + /* Stop converting at invalid character; + this can mean we have read just the first part + of a valid character. */ + if (thislen == -1) + break; + /* @r{Treat null character like any other,} + @r{but also reset shift state.} */ + if (thislen == 0) @{ + thislen = 1; + mbtowc (NULL, NULL, 0); + @} + /* @r{Advance past this character.} */ + inp += thislen; + filled -= thislen; + outp++; + @} + + /* @r{Write the wide characters we just made.} */ + nwrite = write (output, outbuf, + (outp - outbuf) * sizeof (wchar_t)); + if (nwrite < 0) + @{ + perror ("write"); + return 0; + @} + + /* @r{See if we have a @emph{real} invalid character.} */ + if ((eof && filled > 0) || filled >= MB_CUR_MAX) + @{ + error ("invalid multibyte character"); + return 0; + @} + + /* @r{If any characters must be carried forward,} + @r{put them at the beginning of @code{buffer}.} */ + if (filled > 0) + memcpy (inp, buffer, filled); + @} + @} + + return 1; +@} +@end smallexample + +@node Shift State, , Example of Conversion, Extended Characters +@section Multibyte Codes Using Shift Sequences + +In some multibyte character codes, the @emph{meaning} of any particular +byte sequence is not fixed; it depends on what other sequences have come +earlier in the same string. Typically there are just a few sequences +that can change the meaning of other sequences; these few are called +@dfn{shift sequences} and we say that they set the @dfn{shift state} for +other sequences that follow. + +To illustrate shift state and shift sequences, suppose we decide that +the sequence @code{0200} (just one byte) enters Japanese mode, in which +pairs of bytes in the range from @code{0240} to @code{0377} are single +characters, while @code{0201} enters Latin-1 mode, in which single bytes +in the range from @code{0240} to @code{0377} are characters, and +interpreted according to the ISO Latin-1 character set. This is a +multibyte code which has two alternative shift states (``Japanese mode'' +and ``Latin-1 mode''), and two shift sequences that specify particular +shift states. + +When the multibyte character code in use has shift states, then +@code{mblen}, @code{mbtowc} and @code{wctomb} must maintain and update +the current shift state as they scan the string. To make this work +properly, you must follow these rules: + +@itemize @bullet +@item +Before starting to scan a string, call the function with a null pointer +for the multibyte character address---for example, @code{mblen (NULL, +0)}. This initializes the shift state to its standard initial value. + +@item +Scan the string one character at a time, in order. Do not ``back up'' +and rescan characters already scanned, and do not intersperse the +processing of different strings. +@end itemize + +Here is an example of using @code{mblen} following these rules: + +@smallexample +void +scan_string (char *s) +@{ + int length = strlen (s); + + /* @r{Initialize shift state.} */ + mblen (NULL, 0); + + while (1) + @{ + int thischar = mblen (s, length); + /* @r{Deal with end of string and invalid characters.} */ + if (thischar == 0) + break; + if (thischar == -1) + @{ + error ("invalid multibyte character"); + break; + @} + /* @r{Advance past this character.} */ + s += thischar; + length -= thischar; + @} +@} +@end smallexample + +The functions @code{mblen}, @code{mbtowc} and @code{wctomb} are not +reentrant when using a multibyte code that uses a shift state. However, +no other library functions call these functions, so you don't have to +worry that the shift state will be changed mysteriously. diff --git a/manual/memory.texi b/manual/memory.texi new file mode 100644 index 0000000000..9269380e1d --- /dev/null +++ b/manual/memory.texi @@ -0,0 +1,1751 @@ +@comment !!! describe mmap et al (here?) +@c !!! doc brk/sbrk + +@node Memory Allocation, Character Handling, Error Reporting, Top +@chapter Memory Allocation +@cindex memory allocation +@cindex storage allocation + +The GNU system provides several methods for allocating memory space +under explicit program control. They vary in generality and in +efficiency. + +@iftex +@itemize @bullet +@item +The @code{malloc} facility allows fully general dynamic allocation. +@xref{Unconstrained Allocation}. + +@item +Obstacks are another facility, less general than @code{malloc} but more +efficient and convenient for stacklike allocation. @xref{Obstacks}. + +@item +The function @code{alloca} lets you allocate storage dynamically that +will be freed automatically. @xref{Variable Size Automatic}. +@end itemize +@end iftex + +@menu +* Memory Concepts:: An introduction to concepts and terminology. +* Dynamic Allocation and C:: How to get different kinds of allocation in C. +* Unconstrained Allocation:: The @code{malloc} facility allows fully general + dynamic allocation. +* Obstacks:: Obstacks are less general than malloc + but more efficient and convenient. +* Variable Size Automatic:: Allocation of variable-sized blocks + of automatic storage that are freed when the + calling function returns. +* Relocating Allocator:: Waste less memory, if you can tolerate + automatic relocation of the blocks you get. +* Memory Warnings:: Getting warnings when memory is nearly full. +@end menu + +@node Memory Concepts +@section Dynamic Memory Allocation Concepts +@cindex dynamic allocation +@cindex static allocation +@cindex automatic allocation + +@dfn{Dynamic memory allocation} is a technique in which programs +determine as they are running where to store some information. You need +dynamic allocation when the number of memory blocks you need, or how +long you continue to need them, depends on the data you are working on. + +For example, you may need a block to store a line read from an input file; +since there is no limit to how long a line can be, you must allocate the +storage dynamically and make it dynamically larger as you read more of the +line. + +Or, you may need a block for each record or each definition in the input +data; since you can't know in advance how many there will be, you must +allocate a new block for each record or definition as you read it. + +When you use dynamic allocation, the allocation of a block of memory is an +action that the program requests explicitly. You call a function or macro +when you want to allocate space, and specify the size with an argument. If +you want to free the space, you do so by calling another function or macro. +You can do these things whenever you want, as often as you want. + +@node Dynamic Allocation and C +@section Dynamic Allocation and C + +The C language supports two kinds of memory allocation through the variables +in C programs: + +@itemize @bullet +@item +@dfn{Static allocation} is what happens when you declare a static or +global variable. Each static or global variable defines one block of +space, of a fixed size. The space is allocated once, when your program +is started, and is never freed. + +@item +@dfn{Automatic allocation} happens when you declare an automatic +variable, such as a function argument or a local variable. The space +for an automatic variable is allocated when the compound statement +containing the declaration is entered, and is freed when that +compound statement is exited. + +In GNU C, the length of the automatic storage can be an expression +that varies. In other C implementations, it must be a constant. +@end itemize + +Dynamic allocation is not supported by C variables; there is no storage +class ``dynamic'', and there can never be a C variable whose value is +stored in dynamically allocated space. The only way to refer to +dynamically allocated space is through a pointer. Because it is less +convenient, and because the actual process of dynamic allocation +requires more computation time, programmers generally use dynamic +allocation only when neither static nor automatic allocation will serve. + +For example, if you want to allocate dynamically some space to hold a +@code{struct foobar}, you cannot declare a variable of type @code{struct +foobar} whose contents are the dynamically allocated space. But you can +declare a variable of pointer type @code{struct foobar *} and assign it the +address of the space. Then you can use the operators @samp{*} and +@samp{->} on this pointer variable to refer to the contents of the space: + +@smallexample +@{ + struct foobar *ptr + = (struct foobar *) malloc (sizeof (struct foobar)); + ptr->name = x; + ptr->next = current_foobar; + current_foobar = ptr; +@} +@end smallexample + +@node Unconstrained Allocation +@section Unconstrained Allocation +@cindex unconstrained storage allocation +@cindex @code{malloc} function +@cindex heap, dynamic allocation from + +The most general dynamic allocation facility is @code{malloc}. It +allows you to allocate blocks of memory of any size at any time, make +them bigger or smaller at any time, and free the blocks individually at +any time (or never). + +@menu +* Basic Allocation:: Simple use of @code{malloc}. +* Malloc Examples:: Examples of @code{malloc}. @code{xmalloc}. +* Freeing after Malloc:: Use @code{free} to free a block you + got with @code{malloc}. +* Changing Block Size:: Use @code{realloc} to make a block + bigger or smaller. +* Allocating Cleared Space:: Use @code{calloc} to allocate a + block and clear it. +* Efficiency and Malloc:: Efficiency considerations in use of + these functions. +* Aligned Memory Blocks:: Allocating specially aligned memory: + @code{memalign} and @code{valloc}. +* Heap Consistency Checking:: Automatic checking for errors. +* Hooks for Malloc:: You can use these hooks for debugging + programs that use @code{malloc}. +* Statistics of Malloc:: Getting information about how much + memory your program is using. +* Summary of Malloc:: Summary of @code{malloc} and related functions. +@end menu + +@node Basic Allocation +@subsection Basic Storage Allocation +@cindex allocation of memory with @code{malloc} + +To allocate a block of memory, call @code{malloc}. The prototype for +this function is in @file{stdlib.h}. +@pindex stdlib.h + +@comment malloc.h stdlib.h +@comment ANSI +@deftypefun {void *} malloc (size_t @var{size}) +This function returns a pointer to a newly allocated block @var{size} +bytes long, or a null pointer if the block could not be allocated. +@end deftypefun + +The contents of the block are undefined; you must initialize it yourself +(or use @code{calloc} instead; @pxref{Allocating Cleared Space}). +Normally you would cast the value as a pointer to the kind of object +that you want to store in the block. Here we show an example of doing +so, and of initializing the space with zeros using the library function +@code{memset} (@pxref{Copying and Concatenation}): + +@smallexample +struct foo *ptr; +@dots{} +ptr = (struct foo *) malloc (sizeof (struct foo)); +if (ptr == 0) abort (); +memset (ptr, 0, sizeof (struct foo)); +@end smallexample + +You can store the result of @code{malloc} into any pointer variable +without a cast, because ANSI C automatically converts the type +@code{void *} to another type of pointer when necessary. But the cast +is necessary in contexts other than assignment operators or if you might +want your code to run in traditional C. + +Remember that when allocating space for a string, the argument to +@code{malloc} must be one plus the length of the string. This is +because a string is terminated with a null character that doesn't count +in the ``length'' of the string but does need space. For example: + +@smallexample +char *ptr; +@dots{} +ptr = (char *) malloc (length + 1); +@end smallexample + +@noindent +@xref{Representation of Strings}, for more information about this. + +@node Malloc Examples +@subsection Examples of @code{malloc} + +If no more space is available, @code{malloc} returns a null pointer. +You should check the value of @emph{every} call to @code{malloc}. It is +useful to write a subroutine that calls @code{malloc} and reports an +error if the value is a null pointer, returning only if the value is +nonzero. This function is conventionally called @code{xmalloc}. Here +it is: + +@smallexample +void * +xmalloc (size_t size) +@{ + register void *value = malloc (size); + if (value == 0) + fatal ("virtual memory exhausted"); + return value; +@} +@end smallexample + +Here is a real example of using @code{malloc} (by way of @code{xmalloc}). +The function @code{savestring} will copy a sequence of characters into +a newly allocated null-terminated string: + +@smallexample +@group +char * +savestring (const char *ptr, size_t len) +@{ + register char *value = (char *) xmalloc (len + 1); + memcpy (value, ptr, len); + value[len] = '\0'; + return value; +@} +@end group +@end smallexample + +The block that @code{malloc} gives you is guaranteed to be aligned so +that it can hold any type of data. In the GNU system, the address is +always a multiple of eight; if the size of block is 16 or more, then the +address is always a multiple of 16. Only rarely is any higher boundary +(such as a page boundary) necessary; for those cases, use +@code{memalign} or @code{valloc} (@pxref{Aligned Memory Blocks}). + +Note that the memory located after the end of the block is likely to be +in use for something else; perhaps a block already allocated by another +call to @code{malloc}. If you attempt to treat the block as longer than +you asked for it to be, you are liable to destroy the data that +@code{malloc} uses to keep track of its blocks, or you may destroy the +contents of another block. If you have already allocated a block and +discover you want it to be bigger, use @code{realloc} (@pxref{Changing +Block Size}). + +@node Freeing after Malloc +@subsection Freeing Memory Allocated with @code{malloc} +@cindex freeing memory allocated with @code{malloc} +@cindex heap, freeing memory from + +When you no longer need a block that you got with @code{malloc}, use the +function @code{free} to make the block available to be allocated again. +The prototype for this function is in @file{stdlib.h}. +@pindex stdlib.h + +@comment malloc.h stdlib.h +@comment ANSI +@deftypefun void free (void *@var{ptr}) +The @code{free} function deallocates the block of storage pointed at +by @var{ptr}. +@end deftypefun + +@comment stdlib.h +@comment Sun +@deftypefun void cfree (void *@var{ptr}) +This function does the same thing as @code{free}. It's provided for +backward compatibility with SunOS; you should use @code{free} instead. +@end deftypefun + +Freeing a block alters the contents of the block. @strong{Do not expect to +find any data (such as a pointer to the next block in a chain of blocks) in +the block after freeing it.} Copy whatever you need out of the block before +freeing it! Here is an example of the proper way to free all the blocks in +a chain, and the strings that they point to: + +@smallexample +struct chain + @{ + struct chain *next; + char *name; + @} + +void +free_chain (struct chain *chain) +@{ + while (chain != 0) + @{ + struct chain *next = chain->next; + free (chain->name); + free (chain); + chain = next; + @} +@} +@end smallexample + +Occasionally, @code{free} can actually return memory to the operating +system and make the process smaller. Usually, all it can do is allow a +later call to @code{malloc} to reuse the space. In the meantime, the +space remains in your program as part of a free-list used internally by +@code{malloc}. + +There is no point in freeing blocks at the end of a program, because all +of the program's space is given back to the system when the process +terminates. + +@node Changing Block Size +@subsection Changing the Size of a Block +@cindex changing the size of a block (@code{malloc}) + +Often you do not know for certain how big a block you will ultimately need +at the time you must begin to use the block. For example, the block might +be a buffer that you use to hold a line being read from a file; no matter +how long you make the buffer initially, you may encounter a line that is +longer. + +You can make the block longer by calling @code{realloc}. This function +is declared in @file{stdlib.h}. +@pindex stdlib.h + +@comment malloc.h stdlib.h +@comment ANSI +@deftypefun {void *} realloc (void *@var{ptr}, size_t @var{newsize}) +The @code{realloc} function changes the size of the block whose address is +@var{ptr} to be @var{newsize}. + +Since the space after the end of the block may be in use, @code{realloc} +may find it necessary to copy the block to a new address where more free +space is available. The value of @code{realloc} is the new address of the +block. If the block needs to be moved, @code{realloc} copies the old +contents. + +If you pass a null pointer for @var{ptr}, @code{realloc} behaves just +like @samp{malloc (@var{newsize})}. This can be convenient, but beware +that older implementations (before ANSI C) may not support this +behavior, and will probably crash when @code{realloc} is passed a null +pointer. +@end deftypefun + +Like @code{malloc}, @code{realloc} may return a null pointer if no +memory space is available to make the block bigger. When this happens, +the original block is untouched; it has not been modified or relocated. + +In most cases it makes no difference what happens to the original block +when @code{realloc} fails, because the application program cannot continue +when it is out of memory, and the only thing to do is to give a fatal error +message. Often it is convenient to write and use a subroutine, +conventionally called @code{xrealloc}, that takes care of the error message +as @code{xmalloc} does for @code{malloc}: + +@smallexample +void * +xrealloc (void *ptr, size_t size) +@{ + register void *value = realloc (ptr, size); + if (value == 0) + fatal ("Virtual memory exhausted"); + return value; +@} +@end smallexample + +You can also use @code{realloc} to make a block smaller. The reason you +would do this is to avoid tying up a lot of memory space when only a little +is needed. Making a block smaller sometimes necessitates copying it, so it +can fail if no other space is available. + +If the new size you specify is the same as the old size, @code{realloc} +is guaranteed to change nothing and return the same address that you gave. + +@node Allocating Cleared Space +@subsection Allocating Cleared Space + +The function @code{calloc} allocates memory and clears it to zero. It +is declared in @file{stdlib.h}. +@pindex stdlib.h + +@comment malloc.h stdlib.h +@comment ANSI +@deftypefun {void *} calloc (size_t @var{count}, size_t @var{eltsize}) +This function allocates a block long enough to contain a vector of +@var{count} elements, each of size @var{eltsize}. Its contents are +cleared to zero before @code{calloc} returns. +@end deftypefun + +You could define @code{calloc} as follows: + +@smallexample +void * +calloc (size_t count, size_t eltsize) +@{ + size_t size = count * eltsize; + void *value = malloc (size); + if (value != 0) + memset (value, 0, size); + return value; +@} +@end smallexample + +@node Efficiency and Malloc +@subsection Efficiency Considerations for @code{malloc} +@cindex efficiency and @code{malloc} + +To make the best use of @code{malloc}, it helps to know that the GNU +version of @code{malloc} always dispenses small amounts of memory in +blocks whose sizes are powers of two. It keeps separate pools for each +power of two. This holds for sizes up to a page size. Therefore, if +you are free to choose the size of a small block in order to make +@code{malloc} more efficient, make it a power of two. +@c !!! xref getpagesize + +Once a page is split up for a particular block size, it can't be reused +for another size unless all the blocks in it are freed. In many +programs, this is unlikely to happen. Thus, you can sometimes make a +program use memory more efficiently by using blocks of the same size for +many different purposes. + +When you ask for memory blocks of a page or larger, @code{malloc} uses a +different strategy; it rounds the size up to a multiple of a page, and +it can coalesce and split blocks as needed. + +The reason for the two strategies is that it is important to allocate +and free small blocks as fast as possible, but speed is less important +for a large block since the program normally spends a fair amount of +time using it. Also, large blocks are normally fewer in number. +Therefore, for large blocks, it makes sense to use a method which takes +more time to minimize the wasted space. + +@node Aligned Memory Blocks +@subsection Allocating Aligned Memory Blocks + +@cindex page boundary +@cindex alignment (with @code{malloc}) +@pindex stdlib.h +The address of a block returned by @code{malloc} or @code{realloc} in +the GNU system is always a multiple of eight. If you need a block whose +address is a multiple of a higher power of two than that, use +@code{memalign} or @code{valloc}. These functions are declared in +@file{stdlib.h}. + +With the GNU library, you can use @code{free} to free the blocks that +@code{memalign} and @code{valloc} return. That does not work in BSD, +however---BSD does not provide any way to free such blocks. + +@comment malloc.h stdlib.h +@comment BSD +@deftypefun {void *} memalign (size_t @var{size}, size_t @var{boundary}) +The @code{memalign} function allocates a block of @var{size} bytes whose +address is a multiple of @var{boundary}. The @var{boundary} must be a +power of two! The function @code{memalign} works by calling +@code{malloc} to allocate a somewhat larger block, and then returning an +address within the block that is on the specified boundary. +@end deftypefun + +@comment malloc.h stdlib.h +@comment BSD +@deftypefun {void *} valloc (size_t @var{size}) +Using @code{valloc} is like using @code{memalign} and passing the page size +as the value of the second argument. It is implemented like this: + +@smallexample +void * +valloc (size_t size) +@{ + return memalign (size, getpagesize ()); +@} +@end smallexample +@c !!! xref getpagesize +@end deftypefun + +@node Heap Consistency Checking +@subsection Heap Consistency Checking + +@cindex heap consistency checking +@cindex consistency checking, of heap + +You can ask @code{malloc} to check the consistency of dynamic storage by +using the @code{mcheck} function. This function is a GNU extension, +declared in @file{malloc.h}. +@pindex malloc.h + +@comment malloc.h +@comment GNU +@deftypefun int mcheck (void (*@var{abortfn}) (enum mcheck_status @var{status})) +Calling @code{mcheck} tells @code{malloc} to perform occasional +consistency checks. These will catch things such as writing +past the end of a block that was allocated with @code{malloc}. + +The @var{abortfn} argument is the function to call when an inconsistency +is found. If you supply a null pointer, then @code{mcheck} uses a +default function which prints a message and calls @code{abort} +(@pxref{Aborting a Program}). The function you supply is called with +one argument, which says what sort of inconsistency was detected; its +type is described below. + +It is too late to begin allocation checking once you have allocated +anything with @code{malloc}. So @code{mcheck} does nothing in that +case. The function returns @code{-1} if you call it too late, and +@code{0} otherwise (when it is successful). + +The easiest way to arrange to call @code{mcheck} early enough is to use +the option @samp{-lmcheck} when you link your program; then you don't +need to modify your program source at all. +@end deftypefun + +@deftypefun {enum mcheck_status} mprobe (void *@var{pointer}) +The @code{mprobe} function lets you explicitly check for inconsistencies +in a particular allocated block. You must have already called +@code{mcheck} at the beginning of the program, to do its occasional +checks; calling @code{mprobe} requests an additional consistency check +to be done at the time of the call. + +The argument @var{pointer} must be a pointer returned by @code{malloc} +or @code{realloc}. @code{mprobe} returns a value that says what +inconsistency, if any, was found. The values are described below. +@end deftypefun + +@deftp {Data Type} {enum mcheck_status} +This enumerated type describes what kind of inconsistency was detected +in an allocated block, if any. Here are the possible values: + +@table @code +@item MCHECK_DISABLED +@code{mcheck} was not called before the first allocation. +No consistency checking can be done. +@item MCHECK_OK +No inconsistency detected. +@item MCHECK_HEAD +The data immediately before the block was modified. +This commonly happens when an array index or pointer +is decremented too far. +@item MCHECK_TAIL +The data immediately after the block was modified. +This commonly happens when an array index or pointer +is incremented too far. +@item MCHECK_FREE +The block was already freed. +@end table +@end deftp + +@node Hooks for Malloc +@subsection Storage Allocation Hooks +@cindex allocation hooks, for @code{malloc} + +The GNU C library lets you modify the behavior of @code{malloc}, +@code{realloc}, and @code{free} by specifying appropriate hook +functions. You can use these hooks to help you debug programs that use +dynamic storage allocation, for example. + +The hook variables are declared in @file{malloc.h}. +@pindex malloc.h + +@comment malloc.h +@comment GNU +@defvar __malloc_hook +The value of this variable is a pointer to function that @code{malloc} +uses whenever it is called. You should define this function to look +like @code{malloc}; that is, like: + +@smallexample +void *@var{function} (size_t @var{size}) +@end smallexample +@end defvar + +@comment malloc.h +@comment GNU +@defvar __realloc_hook +The value of this variable is a pointer to function that @code{realloc} +uses whenever it is called. You should define this function to look +like @code{realloc}; that is, like: + +@smallexample +void *@var{function} (void *@var{ptr}, size_t @var{size}) +@end smallexample +@end defvar + +@comment malloc.h +@comment GNU +@defvar __free_hook +The value of this variable is a pointer to function that @code{free} +uses whenever it is called. You should define this function to look +like @code{free}; that is, like: + +@smallexample +void @var{function} (void *@var{ptr}) +@end smallexample +@end defvar + +You must make sure that the function you install as a hook for one of +these functions does not call that function recursively without restoring +the old value of the hook first! Otherwise, your program will get stuck +in an infinite recursion. + +Here is an example showing how to use @code{__malloc_hook} properly. It +installs a function that prints out information every time @code{malloc} +is called. + +@smallexample +static void *(*old_malloc_hook) (size_t); +static void * +my_malloc_hook (size_t size) +@{ + void *result; + __malloc_hook = old_malloc_hook; + result = malloc (size); + /* @r{@code{printf} might call @code{malloc}, so protect it too.} */ + printf ("malloc (%u) returns %p\n", (unsigned int) size, result); + __malloc_hook = my_malloc_hook; + return result; +@} + +main () +@{ + ... + old_malloc_hook = __malloc_hook; + __malloc_hook = my_malloc_hook; + ... +@} +@end smallexample + +The @code{mcheck} function (@pxref{Heap Consistency Checking}) works by +installing such hooks. + +@c __morecore, __after_morecore_hook are undocumented +@c It's not clear whether to document them. + +@node Statistics of Malloc +@subsection Statistics for Storage Allocation with @code{malloc} + +@cindex allocation statistics +You can get information about dynamic storage allocation by calling the +@code{mstats} function. This function and its associated data type are +declared in @file{malloc.h}; they are a GNU extension. +@pindex malloc.h + +@comment malloc.h +@comment GNU +@deftp {Data Type} {struct mstats} +This structure type is used to return information about the dynamic +storage allocator. It contains the following members: + +@table @code +@item size_t bytes_total +This is the total size of memory managed by @code{malloc}, in bytes. + +@item size_t chunks_used +This is the number of chunks in use. (The storage allocator internally +gets chunks of memory from the operating system, and then carves them up +to satisfy individual @code{malloc} requests; see @ref{Efficiency and +Malloc}.) + +@item size_t bytes_used +This is the number of bytes in use. + +@item size_t chunks_free +This is the number of chunks which are free -- that is, that have been +allocated by the operating system to your program, but which are not +now being used. + +@item size_t bytes_free +This is the number of bytes which are free. +@end table +@end deftp + +@comment malloc.h +@comment GNU +@deftypefun {struct mstats} mstats (void) +This function returns information about the current dynamic memory usage +in a structure of type @code{struct mstats}. +@end deftypefun + +@node Summary of Malloc +@subsection Summary of @code{malloc}-Related Functions + +Here is a summary of the functions that work with @code{malloc}: + +@table @code +@item void *malloc (size_t @var{size}) +Allocate a block of @var{size} bytes. @xref{Basic Allocation}. + +@item void free (void *@var{addr}) +Free a block previously allocated by @code{malloc}. @xref{Freeing after +Malloc}. + +@item void *realloc (void *@var{addr}, size_t @var{size}) +Make a block previously allocated by @code{malloc} larger or smaller, +possibly by copying it to a new location. @xref{Changing Block Size}. + +@item void *calloc (size_t @var{count}, size_t @var{eltsize}) +Allocate a block of @var{count} * @var{eltsize} bytes using +@code{malloc}, and set its contents to zero. @xref{Allocating Cleared +Space}. + +@item void *valloc (size_t @var{size}) +Allocate a block of @var{size} bytes, starting on a page boundary. +@xref{Aligned Memory Blocks}. + +@item void *memalign (size_t @var{size}, size_t @var{boundary}) +Allocate a block of @var{size} bytes, starting on an address that is a +multiple of @var{boundary}. @xref{Aligned Memory Blocks}. + +@item int mcheck (void (*@var{abortfn}) (void)) +Tell @code{malloc} to perform occasional consistency checks on +dynamically allocated memory, and to call @var{abortfn} when an +inconsistency is found. @xref{Heap Consistency Checking}. + +@item void *(*__malloc_hook) (size_t @var{size}) +A pointer to a function that @code{malloc} uses whenever it is called. + +@item void *(*__realloc_hook) (void *@var{ptr}, size_t @var{size}) +A pointer to a function that @code{realloc} uses whenever it is called. + +@item void (*__free_hook) (void *@var{ptr}) +A pointer to a function that @code{free} uses whenever it is called. + +@item struct mstats mstats (void) +Return information about the current dynamic memory usage. +@xref{Statistics of Malloc}. +@end table + +@node Obstacks +@section Obstacks +@cindex obstacks + +An @dfn{obstack} is a pool of memory containing a stack of objects. You +can create any number of separate obstacks, and then allocate objects in +specified obstacks. Within each obstack, the last object allocated must +always be the first one freed, but distinct obstacks are independent of +each other. + +Aside from this one constraint of order of freeing, obstacks are totally +general: an obstack can contain any number of objects of any size. They +are implemented with macros, so allocation is usually very fast as long as +the objects are usually small. And the only space overhead per object is +the padding needed to start each object on a suitable boundary. + +@menu +* Creating Obstacks:: How to declare an obstack in your program. +* Preparing for Obstacks:: Preparations needed before you can + use obstacks. +* Allocation in an Obstack:: Allocating objects in an obstack. +* Freeing Obstack Objects:: Freeing objects in an obstack. +* Obstack Functions:: The obstack functions are both + functions and macros. +* Growing Objects:: Making an object bigger by stages. +* Extra Fast Growing:: Extra-high-efficiency (though more + complicated) growing objects. +* Status of an Obstack:: Inquiries about the status of an obstack. +* Obstacks Data Alignment:: Controlling alignment of objects in obstacks. +* Obstack Chunks:: How obstacks obtain and release chunks; + efficiency considerations. +* Summary of Obstacks:: +@end menu + +@node Creating Obstacks +@subsection Creating Obstacks + +The utilities for manipulating obstacks are declared in the header +file @file{obstack.h}. +@pindex obstack.h + +@comment obstack.h +@comment GNU +@deftp {Data Type} {struct obstack} +An obstack is represented by a data structure of type @code{struct +obstack}. This structure has a small fixed size; it records the status +of the obstack and how to find the space in which objects are allocated. +It does not contain any of the objects themselves. You should not try +to access the contents of the structure directly; use only the functions +described in this chapter. +@end deftp + +You can declare variables of type @code{struct obstack} and use them as +obstacks, or you can allocate obstacks dynamically like any other kind +of object. Dynamic allocation of obstacks allows your program to have a +variable number of different stacks. (You can even allocate an +obstack structure in another obstack, but this is rarely useful.) + +All the functions that work with obstacks require you to specify which +obstack to use. You do this with a pointer of type @code{struct obstack +*}. In the following, we often say ``an obstack'' when strictly +speaking the object at hand is such a pointer. + +The objects in the obstack are packed into large blocks called +@dfn{chunks}. The @code{struct obstack} structure points to a chain of +the chunks currently in use. + +The obstack library obtains a new chunk whenever you allocate an object +that won't fit in the previous chunk. Since the obstack library manages +chunks automatically, you don't need to pay much attention to them, but +you do need to supply a function which the obstack library should use to +get a chunk. Usually you supply a function which uses @code{malloc} +directly or indirectly. You must also supply a function to free a chunk. +These matters are described in the following section. + +@node Preparing for Obstacks +@subsection Preparing for Using Obstacks + +Each source file in which you plan to use the obstack functions +must include the header file @file{obstack.h}, like this: + +@smallexample +#include <obstack.h> +@end smallexample + +@findex obstack_chunk_alloc +@findex obstack_chunk_free +Also, if the source file uses the macro @code{obstack_init}, it must +declare or define two functions or macros that will be called by the +obstack library. One, @code{obstack_chunk_alloc}, is used to allocate +the chunks of memory into which objects are packed. The other, +@code{obstack_chunk_free}, is used to return chunks when the objects in +them are freed. These macros should appear before any use of obstacks +in the source file. + +Usually these are defined to use @code{malloc} via the intermediary +@code{xmalloc} (@pxref{Unconstrained Allocation}). This is done with +the following pair of macro definitions: + +@smallexample +#define obstack_chunk_alloc xmalloc +#define obstack_chunk_free free +@end smallexample + +@noindent +Though the storage you get using obstacks really comes from @code{malloc}, +using obstacks is faster because @code{malloc} is called less often, for +larger blocks of memory. @xref{Obstack Chunks}, for full details. + +At run time, before the program can use a @code{struct obstack} object +as an obstack, it must initialize the obstack by calling +@code{obstack_init}. + +@comment obstack.h +@comment GNU +@deftypefun int obstack_init (struct obstack *@var{obstack-ptr}) +Initialize obstack @var{obstack-ptr} for allocation of objects. This +function calls the obstack's @code{obstack_chunk_alloc} function. It +returns 0 if @code{obstack_chunk_alloc} returns a null pointer, meaning +that it is out of memory. Otherwise, it returns 1. If you supply an +@code{obstack_chunk_alloc} function that calls @code{exit} +(@pxref{Program Termination}) or @code{longjmp} (@pxref{Non-Local +Exits}) when out of memory, you can safely ignore the value that +@code{obstack_init} returns. +@end deftypefun + +Here are two examples of how to allocate the space for an obstack and +initialize it. First, an obstack that is a static variable: + +@smallexample +static struct obstack myobstack; +@dots{} +obstack_init (&myobstack); +@end smallexample + +@noindent +Second, an obstack that is itself dynamically allocated: + +@smallexample +struct obstack *myobstack_ptr + = (struct obstack *) xmalloc (sizeof (struct obstack)); + +obstack_init (myobstack_ptr); +@end smallexample + +@node Allocation in an Obstack +@subsection Allocation in an Obstack +@cindex allocation (obstacks) + +The most direct way to allocate an object in an obstack is with +@code{obstack_alloc}, which is invoked almost like @code{malloc}. + +@comment obstack.h +@comment GNU +@deftypefun {void *} obstack_alloc (struct obstack *@var{obstack-ptr}, int @var{size}) +This allocates an uninitialized block of @var{size} bytes in an obstack +and returns its address. Here @var{obstack-ptr} specifies which obstack +to allocate the block in; it is the address of the @code{struct obstack} +object which represents the obstack. Each obstack function or macro +requires you to specify an @var{obstack-ptr} as the first argument. + +This function calls the obstack's @code{obstack_chunk_alloc} function if +it needs to allocate a new chunk of memory; it returns a null pointer if +@code{obstack_chunk_alloc} returns one. In that case, it has not +changed the amount of memory allocated in the obstack. If you supply an +@code{obstack_chunk_alloc} function that calls @code{exit} +(@pxref{Program Termination}) or @code{longjmp} (@pxref{Non-Local +Exits}) when out of memory, then @code{obstack_alloc} will never return +a null pointer. +@end deftypefun + +For example, here is a function that allocates a copy of a string @var{str} +in a specific obstack, which is in the variable @code{string_obstack}: + +@smallexample +struct obstack string_obstack; + +char * +copystring (char *string) +@{ + char *s = (char *) obstack_alloc (&string_obstack, + strlen (string) + 1); + memcpy (s, string, strlen (string)); + return s; +@} +@end smallexample + +To allocate a block with specified contents, use the function +@code{obstack_copy}, declared like this: + +@comment obstack.h +@comment GNU +@deftypefun {void *} obstack_copy (struct obstack *@var{obstack-ptr}, void *@var{address}, int @var{size}) +This allocates a block and initializes it by copying @var{size} +bytes of data starting at @var{address}. It can return a null pointer +under the same conditions as @code{obstack_alloc}. +@end deftypefun + +@comment obstack.h +@comment GNU +@deftypefun {void *} obstack_copy0 (struct obstack *@var{obstack-ptr}, void *@var{address}, int @var{size}) +Like @code{obstack_copy}, but appends an extra byte containing a null +character. This extra byte is not counted in the argument @var{size}. +@end deftypefun + +The @code{obstack_copy0} function is convenient for copying a sequence +of characters into an obstack as a null-terminated string. Here is an +example of its use: + +@smallexample +char * +obstack_savestring (char *addr, int size) +@{ + return obstack_copy0 (&myobstack, addr, size); +@} +@end smallexample + +@noindent +Contrast this with the previous example of @code{savestring} using +@code{malloc} (@pxref{Basic Allocation}). + +@node Freeing Obstack Objects +@subsection Freeing Objects in an Obstack +@cindex freeing (obstacks) + +To free an object allocated in an obstack, use the function +@code{obstack_free}. Since the obstack is a stack of objects, freeing +one object automatically frees all other objects allocated more recently +in the same obstack. + +@comment obstack.h +@comment GNU +@deftypefun void obstack_free (struct obstack *@var{obstack-ptr}, void *@var{object}) +If @var{object} is a null pointer, everything allocated in the obstack +is freed. Otherwise, @var{object} must be the address of an object +allocated in the obstack. Then @var{object} is freed, along with +everything allocated in @var{obstack} since @var{object}. +@end deftypefun + +Note that if @var{object} is a null pointer, the result is an +uninitialized obstack. To free all storage in an obstack but leave it +valid for further allocation, call @code{obstack_free} with the address +of the first object allocated on the obstack: + +@smallexample +obstack_free (obstack_ptr, first_object_allocated_ptr); +@end smallexample + +Recall that the objects in an obstack are grouped into chunks. When all +the objects in a chunk become free, the obstack library automatically +frees the chunk (@pxref{Preparing for Obstacks}). Then other +obstacks, or non-obstack allocation, can reuse the space of the chunk. + +@node Obstack Functions +@subsection Obstack Functions and Macros +@cindex macros + +The interfaces for using obstacks may be defined either as functions or +as macros, depending on the compiler. The obstack facility works with +all C compilers, including both ANSI C and traditional C, but there are +precautions you must take if you plan to use compilers other than GNU C. + +If you are using an old-fashioned non-ANSI C compiler, all the obstack +``functions'' are actually defined only as macros. You can call these +macros like functions, but you cannot use them in any other way (for +example, you cannot take their address). + +Calling the macros requires a special precaution: namely, the first +operand (the obstack pointer) may not contain any side effects, because +it may be computed more than once. For example, if you write this: + +@smallexample +obstack_alloc (get_obstack (), 4); +@end smallexample + +@noindent +you will find that @code{get_obstack} may be called several times. +If you use @code{*obstack_list_ptr++} as the obstack pointer argument, +you will get very strange results since the incrementation may occur +several times. + +In ANSI C, each function has both a macro definition and a function +definition. The function definition is used if you take the address of the +function without calling it. An ordinary call uses the macro definition by +default, but you can request the function definition instead by writing the +function name in parentheses, as shown here: + +@smallexample +char *x; +void *(*funcp) (); +/* @r{Use the macro}. */ +x = (char *) obstack_alloc (obptr, size); +/* @r{Call the function}. */ +x = (char *) (obstack_alloc) (obptr, size); +/* @r{Take the address of the function}. */ +funcp = obstack_alloc; +@end smallexample + +@noindent +This is the same situation that exists in ANSI C for the standard library +functions. @xref{Macro Definitions}. + +@strong{Warning:} When you do use the macros, you must observe the +precaution of avoiding side effects in the first operand, even in ANSI +C. + +If you use the GNU C compiler, this precaution is not necessary, because +various language extensions in GNU C permit defining the macros so as to +compute each argument only once. + +@node Growing Objects +@subsection Growing Objects +@cindex growing objects (in obstacks) +@cindex changing the size of a block (obstacks) + +Because storage in obstack chunks is used sequentially, it is possible to +build up an object step by step, adding one or more bytes at a time to the +end of the object. With this technique, you do not need to know how much +data you will put in the object until you come to the end of it. We call +this the technique of @dfn{growing objects}. The special functions +for adding data to the growing object are described in this section. + +You don't need to do anything special when you start to grow an object. +Using one of the functions to add data to the object automatically +starts it. However, it is necessary to say explicitly when the object is +finished. This is done with the function @code{obstack_finish}. + +The actual address of the object thus built up is not known until the +object is finished. Until then, it always remains possible that you will +add so much data that the object must be copied into a new chunk. + +While the obstack is in use for a growing object, you cannot use it for +ordinary allocation of another object. If you try to do so, the space +already added to the growing object will become part of the other object. + +@comment obstack.h +@comment GNU +@deftypefun void obstack_blank (struct obstack *@var{obstack-ptr}, int @var{size}) +The most basic function for adding to a growing object is +@code{obstack_blank}, which adds space without initializing it. +@end deftypefun + +@comment obstack.h +@comment GNU +@deftypefun void obstack_grow (struct obstack *@var{obstack-ptr}, void *@var{data}, int @var{size}) +To add a block of initialized space, use @code{obstack_grow}, which is +the growing-object analogue of @code{obstack_copy}. It adds @var{size} +bytes of data to the growing object, copying the contents from +@var{data}. +@end deftypefun + +@comment obstack.h +@comment GNU +@deftypefun void obstack_grow0 (struct obstack *@var{obstack-ptr}, void *@var{data}, int @var{size}) +This is the growing-object analogue of @code{obstack_copy0}. It adds +@var{size} bytes copied from @var{data}, followed by an additional null +character. +@end deftypefun + +@comment obstack.h +@comment GNU +@deftypefun void obstack_1grow (struct obstack *@var{obstack-ptr}, char @var{c}) +To add one character at a time, use the function @code{obstack_1grow}. +It adds a single byte containing @var{c} to the growing object. +@end deftypefun + +@comment obstack.h +@comment GNU +@deftypefun {void *} obstack_finish (struct obstack *@var{obstack-ptr}) +When you are finished growing the object, use the function +@code{obstack_finish} to close it off and return its final address. + +Once you have finished the object, the obstack is available for ordinary +allocation or for growing another object. + +This function can return a null pointer under the same conditions as +@code{obstack_alloc} (@pxref{Allocation in an Obstack}). +@end deftypefun + +When you build an object by growing it, you will probably need to know +afterward how long it became. You need not keep track of this as you grow +the object, because you can find out the length from the obstack just +before finishing the object with the function @code{obstack_object_size}, +declared as follows: + +@comment obstack.h +@comment GNU +@deftypefun int obstack_object_size (struct obstack *@var{obstack-ptr}) +This function returns the current size of the growing object, in bytes. +Remember to call this function @emph{before} finishing the object. +After it is finished, @code{obstack_object_size} will return zero. +@end deftypefun + +If you have started growing an object and wish to cancel it, you should +finish it and then free it, like this: + +@smallexample +obstack_free (obstack_ptr, obstack_finish (obstack_ptr)); +@end smallexample + +@noindent +This has no effect if no object was growing. + +@cindex shrinking objects +You can use @code{obstack_blank} with a negative size argument to make +the current object smaller. Just don't try to shrink it beyond zero +length---there's no telling what will happen if you do that. + +@node Extra Fast Growing +@subsection Extra Fast Growing Objects +@cindex efficiency and obstacks + +The usual functions for growing objects incur overhead for checking +whether there is room for the new growth in the current chunk. If you +are frequently constructing objects in small steps of growth, this +overhead can be significant. + +You can reduce the overhead by using special ``fast growth'' +functions that grow the object without checking. In order to have a +robust program, you must do the checking yourself. If you do this checking +in the simplest way each time you are about to add data to the object, you +have not saved anything, because that is what the ordinary growth +functions do. But if you can arrange to check less often, or check +more efficiently, then you make the program faster. + +The function @code{obstack_room} returns the amount of room available +in the current chunk. It is declared as follows: + +@comment obstack.h +@comment GNU +@deftypefun int obstack_room (struct obstack *@var{obstack-ptr}) +This returns the number of bytes that can be added safely to the current +growing object (or to an object about to be started) in obstack +@var{obstack} using the fast growth functions. +@end deftypefun + +While you know there is room, you can use these fast growth functions +for adding data to a growing object: + +@comment obstack.h +@comment GNU +@deftypefun void obstack_1grow_fast (struct obstack *@var{obstack-ptr}, char @var{c}) +The function @code{obstack_1grow_fast} adds one byte containing the +character @var{c} to the growing object in obstack @var{obstack-ptr}. +@end deftypefun + +@comment obstack.h +@comment GNU +@deftypefun void obstack_blank_fast (struct obstack *@var{obstack-ptr}, int @var{size}) +The function @code{obstack_blank_fast} adds @var{size} bytes to the +growing object in obstack @var{obstack-ptr} without initializing them. +@end deftypefun + +When you check for space using @code{obstack_room} and there is not +enough room for what you want to add, the fast growth functions +are not safe. In this case, simply use the corresponding ordinary +growth function instead. Very soon this will copy the object to a +new chunk; then there will be lots of room available again. + +So, each time you use an ordinary growth function, check afterward for +sufficient space using @code{obstack_room}. Once the object is copied +to a new chunk, there will be plenty of space again, so the program will +start using the fast growth functions again. + +Here is an example: + +@smallexample +@group +void +add_string (struct obstack *obstack, const char *ptr, int len) +@{ + while (len > 0) + @{ + int room = obstack_room (obstack); + if (room == 0) + @{ + /* @r{Not enough room. Add one character slowly,} + @r{which may copy to a new chunk and make room.} */ + obstack_1grow (obstack, *ptr++); + len--; + @} + else + @{ + if (room > len) + room = len; + /* @r{Add fast as much as we have room for.} */ + len -= room; + while (room-- > 0) + obstack_1grow_fast (obstack, *ptr++); + @} + @} +@} +@end group +@end smallexample + +@node Status of an Obstack +@subsection Status of an Obstack +@cindex obstack status +@cindex status of obstack + +Here are functions that provide information on the current status of +allocation in an obstack. You can use them to learn about an object while +still growing it. + +@comment obstack.h +@comment GNU +@deftypefun {void *} obstack_base (struct obstack *@var{obstack-ptr}) +This function returns the tentative address of the beginning of the +currently growing object in @var{obstack-ptr}. If you finish the object +immediately, it will have that address. If you make it larger first, it +may outgrow the current chunk---then its address will change! + +If no object is growing, this value says where the next object you +allocate will start (once again assuming it fits in the current +chunk). +@end deftypefun + +@comment obstack.h +@comment GNU +@deftypefun {void *} obstack_next_free (struct obstack *@var{obstack-ptr}) +This function returns the address of the first free byte in the current +chunk of obstack @var{obstack-ptr}. This is the end of the currently +growing object. If no object is growing, @code{obstack_next_free} +returns the same value as @code{obstack_base}. +@end deftypefun + +@comment obstack.h +@comment GNU +@deftypefun int obstack_object_size (struct obstack *@var{obstack-ptr}) +This function returns the size in bytes of the currently growing object. +This is equivalent to + +@smallexample +obstack_next_free (@var{obstack-ptr}) - obstack_base (@var{obstack-ptr}) +@end smallexample +@end deftypefun + +@node Obstacks Data Alignment +@subsection Alignment of Data in Obstacks +@cindex alignment (in obstacks) + +Each obstack has an @dfn{alignment boundary}; each object allocated in +the obstack automatically starts on an address that is a multiple of the +specified boundary. By default, this boundary is 4 bytes. + +To access an obstack's alignment boundary, use the macro +@code{obstack_alignment_mask}, whose function prototype looks like +this: + +@comment obstack.h +@comment GNU +@deftypefn Macro int obstack_alignment_mask (struct obstack *@var{obstack-ptr}) +The value is a bit mask; a bit that is 1 indicates that the corresponding +bit in the address of an object should be 0. The mask value should be one +less than a power of 2; the effect is that all object addresses are +multiples of that power of 2. The default value of the mask is 3, so that +addresses are multiples of 4. A mask value of 0 means an object can start +on any multiple of 1 (that is, no alignment is required). + +The expansion of the macro @code{obstack_alignment_mask} is an lvalue, +so you can alter the mask by assignment. For example, this statement: + +@smallexample +obstack_alignment_mask (obstack_ptr) = 0; +@end smallexample + +@noindent +has the effect of turning off alignment processing in the specified obstack. +@end deftypefn + +Note that a change in alignment mask does not take effect until +@emph{after} the next time an object is allocated or finished in the +obstack. If you are not growing an object, you can make the new +alignment mask take effect immediately by calling @code{obstack_finish}. +This will finish a zero-length object and then do proper alignment for +the next object. + +@node Obstack Chunks +@subsection Obstack Chunks +@cindex efficiency of chunks +@cindex chunks + +Obstacks work by allocating space for themselves in large chunks, and +then parceling out space in the chunks to satisfy your requests. Chunks +are normally 4096 bytes long unless you specify a different chunk size. +The chunk size includes 8 bytes of overhead that are not actually used +for storing objects. Regardless of the specified size, longer chunks +will be allocated when necessary for long objects. + +The obstack library allocates chunks by calling the function +@code{obstack_chunk_alloc}, which you must define. When a chunk is no +longer needed because you have freed all the objects in it, the obstack +library frees the chunk by calling @code{obstack_chunk_free}, which you +must also define. + +These two must be defined (as macros) or declared (as functions) in each +source file that uses @code{obstack_init} (@pxref{Creating Obstacks}). +Most often they are defined as macros like this: + +@smallexample +#define obstack_chunk_alloc xmalloc +#define obstack_chunk_free free +@end smallexample + +Note that these are simple macros (no arguments). Macro definitions with +arguments will not work! It is necessary that @code{obstack_chunk_alloc} +or @code{obstack_chunk_free}, alone, expand into a function name if it is +not itself a function name. + +If you allocate chunks with @code{malloc}, the chunk size should be a +power of 2. The default chunk size, 4096, was chosen because it is long +enough to satisfy many typical requests on the obstack yet short enough +not to waste too much memory in the portion of the last chunk not yet used. + +@comment obstack.h +@comment GNU +@deftypefn Macro int obstack_chunk_size (struct obstack *@var{obstack-ptr}) +This returns the chunk size of the given obstack. +@end deftypefn + +Since this macro expands to an lvalue, you can specify a new chunk size by +assigning it a new value. Doing so does not affect the chunks already +allocated, but will change the size of chunks allocated for that particular +obstack in the future. It is unlikely to be useful to make the chunk size +smaller, but making it larger might improve efficiency if you are +allocating many objects whose size is comparable to the chunk size. Here +is how to do so cleanly: + +@smallexample +if (obstack_chunk_size (obstack_ptr) < @var{new-chunk-size}) + obstack_chunk_size (obstack_ptr) = @var{new-chunk-size}; +@end smallexample + +@node Summary of Obstacks +@subsection Summary of Obstack Functions + +Here is a summary of all the functions associated with obstacks. Each +takes the address of an obstack (@code{struct obstack *}) as its first +argument. + +@table @code +@item void obstack_init (struct obstack *@var{obstack-ptr}) +Initialize use of an obstack. @xref{Creating Obstacks}. + +@item void *obstack_alloc (struct obstack *@var{obstack-ptr}, int @var{size}) +Allocate an object of @var{size} uninitialized bytes. +@xref{Allocation in an Obstack}. + +@item void *obstack_copy (struct obstack *@var{obstack-ptr}, void *@var{address}, int @var{size}) +Allocate an object of @var{size} bytes, with contents copied from +@var{address}. @xref{Allocation in an Obstack}. + +@item void *obstack_copy0 (struct obstack *@var{obstack-ptr}, void *@var{address}, int @var{size}) +Allocate an object of @var{size}+1 bytes, with @var{size} of them copied +from @var{address}, followed by a null character at the end. +@xref{Allocation in an Obstack}. + +@item void obstack_free (struct obstack *@var{obstack-ptr}, void *@var{object}) +Free @var{object} (and everything allocated in the specified obstack +more recently than @var{object}). @xref{Freeing Obstack Objects}. + +@item void obstack_blank (struct obstack *@var{obstack-ptr}, int @var{size}) +Add @var{size} uninitialized bytes to a growing object. +@xref{Growing Objects}. + +@item void obstack_grow (struct obstack *@var{obstack-ptr}, void *@var{address}, int @var{size}) +Add @var{size} bytes, copied from @var{address}, to a growing object. +@xref{Growing Objects}. + +@item void obstack_grow0 (struct obstack *@var{obstack-ptr}, void *@var{address}, int @var{size}) +Add @var{size} bytes, copied from @var{address}, to a growing object, +and then add another byte containing a null character. @xref{Growing +Objects}. + +@item void obstack_1grow (struct obstack *@var{obstack-ptr}, char @var{data-char}) +Add one byte containing @var{data-char} to a growing object. +@xref{Growing Objects}. + +@item void *obstack_finish (struct obstack *@var{obstack-ptr}) +Finalize the object that is growing and return its permanent address. +@xref{Growing Objects}. + +@item int obstack_object_size (struct obstack *@var{obstack-ptr}) +Get the current size of the currently growing object. @xref{Growing +Objects}. + +@item void obstack_blank_fast (struct obstack *@var{obstack-ptr}, int @var{size}) +Add @var{size} uninitialized bytes to a growing object without checking +that there is enough room. @xref{Extra Fast Growing}. + +@item void obstack_1grow_fast (struct obstack *@var{obstack-ptr}, char @var{data-char}) +Add one byte containing @var{data-char} to a growing object without +checking that there is enough room. @xref{Extra Fast Growing}. + +@item int obstack_room (struct obstack *@var{obstack-ptr}) +Get the amount of room now available for growing the current object. +@xref{Extra Fast Growing}. + +@item int obstack_alignment_mask (struct obstack *@var{obstack-ptr}) +The mask used for aligning the beginning of an object. This is an +lvalue. @xref{Obstacks Data Alignment}. + +@item int obstack_chunk_size (struct obstack *@var{obstack-ptr}) +The size for allocating chunks. This is an lvalue. @xref{Obstack Chunks}. + +@item void *obstack_base (struct obstack *@var{obstack-ptr}) +Tentative starting address of the currently growing object. +@xref{Status of an Obstack}. + +@item void *obstack_next_free (struct obstack *@var{obstack-ptr}) +Address just after the end of the currently growing object. +@xref{Status of an Obstack}. +@end table + +@node Variable Size Automatic +@section Automatic Storage with Variable Size +@cindex automatic freeing +@cindex @code{alloca} function +@cindex automatic storage with variable size + +The function @code{alloca} supports a kind of half-dynamic allocation in +which blocks are allocated dynamically but freed automatically. + +Allocating a block with @code{alloca} is an explicit action; you can +allocate as many blocks as you wish, and compute the size at run time. But +all the blocks are freed when you exit the function that @code{alloca} was +called from, just as if they were automatic variables declared in that +function. There is no way to free the space explicitly. + +The prototype for @code{alloca} is in @file{stdlib.h}. This function is +a BSD extension. +@pindex stdlib.h + +@comment stdlib.h +@comment GNU, BSD +@deftypefun {void *} alloca (size_t @var{size}); +The return value of @code{alloca} is the address of a block of @var{size} +bytes of storage, allocated in the stack frame of the calling function. +@end deftypefun + +Do not use @code{alloca} inside the arguments of a function call---you +will get unpredictable results, because the stack space for the +@code{alloca} would appear on the stack in the middle of the space for +the function arguments. An example of what to avoid is @code{foo (x, +alloca (4), y)}. +@c This might get fixed in future versions of GCC, but that won't make +@c it safe with compilers generally. + +@menu +* Alloca Example:: Example of using @code{alloca}. +* Advantages of Alloca:: Reasons to use @code{alloca}. +* Disadvantages of Alloca:: Reasons to avoid @code{alloca}. +* GNU C Variable-Size Arrays:: Only in GNU C, here is an alternative + method of allocating dynamically and + freeing automatically. +@end menu + +@node Alloca Example +@subsection @code{alloca} Example + +As an example of use of @code{alloca}, here is a function that opens a file +name made from concatenating two argument strings, and returns a file +descriptor or minus one signifying failure: + +@smallexample +int +open2 (char *str1, char *str2, int flags, int mode) +@{ + char *name = (char *) alloca (strlen (str1) + strlen (str2) + 1); + strcpy (name, str1); + strcat (name, str2); + return open (name, flags, mode); +@} +@end smallexample + +@noindent +Here is how you would get the same results with @code{malloc} and +@code{free}: + +@smallexample +int +open2 (char *str1, char *str2, int flags, int mode) +@{ + char *name = (char *) malloc (strlen (str1) + strlen (str2) + 1); + int desc; + if (name == 0) + fatal ("virtual memory exceeded"); + strcpy (name, str1); + strcat (name, str2); + desc = open (name, flags, mode); + free (name); + return desc; +@} +@end smallexample + +As you can see, it is simpler with @code{alloca}. But @code{alloca} has +other, more important advantages, and some disadvantages. + +@node Advantages of Alloca +@subsection Advantages of @code{alloca} + +Here are the reasons why @code{alloca} may be preferable to @code{malloc}: + +@itemize @bullet +@item +Using @code{alloca} wastes very little space and is very fast. (It is +open-coded by the GNU C compiler.) + +@item +Since @code{alloca} does not have separate pools for different sizes of +block, space used for any size block can be reused for any other size. +@code{alloca} does not cause storage fragmentation. + +@item +@cindex longjmp +Nonlocal exits done with @code{longjmp} (@pxref{Non-Local Exits}) +automatically free the space allocated with @code{alloca} when they exit +through the function that called @code{alloca}. This is the most +important reason to use @code{alloca}. + +To illustrate this, suppose you have a function +@code{open_or_report_error} which returns a descriptor, like +@code{open}, if it succeeds, but does not return to its caller if it +fails. If the file cannot be opened, it prints an error message and +jumps out to the command level of your program using @code{longjmp}. +Let's change @code{open2} (@pxref{Alloca Example}) to use this +subroutine:@refill + +@smallexample +int +open2 (char *str1, char *str2, int flags, int mode) +@{ + char *name = (char *) alloca (strlen (str1) + strlen (str2) + 1); + strcpy (name, str1); + strcat (name, str2); + return open_or_report_error (name, flags, mode); +@} +@end smallexample + +@noindent +Because of the way @code{alloca} works, the storage it allocates is +freed even when an error occurs, with no special effort required. + +By contrast, the previous definition of @code{open2} (which uses +@code{malloc} and @code{free}) would develop a storage leak if it were +changed in this way. Even if you are willing to make more changes to +fix it, there is no easy way to do so. +@end itemize + +@node Disadvantages of Alloca +@subsection Disadvantages of @code{alloca} + +@cindex @code{alloca} disadvantages +@cindex disadvantages of @code{alloca} +These are the disadvantages of @code{alloca} in comparison with +@code{malloc}: + +@itemize @bullet +@item +If you try to allocate more storage than the machine can provide, you +don't get a clean error message. Instead you get a fatal signal like +the one you would get from an infinite recursion; probably a +segmentation violation (@pxref{Program Error Signals}). + +@item +Some non-GNU systems fail to support @code{alloca}, so it is less +portable. However, a slower emulation of @code{alloca} written in C +is available for use on systems with this deficiency. +@end itemize + +@node GNU C Variable-Size Arrays +@subsection GNU C Variable-Size Arrays +@cindex variable-sized arrays + +In GNU C, you can replace most uses of @code{alloca} with an array of +variable size. Here is how @code{open2} would look then: + +@smallexample +int open2 (char *str1, char *str2, int flags, int mode) +@{ + char name[strlen (str1) + strlen (str2) + 1]; + strcpy (name, str1); + strcat (name, str2); + return open (name, flags, mode); +@} +@end smallexample + +But @code{alloca} is not always equivalent to a variable-sized array, for +several reasons: + +@itemize @bullet +@item +A variable size array's space is freed at the end of the scope of the +name of the array. The space allocated with @code{alloca} +remains until the end of the function. + +@item +It is possible to use @code{alloca} within a loop, allocating an +additional block on each iteration. This is impossible with +variable-sized arrays. +@end itemize + +@strong{Note:} If you mix use of @code{alloca} and variable-sized arrays +within one function, exiting a scope in which a variable-sized array was +declared frees all blocks allocated with @code{alloca} during the +execution of that scope. + + +@node Relocating Allocator +@section Relocating Allocator + +@cindex relocating memory allocator +Any system of dynamic memory allocation has overhead: the amount of +space it uses is more than the amount the program asks for. The +@dfn{relocating memory allocator} achieves very low overhead by moving +blocks in memory as necessary, on its own initiative. + +@menu +* Relocator Concepts:: How to understand relocating allocation. +* Using Relocator:: Functions for relocating allocation. +@end menu + +@node Relocator Concepts +@subsection Concepts of Relocating Allocation + +@ifinfo +The @dfn{relocating memory allocator} achieves very low overhead by +moving blocks in memory as necessary, on its own initiative. +@end ifinfo + +When you allocate a block with @code{malloc}, the address of the block +never changes unless you use @code{realloc} to change its size. Thus, +you can safely store the address in various places, temporarily or +permanently, as you like. This is not safe when you use the relocating +memory allocator, because any and all relocatable blocks can move +whenever you allocate memory in any fashion. Even calling @code{malloc} +or @code{realloc} can move the relocatable blocks. + +@cindex handle +For each relocatable block, you must make a @dfn{handle}---a pointer +object in memory, designated to store the address of that block. The +relocating allocator knows where each block's handle is, and updates the +address stored there whenever it moves the block, so that the handle +always points to the block. Each time you access the contents of the +block, you should fetch its address anew from the handle. + +To call any of the relocating allocator functions from a signal handler +is almost certainly incorrect, because the signal could happen at any +time and relocate all the blocks. The only way to make this safe is to +block the signal around any access to the contents of any relocatable +block---not a convenient mode of operation. @xref{Nonreentrancy}. + +@node Using Relocator +@subsection Allocating and Freeing Relocatable Blocks + +@pindex malloc.h +In the descriptions below, @var{handleptr} designates the address of the +handle. All the functions are declared in @file{malloc.h}; all are GNU +extensions. + +@comment malloc.h +@comment GNU +@deftypefun {void *} r_alloc (void **@var{handleptr}, size_t @var{size}) +This function allocates a relocatable block of size @var{size}. It +stores the block's address in @code{*@var{handleptr}} and returns +a non-null pointer to indicate success. + +If @code{r_alloc} can't get the space needed, it stores a null pointer +in @code{*@var{handleptr}}, and returns a null pointer. +@end deftypefun + +@comment malloc.h +@comment GNU +@deftypefun void r_alloc_free (void **@var{handleptr}) +This function is the way to free a relocatable block. It frees the +block that @code{*@var{handleptr}} points to, and stores a null pointer +in @code{*@var{handleptr}} to show it doesn't point to an allocated +block any more. +@end deftypefun + +@comment malloc.h +@comment GNU +@deftypefun {void *} r_re_alloc (void **@var{handleptr}, size_t @var{size}) +The function @code{r_re_alloc} adjusts the size of the block that +@code{*@var{handleptr}} points to, making it @var{size} bytes long. It +stores the address of the resized block in @code{*@var{handleptr}} and +returns a non-null pointer to indicate success. + +If enough memory is not available, this function returns a null pointer +and does not modify @code{*@var{handleptr}}. +@end deftypefun + +@node Memory Warnings +@section Memory Usage Warnings +@cindex memory usage warnings +@cindex warnings of memory almost full + +@pindex malloc.c +You can ask for warnings as the program approaches running out of memory +space, by calling @code{memory_warnings}. This tells @code{malloc} to +check memory usage every time it asks for more memory from the operating +system. This is a GNU extension declared in @file{malloc.h}. + +@comment malloc.h +@comment GNU +@deftypefun void memory_warnings (void *@var{start}, void (*@var{warn-func}) (const char *)) +Call this function to request warnings for nearing exhaustion of virtual +memory. + +The argument @var{start} says where data space begins, in memory. The +allocator compares this against the last address used and against the +limit of data space, to determine the fraction of available memory in +use. If you supply zero for @var{start}, then a default value is used +which is right in most circumstances. + +For @var{warn-func}, supply a function that @code{malloc} can call to +warn you. It is called with a string (a warning message) as argument. +Normally it ought to display the string for the user to read. +@end deftypefun + +The warnings come when memory becomes 75% full, when it becomes 85% +full, and when it becomes 95% full. Above 95% you get another warning +each time memory usage increases. + diff --git a/manual/pattern.texi b/manual/pattern.texi new file mode 100644 index 0000000000..903aa48073 --- /dev/null +++ b/manual/pattern.texi @@ -0,0 +1,1189 @@ +@node Pattern Matching, I/O Overview, Searching and Sorting, Top +@chapter Pattern Matching + +The GNU C Library provides pattern matching facilities for two kinds of +patterns: regular expressions and file-name wildcards. The library also +provides a facility for expanding variable and command references and +parsing text into words in the way the shell does. + +@menu +* Wildcard Matching:: Matching a wildcard pattern against a single string. +* Globbing:: Finding the files that match a wildcard pattern. +* Regular Expressions:: Matching regular expressions against strings. +* Word Expansion:: Expanding shell variables, nested commands, + arithmetic, and wildcards. + This is what the shell does with shell commands. +@end menu + +@node Wildcard Matching +@section Wildcard Matching + +@pindex fnmatch.h +This section describes how to match a wildcard pattern against a +particular string. The result is a yes or no answer: does the +string fit the pattern or not. The symbols described here are all +declared in @file{fnmatch.h}. + +@comment fnmatch.h +@comment POSIX.2 +@deftypefun int fnmatch (const char *@var{pattern}, const char *@var{string}, int @var{flags}) +This function tests whether the string @var{string} matches the pattern +@var{pattern}. It returns @code{0} if they do match; otherwise, it +returns the nonzero value @code{FNM_NOMATCH}. The arguments +@var{pattern} and @var{string} are both strings. + +The argument @var{flags} is a combination of flag bits that alter the +details of matching. See below for a list of the defined flags. + +In the GNU C Library, @code{fnmatch} cannot experience an ``error''---it +always returns an answer for whether the match succeeds. However, other +implementations of @code{fnmatch} might sometimes report ``errors''. +They would do so by returning nonzero values that are not equal to +@code{FNM_NOMATCH}. +@end deftypefun + +These are the available flags for the @var{flags} argument: + +@table @code +@comment fnmatch.h +@comment GNU +@item FNM_FILE_NAME +Treat the @samp{/} character specially, for matching file names. If +this flag is set, wildcard constructs in @var{pattern} cannot match +@samp{/} in @var{string}. Thus, the only way to match @samp{/} is with +an explicit @samp{/} in @var{pattern}. + +@comment fnmatch.h +@comment POSIX.2 +@item FNM_PATHNAME +This is an alias for @code{FNM_FILE_NAME}; it comes from POSIX.2. We +don't recommend this name because we don't use the term ``pathname'' for +file names. + +@comment fnmatch.h +@comment POSIX.2 +@item FNM_PERIOD +Treat the @samp{.} character specially if it appears at the beginning of +@var{string}. If this flag is set, wildcard constructs in @var{pattern} +cannot match @samp{.} as the first character of @var{string}. + +If you set both @code{FNM_PERIOD} and @code{FNM_FILE_NAME}, then the +special treatment applies to @samp{.} following @samp{/} as well as to +@samp{.} at the beginning of @var{string}. (The shell uses the +@code{FNM_PERIOD} and @code{FNM_FILE_NAME} falgs together for matching +file names.) + +@comment fnmatch.h +@comment POSIX.2 +@item FNM_NOESCAPE +Don't treat the @samp{\} character specially in patterns. Normally, +@samp{\} quotes the following character, turning off its special meaning +(if any) so that it matches only itself. When quoting is enabled, the +pattern @samp{\?} matches only the string @samp{?}, because the question +mark in the pattern acts like an ordinary character. + +If you use @code{FNM_NOESCAPE}, then @samp{\} is an ordinary character. + +@comment fnmatch.h +@comment GNU +@item FNM_LEADING_DIR +Ignore a trailing sequence of characters starting with a @samp{/} in +@var{string}; that is to say, test whether @var{string} starts with a +directory name that @var{pattern} matches. + +If this flag is set, either @samp{foo*} or @samp{foobar} as a pattern +would match the string @samp{foobar/frobozz}. + +@comment fnmatch.h +@comment GNU +@item FNM_CASEFOLD +Ignore case in comparing @var{string} to @var{pattern}. +@end table + +@node Globbing +@section Globbing + +@cindex globbing +The archetypal use of wildcards is for matching against the files in a +directory, and making a list of all the matches. This is called +@dfn{globbing}. + +You could do this using @code{fnmatch}, by reading the directory entries +one by one and testing each one with @code{fnmatch}. But that would be +slow (and complex, since you would have to handle subdirectories by +hand). + +The library provides a function @code{glob} to make this particular use +of wildcards convenient. @code{glob} and the other symbols in this +section are declared in @file{glob.h}. + +@menu +* Calling Glob:: Basic use of @code{glob}. +* Flags for Globbing:: Flags that enable various options in @code{glob}. +@end menu + +@node Calling Glob +@subsection Calling @code{glob} + +The result of globbing is a vector of file names (strings). To return +this vector, @code{glob} uses a special data type, @code{glob_t}, which +is a structure. You pass @code{glob} the address of the structure, and +it fills in the structure's fields to tell you about the results. + +@comment glob.h +@comment POSIX.2 +@deftp {Data Type} glob_t +This data type holds a pointer to a word vector. More precisely, it +records both the address of the word vector and its size. + +@table @code +@item gl_pathc +The number of elements in the vector. + +@item gl_pathv +The address of the vector. This field has type @w{@code{char **}}. + +@item gl_offs +The offset of the first real element of the vector, from its nominal +address in the @code{gl_pathv} field. Unlike the other fields, this +is always an input to @code{glob}, rather than an output from it. + +If you use a nonzero offset, then that many elements at the beginning of +the vector are left empty. (The @code{glob} function fills them with +null pointers.) + +The @code{gl_offs} field is meaningful only if you use the +@code{GLOB_DOOFFS} flag. Otherwise, the offset is always zero +regardless of what is in this field, and the first real element comes at +the beginning of the vector. +@end table +@end deftp + +@comment glob.h +@comment POSIX.2 +@deftypefun int glob (const char *@var{pattern}, int @var{flags}, int (*@var{errfunc}) (const char *@var{filename}, int @var{error-code}), glob_t *@var{vector-ptr}) +The function @code{glob} does globbing using the pattern @var{pattern} +in the current directory. It puts the result in a newly allocated +vector, and stores the size and address of this vector into +@code{*@var{vector-ptr}}. The argument @var{flags} is a combination of +bit flags; see @ref{Flags for Globbing}, for details of the flags. + +The result of globbing is a sequence of file names. The function +@code{glob} allocates a string for each resulting word, then +allocates a vector of type @code{char **} to store the addresses of +these strings. The last element of the vector is a null pointer. +This vector is called the @dfn{word vector}. + +To return this vector, @code{glob} stores both its address and its +length (number of elements, not counting the terminating null pointer) +into @code{*@var{vector-ptr}}. + +Normally, @code{glob} sorts the file names alphabetically before +returning them. You can turn this off with the flag @code{GLOB_NOSORT} +if you want to get the information as fast as possible. Usually it's +a good idea to let @code{glob} sort them---if you process the files in +alphabetical order, the users will have a feel for the rate of progress +that your application is making. + +If @code{glob} succeeds, it returns 0. Otherwise, it returns one +of these error codes: + +@table @code +@comment glob.h +@comment POSIX.2 +@item GLOB_ABORTED +There was an error opening a directory, and you used the flag +@code{GLOB_ERR} or your specified @var{errfunc} returned a nonzero +value. +@iftex +See below +@end iftex +@ifinfo +@xref{Flags for Globbing}, +@end ifinfo +for an explanation of the @code{GLOB_ERR} flag and @var{errfunc}. + +@comment glob.h +@comment POSIX.2 +@item GLOB_NOMATCH +The pattern didn't match any existing files. If you use the +@code{GLOB_NOCHECK} flag, then you never get this error code, because +that flag tells @code{glob} to @emph{pretend} that the pattern matched +at least one file. + +@comment glob.h +@comment POSIX.2 +@item GLOB_NOSPACE +It was impossible to allocate memory to hold the result. +@end table + +In the event of an error, @code{glob} stores information in +@code{*@var{vector-ptr}} about all the matches it has found so far. +@end deftypefun + +@node Flags for Globbing +@subsection Flags for Globbing + +This section describes the flags that you can specify in the +@var{flags} argument to @code{glob}. Choose the flags you want, +and combine them with the C bitwise OR operator @code{|}. + +@table @code +@comment glob.h +@comment POSIX.2 +@item GLOB_APPEND +Append the words from this expansion to the vector of words produced by +previous calls to @code{glob}. This way you can effectively expand +several words as if they were concatenated with spaces between them. + +In order for appending to work, you must not modify the contents of the +word vector structure between calls to @code{glob}. And, if you set +@code{GLOB_DOOFFS} in the first call to @code{glob}, you must also +set it when you append to the results. + +Note that the pointer stored in @code{gl_pathv} may no longer be valid +after you call @code{glob} the second time, because @code{glob} might +have relocated the vector. So always fetch @code{gl_pathv} from the +@code{glob_t} structure after each @code{glob} call; @strong{never} save +the pointer across calls. + +@comment glob.h +@comment POSIX.2 +@item GLOB_DOOFFS +Leave blank slots at the beginning of the vector of words. +The @code{gl_offs} field says how many slots to leave. +The blank slots contain null pointers. + +@comment glob.h +@comment POSIX.2 +@item GLOB_ERR +Give up right away and report an error if there is any difficulty +reading the directories that must be read in order to expand @var{pattern} +fully. Such difficulties might include a directory in which you don't +have the requisite access. Normally, @code{glob} tries its best to keep +on going despite any errors, reading whatever directories it can. + +You can exercise even more control than this by specifying an +error-handler function @var{errfunc} when you call @code{glob}. If +@var{errfunc} is not a null pointer, then @code{glob} doesn't give up +right away when it can't read a directory; instead, it calls +@var{errfunc} with two arguments, like this: + +@smallexample +(*@var{errfunc}) (@var{filename}, @var{error-code}) +@end smallexample + +@noindent +The argument @var{filename} is the name of the directory that +@code{glob} couldn't open or couldn't read, and @var{error-code} is the +@code{errno} value that was reported to @code{glob}. + +If the error handler function returns nonzero, then @code{glob} gives up +right away. Otherwise, it continues. + +@comment glob.h +@comment POSIX.2 +@item GLOB_MARK +If the pattern matches the name of a directory, append @samp{/} to the +directory's name when returning it. + +@comment glob.h +@comment POSIX.2 +@item GLOB_NOCHECK +If the pattern doesn't match any file names, return the pattern itself +as if it were a file name that had been matched. (Normally, when the +pattern doesn't match anything, @code{glob} returns that there were no +matches.) + +@comment glob.h +@comment POSIX.2 +@item GLOB_NOSORT +Don't sort the file names; return them in no particular order. +(In practice, the order will depend on the order of the entries in +the directory.) The only reason @emph{not} to sort is to save time. + +@comment glob.h +@comment POSIX.2 +@item GLOB_NOESCAPE +Don't treat the @samp{\} character specially in patterns. Normally, +@samp{\} quotes the following character, turning off its special meaning +(if any) so that it matches only itself. When quoting is enabled, the +pattern @samp{\?} matches only the string @samp{?}, because the question +mark in the pattern acts like an ordinary character. + +If you use @code{GLOB_NOESCAPE}, then @samp{\} is an ordinary character. + +@code{glob} does its work by calling the function @code{fnmatch} +repeatedly. It handles the flag @code{GLOB_NOESCAPE} by turning on the +@code{FNM_NOESCAPE} flag in calls to @code{fnmatch}. +@end table + +@node Regular Expressions +@section Regular Expression Matching + +The GNU C library supports two interfaces for matching regular +expressions. One is the standard POSIX.2 interface, and the other is +what the GNU system has had for many years. + +Both interfaces are declared in the header file @file{regex.h}. +If you define @w{@code{_POSIX_C_SOURCE}}, then only the POSIX.2 +functions, structures, and constants are declared. +@c !!! we only document the POSIX.2 interface here!! + +@menu +* POSIX Regexp Compilation:: Using @code{regcomp} to prepare to match. +* Flags for POSIX Regexps:: Syntax variations for @code{regcomp}. +* Matching POSIX Regexps:: Using @code{regexec} to match the compiled + pattern that you get from @code{regcomp}. +* Regexp Subexpressions:: Finding which parts of the string were matched. +* Subexpression Complications:: Find points of which parts were matched. +* Regexp Cleanup:: Freeing storage; reporting errors. +@end menu + +@node POSIX Regexp Compilation +@subsection POSIX Regular Expression Compilation + +Before you can actually match a regular expression, you must +@dfn{compile} it. This is not true compilation---it produces a special +data structure, not machine instructions. But it is like ordinary +compilation in that its purpose is to enable you to ``execute'' the +pattern fast. (@xref{Matching POSIX Regexps}, for how to use the +compiled regular expression for matching.) + +There is a special data type for compiled regular expressions: + +@comment regex.h +@comment POSIX.2 +@deftp {Data Type} regex_t +This type of object holds a compiled regular expression. +It is actually a structure. It has just one field that your programs +should look at: + +@table @code +@item re_nsub +This field holds the number of parenthetical subexpressions in the +regular expression that was compiled. +@end table + +There are several other fields, but we don't describe them here, because +only the functions in the library should use them. +@end deftp + +After you create a @code{regex_t} object, you can compile a regular +expression into it by calling @code{regcomp}. + +@comment regex.h +@comment POSIX.2 +@deftypefun int regcomp (regex_t *@var{compiled}, const char *@var{pattern}, int @var{cflags}) +The function @code{regcomp} ``compiles'' a regular expression into a +data structure that you can use with @code{regexec} to match against a +string. The compiled regular expression format is designed for +efficient matching. @code{regcomp} stores it into @code{*@var{compiled}}. + +It's up to you to allocate an object of type @code{regex_t} and pass its +address to @code{regcomp}. + +The argument @var{cflags} lets you specify various options that control +the syntax and semantics of regular expressions. @xref{Flags for POSIX +Regexps}. + +If you use the flag @code{REG_NOSUB}, then @code{regcomp} omits from +the compiled regular expression the information necessary to record +how subexpressions actually match. In this case, you might as well +pass @code{0} for the @var{matchptr} and @var{nmatch} arguments when +you call @code{regexec}. + +If you don't use @code{REG_NOSUB}, then the compiled regular expression +does have the capacity to record how subexpressions match. Also, +@code{regcomp} tells you how many subexpressions @var{pattern} has, by +storing the number in @code{@var{compiled}->re_nsub}. You can use that +value to decide how long an array to allocate to hold information about +subexpression matches. + +@code{regcomp} returns @code{0} if it succeeds in compiling the regular +expression; otherwise, it returns a nonzero error code (see the table +below). You can use @code{regerror} to produce an error message string +describing the reason for a nonzero value; see @ref{Regexp Cleanup}. + +@end deftypefun + +Here are the possible nonzero values that @code{regcomp} can return: + +@table @code +@comment regex.h +@comment POSIX.2 +@item REG_BADBR +There was an invalid @samp{\@{@dots{}\@}} construct in the regular +expression. A valid @samp{\@{@dots{}\@}} construct must contain either +a single number, or two numbers in increasing order separated by a +comma. + +@comment regex.h +@comment POSIX.2 +@item REG_BADPAT +There was a syntax error in the regular expression. + +@comment regex.h +@comment POSIX.2 +@item REG_BADRPT +A repetition operator such as @samp{?} or @samp{*} appeared in a bad +position (with no preceding subexpression to act on). + +@comment regex.h +@comment POSIX.2 +@item REG_ECOLLATE +The regular expression referred to an invalid collating element (one not +defined in the current locale for string collation). @xref{Locale +Categories}. + +@comment regex.h +@comment POSIX.2 +@item REG_ECTYPE +The regular expression referred to an invalid character class name. + +@comment regex.h +@comment POSIX.2 +@item REG_EESCAPE +The regular expression ended with @samp{\}. + +@comment regex.h +@comment POSIX.2 +@item REG_ESUBREG +There was an invalid number in the @samp{\@var{digit}} construct. + +@comment regex.h +@comment POSIX.2 +@item REG_EBRACK +There were unbalanced square brackets in the regular expression. + +@comment regex.h +@comment POSIX.2 +@item REG_EPAREN +An extended regular expression had unbalanced parentheses, +or a basic regular expression had unbalanced @samp{\(} and @samp{\)}. + +@comment regex.h +@comment POSIX.2 +@item REG_EBRACE +The regular expression had unbalanced @samp{\@{} and @samp{\@}}. + +@comment regex.h +@comment POSIX.2 +@item REG_ERANGE +One of the endpoints in a range expression was invalid. + +@comment regex.h +@comment POSIX.2 +@item REG_ESPACE +@code{regcomp} ran out of memory. +@end table + +@node Flags for POSIX Regexps +@subsection Flags for POSIX Regular Expressions + +These are the bit flags that you can use in the @var{cflags} operand when +compiling a regular expression with @code{regcomp}. + +@table @code +@comment regex.h +@comment POSIX.2 +@item REG_EXTENDED +Treat the pattern as an extended regular expression, rather than as a +basic regular expression. + +@comment regex.h +@comment POSIX.2 +@item REG_ICASE +Ignore case when matching letters. + +@comment regex.h +@comment POSIX.2 +@item REG_NOSUB +Don't bother storing the contents of the @var{matches-ptr} array. + +@comment regex.h +@comment POSIX.2 +@item REG_NEWLINE +Treat a newline in @var{string} as dividing @var{string} into multiple +lines, so that @samp{$} can match before the newline and @samp{^} can +match after. Also, don't permit @samp{.} to match a newline, and don't +permit @samp{[^@dots{}]} to match a newline. + +Otherwise, newline acts like any other ordinary character. +@end table + +@node Matching POSIX Regexps +@subsection Matching a Compiled POSIX Regular Expression + +Once you have compiled a regular expression, as described in @ref{POSIX +Regexp Compilation}, you can match it against strings using +@code{regexec}. A match anywhere inside the string counts as success, +unless the regular expression contains anchor characters (@samp{^} or +@samp{$}). + +@comment regex.h +@comment POSIX.2 +@deftypefun int regexec (regex_t *@var{compiled}, char *@var{string}, size_t @var{nmatch}, regmatch_t @var{matchptr} @t{[]}, int @var{eflags}) +This function tries to match the compiled regular expression +@code{*@var{compiled}} against @var{string}. + +@code{regexec} returns @code{0} if the regular expression matches; +otherwise, it returns a nonzero value. See the table below for +what nonzero values mean. You can use @code{regerror} to produce an +error message string describing the reason for a nonzero value; +see @ref{Regexp Cleanup}. + +The argument @var{eflags} is a word of bit flags that enable various +options. + +If you want to get information about what part of @var{string} actually +matched the regular expression or its subexpressions, use the arguments +@var{matchptr} and @var{nmatch}. Otherwise, pass @code{0} for +@var{nmatch}, and @code{NULL} for @var{matchptr}. @xref{Regexp +Subexpressions}. +@end deftypefun + +You must match the regular expression with the same set of current +locales that were in effect when you compiled the regular expression. + +The function @code{regexec} accepts the following flags in the +@var{eflags} argument: + +@table @code +@comment regex.h +@comment POSIX.2 +@item REG_NOTBOL +Do not regard the beginning of the specified string as the beginning of +a line; more generally, don't make any assumptions about what text might +precede it. + +@comment regex.h +@comment POSIX.2 +@item REG_NOTEOL +Do not regard the end of the specified string as the end of a line; more +generally, don't make any assumptions about what text might follow it. +@end table + +Here are the possible nonzero values that @code{regexec} can return: + +@table @code +@comment regex.h +@comment POSIX.2 +@item REG_NOMATCH +The pattern didn't match the string. This isn't really an error. + +@comment regex.h +@comment POSIX.2 +@item REG_ESPACE +@code{regexec} ran out of memory. +@end table + +@node Regexp Subexpressions +@subsection Match Results with Subexpressions + +When @code{regexec} matches parenthetical subexpressions of +@var{pattern}, it records which parts of @var{string} they match. It +returns that information by storing the offsets into an array whose +elements are structures of type @code{regmatch_t}. The first element of +the array (index @code{0}) records the part of the string that matched +the entire regular expression. Each other element of the array records +the beginning and end of the part that matched a single parenthetical +subexpression. + +@comment regex.h +@comment POSIX.2 +@deftp {Data Type} regmatch_t +This is the data type of the @var{matcharray} array that you pass to +@code{regexec}. It containes two structure fields, as follows: + +@table @code +@item rm_so +The offset in @var{string} of the beginning of a substring. Add this +value to @var{string} to get the address of that part. + +@item rm_eo +The offset in @var{string} of the end of the substring. +@end table +@end deftp + +@comment regex.h +@comment POSIX.2 +@deftp {Data Type} regoff_t +@code{regoff_t} is an alias for another signed integer type. +The fields of @code{regmatch_t} have type @code{regoff_t}. +@end deftp + +The @code{regmatch_t} elements correspond to subexpressions +positionally; the first element (index @code{1}) records where the first +subexpression matched, the second element records the second +subexpression, and so on. The order of the subexpressions is the order +in which they begin. + +When you call @code{regexec}, you specify how long the @var{matchptr} +array is, with the @var{nmatch} argument. This tells @code{regexec} how +many elements to store. If the actual regular expression has more than +@var{nmatch} subexpressions, then you won't get offset information about +the rest of them. But this doesn't alter whether the pattern matches a +particular string or not. + +If you don't want @code{regexec} to return any information about where +the subexpressions matched, you can either supply @code{0} for +@var{nmatch}, or use the flag @code{REG_NOSUB} when you compile the +pattern with @code{regcomp}. + +@node Subexpression Complications +@subsection Complications in Subexpression Matching + +Sometimes a subexpression matches a substring of no characters. This +happens when @samp{f\(o*\)} matches the string @samp{fum}. (It really +matches just the @samp{f}.) In this case, both of the offsets identify +the point in the string where the null substring was found. In this +example, the offsets are both @code{1}. + +Sometimes the entire regular expression can match without using some of +its subexpressions at all---for example, when @samp{ba\(na\)*} matches the +string @samp{ba}, the parenthetical subexpression is not used. When +this happens, @code{regexec} stores @code{-1} in both fields of the +element for that subexpression. + +Sometimes matching the entire regular expression can match a particular +subexpression more than once---for example, when @samp{ba\(na\)*} +matches the string @samp{bananana}, the parenthetical subexpression +matches three times. When this happens, @code{regexec} usually stores +the offsets of the last part of the string that matched the +subexpression. In the case of @samp{bananana}, these offsets are +@code{6} and @code{8}. + +But the last match is not always the one that is chosen. It's more +accurate to say that the last @emph{opportunity} to match is the one +that takes precedence. What this means is that when one subexpression +appears within another, then the results reported for the inner +subexpression reflect whatever happened on the last match of the outer +subexpression. For an example, consider @samp{\(ba\(na\)*s \)*} matching +the string @samp{bananas bas }. The last time the inner expression +actually matches is near the end of the first word. But it is +@emph{considered} again in the second word, and fails to match there. +@code{regexec} reports nonuse of the ``na'' subexpression. + +Another place where this rule applies is when the regular expression +@w{@samp{\(ba\(na\)*s \|nefer\(ti\)* \)*}} matches @samp{bananas nefertiti}. +The ``na'' subexpression does match in the first word, but it doesn't +match in the second word because the other alternative is used there. +Once again, the second repetition of the outer subexpression overrides +the first, and within that second repetition, the ``na'' subexpression +is not used. So @code{regexec} reports nonuse of the ``na'' +subexpression. + +@node Regexp Cleanup +@subsection POSIX Regexp Matching Cleanup + +When you are finished using a compiled regular expression, you can +free the storage it uses by calling @code{regfree}. + +@comment regex.h +@comment POSIX.2 +@deftypefun void regfree (regex_t *@var{compiled}) +Calling @code{regfree} frees all the storage that @code{*@var{compiled}} +points to. This includes various internal fields of the @code{regex_t} +structure that aren't documented in this manual. + +@code{regfree} does not free the object @code{*@var{compiled}} itself. +@end deftypefun + +You should always free the space in a @code{regex_t} structure with +@code{regfree} before using the structure to compile another regular +expression. + +When @code{regcomp} or @code{regexec} reports an error, you can use +the function @code{regerror} to turn it into an error message string. + +@comment regex.h +@comment POSIX.2 +@deftypefun size_t regerror (int @var{errcode}, regex_t *@var{compiled}, char *@var{buffer}, size_t @var{length}) +This function produces an error message string for the error code +@var{errcode}, and stores the string in @var{length} bytes of memory +starting at @var{buffer}. For the @var{compiled} argument, supply the +same compiled regular expression structure that @code{regcomp} or +@code{regexec} was working with when it got the error. Alternatively, +you can supply @code{NULL} for @var{compiled}; you will still get a +meaningful error message, but it might not be as detailed. + +If the error message can't fit in @var{length} bytes (including a +terminating null character), then @code{regerror} truncates it. +The string that @code{regerror} stores is always null-terminated +even if it has been truncated. + +The return value of @code{regerror} is the minimum length needed to +store the entire error message. If this is less than @var{length}, then +the error message was not truncated, and you can use it. Otherwise, you +should call @code{regerror} again with a larger buffer. + +Here is a function which uses @code{regerror}, but always dynamically +allocates a buffer for the error message: + +@smallexample +char *get_regerror (int errcode, regex_t *compiled) +@{ + size_t length = regerror (errcode, compiled, NULL, 0); + char *buffer = xmalloc (length); + (void) regerror (errcode, compiled, buffer, length); + return buffer; +@} +@end smallexample +@end deftypefun + +@c !!!! this is not actually in the library.... +@node Word Expansion +@section Shell-Style Word Expansion +@cindex word expansion +@cindex expansion of shell words + +@dfn{Word expansion} means the process of splitting a string into +@dfn{words} and substituting for variables, commands, and wildcards +just as the shell does. + +For example, when you write @samp{ls -l foo.c}, this string is split +into three separate words---@samp{ls}, @samp{-l} and @samp{foo.c}. +This is the most basic function of word expansion. + +When you write @samp{ls *.c}, this can become many words, because +the word @samp{*.c} can be replaced with any number of file names. +This is called @dfn{wildcard expansion}, and it is also a part of +word expansion. + +When you use @samp{echo $PATH} to print your path, you are taking +advantage of @dfn{variable substitution}, which is also part of word +expansion. + +Ordinary programs can perform word expansion just like the shell by +calling the library function @code{wordexp}. + +@menu +* Expansion Stages:: What word expansion does to a string. +* Calling Wordexp:: How to call @code{wordexp}. +* Flags for Wordexp:: Options you can enable in @code{wordexp}. +* Wordexp Example:: A sample program that does word expansion. +@end menu + +@node Expansion Stages +@subsection The Stages of Word Expansion + +When word expansion is applied to a sequence of words, it performs the +following transformations in the order shown here: + +@enumerate +@item +@cindex tilde expansion +@dfn{Tilde expansion}: Replacement of @samp{~foo} with the name of +the home directory of @samp{foo}. + +@item +Next, three different transformations are applied in the same step, +from left to right: + +@itemize @bullet +@item +@cindex variable substitution +@cindex substitution of variables and commands +@dfn{Variable substitution}: Environment variables are substituted for +references such as @samp{$foo}. + +@item +@cindex command substitution +@dfn{Command substitution}: Constructs such as @w{@samp{`cat foo`}} and +the equivalent @w{@samp{$(cat foo)}} are replaced with the output from +the inner command. + +@item +@cindex arithmetic expansion +@dfn{Arithmetic expansion}: Constructs such as @samp{$(($x-1))} are +replaced with the result of the arithmetic computation. +@end itemize + +@item +@cindex field splitting +@dfn{Field splitting}: subdivision of the text into @dfn{words}. + +@item +@cindex wildcard expansion +@dfn{Wildcard expansion}: The replacement of a construct such as @samp{*.c} +with a list of @samp{.c} file names. Wildcard expansion applies to an +entire word at a time, and replaces that word with 0 or more file names +that are themselves words. + +@item +@cindex quote removal +@cindex removal of quotes +@dfn{Quote removal}: The deletion of string-quotes, now that they have +done their job by inhibiting the above transformations when appropriate. +@end enumerate + +For the details of these transformations, and how to write the constructs +that use them, see @w{@cite{The BASH Manual}} (to appear). + +@node Calling Wordexp +@subsection Calling @code{wordexp} + +All the functions, constants and data types for word expansion are +declared in the header file @file{wordexp.h}. + +Word expansion produces a vector of words (strings). To return this +vector, @code{wordexp} uses a special data type, @code{wordexp_t}, which +is a structure. You pass @code{wordexp} the address of the structure, +and it fills in the structure's fields to tell you about the results. + +@comment wordexp.h +@comment POSIX.2 +@deftp {Data Type} {wordexp_t} +This data type holds a pointer to a word vector. More precisely, it +records both the address of the word vector and its size. + +@table @code +@item we_wordc +The number of elements in the vector. + +@item we_wordv +The address of the vector. This field has type @w{@code{char **}}. + +@item we_offs +The offset of the first real element of the vector, from its nominal +address in the @code{we_wordv} field. Unlike the other fields, this +is always an input to @code{wordexp}, rather than an output from it. + +If you use a nonzero offset, then that many elements at the beginning of +the vector are left empty. (The @code{wordexp} function fills them with +null pointers.) + +The @code{we_offs} field is meaningful only if you use the +@code{WRDE_DOOFFS} flag. Otherwise, the offset is always zero +regardless of what is in this field, and the first real element comes at +the beginning of the vector. +@end table +@end deftp + +@comment wordexp.h +@comment POSIX.2 +@deftypefun int wordexp (const char *@var{words}, wordexp_t *@var{word-vector-ptr}, int @var{flags}) +Perform word expansion on the string @var{words}, putting the result in +a newly allocated vector, and store the size and address of this vector +into @code{*@var{word-vector-ptr}}. The argument @var{flags} is a +combination of bit flags; see @ref{Flags for Wordexp}, for details of +the flags. + +You shouldn't use any of the characters @samp{|&;<>} in the string +@var{words} unless they are quoted; likewise for newline. If you use +these characters unquoted, you will get the @code{WRDE_BADCHAR} error +code. Don't use parentheses or braces unless they are quoted or part of +a word expansion construct. If you use quotation characters @samp{'"`}, +they should come in pairs that balance. + +The results of word expansion are a sequence of words. The function +@code{wordexp} allocates a string for each resulting word, then +allocates a vector of type @code{char **} to store the addresses of +these strings. The last element of the vector is a null pointer. +This vector is called the @dfn{word vector}. + +To return this vector, @code{wordexp} stores both its address and its +length (number of elements, not counting the terminating null pointer) +into @code{*@var{word-vector-ptr}}. + +If @code{wordexp} succeeds, it returns 0. Otherwise, it returns one +of these error codes: + +@table @code +@comment wordexp.h +@comment POSIX.2 +@item WRDE_BADCHAR +The input string @var{words} contains an unquoted invalid character such +as @samp{|}. + +@comment wordexp.h +@comment POSIX.2 +@item WRDE_BADVAL +The input string refers to an undefined shell variable, and you used the flag +@code{WRDE_UNDEF} to forbid such references. + +@comment wordexp.h +@comment POSIX.2 +@item WRDE_CMDSUB +The input string uses command substitution, and you used the flag +@code{WRDE_NOCMD} to forbid command substitution. + +@comment wordexp.h +@comment POSIX.2 +@item WRDE_NOSPACE +It was impossible to allocate memory to hold the result. In this case, +@code{wordexp} can store part of the results---as much as it could +allocate room for. + +@comment wordexp.h +@comment POSIX.2 +@item WRDE_SYNTAX +There was a syntax error in the input string. For example, an unmatched +quoting character is a syntax error. +@end table +@end deftypefun + +@comment wordexp.h +@comment POSIX.2 +@deftypefun void wordfree (wordexp_t *@var{word-vector-ptr}) +Free the storage used for the word-strings and vector that +@code{*@var{word-vector-ptr}} points to. This does not free the +structure @code{*@var{word-vector-ptr}} itself---only the other +data it points to. +@end deftypefun + +@node Flags for Wordexp +@subsection Flags for Word Expansion + +This section describes the flags that you can specify in the +@var{flags} argument to @code{wordexp}. Choose the flags you want, +and combine them with the C operator @code{|}. + +@table @code +@comment wordexp.h +@comment POSIX.2 +@item WRDE_APPEND +Append the words from this expansion to the vector of words produced by +previous calls to @code{wordexp}. This way you can effectively expand +several words as if they were concatenated with spaces between them. + +In order for appending to work, you must not modify the contents of the +word vector structure between calls to @code{wordexp}. And, if you set +@code{WRDE_DOOFFS} in the first call to @code{wordexp}, you must also +set it when you append to the results. + +@comment wordexp.h +@comment POSIX.2 +@item WRDE_DOOFFS +Leave blank slots at the beginning of the vector of words. +The @code{we_offs} field says how many slots to leave. +The blank slots contain null pointers. + +@comment wordexp.h +@comment POSIX.2 +@item WRDE_NOCMD +Don't do command substitution; if the input requests command substitution, +report an error. + +@comment wordexp.h +@comment POSIX.2 +@item WRDE_REUSE +Reuse a word vector made by a previous call to @code{wordexp}. +Instead of allocating a new vector of words, this call to @code{wordexp} +will use the vector that already exists (making it larger if necessary). + +Note that the vector may move, so it is not safe to save an old pointer +and use it again after calling @code{wordexp}. You must fetch +@code{we_pathv} anew after each call. + +@comment wordexp.h +@comment POSIX.2 +@item WRDE_SHOWERR +Do show any error messages printed by commands run by command substitution. +More precisely, allow these commands to inherit the standard error output +stream of the current process. By default, @code{wordexp} gives these +commands a standard error stream that discards all output. + +@comment wordexp.h +@comment POSIX.2 +@item WRDE_UNDEF +If the input refers to a shell variable that is not defined, report an +error. +@end table + +@node Wordexp Example +@subsection @code{wordexp} Example + +Here is an example of using @code{wordexp} to expand several strings +and use the results to run a shell command. It also shows the use of +@code{WRDE_APPEND} to concatenate the expansions and of @code{wordfree} +to free the space allocated by @code{wordexp}. + +@smallexample +int +expand_and_execute (const char *program, const char *options) +@{ + wordexp_t result; + pid_t pid + int status, i; + + /* @r{Expand the string for the program to run.} */ + switch (wordexp (program, &result, 0)) + @{ + case 0: /* @r{Successful}. */ + break; + case WRDE_NOSPACE: + /* @r{If the error was @code{WRDE_NOSPACE},} + @r{then perhaps part of the result was allocated.} */ + wordfree (&result); + default: /* @r{Some other error.} */ + return -1; + @} + + /* @r{Expand the strings specified for the arguments.} */ + for (i = 0; args[i]; i++) + @{ + if (wordexp (options, &result, WRDE_APPEND)) + @{ + wordfree (&result); + return -1; + @} + @} + + pid = fork (); + if (pid == 0) + @{ + /* @r{This is the child process. Execute the command.} */ + execv (result.we_wordv[0], result.we_wordv); + exit (EXIT_FAILURE); + @} + else if (pid < 0) + /* @r{The fork failed. Report failure.} */ + status = -1; + else + /* @r{This is the parent process. Wait for the child to complete.} */ + if (waitpid (pid, &status, 0) != pid) + status = -1; + + wordfree (&result); + return status; +@} +@end smallexample + +In practice, since @code{wordexp} is executed by running a subshell, it +would be faster to do this by concatenating the strings with spaces +between them and running that as a shell command using @samp{sh -c}. + +@c No sense finishing this for here. +@ignore +@node Tilde Expansion +@subsection Details of Tilde Expansion + +It's a standard part of shell syntax that you can use @samp{~} at the +beginning of a file name to stand for your own home directory. You +can use @samp{~@var{user}} to stand for @var{user}'s home directory. + +@dfn{Tilde expansion} is the process of converting these abbreviations +to the directory names that they stand for. + +Tilde expansion applies to the @samp{~} plus all following characters up +to whitespace or a slash. It takes place only at the beginning of a +word, and only if none of the characters to be transformed is quoted in +any way. + +Plain @samp{~} uses the value of the environment variable @code{HOME} +as the proper home directory name. @samp{~} followed by a user name +uses @code{getpwname} to look up that user in the user database, and +uses whatever directory is recorded there. Thus, @samp{~} followed +by your own name can give different results from plain @samp{~}, if +the value of @code{HOME} is not really your home directory. + +@node Variable Substitution +@subsection Details of Variable Substitution + +Part of ordinary shell syntax is the use of @samp{$@var{variable}} to +substitute the value of a shell variable into a command. This is called +@dfn{variable substitution}, and it is one part of doing word expansion. + +There are two basic ways you can write a variable reference for +substitution: + +@table @code +@item $@{@var{variable}@} +If you write braces around the variable name, then it is completely +unambiguous where the variable name ends. You can concatenate +additional letters onto the end of the variable value by writing them +immediately after the close brace. For example, @samp{$@{foo@}s} +expands into @samp{tractors}. + +@item $@var{variable} +If you do not put braces around the variable name, then the variable +name consists of all the alphanumeric characters and underscores that +follow the @samp{$}. The next punctuation character ends the variable +name. Thus, @samp{$foo-bar} refers to the variable @code{foo} and expands +into @samp{tractor-bar}. +@end table + +When you use braces, you can also use various constructs to modify the +value that is substituted, or test it in various ways. + +@table @code +@item $@{@var{variable}:-@var{default}@} +Substitute the value of @var{variable}, but if that is empty or +undefined, use @var{default} instead. + +@item $@{@var{variable}:=@var{default}@} +Substitute the value of @var{variable}, but if that is empty or +undefined, use @var{default} instead and set the variable to +@var{default}. + +@item $@{@var{variable}:?@var{message}@} +If @var{variable} is defined and not empty, substitute its value. + +Otherwise, print @var{message} as an error message on the standard error +stream, and consider word expansion a failure. + +@c ??? How does wordexp report such an error? + +@item $@{@var{variable}:+@var{replacement}@} +Substitute @var{replacement}, but only if @var{variable} is defined and +nonempty. Otherwise, substitute nothing for this construct. +@end table + +@table @code +@item $@{#@var{variable}@} +Substitute a numeral which expresses in base ten the number of +characters in the value of @var{variable}. @samp{$@{#foo@}} stands for +@samp{7}, because @samp{tractor} is seven characters. +@end table + +These variants of variable substitution let you remove part of the +variable's value before substituting it. The @var{prefix} and +@var{suffix} are not mere strings; they are wildcard patterns, just +like the patterns that you use to match multiple file names. But +in this context, they match against parts of the variable value +rather than against file names. + +@table @code +@item $@{@var{variable}%%@var{suffix}@} +Substitute the value of @var{variable}, but first discard from that +variable any portion at the end that matches the pattern @var{suffix}. + +If there is more than one alternative for how to match against +@var{suffix}, this construct uses the longest possible match. + +Thus, @samp{$@{foo%%r*@}} substitutes @samp{t}, because the largest +match for @samp{r*} at the end of @samp{tractor} is @samp{ractor}. + +@item $@{@var{variable}%@var{suffix}@} +Substitute the value of @var{variable}, but first discard from that +variable any portion at the end that matches the pattern @var{suffix}. + +If there is more than one alternative for how to match against +@var{suffix}, this construct uses the shortest possible alternative. + +Thus, @samp{$@{foo%%r*@}} substitutes @samp{tracto}, because the shortest +match for @samp{r*} at the end of @samp{tractor} is just @samp{r}. + +@item $@{@var{variable}##@var{prefix}@} +Substitute the value of @var{variable}, but first discard from that +variable any portion at the beginning that matches the pattern @var{prefix}. + +If there is more than one alternative for how to match against +@var{prefix}, this construct uses the longest possible match. + +Thus, @samp{$@{foo%%r*@}} substitutes @samp{t}, because the largest +match for @samp{r*} at the end of @samp{tractor} is @samp{ractor}. + +@item $@{@var{variable}#@var{prefix}@} +Substitute the value of @var{variable}, but first discard from that +variable any portion at the beginning that matches the pattern @var{prefix}. + +If there is more than one alternative for how to match against +@var{prefix}, this construct uses the shortest possible alternative. + +Thus, @samp{$@{foo%%r*@}} substitutes @samp{tracto}, because the shortest +match for @samp{r*} at the end of @samp{tractor} is just @samp{r}. + +@end ignore diff --git a/manual/pipe.texi b/manual/pipe.texi new file mode 100644 index 0000000000..773dc4aac8 --- /dev/null +++ b/manual/pipe.texi @@ -0,0 +1,208 @@ +@node Pipes and FIFOs, Sockets, File System Interface, Top +@chapter Pipes and FIFOs + +@cindex pipe +A @dfn{pipe} is a mechanism for interprocess communication; data written +to the pipe by one process can be read by another process. The data is +handled in a first-in, first-out (FIFO) order. The pipe has no name; it +is created for one use and both ends must be inherited from the single +process which created the pipe. + +@cindex FIFO special file +A @dfn{FIFO special file} is similar to a pipe, but instead of being an +anonymous, temporary connection, a FIFO has a name or names like any +other file. Processes open the FIFO by name in order to communicate +through it. + +A pipe or FIFO has to be open at both ends simultaneously. If you read +from a pipe or FIFO file that doesn't have any processes writing to it +(perhaps because they have all closed the file, or exited), the read +returns end-of-file. Writing to a pipe or FIFO that doesn't have a +reading process is treated as an error condition; it generates a +@code{SIGPIPE} signal, and fails with error code @code{EPIPE} if the +signal is handled or blocked. + +Neither pipes nor FIFO special files allow file positioning. Both +reading and writing operations happen sequentially; reading from the +beginning of the file and writing at the end. + +@menu +* Creating a Pipe:: Making a pipe with the @code{pipe} function. +* Pipe to a Subprocess:: Using a pipe to communicate with a + child process. +* FIFO Special Files:: Making a FIFO special file. +* Pipe Atomicity:: When pipe (or FIFO) I/O is atomic. +@end menu + +@node Creating a Pipe +@section Creating a Pipe +@cindex creating a pipe +@cindex opening a pipe +@cindex interprocess communication, with pipes + +The primitive for creating a pipe is the @code{pipe} function. This +creates both the reading and writing ends of the pipe. It is not very +useful for a single process to use a pipe to talk to itself. In typical +use, a process creates a pipe just before it forks one or more child +processes (@pxref{Creating a Process}). The pipe is then used for +communication either between the parent or child processes, or between +two sibling processes. + +The @code{pipe} function is declared in the header file +@file{unistd.h}. +@pindex unistd.h + +@comment unistd.h +@comment POSIX.1 +@deftypefun int pipe (int @var{filedes}@t{[2]}) +The @code{pipe} function creates a pipe and puts the file descriptors +for the reading and writing ends of the pipe (respectively) into +@code{@var{filedes}[0]} and @code{@var{filedes}[1]}. + +An easy way to remember that the input end comes first is that file +descriptor @code{0} is standard input, and file descriptor @code{1} is +standard output. + +If successful, @code{pipe} returns a value of @code{0}. On failure, +@code{-1} is returned. The following @code{errno} error conditions are +defined for this function: + +@table @code +@item EMFILE +The process has too many files open. + +@item ENFILE +There are too many open files in the entire system. @xref{Error Codes}, +for more information about @code{ENFILE}. This error never occurs in +the GNU system. +@end table +@end deftypefun + +Here is an example of a simple program that creates a pipe. This program +uses the @code{fork} function (@pxref{Creating a Process}) to create +a child process. The parent process writes data to the pipe, which is +read by the child process. + +@smallexample +@include pipe.c.texi +@end smallexample + +@node Pipe to a Subprocess +@section Pipe to a Subprocess +@cindex creating a pipe to a subprocess +@cindex pipe to a subprocess +@cindex filtering i/o through subprocess + +A common use of pipes is to send data to or receive data from a program +being run as subprocess. One way of doing this is by using a combination of +@code{pipe} (to create the pipe), @code{fork} (to create the subprocess), +@code{dup2} (to force the subprocess to use the pipe as its standard input +or output channel), and @code{exec} (to execute the new program). Or, +you can use @code{popen} and @code{pclose}. + +The advantage of using @code{popen} and @code{pclose} is that the +interface is much simpler and easier to use. But it doesn't offer as +much flexibility as using the low-level functions directly. + +@comment stdio.h +@comment POSIX.2, SVID, BSD +@deftypefun {FILE *} popen (const char *@var{command}, const char *@var{mode}) +The @code{popen} function is closely related to the @code{system} +function; see @ref{Running a Command}. It executes the shell command +@var{command} as a subprocess. However, instead of waiting for the +command to complete, it creates a pipe to the subprocess and returns a +stream that corresponds to that pipe. + +If you specify a @var{mode} argument of @code{"r"}, you can read from the +stream to retrieve data from the standard output channel of the subprocess. +The subprocess inherits its standard input channel from the parent process. + +Similarly, if you specify a @var{mode} argument of @code{"w"}, you can +write to the stream to send data to the standard input channel of the +subprocess. The subprocess inherits its standard output channel from +the parent process. + +In the event of an error, @code{popen} returns a null pointer. This +might happen if the pipe or stream cannot be created, if the subprocess +cannot be forked, or if the program cannot be executed. +@end deftypefun + +@comment stdio.h +@comment POSIX.2, SVID, BSD +@deftypefun int pclose (FILE *@var{stream}) +The @code{pclose} function is used to close a stream created by @code{popen}. +It waits for the child process to terminate and returns its status value, +as for the @code{system} function. +@end deftypefun + +Here is an example showing how to use @code{popen} and @code{pclose} to +filter output through another program, in this case the paging program +@code{more}. + +@smallexample +@include popen.c.texi +@end smallexample + +@node FIFO Special Files +@section FIFO Special Files +@cindex creating a FIFO special file +@cindex interprocess communication, with FIFO + +A FIFO special file is similar to a pipe, except that it is created in a +different way. Instead of being an anonymous communications channel, a +FIFO special file is entered into the file system by calling +@code{mkfifo}. + +Once you have created a FIFO special file in this way, any process can +open it for reading or writing, in the same way as an ordinary file. +However, it has to be open at both ends simultaneously before you can +proceed to do any input or output operations on it. Opening a FIFO for +reading normally blocks until some other process opens the same FIFO for +writing, and vice versa. + +The @code{mkfifo} function is declared in the header file +@file{sys/stat.h}. +@pindex sys/stat.h + +@comment sys/stat.h +@comment POSIX.1 +@deftypefun int mkfifo (const char *@var{filename}, mode_t @var{mode}) +The @code{mkfifo} function makes a FIFO special file with name +@var{filename}. The @var{mode} argument is used to set the file's +permissions; see @ref{Setting Permissions}. + +The normal, successful return value from @code{mkfifo} is @code{0}. In +the case of an error, @code{-1} is returned. In addition to the usual +file name errors (@pxref{File Name Errors}), the following +@code{errno} error conditions are defined for this function: + +@table @code +@item EEXIST +The named file already exists. + +@item ENOSPC +The directory or file system cannot be extended. + +@item EROFS +The directory that would contain the file resides on a read-only file +system. +@end table +@end deftypefun + +@node Pipe Atomicity +@section Atomicity of Pipe I/O + +Reading or writing pipe data is @dfn{atomic} if the size of data written +is less than @code{PIPE_BUF}. This means that the data transfer seems +to be an instantaneous unit, in that nothing else in the system can +observe a state in which it is partially complete. Atomic I/O may not +begin right away (it may need to wait for buffer space or for data), but +once it does begin, it finishes immediately. + +Reading or writing a larger amount of data may not be atomic; for +example, output data from other processes sharing the descriptor may be +interspersed. Also, once @code{PIPE_BUF} characters have been written, +further writes will block until some characters are read. + +@xref{Limits for Files}, for information about the @code{PIPE_BUF} +parameter. diff --git a/manual/process.texi b/manual/process.texi new file mode 100644 index 0000000000..2f5ba65af5 --- /dev/null +++ b/manual/process.texi @@ -0,0 +1,775 @@ +@node Processes +@chapter Processes + +@cindex process +@dfn{Processes} are the primitive units for allocation of system +resources. Each process has its own address space and (usually) one +thread of control. A process executes a program; you can have multiple +processes executing the same program, but each process has its own copy +of the program within its own address space and executes it +independently of the other copies. + +@cindex child process +@cindex parent process +Processes are organized hierarchically. Each process has a @dfn{parent +process} which explicitly arranged to create it. The processes created +by a given parent are called its @dfn{child processes}. A child +inherits many of its attributes from the parent process. + +This chapter describes how a program can create, terminate, and control +child processes. Actually, there are three distinct operations +involved: creating a new child process, causing the new process to +execute a program, and coordinating the completion of the child process +with the original program. + +The @code{system} function provides a simple, portable mechanism for +running another program; it does all three steps automatically. If you +need more control over the details of how this is done, you can use the +primitive functions to do each step individually instead. + +@menu +* Running a Command:: The easy way to run another program. +* Process Creation Concepts:: An overview of the hard way to do it. +* Process Identification:: How to get the process ID of a process. +* Creating a Process:: How to fork a child process. +* Executing a File:: How to make a process execute another program. +* Process Completion:: How to tell when a child process has completed. +* Process Completion Status:: How to interpret the status value + returned from a child process. +* BSD Wait Functions:: More functions, for backward compatibility. +* Process Creation Example:: A complete example program. +@end menu + + +@node Running a Command +@section Running a Command +@cindex running a command + +The easy way to run another program is to use the @code{system} +function. This function does all the work of running a subprogram, but +it doesn't give you much control over the details: you have to wait +until the subprogram terminates before you can do anything else. + +@comment stdlib.h +@comment ANSI +@deftypefun int system (const char *@var{command}) +@pindex sh +This function executes @var{command} as a shell command. In the GNU C +library, it always uses the default shell @code{sh} to run the command. +In particular, it searches the directories in @code{PATH} to find +programs to execute. The return value is @code{-1} if it wasn't +possible to create the shell process, and otherwise is the status of the +shell process. @xref{Process Completion}, for details on how this +status code can be interpreted. + +@pindex stdlib.h +The @code{system} function is declared in the header file +@file{stdlib.h}. +@end deftypefun + +@strong{Portability Note:} Some C implementations may not have any +notion of a command processor that can execute other programs. You can +determine whether a command processor exists by executing +@w{@code{system (NULL)}}; if the return value is nonzero, a command +processor is available. + +The @code{popen} and @code{pclose} functions (@pxref{Pipe to a +Subprocess}) are closely related to the @code{system} function. They +allow the parent process to communicate with the standard input and +output channels of the command being executed. + +@node Process Creation Concepts +@section Process Creation Concepts + +This section gives an overview of processes and of the steps involved in +creating a process and making it run another program. + +@cindex process ID +@cindex process lifetime +Each process is named by a @dfn{process ID} number. A unique process ID +is allocated to each process when it is created. The @dfn{lifetime} of +a process ends when its termination is reported to its parent process; +at that time, all of the process resources, including its process ID, +are freed. + +@cindex creating a process +@cindex forking a process +@cindex child process +@cindex parent process +Processes are created with the @code{fork} system call (so the operation +of creating a new process is sometimes called @dfn{forking} a process). +The @dfn{child process} created by @code{fork} is a copy of the original +@dfn{parent process}, except that it has its own process ID. + +After forking a child process, both the parent and child processes +continue to execute normally. If you want your program to wait for a +child process to finish executing before continuing, you must do this +explicitly after the fork operation, by calling @code{wait} or +@code{waitpid} (@pxref{Process Completion}). These functions give you +limited information about why the child terminated---for example, its +exit status code. + +A newly forked child process continues to execute the same program as +its parent process, at the point where the @code{fork} call returns. +You can use the return value from @code{fork} to tell whether the program +is running in the parent process or the child. + +@cindex process image +Having several processes run the same program is only occasionally +useful. But the child can execute another program using one of the +@code{exec} functions; see @ref{Executing a File}. The program that the +process is executing is called its @dfn{process image}. Starting +execution of a new program causes the process to forget all about its +previous process image; when the new program exits, the process exits +too, instead of returning to the previous process image. + +@node Process Identification +@section Process Identification + +The @code{pid_t} data type represents process IDs. You can get the +process ID of a process by calling @code{getpid}. The function +@code{getppid} returns the process ID of the parent of the current +process (this is also known as the @dfn{parent process ID}). Your +program should include the header files @file{unistd.h} and +@file{sys/types.h} to use these functions. +@pindex sys/types.h +@pindex unistd.h + +@comment sys/types.h +@comment POSIX.1 +@deftp {Data Type} pid_t +The @code{pid_t} data type is a signed integer type which is capable +of representing a process ID. In the GNU library, this is an @code{int}. +@end deftp + +@comment unistd.h +@comment POSIX.1 +@deftypefun pid_t getpid (void) +The @code{getpid} function returns the process ID of the current process. +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun pid_t getppid (void) +The @code{getppid} function returns the process ID of the parent of the +current process. +@end deftypefun + +@node Creating a Process +@section Creating a Process + +The @code{fork} function is the primitive for creating a process. +It is declared in the header file @file{unistd.h}. +@pindex unistd.h + +@comment unistd.h +@comment POSIX.1 +@deftypefun pid_t fork (void) +The @code{fork} function creates a new process. + +If the operation is successful, there are then both parent and child +processes and both see @code{fork} return, but with different values: it +returns a value of @code{0} in the child process and returns the child's +process ID in the parent process. + +If process creation failed, @code{fork} returns a value of @code{-1} in +the parent process. The following @code{errno} error conditions are +defined for @code{fork}: + +@table @code +@item EAGAIN +There aren't enough system resources to create another process, or the +user already has too many processes running. This means exceeding the +@code{RLIMIT_NPROC} resource limit, which can usually be increased; +@pxref{Limits on Resources}. + +@item ENOMEM +The process requires more space than the system can supply. +@end table +@end deftypefun + +The specific attributes of the child process that differ from the +parent process are: + +@itemize @bullet +@item +The child process has its own unique process ID. + +@item +The parent process ID of the child process is the process ID of its +parent process. + +@item +The child process gets its own copies of the parent process's open file +descriptors. Subsequently changing attributes of the file descriptors +in the parent process won't affect the file descriptors in the child, +and vice versa. @xref{Control Operations}. However, the file position +associated with each descriptor is shared by both processes; +@pxref{File Position}. + +@item +The elapsed processor times for the child process are set to zero; +see @ref{Processor Time}. + +@item +The child doesn't inherit file locks set by the parent process. +@c !!! flock locks shared +@xref{Control Operations}. + +@item +The child doesn't inherit alarms set by the parent process. +@xref{Setting an Alarm}. + +@item +The set of pending signals (@pxref{Delivery of Signal}) for the child +process is cleared. (The child process inherits its mask of blocked +signals and signal actions from the parent process.) +@end itemize + + +@comment unistd.h +@comment BSD +@deftypefun pid_t vfork (void) +The @code{vfork} function is similar to @code{fork} but on systems it +is more efficient; however, there are restrictions you must follow to +use it safely. + +While @code{fork} makes a complete copy of the calling process's +address space and allows both the parent and child to execute +independently, @code{vfork} does not make this copy. Instead, the +child process created with @code{vfork} shares its parent's address +space until it calls exits or one of the @code{exec} functions. In the +meantime, the parent process suspends execution. + +You must be very careful not to allow the child process created with +@code{vfork} to modify any global data or even local variables shared +with the parent. Furthermore, the child process cannot return from (or +do a long jump out of) the function that called @code{vfork}! This +would leave the parent process's control information very confused. If +in doubt, use @code{fork} instead. + +Some operating systems don't really implement @code{vfork}. The GNU C +library permits you to use @code{vfork} on all systems, but actually +executes @code{fork} if @code{vfork} isn't available. If you follow +the proper precautions for using @code{vfork}, your program will still +work even if the system uses @code{fork} instead. +@end deftypefun + +@node Executing a File +@section Executing a File +@cindex executing a file +@cindex @code{exec} functions + +This section describes the @code{exec} family of functions, for executing +a file as a process image. You can use these functions to make a child +process execute a new program after it has been forked. + +@pindex unistd.h +The functions in this family differ in how you specify the arguments, +but otherwise they all do the same thing. They are declared in the +header file @file{unistd.h}. + +@comment unistd.h +@comment POSIX.1 +@deftypefun int execv (const char *@var{filename}, char *const @var{argv}@t{[]}) +The @code{execv} function executes the file named by @var{filename} as a +new process image. + +The @var{argv} argument is an array of null-terminated strings that is +used to provide a value for the @code{argv} argument to the @code{main} +function of the program to be executed. The last element of this array +must be a null pointer. By convention, the first element of this array +is the file name of the program sans directory names. @xref{Program +Arguments}, for full details on how programs can access these arguments. + +The environment for the new process image is taken from the +@code{environ} variable of the current process image; see +@ref{Environment Variables}, for information about environments. +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun int execl (const char *@var{filename}, const char *@var{arg0}, @dots{}) +This is similar to @code{execv}, but the @var{argv} strings are +specified individually instead of as an array. A null pointer must be +passed as the last such argument. +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun int execve (const char *@var{filename}, char *const @var{argv}@t{[]}, char *const @var{env}@t{[]}) +This is similar to @code{execv}, but permits you to specify the environment +for the new program explicitly as the @var{env} argument. This should +be an array of strings in the same format as for the @code{environ} +variable; see @ref{Environment Access}. +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun int execle (const char *@var{filename}, const char *@var{arg0}, char *const @var{env}@t{[]}, @dots{}) +This is similar to @code{execl}, but permits you to specify the +environment for the new program explicitly. The environment argument is +passed following the null pointer that marks the last @var{argv} +argument, and should be an array of strings in the same format as for +the @code{environ} variable. +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun int execvp (const char *@var{filename}, char *const @var{argv}@t{[]}) +The @code{execvp} function is similar to @code{execv}, except that it +searches the directories listed in the @code{PATH} environment variable +(@pxref{Standard Environment}) to find the full file name of a +file from @var{filename} if @var{filename} does not contain a slash. + +This function is useful for executing system utility programs, because +it looks for them in the places that the user has chosen. Shells use it +to run the commands that users type. +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun int execlp (const char *@var{filename}, const char *@var{arg0}, @dots{}) +This function is like @code{execl}, except that it performs the same +file name searching as the @code{execvp} function. +@end deftypefun + +The size of the argument list and environment list taken together must +not be greater than @code{ARG_MAX} bytes. @xref{General Limits}. In +the GNU system, the size (which compares against @code{ARG_MAX}) +includes, for each string, the number of characters in the string, plus +the size of a @code{char *}, plus one, rounded up to a multiple of the +size of a @code{char *}. Other systems may have somewhat different +rules for counting. + +These functions normally don't return, since execution of a new program +causes the currently executing program to go away completely. A value +of @code{-1} is returned in the event of a failure. In addition to the +usual file name errors (@pxref{File Name Errors}), the following +@code{errno} error conditions are defined for these functions: + +@table @code +@item E2BIG +The combined size of the new program's argument list and environment +list is larger than @code{ARG_MAX} bytes. The GNU system has no +specific limit on the argument list size, so this error code cannot +result, but you may get @code{ENOMEM} instead if the arguments are too +big for available memory. + +@item ENOEXEC +The specified file can't be executed because it isn't in the right format. + +@item ENOMEM +Executing the specified file requires more storage than is available. +@end table + +If execution of the new file succeeds, it updates the access time field +of the file as if the file had been read. @xref{File Times}, for more +details about access times of files. + +The point at which the file is closed again is not specified, but +is at some point before the process exits or before another process +image is executed. + +Executing a new process image completely changes the contents of memory, +copying only the argument and environment strings to new locations. But +many other attributes of the process are unchanged: + +@itemize @bullet +@item +The process ID and the parent process ID. @xref{Process Creation Concepts}. + +@item +Session and process group membership. @xref{Concepts of Job Control}. + +@item +Real user ID and group ID, and supplementary group IDs. @xref{Process +Persona}. + +@item +Pending alarms. @xref{Setting an Alarm}. + +@item +Current working directory and root directory. @xref{Working +Directory}. In the GNU system, the root directory is not copied when +executing a setuid program; instead the system default root directory +is used for the new program. + +@item +File mode creation mask. @xref{Setting Permissions}. + +@item +Process signal mask; see @ref{Process Signal Mask}. + +@item +Pending signals; see @ref{Blocking Signals}. + +@item +Elapsed processor time associated with the process; see @ref{Processor Time}. +@end itemize + +If the set-user-ID and set-group-ID mode bits of the process image file +are set, this affects the effective user ID and effective group ID +(respectively) of the process. These concepts are discussed in detail +in @ref{Process Persona}. + +Signals that are set to be ignored in the existing process image are +also set to be ignored in the new process image. All other signals are +set to the default action in the new process image. For more +information about signals, see @ref{Signal Handling}. + +File descriptors open in the existing process image remain open in the +new process image, unless they have the @code{FD_CLOEXEC} +(close-on-exec) flag set. The files that remain open inherit all +attributes of the open file description from the existing process image, +including file locks. File descriptors are discussed in @ref{Low-Level I/O}. + +Streams, by contrast, cannot survive through @code{exec} functions, +because they are located in the memory of the process itself. The new +process image has no streams except those it creates afresh. Each of +the streams in the pre-@code{exec} process image has a descriptor inside +it, and these descriptors do survive through @code{exec} (provided that +they do not have @code{FD_CLOEXEC} set). The new process image can +reconnect these to new streams using @code{fdopen} (@pxref{Descriptors +and Streams}). + +@node Process Completion +@section Process Completion +@cindex process completion +@cindex waiting for completion of child process +@cindex testing exit status of child process + +The functions described in this section are used to wait for a child +process to terminate or stop, and determine its status. These functions +are declared in the header file @file{sys/wait.h}. +@pindex sys/wait.h + +@comment sys/wait.h +@comment POSIX.1 +@deftypefun pid_t waitpid (pid_t @var{pid}, int *@var{status-ptr}, int @var{options}) +The @code{waitpid} function is used to request status information from a +child process whose process ID is @var{pid}. Normally, the calling +process is suspended until the child process makes status information +available by terminating. + +Other values for the @var{pid} argument have special interpretations. A +value of @code{-1} or @code{WAIT_ANY} requests status information for +any child process; a value of @code{0} or @code{WAIT_MYPGRP} requests +information for any child process in the same process group as the +calling process; and any other negative value @minus{} @var{pgid} +requests information for any child process whose process group ID is +@var{pgid}. + +If status information for a child process is available immediately, this +function returns immediately without waiting. If more than one eligible +child process has status information available, one of them is chosen +randomly, and its status is returned immediately. To get the status +from the other eligible child processes, you need to call @code{waitpid} +again. + +The @var{options} argument is a bit mask. Its value should be the +bitwise OR (that is, the @samp{|} operator) of zero or more of the +@code{WNOHANG} and @code{WUNTRACED} flags. You can use the +@code{WNOHANG} flag to indicate that the parent process shouldn't wait; +and the @code{WUNTRACED} flag to request status information from stopped +processes as well as processes that have terminated. + +The status information from the child process is stored in the object +that @var{status-ptr} points to, unless @var{status-ptr} is a null pointer. + +The return value is normally the process ID of the child process whose +status is reported. If the @code{WNOHANG} option was specified and no +child process is waiting to be noticed, the value is zero. A value of +@code{-1} is returned in case of error. The following @code{errno} +error conditions are defined for this function: + +@table @code +@item EINTR +The function was interrupted by delivery of a signal to the calling +process. @xref{Interrupted Primitives}. + +@item ECHILD +There are no child processes to wait for, or the specified @var{pid} +is not a child of the calling process. + +@item EINVAL +An invalid value was provided for the @var{options} argument. +@end table +@end deftypefun + +These symbolic constants are defined as values for the @var{pid} argument +to the @code{waitpid} function. + +@comment Extra blank lines make it look better. +@table @code +@item WAIT_ANY + +This constant macro (whose value is @code{-1}) specifies that +@code{waitpid} should return status information about any child process. + + +@item WAIT_MYPGRP +This constant (with value @code{0}) specifies that @code{waitpid} should +return status information about any child process in the same process +group as the calling process. +@end table + +These symbolic constants are defined as flags for the @var{options} +argument to the @code{waitpid} function. You can bitwise-OR the flags +together to obtain a value to use as the argument. + +@table @code +@item WNOHANG + +This flag specifies that @code{waitpid} should return immediately +instead of waiting, if there is no child process ready to be noticed. + +@item WUNTRACED + +This flag specifies that @code{waitpid} should report the status of any +child processes that have been stopped as well as those that have +terminated. +@end table + +@comment sys/wait.h +@comment POSIX.1 +@deftypefun pid_t wait (int *@var{status-ptr}) +This is a simplified version of @code{waitpid}, and is used to wait +until any one child process terminates. The call: + +@smallexample +wait (&status) +@end smallexample + +@noindent +is exactly equivalent to: + +@smallexample +waitpid (-1, &status, 0) +@end smallexample +@end deftypefun + +@comment sys/wait.h +@comment BSD +@deftypefun pid_t wait4 (pid_t @var{pid}, int *@var{status-ptr}, int @var{options}, struct rusage *@var{usage}) +If @var{usage} is a null pointer, @code{wait4} is equivalent to +@code{waitpid (@var{pid}, @var{status-ptr}, @var{options})}. + +If @var{usage} is not null, @code{wait4} stores usage figures for the +child process in @code{*@var{rusage}} (but only if the child has +terminated, not if it has stopped). @xref{Resource Usage}. + +This function is a BSD extension. +@end deftypefun + +Here's an example of how to use @code{waitpid} to get the status from +all child processes that have terminated, without ever waiting. This +function is designed to be a handler for @code{SIGCHLD}, the signal that +indicates that at least one child process has terminated. + +@smallexample +@group +void +sigchld_handler (int signum) +@{ + int pid; + int status; + while (1) + @{ + pid = waitpid (WAIT_ANY, &status, WNOHANG); + if (pid < 0) + @{ + perror ("waitpid"); + break; + @} + if (pid == 0) + break; + notice_termination (pid, status); + @} +@} +@end group +@end smallexample + +@node Process Completion Status +@section Process Completion Status + +If the exit status value (@pxref{Program Termination}) of the child +process is zero, then the status value reported by @code{waitpid} or +@code{wait} is also zero. You can test for other kinds of information +encoded in the returned status value using the following macros. +These macros are defined in the header file @file{sys/wait.h}. +@pindex sys/wait.h + +@comment sys/wait.h +@comment POSIX.1 +@deftypefn Macro int WIFEXITED (int @var{status}) +This macro returns a nonzero value if the child process terminated +normally with @code{exit} or @code{_exit}. +@end deftypefn + +@comment sys/wait.h +@comment POSIX.1 +@deftypefn Macro int WEXITSTATUS (int @var{status}) +If @code{WIFEXITED} is true of @var{status}, this macro returns the +low-order 8 bits of the exit status value from the child process. +@xref{Exit Status}. +@end deftypefn + +@comment sys/wait.h +@comment POSIX.1 +@deftypefn Macro int WIFSIGNALED (int @var{status}) +This macro returns a nonzero value if the child process terminated +because it received a signal that was not handled. +@xref{Signal Handling}. +@end deftypefn + +@comment sys/wait.h +@comment POSIX.1 +@deftypefn Macro int WTERMSIG (int @var{status}) +If @code{WIFSIGNALED} is true of @var{status}, this macro returns the +signal number of the signal that terminated the child process. +@end deftypefn + +@comment sys/wait.h +@comment BSD +@deftypefn Macro int WCOREDUMP (int @var{status}) +This macro returns a nonzero value if the child process terminated +and produced a core dump. +@end deftypefn + +@comment sys/wait.h +@comment POSIX.1 +@deftypefn Macro int WIFSTOPPED (int @var{status}) +This macro returns a nonzero value if the child process is stopped. +@end deftypefn + +@comment sys/wait.h +@comment POSIX.1 +@deftypefn Macro int WSTOPSIG (int @var{status}) +If @code{WIFSTOPPED} is true of @var{status}, this macro returns the +signal number of the signal that caused the child process to stop. +@end deftypefn + + +@node BSD Wait Functions +@section BSD Process Wait Functions + +The GNU library also provides these related facilities for compatibility +with BSD Unix. BSD uses the @code{union wait} data type to represent +status values rather than an @code{int}. The two representations are +actually interchangeable; they describe the same bit patterns. The GNU +C Library defines macros such as @code{WEXITSTATUS} so that they will +work on either kind of object, and the @code{wait} function is defined +to accept either type of pointer as its @var{status-ptr} argument. + +These functions are declared in @file{sys/wait.h}. +@pindex sys/wait.h + +@comment sys/wait.h +@comment BSD +@deftp {Data Type} {union wait} +This data type represents program termination status values. It has +the following members: + +@table @code +@item int w_termsig +The value of this member is the same as the result of the +@code{WTERMSIG} macro. + +@item int w_coredump +The value of this member is the same as the result of the +@code{WCOREDUMP} macro. + +@item int w_retcode +The value of this member is the same as the result of the +@code{WEXITSTATUS} macro. + +@item int w_stopsig +The value of this member is the same as the result of the +@code{WSTOPSIG} macro. +@end table + +Instead of accessing these members directly, you should use the +equivalent macros. +@end deftp + +The @code{wait3} function is the predecessor to @code{wait4}, which is +more flexible. @code{wait3} is now obsolete. + +@comment sys/wait.h +@comment BSD +@deftypefun pid_t wait3 (union wait *@var{status-ptr}, int @var{options}, struct rusage *@var{usage}) +If @var{usage} is a null pointer, @code{wait3} is equivalent to +@code{waitpid (-1, @var{status-ptr}, @var{options})}. + +If @var{usage} is not null, @code{wait3} stores usage figures for the +child process in @code{*@var{rusage}} (but only if the child has +terminated, not if it has stopped). @xref{Resource Usage}. +@end deftypefun + +@node Process Creation Example +@section Process Creation Example + +Here is an example program showing how you might write a function +similar to the built-in @code{system}. It executes its @var{command} +argument using the equivalent of @samp{sh -c @var{command}}. + +@smallexample +#include <stddef.h> +#include <stdlib.h> +#include <unistd.h> +#include <sys/types.h> +#include <sys/wait.h> + +/* @r{Execute the command using this shell program.} */ +#define SHELL "/bin/sh" + +@group +int +my_system (const char *command) +@{ + int status; + pid_t pid; +@end group + + pid = fork (); + if (pid == 0) + @{ + /* @r{This is the child process. Execute the shell command.} */ + execl (SHELL, SHELL, "-c", command, NULL); + _exit (EXIT_FAILURE); + @} + else if (pid < 0) + /* @r{The fork failed. Report failure.} */ + status = -1; + else + /* @r{This is the parent process. Wait for the child to complete.} */ + if (waitpid (pid, &status, 0) != pid) + status = -1; + return status; +@} +@end smallexample + +@comment Yes, this example has been tested. + +There are a couple of things you should pay attention to in this +example. + +Remember that the first @code{argv} argument supplied to the program +represents the name of the program being executed. That is why, in the +call to @code{execl}, @code{SHELL} is supplied once to name the program +to execute and a second time to supply a value for @code{argv[0]}. + +The @code{execl} call in the child process doesn't return if it is +successful. If it fails, you must do something to make the child +process terminate. Just returning a bad status code with @code{return} +would leave two processes running the original program. Instead, the +right behavior is for the child process to report failure to its parent +process. + +Call @code{_exit} to accomplish this. The reason for using @code{_exit} +instead of @code{exit} is to avoid flushing fully buffered streams such +as @code{stdout}. The buffers of these streams probably contain data +that was copied from the parent process by the @code{fork}, data that +will be output eventually by the parent process. Calling @code{exit} in +the child would output the data twice. @xref{Termination Internals}. diff --git a/manual/search.texi b/manual/search.texi new file mode 100644 index 0000000000..d914135297 --- /dev/null +++ b/manual/search.texi @@ -0,0 +1,195 @@ +@node Searching and Sorting, Pattern Matching, Locales, Top +@chapter Searching and Sorting + +This chapter describes functions for searching and sorting arrays of +arbitrary objects. You pass the appropriate comparison function to be +applied as an argument, along with the size of the objects in the array +and the total number of elements. + +@menu +* Comparison Functions:: Defining how to compare two objects. + Since the sort and search facilities + are general, you have to specify the + ordering. +* Array Search Function:: The @code{bsearch} function. +* Array Sort Function:: The @code{qsort} function. +* Search/Sort Example:: An example program. +@end menu + +@node Comparison Functions, Array Search Function, , Searching and Sorting +@section Defining the Comparison Function +@cindex Comparison Function + +In order to use the sorted array library functions, you have to describe +how to compare the elements of the array. + +To do this, you supply a comparison function to compare two elements of +the array. The library will call this function, passing as arguments +pointers to two array elements to be compared. Your comparison function +should return a value the way @code{strcmp} (@pxref{String/Array +Comparison}) does: negative if the first argument is ``less'' than the +second, zero if they are ``equal'', and positive if the first argument +is ``greater''. + +Here is an example of a comparison function which works with an array of +numbers of type @code{double}: + +@smallexample +int +compare_doubles (const double *a, const double *b) +@{ + return (int) (*a - *b); +@} +@end smallexample + +The header file @file{stdlib.h} defines a name for the data type of +comparison functions. This type is a GNU extension. + +@comment stdlib.h +@comment GNU +@tindex comparison_fn_t +@smallexample +int comparison_fn_t (const void *, const void *); +@end smallexample + +@node Array Search Function, Array Sort Function, Comparison Functions, Searching and Sorting +@section Array Search Function +@cindex search function (for arrays) +@cindex binary search function (for arrays) +@cindex array search function + +To search a sorted array for an element matching the key, use the +@code{bsearch} function. The prototype for this function is in +the header file @file{stdlib.h}. +@pindex stdlib.h + +@comment stdlib.h +@comment ANSI +@deftypefun {void *} bsearch (const void *@var{key}, const void *@var{array}, size_t @var{count}, size_t @var{size}, comparison_fn_t @var{compare}) +The @code{bsearch} function searches the sorted array @var{array} for an object +that is equivalent to @var{key}. The array contains @var{count} elements, +each of which is of size @var{size} bytes. + +The @var{compare} function is used to perform the comparison. This +function is called with two pointer arguments and should return an +integer less than, equal to, or greater than zero corresponding to +whether its first argument is considered less than, equal to, or greater +than its second argument. The elements of the @var{array} must already +be sorted in ascending order according to this comparison function. + +The return value is a pointer to the matching array element, or a null +pointer if no match is found. If the array contains more than one element +that matches, the one that is returned is unspecified. + +This function derives its name from the fact that it is implemented +using the binary search algorithm. +@end deftypefun + +@node Array Sort Function, Search/Sort Example, Array Search Function, Searching and Sorting +@section Array Sort Function +@cindex sort function (for arrays) +@cindex quick sort function (for arrays) +@cindex array sort function + +To sort an array using an arbitrary comparison function, use the +@code{qsort} function. The prototype for this function is in +@file{stdlib.h}. +@pindex stdlib.h + +@comment stdlib.h +@comment ANSI +@deftypefun void qsort (void *@var{array}, size_t @var{count}, size_t @var{size}, comparison_fn_t @var{compare}) +The @var{qsort} function sorts the array @var{array}. The array contains +@var{count} elements, each of which is of size @var{size}. + +The @var{compare} function is used to perform the comparison on the +array elements. This function is called with two pointer arguments and +should return an integer less than, equal to, or greater than zero +corresponding to whether its first argument is considered less than, +equal to, or greater than its second argument. + +@cindex stable sorting +@strong{Warning:} If two objects compare as equal, their order after +sorting is unpredictable. That is to say, the sorting is not stable. +This can make a difference when the comparison considers only part of +the elements. Two elements with the same sort key may differ in other +respects. + +If you want the effect of a stable sort, you can get this result by +writing the comparison function so that, lacking other reason +distinguish between two elements, it compares them by their addresses. +Note that doing this may make the sorting algorithm less efficient, so +do it only if necessary. + +Here is a simple example of sorting an array of doubles in numerical +order, using the comparison function defined above (@pxref{Comparison +Functions}): + +@smallexample +@{ + double *array; + int size; + @dots{} + qsort (array, size, sizeof (double), compare_doubles); +@} +@end smallexample + +The @code{qsort} function derives its name from the fact that it was +originally implemented using the ``quick sort'' algorithm. +@end deftypefun + +@node Search/Sort Example, , Array Sort Function, Searching and Sorting +@section Searching and Sorting Example + +Here is an example showing the use of @code{qsort} and @code{bsearch} +with an array of structures. The objects in the array are sorted +by comparing their @code{name} fields with the @code{strcmp} function. +Then, we can look up individual objects based on their names. + +@comment This example is dedicated to the memory of Jim Henson. RIP. +@smallexample +@include search.c.texi +@end smallexample + +@cindex Kermit the frog +The output from this program looks like: + +@smallexample +Kermit, the frog +Piggy, the pig +Gonzo, the whatever +Fozzie, the bear +Sam, the eagle +Robin, the frog +Animal, the animal +Camilla, the chicken +Sweetums, the monster +Dr. Strangepork, the pig +Link Hogthrob, the pig +Zoot, the human +Dr. Bunsen Honeydew, the human +Beaker, the human +Swedish Chef, the human + +Animal, the animal +Beaker, the human +Camilla, the chicken +Dr. Bunsen Honeydew, the human +Dr. Strangepork, the pig +Fozzie, the bear +Gonzo, the whatever +Kermit, the frog +Link Hogthrob, the pig +Piggy, the pig +Robin, the frog +Sam, the eagle +Swedish Chef, the human +Sweetums, the monster +Zoot, the human + +Kermit, the frog +Gonzo, the whatever +Couldn't find Janice. +@end smallexample + + diff --git a/manual/setjmp.texi b/manual/setjmp.texi new file mode 100644 index 0000000000..dfdac1c4cd --- /dev/null +++ b/manual/setjmp.texi @@ -0,0 +1,213 @@ +@node Non-Local Exits, Signal Handling, Date and Time, Top +@chapter Non-Local Exits +@cindex non-local exits +@cindex long jumps + +Sometimes when your program detects an unusual situation inside a deeply +nested set of function calls, you would like to be able to immediately +return to an outer level of control. This section describes how you can +do such @dfn{non-local exits} using the @code{setjmp} and @code{longjmp} +functions. + +@menu +* Intro: Non-Local Intro. When and how to use these facilities. +* Details: Non-Local Details. Functions for nonlocal exits. +* Non-Local Exits and Signals:: Portability issues. +@end menu + +@node Non-Local Intro, Non-Local Details, , Non-Local Exits +@section Introduction to Non-Local Exits + +As an example of a situation where a non-local exit can be useful, +suppose you have an interactive program that has a ``main loop'' that +prompts for and executes commands. Suppose the ``read'' command reads +input from a file, doing some lexical analysis and parsing of the input +while processing it. If a low-level input error is detected, it would +be useful to be able to return immediately to the ``main loop'' instead +of having to make each of the lexical analysis, parsing, and processing +phases all have to explicitly deal with error situations initially +detected by nested calls. + +(On the other hand, if each of these phases has to do a substantial +amount of cleanup when it exits---such as closing files, deallocating +buffers or other data structures, and the like---then it can be more +appropriate to do a normal return and have each phase do its own +cleanup, because a non-local exit would bypass the intervening phases and +their associated cleanup code entirely. Alternatively, you could use a +non-local exit but do the cleanup explicitly either before or after +returning to the ``main loop''.) + +In some ways, a non-local exit is similar to using the @samp{return} +statement to return from a function. But while @samp{return} abandons +only a single function call, transferring control back to the point at +which it was called, a non-local exit can potentially abandon many +levels of nested function calls. + +You identify return points for non-local exits calling the function +@code{setjmp}. This function saves information about the execution +environment in which the call to @code{setjmp} appears in an object of +type @code{jmp_buf}. Execution of the program continues normally after +the call to @code{setjmp}, but if a exit is later made to this return +point by calling @code{longjmp} with the corresponding @w{@code{jmp_buf}} +object, control is transferred back to the point where @code{setjmp} was +called. The return value from @code{setjmp} is used to distinguish +between an ordinary return and a return made by a call to +@code{longjmp}, so calls to @code{setjmp} usually appear in an @samp{if} +statement. + +Here is how the example program described above might be set up: + +@smallexample +@include setjmp.c.texi +@end smallexample + +The function @code{abort_to_main_loop} causes an immediate transfer of +control back to the main loop of the program, no matter where it is +called from. + +The flow of control inside the @code{main} function may appear a little +mysterious at first, but it is actually a common idiom with +@code{setjmp}. A normal call to @code{setjmp} returns zero, so the +``else'' clause of the conditional is executed. If +@code{abort_to_main_loop} is called somewhere within the execution of +@code{do_command}, then it actually appears as if the @emph{same} call +to @code{setjmp} in @code{main} were returning a second time with a value +of @code{-1}. + +@need 250 +So, the general pattern for using @code{setjmp} looks something like: + +@smallexample +if (setjmp (@var{buffer})) + /* @r{Code to clean up after premature return.} */ + @dots{} +else + /* @r{Code to be executed normally after setting up the return point.} */ + @dots{} +@end smallexample + +@node Non-Local Details, Non-Local Exits and Signals, Non-Local Intro, Non-Local Exits +@section Details of Non-Local Exits + +Here are the details on the functions and data structures used for +performing non-local exits. These facilities are declared in +@file{setjmp.h}. +@pindex setjmp.h + +@comment setjmp.h +@comment ANSI +@deftp {Data Type} jmp_buf +Objects of type @code{jmp_buf} hold the state information to +be restored by a non-local exit. The contents of a @code{jmp_buf} +identify a specific place to return to. +@end deftp + +@comment setjmp.h +@comment ANSI +@deftypefn Macro int setjmp (jmp_buf @var{state}) +When called normally, @code{setjmp} stores information about the +execution state of the program in @var{state} and returns zero. If +@code{longjmp} is later used to perform a non-local exit to this +@var{state}, @code{setjmp} returns a nonzero value. +@end deftypefn + +@comment setjmp.h +@comment ANSI +@deftypefun void longjmp (jmp_buf @var{state}, int @var{value}) +This function restores current execution to the state saved in +@var{state}, and continues execution from the call to @code{setjmp} that +established that return point. Returning from @code{setjmp} by means of +@code{longjmp} returns the @var{value} argument that was passed to +@code{longjmp}, rather than @code{0}. (But if @var{value} is given as +@code{0}, @code{setjmp} returns @code{1}).@refill +@end deftypefun + +There are a lot of obscure but important restrictions on the use of +@code{setjmp} and @code{longjmp}. Most of these restrictions are +present because non-local exits require a fair amount of magic on the +part of the C compiler and can interact with other parts of the language +in strange ways. + +The @code{setjmp} function is actually a macro without an actual +function definition, so you shouldn't try to @samp{#undef} it or take +its address. In addition, calls to @code{setjmp} are safe in only the +following contexts: + +@itemize @bullet +@item +As the test expression of a selection or iteration +statement (such as @samp{if}, @samp{switch}, or @samp{while}). + +@item +As one operand of a equality or comparison operator that appears as the +test expression of a selection or iteration statement. The other +operand must be an integer constant expression. + +@item +As the operand of a unary @samp{!} operator, that appears as the +test expression of a selection or iteration statement. + +@item +By itself as an expression statement. +@end itemize + +Return points are valid only during the dynamic extent of the function +that called @code{setjmp} to establish them. If you @code{longjmp} to +a return point that was established in a function that has already +returned, unpredictable and disastrous things are likely to happen. + +You should use a nonzero @var{value} argument to @code{longjmp}. While +@code{longjmp} refuses to pass back a zero argument as the return value +from @code{setjmp}, this is intended as a safety net against accidental +misuse and is not really good programming style. + +When you perform a non-local exit, accessible objects generally retain +whatever values they had at the time @code{longjmp} was called. The +exception is that the values of automatic variables local to the +function containing the @code{setjmp} call that have been changed since +the call to @code{setjmp} are indeterminate, unless you have declared +them @code{volatile}. + +@node Non-Local Exits and Signals,, Non-Local Details, Non-Local Exits +@section Non-Local Exits and Signals + +In BSD Unix systems, @code{setjmp} and @code{longjmp} also save and +restore the set of blocked signals; see @ref{Blocking Signals}. However, +the POSIX.1 standard requires @code{setjmp} and @code{longjmp} not to +change the set of blocked signals, and provides an additional pair of +functions (@code{sigsetjmp} and @code{sigsetjmp}) to get the BSD +behavior. + +The behavior of @code{setjmp} and @code{longjmp} in the GNU library is +controlled by feature test macros; see @ref{Feature Test Macros}. The +default in the GNU system is the POSIX.1 behavior rather than the BSD +behavior. + +The facilities in this section are declared in the header file +@file{setjmp.h}. +@pindex setjmp.h + +@comment setjmp.h +@comment POSIX.1 +@deftp {Data Type} sigjmp_buf +This is similar to @code{jmp_buf}, except that it can also store state +information about the set of blocked signals. +@end deftp + +@comment setjmp.h +@comment POSIX.1 +@deftypefun int sigsetjmp (sigjmp_buf @var{state}, int @var{savesigs}) +This is similar to @code{setjmp}. If @var{savesigs} is nonzero, the set +of blocked signals is saved in @var{state} and will be restored if a +@code{siglongjmp} is later performed with this @var{state}. +@end deftypefun + +@comment setjmp.h +@comment POSIX.1 +@deftypefun void siglongjmp (sigjmp_buf @var{state}, int @var{value}) +This is similar to @code{longjmp} except for the type of its @var{state} +argument. If the @code{sigsetjmp} call that set this @var{state} used a +nonzero @var{savesigs} flag, @code{siglongjmp} also restores the set of +blocked signals. +@end deftypefun + diff --git a/manual/signal.texi b/manual/signal.texi new file mode 100644 index 0000000000..bca02c528b --- /dev/null +++ b/manual/signal.texi @@ -0,0 +1,3316 @@ +@node Signal Handling, Process Startup, Non-Local Exits, Top +@chapter Signal Handling + +@cindex signal +A @dfn{signal} is a software interrupt delivered to a process. The +operating system uses signals to report exceptional situations to an +executing program. Some signals report errors such as references to +invalid memory addresses; others report asynchronous events, such as +disconnection of a phone line. + +The GNU C library defines a variety of signal types, each for a +particular kind of event. Some kinds of events make it inadvisable or +impossible for the program to proceed as usual, and the corresponding +signals normally abort the program. Other kinds of signals that report +harmless events are ignored by default. + +If you anticipate an event that causes signals, you can define a handler +function and tell the operating system to run it when that particular +type of signal arrives. + +Finally, one process can send a signal to another process; this allows a +parent process to abort a child, or two related processes to communicate +and synchronize. + +@menu +* Concepts of Signals:: Introduction to the signal facilities. +* Standard Signals:: Particular kinds of signals with + standard names and meanings. +* Signal Actions:: Specifying what happens when a + particular signal is delivered. +* Defining Handlers:: How to write a signal handler function. +* Interrupted Primitives:: Signal handlers affect use of @code{open}, + @code{read}, @code{write} and other functions. +* Generating Signals:: How to send a signal to a process. +* Blocking Signals:: Making the system hold signals temporarily. +* Waiting for a Signal:: Suspending your program until a signal + arrives. +* Signal Stack:: Using a Separate Signal Stack. +* BSD Signal Handling:: Additional functions for backward + compatibility with BSD. +@end menu + +@node Concepts of Signals +@section Basic Concepts of Signals + +This section explains basic concepts of how signals are generated, what +happens after a signal is delivered, and how programs can handle +signals. + +@menu +* Kinds of Signals:: Some examples of what can cause a signal. +* Signal Generation:: Concepts of why and how signals occur. +* Delivery of Signal:: Concepts of what a signal does to the + process. +@end menu + +@node Kinds of Signals +@subsection Some Kinds of Signals + +A signal reports the occurrence of an exceptional event. These are some +of the events that can cause (or @dfn{generate}, or @dfn{raise}) a +signal: + +@itemize @bullet +@item +A program error such as dividing by zero or issuing an address outside +the valid range. + +@item +A user request to interrupt or terminate the program. Most environments +are set up to let a user suspend the program by typing @kbd{C-z}, or +terminate it with @kbd{C-c}. Whatever key sequence is used, the +operating system sends the proper signal to interrupt the process. + +@item +The termination of a child process. + +@item +Expiration of a timer or alarm. + +@item +A call to @code{kill} or @code{raise} by the same process. + +@item +A call to @code{kill} from another process. Signals are a limited but +useful form of interprocess communication. + +@item +An attempt to perform an I/O operation that cannot be done. Examples +are reading from a pipe that has no writer (@pxref{Pipes and FIFOs}), +and reading or writing to a terminal in certain situations (@pxref{Job +Control}). +@end itemize + +Each of these kinds of events (excepting explicit calls to @code{kill} +and @code{raise}) generates its own particular kind of signal. The +various kinds of signals are listed and described in detail in +@ref{Standard Signals}. + +@node Signal Generation +@subsection Concepts of Signal Generation +@cindex generation of signals + +In general, the events that generate signals fall into three major +categories: errors, external events, and explicit requests. + +An error means that a program has done something invalid and cannot +continue execution. But not all kinds of errors generate signals---in +fact, most do not. For example, opening a nonexistent file is an error, +but it does not raise a signal; instead, @code{open} returns @code{-1}. +In general, errors that are necessarily associated with certain library +functions are reported by returning a value that indicates an error. +The errors which raise signals are those which can happen anywhere in +the program, not just in library calls. These include division by zero +and invalid memory addresses. + +An external event generally has to do with I/O or other processes. +These include the arrival of input, the expiration of a timer, and the +termination of a child process. + +An explicit request means the use of a library function such as +@code{kill} whose purpose is specifically to generate a signal. + +Signals may be generated @dfn{synchronously} or @dfn{asynchronously}. A +synchronous signal pertains to a specific action in the program, and is +delivered (unless blocked) during that action. Most errors generate +signals synchronously, and so do explicit requests by a process to +generate a signal for that same process. On some machines, certain +kinds of hardware errors (usually floating-point exceptions) are not +reported completely synchronously, but may arrive a few instructions +later. + +Asynchronous signals are generated by events outside the control of the +process that receives them. These signals arrive at unpredictable times +during execution. External events generate signals asynchronously, and +so do explicit requests that apply to some other process. + +A given type of signal is either typically synchrous or typically +asynchronous. For example, signals for errors are typically synchronous +because errors generate signals synchronously. But any type of signal +can be generated synchronously or asynchronously with an explicit +request. + +@node Delivery of Signal +@subsection How Signals Are Delivered +@cindex delivery of signals +@cindex pending signals +@cindex blocked signals + +When a signal is generated, it becomes @dfn{pending}. Normally it +remains pending for just a short period of time and then is +@dfn{delivered} to the process that was signaled. However, if that kind +of signal is currently @dfn{blocked}, it may remain pending +indefinitely---until signals of that kind are @dfn{unblocked}. Once +unblocked, it will be delivered immediately. @xref{Blocking Signals}. + +@cindex specified action (for a signal) +@cindex default action (for a signal) +@cindex signal action +@cindex catching signals +When the signal is delivered, whether right away or after a long delay, +the @dfn{specified action} for that signal is taken. For certain +signals, such as @code{SIGKILL} and @code{SIGSTOP}, the action is fixed, +but for most signals, the program has a choice: ignore the signal, +specify a @dfn{handler function}, or accept the @dfn{default action} for +that kind of signal. The program specifies its choice using functions +such as @code{signal} or @code{sigaction} (@pxref{Signal Actions}). We +sometimes say that a handler @dfn{catches} the signal. While the +handler is running, that particular signal is normally blocked. + +If the specified action for a kind of signal is to ignore it, then any +such signal which is generated is discarded immediately. This happens +even if the signal is also blocked at the time. A signal discarded in +this way will never be delivered, not even if the program subsequently +specifies a different action for that kind of signal and then unblocks +it. + +If a signal arrives which the program has neither handled nor ignored, +its @dfn{default action} takes place. Each kind of signal has its own +default action, documented below (@pxref{Standard Signals}). For most kinds +of signals, the default action is to terminate the process. For certain +kinds of signals that represent ``harmless'' events, the default action +is to do nothing. + +When a signal terminates a process, its parent process can determine the +cause of termination by examining the termination status code reported +by the @code{wait} or @code{waitpid} functions. (This is discussed in +more detail in @ref{Process Completion}.) The information it can get +includes the fact that termination was due to a signal, and the kind of +signal involved. If a program you run from a shell is terminated by a +signal, the shell typically prints some kind of error message. + +The signals that normally represent program errors have a special +property: when one of these signals terminates the process, it also +writes a @dfn{core dump file} which records the state of the process at +the time of termination. You can examine the core dump with a debugger +to investigate what caused the error. + +If you raise a ``program error'' signal by explicit request, and this +terminates the process, it makes a core dump file just as if the signal +had been due directly to an error. + +@node Standard Signals +@section Standard Signals +@cindex signal names +@cindex names of signals + +@pindex signal.h +@cindex signal number +This section lists the names for various standard kinds of signals and +describes what kind of event they mean. Each signal name is a macro +which stands for a positive integer---the @dfn{signal number} for that +kind of signal. Your programs should never make assumptions about the +numeric code for a particular kind of signal, but rather refer to them +always by the names defined here. This is because the number for a +given kind of signal can vary from system to system, but the meanings of +the names are standardized and fairly uniform. + +The signal names are defined in the header file @file{signal.h}. + +@comment signal.h +@comment BSD +@deftypevr Macro int NSIG +The value of this symbolic constant is the total number of signals +defined. Since the signal numbers are allocated consecutively, +@code{NSIG} is also one greater than the largest defined signal number. +@end deftypevr + +@menu +* Program Error Signals:: Used to report serious program errors. +* Termination Signals:: Used to interrupt and/or terminate the + program. +* Alarm Signals:: Used to indicate expiration of timers. +* Asynchronous I/O Signals:: Used to indicate input is available. +* Job Control Signals:: Signals used to support job control. +* Operation Error Signals:: Used to report operational system errors. +* Miscellaneous Signals:: Miscellaneous Signals. +* Signal Messages:: Printing a message describing a signal. +@end menu + +@node Program Error Signals +@subsection Program Error Signals +@cindex program error signals + +The following signals are generated when a serious program error is +detected by the operating system or the computer itself. In general, +all of these signals are indications that your program is seriously +broken in some way, and there's usually no way to continue the +computation which encountered the error. + +Some programs handle program error signals in order to tidy up before +terminating; for example, programs that turn off echoing of terminal +input should handle program error signals in order to turn echoing back +on. The handler should end by specifying the default action for the +signal that happened and then reraising it; this will cause the program +to terminate with that signal, as if it had not had a handler. +(@xref{Termination in Handler}.) + +Termination is the sensible ultimate outcome from a program error in +most programs. However, programming systems such as Lisp that can load +compiled user programs might need to keep executing even if a user +program incurs an error. These programs have handlers which use +@code{longjmp} to return control to the command level. + +The default action for all of these signals is to cause the process to +terminate. If you block or ignore these signals or establish handlers +for them that return normally, your program will probably break horribly +when such signals happen, unless they are generated by @code{raise} or +@code{kill} instead of a real error. + +@vindex COREFILE +When one of these program error signals terminates a process, it also +writes a @dfn{core dump file} which records the state of the process at +the time of termination. The core dump file is named @file{core} and is +written in whichever directory is current in the process at the time. +(On the GNU system, you can specify the file name for core dumps with +the environment variable @code{COREFILE}.) The purpose of core dump +files is so that you can examine them with a debugger to investigate +what caused the error. + +@comment signal.h +@comment ANSI +@deftypevr Macro int SIGFPE +The @code{SIGFPE} signal reports a fatal arithmetic error. Although the +name is derived from ``floating-point exception'', this signal actually +covers all arithmetic errors, including division by zero and overflow. +If a program stores integer data in a location which is then used in a +floating-point operation, this often causes an ``invalid operation'' +exception, because the processor cannot recognize the data as a +floating-point number. +@cindex exception +@cindex floating-point exception + +Actual floating-point exceptions are a complicated subject because there +are many types of exceptions with subtly different meanings, and the +@code{SIGFPE} signal doesn't distinguish between them. The @cite{IEEE +Standard for Binary Floating-Point Arithmetic (ANSI/IEEE Std 754-1985)} +defines various floating-point exceptions and requires conforming +computer systems to report their occurrences. However, this standard +does not specify how the exceptions are reported, or what kinds of +handling and control the operating system can offer to the programmer. +@end deftypevr + +BSD systems provide the @code{SIGFPE} handler with an extra argument +that distinguishes various causes of the exception. In order to access +this argument, you must define the handler to accept two arguments, +which means you must cast it to a one-argument function type in order to +establish the handler. The GNU library does provide this extra +argument, but the value is meaningful only on operating systems that +provide the information (BSD systems and GNU systems). + +@table @code +@comment signal.h +@comment BSD +@item FPE_INTOVF_TRAP +@vindex FPE_INTOVF_TRAP +Integer overflow (impossible in a C program unless you enable overflow +trapping in a hardware-specific fashion). +@comment signal.h +@comment BSD +@item FPE_INTDIV_TRAP +@vindex FPE_INTDIV_TRAP +Integer division by zero. +@comment signal.h +@comment BSD +@item FPE_SUBRNG_TRAP +@vindex FPE_SUBRNG_TRAP +Subscript-range (something that C programs never check for). +@comment signal.h +@comment BSD +@item FPE_FLTOVF_TRAP +@vindex FPE_FLTOVF_TRAP +Floating overflow trap. +@comment signal.h +@comment BSD +@item FPE_FLTDIV_TRAP +@vindex FPE_FLTDIV_TRAP +Floating/decimal division by zero. +@comment signal.h +@comment BSD +@item FPE_FLTUND_TRAP +@vindex FPE_FLTUND_TRAP +Floating underflow trap. (Trapping on floating underflow is not +normally enabled.) +@comment signal.h +@comment BSD +@item FPE_DECOVF_TRAP +@vindex FPE_DECOVF_TRAP +Decimal overflow trap. (Only a few machines have decimal arithmetic and +C never uses it.) +@ignore @c These seem redundant +@comment signal.h +@comment BSD +@item FPE_FLTOVF_FAULT +@vindex FPE_FLTOVF_FAULT +Floating overflow fault. +@comment signal.h +@comment BSD +@item FPE_FLTDIV_FAULT +@vindex FPE_FLTDIV_FAULT +Floating divide by zero fault. +@comment signal.h +@comment BSD +@item FPE_FLTUND_FAULT +@vindex FPE_FLTUND_FAULT +Floating underflow fault. +@end ignore +@end table + +@comment signal.h +@comment ANSI +@deftypevr Macro int SIGILL +The name of this signal is derived from ``illegal instruction''; it +usually means your program is trying to execute garbage or a privileged +instruction. Since the C compiler generates only valid instructions, +@code{SIGILL} typically indicates that the executable file is corrupted, +or that you are trying to execute data. Some common ways of getting +into the latter situation are by passing an invalid object where a +pointer to a function was expected, or by writing past the end of an +automatic array (or similar problems with pointers to automatic +variables) and corrupting other data on the stack such as the return +address of a stack frame. + +@code{SIGILL} can also be generated when the stack overflows, or when +the system has trouble running the handler for a signal. +@end deftypevr +@cindex illegal instruction + +@comment signal.h +@comment ANSI +@deftypevr Macro int SIGSEGV +@cindex segmentation violation +This signal is generated when a program tries to read or write outside +the memory that is allocated for it, or to write memory that can only be +read. (Actually, the signals only occur when the program goes far +enough outside to be detected by the system's memory protection +mechanism.) The name is an abbreviation for ``segmentation violation''. + +Common ways of getting a @code{SIGSEGV} condition include dereferencing +a null or uninitialized pointer, or when you use a pointer to step +through an array, but fail to check for the end of the array. It varies +among systems whether dereferencing a null pointer generates +@code{SIGSEGV} or @code{SIGBUS}. +@end deftypevr + +@comment signal.h +@comment BSD +@deftypevr Macro int SIGBUS +This signal is generated when an invalid pointer is dereferenced. Like +@code{SIGSEGV}, this signal is typically the result of dereferencing an +uninitialized pointer. The difference between the two is that +@code{SIGSEGV} indicates an invalid access to valid memory, while +@code{SIGBUS} indicates an access to an invalid address. In particular, +@code{SIGBUS} signals often result from dereferencing a misaligned +pointer, such as referring to a four-word integer at an address not +divisible by four. (Each kind of computer has its own requirements for +address alignment.) + +The name of this signal is an abbreviation for ``bus error''. +@end deftypevr +@cindex bus error + +@comment signal.h +@comment ANSI +@deftypevr Macro int SIGABRT +@cindex abort signal +This signal indicates an error detected by the program itself and +reported by calling @code{abort}. @xref{Aborting a Program}. +@end deftypevr + +@comment signal.h +@comment Unix +@deftypevr Macro int SIGIOT +Generated by the PDP-11 ``iot'' instruction. On most machines, this is +just another name for @code{SIGABRT}. +@end deftypevr + +@comment signal.h +@comment BSD +@deftypevr Macro int SIGTRAP +Generated by the machine's breakpoint instruction, and possibly other +trap instructions. This signal is used by debuggers. Your program will +probably only see @code{SIGTRAP} if it is somehow executing bad +instructions. +@end deftypevr + +@comment signal.h +@comment BSD +@deftypevr Macro int SIGEMT +Emulator trap; this results from certain unimplemented instructions +which might be emulated in software, or the operating system's +failure to properly emulate them. +@end deftypevr + +@comment signal.h +@comment Unix +@deftypevr Macro int SIGSYS +Bad system call; that is to say, the instruction to trap to the +operating system was executed, but the code number for the system call +to perform was invalid. +@end deftypevr + +@node Termination Signals +@subsection Termination Signals +@cindex program termination signals + +These signals are all used to tell a process to terminate, in one way +or another. They have different names because they're used for slightly +different purposes, and programs might want to handle them differently. + +The reason for handling these signals is usually so your program can +tidy up as appropriate before actually terminating. For example, you +might want to save state information, delete temporary files, or restore +the previous terminal modes. Such a handler should end by specifying +the default action for the signal that happened and then reraising it; +this will cause the program to terminate with that signal, as if it had +not had a handler. (@xref{Termination in Handler}.) + +The (obvious) default action for all of these signals is to cause the +process to terminate. + +@comment signal.h +@comment ANSI +@deftypevr Macro int SIGTERM +@cindex termination signal +The @code{SIGTERM} signal is a generic signal used to cause program +termination. Unlike @code{SIGKILL}, this signal can be blocked, +handled, and ignored. It is the normal way to politely ask a program to +terminate. + +The shell command @code{kill} generates @code{SIGTERM} by default. +@pindex kill +@end deftypevr + +@comment signal.h +@comment ANSI +@deftypevr Macro int SIGINT +@cindex interrupt signal +The @code{SIGINT} (``program interrupt'') signal is sent when the user +types the INTR character (normally @kbd{C-c}). @xref{Special +Characters}, for information about terminal driver support for +@kbd{C-c}. +@end deftypevr + +@comment signal.h +@comment POSIX.1 +@deftypevr Macro int SIGQUIT +@cindex quit signal +@cindex quit signal +The @code{SIGQUIT} signal is similar to @code{SIGINT}, except that it's +controlled by a different key---the QUIT character, usually +@kbd{C-\}---and produces a core dump when it terminates the process, +just like a program error signal. You can think of this as a +program error condition ``detected'' by the user. + +@xref{Program Error Signals}, for information about core dumps. +@xref{Special Characters}, for information about terminal driver +support. + +Certain kinds of cleanups are best omitted in handling @code{SIGQUIT}. +For example, if the program creates temporary files, it should handle +the other termination requests by deleting the temporary files. But it +is better for @code{SIGQUIT} not to delete them, so that the user can +examine them in conjunction with the core dump. +@end deftypevr + +@comment signal.h +@comment POSIX.1 +@deftypevr Macro int SIGKILL +The @code{SIGKILL} signal is used to cause immediate program termination. +It cannot be handled or ignored, and is therefore always fatal. It is +also not possible to block this signal. + +This signal is usually generated only by explicit request. Since it +cannot be handled, you should generate it only as a last resort, after +first trying a less drastic method such as @kbd{C-c} or @code{SIGTERM}. +If a process does not respond to any other termination signals, sending +it a @code{SIGKILL} signal will almost always cause it to go away. + +In fact, if @code{SIGKILL} fails to terminate a process, that by itself +constitutes an operating system bug which you should report. + +The system will generate @code{SIGKILL} for a process itself under some +unusual conditions where the program cannot possible continue to run +(even to run a signal handler). +@end deftypevr +@cindex kill signal + +@comment signal.h +@comment POSIX.1 +@deftypevr Macro int SIGHUP +@cindex hangup signal +The @code{SIGHUP} (``hang-up'') signal is used to report that the user's +terminal is disconnected, perhaps because a network or telephone +connection was broken. For more information about this, see @ref{Control +Modes}. + +This signal is also used to report the termination of the controlling +process on a terminal to jobs associated with that session; this +termination effectively disconnects all processes in the session from +the controlling terminal. For more information, see @ref{Termination +Internals}. +@end deftypevr + +@node Alarm Signals +@subsection Alarm Signals + +These signals are used to indicate the expiration of timers. +@xref{Setting an Alarm}, for information about functions that cause +these signals to be sent. + +The default behavior for these signals is to cause program termination. +This default is rarely useful, but no other default would be useful; +most of the ways of using these signals would require handler functions +in any case. + +@comment signal.h +@comment POSIX.1 +@deftypevr Macro int SIGALRM +This signal typically indicates expiration of a timer that measures real +or clock time. It is used by the @code{alarm} function, for example. +@end deftypevr +@cindex alarm signal + +@comment signal.h +@comment BSD +@deftypevr Macro int SIGVTALRM +This signal typically indicates expiration of a timer that measures CPU +time used by the current process. The name is an abbreviation for +``virtual time alarm''. +@end deftypevr +@cindex virtual time alarm signal + +@comment signal.h +@comment BSD +@deftypevr Macro int SIGPROF +This signal is typically indicates expiration of a timer that measures +both CPU time used by the current process, and CPU time expended on +behalf of the process by the system. Such a timer is used to implement +code profiling facilities, hence the name of this signal. +@end deftypevr +@cindex profiling alarm signal + + +@node Asynchronous I/O Signals +@subsection Asynchronous I/O Signals + +The signals listed in this section are used in conjunction with +asynchronous I/O facilities. You have to take explicit action by +calling @code{fcntl} to enable a particular file descriptior to generate +these signals (@pxref{Interrupt Input}). The default action for these +signals is to ignore them. + +@comment signal.h +@comment BSD +@deftypevr Macro int SIGIO +@cindex input available signal +@cindex output possible signal +This signal is sent when a file descriptor is ready to perform input +or output. + +On most operating systems, terminals and sockets are the only kinds of +files that can generate @code{SIGIO}; other kinds, including ordinary +files, never generate @code{SIGIO} even if you ask them to. + +In the GNU system @code{SIGIO} will always be generated properly +if you successfully set asynchronous mode with @code{fcntl}. +@end deftypevr + +@comment signal.h +@comment BSD +@deftypevr Macro int SIGURG +@cindex urgent data signal +This signal is sent when ``urgent'' or out-of-band data arrives on a +socket. @xref{Out-of-Band Data}. +@end deftypevr + +@comment signal.h +@comment SVID +@deftypevr Macro int SIGPOLL +This is a System V signal name, more or less similar to @code{SIGIO}. +It is defined only for compatibility. +@end deftypevr + +@node Job Control Signals +@subsection Job Control Signals +@cindex job control signals + +These signals are used to support job control. If your system +doesn't support job control, then these macros are defined but the +signals themselves can't be raised or handled. + +You should generally leave these signals alone unless you really +understand how job control works. @xref{Job Control}. + +@comment signal.h +@comment POSIX.1 +@deftypevr Macro int SIGCHLD +@cindex child process signal +This signal is sent to a parent process whenever one of its child +processes terminates or stops. + +The default action for this signal is to ignore it. If you establish a +handler for this signal while there are child processes that have +terminated but not reported their status via @code{wait} or +@code{waitpid} (@pxref{Process Completion}), whether your new handler +applies to those processes or not depends on the particular operating +system. +@end deftypevr + +@comment signal.h +@comment SVID +@deftypevr Macro int SIGCLD +This is an obsolete name for @code{SIGCHLD}. +@end deftypevr + +@comment signal.h +@comment POSIX.1 +@deftypevr Macro int SIGCONT +@cindex continue signal +You can send a @code{SIGCONT} signal to a process to make it continue. +This signal is special---it always makes the process continue if it is +stopped, before the signal is delivered. The default behavior is to do +nothing else. You cannot block this signal. You can set a handler, but +@code{SIGCONT} always makes the process continue regardless. + +Most programs have no reason to handle @code{SIGCONT}; they simply +resume execution without realizing they were ever stopped. You can use +a handler for @code{SIGCONT} to make a program do something special when +it is stopped and continued---for example, to reprint a prompt when it +is suspended while waiting for input. +@end deftypevr + +@comment signal.h +@comment POSIX.1 +@deftypevr Macro int SIGSTOP +The @code{SIGSTOP} signal stops the process. It cannot be handled, +ignored, or blocked. +@end deftypevr +@cindex stop signal + +@comment signal.h +@comment POSIX.1 +@deftypevr Macro int SIGTSTP +The @code{SIGTSTP} signal is an interactive stop signal. Unlike +@code{SIGSTOP}, this signal can be handled and ignored. + +Your program should handle this signal if you have a special need to +leave files or system tables in a secure state when a process is +stopped. For example, programs that turn off echoing should handle +@code{SIGTSTP} so they can turn echoing back on before stopping. + +This signal is generated when the user types the SUSP character +(normally @kbd{C-z}). For more information about terminal driver +support, see @ref{Special Characters}. +@end deftypevr +@cindex interactive stop signal + +@comment signal.h +@comment POSIX.1 +@deftypevr Macro int SIGTTIN +A process cannot read from the the user's terminal while it is running +as a background job. When any process in a background job tries to +read from the terminal, all of the processes in the job are sent a +@code{SIGTTIN} signal. The default action for this signal is to +stop the process. For more information about how this interacts with +the terminal driver, see @ref{Access to the Terminal}. +@end deftypevr +@cindex terminal input signal + +@comment signal.h +@comment POSIX.1 +@deftypevr Macro int SIGTTOU +This is similar to @code{SIGTTIN}, but is generated when a process in a +background job attempts to write to the terminal or set its modes. +Again, the default action is to stop the process. @code{SIGTTOU} is +only generated for an attempt to write to the terminal if the +@code{TOSTOP} output mode is set; @pxref{Output Modes}. +@end deftypevr +@cindex terminal output signal + +While a process is stopped, no more signals can be delivered to it until +it is continued, except @code{SIGKILL} signals and (obviously) +@code{SIGCONT} signals. The signals are marked as pending, but not +delivered until the process is continued. The @code{SIGKILL} signal +always causes termination of the process and can't be blocked, handled +or ignored. You can ignore @code{SIGCONT}, but it always causes the +process to be continued anyway if it is stopped. Sending a +@code{SIGCONT} signal to a process causes any pending stop signals for +that process to be discarded. Likewise, any pending @code{SIGCONT} +signals for a process are discarded when it receives a stop signal. + +When a process in an orphaned process group (@pxref{Orphaned Process +Groups}) receives a @code{SIGTSTP}, @code{SIGTTIN}, or @code{SIGTTOU} +signal and does not handle it, the process does not stop. Stopping the +process would probably not be very useful, since there is no shell +program that will notice it stop and allow the user to continue it. +What happens instead depends on the operating system you are using. +Some systems may do nothing; others may deliver another signal instead, +such as @code{SIGKILL} or @code{SIGHUP}. In the GNU system, the process +dies with @code{SIGKILL}; this avoids the problem of many stopped, +orphaned processes lying around the system. + +@ignore +On the GNU system, it is possible to reattach to the orphaned process +group and continue it, so stop signals do stop the process as usual on +a GNU system unless you have requested POSIX compatibility ``till it +hurts.'' +@end ignore + +@node Operation Error Signals +@subsection Operation Error Signals + +These signals are used to report various errors generated by an +operation done by the program. They do not necessarily indicate a +programming error in the program, but an error that prevents an +operating system call from completing. The default action for all of +them is to cause the process to terminate. + +@comment signal.h +@comment POSIX.1 +@deftypevr Macro int SIGPIPE +@cindex pipe signal +@cindex broken pipe signal +Broken pipe. If you use pipes or FIFOs, you have to design your +application so that one process opens the pipe for reading before +another starts writing. If the reading process never starts, or +terminates unexpectedly, writing to the pipe or FIFO raises a +@code{SIGPIPE} signal. If @code{SIGPIPE} is blocked, handled or +ignored, the offending call fails with @code{EPIPE} instead. + +Pipes and FIFO special files are discussed in more detail in @ref{Pipes +and FIFOs}. + +Another cause of @code{SIGPIPE} is when you try to output to a socket +that isn't connected. @xref{Sending Data}. +@end deftypevr + +@comment signal.h +@comment GNU +@deftypevr Macro int SIGLOST +@cindex lost resource signal +Resource lost. This signal is generated when you have an advisory lock +on an NFS file, and the NFS server reboots and forgets about your lock. + +In the GNU system, @code{SIGLOST} is generated when any server program +dies unexpectedly. It is usually fine to ignore the signal; whatever +call was made to the server that died just returns an error. +@end deftypevr + +@comment signal.h +@comment BSD +@deftypevr Macro int SIGXCPU +CPU time limit exceeded. This signal is generated when the process +exceeds its soft resource limit on CPU time. @xref{Limits on Resources}. +@end deftypevr + +@comment signal.h +@comment BSD +@deftypevr Macro int SIGXFSZ +File size limit exceeded. This signal is generated when the process +attempts to extend a file so it exceeds the process's soft resource +limit on file size. @xref{Limits on Resources}. +@end deftypevr + +@node Miscellaneous Signals +@subsection Miscellaneous Signals + +These signals are used for various other purposes. In general, they +will not affect your program unless it explicitly uses them for something. + +@comment signal.h +@comment POSIX.1 +@deftypevr Macro int SIGUSR1 +@end deftypevr +@comment signal.h +@comment POSIX.1 +@deftypevr Macro int SIGUSR2 +@cindex user signals +The @code{SIGUSR1} and @code{SIGUSR2} signals are set aside for you to +use any way you want. They're useful for simple interprocess +communication, if you write a signal handler for them in the program +that receives the signal. + +There is an example showing the use of @code{SIGUSR1} and @code{SIGUSR2} +in @ref{Signaling Another Process}. + +The default action is to terminate the process. +@end deftypevr + +@comment signal.h +@comment BSD +@deftypevr Macro int SIGWINCH +Window size change. This is generated on some systems (including GNU) +when the terminal driver's record of the number of rows and columns on +the screen is changed. The default action is to ignore it. + +If a program does full-screen display, it should handle @code{SIGWINCH}. +When the signal arrives, it should fetch the new screen size and +reformat its display accordingly. +@end deftypevr + +@comment signal.h +@comment BSD +@deftypevr Macro int SIGINFO +Information request. In 4.4 BSD and the GNU system, this signal is sent +to all the processes in the foreground process group of the controlling +terminal when the user types the STATUS character in canonical mode; +@pxref{Signal Characters}. + +If the process is the leader of the process group, the default action is +to print some status information about the system and what the process +is doing. Otherwise the default is to do nothing. +@end deftypevr + +@node Signal Messages +@subsection Signal Messages +@cindex signal messages + +We mentioned above that the shell prints a message describing the signal +that terminated a child process. The clean way to print a message +describing a signal is to use the functions @code{strsignal} and +@code{psignal}. These functions use a signal number to specify which +kind of signal to describe. The signal number may come from the +termination status of a child process (@pxref{Process Completion}) or it +may come from a signal handler in the same process. + +@comment string.h +@comment GNU +@deftypefun {char *} strsignal (int @var{signum}) +This function returns a pointer to a statically-allocated string +containing a message describing the signal @var{signum}. You +should not modify the contents of this string; and, since it can be +rewritten on subsequent calls, you should save a copy of it if you need +to reference it later. + +@pindex string.h +This function is a GNU extension, declared in the header file +@file{string.h}. +@end deftypefun + +@comment signal.h +@comment BSD +@deftypefun void psignal (int @var{signum}, const char *@var{message}) +This function prints a message describing the signal @var{signum} to the +standard error output stream @code{stderr}; see @ref{Standard Streams}. + +If you call @code{psignal} with a @var{message} that is either a null +pointer or an empty string, @code{psignal} just prints the message +corresponding to @var{signum}, adding a trailing newline. + +If you supply a non-null @var{message} argument, then @code{psignal} +prefixes its output with this string. It adds a colon and a space +character to separate the @var{message} from the string corresponding +to @var{signum}. + +@pindex stdio.h +This function is a BSD feature, declared in the header file @file{signal.h}. +@end deftypefun + +@vindex sys_siglist +There is also an array @code{sys_siglist} which contains the messages +for the various signal codes. This array exists on BSD systems, unlike +@code{strsignal}. + +@node Signal Actions +@section Specifying Signal Actions +@cindex signal actions +@cindex establishing a handler + +The simplest way to change the action for a signal is to use the +@code{signal} function. You can specify a built-in action (such as to +ignore the signal), or you can @dfn{establish a handler}. + +The GNU library also implements the more versatile @code{sigaction} +facility. This section describes both facilities and gives suggestions +on which to use when. + +@menu +* Basic Signal Handling:: The simple @code{signal} function. +* Advanced Signal Handling:: The more powerful @code{sigaction} function. +* Signal and Sigaction:: How those two functions interact. +* Sigaction Function Example:: An example of using the sigaction function. +* Flags for Sigaction:: Specifying options for signal handling. +* Initial Signal Actions:: How programs inherit signal actions. +@end menu + +@node Basic Signal Handling +@subsection Basic Signal Handling +@cindex @code{signal} function + +The @code{signal} function provides a simple interface for establishing +an action for a particular signal. The function and associated macros +are declared in the header file @file{signal.h}. +@pindex signal.h + +@comment signal.h +@comment GNU +@deftp {Data Type} sighandler_t +This is the type of signal handler functions. Signal handlers take one +integer argument specifying the signal number, and have return type +@code{void}. So, you should define handler functions like this: + +@smallexample +void @var{handler} (int @code{signum}) @{ @dots{} @} +@end smallexample + +The name @code{sighandler_t} for this data type is a GNU extension. +@end deftp + +@comment signal.h +@comment ANSI +@deftypefun sighandler_t signal (int @var{signum}, sighandler_t @var{action}) +The @code{signal} function establishes @var{action} as the action for +the signal @var{signum}. + +The first argument, @var{signum}, identifies the signal whose behavior +you want to control, and should be a signal number. The proper way to +specify a signal number is with one of the symbolic signal names +described in @ref{Standard Signals}---don't use an explicit number, because +the numerical code for a given kind of signal may vary from operating +system to operating system. + +The second argument, @var{action}, specifies the action to use for the +signal @var{signum}. This can be one of the following: + +@table @code +@item SIG_DFL +@vindex SIG_DFL +@cindex default action for a signal +@code{SIG_DFL} specifies the default action for the particular signal. +The default actions for various kinds of signals are stated in +@ref{Standard Signals}. + +@item SIG_IGN +@vindex SIG_IGN +@cindex ignore action for a signal +@code{SIG_IGN} specifies that the signal should be ignored. + +Your program generally should not ignore signals that represent serious +events or that are normally used to request termination. You cannot +ignore the @code{SIGKILL} or @code{SIGSTOP} signals at all. You can +ignore program error signals like @code{SIGSEGV}, but ignoring the error +won't enable the program to continue executing meaningfully. Ignoring +user requests such as @code{SIGINT}, @code{SIGQUIT}, and @code{SIGTSTP} +is unfriendly. + +When you do not wish signals to be delivered during a certain part of +the program, the thing to do is to block them, not ignore them. +@xref{Blocking Signals}. + +@item @var{handler} +Supply the address of a handler function in your program, to specify +running this handler as the way to deliver the signal. + +For more information about defining signal handler functions, +see @ref{Defining Handlers}. +@end table + +If you set the action for a signal to @code{SIG_IGN}, or if you set it +to @code{SIG_DFL} and the default action is to ignore that signal, then +any pending signals of that type are discarded (even if they are +blocked). Discarding the pending signals means that they will never be +delivered, not even if you subsequently specify another action and +unblock this kind of signal. + +The @code{signal} function returns the action that was previously in +effect for the specified @var{signum}. You can save this value and +restore it later by calling @code{signal} again. + +If @code{signal} can't honor the request, it returns @code{SIG_ERR} +instead. The following @code{errno} error conditions are defined for +this function: + +@table @code +@item EINVAL +You specified an invalid @var{signum}; or you tried to ignore or provide +a handler for @code{SIGKILL} or @code{SIGSTOP}. +@end table +@end deftypefun + +Here is a simple example of setting up a handler to delete temporary +files when certain fatal signals happen: + +@smallexample +#include <signal.h> + +void +termination_handler (int signum) +@{ + struct temp_file *p; + + for (p = temp_file_list; p; p = p->next) + unlink (p->name); +@} + +int +main (void) +@{ + @dots{} + if (signal (SIGINT, termination_handler) == SIG_IGN) + signal (SIGINT, SIG_IGN); + if (signal (SIGHUP, termination_handler) == SIG_IGN) + signal (SIGHUP, SIG_IGN); + if (signal (SIGTERM, termination_handler) == SIG_IGN) + signal (SIGTERM, SIG_IGN); + @dots{} +@} +@end smallexample + +@noindent +Note how if a given signal was previously set to be ignored, this code +avoids altering that setting. This is because non-job-control shells +often ignore certain signals when starting children, and it is important +for the children to respect this. + +We do not handle @code{SIGQUIT} or the program error signals in this +example because these are designed to provide information for debugging +(a core dump), and the temporary files may give useful information. + +@comment signal.h +@comment SVID +@deftypefun sighandler_t ssignal (int @var{signum}, sighandler_t @var{action}) +The @code{ssignal} function does the same thing as @code{signal}; it is +provided only for compatibility with SVID. +@end deftypefun + +@comment signal.h +@comment ANSI +@deftypevr Macro sighandler_t SIG_ERR +The value of this macro is used as the return value from @code{signal} +to indicate an error. +@end deftypevr + +@ignore +@comment RMS says that ``we don't do this''. +Implementations might define additional macros for built-in signal +actions that are suitable as a @var{action} argument to @code{signal}, +besides @code{SIG_IGN} and @code{SIG_DFL}. Identifiers whose names +begin with @samp{SIG_} followed by an uppercase letter are reserved for +this purpose. +@end ignore + + +@node Advanced Signal Handling +@subsection Advanced Signal Handling +@cindex @code{sigaction} function + +The @code{sigaction} function has the same basic effect as +@code{signal}: to specify how a signal should be handled by the process. +However, @code{sigaction} offers more control, at the expense of more +complexity. In particular, @code{sigaction} allows you to specify +additional flags to control when the signal is generated and how the +handler is invoked. + +The @code{sigaction} function is declared in @file{signal.h}. +@pindex signal.h + +@comment signal.h +@comment POSIX.1 +@deftp {Data Type} {struct sigaction} +Structures of type @code{struct sigaction} are used in the +@code{sigaction} function to specify all the information about how to +handle a particular signal. This structure contains at least the +following members: + +@table @code +@item sighandler_t sa_handler +This is used in the same way as the @var{action} argument to the +@code{signal} function. The value can be @code{SIG_DFL}, +@code{SIG_IGN}, or a function pointer. @xref{Basic Signal Handling}. + +@item sigset_t sa_mask +This specifies a set of signals to be blocked while the handler runs. +Blocking is explained in @ref{Blocking for Handler}. Note that the +signal that was delivered is automatically blocked by default before its +handler is started; this is true regardless of the value in +@code{sa_mask}. If you want that signal not to be blocked within its +handler, you must write code in the handler to unblock it. + +@item int sa_flags +This specifies various flags which can affect the behavior of +the signal. These are described in more detail in @ref{Flags for Sigaction}. +@end table +@end deftp + +@comment signal.h +@comment POSIX.1 +@deftypefun int sigaction (int @var{signum}, const struct sigaction *@var{action}, struct sigaction *@var{old-action}) +The @var{action} argument is used to set up a new action for the signal +@var{signum}, while the @var{old-action} argument is used to return +information about the action previously associated with this symbol. +(In other words, @var{old-action} has the same purpose as the +@code{signal} function's return value---you can check to see what the +old action in effect for the signal was, and restore it later if you +want.) + +Either @var{action} or @var{old-action} can be a null pointer. If +@var{old-action} is a null pointer, this simply suppresses the return +of information about the old action. If @var{action} is a null pointer, +the action associated with the signal @var{signum} is unchanged; this +allows you to inquire about how a signal is being handled without changing +that handling. + +The return value from @code{sigaction} is zero if it succeeds, and +@code{-1} on failure. The following @code{errno} error conditions are +defined for this function: + +@table @code +@item EINVAL +The @var{signum} argument is not valid, or you are trying to +trap or ignore @code{SIGKILL} or @code{SIGSTOP}. +@end table +@end deftypefun + +@node Signal and Sigaction +@subsection Interaction of @code{signal} and @code{sigaction} + +It's possible to use both the @code{signal} and @code{sigaction} +functions within a single program, but you have to be careful because +they can interact in slightly strange ways. + +The @code{sigaction} function specifies more information than the +@code{signal} function, so the return value from @code{signal} cannot +express the full range of @code{sigaction} possibilities. Therefore, if +you use @code{signal} to save and later reestablish an action, it may +not be able to reestablish properly a handler that was established with +@code{sigaction}. + +To avoid having problems as a result, always use @code{sigaction} to +save and restore a handler if your program uses @code{sigaction} at all. +Since @code{sigaction} is more general, it can properly save and +reestablish any action, regardless of whether it was established +originally with @code{signal} or @code{sigaction}. + +On some systems if you establish an action with @code{signal} and then +examine it with @code{sigaction}, the handler address that you get may +not be the same as what you specified with @code{signal}. It may not +even be suitable for use as an action argument with @code{signal}. But +you can rely on using it as an argument to @code{sigaction}. This +problem never happens on the GNU system. + +So, you're better off using one or the other of the mechanisms +consistently within a single program. + +@strong{Portability Note:} The basic @code{signal} function is a feature +of ANSI C, while @code{sigaction} is part of the POSIX.1 standard. If +you are concerned about portability to non-POSIX systems, then you +should use the @code{signal} function instead. + +@node Sigaction Function Example +@subsection @code{sigaction} Function Example + +In @ref{Basic Signal Handling}, we gave an example of establishing a +simple handler for termination signals using @code{signal}. Here is an +equivalent example using @code{sigaction}: + +@smallexample +#include <signal.h> + +void +termination_handler (int signum) +@{ + struct temp_file *p; + + for (p = temp_file_list; p; p = p->next) + unlink (p->name); +@} + +int +main (void) +@{ + @dots{} + struct sigaction new_action, old_action; + + /* @r{Set up the structure to specify the new action.} */ + new_action.sa_handler = termination_handler; + sigemptyset (&new_action.sa_mask); + new_action.sa_flags = 0; + + sigaction (SIGINT, NULL, &old_action); + if (old_action.sa_handler != SIG_IGN) + sigaction (SIGINT, &new_action, NULL); + sigaction (SIGHUP, NULL, &old_action); + if (old_action.sa_handler != SIG_IGN) + sigaction (SIGHUP, &new_action, NULL); + sigaction (SIGTERM, NULL, &old_action); + if (old_action.sa_handler != SIG_IGN) + sigaction (SIGTERM, &new_action, NULL); + @dots{} +@} +@end smallexample + +The program just loads the @code{new_action} structure with the desired +parameters and passes it in the @code{sigaction} call. The usage of +@code{sigemptyset} is described later; see @ref{Blocking Signals}. + +As in the example using @code{signal}, we avoid handling signals +previously set to be ignored. Here we can avoid altering the signal +handler even momentarily, by using the feature of @code{sigaction} that +lets us examine the current action without specifying a new one. + +Here is another example. It retrieves information about the current +action for @code{SIGINT} without changing that action. + +@smallexample +struct sigaction query_action; + +if (sigaction (SIGINT, NULL, &query_action) < 0) + /* @r{@code{sigaction} returns -1 in case of error.} */ +else if (query_action.sa_handler == SIG_DFL) + /* @r{@code{SIGINT} is handled in the default, fatal manner.} */ +else if (query_action.sa_handler == SIG_IGN) + /* @r{@code{SIGINT} is ignored.} */ +else + /* @r{A programmer-defined signal handler is in effect.} */ +@end smallexample + +@node Flags for Sigaction +@subsection Flags for @code{sigaction} +@cindex signal flags +@cindex flags for @code{sigaction} +@cindex @code{sigaction} flags + +The @code{sa_flags} member of the @code{sigaction} structure is a +catch-all for special features. Most of the time, @code{SA_RESTART} is +a good value to use for this field. + +The value of @code{sa_flags} is interpreted as a bit mask. Thus, you +should choose the flags you want to set, @sc{or} those flags together, +and store the result in the @code{sa_flags} member of your +@code{sigaction} structure. + +Each signal number has its own set of flags. Each call to +@code{sigaction} affects one particular signal number, and the flags +that you specify apply only to that particular signal. + +In the GNU C library, establishing a handler with @code{signal} sets all +the flags to zero except for @code{SA_RESTART}, whose value depends on +the settings you have made with @code{siginterrupt}. @xref{Interrupted +Primitives}, to see what this is about. + +@pindex signal.h +These macros are defined in the header file @file{signal.h}. + +@comment signal.h +@comment POSIX.1 +@deftypevr Macro int SA_NOCLDSTOP +This flag is meaningful only for the @code{SIGCHLD} signal. When the +flag is set, the system delivers the signal for a terminated child +process but not for one that is stopped. By default, @code{SIGCHLD} is +delivered for both terminated children and stopped children. + +Setting this flag for a signal other than @code{SIGCHLD} has no effect. +@end deftypevr + +@comment signal.h +@comment BSD +@deftypevr Macro int SA_ONSTACK +If this flag is set for a particular signal number, the system uses the +signal stack when delivering that kind of signal. @xref{Signal Stack}. +If a signal with this flag arrives and you have not set a signal stack, +the system terminates the program with @code{SIGILL}. +@end deftypevr + +@comment signal.h +@comment BSD +@deftypevr Macro int SA_RESTART +This flag controls what happens when a signal is delivered during +certain primitives (such as @code{open}, @code{read} or @code{write}), +and the signal handler returns normally. There are two alternatives: +the library function can resume, or it can return failure with error +code @code{EINTR}. + +The choice is controlled by the @code{SA_RESTART} flag for the +particular kind of signal that was delivered. If the flag is set, +returning from a handler resumes the library function. If the flag is +clear, returning from a handler makes the function fail. +@xref{Interrupted Primitives}. +@end deftypevr + +@node Initial Signal Actions +@subsection Initial Signal Actions +@cindex initial signal actions + +When a new process is created (@pxref{Creating a Process}), it inherits +handling of signals from its parent process. However, when you load a +new process image using the @code{exec} function (@pxref{Executing a +File}), any signals that you've defined your own handlers for revert to +their @code{SIG_DFL} handling. (If you think about it a little, this +makes sense; the handler functions from the old program are specific to +that program, and aren't even present in the address space of the new +program image.) Of course, the new program can establish its own +handlers. + +When a program is run by a shell, the shell normally sets the initial +actions for the child process to @code{SIG_DFL} or @code{SIG_IGN}, as +appropriate. It's a good idea to check to make sure that the shell has +not set up an initial action of @code{SIG_IGN} before you establish your +own signal handlers. + +Here is an example of how to establish a handler for @code{SIGHUP}, but +not if @code{SIGHUP} is currently ignored: + +@smallexample +@group +@dots{} +struct sigaction temp; + +sigaction (SIGHUP, NULL, &temp); + +if (temp.sa_handler != SIG_IGN) + @{ + temp.sa_handler = handle_sighup; + sigemptyset (&temp.sa_mask); + sigaction (SIGHUP, &temp, NULL); + @} +@end group +@end smallexample + +@node Defining Handlers +@section Defining Signal Handlers +@cindex signal handler function + +This section describes how to write a signal handler function that can +be established with the @code{signal} or @code{sigaction} functions. + +A signal handler is just a function that you compile together with the +rest of the program. Instead of directly invoking the function, you use +@code{signal} or @code{sigaction} to tell the operating system to call +it when a signal arrives. This is known as @dfn{establishing} the +handler. @xref{Signal Actions}. + +There are two basic strategies you can use in signal handler functions: + +@itemize @bullet +@item +You can have the handler function note that the signal arrived by +tweaking some global data structures, and then return normally. + +@item +You can have the handler function terminate the program or transfer +control to a point where it can recover from the situation that caused +the signal. +@end itemize + +You need to take special care in writing handler functions because they +can be called asynchronously. That is, a handler might be called at any +point in the program, unpredictably. If two signals arrive during a +very short interval, one handler can run within another. This section +describes what your handler should do, and what you should avoid. + +@menu +* Handler Returns:: Handlers that return normally, and what + this means. +* Termination in Handler:: How handler functions terminate a program. +* Longjmp in Handler:: Nonlocal transfer of control out of a + signal handler. +* Signals in Handler:: What happens when signals arrive while + the handler is already occupied. +* Merged Signals:: When a second signal arrives before the + first is handled. +* Nonreentrancy:: Do not call any functions unless you know they + are reentrant with respect to signals. +* Atomic Data Access:: A single handler can run in the middle of + reading or writing a single object. +@end menu + +@node Handler Returns +@subsection Signal Handlers that Return + +Handlers which return normally are usually used for signals such as +@code{SIGALRM} and the I/O and interprocess communication signals. But +a handler for @code{SIGINT} might also return normally after setting a +flag that tells the program to exit at a convenient time. + +It is not safe to return normally from the handler for a program error +signal, because the behavior of the program when the handler function +returns is not defined after a program error. @xref{Program Error +Signals}. + +Handlers that return normally must modify some global variable in order +to have any effect. Typically, the variable is one that is examined +periodically by the program during normal operation. Its data type +should be @code{sig_atomic_t} for reasons described in @ref{Atomic +Data Access}. + +Here is a simple example of such a program. It executes the body of +the loop until it has noticed that a @code{SIGALRM} signal has arrived. +This technique is useful because it allows the iteration in progress +when the signal arrives to complete before the loop exits. + +@smallexample +@include sigh1.c.texi +@end smallexample + +@node Termination in Handler +@subsection Handlers That Terminate the Process + +Handler functions that terminate the program are typically used to cause +orderly cleanup or recovery from program error signals and interactive +interrupts. + +The cleanest way for a handler to terminate the process is to raise the +same signal that ran the handler in the first place. Here is how to do +this: + +@smallexample +volatile sig_atomic_t fatal_error_in_progress = 0; + +void +fatal_error_signal (int sig) +@{ +@group + /* @r{Since this handler is established for more than one kind of signal, } + @r{it might still get invoked recursively by delivery of some other kind} + @r{of signal. Use a static variable to keep track of that.} */ + if (fatal_error_in_progress) + raise (sig); + fatal_error_in_progress = 1; +@end group + +@group + /* @r{Now do the clean up actions:} + @r{- reset terminal modes} + @r{- kill child processes} + @r{- remove lock files} */ + @dots{} +@end group + +@group + /* @r{Now reraise the signal. Since the signal is blocked,} + @r{it will receive its default handling, which is} + @r{to terminate the process. We could just call} + @r{@code{exit} or @code{abort}, but reraising the signal} + @r{sets the return status from the process correctly.} */ + raise (sig); +@} +@end group +@end smallexample + +@node Longjmp in Handler +@subsection Nonlocal Control Transfer in Handlers +@cindex non-local exit, from signal handler + +You can do a nonlocal transfer of control out of a signal handler using +the @code{setjmp} and @code{longjmp} facilities (@pxref{Non-Local +Exits}). + +When the handler does a nonlocal control transfer, the part of the +program that was running will not continue. If this part of the program +was in the middle of updating an important data structure, the data +structure will remain inconsistent. Since the program does not +terminate, the inconsistency is likely to be noticed later on. + +There are two ways to avoid this problem. One is to block the signal +for the parts of the program that update important data structures. +Blocking the signal delays its delivery until it is unblocked, once the +critical updating is finished. @xref{Blocking Signals}. + +The other way to re-initialize the crucial data structures in the signal +handler, or make their values consistent. + +Here is a rather schematic example showing the reinitialization of one +global variable. + +@smallexample +@group +#include <signal.h> +#include <setjmp.h> + +jmp_buf return_to_top_level; + +volatile sig_atomic_t waiting_for_input; + +void +handle_sigint (int signum) +@{ + /* @r{We may have been waiting for input when the signal arrived,} + @r{but we are no longer waiting once we transfer control.} */ + waiting_for_input = 0; + longjmp (return_to_top_level, 1); +@} +@end group + +@group +int +main (void) +@{ + @dots{} + signal (SIGINT, sigint_handler); + @dots{} + while (1) @{ + prepare_for_command (); + if (setjmp (return_to_top_level) == 0) + read_and_execute_command (); + @} +@} +@end group + +@group +/* @r{Imagine this is a subroutine used by various commands.} */ +char * +read_data () +@{ + if (input_from_terminal) @{ + waiting_for_input = 1; + @dots{} + waiting_for_input = 0; + @} else @{ + @dots{} + @} +@} +@end group +@end smallexample + + +@node Signals in Handler +@subsection Signals Arriving While a Handler Runs +@cindex race conditions, relating to signals + +What happens if another signal arrives while your signal handler +function is running? + +When the handler for a particular signal is invoked, that signal is +automatically blocked until the handler returns. That means that if two +signals of the same kind arrive close together, the second one will be +held until the first has been handled. (The handler can explicitly +unblock the signal using @code{sigprocmask}, if you want to allow more +signals of this type to arrive; see @ref{Process Signal Mask}.) + +However, your handler can still be interrupted by delivery of another +kind of signal. To avoid this, you can use the @code{sa_mask} member of +the action structure passed to @code{sigaction} to explicitly specify +which signals should be blocked while the signal handler runs. These +signals are in addition to the signal for which the handler was invoked, +and any other signals that are normally blocked by the process. +@xref{Blocking for Handler}. + +When the handler returns, the set of blocked signals is restored to the +value it had before the handler ran. So using @code{sigprocmask} inside +the handler only affects what signals can arrive during the execution of +the handler itself, not what signals can arrive once the handler returns. + +@strong{Portability Note:} Always use @code{sigaction} to establish a +handler for a signal that you expect to receive asynchronously, if you +want your program to work properly on System V Unix. On this system, +the handling of a signal whose handler was established with +@code{signal} automatically sets the signal's action back to +@code{SIG_DFL}, and the handler must re-establish itself each time it +runs. This practice, while inconvenient, does work when signals cannot +arrive in succession. However, if another signal can arrive right away, +it may arrive before the handler can re-establish itself. Then the +second signal would receive the default handling, which could terminate +the process. + +@node Merged Signals +@subsection Signals Close Together Merge into One +@cindex handling multiple signals +@cindex successive signals +@cindex merging of signals + +If multiple signals of the same type are delivered to your process +before your signal handler has a chance to be invoked at all, the +handler may only be invoked once, as if only a single signal had +arrived. In effect, the signals merge into one. This situation can +arise when the signal is blocked, or in a multiprocessing environment +where the system is busy running some other processes while the signals +are delivered. This means, for example, that you cannot reliably use a +signal handler to count signals. The only distinction you can reliably +make is whether at least one signal has arrived since a given time in +the past. + +Here is an example of a handler for @code{SIGCHLD} that compensates for +the fact that the number of signals recieved may not equal the number of +child processes generate them. It assumes that the program keeps track +of all the child processes with a chain of structures as follows: + +@smallexample +struct process +@{ + struct process *next; + /* @r{The process ID of this child.} */ + int pid; + /* @r{The descriptor of the pipe or pseudo terminal} + @r{on which output comes from this child.} */ + int input_descriptor; + /* @r{Nonzero if this process has stopped or terminated.} */ + sig_atomic_t have_status; + /* @r{The status of this child; 0 if running,} + @r{otherwise a status value from @code{waitpid}.} */ + int status; +@}; + +struct process *process_list; +@end smallexample + +This example also uses a flag to indicate whether signals have arrived +since some time in the past---whenever the program last cleared it to +zero. + +@smallexample +/* @r{Nonzero means some child's status has changed} + @r{so look at @code{process_list} for the details.} */ +int process_status_change; +@end smallexample + +Here is the handler itself: + +@smallexample +void +sigchld_handler (int signo) +@{ + int old_errno = errno; + + while (1) @{ + register int pid; + int w; + struct process *p; + + /* @r{Keep asking for a status until we get a definitive result.} */ + do + @{ + errno = 0; + pid = waitpid (WAIT_ANY, &w, WNOHANG | WUNTRACED); + @} + while (pid <= 0 && errno == EINTR); + + if (pid <= 0) @{ + /* @r{A real failure means there are no more} + @r{stopped or terminated child processes, so return.} */ + errno = old_errno; + return; + @} + + /* @r{Find the process that signaled us, and record its status.} */ + + for (p = process_list; p; p = p->next) + if (p->pid == pid) @{ + p->status = w; + /* @r{Indicate that the @code{status} field} + @r{has data to look at. We do this only after storing it.} */ + p->have_status = 1; + + /* @r{If process has terminated, stop waiting for its output.} */ + if (WIFSIGNALED (w) || WIFEXITED (w)) + if (p->input_descriptor) + FD_CLR (p->input_descriptor, &input_wait_mask); + + /* @r{The program should check this flag from time to time} + @r{to see if there is any news in @code{process_list}.} */ + ++process_status_change; + @} + + /* @r{Loop around to handle all the processes} + @r{that have something to tell us.} */ + @} +@} +@end smallexample + +Here is the proper way to check the flag @code{process_status_change}: + +@smallexample +if (process_status_change) @{ + struct process *p; + process_status_change = 0; + for (p = process_list; p; p = p->next) + if (p->have_status) @{ + @dots{} @r{Examine @code{p->status}} @dots{} + @} +@} +@end smallexample + +@noindent +It is vital to clear the flag before examining the list; otherwise, if a +signal were delivered just before the clearing of the flag, and after +the appropriate element of the process list had been checked, the status +change would go unnoticed until the next signal arrived to set the flag +again. You could, of course, avoid this problem by blocking the signal +while scanning the list, but it is much more elegant to guarantee +correctness by doing things in the right order. + +The loop which checks process status avoids examining @code{p->status} +until it sees that status has been validly stored. This is to make sure +that the status cannot change in the middle of accessing it. Once +@code{p->have_status} is set, it means that the child process is stopped +or terminated, and in either case, it cannot stop or terminate again +until the program has taken notice. @xref{Atomic Usage}, for more +information about coping with interruptions during accessings of a +variable. + +Here is another way you can test whether the handler has run since the +last time you checked. This technique uses a counter which is never +changed outside the handler. Instead of clearing the count, the program +remembers the previous value and sees whether it has changed since the +previous check. The advantage of this method is that different parts of +the program can check independently, each part checking whether there +has been a signal since that part last checked. + +@smallexample +sig_atomic_t process_status_change; + +sig_atomic_t last_process_status_change; + +@dots{} +@{ + sig_atomic_t prev = last_process_status_change; + last_process_status_change = process_status_change; + if (last_process_status_change != prev) @{ + struct process *p; + for (p = process_list; p; p = p->next) + if (p->have_status) @{ + @dots{} @r{Examine @code{p->status}} @dots{} + @} + @} +@} +@end smallexample + +@node Nonreentrancy +@subsection Signal Handling and Nonreentrant Functions +@cindex restrictions on signal handler functions + +Handler functions usually don't do very much. The best practice is to +write a handler that does nothing but set an external variable that the +program checks regularly, and leave all serious work to the program. +This is best because the handler can be called at asynchronously, at +unpredictable times---perhaps in the middle of a primitive function, or +even between the beginning and the end of a C operator that requires +multiple instructions. The data structures being manipulated might +therefore be in an inconsistent state when the handler function is +invoked. Even copying one @code{int} variable into another can take two +instructions on most machines. + +This means you have to be very careful about what you do in a signal +handler. + +@itemize @bullet +@item +@cindex @code{volatile} declarations +If your handler needs to access any global variables from your program, +declare those variables @code{volatile}. This tells the compiler that +the value of the variable might change asynchronously, and inhibits +certain optimizations that would be invalidated by such modifications. + +@item +@cindex reentrant functions +If you call a function in the handler, make sure it is @dfn{reentrant} +with respect to signals, or else make sure that the signal cannot +interrupt a call to a related function. +@end itemize + +A function can be non-reentrant if it uses memory that is not on the +stack. + +@itemize @bullet +@item +If a function uses a static variable or a global variable, or a +dynamically-allocated object that it finds for itself, then it is +non-reentrant and any two calls to the function can interfere. + +For example, suppose that the signal handler uses @code{gethostbyname}. +This function returns its value in a static object, reusing the same +object each time. If the signal happens to arrive during a call to +@code{gethostbyname}, or even after one (while the program is still +using the value), it will clobber the value that the program asked for. + +However, if the program does not use @code{gethostbyname} or any other +function that returns information in the same object, or if it always +blocks signals around each use, then you are safe. + +There are a large number of library functions that return values in a +fixed object, always reusing the same object in this fashion, and all of +them cause the same problem. The description of a function in this +manual always mentions this behavior. + +@item +If a function uses and modifies an object that you supply, then it is +potentially non-reentrant; two calls can interfere if they use the same +object. + +This case arises when you do I/O using streams. Suppose that the +signal handler prints a message with @code{fprintf}. Suppose that the +program was in the middle of an @code{fprintf} call using the same +stream when the signal was delivered. Both the signal handler's message +and the program's data could be corrupted, because both calls operate on +the same data structure---the stream itself. + +However, if you know that the stream that the handler uses cannot +possibly be used by the program at a time when signals can arrive, then +you are safe. It is no problem if the program uses some other stream. + +@item +On most systems, @code{malloc} and @code{free} are not reentrant, +because they use a static data structure which records what memory +blocks are free. As a result, no library functions that allocate or +free memory are reentrant. This includes functions that allocate space +to store a result. + +The best way to avoid the need to allocate memory in a handler is to +allocate in advance space for signal handlers to use. + +The best way to avoid freeing memory in a handler is to flag or record +the objects to be freed, and have the program check from time to time +whether anything is waiting to be freed. But this must be done with +care, because placing an object on a chain is not atomic, and if it is +interrupted by another signal handler that does the same thing, you +could ``lose'' one of the objects. + +@ignore +!!! not true +On the GNU system, @code{malloc} and @code{free} are safe to use in +signal handlers because they block signals. As a result, the library +functions that allocate space for a result are also safe in signal +handlers. The obstack allocation functions are safe as long as you +don't use the same obstack both inside and outside of a signal handler. +@end ignore + +The relocating allocation functions (@pxref{Relocating Allocator}) +are certainly not safe to use in a signal handler. + +@item +Any function that modifies @code{errno} is non-reentrant, but you can +correct for this: in the handler, save the original value of +@code{errno} and restore it before returning normally. This prevents +errors that occur within the signal handler from being confused with +errors from system calls at the point the program is interrupted to run +the handler. + +This technique is generally applicable; if you want to call in a handler +a function that modifies a particular object in memory, you can make +this safe by saving and restoring that object. + +@item +Merely reading from a memory object is safe provided that you can deal +with any of the values that might appear in the object at a time when +the signal can be delivered. Keep in mind that assignment to some data +types requires more than one instruction, which means that the handler +could run ``in the middle of'' an assignment to the variable if its type +is not atomic. @xref{Atomic Data Access}. + +@item +Merely writing into a memory object is safe as long as a sudden change +in the value, at any time when the handler might run, will not disturb +anything. +@end itemize + +@node Atomic Data Access +@subsection Atomic Data Access and Signal Handling + +Whether the data in your application concerns atoms, or mere text, you +have to be careful about the fact that access to a single datum is not +necessarily @dfn{atomic}. This means that it can take more than one +instruction to read or write a single object. In such cases, a signal +handler might in the middle of reading or writing the object. + +There are three ways you can cope with this problem. You can use data +types that are always accessed atomically; you can carefully arrange +that nothing untoward happens if an access is interrupted, or you can +block all signals around any access that had better not be interrupted +(@pxref{Blocking Signals}). + +@menu +* Non-atomic Example:: A program illustrating interrupted access. +* Types: Atomic Types. Data types that guarantee no interruption. +* Usage: Atomic Usage. Proving that interruption is harmless. +@end menu + +@node Non-atomic Example +@subsubsection Problems with Non-Atomic Access + +Here is an example which shows what can happen if a signal handler runs +in the middle of modifying a variable. (Interrupting the reading of a +variable can also lead to paradoxical results, but here we only show +writing.) + +@smallexample +#include <signal.h> +#include <stdio.h> + +struct two_words @{ int a, b; @} memory; + +void +handler(int signum) +@{ + printf ("%d,%d\n", memory.a, memory.b); + alarm (1); +@} + +@group +int +main (void) +@{ + static struct two_words zeros = @{ 0, 0 @}, ones = @{ 1, 1 @}; + signal (SIGALRM, handler); + memory = zeros; + alarm (1); + while (1) + @{ + memory = zeros; + memory = ones; + @} +@} +@end group +@end smallexample + +This program fills @code{memory} with zeros, ones, zeros, ones, +alternating forever; meanwhile, once per second, the alarm signal handler +prints the current contents. (Calling @code{printf} in the handler is +safe in this program because it is certainly not being called outside +the handler when the signal happens.) + +Clearly, this program can print a pair of zeros or a pair of ones. But +that's not all it can do! On most machines, it takes several +instructions to store a new value in @code{memory}, and the value is +stored one word at a time. If the signal is delivered in between these +instructions, the handler might find that @code{memory.a} is zero and +@code{memory.b} is one (or vice versa). + +On some machines it may be possible to store a new value in +@code{memory} with just one instruction that cannot be interrupted. On +these machines, the handler will always print two zeros or two ones. + +@node Atomic Types +@subsubsection Atomic Types + +To avoid uncertainty about interrupting access to a variable, you can +use a particular data type for which access is always atomic: +@code{sig_atomic_t}. Reading and writing this data type is guaranteed +to happen in a single instruction, so there's no way for a handler to +run ``in the middle'' of an access. + +The type @code{sig_atomic_t} is always an integer data type, but which +one it is, and how many bits it contains, may vary from machine to +machine. + +@comment signal.h +@comment ANSI +@deftp {Data Type} sig_atomic_t +This is an integer data type. Objects of this type are always accessed +atomically. +@end deftp + +In practice, you can assume that @code{int} and other integer types no +longer than @code{int} are atomic. You can also assume that pointer +types are atomic; that is very convenient. Both of these are true on +all of the machines that the GNU C library supports, and on all POSIX +systems we know of. +@c ??? This might fail on a 386 that uses 64-bit pointers. + +@node Atomic Usage +@subsubsection Atomic Usage Patterns + +Certain patterns of access avoid any problem even if an access is +interrupted. For example, a flag which is set by the handler, and +tested and cleared by the main program from time to time, is always safe +even if access actually requires two instructions. To show that this is +so, we must consider each access that could be interrupted, and show +that there is no problem if it is interrupted. + +An interrupt in the middle of testing the flag is safe because either it's +recognized to be nonzero, in which case the precise value doesn't +matter, or it will be seen to be nonzero the next time it's tested. + +An interrupt in the middle of clearing the flag is no problem because +either the value ends up zero, which is what happens if a signal comes +in just before the flag is cleared, or the value ends up nonzero, and +subsequent events occur as if the signal had come in just after the flag +was cleared. As long as the code handles both of these cases properly, +it can also handle a signal in the middle of clearing the flag. (This +is an example of the sort of reasoning you need to do to figure out +whether non-atomic usage is safe.) + +Sometimes you can insure uninterrupted access to one object by +protecting its use with another object, perhaps one whose type +guarantees atomicity. @xref{Merged Signals}, for an example. + +@node Interrupted Primitives +@section Primitives Interrupted by Signals + +A signal can arrive and be handled while an I/O primitive such as +@code{open} or @code{read} is waiting for an I/O device. If the signal +handler returns, the system faces the question: what should happen next? + +POSIX specifies one approach: make the primitive fail right away. The +error code for this kind of failure is @code{EINTR}. This is flexible, +but usually inconvenient. Typically, POSIX applications that use signal +handlers must check for @code{EINTR} after each library function that +can return it, in order to try the call again. Often programmers forget +to check, which is a common source of error. + +The GNU library provides a convenient way to retry a call after a +temporary failure, with the macro @code{TEMP_FAILURE_RETRY}: + +@comment unistd.h +@comment GNU +@defmac TEMP_FAILURE_RETRY (@var{expression}) +This macro evaluates @var{expression} once. If it fails and reports +error code @code{EINTR}, @code{TEMP_FAILURE_RETRY} evaluates it again, +and over and over until the result is not a temporary failure. + +The value returned by @code{TEMP_FAILURE_RETRY} is whatever value +@var{expression} produced. +@end defmac + +BSD avoids @code{EINTR} entirely and provides a more convenient +approach: to restart the interrupted primitive, instead of making it +fail. If you choose this approach, you need not be concerned with +@code{EINTR}. + +You can choose either approach with the GNU library. If you use +@code{sigaction} to establish a signal handler, you can specify how that +handler should behave. If you specify the @code{SA_RESTART} flag, +return from that handler will resume a primitive; otherwise, return from +that handler will cause @code{EINTR}. @xref{Flags for Sigaction}. + +Another way to specify the choice is with the @code{siginterrupt} +function. @xref{BSD Handler}. + +@c !!! not true now about _BSD_SOURCE +When you don't specify with @code{sigaction} or @code{siginterrupt} what +a particular handler should do, it uses a default choice. The default +choice in the GNU library depends on the feature test macros you have +defined. If you define @code{_BSD_SOURCE} or @code{_GNU_SOURCE} before +calling @code{signal}, the default is to resume primitives; otherwise, +the default is to make them fail with @code{EINTR}. (The library +contains alternate versions of the @code{signal} function, and the +feature test macros determine which one you really call.) @xref{Feature +Test Macros}. +@cindex EINTR, and restarting interrupted primitives +@cindex restarting interrupted primitives +@cindex interrupting primitives +@cindex primitives, interrupting +@c !!! want to have @cindex system calls @i{see} primitives [no page #] + +The description of each primitive affected by this issue +lists @code{EINTR} among the error codes it can return. + +There is one situation where resumption never happens no matter which +choice you make: when a data-transfer function such as @code{read} or +@code{write} is interrupted by a signal after transferring part of the +data. In this case, the function returns the number of bytes already +transferred, indicating partial success. + +This might at first appear to cause unreliable behavior on +record-oriented devices (including datagram sockets; @pxref{Datagrams}), +where splitting one @code{read} or @code{write} into two would read or +write two records. Actually, there is no problem, because interruption +after a partial transfer cannot happen on such devices; they always +transfer an entire record in one burst, with no waiting once data +transfer has started. + +@node Generating Signals +@section Generating Signals +@cindex sending signals +@cindex raising signals +@cindex signals, generating + +Besides signals that are generated as a result of a hardware trap or +interrupt, your program can explicitly send signals to itself or to +another process. + +@menu +* Signaling Yourself:: A process can send a signal to itself. +* Signaling Another Process:: Send a signal to another process. +* Permission for kill:: Permission for using @code{kill}. +* Kill Example:: Using @code{kill} for Communication. +@end menu + +@node Signaling Yourself +@subsection Signaling Yourself + +A process can send itself a signal with the @code{raise} function. This +function is declared in @file{signal.h}. +@pindex signal.h + +@comment signal.h +@comment ANSI +@deftypefun int raise (int @var{signum}) +The @code{raise} function sends the signal @var{signum} to the calling +process. It returns zero if successful and a nonzero value if it fails. +About the only reason for failure would be if the value of @var{signum} +is invalid. +@end deftypefun + +@comment signal.h +@comment SVID +@deftypefun int gsignal (int @var{signum}) +The @code{gsignal} function does the same thing as @code{raise}; it is +provided only for compatibility with SVID. +@end deftypefun + +One convenient use for @code{raise} is to reproduce the default behavior +of a signal that you have trapped. For instance, suppose a user of your +program types the SUSP character (usually @kbd{C-z}; @pxref{Special +Characters}) to send it an interactive stop stop signal +(@code{SIGTSTP}), and you want to clean up some internal data buffers +before stopping. You might set this up like this: + +@comment RMS suggested getting rid of the handler for SIGCONT in this function. +@comment But that would require that the handler for SIGTSTP unblock the +@comment signal before doing the call to raise. We haven't covered that +@comment topic yet, and I don't want to distract from the main point of +@comment the example with a digression to explain what is going on. As +@comment the example is written, the signal that is raise'd will be delivered +@comment as soon as the SIGTSTP handler returns, which is fine. + +@smallexample +#include <signal.h> + +/* @r{When a stop signal arrives, set the action back to the default + and then resend the signal after doing cleanup actions.} */ + +void +tstp_handler (int sig) +@{ + signal (SIGTSTP, SIG_DFL); + /* @r{Do cleanup actions here.} */ + @dots{} + raise (SIGTSTP); +@} + +/* @r{When the process is continued again, restore the signal handler.} */ + +void +cont_handler (int sig) +@{ + signal (SIGCONT, cont_handler); + signal (SIGTSTP, tstp_handler); +@} + +@group +/* @r{Enable both handlers during program initialization.} */ + +int +main (void) +@{ + signal (SIGCONT, cont_handler); + signal (SIGTSTP, tstp_handler); + @dots{} +@} +@end group +@end smallexample + +@strong{Portability note:} @code{raise} was invented by the ANSI C +committee. Older systems may not support it, so using @code{kill} may +be more portable. @xref{Signaling Another Process}. + +@node Signaling Another Process +@subsection Signaling Another Process + +@cindex killing a process +The @code{kill} function can be used to send a signal to another process. +In spite of its name, it can be used for a lot of things other than +causing a process to terminate. Some examples of situations where you +might want to send signals between processes are: + +@itemize @bullet +@item +A parent process starts a child to perform a task---perhaps having the +child running an infinite loop---and then terminates the child when the +task is no longer needed. + +@item +A process executes as part of a group, and needs to terminate or notify +the other processes in the group when an error or other event occurs. + +@item +Two processes need to synchronize while working together. +@end itemize + +This section assumes that you know a little bit about how processes +work. For more information on this subject, see @ref{Processes}. + +The @code{kill} function is declared in @file{signal.h}. +@pindex signal.h + +@comment signal.h +@comment POSIX.1 +@deftypefun int kill (pid_t @var{pid}, int @var{signum}) +The @code{kill} function sends the signal @var{signum} to the process +or process group specified by @var{pid}. Besides the signals listed in +@ref{Standard Signals}, @var{signum} can also have a value of zero to +check the validity of the @var{pid}. + +The @var{pid} specifies the process or process group to receive the +signal: + +@table @code +@item @var{pid} > 0 +The process whose identifier is @var{pid}. + +@item @var{pid} == 0 +All processes in the same process group as the sender. + +@item @var{pid} < -1 +The process group whose identifier is @minus{}@var{pid}. + +@item @var{pid} == -1 +If the process is privileged, send the signal to all processes except +for some special system processes. Otherwise, send the signal to all +processes with the same effective user ID. +@end table + +A process can send a signal @var{signum} to itself with a call like +@w{@code{kill (getpid(), @var{signum})}}. If @code{kill} is used by a +process to send a signal to itself, and the signal is not blocked, then +@code{kill} delivers at least one signal (which might be some other +pending unblocked signal instead of the signal @var{signum}) to that +process before it returns. + +The return value from @code{kill} is zero if the signal can be sent +successfully. Otherwise, no signal is sent, and a value of @code{-1} is +returned. If @var{pid} specifies sending a signal to several processes, +@code{kill} succeeds if it can send the signal to at least one of them. +There's no way you can tell which of the processes got the signal +or whether all of them did. + +The following @code{errno} error conditions are defined for this function: + +@table @code +@item EINVAL +The @var{signum} argument is an invalid or unsupported number. + +@item EPERM +You do not have the privilege to send a signal to the process or any of +the processes in the process group named by @var{pid}. + +@item ESCRH +The @var{pid} argument does not refer to an existing process or group. +@end table +@end deftypefun + +@comment signal.h +@comment BSD +@deftypefun int killpg (int @var{pgid}, int @var{signum}) +This is similar to @code{kill}, but sends signal @var{signum} to the +process group @var{pgid}. This function is provided for compatibility +with BSD; using @code{kill} to do this is more portable. +@end deftypefun + +As a simple example of @code{kill}, the call @w{@code{kill (getpid (), +@var{sig})}} has the same effect as @w{@code{raise (@var{sig})}}. + +@node Permission for kill +@subsection Permission for using @code{kill} + +There are restrictions that prevent you from using @code{kill} to send +signals to any random process. These are intended to prevent antisocial +behavior such as arbitrarily killing off processes belonging to another +user. In typical use, @code{kill} is used to pass signals between +parent, child, and sibling processes, and in these situations you +normally do have permission to send signals. The only common execption +is when you run a setuid program in a child process; if the program +changes its real UID as well as its effective UID, you may not have +permission to send a signal. The @code{su} program does this. + +Whether a process has permission to send a signal to another process +is determined by the user IDs of the two processes. This concept is +discussed in detail in @ref{Process Persona}. + +Generally, for a process to be able to send a signal to another process, +either the sending process must belong to a privileged user (like +@samp{root}), or the real or effective user ID of the sending process +must match the real or effective user ID of the receiving process. If +the receiving process has changed its effective user ID from the +set-user-ID mode bit on its process image file, then the owner of the +process image file is used in place of its current effective user ID. +In some implementations, a parent process might be able to send signals +to a child process even if the user ID's don't match, and other +implementations might enforce other restrictions. + +The @code{SIGCONT} signal is a special case. It can be sent if the +sender is part of the same session as the receiver, regardless of +user IDs. + +@node Kill Example +@subsection Using @code{kill} for Communication +@cindex interprocess communication, with signals +Here is a longer example showing how signals can be used for +interprocess communication. This is what the @code{SIGUSR1} and +@code{SIGUSR2} signals are provided for. Since these signals are fatal +by default, the process that is supposed to receive them must trap them +through @code{signal} or @code{sigaction}. + +In this example, a parent process forks a child process and then waits +for the child to complete its initialization. The child process tells +the parent when it is ready by sending it a @code{SIGUSR1} signal, using +the @code{kill} function. + +@smallexample +@include sigusr.c.texi +@end smallexample + +This example uses a busy wait, which is bad, because it wastes CPU +cycles that other programs could otherwise use. It is better to ask the +system to wait until the signal arrives. See the example in +@ref{Waiting for a Signal}. + +@node Blocking Signals +@section Blocking Signals +@cindex blocking signals + +Blocking a signal means telling the operating system to hold it and +deliver it later. Generally, a program does not block signals +indefinitely---it might as well ignore them by setting their actions to +@code{SIG_IGN}. But it is useful to block signals briefly, to prevent +them from interrupting sensitive operations. For instance: + +@itemize @bullet +@item +You can use the @code{sigprocmask} function to block signals while you +modify global variables that are also modified by the handlers for these +signals. + +@item +You can set @code{sa_mask} in your @code{sigaction} call to block +certain signals while a particular signal handler runs. This way, the +signal handler can run without being interrupted itself by signals. +@end itemize + +@menu +* Why Block:: The purpose of blocking signals. +* Signal Sets:: How to specify which signals to + block. +* Process Signal Mask:: Blocking delivery of signals to your + process during normal execution. +* Testing for Delivery:: Blocking to Test for Delivery of + a Signal. +* Blocking for Handler:: Blocking additional signals while a + handler is being run. +* Checking for Pending Signals:: Checking for Pending Signals +* Remembering a Signal:: How you can get almost the same + effect as blocking a signal, by + handling it and setting a flag + to be tested later. +@end menu + +@node Why Block +@subsection Why Blocking Signals is Useful + +Temporary blocking of signals with @code{sigprocmask} gives you a way to +prevent interrupts during critical parts of your code. If signals +arrive in that part of the program, they are delivered later, after you +unblock them. + +One example where this is useful is for sharing data between a signal +handler and the rest of the program. If the type of the data is not +@code{sig_atomic_t} (@pxref{Atomic Data Access}), then the signal +handler could run when the rest of the program has only half finished +reading or writing the data. This would lead to confusing consequences. + +To make the program reliable, you can prevent the signal handler from +running while the rest of the program is examining or modifying that +data---by blocking the appropriate signal around the parts of the +program that touch the data. + +Blocking signals is also necessary when you want to perform a certain +action only if a signal has not arrived. Suppose that the handler for +the signal sets a flag of type @code{sig_atomic_t}; you would like to +test the flag and perform the action if the flag is not set. This is +unreliable. Suppose the signal is delivered immediately after you test +the flag, but before the consequent action: then the program will +perform the action even though the signal has arrived. + +The only way to test reliably for whether a signal has yet arrived is to +test while the signal is blocked. + +@node Signal Sets +@subsection Signal Sets + +All of the signal blocking functions use a data structure called a +@dfn{signal set} to specify what signals are affected. Thus, every +activity involves two stages: creating the signal set, and then passing +it as an argument to a library function. +@cindex signal set + +These facilities are declared in the header file @file{signal.h}. +@pindex signal.h + +@comment signal.h +@comment POSIX.1 +@deftp {Data Type} sigset_t +The @code{sigset_t} data type is used to represent a signal set. +Internally, it may be implemented as either an integer or structure +type. + +For portability, use only the functions described in this section to +initialize, change, and retrieve information from @code{sigset_t} +objects---don't try to manipulate them directly. +@end deftp + +There are two ways to initialize a signal set. You can initially +specify it to be empty with @code{sigemptyset} and then add specified +signals individually. Or you can specify it to be full with +@code{sigfillset} and then delete specified signals individually. + +You must always initialize the signal set with one of these two +functions before using it in any other way. Don't try to set all the +signals explicitly because the @code{sigset_t} object might include some +other information (like a version field) that needs to be initialized as +well. (In addition, it's not wise to put into your program an +assumption that the system has no signals aside from the ones you know +about.) + +@comment signal.h +@comment POSIX.1 +@deftypefun int sigemptyset (sigset_t *@var{set}) +This function initializes the signal set @var{set} to exclude all of the +defined signals. It always returns @code{0}. +@end deftypefun + +@comment signal.h +@comment POSIX.1 +@deftypefun int sigfillset (sigset_t *@var{set}) +This function initializes the signal set @var{set} to include +all of the defined signals. Again, the return value is @code{0}. +@end deftypefun + +@comment signal.h +@comment POSIX.1 +@deftypefun int sigaddset (sigset_t *@var{set}, int @var{signum}) +This function adds the signal @var{signum} to the signal set @var{set}. +All @code{sigaddset} does is modify @var{set}; it does not block or +unblock any signals. + +The return value is @code{0} on success and @code{-1} on failure. +The following @code{errno} error condition is defined for this function: + +@table @code +@item EINVAL +The @var{signum} argument doesn't specify a valid signal. +@end table +@end deftypefun + +@comment signal.h +@comment POSIX.1 +@deftypefun int sigdelset (sigset_t *@var{set}, int @var{signum}) +This function removes the signal @var{signum} from the signal set +@var{set}. All @code{sigdelset} does is modify @var{set}; it does not +block or unblock any signals. The return value and error conditions are +the same as for @code{sigaddset}. +@end deftypefun + +Finally, there is a function to test what signals are in a signal set: + +@comment signal.h +@comment POSIX.1 +@deftypefun int sigismember (const sigset_t *@var{set}, int @var{signum}) +The @code{sigismember} function tests whether the signal @var{signum} is +a member of the signal set @var{set}. It returns @code{1} if the signal +is in the set, @code{0} if not, and @code{-1} if there is an error. + +The following @code{errno} error condition is defined for this function: + +@table @code +@item EINVAL +The @var{signum} argument doesn't specify a valid signal. +@end table +@end deftypefun + +@node Process Signal Mask +@subsection Process Signal Mask +@cindex signal mask +@cindex process signal mask + +The collection of signals that are currently blocked is called the +@dfn{signal mask}. Each process has its own signal mask. When you +create a new process (@pxref{Creating a Process}), it inherits its +parent's mask. You can block or unblock signals with total flexibility +by modifying the signal mask. + +The prototype for the @code{sigprocmask} function is in @file{signal.h}. +@pindex signal.h + +@comment signal.h +@comment POSIX.1 +@deftypefun int sigprocmask (int @var{how}, const sigset_t *@var{set}, sigset_t *@var{oldset}) +The @code{sigprocmask} function is used to examine or change the calling +process's signal mask. The @var{how} argument determines how the signal +mask is changed, and must be one of the following values: + +@table @code +@comment signal.h +@comment POSIX.1 +@vindex SIG_BLOCK +@item SIG_BLOCK +Block the signals in @code{set}---add them to the existing mask. In +other words, the new mask is the union of the existing mask and +@var{set}. + +@comment signal.h +@comment POSIX.1 +@vindex SIG_UNBLOCK +@item SIG_UNBLOCK +Unblock the signals in @var{set}---remove them from the existing mask. + +@comment signal.h +@comment POSIX.1 +@vindex SIG_SETMASK +@item SIG_SETMASK +Use @var{set} for the mask; ignore the previous value of the mask. +@end table + +The last argument, @var{oldset}, is used to return information about the +old process signal mask. If you just want to change the mask without +looking at it, pass a null pointer as the @var{oldset} argument. +Similarly, if you want to know what's in the mask without changing it, +pass a null pointer for @var{set} (in this case the @var{how} argument +is not significant). The @var{oldset} argument is often used to +remember the previous signal mask in order to restore it later. (Since +the signal mask is inherited over @code{fork} and @code{exec} calls, you +can't predict what its contents are when your program starts running.) + +If invoking @code{sigprocmask} causes any pending signals to be +unblocked, at least one of those signals is delivered to the process +before @code{sigprocmask} returns. The order in which pending signals +are delivered is not specified, but you can control the order explicitly +by making multiple @code{sigprocmask} calls to unblock various signals +one at a time. + +The @code{sigprocmask} function returns @code{0} if successful, and @code{-1} +to indicate an error. The following @code{errno} error conditions are +defined for this function: + +@table @code +@item EINVAL +The @var{how} argument is invalid. +@end table + +You can't block the @code{SIGKILL} and @code{SIGSTOP} signals, but +if the signal set includes these, @code{sigprocmask} just ignores +them instead of returning an error status. + +Remember, too, that blocking program error signals such as @code{SIGFPE} +leads to undesirable results for signals generated by an actual program +error (as opposed to signals sent with @code{raise} or @code{kill}). +This is because your program may be too broken to be able to continue +executing to a point where the signal is unblocked again. +@xref{Program Error Signals}. +@end deftypefun + +@node Testing for Delivery +@subsection Blocking to Test for Delivery of a Signal + +Now for a simple example. Suppose you establish a handler for +@code{SIGALRM} signals that sets a flag whenever a signal arrives, and +your main program checks this flag from time to time and then resets it. +You can prevent additional @code{SIGALRM} signals from arriving in the +meantime by wrapping the critical part of the code with calls to +@code{sigprocmask}, like this: + +@smallexample +/* @r{This variable is set by the SIGALRM signal handler.} */ +volatile sig_atomic_t flag = 0; + +int +main (void) +@{ + sigset_t block_alarm; + + @dots{} + + /* @r{Initialize the signal mask.} */ + sigemptyset (&block_alarm); + sigaddset (&block_alarm, SIGALRM); + +@group + while (1) + @{ + /* @r{Check if a signal has arrived; if so, reset the flag.} */ + sigprocmask (SIG_BLOCK, &block_alarm, NULL); + if (flag) + @{ + @var{actions-if-not-arrived} + flag = 0; + @} + sigprocmask (SIG_UNBLOCK, &block_alarm, NULL); + + @dots{} + @} +@} +@end group +@end smallexample + +@node Blocking for Handler +@subsection Blocking Signals for a Handler +@cindex blocking signals, in a handler + +When a signal handler is invoked, you usually want it to be able to +finish without being interrupted by another signal. From the moment the +handler starts until the moment it finishes, you must block signals that +might confuse it or corrupt its data. + +When a handler function is invoked on a signal, that signal is +automatically blocked (in addition to any other signals that are already +in the process's signal mask) during the time the handler is running. +If you set up a handler for @code{SIGTSTP}, for instance, then the +arrival of that signal forces further @code{SIGTSTP} signals to wait +during the execution of the handler. + +However, by default, other kinds of signals are not blocked; they can +arrive during handler execution. + +The reliable way to block other kinds of signals during the execution of +the handler is to use the @code{sa_mask} member of the @code{sigaction} +structure. + +Here is an example: + +@smallexample +#include <signal.h> +#include <stddef.h> + +void catch_stop (); + +void +install_handler (void) +@{ + struct sigaction setup_action; + sigset_t block_mask; + + sigemptyset (&block_mask); + /* @r{Block other terminal-generated signals while handler runs.} */ + sigaddset (&block_mask, SIGINT); + sigaddset (&block_mask, SIGQUIT); + setup_action.sa_handler = catch_stop; + setup_action.sa_mask = block_mask; + setup_action.sa_flags = 0; + sigaction (SIGTSTP, &setup_action, NULL); +@} +@end smallexample + +This is more reliable than blocking the other signals explicitly in the +code for the handler. If you block signals explicity in the handler, +you can't avoid at least a short interval at the beginning of the +handler where they are not yet blocked. + +You cannot remove signals from the process's current mask using this +mechanism. However, you can make calls to @code{sigprocmask} within +your handler to block or unblock signals as you wish. + +In any case, when the handler returns, the system restores the mask that +was in place before the handler was entered. If any signals that become +unblocked by this restoration are pending, the process will receive +those signals immediately, before returning to the code that was +interrupted. + +@node Checking for Pending Signals +@subsection Checking for Pending Signals +@cindex pending signals, checking for +@cindex blocked signals, checking for +@cindex checking for pending signals + +You can find out which signals are pending at any time by calling +@code{sigpending}. This function is declared in @file{signal.h}. +@pindex signal.h + +@comment signal.h +@comment POSIX.1 +@deftypefun int sigpending (sigset_t *@var{set}) +The @code{sigpending} function stores information about pending signals +in @var{set}. If there is a pending signal that is blocked from +delivery, then that signal is a member of the returned set. (You can +test whether a particular signal is a member of this set using +@code{sigismember}; see @ref{Signal Sets}.) + +The return value is @code{0} if successful, and @code{-1} on failure. +@end deftypefun + +Testing whether a signal is pending is not often useful. Testing when +that signal is not blocked is almost certainly bad design. + +Here is an example. + +@smallexample +#include <signal.h> +#include <stddef.h> + +sigset_t base_mask, waiting_mask; + +sigemptyset (&base_mask); +sigaddset (&base_mask, SIGINT); +sigaddset (&base_mask, SIGTSTP); + +/* @r{Block user interrupts while doing other processing.} */ +sigprocmask (SIG_SETMASK, &base_mask, NULL); +@dots{} + +/* @r{After a while, check to see whether any signals are pending.} */ +sigpending (&waiting_mask); +if (sigismember (&waiting_mask, SIGINT)) @{ + /* @r{User has tried to kill the process.} */ +@} +else if (sigismember (&waiting_mask, SIGTSTP)) @{ + /* @r{User has tried to stop the process.} */ +@} +@end smallexample + +Remember that if there is a particular signal pending for your process, +additional signals of that same type that arrive in the meantime might +be discarded. For example, if a @code{SIGINT} signal is pending when +another @code{SIGINT} signal arrives, your program will probably only +see one of them when you unblock this signal. + +@strong{Portability Note:} The @code{sigpending} function is new in +POSIX.1. Older systems have no equivalent facility. + +@node Remembering a Signal +@subsection Remembering a Signal to Act On Later + +Instead of blocking a signal using the library facilities, you can get +almost the same results by making the handler set a flag to be tested +later, when you ``unblock''. Here is an example: + +@smallexample +/* @r{If this flag is nonzero, don't handle the signal right away.} */ +volatile sig_atomic_t signal_pending; + +/* @r{This is nonzero if a signal arrived and was not handled.} */ +volatile sig_atomic_t defer_signal; + +void +handler (int signum) +@{ + if (defer_signal) + signal_pending = signum; + else + @dots{} /* @r{``Really'' handle the signal.} */ +@} + +@dots{} + +void +update_mumble (int frob) +@{ + /* @r{Prevent signals from having immediate effect.} */ + defer_signal++; + /* @r{Now update @code{mumble}, without worrying about interruption.} */ + mumble.a = 1; + mumble.b = hack (); + mumble.c = frob; + /* @r{We have updated @code{mumble}. Handle any signal that came in.} */ + defer_signal--; + if (defer_signal == 0 && signal_pending != 0) + raise (signal_pending); +@} +@end smallexample + +Note how the particular signal that arrives is stored in +@code{signal_pending}. That way, we can handle several types of +inconvenient signals with the same mechanism. + +We increment and decrement @code{defer_signal} so that nested critical +sections will work properly; thus, if @code{update_mumble} were called +with @code{signal_pending} already nonzero, signals would be deferred +not only within @code{update_mumble}, but also within the caller. This +is also why we do not check @code{signal_pending} if @code{defer_signal} +is still nonzero. + +The incrementing and decrementing of @code{defer_signal} require more +than one instruction; it is possible for a signal to happen in the +middle. But that does not cause any problem. If the signal happens +early enough to see the value from before the increment or decrement, +that is equivalent to a signal which came before the beginning of the +increment or decrement, which is a case that works properly. + +It is absolutely vital to decrement @code{defer_signal} before testing +@code{signal_pending}, because this avoids a subtle bug. If we did +these things in the other order, like this, + +@smallexample + if (defer_signal == 1 && signal_pending != 0) + raise (signal_pending); + defer_signal--; +@end smallexample + +@noindent +then a signal arriving in between the @code{if} statement and the decrement +would be effetively ``lost'' for an indefinite amount of time. The +handler would merely set @code{defer_signal}, but the program having +already tested this variable, it would not test the variable again. + +@cindex timing error in signal handling +Bugs like these are called @dfn{timing errors}. They are especially bad +because they happen only rarely and are nearly impossible to reproduce. +You can't expect to find them with a debugger as you would find a +reproducible bug. So it is worth being especially careful to avoid +them. + +(You would not be tempted to write the code in this order, given the use +of @code{defer_signal} as a counter which must be tested along with +@code{signal_pending}. After all, testing for zero is cleaner than +testing for one. But if you did not use @code{defer_signal} as a +counter, and gave it values of zero and one only, then either order +might seem equally simple. This is a further advantage of using a +counter for @code{defer_signal}: it will reduce the chance you will +write the code in the wrong order and create a subtle bug.) + +@node Waiting for a Signal +@section Waiting for a Signal +@cindex waiting for a signal +@cindex @code{pause} function + +If your program is driven by external events, or uses signals for +synchronization, then when it has nothing to do it should probably wait +until a signal arrives. + +@menu +* Using Pause:: The simple way, using @code{pause}. +* Pause Problems:: Why the simple way is often not very good. +* Sigsuspend:: Reliably waiting for a specific signal. +@end menu + +@node Using Pause +@subsection Using @code{pause} + +The simple way to wait until a signal arrives is to call @code{pause}. +Please read about its disadvantages, in the following section, before +you use it. + +@comment unistd.h +@comment POSIX.1 +@deftypefun int pause () +The @code{pause} function suspends program execution until a signal +arrives whose action is either to execute a handler function, or to +terminate the process. + +If the signal causes a handler function to be executed, then +@code{pause} returns. This is considered an unsuccessful return (since +``successful'' behavior would be to suspend the program forever), so the +return value is @code{-1}. Even if you specify that other primitives +should resume when a system handler returns (@pxref{Interrupted +Primitives}), this has no effect on @code{pause}; it always fails when a +signal is handled. + +The following @code{errno} error conditions are defined for this function: + +@table @code +@item EINTR +The function was interrupted by delivery of a signal. +@end table + +If the signal causes program termination, @code{pause} doesn't return +(obviously). + +The @code{pause} function is declared in @file{unistd.h}. +@end deftypefun + +@node Pause Problems +@subsection Problems with @code{pause} + +The simplicity of @code{pause} can conceal serious timing errors that +can make a program hang mysteriously. + +It is safe to use @code{pause} if the real work of your program is done +by the signal handlers themselves, and the ``main program'' does nothing +but call @code{pause}. Each time a signal is delivered, the handler +will do the next batch of work that is to be done, and then return, so +that the main loop of the program can call @code{pause} again. + +You can't safely use @code{pause} to wait until one more signal arrives, +and then resume real work. Even if you arrange for the signal handler +to cooperate by setting a flag, you still can't use @code{pause} +reliably. Here is an example of this problem: + +@smallexample +/* @r{@code{usr_interrupt} is set by the signal handler.} */ +if (!usr_interrupt) + pause (); + +/* @r{Do work once the signal arrives.} */ +@dots{} +@end smallexample + +@noindent +This has a bug: the signal could arrive after the variable +@code{usr_interrupt} is checked, but before the call to @code{pause}. +If no further signals arrive, the process would never wake up again. + +You can put an upper limit on the excess waiting by using @code{sleep} +in a loop, instead of using @code{pause}. (@xref{Sleeping}, for more +about @code{sleep}.) Here is what this looks like: + +@smallexample +/* @r{@code{usr_interrupt} is set by the signal handler.} +while (!usr_interrupt) + sleep (1); + +/* @r{Do work once the signal arrives.} */ +@dots{} +@end smallexample + +For some purposes, that is good enough. But with a little more +complexity, you can wait reliably until a particular signal handler is +run, using @code{sigsuspend}. +@ifinfo +@xref{Sigsuspend}. +@end ifinfo + +@node Sigsuspend +@subsection Using @code{sigsuspend} + +The clean and reliable way to wait for a signal to arrive is to block it +and then use @code{sigsuspend}. By using @code{sigsuspend} in a loop, +you can wait for certain kinds of signals, while letting other kinds of +signals be handled by their handlers. + +@comment signal.h +@comment POSIX.1 +@deftypefun int sigsuspend (const sigset_t *@var{set}) +This function replaces the process's signal mask with @var{set} and then +suspends the process until a signal is delivered whose action is either +to terminate the process or invoke a signal handling function. In other +words, the program is effectively suspended until one of the signals that +is not a member of @var{set} arrives. + +If the process is woken up by deliver of a signal that invokes a handler +function, and the handler function returns, then @code{sigsuspend} also +returns. + +The mask remains @var{set} only as long as @code{sigsuspend} is waiting. +The function @code{sigsuspend} always restores the previous signal mask +when it returns. + +The return value and error conditions are the same as for @code{pause}. +@end deftypefun + +With @code{sigsuspend}, you can replace the @code{pause} or @code{sleep} +loop in the previous section with something completely reliable: + +@smallexample +sigset_t mask, oldmask; + +@dots{} + +/* @r{Set up the mask of signals to temporarily block.} */ +sigemptyset (&mask); +sigaddset (&mask, SIGUSR1); + +@dots{} + +/* @r{Wait for a signal to arrive.} */ +sigprocmask (SIG_BLOCK, &mask, &oldmask); +while (!usr_interrupt) + sigsuspend (&oldmask); +sigprocmask (SIG_UNBLOCK, &mask, NULL); +@end smallexample + +This last piece of code is a little tricky. The key point to remember +here is that when @code{sigsuspend} returns, it resets the process's +signal mask to the original value, the value from before the call to +@code{sigsuspend}---in this case, the @code{SIGUSR1} signal is once +again blocked. The second call to @code{sigprocmask} is +necessary to explicitly unblock this signal. + +One other point: you may be wondering why the @code{while} loop is +necessary at all, since the program is apparently only waiting for one +@code{SIGUSR1} signal. The answer is that the mask passed to +@code{sigsuspend} permits the process to be woken up by the delivery of +other kinds of signals, as well---for example, job control signals. If +the process is woken up by a signal that doesn't set +@code{usr_interrupt}, it just suspends itself again until the ``right'' +kind of signal eventually arrives. + +This technique takes a few more lines of preparation, but that is needed +just once for each kind of wait criterion you want to use. The code +that actually waits is just four lines. + +@node Signal Stack +@section Using a Separate Signal Stack + +A signal stack is a special area of memory to be used as the execution +stack during signal handlers. It should be fairly large, to avoid any +danger that it will overflow in turn; the macro @code{SIGSTKSZ} is +defined to a canonical size for signal stacks. You can use +@code{malloc} to allocate the space for the stack. Then call +@code{sigaltstack} or @code{sigstack} to tell the system to use that +space for the signal stack. + +You don't need to write signal handlers differently in order to use a +signal stack. Switching from one stack to the other happens +automatically. (Some non-GNU debuggers on some machines may get +confused if you examine a stack trace while a handler that uses the +signal stack is running.) + +There are two interfaces for telling the system to use a separate signal +stack. @code{sigstack} is the older interface, which comes from 4.2 +BSD. @code{sigaltstack} is the newer interface, and comes from 4.4 +BSD. The @code{sigaltstack} interface has the advantage that it does +not require your program to know which direction the stack grows, which +depends on the specific machine and operating system. + +@comment signal.h +@comment BSD +@deftp {Data Type} {struct sigaltstack} +This structure describes a signal stack. It contains the following members: + +@table @code +@item void *ss_sp +This points to the base of the signal stack. + +@item size_t ss_size +This is the size (in bytes) of the signal stack which @samp{ss_sp} points to. +You should set this to however much space you allocated for the stack. + +There are two macros defined in @file{signal.h} that you should use in +calculating this size: + +@vtable @code +@item SIGSTKSZ +This is the canonical size for a signal stack. It is judged to be +sufficient for normal uses. + +@item MINSIGSTKSZ +This is the amount of signal stack space the operating system needs just +to implement signal delivery. The size of a signal stack @strong{must} +be greater than this. + +For most cases, just using @code{SIGSTKSZ} for @code{ss_size} is +sufficient. But if you know how much stack space your program's signal +handlers will need, you may want to use a different size. In this case, +you should allocate @code{MINSIGSTKSZ} additional bytes for the signal +stack and increase @code{ss_size} accordinly. +@end vtable + +@item int ss_flags +This field contains the bitwise @sc{or} of these flags: + +@vtable @code +@item SA_DISABLE +This tells the system that it should not use the signal stack. + +@item SA_ONSTACK +This is set by the system, and indicates that the signal stack is +currently in use. If this bit is not set, then signals will be +delivered on the normal user stack. +@end vtable +@end table +@end deftp + +@comment signal.h +@comment BSD +@deftypefun int sigaltstack (const struct sigaltstack *@var{stack}, struct sigaltstack *@var{oldstack}) +The @code{sigaltstack} function specifies an alternate stack for use +during signal handling. When a signal is received by the process and +its action indicates that the signal stack is used, the system arranges +a switch to the currently installed signal stack while the handler for +that signal is executed. + +If @var{oldstack} is not a null pointer, information about the currently +installed signal stack is returned in the location it points to. If +@var{stack} is not a null pointer, then this is installed as the new +stack for use by signal handlers. + +The return value is @code{0} on success and @code{-1} on failure. If +@code{sigaltstack} fails, it sets @code{errno} to one of these values: + +@table @code +@item +@item EINVAL +You tried to disable a stack that was in fact currently in use. + +@item ENOMEM +The size of the alternate stack was too small. +It must be greater than @code{MINSIGSTKSZ}. +@end table +@end deftypefun + +Here is the older @code{sigstack} interface. You should use +@code{sigaltstack} instead on systems that have it. + +@comment signal.h +@comment BSD +@deftp {Data Type} {struct sigstack} +This structure describes a signal stack. It contains the following members: + +@table @code +@item void *ss_sp +This is the stack pointer. If the stack grows downwards on your +machine, this should point to the top of the area you allocated. If the +stack grows upwards, it should point to the bottom. + +@item int ss_onstack +This field is true if the process is currently using this stack. +@end table +@end deftp + +@comment signal.h +@comment BSD +@deftypefun int sigstack (const struct sigstack *@var{stack}, struct sigstack *@var{oldstack}) +The @code{sigstack} function specifies an alternate stack for use during +signal handling. When a signal is received by the process and its +action indicates that the signal stack is used, the system arranges a +switch to the currently installed signal stack while the handler for +that signal is executed. + +If @var{oldstack} is not a null pointer, information about the currently +installed signal stack is returned in the location it points to. If +@var{stack} is not a null pointer, then this is installed as the new +stack for use by signal handlers. + +The return value is @code{0} on success and @code{-1} on failure. +@end deftypefun + +@node BSD Signal Handling +@section BSD Signal Handling + +This section describes alternative signal handling functions derived +from BSD Unix. These facilities were an advance, in their time; today, +they are mostly obsolete, and supported mainly for compatibility with +BSD Unix. + +There are many similarities between the BSD and POSIX signal handling +facilities, because the POSIX facilities were inspired by the BSD +facilities. Besides having different names for all the functions to +avoid conflicts, the main differences between the two are: + +@itemize @bullet +@item +BSD Unix represents signal masks as an @code{int} bit mask, rather than +as a @code{sigset_t} object. + +@item +The BSD facilities use a different default for whether an interrupted +primitive should fail or resume. The POSIX facilities make system +calls fail unless you specify that they should resume. With the BSD +facility, the default is to make system calls resume unless you say they +should fail. @xref{Interrupted Primitives}. +@end itemize + +The BSD facilities are declared in @file{signal.h}. +@pindex signal.h + +@menu +* BSD Handler:: BSD Function to Establish a Handler. +* Blocking in BSD:: BSD Functions for Blocking Signals. +@end menu + +@node BSD Handler +@subsection BSD Function to Establish a Handler + +@comment signal.h +@comment BSD +@deftp {Data Type} {struct sigvec} +This data type is the BSD equivalent of @code{struct sigaction} +(@pxref{Advanced Signal Handling}); it is used to specify signal actions +to the @code{sigvec} function. It contains the following members: + +@table @code +@item sighandler_t sv_handler +This is the handler function. + +@item int sv_mask +This is the mask of additional signals to be blocked while the handler +function is being called. + +@item int sv_flags +This is a bit mask used to specify various flags which affect the +behavior of the signal. You can also refer to this field as +@code{sv_onstack}. +@end table +@end deftp + +These symbolic constants can be used to provide values for the +@code{sv_flags} field of a @code{sigvec} structure. This field is a bit +mask value, so you bitwise-OR the flags of interest to you together. + +@comment signal.h +@comment BSD +@deftypevr Macro int SV_ONSTACK +If this bit is set in the @code{sv_flags} field of a @code{sigvec} +structure, it means to use the signal stack when delivering the signal. +@end deftypevr + +@comment signal.h +@comment BSD +@deftypevr Macro int SV_INTERRUPT +If this bit is set in the @code{sv_flags} field of a @code{sigvec} +structure, it means that system calls interrupted by this kind of signal +should not be restarted if the handler returns; instead, the system +calls should return with a @code{EINTR} error status. @xref{Interrupted +Primitives}. +@end deftypevr + +@comment signal.h +@comment Sun +@deftypevr Macro int SV_RESETHAND +If this bit is set in the @code{sv_flags} field of a @code{sigvec} +structure, it means to reset the action for the signal back to +@code{SIG_DFL} when the signal is received. +@end deftypevr + +@comment signal.h +@comment BSD +@deftypefun int sigvec (int @var{signum}, const struct sigvec *@var{action},struct sigvec *@var{old-action}) +This function is the equivalent of @code{sigaction} (@pxref{Advanced Signal +Handling}); it installs the action @var{action} for the signal @var{signum}, +returning information about the previous action in effect for that signal +in @var{old-action}. +@end deftypefun + +@comment signal.h +@comment BSD +@deftypefun int siginterrupt (int @var{signum}, int @var{failflag}) +This function specifies which approach to use when certain primitives +are interrupted by handling signal @var{signum}. If @var{failflag} is +false, signal @var{signum} restarts primitives. If @var{failflag} is +true, handling @var{signum} causes these primitives to fail with error +code @code{EINTR}. @xref{Interrupted Primitives}. +@end deftypefun + +@node Blocking in BSD +@subsection BSD Functions for Blocking Signals + +@comment signal.h +@comment BSD +@deftypefn Macro int sigmask (int @var{signum}) +This macro returns a signal mask that has the bit for signal @var{signum} +set. You can bitwise-OR the results of several calls to @code{sigmask} +together to specify more than one signal. For example, + +@smallexample +(sigmask (SIGTSTP) | sigmask (SIGSTOP) + | sigmask (SIGTTIN) | sigmask (SIGTTOU)) +@end smallexample + +@noindent +specifies a mask that includes all the job-control stop signals. +@end deftypefn + +@comment signal.h +@comment BSD +@deftypefun int sigblock (int @var{mask}) +This function is equivalent to @code{sigprocmask} (@pxref{Process Signal +Mask}) with a @var{how} argument of @code{SIG_BLOCK}: it adds the +signals specified by @var{mask} to the calling process's set of blocked +signals. The return value is the previous set of blocked signals. +@end deftypefun + +@comment signal.h +@comment BSD +@deftypefun int sigsetmask (int @var{mask}) +This function equivalent to @code{sigprocmask} (@pxref{Process +Signal Mask}) with a @var{how} argument of @code{SIG_SETMASK}: it sets +the calling process's signal mask to @var{mask}. The return value is +the previous set of blocked signals. +@end deftypefun + +@comment signal.h +@comment BSD +@deftypefun int sigpause (int @var{mask}) +This function is the equivalent of @code{sigsuspend} (@pxref{Waiting +for a Signal}): it sets the calling process's signal mask to @var{mask}, +and waits for a signal to arrive. On return the previous set of blocked +signals is restored. +@end deftypefun diff --git a/manual/socket.texi b/manual/socket.texi new file mode 100644 index 0000000000..0b338fca82 --- /dev/null +++ b/manual/socket.texi @@ -0,0 +1,2748 @@ +@node Sockets, Low-Level Terminal Interface, Pipes and FIFOs, Top +@chapter Sockets + +This chapter describes the GNU facilities for interprocess +communication using sockets. + +@cindex socket +@cindex interprocess communication, with sockets +A @dfn{socket} is a generalized interprocess communication channel. +Like a pipe, a socket is represented as a file descriptor. But, +unlike pipes, sockets support communication between unrelated +processes, and even between processes running on different machines +that communicate over a network. Sockets are the primary means of +communicating with other machines; @code{telnet}, @code{rlogin}, +@code{ftp}, @code{talk}, and the other familiar network programs use +sockets. + +Not all operating systems support sockets. In the GNU library, the +header file @file{sys/socket.h} exists regardless of the operating +system, and the socket functions always exist, but if the system does +not really support sockets, these functions always fail. + +@strong{Incomplete:} We do not currently document the facilities for +broadcast messages or for configuring Internet interfaces. + +@menu +* Socket Concepts:: Basic concepts you need to know about. +* Communication Styles::Stream communication, datagrams, and other styles. +* Socket Addresses:: How socket names (``addresses'') work. +* File Namespace:: Details about the file namespace. +* Internet Namespace:: Details about the Internet namespace. +* Misc Namespaces:: Other namespaces not documented fully here. +* Open/Close Sockets:: Creating sockets and destroying them. +* Connections:: Operations on sockets with connection state. +* Datagrams:: Operations on datagram sockets. +* Inetd:: Inetd is a daemon that starts servers on request. + The most convenient way to write a server + is to make it work with Inetd. +* Socket Options:: Miscellaneous low-level socket options. +* Networks Database:: Accessing the database of network names. +@end menu + +@node Socket Concepts +@section Socket Concepts + +@cindex communication style (of a socket) +@cindex style of communication (of a socket) +When you create a socket, you must specify the style of communication +you want to use and the type of protocol that should implement it. +The @dfn{communication style} of a socket defines the user-level +semantics of sending and receiving data on the socket. Choosing a +communication style specifies the answers to questions such as these: + +@itemize @bullet +@item +@cindex packet +@cindex byte stream +@cindex stream (sockets) +@strong{What are the units of data transmission?} Some communication +styles regard the data as a sequence of bytes, with no larger +structure; others group the bytes into records (which are known in +this context as @dfn{packets}). + +@item +@cindex loss of data on sockets +@cindex data loss on sockets +@strong{Can data be lost during normal operation?} Some communication +styles guarantee that all the data sent arrives in the order it was +sent (barring system or network crashes); other styles occasionally +lose data as a normal part of operation, and may sometimes deliver +packets more than once or in the wrong order. + +Designing a program to use unreliable communication styles usually +involves taking precautions to detect lost or misordered packets and +to retransmit data as needed. + +@item +@strong{Is communication entirely with one partner?} Some +communication styles are like a telephone call---you make a +@dfn{connection} with one remote socket, and then exchange data +freely. Other styles are like mailing letters---you specify a +destination address for each message you send. +@end itemize + +@cindex namespace (of socket) +@cindex domain (of socket) +@cindex socket namespace +@cindex socket domain +You must also choose a @dfn{namespace} for naming the socket. A socket +name (``address'') is meaningful only in the context of a particular +namespace. In fact, even the data type to use for a socket name may +depend on the namespace. Namespaces are also called ``domains'', but we +avoid that word as it can be confused with other usage of the same +term. Each namespace has a symbolic name that starts with @samp{PF_}. +A corresponding symbolic name starting with @samp{AF_} designates the +address format for that namespace. + +@cindex network protocol +@cindex protocol (of socket) +@cindex socket protocol +@cindex protocol family +Finally you must choose the @dfn{protocol} to carry out the +communication. The protocol determines what low-level mechanism is used +to transmit and receive data. Each protocol is valid for a particular +namespace and communication style; a namespace is sometimes called a +@dfn{protocol family} because of this, which is why the namespace names +start with @samp{PF_}. + +The rules of a protocol apply to the data passing between two programs, +perhaps on different computers; most of these rules are handled by the +operating system, and you need not know about them. What you do need to +know about protocols is this: + +@itemize @bullet +@item +In order to have communication between two sockets, they must specify +the @emph{same} protocol. + +@item +Each protocol is meaningful with particular style/namespace +combinations and cannot be used with inappropriate combinations. For +example, the TCP protocol fits only the byte stream style of +communication and the Internet namespace. + +@item +For each combination of style and namespace, there is a @dfn{default +protocol} which you can request by specifying 0 as the protocol +number. And that's what you should normally do---use the default. +@end itemize + +@node Communication Styles +@section Communication Styles + +The GNU library includes support for several different kinds of sockets, +each with different characteristics. This section describes the +supported socket types. The symbolic constants listed here are +defined in @file{sys/socket.h}. +@pindex sys/socket.h + +@comment sys/socket.h +@comment BSD +@deftypevr Macro int SOCK_STREAM +The @code{SOCK_STREAM} style is like a pipe (@pxref{Pipes and FIFOs}); +it operates over a connection with a particular remote socket, and +transmits data reliably as a stream of bytes. + +Use of this style is covered in detail in @ref{Connections}. +@end deftypevr + +@comment sys/socket.h +@comment BSD +@deftypevr Macro int SOCK_DGRAM +The @code{SOCK_DGRAM} style is used for sending +individually-addressed packets, unreliably. +It is the diametrical opposite of @code{SOCK_STREAM}. + +Each time you write data to a socket of this kind, that data becomes +one packet. Since @code{SOCK_DGRAM} sockets do not have connections, +you must specify the recipient address with each packet. + +The only guarantee that the system makes about your requests to +transmit data is that it will try its best to deliver each packet you +send. It may succeed with the sixth packet after failing with the +fourth and fifth packets; the seventh packet may arrive before the +sixth, and may arrive a second time after the sixth. + +The typical use for @code{SOCK_DGRAM} is in situations where it is +acceptable to simply resend a packet if no response is seen in a +reasonable amount of time. + +@xref{Datagrams}, for detailed information about how to use datagram +sockets. +@end deftypevr + +@ignore +@c This appears to be only for the NS domain, which we aren't +@c discussing and probably won't support either. +@comment sys/socket.h +@comment BSD +@deftypevr Macro int SOCK_SEQPACKET +This style is like @code{SOCK_STREAM} except that the data is +structured into packets. + +A program that receives data over a @code{SOCK_SEQPACKET} socket +should be prepared to read the entire message packet in a single call +to @code{read}; if it only reads part of the message, the remainder of +the message is simply discarded instead of being available for +subsequent calls to @code{read}. + +Many protocols do not support this communication style. +@end deftypevr +@end ignore + +@ignore +@comment sys/socket.h +@comment BSD +@deftypevr Macro int SOCK_RDM +This style is a reliable version of @code{SOCK_DGRAM}: it sends +individually addressed packets, but guarantees that each packet sent +arrives exactly once. + +@strong{Warning:} It is not clear this is actually supported +by any operating system. +@end deftypevr +@end ignore + +@comment sys/socket.h +@comment BSD +@deftypevr Macro int SOCK_RAW +This style provides access to low-level network protocols and +interfaces. Ordinary user programs usually have no need to use this +style. +@end deftypevr + +@node Socket Addresses +@section Socket Addresses + +@cindex address of socket +@cindex name of socket +@cindex binding a socket address +@cindex socket address (name) binding +The name of a socket is normally called an @dfn{address}. The +functions and symbols for dealing with socket addresses were named +inconsistently, sometimes using the term ``name'' and sometimes using +``address''. You can regard these terms as synonymous where sockets +are concerned. + +A socket newly created with the @code{socket} function has no +address. Other processes can find it for communication only if you +give it an address. We call this @dfn{binding} the address to the +socket, and the way to do it is with the @code{bind} function. + +You need be concerned with the address of a socket if other processes +are to find it and start communicating with it. You can specify an +address for other sockets, but this is usually pointless; the first time +you send data from a socket, or use it to initiate a connection, the +system assigns an address automatically if you have not specified one. + +Occasionally a client needs to specify an address because the server +discriminates based on addresses; for example, the rsh and rlogin +protocols look at the client's socket address and don't bypass password +checking unless it is less than @code{IPPORT_RESERVED} (@pxref{Ports}). + +The details of socket addresses vary depending on what namespace you are +using. @xref{File Namespace}, or @ref{Internet Namespace}, for specific +information. + +Regardless of the namespace, you use the same functions @code{bind} and +@code{getsockname} to set and examine a socket's address. These +functions use a phony data type, @code{struct sockaddr *}, to accept the +address. In practice, the address lives in a structure of some other +data type appropriate to the address format you are using, but you cast +its address to @code{struct sockaddr *} when you pass it to +@code{bind}. + +@menu +* Address Formats:: About @code{struct sockaddr}. +* Setting Address:: Binding an address to a socket. +* Reading Address:: Reading the address of a socket. +@end menu + +@node Address Formats +@subsection Address Formats + +The functions @code{bind} and @code{getsockname} use the generic data +type @code{struct sockaddr *} to represent a pointer to a socket +address. You can't use this data type effectively to interpret an +address or construct one; for that, you must use the proper data type +for the socket's namespace. + +Thus, the usual practice is to construct an address in the proper +namespace-specific type, then cast a pointer to @code{struct sockaddr *} +when you call @code{bind} or @code{getsockname}. + +The one piece of information that you can get from the @code{struct +sockaddr} data type is the @dfn{address format} designator which tells +you which data type to use to understand the address fully. + +@pindex sys/socket.h +The symbols in this section are defined in the header file +@file{sys/socket.h}. + +@comment sys/socket.h +@comment BSD +@deftp {Date Type} {struct sockaddr} +The @code{struct sockaddr} type itself has the following members: + +@table @code +@item short int sa_family +This is the code for the address format of this address. It +identifies the format of the data which follows. + +@item char sa_data[14] +This is the actual socket address data, which is format-dependent. Its +length also depends on the format, and may well be more than 14. The +length 14 of @code{sa_data} is essentially arbitrary. +@end table +@end deftp + +Each address format has a symbolic name which starts with @samp{AF_}. +Each of them corresponds to a @samp{PF_} symbol which designates the +corresponding namespace. Here is a list of address format names: + +@table @code +@comment sys/socket.h +@comment GNU +@item AF_FILE +@vindex AF_FILE +This designates the address format that goes with the file namespace. +(@code{PF_FILE} is the name of that namespace.) @xref{File Namespace +Details}, for information about this address format. + +@comment sys/socket.h +@comment BSD +@item AF_UNIX +@vindex AF_UNIX +This is a synonym for @code{AF_FILE}, for compatibility. +(@code{PF_UNIX} is likewise a synonym for @code{PF_FILE}.) + +@comment sys/socket.h +@comment BSD +@item AF_INET +@vindex AF_INET +This designates the address format that goes with the Internet +namespace. (@code{PF_INET} is the name of that namespace.) +@xref{Internet Address Format}. + +@comment sys/socket.h +@comment BSD +@item AF_UNSPEC +@vindex AF_UNSPEC +This designates no particular address format. It is used only in rare +cases, such as to clear out the default destination address of a +``connected'' datagram socket. @xref{Sending Datagrams}. + +The corresponding namespace designator symbol @code{PF_UNSPEC} exists +for completeness, but there is no reason to use it in a program. +@end table + +@file{sys/socket.h} defines symbols starting with @samp{AF_} for many +different kinds of networks, all or most of which are not actually +implemented. We will document those that really work, as we receive +information about how to use them. + +@node Setting Address +@subsection Setting the Address of a Socket + +@pindex sys/socket.h +Use the @code{bind} function to assign an address to a socket. The +prototype for @code{bind} is in the header file @file{sys/socket.h}. +For examples of use, see @ref{File Namespace}, or see @ref{Inet Example}. + +@comment sys/socket.h +@comment BSD +@deftypefun int bind (int @var{socket}, struct sockaddr *@var{addr}, size_t @var{length}) +The @code{bind} function assigns an address to the socket +@var{socket}. The @var{addr} and @var{length} arguments specify the +address; the detailed format of the address depends on the namespace. +The first part of the address is always the format designator, which +specifies a namespace, and says that the address is in the format for +that namespace. + +The return value is @code{0} on success and @code{-1} on failure. The +following @code{errno} error conditions are defined for this function: + +@table @code +@item EBADF +The @var{socket} argument is not a valid file descriptor. + +@item ENOTSOCK +The descriptor @var{socket} is not a socket. + +@item EADDRNOTAVAIL +The specified address is not available on this machine. + +@item EADDRINUSE +Some other socket is already using the specified address. + +@item EINVAL +The socket @var{socket} already has an address. + +@item EACCES +You do not have permission to access the requested address. (In the +Internet domain, only the super-user is allowed to specify a port number +in the range 0 through @code{IPPORT_RESERVED} minus one; see +@ref{Ports}.) +@end table + +Additional conditions may be possible depending on the particular namespace +of the socket. +@end deftypefun + +@node Reading Address +@subsection Reading the Address of a Socket + +@pindex sys/socket.h +Use the function @code{getsockname} to examine the address of an +Internet socket. The prototype for this function is in the header file +@file{sys/socket.h}. + +@comment sys/socket.h +@comment BSD +@deftypefun int getsockname (int @var{socket}, struct sockaddr *@var{addr}, size_t *@var{length-ptr}) +The @code{getsockname} function returns information about the +address of the socket @var{socket} in the locations specified by the +@var{addr} and @var{length-ptr} arguments. Note that the +@var{length-ptr} is a pointer; you should initialize it to be the +allocation size of @var{addr}, and on return it contains the actual +size of the address data. + +The format of the address data depends on the socket namespace. The +length of the information is usually fixed for a given namespace, so +normally you can know exactly how much space is needed and can provide +that much. The usual practice is to allocate a place for the value +using the proper data type for the socket's namespace, then cast its +address to @code{struct sockaddr *} to pass it to @code{getsockname}. + +The return value is @code{0} on success and @code{-1} on error. The +following @code{errno} error conditions are defined for this function: + +@table @code +@item EBADF +The @var{socket} argument is not a valid file descriptor. + +@item ENOTSOCK +The descriptor @var{socket} is not a socket. + +@item ENOBUFS +There are not enough internal buffers available for the operation. +@end table +@end deftypefun + +You can't read the address of a socket in the file namespace. This is +consistent with the rest of the system; in general, there's no way to +find a file's name from a descriptor for that file. + +@node File Namespace +@section The File Namespace +@cindex file namespace, for sockets + +This section describes the details of the file namespace, whose +symbolic name (required when you create a socket) is @code{PF_FILE}. + +@menu +* Concepts: File Namespace Concepts. What you need to understand. +* Details: File Namespace Details. Address format, symbolic names, etc. +* Example: File Socket Example. Example of creating a socket. +@end menu + +@node File Namespace Concepts +@subsection File Namespace Concepts + +In the file namespace, socket addresses are file names. You can specify +any file name you want as the address of the socket, but you must have +write permission on the directory containing it. In order to connect to +a socket, you must have read permission for it. It's common to put +these files in the @file{/tmp} directory. + +One peculiarity of the file namespace is that the name is only used when +opening the connection; once that is over with, the address is not +meaningful and may not exist. + +Another peculiarity is that you cannot connect to such a socket from +another machine--not even if the other machine shares the file system +which contains the name of the socket. You can see the socket in a +directory listing, but connecting to it never succeeds. Some programs +take advantage of this, such as by asking the client to send its own +process ID, and using the process IDs to distinguish between clients. +However, we recommend you not use this method in protocols you design, +as we might someday permit connections from other machines that mount +the same file systems. Instead, send each new client an identifying +number if you want it to have one. + +After you close a socket in the file namespace, you should delete the +file name from the file system. Use @code{unlink} or @code{remove} to +do this; see @ref{Deleting Files}. + +The file namespace supports just one protocol for any communication +style; it is protocol number @code{0}. + +@node File Namespace Details +@subsection Details of File Namespace + +@pindex sys/socket.h +To create a socket in the file namespace, use the constant +@code{PF_FILE} as the @var{namespace} argument to @code{socket} or +@code{socketpair}. This constant is defined in @file{sys/socket.h}. + +@comment sys/socket.h +@comment GNU +@deftypevr Macro int PF_FILE +This designates the file namespace, in which socket addresses are file +names, and its associated family of protocols. +@end deftypevr + +@comment sys/socket.h +@comment BSD +@deftypevr Macro int PF_UNIX +This is a synonym for @code{PF_FILE}, for compatibility's sake. +@end deftypevr + +The structure for specifying socket names in the file namespace is +defined in the header file @file{sys/un.h}: +@pindex sys/un.h + +@comment sys/un.h +@comment BSD +@deftp {Data Type} {struct sockaddr_un} +This structure is used to specify file namespace socket addresses. It has +the following members: + +@table @code +@item short int sun_family +This identifies the address family or format of the socket address. +You should store the value @code{AF_FILE} to designate the file +namespace. @xref{Socket Addresses}. + +@item char sun_path[108] +This is the file name to use. + +@strong{Incomplete:} Why is 108 a magic number? RMS suggests making +this a zero-length array and tweaking the example following to use +@code{alloca} to allocate an appropriate amount of storage based on +the length of the filename. +@end table +@end deftp + +You should compute the @var{length} parameter for a socket address in +the file namespace as the sum of the size of the @code{sun_family} +component and the string length (@emph{not} the allocation size!) of +the file name string. + +@node File Socket Example +@subsection Example of File-Namespace Sockets + +Here is an example showing how to create and name a socket in the file +namespace. + +@smallexample +@include mkfsock.c.texi +@end smallexample + +@node Internet Namespace +@section The Internet Namespace +@cindex Internet namespace, for sockets + +This section describes the details the protocols and socket naming +conventions used in the Internet namespace. + +To create a socket in the Internet namespace, use the symbolic name +@code{PF_INET} of this namespace as the @var{namespace} argument to +@code{socket} or @code{socketpair}. This macro is defined in +@file{sys/socket.h}. +@pindex sys/socket.h + +@comment sys/socket.h +@comment BSD +@deftypevr Macro int PF_INET +This designates the Internet namespace and associated family of +protocols. +@end deftypevr + +A socket address for the Internet namespace includes the following components: + +@itemize @bullet +@item +The address of the machine you want to connect to. Internet addresses +can be specified in several ways; these are discussed in @ref{Internet +Address Format}, @ref{Host Addresses}, and @ref{Host Names}. + +@item +A port number for that machine. @xref{Ports}. +@end itemize + +You must ensure that the address and port number are represented in a +canonical format called @dfn{network byte order}. @xref{Byte Order}, +for information about this. + +@menu +* Internet Address Format:: How socket addresses are specified in the + Internet namespace. +* Host Addresses:: All about host addresses of internet host. +* Protocols Database:: Referring to protocols by name. +* Ports:: Internet port numbers. +* Services Database:: Ports may have symbolic names. +* Byte Order:: Different hosts may use different byte + ordering conventions; you need to + canonicalize host address and port number. +* Inet Example:: Putting it all together. +@end menu + +@node Internet Address Format +@subsection Internet Socket Address Format + +In the Internet namespace, a socket address consists of a host address +and a port on that host. In addition, the protocol you choose serves +effectively as a part of the address because local port numbers are +meaningful only within a particular protocol. + +The data type for representing socket addresses in the Internet namespace +is defined in the header file @file{netinet/in.h}. +@pindex netinet/in.h + +@comment netinet/in.h +@comment BSD +@deftp {Data Type} {struct sockaddr_in} +This is the data type used to represent socket addresses in the +Internet namespace. It has the following members: + +@table @code +@item short int sin_family +This identifies the address family or format of the socket address. +You should store the value of @code{AF_INET} in this member. +@xref{Socket Addresses}. + +@item struct in_addr sin_addr +This is the Internet address of the host machine. @xref{Host +Addresses}, and @ref{Host Names}, for how to get a value to store +here. + +@item unsigned short int sin_port +This is the port number. @xref{Ports}. +@end table +@end deftp + +When you call @code{bind} or @code{getsockname}, you should specify +@code{sizeof (struct sockaddr_in)} as the @var{length} parameter if +you are using an Internet namespace socket address. + +@node Host Addresses +@subsection Host Addresses + +Each computer on the Internet has one or more @dfn{Internet addresses}, +numbers which identify that computer among all those on the Internet. +Users typically write numeric host addresses as sequences of four +numbers, separated by periods, as in @samp{128.52.46.32}. + +Each computer also has one or more @dfn{host names}, which are strings +of words separated by periods, as in @samp{churchy.gnu.ai.mit.edu}. + +Programs that let the user specify a host typically accept both numeric +addresses and host names. But the program needs a numeric address to +open a connection; to use a host name, you must convert it to the +numeric address it stands for. + +@menu +* Abstract Host Addresses:: What a host number consists of. +* Data type: Host Address Data Type. Data type for a host number. +* Functions: Host Address Functions. Functions to operate on them. +* Names: Host Names. Translating host names to host numbers. +@end menu + +@node Abstract Host Addresses +@subsubsection Internet Host Addresses +@cindex host address, Internet +@cindex Internet host address + +@ifinfo +Each computer on the Internet has one or more Internet addresses, +numbers which identify that computer among all those on the Internet. +@end ifinfo + +@cindex network number +@cindex local network address number +An Internet host address is a number containing four bytes of data. +These are divided into two parts, a @dfn{network number} and a +@dfn{local network address number} within that network. The network +number consists of the first one, two or three bytes; the rest of the +bytes are the local address. + +Network numbers are registered with the Network Information Center +(NIC), and are divided into three classes---A, B, and C. The local +network address numbers of individual machines are registered with the +administrator of the particular network. + +Class A networks have single-byte numbers in the range 0 to 127. There +are only a small number of Class A networks, but they can each support a +very large number of hosts. Medium-sized Class B networks have two-byte +network numbers, with the first byte in the range 128 to 191. Class C +networks are the smallest; they have three-byte network numbers, with +the first byte in the range 192-255. Thus, the first 1, 2, or 3 bytes +of an Internet address specifies a network. The remaining bytes of the +Internet address specify the address within that network. + +The Class A network 0 is reserved for broadcast to all networks. In +addition, the host number 0 within each network is reserved for broadcast +to all hosts in that network. + +The Class A network 127 is reserved for loopback; you can always use +the Internet address @samp{127.0.0.1} to refer to the host machine. + +Since a single machine can be a member of multiple networks, it can +have multiple Internet host addresses. However, there is never +supposed to be more than one machine with the same host address. + +@c !!! this section could document the IN_CLASS* macros in <netinet/in.h>. + +@cindex standard dot notation, for Internet addresses +@cindex dot notation, for Internet addresses +There are four forms of the @dfn{standard numbers-and-dots notation} +for Internet addresses: + +@table @code +@item @var{a}.@var{b}.@var{c}.@var{d} +This specifies all four bytes of the address individually. + +@item @var{a}.@var{b}.@var{c} +The last part of the address, @var{c}, is interpreted as a 2-byte quantity. +This is useful for specifying host addresses in a Class B network with +network address number @code{@var{a}.@var{b}}. + +@item @var{a}.@var{b} +The last part of the address, @var{c}, is interpreted as a 3-byte quantity. +This is useful for specifying host addresses in a Class A network with +network address number @var{a}. + +@item @var{a} +If only one part is given, this corresponds directly to the host address +number. +@end table + +Within each part of the address, the usual C conventions for specifying +the radix apply. In other words, a leading @samp{0x} or @samp{0X} implies +hexadecimal radix; a leading @samp{0} implies octal; and otherwise decimal +radix is assumed. + +@node Host Address Data Type +@subsubsection Host Address Data Type + +Internet host addresses are represented in some contexts as integers +(type @code{unsigned long int}). In other contexts, the integer is +packaged inside a structure of type @code{struct in_addr}. It would +be better if the usage were made consistent, but it is not hard to extract +the integer from the structure or put the integer into a structure. + +The following basic definitions for Internet addresses appear in the +header file @file{netinet/in.h}: +@pindex netinet/in.h + +@comment netinet/in.h +@comment BSD +@deftp {Data Type} {struct in_addr} +This data type is used in certain contexts to contain an Internet host +address. It has just one field, named @code{s_addr}, which records the +host address number as an @code{unsigned long int}. +@end deftp + +@comment netinet/in.h +@comment BSD +@deftypevr Macro {unsigned long int} INADDR_LOOPBACK +You can use this constant to stand for ``the address of this machine,'' +instead of finding its actual address. It is the Internet address +@samp{127.0.0.1}, which is usually called @samp{localhost}. This +special constant saves you the trouble of looking up the address of your +own machine. Also, the system usually implements @code{INADDR_LOOPBACK} +specially, avoiding any network traffic for the case of one machine +talking to itself. +@end deftypevr + +@comment netinet/in.h +@comment BSD +@deftypevr Macro {unsigned long int} INADDR_ANY +You can use this constant to stand for ``any incoming address,'' when +binding to an address. @xref{Setting Address}. This is the usual +address to give in the @code{sin_addr} member of @w{@code{struct +sockaddr_in}} when you want to accept Internet connections. +@end deftypevr + +@comment netinet/in.h +@comment BSD +@deftypevr Macro {unsigned long int} INADDR_BROADCAST +This constant is the address you use to send a broadcast message. +@c !!! broadcast needs further documented +@end deftypevr + +@comment netinet/in.h +@comment BSD +@deftypevr Macro {unsigned long int} INADDR_NONE +This constant is returned by some functions to indicate an error. +@end deftypevr + +@node Host Address Functions +@subsubsection Host Address Functions + +@pindex arpa/inet.h +These additional functions for manipulating Internet addresses are +declared in @file{arpa/inet.h}. They represent Internet addresses in +network byte order; they represent network numbers and +local-address-within-network numbers in host byte order. +@xref{Byte Order}, for an explanation of network and host byte order. + +@comment arpa/inet.h +@comment BSD +@deftypefun {int} inet_aton (const char *@var{name}, struct in_addr *@var{addr}) +This function converts the Internet host address @var{name} +from the standard numbers-and-dots notation into binary data and stores +it in the @code{struct in_addr} that @var{addr} points to. +@code{inet_aton} returns nonzero if the address is valid, zero if not. +@end deftypefun + +@comment arpa/inet.h +@comment BSD +@deftypefun {unsigned long int} inet_addr (const char *@var{name}) +This function converts the Internet host address @var{name} from the +standard numbers-and-dots notation into binary data. If the input is +not valid, @code{inet_addr} returns @code{INADDR_NONE}. This is an +obsolete interface to @code{inet_aton}, described immediately above; it +is obsolete because @code{INADDR_NONE} is a valid address +(255.255.255.255), and @code{inet_aton} provides a cleaner way to +indicate error return. +@end deftypefun + +@comment arpa/inet.h +@comment BSD +@deftypefun {unsigned long int} inet_network (const char *@var{name}) +This function extracts the network number from the address @var{name}, +given in the standard numbers-and-dots notation. +If the input is not valid, @code{inet_network} returns @code{-1}. +@end deftypefun + +@comment arpa/inet.h +@comment BSD +@deftypefun {char *} inet_ntoa (struct in_addr @var{addr}) +This function converts the Internet host address @var{addr} to a +string in the standard numbers-and-dots notation. The return value is +a pointer into a statically-allocated buffer. Subsequent calls will +overwrite the same buffer, so you should copy the string if you need +to save it. +@end deftypefun + +@comment arpa/inet.h +@comment BSD +@deftypefun {struct in_addr} inet_makeaddr (int @var{net}, int @var{local}) +This function makes an Internet host address by combining the network +number @var{net} with the local-address-within-network number +@var{local}. +@end deftypefun + +@comment arpa/inet.h +@comment BSD +@deftypefun int inet_lnaof (struct in_addr @var{addr}) +This function returns the local-address-within-network part of the +Internet host address @var{addr}. +@end deftypefun + +@comment arpa/inet.h +@comment BSD +@deftypefun int inet_netof (struct in_addr @var{addr}) +This function returns the network number part of the Internet host +address @var{addr}. +@end deftypefun + +@node Host Names +@subsubsection Host Names +@cindex hosts database +@cindex converting host name to address +@cindex converting host address to name + +Besides the standard numbers-and-dots notation for Internet addresses, +you can also refer to a host by a symbolic name. The advantage of a +symbolic name is that it is usually easier to remember. For example, +the machine with Internet address @samp{128.52.46.32} is also known as +@samp{churchy.gnu.ai.mit.edu}; and other machines in the @samp{gnu.ai.mit.edu} +domain can refer to it simply as @samp{churchy}. + +@pindex /etc/hosts +@pindex netdb.h +Internally, the system uses a database to keep track of the mapping +between host names and host numbers. This database is usually either +the file @file{/etc/hosts} or an equivalent provided by a name server. +The functions and other symbols for accessing this database are declared +in @file{netdb.h}. They are BSD features, defined unconditionally if +you include @file{netdb.h}. + +@comment netdb.h +@comment BSD +@deftp {Data Type} {struct hostent} +This data type is used to represent an entry in the hosts database. It +has the following members: + +@table @code +@item char *h_name +This is the ``official'' name of the host. + +@item char **h_aliases +These are alternative names for the host, represented as a null-terminated +vector of strings. + +@item int h_addrtype +This is the host address type; in practice, its value is always +@code{AF_INET}. In principle other kinds of addresses could be +represented in the data base as well as Internet addresses; if this were +done, you might find a value in this field other than @code{AF_INET}. +@xref{Socket Addresses}. + +@item int h_length +This is the length, in bytes, of each address. + +@item char **h_addr_list +This is the vector of addresses for the host. (Recall that the host +might be connected to multiple networks and have different addresses on +each one.) The vector is terminated by a null pointer. + +@item char *h_addr +This is a synonym for @code{h_addr_list[0]}; in other words, it is the +first host address. +@end table +@end deftp + +As far as the host database is concerned, each address is just a block +of memory @code{h_length} bytes long. But in other contexts there is an +implicit assumption that you can convert this to a @code{struct in_addr} or +an @code{unsigned long int}. Host addresses in a @code{struct hostent} +structure are always given in network byte order; see @ref{Byte Order}. + +You can use @code{gethostbyname} or @code{gethostbyaddr} to search the +hosts database for information about a particular host. The information +is returned in a statically-allocated structure; you must copy the +information if you need to save it across calls. + +@comment netdb.h +@comment BSD +@deftypefun {struct hostent *} gethostbyname (const char *@var{name}) +The @code{gethostbyname} function returns information about the host +named @var{name}. If the lookup fails, it returns a null pointer. +@end deftypefun + +@comment netdb.h +@comment BSD +@deftypefun {struct hostent *} gethostbyaddr (const char *@var{addr}, int @var{length}, int @var{format}) +The @code{gethostbyaddr} function returns information about the host +with Internet address @var{addr}. The @var{length} argument is the +size (in bytes) of the address at @var{addr}. @var{format} specifies +the address format; for an Internet address, specify a value of +@code{AF_INET}. + +If the lookup fails, @code{gethostbyaddr} returns a null pointer. +@end deftypefun + +@vindex h_errno +If the name lookup by @code{gethostbyname} or @code{gethostbyaddr} +fails, you can find out the reason by looking at the value of the +variable @code{h_errno}. (It would be cleaner design for these +functions to set @code{errno}, but use of @code{h_errno} is compatible +with other systems.) Before using @code{h_errno}, you must declare it +like this: + +@smallexample +extern int h_errno; +@end smallexample + +Here are the error codes that you may find in @code{h_errno}: + +@table @code +@comment netdb.h +@comment BSD +@item HOST_NOT_FOUND +@vindex HOST_NOT_FOUND +No such host is known in the data base. + +@comment netdb.h +@comment BSD +@item TRY_AGAIN +@vindex TRY_AGAIN +This condition happens when the name server could not be contacted. If +you try again later, you may succeed then. + +@comment netdb.h +@comment BSD +@item NO_RECOVERY +@vindex NO_RECOVERY +A non-recoverable error occurred. + +@comment netdb.h +@comment BSD +@item NO_ADDRESS +@vindex NO_ADDRESS +The host database contains an entry for the name, but it doesn't have an +associated Internet address. +@end table + +You can also scan the entire hosts database one entry at a time using +@code{sethostent}, @code{gethostent}, and @code{endhostent}. Be careful +in using these functions, because they are not reentrant. + +@comment netdb.h +@comment BSD +@deftypefun void sethostent (int @var{stayopen}) +This function opens the hosts database to begin scanning it. You can +then call @code{gethostent} to read the entries. + +@c There was a rumor that this flag has different meaning if using the DNS, +@c but it appears this description is accurate in that case also. +If the @var{stayopen} argument is nonzero, this sets a flag so that +subsequent calls to @code{gethostbyname} or @code{gethostbyaddr} will +not close the database (as they usually would). This makes for more +efficiency if you call those functions several times, by avoiding +reopening the database for each call. +@end deftypefun + +@comment netdb.h +@comment BSD +@deftypefun {struct hostent *} gethostent () +This function returns the next entry in the hosts database. It +returns a null pointer if there are no more entries. +@end deftypefun + +@comment netdb.h +@comment BSD +@deftypefun void endhostent () +This function closes the hosts database. +@end deftypefun + +@node Ports +@subsection Internet Ports +@cindex port number + +A socket address in the Internet namespace consists of a machine's +Internet address plus a @dfn{port number} which distinguishes the +sockets on a given machine (for a given protocol). Port numbers range +from 0 to 65,535. + +Port numbers less than @code{IPPORT_RESERVED} are reserved for standard +servers, such as @code{finger} and @code{telnet}. There is a database +that keeps track of these, and you can use the @code{getservbyname} +function to map a service name onto a port number; see @ref{Services +Database}. + +If you write a server that is not one of the standard ones defined in +the database, you must choose a port number for it. Use a number +greater than @code{IPPORT_USERRESERVED}; such numbers are reserved for +servers and won't ever be generated automatically by the system. +Avoiding conflicts with servers being run by other users is up to you. + +When you use a socket without specifying its address, the system +generates a port number for it. This number is between +@code{IPPORT_RESERVED} and @code{IPPORT_USERRESERVED}. + +On the Internet, it is actually legitimate to have two different +sockets with the same port number, as long as they never both try to +communicate with the same socket address (host address plus port +number). You shouldn't duplicate a port number except in special +circumstances where a higher-level protocol requires it. Normally, +the system won't let you do it; @code{bind} normally insists on +distinct port numbers. To reuse a port number, you must set the +socket option @code{SO_REUSEADDR}. @xref{Socket-Level Options}. + +@pindex netinet/in.h +These macros are defined in the header file @file{netinet/in.h}. + +@comment netinet/in.h +@comment BSD +@deftypevr Macro int IPPORT_RESERVED +Port numbers less than @code{IPPORT_RESERVED} are reserved for +superuser use. +@end deftypevr + +@comment netinet/in.h +@comment BSD +@deftypevr Macro int IPPORT_USERRESERVED +Port numbers greater than or equal to @code{IPPORT_USERRESERVED} are +reserved for explicit use; they will never be allocated automatically. +@end deftypevr + +@node Services Database +@subsection The Services Database +@cindex services database +@cindex converting service name to port number +@cindex converting port number to service name + +@pindex /etc/services +The database that keeps track of ``well-known'' services is usually +either the file @file{/etc/services} or an equivalent from a name server. +You can use these utilities, declared in @file{netdb.h}, to access +the services database. +@pindex netdb.h + +@comment netdb.h +@comment BSD +@deftp {Data Type} {struct servent} +This data type holds information about entries from the services database. +It has the following members: + +@table @code +@item char *s_name +This is the ``official'' name of the service. + +@item char **s_aliases +These are alternate names for the service, represented as an array of +strings. A null pointer terminates the array. + +@item int s_port +This is the port number for the service. Port numbers are given in +network byte order; see @ref{Byte Order}. + +@item char *s_proto +This is the name of the protocol to use with this service. +@xref{Protocols Database}. +@end table +@end deftp + +To get information about a particular service, use the +@code{getservbyname} or @code{getservbyport} functions. The information +is returned in a statically-allocated structure; you must copy the +information if you need to save it across calls. + +@comment netdb.h +@comment BSD +@deftypefun {struct servent *} getservbyname (const char *@var{name}, const char *@var{proto}) +The @code{getservbyname} function returns information about the +service named @var{name} using protocol @var{proto}. If it can't find +such a service, it returns a null pointer. + +This function is useful for servers as well as for clients; servers +use it to determine which port they should listen on (@pxref{Listening}). +@end deftypefun + +@comment netdb.h +@comment BSD +@deftypefun {struct servent *} getservbyport (int @var{port}, const char *@var{proto}) +The @code{getservbyport} function returns information about the +service at port @var{port} using protocol @var{proto}. If it can't +find such a service, it returns a null pointer. +@end deftypefun + +You can also scan the services database using @code{setservent}, +@code{getservent}, and @code{endservent}. Be careful in using these +functions, because they are not reentrant. + +@comment netdb.h +@comment BSD +@deftypefun void setservent (int @var{stayopen}) +This function opens the services database to begin scanning it. + +If the @var{stayopen} argument is nonzero, this sets a flag so that +subsequent calls to @code{getservbyname} or @code{getservbyport} will +not close the database (as they usually would). This makes for more +efficiency if you call those functions several times, by avoiding +reopening the database for each call. +@end deftypefun + +@comment netdb.h +@comment BSD +@deftypefun {struct servent *} getservent (void) +This function returns the next entry in the services database. If +there are no more entries, it returns a null pointer. +@end deftypefun + +@comment netdb.h +@comment BSD +@deftypefun void endservent (void) +This function closes the services database. +@end deftypefun + +@node Byte Order +@subsection Byte Order Conversion +@cindex byte order conversion, for socket +@cindex converting byte order + +@cindex big-endian +@cindex little-endian +Different kinds of computers use different conventions for the +ordering of bytes within a word. Some computers put the most +significant byte within a word first (this is called ``big-endian'' +order), and others put it last (``little-endian'' order). + +@cindex network byte order +So that machines with different byte order conventions can +communicate, the Internet protocols specify a canonical byte order +convention for data transmitted over the network. This is known +as the @dfn{network byte order}. + +When establishing an Internet socket connection, you must make sure that +the data in the @code{sin_port} and @code{sin_addr} members of the +@code{sockaddr_in} structure are represented in the network byte order. +If you are encoding integer data in the messages sent through the +socket, you should convert this to network byte order too. If you don't +do this, your program may fail when running on or talking to other kinds +of machines. + +If you use @code{getservbyname} and @code{gethostbyname} or +@code{inet_addr} to get the port number and host address, the values are +already in the network byte order, and you can copy them directly into +the @code{sockaddr_in} structure. + +Otherwise, you have to convert the values explicitly. Use +@code{htons} and @code{ntohs} to convert values for the @code{sin_port} +member. Use @code{htonl} and @code{ntohl} to convert values for the +@code{sin_addr} member. (Remember, @code{struct in_addr} is equivalent +to @code{unsigned long int}.) These functions are declared in +@file{netinet/in.h}. +@pindex netinet/in.h + +@comment netinet/in.h +@comment BSD +@deftypefun {unsigned short int} htons (unsigned short int @var{hostshort}) +This function converts the @code{short} integer @var{hostshort} from +host byte order to network byte order. +@end deftypefun + +@comment netinet/in.h +@comment BSD +@deftypefun {unsigned short int} ntohs (unsigned short int @var{netshort}) +This function converts the @code{short} integer @var{netshort} from +network byte order to host byte order. +@end deftypefun + +@comment netinet/in.h +@comment BSD +@deftypefun {unsigned long int} htonl (unsigned long int @var{hostlong}) +This function converts the @code{long} integer @var{hostlong} from +host byte order to network byte order. +@end deftypefun + +@comment netinet/in.h +@comment BSD +@deftypefun {unsigned long int} ntohl (unsigned long int @var{netlong}) +This function converts the @code{long} integer @var{netlong} from +network byte order to host byte order. +@end deftypefun + +@node Protocols Database +@subsection Protocols Database +@cindex protocols database + +The communications protocol used with a socket controls low-level +details of how data is exchanged. For example, the protocol implements +things like checksums to detect errors in transmissions, and routing +instructions for messages. Normal user programs have little reason to +mess with these details directly. + +@cindex TCP (Internet protocol) +The default communications protocol for the Internet namespace depends on +the communication style. For stream communication, the default is TCP +(``transmission control protocol''). For datagram communication, the +default is UDP (``user datagram protocol''). For reliable datagram +communication, the default is RDP (``reliable datagram protocol''). +You should nearly always use the default. + +@pindex /etc/protocols +Internet protocols are generally specified by a name instead of a +number. The network protocols that a host knows about are stored in a +database. This is usually either derived from the file +@file{/etc/protocols}, or it may be an equivalent provided by a name +server. You look up the protocol number associated with a named +protocol in the database using the @code{getprotobyname} function. + +Here are detailed descriptions of the utilities for accessing the +protocols database. These are declared in @file{netdb.h}. +@pindex netdb.h + +@comment netdb.h +@comment BSD +@deftp {Data Type} {struct protoent} +This data type is used to represent entries in the network protocols +database. It has the following members: + +@table @code +@item char *p_name +This is the official name of the protocol. + +@item char **p_aliases +These are alternate names for the protocol, specified as an array of +strings. The last element of the array is a null pointer. + +@item int p_proto +This is the protocol number (in host byte order); use this member as the +@var{protocol} argument to @code{socket}. +@end table +@end deftp + +You can use @code{getprotobyname} and @code{getprotobynumber} to search +the protocols database for a specific protocol. The information is +returned in a statically-allocated structure; you must copy the +information if you need to save it across calls. + +@comment netdb.h +@comment BSD +@deftypefun {struct protoent *} getprotobyname (const char *@var{name}) +The @code{getprotobyname} function returns information about the +network protocol named @var{name}. If there is no such protocol, it +returns a null pointer. +@end deftypefun + +@comment netdb.h +@comment BSD +@deftypefun {struct protoent *} getprotobynumber (int @var{protocol}) +The @code{getprotobynumber} function returns information about the +network protocol with number @var{protocol}. If there is no such +protocol, it returns a null pointer. +@end deftypefun + +You can also scan the whole protocols database one protocol at a time by +using @code{setprotoent}, @code{getprotoent}, and @code{endprotoent}. +Be careful in using these functions, because they are not reentrant. + +@comment netdb.h +@comment BSD +@deftypefun void setprotoent (int @var{stayopen}) +This function opens the protocols database to begin scanning it. + +If the @var{stayopen} argument is nonzero, this sets a flag so that +subsequent calls to @code{getprotobyname} or @code{getprotobynumber} will +not close the database (as they usually would). This makes for more +efficiency if you call those functions several times, by avoiding +reopening the database for each call. +@end deftypefun + +@comment netdb.h +@comment BSD +@deftypefun {struct protoent *} getprotoent (void) +This function returns the next entry in the protocols database. It +returns a null pointer if there are no more entries. +@end deftypefun + +@comment netdb.h +@comment BSD +@deftypefun void endprotoent (void) +This function closes the protocols database. +@end deftypefun + +@node Inet Example +@subsection Internet Socket Example + +Here is an example showing how to create and name a socket in the +Internet namespace. The newly created socket exists on the machine that +the program is running on. Rather than finding and using the machine's +Internet address, this example specifies @code{INADDR_ANY} as the host +address; the system replaces that with the machine's actual address. + +@smallexample +@include mkisock.c.texi +@end smallexample + +Here is another example, showing how you can fill in a @code{sockaddr_in} +structure, given a host name string and a port number: + +@smallexample +@include isockad.c.texi +@end smallexample + +@node Misc Namespaces +@section Other Namespaces + +@vindex PF_NS +@vindex PF_ISO +@vindex PF_CCITT +@vindex PF_IMPLINK +@vindex PF_ROUTE +Certain other namespaces and associated protocol families are supported +but not documented yet because they are not often used. @code{PF_NS} +refers to the Xerox Network Software protocols. @code{PF_ISO} stands +for Open Systems Interconnect. @code{PF_CCITT} refers to protocols from +CCITT. @file{socket.h} defines these symbols and others naming protocols +not actually implemented. + +@code{PF_IMPLINK} is used for communicating between hosts and Internet +Message Processors. For information on this, and on @code{PF_ROUTE}, an +occasionally-used local area routing protocol, see the GNU Hurd Manual +(to appear in the future). + +@node Open/Close Sockets +@section Opening and Closing Sockets + +This section describes the actual library functions for opening and +closing sockets. The same functions work for all namespaces and +connection styles. + +@menu +* Creating a Socket:: How to open a socket. +* Closing a Socket:: How to close a socket. +* Socket Pairs:: These are created like pipes. +@end menu + +@node Creating a Socket +@subsection Creating a Socket +@cindex creating a socket +@cindex socket, creating +@cindex opening a socket + +The primitive for creating a socket is the @code{socket} function, +declared in @file{sys/socket.h}. +@pindex sys/socket.h + +@comment sys/socket.h +@comment BSD +@deftypefun int socket (int @var{namespace}, int @var{style}, int @var{protocol}) +This function creates a socket and specifies communication style +@var{style}, which should be one of the socket styles listed in +@ref{Communication Styles}. The @var{namespace} argument specifies +the namespace; it must be @code{PF_FILE} (@pxref{File Namespace}) or +@code{PF_INET} (@pxref{Internet Namespace}). @var{protocol} +designates the specific protocol (@pxref{Socket Concepts}); zero is +usually right for @var{protocol}. + +The return value from @code{socket} is the file descriptor for the new +socket, or @code{-1} in case of error. The following @code{errno} error +conditions are defined for this function: + +@table @code +@item EPROTONOSUPPORT +The @var{protocol} or @var{style} is not supported by the +@var{namespace} specified. + +@item EMFILE +The process already has too many file descriptors open. + +@item ENFILE +The system already has too many file descriptors open. + +@item EACCESS +The process does not have privilege to create a socket of the specified +@var{style} or @var{protocol}. + +@item ENOBUFS +The system ran out of internal buffer space. +@end table + +The file descriptor returned by the @code{socket} function supports both +read and write operations. But, like pipes, sockets do not support file +positioning operations. +@end deftypefun + +For examples of how to call the @code{socket} function, +see @ref{File Namespace}, or @ref{Inet Example}. + + +@node Closing a Socket +@subsection Closing a Socket +@cindex socket, closing +@cindex closing a socket +@cindex shutting down a socket +@cindex socket shutdown + +When you are finished using a socket, you can simply close its +file descriptor with @code{close}; see @ref{Opening and Closing Files}. +If there is still data waiting to be transmitted over the connection, +normally @code{close} tries to complete this transmission. You +can control this behavior using the @code{SO_LINGER} socket option to +specify a timeout period; see @ref{Socket Options}. + +@pindex sys/socket.h +You can also shut down only reception or only transmission on a +connection by calling @code{shutdown}, which is declared in +@file{sys/socket.h}. + +@comment sys/socket.h +@comment BSD +@deftypefun int shutdown (int @var{socket}, int @var{how}) +The @code{shutdown} function shuts down the connection of socket +@var{socket}. The argument @var{how} specifies what action to +perform: + +@table @code +@item 0 +Stop receiving data for this socket. If further data arrives, +reject it. + +@item 1 +Stop trying to transmit data from this socket. Discard any data +waiting to be sent. Stop looking for acknowledgement of data already +sent; don't retransmit it if it is lost. + +@item 2 +Stop both reception and transmission. +@end table + +The return value is @code{0} on success and @code{-1} on failure. The +following @code{errno} error conditions are defined for this function: + +@table @code +@item EBADF +@var{socket} is not a valid file descriptor. + +@item ENOTSOCK +@var{socket} is not a socket. + +@item ENOTCONN +@var{socket} is not connected. +@end table +@end deftypefun + +@node Socket Pairs +@subsection Socket Pairs +@cindex creating a socket pair +@cindex socket pair +@cindex opening a socket pair + +@pindex sys/socket.h +A @dfn{socket pair} consists of a pair of connected (but unnamed) +sockets. It is very similar to a pipe and is used in much the same +way. Socket pairs are created with the @code{socketpair} function, +declared in @file{sys/socket.h}. A socket pair is much like a pipe; the +main difference is that the socket pair is bidirectional, whereas the +pipe has one input-only end and one output-only end (@pxref{Pipes and +FIFOs}). + +@comment sys/socket.h +@comment BSD +@deftypefun int socketpair (int @var{namespace}, int @var{style}, int @var{protocol}, int @var{filedes}@t{[2]}) +This function creates a socket pair, returning the file descriptors in +@code{@var{filedes}[0]} and @code{@var{filedes}[1]}. The socket pair +is a full-duplex communications channel, so that both reading and writing +may be performed at either end. + +The @var{namespace}, @var{style}, and @var{protocol} arguments are +interpreted as for the @code{socket} function. @var{style} should be +one of the communication styles listed in @ref{Communication Styles}. +The @var{namespace} argument specifies the namespace, which must be +@code{AF_FILE} (@pxref{File Namespace}); @var{protocol} specifies the +communications protocol, but zero is the only meaningful value. + +If @var{style} specifies a connectionless communication style, then +the two sockets you get are not @emph{connected}, strictly speaking, +but each of them knows the other as the default destination address, +so they can send packets to each other. + +The @code{socketpair} function returns @code{0} on success and @code{-1} +on failure. The following @code{errno} error conditions are defined +for this function: + +@table @code +@item EMFILE +The process has too many file descriptors open. + +@item EAFNOSUPPORT +The specified namespace is not supported. + +@item EPROTONOSUPPORT +The specified protocol is not supported. + +@item EOPNOTSUPP +The specified protocol does not support the creation of socket pairs. +@end table +@end deftypefun + +@node Connections +@section Using Sockets with Connections + +@cindex connection +@cindex client +@cindex server +The most common communication styles involve making a connection to a +particular other socket, and then exchanging data with that socket +over and over. Making a connection is asymmetric; one side (the +@dfn{client}) acts to request a connection, while the other side (the +@dfn{server}) makes a socket and waits for the connection request. + +@iftex +@itemize @bullet +@item +@ref{Connecting}, describes what the client program must do to +initiate a connection with a server. + +@item +@ref{Listening}, and @ref{Accepting Connections}, describe what the +server program must do to wait for and act upon connection requests +from clients. + +@item +@ref{Transferring Data}, describes how data is transferred through the +connected socket. +@end itemize +@end iftex + +@menu +* Connecting:: What the client program must do. +* Listening:: How a server program waits for requests. +* Accepting Connections:: What the server does when it gets a request. +* Who is Connected:: Getting the address of the + other side of a connection. +* Transferring Data:: How to send and receive data. +* Byte Stream Example:: An example program: a client for communicating + over a byte stream socket in the Internet namespace. +* Server Example:: A corresponding server program. +* Out-of-Band Data:: This is an advanced feature. +@end menu + +@node Connecting +@subsection Making a Connection +@cindex connecting a socket +@cindex socket, connecting +@cindex socket, initiating a connection +@cindex socket, client actions + +In making a connection, the client makes a connection while the server +waits for and accepts the connection. Here we discuss what the client +program must do, using the @code{connect} function, which is declared in +@file{sys/socket.h}. + +@comment sys/socket.h +@comment BSD +@deftypefun int connect (int @var{socket}, struct sockaddr *@var{addr}, size_t @var{length}) +The @code{connect} function initiates a connection from the socket +with file descriptor @var{socket} to the socket whose address is +specified by the @var{addr} and @var{length} arguments. (This socket +is typically on another machine, and it must be already set up as a +server.) @xref{Socket Addresses}, for information about how these +arguments are interpreted. + +Normally, @code{connect} waits until the server responds to the request +before it returns. You can set nonblocking mode on the socket +@var{socket} to make @code{connect} return immediately without waiting +for the response. @xref{File Status Flags}, for information about +nonblocking mode. +@c !!! how do you tell when it has finished connecting? I suspect the +@c way you do it is select for writing. + +The normal return value from @code{connect} is @code{0}. If an error +occurs, @code{connect} returns @code{-1}. The following @code{errno} +error conditions are defined for this function: + +@table @code +@item EBADF +The socket @var{socket} is not a valid file descriptor. + +@item ENOTSOCK +The socket @var{socket} is not a socket. + +@item EADDRNOTAVAIL +The specified address is not available on the remote machine. + +@item EAFNOSUPPORT +The namespace of the @var{addr} is not supported by this socket. + +@item EISCONN +The socket @var{socket} is already connected. + +@item ETIMEDOUT +The attempt to establish the connection timed out. + +@item ECONNREFUSED +The server has actively refused to establish the connection. + +@item ENETUNREACH +The network of the given @var{addr} isn't reachable from this host. + +@item EADDRINUSE +The socket address of the given @var{addr} is already in use. + +@item EINPROGRESS +The socket @var{socket} is non-blocking and the connection could not be +established immediately. You can determine when the connection is +completely established with @code{select}; @pxref{Waiting for I/O}. +Another @code{connect} call on the same socket, before the connection is +completely established, will fail with @code{EALREADY}. + +@item EALREADY +The socket @var{socket} is non-blocking and already has a pending +connection in progress (see @code{EINPROGRESS} above). +@end table +@end deftypefun + +@node Listening +@subsection Listening for Connections +@cindex listening (sockets) +@cindex sockets, server actions +@cindex sockets, listening + +Now let us consider what the server process must do to accept +connections on a socket. First it must use the @code{listen} function +to enable connection requests on the socket, and then accept each +incoming connection with a call to @code{accept} (@pxref{Accepting +Connections}). Once connection requests are enabled on a server socket, +the @code{select} function reports when the socket has a connection +ready to be accepted (@pxref{Waiting for I/O}). + +The @code{listen} function is not allowed for sockets using +connectionless communication styles. + +You can write a network server that does not even start running until a +connection to it is requested. @xref{Inetd Servers}. + +In the Internet namespace, there are no special protection mechanisms +for controlling access to connect to a port; any process on any machine +can make a connection to your server. If you want to restrict access to +your server, make it examine the addresses associated with connection +requests or implement some other handshaking or identification +protocol. + +In the File namespace, the ordinary file protection bits control who has +access to connect to the socket. + +@comment sys/socket.h +@comment BSD +@deftypefun int listen (int @var{socket}, unsigned int @var{n}) +The @code{listen} function enables the socket @var{socket} to accept +connections, thus making it a server socket. + +The argument @var{n} specifies the length of the queue for pending +connections. When the queue fills, new clients attempting to connect +fail with @code{ECONNREFUSED} until the server calls @code{accept} to +accept a connection from the queue. + +The @code{listen} function returns @code{0} on success and @code{-1} +on failure. The following @code{errno} error conditions are defined +for this function: + +@table @code +@item EBADF +The argument @var{socket} is not a valid file descriptor. + +@item ENOTSOCK +The argument @var{socket} is not a socket. + +@item EOPNOTSUPP +The socket @var{socket} does not support this operation. +@end table +@end deftypefun + +@node Accepting Connections +@subsection Accepting Connections +@cindex sockets, accepting connections +@cindex accepting connections + +When a server receives a connection request, it can complete the +connection by accepting the request. Use the function @code{accept} +to do this. + +A socket that has been established as a server can accept connection +requests from multiple clients. The server's original socket +@emph{does not become part} of the connection; instead, @code{accept} +makes a new socket which participates in the connection. +@code{accept} returns the descriptor for this socket. The server's +original socket remains available for listening for further connection +requests. + +The number of pending connection requests on a server socket is finite. +If connection requests arrive from clients faster than the server can +act upon them, the queue can fill up and additional requests are refused +with a @code{ECONNREFUSED} error. You can specify the maximum length of +this queue as an argument to the @code{listen} function, although the +system may also impose its own internal limit on the length of this +queue. + +@comment sys/socket.h +@comment BSD +@deftypefun int accept (int @var{socket}, struct sockaddr *@var{addr}, size_t *@var{length-ptr}) +This function is used to accept a connection request on the server +socket @var{socket}. + +The @code{accept} function waits if there are no connections pending, +unless the socket @var{socket} has nonblocking mode set. (You can use +@code{select} to wait for a pending connection, with a nonblocking +socket.) @xref{File Status Flags}, for information about nonblocking +mode. + +The @var{addr} and @var{length-ptr} arguments are used to return +information about the name of the client socket that initiated the +connection. @xref{Socket Addresses}, for information about the format +of the information. + +Accepting a connection does not make @var{socket} part of the +connection. Instead, it creates a new socket which becomes +connected. The normal return value of @code{accept} is the file +descriptor for the new socket. + +After @code{accept}, the original socket @var{socket} remains open and +unconnected, and continues listening until you close it. You can +accept further connections with @var{socket} by calling @code{accept} +again. + +If an error occurs, @code{accept} returns @code{-1}. The following +@code{errno} error conditions are defined for this function: + +@table @code +@item EBADF +The @var{socket} argument is not a valid file descriptor. + +@item ENOTSOCK +The descriptor @var{socket} argument is not a socket. + +@item EOPNOTSUPP +The descriptor @var{socket} does not support this operation. + +@item EWOULDBLOCK +@var{socket} has nonblocking mode set, and there are no pending +connections immediately available. +@end table +@end deftypefun + +The @code{accept} function is not allowed for sockets using +connectionless communication styles. + +@node Who is Connected +@subsection Who is Connected to Me? + +@comment sys/socket.h +@comment BSD +@deftypefun int getpeername (int @var{socket}, struct sockaddr *@var{addr}, size_t *@var{length-ptr}) +The @code{getpeername} function returns the address of the socket that +@var{socket} is connected to; it stores the address in the memory space +specified by @var{addr} and @var{length-ptr}. It stores the length of +the address in @code{*@var{length-ptr}}. + +@xref{Socket Addresses}, for information about the format of the +address. In some operating systems, @code{getpeername} works only for +sockets in the Internet domain. + +The return value is @code{0} on success and @code{-1} on error. The +following @code{errno} error conditions are defined for this function: + +@table @code +@item EBADF +The argument @var{socket} is not a valid file descriptor. + +@item ENOTSOCK +The descriptor @var{socket} is not a socket. + +@item ENOTCONN +The socket @var{socket} is not connected. + +@item ENOBUFS +There are not enough internal buffers available. +@end table +@end deftypefun + + +@node Transferring Data +@subsection Transferring Data +@cindex reading from a socket +@cindex writing to a socket + +Once a socket has been connected to a peer, you can use the ordinary +@code{read} and @code{write} operations (@pxref{I/O Primitives}) to +transfer data. A socket is a two-way communications channel, so read +and write operations can be performed at either end. + +There are also some I/O modes that are specific to socket operations. +In order to specify these modes, you must use the @code{recv} and +@code{send} functions instead of the more generic @code{read} and +@code{write} functions. The @code{recv} and @code{send} functions take +an additional argument which you can use to specify various flags to +control the special I/O modes. For example, you can specify the +@code{MSG_OOB} flag to read or write out-of-band data, the +@code{MSG_PEEK} flag to peek at input, or the @code{MSG_DONTROUTE} flag +to control inclusion of routing information on output. + +@menu +* Sending Data:: Sending data with @code{send}. +* Receiving Data:: Reading data with @code{recv}. +* Socket Data Options:: Using @code{send} and @code{recv}. +@end menu + +@node Sending Data +@subsubsection Sending Data + +@pindex sys/socket.h +The @code{send} function is declared in the header file +@file{sys/socket.h}. If your @var{flags} argument is zero, you can just +as well use @code{write} instead of @code{send}; see @ref{I/O +Primitives}. If the socket was connected but the connection has broken, +you get a @code{SIGPIPE} signal for any use of @code{send} or +@code{write} (@pxref{Miscellaneous Signals}). + +@comment sys/socket.h +@comment BSD +@deftypefun int send (int @var{socket}, void *@var{buffer}, size_t @var{size}, int @var{flags}) +The @code{send} function is like @code{write}, but with the additional +flags @var{flags}. The possible values of @var{flags} are described +in @ref{Socket Data Options}. + +This function returns the number of bytes transmitted, or @code{-1} on +failure. If the socket is nonblocking, then @code{send} (like +@code{write}) can return after sending just part of the data. +@xref{File Status Flags}, for information about nonblocking mode. + +Note, however, that a successful return value merely indicates that +the message has been sent without error, not necessarily that it has +been received without error. + +The following @code{errno} error conditions are defined for this function: + +@table @code +@item EBADF +The @var{socket} argument is not a valid file descriptor. + +@item EINTR +The operation was interrupted by a signal before any data was sent. +@xref{Interrupted Primitives}. + +@item ENOTSOCK +The descriptor @var{socket} is not a socket. + +@item EMSGSIZE +The socket type requires that the message be sent atomically, but the +message is too large for this to be possible. + +@item EWOULDBLOCK +Nonblocking mode has been set on the socket, and the write operation +would block. (Normally @code{send} blocks until the operation can be +completed.) + +@item ENOBUFS +There is not enough internal buffer space available. + +@item ENOTCONN +You never connected this socket. + +@item EPIPE +This socket was connected but the connection is now broken. In this +case, @code{send} generates a @code{SIGPIPE} signal first; if that +signal is ignored or blocked, or if its handler returns, then +@code{send} fails with @code{EPIPE}. +@end table +@end deftypefun + +@node Receiving Data +@subsubsection Receiving Data + +@pindex sys/socket.h +The @code{recv} function is declared in the header file +@file{sys/socket.h}. If your @var{flags} argument is zero, you can +just as well use @code{read} instead of @code{recv}; see @ref{I/O +Primitives}. + +@comment sys/socket.h +@comment BSD +@deftypefun int recv (int @var{socket}, void *@var{buffer}, size_t @var{size}, int @var{flags}) +The @code{recv} function is like @code{read}, but with the additional +flags @var{flags}. The possible values of @var{flags} are described +In @ref{Socket Data Options}. + +If nonblocking mode is set for @var{socket}, and no data is available to +be read, @code{recv} fails immediately rather than waiting. @xref{File +Status Flags}, for information about nonblocking mode. + +This function returns the number of bytes received, or @code{-1} on failure. +The following @code{errno} error conditions are defined for this function: + +@table @code +@item EBADF +The @var{socket} argument is not a valid file descriptor. + +@item ENOTSOCK +The descriptor @var{socket} is not a socket. + +@item EWOULDBLOCK +Nonblocking mode has been set on the socket, and the read operation +would block. (Normally, @code{recv} blocks until there is input +available to be read.) + +@item EINTR +The operation was interrupted by a signal before any data was read. +@xref{Interrupted Primitives}. + +@item ENOTCONN +You never connected this socket. +@end table +@end deftypefun + +@node Socket Data Options +@subsubsection Socket Data Options + +@pindex sys/socket.h +The @var{flags} argument to @code{send} and @code{recv} is a bit +mask. You can bitwise-OR the values of the following macros together +to obtain a value for this argument. All are defined in the header +file @file{sys/socket.h}. + +@comment sys/socket.h +@comment BSD +@deftypevr Macro int MSG_OOB +Send or receive out-of-band data. @xref{Out-of-Band Data}. +@end deftypevr + +@comment sys/socket.h +@comment BSD +@deftypevr Macro int MSG_PEEK +Look at the data but don't remove it from the input queue. This is +only meaningful with input functions such as @code{recv}, not with +@code{send}. +@end deftypevr + +@comment sys/socket.h +@comment BSD +@deftypevr Macro int MSG_DONTROUTE +Don't include routing information in the message. This is only +meaningful with output operations, and is usually only of interest for +diagnostic or routing programs. We don't try to explain it here. +@end deftypevr + +@node Byte Stream Example +@subsection Byte Stream Socket Example + +Here is an example client program that makes a connection for a byte +stream socket in the Internet namespace. It doesn't do anything +particularly interesting once it has connected to the server; it just +sends a text string to the server and exits. + +@smallexample +@include inetcli.c.texi +@end smallexample + +@node Server Example +@subsection Byte Stream Connection Server Example + +The server end is much more complicated. Since we want to allow +multiple clients to be connected to the server at the same time, it +would be incorrect to wait for input from a single client by simply +calling @code{read} or @code{recv}. Instead, the right thing to do is +to use @code{select} (@pxref{Waiting for I/O}) to wait for input on +all of the open sockets. This also allows the server to deal with +additional connection requests. + +This particular server doesn't do anything interesting once it has +gotten a message from a client. It does close the socket for that +client when it detects an end-of-file condition (resulting from the +client shutting down its end of the connection). + +This program uses @code{make_socket} and @code{init_sockaddr} to set +up the socket address; see @ref{Inet Example}. + +@smallexample +@include inetsrv.c.texi +@end smallexample + +@node Out-of-Band Data +@subsection Out-of-Band Data + +@cindex out-of-band data +@cindex high-priority data +Streams with connections permit @dfn{out-of-band} data that is +delivered with higher priority than ordinary data. Typically the +reason for sending out-of-band data is to send notice of an +exceptional condition. The way to send out-of-band data is using +@code{send}, specifying the flag @code{MSG_OOB} (@pxref{Sending +Data}). + +Out-of-band data is received with higher priority because the +receiving process need not read it in sequence; to read the next +available out-of-band data, use @code{recv} with the @code{MSG_OOB} +flag (@pxref{Receiving Data}). Ordinary read operations do not read +out-of-band data; they read only the ordinary data. + +@cindex urgent socket condition +When a socket finds that out-of-band data is on its way, it sends a +@code{SIGURG} signal to the owner process or process group of the +socket. You can specify the owner using the @code{F_SETOWN} command +to the @code{fcntl} function; see @ref{Interrupt Input}. You must +also establish a handler for this signal, as described in @ref{Signal +Handling}, in order to take appropriate action such as reading the +out-of-band data. + +Alternatively, you can test for pending out-of-band data, or wait +until there is out-of-band data, using the @code{select} function; it +can wait for an exceptional condition on the socket. @xref{Waiting +for I/O}, for more information about @code{select}. + +Notification of out-of-band data (whether with @code{SIGURG} or with +@code{select}) indicates that out-of-band data is on the way; the data +may not actually arrive until later. If you try to read the +out-of-band data before it arrives, @code{recv} fails with an +@code{EWOULDBLOCK} error. + +Sending out-of-band data automatically places a ``mark'' in the stream +of ordinary data, showing where in the sequence the out-of-band data +``would have been''. This is useful when the meaning of out-of-band +data is ``cancel everything sent so far''. Here is how you can test, +in the receiving process, whether any ordinary data was sent before +the mark: + +@smallexample +success = ioctl (socket, SIOCATMARK, &result); +@end smallexample + +Here's a function to discard any ordinary data preceding the +out-of-band mark: + +@smallexample +int +discard_until_mark (int socket) +@{ + while (1) + @{ + /* @r{This is not an arbitrary limit; any size will do.} */ + char buffer[1024]; + int result, success; + + /* @r{If we have reached the mark, return.} */ + success = ioctl (socket, SIOCATMARK, &result); + if (success < 0) + perror ("ioctl"); + if (result) + return; + + /* @r{Otherwise, read a bunch of ordinary data and discard it.} + @r{This is guaranteed not to read past the mark} + @r{if it starts before the mark.} */ + success = read (socket, buffer, sizeof buffer); + if (success < 0) + perror ("read"); + @} +@} +@end smallexample + +If you don't want to discard the ordinary data preceding the mark, you +may need to read some of it anyway, to make room in internal system +buffers for the out-of-band data. If you try to read out-of-band data +and get an @code{EWOULDBLOCK} error, try reading some ordinary data +(saving it so that you can use it when you want it) and see if that +makes room. Here is an example: + +@smallexample +struct buffer +@{ + char *buffer; + int size; + struct buffer *next; +@}; + +/* @r{Read the out-of-band data from SOCKET and return it} + @r{as a `struct buffer', which records the address of the data} + @r{and its size.} + + @r{It may be necessary to read some ordinary data} + @r{in order to make room for the out-of-band data.} + @r{If so, the ordinary data is saved as a chain of buffers} + @r{found in the `next' field of the value.} */ + +struct buffer * +read_oob (int socket) +@{ + struct buffer *tail = 0; + struct buffer *list = 0; + + while (1) + @{ + /* @r{This is an arbitrary limit.} + @r{Does anyone know how to do this without a limit?} */ + char *buffer = (char *) xmalloc (1024); + struct buffer *link; + int success; + int result; + + /* @r{Try again to read the out-of-band data.} */ + success = recv (socket, buffer, sizeof buffer, MSG_OOB); + if (success >= 0) + @{ + /* @r{We got it, so return it.} */ + struct buffer *link + = (struct buffer *) xmalloc (sizeof (struct buffer)); + link->buffer = buffer; + link->size = success; + link->next = list; + return link; + @} + + /* @r{If we fail, see if we are at the mark.} */ + success = ioctl (socket, SIOCATMARK, &result); + if (success < 0) + perror ("ioctl"); + if (result) + @{ + /* @r{At the mark; skipping past more ordinary data cannot help.} + @r{So just wait a while.} */ + sleep (1); + continue; + @} + + /* @r{Otherwise, read a bunch of ordinary data and save it.} + @r{This is guaranteed not to read past the mark} + @r{if it starts before the mark.} */ + success = read (socket, buffer, sizeof buffer); + if (success < 0) + perror ("read"); + + /* @r{Save this data in the buffer list.} */ + @{ + struct buffer *link + = (struct buffer *) xmalloc (sizeof (struct buffer)); + link->buffer = buffer; + link->size = success; + + /* @r{Add the new link to the end of the list.} */ + if (tail) + tail->next = link; + else + list = link; + tail = link; + @} + @} +@} +@end smallexample + +@node Datagrams +@section Datagram Socket Operations + +@cindex datagram socket +This section describes how to use communication styles that don't use +connections (styles @code{SOCK_DGRAM} and @code{SOCK_RDM}). Using +these styles, you group data into packets and each packet is an +independent communication. You specify the destination for each +packet individually. + +Datagram packets are like letters: you send each one independently, +with its own destination address, and they may arrive in the wrong +order or not at all. + +The @code{listen} and @code{accept} functions are not allowed for +sockets using connectionless communication styles. + +@menu +* Sending Datagrams:: Sending packets on a datagram socket. +* Receiving Datagrams:: Receiving packets on a datagram socket. +* Datagram Example:: An example program: packets sent over a + datagram socket in the file namespace. +* Example Receiver:: Another program, that receives those packets. +@end menu + +@node Sending Datagrams +@subsection Sending Datagrams +@cindex sending a datagram +@cindex transmitting datagrams +@cindex datagrams, transmitting + +@pindex sys/socket.h +The normal way of sending data on a datagram socket is by using the +@code{sendto} function, declared in @file{sys/socket.h}. + +You can call @code{connect} on a datagram socket, but this only +specifies a default destination for further data transmission on the +socket. When a socket has a default destination, then you can use +@code{send} (@pxref{Sending Data}) or even @code{write} (@pxref{I/O +Primitives}) to send a packet there. You can cancel the default +destination by calling @code{connect} using an address format of +@code{AF_UNSPEC} in the @var{addr} argument. @xref{Connecting}, for +more information about the @code{connect} function. + +@comment sys/socket.h +@comment BSD +@deftypefun int sendto (int @var{socket}, void *@var{buffer}. size_t @var{size}, int @var{flags}, struct sockaddr *@var{addr}, size_t @var{length}) +The @code{sendto} function transmits the data in the @var{buffer} +through the socket @var{socket} to the destination address specified +by the @var{addr} and @var{length} arguments. The @var{size} argument +specifies the number of bytes to be transmitted. + +The @var{flags} are interpreted the same way as for @code{send}; see +@ref{Socket Data Options}. + +The return value and error conditions are also the same as for +@code{send}, but you cannot rely on the system to detect errors and +report them; the most common error is that the packet is lost or there +is no one at the specified address to receive it, and the operating +system on your machine usually does not know this. + +It is also possible for one call to @code{sendto} to report an error +due to a problem related to a previous call. +@end deftypefun + +@node Receiving Datagrams +@subsection Receiving Datagrams +@cindex receiving datagrams + +The @code{recvfrom} function reads a packet from a datagram socket and +also tells you where it was sent from. This function is declared in +@file{sys/socket.h}. + +@comment sys/socket.h +@comment BSD +@deftypefun int recvfrom (int @var{socket}, void *@var{buffer}, size_t @var{size}, int @var{flags}, struct sockaddr *@var{addr}, size_t *@var{length-ptr}) +The @code{recvfrom} function reads one packet from the socket +@var{socket} into the buffer @var{buffer}. The @var{size} argument +specifies the maximum number of bytes to be read. + +If the packet is longer than @var{size} bytes, then you get the first +@var{size} bytes of the packet, and the rest of the packet is lost. +There's no way to read the rest of the packet. Thus, when you use a +packet protocol, you must always know how long a packet to expect. + +The @var{addr} and @var{length-ptr} arguments are used to return the +address where the packet came from. @xref{Socket Addresses}. For a +socket in the file domain, the address information won't be meaningful, +since you can't read the address of such a socket (@pxref{File +Namespace}). You can specify a null pointer as the @var{addr} argument +if you are not interested in this information. + +The @var{flags} are interpreted the same way as for @code{recv} +(@pxref{Socket Data Options}). The return value and error conditions +are also the same as for @code{recv}. +@end deftypefun + +You can use plain @code{recv} (@pxref{Receiving Data}) instead of +@code{recvfrom} if you know don't need to find out who sent the packet +(either because you know where it should come from or because you +treat all possible senders alike). Even @code{read} can be used if +you don't want to specify @var{flags} (@pxref{I/O Primitives}). + +@ignore +@c sendmsg and recvmsg are like readv and writev in that they +@c use a series of buffers. It's not clear this is worth +@c supporting or that we support them. +@c !!! they can do more; it is hairy + +@comment sys/socket.h +@comment BSD +@deftp {Data Type} {struct msghdr} +@end deftp + +@comment sys/socket.h +@comment BSD +@deftypefun int sendmsg (int @var{socket}, const struct msghdr *@var{message}, int @var{flags}) +@end deftypefun + +@comment sys/socket.h +@comment BSD +@deftypefun int recvmsg (int @var{socket}, struct msghdr *@var{message}, int @var{flags}) +@end deftypefun +@end ignore + +@node Datagram Example +@subsection Datagram Socket Example + +Here is a set of example programs that send messages over a datagram +stream in the file namespace. Both the client and server programs use the +@code{make_named_socket} function that was presented in @ref{File +Namespace}, to create and name their sockets. + +First, here is the server program. It sits in a loop waiting for +messages to arrive, bouncing each message back to the sender. +Obviously, this isn't a particularly useful program, but it does show +the general ideas involved. + +@smallexample +@include filesrv.c.texi +@end smallexample + +@node Example Receiver +@subsection Example of Reading Datagrams + +Here is the client program corresponding to the server above. + +It sends a datagram to the server and then waits for a reply. Notice +that the socket for the client (as well as for the server) in this +example has to be given a name. This is so that the server can direct +a message back to the client. Since the socket has no associated +connection state, the only way the server can do this is by +referencing the name of the client. + +@smallexample +@include filecli.c.texi +@end smallexample + +Keep in mind that datagram socket communications are unreliable. In +this example, the client program waits indefinitely if the message +never reaches the server or if the server's response never comes +back. It's up to the user running the program to kill it and restart +it, if desired. A more automatic solution could be to use +@code{select} (@pxref{Waiting for I/O}) to establish a timeout period +for the reply, and in case of timeout either resend the message or +shut down the socket and exit. + +@node Inetd +@section The @code{inetd} Daemon + +We've explained above how to write a server program that does its own +listening. Such a server must already be running in order for anyone +to connect to it. + +Another way to provide service for an Internet port is to let the daemon +program @code{inetd} do the listening. @code{inetd} is a program that +runs all the time and waits (using @code{select}) for messages on a +specified set of ports. When it receives a message, it accepts the +connection (if the socket style calls for connections) and then forks a +child process to run the corresponding server program. You specify the +ports and their programs in the file @file{/etc/inetd.conf}. + +@menu +* Inetd Servers:: +* Configuring Inetd:: +@end menu + +@node Inetd Servers +@subsection @code{inetd} Servers + +Writing a server program to be run by @code{inetd} is very simple. Each time +someone requests a connection to the appropriate port, a new server +process starts. The connection already exists at this time; the +socket is available as the standard input descriptor and as the +standard output descriptor (descriptors 0 and 1) in the server +process. So the server program can begin reading and writing data +right away. Often the program needs only the ordinary I/O facilities; +in fact, a general-purpose filter program that knows nothing about +sockets can work as a byte stream server run by @code{inetd}. + +You can also use @code{inetd} for servers that use connectionless +communication styles. For these servers, @code{inetd} does not try to accept +a connection, since no connection is possible. It just starts the +server program, which can read the incoming datagram packet from +descriptor 0. The server program can handle one request and then +exit, or you can choose to write it to keep reading more requests +until no more arrive, and then exit. You must specify which of these +two techniques the server uses, when you configure @code{inetd}. + +@node Configuring Inetd +@subsection Configuring @code{inetd} + +The file @file{/etc/inetd.conf} tells @code{inetd} which ports to listen to +and what server programs to run for them. Normally each entry in the +file is one line, but you can split it onto multiple lines provided +all but the first line of the entry start with whitespace. Lines that +start with @samp{#} are comments. + +Here are two standard entries in @file{/etc/inetd.conf}: + +@smallexample +ftp stream tcp nowait root /libexec/ftpd ftpd +talk dgram udp wait root /libexec/talkd talkd +@end smallexample + +An entry has this format: + +@smallexample +@var{service} @var{style} @var{protocol} @var{wait} @var{username} @var{program} @var{arguments} +@end smallexample + +The @var{service} field says which service this program provides. It +should be the name of a service defined in @file{/etc/services}. +@code{inetd} uses @var{service} to decide which port to listen on for +this entry. + +The fields @var{style} and @var{protocol} specify the communication +style and the protocol to use for the listening socket. The style +should be the name of a communication style, converted to lower case +and with @samp{SOCK_} deleted---for example, @samp{stream} or +@samp{dgram}. @var{protocol} should be one of the protocols listed in +@file{/etc/protocols}. The typical protocol names are @samp{tcp} for +byte stream connections and @samp{udp} for unreliable datagrams. + +The @var{wait} field should be either @samp{wait} or @samp{nowait}. +Use @samp{wait} if @var{style} is a connectionless style and the +server, once started, handles multiple requests, as many as come in. +Use @samp{nowait} if @code{inetd} should start a new process for each message +or request that comes in. If @var{style} uses connections, then +@var{wait} @strong{must} be @samp{nowait}. + +@var{user} is the user name that the server should run as. @code{inetd} runs +as root, so it can set the user ID of its children arbitrarily. It's +best to avoid using @samp{root} for @var{user} if you can; but some +servers, such as Telnet and FTP, read a username and password +themselves. These servers need to be root initially so they can log +in as commanded by the data coming over the network. + +@var{program} together with @var{arguments} specifies the command to +run to start the server. @var{program} should be an absolute file +name specifying the executable file to run. @var{arguments} consists +of any number of whitespace-separated words, which become the +command-line arguments of @var{program}. The first word in +@var{arguments} is argument zero, which should by convention be the +program name itself (sans directories). + +If you edit @file{/etc/inetd.conf}, you can tell @code{inetd} to reread the +file and obey its new contents by sending the @code{inetd} process the +@code{SIGHUP} signal. You'll have to use @code{ps} to determine the +process ID of the @code{inetd} process, as it is not fixed. + +@c !!! could document /etc/inetd.sec + +@node Socket Options +@section Socket Options +@cindex socket options + +This section describes how to read or set various options that modify +the behavior of sockets and their underlying communications protocols. + +@cindex level, for socket options +@cindex socket option level +When you are manipulating a socket option, you must specify which +@dfn{level} the option pertains to. This describes whether the option +applies to the socket interface, or to a lower-level communications +protocol interface. + +@menu +* Socket Option Functions:: The basic functions for setting and getting + socket options. +* Socket-Level Options:: Details of the options at the socket level. +@end menu + +@node Socket Option Functions +@subsection Socket Option Functions + +@pindex sys/socket.h +Here are the functions for examining and modifying socket options. +They are declared in @file{sys/socket.h}. + +@comment sys/socket.h +@comment BSD +@deftypefun int getsockopt (int @var{socket}, int @var{level}, int @var{optname}, void *@var{optval}, size_t *@var{optlen-ptr}) +The @code{getsockopt} function gets information about the value of +option @var{optname} at level @var{level} for socket @var{socket}. + +The option value is stored in a buffer that @var{optval} points to. +Before the call, you should supply in @code{*@var{optlen-ptr}} the +size of this buffer; on return, it contains the number of bytes of +information actually stored in the buffer. + +Most options interpret the @var{optval} buffer as a single @code{int} +value. + +The actual return value of @code{getsockopt} is @code{0} on success +and @code{-1} on failure. The following @code{errno} error conditions +are defined: + +@table @code +@item EBADF +The @var{socket} argument is not a valid file descriptor. + +@item ENOTSOCK +The descriptor @var{socket} is not a socket. + +@item ENOPROTOOPT +The @var{optname} doesn't make sense for the given @var{level}. +@end table +@end deftypefun + +@comment sys/socket.h +@comment BSD +@deftypefun int setsockopt (int @var{socket}, int @var{level}, int @var{optname}, void *@var{optval}, size_t @var{optlen}) +This function is used to set the socket option @var{optname} at level +@var{level} for socket @var{socket}. The value of the option is passed +in the buffer @var{optval}, which has size @var{optlen}. + +The return value and error codes for @code{setsockopt} are the same as +for @code{getsockopt}. +@end deftypefun + +@node Socket-Level Options +@subsection Socket-Level Options + +@comment sys/socket.h +@comment BSD +@deftypevr Constant int SOL_SOCKET +Use this constant as the @var{level} argument to @code{getsockopt} or +@code{setsockopt} to manipulate the socket-level options described in +this section. +@end deftypevr + +@pindex sys/socket.h +Here is a table of socket-level option names; all are defined in the +header file @file{sys/socket.h}. + +@table @code +@comment sys/socket.h +@comment BSD +@item SO_DEBUG +@c Extra blank line here makes the table look better. + +This option toggles recording of debugging information in the underlying +protocol modules. The value has type @code{int}; a nonzero value means +``yes''. +@c !!! should say how this is used +@c Ok, anyone who knows, please explain. + +@comment sys/socket.h +@comment BSD +@item SO_REUSEADDR +This option controls whether @code{bind} (@pxref{Setting Address}) +should permit reuse of local addresses for this socket. If you enable +this option, you can actually have two sockets with the same Internet +port number; but the system won't allow you to use the two +identically-named sockets in a way that would confuse the Internet. The +reason for this option is that some higher-level Internet protocols, +including FTP, require you to keep reusing the same socket number. + +The value has type @code{int}; a nonzero value means ``yes''. + +@comment sys/socket.h +@comment BSD +@item SO_KEEPALIVE +This option controls whether the underlying protocol should +periodically transmit messages on a connected socket. If the peer +fails to respond to these messages, the connection is considered +broken. The value has type @code{int}; a nonzero value means +``yes''. + +@comment sys/socket.h +@comment BSD +@item SO_DONTROUTE +This option controls whether outgoing messages bypass the normal +message routing facilities. If set, messages are sent directly to the +network interface instead. The value has type @code{int}; a nonzero +value means ``yes''. + +@comment sys/socket.h +@comment BSD +@item SO_LINGER +This option specifies what should happen when the socket of a type +that promises reliable delivery still has untransmitted messages when +it is closed; see @ref{Closing a Socket}. The value has type +@code{struct linger}. + +@comment sys/socket.h +@comment BSD +@deftp {Data Type} {struct linger} +This structure type has the following members: + +@table @code +@item int l_onoff +This field is interpreted as a boolean. If nonzero, @code{close} +blocks until the data is transmitted or the timeout period has expired. + +@item int l_linger +This specifies the timeout period, in seconds. +@end table +@end deftp + +@comment sys/socket.h +@comment BSD +@item SO_BROADCAST +This option controls whether datagrams may be broadcast from the socket. +The value has type @code{int}; a nonzero value means ``yes''. + +@comment sys/socket.h +@comment BSD +@item SO_OOBINLINE +If this option is set, out-of-band data received on the socket is +placed in the normal input queue. This permits it to be read using +@code{read} or @code{recv} without specifying the @code{MSG_OOB} +flag. @xref{Out-of-Band Data}. The value has type @code{int}; a +nonzero value means ``yes''. + +@comment sys/socket.h +@comment BSD +@item SO_SNDBUF +This option gets or sets the size of the output buffer. The value is a +@code{size_t}, which is the size in bytes. + +@comment sys/socket.h +@comment BSD +@item SO_RCVBUF +This option gets or sets the size of the input buffer. The value is a +@code{size_t}, which is the size in bytes. + +@comment sys/socket.h +@comment GNU +@item SO_STYLE +@comment sys/socket.h +@comment BSD +@itemx SO_TYPE +This option can be used with @code{getsockopt} only. It is used to +get the socket's communication style. @code{SO_TYPE} is the +historical name, and @code{SO_STYLE} is the preferred name in GNU. +The value has type @code{int} and its value designates a communication +style; see @ref{Communication Styles}. + +@comment sys/socket.h +@comment BSD +@item SO_ERROR +@c Extra blank line here makes the table look better. + +This option can be used with @code{getsockopt} only. It is used to reset +the error status of the socket. The value is an @code{int}, which represents +the previous error status. +@c !!! what is "socket error status"? this is never defined. +@end table + +@node Networks Database +@section Networks Database +@cindex networks database +@cindex converting network number to network name +@cindex converting network name to network number + +@pindex /etc/networks +@pindex netdb.h +Many systems come with a database that records a list of networks known +to the system developer. This is usually kept either in the file +@file{/etc/networks} or in an equivalent from a name server. This data +base is useful for routing programs such as @code{route}, but it is not +useful for programs that simply communicate over the network. We +provide functions to access this data base, which are declared in +@file{netdb.h}. + +@comment netdb.h +@comment BSD +@deftp {Data Type} {struct netent} +This data type is used to represent information about entries in the +networks database. It has the following members: + +@table @code +@item char *n_name +This is the ``official'' name of the network. + +@item char **n_aliases +These are alternative names for the network, represented as a vector +of strings. A null pointer terminates the array. + +@item int n_addrtype +This is the type of the network number; this is always equal to +@code{AF_INET} for Internet networks. + +@item unsigned long int n_net +This is the network number. Network numbers are returned in host +byte order; see @ref{Byte Order}. +@end table +@end deftp + +Use the @code{getnetbyname} or @code{getnetbyaddr} functions to search +the networks database for information about a specific network. The +information is returned in a statically-allocated structure; you must +copy the information if you need to save it. + +@comment netdb.h +@comment BSD +@deftypefun {struct netent *} getnetbyname (const char *@var{name}) +The @code{getnetbyname} function returns information about the network +named @var{name}. It returns a null pointer if there is no such +network. +@end deftypefun + +@comment netdb.h +@comment BSD +@deftypefun {struct netent *} getnetbyaddr (long @var{net}, int @var{type}) +The @code{getnetbyaddr} function returns information about the network +of type @var{type} with number @var{net}. You should specify a value of +@code{AF_INET} for the @var{type} argument for Internet networks. + +@code{getnetbyaddr} returns a null pointer if there is no such +network. +@end deftypefun + +You can also scan the networks database using @code{setnetent}, +@code{getnetent}, and @code{endnetent}. Be careful in using these +functions, because they are not reentrant. + +@comment netdb.h +@comment BSD +@deftypefun void setnetent (int @var{stayopen}) +This function opens and rewinds the networks database. + +If the @var{stayopen} argument is nonzero, this sets a flag so that +subsequent calls to @code{getnetbyname} or @code{getnetbyaddr} will +not close the database (as they usually would). This makes for more +efficiency if you call those functions several times, by avoiding +reopening the database for each call. +@end deftypefun + +@comment netdb.h +@comment BSD +@deftypefun {struct netent *} getnetent (void) +This function returns the next entry in the networks database. It +returns a null pointer if there are no more entries. +@end deftypefun + +@comment netdb.h +@comment BSD +@deftypefun void endnetent (void) +This function closes the networks database. +@end deftypefun diff --git a/manual/startup.texi b/manual/startup.texi new file mode 100644 index 0000000000..c4f2b2f03f --- /dev/null +++ b/manual/startup.texi @@ -0,0 +1,908 @@ +@node Process Startup +@chapter Process Startup and Termination + +@cindex process +@dfn{Processes} are the primitive units for allocation of system +resources. Each process has its own address space and (usually) one +thread of control. A process executes a program; you can have multiple +processes executing the same program, but each process has its own copy +of the program within its own address space and executes it +independently of the other copies. + +This chapter explains what your program should do to handle the startup +of a process, to terminate its process, and to receive information +(arguments and the environment) from the parent process. + +@menu +* Program Arguments:: Parsing your program's command-line arguments. +* Environment Variables:: How to access parameters inherited from + a parent process. +* Program Termination:: How to cause a process to terminate and + return status information to its parent. +@end menu + +@node Program Arguments +@section Program Arguments +@cindex program arguments +@cindex command line arguments +@cindex arguments, to program + +@cindex program startup +@cindex startup of program +@cindex invocation of program +@cindex @code{main} function +@findex main +The system starts a C program by calling the function @code{main}. It +is up to you to write a function named @code{main}---otherwise, you +won't even be able to link your program without errors. + +In ANSI C you can define @code{main} either to take no arguments, or to +take two arguments that represent the command line arguments to the +program, like this: + +@smallexample +int main (int @var{argc}, char *@var{argv}[]) +@end smallexample + +@cindex argc (program argument count) +@cindex argv (program argument vector) +The command line arguments are the whitespace-separated tokens given in +the shell command used to invoke the program; thus, in @samp{cat foo +bar}, the arguments are @samp{foo} and @samp{bar}. The only way a +program can look at its command line arguments is via the arguments of +@code{main}. If @code{main} doesn't take arguments, then you cannot get +at the command line. + +The value of the @var{argc} argument is the number of command line +arguments. The @var{argv} argument is a vector of C strings; its +elements are the individual command line argument strings. The file +name of the program being run is also included in the vector as the +first element; the value of @var{argc} counts this element. A null +pointer always follows the last element: @code{@var{argv}[@var{argc}]} +is this null pointer. + +For the command @samp{cat foo bar}, @var{argc} is 3 and @var{argv} has +three elements, @code{"cat"}, @code{"foo"} and @code{"bar"}. + +If the syntax for the command line arguments to your program is simple +enough, you can simply pick the arguments off from @var{argv} by hand. +But unless your program takes a fixed number of arguments, or all of the +arguments are interpreted in the same way (as file names, for example), +you are usually better off using @code{getopt} to do the parsing. + +In Unix systems you can define @code{main} a third way, using three arguments: + +@smallexample +int main (int @var{argc}, char *@var{argv}[], char *@var{envp}) +@end smallexample + +The first two arguments are just the same. The third argument +@var{envp} gives the process's environment; it is the same as the value +of @code{environ}. @xref{Environment Variables}. POSIX.1 does not +allow this three-argument form, so to be portable it is best to write +@code{main} to take two arguments, and use the value of @code{environ}. + +@menu +* Argument Syntax:: By convention, options start with a hyphen. +* Parsing Options:: The @code{getopt} function. +* Example of Getopt:: An example of parsing options with @code{getopt}. +* Long Options:: GNU suggests utilities accept long-named options. + Here is how to do that. +* Long Option Example:: An example of using @code{getopt_long}. +@end menu + +@node Argument Syntax +@subsection Program Argument Syntax Conventions +@cindex program argument syntax +@cindex syntax, for program arguments +@cindex command argument syntax + +POSIX recommends these conventions for command line arguments. +@code{getopt} (@pxref{Parsing Options}) makes it easy to implement them. + +@itemize @bullet +@item +Arguments are options if they begin with a hyphen delimiter (@samp{-}). + +@item +Multiple options may follow a hyphen delimiter in a single token if +the options do not take arguments. Thus, @samp{-abc} is equivalent to +@samp{-a -b -c}. + +@item +Option names are single alphanumeric characters (as for @code{isalnum}; +see @ref{Classification of Characters}). + +@item +Certain options require an argument. For example, the @samp{-o} command +of the @code{ld} command requires an argument---an output file name. + +@item +An option and its argument may or may not appear as separate tokens. (In +other words, the whitespace separating them is optional.) Thus, +@w{@samp{-o foo}} and @samp{-ofoo} are equivalent. + +@item +Options typically precede other non-option arguments. + +The implementation of @code{getopt} in the GNU C library normally makes +it appear as if all the option arguments were specified before all the +non-option arguments for the purposes of parsing, even if the user of +your program intermixed option and non-option arguments. It does this +by reordering the elements of the @var{argv} array. This behavior is +nonstandard; if you want to suppress it, define the +@code{_POSIX_OPTION_ORDER} environment variable. @xref{Standard +Environment}. + +@item +The argument @samp{--} terminates all options; any following arguments +are treated as non-option arguments, even if they begin with a hyphen. + +@item +A token consisting of a single hyphen character is interpreted as an +ordinary non-option argument. By convention, it is used to specify +input from or output to the standard input and output streams. + +@item +Options may be supplied in any order, or appear multiple times. The +interpretation is left up to the particular application program. +@end itemize + +@cindex long-named options +GNU adds @dfn{long options} to these conventions. Long options consist +of @samp{--} followed by a name made of alphanumeric characters and +dashes. Option names are typically one to three words long, with +hyphens to separate words. Users can abbreviate the option names as +long as the abbreviations are unique. + +To specify an argument for a long option, write +@samp{--@var{name}=@var{value}}. This syntax enables a long option to +accept an argument that is itself optional. + +Eventually, the GNU system will provide completion for long option names +in the shell. + +@node Parsing Options +@subsection Parsing Program Options +@cindex program arguments, parsing +@cindex command arguments, parsing +@cindex parsing program arguments + +Here are the details about how to call the @code{getopt} function. To +use this facility, your program must include the header file +@file{unistd.h}. +@pindex unistd.h + +@comment unistd.h +@comment POSIX.2 +@deftypevar int opterr +If the value of this variable is nonzero, then @code{getopt} prints an +error message to the standard error stream if it encounters an unknown +option character or an option with a missing required argument. This is +the default behavior. If you set this variable to zero, @code{getopt} +does not print any messages, but it still returns the character @code{?} +to indicate an error. +@end deftypevar + +@comment unistd.h +@comment POSIX.2 +@deftypevar int optopt +When @code{getopt} encounters an unknown option character or an option +with a missing required argument, it stores that option character in +this variable. You can use this for providing your own diagnostic +messages. +@end deftypevar + +@comment unistd.h +@comment POSIX.2 +@deftypevar int optind +This variable is set by @code{getopt} to the index of the next element +of the @var{argv} array to be processed. Once @code{getopt} has found +all of the option arguments, you can use this variable to determine +where the remaining non-option arguments begin. The initial value of +this variable is @code{1}. +@end deftypevar + +@comment unistd.h +@comment POSIX.2 +@deftypevar {char *} optarg +This variable is set by @code{getopt} to point at the value of the +option argument, for those options that accept arguments. +@end deftypevar + +@comment unistd.h +@comment POSIX.2 +@deftypefun int getopt (int @var{argc}, char **@var{argv}, const char *@var{options}) +The @code{getopt} function gets the next option argument from the +argument list specified by the @var{argv} and @var{argc} arguments. +Normally these values come directly from the arguments received by +@code{main}. + +The @var{options} argument is a string that specifies the option +characters that are valid for this program. An option character in this +string can be followed by a colon (@samp{:}) to indicate that it takes a +required argument. + +If the @var{options} argument string begins with a hyphen (@samp{-}), this +is treated specially. It permits arguments that are not options to be +returned as if they were associated with option character @samp{\0}. + +The @code{getopt} function returns the option character for the next +command line option. When no more option arguments are available, it +returns @code{-1}. There may still be more non-option arguments; you +must compare the external variable @code{optind} against the @var{argc} +parameter to check this. + +If the option has an argument, @code{getopt} returns the argument by +storing it in the varables @var{optarg}. You don't ordinarily need to +copy the @code{optarg} string, since it is a pointer into the original +@var{argv} array, not into a static area that might be overwritten. + +If @code{getopt} finds an option character in @var{argv} that was not +included in @var{options}, or a missing option argument, it returns +@samp{?} and sets the external variable @code{optopt} to the actual +option character. If the first character of @var{options} is a colon +(@samp{:}), then @code{getopt} returns @samp{:} instead of @samp{?} to +indicate a missing option argument. In addition, if the external +variable @code{opterr} is nonzero (which is the default), @code{getopt} +prints an error message. +@end deftypefun + +@node Example of Getopt +@subsection Example of Parsing Arguments with @code{getopt} + +Here is an example showing how @code{getopt} is typically used. The +key points to notice are: + +@itemize @bullet +@item +Normally, @code{getopt} is called in a loop. When @code{getopt} returns +@code{-1}, indicating no more options are present, the loop terminates. + +@item +A @code{switch} statement is used to dispatch on the return value from +@code{getopt}. In typical use, each case just sets a variable that +is used later in the program. + +@item +A second loop is used to process the remaining non-option arguments. +@end itemize + +@smallexample +@include testopt.c.texi +@end smallexample + +Here are some examples showing what this program prints with different +combinations of arguments: + +@smallexample +% testopt +aflag = 0, bflag = 0, cvalue = (null) + +% testopt -a -b +aflag = 1, bflag = 1, cvalue = (null) + +% testopt -ab +aflag = 1, bflag = 1, cvalue = (null) + +% testopt -c foo +aflag = 0, bflag = 0, cvalue = foo + +% testopt -cfoo +aflag = 0, bflag = 0, cvalue = foo + +% testopt arg1 +aflag = 0, bflag = 0, cvalue = (null) +Non-option argument arg1 + +% testopt -a arg1 +aflag = 1, bflag = 0, cvalue = (null) +Non-option argument arg1 + +% testopt -c foo arg1 +aflag = 0, bflag = 0, cvalue = foo +Non-option argument arg1 + +% testopt -a -- -b +aflag = 1, bflag = 0, cvalue = (null) +Non-option argument -b + +% testopt -a - +aflag = 1, bflag = 0, cvalue = (null) +Non-option argument - +@end smallexample + +@node Long Options +@subsection Parsing Long Options + +To accept GNU-style long options as well as single-character options, +use @code{getopt_long} instead of @code{getopt}. You should make every +program accept long options if it uses any options, for this takes +little extra work and helps beginners remember how to use the program. + +@comment getopt.h +@comment GNU +@deftp {Data Type} {struct option} +This structure describes a single long option name for the sake of +@code{getopt_long}. The argument @var{longopts} must be an array of +these structures, one for each long option. Terminate the array with an +element containing all zeros. + +The @code{struct option} structure has these fields: + +@table @code +@item const char *name +This field is the name of the option. It is a string. + +@item int has_arg +This field says whether the option takes an argument. It is an integer, +and there are three legitimate values: @w{@code{no_argument}}, +@code{required_argument} and @code{optional_argument}. + +@item int *flag +@itemx int val +These fields control how to report or act on the option when it occurs. + +If @code{flag} is a null pointer, then the @code{val} is a value which +identifies this option. Often these values are chosen to uniquely +identify particular long options. + +If @code{flag} is not a null pointer, it should be the address of an +@code{int} variable which is the flag for this option. The value in +@code{val} is the value to store in the flag to indicate that the option +was seen. +@end table +@end deftp + +@comment getopt.h +@comment GNU +@deftypefun int getopt_long (int @var{argc}, char **@var{argv}, const char *@var{shortopts}, struct option *@var{longopts}, int *@var{indexptr}) +Decode options from the vector @var{argv} (whose length is @var{argc}). +The argument @var{shortopts} describes the short options to accept, just as +it does in @code{getopt}. The argument @var{longopts} describes the long +options to accept (see above). + +When @code{getopt_long} encounters a short option, it does the same +thing that @code{getopt} would do: it returns the character code for the +option, and stores the options argument (if it has one) in @code{optarg}. + +When @code{getopt_long} encounters a long option, it takes actions based +on the @code{flag} and @code{val} fields of the definition of that +option. + +If @code{flag} is a null pointer, then @code{getopt_long} returns the +contents of @code{val} to indicate which option it found. You should +arrange distinct values in the @code{val} field for options with +different meanings, so you can decode these values after +@code{getopt_long} returns. If the long option is equivalent to a short +option, you can use the short option's character code in @code{val}. + +If @code{flag} is not a null pointer, that means this option should just +set a flag in the program. The flag is a variable of type @code{int} +that you define. Put the address of the flag in the @code{flag} field. +Put in the @code{val} field the value you would like this option to +store in the flag. In this case, @code{getopt_long} returns @code{0}. + +For any long option, @code{getopt_long} tells you the index in the array +@var{longopts} of the options definition, by storing it into +@code{*@var{indexptr}}. You can get the name of the option with +@code{@var{longopts}[*@var{indexptr}].name}. So you can distinguish among +long options either by the values in their @code{val} fields or by their +indices. You can also distinguish in this way among long options that +set flags. + +When a long option has an argument, @code{getopt_long} puts the argument +value in the variable @code{optarg} before returning. When the option +has no argument, the value in @code{optarg} is a null pointer. This is +how you can tell whether an optional argument was supplied. + +When @code{getopt_long} has no more options to handle, it returns +@code{-1}, and leaves in the variable @code{optind} the index in +@var{argv} of the next remaining argument. +@end deftypefun + +@node Long Option Example +@subsection Example of Parsing Long Options + +@smallexample +@include longopt.c.texi +@end smallexample + +@node Environment Variables +@section Environment Variables + +@cindex environment variable +When a program is executed, it receives information about the context in +which it was invoked in two ways. The first mechanism uses the +@var{argv} and @var{argc} arguments to its @code{main} function, and is +discussed in @ref{Program Arguments}. The second mechanism uses +@dfn{environment variables} and is discussed in this section. + +The @var{argv} mechanism is typically used to pass command-line +arguments specific to the particular program being invoked. The +environment, on the other hand, keeps track of information that is +shared by many programs, changes infrequently, and that is less +frequently used. + +The environment variables discussed in this section are the same +environment variables that you set using assignments and the +@code{export} command in the shell. Programs executed from the shell +inherit all of the environment variables from the shell. +@c !!! xref to right part of bash manual when it exists + +@cindex environment +Standard environment variables are used for information about the user's +home directory, terminal type, current locale, and so on; you can define +additional variables for other purposes. The set of all environment +variables that have values is collectively known as the +@dfn{environment}. + +Names of environment variables are case-sensitive and must not contain +the character @samp{=}. System-defined environment variables are +invariably uppercase. + +The values of environment variables can be anything that can be +represented as a string. A value must not contain an embedded null +character, since this is assumed to terminate the string. + + +@menu +* Environment Access:: How to get and set the values of + environment variables. +* Standard Environment:: These environment variables have + standard interpretations. +@end menu + +@node Environment Access +@subsection Environment Access +@cindex environment access +@cindex environment representation + +The value of an environment variable can be accessed with the +@code{getenv} function. This is declared in the header file +@file{stdlib.h}. +@pindex stdlib.h + +@comment stdlib.h +@comment ANSI +@deftypefun {char *} getenv (const char *@var{name}) +This function returns a string that is the value of the environment +variable @var{name}. You must not modify this string. In some non-Unix +systems not using the GNU library, it might be overwritten by subsequent +calls to @code{getenv} (but not by any other library function). If the +environment variable @var{name} is not defined, the value is a null +pointer. +@end deftypefun + + +@comment stdlib.h +@comment SVID +@deftypefun int putenv (const char *@var{string}) +The @code{putenv} function adds or removes definitions from the environment. +If the @var{string} is of the form @samp{@var{name}=@var{value}}, the +definition is added to the environment. Otherwise, the @var{string} is +interpreted as the name of an environment variable, and any definition +for this variable in the environment is removed. + +The GNU library provides this function for compatibility with SVID; it +may not be available in other systems. +@end deftypefun + +@c !!! BSD function setenv + +You can deal directly with the underlying representation of environment +objects to add more variables to the environment (for example, to +communicate with another program you are about to execute; see +@ref{Executing a File}). + +@comment unistd.h +@comment POSIX.1 +@deftypevar {char **} environ +The environment is represented as an array of strings. Each string is +of the format @samp{@var{name}=@var{value}}. The order in which +strings appear in the environment is not significant, but the same +@var{name} must not appear more than once. The last element of the +array is a null pointer. + +This variable is declared in the header file @file{unistd.h}. + +If you just want to get the value of an environment variable, use +@code{getenv}. +@end deftypevar + +Unix systems, and the GNU system, pass the initial value of +@code{environ} as the third argument to @code{main}. +@xref{Program Arguments}. + +@node Standard Environment +@subsection Standard Environment Variables +@cindex standard environment variables + +These environment variables have standard meanings. This doesn't mean +that they are always present in the environment; but if these variables +@emph{are} present, they have these meanings. You shouldn't try to use +these environment variable names for some other purpose. + +@comment Extra blank lines make it look better. +@table @code +@item HOME +@cindex HOME environment variable +@cindex home directory + +This is a string representing the user's @dfn{home directory}, or +initial default working directory. + +The user can set @code{HOME} to any value. +If you need to make sure to obtain the proper home directory +for a particular user, you should not use @code{HOME}; instead, +look up the user's name in the user database (@pxref{User Database}). + +For most purposes, it is better to use @code{HOME}, precisely because +this lets the user specify the value. + +@c !!! also USER +@item LOGNAME +@cindex LOGNAME environment variable + +This is the name that the user used to log in. Since the value in the +environment can be tweaked arbitrarily, this is not a reliable way to +identify the user who is running a process; a function like +@code{getlogin} (@pxref{Who Logged In}) is better for that purpose. + +For most purposes, it is better to use @code{LOGNAME}, precisely because +this lets the user specify the value. + +@item PATH +@cindex PATH environment variable + +A @dfn{path} is a sequence of directory names which is used for +searching for a file. The variable @code{PATH} holds a path used +for searching for programs to be run. + +The @code{execlp} and @code{execvp} functions (@pxref{Executing a File}) +use this environment variable, as do many shells and other utilities +which are implemented in terms of those functions. + +The syntax of a path is a sequence of directory names separated by +colons. An empty string instead of a directory name stands for the +current directory (@pxref{Working Directory}). + +A typical value for this environment variable might be a string like: + +@smallexample +:/bin:/etc:/usr/bin:/usr/new/X11:/usr/new:/usr/local/bin +@end smallexample + +This means that if the user tries to execute a program named @code{foo}, +the system will look for files named @file{foo}, @file{/bin/foo}, +@file{/etc/foo}, and so on. The first of these files that exists is +the one that is executed. + +@c !!! also TERMCAP +@item TERM +@cindex TERM environment variable + +This specifies the kind of terminal that is receiving program output. +Some programs can make use of this information to take advantage of +special escape sequences or terminal modes supported by particular kinds +of terminals. Many programs which use the termcap library +(@pxref{Finding a Terminal Description,Find,,termcap,The Termcap Library +Manual}) use the @code{TERM} environment variable, for example. + +@item TZ +@cindex TZ environment variable + +This specifies the time zone. @xref{TZ Variable}, for information about +the format of this string and how it is used. + +@item LANG +@cindex LANG environment variable + +This specifies the default locale to use for attribute categories where +neither @code{LC_ALL} nor the specific environment variable for that +category is set. @xref{Locales}, for more information about +locales. + +@ignore +@c I doubt this really exists +@item LC_ALL +@cindex LC_ALL environment variable + +This is similar to the @code{LANG} environment variable. However, its +value takes precedence over any values provided for the individual +attribute category environment variables, or for the @code{LANG} +environment variable. +@end ignore + +@item LC_COLLATE +@cindex LC_COLLATE environment variable + +This specifies what locale to use for string sorting. + +@item LC_CTYPE +@cindex LC_CTYPE environment variable + +This specifies what locale to use for character sets and character +classification. + +@item LC_MONETARY +@cindex LC_MONETARY environment variable + +This specifies what locale to use for formatting monetary values. + +@item LC_NUMERIC +@cindex LC_NUMERIC environment variable + +This specifies what locale to use for formatting numbers. + +@item LC_TIME +@cindex LC_TIME environment variable + +This specifies what locale to use for formatting date/time values. + +@item _POSIX_OPTION_ORDER +@cindex _POSIX_OPTION_ORDER environment variable. + +If this environment variable is defined, it suppresses the usual +reordering of command line arguments by @code{getopt}. @xref{Argument Syntax}. + +@c !!! GNU also has COREFILE, CORESERVER, EXECSERVERS +@end table + +@node Program Termination +@section Program Termination +@cindex program termination +@cindex process termination + +@cindex exit status value +The usual way for a program to terminate is simply for its @code{main} +function to return. The @dfn{exit status value} returned from the +@code{main} function is used to report information back to the process's +parent process or shell. + +A program can also terminate normally by calling the @code{exit} +function. + +In addition, programs can be terminated by signals; this is discussed in +more detail in @ref{Signal Handling}. The @code{abort} function causes +a signal that kills the program. + +@menu +* Normal Termination:: If a program calls @code{exit}, a + process terminates normally. +* Exit Status:: The @code{exit status} provides information + about why the process terminated. +* Cleanups on Exit:: A process can run its own cleanup + functions upon normal termination. +* Aborting a Program:: The @code{abort} function causes + abnormal program termination. +* Termination Internals:: What happens when a process terminates. +@end menu + +@node Normal Termination +@subsection Normal Termination + +A process terminates normally when the program calls @code{exit}. +Returning from @code{main} is equivalent to calling @code{exit}, and +the value that @code{main} returns is used as the argument to @code{exit}. + +@comment stdlib.h +@comment ANSI +@deftypefun void exit (int @var{status}) +The @code{exit} function terminates the process with status +@var{status}. This function does not return. +@end deftypefun + +Normal termination causes the following actions: + +@enumerate +@item +Functions that were registered with the @code{atexit} or @code{on_exit} +functions are called in the reverse order of their registration. This +mechanism allows your application to specify its own ``cleanup'' actions +to be performed at program termination. Typically, this is used to do +things like saving program state information in a file, or unlocking +locks in shared data bases. + +@item +All open streams are closed, writing out any buffered output data. See +@ref{Closing Streams}. In addition, temporary files opened +with the @code{tmpfile} function are removed; see @ref{Temporary Files}. + +@item +@code{_exit} is called, terminating the program. @xref{Termination Internals}. +@end enumerate + +@node Exit Status +@subsection Exit Status +@cindex exit status + +When a program exits, it can return to the parent process a small +amount of information about the cause of termination, using the +@dfn{exit status}. This is a value between 0 and 255 that the exiting +process passes as an argument to @code{exit}. + +Normally you should use the exit status to report very broad information +about success or failure. You can't provide a lot of detail about the +reasons for the failure, and most parent processes would not want much +detail anyway. + +There are conventions for what sorts of status values certain programs +should return. The most common convention is simply 0 for success and 1 +for failure. Programs that perform comparison use a different +convention: they use status 1 to indicate a mismatch, and status 2 to +indicate an inability to compare. Your program should follow an +existing convention if an existing convention makes sense for it. + +A general convention reserves status values 128 and up for special +purposes. In particular, the value 128 is used to indicate failure to +execute another program in a subprocess. This convention is not +universally obeyed, but it is a good idea to follow it in your programs. + +@strong{Warning:} Don't try to use the number of errors as the exit +status. This is actually not very useful; a parent process would +generally not care how many errors occurred. Worse than that, it does +not work, because the status value is truncated to eight bits. +Thus, if the program tried to report 256 errors, the parent would +receive a report of 0 errors---that is, success. + +For the same reason, it does not work to use the value of @code{errno} +as the exit status---these can exceed 255. + +@strong{Portability note:} Some non-POSIX systems use different +conventions for exit status values. For greater portability, you can +use the macros @code{EXIT_SUCCESS} and @code{EXIT_FAILURE} for the +conventional status value for success and failure, respectively. They +are declared in the file @file{stdlib.h}. +@pindex stdlib.h + +@comment stdlib.h +@comment ANSI +@deftypevr Macro int EXIT_SUCCESS +This macro can be used with the @code{exit} function to indicate +successful program completion. + +On POSIX systems, the value of this macro is @code{0}. On other +systems, the value might be some other (possibly non-constant) integer +expression. +@end deftypevr + +@comment stdlib.h +@comment ANSI +@deftypevr Macro int EXIT_FAILURE +This macro can be used with the @code{exit} function to indicate +unsuccessful program completion in a general sense. + +On POSIX systems, the value of this macro is @code{1}. On other +systems, the value might be some other (possibly non-constant) integer +expression. Other nonzero status values also indicate future. Certain +programs use different nonzero status values to indicate particular +kinds of "non-success". For example, @code{diff} uses status value +@code{1} to mean that the files are different, and @code{2} or more to +mean that there was difficulty in opening the files. +@end deftypevr + +@node Cleanups on Exit +@subsection Cleanups on Exit + +Your program can arrange to run its own cleanup functions if normal +termination happens. If you are writing a library for use in various +application programs, then it is unreliable to insist that all +applications call the library's cleanup functions explicitly before +exiting. It is much more robust to make the cleanup invisible to the +application, by setting up a cleanup function in the library itself +using @code{atexit} or @code{on_exit}. + +@comment stdlib.h +@comment ANSI +@deftypefun int atexit (void (*@var{function}) (void)) +The @code{atexit} function registers the function @var{function} to be +called at normal program termination. The @var{function} is called with +no arguments. + +The return value from @code{atexit} is zero on success and nonzero if +the function cannot be registered. +@end deftypefun + +@comment stdlib.h +@comment SunOS +@deftypefun int on_exit (void (*@var{function})(int @var{status}, void *@var{arg}), void *@var{arg}) +This function is a somewhat more powerful variant of @code{atexit}. It +accepts two arguments, a function @var{function} and an arbitrary +pointer @var{arg}. At normal program termination, the @var{function} is +called with two arguments: the @var{status} value passed to @code{exit}, +and the @var{arg}. + +This function is included in the GNU C library only for compatibility +for SunOS, and may not be supported by other implementations. +@end deftypefun + +Here's a trivial program that illustrates the use of @code{exit} and +@code{atexit}: + +@smallexample +@include atexit.c.texi +@end smallexample + +@noindent +When this program is executed, it just prints the message and exits. + +@node Aborting a Program +@subsection Aborting a Program +@cindex aborting a program + +You can abort your program using the @code{abort} function. The prototype +for this function is in @file{stdlib.h}. +@pindex stdlib.h + +@comment stdlib.h +@comment ANSI +@deftypefun void abort (void) +The @code{abort} function causes abnormal program termination. This +does not execute cleanup functions registered with @code{atexit} or +@code{on_exit}. + +This function actually terminates the process by raising a +@code{SIGABRT} signal, and your program can include a handler to +intercept this signal; see @ref{Signal Handling}. +@end deftypefun + +@c Put in by rms. Don't remove. +@cartouche +@strong{Future Change Warning:} Proposed Federal censorship regulations +may prohibit us from giving you information about the possibility of +calling this function. We would be required to say that this is not an +acceptable way of terminating a program. +@end cartouche + +@node Termination Internals +@subsection Termination Internals + +The @code{_exit} function is the primitive used for process termination +by @code{exit}. It is declared in the header file @file{unistd.h}. +@pindex unistd.h + +@comment unistd.h +@comment POSIX.1 +@deftypefun void _exit (int @var{status}) +The @code{_exit} function is the primitive for causing a process to +terminate with status @var{status}. Calling this function does not +execute cleanup functions registered with @code{atexit} or +@code{on_exit}. +@end deftypefun + +When a process terminates for any reason---either by an explicit +termination call, or termination as a result of a signal---the +following things happen: + +@itemize @bullet +@item +All open file descriptors in the process are closed. @xref{Low-Level I/O}. +Note that streams are not flushed automatically when the process +terminates; @xref{I/O on Streams}. + +@item +The low-order 8 bits of the return status code are saved to be reported +back to the parent process via @code{wait} or @code{waitpid}; see +@ref{Process Completion}. + +@item +Any child processes of the process being terminated are assigned a new +parent process. (On most systems, including GNU, this is the @code{init} +process, with process ID 1.) + +@item +A @code{SIGCHLD} signal is sent to the parent process. + +@item +If the process is a session leader that has a controlling terminal, then +a @code{SIGHUP} signal is sent to each process in the foreground job, +and the controlling terminal is disassociated from that session. +@xref{Job Control}. + +@item +If termination of a process causes a process group to become orphaned, +and any member of that process group is stopped, then a @code{SIGHUP} +signal and a @code{SIGCONT} signal are sent to each process in the +group. @xref{Job Control}. +@end itemize diff --git a/manual/stdio.texi b/manual/stdio.texi new file mode 100644 index 0000000000..411d94a242 --- /dev/null +++ b/manual/stdio.texi @@ -0,0 +1,3635 @@ +@node I/O on Streams, Low-Level I/O, I/O Overview, Top +@chapter Input/Output on Streams + +This chapter describes the functions for creating streams and performing +input and output operations on them. As discussed in @ref{I/O +Overview}, a stream is a fairly abstract, high-level concept +representing a communications channel to a file, device, or process. + +@menu +* Streams:: About the data type representing a stream. +* Standard Streams:: Streams to the standard input and output + devices are created for you. +* Opening Streams:: How to create a stream to talk to a file. +* Closing Streams:: Close a stream when you are finished with it. +* Simple Output:: Unformatted output by characters and lines. +* Character Input:: Unformatted input by characters and words. +* Line Input:: Reading a line or a record from a stream. +* Unreading:: Peeking ahead/pushing back input just read. +* Block Input/Output:: Input and output operations on blocks of data. +* Formatted Output:: @code{printf} and related functions. +* Customizing Printf:: You can define new conversion specifiers for + @code{printf} and friends. +* Formatted Input:: @code{scanf} and related functions. +* EOF and Errors:: How you can tell if an I/O error happens. +* Binary Streams:: Some systems distinguish between text files + and binary files. +* File Positioning:: About random-access streams. +* Portable Positioning:: Random access on peculiar ANSI C systems. +* Stream Buffering:: How to control buffering of streams. +* Other Kinds of Streams:: Streams that do not necessarily correspond + to an open file. +@end menu + +@node Streams +@section Streams + +For historical reasons, the type of the C data structure that represents +a stream is called @code{FILE} rather than ``stream''. Since most of +the library functions deal with objects of type @code{FILE *}, sometimes +the term @dfn{file pointer} is also used to mean ``stream''. This leads +to unfortunate confusion over terminology in many books on C. This +manual, however, is careful to use the terms ``file'' and ``stream'' +only in the technical sense. +@cindex file pointer + +@pindex stdio.h +The @code{FILE} type is declared in the header file @file{stdio.h}. + +@comment stdio.h +@comment ANSI +@deftp {Data Type} FILE +This is the data type used to represent stream objects. A @code{FILE} +object holds all of the internal state information about the connection +to the associated file, including such things as the file position +indicator and buffering information. Each stream also has error and +end-of-file status indicators that can be tested with the @code{ferror} +and @code{feof} functions; see @ref{EOF and Errors}. +@end deftp + +@code{FILE} objects are allocated and managed internally by the +input/output library functions. Don't try to create your own objects of +type @code{FILE}; let the library do it. Your programs should +deal only with pointers to these objects (that is, @code{FILE *} values) +rather than the objects themselves. +@c !!! should say that FILE's have "No user-servicable parts inside." + +@node Standard Streams +@section Standard Streams +@cindex standard streams +@cindex streams, standard + +When the @code{main} function of your program is invoked, it already has +three predefined streams open and available for use. These represent +the ``standard'' input and output channels that have been established +for the process. + +These streams are declared in the header file @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftypevar {FILE *} stdin +The @dfn{standard input} stream, which is the normal source of input for the +program. +@end deftypevar +@cindex standard input stream + +@comment stdio.h +@comment ANSI +@deftypevar {FILE *} stdout +The @dfn{standard output} stream, which is used for normal output from +the program. +@end deftypevar +@cindex standard output stream + +@comment stdio.h +@comment ANSI +@deftypevar {FILE *} stderr +The @dfn{standard error} stream, which is used for error messages and +diagnostics issued by the program. +@end deftypevar +@cindex standard error stream + +In the GNU system, you can specify what files or processes correspond to +these streams using the pipe and redirection facilities provided by the +shell. (The primitives shells use to implement these facilities are +described in @ref{File System Interface}.) Most other operating systems +provide similar mechanisms, but the details of how to use them can vary. + +In the GNU C library, @code{stdin}, @code{stdout}, and @code{stderr} are +normal variables which you can set just like any others. For example, to redirect +the standard output to a file, you could do: + +@smallexample +fclose (stdout); +stdout = fopen ("standard-output-file", "w"); +@end smallexample + +Note however, that in other systems @code{stdin}, @code{stdout}, and +@code{stderr} are macros that you cannot assign to in the normal way. +But you can use @code{freopen} to get the effect of closing one and +reopening it. @xref{Opening Streams}. + +@node Opening Streams +@section Opening Streams + +@cindex opening a stream +Opening a file with the @code{fopen} function creates a new stream and +establishes a connection between the stream and a file. This may +involve creating a new file. + +@pindex stdio.h +Everything described in this section is declared in the header file +@file{stdio.h}. + +@comment stdio.h +@comment ANSI +@deftypefun {FILE *} fopen (const char *@var{filename}, const char *@var{opentype}) +The @code{fopen} function opens a stream for I/O to the file +@var{filename}, and returns a pointer to the stream. + +The @var{opentype} argument is a string that controls how the file is +opened and specifies attributes of the resulting stream. It must begin +with one of the following sequences of characters: + +@table @samp +@item r +Open an existing file for reading only. + +@item w +Open the file for writing only. If the file already exists, it is +truncated to zero length. Otherwise a new file is created. + +@item a +Open a file for append access; that is, writing at the end of file only. +If the file already exists, its initial contents are unchanged and +output to the stream is appended to the end of the file. +Otherwise, a new, empty file is created. + +@item r+ +Open an existing file for both reading and writing. The initial contents +of the file are unchanged and the initial file position is at the +beginning of the file. + +@item w+ +Open a file for both reading and writing. If the file already exists, it +is truncated to zero length. Otherwise, a new file is created. + +@item a+ +Open or create file for both reading and appending. If the file exists, +its initial contents are unchanged. Otherwise, a new file is created. +The initial file position for reading is at the beginning of the file, +but output is always appended to the end of the file. +@end table + +As you can see, @samp{+} requests a stream that can do both input and +output. The ANSI standard says that when using such a stream, you must +call @code{fflush} (@pxref{Stream Buffering}) or a file positioning +function such as @code{fseek} (@pxref{File Positioning}) when switching +from reading to writing or vice versa. Otherwise, internal buffers +might not be emptied properly. The GNU C library does not have this +limitation; you can do arbitrary reading and writing operations on a +stream in whatever order. + +Additional characters may appear after these to specify flags for the +call. Always put the mode (@samp{r}, @samp{w+}, etc.) first; that is +the only part you are guaranteed will be understood by all systems. + +The GNU C library defines one additional character for use in +@var{opentype}: the character @samp{x} insists on creating a new +file---if a file @var{filename} already exists, @code{fopen} fails +rather than opening it. If you use @samp{x} you can are guaranteed that +you will not clobber an existing file. This is equivalent to the +@code{O_EXCL} option to the @code{open} function (@pxref{Opening and +Closing Files}). + +The character @samp{b} in @var{opentype} has a standard meaning; it +requests a binary stream rather than a text stream. But this makes no +difference in POSIX systems (including the GNU system). If both +@samp{+} and @samp{b} are specified, they can appear in either order. +@xref{Binary Streams}. + +Any other characters in @var{opentype} are simply ignored. They may be +meaningful in other systems. + +If the open fails, @code{fopen} returns a null pointer. +@end deftypefun + +You can have multiple streams (or file descriptors) pointing to the same +file open at the same time. If you do only input, this works +straightforwardly, but you must be careful if any output streams are +included. @xref{Stream/Descriptor Precautions}. This is equally true +whether the streams are in one program (not usual) or in several +programs (which can easily happen). It may be advantageous to use the +file locking facilities to avoid simultaneous access. @xref{File +Locks}. + +@comment stdio.h +@comment ANSI +@deftypevr Macro int FOPEN_MAX +The value of this macro is an integer constant expression that +represents the minimum number of streams that the implementation +guarantees can be open simultaneously. You might be able to open more +than this many streams, but that is not guaranteed. The value of this +constant is at least eight, which includes the three standard streams +@code{stdin}, @code{stdout}, and @code{stderr}. In POSIX.1 systems this +value is determined by the @code{OPEN_MAX} parameter; @pxref{General +Limits}. In BSD and GNU, it is controlled by the @code{RLIMIT_NOFILE} +resource limit; @pxref{Limits on Resources}. +@end deftypevr + +@comment stdio.h +@comment ANSI +@deftypefun {FILE *} freopen (const char *@var{filename}, const char *@var{opentype}, FILE *@var{stream}) +This function is like a combination of @code{fclose} and @code{fopen}. +It first closes the stream referred to by @var{stream}, ignoring any +errors that are detected in the process. (Because errors are ignored, +you should not use @code{freopen} on an output stream if you have +actually done any output using the stream.) Then the file named by +@var{filename} is opened with mode @var{opentype} as for @code{fopen}, +and associated with the same stream object @var{stream}. + +If the operation fails, a null pointer is returned; otherwise, +@code{freopen} returns @var{stream}. + +@code{freopen} has traditionally been used to connect a standard stream +such as @code{stdin} with a file of your own choice. This is useful in +programs in which use of a standard stream for certain purposes is +hard-coded. In the GNU C library, you can simply close the standard +streams and open new ones with @code{fopen}. But other systems lack +this ability, so using @code{freopen} is more portable. +@end deftypefun + + +@node Closing Streams +@section Closing Streams + +@cindex closing a stream +When a stream is closed with @code{fclose}, the connection between the +stream and the file is cancelled. After you have closed a stream, you +cannot perform any additional operations on it. + +@comment stdio.h +@comment ANSI +@deftypefun int fclose (FILE *@var{stream}) +This function causes @var{stream} to be closed and the connection to +the corresponding file to be broken. Any buffered output is written +and any buffered input is discarded. The @code{fclose} function returns +a value of @code{0} if the file was closed successfully, and @code{EOF} +if an error was detected. + +It is important to check for errors when you call @code{fclose} to close +an output stream, because real, everyday errors can be detected at this +time. For example, when @code{fclose} writes the remaining buffered +output, it might get an error because the disk is full. Even if you +know the buffer is empty, errors can still occur when closing a file if +you are using NFS. + +The function @code{fclose} is declared in @file{stdio.h}. +@end deftypefun + +If the @code{main} function to your program returns, or if you call the +@code{exit} function (@pxref{Normal Termination}), all open streams are +automatically closed properly. If your program terminates in any other +manner, such as by calling the @code{abort} function (@pxref{Aborting a +Program}) or from a fatal signal (@pxref{Signal Handling}), open streams +might not be closed properly. Buffered output might not be flushed and +files may be incomplete. For more information on buffering of streams, +see @ref{Stream Buffering}. + +@node Simple Output +@section Simple Output by Characters or Lines + +@cindex writing to a stream, by characters +This section describes functions for performing character- and +line-oriented output. + +These functions are declared in the header file @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftypefun int fputc (int @var{c}, FILE *@var{stream}) +The @code{fputc} function converts the character @var{c} to type +@code{unsigned char}, and writes it to the stream @var{stream}. +@code{EOF} is returned if a write error occurs; otherwise the +character @var{c} is returned. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int putc (int @var{c}, FILE *@var{stream}) +This is just like @code{fputc}, except that most systems implement it as +a macro, making it faster. One consequence is that it may evaluate the +@var{stream} argument more than once, which is an exception to the +general rule for macros. @code{putc} is usually the best function to +use for writing a single character. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int putchar (int @var{c}) +The @code{putchar} function is equivalent to @code{putc} with +@code{stdout} as the value of the @var{stream} argument. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int fputs (const char *@var{s}, FILE *@var{stream}) +The function @code{fputs} writes the string @var{s} to the stream +@var{stream}. The terminating null character is not written. +This function does @emph{not} add a newline character, either. +It outputs only the characters in the string. + +This function returns @code{EOF} if a write error occurs, and otherwise +a non-negative value. + +For example: + +@smallexample +fputs ("Are ", stdout); +fputs ("you ", stdout); +fputs ("hungry?\n", stdout); +@end smallexample + +@noindent +outputs the text @samp{Are you hungry?} followed by a newline. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int puts (const char *@var{s}) +The @code{puts} function writes the string @var{s} to the stream +@code{stdout} followed by a newline. The terminating null character of +the string is not written. (Note that @code{fputs} does @emph{not} +write a newline as this function does.) + +@code{puts} is the most convenient function for printing simple +messages. For example: + +@smallexample +puts ("This is a message."); +@end smallexample +@end deftypefun + +@comment stdio.h +@comment SVID +@deftypefun int putw (int @var{w}, FILE *@var{stream}) +This function writes the word @var{w} (that is, an @code{int}) to +@var{stream}. It is provided for compatibility with SVID, but we +recommend you use @code{fwrite} instead (@pxref{Block Input/Output}). +@end deftypefun + +@node Character Input +@section Character Input + +@cindex reading from a stream, by characters +This section describes functions for performing character-oriented input. +These functions are declared in the header file @file{stdio.h}. +@pindex stdio.h + +These functions return an @code{int} value that is either a character of +input, or the special value @code{EOF} (usually -1). It is important to +store the result of these functions in a variable of type @code{int} +instead of @code{char}, even when you plan to use it only as a +character. Storing @code{EOF} in a @code{char} variable truncates its +value to the size of a character, so that it is no longer +distinguishable from the valid character @samp{(char) -1}. So always +use an @code{int} for the result of @code{getc} and friends, and check +for @code{EOF} after the call; once you've verified that the result is +not @code{EOF}, you can be sure that it will fit in a @samp{char} +variable without loss of information. + +@comment stdio.h +@comment ANSI +@deftypefun int fgetc (FILE *@var{stream}) +This function reads the next character as an @code{unsigned char} from +the stream @var{stream} and returns its value, converted to an +@code{int}. If an end-of-file condition or read error occurs, +@code{EOF} is returned instead. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int getc (FILE *@var{stream}) +This is just like @code{fgetc}, except that it is permissible (and +typical) for it to be implemented as a macro that evaluates the +@var{stream} argument more than once. @code{getc} is often highly +optimized, so it is usually the best function to use to read a single +character. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int getchar (void) +The @code{getchar} function is equivalent to @code{getc} with @code{stdin} +as the value of the @var{stream} argument. +@end deftypefun + +Here is an example of a function that does input using @code{fgetc}. It +would work just as well using @code{getc} instead, or using +@code{getchar ()} instead of @w{@code{fgetc (stdin)}}. + +@smallexample +int +y_or_n_p (const char *question) +@{ + fputs (question, stdout); + while (1) + @{ + int c, answer; + /* @r{Write a space to separate answer from question.} */ + fputc (' ', stdout); + /* @r{Read the first character of the line.} + @r{This should be the answer character, but might not be.} */ + c = tolower (fgetc (stdin)); + answer = c; + /* @r{Discard rest of input line.} */ + while (c != '\n' && c != EOF) + c = fgetc (stdin); + /* @r{Obey the answer if it was valid.} */ + if (answer == 'y') + return 1; + if (answer == 'n') + return 0; + /* @r{Answer was invalid: ask for valid answer.} */ + fputs ("Please answer y or n:", stdout); + @} +@} +@end smallexample + +@comment stdio.h +@comment SVID +@deftypefun int getw (FILE *@var{stream}) +This function reads a word (that is, an @code{int}) from @var{stream}. +It's provided for compatibility with SVID. We recommend you use +@code{fread} instead (@pxref{Block Input/Output}). Unlike @code{getc}, +any @code{int} value could be a valid result. @code{getw} returns +@code{EOF} when it encounters end-of-file or an error, but there is no +way to distinguish this from an input word with value -1. +@end deftypefun + +@node Line Input +@section Line-Oriented Input + +Since many programs interpret input on the basis of lines, it's +convenient to have functions to read a line of text from a stream. + +Standard C has functions to do this, but they aren't very safe: null +characters and even (for @code{gets}) long lines can confuse them. So +the GNU library provides the nonstandard @code{getline} function that +makes it easy to read lines reliably. + +Another GNU extension, @code{getdelim}, generalizes @code{getline}. It +reads a delimited record, defined as everything through the next +occurrence of a specified delimiter character. + +All these functions are declared in @file{stdio.h}. + +@comment stdio.h +@comment GNU +@deftypefun ssize_t getline (char **@var{lineptr}, size_t *@var{n}, FILE *@var{stream}) +This function reads an entire line from @var{stream}, storing the text +(including the newline and a terminating null character) in a buffer +and storing the buffer address in @code{*@var{lineptr}}. + +Before calling @code{getline}, you should place in @code{*@var{lineptr}} +the address of a buffer @code{*@var{n}} bytes long, allocated with +@code{malloc}. If this buffer is long enough to hold the line, +@code{getline} stores the line in this buffer. Otherwise, +@code{getline} makes the buffer bigger using @code{realloc}, storing the +new buffer address back in @code{*@var{lineptr}} and the increased size +back in @code{*@var{n}}. +@xref{Unconstrained Allocation}. + +If you set @code{*@var{lineptr}} to a null pointer, and @code{*@var{n}} +to zero, before the call, then @code{getline} allocates the initial +buffer for you by calling @code{malloc}. + +In either case, when @code{getline} returns, @code{*@var{lineptr}} is +a @code{char *} which points to the text of the line. + +When @code{getline} is successful, it returns the number of characters +read (including the newline, but not including the terminating null). +This value enables you to distinguish null characters that are part of +the line from the null character inserted as a terminator. + +This function is a GNU extension, but it is the recommended way to read +lines from a stream. The alternative standard functions are unreliable. + +If an error occurs or end of file is reached, @code{getline} returns +@code{-1}. +@end deftypefun + +@comment stdio.h +@comment GNU +@deftypefun ssize_t getdelim (char **@var{lineptr}, size_t *@var{n}, int @var{delimiter}, FILE *@var{stream}) +This function is like @code{getline} except that the character which +tells it to stop reading is not necessarily newline. The argument +@var{delimiter} specifies the delimiter character; @code{getdelim} keeps +reading until it sees that character (or end of file). + +The text is stored in @var{lineptr}, including the delimiter character +and a terminating null. Like @code{getline}, @code{getdelim} makes +@var{lineptr} bigger if it isn't big enough. + +@code{getline} is in fact implemented in terms of @code{getdelim}, just +like this: + +@smallexample +ssize_t +getline (char **lineptr, size_t *n, FILE *stream) +@{ + return getdelim (lineptr, n, '\n', stream); +@} +@end smallexample +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun {char *} fgets (char *@var{s}, int @var{count}, FILE *@var{stream}) +The @code{fgets} function reads characters from the stream @var{stream} +up to and including a newline character and stores them in the string +@var{s}, adding a null character to mark the end of the string. You +must supply @var{count} characters worth of space in @var{s}, but the +number of characters read is at most @var{count} @minus{} 1. The extra +character space is used to hold the null character at the end of the +string. + +If the system is already at end of file when you call @code{fgets}, then +the contents of the array @var{s} are unchanged and a null pointer is +returned. A null pointer is also returned if a read error occurs. +Otherwise, the return value is the pointer @var{s}. + +@strong{Warning:} If the input data has a null character, you can't tell. +So don't use @code{fgets} unless you know the data cannot contain a null. +Don't use it to read files edited by the user because, if the user inserts +a null character, you should either handle it properly or print a clear +error message. We recommend using @code{getline} instead of @code{fgets}. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefn {Deprecated function} {char *} gets (char *@var{s}) +The function @code{gets} reads characters from the stream @code{stdin} +up to the next newline character, and stores them in the string @var{s}. +The newline character is discarded (note that this differs from the +behavior of @code{fgets}, which copies the newline character into the +string). If @code{gets} encounters a read error or end-of-file, it +returns a null pointer; otherwise it returns @var{s}. + +@strong{Warning:} The @code{gets} function is @strong{very dangerous} +because it provides no protection against overflowing the string +@var{s}. The GNU library includes it for compatibility only. You +should @strong{always} use @code{fgets} or @code{getline} instead. To +remind you of this, the linker (if using GNU @code{ld}) will issue a +warning whenever you use @code{gets}. +@end deftypefn + +@node Unreading +@section Unreading +@cindex peeking at input +@cindex unreading characters +@cindex pushing input back + +In parser programs it is often useful to examine the next character in +the input stream without removing it from the stream. This is called +``peeking ahead'' at the input because your program gets a glimpse of +the input it will read next. + +Using stream I/O, you can peek ahead at input by first reading it and +then @dfn{unreading} it (also called @dfn{pushing it back} on the stream). +Unreading a character makes it available to be input again from the stream, +by the next call to @code{fgetc} or other input function on that stream. + +@menu +* Unreading Idea:: An explanation of unreading with pictures. +* How Unread:: How to call @code{ungetc} to do unreading. +@end menu + +@node Unreading Idea +@subsection What Unreading Means + +Here is a pictorial explanation of unreading. Suppose you have a +stream reading a file that contains just six characters, the letters +@samp{foobar}. Suppose you have read three characters so far. The +situation looks like this: + +@smallexample +f o o b a r + ^ +@end smallexample + +@noindent +so the next input character will be @samp{b}. + +@c @group Invalid outside @example +If instead of reading @samp{b} you unread the letter @samp{o}, you get a +situation like this: + +@smallexample +f o o b a r + | + o-- + ^ +@end smallexample + +@noindent +so that the next input characters will be @samp{o} and @samp{b}. +@c @end group + +@c @group +If you unread @samp{9} instead of @samp{o}, you get this situation: + +@smallexample +f o o b a r + | + 9-- + ^ +@end smallexample + +@noindent +so that the next input characters will be @samp{9} and @samp{b}. +@c @end group + +@node How Unread +@subsection Using @code{ungetc} To Do Unreading + +The function to unread a character is called @code{ungetc}, because it +reverses the action of @code{getc}. + +@comment stdio.h +@comment ANSI +@deftypefun int ungetc (int @var{c}, FILE *@var{stream}) +The @code{ungetc} function pushes back the character @var{c} onto the +input stream @var{stream}. So the next input from @var{stream} will +read @var{c} before anything else. + +If @var{c} is @code{EOF}, @code{ungetc} does nothing and just returns +@code{EOF}. This lets you call @code{ungetc} with the return value of +@code{getc} without needing to check for an error from @code{getc}. + +The character that you push back doesn't have to be the same as the last +character that was actually read from the stream. In fact, it isn't +necessary to actually read any characters from the stream before +unreading them with @code{ungetc}! But that is a strange way to write +a program; usually @code{ungetc} is used only to unread a character +that was just read from the same stream. + +The GNU C library only supports one character of pushback---in other +words, it does not work to call @code{ungetc} twice without doing input +in between. Other systems might let you push back multiple characters; +then reading from the stream retrieves the characters in the reverse +order that they were pushed. + +Pushing back characters doesn't alter the file; only the internal +buffering for the stream is affected. If a file positioning function +(such as @code{fseek} or @code{rewind}; @pxref{File Positioning}) is +called, any pending pushed-back characters are discarded. + +Unreading a character on a stream that is at end of file clears the +end-of-file indicator for the stream, because it makes the character of +input available. After you read that character, trying to read again +will encounter end of file. +@end deftypefun + +Here is an example showing the use of @code{getc} and @code{ungetc} to +skip over whitespace characters. When this function reaches a +non-whitespace character, it unreads that character to be seen again on +the next read operation on the stream. + +@smallexample +#include <stdio.h> +#include <ctype.h> + +void +skip_whitespace (FILE *stream) +@{ + int c; + do + /* @r{No need to check for @code{EOF} because it is not} + @r{@code{isspace}, and @code{ungetc} ignores @code{EOF}.} */ + c = getc (stream); + while (isspace (c)); + ungetc (c, stream); +@} +@end smallexample + +@node Block Input/Output +@section Block Input/Output + +This section describes how to do input and output operations on blocks +of data. You can use these functions to read and write binary data, as +well as to read and write text in fixed-size blocks instead of by +characters or lines. +@cindex binary I/O to a stream +@cindex block I/O to a stream +@cindex reading from a stream, by blocks +@cindex writing to a stream, by blocks + +Binary files are typically used to read and write blocks of data in the +same format as is used to represent the data in a running program. In +other words, arbitrary blocks of memory---not just character or string +objects---can be written to a binary file, and meaningfully read in +again by the same program. + +Storing data in binary form is often considerably more efficient than +using the formatted I/O functions. Also, for floating-point numbers, +the binary form avoids possible loss of precision in the conversion +process. On the other hand, binary files can't be examined or modified +easily using many standard file utilities (such as text editors), and +are not portable between different implementations of the language, or +different kinds of computers. + +These functions are declared in @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftypefun size_t fread (void *@var{data}, size_t @var{size}, size_t @var{count}, FILE *@var{stream}) +This function reads up to @var{count} objects of size @var{size} into +the array @var{data}, from the stream @var{stream}. It returns the +number of objects actually read, which might be less than @var{count} if +a read error occurs or the end of the file is reached. This function +returns a value of zero (and doesn't read anything) if either @var{size} +or @var{count} is zero. + +If @code{fread} encounters end of file in the middle of an object, it +returns the number of complete objects read, and discards the partial +object. Therefore, the stream remains at the actual end of the file. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun size_t fwrite (const void *@var{data}, size_t @var{size}, size_t @var{count}, FILE *@var{stream}) +This function writes up to @var{count} objects of size @var{size} from +the array @var{data}, to the stream @var{stream}. The return value is +normally @var{count}, if the call succeeds. Any other value indicates +some sort of error, such as running out of space. +@end deftypefun + +@node Formatted Output +@section Formatted Output + +@cindex format string, for @code{printf} +@cindex template, for @code{printf} +@cindex formatted output to a stream +@cindex writing to a stream, formatted +The functions described in this section (@code{printf} and related +functions) provide a convenient way to perform formatted output. You +call @code{printf} with a @dfn{format string} or @dfn{template string} +that specifies how to format the values of the remaining arguments. + +Unless your program is a filter that specifically performs line- or +character-oriented processing, using @code{printf} or one of the other +related functions described in this section is usually the easiest and +most concise way to perform output. These functions are especially +useful for printing error messages, tables of data, and the like. + +@menu +* Formatted Output Basics:: Some examples to get you started. +* Output Conversion Syntax:: General syntax of conversion + specifications. +* Table of Output Conversions:: Summary of output conversions and + what they do. +* Integer Conversions:: Details about formatting of integers. +* Floating-Point Conversions:: Details about formatting of + floating-point numbers. +* Other Output Conversions:: Details about formatting of strings, + characters, pointers, and the like. +* Formatted Output Functions:: Descriptions of the actual functions. +* Dynamic Output:: Functions that allocate memory for the output. +* Variable Arguments Output:: @code{vprintf} and friends. +* Parsing a Template String:: What kinds of args does a given template + call for? +* Example of Parsing:: Sample program using @code{parse_printf_format}. +@end menu + +@node Formatted Output Basics +@subsection Formatted Output Basics + +The @code{printf} function can be used to print any number of arguments. +The template string argument you supply in a call provides +information not only about the number of additional arguments, but also +about their types and what style should be used for printing them. + +Ordinary characters in the template string are simply written to the +output stream as-is, while @dfn{conversion specifications} introduced by +a @samp{%} character in the template cause subsequent arguments to be +formatted and written to the output stream. For example, +@cindex conversion specifications (@code{printf}) + +@smallexample +int pct = 37; +char filename[] = "foo.txt"; +printf ("Processing of `%s' is %d%% finished.\nPlease be patient.\n", + filename, pct); +@end smallexample + +@noindent +produces output like + +@smallexample +Processing of `foo.txt' is 37% finished. +Please be patient. +@end smallexample + +This example shows the use of the @samp{%d} conversion to specify that +an @code{int} argument should be printed in decimal notation, the +@samp{%s} conversion to specify printing of a string argument, and +the @samp{%%} conversion to print a literal @samp{%} character. + +There are also conversions for printing an integer argument as an +unsigned value in octal, decimal, or hexadecimal radix (@samp{%o}, +@samp{%u}, or @samp{%x}, respectively); or as a character value +(@samp{%c}). + +Floating-point numbers can be printed in normal, fixed-point notation +using the @samp{%f} conversion or in exponential notation using the +@samp{%e} conversion. The @samp{%g} conversion uses either @samp{%e} +or @samp{%f} format, depending on what is more appropriate for the +magnitude of the particular number. + +You can control formatting more precisely by writing @dfn{modifiers} +between the @samp{%} and the character that indicates which conversion +to apply. These slightly alter the ordinary behavior of the conversion. +For example, most conversion specifications permit you to specify a +minimum field width and a flag indicating whether you want the result +left- or right-justified within the field. + +The specific flags and modifiers that are permitted and their +interpretation vary depending on the particular conversion. They're all +described in more detail in the following sections. Don't worry if this +all seems excessively complicated at first; you can almost always get +reasonable free-format output without using any of the modifiers at all. +The modifiers are mostly used to make the output look ``prettier'' in +tables. + +@node Output Conversion Syntax +@subsection Output Conversion Syntax + +This section provides details about the precise syntax of conversion +specifications that can appear in a @code{printf} template +string. + +Characters in the template string that are not part of a +conversion specification are printed as-is to the output stream. +Multibyte character sequences (@pxref{Extended Characters}) are permitted in +a template string. + +The conversion specifications in a @code{printf} template string have +the general form: + +@example +% @var{flags} @var{width} @r{[} . @var{precision} @r{]} @var{type} @var{conversion} +@end example + +For example, in the conversion specifier @samp{%-10.8ld}, the @samp{-} +is a flag, @samp{10} specifies the field width, the precision is +@samp{8}, the letter @samp{l} is a type modifier, and @samp{d} specifies +the conversion style. (This particular type specifier says to +print a @code{long int} argument in decimal notation, with a minimum of +8 digits left-justified in a field at least 10 characters wide.) + +In more detail, output conversion specifications consist of an +initial @samp{%} character followed in sequence by: + +@itemize @bullet +@item +Zero or more @dfn{flag characters} that modify the normal behavior of +the conversion specification. +@cindex flag character (@code{printf}) + +@item +An optional decimal integer specifying the @dfn{minimum field width}. +If the normal conversion produces fewer characters than this, the field +is padded with spaces to the specified width. This is a @emph{minimum} +value; if the normal conversion produces more characters than this, the +field is @emph{not} truncated. Normally, the output is right-justified +within the field. +@cindex minimum field width (@code{printf}) + +You can also specify a field width of @samp{*}. This means that the +next argument in the argument list (before the actual value to be +printed) is used as the field width. The value must be an @code{int}. +If the value is negative, this means to set the @samp{-} flag (see +below) and to use the absolute value as the field width. + +@item +An optional @dfn{precision} to specify the number of digits to be +written for the numeric conversions. If the precision is specified, it +consists of a period (@samp{.}) followed optionally by a decimal integer +(which defaults to zero if omitted). +@cindex precision (@code{printf}) + +You can also specify a precision of @samp{*}. This means that the next +argument in the argument list (before the actual value to be printed) is +used as the precision. The value must be an @code{int}, and is ignored +if it is negative. If you specify @samp{*} for both the field width and +precision, the field width argument precedes the precision argument. +Other C library versions may not recognize this syntax. + +@item +An optional @dfn{type modifier character}, which is used to specify the +data type of the corresponding argument if it differs from the default +type. (For example, the integer conversions assume a type of @code{int}, +but you can specify @samp{h}, @samp{l}, or @samp{L} for other integer +types.) +@cindex type modifier character (@code{printf}) + +@item +A character that specifies the conversion to be applied. +@end itemize + +The exact options that are permitted and how they are interpreted vary +between the different conversion specifiers. See the descriptions of the +individual conversions for information about the particular options that +they use. + +With the @samp{-Wformat} option, the GNU C compiler checks calls to +@code{printf} and related functions. It examines the format string and +verifies that the correct number and types of arguments are supplied. +There is also a GNU C syntax to tell the compiler that a function you +write uses a @code{printf}-style format string. +@xref{Function Attributes, , Declaring Attributes of Functions, +gcc.info, Using GNU CC}, for more information. + +@node Table of Output Conversions +@subsection Table of Output Conversions +@cindex output conversions, for @code{printf} + +Here is a table summarizing what all the different conversions do: + +@table @asis +@item @samp{%d}, @samp{%i} +Print an integer as a signed decimal number. @xref{Integer +Conversions}, for details. @samp{%d} and @samp{%i} are synonymous for +output, but are different when used with @code{scanf} for input +(@pxref{Table of Input Conversions}). + +@item @samp{%o} +Print an integer as an unsigned octal number. @xref{Integer +Conversions}, for details. + +@item @samp{%u} +Print an integer as an unsigned decimal number. @xref{Integer +Conversions}, for details. + +@item @samp{%x}, @samp{%X} +Print an integer as an unsigned hexadecimal number. @samp{%x} uses +lower-case letters and @samp{%X} uses upper-case. @xref{Integer +Conversions}, for details. + +@item @samp{%f} +Print a floating-point number in normal (fixed-point) notation. +@xref{Floating-Point Conversions}, for details. + +@item @samp{%e}, @samp{%E} +Print a floating-point number in exponential notation. @samp{%e} uses +lower-case letters and @samp{%E} uses upper-case. @xref{Floating-Point +Conversions}, for details. + +@item @samp{%g}, @samp{%G} +Print a floating-point number in either normal or exponential notation, +whichever is more appropriate for its magnitude. @samp{%g} uses +lower-case letters and @samp{%G} uses upper-case. @xref{Floating-Point +Conversions}, for details. + +@item @samp{%c} +Print a single character. @xref{Other Output Conversions}. + +@item @samp{%s} +Print a string. @xref{Other Output Conversions}. + +@item @samp{%p} +Print the value of a pointer. @xref{Other Output Conversions}. + +@item @samp{%n} +Get the number of characters printed so far. @xref{Other Output Conversions}. +Note that this conversion specification never produces any output. + +@item @samp{%m} +Print the string corresponding to the value of @code{errno}. +(This is a GNU extension.) +@xref{Other Output Conversions}. + +@item @samp{%%} +Print a literal @samp{%} character. @xref{Other Output Conversions}. +@end table + +If the syntax of a conversion specification is invalid, unpredictable +things will happen, so don't do this. If there aren't enough function +arguments provided to supply values for all the conversion +specifications in the template string, or if the arguments are not of +the correct types, the results are unpredictable. If you supply more +arguments than conversion specifications, the extra argument values are +simply ignored; this is sometimes useful. + +@node Integer Conversions +@subsection Integer Conversions + +This section describes the options for the @samp{%d}, @samp{%i}, +@samp{%o}, @samp{%u}, @samp{%x}, and @samp{%X} conversion +specifications. These conversions print integers in various formats. + +The @samp{%d} and @samp{%i} conversion specifications both print an +@code{int} argument as a signed decimal number; while @samp{%o}, +@samp{%u}, and @samp{%x} print the argument as an unsigned octal, +decimal, or hexadecimal number (respectively). The @samp{%X} conversion +specification is just like @samp{%x} except that it uses the characters +@samp{ABCDEF} as digits instead of @samp{abcdef}. + +The following flags are meaningful: + +@table @asis +@item @samp{-} +Left-justify the result in the field (instead of the normal +right-justification). + +@item @samp{+} +For the signed @samp{%d} and @samp{%i} conversions, print a +plus sign if the value is positive. + +@item @samp{ } +For the signed @samp{%d} and @samp{%i} conversions, if the result +doesn't start with a plus or minus sign, prefix it with a space +character instead. Since the @samp{+} flag ensures that the result +includes a sign, this flag is ignored if you supply both of them. + +@item @samp{#} +For the @samp{%o} conversion, this forces the leading digit to be +@samp{0}, as if by increasing the precision. For @samp{%x} or +@samp{%X}, this prefixes a leading @samp{0x} or @samp{0X} (respectively) +to the result. This doesn't do anything useful for the @samp{%d}, +@samp{%i}, or @samp{%u} conversions. Using this flag produces output +which can be parsed by the @code{strtoul} function (@pxref{Parsing of +Integers}) and @code{scanf} with the @samp{%i} conversion +(@pxref{Numeric Input Conversions}). + +@item @samp{'} +Separate the digits into groups as specified by the locale specified for +the @code{LC_NUMERIC} category; @pxref{General Numeric}. This flag is a +GNU extension. + +@item @samp{0} +Pad the field with zeros instead of spaces. The zeros are placed after +any indication of sign or base. This flag is ignored if the @samp{-} +flag is also specified, or if a precision is specified. +@end table + +If a precision is supplied, it specifies the minimum number of digits to +appear; leading zeros are produced if necessary. If you don't specify a +precision, the number is printed with as many digits as it needs. If +you convert a value of zero with an explicit precision of zero, then no +characters at all are produced. + +Without a type modifier, the corresponding argument is treated as an +@code{int} (for the signed conversions @samp{%i} and @samp{%d}) or +@code{unsigned int} (for the unsigned conversions @samp{%o}, @samp{%u}, +@samp{%x}, and @samp{%X}). Recall that since @code{printf} and friends +are variadic, any @code{char} and @code{short} arguments are +automatically converted to @code{int} by the default argument +promotions. For arguments of other integer types, you can use these +modifiers: + +@table @samp +@item h +Specifies that the argument is a @code{short int} or @code{unsigned +short int}, as appropriate. A @code{short} argument is converted to an +@code{int} or @code{unsigned int} by the default argument promotions +anyway, but the @samp{h} modifier says to convert it back to a +@code{short} again. + +@item l +Specifies that the argument is a @code{long int} or @code{unsigned long +int}, as appropriate. Two @samp{l} characters is like the @samp{L} +modifier, below. + +@item L +@itemx ll +@itemx q +Specifies that the argument is a @code{long long int}. (This type is +an extension supported by the GNU C compiler. On systems that don't +support extra-long integers, this is the same as @code{long int}.) + +The @samp{q} modifier is another name for the same thing, which comes +from 4.4 BSD; a @w{@code{long long int}} is sometimes called a ``quad'' +@code{int}. + +@item Z +Specifies that the argument is a @code{size_t}. This is a GNU extension. +@end table + +Here is an example. Using the template string: + +@smallexample +"|%5d|%-5d|%+5d|%+-5d|% 5d|%05d|%5.0d|%5.2d|%d|\n" +@end smallexample + +@noindent +to print numbers using the different options for the @samp{%d} +conversion gives results like: + +@smallexample +| 0|0 | +0|+0 | 0|00000| | 00|0| +| 1|1 | +1|+1 | 1|00001| 1| 01|1| +| -1|-1 | -1|-1 | -1|-0001| -1| -01|-1| +|100000|100000|+100000| 100000|100000|100000|100000|100000| +@end smallexample + +In particular, notice what happens in the last case where the number +is too large to fit in the minimum field width specified. + +Here are some more examples showing how unsigned integers print under +various format options, using the template string: + +@smallexample +"|%5u|%5o|%5x|%5X|%#5o|%#5x|%#5X|%#10.8x|\n" +@end smallexample + +@smallexample +| 0| 0| 0| 0| 0| 0x0| 0X0|0x00000000| +| 1| 1| 1| 1| 01| 0x1| 0X1|0x00000001| +|100000|303240|186a0|186A0|0303240|0x186a0|0X186A0|0x000186a0| +@end smallexample + + +@node Floating-Point Conversions +@subsection Floating-Point Conversions + +This section discusses the conversion specifications for floating-point +numbers: the @samp{%f}, @samp{%e}, @samp{%E}, @samp{%g}, and @samp{%G} +conversions. + +The @samp{%f} conversion prints its argument in fixed-point notation, +producing output of the form +@w{[@code{-}]@var{ddd}@code{.}@var{ddd}}, +where the number of digits following the decimal point is controlled +by the precision you specify. + +The @samp{%e} conversion prints its argument in exponential notation, +producing output of the form +@w{[@code{-}]@var{d}@code{.}@var{ddd}@code{e}[@code{+}|@code{-}]@var{dd}}. +Again, the number of digits following the decimal point is controlled by +the precision. The exponent always contains at least two digits. The +@samp{%E} conversion is similar but the exponent is marked with the letter +@samp{E} instead of @samp{e}. + +The @samp{%g} and @samp{%G} conversions print the argument in the style +of @samp{%e} or @samp{%E} (respectively) if the exponent would be less +than -4 or greater than or equal to the precision; otherwise they use the +@samp{%f} style. Trailing zeros are removed from the fractional portion +of the result and a decimal-point character appears only if it is +followed by a digit. + +The following flags can be used to modify the behavior: + +@comment We use @asis instead of @samp so we can have ` ' as an item. +@table @asis +@item @samp{-} +Left-justify the result in the field. Normally the result is +right-justified. + +@item @samp{+} +Always include a plus or minus sign in the result. + +@item @samp{ } +If the result doesn't start with a plus or minus sign, prefix it with a +space instead. Since the @samp{+} flag ensures that the result includes +a sign, this flag is ignored if you supply both of them. + +@item @samp{#} +Specifies that the result should always include a decimal point, even +if no digits follow it. For the @samp{%g} and @samp{%G} conversions, +this also forces trailing zeros after the decimal point to be left +in place where they would otherwise be removed. + +@item @samp{'} +Separate the digits of the integer part of the result into groups as +specified by the locale specified for the @code{LC_NUMERIC} category; +@pxref{General Numeric}. This flag is a GNU extension. + +@item @samp{0} +Pad the field with zeros instead of spaces; the zeros are placed +after any sign. This flag is ignored if the @samp{-} flag is also +specified. +@end table + +The precision specifies how many digits follow the decimal-point +character for the @samp{%f}, @samp{%e}, and @samp{%E} conversions. For +these conversions, the default precision is @code{6}. If the precision +is explicitly @code{0}, this suppresses the decimal point character +entirely. For the @samp{%g} and @samp{%G} conversions, the precision +specifies how many significant digits to print. Significant digits are +the first digit before the decimal point, and all the digits after it. +If the precision @code{0} or not specified for @samp{%g} or @samp{%G}, +it is treated like a value of @code{1}. If the value being printed +cannot be expressed accurately in the specified number of digits, the +value is rounded to the nearest number that fits. + +Without a type modifier, the floating-point conversions use an argument +of type @code{double}. (By the default argument promotions, any +@code{float} arguments are automatically converted to @code{double}.) +The following type modifier is supported: + +@table @samp +@item L +An uppercase @samp{L} specifies that the argument is a @code{long +double}. +@end table + +Here are some examples showing how numbers print using the various +floating-point conversions. All of the numbers were printed using +this template string: + +@smallexample +"|%12.4f|%12.4e|%12.4g|\n" +@end smallexample + +Here is the output: + +@smallexample +| 0.0000| 0.0000e+00| 0| +| 1.0000| 1.0000e+00| 1| +| -1.0000| -1.0000e+00| -1| +| 100.0000| 1.0000e+02| 100| +| 1000.0000| 1.0000e+03| 1000| +| 10000.0000| 1.0000e+04| 1e+04| +| 12345.0000| 1.2345e+04| 1.234e+04| +| 100000.0000| 1.0000e+05| 1e+05| +| 123456.0000| 1.2346e+05| 1.234e+05| +@end smallexample + +Notice how the @samp{%g} conversion drops trailing zeros. + +@node Other Output Conversions +@subsection Other Output Conversions + +This section describes miscellaneous conversions for @code{printf}. + +The @samp{%c} conversion prints a single character. The @code{int} +argument is first converted to an @code{unsigned char}. The @samp{-} +flag can be used to specify left-justification in the field, but no +other flags are defined, and no precision or type modifier can be given. +For example: + +@smallexample +printf ("%c%c%c%c%c", 'h', 'e', 'l', 'l', 'o'); +@end smallexample + +@noindent +prints @samp{hello}. + +The @samp{%s} conversion prints a string. The corresponding argument +must be of type @code{char *} (or @code{const char *}). A precision can +be specified to indicate the maximum number of characters to write; +otherwise characters in the string up to but not including the +terminating null character are written to the output stream. The +@samp{-} flag can be used to specify left-justification in the field, +but no other flags or type modifiers are defined for this conversion. +For example: + +@smallexample +printf ("%3s%-6s", "no", "where"); +@end smallexample + +@noindent +prints @samp{ nowhere }. + +If you accidentally pass a null pointer as the argument for a @samp{%s} +conversion, the GNU library prints it as @samp{(null)}. We think this +is more useful than crashing. But it's not good practice to pass a null +argument intentionally. + +The @samp{%m} conversion prints the string corresponding to the error +code in @code{errno}. @xref{Error Messages}. Thus: + +@smallexample +fprintf (stderr, "can't open `%s': %m\n", filename); +@end smallexample + +@noindent +is equivalent to: + +@smallexample +fprintf (stderr, "can't open `%s': %s\n", filename, strerror (errno)); +@end smallexample + +@noindent +The @samp{%m} conversion is a GNU C library extension. + +The @samp{%p} conversion prints a pointer value. The corresponding +argument must be of type @code{void *}. In practice, you can use any +type of pointer. + +In the GNU system, non-null pointers are printed as unsigned integers, +as if a @samp{%#x} conversion were used. Null pointers print as +@samp{(nil)}. (Pointers might print differently in other systems.) + +For example: + +@smallexample +printf ("%p", "testing"); +@end smallexample + +@noindent +prints @samp{0x} followed by a hexadecimal number---the address of the +string constant @code{"testing"}. It does not print the word +@samp{testing}. + +You can supply the @samp{-} flag with the @samp{%p} conversion to +specify left-justification, but no other flags, precision, or type +modifiers are defined. + +The @samp{%n} conversion is unlike any of the other output conversions. +It uses an argument which must be a pointer to an @code{int}, but +instead of printing anything it stores the number of characters printed +so far by this call at that location. The @samp{h} and @samp{l} type +modifiers are permitted to specify that the argument is of type +@code{short int *} or @code{long int *} instead of @code{int *}, but no +flags, field width, or precision are permitted. + +For example, + +@smallexample +int nchar; +printf ("%d %s%n\n", 3, "bears", &nchar); +@end smallexample + +@noindent +prints: + +@smallexample +3 bears +@end smallexample + +@noindent +and sets @code{nchar} to @code{7}, because @samp{3 bears} is seven +characters. + + +The @samp{%%} conversion prints a literal @samp{%} character. This +conversion doesn't use an argument, and no flags, field width, +precision, or type modifiers are permitted. + + +@node Formatted Output Functions +@subsection Formatted Output Functions + +This section describes how to call @code{printf} and related functions. +Prototypes for these functions are in the header file @file{stdio.h}. +Because these functions take a variable number of arguments, you +@emph{must} declare prototypes for them before using them. Of course, +the easiest way to make sure you have all the right prototypes is to +just include @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftypefun int printf (const char *@var{template}, @dots{}) +The @code{printf} function prints the optional arguments under the +control of the template string @var{template} to the stream +@code{stdout}. It returns the number of characters printed, or a +negative value if there was an output error. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int fprintf (FILE *@var{stream}, const char *@var{template}, @dots{}) +This function is just like @code{printf}, except that the output is +written to the stream @var{stream} instead of @code{stdout}. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int sprintf (char *@var{s}, const char *@var{template}, @dots{}) +This is like @code{printf}, except that the output is stored in the character +array @var{s} instead of written to a stream. A null character is written +to mark the end of the string. + +The @code{sprintf} function returns the number of characters stored in +the array @var{s}, not including the terminating null character. + +The behavior of this function is undefined if copying takes place +between objects that overlap---for example, if @var{s} is also given +as an argument to be printed under control of the @samp{%s} conversion. +@xref{Copying and Concatenation}. + +@strong{Warning:} The @code{sprintf} function can be @strong{dangerous} +because it can potentially output more characters than can fit in the +allocation size of the string @var{s}. Remember that the field width +given in a conversion specification is only a @emph{minimum} value. + +To avoid this problem, you can use @code{snprintf} or @code{asprintf}, +described below. +@end deftypefun + +@comment stdio.h +@comment GNU +@deftypefun int snprintf (char *@var{s}, size_t @var{size}, const char *@var{template}, @dots{}) +The @code{snprintf} function is similar to @code{sprintf}, except that +the @var{size} argument specifies the maximum number of characters to +produce. The trailing null character is counted towards this limit, so +you should allocate at least @var{size} characters for the string @var{s}. + +The return value is the number of characters stored, not including the +terminating null. If this value equals @code{@var{size} - 1}, then +there was not enough space in @var{s} for all the output. You should +try again with a bigger output string. Here is an example of doing +this: + +@smallexample +@group +/* @r{Construct a message describing the value of a variable} + @r{whose name is @var{name} and whose value is @var{value}.} */ +char * +make_message (char *name, char *value) +@{ + /* @r{Guess we need no more than 100 chars of space.} */ + int size = 100; + char *buffer = (char *) xmalloc (size); +@end group +@group + while (1) + @{ + /* @r{Try to print in the allocated space.} */ + int nchars = snprintf (buffer, size, + "value of %s is %s", + name, value); + /* @r{If that worked, return the string.} */ + if (nchars < size) + return buffer; + /* @r{Else try again with twice as much space.} */ + size *= 2; + buffer = (char *) xrealloc (size, buffer); + @} +@} +@end group +@end smallexample + +In practice, it is often easier just to use @code{asprintf}, below. +@end deftypefun + +@node Dynamic Output +@subsection Dynamically Allocating Formatted Output + +The functions in this section do formatted output and place the results +in dynamically allocated memory. + +@comment stdio.h +@comment GNU +@deftypefun int asprintf (char **@var{ptr}, const char *@var{template}, @dots{}) +This function is similar to @code{sprintf}, except that it dynamically +allocates a string (as with @code{malloc}; @pxref{Unconstrained +Allocation}) to hold the output, instead of putting the output in a +buffer you allocate in advance. The @var{ptr} argument should be the +address of a @code{char *} object, and @code{asprintf} stores a pointer +to the newly allocated string at that location. + +Here is how to use @code{asprintf} to get the same result as the +@code{snprintf} example, but more easily: + +@smallexample +/* @r{Construct a message describing the value of a variable} + @r{whose name is @var{name} and whose value is @var{value}.} */ +char * +make_message (char *name, char *value) +@{ + char *result; + asprintf (&result, "value of %s is %s", name, value); + return result; +@} +@end smallexample +@end deftypefun + +@comment stdio.h +@comment GNU +@deftypefun int obstack_printf (struct obstack *@var{obstack}, const char *@var{template}, @dots{}) +This function is similar to @code{asprintf}, except that it uses the +obstack @var{obstack} to allocate the space. @xref{Obstacks}. + +The characters are written onto the end of the current object. +To get at them, you must finish the object with @code{obstack_finish} +(@pxref{Growing Objects}).@refill +@end deftypefun + +@node Variable Arguments Output +@subsection Variable Arguments Output Functions + +The functions @code{vprintf} and friends are provided so that you can +define your own variadic @code{printf}-like functions that make use of +the same internals as the built-in formatted output functions. + +The most natural way to define such functions would be to use a language +construct to say, ``Call @code{printf} and pass this template plus all +of my arguments after the first five.'' But there is no way to do this +in C, and it would be hard to provide a way, since at the C language +level there is no way to tell how many arguments your function received. + +Since that method is impossible, we provide alternative functions, the +@code{vprintf} series, which lets you pass a @code{va_list} to describe +``all of my arguments after the first five.'' + +When it is sufficient to define a macro rather than a real function, +the GNU C compiler provides a way to do this much more easily with macros. +For example: + +@smallexample +#define myprintf(a, b, c, d, e, rest...) printf (mytemplate , ## rest...) +@end smallexample + +@noindent +@xref{Macro Varargs, , Macros with Variable Numbers of Arguments, +gcc.info, Using GNU CC}, for details. But this is limited to macros, +and does not apply to real functions at all. + +Before calling @code{vprintf} or the other functions listed in this +section, you @emph{must} call @code{va_start} (@pxref{Variadic +Functions}) to initialize a pointer to the variable arguments. Then you +can call @code{va_arg} to fetch the arguments that you want to handle +yourself. This advances the pointer past those arguments. + +Once your @code{va_list} pointer is pointing at the argument of your +choice, you are ready to call @code{vprintf}. That argument and all +subsequent arguments that were passed to your function are used by +@code{vprintf} along with the template that you specified separately. + +In some other systems, the @code{va_list} pointer may become invalid +after the call to @code{vprintf}, so you must not use @code{va_arg} +after you call @code{vprintf}. Instead, you should call @code{va_end} +to retire the pointer from service. However, you can safely call +@code{va_start} on another pointer variable and begin fetching the +arguments again through that pointer. Calling @code{vprintf} does not +destroy the argument list of your function, merely the particular +pointer that you passed to it. + +GNU C does not have such restrictions. You can safely continue to fetch +arguments from a @code{va_list} pointer after passing it to +@code{vprintf}, and @code{va_end} is a no-op. (Note, however, that +subsequent @code{va_arg} calls will fetch the same arguments which +@code{vprintf} previously used.) + +Prototypes for these functions are declared in @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftypefun int vprintf (const char *@var{template}, va_list @var{ap}) +This function is similar to @code{printf} except that, instead of taking +a variable number of arguments directly, it takes an argument list +pointer @var{ap}. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int vfprintf (FILE *@var{stream}, const char *@var{template}, va_list @var{ap}) +This is the equivalent of @code{fprintf} with the variable argument list +specified directly as for @code{vprintf}. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int vsprintf (char *@var{s}, const char *@var{template}, va_list @var{ap}) +This is the equivalent of @code{sprintf} with the variable argument list +specified directly as for @code{vprintf}. +@end deftypefun + +@comment stdio.h +@comment GNU +@deftypefun int vsnprintf (char *@var{s}, size_t @var{size}, const char *@var{template}, va_list @var{ap}) +This is the equivalent of @code{snprintf} with the variable argument list +specified directly as for @code{vprintf}. +@end deftypefun + +@comment stdio.h +@comment GNU +@deftypefun int vasprintf (char **@var{ptr}, const char *@var{template}, va_list @var{ap}) +The @code{vasprintf} function is the equivalent of @code{asprintf} with the +variable argument list specified directly as for @code{vprintf}. +@end deftypefun + +@comment stdio.h +@comment GNU +@deftypefun int obstack_vprintf (struct obstack *@var{obstack}, const char *@var{template}, va_list @var{ap}) +The @code{obstack_vprintf} function is the equivalent of +@code{obstack_printf} with the variable argument list specified directly +as for @code{vprintf}.@refill +@end deftypefun + +Here's an example showing how you might use @code{vfprintf}. This is a +function that prints error messages to the stream @code{stderr}, along +with a prefix indicating the name of the program +(@pxref{Error Messages}, for a description of +@code{program_invocation_short_name}). + +@smallexample +@group +#include <stdio.h> +#include <stdarg.h> + +void +eprintf (const char *template, ...) +@{ + va_list ap; + extern char *program_invocation_short_name; + + fprintf (stderr, "%s: ", program_invocation_short_name); + va_start (ap, count); + vfprintf (stderr, template, ap); + va_end (ap); +@} +@end group +@end smallexample + +@noindent +You could call @code{eprintf} like this: + +@smallexample +eprintf ("file `%s' does not exist\n", filename); +@end smallexample + +In GNU C, there is a special construct you can use to let the compiler +know that a function uses a @code{printf}-style format string. Then it +can check the number and types of arguments in each call to the +function, and warn you when they do not match the format string. +For example, take this declaration of @code{eprintf}: + +@smallexample +void eprintf (const char *template, ...) + __attribute__ ((format (printf, 1, 2))); +@end smallexample + +@noindent +This tells the compiler that @code{eprintf} uses a format string like +@code{printf} (as opposed to @code{scanf}; @pxref{Formatted Input}); +the format string appears as the first argument; +and the arguments to satisfy the format begin with the second. +@xref{Function Attributes, , Declaring Attributes of Functions, +gcc.info, Using GNU CC}, for more information. + +@node Parsing a Template String +@subsection Parsing a Template String +@cindex parsing a template string + +You can use the function @code{parse_printf_format} to obtain +information about the number and types of arguments that are expected by +a given template string. This function permits interpreters that +provide interfaces to @code{printf} to avoid passing along invalid +arguments from the user's program, which could cause a crash. + +All the symbols described in this section are declared in the header +file @file{printf.h}. + +@comment printf.h +@comment GNU +@deftypefun size_t parse_printf_format (const char *@var{template}, size_t @var{n}, int *@var{argtypes}) +This function returns information about the number and types of +arguments expected by the @code{printf} template string @var{template}. +The information is stored in the array @var{argtypes}; each element of +this array describes one argument. This information is encoded using +the various @samp{PA_} macros, listed below. + +The @var{n} argument specifies the number of elements in the array +@var{argtypes}. This is the most elements that +@code{parse_printf_format} will try to write. + +@code{parse_printf_format} returns the total number of arguments required +by @var{template}. If this number is greater than @var{n}, then the +information returned describes only the first @var{n} arguments. If you +want information about more than that many arguments, allocate a bigger +array and call @code{parse_printf_format} again. +@end deftypefun + +The argument types are encoded as a combination of a basic type and +modifier flag bits. + +@comment printf.h +@comment GNU +@deftypevr Macro int PA_FLAG_MASK +This macro is a bitmask for the type modifier flag bits. You can write +the expression @code{(argtypes[i] & PA_FLAG_MASK)} to extract just the +flag bits for an argument, or @code{(argtypes[i] & ~PA_FLAG_MASK)} to +extract just the basic type code. +@end deftypevr + +Here are symbolic constants that represent the basic types; they stand +for integer values. + +@table @code +@comment printf.h +@comment GNU +@item PA_INT +@vindex PA_INT +This specifies that the base type is @code{int}. + +@comment printf.h +@comment GNU +@item PA_CHAR +@vindex PA_CHAR +This specifies that the base type is @code{int}, cast to @code{char}. + +@comment printf.h +@comment GNU +@item PA_STRING +@vindex PA_STRING +This specifies that the base type is @code{char *}, a null-terminated string. + +@comment printf.h +@comment GNU +@item PA_POINTER +@vindex PA_POINTER +This specifies that the base type is @code{void *}, an arbitrary pointer. + +@comment printf.h +@comment GNU +@item PA_FLOAT +@vindex PA_FLOAT +This specifies that the base type is @code{float}. + +@comment printf.h +@comment GNU +@item PA_DOUBLE +@vindex PA_DOUBLE +This specifies that the base type is @code{double}. + +@comment printf.h +@comment GNU +@item PA_LAST +@vindex PA_LAST +You can define additional base types for your own programs as offsets +from @code{PA_LAST}. For example, if you have data types @samp{foo} +and @samp{bar} with their own specialized @code{printf} conversions, +you could define encodings for these types as: + +@smallexample +#define PA_FOO PA_LAST +#define PA_BAR (PA_LAST + 1) +@end smallexample +@end table + +Here are the flag bits that modify a basic type. They are combined with +the code for the basic type using inclusive-or. + +@table @code +@comment printf.h +@comment GNU +@item PA_FLAG_PTR +@vindex PA_FLAG_PTR +If this bit is set, it indicates that the encoded type is a pointer to +the base type, rather than an immediate value. +For example, @samp{PA_INT|PA_FLAG_PTR} represents the type @samp{int *}. + +@comment printf.h +@comment GNU +@item PA_FLAG_SHORT +@vindex PA_FLAG_SHORT +If this bit is set, it indicates that the base type is modified with +@code{short}. (This corresponds to the @samp{h} type modifier.) + +@comment printf.h +@comment GNU +@item PA_FLAG_LONG +@vindex PA_FLAG_LONG +If this bit is set, it indicates that the base type is modified with +@code{long}. (This corresponds to the @samp{l} type modifier.) + +@comment printf.h +@comment GNU +@item PA_FLAG_LONG_LONG +@vindex PA_FLAG_LONG_LONG +If this bit is set, it indicates that the base type is modified with +@code{long long}. (This corresponds to the @samp{L} type modifier.) + +@comment printf.h +@comment GNU +@item PA_FLAG_LONG_DOUBLE +@vindex PA_FLAG_LONG_DOUBLE +This is a synonym for @code{PA_FLAG_LONG_LONG}, used by convention with +a base type of @code{PA_DOUBLE} to indicate a type of @code{long double}. +@end table + +@ifinfo +For an example of using these facilitles, see @ref{Example of Parsing}. +@end ifinfo + +@node Example of Parsing +@subsection Example of Parsing a Template String + +Here is an example of decoding argument types for a format string. We +assume this is part of an interpreter which contains arguments of type +@code{NUMBER}, @code{CHAR}, @code{STRING} and @code{STRUCTURE} (and +perhaps others which are not valid here). + +@smallexample +/* @r{Test whether the @var{nargs} specified objects} + @r{in the vector @var{args} are valid} + @r{for the format string @var{format}:} + @r{if so, return 1.} + @r{If not, return 0 after printing an error message.} */ + +int +validate_args (char *format, int nargs, OBJECT *args) +@{ + int *argtypes; + int nwanted; + + /* @r{Get the information about the arguments.} + @r{Each conversion specification must be at least two characters} + @r{long, so there cannot be more specifications than half the} + @r{length of the string.} */ + + argtypes = (int *) alloca (strlen (format) / 2 * sizeof (int)); + nwanted = parse_printf_format (string, nelts, argtypes); + + /* @r{Check the number of arguments.} */ + if (nwanted > nargs) + @{ + error ("too few arguments (at least %d required)", nwanted); + return 0; + @} + + /* @r{Check the C type wanted for each argument} + @r{and see if the object given is suitable.} */ + for (i = 0; i < nwanted; i++) + @{ + int wanted; + + if (argtypes[i] & PA_FLAG_PTR) + wanted = STRUCTURE; + else + switch (argtypes[i] & ~PA_FLAG_MASK) + @{ + case PA_INT: + case PA_FLOAT: + case PA_DOUBLE: + wanted = NUMBER; + break; + case PA_CHAR: + wanted = CHAR; + break; + case PA_STRING: + wanted = STRING; + break; + case PA_POINTER: + wanted = STRUCTURE; + break; + @} + if (TYPE (args[i]) != wanted) + @{ + error ("type mismatch for arg number %d", i); + return 0; + @} + @} + return 1; +@} +@end smallexample + +@node Customizing Printf +@section Customizing @code{printf} +@cindex customizing @code{printf} +@cindex defining new @code{printf} conversions +@cindex extending @code{printf} + +The GNU C library lets you define your own custom conversion specifiers +for @code{printf} template strings, to teach @code{printf} clever ways +to print the important data structures of your program. + +The way you do this is by registering the conversion with the function +@code{register_printf_function}; see @ref{Registering New Conversions}. +One of the arguments you pass to this function is a pointer to a handler +function that produces the actual output; see @ref{Defining the Output +Handler}, for information on how to write this function. + +You can also install a function that just returns information about the +number and type of arguments expected by the conversion specifier. +@xref{Parsing a Template String}, for information about this. + +The facilities of this section are declared in the header file +@file{printf.h}. + +@menu +* Registering New Conversions:: Using @code{register_printf_function} + to register a new output conversion. +* Conversion Specifier Options:: The handler must be able to get + the options specified in the + template when it is called. +* Defining the Output Handler:: Defining the handler and arginfo + functions that are passed as arguments + to @code{register_printf_function}. +* Printf Extension Example:: How to define a @code{printf} + handler function. +@end menu + +@strong{Portability Note:} The ability to extend the syntax of +@code{printf} template strings is a GNU extension. ANSI standard C has +nothing similar. + +@node Registering New Conversions +@subsection Registering New Conversions + +The function to register a new output conversion is +@code{register_printf_function}, declared in @file{printf.h}. +@pindex printf.h + +@comment printf.h +@comment GNU +@deftypefun int register_printf_function (int @var{spec}, printf_function @var{handler-function}, printf_arginfo_function @var{arginfo-function}) +This function defines the conversion specifier character @var{spec}. +Thus, if @var{spec} is @code{'z'}, it defines the conversion @samp{%z}. +You can redefine the built-in conversions like @samp{%s}, but flag +characters like @samp{#} and type modifiers like @samp{l} can never be +used as conversions; calling @code{register_printf_function} for those +characters has no effect. + +The @var{handler-function} is the function called by @code{printf} and +friends when this conversion appears in a template string. +@xref{Defining the Output Handler}, for information about how to define +a function to pass as this argument. If you specify a null pointer, any +existing handler function for @var{spec} is removed. + +The @var{arginfo-function} is the function called by +@code{parse_printf_format} when this conversion appears in a +template string. @xref{Parsing a Template String}, for information +about this. + +Normally, you install both functions for a conversion at the same time, +but if you are never going to call @code{parse_printf_format}, you do +not need to define an arginfo function. + +The return value is @code{0} on success, and @code{-1} on failure +(which occurs if @var{spec} is out of range). + +You can redefine the standard output conversions, but this is probably +not a good idea because of the potential for confusion. Library routines +written by other people could break if you do this. +@end deftypefun + +@node Conversion Specifier Options +@subsection Conversion Specifier Options + +If you define a meaning for @samp{%q}, what if the template contains +@samp{%+23q} or @samp{%-#q}? To implement a sensible meaning for these, +the handler when called needs to be able to get the options specified in +the template. + +Both the @var{handler-function} and @var{arginfo-function} arguments +to @code{register_printf_function} accept an argument that points to a +@code{struct printf_info}, which contains information about the options +appearing in an instance of the conversion specifier. This data type +is declared in the header file @file{printf.h}. +@pindex printf.h + +@comment printf.h +@comment GNU +@deftp {Type} {struct printf_info} +This structure is used to pass information about the options appearing +in an instance of a conversion specifier in a @code{printf} template +string to the handler and arginfo functions for that specifier. It +contains the following members: + +@table @code +@item int prec +This is the precision specified. The value is @code{-1} if no precision +was specified. If the precision was given as @samp{*}, the +@code{printf_info} structure passed to the handler function contains the +actual value retrieved from the argument list. But the structure passed +to the arginfo function contains a value of @code{INT_MIN}, since the +actual value is not known. + +@item int width +This is the minimum field width specified. The value is @code{0} if no +width was specified. If the field width was given as @samp{*}, the +@code{printf_info} structure passed to the handler function contains the +actual value retrieved from the argument list. But the structure passed +to the arginfo function contains a value of @code{INT_MIN}, since the +actual value is not known. + +@item char spec +This is the conversion specifier character specified. It's stored in +the structure so that you can register the same handler function for +multiple characters, but still have a way to tell them apart when the +handler function is called. + +@item unsigned int is_long_double +This is a boolean that is true if the @samp{L}, @samp{ll}, or @samp{q} +type modifier was specified. For integer conversions, this indicates +@code{long long int}, as opposed to @code{long double} for floating +point conversions. + +@item unsigned int is_short +This is a boolean that is true if the @samp{h} type modifier was specified. + +@item unsigned int is_long +This is a boolean that is true if the @samp{l} type modifier was specified. + +@item unsigned int alt +This is a boolean that is true if the @samp{#} flag was specified. + +@item unsigned int space +This is a boolean that is true if the @samp{ } flag was specified. + +@item unsigned int left +This is a boolean that is true if the @samp{-} flag was specified. + +@item unsigned int showsign +This is a boolean that is true if the @samp{+} flag was specified. + +@item unsigned int group +This is a boolean that is true if the @samp{'} flag was specified. + +@item char pad +This is the character to use for padding the output to the minimum field +width. The value is @code{'0'} if the @samp{0} flag was specified, and +@code{' '} otherwise. +@end table +@end deftp + + +@node Defining the Output Handler +@subsection Defining the Output Handler + +Now let's look at how to define the handler and arginfo functions +which are passed as arguments to @code{register_printf_function}. + +You should define your handler functions with a prototype like: + +@smallexample +int @var{function} (FILE *stream, const struct printf_info *info, + va_list *ap_pointer) +@end smallexample + +The @code{stream} argument passed to the handler function is the stream to +which it should write output. + +The @code{info} argument is a pointer to a structure that contains +information about the various options that were included with the +conversion in the template string. You should not modify this structure +inside your handler function. @xref{Conversion Specifier Options}, for +a description of this data structure. + +The @code{ap_pointer} argument is used to pass the tail of the variable +argument list containing the values to be printed to your handler. +Unlike most other functions that can be passed an explicit variable +argument list, this is a @emph{pointer} to a @code{va_list}, rather than +the @code{va_list} itself. Thus, you should fetch arguments by +means of @code{va_arg (@var{type}, *ap_pointer)}. + +(Passing a pointer here allows the function that calls your handler +function to update its own @code{va_list} variable to account for the +arguments that your handler processes. @xref{Variadic Functions}.) + +Your handler function should return a value just like @code{printf} +does: it should return the number of characters it has written, or a +negative value to indicate an error. + +@comment printf.h +@comment GNU +@deftp {Data Type} printf_function +This is the data type that a handler function should have. +@end deftp + +If you are going to use @w{@code{parse_printf_format}} in your +application, you should also define a function to pass as the +@var{arginfo-function} argument for each new conversion you install with +@code{register_printf_function}. + +You should define these functions with a prototype like: + +@smallexample +int @var{function} (const struct printf_info *info, + size_t n, int *argtypes) +@end smallexample + +The return value from the function should be the number of arguments the +conversion expects. The function should also fill in no more than +@var{n} elements of the @var{argtypes} array with information about the +types of each of these arguments. This information is encoded using the +various @samp{PA_} macros. (You will notice that this is the same +calling convention @code{parse_printf_format} itself uses.) + +@comment printf.h +@comment GNU +@deftp {Data Type} printf_arginfo_function +This type is used to describe functions that return information about +the number and type of arguments used by a conversion specifier. +@end deftp + +@node Printf Extension Example +@subsection @code{printf} Extension Example + +Here is an example showing how to define a @code{printf} handler function. +This program defines a data structure called a @code{Widget} and +defines the @samp{%W} conversion to print information about @w{@code{Widget *}} +arguments, including the pointer value and the name stored in the data +structure. The @samp{%W} conversion supports the minimum field width and +left-justification options, but ignores everything else. + +@smallexample +@include rprintf.c.texi +@end smallexample + +The output produced by this program looks like: + +@smallexample +|<Widget 0xffeffb7c: mywidget>| +| <Widget 0xffeffb7c: mywidget>| +|<Widget 0xffeffb7c: mywidget> | +@end smallexample + +@node Formatted Input +@section Formatted Input + +@cindex formatted input from a stream +@cindex reading from a stream, formatted +@cindex format string, for @code{scanf} +@cindex template, for @code{scanf} +The functions described in this section (@code{scanf} and related +functions) provide facilities for formatted input analogous to the +formatted output facilities. These functions provide a mechanism for +reading arbitrary values under the control of a @dfn{format string} or +@dfn{template string}. + +@menu +* Formatted Input Basics:: Some basics to get you started. +* Input Conversion Syntax:: Syntax of conversion specifications. +* Table of Input Conversions:: Summary of input conversions and what they do. +* Numeric Input Conversions:: Details of conversions for reading numbers. +* String Input Conversions:: Details of conversions for reading strings. +* Dynamic String Input:: String conversions that @code{malloc} the buffer. +* Other Input Conversions:: Details of miscellaneous other conversions. +* Formatted Input Functions:: Descriptions of the actual functions. +* Variable Arguments Input:: @code{vscanf} and friends. +@end menu + +@node Formatted Input Basics +@subsection Formatted Input Basics + +Calls to @code{scanf} are superficially similar to calls to +@code{printf} in that arbitrary arguments are read under the control of +a template string. While the syntax of the conversion specifications in +the template is very similar to that for @code{printf}, the +interpretation of the template is oriented more towards free-format +input and simple pattern matching, rather than fixed-field formatting. +For example, most @code{scanf} conversions skip over any amount of +``white space'' (including spaces, tabs, and newlines) in the input +file, and there is no concept of precision for the numeric input +conversions as there is for the corresponding output conversions. +Ordinarily, non-whitespace characters in the template are expected to +match characters in the input stream exactly, but a matching failure is +distinct from an input error on the stream. +@cindex conversion specifications (@code{scanf}) + +Another area of difference between @code{scanf} and @code{printf} is +that you must remember to supply pointers rather than immediate values +as the optional arguments to @code{scanf}; the values that are read are +stored in the objects that the pointers point to. Even experienced +programmers tend to forget this occasionally, so if your program is +getting strange errors that seem to be related to @code{scanf}, you +might want to double-check this. + +When a @dfn{matching failure} occurs, @code{scanf} returns immediately, +leaving the first non-matching character as the next character to be +read from the stream. The normal return value from @code{scanf} is the +number of values that were assigned, so you can use this to determine if +a matching error happened before all the expected values were read. +@cindex matching failure, in @code{scanf} + +The @code{scanf} function is typically used for things like reading in +the contents of tables. For example, here is a function that uses +@code{scanf} to initialize an array of @code{double}: + +@smallexample +void +readarray (double *array, int n) +@{ + int i; + for (i=0; i<n; i++) + if (scanf (" %lf", &(array[i])) != 1) + invalid_input_error (); +@} +@end smallexample + +The formatted input functions are not used as frequently as the +formatted output functions. Partly, this is because it takes some care +to use them properly. Another reason is that it is difficult to recover +from a matching error. + +If you are trying to read input that doesn't match a single, fixed +pattern, you may be better off using a tool such as Flex to generate a +lexical scanner, or Bison to generate a parser, rather than using +@code{scanf}. For more information about these tools, see @ref{, , , +flex.info, Flex: The Lexical Scanner Generator}, and @ref{, , , +bison.info, The Bison Reference Manual}. + +@node Input Conversion Syntax +@subsection Input Conversion Syntax + +A @code{scanf} template string is a string that contains ordinary +multibyte characters interspersed with conversion specifications that +start with @samp{%}. + +Any whitespace character (as defined by the @code{isspace} function; +@pxref{Classification of Characters}) in the template causes any number +of whitespace characters in the input stream to be read and discarded. +The whitespace characters that are matched need not be exactly the same +whitespace characters that appear in the template string. For example, +write @samp{ , } in the template to recognize a comma with optional +whitespace before and after. + +Other characters in the template string that are not part of conversion +specifications must match characters in the input stream exactly; if +this is not the case, a matching failure occurs. + +The conversion specifications in a @code{scanf} template string +have the general form: + +@smallexample +% @var{flags} @var{width} @var{type} @var{conversion} +@end smallexample + +In more detail, an input conversion specification consists of an initial +@samp{%} character followed in sequence by: + +@itemize @bullet +@item +An optional @dfn{flag character} @samp{*}, which says to ignore the text +read for this specification. When @code{scanf} finds a conversion +specification that uses this flag, it reads input as directed by the +rest of the conversion specification, but it discards this input, does +not use a pointer argument, and does not increment the count of +successful assignments. +@cindex flag character (@code{scanf}) + +@item +An optional flag character @samp{a} (valid with string conversions only) +which requests allocation of a buffer long enough to store the string in. +(This is a GNU extension.) +@xref{Dynamic String Input}. + +@item +An optional decimal integer that specifies the @dfn{maximum field +width}. Reading of characters from the input stream stops either when +this maximum is reached or when a non-matching character is found, +whichever happens first. Most conversions discard initial whitespace +characters (those that don't are explicitly documented), and these +discarded characters don't count towards the maximum field width. +String input conversions store a null character to mark the end of the +input; the maximum field width does not include this terminator. +@cindex maximum field width (@code{scanf}) + +@item +An optional @dfn{type modifier character}. For example, you can +specify a type modifier of @samp{l} with integer conversions such as +@samp{%d} to specify that the argument is a pointer to a @code{long int} +rather than a pointer to an @code{int}. +@cindex type modifier character (@code{scanf}) + +@item +A character that specifies the conversion to be applied. +@end itemize + +The exact options that are permitted and how they are interpreted vary +between the different conversion specifiers. See the descriptions of the +individual conversions for information about the particular options that +they allow. + +With the @samp{-Wformat} option, the GNU C compiler checks calls to +@code{scanf} and related functions. It examines the format string and +verifies that the correct number and types of arguments are supplied. +There is also a GNU C syntax to tell the compiler that a function you +write uses a @code{scanf}-style format string. +@xref{Function Attributes, , Declaring Attributes of Functions, +gcc.info, Using GNU CC}, for more information. + +@node Table of Input Conversions +@subsection Table of Input Conversions +@cindex input conversions, for @code{scanf} + +Here is a table that summarizes the various conversion specifications: + +@table @asis +@item @samp{%d} +Matches an optionally signed integer written in decimal. @xref{Numeric +Input Conversions}. + +@item @samp{%i} +Matches an optionally signed integer in any of the formats that the C +language defines for specifying an integer constant. @xref{Numeric +Input Conversions}. + +@item @samp{%o} +Matches an unsigned integer written in octal radix. +@xref{Numeric Input Conversions}. + +@item @samp{%u} +Matches an unsigned integer written in decimal radix. +@xref{Numeric Input Conversions}. + +@item @samp{%x}, @samp{%X} +Matches an unsigned integer written in hexadecimal radix. +@xref{Numeric Input Conversions}. + +@item @samp{%e}, @samp{%f}, @samp{%g}, @samp{%E}, @samp{%G} +Matches an optionally signed floating-point number. @xref{Numeric Input +Conversions}. + +@item @samp{%s} +Matches a string containing only non-whitespace characters. +@xref{String Input Conversions}. + +@item @samp{%[} +Matches a string of characters that belong to a specified set. +@xref{String Input Conversions}. + +@item @samp{%c} +Matches a string of one or more characters; the number of characters +read is controlled by the maximum field width given for the conversion. +@xref{String Input Conversions}. + +@item @samp{%p} +Matches a pointer value in the same implementation-defined format used +by the @samp{%p} output conversion for @code{printf}. @xref{Other Input +Conversions}. + +@item @samp{%n} +This conversion doesn't read any characters; it records the number of +characters read so far by this call. @xref{Other Input Conversions}. + +@item @samp{%%} +This matches a literal @samp{%} character in the input stream. No +corresponding argument is used. @xref{Other Input Conversions}. +@end table + +If the syntax of a conversion specification is invalid, the behavior is +undefined. If there aren't enough function arguments provided to supply +addresses for all the conversion specifications in the template strings +that perform assignments, or if the arguments are not of the correct +types, the behavior is also undefined. On the other hand, extra +arguments are simply ignored. + +@node Numeric Input Conversions +@subsection Numeric Input Conversions + +This section describes the @code{scanf} conversions for reading numeric +values. + +The @samp{%d} conversion matches an optionally signed integer in decimal +radix. The syntax that is recognized is the same as that for the +@code{strtol} function (@pxref{Parsing of Integers}) with the value +@code{10} for the @var{base} argument. + +The @samp{%i} conversion matches an optionally signed integer in any of +the formats that the C language defines for specifying an integer +constant. The syntax that is recognized is the same as that for the +@code{strtol} function (@pxref{Parsing of Integers}) with the value +@code{0} for the @var{base} argument. (You can print integers in this +syntax with @code{printf} by using the @samp{#} flag character with the +@samp{%x}, @samp{%o}, or @samp{%d} conversion. @xref{Integer Conversions}.) + +For example, any of the strings @samp{10}, @samp{0xa}, or @samp{012} +could be read in as integers under the @samp{%i} conversion. Each of +these specifies a number with decimal value @code{10}. + +The @samp{%o}, @samp{%u}, and @samp{%x} conversions match unsigned +integers in octal, decimal, and hexadecimal radices, respectively. The +syntax that is recognized is the same as that for the @code{strtoul} +function (@pxref{Parsing of Integers}) with the appropriate value +(@code{8}, @code{10}, or @code{16}) for the @var{base} argument. + +The @samp{%X} conversion is identical to the @samp{%x} conversion. They +both permit either uppercase or lowercase letters to be used as digits. + +The default type of the corresponding argument for the @code{%d} and +@code{%i} conversions is @code{int *}, and @code{unsigned int *} for the +other integer conversions. You can use the following type modifiers to +specify other sizes of integer: + +@table @samp +@item h +Specifies that the argument is a @code{short int *} or @code{unsigned +short int *}. + +@item l +Specifies that the argument is a @code{long int *} or @code{unsigned +long int *}. Two @samp{l} characters is like the @samp{L} modifier, below. + +@need 100 +@item ll +@itemx L +@itemx q +Specifies that the argument is a @code{long long int *} or @code{unsigned long long int *}. (The @code{long long} type is an extension supported by the +GNU C compiler. For systems that don't provide extra-long integers, this +is the same as @code{long int}.) + +The @samp{q} modifier is another name for the same thing, which comes +from 4.4 BSD; a @w{@code{long long int}} is sometimes called a ``quad'' +@code{int}. +@end table + +All of the @samp{%e}, @samp{%f}, @samp{%g}, @samp{%E}, and @samp{%G} +input conversions are interchangeable. They all match an optionally +signed floating point number, in the same syntax as for the +@code{strtod} function (@pxref{Parsing of Floats}). + +For the floating-point input conversions, the default argument type is +@code{float *}. (This is different from the corresponding output +conversions, where the default type is @code{double}; remember that +@code{float} arguments to @code{printf} are converted to @code{double} +by the default argument promotions, but @code{float *} arguments are +not promoted to @code{double *}.) You can specify other sizes of float +using these type modifiers: + +@table @samp +@item l +Specifies that the argument is of type @code{double *}. + +@item L +Specifies that the argument is of type @code{long double *}. +@end table + +@node String Input Conversions +@subsection String Input Conversions + +This section describes the @code{scanf} input conversions for reading +string and character values: @samp{%s}, @samp{%[}, and @samp{%c}. + +You have two options for how to receive the input from these +conversions: + +@itemize @bullet +@item +Provide a buffer to store it in. This is the default. You +should provide an argument of type @code{char *}. + +@strong{Warning:} To make a robust program, you must make sure that the +input (plus its terminating null) cannot possibly exceed the size of the +buffer you provide. In general, the only way to do this is to specify a +maximum field width one less than the buffer size. @strong{If you +provide the buffer, always specify a maximum field width to prevent +overflow.} + +@item +Ask @code{scanf} to allocate a big enough buffer, by specifying the +@samp{a} flag character. This is a GNU extension. You should provide +an argument of type @code{char **} for the buffer address to be stored +in. @xref{Dynamic String Input}. +@end itemize + +The @samp{%c} conversion is the simplest: it matches a fixed number of +characters, always. The maximum field with says how many characters to +read; if you don't specify the maximum, the default is 1. This +conversion doesn't append a null character to the end of the text it +reads. It also does not skip over initial whitespace characters. It +reads precisely the next @var{n} characters, and fails if it cannot get +that many. Since there is always a maximum field width with @samp{%c} +(whether specified, or 1 by default), you can always prevent overflow by +making the buffer long enough. + +The @samp{%s} conversion matches a string of non-whitespace characters. +It skips and discards initial whitespace, but stops when it encounters +more whitespace after having read something. It stores a null character +at the end of the text that it reads. + +For example, reading the input: + +@smallexample + hello, world +@end smallexample + +@noindent +with the conversion @samp{%10c} produces @code{" hello, wo"}, but +reading the same input with the conversion @samp{%10s} produces +@code{"hello,"}. + +@strong{Warning:} If you do not specify a field width for @samp{%s}, +then the number of characters read is limited only by where the next +whitespace character appears. This almost certainly means that invalid +input can make your program crash---which is a bug. + +To read in characters that belong to an arbitrary set of your choice, +use the @samp{%[} conversion. You specify the set between the @samp{[} +character and a following @samp{]} character, using the same syntax used +in regular expressions. As special cases: + +@itemize @bullet +@item +A literal @samp{]} character can be specified as the first character +of the set. + +@item +An embedded @samp{-} character (that is, one that is not the first or +last character of the set) is used to specify a range of characters. + +@item +If a caret character @samp{^} immediately follows the initial @samp{[}, +then the set of allowed input characters is the everything @emph{except} +the characters listed. +@end itemize + +The @samp{%[} conversion does not skip over initial whitespace +characters. + +Here are some examples of @samp{%[} conversions and what they mean: + +@table @samp +@item %25[1234567890] +Matches a string of up to 25 digits. + +@item %25[][] +Matches a string of up to 25 square brackets. + +@item %25[^ \f\n\r\t\v] +Matches a string up to 25 characters long that doesn't contain any of +the standard whitespace characters. This is slightly different from +@samp{%s}, because if the input begins with a whitespace character, +@samp{%[} reports a matching failure while @samp{%s} simply discards the +initial whitespace. + +@item %25[a-z] +Matches up to 25 lowercase characters. +@end table + +One more reminder: the @samp{%s} and @samp{%[} conversions are +@strong{dangerous} if you don't specify a maximum width or use the +@samp{a} flag, because input too long would overflow whatever buffer you +have provided for it. No matter how long your buffer is, a user could +supply input that is longer. A well-written program reports invalid +input with a comprehensible error message, not with a crash. + +@node Dynamic String Input +@subsection Dynamically Allocating String Conversions + +A GNU extension to formatted input lets you safely read a string with no +maximum size. Using this feature, you don't supply a buffer; instead, +@code{scanf} allocates a buffer big enough to hold the data and gives +you its address. To use this feature, write @samp{a} as a flag +character, as in @samp{%as} or @samp{%a[0-9a-z]}. + +The pointer argument you supply for where to store the input should have +type @code{char **}. The @code{scanf} function allocates a buffer and +stores its address in the word that the argument points to. You should +free the buffer with @code{free} when you no longer need it. + +Here is an example of using the @samp{a} flag with the @samp{%[@dots{}]} +conversion specification to read a ``variable assignment'' of the form +@samp{@var{variable} = @var{value}}. + +@smallexample +@{ + char *variable, *value; + + if (2 > scanf ("%a[a-zA-Z0-9] = %a[^\n]\n", + &variable, &value)) + @{ + invalid_input_error (); + return 0; + @} + + @dots{} +@} +@end smallexample + +@node Other Input Conversions +@subsection Other Input Conversions + +This section describes the miscellaneous input conversions. + +The @samp{%p} conversion is used to read a pointer value. It recognizes +the same syntax as is used by the @samp{%p} output conversion for +@code{printf} (@pxref{Other Output Conversions}); that is, a hexadecimal +number just as the @samp{%x} conversion accepts. The corresponding +argument should be of type @code{void **}; that is, the address of a +place to store a pointer. + +The resulting pointer value is not guaranteed to be valid if it was not +originally written during the same program execution that reads it in. + +The @samp{%n} conversion produces the number of characters read so far +by this call. The corresponding argument should be of type @code{int *}. +This conversion works in the same way as the @samp{%n} conversion for +@code{printf}; see @ref{Other Output Conversions}, for an example. + +The @samp{%n} conversion is the only mechanism for determining the +success of literal matches or conversions with suppressed assignments. +If the @samp{%n} follows the locus of a matching failure, then no value +is stored for it since @code{scanf} returns before processing the +@samp{%n}. If you store @code{-1} in that argument slot before calling +@code{scanf}, the presence of @code{-1} after @code{scanf} indicates an +error occurred before the @samp{%n} was reached. + +Finally, the @samp{%%} conversion matches a literal @samp{%} character +in the input stream, without using an argument. This conversion does +not permit any flags, field width, or type modifier to be specified. + +@node Formatted Input Functions +@subsection Formatted Input Functions + +Here are the descriptions of the functions for performing formatted +input. +Prototypes for these functions are in the header file @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftypefun int scanf (const char *@var{template}, @dots{}) +The @code{scanf} function reads formatted input from the stream +@code{stdin} under the control of the template string @var{template}. +The optional arguments are pointers to the places which receive the +resulting values. + +The return value is normally the number of successful assignments. If +an end-of-file condition is detected before any matches are performed +(including matches against whitespace and literal characters in the +template), then @code{EOF} is returned. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int fscanf (FILE *@var{stream}, const char *@var{template}, @dots{}) +This function is just like @code{scanf}, except that the input is read +from the stream @var{stream} instead of @code{stdin}. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int sscanf (const char *@var{s}, const char *@var{template}, @dots{}) +This is like @code{scanf}, except that the characters are taken from the +null-terminated string @var{s} instead of from a stream. Reaching the +end of the string is treated as an end-of-file condition. + +The behavior of this function is undefined if copying takes place +between objects that overlap---for example, if @var{s} is also given +as an argument to receive a string read under control of the @samp{%s} +conversion. +@end deftypefun + +@node Variable Arguments Input +@subsection Variable Arguments Input Functions + +The functions @code{vscanf} and friends are provided so that you can +define your own variadic @code{scanf}-like functions that make use of +the same internals as the built-in formatted output functions. +These functions are analogous to the @code{vprintf} series of output +functions. @xref{Variable Arguments Output}, for important +information on how to use them. + +@strong{Portability Note:} The functions listed in this section are GNU +extensions. + +@comment stdio.h +@comment GNU +@deftypefun int vscanf (const char *@var{template}, va_list @var{ap}) +This function is similar to @code{scanf} except that, instead of taking +a variable number of arguments directly, it takes an argument list +pointer @var{ap} of type @code{va_list} (@pxref{Variadic Functions}). +@end deftypefun + +@comment stdio.h +@comment GNU +@deftypefun int vfscanf (FILE *@var{stream}, const char *@var{template}, va_list @var{ap}) +This is the equivalent of @code{fscanf} with the variable argument list +specified directly as for @code{vscanf}. +@end deftypefun + +@comment stdio.h +@comment GNU +@deftypefun int vsscanf (const char *@var{s}, const char *@var{template}, va_list @var{ap}) +This is the equivalent of @code{sscanf} with the variable argument list +specified directly as for @code{vscanf}. +@end deftypefun + +In GNU C, there is a special construct you can use to let the compiler +know that a function uses a @code{scanf}-style format string. Then it +can check the number and types of arguments in each call to the +function, and warn you when they do not match the format string. +@xref{Function Attributes, , Declaring Attributes of Functions, +gcc.info, Using GNU CC}, for details. + +@node EOF and Errors +@section End-Of-File and Errors + +@cindex end of file, on a stream +Many of the functions described in this chapter return the value of the +macro @code{EOF} to indicate unsuccessful completion of the operation. +Since @code{EOF} is used to report both end of file and random errors, +it's often better to use the @code{feof} function to check explicitly +for end of file and @code{ferror} to check for errors. These functions +check indicators that are part of the internal state of the stream +object, indicators set if the appropriate condition was detected by a +previous I/O operation on that stream. + +These symbols are declared in the header file @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftypevr Macro int EOF +This macro is an integer value that is returned by a number of functions +to indicate an end-of-file condition, or some other error situation. +With the GNU library, @code{EOF} is @code{-1}. In other libraries, its +value may be some other negative number. +@end deftypevr + +@comment stdio.h +@comment ANSI +@deftypefun void clearerr (FILE *@var{stream}) +This function clears the end-of-file and error indicators for the +stream @var{stream}. + +The file positioning functions (@pxref{File Positioning}) also clear the +end-of-file indicator for the stream. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int feof (FILE *@var{stream}) +The @code{feof} function returns nonzero if and only if the end-of-file +indicator for the stream @var{stream} is set. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int ferror (FILE *@var{stream}) +The @code{ferror} function returns nonzero if and only if the error +indicator for the stream @var{stream} is set, indicating that an error +has occurred on a previous operation on the stream. +@end deftypefun + +In addition to setting the error indicator associated with the stream, +the functions that operate on streams also set @code{errno} in the same +way as the corresponding low-level functions that operate on file +descriptors. For example, all of the functions that perform output to a +stream---such as @code{fputc}, @code{printf}, and @code{fflush}---are +implemented in terms of @code{write}, and all of the @code{errno} error +conditions defined for @code{write} are meaningful for these functions. +For more information about the descriptor-level I/O functions, see +@ref{Low-Level I/O}. + +@node Binary Streams +@section Text and Binary Streams + +The GNU system and other POSIX-compatible operating systems organize all +files as uniform sequences of characters. However, some other systems +make a distinction between files containing text and files containing +binary data, and the input and output facilities of ANSI C provide for +this distinction. This section tells you how to write programs portable +to such systems. + +@cindex text stream +@cindex binary stream +When you open a stream, you can specify either a @dfn{text stream} or a +@dfn{binary stream}. You indicate that you want a binary stream by +specifying the @samp{b} modifier in the @var{opentype} argument to +@code{fopen}; see @ref{Opening Streams}. Without this +option, @code{fopen} opens the file as a text stream. + +Text and binary streams differ in several ways: + +@itemize @bullet +@item +The data read from a text stream is divided into @dfn{lines} which are +terminated by newline (@code{'\n'}) characters, while a binary stream is +simply a long series of characters. A text stream might on some systems +fail to handle lines more than 254 characters long (including the +terminating newline character). +@cindex lines (in a text file) + +@item +On some systems, text files can contain only printing characters, +horizontal tab characters, and newlines, and so text streams may not +support other characters. However, binary streams can handle any +character value. + +@item +Space characters that are written immediately preceding a newline +character in a text stream may disappear when the file is read in again. + +@item +More generally, there need not be a one-to-one mapping between +characters that are read from or written to a text stream, and the +characters in the actual file. +@end itemize + +Since a binary stream is always more capable and more predictable than a +text stream, you might wonder what purpose text streams serve. Why not +simply always use binary streams? The answer is that on these operating +systems, text and binary streams use different file formats, and the +only way to read or write ``an ordinary file of text'' that can work +with other text-oriented programs is through a text stream. + +In the GNU library, and on all POSIX systems, there is no difference +between text streams and binary streams. When you open a stream, you +get the same kind of stream regardless of whether you ask for binary. +This stream can handle any file content, and has none of the +restrictions that text streams sometimes have. + +@node File Positioning +@section File Positioning +@cindex file positioning on a stream +@cindex positioning a stream +@cindex seeking on a stream + +The @dfn{file position} of a stream describes where in the file the +stream is currently reading or writing. I/O on the stream advances the +file position through the file. In the GNU system, the file position is +represented as an integer, which counts the number of bytes from the +beginning of the file. @xref{File Position}. + +During I/O to an ordinary disk file, you can change the file position +whenever you wish, so as to read or write any portion of the file. Some +other kinds of files may also permit this. Files which support changing +the file position are sometimes referred to as @dfn{random-access} +files. + +You can use the functions in this section to examine or modify the file +position indicator associated with a stream. The symbols listed below +are declared in the header file @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftypefun {long int} ftell (FILE *@var{stream}) +This function returns the current file position of the stream +@var{stream}. + +This function can fail if the stream doesn't support file positioning, +or if the file position can't be represented in a @code{long int}, and +possibly for other reasons as well. If a failure occurs, a value of +@code{-1} is returned. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int fseek (FILE *@var{stream}, long int @var{offset}, int @var{whence}) +The @code{fseek} function is used to change the file position of the +stream @var{stream}. The value of @var{whence} must be one of the +constants @code{SEEK_SET}, @code{SEEK_CUR}, or @code{SEEK_END}, to +indicate whether the @var{offset} is relative to the beginning of the +file, the current file position, or the end of the file, respectively. + +This function returns a value of zero if the operation was successful, +and a nonzero value to indicate failure. A successful call also clears +the end-of-file indicator of @var{stream} and discards any characters +that were ``pushed back'' by the use of @code{ungetc}. + +@code{fseek} either flushes any buffered output before setting the file +position or else remembers it so it will be written later in its proper +place in the file. +@end deftypefun + +@strong{Portability Note:} In non-POSIX systems, @code{ftell} and +@code{fseek} might work reliably only on binary streams. @xref{Binary +Streams}. + +The following symbolic constants are defined for use as the @var{whence} +argument to @code{fseek}. They are also used with the @code{lseek} +function (@pxref{I/O Primitives}) and to specify offsets for file locks +(@pxref{Control Operations}). + +@comment stdio.h +@comment ANSI +@deftypevr Macro int SEEK_SET +This is an integer constant which, when used as the @var{whence} +argument to the @code{fseek} function, specifies that the offset +provided is relative to the beginning of the file. +@end deftypevr + +@comment stdio.h +@comment ANSI +@deftypevr Macro int SEEK_CUR +This is an integer constant which, when used as the @var{whence} +argument to the @code{fseek} function, specifies that the offset +provided is relative to the current file position. +@end deftypevr + +@comment stdio.h +@comment ANSI +@deftypevr Macro int SEEK_END +This is an integer constant which, when used as the @var{whence} +argument to the @code{fseek} function, specifies that the offset +provided is relative to the end of the file. +@end deftypevr + +@comment stdio.h +@comment ANSI +@deftypefun void rewind (FILE *@var{stream}) +The @code{rewind} function positions the stream @var{stream} at the +begining of the file. It is equivalent to calling @code{fseek} on the +@var{stream} with an @var{offset} argument of @code{0L} and a +@var{whence} argument of @code{SEEK_SET}, except that the return +value is discarded and the error indicator for the stream is reset. +@end deftypefun + +These three aliases for the @samp{SEEK_@dots{}} constants exist for the +sake of compatibility with older BSD systems. They are defined in two +different header files: @file{fcntl.h} and @file{sys/file.h}. + +@table @code +@comment sys/file.h +@comment BSD +@item L_SET +@vindex L_SET +An alias for @code{SEEK_SET}. + +@comment sys/file.h +@comment BSD +@item L_INCR +@vindex L_INCR +An alias for @code{SEEK_CUR}. + +@comment sys/file.h +@comment BSD +@item L_XTND +@vindex L_XTND +An alias for @code{SEEK_END}. +@end table + +@node Portable Positioning +@section Portable File-Position Functions + +On the GNU system, the file position is truly a character count. You +can specify any character count value as an argument to @code{fseek} and +get reliable results for any random access file. However, some ANSI C +systems do not represent file positions in this way. + +On some systems where text streams truly differ from binary streams, it +is impossible to represent the file position of a text stream as a count +of characters from the beginning of the file. For example, the file +position on some systems must encode both a record offset within the +file, and a character offset within the record. + +As a consequence, if you want your programs to be portable to these +systems, you must observe certain rules: + +@itemize @bullet +@item +The value returned from @code{ftell} on a text stream has no predictable +relationship to the number of characters you have read so far. The only +thing you can rely on is that you can use it subsequently as the +@var{offset} argument to @code{fseek} to move back to the same file +position. + +@item +In a call to @code{fseek} on a text stream, either the @var{offset} must +either be zero; or @var{whence} must be @code{SEEK_SET} and the +@var{offset} must be the result of an earlier call to @code{ftell} on +the same stream. + +@item +The value of the file position indicator of a text stream is undefined +while there are characters that have been pushed back with @code{ungetc} +that haven't been read or discarded. @xref{Unreading}. +@end itemize + +But even if you observe these rules, you may still have trouble for long +files, because @code{ftell} and @code{fseek} use a @code{long int} value +to represent the file position. This type may not have room to encode +all the file positions in a large file. + +So if you do want to support systems with peculiar encodings for the +file positions, it is better to use the functions @code{fgetpos} and +@code{fsetpos} instead. These functions represent the file position +using the data type @code{fpos_t}, whose internal representation varies +from system to system. + +These symbols are declared in the header file @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftp {Data Type} fpos_t +This is the type of an object that can encode information about the +file position of a stream, for use by the functions @code{fgetpos} and +@code{fsetpos}. + +In the GNU system, @code{fpos_t} is equivalent to @code{off_t} or +@code{long int}. In other systems, it might have a different internal +representation. +@end deftp + +@comment stdio.h +@comment ANSI +@deftypefun int fgetpos (FILE *@var{stream}, fpos_t *@var{position}) +This function stores the value of the file position indicator for the +stream @var{stream} in the @code{fpos_t} object pointed to by +@var{position}. If successful, @code{fgetpos} returns zero; otherwise +it returns a nonzero value and stores an implementation-defined positive +value in @code{errno}. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypefun int fsetpos (FILE *@var{stream}, const fpos_t @var{position}) +This function sets the file position indicator for the stream @var{stream} +to the position @var{position}, which must have been set by a previous +call to @code{fgetpos} on the same stream. If successful, @code{fsetpos} +clears the end-of-file indicator on the stream, discards any characters +that were ``pushed back'' by the use of @code{ungetc}, and returns a value +of zero. Otherwise, @code{fsetpos} returns a nonzero value and stores +an implementation-defined positive value in @code{errno}. +@end deftypefun + +@node Stream Buffering +@section Stream Buffering + +@cindex buffering of streams +Characters that are written to a stream are normally accumulated and +transmitted asynchronously to the file in a block, instead of appearing +as soon as they are output by the application program. Similarly, +streams often retrieve input from the host environment in blocks rather +than on a character-by-character basis. This is called @dfn{buffering}. + +If you are writing programs that do interactive input and output using +streams, you need to understand how buffering works when you design the +user interface to your program. Otherwise, you might find that output +(such as progress or prompt messages) doesn't appear when you intended +it to, or other unexpected behavior. + +This section deals only with controlling when characters are transmitted +between the stream and the file or device, and @emph{not} with how +things like echoing, flow control, and the like are handled on specific +classes of devices. For information on common control operations on +terminal devices, see @ref{Low-Level Terminal Interface}. + +You can bypass the stream buffering facilities altogether by using the +low-level input and output functions that operate on file descriptors +instead. @xref{Low-Level I/O}. + +@menu +* Buffering Concepts:: Terminology is defined here. +* Flushing Buffers:: How to ensure that output buffers are flushed. +* Controlling Buffering:: How to specify what kind of buffering to use. +@end menu + +@node Buffering Concepts +@subsection Buffering Concepts + +There are three different kinds of buffering strategies: + +@itemize @bullet +@item +Characters written to or read from an @dfn{unbuffered} stream are +transmitted individually to or from the file as soon as possible. +@cindex unbuffered stream + +@item +Characters written to a @dfn{line buffered} stream are transmitted to +the file in blocks when a newline character is encountered. +@cindex line buffered stream + +@item +Characters written to or read from a @dfn{fully buffered} stream are +transmitted to or from the file in blocks of arbitrary size. +@cindex fully buffered stream +@end itemize + +Newly opened streams are normally fully buffered, with one exception: a +stream connected to an interactive device such as a terminal is +initially line buffered. @xref{Controlling Buffering}, for information +on how to select a different kind of buffering. Usually the automatic +selection gives you the most convenient kind of buffering for the file +or device you open. + +The use of line buffering for interactive devices implies that output +messages ending in a newline will appear immediately---which is usually +what you want. Output that doesn't end in a newline might or might not +show up immediately, so if you want them to appear immediately, you +should flush buffered output explicitly with @code{fflush}, as described +in @ref{Flushing Buffers}. + +@node Flushing Buffers +@subsection Flushing Buffers + +@cindex flushing a stream +@dfn{Flushing} output on a buffered stream means transmitting all +accumulated characters to the file. There are many circumstances when +buffered output on a stream is flushed automatically: + +@itemize @bullet +@item +When you try to do output and the output buffer is full. + +@item +When the stream is closed. @xref{Closing Streams}. + +@item +When the program terminates by calling @code{exit}. +@xref{Normal Termination}. + +@item +When a newline is written, if the stream is line buffered. + +@item +Whenever an input operation on @emph{any} stream actually reads data +from its file. +@end itemize + +If you want to flush the buffered output at another time, call +@code{fflush}, which is declared in the header file @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftypefun int fflush (FILE *@var{stream}) +This function causes any buffered output on @var{stream} to be delivered +to the file. If @var{stream} is a null pointer, then +@code{fflush} causes buffered output on @emph{all} open output streams +to be flushed. + +This function returns @code{EOF} if a write error occurs, or zero +otherwise. +@end deftypefun + +@strong{Compatibility Note:} Some brain-damaged operating systems have +been known to be so thoroughly fixated on line-oriented input and output +that flushing a line buffered stream causes a newline to be written! +Fortunately, this ``feature'' seems to be becoming less common. You do +not need to worry about this in the GNU system. + + +@node Controlling Buffering +@subsection Controlling Which Kind of Buffering + +After opening a stream (but before any other operations have been +performed on it), you can explicitly specify what kind of buffering you +want it to have using the @code{setvbuf} function. +@cindex buffering, controlling + +The facilities listed in this section are declared in the header +file @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment ANSI +@deftypefun int setvbuf (FILE *@var{stream}, char *@var{buf}, int @var{mode}, size_t @var{size}) +This function is used to specify that the stream @var{stream} should +have the buffering mode @var{mode}, which can be either @code{_IOFBF} +(for full buffering), @code{_IOLBF} (for line buffering), or +@code{_IONBF} (for unbuffered input/output). + +If you specify a null pointer as the @var{buf} argument, then @code{setvbuf} +allocates a buffer itself using @code{malloc}. This buffer will be freed +when you close the stream. + +Otherwise, @var{buf} should be a character array that can hold at least +@var{size} characters. You should not free the space for this array as +long as the stream remains open and this array remains its buffer. You +should usually either allocate it statically, or @code{malloc} +(@pxref{Unconstrained Allocation}) the buffer. Using an automatic array +is not a good idea unless you close the file before exiting the block +that declares the array. + +While the array remains a stream buffer, the stream I/O functions will +use the buffer for their internal purposes. You shouldn't try to access +the values in the array directly while the stream is using it for +buffering. + +The @code{setvbuf} function returns zero on success, or a nonzero value +if the value of @var{mode} is not valid or if the request could not +be honored. +@end deftypefun + +@comment stdio.h +@comment ANSI +@deftypevr Macro int _IOFBF +The value of this macro is an integer constant expression that can be +used as the @var{mode} argument to the @code{setvbuf} function to +specify that the stream should be fully buffered. +@end deftypevr + +@comment stdio.h +@comment ANSI +@deftypevr Macro int _IOLBF +The value of this macro is an integer constant expression that can be +used as the @var{mode} argument to the @code{setvbuf} function to +specify that the stream should be line buffered. +@end deftypevr + +@comment stdio.h +@comment ANSI +@deftypevr Macro int _IONBF +The value of this macro is an integer constant expression that can be +used as the @var{mode} argument to the @code{setvbuf} function to +specify that the stream should be unbuffered. +@end deftypevr + +@comment stdio.h +@comment ANSI +@deftypevr Macro int BUFSIZ +The value of this macro is an integer constant expression that is good +to use for the @var{size} argument to @code{setvbuf}. This value is +guaranteed to be at least @code{256}. + +The value of @code{BUFSIZ} is chosen on each system so as to make stream +I/O efficient. So it is a good idea to use @code{BUFSIZ} as the size +for the buffer when you call @code{setvbuf}. + +Actually, you can get an even better value to use for the buffer size +by means of the @code{fstat} system call: it is found in the +@code{st_blksize} field of the file attributes. @xref{Attribute Meanings}. + +Sometimes people also use @code{BUFSIZ} as the allocation size of +buffers used for related purposes, such as strings used to receive a +line of input with @code{fgets} (@pxref{Character Input}). There is no +particular reason to use @code{BUFSIZ} for this instead of any other +integer, except that it might lead to doing I/O in chunks of an +efficient size. +@end deftypevr + +@comment stdio.h +@comment ANSI +@deftypefun void setbuf (FILE *@var{stream}, char *@var{buf}) +If @var{buf} is a null pointer, the effect of this function is +equivalent to calling @code{setvbuf} with a @var{mode} argument of +@code{_IONBF}. Otherwise, it is equivalent to calling @code{setvbuf} +with @var{buf}, and a @var{mode} of @code{_IOFBF} and a @var{size} +argument of @code{BUFSIZ}. + +The @code{setbuf} function is provided for compatibility with old code; +use @code{setvbuf} in all new programs. +@end deftypefun + +@comment stdio.h +@comment BSD +@deftypefun void setbuffer (FILE *@var{stream}, char *@var{buf}, size_t @var{size}) +If @var{buf} is a null pointer, this function makes @var{stream} unbuffered. +Otherwise, it makes @var{stream} fully buffered using @var{buf} as the +buffer. The @var{size} argument specifies the length of @var{buf}. + +This function is provided for compatibility with old BSD code. Use +@code{setvbuf} instead. +@end deftypefun + +@comment stdio.h +@comment BSD +@deftypefun void setlinebuf (FILE *@var{stream}) +This function makes @var{stream} be line buffered, and allocates the +buffer for you. + +This function is provided for compatibility with old BSD code. Use +@code{setvbuf} instead. +@end deftypefun + +@node Other Kinds of Streams +@section Other Kinds of Streams + +The GNU library provides ways for you to define additional kinds of +streams that do not necessarily correspond to an open file. + +One such type of stream takes input from or writes output to a string. +These kinds of streams are used internally to implement the +@code{sprintf} and @code{sscanf} functions. You can also create such a +stream explicitly, using the functions described in @ref{String Streams}. + +More generally, you can define streams that do input/output to arbitrary +objects using functions supplied by your program. This protocol is +discussed in @ref{Custom Streams}. + +@strong{Portability Note:} The facilities described in this section are +specific to GNU. Other systems or C implementations might or might not +provide equivalent functionality. + +@menu +* String Streams:: Streams that get data from or put data in + a string or memory buffer. +* Obstack Streams:: Streams that store data in an obstack. +* Custom Streams:: Defining your own streams with an arbitrary + input data source and/or output data sink. +@end menu + +@node String Streams +@subsection String Streams + +@cindex stream, for I/O to a string +@cindex string stream +The @code{fmemopen} and @code{open_memstream} functions allow you to do +I/O to a string or memory buffer. These facilities are declared in +@file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment GNU +@deftypefun {FILE *} fmemopen (void *@var{buf}, size_t @var{size}, const char *@var{opentype}) +This function opens a stream that allows the access specified by the +@var{opentype} argument, that reads from or writes to the buffer specified +by the argument @var{buf}. This array must be at least @var{size} bytes long. + +If you specify a null pointer as the @var{buf} argument, @code{fmemopen} +dynamically allocates (as with @code{malloc}; @pxref{Unconstrained +Allocation}) an array @var{size} bytes long. This is really only useful +if you are going to write things to the buffer and then read them back +in again, because you have no way of actually getting a pointer to the +buffer (for this, try @code{open_memstream}, below). The buffer is +freed when the stream is open. + +The argument @var{opentype} is the same as in @code{fopen} +(@xref{Opening Streams}). If the @var{opentype} specifies +append mode, then the initial file position is set to the first null +character in the buffer. Otherwise the initial file position is at the +beginning of the buffer. + +When a stream open for writing is flushed or closed, a null character +(zero byte) is written at the end of the buffer if it fits. You +should add an extra byte to the @var{size} argument to account for this. +Attempts to write more than @var{size} bytes to the buffer result +in an error. + +For a stream open for reading, null characters (zero bytes) in the +buffer do not count as ``end of file''. Read operations indicate end of +file only when the file position advances past @var{size} bytes. So, if +you want to read characters from a null-terminated string, you should +supply the length of the string as the @var{size} argument. +@end deftypefun + +Here is an example of using @code{fmemopen} to create a stream for +reading from a string: + +@smallexample +@include memopen.c.texi +@end smallexample + +This program produces the following output: + +@smallexample +Got f +Got o +Got o +Got b +Got a +Got r +@end smallexample + +@comment stdio.h +@comment GNU +@deftypefun {FILE *} open_memstream (char **@var{ptr}, size_t *@var{sizeloc}) +This function opens a stream for writing to a buffer. The buffer is +allocated dynamically (as with @code{malloc}; @pxref{Unconstrained +Allocation}) and grown as necessary. + +When the stream is closed with @code{fclose} or flushed with +@code{fflush}, the locations @var{ptr} and @var{sizeloc} are updated to +contain the pointer to the buffer and its size. The values thus stored +remain valid only as long as no further output on the stream takes +place. If you do more output, you must flush the stream again to store +new values before you use them again. + +A null character is written at the end of the buffer. This null character +is @emph{not} included in the size value stored at @var{sizeloc}. + +You can move the stream's file position with @code{fseek} (@pxref{File +Positioning}). Moving the file position past the end of the data +already written fills the intervening space with zeroes. +@end deftypefun + +Here is an example of using @code{open_memstream}: + +@smallexample +@include memstrm.c.texi +@end smallexample + +This program produces the following output: + +@smallexample +buf = `hello', size = 5 +buf = `hello, world', size = 12 +@end smallexample + +@c @group Invalid outside @example. +@node Obstack Streams +@subsection Obstack Streams + +You can open an output stream that puts it data in an obstack. +@xref{Obstacks}. + +@comment stdio.h +@comment GNU +@deftypefun {FILE *} open_obstack_stream (struct obstack *@var{obstack}) +This function opens a stream for writing data into the obstack @var{obstack}. +This starts an object in the obstack and makes it grow as data is +written (@pxref{Growing Objects}). +@c @end group Doubly invalid because not nested right. + +Calling @code{fflush} on this stream updates the current size of the +object to match the amount of data that has been written. After a call +to @code{fflush}, you can examine the object temporarily. + +You can move the file position of an obstack stream with @code{fseek} +(@pxref{File Positioning}). Moving the file position past the end of +the data written fills the intervening space with zeros. + +To make the object permanent, update the obstack with @code{fflush}, and +then use @code{obstack_finish} to finalize the object and get its address. +The following write to the stream starts a new object in the obstack, +and later writes add to that object until you do another @code{fflush} +and @code{obstack_finish}. + +But how do you find out how long the object is? You can get the length +in bytes by calling @code{obstack_object_size} (@pxref{Status of an +Obstack}), or you can null-terminate the object like this: + +@smallexample +obstack_1grow (@var{obstack}, 0); +@end smallexample + +Whichever one you do, you must do it @emph{before} calling +@code{obstack_finish}. (You can do both if you wish.) +@end deftypefun + +Here is a sample function that uses @code{open_obstack_stream}: + +@smallexample +char * +make_message_string (const char *a, int b) +@{ + FILE *stream = open_obstack_stream (&message_obstack); + output_task (stream); + fprintf (stream, ": "); + fprintf (stream, a, b); + fprintf (stream, "\n"); + fclose (stream); + obstack_1grow (&message_obstack, 0); + return obstack_finish (&message_obstack); +@} +@end smallexample + +@node Custom Streams +@subsection Programming Your Own Custom Streams +@cindex custom streams +@cindex programming your own streams + +This section describes how you can make a stream that gets input from an +arbitrary data source or writes output to an arbitrary data sink +programmed by you. We call these @dfn{custom streams}. + +@c !!! this does not talk at all about the higher-level hooks + +@menu +* Streams and Cookies:: The @dfn{cookie} records where to fetch or + store data that is read or written. +* Hook Functions:: How you should define the four @dfn{hook + functions} that a custom stream needs. +@end menu + +@node Streams and Cookies +@subsubsection Custom Streams and Cookies +@cindex cookie, for custom stream + +Inside every custom stream is a special object called the @dfn{cookie}. +This is an object supplied by you which records where to fetch or store +the data read or written. It is up to you to define a data type to use +for the cookie. The stream functions in the library never refer +directly to its contents, and they don't even know what the type is; +they record its address with type @code{void *}. + +To implement a custom stream, you must specify @emph{how} to fetch or +store the data in the specified place. You do this by defining +@dfn{hook functions} to read, write, change ``file position'', and close +the stream. All four of these functions will be passed the stream's +cookie so they can tell where to fetch or store the data. The library +functions don't know what's inside the cookie, but your functions will +know. + +When you create a custom stream, you must specify the cookie pointer, +and also the four hook functions stored in a structure of type +@code{cookie_io_functions_t}. + +These facilities are declared in @file{stdio.h}. +@pindex stdio.h + +@comment stdio.h +@comment GNU +@deftp {Data Type} {cookie_io_functions_t} +This is a structure type that holds the functions that define the +communications protocol between the stream and its cookie. It has +the following members: + +@table @code +@item cookie_read_function_t *read +This is the function that reads data from the cookie. If the value is a +null pointer instead of a function, then read operations on ths stream +always return @code{EOF}. + +@item cookie_write_function_t *write +This is the function that writes data to the cookie. If the value is a +null pointer instead of a function, then data written to the stream is +discarded. + +@item cookie_seek_function_t *seek +This is the function that performs the equivalent of file positioning on +the cookie. If the value is a null pointer instead of a function, calls +to @code{fseek} on this stream can only seek to locations within the +buffer; any attempt to seek outside the buffer will return an +@code{ESPIPE} error. + +@item cookie_close_function_t *close +This function performs any appropriate cleanup on the cookie when +closing the stream. If the value is a null pointer instead of a +function, nothing special is done to close the cookie when the stream is +closed. +@end table +@end deftp + +@comment stdio.h +@comment GNU +@deftypefun {FILE *} fopencookie (void *@var{cookie}, const char *@var{opentype}, cookie_io_functions_t @var{io-functions}) +This function actually creates the stream for communicating with the +@var{cookie} using the functions in the @var{io-functions} argument. +The @var{opentype} argument is interpreted as for @code{fopen}; +see @ref{Opening Streams}. (But note that the ``truncate on +open'' option is ignored.) The new stream is fully buffered. + +The @code{fopencookie} function returns the newly created stream, or a null +pointer in case of an error. +@end deftypefun + +@node Hook Functions +@subsubsection Custom Stream Hook Functions +@cindex hook functions (of custom streams) + +Here are more details on how you should define the four hook functions +that a custom stream needs. + +You should define the function to read data from the cookie as: + +@smallexample +ssize_t @var{reader} (void *@var{cookie}, void *@var{buffer}, size_t @var{size}) +@end smallexample + +This is very similar to the @code{read} function; see @ref{I/O +Primitives}. Your function should transfer up to @var{size} bytes into +the @var{buffer}, and return the number of bytes read, or zero to +indicate end-of-file. You can return a value of @code{-1} to indicate +an error. + +You should define the function to write data to the cookie as: + +@smallexample +ssize_t @var{writer} (void *@var{cookie}, const void *@var{buffer}, size_t @var{size}) +@end smallexample + +This is very similar to the @code{write} function; see @ref{I/O +Primitives}. Your function should transfer up to @var{size} bytes from +the buffer, and return the number of bytes written. You can return a +value of @code{-1} to indicate an error. + +You should define the function to perform seek operations on the cookie +as: + +@smallexample +int @var{seeker} (void *@var{cookie}, fpos_t *@var{position}, int @var{whence}) +@end smallexample + +For this function, the @var{position} and @var{whence} arguments are +interpreted as for @code{fgetpos}; see @ref{Portable Positioning}. In +the GNU library, @code{fpos_t} is equivalent to @code{off_t} or +@code{long int}, and simply represents the number of bytes from the +beginning of the file. + +After doing the seek operation, your function should store the resulting +file position relative to the beginning of the file in @var{position}. +Your function should return a value of @code{0} on success and @code{-1} +to indicate an error. + +You should define the function to do cleanup operations on the cookie +appropriate for closing the stream as: + +@smallexample +int @var{cleaner} (void *@var{cookie}) +@end smallexample + +Your function should return @code{-1} to indicate an error, and @code{0} +otherwise. + +@comment stdio.h +@comment GNU +@deftp {Data Type} cookie_read_function +This is the data type that the read function for a custom stream should have. +If you declare the function as shown above, this is the type it will have. +@end deftp + +@comment stdio.h +@comment GNU +@deftp {Data Type} cookie_write_function +The data type of the write function for a custom stream. +@end deftp + +@comment stdio.h +@comment GNU +@deftp {Data Type} cookie_seek_function +The data type of the seek function for a custom stream. +@end deftp + +@comment stdio.h +@comment GNU +@deftp {Data Type} cookie_close_function +The data type of the close function for a custom stream. +@end deftp + +@ignore +Roland says: + +@quotation +There is another set of functions one can give a stream, the +input-room and output-room functions. These functions must +understand stdio internals. To describe how to use these +functions, you also need to document lots of how stdio works +internally (which isn't relevant for other uses of stdio). +Perhaps I can write an interface spec from which you can write +good documentation. But it's pretty complex and deals with lots +of nitty-gritty details. I think it might be better to let this +wait until the rest of the manual is more done and polished. +@end quotation +@end ignore + +@c ??? This section could use an example. diff --git a/manual/string.texi b/manual/string.texi new file mode 100644 index 0000000000..c638912229 --- /dev/null +++ b/manual/string.texi @@ -0,0 +1,947 @@ +@node String and Array Utilities, Extended Characters, Character Handling, Top +@chapter String and Array Utilities + +Operations on strings (or arrays of characters) are an important part of +many programs. The GNU C library provides an extensive set of string +utility functions, including functions for copying, concatenating, +comparing, and searching strings. Many of these functions can also +operate on arbitrary regions of storage; for example, the @code{memcpy} +function can be used to copy the contents of any kind of array. + +It's fairly common for beginning C programmers to ``reinvent the wheel'' +by duplicating this functionality in their own code, but it pays to +become familiar with the library functions and to make use of them, +since this offers benefits in maintenance, efficiency, and portability. + +For instance, you could easily compare one string to another in two +lines of C code, but if you use the built-in @code{strcmp} function, +you're less likely to make a mistake. And, since these library +functions are typically highly optimized, your program may run faster +too. + +@menu +* Representation of Strings:: Introduction to basic concepts. +* String/Array Conventions:: Whether to use a string function or an + arbitrary array function. +* String Length:: Determining the length of a string. +* Copying and Concatenation:: Functions to copy the contents of strings + and arrays. +* String/Array Comparison:: Functions for byte-wise and character-wise + comparison. +* Collation Functions:: Functions for collating strings. +* Search Functions:: Searching for a specific element or substring. +* Finding Tokens in a String:: Splitting a string into tokens by looking + for delimiters. +@end menu + +@node Representation of Strings, String/Array Conventions, , String and Array Utilities +@section Representation of Strings +@cindex string, representation of + +This section is a quick summary of string concepts for beginning C +programmers. It describes how character strings are represented in C +and some common pitfalls. If you are already familiar with this +material, you can skip this section. + +@cindex string +@cindex null character +A @dfn{string} is an array of @code{char} objects. But string-valued +variables are usually declared to be pointers of type @code{char *}. +Such variables do not include space for the text of a string; that has +to be stored somewhere else---in an array variable, a string constant, +or dynamically allocated memory (@pxref{Memory Allocation}). It's up to +you to store the address of the chosen memory space into the pointer +variable. Alternatively you can store a @dfn{null pointer} in the +pointer variable. The null pointer does not point anywhere, so +attempting to reference the string it points to gets an error. + +By convention, a @dfn{null character}, @code{'\0'}, marks the end of a +string. For example, in testing to see whether the @code{char *} +variable @var{p} points to a null character marking the end of a string, +you can write @code{!*@var{p}} or @code{*@var{p} == '\0'}. + +A null character is quite different conceptually from a null pointer, +although both are represented by the integer @code{0}. + +@cindex string literal +@dfn{String literals} appear in C program source as strings of +characters between double-quote characters (@samp{"}). In ANSI C, +string literals can also be formed by @dfn{string concatenation}: +@code{"a" "b"} is the same as @code{"ab"}. Modification of string +literals is not allowed by the GNU C compiler, because literals +are placed in read-only storage. + +Character arrays that are declared @code{const} cannot be modified +either. It's generally good style to declare non-modifiable string +pointers to be of type @code{const char *}, since this often allows the +C compiler to detect accidental modifications as well as providing some +amount of documentation about what your program intends to do with the +string. + +The amount of memory allocated for the character array may extend past +the null character that normally marks the end of the string. In this +document, the term @dfn{allocation size} is always used to refer to the +total amount of memory allocated for the string, while the term +@dfn{length} refers to the number of characters up to (but not +including) the terminating null character. +@cindex length of string +@cindex allocation size of string +@cindex size of string +@cindex string length +@cindex string allocation + +A notorious source of program bugs is trying to put more characters in a +string than fit in its allocated size. When writing code that extends +strings or moves characters into a pre-allocated array, you should be +very careful to keep track of the length of the text and make explicit +checks for overflowing the array. Many of the library functions +@emph{do not} do this for you! Remember also that you need to allocate +an extra byte to hold the null character that marks the end of the +string. + +@node String/Array Conventions, String Length, Representation of Strings, String and Array Utilities +@section String and Array Conventions + +This chapter describes both functions that work on arbitrary arrays or +blocks of memory, and functions that are specific to null-terminated +arrays of characters. + +Functions that operate on arbitrary blocks of memory have names +beginning with @samp{mem} (such as @code{memcpy}) and invariably take an +argument which specifies the size (in bytes) of the block of memory to +operate on. The array arguments and return values for these functions +have type @code{void *}, and as a matter of style, the elements of these +arrays are referred to as ``bytes''. You can pass any kind of pointer +to these functions, and the @code{sizeof} operator is useful in +computing the value for the size argument. + +In contrast, functions that operate specifically on strings have names +beginning with @samp{str} (such as @code{strcpy}) and look for a null +character to terminate the string instead of requiring an explicit size +argument to be passed. (Some of these functions accept a specified +maximum length, but they also check for premature termination with a +null character.) The array arguments and return values for these +functions have type @code{char *}, and the array elements are referred +to as ``characters''. + +In many cases, there are both @samp{mem} and @samp{str} versions of a +function. The one that is more appropriate to use depends on the exact +situation. When your program is manipulating arbitrary arrays or blocks of +storage, then you should always use the @samp{mem} functions. On the +other hand, when you are manipulating null-terminated strings it is +usually more convenient to use the @samp{str} functions, unless you +already know the length of the string in advance. + +@node String Length, Copying and Concatenation, String/Array Conventions, String and Array Utilities +@section String Length + +You can get the length of a string using the @code{strlen} function. +This function is declared in the header file @file{string.h}. +@pindex string.h + +@comment string.h +@comment ANSI +@deftypefun size_t strlen (const char *@var{s}) +The @code{strlen} function returns the length of the null-terminated +string @var{s}. (In other words, it returns the offset of the terminating +null character within the array.) + +For example, +@smallexample +strlen ("hello, world") + @result{} 12 +@end smallexample + +When applied to a character array, the @code{strlen} function returns +the length of the string stored there, not its allocation size. You can +get the allocation size of the character array that holds a string using +the @code{sizeof} operator: + +@smallexample +char string[32] = "hello, world"; +sizeof (string) + @result{} 32 +strlen (string) + @result{} 12 +@end smallexample +@end deftypefun + +@node Copying and Concatenation, String/Array Comparison, String Length, String and Array Utilities +@section Copying and Concatenation + +You can use the functions described in this section to copy the contents +of strings and arrays, or to append the contents of one string to +another. These functions are declared in the header file +@file{string.h}. +@pindex string.h +@cindex copying strings and arrays +@cindex string copy functions +@cindex array copy functions +@cindex concatenating strings +@cindex string concatenation functions + +A helpful way to remember the ordering of the arguments to the functions +in this section is that it corresponds to an assignment expression, with +the destination array specified to the left of the source array. All +of these functions return the address of the destination array. + +Most of these functions do not work properly if the source and +destination arrays overlap. For example, if the beginning of the +destination array overlaps the end of the source array, the original +contents of that part of the source array may get overwritten before it +is copied. Even worse, in the case of the string functions, the null +character marking the end of the string may be lost, and the copy +function might get stuck in a loop trashing all the memory allocated to +your program. + +All functions that have problems copying between overlapping arrays are +explicitly identified in this manual. In addition to functions in this +section, there are a few others like @code{sprintf} (@pxref{Formatted +Output Functions}) and @code{scanf} (@pxref{Formatted Input +Functions}). + +@comment string.h +@comment ANSI +@deftypefun {void *} memcpy (void *@var{to}, const void *@var{from}, size_t @var{size}) +The @code{memcpy} function copies @var{size} bytes from the object +beginning at @var{from} into the object beginning at @var{to}. The +behavior of this function is undefined if the two arrays @var{to} and +@var{from} overlap; use @code{memmove} instead if overlapping is possible. + +The value returned by @code{memcpy} is the value of @var{to}. + +Here is an example of how you might use @code{memcpy} to copy the +contents of an array: + +@smallexample +struct foo *oldarray, *newarray; +int arraysize; +@dots{} +memcpy (new, old, arraysize * sizeof (struct foo)); +@end smallexample +@end deftypefun + +@comment string.h +@comment ANSI +@deftypefun {void *} memmove (void *@var{to}, const void *@var{from}, size_t @var{size}) +@code{memmove} copies the @var{size} bytes at @var{from} into the +@var{size} bytes at @var{to}, even if those two blocks of space +overlap. In the case of overlap, @code{memmove} is careful to copy the +original values of the bytes in the block at @var{from}, including those +bytes which also belong to the block at @var{to}. +@end deftypefun + +@comment string.h +@comment SVID +@deftypefun {void *} memccpy (void *@var{to}, const void *@var{from}, int @var{c}, size_t @var{size}) +This function copies no more than @var{size} bytes from @var{from} to +@var{to}, stopping if a byte matching @var{c} is found. The return +value is a pointer into @var{to} one byte past where @var{c} was copied, +or a null pointer if no byte matching @var{c} appeared in the first +@var{size} bytes of @var{from}. +@end deftypefun + +@comment string.h +@comment ANSI +@deftypefun {void *} memset (void *@var{block}, int @var{c}, size_t @var{size}) +This function copies the value of @var{c} (converted to an +@code{unsigned char}) into each of the first @var{size} bytes of the +object beginning at @var{block}. It returns the value of @var{block}. +@end deftypefun + +@comment string.h +@comment ANSI +@deftypefun {char *} strcpy (char *@var{to}, const char *@var{from}) +This copies characters from the string @var{from} (up to and including +the terminating null character) into the string @var{to}. Like +@code{memcpy}, this function has undefined results if the strings +overlap. The return value is the value of @var{to}. +@end deftypefun + +@comment string.h +@comment ANSI +@deftypefun {char *} strncpy (char *@var{to}, const char *@var{from}, size_t @var{size}) +This function is similar to @code{strcpy} but always copies exactly +@var{size} characters into @var{to}. + +If the length of @var{from} is more than @var{size}, then @code{strncpy} +copies just the first @var{size} characters. Note that in this case +there is no null terminator written into @var{to}. + +If the length of @var{from} is less than @var{size}, then @code{strncpy} +copies all of @var{from}, followed by enough null characters to add up +to @var{size} characters in all. This behavior is rarely useful, but it +is specified by the ANSI C standard. + +The behavior of @code{strncpy} is undefined if the strings overlap. + +Using @code{strncpy} as opposed to @code{strcpy} is a way to avoid bugs +relating to writing past the end of the allocated space for @var{to}. +However, it can also make your program much slower in one common case: +copying a string which is probably small into a potentially large buffer. +In this case, @var{size} may be large, and when it is, @code{strncpy} will +waste a considerable amount of time copying null characters. +@end deftypefun + +@comment string.h +@comment SVID +@deftypefun {char *} strdup (const char *@var{s}) +This function copies the null-terminated string @var{s} into a newly +allocated string. The string is allocated using @code{malloc}; see +@ref{Unconstrained Allocation}. If @code{malloc} cannot allocate space +for the new string, @code{strdup} returns a null pointer. Otherwise it +returns a pointer to the new string. +@end deftypefun + +@comment string.h +@comment Unknown origin +@deftypefun {char *} stpcpy (char *@var{to}, const char *@var{from}) +This function is like @code{strcpy}, except that it returns a pointer to +the end of the string @var{to} (that is, the address of the terminating +null character) rather than the beginning. + +For example, this program uses @code{stpcpy} to concatenate @samp{foo} +and @samp{bar} to produce @samp{foobar}, which it then prints. + +@smallexample +@include stpcpy.c.texi +@end smallexample + +This function is not part of the ANSI or POSIX standards, and is not +customary on Unix systems, but we did not invent it either. Perhaps it +comes from MS-DOG. + +Its behavior is undefined if the strings overlap. +@end deftypefun + +@comment string.h +@comment ANSI +@deftypefun {char *} strcat (char *@var{to}, const char *@var{from}) +The @code{strcat} function is similar to @code{strcpy}, except that the +characters from @var{from} are concatenated or appended to the end of +@var{to}, instead of overwriting it. That is, the first character from +@var{from} overwrites the null character marking the end of @var{to}. + +An equivalent definition for @code{strcat} would be: + +@smallexample +char * +strcat (char *to, const char *from) +@{ + strcpy (to + strlen (to), from); + return to; +@} +@end smallexample + +This function has undefined results if the strings overlap. +@end deftypefun + +@comment string.h +@comment ANSI +@deftypefun {char *} strncat (char *@var{to}, const char *@var{from}, size_t @var{size}) +This function is like @code{strcat} except that not more than @var{size} +characters from @var{from} are appended to the end of @var{to}. A +single null character is also always appended to @var{to}, so the total +allocated size of @var{to} must be at least @code{@var{size} + 1} bytes +longer than its initial length. + +The @code{strncat} function could be implemented like this: + +@smallexample +@group +char * +strncat (char *to, const char *from, size_t size) +@{ + strncpy (to + strlen (to), from, size); + return to; +@} +@end group +@end smallexample + +The behavior of @code{strncat} is undefined if the strings overlap. +@end deftypefun + +Here is an example showing the use of @code{strncpy} and @code{strncat}. +Notice how, in the call to @code{strncat}, the @var{size} parameter +is computed to avoid overflowing the character array @code{buffer}. + +@smallexample +@include strncat.c.texi +@end smallexample + +@noindent +The output produced by this program looks like: + +@smallexample +hello +hello, wo +@end smallexample + +@comment string.h +@comment BSD +@deftypefun {void *} bcopy (void *@var{from}, const void *@var{to}, size_t @var{size}) +This is a partially obsolete alternative for @code{memmove}, derived from +BSD. Note that it is not quite equivalent to @code{memmove}, because the +arguments are not in the same order. +@end deftypefun + +@comment string.h +@comment BSD +@deftypefun {void *} bzero (void *@var{block}, size_t @var{size}) +This is a partially obsolete alternative for @code{memset}, derived from +BSD. Note that it is not as general as @code{memset}, because the only +value it can store is zero. +@end deftypefun + +@node String/Array Comparison, Collation Functions, Copying and Concatenation, String and Array Utilities +@section String/Array Comparison +@cindex comparing strings and arrays +@cindex string comparison functions +@cindex array comparison functions +@cindex predicates on strings +@cindex predicates on arrays + +You can use the functions in this section to perform comparisons on the +contents of strings and arrays. As well as checking for equality, these +functions can also be used as the ordering functions for sorting +operations. @xref{Searching and Sorting}, for an example of this. + +Unlike most comparison operations in C, the string comparison functions +return a nonzero value if the strings are @emph{not} equivalent rather +than if they are. The sign of the value indicates the relative ordering +of the first characters in the strings that are not equivalent: a +negative value indicates that the first string is ``less'' than the +second, while a positive value indicates that the first string is +``greater''. + +The most common use of these functions is to check only for equality. +This is canonically done with an expression like @w{@samp{! strcmp (s1, s2)}}. + +All of these functions are declared in the header file @file{string.h}. +@pindex string.h + +@comment string.h +@comment ANSI +@deftypefun int memcmp (const void *@var{a1}, const void *@var{a2}, size_t @var{size}) +The function @code{memcmp} compares the @var{size} bytes of memory +beginning at @var{a1} against the @var{size} bytes of memory beginning +at @var{a2}. The value returned has the same sign as the difference +between the first differing pair of bytes (interpreted as @code{unsigned +char} objects, then promoted to @code{int}). + +If the contents of the two blocks are equal, @code{memcmp} returns +@code{0}. +@end deftypefun + +On arbitrary arrays, the @code{memcmp} function is mostly useful for +testing equality. It usually isn't meaningful to do byte-wise ordering +comparisons on arrays of things other than bytes. For example, a +byte-wise comparison on the bytes that make up floating-point numbers +isn't likely to tell you anything about the relationship between the +values of the floating-point numbers. + +You should also be careful about using @code{memcmp} to compare objects +that can contain ``holes'', such as the padding inserted into structure +objects to enforce alignment requirements, extra space at the end of +unions, and extra characters at the ends of strings whose length is less +than their allocated size. The contents of these ``holes'' are +indeterminate and may cause strange behavior when performing byte-wise +comparisons. For more predictable results, perform an explicit +component-wise comparison. + +For example, given a structure type definition like: + +@smallexample +struct foo + @{ + unsigned char tag; + union + @{ + double f; + long i; + char *p; + @} value; + @}; +@end smallexample + +@noindent +you are better off writing a specialized comparison function to compare +@code{struct foo} objects instead of comparing them with @code{memcmp}. + +@comment string.h +@comment ANSI +@deftypefun int strcmp (const char *@var{s1}, const char *@var{s2}) +The @code{strcmp} function compares the string @var{s1} against +@var{s2}, returning a value that has the same sign as the difference +between the first differing pair of characters (interpreted as +@code{unsigned char} objects, then promoted to @code{int}). + +If the two strings are equal, @code{strcmp} returns @code{0}. + +A consequence of the ordering used by @code{strcmp} is that if @var{s1} +is an initial substring of @var{s2}, then @var{s1} is considered to be +``less than'' @var{s2}. +@end deftypefun + +@comment string.h +@comment BSD +@deftypefun int strcasecmp (const char *@var{s1}, const char *@var{s2}) +This function is like @code{strcmp}, except that differences in case +are ignored. + +@code{strcasecmp} is derived from BSD. +@end deftypefun + +@comment string.h +@comment BSD +@deftypefun int strncasecmp (const char *@var{s1}, const char *@var{s2}, size_t @var{n}) +This function is like @code{strncmp}, except that differences in case +are ignored. + +@code{strncasecmp} is a GNU extension. +@end deftypefun + +@comment string.h +@comment ANSI +@deftypefun int strncmp (const char *@var{s1}, const char *@var{s2}, size_t @var{size}) +This function is the similar to @code{strcmp}, except that no more than +@var{size} characters are compared. In other words, if the two strings are +the same in their first @var{size} characters, the return value is zero. +@end deftypefun + +Here are some examples showing the use of @code{strcmp} and @code{strncmp}. +These examples assume the use of the ASCII character set. (If some +other character set---say, EBCDIC---is used instead, then the glyphs +are associated with different numeric codes, and the return values +and ordering may differ.) + +@smallexample +strcmp ("hello", "hello") + @result{} 0 /* @r{These two strings are the same.} */ +strcmp ("hello", "Hello") + @result{} 32 /* @r{Comparisons are case-sensitive.} */ +strcmp ("hello", "world") + @result{} -15 /* @r{The character @code{'h'} comes before @code{'w'}.} */ +strcmp ("hello", "hello, world") + @result{} -44 /* @r{Comparing a null character against a comma.} */ +strncmp ("hello", "hello, world"", 5) + @result{} 0 /* @r{The initial 5 characters are the same.} */ +strncmp ("hello, world", "hello, stupid world!!!", 5) + @result{} 0 /* @r{The initial 5 characters are the same.} */ +@end smallexample + +@comment string.h +@comment BSD +@deftypefun int bcmp (const void *@var{a1}, const void *@var{a2}, size_t @var{size}) +This is an obsolete alias for @code{memcmp}, derived from BSD. +@end deftypefun + +@node Collation Functions, Search Functions, String/Array Comparison, String and Array Utilities +@section Collation Functions + +@cindex collating strings +@cindex string collation functions + +In some locales, the conventions for lexicographic ordering differ from +the strict numeric ordering of character codes. For example, in Spanish +most glyphs with diacritical marks such as accents are not considered +distinct letters for the purposes of collation. On the other hand, the +two-character sequence @samp{ll} is treated as a single letter that is +collated immediately after @samp{l}. + +You can use the functions @code{strcoll} and @code{strxfrm} (declared in +the header file @file{string.h}) to compare strings using a collation +ordering appropriate for the current locale. The locale used by these +functions in particular can be specified by setting the locale for the +@code{LC_COLLATE} category; see @ref{Locales}. +@pindex string.h + +In the standard C locale, the collation sequence for @code{strcoll} is +the same as that for @code{strcmp}. + +Effectively, the way these functions work is by applying a mapping to +transform the characters in a string to a byte sequence that represents +the string's position in the collating sequence of the current locale. +Comparing two such byte sequences in a simple fashion is equivalent to +comparing the strings with the locale's collating sequence. + +The function @code{strcoll} performs this translation implicitly, in +order to do one comparison. By contrast, @code{strxfrm} performs the +mapping explicitly. If you are making multiple comparisons using the +same string or set of strings, it is likely to be more efficient to use +@code{strxfrm} to transform all the strings just once, and subsequently +compare the transformed strings with @code{strcmp}. + +@comment string.h +@comment ANSI +@deftypefun int strcoll (const char *@var{s1}, const char *@var{s2}) +The @code{strcoll} function is similar to @code{strcmp} but uses the +collating sequence of the current locale for collation (the +@code{LC_COLLATE} locale). +@end deftypefun + +Here is an example of sorting an array of strings, using @code{strcoll} +to compare them. The actual sort algorithm is not written here; it +comes from @code{qsort} (@pxref{Array Sort Function}). The job of the +code shown here is to say how to compare the strings while sorting them. +(Later on in this section, we will show a way to do this more +efficiently using @code{strxfrm}.) + +@smallexample +/* @r{This is the comparison function used with @code{qsort}.} */ + +int +compare_elements (char **p1, char **p2) +@{ + return strcoll (*p1, *p2); +@} + +/* @r{This is the entry point---the function to sort} + @r{strings using the locale's collating sequence.} */ + +void +sort_strings (char **array, int nstrings) +@{ + /* @r{Sort @code{temp_array} by comparing the strings.} */ + qsort (array, sizeof (char *), + nstrings, compare_elements); +@} +@end smallexample + +@cindex converting string to collation order +@comment string.h +@comment ANSI +@deftypefun size_t strxfrm (char *@var{to}, const char *@var{from}, size_t @var{size}) +The function @code{strxfrm} transforms @var{string} using the collation +transformation determined by the locale currently selected for +collation, and stores the transformed string in the array @var{to}. Up +to @var{size} characters (including a terminating null character) are +stored. + +The behavior is undefined if the strings @var{to} and @var{from} +overlap; see @ref{Copying and Concatenation}. + +The return value is the length of the entire transformed string. This +value is not affected by the value of @var{size}, but if it is greater +than @var{size}, it means that the transformed string did not entirely +fit in the array @var{to}. In this case, only as much of the string as +actually fits was stored. To get the whole transformed string, call +@code{strxfrm} again with a bigger output array. + +The transformed string may be longer than the original string, and it +may also be shorter. + +If @var{size} is zero, no characters are stored in @var{to}. In this +case, @code{strxfrm} simply returns the number of characters that would +be the length of the transformed string. This is useful for determining +what size string to allocate. It does not matter what @var{to} is if +@var{size} is zero; @var{to} may even be a null pointer. +@end deftypefun + +Here is an example of how you can use @code{strxfrm} when +you plan to do many comparisons. It does the same thing as the previous +example, but much faster, because it has to transform each string only +once, no matter how many times it is compared with other strings. Even +the time needed to allocate and free storage is much less than the time +we save, when there are many strings. + +@smallexample +struct sorter @{ char *input; char *transformed; @}; + +/* @r{This is the comparison function used with @code{qsort}} + @r{to sort an array of @code{struct sorter}.} */ + +int +compare_elements (struct sorter *p1, struct sorter *p2) +@{ + return strcmp (p1->transformed, p2->transformed); +@} + +/* @r{This is the entry point---the function to sort} + @r{strings using the locale's collating sequence.} */ + +void +sort_strings_fast (char **array, int nstrings) +@{ + struct sorter temp_array[nstrings]; + int i; + + /* @r{Set up @code{temp_array}. Each element contains} + @r{one input string and its transformed string.} */ + for (i = 0; i < nstrings; i++) + @{ + size_t length = strlen (array[i]) * 2; + + temp_array[i].input = array[i]; + + /* @r{Transform @code{array[i]}.} + @r{First try a buffer probably big enough.} */ + while (1) + @{ + char *transformed = (char *) xmalloc (length); + if (strxfrm (transformed, array[i], length) < length) + @{ + temp_array[i].transformed = transformed; + break; + @} + /* @r{Try again with a bigger buffer.} */ + free (transformed); + length *= 2; + @} + @} + + /* @r{Sort @code{temp_array} by comparing transformed strings.} */ + qsort (temp_array, sizeof (struct sorter), + nstrings, compare_elements); + + /* @r{Put the elements back in the permanent array} + @r{in their sorted order.} */ + for (i = 0; i < nstrings; i++) + array[i] = temp_array[i].input; + + /* @r{Free the strings we allocated.} */ + for (i = 0; i < nstrings; i++) + free (temp_array[i].transformed); +@} +@end smallexample + +@strong{Compatibility Note:} The string collation functions are a new +feature of ANSI C. Older C dialects have no equivalent feature. + +@node Search Functions, Finding Tokens in a String, Collation Functions, String and Array Utilities +@section Search Functions + +This section describes library functions which perform various kinds +of searching operations on strings and arrays. These functions are +declared in the header file @file{string.h}. +@pindex string.h +@cindex search functions (for strings) +@cindex string search functions + +@comment string.h +@comment ANSI +@deftypefun {void *} memchr (const void *@var{block}, int @var{c}, size_t @var{size}) +This function finds the first occurrence of the byte @var{c} (converted +to an @code{unsigned char}) in the initial @var{size} bytes of the +object beginning at @var{block}. The return value is a pointer to the +located byte, or a null pointer if no match was found. +@end deftypefun + +@comment string.h +@comment ANSI +@deftypefun {char *} strchr (const char *@var{string}, int @var{c}) +The @code{strchr} function finds the first occurrence of the character +@var{c} (converted to a @code{char}) in the null-terminated string +beginning at @var{string}. The return value is a pointer to the located +character, or a null pointer if no match was found. + +For example, +@smallexample +strchr ("hello, world", 'l') + @result{} "llo, world" +strchr ("hello, world", '?') + @result{} NULL +@end smallexample + +The terminating null character is considered to be part of the string, +so you can use this function get a pointer to the end of a string by +specifying a null character as the value of the @var{c} argument. +@end deftypefun + +@comment string.h +@comment BSD +@deftypefun {char *} index (const char *@var{string}, int @var{c}) +@code{index} is another name for @code{strchr}; they are exactly the same. +@end deftypefun + +@comment string.h +@comment ANSI +@deftypefun {char *} strrchr (const char *@var{string}, int @var{c}) +The function @code{strrchr} is like @code{strchr}, except that it searches +backwards from the end of the string @var{string} (instead of forwards +from the front). + +For example, +@smallexample +strrchr ("hello, world", 'l') + @result{} "ld" +@end smallexample +@end deftypefun + +@comment string.h +@comment BSD +@deftypefun {char *} rindex (const char *@var{string}, int @var{c}) +@code{rindex} is another name for @code{strrchr}; they are exactly the same. +@end deftypefun + +@comment string.h +@comment ANSI +@deftypefun {char *} strstr (const char *@var{haystack}, const char *@var{needle}) +This is like @code{strchr}, except that it searches @var{haystack} for a +substring @var{needle} rather than just a single character. It +returns a pointer into the string @var{haystack} that is the first +character of the substring, or a null pointer if no match was found. If +@var{needle} is an empty string, the function returns @var{haystack}. + +For example, +@smallexample +strstr ("hello, world", "l") + @result{} "llo, world" +strstr ("hello, world", "wo") + @result{} "world" +@end smallexample +@end deftypefun + + +@comment string.h +@comment GNU +@deftypefun {void *} memmem (const void *@var{needle}, size_t @var{needle-len},@*const void *@var{haystack}, size_t @var{haystack-len}) +This is like @code{strstr}, but @var{needle} and @var{haystack} are byte +arrays rather than null-terminated strings. @var{needle-len} is the +length of @var{needle} and @var{haystack-len} is the length of +@var{haystack}.@refill + +This function is a GNU extension. +@end deftypefun + +@comment string.h +@comment ANSI +@deftypefun size_t strspn (const char *@var{string}, const char *@var{skipset}) +The @code{strspn} (``string span'') function returns the length of the +initial substring of @var{string} that consists entirely of characters that +are members of the set specified by the string @var{skipset}. The order +of the characters in @var{skipset} is not important. + +For example, +@smallexample +strspn ("hello, world", "abcdefghijklmnopqrstuvwxyz") + @result{} 5 +@end smallexample +@end deftypefun + +@comment string.h +@comment ANSI +@deftypefun size_t strcspn (const char *@var{string}, const char *@var{stopset}) +The @code{strcspn} (``string complement span'') function returns the length +of the initial substring of @var{string} that consists entirely of characters +that are @emph{not} members of the set specified by the string @var{stopset}. +(In other words, it returns the offset of the first character in @var{string} +that is a member of the set @var{stopset}.) + +For example, +@smallexample +strcspn ("hello, world", " \t\n,.;!?") + @result{} 5 +@end smallexample +@end deftypefun + +@comment string.h +@comment ANSI +@deftypefun {char *} strpbrk (const char *@var{string}, const char *@var{stopset}) +The @code{strpbrk} (``string pointer break'') function is related to +@code{strcspn}, except that it returns a pointer to the first character +in @var{string} that is a member of the set @var{stopset} instead of the +length of the initial substring. It returns a null pointer if no such +character from @var{stopset} is found. + +@c @group Invalid outside the example. +For example, + +@smallexample +strpbrk ("hello, world", " \t\n,.;!?") + @result{} ", world" +@end smallexample +@c @end group +@end deftypefun + +@node Finding Tokens in a String, , Search Functions, String and Array Utilities +@section Finding Tokens in a String + +@c !!! Document strsep, which is a better thing to use than strtok. + +@cindex tokenizing strings +@cindex breaking a string into tokens +@cindex parsing tokens from a string +It's fairly common for programs to have a need to do some simple kinds +of lexical analysis and parsing, such as splitting a command string up +into tokens. You can do this with the @code{strtok} function, declared +in the header file @file{string.h}. +@pindex string.h + +@comment string.h +@comment ANSI +@deftypefun {char *} strtok (char *@var{newstring}, const char *@var{delimiters}) +A string can be split into tokens by making a series of calls to the +function @code{strtok}. + +The string to be split up is passed as the @var{newstring} argument on +the first call only. The @code{strtok} function uses this to set up +some internal state information. Subsequent calls to get additional +tokens from the same string are indicated by passing a null pointer as +the @var{newstring} argument. Calling @code{strtok} with another +non-null @var{newstring} argument reinitializes the state information. +It is guaranteed that no other library function ever calls @code{strtok} +behind your back (which would mess up this internal state information). + +The @var{delimiters} argument is a string that specifies a set of delimiters +that may surround the token being extracted. All the initial characters +that are members of this set are discarded. The first character that is +@emph{not} a member of this set of delimiters marks the beginning of the +next token. The end of the token is found by looking for the next +character that is a member of the delimiter set. This character in the +original string @var{newstring} is overwritten by a null character, and the +pointer to the beginning of the token in @var{newstring} is returned. + +On the next call to @code{strtok}, the searching begins at the next +character beyond the one that marked the end of the previous token. +Note that the set of delimiters @var{delimiters} do not have to be the +same on every call in a series of calls to @code{strtok}. + +If the end of the string @var{newstring} is reached, or if the remainder of +string consists only of delimiter characters, @code{strtok} returns +a null pointer. +@end deftypefun + +@strong{Warning:} Since @code{strtok} alters the string it is parsing, +you always copy the string to a temporary buffer before parsing it with +@code{strtok}. If you allow @code{strtok} to modify a string that came +from another part of your program, you are asking for trouble; that +string may be part of a data structure that could be used for other +purposes during the parsing, when alteration by @code{strtok} makes the +data structure temporarily inaccurate. + +The string that you are operating on might even be a constant. Then +when @code{strtok} tries to modify it, your program will get a fatal +signal for writing in read-only memory. @xref{Program Error Signals}. + +This is a special case of a general principle: if a part of a program +does not have as its purpose the modification of a certain data +structure, then it is error-prone to modify the data structure +temporarily. + +The function @code{strtok} is not reentrant. @xref{Nonreentrancy}, for +a discussion of where and why reentrancy is important. + +Here is a simple example showing the use of @code{strtok}. + +@comment Yes, this example has been tested. +@smallexample +#include <string.h> +#include <stddef.h> + +@dots{} + +char string[] = "words separated by spaces -- and, punctuation!"; +const char delimiters[] = " .,;:!-"; +char *token; + +@dots{} + +token = strtok (string, delimiters); /* token => "words" */ +token = strtok (NULL, delimiters); /* token => "separated" */ +token = strtok (NULL, delimiters); /* token => "by" */ +token = strtok (NULL, delimiters); /* token => "spaces" */ +token = strtok (NULL, delimiters); /* token => "and" */ +token = strtok (NULL, delimiters); /* token => "punctuation" */ +token = strtok (NULL, delimiters); /* token => NULL */ +@end smallexample diff --git a/manual/summary.awk b/manual/summary.awk new file mode 100644 index 0000000000..2eade0c20d --- /dev/null +++ b/manual/summary.awk @@ -0,0 +1,110 @@ +# awk script to create summary.texinfo from the library texinfo files. + +# Copyright (C) 1992, 1993 Free Software Foundation, Inc. +# This file is part of the GNU C Library. + +# The GNU C Library is free software; you can redistribute it and/or +# modify it under the terms of the GNU Library General Public License +# as published by the Free Software Foundation; either version 2 of +# the License, or (at your option) any later version. + +# The GNU C Library is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +# Library General Public License for more details. + +# You should have received a copy of the GNU Library General Public +# License along with the GNU C Library; see the file COPYING.LIB. If +# not, write to the Free Software Foundation, Inc., 675 Mass Ave, +# Cambridge, MA 02139, USA. + +# This script recognizes sequences that look like: +# @comment HEADER.h +# @comment STANDARD +# @def... ITEM | @item ITEM | @vindex ITEM + +BEGIN { header = 0; +nameword["@defun"]=1 +nameword["@defmac"]=1 +nameword["@defspec"]=1 +nameword["@defvar"]=1 +nameword["@defopt"]=1 +nameword["@deffn"]=2 +nameword["@defvr"]=2 +nameword["@deftp"]=2 +nameword["@deftypefun"]=2 +nameword["@deftypevar"]=2 +nameword["@deftypefn"]=3 +nameword["@deftypevr"]=3 +firstword["@defun"]=1 +firstword["@defmac"]=1 +firstword["@defspec"]=1 +firstword["@defvar"]=1 +firstword["@defopt"]=1 +firstword["@deffn"]=2 +firstword["@defvr"]=2 +firstword["@deftp"]=2 +firstword["@deftypefun"]=1 +firstword["@deftypevar"]=1 +firstword["@deftypefn"]=2 +firstword["@deftypevr"]=2 +nameword["@item"]=1 +firstword["@item"]=1 +nameword["@itemx"]=1 +firstword["@itemx"]=1 +nameword["@vindex"]=1 +firstword["@vindex"]=1 + +print "@c DO NOT EDIT THIS FILE!" +print "@c This file is generated by summary.awk from the Texinfo sources." +} + +$1 == "@node" { node=$2; + for (i = 3; i <= NF; ++i) + { node=node " " $i; if ( $i ~ /,/ ) break; } + } + +$1 == "@comment" && $2 ~ /\.h$/ { header="@file{" $2 "}"; + for (i = 3; i <= NF; ++i) + header=header ", @file{" $i "}" + } + +$1 == "@comment" && $2 == "(none)" { header = -1; } + +$1 == "@comment" && header != 0 { std=$2; + for (i=3;i<=NF;++i) std=std " " $i } + +header != 0 && $1 ~ /@def|@item|@vindex/ \ + { defn=""; name=""; curly=0; n=1; + for (i = 2; i <= NF; ++i) { + if ($i ~ /^{/ && $i !~ /}/) { + curly=1 + word=substr ($i, 2, length ($i)) + } + else { + if (curly) { + if ($i ~ /}$/) { + curly=0 + word=word " " substr ($i, 1, length ($i) - 1) + } else + word=word " " $i + } + # Handle a single word in braces. + else if ($i ~ /^{.*}$/) + word=substr ($i, 2, length ($i) - 2) + else + word=$i + if (!curly) { + if (n >= firstword[$1]) + defn=defn " " word + if (n == nameword[$1]) + name=word + ++n + } + } + } + printf "@comment %s%c", name, 012 # FF + printf "@item%s%c%c", defn, 012, 012 + if (header != -1) printf "%s ", header; + printf "(%s): @ref{%s}.%c\n", std, node, 012; + header = 0 } diff --git a/manual/sysinfo.texi b/manual/sysinfo.texi new file mode 100644 index 0000000000..a30536db6e --- /dev/null +++ b/manual/sysinfo.texi @@ -0,0 +1,180 @@ +@node System Information, System Configuration, Users and Groups, Top +@chapter System Information + +This chapter describes functions that return information about the +particular machine that is in use---the type of hardware, the type of +software, and the individual machine's name. + +@menu +* Host Identification:: Determining the name of the machine. +* Hardware/Software Type ID:: Determining the hardware type of the + machine and what operating system it is + running. +@end menu + + +@node Host Identification +@section Host Identification + +This section explains how to identify the particular machine that your +program is running on. The identification of a machine consists of its +Internet host name and Internet address; see @ref{Internet Namespace}. +The host name should always be a fully qualified domain name, like +@w{@samp{crispy-wheats-n-chicken.ai.mit.edu}}, not a simple name like +just @w{@samp{crispy-wheats-n-chicken}}. + +@pindex hostname +@pindex hostid +@pindex unistd.h +Prototypes for these functions appear in @file{unistd.h}. The shell +commands @code{hostname} and @code{hostid} work by calling them. + +@comment unistd.h +@comment BSD +@deftypefun int gethostname (char *@var{name}, size_t @var{size}) +This function returns the name of the host machine in the array +@var{name}. The @var{size} argument specifies the size of this array, +in bytes. + +The return value is @code{0} on success and @code{-1} on failure. In +the GNU C library, @code{gethostname} fails if @var{size} is not large +enough; then you can try again with a larger array. The following +@code{errno} error condition is defined for this function: + +@table @code +@item ENAMETOOLONG +The @var{size} argument is less than the size of the host name plus one. +@end table + +@pindex sys/param.h +On some systems, there is a symbol for the maximum possible host name +length: @code{MAXHOSTNAMELEN}. It is defined in @file{sys/param.h}. +But you can't count on this to exist, so it is cleaner to handle +failure and try again. + +@code{gethostname} stores the beginning of the host name in @var{name} +even if the host name won't entirely fit. For some purposes, a +truncated host name is good enough. If it is, you can ignore the +error code. +@end deftypefun + +@comment unistd.h +@comment BSD +@deftypefun int sethostname (const char *@var{name}, size_t @var{length}) +The @code{sethostname} function sets the name of the host machine to +@var{name}, a string with length @var{length}. Only privileged +processes are allowed to do this. Usually it happens just once, at +system boot time. + +The return value is @code{0} on success and @code{-1} on failure. +The following @code{errno} error condition is defined for this function: + +@table @code +@item EPERM +This process cannot set the host name because it is not privileged. +@end table +@end deftypefun + +@comment unistd.h +@comment BSD +@deftypefun {long int} gethostid (void) +This function returns the ``host ID'' of the machine the program is +running on. By convention, this is usually the primary Internet address +of that machine, converted to a @w{@code{long int}}. However, some +systems it is a meaningless but unique number which is hard-coded for +each machine. +@end deftypefun + +@comment unistd.h +@comment BSD +@deftypefun int sethostid (long int @var{id}) +The @code{sethostid} function sets the ``host ID'' of the host machine +to @var{id}. Only privileged processes are allowed to do this. Usually +it happens just once, at system boot time. + +The return value is @code{0} on success and @code{-1} on failure. +The following @code{errno} error condition is defined for this function: + +@table @code +@item EPERM +This process cannot set the host name because it is not privileged. + +@item ENOSYS +The operating system does not support setting the host ID. On some +systems, the host ID is a meaningless but unique number hard-coded for +each machine. +@end table +@end deftypefun + +@node Hardware/Software Type ID +@section Hardware/Software Type Identification + +You can use the @code{uname} function to find out some information about +the type of computer your program is running on. This function and the +associated data type are declared in the header file +@file{sys/utsname.h}. +@pindex sys/utsname.h + +@comment sys/utsname.h +@comment POSIX.1 +@deftp {Data Type} {struct utsname} +The @code{utsname} structure is used to hold information returned +by the @code{uname} function. It has the following members: + +@table @code +@item char sysname[] +This is the name of the operating system in use. + +@item char nodename[] +This is the network name of this particular computer. In the GNU +library, the value is the same as that returned by @code{gethostname}; +see @ref{Host Identification}. + +@item char release[] +This is the current release level of the operating system implementation. + +@item char version[] +This is the current version level within the release of the operating +system. + +@item char machine[] +This is a description of the type of hardware that is in use. + +Some systems provide a mechanism to interrogate the kernel directly for +this information. On systems without such a mechanism, the GNU C +library fills in this field based on the configuration name that was +specified when building and installing the library. + +GNU uses a three-part name to describe a system configuration; the three +parts are @var{cpu}, @var{manufacturer} and @var{system-type}, and they +are separated with dashes. Any possible combination of three names is +potentially meaningful, but most such combinations are meaningless in +practice and even the meaningful ones are not necessarily supported by +any particular GNU program. + +Since the value in @code{machine} is supposed to describe just the +hardware, it consists of the first two parts of the configuration name: +@samp{@var{cpu}-@var{manufacturer}}. For example, it might be one of these: + +@quotation +@code{"sparc-sun"}, +@code{"i386-@var{anything}"}, +@code{"m68k-hp"}, +@code{"m68k-sony"}, +@code{"m68k-sun"}, +@code{"mips-dec"} +@end quotation +@end table +@end deftp + +@comment sys/utsname.h +@comment POSIX.1 +@deftypefun int uname (struct utsname *@var{info}) +The @code{uname} function fills in the structure pointed to by +@var{info} with information about the operating system and host machine. +A non-negative value indicates that the data was successfully stored. + +@code{-1} as the value indicates an error. The only error possible is +@code{EFAULT}, which we normally don't mention as it is always a +possibility. +@end deftypefun diff --git a/manual/terminal.texi b/manual/terminal.texi new file mode 100644 index 0000000000..a9593ccfc5 --- /dev/null +++ b/manual/terminal.texi @@ -0,0 +1,1787 @@ +@node Low-Level Terminal Interface +@chapter Low-Level Terminal Interface + +This chapter describes functions that are specific to terminal devices. +You can use these functions to do things like turn off input echoing; +set serial line characteristics such as line speed and flow control; and +change which characters are used for end-of-file, command-line editing, +sending signals, and similar control functions. + +Most of the functions in this chapter operate on file descriptors. +@xref{Low-Level I/O}, for more information about what a file +descriptor is and how to open a file descriptor for a terminal device. + +@menu +* Is It a Terminal:: How to determine if a file is a terminal + device, and what its name is. +* I/O Queues:: About flow control and typeahead. +* Canonical or Not:: Two basic styles of input processing. +* Terminal Modes:: How to examine and modify flags controlling + details of terminal I/O: echoing, + signals, editing. +* Line Control:: Sending break sequences, clearing + terminal buffers @dots{} +* Noncanon Example:: How to read single characters without echo. +@end menu + +@node Is It a Terminal +@section Identifying Terminals +@cindex terminal identification +@cindex identifying terminals + +The functions described in this chapter only work on files that +correspond to terminal devices. You can find out whether a file +descriptor is associated with a terminal by using the @code{isatty} +function. + +@pindex unistd.h +Prototypes for both @code{isatty} and @code{ttyname} are declared in +the header file @file{unistd.h}. + +@comment unistd.h +@comment POSIX.1 +@deftypefun int isatty (int @var{filedes}) +This function returns @code{1} if @var{filedes} is a file descriptor +associated with an open terminal device, and @code{0} otherwise. +@end deftypefun + +If a file descriptor is associated with a terminal, you can get its +associated file name using the @code{ttyname} function. See also the +@code{ctermid} function, described in @ref{Identifying the Terminal}. + +@comment unistd.h +@comment POSIX.1 +@deftypefun {char *} ttyname (int @var{filedes}) +If the file descriptor @var{filedes} is associated with a terminal +device, the @code{ttyname} function returns a pointer to a +statically-allocated, null-terminated string containing the file name of +the terminal file. The value is a null pointer if the file descriptor +isn't associated with a terminal, or the file name cannot be determined. +@end deftypefun + +@node I/O Queues +@section I/O Queues + +Many of the remaining functions in this section refer to the input and +output queues of a terminal device. These queues implement a form of +buffering @emph{within the kernel} independent of the buffering +implemented by I/O streams (@pxref{I/O on Streams}). + +@cindex terminal input queue +@cindex typeahead buffer +The @dfn{terminal input queue} is also sometimes referred to as its +@dfn{typeahead buffer}. It holds the characters that have been received +from the terminal but not yet read by any process. + +The size of the terminal's input queue is described by the +@code{MAX_INPUT} and @w{@code{_POSIX_MAX_INPUT}} parameters; see @ref{Limits +for Files}. You are guaranteed a queue size of at least +@code{MAX_INPUT}, but the queue might be larger, and might even +dynamically change size. If input flow control is enabled by setting +the @code{IXOFF} input mode bit (@pxref{Input Modes}), the terminal +driver transmits STOP and START characters to the terminal when +necessary to prevent the queue from overflowing. Otherwise, input may +be lost if it comes in too fast from the terminal. In canonical mode, +all input stays in the queue until a newline character is received, so +the terminal input queue can fill up when you type a very long line. +@xref{Canonical or Not}. + +@cindex terminal output queue +The @dfn{terminal output queue} is like the input queue, but for output; +it contains characters that have been written by processes, but not yet +transmitted to the terminal. If output flow control is enabled by +setting the @code{IXON} input mode bit (@pxref{Input Modes}), the +terminal driver obeys STOP and STOP characters sent by the terminal to +stop and restart transmission of output. + +@dfn{Clearing} the terminal input queue means discarding any characters +that have been received but not yet read. Similarly, clearing the +terminal output queue means discarding any characters that have been +written but not yet transmitted. + +@node Canonical or Not +@section Two Styles of Input: Canonical or Not + +POSIX systems support two basic modes of input: canonical and +noncanonical. + +@cindex canonical input processing +In @dfn{canonical input processing} mode, terminal input is processed in +lines terminated by newline (@code{'\n'}), EOF, or EOL characters. No +input can be read until an entire line has been typed by the user, and +the @code{read} function (@pxref{I/O Primitives}) returns at most a +single line of input, no matter how many bytes are requested. + +In canonical input mode, the operating system provides input editing +facilities: some characters are interpreted specially to perform editing +operations within the current line of text, such as ERASE and KILL. +@xref{Editing Characters}. + +The constants @code{_POSIX_MAX_CANON} and @code{MAX_CANON} parameterize +the maximum number of bytes which may appear in a single line of +canonical input. @xref{Limits for Files}. You are guaranteed a maximum +line length of at least @code{MAX_CANON} bytes, but the maximum might be +larger, and might even dynamically change size. + +@cindex noncanonical input processing +In @dfn{noncanonical input processing} mode, characters are not grouped +into lines, and ERASE and KILL processing is not performed. The +granularity with which bytes are read in noncanonical input mode is +controlled by the MIN and TIME settings. @xref{Noncanonical Input}. + +Most programs use canonical input mode, because this gives the user a +way to edit input line by line. The usual reason to use noncanonical +mode is when the program accepts single-character commands or provides +its own editing facilities. + +The choice of canonical or noncanonical input is controlled by the +@code{ICANON} flag in the @code{c_lflag} member of @code{struct termios}. +@xref{Local Modes}. + +@node Terminal Modes +@section Terminal Modes + +@pindex termios.h +This section describes the various terminal attributes that control how +input and output are done. The functions, data structures, and symbolic +constants are all declared in the header file @file{termios.h}. +@c !!! should mention terminal attributes are distinct from file attributes + +@menu +* Mode Data Types:: The data type @code{struct termios} and + related types. +* Mode Functions:: Functions to read and set the terminal + attributes. +* Setting Modes:: The right way to set terminal attributes + reliably. +* Input Modes:: Flags controlling low-level input handling. +* Output Modes:: Flags controlling low-level output handling. +* Control Modes:: Flags controlling serial port behavior. +* Local Modes:: Flags controlling high-level input handling. +* Line Speed:: How to read and set the terminal line speed. +* Special Characters:: Characters that have special effects, + and how to change them. +* Noncanonical Input:: Controlling how long to wait for input. +@end menu + +@node Mode Data Types +@subsection Terminal Mode Data Types +@cindex terminal mode data types + +The entire collection of attributes of a terminal is stored in a +structure of type @code{struct termios}. This structure is used +with the functions @code{tcgetattr} and @code{tcsetattr} to read +and set the attributes. + +@comment termios.h +@comment POSIX.1 +@deftp {Data Type} {struct termios} +Structure that records all the I/O attributes of a terminal. The +structure includes at least the following members: + +@table @code +@item tcflag_t c_iflag +A bit mask specifying flags for input modes; see @ref{Input Modes}. + +@item tcflag_t c_oflag +A bit mask specifying flags for output modes; see @ref{Output Modes}. + +@item tcflag_t c_cflag +A bit mask specifying flags for control modes; see @ref{Control Modes}. + +@item tcflag_t c_lflag +A bit mask specifying flags for local modes; see @ref{Local Modes}. + +@item cc_t c_cc[NCCS] +An array specifying which characters are associated with various +control functions; see @ref{Special Characters}. +@end table + +The @code{struct termios} structure also contains members which +encode input and output transmission speeds, but the representation is +not specified. @xref{Line Speed}, for how to examine and store the +speed values. +@end deftp + +The following sections describe the details of the members of the +@code{struct termios} structure. + +@comment termios.h +@comment POSIX.1 +@deftp {Data Type} tcflag_t +This is an unsigned integer type used to represent the various +bit masks for terminal flags. +@end deftp + +@comment termios.h +@comment POSIX.1 +@deftp {Data Type} cc_t +This is an unsigned integer type used to represent characters associated +with various terminal control functions. +@end deftp + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro int NCCS +The value of this macro is the number of elements in the @code{c_cc} +array. +@end deftypevr + +@node Mode Functions +@subsection Terminal Mode Functions +@cindex terminal mode functions + +@comment termios.h +@comment POSIX.1 +@deftypefun int tcgetattr (int @var{filedes}, struct termios *@var{termios-p}) +This function is used to examine the attributes of the terminal +device with file descriptor @var{filedes}. The attributes are returned +in the structure that @var{termios-p} points to. + +If successful, @code{tcgetattr} returns @code{0}. A return value of @code{-1} +indicates an error. The following @code{errno} error conditions are +defined for this function: + +@table @code +@item EBADF +The @var{filedes} argument is not a valid file descriptor. + +@item ENOTTY +The @var{filedes} is not associated with a terminal. +@end table +@end deftypefun + +@comment termios.h +@comment POSIX.1 +@deftypefun int tcsetattr (int @var{filedes}, int @var{when}, const struct termios *@var{termios-p}) +This function sets the attributes of the terminal device with file +descriptor @var{filedes}. The new attributes are taken from the +structure that @var{termios-p} points to. + +The @var{when} argument specifies how to deal with input and output +already queued. It can be one of the following values: + +@table @code +@comment termios.h +@comment POSIX.1 +@item TCSANOW +@vindex TCSANOW +Make the change immediately. + +@comment termios.h +@comment POSIX.1 +@item TCSADRAIN +@vindex TCSADRAIN +Make the change after waiting until all queued output has been written. +You should usually use this option when changing parameters that affect +output. + +@comment termios.h +@comment POSIX.1 +@item TCSAFLUSH +@vindex TCSAFLUSH +This is like @code{TCSADRAIN}, but also discards any queued input. + +@comment termios.h +@comment BSD +@item TCSASOFT +@vindex TCSASOFT +This is a flag bit that you can add to any of the above alternatives. +Its meaning is to inhibit alteration of the state of the terminal +hardware. It is a BSD extension; it is only supported on BSD systems +and the GNU system. + +Using @code{TCSASOFT} is exactly the same as setting the @code{CIGNORE} +bit in the @code{c_cflag} member of the structure @var{termios-p} points +to. @xref{Control Modes}, for a description of @code{CIGNORE}. +@end table + +If this function is called from a background process on its controlling +terminal, normally all processes in the process group are sent a +@code{SIGTTOU} signal, in the same way as if the process were trying to +write to the terminal. The exception is if the calling process itself +is ignoring or blocking @code{SIGTTOU} signals, in which case the +operation is performed and no signal is sent. @xref{Job Control}. + +If successful, @code{tcsetattr} returns @code{0}. A return value of +@code{-1} indicates an error. The following @code{errno} error +conditions are defined for this function: + +@table @code +@item EBADF +The @var{filedes} argument is not a valid file descriptor. + +@item ENOTTY +The @var{filedes} is not associated with a terminal. + +@item EINVAL +Either the value of the @code{when} argument is not valid, or there is +something wrong with the data in the @var{termios-p} argument. +@end table +@end deftypefun + +Although @code{tcgetattr} and @code{tcsetattr} specify the terminal +device with a file descriptor, the attributes are those of the terminal +device itself and not of the file descriptor. This means that the +effects of changing terminal attributes are persistent; if another +process opens the terminal file later on, it will see the changed +attributes even though it doesn't have anything to do with the open file +descriptor you originally specified in changing the attributes. + +Similarly, if a single process has multiple or duplicated file +descriptors for the same terminal device, changing the terminal +attributes affects input and output to all of these file +descriptors. This means, for example, that you can't open one file +descriptor or stream to read from a terminal in the normal +line-buffered, echoed mode; and simultaneously have another file +descriptor for the same terminal that you use to read from it in +single-character, non-echoed mode. Instead, you have to explicitly +switch the terminal back and forth between the two modes. + +@node Setting Modes +@subsection Setting Terminal Modes Properly + +When you set terminal modes, you should call @code{tcgetattr} first to +get the current modes of the particular terminal device, modify only +those modes that you are really interested in, and store the result with +@code{tcsetattr}. + +It's a bad idea to simply initialize a @code{struct termios} structure +to a chosen set of attributes and pass it directly to @code{tcsetattr}. +Your program may be run years from now, on systems that support members +not documented in this manual. The way to avoid setting these members +to unreasonable values is to avoid changing them. + +What's more, different terminal devices may require different mode +settings in order to function properly. So you should avoid blindly +copying attributes from one terminal device to another. + +When a member contains a collection of independent flags, as the +@code{c_iflag}, @code{c_oflag} and @code{c_cflag} members do, even +setting the entire member is a bad idea, because particular operating +systems have their own flags. Instead, you should start with the +current value of the member and alter only the flags whose values matter +in your program, leaving any other flags unchanged. + +Here is an example of how to set one flag (@code{ISTRIP}) in the +@code{struct termios} structure while properly preserving all the other +data in the structure: + +@smallexample +@group +int +set_istrip (int desc, int value) +@{ + struct termios settings; + int result; +@end group + +@group + result = tcgetattr (desc, &settings); + if (result < 0) + @{ + perror ("error in tcgetattr"); + return 0; + @} +@end group +@group + settings.c_iflag &= ~ISTRIP; + if (value) + settings.c_iflag |= ISTRIP; +@end group +@group + result = tcsetattr (desc, TCSANOW, &settings); + if (result < 0) + @{ + perror ("error in tcgetattr"); + return; + @} + return 1; +@} +@end group +@end smallexample + +@node Input Modes +@subsection Input Modes + +This section describes the terminal attribute flags that control +fairly low-level aspects of input processing: handling of parity errors, +break signals, flow control, and @key{RET} and @key{LFD} characters. + +All of these flags are bits in the @code{c_iflag} member of the +@code{struct termios} structure. The member is an integer, and you +change flags using the operators @code{&}, @code{|} and @code{^}. Don't +try to specify the entire value for @code{c_iflag}---instead, change +only specific flags and leave the rest untouched (@pxref{Setting +Modes}). + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t INPCK +@cindex parity checking +If this bit is set, input parity checking is enabled. If it is not set, +no checking at all is done for parity errors on input; the +characters are simply passed through to the application. + +Parity checking on input processing is independent of whether parity +detection and generation on the underlying terminal hardware is enabled; +see @ref{Control Modes}. For example, you could clear the @code{INPCK} +input mode flag and set the @code{PARENB} control mode flag to ignore +parity errors on input, but still generate parity on output. + +If this bit is set, what happens when a parity error is detected depends +on whether the @code{IGNPAR} or @code{PARMRK} bits are set. If neither +of these bits are set, a byte with a parity error is passed to the +application as a @code{'\0'} character. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t IGNPAR +If this bit is set, any byte with a framing or parity error is ignored. +This is only useful if @code{INPCK} is also set. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t PARMRK +If this bit is set, input bytes with parity or framing errors are marked +when passed to the program. This bit is meaningful only when +@code{INPCK} is set and @code{IGNPAR} is not set. + +The way erroneous bytes are marked is with two preceding bytes, +@code{377} and @code{0}. Thus, the program actually reads three bytes +for one erroneous byte received from the terminal. + +If a valid byte has the value @code{0377}, and @code{ISTRIP} (see below) +is not set, the program might confuse it with the prefix that marks a +parity error. So a valid byte @code{0377} is passed to the program as +two bytes, @code{0377} @code{0377}, in this case. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t ISTRIP +If this bit is set, valid input bytes are stripped to seven bits; +otherwise, all eight bits are available for programs to read. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t IGNBRK +If this bit is set, break conditions are ignored. + +@cindex break condition, detecting +A @dfn{break condition} is defined in the context of asynchronous +serial data transmission as a series of zero-value bits longer than a +single byte. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t BRKINT +If this bit is set and @code{IGNBRK} is not set, a break condition +clears the terminal input and output queues and raises a @code{SIGINT} +signal for the foreground process group associated with the terminal. + +If neither @code{BRKINT} nor @code{IGNBRK} are set, a break condition is +passed to the application as a single @code{'\0'} character if +@code{PARMRK} is not set, or otherwise as a three-character sequence +@code{'\377'}, @code{'\0'}, @code{'\0'}. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t IGNCR +If this bit is set, carriage return characters (@code{'\r'}) are +discarded on input. Discarding carriage return may be useful on +terminals that send both carriage return and linefeed when you type the +@key{RET} key. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t ICRNL +If this bit is set and @code{IGNCR} is not set, carriage return characters +(@code{'\r'}) received as input are passed to the application as newline +characters (@code{'\n'}). +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t INLCR +If this bit is set, newline characters (@code{'\n'}) received as input +are passed to the application as carriage return characters (@code{'\r'}). +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t IXOFF +If this bit is set, start/stop control on input is enabled. In other +words, the computer sends STOP and START characters as necessary to +prevent input from coming in faster than programs are reading it. The +idea is that the actual terminal hardware that is generating the input +data responds to a STOP character by suspending transmission, and to a +START character by resuming transmission. @xref{Start/Stop Characters}. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t IXON +If this bit is set, start/stop control on output is enabled. In other +words, if the computer receives a STOP character, it suspends output +until a START character is received. In this case, the STOP and START +characters are never passed to the application program. If this bit is +not set, then START and STOP can be read as ordinary characters. +@xref{Start/Stop Characters}. +@c !!! mention this interferes with using C-s and C-q for programs like emacs +@end deftypevr + +@comment termios.h +@comment BSD +@deftypevr Macro tcflag_t IXANY +If this bit is set, any input character restarts output when output has +been suspended with the STOP character. Otherwise, only the START +character restarts output. + +This is a BSD extension; it exists only on BSD systems and the GNU system. +@end deftypevr + +@comment termios.h +@comment BSD +@deftypevr Macro tcflag_t IMAXBEL +If this bit is set, then filling up the terminal input buffer sends a +BEL character (code @code{007}) to the terminal to ring the bell. + +This is a BSD extension. +@end deftypevr + +@node Output Modes +@subsection Output Modes + +This section describes the terminal flags and fields that control how +output characters are translated and padded for display. All of these +are contained in the @code{c_oflag} member of the @w{@code{struct termios}} +structure. + +The @code{c_oflag} member itself is an integer, and you change the flags +and fields using the operators @code{&}, @code{|}, and @code{^}. Don't +try to specify the entire value for @code{c_oflag}---instead, change +only specific flags and leave the rest untouched (@pxref{Setting +Modes}). + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t OPOST +If this bit is set, output data is processed in some unspecified way so +that it is displayed appropriately on the terminal device. This +typically includes mapping newline characters (@code{'\n'}) onto +carriage return and linefeed pairs. + +If this bit isn't set, the characters are transmitted as-is. +@end deftypevr + +The following three bits are BSD features, and they exist only BSD +systems and the GNU system. They are effective only if @code{OPOST} is +set. + +@comment termios.h +@comment BSD +@deftypevr Macro tcflag_t ONLCR +If this bit is set, convert the newline character on output into a pair +of characters, carriage return followed by linefeed. +@end deftypevr + +@comment termios.h +@comment BSD +@deftypevr Macro tcflag_t OXTABS +If this bit is set, convert tab characters on output into the appropriate +number of spaces to emulate a tab stop every eight columns. +@end deftypevr + +@comment termios.h +@comment BSD +@deftypevr Macro tcflag_t ONOEOT +If this bit is set, discard @kbd{C-d} characters (code @code{004}) on +output. These characters cause many dial-up terminals to disconnect. +@end deftypevr + +@node Control Modes +@subsection Control Modes + +This section describes the terminal flags and fields that control +parameters usually associated with asynchronous serial data +transmission. These flags may not make sense for other kinds of +terminal ports (such as a network connection pseudo-terminal). All of +these are contained in the @code{c_cflag} member of the @code{struct +termios} structure. + +The @code{c_cflag} member itself is an integer, and you change the flags +and fields using the operators @code{&}, @code{|}, and @code{^}. Don't +try to specify the entire value for @code{c_cflag}---instead, change +only specific flags and leave the rest untouched (@pxref{Setting +Modes}). + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t CLOCAL +If this bit is set, it indicates that the terminal is connected +``locally'' and that the modem status lines (such as carrier detect) +should be ignored. +@cindex modem status lines +@cindex carrier detect + +On many systems if this bit is not set and you call @code{open} without +the @code{O_NONBLOCK} flag set, @code{open} blocks until a modem +connection is established. + +If this bit is not set and a modem disconnect is detected, a +@code{SIGHUP} signal is sent to the controlling process group for the +terminal (if it has one). Normally, this causes the process to exit; +see @ref{Signal Handling}. Reading from the terminal after a disconnect +causes an end-of-file condition, and writing causes an @code{EIO} error +to be returned. The terminal device must be closed and reopened to +clear the condition. +@cindex modem disconnect +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t HUPCL +If this bit is set, a modem disconnect is generated when all processes +that have the terminal device open have either closed the file or exited. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t CREAD +If this bit is set, input can be read from the terminal. Otherwise, +input is discarded when it arrives. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t CSTOPB +If this bit is set, two stop bits are used. Otherwise, only one stop bit +is used. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t PARENB +If this bit is set, generation and detection of a parity bit are enabled. +@xref{Input Modes}, for information on how input parity errors are handled. + +If this bit is not set, no parity bit is added to output characters, and +input characters are not checked for correct parity. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t PARODD +This bit is only useful if @code{PARENB} is set. If @code{PARODD} is set, +odd parity is used, otherwise even parity is used. +@end deftypevr + +The control mode flags also includes a field for the number of bits per +character. You can use the @code{CSIZE} macro as a mask to extract the +value, like this: @code{settings.c_cflag & CSIZE}. + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t CSIZE +This is a mask for the number of bits per character. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t CS5 +This specifies five bits per byte. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t CS6 +This specifies six bits per byte. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t CS7 +This specifies seven bits per byte. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t CS8 +This specifies eight bits per byte. +@end deftypevr + +The following four bits are BSD extensions; this exist only on BSD +systems and the GNU system. + +@comment termios.h +@comment BSD +@deftypevr Macro tcflag_t CCTS_OFLOW +If this bit is set, enable flow control of output based on the CTS wire +(RS232 protocol). +@end deftypevr + +@comment termios.h +@comment BSD +@deftypevr Macro tcflag_t CRTS_IFLOW +If this bit is set, enable flow control of input based on the RTS wire +(RS232 protocol). +@end deftypevr + +@comment termios.h +@comment BSD +@deftypevr Macro tcflag_t MDMBUF +If this bit is set, enable carrier-based flow control of output. +@end deftypevr + +@comment termios.h +@comment BSD +@deftypevr Macro tcflag_t CIGNORE +If this bit is set, it says to ignore the control modes and line speed +values entirely. This is only meaningful in a call to @code{tcsetattr}. + +The @code{c_cflag} member and the line speed values returned by +@code{cfgetispeed} and @code{cfgetospeed} will be unaffected by the +call. @code{CIGNORE} is useful if you want to set all the software +modes in the other members, but leave the hardware details in +@code{c_cflag} unchanged. (This is how the @code{TCSASOFT} flag to +@code{tcsettattr} works.) + +This bit is never set in the structure filled in by @code{tcgetattr}. +@end deftypevr + +@node Local Modes +@subsection Local Modes + +This section describes the flags for the @code{c_lflag} member of the +@code{struct termios} structure. These flags generally control +higher-level aspects of input processing than the input modes flags +described in @ref{Input Modes}, such as echoing, signals, and the choice +of canonical or noncanonical input. + +The @code{c_lflag} member itself is an integer, and you change the flags +and fields using the operators @code{&}, @code{|}, and @code{^}. Don't +try to specify the entire value for @code{c_lflag}---instead, change +only specific flags and leave the rest untouched (@pxref{Setting +Modes}). + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t ICANON +This bit, if set, enables canonical input processing mode. Otherwise, +input is processed in noncanonical mode. @xref{Canonical or Not}. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t ECHO +If this bit is set, echoing of input characters back to the terminal +is enabled. +@cindex echo of terminal input +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t ECHOE +If this bit is set, echoing indicates erasure of input with the ERASE +character by erasing the last character in the current line from the +screen. Otherwise, the character erased is re-echoed to show what has +happened (suitable for a printing terminal). + +This bit only controls the display behavior; the @code{ICANON} bit by +itself controls actual recognition of the ERASE character and erasure of +input, without which @code{ECHOE} is simply irrelevant. +@end deftypevr + +@comment termios.h +@comment BSD +@deftypevr Macro tcflag_t ECHOPRT +This bit is like @code{ECHOE}, enables display of the ERASE character in +a way that is geared to a hardcopy terminal. When you type the ERASE +character, a @samp{\} character is printed followed by the first +character erased. Typing the ERASE character again just prints the next +character erased. Then, the next time you type a normal character, a +@samp{/} character is printed before the character echoes. + +This is a BSD extension, and exists only in BSD systems and the +GNU system. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t ECHOK +This bit enables special display of the KILL character by moving to a +new line after echoing the KILL character normally. The behavior of +@code{ECHOKE} (below) is nicer to look at. + +If this bit is not set, the KILL character echoes just as it would if it +were not the KILL character. Then it is up to the user to remember that +the KILL character has erased the preceding input; there is no +indication of this on the screen. + +This bit only controls the display behavior; the @code{ICANON} bit by +itself controls actual recognition of the KILL character and erasure of +input, without which @code{ECHOK} is simply irrelevant. +@end deftypevr + +@comment termios.h +@comment BSD +@deftypevr Macro tcflag_t ECHOKE +This bit is similar to @code{ECHOK}. It enables special display of the +KILL character by erasing on the screen the entire line that has been +killed. This is a BSD extension, and exists only in BSD systems and the +GNU system. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t ECHONL +If this bit is set and the @code{ICANON} bit is also set, then the +newline (@code{'\n'}) character is echoed even if the @code{ECHO} bit +is not set. +@end deftypevr + +@comment termios.h +@comment BSD +@deftypevr Macro tcflag_t ECHOCTL +If this bit is set and the @code{ECHO} bit is also set, echo control +characters with @samp{^} followed by the corresponding text character. +Thus, control-A echoes as @samp{^A}. This is usually the preferred mode +for interactive input, because echoing a control character back to the +terminal could have some undesired effect on the terminal. + +This is a BSD extension, and exists only in BSD systems and the +GNU system. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t ISIG +This bit controls whether the INTR, QUIT, and SUSP characters are +recognized. The functions associated with these characters are performed +if and only if this bit is set. Being in canonical or noncanonical +input mode has no affect on the interpretation of these characters. + +You should use caution when disabling recognition of these characters. +Programs that cannot be interrupted interactively are very +user-unfriendly. If you clear this bit, your program should provide +some alternate interface that allows the user to interactively send the +signals associated with these characters, or to escape from the program. +@cindex interactive signals, from terminal + +@xref{Signal Characters}. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t IEXTEN +POSIX.1 gives @code{IEXTEN} implementation-defined meaning, +so you cannot rely on this interpretation on all systems. + +On BSD systems and the GNU system, it enables the LNEXT and DISCARD characters. +@xref{Other Special}. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t NOFLSH +Normally, the INTR, QUIT, and SUSP characters cause input and output +queues for the terminal to be cleared. If this bit is set, the queues +are not cleared. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro tcflag_t TOSTOP +If this bit is set and the system supports job control, then +@code{SIGTTOU} signals are generated by background processes that +attempt to write to the terminal. @xref{Access to the Terminal}. +@end deftypevr + +The following bits are BSD extensions; they exist only in BSD systems +and the GNU system. + +@comment termios.h +@comment BSD +@deftypevr Macro tcflag_t ALTWERASE +This bit determines how far the WERASE character should erase. The +WERASE character erases back to the beginning of a word; the question +is, where do words begin? + +If this bit is clear, then the beginning of a word is a nonwhitespace +character following a whitespace character. If the bit is set, then the +beginning of a word is an alphanumeric character or underscore following +a character which is none of those. + +@xref{Editing Characters}, for more information about the WERASE character. +@end deftypevr + +@comment termios.h +@comment BSD +@deftypevr Macro tcflag_t FLUSHO +This is the bit that toggles when the user types the DISCARD character. +While this bit is set, all output is discarded. @xref{Other Special}. +@end deftypevr + +@comment termios.h +@comment BSD +@deftypevr Macro tcflag_t NOKERNINFO +Setting this bit disables handling of the STATUS character. +@xref{Other Special}. +@end deftypevr + +@comment termios.h +@comment BSD +@deftypevr Macro tcflag_t PENDIN +If this bit is set, it indicates that there is a line of input that +needs to be reprinted. Typing the REPRINT character sets this bit; the +bit remains set until reprinting is finished. @xref{Editing Characters}. +@end deftypevr + +@c EXTPROC is too obscure to document now. --roland + +@node Line Speed +@subsection Line Speed +@cindex line speed +@cindex baud rate +@cindex terminal line speed +@cindex terminal line speed + +The terminal line speed tells the computer how fast to read and write +data on the terminal. + +If the terminal is connected to a real serial line, the terminal speed +you specify actually controls the line---if it doesn't match the +terminal's own idea of the speed, communication does not work. Real +serial ports accept only certain standard speeds. Also, particular +hardware may not support even all the standard speeds. Specifying a +speed of zero hangs up a dialup connection and turns off modem control +signals. + +If the terminal is not a real serial line (for example, if it is a +network connection), then the line speed won't really affect data +transmission speed, but some programs will use it to determine the +amount of padding needed. It's best to specify a line speed value that +matches the actual speed of the actual terminal, but you can safely +experiment with different values to vary the amount of padding. + +There are actually two line speeds for each terminal, one for input and +one for output. You can set them independently, but most often +terminals use the same speed for both directions. + +The speed values are stored in the @code{struct termios} structure, but +don't try to access them in the @code{struct termios} structure +directly. Instead, you should use the following functions to read and +store them: + +@comment termios.h +@comment POSIX.1 +@deftypefun speed_t cfgetospeed (const struct termios *@var{termios-p}) +This function returns the output line speed stored in the structure +@code{*@var{termios-p}}. +@end deftypefun + +@comment termios.h +@comment POSIX.1 +@deftypefun speed_t cfgetispeed (const struct termios *@var{termios-p}) +This function returns the input line speed stored in the structure +@code{*@var{termios-p}}. +@end deftypefun + +@comment termios.h +@comment POSIX.1 +@deftypefun int cfsetospeed (struct termios *@var{termios-p}, speed_t @var{speed}) +This function stores @var{speed} in @code{*@var{termios-p}} as the output +speed. The normal return value is @code{0}; a value of @code{-1} +indicates an error. If @var{speed} is not a speed, @code{cfsetospeed} +returns @code{-1}. +@end deftypefun + +@comment termios.h +@comment POSIX.1 +@deftypefun int cfsetispeed (struct termios *@var{termios-p}, speed_t @var{speed}) +This function stores @var{speed} in @code{*@var{termios-p}} as the input +speed. The normal return value is @code{0}; a value of @code{-1} +indicates an error. If @var{speed} is not a speed, @code{cfsetospeed} +returns @code{-1}. +@end deftypefun + +@comment termios.h +@comment BSD +@deftypefun int cfsetspeed (struct termios *@var{termios-p}, speed_t @var{speed}) +This function stores @var{speed} in @code{*@var{termios-p}} as both the +input and output speeds. The normal return value is @code{0}; a value +of @code{-1} indicates an error. If @var{speed} is not a speed, +@code{cfsetspeed} returns @code{-1}. This function is an extension in +4.4 BSD. +@end deftypefun + +@comment termios.h +@comment POSIX.1 +@deftp {Data Type} speed_t +The @code{speed_t} type is an unsigned integer data type used to +represent line speeds. +@end deftp + +The functions @code{cfsetospeed} and @code{cfsetispeed} report errors +only for speed values that the system simply cannot handle. If you +specify a speed value that is basically acceptable, then those functions +will succeed. But they do not check that a particular hardware device +can actually support the specified speeds---in fact, they don't know +which device you plan to set the speed for. If you use @code{tcsetattr} +to set the speed of a particular device to a value that it cannot +handle, @code{tcsetattr} returns @code{-1}. + +@strong{Portability note:} In the GNU library, the functions above +accept speeds measured in bits per second as input, and return speed +values measured in bits per second. Other libraries require speeds to +be indicated by special codes. For POSIX.1 portability, you must use +one of the following symbols to represent the speed; their precise +numeric values are system-dependent, but each name has a fixed meaning: +@code{B110} stands for 110 bps, @code{B300} for 300 bps, and so on. +There is no portable way to represent any speed but these, but these are +the only speeds that typical serial lines can support. + +@comment termios.h +@comment POSIX.1 +@vindex B0 +@comment termios.h +@comment POSIX.1 +@vindex B50 +@comment termios.h +@comment POSIX.1 +@vindex B75 +@comment termios.h +@comment POSIX.1 +@vindex B110 +@comment termios.h +@comment POSIX.1 +@vindex B134 +@comment termios.h +@comment POSIX.1 +@vindex B150 +@comment termios.h +@comment POSIX.1 +@vindex B200 +@comment termios.h +@comment POSIX.1 +@vindex B300 +@comment termios.h +@comment POSIX.1 +@vindex B600 +@comment termios.h +@comment POSIX.1 +@vindex B1200 +@comment termios.h +@comment POSIX.1 +@vindex B1800 +@comment termios.h +@comment POSIX.1 +@vindex B2400 +@comment termios.h +@comment POSIX.1 +@vindex B4800 +@comment termios.h +@comment POSIX.1 +@vindex B9600 +@comment termios.h +@comment POSIX.1 +@vindex B19200 +@comment termios.h +@comment POSIX.1 +@vindex B38400 +@smallexample +B0 B50 B75 B110 B134 B150 B200 +B300 B600 B1200 B1800 B2400 B4800 +B9600 B19200 B38400 +@end smallexample + +@vindex EXTA +@vindex EXTB +BSD defines two additional speed symbols as aliases: @code{EXTA} is an +alias for @code{B19200} and @code{EXTB} is an alias for @code{B38400}. +These aliases are obsolete. + +@node Special Characters +@subsection Special Characters + +In canonical input, the terminal driver recognizes a number of special +characters which perform various control functions. These include the +ERASE character (usually @key{DEL}) for editing input, and other editing +characters. The INTR character (normally @kbd{C-c}) for sending a +@code{SIGINT} signal, and other signal-raising characters, may be +available in either canonical or noncanonical input mode. All these +characters are described in this section. + +The particular characters used are specified in the @code{c_cc} member +of the @code{struct termios} structure. This member is an array; each +element specifies the character for a particular role. Each element has +a symbolic constant that stands for the index of that element---for +example, @code{INTR} is the index of the element that specifies the INTR +character, so storing @code{'='} in @code{@var{termios}.c_cc[INTR]} +specifies @samp{=} as the INTR character. + +@vindex _POSIX_VDISABLE +On some systems, you can disable a particular special character function +by specifying the value @code{_POSIX_VDISABLE} for that role. This +value is unequal to any possible character code. @xref{Options for +Files}, for more information about how to tell whether the operating +system you are using supports @code{_POSIX_VDISABLE}. + +@menu +* Editing Characters:: Special characters that terminate lines and + delete text, and other editing functions. +* Signal Characters:: Special characters that send or raise signals + to or for certain classes of processes. +* Start/Stop Characters:: Special characters that suspend or resume + suspended output. +* Other Special:: Other special characters for BSD systems: + they can discard output, and print status. +@end menu + +@node Editing Characters +@subsubsection Characters for Input Editing + +These special characters are active only in canonical input mode. +@xref{Canonical or Not}. + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro int VEOF +@cindex EOF character +This is the subscript for the EOF character in the special control +character array. @code{@var{termios}.c_cc[VEOF]} holds the character +itself. + +The EOF character is recognized only in canonical input mode. It acts +as a line terminator in the same way as a newline character, but if the +EOF character is typed at the beginning of a line it causes @code{read} +to return a byte count of zero, indicating end-of-file. The EOF +character itself is discarded. + +Usually, the EOF character is @kbd{C-d}. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro int VEOL +@cindex EOL character +This is the subscript for the EOL character in the special control +character array. @code{@var{termios}.c_cc[VEOL]} holds the character +itself. + +The EOL character is recognized only in canonical input mode. It acts +as a line terminator, just like a newline character. The EOL character +is not discarded; it is read as the last character in the input line. + +@c !!! example: this is set to ESC by 4.3 csh with "set filec" so it can +@c complete partial lines without using cbreak or raw mode. + +You don't need to use the EOL character to make @key{RET} end a line. +Just set the ICRNL flag. In fact, this is the default state of +affairs. +@end deftypevr + +@comment termios.h +@comment BSD +@deftypevr Macro int VEOL2 +@cindex EOL2 character +This is the subscript for the EOL2 character in the special control +character array. @code{@var{termios}.c_cc[VEOL2]} holds the character +itself. + +The EOL2 character works just like the EOL character (see above), but it +can be a different character. Thus, you can specify two characters to +terminate an input line, by setting EOL to one of them and EOL2 to the +other. + +The EOL2 character is a BSD extension; it exists only on BSD systems +and the GNU system. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro int VERASE +@cindex ERASE character +This is the subscript for the ERASE character in the special control +character array. @code{@var{termios}.c_cc[VERASE]} holds the +character itself. + +The ERASE character is recognized only in canonical input mode. When +the user types the erase character, the previous character typed is +discarded. (If the terminal generates multibyte character sequences, +this may cause more than one byte of input to be discarded.) This +cannot be used to erase past the beginning of the current line of text. +The ERASE character itself is discarded. +@c !!! mention ECHOE here + +Usually, the ERASE character is @key{DEL}. +@end deftypevr + +@comment termios.h +@comment BSD +@deftypevr Macro int VWERASE +@cindex WERASE character +This is the subscript for the WERASE character in the special control +character array. @code{@var{termios}.c_cc[VWERASE]} holds the character +itself. + +The WERASE character is recognized only in canonical mode. It erases an +entire word of prior input, and any whitespace after it; whitespace +characters before the word are not erased. + +The definition of a ``word'' depends on the setting of the +@code{ALTWERASE} mode; @pxref{Local Modes}. + +If the @code{ALTWERASE} mode is not set, a word is defined as a sequence +of any characters except space or tab. + +If the @code{ALTWERASE} mode is set, a word is defined as a sequence of +characters containing only letters, numbers, and underscores, optionally +followed by one character that is not a letter, number, or underscore. + +The WERASE character is usually @kbd{C-w}. + +This is a BSD extension. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro int VKILL +@cindex KILL character +This is the subscript for the KILL character in the special control +character array. @code{@var{termios}.c_cc[VKILL]} holds the character +itself. + +The KILL character is recognized only in canonical input mode. When the +user types the kill character, the entire contents of the current line +of input are discarded. The kill character itself is discarded too. + +The KILL character is usually @kbd{C-u}. +@end deftypevr + +@comment termios.h +@comment BSD +@deftypevr Macro int VREPRINT +@cindex REPRINT character +This is the subscript for the REPRINT character in the special control +character array. @code{@var{termios}.c_cc[VREPRINT]} holds the character +itself. + +The REPRINT character is recognized only in canonical mode. It reprints +the current input line. If some asynchronous output has come while you +are typing, this lets you see the line you are typing clearly again. + +The REPRINT character is usually @kbd{C-r}. + +This is a BSD extension. +@end deftypevr + +@node Signal Characters +@subsubsection Characters that Cause Signals + +These special characters may be active in either canonical or noncanonical +input mode, but only when the @code{ISIG} flag is set (@pxref{Local +Modes}). + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro int VINTR +@cindex INTR character +@cindex interrupt character +This is the subscript for the INTR character in the special control +character array. @code{@var{termios}.c_cc[VINTR]} holds the character +itself. + +The INTR (interrupt) character raises a @code{SIGINT} signal for all +processes in the foreground job associated with the terminal. The INTR +character itself is then discarded. @xref{Signal Handling}, for more +information about signals. + +Typically, the INTR character is @kbd{C-c}. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro int VQUIT +@cindex QUIT character +This is the subscript for the QUIT character in the special control +character array. @code{@var{termios}.c_cc[VQUIT]} holds the character +itself. + +The QUIT character raises a @code{SIGQUIT} signal for all processes in +the foreground job associated with the terminal. The QUIT character +itself is then discarded. @xref{Signal Handling}, for more information +about signals. + +Typically, the QUIT character is @kbd{C-\}. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro int VSUSP +@cindex SUSP character +@cindex suspend character +This is the subscript for the SUSP character in the special control +character array. @code{@var{termios}.c_cc[VSUSP]} holds the character +itself. + +The SUSP (suspend) character is recognized only if the implementation +supports job control (@pxref{Job Control}). It causes a @code{SIGTSTP} +signal to be sent to all processes in the foreground job associated with +the terminal. The SUSP character itself is then discarded. +@xref{Signal Handling}, for more information about signals. + +Typically, the SUSP character is @kbd{C-z}. +@end deftypevr + +Few applications disable the normal interpretation of the SUSP +character. If your program does this, it should provide some other +mechanism for the user to stop the job. When the user invokes this +mechanism, the program should send a @code{SIGTSTP} signal to the +process group of the process, not just to the process itself. +@xref{Signaling Another Process}. + +@comment termios.h +@comment BSD +@deftypevr Macro int VDSUSP +@cindex DSUSP character +@cindex delayed suspend character +This is the subscript for the DSUSP character in the special control +character array. @code{@var{termios}.c_cc[VDSUSP]} holds the character +itself. + +The DSUSP (suspend) character is recognized only if the implementation +supports job control (@pxref{Job Control}). It sends a @code{SIGTSTP} +signal, like the SUSP character, but not right away---only when the +program tries to read it as input. Not all systems with job control +support DSUSP; only BSD-compatible systems (including the GNU system). + +@xref{Signal Handling}, for more information about signals. + +Typically, the DSUSP character is @kbd{C-y}. +@end deftypevr + +@node Start/Stop Characters +@subsubsection Special Characters for Flow Control + +These special characters may be active in either canonical or noncanonical +input mode, but their use is controlled by the flags @code{IXON} and +@code{IXOFF} (@pxref{Input Modes}). + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro int VSTART +@cindex START character +This is the subscript for the START character in the special control +character array. @code{@var{termios}.c_cc[VSTART]} holds the +character itself. + +The START character is used to support the @code{IXON} and @code{IXOFF} +input modes. If @code{IXON} is set, receiving a START character resumes +suspended output; the START character itself is discarded. If +@code{IXANY} is set, receiving any character at all resumes suspended +output; the resuming character is not discarded unless it is the START +character. @code{IXOFF} is set, the system may also transmit START +characters to the terminal. + +The usual value for the START character is @kbd{C-q}. You may not be +able to change this value---the hardware may insist on using @kbd{C-q} +regardless of what you specify. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro int VSTOP +@cindex STOP character +This is the subscript for the STOP character in the special control +character array. @code{@var{termios}.c_cc[VSTOP]} holds the character +itself. + +The STOP character is used to support the @code{IXON} and @code{IXOFF} +input modes. If @code{IXON} is set, receiving a STOP character causes +output to be suspended; the STOP character itself is discarded. If +@code{IXOFF} is set, the system may also transmit STOP characters to the +terminal, to prevent the input queue from overflowing. + +The usual value for the STOP character is @kbd{C-s}. You may not be +able to change this value---the hardware may insist on using @kbd{C-s} +regardless of what you specify. +@end deftypevr + +@node Other Special +@subsubsection Other Special Characters + +These special characters exist only in BSD systems and the GNU system. + +@comment termios.h +@comment BSD +@deftypevr Macro int VLNEXT +@cindex LNEXT character +This is the subscript for the LNEXT character in the special control +character array. @code{@var{termios}.c_cc[VLNEXT]} holds the character +itself. + +The LNEXT character is recognized only when @code{IEXTEN} is set, but in +both canonical and noncanonical mode. It disables any special +significance of the next character the user types. Even if the +character would normally perform some editting function or generate a +signal, it is read as a plain character. This is the analogue of the +@kbd{C-q} command in Emacs. ``LNEXT'' stands for ``literal next.'' + +The LNEXT character is usually @kbd{C-v}. +@end deftypevr + +@comment termios.h +@comment BSD +@deftypevr Macro int VDISCARD +@cindex DISCARD character +This is the subscript for the DISCARD character in the special control +character array. @code{@var{termios}.c_cc[VDISCARD]} holds the character +itself. + +The DISCARD character is recognized only when @code{IEXTEN} is set, but +in both canonical and noncanonical mode. Its effect is to toggle the +discard-output flag. When this flag is set, all program output is +discarded. Setting the flag also discards all output currently in the +output buffer. Typing any other character resets the flag. +@end deftypevr + +@comment termios.h +@comment BSD +@deftypevr Macro int VSTATUS +@cindex STATUS character +This is the subscript for the STATUS character in the special control +character array. @code{@var{termios}.c_cc[VSTATUS]} holds the character +itself. + +The STATUS character's effect is to print out a status message about how +the current process is running. + +The STATUS character is recognized only in canonical mode, and only if +@code{NOKERNINFO} is not set. +@end deftypevr + +@node Noncanonical Input +@subsection Noncanonical Input + +In noncanonical input mode, the special editing characters such as +ERASE and KILL are ignored. The system facilities for the user to edit +input are disabled in noncanonical mode, so that all input characters +(unless they are special for signal or flow-control purposes) are passed +to the application program exactly as typed. It is up to the +application program to give the user ways to edit the input, if +appropriate. + +Noncanonical mode offers special parameters called MIN and TIME for +controlling whether and how long to wait for input to be available. You +can even use them to avoid ever waiting---to return immediately with +whatever input is available, or with no input. + +The MIN and TIME are stored in elements of the @code{c_cc} array, which +is a member of the @w{@code{struct termios}} structure. Each element of +this array has a particular role, and each element has a symbolic +constant that stands for the index of that element. @code{VMIN} and +@code{VMAX} are the names for the indices in the array of the MIN and +TIME slots. + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro int VMIN +@cindex MIN termios slot +This is the subscript for the MIN slot in the @code{c_cc} array. Thus, +@code{@var{termios}.c_cc[VMIN]} is the value itself. + +The MIN slot is only meaningful in noncanonical input mode; it +specifies the minimum number of bytes that must be available in the +input queue in order for @code{read} to return. +@end deftypevr + +@comment termios.h +@comment POSIX.1 +@deftypevr Macro int VTIME +@cindex TIME termios slot +This is the subscript for the TIME slot in the @code{c_cc} array. Thus, +@code{@var{termios}.c_cc[VTIME]} is the value itself. + +The TIME slot is only meaningful in noncanonical input mode; it +specifies how long to wait for input before returning, in units of 0.1 +seconds. +@end deftypevr + +The MIN and TIME values interact to determine the criterion for when +@code{read} should return; their precise meanings depend on which of +them are nonzero. There are four possible cases: + +@itemize @bullet +@item +Both TIME and MIN are nonzero. + +In this case, TIME specifies how long to wait after each input character +to see if more input arrives. After the first character received, +@code{read} keeps waiting until either MIN bytes have arrived in all, or +TIME elapses with no further input. + +@code{read} always blocks until the first character arrives, even if +TIME elapses first. @code{read} can return more than MIN characters if +more than MIN happen to be in the queue. + +@item +Both MIN and TIME are zero. + +In this case, @code{read} always returns immediately with as many +characters as are available in the queue, up to the number requested. +If no input is immediately available, @code{read} returns a value of +zero. + +@item +MIN is zero but TIME has a nonzero value. + +In this case, @code{read} waits for time TIME for input to become +available; the availability of a single byte is enough to satisfy the +read request and cause @code{read} to return. When it returns, it +returns as many characters as are available, up to the number requested. +If no input is available before the timer expires, @code{read} returns a +value of zero. + +@item +TIME is zero but MIN has a nonzero value. + +In this case, @code{read} waits until at least MIN bytes are available +in the queue. At that time, @code{read} returns as many characters as +are available, up to the number requested. @code{read} can return more +than MIN characters if more than MIN happen to be in the queue. +@end itemize + +What happens if MIN is 50 and you ask to read just 10 bytes? +Normally, @code{read} waits until there are 50 bytes in the buffer (or, +more generally, the wait condition described above is satisfied), and +then reads 10 of them, leaving the other 40 buffered in the operating +system for a subsequent call to @code{read}. + +@strong{Portability note:} On some systems, the MIN and TIME slots are +actually the same as the EOF and EOL slots. This causes no serious +problem because the MIN and TIME slots are used only in noncanonical +input and the EOF and EOL slots are used only in canonical input, but it +isn't very clean. The GNU library allocates separate slots for these +uses. + +@comment termios.h +@comment BSD +@deftypefun int cfmakeraw (struct termios *@var{termios-p}) +This function provides an easy way to set up @code{*@var{termios-p}} for +what has traditionally been called ``raw mode'' in BSD. This uses +noncanonical input, and turns off most processing to give an unmodified +channel to the terminal. + +It does exactly this: +@smallexample + @var{termios-p}->c_iflag &= ~(IGNBRK|BRKINT|PARMRK|ISTRIP + |INLCR|IGNCR|ICRNL|IXON); + @var{termios-p}->c_oflag &= ~OPOST; + @var{termios-p}->c_lflag &= ~(ECHO|ECHONL|ICANON|ISIG|IEXTEN); + @var{termios-p}->c_cflag &= ~(CSIZE|PARENB); + @var{termios-p}->c_cflag |= CS8; +@end smallexample +@end deftypefun + +@node Line Control +@section Line Control Functions +@cindex terminal line control functions + +These functions perform miscellaneous control actions on terminal +devices. As regards terminal access, they are treated like doing +output: if any of these functions is used by a background process on its +controlling terminal, normally all processes in the process group are +sent a @code{SIGTTOU} signal. The exception is if the calling process +itself is ignoring or blocking @code{SIGTTOU} signals, in which case the +operation is performed and no signal is sent. @xref{Job Control}. + +@cindex break condition, generating +@comment termios.h +@comment POSIX.1 +@deftypefun int tcsendbreak (int @var{filedes}, int @var{duration}) +This function generates a break condition by transmitting a stream of +zero bits on the terminal associated with the file descriptor +@var{filedes}. The duration of the break is controlled by the +@var{duration} argument. If zero, the duration is between 0.25 and 0.5 +seconds. The meaning of a nonzero value depends on the operating system. + +This function does nothing if the terminal is not an asynchronous serial +data port. + +The return value is normally zero. In the event of an error, a value +of @code{-1} is returned. The following @code{errno} error conditions +are defined for this function: + +@table @code +@item EBADF +The @var{filedes} is not a valid file descriptor. + +@item ENOTTY +The @var{filedes} is not associated with a terminal device. +@end table +@end deftypefun + + +@cindex flushing terminal output queue +@cindex terminal output queue, flushing +@comment termios.h +@comment POSIX.1 +@deftypefun int tcdrain (int @var{filedes}) +The @code{tcdrain} function waits until all queued +output to the terminal @var{filedes} has been transmitted. + +The return value is normally zero. In the event of an error, a value +of @code{-1} is returned. The following @code{errno} error conditions +are defined for this function: + +@table @code +@item EBADF +The @var{filedes} is not a valid file descriptor. + +@item ENOTTY +The @var{filedes} is not associated with a terminal device. + +@item EINTR +The operation was interrupted by delivery of a signal. +@xref{Interrupted Primitives}. +@end table +@end deftypefun + + +@cindex clearing terminal input queue +@cindex terminal input queue, clearing +@comment termios.h +@comment POSIX.1 +@deftypefun int tcflush (int @var{filedes}, int @var{queue}) +The @code{tcflush} function is used to clear the input and/or output +queues associated with the terminal file @var{filedes}. The @var{queue} +argument specifies which queue(s) to clear, and can be one of the +following values: + +@c Extra blank lines here make it look better. +@table @code +@vindex TCIFLUSH +@item TCIFLUSH + +Clear any input data received, but not yet read. + +@vindex TCOFLUSH +@item TCOFLUSH + +Clear any output data written, but not yet transmitted. + +@vindex TCIOFLUSH +@item TCIOFLUSH + +Clear both queued input and output. +@end table + +The return value is normally zero. In the event of an error, a value +of @code{-1} is returned. The following @code{errno} error conditions +are defined for this function: + +@table @code +@item EBADF +The @var{filedes} is not a valid file descriptor. + +@item ENOTTY +The @var{filedes} is not associated with a terminal device. + +@item EINVAL +A bad value was supplied as the @var{queue} argument. +@end table + +It is unfortunate that this function is named @code{tcflush}, because +the term ``flush'' is normally used for quite another operation---waiting +until all output is transmitted---and using it for discarding input or +output would be confusing. Unfortunately, the name @code{tcflush} comes +from POSIX and we cannot change it. +@end deftypefun + +@cindex flow control, terminal +@cindex terminal flow control +@comment termios.h +@comment POSIX.1 +@deftypefun int tcflow (int @var{filedes}, int @var{action}) +The @code{tcflow} function is used to perform operations relating to +XON/XOFF flow control on the terminal file specified by @var{filedes}. + +The @var{action} argument specifies what operation to perform, and can +be one of the following values: + +@table @code +@vindex TCOOFF +@item TCOOFF +Suspend transmission of output. + +@vindex TCOON +@item TCOON +Restart transmission of output. + +@vindex TCIOFF +@item TCIOFF +Transmit a STOP character. + +@vindex TCION +@item TCION +Transmit a START character. +@end table + +For more information about the STOP and START characters, see @ref{Special +Characters}. + +The return value is normally zero. In the event of an error, a value +of @code{-1} is returned. The following @code{errno} error conditions +are defined for this function: + +@table @code +@vindex EBADF +@item EBADF +The @var{filedes} is not a valid file descriptor. + +@vindex ENOTTY +@item ENOTTY +The @var{filedes} is not associated with a terminal device. + +@vindex EINVAL +@item EINVAL +A bad value was supplied as the @var{action} argument. +@end table +@end deftypefun + +@node Noncanon Example +@section Noncanonical Mode Example + +Here is an example program that shows how you can set up a terminal +device to read single characters in noncanonical input mode, without +echo. + +@smallexample +@include termios.c.texi +@end smallexample + +This program is careful to restore the original terminal modes before +exiting or terminating with a signal. It uses the @code{atexit} +function (@pxref{Cleanups on Exit}) to make sure this is done +by @code{exit}. + +@ignore +@c !!!! the example doesn't handle any signals! +The signals handled in the example are the ones that typically occur due +to actions of the user. It might be desirable to handle other signals +such as SIGSEGV that can result from bugs in the program. +@end ignore + +The shell is supposed to take care of resetting the terminal modes when +a process is stopped or continued; see @ref{Job Control}. But some +existing shells do not actually do this, so you may wish to establish +handlers for job control signals that reset terminal modes. The above +example does so. diff --git a/manual/time.texi b/manual/time.texi new file mode 100644 index 0000000000..767c318a42 --- /dev/null +++ b/manual/time.texi @@ -0,0 +1,1574 @@ +@node Date and Time, Non-Local Exits, Arithmetic, Top +@chapter Date and Time + +This chapter describes functions for manipulating dates and times, +including functions for determining what the current time is and +conversion between different time representations. + +The time functions fall into three main categories: + +@itemize @bullet +@item +Functions for measuring elapsed CPU time are discussed in @ref{Processor +Time}. + +@item +Functions for measuring absolute clock or calendar time are discussed in +@ref{Calendar Time}. + +@item +Functions for setting alarms and timers are discussed in @ref{Setting +an Alarm}. +@end itemize + +@menu +* Processor Time:: Measures processor time used by a program. +* Calendar Time:: Manipulation of ``real'' dates and times. +* Setting an Alarm:: Sending a signal after a specified time. +* Sleeping:: Waiting for a period of time. +* Resource Usage:: Measuring various resources used. +* Limits on Resources:: Specifying limits on resource usage. +* Priority:: Reading or setting process run priority. +@end menu + +@node Processor Time +@section Processor Time + +If you're trying to optimize your program or measure its efficiency, it's +very useful to be able to know how much @dfn{processor time} or @dfn{CPU +time} it has used at any given point. Processor time is different from +actual wall clock time because it doesn't include any time spent waiting +for I/O or when some other process is running. Processor time is +represented by the data type @code{clock_t}, and is given as a number of +@dfn{clock ticks} relative to an arbitrary base time marking the beginning +of a single program invocation. +@cindex CPU time +@cindex processor time +@cindex clock ticks +@cindex ticks, clock +@cindex time, elapsed CPU + +@menu +* Basic CPU Time:: The @code{clock} function. +* Detailed CPU Time:: The @code{times} function. +@end menu + +@node Basic CPU Time +@subsection Basic CPU Time Inquiry + +To get the elapsed CPU time used by a process, you can use the +@code{clock} function. This facility is declared in the header file +@file{time.h}. +@pindex time.h + +In typical usage, you call the @code{clock} function at the beginning and +end of the interval you want to time, subtract the values, and then divide +by @code{CLOCKS_PER_SEC} (the number of clock ticks per second), like this: + +@smallexample +@group +#include <time.h> + +clock_t start, end; +double elapsed; + +start = clock(); +@dots{} /* @r{Do the work.} */ +end = clock(); +elapsed = ((double) (end - start)) / CLOCKS_PER_SEC; +@end group +@end smallexample + +Different computers and operating systems vary wildly in how they keep +track of processor time. It's common for the internal processor clock +to have a resolution somewhere between hundredths and millionths of a +second. + +In the GNU system, @code{clock_t} is equivalent to @code{long int} and +@code{CLOCKS_PER_SEC} is an integer value. But in other systems, both +@code{clock_t} and the type of the macro @code{CLOCKS_PER_SEC} can be +either integer or floating-point types. Casting processor time values +to @code{double}, as in the example above, makes sure that operations +such as arithmetic and printing work properly and consistently no matter +what the underlying representation is. + +@comment time.h +@comment ANSI +@deftypevr Macro int CLOCKS_PER_SEC +The value of this macro is the number of clock ticks per second measured +by the @code{clock} function. +@end deftypevr + +@comment time.h +@comment POSIX.1 +@deftypevr Macro int CLK_TCK +This is an obsolete name for @code{CLOCKS_PER_SEC}. +@end deftypevr + +@comment time.h +@comment ANSI +@deftp {Data Type} clock_t +This is the type of the value returned by the @code{clock} function. +Values of type @code{clock_t} are in units of clock ticks. +@end deftp + +@comment time.h +@comment ANSI +@deftypefun clock_t clock (void) +This function returns the elapsed processor time. The base time is +arbitrary but doesn't change within a single process. If the processor +time is not available or cannot be represented, @code{clock} returns the +value @code{(clock_t)(-1)}. +@end deftypefun + + +@node Detailed CPU Time +@subsection Detailed Elapsed CPU Time Inquiry + +The @code{times} function returns more detailed information about +elapsed processor time in a @w{@code{struct tms}} object. You should +include the header file @file{sys/times.h} to use this facility. +@pindex sys/times.h + +@comment sys/times.h +@comment POSIX.1 +@deftp {Data Type} {struct tms} +The @code{tms} structure is used to return information about process +times. It contains at least the following members: + +@table @code +@item clock_t tms_utime +This is the CPU time used in executing the instructions of the calling +process. + +@item clock_t tms_stime +This is the CPU time used by the system on behalf of the calling process. + +@item clock_t tms_cutime +This is the sum of the @code{tms_utime} values and the @code{tms_cutime} +values of all terminated child processes of the calling process, whose +status has been reported to the parent process by @code{wait} or +@code{waitpid}; see @ref{Process Completion}. In other words, it +represents the total CPU time used in executing the instructions of all +the terminated child processes of the calling process, excluding child +processes which have not yet been reported by @code{wait} or +@code{waitpid}. + +@item clock_t tms_cstime +This is similar to @code{tms_cutime}, but represents the total CPU time +used by the system on behalf of all the terminated child processes of the +calling process. +@end table + +All of the times are given in clock ticks. These are absolute values; in a +newly created process, they are all zero. @xref{Creating a Process}. +@end deftp + +@comment sys/times.h +@comment POSIX.1 +@deftypefun clock_t times (struct tms *@var{buffer}) +The @code{times} function stores the processor time information for +the calling process in @var{buffer}. + +The return value is the same as the value of @code{clock()}: the elapsed +real time relative to an arbitrary base. The base is a constant within a +particular process, and typically represents the time since system +start-up. A value of @code{(clock_t)(-1)} is returned to indicate failure. +@end deftypefun + +@strong{Portability Note:} The @code{clock} function described in +@ref{Basic CPU Time}, is specified by the ANSI C standard. The +@code{times} function is a feature of POSIX.1. In the GNU system, the +value returned by the @code{clock} function is equivalent to the sum of +the @code{tms_utime} and @code{tms_stime} fields returned by +@code{times}. + +@node Calendar Time +@section Calendar Time + +This section describes facilities for keeping track of dates and times +according to the Gregorian calendar. +@cindex Gregorian calendar +@cindex time, calendar +@cindex date and time + +There are three representations for date and time information: + +@itemize @bullet +@item +@dfn{Calendar time} (the @code{time_t} data type) is a compact +representation, typically giving the number of seconds elapsed since +some implementation-specific base time. +@cindex calendar time + +@item +There is also a @dfn{high-resolution time} representation (the @code{struct +timeval} data type) that includes fractions of a second. Use this time +representation instead of ordinary calendar time when you need greater +precision. +@cindex high-resolution time + +@item +@dfn{Local time} or @dfn{broken-down time} (the @code{struct +tm} data type) represents the date and time as a set of components +specifying the year, month, and so on, for a specific time zone. +This time representation is usually used in conjunction with formatting +date and time values. +@cindex local time +@cindex broken-down time +@end itemize + +@menu +* Simple Calendar Time:: Facilities for manipulating calendar time. +* High-Resolution Calendar:: A time representation with greater precision. +* Broken-down Time:: Facilities for manipulating local time. +* Formatting Date and Time:: Converting times to strings. +* TZ Variable:: How users specify the time zone. +* Time Zone Functions:: Functions to examine or specify the time zone. +* Time Functions Example:: An example program showing use of some of + the time functions. +@end menu + +@node Simple Calendar Time +@subsection Simple Calendar Time + +This section describes the @code{time_t} data type for representing +calendar time, and the functions which operate on calendar time objects. +These facilities are declared in the header file @file{time.h}. +@pindex time.h + +@cindex epoch +@comment time.h +@comment ANSI +@deftp {Data Type} time_t +This is the data type used to represent calendar time. In the GNU C +library and other POSIX-compliant implementations, @code{time_t} is +equivalent to @code{long int}. When interpreted as an absolute time +value, it represents the number of seconds elapsed since 00:00:00 on +January 1, 1970, Coordinated Universal Time. (This date is sometimes +referred to as the @dfn{epoch}.) + +In other systems, @code{time_t} might be either an integer or +floating-point type. +@end deftp + +@comment time.h +@comment ANSI +@deftypefun double difftime (time_t @var{time1}, time_t @var{time0}) +The @code{difftime} function returns the number of seconds elapsed +between time @var{time1} and time @var{time0}, as a value of type +@code{double}. + +In the GNU system, you can simply subtract @code{time_t} values. But on +other systems, the @code{time_t} data type might use some other encoding +where subtraction doesn't work directly. +@end deftypefun + +@comment time.h +@comment ANSI +@deftypefun time_t time (time_t *@var{result}) +The @code{time} function returns the current time as a value of type +@code{time_t}. If the argument @var{result} is not a null pointer, the +time value is also stored in @code{*@var{result}}. If the calendar +time is not available, the value @w{@code{(time_t)(-1)}} is returned. +@end deftypefun + + +@node High-Resolution Calendar +@subsection High-Resolution Calendar + +The @code{time_t} data type used to represent calendar times has a +resolution of only one second. Some applications need more precision. + +So, the GNU C library also contains functions which are capable of +representing calendar times to a higher resolution than one second. The +functions and the associated data types described in this section are +declared in @file{sys/time.h}. +@pindex sys/time.h + +@comment sys/time.h +@comment BSD +@deftp {Data Type} {struct timeval} +The @code{struct timeval} structure represents a calendar time. It +has the following members: + +@table @code +@item long int tv_sec +This represents the number of seconds since the epoch. It is equivalent +to a normal @code{time_t} value. + +@item long int tv_usec +This is the fractional second value, represented as the number of +microseconds. + +Some times struct timeval values are used for time intervals. Then the +@code{tv_sec} member is the number of seconds in the interval, and +@code{tv_usec} is the number of additional microseconds. +@end table +@end deftp + +@comment sys/time.h +@comment BSD +@deftp {Data Type} {struct timezone} +The @code{struct timezone} structure is used to hold minimal information +about the local time zone. It has the following members: + +@table @code +@item int tz_minuteswest +This is the number of minutes west of GMT. + +@item int tz_dsttime +If nonzero, daylight savings time applies during some part of the year. +@end table + +The @code{struct timezone} type is obsolete and should never be used. +Instead, use the facilities described in @ref{Time Zone Functions}. +@end deftp + +It is often necessary to subtract two values of type @w{@code{struct +timeval}}. Here is the best way to do this. It works even on some +peculiar operating systems where the @code{tv_sec} member has an +unsigned type. + +@smallexample +/* @r{Subtract the `struct timeval' values X and Y,} + @r{storing the result in RESULT.} + @r{Return 1 if the difference is negative, otherwise 0.} */ + +int +timeval_subtract (result, x, y) + struct timeval *result, *x, *y; +@{ + /* @r{Perform the carry for the later subtraction by updating @var{y}.} */ + if (x->tv_usec < y->tv_usec) @{ + int nsec = (y->tv_usec - x->tv_usec) / 1000000 + 1; + y->tv_usec -= 1000000 * nsec; + y->tv_sec += nsec; + @} + if (x->tv_usec - y->tv_usec > 1000000) @{ + int nsec = (y->tv_usec - x->tv_usec) / 1000000; + y->tv_usec += 1000000 * nsec; + y->tv_sec -= nsec; + @} + + /* @r{Compute the time remaining to wait.} + @r{@code{tv_usec} is certainly positive.} */ + result->tv_sec = x->tv_sec - y->tv_sec; + result->tv_usec = x->tv_usec - y->tv_usec; + + /* @r{Return 1 if result is negative.} */ + return x->tv_sec < y->tv_sec; +@} +@end smallexample + +@comment sys/time.h +@comment BSD +@deftypefun int gettimeofday (struct timeval *@var{tp}, struct timezone *@var{tzp}) +The @code{gettimeofday} function returns the current date and time in the +@code{struct timeval} structure indicated by @var{tp}. Information about the +time zone is returned in the structure pointed at @var{tzp}. If the @var{tzp} +argument is a null pointer, time zone information is ignored. + +The return value is @code{0} on success and @code{-1} on failure. The +following @code{errno} error condition is defined for this function: + +@table @code +@item ENOSYS +The operating system does not support getting time zone information, and +@var{tzp} is not a null pointer. The GNU operating system does not +support using @w{@code{struct timezone}} to represent time zone +information; that is an obsolete feature of 4.3 BSD. +Instead, use the facilities described in @ref{Time Zone Functions}. +@end table +@end deftypefun + +@comment sys/time.h +@comment BSD +@deftypefun int settimeofday (const struct timeval *@var{tp}, const struct timezone *@var{tzp}) +The @code{settimeofday} function sets the current date and time +according to the arguments. As for @code{gettimeofday}, time zone +information is ignored if @var{tzp} is a null pointer. + +You must be a privileged user in order to use @code{settimeofday}. + +The return value is @code{0} on success and @code{-1} on failure. The +following @code{errno} error conditions are defined for this function: + +@table @code +@item EPERM +This process cannot set the time because it is not privileged. + +@item ENOSYS +The operating system does not support setting time zone information, and +@var{tzp} is not a null pointer. +@end table +@end deftypefun + +@comment sys/time.h +@comment BSD +@deftypefun int adjtime (const struct timeval *@var{delta}, struct timeval *@var{olddelta}) +This function speeds up or slows down the system clock in order to make +gradual adjustments in the current time. This ensures that the time +reported by the system clock is always monotonically increasing, which +might not happen if you simply set the current time. + +The @var{delta} argument specifies a relative adjustment to be made to +the current time. If negative, the system clock is slowed down for a +while until it has lost this much time. If positive, the system clock +is speeded up for a while. + +If the @var{olddelta} argument is not a null pointer, the @code{adjtime} +function returns information about any previous time adjustment that +has not yet completed. + +This function is typically used to synchronize the clocks of computers +in a local network. You must be a privileged user to use it. +The return value is @code{0} on success and @code{-1} on failure. The +following @code{errno} error condition is defined for this function: + +@table @code +@item EPERM +You do not have privilege to set the time. +@end table +@end deftypefun + +@strong{Portability Note:} The @code{gettimeofday}, @code{settimeofday}, +and @code{adjtime} functions are derived from BSD. + + +@node Broken-down Time +@subsection Broken-down Time +@cindex broken-down time +@cindex calendar time and broken-down time + +Calendar time is represented as a number of seconds. This is convenient +for calculation, but has no resemblance to the way people normally +represent dates and times. By contrast, @dfn{broken-down time} is a binary +representation separated into year, month, day, and so on. Broken down +time values are not useful for calculations, but they are useful for +printing human readable time. + +A broken-down time value is always relative to a choice of local time +zone, and it also indicates which time zone was used. + +The symbols in this section are declared in the header file @file{time.h}. + +@comment time.h +@comment ANSI +@deftp {Data Type} {struct tm} +This is the data type used to represent a broken-down time. The structure +contains at least the following members, which can appear in any order: + +@table @code +@item int tm_sec +This is the number of seconds after the minute, normally in the range +@code{0} to @code{59}. (The actual upper limit is @code{61}, to allow +for ``leap seconds''.) +@cindex leap second + +@item int tm_min +This is the number of minutes after the hour, in the range @code{0} to +@code{59}. + +@item int tm_hour +This is the number of hours past midnight, in the range @code{0} to +@code{23}. + +@item int tm_mday +This is the day of the month, in the range @code{1} to @code{31}. + +@item int tm_mon +This is the number of months since January, in the range @code{0} to +@code{11}. + +@item int tm_year +This is the number of years since @code{1900}. + +@item int tm_wday +This is the number of days since Sunday, in the range @code{0} to @code{6}. + +@item int tm_yday +This is the number of days since January 1, in the range @code{0} to +@code{365}. + +@item int tm_isdst +@cindex Daylight Saving Time +@cindex summer time +This is a flag that indicates whether Daylight Saving Time is (or was, or +will be) in effect at the time described. The value is positive if +Daylight Saving Time is in effect, zero if it is not, and negative if the +information is not available. + +@item long int tm_gmtoff +This field describes the time zone that was used to compute this +broken-down time value; it is the amount you must add to the local time +in that zone to get GMT, in units of seconds. The value is like that of +the variable @code{timezone} (@pxref{Time Zone Functions}). You can +also think of this as the ``number of seconds west'' of GMT. The +@code{tm_gmtoff} field is a GNU library extension. + +@item const char *tm_zone +This field is the three-letter name for the time zone that was used to +compute this broken-down time value. It is a GNU library extension. +@end table +@end deftp + +@comment time.h +@comment ANSI +@deftypefun {struct tm *} localtime (const time_t *@var{time}) +The @code{localtime} function converts the calendar time pointed to by +@var{time} to broken-down time representation, expressed relative to the +user's specified time zone. + +The return value is a pointer to a static broken-down time structure, which +might be overwritten by subsequent calls to any of the date and time +functions. (But no other library function overwrites the contents of this +object.) + +Calling @code{localtime} has one other effect: it sets the variable +@code{tzname} with information about the current time zone. @xref{Time +Zone Functions}. +@end deftypefun + +@comment time.h +@comment ANSI +@deftypefun {struct tm *} gmtime (const time_t *@var{time}) +This function is similar to @code{localtime}, except that the broken-down +time is expressed as Coordinated Universal Time (UTC)---that is, as +Greenwich Mean Time (GMT) rather than relative to the local time zone. + +Recall that calendar times are @emph{always} expressed in coordinated +universal time. +@end deftypefun + +@comment time.h +@comment ANSI +@deftypefun time_t mktime (struct tm *@var{brokentime}) +The @code{mktime} function is used to convert a broken-down time structure +to a calendar time representation. It also ``normalizes'' the contents of +the broken-down time structure, by filling in the day of week and day of +year based on the other date and time components. + +The @code{mktime} function ignores the specified contents of the +@code{tm_wday} and @code{tm_yday} members of the broken-down time +structure. It uses the values of the other components to compute the +calendar time; it's permissible for these components to have +unnormalized values outside of their normal ranges. The last thing that +@code{mktime} does is adjust the components of the @var{brokentime} +structure (including the @code{tm_wday} and @code{tm_yday}). + +If the specified broken-down time cannot be represented as a calendar time, +@code{mktime} returns a value of @code{(time_t)(-1)} and does not modify +the contents of @var{brokentime}. + +Calling @code{mktime} also sets the variable @code{tzname} with +information about the current time zone. @xref{Time Zone Functions}. +@end deftypefun + +@node Formatting Date and Time +@subsection Formatting Date and Time + +The functions described in this section format time values as strings. +These functions are declared in the header file @file{time.h}. +@pindex time.h + +@comment time.h +@comment ANSI +@deftypefun {char *} asctime (const struct tm *@var{brokentime}) +The @code{asctime} function converts the broken-down time value that +@var{brokentime} points to into a string in a standard format: + +@smallexample +"Tue May 21 13:46:22 1991\n" +@end smallexample + +The abbreviations for the days of week are: @samp{Sun}, @samp{Mon}, +@samp{Tue}, @samp{Wed}, @samp{Thu}, @samp{Fri}, and @samp{Sat}. + +The abbreviations for the months are: @samp{Jan}, @samp{Feb}, +@samp{Mar}, @samp{Apr}, @samp{May}, @samp{Jun}, @samp{Jul}, @samp{Aug}, +@samp{Sep}, @samp{Oct}, @samp{Nov}, and @samp{Dec}. + +The return value points to a statically allocated string, which might be +overwritten by subsequent calls to any of the date and time functions. +(But no other library function overwrites the contents of this +string.) +@end deftypefun + +@comment time.h +@comment ANSI +@deftypefun {char *} ctime (const time_t *@var{time}) +The @code{ctime} function is similar to @code{asctime}, except that the +time value is specified as a @code{time_t} calendar time value rather +than in broken-down local time format. It is equivalent to + +@smallexample +asctime (localtime (@var{time})) +@end smallexample + +@code{ctime} sets the variable @code{tzname}, because @code{localtime} +does so. @xref{Time Zone Functions}. +@end deftypefun + +@comment time.h +@comment ANSI +@deftypefun size_t strftime (char *@var{s}, size_t @var{size}, const char *@var{template}, const struct tm *@var{brokentime}) +This function is similar to the @code{sprintf} function (@pxref{Formatted +Input}), but the conversion specifications that can appear in the format +template @var{template} are specialized for printing components of the date +and time @var{brokentime} according to the locale currently specified for +time conversion (@pxref{Locales}). + +Ordinary characters appearing in the @var{template} are copied to the +output string @var{s}; this can include multibyte character sequences. +Conversion specifiers are introduced by a @samp{%} character, and are +replaced in the output string as follows: + +@table @code +@item %a +The abbreviated weekday name according to the current locale. + +@item %A +The full weekday name according to the current locale. + +@item %b +The abbreviated month name according to the current locale. + +@item %B +The full month name according to the current locale. + +@item %c +The preferred date and time representation for the current locale. + +@item %d +The day of the month as a decimal number (range @code{01} to @code{31}). + +@item %H +The hour as a decimal number, using a 24-hour clock (range @code{00} to +@code{23}). + +@item %I +The hour as a decimal number, using a 12-hour clock (range @code{01} to +@code{12}). + +@item %j +The day of the year as a decimal number (range @code{001} to @code{366}). + +@item %m +The month as a decimal number (range @code{01} to @code{12}). + +@item %M +The minute as a decimal number. + +@item %p +Either @samp{am} or @samp{pm}, according to the given time value; or the +corresponding strings for the current locale. + +@item %S +The second as a decimal number. + +@item %U +The week number of the current year as a decimal number, starting with +the first Sunday as the first day of the first week. + +@item %W +The week number of the current year as a decimal number, starting with +the first Monday as the first day of the first week. + +@item %w +The day of the week as a decimal number, Sunday being @code{0}. + +@item %x +The preferred date representation for the current locale, but without the +time. + +@item %X +The preferred time representation for the current locale, but with no date. + +@item %y +The year as a decimal number, but without a century (range @code{00} to +@code{99}). + +@item %Y +The year as a decimal number, including the century. + +@item %Z +The time zone or name or abbreviation (empty if the time zone can't be +determined). + +@item %% +A literal @samp{%} character. +@end table + +The @var{size} parameter can be used to specify the maximum number of +characters to be stored in the array @var{s}, including the terminating +null character. If the formatted time requires more than @var{size} +characters, the excess characters are discarded. The return value from +@code{strftime} is the number of characters placed in the array @var{s}, +not including the terminating null character. If the value equals +@var{size}, it means that the array @var{s} was too small; you should +repeat the call, providing a bigger array. + +If @var{s} is a null pointer, @code{strftime} does not actually write +anything, but instead returns the number of characters it would have written. + +For an example of @code{strftime}, see @ref{Time Functions Example}. +@end deftypefun + +@node TZ Variable +@subsection Specifying the Time Zone with @code{TZ} + +In POSIX systems, a user can specify the time zone by means of the +@code{TZ} environment variable. For information about how to set +environment variables, see @ref{Environment Variables}. The functions +for accessing the time zone are declared in @file{time.h}. +@pindex time.h +@cindex time zone + +You should not normally need to set @code{TZ}. If the system is +configured properly, the default timezone will be correct. You might +set @code{TZ} if you are using a computer over the network from a +different timezone, and would like times reported to you in the timezone +that local for you, rather than what is local for the computer. + +In POSIX.1 systems the value of the @code{TZ} variable can be of one of +three formats. With the GNU C library, the most common format is the +last one, which can specify a selection from a large database of time +zone information for many regions of the world. The first two formats +are used to describe the time zone information directly, which is both +more cumbersome and less precise. But the POSIX.1 standard only +specifies the details of the first two formats, so it is good to be +familiar with them in case you come across a POSIX.1 system that doesn't +support a time zone information database. + +The first format is used when there is no Daylight Saving Time (or +summer time) in the local time zone: + +@smallexample +@r{@var{std} @var{offset}} +@end smallexample + +The @var{std} string specifies the name of the time zone. It must be +three or more characters long and must not contain a leading colon or +embedded digits, commas, or plus or minus signs. There is no space +character separating the time zone name from the @var{offset}, so these +restrictions are necessary to parse the specification correctly. + +The @var{offset} specifies the time value one must add to the local time +to get a Coordinated Universal Time value. It has syntax like +[@code{+}|@code{-}]@var{hh}[@code{:}@var{mm}[@code{:}@var{ss}]]. This +is positive if the local time zone is west of the Prime Meridian and +negative if it is east. The hour must be between @code{0} and +@code{24}, and the minute and seconds between @code{0} and @code{59}. + +For example, here is how we would specify Eastern Standard Time, but +without any daylight savings time alternative: + +@smallexample +EST+5 +@end smallexample + +The second format is used when there is Daylight Saving Time: + +@smallexample +@r{@var{std} @var{offset} @var{dst} [@var{offset}]@code{,}@var{start}[@code{/}@var{time}]@code{,}@var{end}[@code{/}@var{time}]} +@end smallexample + +The initial @var{std} and @var{offset} specify the standard time zone, as +described above. The @var{dst} string and @var{offset} specify the name +and offset for the corresponding daylight savings time time zone; if the +@var{offset} is omitted, it defaults to one hour ahead of standard time. + +The remainder of the specification describes when daylight savings time is +in effect. The @var{start} field is when daylight savings time goes into +effect and the @var{end} field is when the change is made back to standard +time. The following formats are recognized for these fields: + +@table @code +@item J@var{n} +This specifies the Julian day, with @var{n} between @code{1} and @code{365}. +February 29 is never counted, even in leap years. + +@item @var{n} +This specifies the Julian day, with @var{n} between @code{0} and @code{365}. +February 29 is counted in leap years. + +@item M@var{m}.@var{w}.@var{d} +This specifies day @var{d} of week @var{w} of month @var{m}. The day +@var{d} must be between @code{0} (Sunday) and @code{6}. The week +@var{w} must be between @code{1} and @code{5}; week @code{1} is the +first week in which day @var{d} occurs, and week @code{5} specifies the +@emph{last} @var{d} day in the month. The month @var{m} should be +between @code{1} and @code{12}. +@end table + +The @var{time} fields specify when, in the local time currently in +effect, the change to the other time occurs. If omitted, the default is +@code{02:00:00}. + +For example, here is how one would specify the Eastern time zone in the +United States, including the appropriate daylight saving time and its dates +of applicability. The normal offset from GMT is 5 hours; since this is +west of the prime meridian, the sign is positive. Summer time begins on +the first Sunday in April at 2:00am, and ends on the last Sunday in October +at 2:00am. + +@smallexample +EST+5EDT,M4.1.0/M10.5.0 +@end smallexample + +The schedule of daylight savings time in any particular jurisdiction has +changed over the years. To be strictly correct, the conversion of dates +and times in the past should be based on the schedule that was in effect +then. However, this format has no facilities to let you specify how the +schedule has changed from year to year. The most you can do is specify +one particular schedule---usually the present day schedule---and this is +used to convert any date, no matter when. For precise time zone +specifications, it is best to use the time zone information database +(see below). + +The third format looks like this: + +@smallexample +:@var{characters} +@end smallexample + +Each operating system interprets this format differently; in the GNU C +library, @var{characters} is the name of a file which describes the time +zone. + +@pindex /etc/localtime +@pindex localtime +If the @code{TZ} environment variable does not have a value, the +operation chooses a time zone by default. In the GNU C library, the +default time zone is like the specification @samp{TZ=:/etc/localtime} +(or @samp{TZ=:/usr/local/etc/localtime}, depending on how GNU C library +was configured; @pxref{Installation}). Other C libraries use their own +rule for choosing the default time zone, so there is little we can say +about them. + +@cindex time zone database +@pindex /share/lib/zoneinfo +@pindex zoneinfo +If @var{characters} begins with a slash, it is an absolute file name; +otherwise the library looks for the file +@w{@file{/share/lib/zoneinfo/@var{characters}}}. The @file{zoneinfo} +directory contains data files describing local time zones in many +different parts of the world. The names represent major cities, with +subdirectories for geographical areas; for example, +@file{America/New_York}, @file{Europe/London}, @file{Asia/Hong_Kong}. +These data files are installed by the system administrator, who also +sets @file{/etc/localtime} to point to the data file for the local time +zone. The GNU C library comes with a large database of time zone +information for most regions of the world, which is maintained by a +community of volunteers and put in the public domain. + +@node Time Zone Functions +@subsection Functions and Variables for Time Zones + +@comment time.h +@comment POSIX.1 +@deftypevar char * tzname [2] +The array @code{tzname} contains two strings, which are the standard +three-letter names of the pair of time zones (standard and daylight +savings) that the user has selected. @code{tzname[0]} is the name of +the standard time zone (for example, @code{"EST"}), and @code{tzname[1]} +is the name for the time zone when daylight savings time is in use (for +example, @code{"EDT"}). These correspond to the @var{std} and @var{dst} +strings (respectively) from the @code{TZ} environment variable. + +The @code{tzname} array is initialized from the @code{TZ} environment +variable whenever @code{tzset}, @code{ctime}, @code{strftime}, +@code{mktime}, or @code{localtime} is called. +@end deftypevar + +@comment time.h +@comment POSIX.1 +@deftypefun void tzset (void) +The @code{tzset} function initializes the @code{tzname} variable from +the value of the @code{TZ} environment variable. It is not usually +necessary for your program to call this function, because it is called +automatically when you use the other time conversion functions that +depend on the time zone. +@end deftypefun + +The following variables are defined for compatibility with System V +Unix. These variables are set by calling @code{localtime}. + +@comment time.h +@comment SVID +@deftypevar {long int} timezone +This contains the difference between GMT and local standard time, in +seconds. For example, in the U.S. Eastern time zone, the value is +@code{5*60*60}. +@end deftypevar + +@comment time.h +@comment SVID +@deftypevar int daylight +This variable has a nonzero value if the standard U.S. daylight savings +time rules apply. +@end deftypevar + +@node Time Functions Example +@subsection Time Functions Example + +Here is an example program showing the use of some of the local time and +calendar time functions. + +@smallexample +@include strftim.c.texi +@end smallexample + +It produces output like this: + +@smallexample +Wed Jul 31 13:02:36 1991 +Today is Wednesday, July 31. +The time is 01:02 PM. +@end smallexample + + +@node Setting an Alarm +@section Setting an Alarm + +The @code{alarm} and @code{setitimer} functions provide a mechanism for a +process to interrupt itself at some future time. They do this by setting a +timer; when the timer expires, the process receives a signal. + +@cindex setting an alarm +@cindex interval timer, setting +@cindex alarms, setting +@cindex timers, setting +Each process has three independent interval timers available: + +@itemize @bullet +@item +A real-time timer that counts clock time. This timer sends a +@code{SIGALRM} signal to the process when it expires. +@cindex real-time timer +@cindex timer, real-time + +@item +A virtual timer that counts CPU time used by the process. This timer +sends a @code{SIGVTALRM} signal to the process when it expires. +@cindex virtual timer +@cindex timer, virtual + +@item +A profiling timer that counts both CPU time used by the process, and CPU +time spent in system calls on behalf of the process. This timer sends a +@code{SIGPROF} signal to the process when it expires. +@cindex profiling timer +@cindex timer, profiling + +This timer is useful for profiling in interpreters. The interval timer +mechanism does not have the fine granularity necessary for profiling +native code. +@c @xref{profil} !!! +@end itemize + +You can only have one timer of each kind set at any given time. If you +set a timer that has not yet expired, that timer is simply reset to the +new value. + +You should establish a handler for the appropriate alarm signal using +@code{signal} or @code{sigaction} before issuing a call to @code{setitimer} +or @code{alarm}. Otherwise, an unusual chain of events could cause the +timer to expire before your program establishes the handler, and in that +case it would be terminated, since that is the default action for the alarm +signals. @xref{Signal Handling}. + +The @code{setitimer} function is the primary means for setting an alarm. +This facility is declared in the header file @file{sys/time.h}. The +@code{alarm} function, declared in @file{unistd.h}, provides a somewhat +simpler interface for setting the real-time timer. +@pindex unistd.h +@pindex sys/time.h + +@comment sys/time.h +@comment BSD +@deftp {Data Type} {struct itimerval} +This structure is used to specify when a timer should expire. It contains +the following members: +@table @code +@item struct timeval it_interval +This is the interval between successive timer interrupts. If zero, the +alarm will only be sent once. + +@item struct timeval it_value +This is the interval to the first timer interrupt. If zero, the alarm is +disabled. +@end table + +The @code{struct timeval} data type is described in @ref{High-Resolution +Calendar}. +@end deftp + +@comment sys/time.h +@comment BSD +@deftypefun int setitimer (int @var{which}, struct itimerval *@var{new}, struct itimerval *@var{old}) +The @code{setitimer} function sets the timer specified by @var{which} +according to @var{new}. The @var{which} argument can have a value of +@code{ITIMER_REAL}, @code{ITIMER_VIRTUAL}, or @code{ITIMER_PROF}. + +If @var{old} is not a null pointer, @code{setitimer} returns information +about any previous unexpired timer of the same kind in the structure it +points to. + +The return value is @code{0} on success and @code{-1} on failure. The +following @code{errno} error conditions are defined for this function: + +@table @code +@item EINVAL +The timer interval was too large. +@end table +@end deftypefun + +@comment sys/time.h +@comment BSD +@deftypefun int getitimer (int @var{which}, struct itimerval *@var{old}) +The @code{getitimer} function stores information about the timer specified +by @var{which} in the structure pointed at by @var{old}. + +The return value and error conditions are the same as for @code{setitimer}. +@end deftypefun + +@comment sys/time.h +@comment BSD +@table @code +@item ITIMER_REAL +@findex ITIMER_REAL +This constant can be used as the @var{which} argument to the +@code{setitimer} and @code{getitimer} functions to specify the real-time +timer. + +@comment sys/time.h +@comment BSD +@item ITIMER_VIRTUAL +@findex ITIMER_VIRTUAL +This constant can be used as the @var{which} argument to the +@code{setitimer} and @code{getitimer} functions to specify the virtual +timer. + +@comment sys/time.h +@comment BSD +@item ITIMER_PROF +@findex ITIMER_PROF +This constant can be used as the @var{which} argument to the +@code{setitimer} and @code{getitimer} functions to specify the profiling +timer. +@end table + +@comment unistd.h +@comment POSIX.1 +@deftypefun {unsigned int} alarm (unsigned int @var{seconds}) +The @code{alarm} function sets the real-time timer to expire in +@var{seconds} seconds. If you want to cancel any existing alarm, you +can do this by calling @code{alarm} with a @var{seconds} argument of +zero. + +The return value indicates how many seconds remain before the previous +alarm would have been sent. If there is no previous alarm, @code{alarm} +returns zero. +@end deftypefun + +The @code{alarm} function could be defined in terms of @code{setitimer} +like this: + +@smallexample +unsigned int +alarm (unsigned int seconds) +@{ + struct itimerval old, new; + new.it_interval.tv_usec = 0; + new.it_interval.tv_sec = 0; + new.it_value.tv_usec = 0; + new.it_value.tv_sec = (long int) seconds; + if (setitimer (ITIMER_REAL, &new, &old) < 0) + return 0; + else + return old.it_value.tv_sec; +@} +@end smallexample + +There is an example showing the use of the @code{alarm} function in +@ref{Handler Returns}. + +If you simply want your process to wait for a given number of seconds, +you should use the @code{sleep} function. @xref{Sleeping}. + +You shouldn't count on the signal arriving precisely when the timer +expires. In a multiprocessing environment there is typically some +amount of delay involved. + +@strong{Portability Note:} The @code{setitimer} and @code{getitimer} +functions are derived from BSD Unix, while the @code{alarm} function is +specified by the POSIX.1 standard. @code{setitimer} is more powerful than +@code{alarm}, but @code{alarm} is more widely used. + +@node Sleeping +@section Sleeping + +The function @code{sleep} gives a simple way to make the program wait +for short periods of time. If your program doesn't use signals (except +to terminate), then you can expect @code{sleep} to wait reliably for +the specified amount of time. Otherwise, @code{sleep} can return sooner +if a signal arrives; if you want to wait for a given period regardless +of signals, use @code{select} (@pxref{Waiting for I/O}) and don't +specify any descriptors to wait for. +@c !!! select can get EINTR; using SA_RESTART makes sleep win too. + +@comment unistd.h +@comment POSIX.1 +@deftypefun {unsigned int} sleep (unsigned int @var{seconds}) +The @code{sleep} function waits for @var{seconds} or until a signal +is delivered, whichever happens first. + +If @code{sleep} function returns because the requested time has +elapsed, it returns a value of zero. If it returns because of delivery +of a signal, its return value is the remaining time in the sleep period. + +The @code{sleep} function is declared in @file{unistd.h}. +@end deftypefun + +Resist the temptation to implement a sleep for a fixed amount of time by +using the return value of @code{sleep}, when nonzero, to call +@code{sleep} again. This will work with a certain amount of accuracy as +long as signals arrive infrequently. But each signal can cause the +eventual wakeup time to be off by an additional second or so. Suppose a +few signals happen to arrive in rapid succession by bad luck---there is +no limit on how much this could shorten or lengthen the wait. + +Instead, compute the time at which the program should stop waiting, and +keep trying to wait until that time. This won't be off by more than a +second. With just a little more work, you can use @code{select} and +make the waiting period quite accurate. (Of course, heavy system load +can cause unavoidable additional delays---unless the machine is +dedicated to one application, there is no way you can avoid this.) + +On some systems, @code{sleep} can do strange things if your program uses +@code{SIGALRM} explicitly. Even if @code{SIGALRM} signals are being +ignored or blocked when @code{sleep} is called, @code{sleep} might +return prematurely on delivery of a @code{SIGALRM} signal. If you have +established a handler for @code{SIGALRM} signals and a @code{SIGALRM} +signal is delivered while the process is sleeping, the action taken +might be just to cause @code{sleep} to return instead of invoking your +handler. And, if @code{sleep} is interrupted by delivery of a signal +whose handler requests an alarm or alters the handling of @code{SIGALRM}, +this handler and @code{sleep} will interfere. + +On the GNU system, it is safe to use @code{sleep} and @code{SIGALRM} in +the same program, because @code{sleep} does not work by means of +@code{SIGALRM}. + +@node Resource Usage +@section Resource Usage + +@pindex sys/resource.h +The function @code{getrusage} and the data type @code{struct rusage} +are used for examining the usage figures of a process. They are declared +in @file{sys/resource.h}. + +@comment sys/resource.h +@comment BSD +@deftypefun int getrusage (int @var{processes}, struct rusage *@var{rusage}) +This function reports the usage totals for processes specified by +@var{processes}, storing the information in @code{*@var{rusage}}. + +In most systems, @var{processes} has only two valid values: + +@table @code +@comment sys/resource.h +@comment BSD +@item RUSAGE_SELF +Just the current process. + +@comment sys/resource.h +@comment BSD +@item RUSAGE_CHILDREN +All child processes (direct and indirect) that have terminated already. +@end table + +In the GNU system, you can also inquire about a particular child process +by specifying its process ID. + +The return value of @code{getrusage} is zero for success, and @code{-1} +for failure. + +@table @code +@item EINVAL +The argument @var{processes} is not valid. +@end table +@end deftypefun + +One way of getting usage figures for a particular child process is with +the function @code{wait4}, which returns totals for a child when it +terminates. @xref{BSD Wait Functions}. + +@comment sys/resource.h +@comment BSD +@deftp {Data Type} {struct rusage} +This data type records a collection usage amounts for various sorts of +resources. It has the following members, and possibly others: + +@table @code +@item struct timeval ru_utime +Time spent executing user instructions. + +@item struct timeval ru_stime +Time spent in operating system code on behalf of @var{processes}. + +@item long int ru_maxrss +The maximum resident set size used, in kilobytes. That is, the maximum +number of kilobytes that @var{processes} used in real memory simultaneously. + +@item long int ru_ixrss +An integral value expressed in kilobytes times ticks of execution, which +indicates the amount of memory used by text that was shared with other +processes. + +@item long int ru_idrss +An integral value expressed the same way, which is the amount of +unshared memory used in data. + +@item long int ru_isrss +An integral value expressed the same way, which is the amount of +unshared memory used in stack space. + +@item long int ru_minflt +The number of page faults which were serviced without requiring any I/O. + +@item long int ru_majflt +The number of page faults which were serviced by doing I/O. + +@item long int ru_nswap +The number of times @var{processes} was swapped entirely out of main memory. + +@item long int ru_inblock +The number of times the file system had to read from the disk on behalf +of @var{processes}. + +@item long int ru_oublock +The number of times the file system had to write to the disk on behalf +of @var{processes}. + +@item long int ru_msgsnd +Number of IPC messages sent. + +@item long ru_msgrcv +Number of IPC messages received. + +@item long int ru_nsignals +Number of signals received. + +@item long int ru_nvcsw +The number of times @var{processes} voluntarily invoked a context switch +(usually to wait for some service). + +@item long int ru_nivcsw +The number of times an involuntary context switch took place (because +the time slice expired, or another process of higher priority became +runnable). +@end table +@end deftp + +An additional historical function for examining usage figures, +@code{vtimes}, is supported but not documented here. It is declared in +@file{sys/vtimes.h}. + +@node Limits on Resources +@section Limiting Resource Usage +@cindex resource limits +@cindex limits on resource usage +@cindex usage limits + +You can specify limits for the resource usage of a process. When the +process tries to exceed a limit, it may get a signal, or the system call +by which it tried to do so may fail, depending on the limit. Each +process initially inherits its limit values from its parent, but it can +subsequently change them. + +@pindex sys/resource.h +The symbols in this section are defined in @file{sys/resource.h}. + +@comment sys/resource.h +@comment BSD +@deftypefun int getrlimit (int @var{resource}, struct rlimit *@var{rlp}) +Read the current value and the maximum value of resource @var{resource} +and store them in @code{*@var{rlp}}. + +The return value is @code{0} on success and @code{-1} on failure. The +only possible @code{errno} error condition is @code{EFAULT}. +@end deftypefun + +@comment sys/resource.h +@comment BSD +@deftypefun int setrlimit (int @var{resource}, struct rlimit *@var{rlp}) +Store the current value and the maximum value of resource @var{resource} +in @code{*@var{rlp}}. + +The return value is @code{0} on success and @code{-1} on failure. The +following @code{errno} error condition is possible: + +@table @code +@item EPERM +You tried to change the maximum permissible limit value, +but you don't have privileges to do so. +@end table +@end deftypefun + +@comment sys/resource.h +@comment BSD +@deftp {Data Type} {struct rlimit} +This structure is used with @code{getrlimit} to receive limit values, +and with @code{setrlimit} to specify limit values. It has two fields: + +@table @code +@item rlim_cur +The current value of the limit in question. +This is also called the ``soft limit''. +@cindex soft limit + +@item rlim_max +The maximum permissible value of the limit in question. You cannot set +the current value of the limit to a larger number than this maximum. +Only the super user can change the maximum permissible value. +This is also called the ``hard limit''. +@cindex hard limit +@end table + +In @code{getrlimit}, the structure is an output; it receives the current +values. In @code{setrlimit}, it specifies the new values. +@end deftp + +Here is a list of resources that you can specify a limit for. +Those that are sizes are measured in bytes. + +@table @code +@comment sys/resource.h +@comment BSD +@item RLIMIT_CPU +@vindex RLIMIT_CPU +The maximum amount of cpu time the process can use. If it runs for +longer than this, it gets a signal: @code{SIGXCPU}. The value is +measured in seconds. @xref{Operation Error Signals}. + +@comment sys/resource.h +@comment BSD +@item RLIMIT_FSIZE +@vindex RLIMIT_FSIZE +The maximum size of file the process can create. Trying to write a +larger file causes a signal: @code{SIGXFSZ}. @xref{Operation Error +Signals}. + +@comment sys/resource.h +@comment BSD +@item RLIMIT_DATA +@vindex RLIMIT_DATA +The maximum size of data memory for the process. If the process tries +to allocate data memory beyond this amount, the allocation function +fails. + +@comment sys/resource.h +@comment BSD +@item RLIMIT_STACK +@vindex RLIMIT_STACK +The maximum stack size for the process. If the process tries to extend +its stack past this size, it gets a @code{SIGSEGV} signal. +@xref{Program Error Signals}. + +@comment sys/resource.h +@comment BSD +@item RLIMIT_CORE +@vindex RLIMIT_CORE +The maximum size core file that this process can create. If the process +terminates and would dump a core file larger than this maximum size, +then no core file is created. So setting this limit to zero prevents +core files from ever being created. + +@comment sys/resource.h +@comment BSD +@item RLIMIT_RSS +@vindex RLIMIT_RSS +The maximum amount of physical memory that this process should get. +This parameter is a guide for the system's scheduler and memory +allocator; the system may give the process more memory when there is a +surplus. + +@comment sys/resource.h +@comment BSD +@item RLIMIT_MEMLOCK +The maximum amount of memory that can be locked into physical memory (so +it will never be paged out). + +@comment sys/resource.h +@comment BSD +@item RLIMIT_NPROC +The maximum number of processes that can be created with the same user ID. +If you have reached the limit for your user ID, @code{fork} will fail +with @code{EAGAIN}. @xref{Creating a Process}. + +@comment sys/resource.h +@comment BSD +@item RLIMIT_NOFILE +@vindex RLIMIT_NOFILE +@itemx RLIMIT_OFILE +@vindex RLIMIT_OFILE +The maximum number of files that the process can open. If it tries to +open more files than this, it gets error code @code{EMFILE}. +@xref{Error Codes}. Not all systems support this limit; GNU does, and +4.4 BSD does. + +@comment sys/resource.h +@comment BSD +@item RLIM_NLIMITS +@vindex RLIM_NLIMITS +The number of different resource limits. Any valid @var{resource} +operand must be less than @code{RLIM_NLIMITS}. +@end table + +@comment sys/resource.h +@comment BSD +@defvr Constant int RLIM_INFINITY +This constant stands for a value of ``infinity'' when supplied as +the limit value in @code{setrlimit}. +@end defvr + +@c ??? Someone want to finish these? +Two historical functions for setting resource limits, @code{ulimit} and +@code{vlimit}, are not documented here. The latter is declared in +@file{sys/vlimit.h} and comes from BSD. + +@node Priority +@section Process Priority +@cindex process priority +@cindex priority of a process + +@pindex sys/resource.h +When several processes try to run, their respective priorities determine +what share of the CPU each process gets. This section describes how you +can read and set the priority of a process. All these functions and +macros are declared in @file{sys/resource.h}. + +The range of valid priority values depends on the operating system, but +typically it runs from @code{-20} to @code{20}. A lower priority value +means the process runs more often. These constants describe the range of +priority values: + +@table @code +@comment sys/resource.h +@comment BSD +@item PRIO_MIN +@vindex PRIO_MIN +The smallest valid priority value. + +@comment sys/resource.h +@comment BSD +@item PRIO_MAX +@vindex PRIO_MAX +The smallest valid priority value. +@end table + +@comment sys/resource.h +@comment BSD +@deftypefun int getpriority (int @var{class}, int @var{id}) +Read the priority of a class of processes; @var{class} and @var{id} +specify which ones (see below). If the processes specified do not all +have the same priority, this returns the smallest value that any of them +has. + +The return value is the priority value on success, and @code{-1} on +failure. The following @code{errno} error condition are possible for +this function: + +@table @code +@item ESRCH +The combination of @var{class} and @var{id} does not match any existing +process. + +@item EINVAL +The value of @var{class} is not valid. +@end table + +When the return value is @code{-1}, it could indicate failure, or it +could be the priority value. The only way to make certain is to set +@code{errno = 0} before calling @code{getpriority}, then use @code{errno +!= 0} afterward as the criterion for failure. +@end deftypefun + +@comment sys/resource.h +@comment BSD +@deftypefun int setpriority (int @var{class}, int @var{id}, int @var{priority}) +Set the priority of a class of processes to @var{priority}; @var{class} +and @var{id} specify which ones (see below). + +The return value is @code{0} on success and @code{-1} on failure. The +following @code{errno} error condition are defined for this function: + +@table @code +@item ESRCH +The combination of @var{class} and @var{id} does not match any existing +process. + +@item EINVAL +The value of @var{class} is not valid. + +@item EPERM +You tried to set the priority of some other user's process, and you +don't have privileges for that. + +@item EACCES +You tried to lower the priority of a process, and you don't have +privileges for that. +@end table +@end deftypefun + +The arguments @var{class} and @var{id} together specify a set of +processes you are interested in. These are the possible values for +@var{class}: + +@table @code +@comment sys/resource.h +@comment BSD +@item PRIO_PROCESS +@vindex PRIO_PROCESS +Read or set the priority of one process. The argument @var{id} is a +process ID. + +@comment sys/resource.h +@comment BSD +@item PRIO_PGRP +@vindex PRIO_PGRP +Read or set the priority of one process group. The argument @var{id} is +a process group ID. + +@comment sys/resource.h +@comment BSD +@item PRIO_USER +@vindex PRIO_USER +Read or set the priority of one user's processes. The argument @var{id} +is a user ID. +@end table + +If the argument @var{id} is 0, it stands for the current process, +current process group, or the current user, according to @var{class}. + +@c ??? I don't know where we should say this comes from. +@comment Unix +@comment dunno.h +@deftypefun int nice (int @var{increment}) +Increment the priority of the current process by @var{increment}. +The return value is the same as for @code{setpriority}. + +Here is an equivalent definition for @code{nice}: + +@smallexample +int +nice (int increment) +@{ + int old = getpriority (PRIO_PROCESS, 0); + return setpriority (PRIO_PROCESS, 0, old + increment); +@} +@end smallexample +@end deftypefun diff --git a/manual/users.texi b/manual/users.texi new file mode 100644 index 0000000000..c35e8b6a5b --- /dev/null +++ b/manual/users.texi @@ -0,0 +1,1012 @@ +@node Users and Groups, System Information, Job Control, Top +@chapter Users and Groups + +Every user who can log in on the system is identified by a unique number +called the @dfn{user ID}. Each process has an effective user ID which +says which user's access permissions it has. + +Users are classified into @dfn{groups} for access control purposes. Each +process has one or more @dfn{group ID values} which say which groups the +process can use for access to files. + +The effective user and group IDs of a process collectively form its +@dfn{persona}. This determines which files the process can access. +Normally, a process inherits its persona from the parent process, but +under special circumstances a process can change its persona and thus +change its access permissions. + +Each file in the system also has a user ID and a group ID. Access +control works by comparing the user and group IDs of the file with those +of the running process. + +The system keeps a database of all the registered users, and another +database of all the defined groups. There are library functions you +can use to examine these databases. + +@menu +* User and Group IDs:: Each user has a unique numeric ID; + likewise for groups. +* Process Persona:: The user IDs and group IDs of a process. +* Why Change Persona:: Why a program might need to change + its user and/or group IDs. +* How Change Persona:: Changing the user and group IDs. +* Reading Persona:: How to examine the user and group IDs. + +* Setting User ID:: Functions for setting the user ID. +* Setting Groups:: Functions for setting the group IDs. + +* Enable/Disable Setuid:: Turning setuid access on and off. +* Setuid Program Example:: The pertinent parts of one sample program. +* Tips for Setuid:: How to avoid granting unlimited access. + +* Who Logged In:: Getting the name of the user who logged in, + or of the real user ID of the current process. + +* User Database:: Functions and data structures for + accessing the user database. +* Group Database:: Functions and data structures for + accessing the group database. +* Database Example:: Example program showing use of database + inquiry functions. +@end menu + +@node User and Group IDs +@section User and Group IDs + +@cindex login name +@cindex user name +@cindex user ID +Each user account on a computer system is identified by a @dfn{user +name} (or @dfn{login name}) and @dfn{user ID}. Normally, each user name +has a unique user ID, but it is possible for several login names to have +the same user ID. The user names and corresponding user IDs are stored +in a data base which you can access as described in @ref{User Database}. + +@cindex group name +@cindex group ID +Users are classified in @dfn{groups}. Each user name also belongs to +one or more groups, and has one @dfn{default group}. Users who are +members of the same group can share resources (such as files) that are +not accessible to users who are not a member of that group. Each group +has a @dfn{group name} and @dfn{group ID}. @xref{Group Database}, +for how to find information about a group ID or group name. + +@node Process Persona +@section The Persona of a Process +@cindex persona +@cindex effective user ID +@cindex effective group ID + +@c !!! bogus; not single ID. set of effective group IDs (and, in GNU, +@c set of effective UIDs) determines privilege. lying here and then +@c telling the truth below is confusing. +At any time, each process has a single user ID and a group ID which +determine the privileges of the process. These are collectively called +the @dfn{persona} of the process, because they determine ``who it is'' +for purposes of access control. These IDs are also called the +@dfn{effective user ID} and @dfn{effective group ID} of the process. + +Your login shell starts out with a persona which consists of your user +ID and your default group ID. +@c !!! also supplementary group IDs. +In normal circumstances, all your other processes inherit these values. + +@cindex real user ID +@cindex real group ID +A process also has a @dfn{real user ID} which identifies the user who +created the process, and a @dfn{real group ID} which identifies that +user's default group. These values do not play a role in access +control, so we do not consider them part of the persona. But they are +also important. + +Both the real and effective user ID can be changed during the lifetime +of a process. @xref{Why Change Persona}. + +@cindex supplementary group IDs +In addition, a user can belong to multiple groups, so the persona +includes @dfn{supplementary group IDs} that also contribute to access +permission. + +For details on how a process's effective user IDs and group IDs affect +its permission to access files, see @ref{Access Permission}. + +The user ID of a process also controls permissions for sending signals +using the @code{kill} function. @xref{Signaling Another Process}. + +@node Why Change Persona +@section Why Change the Persona of a Process? + +The most obvious situation where it is necessary for a process to change +its user and/or group IDs is the @code{login} program. When +@code{login} starts running, its user ID is @code{root}. Its job is to +start a shell whose user and group IDs are those of the user who is +logging in. (To accomplish this fully, @code{login} must set the real +user and group IDs as well as its persona. But this is a special case.) + +The more common case of changing persona is when an ordinary user +program needs access to a resource that wouldn't ordinarily be +accessible to the user actually running it. + +For example, you may have a file that is controlled by your program but +that shouldn't be read or modified directly by other users, either +because it implements some kind of locking protocol, or because you want +to preserve the integrity or privacy of the information it contains. +This kind of restricted access can be implemented by having the program +change its effective user or group ID to match that of the resource. + +Thus, imagine a game program that saves scores in a file. The game +program itself needs to be able to update this file no matter who is +running it, but if users can write the file without going through the +game, they can give themselves any scores they like. Some people +consider this undesirable, or even reprehensible. It can be prevented +by creating a new user ID and login name (say, @code{games}) to own the +scores file, and make the file writable only by this user. Then, when +the game program wants to update this file, it can change its effective +user ID to be that for @code{games}. In effect, the program must +adopt the persona of @code{games} so it can write the scores file. + +@node How Change Persona +@section How an Application Can Change Persona +@cindex @code{setuid} programs + +The ability to change the persona of a process can be a source of +unintentional privacy violations, or even intentional abuse. Because of +the potential for problems, changing persona is restricted to special +circumstances. + +You can't arbitrarily set your user ID or group ID to anything you want; +only privileged processes can do that. Instead, the normal way for a +program to change its persona is that it has been set up in advance to +change to a particular user or group. This is the function of the setuid +and setgid bits of a file's access mode. @xref{Permission Bits}. + +When the setuid bit of an executable file is set, executing that file +automatically changes the effective user ID to the user that owns the +file. Likewise, executing a file whose setgid bit is set changes the +effective group ID to the group of the file. @xref{Executing a File}. +Creating a file that changes to a particular user or group ID thus +requires full access to that user or group ID. + +@xref{File Attributes}, for a more general discussion of file modes and +accessibility. + +A process can always change its effective user (or group) ID back to its +real ID. Programs do this so as to turn off their special privileges +when they are not needed, which makes for more robustness. + +@c !!! talk about _POSIX_SAVED_IDS + +@node Reading Persona +@section Reading the Persona of a Process + +Here are detailed descriptions of the functions for reading the user and +group IDs of a process, both real and effective. To use these +facilities, you must include the header files @file{sys/types.h} and +@file{unistd.h}. +@pindex unistd.h +@pindex sys/types.h + +@comment sys/types.h +@comment POSIX.1 +@deftp {Data Type} uid_t +This is an integer data type used to represent user IDs. In the GNU +library, this is an alias for @code{unsigned int}. +@end deftp + +@comment sys/types.h +@comment POSIX.1 +@deftp {Data Type} gid_t +This is an integer data type used to represent group IDs. In the GNU +library, this is an alias for @code{unsigned int}. +@end deftp + +@comment unistd.h +@comment POSIX.1 +@deftypefun uid_t getuid (void) +The @code{getuid} function returns the real user ID of the process. +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun gid_t getgid (void) +The @code{getgid} function returns the real group ID of the process. +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun uid_t geteuid (void) +The @code{geteuid} function returns the effective user ID of the process. +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun gid_t getegid (void) +The @code{getegid} function returns the effective group ID of the process. +@end deftypefun + +@comment unistd.h +@comment POSIX.1 +@deftypefun int getgroups (int @var{count}, gid_t *@var{groups}) +The @code{getgroups} function is used to inquire about the supplementary +group IDs of the process. Up to @var{count} of these group IDs are +stored in the array @var{groups}; the return value from the function is +the number of group IDs actually stored. If @var{count} is smaller than +the total number of supplementary group IDs, then @code{getgroups} +returns a value of @code{-1} and @code{errno} is set to @code{EINVAL}. + +If @var{count} is zero, then @code{getgroups} just returns the total +number of supplementary group IDs. On systems that do not support +supplementary groups, this will always be zero. + +Here's how to use @code{getgroups} to read all the supplementary group +IDs: + +@smallexample +@group +gid_t * +read_all_groups (void) +@{ + int ngroups = getgroups (NULL, 0); + gid_t *groups + = (gid_t *) xmalloc (ngroups * sizeof (gid_t)); + int val = getgroups (ngroups, groups); + if (val < 0) + @{ + free (groups); + return NULL; + @} + return groups; +@} +@end group +@end smallexample +@end deftypefun + +@node Setting User ID +@section Setting the User ID + +This section describes the functions for altering the user ID (real +and/or effective) of a process. To use these facilities, you must +include the header files @file{sys/types.h} and @file{unistd.h}. +@pindex unistd.h +@pindex sys/types.h + +@comment unistd.h +@comment POSIX.1 +@deftypefun int setuid (uid_t @var{newuid}) +This function sets both the real and effective user ID of the process +to @var{newuid}, provided that the process has appropriate privileges. +@c !!! also sets saved-id + +If the process is not privileged, then @var{newuid} must either be equal +to the real user ID or the saved user ID (if the system supports the +@code{_POSIX_SAVED_IDS} feature). In this case, @code{setuid} sets only +the effective user ID and not the real user ID. +@c !!! xref to discussion of _POSIX_SAVED_IDS + +The @code{setuid} function returns a value of @code{0} to indicate +successful completion, and a value of @code{-1} to indicate an error. +The following @code{errno} error conditions are defined for this +function: + +@table @code +@item EINVAL +The value of the @var{newuid} argument is invalid. + +@item EPERM +The process does not have the appropriate privileges; you do not +have permission to change to the specified ID. +@end table +@end deftypefun + +@comment unistd.h +@comment BSD +@deftypefun int setreuid (uid_t @var{ruid}, uid_t @var{euid}) +This function sets the real user ID of the process to @var{ruid} and the +effective user ID to @var{euid}. If @var{ruid} is @code{-1}, it means +not to change the real user ID; likewise if @var{euid} is @code{-1}, it +means not to change the effective user ID. + +The @code{setreuid} function exists for compatibility with 4.3 BSD Unix, +which does not support saved IDs. You can use this function to swap the +effective and real user IDs of the process. (Privileged processes are +not limited to this particular usage.) If saved IDs are supported, you +should use that feature instead of this function. @xref{Enable/Disable +Setuid}. + +The return value is @code{0} on success and @code{-1} on failure. +The following @code{errno} error conditions are defined for this +function: + +@table @code +@item EPERM +The process does not have the appropriate privileges; you do not +have permission to change to the specified ID. +@end table +@end deftypefun + +@node Setting Groups +@section Setting the Group IDs + +This section describes the functions for altering the group IDs (real +and effective) of a process. To use these facilities, you must include +the header files @file{sys/types.h} and @file{unistd.h}. +@pindex unistd.h +@pindex sys/types.h + +@comment unistd.h +@comment POSIX.1 +@deftypefun int setgid (gid_t @var{newgid}) +This function sets both the real and effective group ID of the process +to @var{newgid}, provided that the process has appropriate privileges. +@c !!! also sets saved-id + +If the process is not privileged, then @var{newgid} must either be equal +to the real group ID or the saved group ID. In this case, @code{setgid} +sets only the effective group ID and not the real group ID. + +The return values and error conditions for @code{setgid} are the same +as those for @code{setuid}. +@end deftypefun + +@comment unistd.h +@comment BSD +@deftypefun int setregid (gid_t @var{rgid}, fid_t @var{egid}) +This function sets the real group ID of the process to @var{rgid} and +the effective group ID to @var{egid}. If @var{rgid} is @code{-1}, it +means not to change the real group ID; likewise if @var{egid} is +@code{-1}, it means not to change the effective group ID. + +The @code{setregid} function is provided for compatibility with 4.3 BSD +Unix, which does not support saved IDs. You can use this function to +swap the effective and real group IDs of the process. (Privileged +processes are not limited to this usage.) If saved IDs are supported, +you should use that feature instead of using this function. +@xref{Enable/Disable Setuid}. + +The return values and error conditions for @code{setregid} are the same +as those for @code{setreuid}. +@end deftypefun + +The GNU system also lets privileged processes change their supplementary +group IDs. To use @code{setgroups} or @code{initgroups}, your programs +should include the header file @file{grp.h}. +@pindex grp.h + +@comment grp.h +@comment BSD +@deftypefun int setgroups (size_t @var{count}, gid_t *@var{groups}) +This function sets the process's supplementary group IDs. It can only +be called from privileged processes. The @var{count} argument specifies +the number of group IDs in the array @var{groups}. + +This function returns @code{0} if successful and @code{-1} on error. +The following @code{errno} error conditions are defined for this +function: + +@table @code +@item EPERM +The calling process is not privileged. +@end table +@end deftypefun + +@comment grp.h +@comment BSD +@deftypefun int initgroups (const char *@var{user}, gid_t @var{gid}) +The @code{initgroups} function effectively calls @code{setgroups} to +set the process's supplementary group IDs to be the normal default for +the user name @var{user}. The group ID @var{gid} is also included. +@c !!! explain that this works by reading the group file looking for +@c groups USER is a member of. +@end deftypefun + +@node Enable/Disable Setuid +@section Enabling and Disabling Setuid Access + +A typical setuid program does not need its special access all of the +time. It's a good idea to turn off this access when it isn't needed, +so it can't possibly give unintended access. + +If the system supports the saved user ID feature, you can accomplish +this with @code{setuid}. When the game program starts, its real user ID +is @code{jdoe}, its effective user ID is @code{games}, and its saved +user ID is also @code{games}. The program should record both user ID +values once at the beginning, like this: + +@smallexample +user_user_id = getuid (); +game_user_id = geteuid (); +@end smallexample + +Then it can turn off game file access with + +@smallexample +setuid (user_user_id); +@end smallexample + +@noindent +and turn it on with + +@smallexample +setuid (game_user_id); +@end smallexample + +@noindent +Throughout this process, the real user ID remains @code{jdoe} and the +saved user ID remains @code{games}, so the program can always set its +effective user ID to either one. + +On other systems that don't support the saved user ID feature, you can +turn setuid access on and off by using @code{setreuid} to swap the real +and effective user IDs of the process, as follows: + +@smallexample +setreuid (geteuid (), getuid ()); +@end smallexample + +@noindent +This special case is always allowed---it cannot fail. + +Why does this have the effect of toggling the setuid access? Suppose a +game program has just started, and its real user ID is @code{jdoe} while +its effective user ID is @code{games}. In this state, the game can +write the scores file. If it swaps the two uids, the real becomes +@code{games} and the effective becomes @code{jdoe}; now the program has +only @code{jdoe} access. Another swap brings @code{games} back to +the effective user ID and restores access to the scores file. + +In order to handle both kinds of systems, test for the saved user ID +feature with a preprocessor conditional, like this: + +@smallexample +#ifdef _POSIX_SAVED_IDS + setuid (user_user_id); +#else + setreuid (geteuid (), getuid ()); +#endif +@end smallexample + +@node Setuid Program Example +@section Setuid Program Example + +Here's an example showing how to set up a program that changes its +effective user ID. + +This is part of a game program called @code{caber-toss} that +manipulates a file @file{scores} that should be writable only by the game +program itself. The program assumes that its executable +file will be installed with the set-user-ID bit set and owned by the +same user as the @file{scores} file. Typically, a system +administrator will set up an account like @code{games} for this purpose. + +The executable file is given mode @code{4755}, so that doing an +@samp{ls -l} on it produces output like: + +@smallexample +-rwsr-xr-x 1 games 184422 Jul 30 15:17 caber-toss +@end smallexample + +@noindent +The set-user-ID bit shows up in the file modes as the @samp{s}. + +The scores file is given mode @code{644}, and doing an @samp{ls -l} on +it shows: + +@smallexample +-rw-r--r-- 1 games 0 Jul 31 15:33 scores +@end smallexample + +Here are the parts of the program that show how to set up the changed +user ID. This program is conditionalized so that it makes use of the +saved IDs feature if it is supported, and otherwise uses @code{setreuid} +to swap the effective and real user IDs. + +@smallexample +#include <stdio.h> +#include <sys/types.h> +#include <unistd.h> +#include <stdlib.h> + + +/* @r{Save the effective and real UIDs.} */ + +static uid_t euid, ruid; + + +/* @r{Restore the effective UID to its original value.} */ + +void +do_setuid (void) +@{ + int status; + +#ifdef _POSIX_SAVED_IDS + status = setuid (euid); +#else + status = setreuid (ruid, euid); +#endif + if (status < 0) @{ + fprintf (stderr, "Couldn't set uid.\n"); + exit (status); + @} +@} + + +@group +/* @r{Set the effective UID to the real UID.} */ + +void +undo_setuid (void) +@{ + int status; + +#ifdef _POSIX_SAVED_IDS + status = setuid (ruid); +#else + status = setreuid (euid, ruid); +#endif + if (status < 0) @{ + fprintf (stderr, "Couldn't set uid.\n"); + exit (status); + @} +@} +@end group + +/* @r{Main program.} */ + +int +main (void) +@{ + /* @r{Save the real and effective user IDs.} */ + ruid = getuid (); + euid = geteuid (); + undo_setuid (); + + /* @r{Do the game and record the score.} */ + @dots{} +@} +@end smallexample + +Notice how the first thing the @code{main} function does is to set the +effective user ID back to the real user ID. This is so that any other +file accesses that are performed while the user is playing the game use +the real user ID for determining permissions. Only when the program +needs to open the scores file does it switch back to the original +effective user ID, like this: + +@smallexample +/* @r{Record the score.} */ + +int +record_score (int score) +@{ + FILE *stream; + char *myname; + + /* @r{Open the scores file.} */ + do_setuid (); + stream = fopen (SCORES_FILE, "a"); + undo_setuid (); + +@group + /* @r{Write the score to the file.} */ + if (stream) + @{ + myname = cuserid (NULL); + if (score < 0) + fprintf (stream, "%10s: Couldn't lift the caber.\n", myname); + else + fprintf (stream, "%10s: %d feet.\n", myname, score); + fclose (stream); + return 0; + @} + else + return -1; +@} +@end group +@end smallexample + +@node Tips for Setuid +@section Tips for Writing Setuid Programs + +It is easy for setuid programs to give the user access that isn't +intended---in fact, if you want to avoid this, you need to be careful. +Here are some guidelines for preventing unintended access and +minimizing its consequences when it does occur: + +@itemize @bullet +@item +Don't have @code{setuid} programs with privileged user IDs such as +@code{root} unless it is absolutely necessary. If the resource is +specific to your particular program, it's better to define a new, +nonprivileged user ID or group ID just to manage that resource. + +@item +Be cautious about using the @code{system} and @code{exec} functions in +combination with changing the effective user ID. Don't let users of +your program execute arbitrary programs under a changed user ID. +Executing a shell is especially bad news. Less obviously, the +@code{execlp} and @code{execvp} functions are a potential risk (since +the program they execute depends on the user's @code{PATH} environment +variable). + +If you must @code{exec} another program under a changed ID, specify an +absolute file name (@pxref{File Name Resolution}) for the executable, +and make sure that the protections on that executable and @emph{all} +containing directories are such that ordinary users cannot replace it +with some other program. + +@item +Only use the user ID controlling the resource in the part of the program +that actually uses that resource. When you're finished with it, restore +the effective user ID back to the actual user's user ID. +@xref{Enable/Disable Setuid}. + +@item +If the @code{setuid} part of your program needs to access other files +besides the controlled resource, it should verify that the real user +would ordinarily have permission to access those files. You can use the +@code{access} function (@pxref{Access Permission}) to check this; it +uses the real user and group IDs, rather than the effective IDs. +@end itemize + +@node Who Logged In +@section Identifying Who Logged In +@cindex login name, determining +@cindex user ID, determining + +You can use the functions listed in this section to determine the login +name of the user who is running a process, and the name of the user who +logged in the current session. See also the function @code{getuid} and +friends (@pxref{Reading Persona}). + +The @code{getlogin} function is declared in @file{unistd.h}, while +@code{cuserid} and @code{L_cuserid} are declared in @file{stdio.h}. +@pindex stdio.h +@pindex unistd.h + +@comment unistd.h +@comment POSIX.1 +@deftypefun {char *} getlogin (void) +The @code{getlogin} function returns a pointer to a string containing the +name of the user logged in on the controlling terminal of the process, +or a null pointer if this information cannot be determined. The string +is statically allocated and might be overwritten on subsequent calls to +this function or to @code{cuserid}. +@end deftypefun + +@comment stdio.h +@comment POSIX.1 +@deftypefun {char *} cuserid (char *@var{string}) +The @code{cuserid} function returns a pointer to a string containing a +user name associated with the effective ID of the process. If +@var{string} is not a null pointer, it should be an array that can hold +at least @code{L_cuserid} characters; the string is returned in this +array. Otherwise, a pointer to a string in a static area is returned. +This string is statically allocated and might be overwritten on +subsequent calls to this function or to @code{getlogin}. +@end deftypefun + +@comment stdio.h +@comment POSIX.1 +@deftypevr Macro int L_cuserid +An integer constant that indicates how long an array you might need to +store a user name. +@end deftypevr + +These functions let your program identify positively the user who is +running or the user who logged in this session. (These can differ when +setuid programs are involved; @xref{Process Persona}.) The user cannot +do anything to fool these functions. + +For most purposes, it is more useful to use the environment variable +@code{LOGNAME} to find out who the user is. This is more flexible +precisely because the user can set @code{LOGNAME} arbitrarily. +@xref{Standard Environment}. + +@node User Database +@section User Database +@cindex user database +@cindex password database +@pindex /etc/passwd + +This section describes all about how to search and scan the database of +registered users. The database itself is kept in the file +@file{/etc/passwd} on most systems, but on some systems a special +network server gives access to it. + +@menu +* User Data Structure:: What each user record contains. +* Lookup User:: How to look for a particular user. +* Scanning All Users:: Scanning the list of all users, one by one. +* Writing a User Entry:: How a program can rewrite a user's record. +@end menu + +@node User Data Structure +@subsection The Data Structure that Describes a User + +The functions and data structures for accessing the system user database +are declared in the header file @file{pwd.h}. +@pindex pwd.h + +@comment pwd.h +@comment POSIX.1 +@deftp {Data Type} {struct passwd} +The @code{passwd} data structure is used to hold information about +entries in the system user data base. It has at least the following members: + +@table @code +@item char *pw_name +The user's login name. + +@item char *pw_passwd. +The encrypted password string. + +@item uid_t pw_uid +The user ID number. + +@item gid_t pw_gid +The user's default group ID number. + +@item char *pw_gecos +A string typically containing the user's real name, and possibly other +information such as a phone number. + +@item char *pw_dir +The user's home directory, or initial working directory. This might be +a null pointer, in which case the interpretation is system-dependent. + +@item char *pw_shell +The user's default shell, or the initial program run when the user logs in. +This might be a null pointer, indicating that the system default should +be used. +@end table +@end deftp + +@node Lookup User +@subsection Looking Up One User +@cindex converting user ID to user name +@cindex converting user name to user ID + +You can search the system user database for information about a +specific user using @code{getpwuid} or @code{getpwnam}. These +functions are declared in @file{pwd.h}. + +@comment pwd.h +@comment POSIX.1 +@deftypefun {struct passwd *} getpwuid (uid_t @var{uid}) +This function returns a pointer to a statically-allocated structure +containing information about the user whose user ID is @var{uid}. This +structure may be overwritten on subsequent calls to @code{getpwuid}. + +A null pointer value indicates there is no user in the data base with +user ID @var{uid}. +@end deftypefun + +@comment pwd.h +@comment POSIX.1 +@deftypefun {struct passwd *} getpwnam (const char *@var{name}) +This function returns a pointer to a statically-allocated structure +containing information about the user whose user name is @var{name}. +This structure may be overwritten on subsequent calls to +@code{getpwnam}. + +A null pointer value indicates there is no user named @var{name}. +@end deftypefun + +@node Scanning All Users +@subsection Scanning the List of All Users +@cindex scanning the user list + +This section explains how a program can read the list of all users in +the system, one user at a time. The functions described here are +declared in @file{pwd.h}. + +You can use the @code{fgetpwent} function to read user entries from a +particular file. + +@comment pwd.h +@comment SVID +@deftypefun {struct passwd *} fgetpwent (FILE *@var{stream}) +This function reads the next user entry from @var{stream} and returns a +pointer to the entry. The structure is statically allocated and is +rewritten on subsequent calls to @code{fgetpwent}. You must copy the +contents of the structure if you wish to save the information. + +This stream must correspond to a file in the same format as the standard +password database file. This function comes from System V. +@end deftypefun + +The way to scan all the entries in the user database is with +@code{setpwent}, @code{getpwent}, and @code{endpwent}. + +@comment pwd.h +@comment SVID, BSD +@deftypefun void setpwent (void) +This function initializes a stream which @code{getpwent} uses to read +the user database. +@end deftypefun + +@comment pwd.h +@comment POSIX.1 +@deftypefun {struct passwd *} getpwent (void) +The @code{getpwent} function reads the next entry from the stream +initialized by @code{setpwent}. It returns a pointer to the entry. The +structure is statically allocated and is rewritten on subsequent calls +to @code{getpwent}. You must copy the contents of the structure if you +wish to save the information. +@end deftypefun + +@comment pwd.h +@comment SVID, BSD +@deftypefun void endpwent (void) +This function closes the internal stream used by @code{getpwent}. +@end deftypefun + +@node Writing a User Entry +@subsection Writing a User Entry + +@comment pwd.h +@comment SVID +@deftypefun int putpwent (const struct passwd *@var{p}, FILE *@var{stream}) +This function writes the user entry @code{*@var{p}} to the stream +@var{stream}, in the format used for the standard user database +file. The return value is zero on success and nonzero on failure. + +This function exists for compatibility with SVID. We recommend that you +avoid using it, because it makes sense only on the assumption that the +@code{struct passwd} structure has no members except the standard ones; +on a system which merges the traditional Unix data base with other +extended information about users, adding an entry using this function +would inevitably leave out much of the important information. + +The function @code{putpwent} is declared in @file{pwd.h}. +@end deftypefun + +@node Group Database +@section Group Database +@cindex group database +@pindex /etc/group + +This section describes all about how to search and scan the database of +registered groups. The database itself is kept in the file +@file{/etc/group} on most systems, but on some systems a special network +service provides access to it. + +@menu +* Group Data Structure:: What each group record contains. +* Lookup Group:: How to look for a particular group. +* Scanning All Groups:: Scanning the list of all groups. +@end menu + +@node Group Data Structure +@subsection The Data Structure for a Group + +The functions and data structures for accessing the system group +database are declared in the header file @file{grp.h}. +@pindex grp.h + +@comment grp.h +@comment POSIX.1 +@deftp {Data Type} {struct group} +The @code{group} structure is used to hold information about an entry in +the system group database. It has at least the following members: + +@table @code +@item char *gr_name +The name of the group. + +@item gid_t gr_gid +The group ID of the group. + +@item char **gr_mem +A vector of pointers to the names of users in the group. Each user name +is a null-terminated string, and the vector itself is terminated by a +null pointer. +@end table +@end deftp + +@node Lookup Group +@subsection Looking Up One Group +@cindex converting group name to group ID +@cindex converting group ID to group name + +You can search the group database for information about a specific +group using @code{getgrgid} or @code{getgrnam}. These functions are +declared in @file{grp.h}. + +@comment grp.h +@comment POSIX.1 +@deftypefun {struct group *} getgrgid (gid_t @var{gid}) +This function returns a pointer to a statically-allocated structure +containing information about the group whose group ID is @var{gid}. +This structure may be overwritten by subsequent calls to +@code{getgrgid}. + +A null pointer indicates there is no group with ID @var{gid}. +@end deftypefun + +@comment grp.h +@comment SVID, BSD +@deftypefun {struct group *} getgrnam (const char *@var{name}) +This function returns a pointer to a statically-allocated structure +containing information about the group whose group name is @var{name}. +This structure may be overwritten by subsequent calls to +@code{getgrnam}. + +A null pointer indicates there is no group named @var{name}. +@end deftypefun + +@node Scanning All Groups +@subsection Scanning the List of All Groups +@cindex scanning the group list + +This section explains how a program can read the list of all groups in +the system, one group at a time. The functions described here are +declared in @file{grp.h}. + +You can use the @code{fgetgrent} function to read group entries from a +particular file. + +@comment grp.h +@comment SVID +@deftypefun {struct group *} fgetgrent (FILE *@var{stream}) +The @code{fgetgrent} function reads the next entry from @var{stream}. +It returns a pointer to the entry. The structure is statically +allocated and is rewritten on subsequent calls to @code{fgetgrent}. You +must copy the contents of the structure if you wish to save the +information. + +The stream must correspond to a file in the same format as the standard +group database file. +@end deftypefun + +The way to scan all the entries in the group database is with +@code{setgrent}, @code{getgrent}, and @code{endgrent}. + +@comment grp.h +@comment SVID, BSD +@deftypefun void setgrent (void) +This function initializes a stream for reading from the group data base. +You use this stream by calling @code{getgrent}. +@end deftypefun + +@comment grp.h +@comment SVID, BSD +@deftypefun {struct group *} getgrent (void) +The @code{getgrent} function reads the next entry from the stream +initialized by @code{setgrent}. It returns a pointer to the entry. The +structure is statically allocated and is rewritten on subsequent calls +to @code{getgrent}. You must copy the contents of the structure if you +wish to save the information. +@end deftypefun + +@comment grp.h +@comment SVID, BSD +@deftypefun void endgrent (void) +This function closes the internal stream used by @code{getgrent}. +@end deftypefun + +@node Database Example +@section User and Group Database Example + +Here is an example program showing the use of the system database inquiry +functions. The program prints some information about the user running +the program. + +@smallexample +@include db.c.texi +@end smallexample + +Here is some output from this program: + +@smallexample +I am Throckmorton Snurd. +My login name is snurd. +My uid is 31093. +My home directory is /home/fsg/snurd. +My default shell is /bin/sh. +My default group is guest (12). +The members of this group are: + friedman + tami +@end smallexample |