diff options
Diffstat (limited to 'manual/crypt.texi')
-rw-r--r-- | manual/crypt.texi | 392 |
1 files changed, 251 insertions, 141 deletions
diff --git a/manual/crypt.texi b/manual/crypt.texi index 0f04ee9899..c41b911c8f 100644 --- a/manual/crypt.texi +++ b/manual/crypt.texi @@ -1,121 +1,200 @@ @node Cryptographic Functions, Debugging Support, System Configuration, Top @chapter Cryptographic Functions -@c %MENU% Password storage and strongly unpredictable bytes +@c %MENU% Passphrase storage and strongly unpredictable bytes. + +@Theglibc{} includes only a few special-purpose cryptographic +functions: one-way hash functions for passphrase storage, and access +to a cryptographic randomness source, if one is provided by the +operating system. Programs that need general-purpose cryptography +should use a dedicated cryptography library, such as +@uref{https://www.gnu.org/software/libgcrypt/,,libgcrypt}. + +Many countries place legal restrictions on the import, export, +possession, or use of cryptographic software. We deplore these +restrictions, but we must still warn you that @theglibc{} may be +subject to them, even if you do not use the functions in this chapter +yourself. The restrictions vary from place to place and are changed +often, so we cannot give any more specific advice than this warning. @menu -* crypt:: A one-way function for passwords. -* Unpredictable Bytes:: Randomness for cryptography purposes. +* Passphrase Storage:: One-way hashing for passphrases. +* Unpredictable Bytes:: Randomness for cryptographic purposes. @end menu -@node crypt -@section Encrypting Passwords +@node Passphrase Storage +@section Passphrase Storage +@cindex passphrase hashing +@cindex one-way hashing +@cindex hashing, passphrase -On many systems, it is unnecessary to have any kind of user -authentication; for instance, a workstation which is not connected to a -network probably does not need any user authentication, because to use -the machine an intruder must have physical access. - -Sometimes, however, it is necessary to be sure that a user is authorized +Sometimes it is necessary to be sure that a user is authorized to use some service a machine provides---for instance, to log in as a particular user id (@pxref{Users and Groups}). One traditional way of -doing this is for each user to choose a secret @dfn{password}; then, the -system can ask someone claiming to be a user what the user's password -is, and if the person gives the correct password then the system can -grant the appropriate privileges. - -If all the passwords are just stored in a file somewhere, then this file -has to be very carefully protected. To avoid this, passwords are run -through a @dfn{one-way function}, a function which makes it difficult to -work out what its input was by looking at its output, before storing in -the file. - -@Theglibc{} provides a one-way function that is compatible with -the behavior of the @code{crypt} function introduced in FreeBSD 2.0. -It supports two one-way algorithms: one based on the MD5 -message-digest algorithm that is compatible with modern BSD systems, -and the other based on the Data Encryption Standard (DES) that is -compatible with Unix systems. - -@deftypefun {char *} crypt (const char *@var{key}, const char *@var{salt}) -@standards{BSD, crypt.h} -@standards{SVID, crypt.h} +doing this is for each user to choose a secret @dfn{passphrase}; then, the +system can ask someone claiming to be a user what the user's passphrase +is, and if the person gives the correct passphrase then the system can +grant the appropriate privileges. (Traditionally, these were called +``passwords,'' but nowadays a single word is too easy to guess.) + +Programs that handle passphrases must take special care not to reveal +them to anyone, no matter what. It is not enough to keep them in a +file that is only accessible with special privileges. The file might +be ``leaked'' via a bug or misconfiguration, and system administrators +shouldn't learn everyone's passphrase even if they have to edit that +file for some reason. To avoid this, passphrases should also be +converted into @dfn{one-way hashes}, using a @dfn{one-way function}, +before they are stored. + +A one-way function is easy to compute, but there is no known way to +compute its inverse. This means the system can easily check +passphrases, by hashing them and comparing the result with the stored +hash. But an attacker who discovers someone's passphrase hash can +only discover the passphrase it corresponds to by guessing and +checking. The one-way functions are designed to make this process +impractically slow, for all but the most obvious guesses. (Do not use +a word from the dictionary as your passphrase.) + +@Theglibc{} provides an interface to four one-way functions, based on +the SHA-2-512, SHA-2-256, MD5, and DES cryptographic primitives. New +passphrases should be hashed with either of the SHA-based functions. +The others are too weak for newly set passphrases, but we continue to +support them for verifying old passphrases. The DES-based hash is +especially weak, because it ignores all but the first eight characters +of its input. + +@deftypefun {char *} crypt (const char *@var{phrase}, const char *@var{salt}) +@standards{X/Open, unistd.h} +@standards{GNU, crypt.h} @safety{@prelim{}@mtunsafe{@mtasurace{:crypt}}@asunsafe{@asucorrupt{} @asulock{} @ascuheap{} @ascudlopen{}}@acunsafe{@aculock{} @acsmem{}}} @c Besides the obvious problem of returning a pointer into static @c storage, the DES initializer takes an internal lock with the usual -@c set of problems for AS- and AC-Safety. The FIPS mode checker and the -@c NSS implementations of may leak file descriptors if canceled. The +@c set of problems for AS- and AC-Safety. +@c The NSS implementations may leak file descriptors if cancelled. @c The MD5, SHA256 and SHA512 implementations will malloc on long keys, @c and NSS relies on dlopening, which brings about another can of worms. -The @code{crypt} function takes a password, @var{key}, as a string, and -a @var{salt} character array which is described below, and returns a -printable ASCII string which starts with another salt. It is believed -that, given the output of the function, the best way to find a @var{key} -that will produce that output is to guess values of @var{key} until the -original value of @var{key} is found. - -The @var{salt} parameter does two things. Firstly, it selects which -algorithm is used, the MD5-based one or the DES-based one. Secondly, it -makes life harder for someone trying to guess passwords against a file -containing many passwords; without a @var{salt}, an intruder can make a -guess, run @code{crypt} on it once, and compare the result with all the -passwords. With a @var{salt}, the intruder must run @code{crypt} once -for each different salt. - -For the MD5-based algorithm, the @var{salt} should consist of the string -@code{$1$}, followed by up to 8 characters, terminated by either -another @code{$} or the end of the string. The result of @code{crypt} -will be the @var{salt}, followed by a @code{$} if the salt didn't end -with one, followed by 22 characters from the alphabet -@code{./0-9A-Za-z}, up to 34 characters total. Every character in the -@var{key} is significant. - -For the DES-based algorithm, the @var{salt} should consist of two -characters from the alphabet @code{./0-9A-Za-z}, and the result of -@code{crypt} will be those two characters followed by 11 more from the -same alphabet, 13 in total. Only the first 8 characters in the -@var{key} are significant. - -The MD5-based algorithm has no limit on the useful length of the -password used, and is slightly more secure. It is therefore preferred -over the DES-based algorithm. - -When the user enters their password for the first time, the @var{salt} -should be set to a new string which is reasonably random. To verify a -password against the result of a previous call to @code{crypt}, pass -the result of the previous call as the @var{salt}. +The function @code{crypt} converts a passphrase string, @var{phrase}, +into a one-way hash suitable for storage in the user database. The +string that it returns will consist entirely of printable ASCII +characters. It will not contain whitespace, nor any of the characters +@samp{:}, @samp{;}, @samp{*}, @samp{!}, or @samp{\}. + +The @var{salt} parameter controls which one-way function is used, and +it also ensures that the output of the one-way function is different +for every user, even if they have the same passphrase. This makes it +harder to guess passphrases from a large user database. Without salt, +the attacker could make a guess, run @code{crypt} on it once, and +compare the result with all the hashes. Salt forces the attacker to +make separate calls to @code{crypt} for each user. + +To verify a passphrase, pass the previously hashed passphrase as the +@var{salt}. To hash a new passphrase for storage, set @var{salt} to a +string consisting of a prefix plus a sequence of randomly chosen +characters, according to this table: + +@multitable @columnfractions .2 .1 .3 +@headitem One-way function @tab Prefix @tab Random sequence +@item SHA-2-512 +@tab @samp{$6$} +@tab 16 characters +@item SHA-2-256 +@tab @samp{$5$} +@tab 16 characters +@item MD5 +@tab @samp{$1$} +@tab 8 characters +@item DES +@tab @samp{} +@tab 2 characters +@end multitable + +In all cases, the random characters should be chosen from the alphabet +@code{./0-9A-Za-z}. + +With all of the hash functions @emph{except} DES, @var{phrase} can be +arbitrarily long, and all eight bits of each byte are significant. +With DES, only the first eight characters of @var{phrase} affect the +output, and the eighth bit of each byte is also ignored. + +@code{crypt} can fail. Some implementations return @code{NULL} on +failure, and others return an @emph{invalid} hashed passphrase, which +will begin with a @samp{*} and will not be the same as @var{salt}. In +either case, @code{errno} will be set to indicate the problem. Some +of the possible error codes are: + +@table @code +@item EINVAL +@var{salt} is invalid; neither a previously hashed passphrase, nor a +well-formed new salt for any of the supported hash functions. + +@item EPERM +The system configuration forbids use of the hash function selected by +@var{salt}. + +@item ENOMEM +Failed to allocate internal scratch storage. + +@item ENOSYS +@itemx EOPNOTSUPP +Hashing passphrases is not supported at all, or the hash function +selected by @var{salt} is not supported. @Theglibc{} does not use +these error codes, but they may be encountered on other operating +systems. +@end table + +@code{crypt} uses static storage for both internal scratchwork and the +string it returns. It is not safe to call @code{crypt} from multiple +threads simultaneously, and the string it returns will be overwritten +by any subsequent call to @code{crypt}. + +@code{crypt} is specified in the X/Open Portability Guide and is +present on nearly all historical Unix systems. However, the XPG does +not specify any one-way functions. + +@code{crypt} is declared in @file{unistd.h}. @Theglibc{} also +declares this function in @file{crypt.h}. @end deftypefun -@deftypefun {char *} crypt_r (const char *@var{key}, const char *@var{salt}, {struct crypt_data *} @var{data}) +@deftypefun {char *} crypt_r (const char *@var{phrase}, const char *@var{salt}, struct crypt_data *@var{data}) @standards{GNU, crypt.h} @safety{@prelim{}@mtsafe{}@asunsafe{@asucorrupt{} @asulock{} @ascuheap{} @ascudlopen{}}@acunsafe{@aculock{} @acsmem{}}} +@tindex struct crypt_data @c Compared with crypt, this function fixes the @mtasurace:crypt @c problem, but nothing else. -The @code{crypt_r} function does the same thing as @code{crypt}, but -takes an extra parameter which includes space for its result (among -other things), so it can be reentrant. @code{data@w{->}initialized} must be -cleared to zero before the first time @code{crypt_r} is called. - -The @code{crypt_r} function is a GNU extension. +The function @code{crypt_r} is a thread-safe version of @code{crypt}. +Instead of static storage, it uses the memory pointed to by its +@var{data} argument for both scratchwork and the string it returns. +It can safely be used from multiple threads, as long as different +@var{data} objects are used in each thread. The string it returns +will still be overwritten by another call with the same @var{data}. + +@var{data} must point to a @code{struct crypt_data} object allocated +by the caller. All of the fields of @code{struct crypt_data} are +private, but before one of these objects is used for the first time, +it must be initialized to all zeroes, using @code{memset} or similar. +After that, it can be reused for many calls to @code{crypt_r} without +erasing it again. @code{struct crypt_data} is very large, so it is +best to allocate it with @code{malloc} rather than as a local +variable. @xref{Memory Allocation}. + +@code{crypt_r} is a GNU extension. It is declared in @file{crypt.h}, +as is @code{struct crypt_data}. @end deftypefun -The @code{crypt} and @code{crypt_r} functions are prototyped in the -header @file{crypt.h}. - -The following short program is an example of how to use @code{crypt} the -first time a password is entered. Note that the @var{salt} generation -is just barely acceptable; in particular, it is not unique between -machines, and in many applications it would not be acceptable to let an -attacker know what time the user's password was last set. +The following program shows how to use @code{crypt} the first time a +passphrase is entered. It uses @code{getentropy} to make the salt as +unpredictable as possible; @pxref{Unpredictable Bytes}. @smallexample @include genpass.c.texi @end smallexample -The next program shows how to verify a password. It prompts the user -for a password and prints ``Access granted.'' if the user types -@code{GNU libc manual}. +The next program demonstrates how to verify a passphrase. It checks a +hash hardcoded into the program, because looking up real users' hashed +passphrases may require special privileges (@pxref{User Database}). +It also shows that different one-way functions produce different +hashes for the same passphrase. @smallexample @include testpass.c.texi @@ -123,93 +202,121 @@ for a password and prints ``Access granted.'' if the user types @node Unpredictable Bytes @section Generating Unpredictable Bytes - -Some cryptographic applications (such as session key generation) need -unpredictable bytes. - -In general, application code should use a deterministic random bit -generator, which could call the @code{getentropy} function described -below internally to obtain randomness to seed the generator. The -@code{getrandom} function is intended for low-level applications which -need additional control over the blocking behavior. +@cindex randomness source +@cindex random numbers, cryptographic +@cindex pseudo-random numbers, cryptographic +@cindex cryptographic random number generator +@cindex deterministic random bit generator +@cindex CRNG +@cindex CSPRNG +@cindex DRBG + +Cryptographic applications often need some random data that will be as +difficult as possible for a hostile eavesdropper to guess. For +instance, encryption keys should be chosen at random, and the ``salt'' +strings used by @code{crypt} (@pxref{Passphrase Storage}) should also +be chosen at random. + +Some pseudo-random number generators do not provide unpredictable-enough +output for cryptographic applications; @pxref{Pseudo-Random Numbers}. +Such applications need to use a @dfn{cryptographic random number +generator} (CRNG), also sometimes called a @dfn{cryptographically strong +pseudo-random number generator} (CSPRNG) or @dfn{deterministic random +bit generator} (DRBG). + +Currently, @theglibc{} does not provide a cryptographic random number +generator, but it does provide functions that read random data from a +@dfn{randomness source} supplied by the operating system. The +randomness source is a CRNG at heart, but it also continually +``re-seeds'' itself from physical sources of randomness, such as +electronic noise and clock jitter. This means applications do not need +to do anything to ensure that the random numbers it produces are +different on each run. + +The catch, however, is that these functions will only produce +relatively short random strings in any one call. Often this is not a +problem, but applications that need more than a few kilobytes of +cryptographically strong random data should call these functions once +and use their output to seed a CRNG. + +Most applications should use @code{getentropy}. The @code{getrandom} +function is intended for low-level applications which need additional +control over blocking behavior. @deftypefun int getentropy (void *@var{buffer}, size_t @var{length}) @standards{GNU, sys/random.h} @safety{@mtsafe{}@assafe{}@acsafe{}} -This function writes @var{length} bytes of random data to the array -starting at @var{buffer}, which must be at most 256 bytes long. The -function returns zero on success. On failure, it returns @code{-1} and -@code{errno} is updated accordingly. - -The @code{getentropy} function is declared in the header file -@file{sys/random.h}. It is derived from OpenBSD. - -The @code{getentropy} function is not a cancellation point. A call to -@code{getentropy} can block if the system has just booted and the kernel -entropy pool has not yet been initialized. In this case, the function -will keep blocking even if a signal arrives, and return only after the -entropy pool has been initialized. - -The @code{getentropy} function can fail with several errors, some of -which are listed below. +This function writes exactly @var{length} bytes of random data to the +array starting at @var{buffer}. @var{length} can be no more than 256. +On success, it returns zero. On failure, it returns @math{-1}, and +@code{errno} is set to indicate the problem. Some of the possible +errors are listed below. @table @code @item ENOSYS -The kernel does not implement the required system call. +The operating system does not implement a randomness source, or does +not support this way of accessing it. (For instance, the system call +used by this function was added to the Linux kernel in version 3.17.) @item EFAULT The combination of @var{buffer} and @var{length} arguments specifies an invalid memory range. @item EIO -More than 256 bytes of randomness have been requested, or the buffer -could not be overwritten with random data for an unspecified reason. - +@var{length} is larger than 256, or the kernel entropy pool has +suffered a catastrophic failure. @end table +A call to @code{getentropy} can only block when the system has just +booted and the randomness source has not yet been initialized. +However, if it does block, it cannot be interrupted by signals or +thread cancellation. Programs intended to run in very early stages of +the boot process may need to use @code{getrandom} in non-blocking mode +instead, and be prepared to cope with random data not being available +at all. + +The @code{getentropy} function is declared in the header file +@file{sys/random.h}. It is derived from OpenBSD. @end deftypefun @deftypefun ssize_t getrandom (void *@var{buffer}, size_t @var{length}, unsigned int @var{flags}) @standards{GNU, sys/random.h} @safety{@mtsafe{}@assafe{}@acsafe{}} -This function writes @var{length} bytes of random data to the array -starting at @var{buffer}. On success, this function returns the number -of bytes which have been written to the buffer (which can be less than -@var{length}). On error, @code{-1} is returned, and @code{errno} is -updated accordingly. - -The @code{getrandom} function is declared in the header file -@file{sys/random.h}. It is a GNU extension. - -The following flags are defined for the @var{flags} argument: +This function writes up to @var{length} bytes of random data to the +array starting at @var{buffer}. The @var{flags} argument should be +either zero, or the bitwise OR of some of the following flags: @table @code @item GRND_RANDOM -Use the @file{/dev/random} (blocking) pool instead of the -@file{/dev/urandom} (non-blocking) pool to obtain randomness. If the -@code{GRND_RANDOM} flag is specified, the @code{getrandom} function can -block even after the randomness source has been initialized. +Use the @file{/dev/random} (blocking) source instead of the +@file{/dev/urandom} (non-blocking) source to obtain randomness. + +If this flag is specified, the call may block, potentially for quite +some time, even after the randomness source has been initialized. If it +is not specified, the call can only block when the system has just +booted and the randomness source has not yet been initialized. @item GRND_NONBLOCK Instead of blocking, return to the caller immediately if no data is available. @end table -The @code{getrandom} function is a cancellation point. +Unlike @code{getentropy}, the @code{getrandom} function is a +cancellation point, and if it blocks, it can be interrupted by +signals. -Obtaining randomness from the @file{/dev/urandom} pool (i.e., a call -without the @code{GRND_RANDOM} flag) can block if the system has just -booted and the pool has not yet been initialized. - -The @code{getrandom} function can fail with several errors, some of -which are listed below. In addition, the function may not fill the -buffer completely and return a value less than @var{length}. +On success, @code{getrandom} returns the number of bytes which have +been written to the buffer, which may be less than @var{length}. On +error, it returns @math{-1}, and @code{errno} is set to indicate the +problem. Some of the possible errors are: @table @code @item ENOSYS -The kernel does not implement the @code{getrandom} system call. +The operating system does not implement a randomness source, or does +not support this way of accessing it. (For instance, the system call +used by this function was added to the Linux kernel in version 3.17.) @item EAGAIN No random data was available and @code{GRND_NONBLOCK} was specified in @@ -228,4 +335,7 @@ the kernel randomness pool is initialized, this can happen even if The @var{flags} argument contains an invalid combination of flags. @end table +The @code{getrandom} function is declared in the header file +@file{sys/random.h}. It is a GNU extension. + @end deftypefun |