diff options
author | Paul Eggert <eggert@cs.ucla.edu> | 2021-04-11 19:06:00 -0700 |
---|---|---|
committer | Paul Eggert <eggert@cs.ucla.edu> | 2021-04-13 12:17:56 -0700 |
commit | bdc674d97ba8b59e22b1f45fa1a37862764fcc75 (patch) | |
tree | 66b8438f974eb3910663d1a0f047f256de376f50 | |
parent | cedbf6d5f3f70ca911176de87d6e453eeab4b7a1 (diff) | |
download | glibc-bdc674d97ba8b59e22b1f45fa1a37862764fcc75.tar.gz glibc-bdc674d97ba8b59e22b1f45fa1a37862764fcc75.tar.xz glibc-bdc674d97ba8b59e22b1f45fa1a37862764fcc75.zip |
Improve documentation for malloc etc. (BZ#27719)
Cover key corner cases (e.g., whether errno is set) that are well settled in glibc, fix some examples to avoid integer overflow, and update some other dated examples (code needed for K&R C, e.g.). * manual/charset.texi (Non-reentrant String Conversion): * manual/filesys.texi (Symbolic Links): * manual/memory.texi (Allocating Cleared Space): * manual/socket.texi (Host Names): * manual/string.texi (Concatenating Strings): * manual/users.texi (Setting Groups): Use reallocarray instead of realloc, to avoid integer overflow issues. * manual/filesys.texi (Scanning Directory Content): * manual/memory.texi (The GNU Allocator, Hooks for Malloc): * manual/tunables.texi: Use code font for 'malloc' instead of roman font. (Symbolic Links): Don't assume readlink return value fits in 'int'. * manual/memory.texi (Memory Allocation and C, Basic Allocation) (Malloc Examples, Alloca Example): * manual/stdio.texi (Formatted Output Functions): * manual/string.texi (Concatenating Strings, Collation Functions): Omit pointer casts that are needed only in ancient K&R C. * manual/memory.texi (Basic Allocation): Say that malloc sets errno on failure. Say "convert" rather than "cast", since casts are no longer needed. * manual/memory.texi (Basic Allocation): * manual/string.texi (Concatenating Strings): In examples, use C99 declarations after statements for brevity. * manual/memory.texi (Malloc Examples): Add portability notes for malloc (0), errno setting, and PTRDIFF_MAX. (Changing Block Size): Say that realloc (p, 0) acts like (p ? (free (p), NULL) : malloc (0)). Add xreallocarray example, since other examples can use it. Add portability notes for realloc (0, 0), realloc (p, 0), PTRDIFF_MAX, and improve notes for reallocating to the same size. (Allocating Cleared Space): Reword now-confusing discussion about replacement, and xref "Replacing malloc". * manual/stdio.texi (Formatted Output Functions): Don't assume message size fits in 'int'. * manual/string.texi (Concatenating Strings): Fix undefined behavior involving arithmetic on a freed pointer.
-rw-r--r-- | manual/charset.texi | 2 | ||||
-rw-r--r-- | manual/filesys.texi | 10 | ||||
-rw-r--r-- | manual/memory.texi | 125 | ||||
-rw-r--r-- | manual/socket.texi | 2 | ||||
-rw-r--r-- | manual/stdio.texi | 30 | ||||
-rw-r--r-- | manual/string.texi | 41 | ||||
-rw-r--r-- | manual/tunables.texi | 14 | ||||
-rw-r--r-- | manual/users.texi | 2 |
8 files changed, 136 insertions, 90 deletions
diff --git a/manual/charset.texi b/manual/charset.texi index b638323fc2..a9b5cb4a37 100644 --- a/manual/charset.texi +++ b/manual/charset.texi @@ -1469,7 +1469,7 @@ mbstowcs_alloc (const char *string) size = mbstowcs (buf, string, size); if (size == (size_t) -1) return NULL; - buf = xrealloc (buf, (size + 1) * sizeof (wchar_t)); + buf = xreallocarray (buf, size + 1, sizeof *buf); return buf; @} @end smallexample diff --git a/manual/filesys.texi b/manual/filesys.texi index 73e630842e..47d929744e 100644 --- a/manual/filesys.texi +++ b/manual/filesys.texi @@ -735,7 +735,7 @@ the functions @code{alphasort} and @code{versionsort} below. The return value of the function is the number of entries placed in *@var{namelist}. If it is @code{-1} an error occurred (either the -directory could not be opened for reading or the malloc call failed) and +directory could not be opened for reading or memory allocation failed) and the global variable @code{errno} contains more information on the error. @end deftypefun @@ -1378,13 +1378,14 @@ call @code{readlink} again. Here is an example: char * readlink_malloc (const char *filename) @{ - int size = 100; + size_t size = 50; char *buffer = NULL; while (1) @{ - buffer = (char *) xrealloc (buffer, size); - int nchars = readlink (filename, buffer, size); + buffer = xreallocarray (buffer, size, 2); + size *= 2; + ssize_t nchars = readlink (filename, buffer, size); if (nchars < 0) @{ free (buffer); @@ -1392,7 +1393,6 @@ readlink_malloc (const char *filename) @} if (nchars < size) return buffer; - size *= 2; @} @} @end smallexample diff --git a/manual/memory.texi b/manual/memory.texi index b2cc65228a..28ec2e4e63 100644 --- a/manual/memory.texi +++ b/manual/memory.texi @@ -254,8 +254,7 @@ address of the space. Then you can use the operators @samp{*} and @smallexample @{ - struct foobar *ptr - = (struct foobar *) malloc (sizeof (struct foobar)); + struct foobar *ptr = malloc (sizeof *ptr); ptr->name = x; ptr->next = current_foobar; current_foobar = ptr; @@ -268,7 +267,8 @@ address of the space. Then you can use the operators @samp{*} and The @code{malloc} implementation in @theglibc{} is derived from ptmalloc (pthreads malloc), which in turn is derived from dlmalloc (Doug Lea malloc). -This malloc may allocate memory in two different ways depending on their size +This @code{malloc} may allocate memory +in two different ways depending on their size and certain parameters that may be controlled by users. The most common way is to allocate portions of memory (called chunks) from a large contiguous area of memory and manage these areas to optimize their use and reduce wastage in the @@ -583,29 +583,27 @@ this function is in @file{stdlib.h}. @c chunk_non_main_arena ok @c heap_for_ptr ok This function returns a pointer to a newly allocated block @var{size} -bytes long, or a null pointer if the block could not be allocated. +bytes long, or a null pointer (setting @code{errno}) +if the block could not be allocated. @end deftypefun The contents of the block are undefined; you must initialize it yourself (or use @code{calloc} instead; @pxref{Allocating Cleared Space}). -Normally you would cast the value as a pointer to the kind of object +Normally you would convert the value to a pointer to the kind of object that you want to store in the block. Here we show an example of doing so, and of initializing the space with zeros using the library function @code{memset} (@pxref{Copying Strings and Arrays}): @smallexample -struct foo *ptr; -@dots{} -ptr = (struct foo *) malloc (sizeof (struct foo)); +struct foo *ptr = malloc (sizeof *ptr); if (ptr == 0) abort (); memset (ptr, 0, sizeof (struct foo)); @end smallexample You can store the result of @code{malloc} into any pointer variable without a cast, because @w{ISO C} automatically converts the type -@code{void *} to another type of pointer when necessary. But the cast -is necessary in contexts other than assignment operators or if you might -want your code to run in traditional C. +@code{void *} to another type of pointer when necessary. However, a cast +is necessary if the type is needed but not specified by context. Remember that when allocating space for a string, the argument to @code{malloc} must be one plus the length of the string. This is @@ -613,9 +611,7 @@ because a string is terminated with a null character that doesn't count in the ``length'' of the string but does need space. For example: @smallexample -char *ptr; -@dots{} -ptr = (char *) malloc (length + 1); +char *ptr = malloc (length + 1); @end smallexample @noindent @@ -630,6 +626,7 @@ useful to write a subroutine that calls @code{malloc} and reports an error if the value is a null pointer, returning only if the value is nonzero. This function is conventionally called @code{xmalloc}. Here it is: +@cindex @code{xmalloc} function @smallexample void * @@ -651,9 +648,9 @@ a newly allocated null-terminated string: char * savestring (const char *ptr, size_t len) @{ - char *value = (char *) xmalloc (len + 1); + char *value = xmalloc (len + 1); value[len] = '\0'; - return (char *) memcpy (value, ptr, len); + return memcpy (value, ptr, len); @} @end group @end smallexample @@ -674,6 +671,27 @@ contents of another block. If you have already allocated a block and discover you want it to be bigger, use @code{realloc} (@pxref{Changing Block Size}). +@strong{Portability Notes:} + +@itemize @bullet +@item +In @theglibc{}, a successful @code{malloc (0)} +returns a non-null pointer to a newly allocated size-zero block; +other implementations may return @code{NULL} instead. +POSIX and the ISO C standard allow both behaviors. + +@item +In @theglibc{}, a failed @code{malloc} call sets @code{errno}, +but ISO C does not require this and non-POSIX implementations +need not set @code{errno} when failing. + +@item +In @theglibc{}, @code{malloc} always fails when @var{size} exceeds +@code{PTRDIFF_MAX}, to avoid problems with programs that subtract +pointers or use signed indexes. Other implementations may succeed in +this case, leading to undefined behavior later. +@end itemize + @node Freeing after Malloc @subsubsection Freeing Memory Allocated with @code{malloc} @cindex freeing memory allocated with @code{malloc} @@ -817,10 +835,12 @@ block. If the block needs to be moved, @code{realloc} copies the old contents. If you pass a null pointer for @var{ptr}, @code{realloc} behaves just -like @samp{malloc (@var{newsize})}. This can be convenient, but beware -that older implementations (before @w{ISO C}) may not support this -behavior, and will probably crash when @code{realloc} is passed a null -pointer. +like @samp{malloc (@var{newsize})}. +Otherwise, if @var{newsize} is zero +@code{realloc} frees the block and returns @code{NULL}. +Otherwise, if @code{realloc} cannot reallocate the requested size +it returns @code{NULL} and sets @code{errno}; the original block +is left undisturbed. @end deftypefun @deftypefun {void *} reallocarray (void *@var{ptr}, size_t @var{nmemb}, size_t @var{size}) @@ -850,19 +870,27 @@ relocated. In most cases it makes no difference what happens to the original block when @code{realloc} fails, because the application program cannot continue when it is out of memory, and the only thing to do is to give a fatal error -message. Often it is convenient to write and use a subroutine, -conventionally called @code{xrealloc}, that takes care of the error message +message. Often it is convenient to write and use subroutines, +conventionally called @code{xrealloc} and @code{xreallocarray}, +that take care of the error message as @code{xmalloc} does for @code{malloc}: +@cindex @code{xrealloc} and @code{xreallocarray} functions @smallexample void * -xrealloc (void *ptr, size_t size) +xreallocarray (void *ptr, size_t nmemb, size_t size) @{ - void *value = realloc (ptr, size); + void *value = reallocarray (ptr, nmemb, size); if (value == 0) fatal ("Virtual memory exhausted"); return value; @} + +void * +xrealloc (void *ptr, size_t size) +@{ + return xreallocarray (ptr, 1, size); +@} @end smallexample You can also use @code{realloc} or @code{reallocarray} to make a block @@ -873,9 +901,28 @@ space when only a little is needed. In several allocation implementations, making a block smaller sometimes necessitates copying it, so it can fail if no other space is available. -If the new size you specify is the same as the old size, @code{realloc} and +@strong{Portability Notes:} + +@itemize @bullet +@item +Portable programs should not attempt to reallocate blocks to be size zero. +On other implementations if @var{ptr} is non-null, @code{realloc (ptr, 0)} +might free the block and return a non-null pointer to a size-zero +object, or it might fail and return @code{NULL} without freeing the block. +The ISO C17 standard allows these variations. + +@item +In @theglibc{}, reallocation fails if the resulting block +would exceed @code{PTRDIFF_MAX} in size, to avoid problems with programs +that subtract pointers or use signed indexes. Other implementations may +succeed, leading to undefined behavior later. + +@item +In @theglibc{}, if the new size is the same as the old, @code{realloc} and @code{reallocarray} are guaranteed to change nothing and return the same -address that you gave. +address that you gave. However, POSIX and ISO C allow the functions +to relocate the object or fail in this situation. +@end itemize @node Allocating Cleared Space @subsubsection Allocating Cleared Space @@ -916,18 +963,20 @@ You could define @code{calloc} as follows: void * calloc (size_t count, size_t eltsize) @{ - size_t size = count * eltsize; - void *value = malloc (size); + void *value = reallocarray (0, count, eltsize); if (value != 0) - memset (value, 0, size); + memset (value, 0, count * eltsize); return value; @} @end smallexample But in general, it is not guaranteed that @code{calloc} calls -@code{malloc} internally. Therefore, if an application provides its own -@code{malloc}/@code{realloc}/@code{free} outside the C library, it -should always define @code{calloc}, too. +@code{reallocarray} and @code{memset} internally. For example, if the +@code{calloc} implementation knows for other reasons that the new +memory block is zero, it need not zero out the block again with +@code{memset}. Also, if an application provides its own +@code{reallocarray} outside the C library, @code{calloc} might not use +that redefinition. @xref{Replacing malloc}. @node Aligned Memory Blocks @subsubsection Allocating Aligned Memory Blocks @@ -1421,15 +1470,15 @@ should make sure to restore all the hooks to their previous value. When coming back from the recursive call, all the hooks should be resaved since a hook might modify itself. -An issue to look out for is the time at which the malloc hook functions -can be safely installed. If the hook functions call the malloc-related -functions recursively, it is necessary that malloc has already properly +An issue to look out for is the time at which the hook functions +can be safely installed. If the hook functions call the @code{malloc}-related +functions recursively, it is necessary that @code{malloc} has already properly initialized itself at the time when @code{__malloc_hook} etc. is assigned to. On the other hand, if the hook functions provide a -complete malloc implementation of their own, it is vital that the hooks +complete @code{malloc} implementation of their own, it is vital that the hooks are assigned to @emph{before} the very first @code{malloc} call has completed, because otherwise a chunk obtained from the ordinary, -un-hooked malloc may later be handed to @code{__free_hook}, for example. +un-hooked @code{malloc} may later be handed to @code{__free_hook}, for example. Here is an example showing how to use @code{__malloc_hook} and @code{__free_hook} properly. It installs a function that prints out @@ -2867,7 +2916,7 @@ Here is how you would get the same results with @code{malloc} and int open2 (char *str1, char *str2, int flags, int mode) @{ - char *name = (char *) malloc (strlen (str1) + strlen (str2) + 1); + char *name = malloc (strlen (str1) + strlen (str2) + 1); int desc; if (name == 0) fatal ("virtual memory exceeded"); diff --git a/manual/socket.texi b/manual/socket.texi index cd7c0e7b12..68c930b552 100644 --- a/manual/socket.texi +++ b/manual/socket.texi @@ -1539,8 +1539,8 @@ gethostname (char *host) &hp, &herr)) == ERANGE) @{ /* Enlarge the buffer. */ + tmphstbuf = reallocarray (tmphstbuf, hstbuflen, 2); hstbuflen *= 2; - tmphstbuf = realloc (tmphstbuf, hstbuflen); @} free (tmphstbuf); diff --git a/manual/stdio.texi b/manual/stdio.texi index 6ff1806281..fd7ed0cedc 100644 --- a/manual/stdio.texi +++ b/manual/stdio.texi @@ -2428,31 +2428,29 @@ string. Here is an example of doing this: char * make_message (char *name, char *value) @{ - /* @r{Guess we need no more than 100 chars of space.} */ - int size = 100; - char *buffer = (char *) xmalloc (size); - int nchars; + /* @r{Guess we need no more than 100 bytes of space.} */ + size_t size = 100; + char *buffer = xmalloc (size); @end group @group - if (buffer == NULL) - return NULL; - /* @r{Try to print in the allocated space.} */ - nchars = snprintf (buffer, size, "value of %s is %s", - name, value); + int buflen = snprintf (buffer, size, "value of %s is %s", + name, value); + if (! (0 <= buflen && buflen < SIZE_MAX)) + fatal ("integer overflow"); @end group @group - if (nchars >= size) + if (buflen >= size) @{ /* @r{Reallocate buffer now that we know how much space is needed.} */ - size = nchars + 1; - buffer = (char *) xrealloc (buffer, size); + size = buflen; + size++; + buffer = xrealloc (buffer, size); - if (buffer != NULL) - /* @r{Try again.} */ - snprintf (buffer, size, "value of %s is %s", - name, value); + /* @r{Try again.} */ + snprintf (buffer, size, "value of %s is %s", + name, value); @} /* @r{The last call worked, return the string.} */ return buffer; diff --git a/manual/string.texi b/manual/string.texi index ad11519377..7ca5ff6c94 100644 --- a/manual/string.texi +++ b/manual/string.texi @@ -744,19 +744,17 @@ concat (const char *str, @dots{}) @{ va_list ap, ap2; size_t total = 1; - const char *s; - char *result; va_start (ap, str); va_copy (ap2, ap); /* @r{Determine how much space we need.} */ - for (s = str; s != NULL; s = va_arg (ap, const char *)) + for (const char *s = str; s != NULL; s = va_arg (ap, const char *)) total += strlen (s); va_end (ap); - result = (char *) malloc (total); + char *result = malloc (total); if (result != NULL) @{ result[0] = '\0'; @@ -786,45 +784,44 @@ efficiently: char * concat (const char *str, @dots{}) @{ - va_list ap; size_t allocated = 100; - char *result = (char *) malloc (allocated); + char *result = malloc (allocated); if (result != NULL) @{ + va_list ap; + size_t resultlen = 0; char *newp; - char *wp; - const char *s; va_start (ap, str); - wp = result; - for (s = str; s != NULL; s = va_arg (ap, const char *)) + for (const char *s = str; s != NULL; s = va_arg (ap, const char *)) @{ size_t len = strlen (s); /* @r{Resize the allocated memory if necessary.} */ - if (wp + len + 1 > result + allocated) + if (resultlen + len + 1 > allocated) @{ - allocated = (allocated + len) * 2; - newp = (char *) realloc (result, allocated); + allocated += len; + newp = reallocarray (result, allocated, 2); + allocated *= 2; if (newp == NULL) @{ free (result); return NULL; @} - wp = newp + (wp - result); result = newp; @} - wp = mempcpy (wp, s, len); + memcpy (result + resultlen, s, len); + resultlen += len; @} /* @r{Terminate the result string.} */ - *wp++ = '\0'; + result[resultlen++] = '\0'; /* @r{Resize memory to the optimal size.} */ - newp = realloc (result, wp - result); + newp = realloc (result, resultlen); if (newp != NULL) result = newp; @@ -1619,8 +1616,8 @@ sort_strings_fast (char **array, int nstrings) @{ /* @r{Allocate the needed space. +1 for terminating} @r{@code{'\0'} byte.} */ - transformed = (char *) xrealloc (transformed, - transformed_length + 1); + transformed = xrealloc (transformed, + transformed_length + 1); /* @r{The return value is not interesting because we know} @r{how long the transformed string is.} */ @@ -1663,9 +1660,9 @@ sort_strings_fast (wchar_t **array, int nstrings) @{ /* @r{Allocate the needed space. +1 for terminating} @r{@code{L'\0'} wide character.} */ - transformed = (wchar_t *) xrealloc (transformed, - (transformed_length + 1) - * sizeof (wchar_t)); + transformed = xreallocarray (transformed, + transformed_length + 1, + sizeof *transformed); /* @r{The return value is not interesting because we know} @r{how long the transformed string is.} */ diff --git a/manual/tunables.texi b/manual/tunables.texi index 1b746c0fa1..6de647b426 100644 --- a/manual/tunables.texi +++ b/manual/tunables.texi @@ -10,7 +10,8 @@ their workload. These are implemented as a set of switches that may be modified in different ways. The current default method to do this is via the @env{GLIBC_TUNABLES} environment variable by setting it to a string of colon-separated @var{name}=@var{value} pairs. For example, the following -example enables malloc checking and sets the malloc trim threshold to 128 +example enables @code{malloc} checking and sets the @code{malloc} +trim threshold to 128 bytes: @example @@ -115,7 +116,7 @@ This tunable supersedes the @env{MALLOC_CHECK_} environment variable and is identical in features. Setting this tunable to a non-zero value enables a special (less -efficient) memory allocator for the malloc family of functions that is +efficient) memory allocator for the @code{malloc} family of functions that is designed to be tolerant against simple errors such as double calls of free with the same argument, or overruns of a single byte (off-by-one bugs). Not all such errors can be protected against, however, and memory @@ -149,7 +150,7 @@ identical in features. If set to a non-zero value, memory blocks are initialized with values depending on some low order bits of this tunable when they are allocated (except when -allocated by calloc) and freed. This can be used to debug the use of +allocated by @code{calloc}) and freed. This can be used to debug the use of uninitialized or freed heap memory. Note that this option does not guarantee that the freed block will have any specific values. It only guarantees that the content the block had before it was freed will be overwritten. @@ -256,13 +257,13 @@ is no limit. @end deftp @deftp Tunable glibc.malloc.mxfast -One of the optimizations malloc uses is to maintain a series of ``fast +One of the optimizations @code{malloc} uses is to maintain a series of ``fast bins'' that hold chunks up to a specific size. The default and maximum size which may be held this way is 80 bytes on 32-bit systems or 160 bytes on 64-bit systems. Applications which value size over speed may choose to reduce the size of requests which are serviced from fast bins with this tunable. Note that the value specified -includes malloc's internal overhead, which is normally the size of one +includes @code{malloc}'s internal overhead, which is normally the size of one pointer, so add 4 on 32-bit systems or 8 on 64-bit systems to the size passed to @code{malloc} for the largest bin size to enable. @end deftp @@ -543,7 +544,8 @@ all other systems. This tunable takes a value between 0 and 255 and acts as a bitmask that enables various capabilities. -Bit 0 (the least significant bit) causes the malloc subsystem to allocate +Bit 0 (the least significant bit) causes the @code{malloc} +subsystem to allocate tagged memory, with each allocation being assigned a random tag. Bit 1 enables precise faulting mode for tag violations on systems that diff --git a/manual/users.texi b/manual/users.texi index ec22ce6c1c..72da3fb714 100644 --- a/manual/users.texi +++ b/manual/users.texi @@ -578,7 +578,7 @@ supplementary_groups (char *user) if (getgrouplist (pw->pw_name, pw->pw_gid, groups, &ngroups) < 0) @{ - groups = xrealloc (ngroups * sizeof (gid_t)); + groups = xreallocarray (ngroups, sizeof *groups); getgrouplist (pw->pw_name, pw->pw_gid, groups, &ngroups); @} return groups; |