about summary refs log tree commit diff
path: root/manual/string.texi
diff options
context:
space:
mode:
authorZack Weinberg <zackw@panix.com>2016-09-15 07:29:44 -0400
committerZack Weinberg <zackw@panix.com>2016-12-16 16:21:54 -0500
commitea1bd74defcf9d5291d14972e63105168ca9eb4f (patch)
tree98e2a212f799a95dd8c2976448d4d492a07c716b /manual/string.texi
parentc0b4353654e635e280ec0c1972f251bf696abb36 (diff)
downloadglibc-ea1bd74defcf9d5291d14972e63105168ca9eb4f.tar.gz
glibc-ea1bd74defcf9d5291d14972e63105168ca9eb4f.tar.xz
glibc-ea1bd74defcf9d5291d14972e63105168ca9eb4f.zip
New string function explicit_bzero (from OpenBSD).
explicit_bzero(s, n) is the same as memset(s, 0, n), except that the
compiler is not allowed to delete a call to explicit_bzero even if the
memory pointed to by 's' is dead after the call.  Right now, this effect
is achieved externally by having explicit_bzero be a function whose
semantics are unknown to the compiler, and internally, with a no-op
asm statement that clobbers memory.  This does mean that small
explicit_bzero operations cannot be expanded inline as small memset
operations can, but on the other hand, small memset operations do get
deleted by the compiler.  Hopefully full compiler support for
explicit_bzero will happen relatively soon.

There are two new tests: test-explicit_bzero.c verifies the
visible semantics in the same way as the existing test-bzero.c,
and tst-xbzero-opt.c verifies the not-being-optimized-out property.
The latter is conceptually based on a test written by Matthew Dempsky
for the OpenBSD regression suite.

The crypt() implementation has an immediate use for this new feature.
We avoid having to add a GLIBC_PRIVATE alias for explicit_bzero
by running all of libcrypt's calls through the fortified variant,
__explicit_bzero_chk, which is in the impl namespace anyway.  Currently
I'm not aware of anything in libc proper that needs this, but the
glue is all in place if it does become necessary.  The legacy DES
implementation wasn't bothering to clear its buffers, so I added that,
mostly for consistency's sake.

	* string/explicit_bzero.c: New routine.
	* string/test-explicit_bzero.c, string/tst-xbzero-opt.c: New tests.
	* string/Makefile (routines, strop-tests, tests): Add them.
	* string/test-memset.c: Add ifdeffage for testing explicit_bzero.
	* string/string.h [__USE_MISC]: Declare explicit_bzero.

	* debug/explicit_bzero_chk.c: New routine.
	* debug/Makefile (routines): Add it.
	* debug/tst-chk1.c: Test fortification of explicit_bzero.
	* string/bits/string3.h: Fortify explicit_bzero.

	* manual/string.texi: Document explicit_bzero.
	* NEWS: Mention addition of explicit_bzero.

	* crypt/crypt-entry.c (__crypt_r): Clear key-dependent intermediate
	data before returning, using explicit_bzero.
	* crypt/md5-crypt.c (__md5_crypt_r): Likewise.
	* crypt/sha256-crypt.c (__sha256_crypt_r): Likewise.
	* crypt/sha512-crypt.c (__sha512_crypt_r): Likewise.

	* include/string.h: Redirect internal uses of explicit_bzero
	to __explicit_bzero_chk[_internal].
	* string/Versions [GLIBC_2.25]: Add explicit_bzero.
	* debug/Versions [GLIBC_2.25]: Add __explicit_bzero_chk.
	* sysdeps/arm/nacl/libc.abilist
	* sysdeps/unix/sysv/linux/aarch64/libc.abilist
	* sysdeps/unix/sysv/linux/alpha/libc.abilist
	* sysdeps/unix/sysv/linux/arm/libc.abilist
	* sysdeps/unix/sysv/linux/hppa/libc.abilist
	* sysdeps/unix/sysv/linux/i386/libc.abilist
	* sysdeps/unix/sysv/linux/ia64/libc.abilist
	* sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist
	* sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist
	* sysdeps/unix/sysv/linux/microblaze/libc.abilist
	* sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist
	* sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist
	* sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist
	* sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist
	* sysdeps/unix/sysv/linux/nios2/libc.abilist
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist
	* sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist
	* sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist
	* sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist
	* sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist
	* sysdeps/unix/sysv/linux/sh/libc.abilist
	* sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist
	* sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist
	* sysdeps/unix/sysv/linux/tile/tilegx/tilegx32/libc.abilist
	* sysdeps/unix/sysv/linux/tile/tilegx/tilegx64/libc.abilist
	* sysdeps/unix/sysv/linux/tile/tilepro/libc.abilist
	* sysdeps/unix/sysv/linux/x86_64/64/libc.abilist
	* sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist:
	Add entries for explicit_bzero and __explicit_bzero_chk.
Diffstat (limited to 'manual/string.texi')
-rw-r--r--manual/string.texi99
1 files changed, 99 insertions, 0 deletions
diff --git a/manual/string.texi b/manual/string.texi
index 1986357ee8..b8810d66b7 100644
--- a/manual/string.texi
+++ b/manual/string.texi
@@ -34,6 +34,8 @@ too.
 * Search Functions::            Searching for a specific element or substring.
 * Finding Tokens in a String::  Splitting a string into tokens by looking
 				 for delimiters.
+* Erasing Sensitive Data::      Clearing memory which contains sensitive
+                                 data, after it's no longer needed.
 * strfry::                      Function for flash-cooking a string.
 * Trivial Encryption::          Obscuring data.
 * Encode Binary Data::          Encoding and Decoding of Binary Data.
@@ -2404,6 +2406,103 @@ contains no '/' bytes, then "." is returned.  The prototype for this
 function can be found in @file{libgen.h}.
 @end deftypefun
 
+@node Erasing Sensitive Data
+@section Erasing Sensitive Data
+
+Sensitive data, such as cryptographic keys, should be erased from
+memory after use, to reduce the risk that a bug will expose it to the
+outside world.  However, compiler optimizations may determine that an
+erasure operation is ``unnecessary,'' and remove it from the generated
+code, because no @emph{correct} program could access the variable or
+heap object containing the sensitive data after it's deallocated.
+Since erasure is a precaution against bugs, this optimization is
+inappropriate.
+
+The function @code{explicit_bzero} erases a block of memory, and
+guarantees that the compiler will not remove the erasure as
+``unnecessary.''
+
+@smallexample
+@group
+#include <string.h>
+
+extern void encrypt (const char *key, const char *in,
+                     char *out, size_t n);
+extern void genkey (const char *phrase, char *key);
+
+void encrypt_with_phrase (const char *phrase, const char *in,
+                          char *out, size_t n)
+@{
+  char key[16];
+  genkey (phrase, key);
+  encrypt (key, in, out, n);
+  explicit_bzero (key, 16);
+@}
+@end group
+@end smallexample
+
+@noindent
+In this example, if @code{memset}, @code{bzero}, or a hand-written
+loop had been used, the compiler might remove them as ``unnecessary.''
+
+@strong{Warning:} @code{explicit_bzero} does not guarantee that
+sensitive data is @emph{completely} erased from the computer's memory.
+There may be copies in temporary storage areas, such as registers and
+``scratch'' stack space; since these are invisible to the source code,
+a library function cannot erase them.
+
+Also, @code{explicit_bzero} only operates on RAM.  If a sensitive data
+object never needs to have its address taken other than to call
+@code{explicit_bzero}, it might be stored entirely in CPU registers
+@emph{until} the call to @code{explicit_bzero}.  Then it will be
+copied into RAM, the copy will be erased, and the original will remain
+intact.  Data in RAM is more likely to be exposed by a bug than data
+in registers, so this creates a brief window where the data is at
+greater risk of exposure than it would have been if the program didn't
+try to erase it at all.
+
+Declaring sensitive variables as @code{volatile} will make both the
+above problems @emph{worse}; a @code{volatile} variable will be stored
+in memory for its entire lifetime, and the compiler will make
+@emph{more} copies of it than it would otherwise have.  Attempting to
+erase a normal variable ``by hand'' through a
+@code{volatile}-qualified pointer doesn't work at all---because the
+variable itself is not @code{volatile}, some compilers will ignore the
+qualification on the pointer and remove the erasure anyway.
+
+Having said all that, in most situations, using @code{explicit_bzero}
+is better than not using it.  At present, the only way to do a more
+thorough job is to write the entire sensitive operation in assembly
+language.  We anticipate that future compilers will recognize calls to
+@code{explicit_bzero} and take appropriate steps to erase all the
+copies of the affected data, whereever they may be.
+
+@comment string.h
+@comment BSD
+@deftypefun void explicit_bzero (void *@var{block}, size_t @var{len})
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+
+@code{explicit_bzero} writes zero into @var{len} bytes of memory
+beginning at @var{block}, just as @code{bzero} would.  The zeroes are
+always written, even if the compiler could determine that this is
+``unnecessary'' because no correct program could read them back.
+
+@strong{Note:} The @emph{only} optimization that @code{explicit_bzero}
+disables is removal of ``unnecessary'' writes to memory.  The compiler
+can perform all the other optimizations that it could for a call to
+@code{memset}.  For instance, it may replace the function call with
+inline memory writes, and it may assume that @var{block} cannot be a
+null pointer.
+
+@strong{Portability Note:} This function first appeared in OpenBSD 5.5
+and has not been standardized.  Other systems may provide the same
+functionality under a different name, such as @code{explicit_memset},
+@code{memset_s}, or @code{SecureZeroMemory}.
+
+@Theglibc{} declares this function in @file{string.h}, but on other
+systems it may be in @file{strings.h} instead.
+@end deftypefun
+
 @node strfry
 @section strfry