about summary refs log tree commit diff
path: root/manual
diff options
context:
space:
mode:
authorDJ Delorie <dj@delorie.com>2017-07-06 13:37:30 -0400
committerDJ Delorie <dj@delorie.com>2017-07-06 13:37:30 -0400
commitd5c3fafc4307c9b7a4c7d5cb381fcdbfad340bcc (patch)
tree380cfbc329860434d6b29825bd02ba5f0c7d4b30 /manual
parent3cefdd7310a5d1fad45648d9346e47df9c185fdc (diff)
downloadglibc-d5c3fafc4307c9b7a4c7d5cb381fcdbfad340bcc.tar.gz
glibc-d5c3fafc4307c9b7a4c7d5cb381fcdbfad340bcc.tar.xz
glibc-d5c3fafc4307c9b7a4c7d5cb381fcdbfad340bcc.zip
Add per-thread cache to malloc
* config.make.in: Enable experimental malloc option.
* configure.ac: Likewise.
* configure: Regenerate.
* manual/install.texi: Document it.
* INSTALL: Regenerate.
* malloc/Makefile: Likewise.
* malloc/malloc.c: Add per-thread cache (tcache).
(tcache_put): New.
(tcache_get): New.
(tcache_thread_freeres): New.
(tcache_init): New.
(__libc_malloc): Use cached chunks if available.
(__libc_free): Initialize tcache if needed.
(__libc_realloc): Likewise.
(__libc_calloc): Likewise.
(_int_malloc): Prefill tcache when appropriate.
(_int_free): Likewise.
(do_set_tcache_max): New.
(do_set_tcache_count): New.
(do_set_tcache_unsorted_limit): New.
* manual/probes.texi: Document new probes.
* malloc/arena.c: Add new tcache tunables.
* elf/dl-tunables.list: Likewise.
* manual/tunables.texi: Document them.
* NEWS: Mention the per-thread cache.
Diffstat (limited to 'manual')
-rw-r--r--manual/install.texi6
-rw-r--r--manual/probes.texi19
-rw-r--r--manual/tunables.texi32
3 files changed, 57 insertions, 0 deletions
diff --git a/manual/install.texi b/manual/install.texi
index 03eb2dd93b..b8deb9ceba 100644
--- a/manual/install.texi
+++ b/manual/install.texi
@@ -232,6 +232,12 @@ libnss_nisplus are not built at all.
 Use this option to enable libnsl with all depending NSS modules and
 header files.
 
+@item --disable-experimental-malloc
+By default, a per-thread cache is enabled in @code{malloc}.  While
+this cache can be disabled on a per-application basis using tunables
+(set glibc.malloc.tcache_count to zero), this option can be used to
+remove it from the build completely.
+
 @item --build=@var{build-system}
 @itemx --host=@var{host-system}
 These options are for cross-compiling.  If you specify both options and
diff --git a/manual/probes.texi b/manual/probes.texi
index eb91c62703..96acaed206 100644
--- a/manual/probes.texi
+++ b/manual/probes.texi
@@ -231,6 +231,25 @@ dynamic brk/mmap thresholds.  Argument @var{$arg1} and @var{$arg2} are
 the adjusted mmap and trim thresholds, respectively.
 @end deftp
 
+@deftp Probe memory_tunable_tcache_max_bytes (int @var{$arg1}, int @var{$arg2})
+This probe is triggered when the @code{glibc.malloc.tcache_max}
+tunable is set.  Argument @var{$arg1} is the requested value, and
+@var{$arg2} is the previous value of this tunable.
+@end deftp
+
+@deftp Probe memory_tunable_tcache_count (int @var{$arg1}, int @var{$arg2})
+This probe is triggered when the @code{glibc.malloc.tcache_count}
+tunable is set.  Argument @var{$arg1} is the requested value, and
+@var{$arg2} is the previous value of this tunable.
+@end deftp
+
+@deftp Probe memory_tunable_tcache_unsorted_limit (int @var{$arg1}, int @var{$arg2})
+This probe is triggered when the
+@code{glibc.malloc.tcache_unsorted_limit} tunable is set.  Argument
+@var{$arg1} is the requested value, and @var{$arg2} is the previous
+value of this tunable.
+@end deftp
+
 @node Mathematical Function Probes
 @section Mathematical Function Probes
 
diff --git a/manual/tunables.texi b/manual/tunables.texi
index 9331b03702..b16d591b90 100644
--- a/manual/tunables.texi
+++ b/manual/tunables.texi
@@ -193,6 +193,38 @@ systems the limit is twice the number of cores online and on 64-bit systems, it
 is 8 times the number of cores online.
 @end deftp
 
+@deftp Tunable glibc.malloc.tcache_max
+The maximum size of a request (in bytes) which may be met via the
+per-thread cache.  The default (and maximum) value is 1032 bytes on
+64-bit systems and 516 bytes on 32-bit systems.
+@end deftp
+
+@deftp Tunable glibc.malloc.tcache_count
+The maximum number of chunks of each size to cache. The default is 7.
+There is no upper limit, other than available system memory.  If set
+to zero, the per-thread cache is effectively disabled.
+
+The approximate maximum overhead of the per-thread cache is thus equal
+to the number of bins times the chunk count in each bin times the size
+of each chunk.  With defaults, the approximate maximum overhead of the
+per-thread cache is approximately 236 KB on 64-bit systems and 118 KB
+on 32-bit systems.
+@end deftp
+
+@deftp Tunable glibc.malloc.tcache_unsorted_limit
+When the user requests memory and the request cannot be met via the
+per-thread cache, the arenas are used to meet the request.  At this
+time, additional chunks will be moved from existing arena lists to
+pre-fill the corresponding cache.  While copies from the fastbins,
+smallbins, and regular bins are bounded and predictable due to the bin
+sizes, copies from the unsorted bin are not bounded, and incur
+additional time penalties as they need to be sorted as they're
+scanned.  To make scanning the unsorted list more predictable and
+bounded, the user may set this tunable to limit the number of chunks
+that are scanned from the unsorted list while searching for chunks to
+pre-fill the per-thread cache with.  The default, or when set to zero,
+is no limit.
+
 @node Hardware Capability Tunables
 @section Hardware Capability Tunables
 @cindex hardware capability tunables