about summary refs log tree commit diff
path: root/manual/tunables.texi
Commit message (Collapse)AuthorAgeFilesLines
* Reversing calculation of __x86_shared_non_temporal_thresholdPatrick McGehearty2020-09-281-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The __x86_shared_non_temporal_threshold determines when memcpy on x86 uses non_temporal stores to avoid pushing other data out of the last level cache. This patch proposes to revert the calculation change made by H.J. Lu's patch of June 2, 2017. H.J. Lu's patch selected a threshold suitable for a single thread getting maximum performance. It was tuned using the single threaded large memcpy micro benchmark on an 8 core processor. The last change changes the threshold from using 3/4 of one thread's share of the cache to using 3/4 of the entire cache of a multi-threaded system before switching to non-temporal stores. Multi-threaded systems with more than a few threads are server-class and typically have many active threads. If one thread consumes 3/4 of the available cache for all threads, it will cause other active threads to have data removed from the cache. Two examples show the range of the effect. John McCalpin's widely parallel Stream benchmark, which runs in parallel and fetches data sequentially, saw a 20% slowdown with this patch on an internal system test of 128 threads. This regression was discovered when comparing OL8 performance to OL7. An example that compares normal stores to non-temporal stores may be found at https://vgatherps.github.io/2018-09-02-nontemporal/. A simple test shows performance loss of 400 to 500% due to a failure to use nontemporal stores. These performance losses are most likely to occur when the system load is heaviest and good performance is critical. The tunable x86_non_temporal_threshold can be used to override the default for the knowledgable user who really wants maximum cache allocation to a single thread in a multi-threaded system. The manual entry for the tunable has been expanded to provide more information about its purpose. modified: sysdeps/x86/cacheinfo.c modified: manual/tunables.texi
* rtld: Avoid using up static TLS surplus for optimizations [BZ #25051]Szabolcs Nagy2020-07-081-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | On some targets static TLS surplus area can be used opportunistically for dynamically loaded modules such that the TLS access then becomes faster (TLSDESC and powerpc TLS optimization). However we don't want all surplus TLS to be used for this optimization because dynamically loaded modules with initial-exec model TLS can only use surplus TLS. The new contract for surplus static TLS use is: - libc.so can have up to 192 bytes of IE TLS, - other system libraries together can have up to 144 bytes of IE TLS. - Some "optional" static TLS is available for opportunistic use. The optional TLS is now tunable: rtld.optional_static_tls, so users can directly affect the allocated static TLS size. (Note that module unloading with dlclose does not reclaim static TLS. After the optional TLS runs out, TLS access is no longer optimized to use static TLS.) The default setting of rtld.optional_static_tls is 512 so the surplus TLS is 3*192 + 4*144 + 512 = 1664 by default, the same as before. Fixes BZ #25051. Tested on aarch64-linux-gnu and x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* rtld: Account static TLS surplus for audit modulesSzabolcs Nagy2020-07-081-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | The new static TLS surplus size computation is surplus_tls = 192 * (nns-1) + 144 * nns + 512 where nns is controlled via the rtld.nns tunable. This commit accounts audit modules too so nns = rtld.nns + audit modules. rtld.nns should only include the namespaces required by the application, namespaces for audit modules are accounted on top of that so audit modules don't use up the static TLS that is reserved for the application. This allows loading many audit modules without tuning rtld.nns or using up static TLS, and it fixes FAIL: elf/tst-auditmany Note that DL_NNS is currently a hard upper limit for nns, and if rtld.nns + audit modules go over the limit that's a fatal error. By default rtld.nns is 4 which allows 12 audit modules. Counting the audit modules is based on existing audit string parsing code, we cannot use GLRO(dl_naudit) before the modules are actually loaded.
* rtld: Add rtld.nns tunable for the number of supported namespacesSzabolcs Nagy2020-07-081-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | TLS_STATIC_SURPLUS is 1664 bytes currently which is not enough to support DL_NNS (== 16) number of dynamic link namespaces, if we assume 192 bytes of TLS are reserved for libc use and 144 bytes are reserved for other system libraries that use IE TLS. A new tunable is introduced to control the number of supported namespaces and to adjust the surplus static TLS size as follows: surplus_tls = 192 * (rtld.nns-1) + 144 * rtld.nns + 512 The default is rtld.nns == 4 and then the surplus TLS size is the same as before, so the behaviour is unchanged by default. If an application creates more namespaces than the rtld.nns setting allows, then it is not guaranteed to work, but the limit is not checked. So existing usage will continue to work, but in the future if an application creates more than 4 dynamic link namespaces then the tunable will need to be set. In this patch DL_NNS is a fixed value and provides a maximum to the rtld.nns setting. Static linking used fixed 2048 bytes surplus TLS, this is changed so the same contract is used as for dynamic linking. With static linking DL_NNS == 1 so rtld.nns tunable is forced to 1, so by default the surplus TLS is reduced to 144 + 512 = 656 bytes. This change is not expected to cause problems. Tested on aarch64-linux-gnu and x86_64-linux-gnu. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* x86: Add thresholds for "rep movsb/stosb" to tunablesH.J. Lu2020-07-061-0/+16
| | | | | | | | | | Add x86_rep_movsb_threshold and x86_rep_stosb_threshold to tunables to update thresholds for "rep movsb" and "rep stosb" at run-time. Note that the user specified threshold for "rep movsb" smaller than the minimum threshold will be ignored. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* aarch64: Add Huawei Kunpeng to tunable cpu listXuelei Zhang2019-12-191-1/+1
| | | | | | | | | | | Kunpeng processer is a 64-bit Arm-compatible CPU released by Huawei, and we have already signed a copyright assignement with the FSF. This patch adds its to cpu list, and related macro for IFUNC. Checked on aarch64-linux-gnu. Reviewed-by: Szabolcs Nagy <Szabolcs.Nagy@arm.com>
* Add glibc.malloc.mxfast tunableDJ Delorie2019-08-091-0/+12
| | | | | | | | | | | | * elf/dl-tunables.list: Add glibc.malloc.mxfast. * manual/tunables.texi: Document it. * malloc/malloc.c (do_set_mxfast): New. (__libc_mallopt): Call it. * malloc/arena.c: Add mxfast tunable. * malloc/tst-mxfast.c: New. * malloc/Makefile: Add it. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* Small tcache improvementsWilco Dijkstra2019-05-171-1/+1
| | | | | | | | | | | | | | | | Change the tcache->counts[] entries to uint16_t - this removes the limit set by char and allows a larger tcache. Remove a few redundant asserts. bench-malloc-thread with 4 threads is ~15% faster on Cortex-A72. Reviewed-by: DJ Delorie <dj@redhat.com> * malloc/malloc.c (MAX_TCACHE_COUNT): Increase to UINT16_MAX. (tcache_put): Remove redundant assert. (tcache_get): Remove redundant asserts. (__libc_malloc): Check tcache count is not zero. * manual/tunables.texi (glibc.malloc.tcache_count): Update maximum.
* Fix tcache count maximum (BZ #24531)Wilco Dijkstra2019-05-101-2/+2
| | | | | | | | | | | | The tcache counts[] array is a char, which has a very small range and thus may overflow. When setting tcache_count tunable, there is no overflow check. However the tunable must not be larger than the maximum value of the tcache counts[] array, otherwise it can overflow when filling the tcache. [BZ #24531] * malloc/malloc.c (MAX_TCACHE_COUNT): New define. (do_set_tcache_count): Only update if count is small enough. * manual/tunables.texi (glibc.malloc.tcache_count): Document max value.
* aarch64: Add AmpereComputing emag to tunable cpu listFeng Xue2019-02-011-1/+1
| | | | | | | | | | | | | Emag is a 64-bit CPU core released by AmpereComputing. Add its name to cpu list, and corresponding macro as utilities for later IFUNC dispatch. * manual/tunables.texi (Tunable glibc.cpu.name): Add emag. * sysdeps/unix/sysv/linux/aarch64/cpu-features.c (cpu_list): Add emag. * sysdeps/unix/sysv/linux/aarch64/cpu-features.h (IS_EMAG): New macro.
* [AArch64] Add ifunc support for AresWilco Dijkstra2019-01-091-1/+1
| | | | | | | | | | | | | | | Add Ares to the midr_el0 list and support ifunc dispatch. Since Ares supports 2 128-bit loads/stores, use Neon registers for memcpy by selecting __memcpy_falkor by default (we should rename this to __memcpy_simd or similar). * manual/tunables.texi (glibc.cpu.name): Add ares tunable. * sysdeps/aarch64/multiarch/memcpy.c (__libc_memcpy): Use __memcpy_falkor for ares. * sysdeps/unix/sysv/linux/aarch64/cpu-features.h (IS_ARES): Add new define. * sysdeps/unix/sysv/linux/aarch64/cpu-features.c (cpu_list): Add ares cpu.
* Mutex: Add pthread mutex tunablesKemi Wang2018-12-011-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch does not have any functionality change, we only provide a spin count tunes for pthread adaptive spin mutex. The tunable glibc.pthread.mutex_spin_count tunes can be used by system administrator to squeeze system performance according to different hardware capabilities and workload characteristics. The maximum value of spin count is limited to 32767 to avoid the overflow of mutex->__data.__spins variable with the possible type of short in pthread_mutex_lock (). The default value of spin count is set to 100 with the reference to the previous number of times of spinning via trylock. This value would be architecture-specific and can be tuned with kinds of benchmarks to fit most cases in future. I would extend my appreciation sincerely to H.J.Lu for his help to refine this patch series. * manual/tunables.texi (POSIX Thread Tunables): New node. * nptl/Makefile (libpthread-routines): Add pthread_mutex_conf. * nptl/nptl-init.c: Include pthread_mutex_conf.h (__pthread_initialize_minimal_internal) [HAVE_TUNABLES]: Call __pthread_tunables_init. * nptl/pthreadP.h (MAX_ADAPTIVE_COUNT): Remove. (max_adaptive_count): Define. * nptl/pthread_mutex_conf.c: New file. * nptl/pthread_mutex_conf.h: New file. * sysdeps/generic/adaptive_spin_count.h: New file. * sysdeps/nptl/dl-tunables.list: New file. * nptl/pthread_mutex_lock.c (__pthread_mutex_lock): Use max_adaptive_count () not MAX_ADAPTIVE_COUNT. * nptl/pthread_mutex_timedlock.c (__pthrad_mutex_timedlock): Likewise. Suggested-by: Andi Kleen <andi.kleen@intel.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com> Signed-off-by: Kemi.wang <kemi.wang@intel.com>
* Rename the glibc.tune namespace to glibc.cpuSiddhesh Poyarekar2018-08-021-20/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The glibc.tune namespace is vaguely named since it is a 'tunable', so give it a more specific name that describes what it refers to. Rename the tunable namespace to 'cpu' to more accurately reflect what it encompasses. Also rename glibc.tune.cpu to glibc.cpu.name since glibc.cpu.cpu is weird. * NEWS: Mention the change. * elf/dl-tunables.list: Rename tune namespace to cpu. * sysdeps/powerpc/dl-tunables.list: Likewise. * sysdeps/x86/dl-tunables.list: Likewise. * sysdeps/aarch64/dl-tunables.list: Rename tune.cpu to cpu.name. * elf/dl-hwcaps.c (_dl_important_hwcaps): Adjust. * elf/dl-hwcaps.h (GET_HWCAP_MASK): Likewise. * manual/README.tunables: Likewise. * manual/tunables.texi: Likewise. * sysdeps/powerpc/cpu-features.c: Likewise. * sysdeps/unix/sysv/linux/aarch64/cpu-features.c (init_cpu_features): Likewise. * sysdeps/x86/cpu-features.c: Likewise. * sysdeps/x86/cpu-features.h: Likewise. * sysdeps/x86/cpu-tunables.c: Likewise. * sysdeps/x86_64/Makefile: Likewise. * sysdeps/x86/dl-cet.c: Likewise. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* x86/CET: Document glibc.tune.x86_ibt and glibc.tune.x86_shstkH.J. Lu2018-07-181-0/+28
| | | | | * manual/tunables.texi: Document glibc.tune.x86_ibt and glibc.tune.x86_shstk.
* manual: Fix spelling of "Auxiliary."Tobias Klauser2018-01-231-1/+1
|
* powerpc: POWER8 memcpy optimization for cached memoryAdhemerval Zanella2017-12-111-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On POWER8, unaligned memory accesses to cached memory has little impact on performance as opposed to its ancestors. It is disabled by default and will only be available when the tunable glibc.tune.cached_memopt is set to 1. __memcpy_power8_cached __memcpy_power7 ============================================================ max-size=4096: 33325.70 ( 12.65%) 38153.00 max-size=8192: 32878.20 ( 11.17%) 37012.30 max-size=16384: 33782.20 ( 11.61%) 38219.20 max-size=32768: 33296.20 ( 11.30%) 37538.30 max-size=65536: 33765.60 ( 10.53%) 37738.40 * manual/tunables.texi (Hardware Capability Tunables): Document glibc.tune.cached_memopt. * sysdeps/powerpc/cpu-features.c: New file. * sysdeps/powerpc/cpu-features.h: New file. * sysdeps/powerpc/dl-procinfo.c [!IS_IN(ldconfig)]: Add _dl_powerpc_cpu_features. * sysdeps/powerpc/dl-tunables.list: New file. * sysdeps/powerpc/ldsodefs.h: Include cpu-features.h. * sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h (INIT_ARCH): Initialize use_aligned_memopt. * sysdeps/powerpc/powerpc64/dl-machine.h [defined(SHARED && IS_IN(rtld))]: Restrict dl_platform_init availability and initialize CPU features used by tunables. * sysdeps/powerpc/powerpc64/multiarch/Makefile (sysdep_routines): Add memcpy-power8-cached. * sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c: Add __memcpy_power8_cached. * sysdeps/powerpc/powerpc64/multiarch/memcpy.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/memcpy-power8-cached.S: New file. Reviewed-by: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
* Add elision tunablesRogerio Alves2017-12-051-0/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds several new tunables to control the behavior of elision on supported platforms[1]. Since elision now depends on tunables, we should always *compile* with elision enabled, and leave the code disabled, but available for runtime selection. This gives us *much* better compile-time testing of the existing code to avoid bit-rot[2]. Tested on ppc, ppc64, ppc64le, s390x and x86_64. [1] This part of the patch was initially proposed by Paul Murphy but was "staled" because the framework have changed since the patch was originally proposed: https://patchwork.sourceware.org/patch/10342/ [2] This part of the patch was inititally proposed as a RFC by Carlos O'Donnell. Make sense to me integrate this on the patch: https://sourceware.org/ml/libc-alpha/2017-05/msg00335.html * elf/dl-tunables.list: Add elision parameters. * manual/tunables.texi: Add entries about elision tunable. * sysdeps/unix/sysv/linux/powerpc/elision-conf.c: Add callback functions to dynamically enable/disable elision. Add multiple callbacks functions to set elision parameters. Deleted __libc_enable_secure check. * sysdeps/unix/sysv/linux/s390/elision-conf.c: Likewise. * sysdeps/unix/sysv/linux/x86/elision-conf.c: Likewise. * configure: Regenerated. * configure.ac: Option enable_lock_elision was deleted. * config.h.in: ENABLE_LOCK_ELISION flag was deleted. * config.make.in: Remove references to enable_lock_elision. * manual/install.texi: Elision configure option was removed. * INSTALL: Regenerated to remove enable_lock_elision. * nptl/Makefile: Disable elision so it can verify error case for destroying a mutex. * sysdeps/powerpc/nptl/elide.h: Cleanup ENABLE_LOCK_ELISION check. Deleted macros for the case when ENABLE_LOCK_ELISION was not defined. * sysdeps/s390/configure: Regenerated. * sysdeps/s390/configure.ac: Remove references to enable_lock_elision.. * nptl/tst-mutex8.c: Deleted all #ifndef ENABLE_LOCK_ELISION from the test. * sysdeps/powerpc/powerpc32/sysdep.h: Deleted all ENABLE_LOCK_ELISION checks. * sysdeps/powerpc/powerpc64/sysdep.h: Likewise. * sysdeps/powerpc/sysdep.h: Likewise. * sysdeps/s390/nptl/bits/pthreadtypes-arch.h: Likewise. * sysdeps/unix/sysv/linux/powerpc/force-elision.h: Likewise. * sysdeps/unix/sysv/linux/s390/elision-conf.h: Likewise. * sysdeps/unix/sysv/linux/s390/force-elision.h: Likewise. * sysdeps/unix/sysv/linux/s390/lowlevellock.h: Likewise. * sysdeps/unix/sysv/linux/s390/Makefile: Remove references to enable-lock-elision. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
* Add thunderx2t99 and thunderx2t99p1 CPU names to tunables listSteve Ellcey2017-09-081-1/+2
| | | | | | | * manual/tunables.texi (glibc.tune.cpu): Add thunderx2t99 and thunderx2t99p1 to list of cpu names. * sysdeps/unix/sysv/linux/aarch64/cpu-features.c (cpu_list): Add thunderx2t99 and thunderx2t99p1 entries to cpu_list.
* malloc: Abort on heap corruption, without a backtrace [BZ #21754]Florian Weimer2017-08-301-21/+7
| | | | | The stack trace printing caused deadlocks and has been itself been targeted by code execution exploits.
* aarch64: Optimized memcpy for Qualcomm Falkor processorSiddhesh Poyarekar2017-08-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is an optimized implementation of the memcpy routine that gives a significant gain in performance for all sizes of copies on the Qualcomm Falkor processor. A detailed rationale of the implementation is written in a comment in the patch. This implementation improves time for copies up to 128 bytes by up to 15% and for larger copies by up to 35% in the glibc microbenchmark. The memcpy-random benchmark sees improvements in all sizes in the range of 13%-18%. Here are the full numbers extracted from the glibc microbenchmark using the commands: ../benchtests/scripts/compare_strings.py benchtests/bench-memcpy.out \ ../benchtests/scripts/benchout_strings.schema.json \ -base=__memcpy_generic length align1 align2 ../benchtests/scripts/compare_strings.py benchtests/bench-memcpy-large.out \ ../benchtests/scripts/benchout_strings.schema.json \ -base=__memcpy_generic length align1 align2 ../benchtests/scripts/compare_strings.py benchtests/bench-memcpy-random.out \ ../benchtests/scripts/benchout_strings.schema.json \ -base=__memcpy_generic max-size Function: memcpy __memcpy_thunderx __memcpy_falkor __memcpy_generic Variant: default ================================================================================ length=1,align1=0,align2=0: 33.59 (-115.00%) 15.62 (0.00%) 15.62 length=1,align1=0,align2=0: 16.41 (-10.53%) 14.06 (5.26%) 14.84 length=1,align1=0,align2=0: 14.84 (0.00%) 14.84 (0.00%) 14.84 length=1,align1=0,align2=0: 15.62 (-5.26%) 14.06 (5.26%) 14.84 length=2,align1=0,align2=0: 15.62 (-5.26%) 14.06 (5.26%) 14.84 length=2,align1=1,align2=0: 15.62 (-5.26%) 14.06 (5.26%) 14.84 length=2,align1=0,align2=1: 14.84 (0.00%) 14.06 (5.26%) 14.84 length=2,align1=1,align2=1: 14.84 (-5.56%) 14.06 (0.00%) 14.06 length=4,align1=0,align2=0: 14.06 (0.00%) 14.06 (0.00%) 14.06 length=4,align1=2,align2=0: 14.06 (-5.88%) 14.06 (-5.88%) 13.28 length=4,align1=0,align2=2: 14.06 (0.00%) 14.06 (0.00%) 14.06 length=4,align1=2,align2=2: 14.06 (-5.88%) 14.06 (-5.88%) 13.28 length=8,align1=0,align2=0: 14.84 (-5.56%) 13.28 (5.56%) 14.06 length=8,align1=3,align2=0: 14.06 (0.00%) 13.28 (5.56%) 14.06 length=8,align1=0,align2=3: 13.28 (0.00%) 13.28 (0.00%) 13.28 length=8,align1=3,align2=3: 13.28 (-6.25%) 13.28 (-6.25%) 12.50 length=16,align1=0,align2=0: 13.28 (0.00%) 13.28 (0.00%) 13.28 length=16,align1=4,align2=0: 13.28 (0.00%) 12.50 (5.88%) 13.28 length=16,align1=0,align2=4: 13.28 (0.00%) 13.28 (0.00%) 13.28 length=16,align1=4,align2=4: 13.28 (-6.25%) 12.50 (0.00%) 12.50 length=32,align1=0,align2=0: 14.06 (0.00%) 12.50 (11.11%) 14.06 length=32,align1=5,align2=0: 13.28 (0.00%) 12.50 (5.88%) 13.28 length=32,align1=0,align2=5: 14.06 (-5.88%) 12.50 (5.88%) 13.28 length=32,align1=5,align2=5: 14.06 (-5.88%) 12.50 (5.88%) 13.28 length=64,align1=0,align2=0: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=64,align1=6,align2=0: 13.28 (0.00%) 13.28 (0.00%) 13.28 length=64,align1=0,align2=6: 14.06 (5.26%) 14.06 (5.26%) 14.84 length=64,align1=6,align2=6: 14.84 (-11.77%) 14.06 (-5.88%) 13.28 length=128,align1=0,align2=0: 17.19 (-4.76%) 14.84 (9.52%) 16.41 length=128,align1=7,align2=0: 16.41 (4.55%) 15.62 (9.09%) 17.19 length=128,align1=0,align2=7: 16.41 (0.00%) 14.06 (14.29%) 16.41 length=128,align1=7,align2=7: 16.41 (4.55%) 15.62 (9.09%) 17.19 length=256,align1=0,align2=0: 21.88 (-3.70%) 21.09 (0.00%) 21.09 length=256,align1=8,align2=0: 21.09 (-3.85%) 21.09 (-3.85%) 20.31 length=256,align1=0,align2=8: 20.31 (-4.00%) 20.31 (-4.00%) 19.53 length=256,align1=8,align2=8: 21.88 (-7.69%) 20.31 (0.00%) 20.31 length=512,align1=0,align2=0: 28.91 (-2.78%) 28.91 (-2.78%) 28.12 length=512,align1=9,align2=0: 30.47 (-2.63%) 30.47 (-2.63%) 29.69 length=512,align1=0,align2=9: 29.69 (0.00%) 29.69 (0.00%) 29.69 length=512,align1=9,align2=9: 28.12 (-2.86%) 28.12 (-2.86%) 27.34 length=1024,align1=0,align2=0: 44.53 (0.00%) 44.53 (0.00%) 44.53 length=1024,align1=10,align2=0: 50.00 (0.00%) 50.00 (0.00%) 50.00 length=1024,align1=0,align2=10: 49.22 (1.56%) 50.78 (-1.56%) 50.00 length=1024,align1=10,align2=10: 44.53 (-1.79%) 43.75 (0.00%) 43.75 length=2048,align1=0,align2=0: 77.34 (-1.02%) 76.56 (0.00%) 76.56 length=2048,align1=11,align2=0: 89.84 (0.00%) 89.84 (0.00%) 89.84 length=2048,align1=0,align2=11: 89.84 (0.00%) 89.84 (0.00%) 89.84 length=2048,align1=11,align2=11: 75.78 (0.00%) 75.78 (0.00%) 75.78 length=4096,align1=0,align2=0: 141.41 (-0.56%) 140.62 (0.00%) 140.62 length=4096,align1=12,align2=0: 171.09 (-0.46%) 170.31 (0.00%) 170.31 length=4096,align1=0,align2=12: 170.31 (0.00%) 170.31 (0.00%) 170.31 length=4096,align1=12,align2=12: 140.62 (0.00%) 140.62 (0.00%) 140.62 length=8192,align1=0,align2=0: 278.91 (-0.28%) 275.78 (0.84%) 278.12 length=8192,align1=13,align2=0: 338.28 (0.23%) 335.94 (0.92%) 339.06 length=8192,align1=0,align2=13: 338.28 (0.00%) 455.47 (-34.64%) 338.28 length=8192,align1=13,align2=13: 278.12 (-0.28%) 275.78 (0.56%) 277.34 length=16384,align1=0,align2=0: 535.94 (-0.15%) 531.25 (0.73%) 535.16 length=16384,align1=14,align2=0: 659.38 (0.12%) 659.38 (0.12%) 660.16 length=16384,align1=0,align2=14: 659.38 (0.00%) 657.03 (0.36%) 659.38 length=16384,align1=14,align2=14: 535.16 (0.44%) 532.81 (0.87%) 537.50 length=32768,align1=0,align2=0: 1260.94 (10.68%) 1121.88 (20.53%) 1411.72 length=32768,align1=15,align2=0: 1368.75 (10.02%) 1376.56 (9.50%) 1521.09 length=32768,align1=0,align2=15: 1333.59 (10.91%) 1373.44 (8.25%) 1496.88 length=32768,align1=15,align2=15: 1256.25 (13.96%) 1125.78 (22.90%) 1460.16 length=65536,align1=0,align2=0: 2853.91 (30.11%) 2589.06 (36.60%) 4083.59 length=65536,align1=16,align2=0: 2850.00 (30.14%) 2589.84 (36.52%) 4079.69 length=65536,align1=0,align2=16: 2853.12 (30.60%) 2589.84 (37.00%) 4110.94 length=65536,align1=16,align2=16: 2850.78 (30.07%) 2589.06 (36.49%) 4076.56 length=0,align1=0,align2=0: 15.62 (-5.26%) 16.41 (-10.53%) 14.84 length=0,align1=0,align2=0: 14.84 (-5.56%) 14.84 (-5.56%) 14.06 length=0,align1=0,align2=0: 14.84 (0.00%) 14.84 (0.00%) 14.84 length=0,align1=0,align2=0: 16.41 (-16.67%) 14.84 (-5.56%) 14.06 length=1,align1=0,align2=0: 15.62 (4.76%) 15.62 (4.76%) 16.41 length=1,align1=1,align2=0: 15.62 (0.00%) 14.84 (5.00%) 15.62 length=1,align1=0,align2=1: 14.84 (0.00%) 14.84 (0.00%) 14.84 length=1,align1=1,align2=1: 14.84 (0.00%) 14.06 (5.26%) 14.84 length=2,align1=0,align2=0: 14.84 (0.00%) 14.06 (5.26%) 14.84 length=2,align1=2,align2=0: 14.84 (0.00%) 14.06 (5.26%) 14.84 length=2,align1=0,align2=2: 14.84 (-5.56%) 14.06 (0.00%) 14.06 length=2,align1=2,align2=2: 14.84 (0.00%) 14.06 (5.26%) 14.84 length=3,align1=0,align2=0: 14.84 (0.00%) 14.84 (0.00%) 14.84 length=3,align1=3,align2=0: 14.84 (-5.56%) 14.06 (0.00%) 14.06 length=3,align1=0,align2=3: 15.62 (-11.11%) 14.06 (0.00%) 14.06 length=3,align1=3,align2=3: 14.84 (0.00%) 14.06 (5.26%) 14.84 length=4,align1=0,align2=0: 17.97 (-27.78%) 14.06 (0.00%) 14.06 length=4,align1=4,align2=0: 13.28 (5.56%) 14.06 (0.00%) 14.06 length=4,align1=0,align2=4: 14.06 (0.00%) 13.28 (5.56%) 14.06 length=4,align1=4,align2=4: 13.28 (5.56%) 13.28 (5.56%) 14.06 length=5,align1=0,align2=0: 13.28 (5.56%) 13.28 (5.56%) 14.06 length=5,align1=5,align2=0: 14.06 (0.00%) 14.06 (0.00%) 14.06 length=5,align1=0,align2=5: 14.06 (0.00%) 13.28 (5.56%) 14.06 length=5,align1=5,align2=5: 14.06 (-5.88%) 14.06 (-5.88%) 13.28 length=6,align1=0,align2=0: 14.06 (-5.88%) 14.06 (-5.88%) 13.28 length=6,align1=6,align2=0: 14.06 (0.00%) 14.06 (0.00%) 14.06 length=6,align1=0,align2=6: 14.06 (0.00%) 13.28 (5.56%) 14.06 length=6,align1=6,align2=6: 14.06 (0.00%) 13.28 (5.56%) 14.06 length=7,align1=0,align2=0: 14.84 (-11.77%) 14.06 (-5.88%) 13.28 length=7,align1=7,align2=0: 13.28 (0.00%) 14.06 (-5.88%) 13.28 length=7,align1=0,align2=7: 14.06 (0.00%) 14.06 (0.00%) 14.06 length=7,align1=7,align2=7: 14.06 (0.00%) 14.06 (0.00%) 14.06 length=8,align1=0,align2=0: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=8,align1=8,align2=0: 14.06 (0.00%) 13.28 (5.56%) 14.06 length=8,align1=0,align2=8: 13.28 (0.00%) 13.28 (0.00%) 13.28 length=8,align1=8,align2=8: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=9,align1=0,align2=0: 13.28 (0.00%) 13.28 (0.00%) 13.28 length=9,align1=9,align2=0: 13.28 (0.00%) 13.28 (0.00%) 13.28 length=9,align1=0,align2=9: 13.28 (0.00%) 14.06 (-5.88%) 13.28 length=9,align1=9,align2=9: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=10,align1=0,align2=0: 14.06 (0.00%) 13.28 (5.56%) 14.06 length=10,align1=10,align2=0: 14.06 (-5.88%) 14.06 (-5.88%) 13.28 length=10,align1=0,align2=10: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=10,align1=10,align2=10: 14.06 (0.00%) 13.28 (5.56%) 14.06 length=11,align1=0,align2=0: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=11,align1=11,align2=0: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=11,align1=0,align2=11: 13.28 (0.00%) 13.28 (0.00%) 13.28 length=11,align1=11,align2=11: 13.28 (0.00%) 13.28 (0.00%) 13.28 length=12,align1=0,align2=0: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=12,align1=12,align2=0: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=12,align1=0,align2=12: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=12,align1=12,align2=12: 14.06 (0.00%) 13.28 (5.56%) 14.06 length=13,align1=0,align2=0: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=13,align1=13,align2=0: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=13,align1=0,align2=13: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=13,align1=13,align2=13: 13.28 (0.00%) 13.28 (0.00%) 13.28 length=14,align1=0,align2=0: 13.28 (0.00%) 13.28 (0.00%) 13.28 length=14,align1=14,align2=0: 13.28 (5.56%) 13.28 (5.56%) 14.06 length=14,align1=0,align2=14: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=14,align1=14,align2=14: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=15,align1=0,align2=0: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=15,align1=15,align2=0: 14.06 (-5.88%) 14.06 (-5.88%) 13.28 length=15,align1=0,align2=15: 13.28 (0.00%) 13.28 (0.00%) 13.28 length=15,align1=15,align2=15: 13.28 (0.00%) 14.06 (-5.88%) 13.28 length=16,align1=0,align2=0: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=16,align1=16,align2=0: 13.28 (5.56%) 14.06 (0.00%) 14.06 length=16,align1=0,align2=16: 14.84 (-11.77%) 13.28 (0.00%) 13.28 length=16,align1=16,align2=16: 13.28 (-6.25%) 12.50 (0.00%) 12.50 length=17,align1=0,align2=0: 14.06 (-5.88%) 12.50 (5.88%) 13.28 length=17,align1=17,align2=0: 14.84 (-11.77%) 12.50 (5.88%) 13.28 length=17,align1=0,align2=17: 14.84 (-5.56%) 12.50 (11.11%) 14.06 length=17,align1=17,align2=17: 14.84 (-11.77%) 12.50 (5.88%) 13.28 length=18,align1=0,align2=0: 14.06 (0.00%) 12.50 (11.11%) 14.06 length=18,align1=18,align2=0: 13.28 (5.56%) 12.50 (11.11%) 14.06 length=18,align1=0,align2=18: 14.06 (-5.88%) 12.50 (5.88%) 13.28 length=18,align1=18,align2=18: 14.06 (0.00%) 12.50 (11.11%) 14.06 length=19,align1=0,align2=0: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=19,align1=19,align2=0: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=19,align1=0,align2=19: 14.84 (-5.56%) 12.50 (11.11%) 14.06 length=19,align1=19,align2=19: 14.06 (-5.88%) 12.50 (5.88%) 13.28 length=20,align1=0,align2=0: 14.84 (-11.77%) 12.50 (5.88%) 13.28 length=20,align1=20,align2=0: 14.06 (0.00%) 12.50 (11.11%) 14.06 length=20,align1=0,align2=20: 14.06 (-5.88%) 12.50 (5.88%) 13.28 length=20,align1=20,align2=20: 14.06 (0.00%) 13.28 (5.56%) 14.06 length=21,align1=0,align2=0: 14.84 (-5.56%) 12.50 (11.11%) 14.06 length=21,align1=21,align2=0: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=21,align1=0,align2=21: 14.84 (-11.77%) 12.50 (5.88%) 13.28 length=21,align1=21,align2=21: 13.28 (5.56%) 13.28 (5.56%) 14.06 length=22,align1=0,align2=0: 14.06 (-5.88%) 12.50 (5.88%) 13.28 length=22,align1=22,align2=0: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=22,align1=0,align2=22: 14.06 (0.00%) 12.50 (11.11%) 14.06 length=22,align1=22,align2=22: 14.06 (0.00%) 12.50 (11.11%) 14.06 length=23,align1=0,align2=0: 14.06 (-5.88%) 12.50 (5.88%) 13.28 length=23,align1=23,align2=0: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=23,align1=0,align2=23: 14.06 (-5.88%) 12.50 (5.88%) 13.28 length=23,align1=23,align2=23: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=24,align1=0,align2=0: 14.06 (-5.88%) 12.50 (5.88%) 13.28 length=24,align1=24,align2=0: 14.06 (0.00%) 13.28 (5.56%) 14.06 length=24,align1=0,align2=24: 14.84 (-11.77%) 12.50 (5.88%) 13.28 length=24,align1=24,align2=24: 14.06 (-5.88%) 13.28 (0.00%) 13.28 length=25,align1=0,align2=0: 14.06 (0.00%) 12.50 (11.11%) 14.06 length=25,align1=25,align2=0: 14.06 (0.00%) 13.28 (5.56%) 14.06 length=25,align1=0,align2=25: 14.06 (0.00%) 12.50 (11.11%) 14.06 length=25,align1=25,align2=25: 13.28 (0.00%) 13.28 (0.00%) 13.28 length=26,align1=0,align2=0: 14.06 (-5.88%) 12.50 (5.88%) 13.28 length=26,align1=26,align2=0: 14.06 (0.00%) 13.28 (5.56%) 14.06 length=26,align1=0,align2=26: 14.06 (-5.88%) 12.50 (5.88%) 13.28 length=26,align1=26,align2=26: 14.06 (0.00%) 13.28 (5.56%) 14.06 length=27,align1=0,align2=0: 14.06 (-5.88%) 12.50 (5.88%) 13.28 length=27,align1=27,align2=0: 14.06 (-5.88%) 12.50 (5.88%) 13.28 length=27,align1=0,align2=27: 14.06 (-5.88%) 12.50 (5.88%) 13.28 length=27,align1=27,align2=27: 14.06 (0.00%) 12.50 (11.11%) 14.06 length=28,align1=0,align2=0: 14.06 (-5.88%) 12.50 (5.88%) 13.28 length=28,align1=28,align2=0: 14.06 (0.00%) 12.50 (11.11%) 14.06 length=28,align1=0,align2=28: 14.06 (0.00%) 12.50 (11.11%) 14.06 length=28,align1=28,align2=28: 14.84 (-11.77%) 13.28 (0.00%) 13.28 length=29,align1=0,align2=0: 14.06 (-5.88%) 12.50 (5.88%) 13.28 length=29,align1=29,align2=0: 13.28 (0.00%) 12.50 (5.88%) 13.28 length=29,align1=0,align2=29: 14.06 (0.00%) 12.50 (11.11%) 14.06 length=29,align1=29,align2=29: 13.28 (5.56%) 12.50 (11.11%) 14.06 length=30,align1=0,align2=0: 14.06 (-5.88%) 12.50 (5.88%) 13.28 length=30,align1=30,align2=0: 13.28 (5.56%) 12.50 (11.11%) 14.06 length=30,align1=0,align2=30: 14.06 (-5.88%) 12.50 (5.88%) 13.28 length=30,align1=30,align2=30: 13.28 (0.00%) 12.50 (5.88%) 13.28 length=31,align1=0,align2=0: 13.28 (0.00%) 12.50 (5.88%) 13.28 length=31,align1=31,align2=0: 14.06 (0.00%) 12.50 (11.11%) 14.06 length=31,align1=0,align2=31: 13.28 (0.00%) 12.50 (5.88%) 13.28 length=31,align1=31,align2=31: 14.06 (0.00%) 12.50 (11.11%) 14.06 length=48,align1=0,align2=0: 14.06 (0.00%) 14.06 (0.00%) 14.06 length=48,align1=3,align2=0: 14.06 (0.00%) 14.06 (0.00%) 14.06 length=48,align1=0,align2=3: 14.06 (-5.88%) 14.06 (-5.88%) 13.28 length=48,align1=3,align2=3: 13.28 (5.56%) 14.06 (0.00%) 14.06 length=80,align1=0,align2=0: 15.62 (-11.11%) 14.84 (-5.56%) 14.06 length=80,align1=5,align2=0: 15.62 (-11.11%) 16.41 (-16.67%) 14.06 length=80,align1=0,align2=5: 14.06 (0.00%) 15.62 (-11.11%) 14.06 length=80,align1=5,align2=5: 15.62 (-5.26%) 17.19 (-15.79%) 14.84 length=96,align1=0,align2=0: 14.06 (0.00%) 14.84 (-5.56%) 14.06 length=96,align1=6,align2=0: 14.84 (-5.56%) 16.41 (-16.67%) 14.06 length=96,align1=0,align2=6: 14.06 (0.00%) 14.84 (-5.56%) 14.06 length=96,align1=6,align2=6: 14.84 (-5.56%) 17.19 (-22.22%) 14.06 length=112,align1=0,align2=0: 17.19 (-4.76%) 14.06 (14.29%) 16.41 length=112,align1=7,align2=0: 17.19 (0.00%) 16.41 (4.55%) 17.19 length=112,align1=0,align2=7: 16.41 (0.00%) 14.84 (9.52%) 16.41 length=112,align1=7,align2=7: 17.19 (0.00%) 17.19 (0.00%) 17.19 length=144,align1=0,align2=0: 17.19 (-10.00%) 17.97 (-15.00%) 15.62 length=144,align1=9,align2=0: 17.19 (-4.76%) 18.75 (-14.29%) 16.41 length=144,align1=0,align2=9: 20.31 (-8.33%) 18.75 (0.00%) 18.75 length=144,align1=9,align2=9: 18.75 (-4.35%) 18.75 (-4.35%) 17.97 length=160,align1=0,align2=0: 18.75 (-4.35%) 17.97 (0.00%) 17.97 length=160,align1=10,align2=0: 18.75 (4.00%) 18.75 (4.00%) 19.53 length=160,align1=0,align2=10: 19.53 (-4.17%) 17.97 (4.17%) 18.75 length=160,align1=10,align2=10: 18.75 (-4.35%) 18.75 (-4.35%) 17.97 length=176,align1=0,align2=0: 18.75 (-4.35%) 17.19 (4.35%) 17.97 length=176,align1=11,align2=0: 19.53 (0.00%) 19.53 (0.00%) 19.53 length=176,align1=0,align2=11: 19.53 (-4.17%) 18.75 (0.00%) 18.75 length=176,align1=11,align2=11: 18.75 (0.00%) 17.97 (4.17%) 18.75 length=192,align1=0,align2=0: 18.75 (0.00%) 17.97 (4.17%) 18.75 length=192,align1=12,align2=0: 21.09 (-8.00%) 18.75 (4.00%) 19.53 length=192,align1=0,align2=12: 18.75 (0.00%) 18.75 (0.00%) 18.75 length=192,align1=12,align2=12: 18.75 (0.00%) 17.97 (4.17%) 18.75 length=208,align1=0,align2=0: 17.97 (0.00%) 20.31 (-13.04%) 17.97 length=208,align1=13,align2=0: 19.53 (7.41%) 21.09 (0.00%) 21.09 length=208,align1=0,align2=13: 23.44 (-11.11%) 21.09 (0.00%) 21.09 length=208,align1=13,align2=13: 21.09 (-3.85%) 21.09 (-3.85%) 20.31 length=224,align1=0,align2=0: 21.09 (-8.00%) 20.31 (-4.00%) 19.53 length=224,align1=14,align2=0: 23.44 (-11.11%) 20.31 (3.70%) 21.09 length=224,align1=0,align2=14: 21.09 (3.57%) 20.31 (7.14%) 21.88 length=224,align1=14,align2=14: 20.31 (0.00%) 19.53 (3.85%) 20.31 length=240,align1=0,align2=0: 20.31 (-4.00%) 19.53 (0.00%) 19.53 length=240,align1=15,align2=0: 22.66 (0.00%) 20.31 (10.34%) 22.66 length=240,align1=0,align2=15: 20.31 (-4.00%) 20.31 (-4.00%) 19.53 length=240,align1=15,align2=15: 21.88 (0.00%) 21.09 (3.57%) 21.88 length=272,align1=0,align2=0: 20.31 (0.00%) 28.12 (-38.46%) 20.31 length=272,align1=17,align2=0: 22.66 (0.00%) 27.34 (-20.69%) 22.66 length=272,align1=0,align2=17: 25.78 (-10.00%) 28.12 (-20.00%) 23.44 length=272,align1=17,align2=17: 22.66 (-3.57%) 27.34 (-25.00%) 21.88 length=288,align1=0,align2=0: 23.44 (-7.14%) 27.34 (-25.00%) 21.88 length=288,align1=18,align2=0: 22.66 (0.00%) 27.34 (-20.69%) 22.66 length=288,align1=0,align2=18: 23.44 (-3.45%) 25.00 (-10.35%) 22.66 length=288,align1=18,align2=18: 22.66 (-3.57%) 21.88 (0.00%) 21.88 length=304,align1=0,align2=0: 21.88 (0.00%) 21.88 (0.00%) 21.88 length=304,align1=19,align2=0: 23.44 (-3.45%) 22.66 (0.00%) 22.66 length=304,align1=0,align2=19: 22.66 (0.00%) 22.66 (0.00%) 22.66 length=304,align1=19,align2=19: 22.66 (-3.57%) 21.88 (0.00%) 21.88 length=320,align1=0,align2=0: 22.66 (-3.57%) 21.88 (0.00%) 21.88 length=320,align1=20,align2=0: 22.66 (0.00%) 22.66 (0.00%) 22.66 length=320,align1=0,align2=20: 22.66 (0.00%) 22.66 (0.00%) 22.66 length=320,align1=20,align2=20: 22.66 (-3.57%) 21.88 (0.00%) 21.88 length=336,align1=0,align2=0: 21.88 (0.00%) 24.22 (-10.71%) 21.88 length=336,align1=21,align2=0: 22.66 (0.00%) 25.00 (-10.35%) 22.66 length=336,align1=0,align2=21: 25.78 (0.00%) 25.00 (3.03%) 25.78 length=336,align1=21,align2=21: 25.00 (0.00%) 23.44 (6.25%) 25.00 length=352,align1=0,align2=0: 24.22 (0.00%) 24.22 (0.00%) 24.22 length=352,align1=22,align2=0: 25.00 (0.00%) 25.00 (0.00%) 25.00 length=352,align1=0,align2=22: 25.00 (-3.23%) 25.00 (-3.23%) 24.22 length=352,align1=22,align2=22: 25.00 (-3.23%) 24.22 (0.00%) 24.22 length=368,align1=0,align2=0: 25.00 (-3.23%) 23.44 (3.23%) 24.22 length=368,align1=23,align2=0: 25.00 (0.00%) 24.22 (3.12%) 25.00 length=368,align1=0,align2=23: 25.00 (-3.23%) 25.00 (-3.23%) 24.22 length=368,align1=23,align2=23: 25.00 (-6.67%) 23.44 (0.00%) 23.44 length=384,align1=0,align2=0: 24.22 (0.00%) 24.22 (0.00%) 24.22 length=384,align1=24,align2=0: 25.00 (0.00%) 24.22 (3.12%) 25.00 length=384,align1=0,align2=24: 25.00 (0.00%) 25.78 (-3.12%) 25.00 length=384,align1=24,align2=24: 24.22 (-3.33%) 23.44 (0.00%) 23.44 length=400,align1=0,align2=0: 25.00 (-3.23%) 26.56 (-9.68%) 24.22 length=400,align1=25,align2=0: 25.78 (-3.12%) 27.34 (-9.38%) 25.00 length=400,align1=0,align2=25: 27.34 (0.00%) 27.34 (0.00%) 27.34 length=400,align1=25,align2=25: 26.56 (0.00%) 25.78 (2.94%) 26.56 length=416,align1=0,align2=0: 26.56 (-3.03%) 25.78 (0.00%) 25.78 length=416,align1=26,align2=0: 28.12 (-2.86%) 27.34 (0.00%) 27.34 length=416,align1=0,align2=26: 27.34 (-2.94%) 28.12 (-5.88%) 26.56 length=416,align1=26,align2=26: 25.78 (0.00%) 26.56 (-3.03%) 25.78 length=432,align1=0,align2=0: 27.34 (-2.94%) 25.78 (2.94%) 26.56 length=432,align1=27,align2=0: 28.12 (-2.86%) 27.34 (0.00%) 27.34 length=432,align1=0,align2=27: 27.34 (0.00%) 28.12 (-2.86%) 27.34 length=432,align1=27,align2=27: 25.78 (0.00%) 25.78 (0.00%) 25.78 length=448,align1=0,align2=0: 26.56 (-3.03%) 25.78 (0.00%) 25.78 length=448,align1=28,align2=0: 27.34 (0.00%) 27.34 (0.00%) 27.34 length=448,align1=0,align2=28: 27.34 (0.00%) 28.12 (-2.86%) 27.34 length=448,align1=28,align2=28: 25.78 (0.00%) 25.78 (0.00%) 25.78 length=464,align1=0,align2=0: 25.78 (0.00%) 28.12 (-9.09%) 25.78 length=464,align1=29,align2=0: 28.12 (-2.86%) 29.69 (-8.57%) 27.34 length=464,align1=0,align2=29: 30.47 (0.00%) 30.47 (0.00%) 30.47 length=464,align1=29,align2=29: 28.12 (0.00%) 27.34 (2.78%) 28.12 length=480,align1=0,align2=0: 29.69 (-5.56%) 28.12 (0.00%) 28.12 length=480,align1=30,align2=0: 31.25 (-2.56%) 29.69 (2.56%) 30.47 length=480,align1=0,align2=30: 29.69 (0.00%) 30.47 (-2.63%) 29.69 length=480,align1=30,align2=30: 28.12 (0.00%) 28.12 (0.00%) 28.12 length=496,align1=0,align2=0: 28.12 (0.00%) 27.34 (2.78%) 28.12 length=496,align1=31,align2=0: 30.47 (-2.63%) 29.69 (0.00%) 29.69 length=496,align1=0,align2=31: 29.69 (0.00%) 30.47 (-2.63%) 29.69 length=496,align1=31,align2=31: 28.12 (-2.86%) 28.12 (-2.86%) 27.34 length=1024,align1=0,align2=0: 44.53 (0.00%) 44.53 (0.00%) 44.53 length=1024,align1=32,align2=0: 44.53 (-1.79%) 44.53 (-1.79%) 43.75 length=1024,align1=0,align2=32: 44.53 (-1.79%) 43.75 (0.00%) 43.75 length=1024,align1=32,align2=32: 43.75 (1.75%) 43.75 (1.75%) 44.53 length=1056,align1=0,align2=0: 46.88 (-1.69%) 46.88 (-1.69%) 46.09 length=1056,align1=33,align2=0: 53.12 (0.00%) 52.34 (1.47%) 53.12 length=1056,align1=0,align2=33: 52.34 (0.00%) 53.12 (-1.49%) 52.34 length=1056,align1=33,align2=33: 46.09 (0.00%) 46.88 (-1.69%) 46.09 length=1088,align1=0,align2=0: 46.88 (-1.69%) 46.09 (0.00%) 46.09 length=1088,align1=34,align2=0: 52.34 (0.00%) 52.34 (0.00%) 52.34 length=1088,align1=0,align2=34: 53.12 (-3.03%) 53.12 (-3.03%) 51.56 length=1088,align1=34,align2=34: 46.09 (0.00%) 46.88 (-1.69%) 46.09 length=1120,align1=0,align2=0: 49.22 (-1.61%) 48.44 (0.00%) 48.44 length=1120,align1=35,align2=0: 54.69 (1.41%) 55.47 (0.00%) 55.47 length=1120,align1=0,align2=35: 57.03 (0.00%) 55.47 (2.74%) 57.03 length=1120,align1=35,align2=35: 48.44 (0.00%) 49.22 (-1.61%) 48.44 length=1152,align1=0,align2=0: 47.66 (1.61%) 48.44 (0.00%) 48.44 length=1152,align1=36,align2=0: 55.47 (-1.43%) 55.47 (-1.43%) 54.69 length=1152,align1=0,align2=36: 58.59 (-1.35%) 55.47 (4.05%) 57.81 length=1152,align1=36,align2=36: 48.44 (0.00%) 49.22 (-1.61%) 48.44 length=1184,align1=0,align2=0: 53.12 (-3.03%) 50.78 (1.52%) 51.56 length=1184,align1=37,align2=0: 61.72 (-2.60%) 57.03 (5.19%) 60.16 length=1184,align1=0,align2=37: 62.50 (-1.27%) 57.03 (7.60%) 61.72 length=1184,align1=37,align2=37: 53.12 (-1.49%) 50.78 (2.99%) 52.34 length=1216,align1=0,align2=0: 53.91 (-4.55%) 50.78 (1.52%) 51.56 length=1216,align1=38,align2=0: 60.94 (0.00%) 57.03 (6.41%) 60.94 length=1216,align1=0,align2=38: 60.16 (0.00%) 57.81 (3.90%) 60.16 length=1216,align1=38,align2=38: 52.34 (-1.52%) 50.00 (3.03%) 51.56 length=1248,align1=0,align2=0: 54.69 (-2.94%) 53.12 (0.00%) 53.12 length=1248,align1=39,align2=0: 64.06 (-1.23%) 60.16 (4.94%) 63.28 length=1248,align1=0,align2=39: 60.94 (-2.63%) 60.16 (-1.32%) 59.38 length=1248,align1=39,align2=39: 53.12 (0.00%) 52.34 (1.47%) 53.12 length=1280,align1=0,align2=0: 52.34 (-1.52%) 52.34 (-1.52%) 51.56 length=1280,align1=40,align2=0: 61.72 (3.66%) 59.38 (7.32%) 64.06 length=1280,align1=0,align2=40: 60.94 (-2.63%) 60.16 (-1.32%) 59.38 length=1280,align1=40,align2=40: 52.34 (-1.52%) 52.34 (-1.52%) 51.56 length=1312,align1=0,align2=0: 54.69 (-1.45%) 55.47 (-2.90%) 53.91 length=1312,align1=41,align2=0: 63.28 (0.00%) 62.50 (1.23%) 63.28 length=1312,align1=0,align2=41: 62.50 (0.00%) 62.50 (0.00%) 62.50 length=1312,align1=41,align2=41: 53.91 (0.00%) 54.69 (-1.45%) 53.91 length=1344,align1=0,align2=0: 54.69 (0.00%) 54.69 (0.00%) 54.69 length=1344,align1=42,align2=0: 62.50 (0.00%) 62.50 (0.00%) 62.50 length=1344,align1=0,align2=42: 62.50 (-1.27%) 62.50 (-1.27%) 61.72 length=1344,align1=42,align2=42: 53.91 (0.00%) 53.91 (0.00%) 53.91 length=1376,align1=0,align2=0: 65.62 (-16.67%) 68.75 (-22.22%) 56.25 length=1376,align1=43,align2=0: 71.88 (-9.52%) 73.44 (-11.90%) 65.62 length=1376,align1=0,align2=43: 72.66 (-12.05%) 74.22 (-14.46%) 64.84 length=1376,align1=43,align2=43: 64.06 (-13.89%) 67.97 (-20.83%) 56.25 length=1408,align1=0,align2=0: 57.03 (-1.39%) 68.75 (-22.22%) 56.25 length=1408,align1=44,align2=0: 65.62 (-1.20%) 73.44 (-13.25%) 64.84 length=1408,align1=0,align2=44: 64.84 (0.00%) 74.22 (-14.46%) 64.84 length=1408,align1=44,align2=44: 56.25 (-1.41%) 68.75 (-23.94%) 55.47 length=1440,align1=0,align2=0: 67.97 (-14.47%) 64.84 (-9.21%) 59.38 length=1440,align1=45,align2=0: 74.22 (-10.47%) 68.75 (-2.33%) 67.19 length=1440,align1=0,align2=45: 72.66 (-6.90%) 69.53 (-2.30%) 67.97 length=1440,align1=45,align2=45: 65.62 (-13.51%) 58.59 (-1.35%) 57.81 length=1472,align1=0,align2=0: 66.41 (-14.86%) 58.59 (-1.35%) 57.81 length=1472,align1=46,align2=0: 73.44 (-9.30%) 67.19 (0.00%) 67.19 length=1472,align1=0,align2=46: 70.31 (-4.65%) 67.97 (-1.16%) 67.19 length=1472,align1=46,align2=46: 57.81 (0.00%) 58.59 (-1.35%) 57.81 length=1504,align1=0,align2=0: 60.94 (0.00%) 60.94 (0.00%) 60.94 length=1504,align1=47,align2=0: 71.09 (-1.11%) 70.31 (0.00%) 70.31 length=1504,align1=0,align2=47: 70.31 (-1.12%) 70.31 (-1.12%) 69.53 length=1504,align1=47,align2=47: 60.94 (-1.30%) 60.16 (0.00%) 60.16 length=1536,align1=0,align2=0: 62.50 (-3.90%) 60.16 (0.00%) 60.16 length=1536,align1=48,align2=0: 60.94 (-1.30%) 60.16 (0.00%) 60.16 length=1536,align1=0,align2=48: 61.72 (-3.95%) 60.16 (-1.32%) 59.38 length=1536,align1=48,align2=48: 60.94 (-1.30%) 60.16 (0.00%) 60.16 length=1568,align1=0,align2=0: 80.47 (-27.16%) 63.28 (0.00%) 63.28 length=1568,align1=49,align2=0: 86.72 (-18.09%) 72.66 (1.06%) 73.44 length=1568,align1=0,align2=49: 74.22 (-3.26%) 74.22 (-3.26%) 71.88 length=1568,align1=49,align2=49: 62.50 (0.00%) 61.72 (1.25%) 62.50 length=1600,align1=0,align2=0: 62.50 (-1.27%) 62.50 (-1.27%) 61.72 length=1600,align1=50,align2=0: 73.44 (0.00%) 71.88 (2.13%) 73.44 length=1600,align1=0,align2=50: 72.66 (0.00%) 73.44 (-1.08%) 72.66 length=1600,align1=50,align2=50: 62.50 (-1.27%) 62.50 (-1.27%) 61.72 length=1632,align1=0,align2=0: 64.84 (0.00%) 64.84 (0.00%) 64.84 length=1632,align1=51,align2=0: 75.78 (0.00%) 75.00 (1.03%) 75.78 length=1632,align1=0,align2=51: 78.91 (0.00%) 75.78 (3.96%) 78.91 length=1632,align1=51,align2=51: 64.84 (-2.47%) 64.84 (-2.47%) 63.28 length=1664,align1=0,align2=0: 64.84 (-1.22%) 64.84 (-1.22%) 64.06 length=1664,align1=52,align2=0: 75.78 (0.00%) 75.00 (1.03%) 75.78 length=1664,align1=0,align2=52: 80.47 (-0.98%) 75.78 (4.90%) 79.69 length=1664,align1=52,align2=52: 64.06 (-1.23%) 65.62 (-3.70%) 63.28 length=1696,align1=0,align2=0: 69.53 (-3.49%) 72.66 (-8.14%) 67.19 length=1696,align1=53,align2=0: 80.47 (-0.98%) 82.03 (-2.94%) 79.69 length=1696,align1=0,align2=53: 80.47 (0.96%) 82.03 (-0.96%) 81.25 length=1696,align1=53,align2=53: 68.75 (-2.33%) 72.66 (-8.14%) 67.19 length=1728,align1=0,align2=0: 67.97 (0.00%) 72.66 (-6.90%) 67.97 length=1728,align1=54,align2=0: 80.47 (-0.98%) 82.81 (-3.92%) 79.69 length=1728,align1=0,align2=54: 78.91 (-1.00%) 82.03 (-5.00%) 78.12 length=1728,align1=54,align2=54: 68.75 (0.00%) 72.66 (-5.68%) 68.75 length=1760,align1=0,align2=0: 77.34 (-12.50%) 68.75 (0.00%) 68.75 length=1760,align1=55,align2=0: 91.41 (-8.33%) 79.69 (5.56%) 84.38 length=1760,align1=0,align2=55: 88.28 (-10.78%) 80.47 (-0.98%) 79.69 length=1760,align1=55,align2=55: 77.34 (-11.24%) 68.75 (1.12%) 69.53 length=1792,align1=0,align2=0: 78.12 (-14.94%) 68.75 (-1.15%) 67.97 length=1792,align1=56,align2=0: 88.28 (-4.63%) 79.69 (5.56%) 84.38 length=1792,align1=0,align2=56: 88.28 (-9.71%) 80.47 (0.00%) 80.47 length=1792,align1=56,align2=56: 77.34 (-11.24%) 68.75 (1.12%) 69.53 length=1824,align1=0,align2=0: 72.66 (7.92%) 70.31 (10.89%) 78.91 length=1824,align1=57,align2=0: 85.94 (5.17%) 82.03 (9.48%) 90.62 length=1824,align1=0,align2=57: 82.03 (3.67%) 82.81 (2.75%) 85.16 length=1824,align1=57,align2=57: 70.31 (-1.12%) 70.31 (-1.12%) 69.53 length=1856,align1=0,align2=0: 70.31 (-1.12%) 70.31 (-1.12%) 69.53 length=1856,align1=58,align2=0: 83.59 (-0.94%) 82.03 (0.94%) 82.81 length=1856,align1=0,align2=58: 178.12 (-115.09%) 82.81 (0.00%) 82.81 length=1856,align1=58,align2=58: 70.31 (-1.12%) 70.31 (-1.12%) 69.53 length=1888,align1=0,align2=0: 73.44 (-1.08%) 78.91 (-8.60%) 72.66 length=1888,align1=59,align2=0: 85.94 (0.00%) 89.84 (-4.55%) 85.94 length=1888,align1=0,align2=59: 84.38 (0.00%) 89.06 (-5.56%) 84.38 length=1888,align1=59,align2=59: 72.66 (-1.09%) 78.12 (-8.70%) 71.88 length=1920,align1=0,align2=0: 72.66 (-1.09%) 78.12 (-8.70%) 71.88 length=1920,align1=60,align2=0: 85.94 (0.00%) 89.84 (-4.55%) 85.94 length=1920,align1=0,align2=60: 85.16 (0.00%) 89.06 (-4.59%) 85.16 length=1920,align1=60,align2=60: 72.66 (-1.09%) 78.91 (-9.78%) 71.88 length=1952,align1=0,align2=0: 75.00 (-1.05%) 75.00 (-1.05%) 74.22 length=1952,align1=61,align2=0: 88.28 (0.00%) 87.50 (0.88%) 88.28 length=1952,align1=0,align2=61: 87.50 (0.00%) 88.28 (-0.89%) 87.50 length=1952,align1=61,align2=61: 74.22 (0.00%) 74.22 (0.00%) 74.22 length=1984,align1=0,align2=0: 75.00 (-1.05%) 73.44 (1.05%) 74.22 length=1984,align1=62,align2=0: 89.06 (-0.89%) 87.50 (0.88%) 88.28 length=1984,align1=0,align2=62: 87.50 (0.00%) 88.28 (-0.89%) 87.50 length=1984,align1=62,align2=62: 74.22 (0.00%) 74.22 (0.00%) 74.22 length=2016,align1=0,align2=0: 77.34 (-1.02%) 76.56 (0.00%) 76.56 length=2016,align1=63,align2=0: 91.41 (-0.86%) 90.62 (0.00%) 90.62 length=2016,align1=0,align2=63: 89.84 (0.00%) 90.62 (-0.87%) 89.84 length=2016,align1=63,align2=63: 77.34 (-1.02%) 76.56 (0.00%) 76.56 length=4096,align1=0,align2=0: 141.41 (-0.56%) 146.88 (-4.44%) 140.62 Function: memcpy __memcpy_thunderx __memcpy_falkor __memcpy_generic Variant: large ================================================================================ length=65543,align1=0,align2=0: 4018.75 (3.09%) 2634.38 (36.47%) 4146.88 length=65551,align1=0,align2=3: 4425.00 (-6.47%) 3134.38 (24.59%) 4156.25 length=65567,align1=3,align2=0: 2909.38 (29.95%) 3134.38 (24.53%) 4153.12 length=65599,align1=3,align2=5: 4415.62 (-6.16%) 3134.38 (24.64%) 4159.38 length=131079,align1=0,align2=0: 5765.62 (30.38%) 5240.62 (36.72%) 8281.25 length=131087,align1=0,align2=3: 8831.25 (-6.56%) 6271.88 (24.32%) 8287.50 length=131103,align1=3,align2=0: 5793.75 (29.05%) 6268.75 (23.23%) 8165.62 length=131135,align1=3,align2=5: 5806.25 (29.97%) 6259.38 (24.50%) 8290.62 length=262151,align1=0,align2=0: 11850.00 (28.91%) 10762.50 (35.43%) 16668.80 length=262159,align1=0,align2=3: 12043.80 (27.72%) 12700.00 (23.78%) 16662.50 length=262175,align1=3,align2=0: 12046.90 (27.90%) 12687.50 (24.07%) 16709.40 length=262207,align1=3,align2=5: 11984.40 (28.08%) 12678.10 (23.91%) 16662.50 length=524295,align1=0,align2=0: 24825.00 (25.00%) 24268.80 (27.34%) 33400.00 length=524303,align1=0,align2=3: 35731.20 (-6.53%) 25678.10 (23.44%) 33540.60 length=524319,align1=3,align2=0: 25893.80 (22.71%) 25725.00 (23.22%) 33503.10 length=524351,align1=3,align2=5: 25887.50 (22.86%) 25690.60 (23.45%) 33559.40 length=1048583,align1=0,align2=0: 50621.90 (0.30%) 50600.00 (0.34%) 50771.90 length=1048591,align1=0,align2=3: 53206.20 (0.54%) 51081.20 (4.51%) 53493.80 length=1048607,align1=3,align2=0: 53221.90 (0.32%) 51975.00 (2.66%) 53393.80 length=1048639,align1=3,align2=5: 53240.60 (0.36%) 51953.10 (2.77%) 53431.20 length=2097159,align1=0,align2=0: 103744.00 (-2.00%) 102447.00 (-1.00%) 102425.00 length=2097167,align1=0,align2=3: 108588.00 (-1.00%) 105159.00 (2.00%) 107606.00 length=2097183,align1=3,align2=0: 107678.00 (0.00%) 105250.00 (2.00%) 108125.00 length=2097215,align1=3,align2=5: 107906.00 (1.00%) 105841.00 (3.00%) 109475.00 length=4194311,align1=0,align2=0: 202994.00 (0.00%) 202500.00 (1.00%) 204809.00 length=4194319,align1=0,align2=3: 213350.00 (0.00%) 205997.00 (3.00%) 213384.00 length=4194335,align1=3,align2=0: 212653.00 (0.00%) 206444.00 (3.00%) 212900.00 length=4194367,align1=3,align2=5: 213044.00 (0.00%) 206084.00 (3.00%) 213847.00 length=8388615,align1=0,align2=0: 401294.00 (0.00%) 401231.00 (0.00%) 401944.00 length=8388623,align1=0,align2=3: 480872.00 (-14.00%) 406444.00 (3.00%) 422900.00 length=8388639,align1=3,align2=0: 422147.00 (0.00%) 407750.00 (3.00%) 422803.00 length=8388671,align1=3,align2=5: 442003.00 (-5.00%) 407125.00 (3.00%) 423509.00 length=16777223,align1=0,align2=0: 799809.00 (0.00%) 800000.00 (0.00%) 801756.00 length=16777231,align1=0,align2=3: 841184.00 (0.00%) 808525.00 (4.00%) 843775.00 length=16777247,align1=3,align2=0: 841166.00 (0.00%) 810147.00 (3.00%) 843147.00 length=16777279,align1=3,align2=5: 972569.00 (-16.00%) 808588.00 (4.00%) 843731.00 length=33554439,align1=0,align2=0: 1842240.00 (-0.01%) 1863590.00 (-1.17%) 1841990.00 length=33554447,align1=0,align2=3: 2103470.00 (-2.74%) 1919460.00 (6.25%) 2047440.00 length=33554463,align1=3,align2=0: 2075690.00 (-1.07%) 1930040.00 (6.02%) 2053720.00 length=33554495,align1=3,align2=5: 2110590.00 (-2.82%) 1924440.00 (6.25%) 2052650.00 Function: memcpy __memcpy_thunderx __memcpy_falkor __memcpy_generic Variant: random ================================================================================ max-size=4096: 44061.90 (5.85%) 38568.20 (17.59%) 46799.90 max-size=8192: 42790.90 (5.27%) 38158.90 (15.52%) 45171.50 max-size=16384: 44912.10 (2.25%) 38710.40 (15.75%) 45945.00 max-size=32768: 43577.90 (1.23%) 37975.10 (13.93%) 44120.00 max-size=65536: 44375.50 (1.04%) 38474.20 (14.20%) 44840.60 * manual/tunables.texi (Tunable glibc.tune.cpu): Add falkor. * sysdeps/aarch64/multiarch/Makefile (sysdep_routines): Add memcpy_falkor. * sysdeps/aarch64/multiarch/ifunc-impl-list.c (MAX_IFUNC): Bump. (__libc_ifunc_impl_list): Add __memcpy_falkor. * sysdeps/aarch64/multiarch/memcpy.c: Likewise. * sysdeps/aarch64/multiarch/memcpy_falkor.S: New file. * sysdeps/unix/sysv/linux/aarch64/cpu-features.c (cpu_list): Add falkor. * sysdeps/unix/sysv/linux/aarch64/cpu-features.h (IS_FALKOR): New macro.
* * manual/tunables.texi: Add missing @end deftp.DJ Delorie2017-07-061-0/+1
|
* Add per-thread cache to mallocDJ Delorie2017-07-061-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | * config.make.in: Enable experimental malloc option. * configure.ac: Likewise. * configure: Regenerate. * manual/install.texi: Document it. * INSTALL: Regenerate. * malloc/Makefile: Likewise. * malloc/malloc.c: Add per-thread cache (tcache). (tcache_put): New. (tcache_get): New. (tcache_thread_freeres): New. (tcache_init): New. (__libc_malloc): Use cached chunks if available. (__libc_free): Initialize tcache if needed. (__libc_realloc): Likewise. (__libc_calloc): Likewise. (_int_malloc): Prefill tcache when appropriate. (_int_free): Likewise. (do_set_tcache_max): New. (do_set_tcache_count): New. (do_set_tcache_unsorted_limit): New. * manual/probes.texi: Document new probes. * malloc/arena.c: Add new tcache tunables. * elf/dl-tunables.list: Likewise. * manual/tunables.texi: Document them. * NEWS: Mention the per-thread cache.
* tunables, aarch64: New tunable to override cpuSiddhesh Poyarekar2017-06-301-0/+8
| | | | | | | | | | | | | | | | | | | | Add a new tunable (glibc.tune.cpu) to override CPU identification on aarch64. This is useful in two cases: one where it is desirable to pretend to be another CPU for purposes of testing or because routines written for that CPU are beneficial for specific workloads and second where the underlying kernel does not support emulation of MRS to get the MIDR of the CPU. * elf/dl-tunables.h (tunable_is_name): Move from... * elf/dl-tunables.c (is_name): ... here. (parse_tunables, __tunables_init): Adjust. * manual/tunables.texi: Document glibc.tune.cpu. * sysdeps/aarch64/dl-tunables.list: New file. * sysdeps/unix/sysv/linux/aarch64/cpu-features.c (struct cpu_list): New type. (cpu_list): New list of CPU names and their MIDR. (get_midr_from_mcpu): New function. (init_cpu_features): Override MIDR if necessary.
* x86: Rename glibc.tune.ifunc to glibc.tune.hwcapsH.J. Lu2017-06-211-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | Rename glibc.tune.ifunc to glibc.tune.hwcaps and move it to sysdeps/x86/dl-tunables.list since it is x86 specicifc. Also change type of data_cache_size, data_cache_size and non_temporal_threshold to unsigned long int to match size_t. Remove usage DEFAULT_STRLEN from cpu-tunables.c. * elf/dl-tunables.list (glibc.tune.ifunc): Removed. * sysdeps/x86/dl-tunables.list (glibc.tune.hwcaps): New. Remove security_level on all fields. * manual/tunables.texi: Replace ifunc with hwcaps. * sysdeps/x86/cpu-features.c (TUNABLE_CALLBACK (set_ifunc)): Renamed to .. (TUNABLE_CALLBACK (set_hwcaps)): This. (init_cpu_features): Updated. * sysdeps/x86/cpu-features.h (cpu_features): Change type of data_cache_size, data_cache_size and non_temporal_threshold to unsigned long int. * sysdeps/x86/cpu-tunables.c (DEFAULT_STRLEN): Removed. (TUNABLE_CALLBACK (set_ifunc)): Renamed to ... (TUNABLE_CALLBACK (set_hwcaps)): This. Update comments. Don't use DEFAULT_STRLEN.
* tunables: Add IFUNC selection and cache sizesH.J. Lu2017-06-201-0/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current IFUNC selection is based on microbenchmarks in glibc. It should give the best performance for most workloads. But other choices may have better performance for a particular workload or on the hardware which wasn't available at the selection was made. The environment variable, GLIBC_TUNABLES=glibc.tune.ifunc=-xxx,yyy,-zzz...., can be used to enable CPU/ARCH feature yyy, disable CPU/ARCH feature yyy and zzz, where the feature name is case-sensitive and has to match the ones in cpu-features.h. It can be used by glibc developers to override the IFUNC selection to tune for a new processor or improve performance for a particular workload. It isn't intended for normal end users. NOTE: the IFUNC selection may change over time. Please check all multiarch implementations when experimenting. Also, GLIBC_TUNABLES=glibc.tune.x86_non_temporal_threshold=NUMBER is provided to set threshold to use non temporal store to NUMBER, GLIBC_TUNABLES=glibc.tune.x86_data_cache_size=NUMBER to set data cache size, GLIBC_TUNABLES=glibc.tune.x86_shared_cache_size=NUMBER to set shared cache size. * elf/dl-tunables.list (tune): Add ifunc, x86_non_temporal_threshold, x86_data_cache_size and x86_shared_cache_size. * manual/tunables.texi: Document glibc.tune.ifunc, glibc.tune.x86_data_cache_size, glibc.tune.x86_shared_cache_size and glibc.tune.x86_non_temporal_threshold. * sysdeps/unix/sysv/linux/x86/dl-sysdep.c: New file. * sysdeps/x86/cpu-tunables.c: Likewise. * sysdeps/x86/cacheinfo.c (init_cacheinfo): Check and get data cache size, shared cache size and non temporal threshold from cpu_features. * sysdeps/x86/cpu-features.c [HAVE_TUNABLES] (TUNABLE_NAMESPACE): New. [HAVE_TUNABLES] Include <unistd.h>. [HAVE_TUNABLES] Include <elf/dl-tunables.h>. [HAVE_TUNABLES] (TUNABLE_CALLBACK (set_ifunc)): Likewise. [HAVE_TUNABLES] (init_cpu_features): Use TUNABLE_GET to set IFUNC selection, data cache size, shared cache size and non temporal threshold. * sysdeps/x86/cpu-features.h (cpu_features): Add data_cache_size, shared_cache_size and non_temporal_threshold.
* tunables: Add LD_HWCAP_MASK to tunablesSiddhesh Poyarekar2017-06-071-0/+23
| | | | | | | | | | | | | | | | Add LD_HWCAP_MASK to tunables in preparation of it being removed from rtld.c. This allows us to read LD_HWCAP_MASK much earlier so that it can influence IFUNC resolution in aarch64. This patch does not actually do anything other than read the LD_HWCAP_MASK variable and add the tunables way to set the LD_HWCAP_MASK, i.e. via the glibc.tune.hwcap_mask tunable. In a follow-up patch, the _dl_hwcap_mask will be replaced with glibc.tune.hwcap_mask to complete the transition. * elf/dl-tunables.list: Add glibc.tune.hwcap_mask. * scripts/gen-tunables.awk: Include dl-procinfo.h. * manual/tunables.texi: Document glibc.tune.hwcap_mask.
* User manual documentation for tunablesSiddhesh Poyarekar2016-12-311-0/+192
Create a new node for tunables documentation and add notes for the malloc tunables. * manual/tunables.texi: New chapter. * manual/Makefile (chapters): Add it. * manual/probes.texi (@node): Point to the Tunables chapter.