From 83d776f979342f923b5c3d2a5b43afab841c6086 Mon Sep 17 00:00:00 2001 From: Andrew Senkevich Date: Sat, 19 Dec 2015 02:47:28 +0300 Subject: Added memset optimized with AVX512 for KNL hardware. It shows improvement up to 28% over AVX2 memset (performance results attached at ). * sysdeps/x86_64/multiarch/memset-avx512-no-vzeroupper.S: New file. * sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Added new file. * sysdeps/x86_64/multiarch/ifunc-impl-list.c: Added new tests. * sysdeps/x86_64/multiarch/memset.S: Added new IFUNC branch. * sysdeps/x86_64/multiarch/memset_chk.S: Likewise. * sysdeps/x86/cpu-features.h (bit_Prefer_No_VZEROUPPER, index_Prefer_No_VZEROUPPER): New. * sysdeps/x86/cpu-features.c (init_cpu_features): Set the Prefer_No_VZEROUPPER for Knights Landing. --- sysdeps/x86_64/multiarch/Makefile | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'sysdeps/x86_64/multiarch/Makefile') diff --git a/sysdeps/x86_64/multiarch/Makefile b/sysdeps/x86_64/multiarch/Makefile index bb811c2dfb..b2e31efe02 100644 --- a/sysdeps/x86_64/multiarch/Makefile +++ b/sysdeps/x86_64/multiarch/Makefile @@ -18,7 +18,8 @@ sysdep_routines += strncat-c stpncpy-c strncpy-c strcmp-ssse3 \ stpcpy-sse2-unaligned stpncpy-sse2-unaligned \ strcat-sse2-unaligned strncat-sse2-unaligned \ strchr-sse2-no-bsf memcmp-ssse3 strstr-sse2-unaligned \ - strcspn-c strpbrk-c strspn-c varshift memset-avx2 + strcspn-c strpbrk-c strspn-c varshift memset-avx2 \ + memset-avx512-no-vzeroupper CFLAGS-varshift.c += -msse4 CFLAGS-strcspn-c.c += -msse4 CFLAGS-strpbrk-c.c += -msse4 -- cgit 1.4.1