about summary refs log tree commit diff
diff options
context:
space:
mode:
authorAmit Pawar <Amit.Pawar@amd.com>2016-01-14 20:06:02 +0530
committerH.J. Lu <hjl.tools@gmail.com>2016-01-14 08:14:31 -0800
commitd7890e6947114785755ae5b1cf5310491092ee0b (patch)
treed69fa16cd7eec8008f4224a8006128555916262c
parenta4b5177ca83ca97c562a7138923dafe0cb92d1a0 (diff)
downloadglibc-d7890e6947114785755ae5b1cf5310491092ee0b.tar.gz
glibc-d7890e6947114785755ae5b1cf5310491092ee0b.tar.xz
glibc-d7890e6947114785755ae5b1cf5310491092ee0b.zip
Set index_Fast_Unaligned_Load for Excavator family CPUs
GLIBC benchtest testcases shows SSE2_Unaligned based implementations
are performing faster compare to SSE2 based implementations for
routines: strcmp, strcat, strncat, stpcpy, stpncpy, strcpy, strncpy
and strstr. Flag index_Fast_Unaligned_Load is set for Excavator family
0x15h CPU's. This makes SSE2_Unaligned based implementations as
default for these routines.

	[BZ #19467]
	* sysdeps/x86/cpu-features.c (init_cpu_features): Set
	index_Fast_Unaligned_Load flag for Excavator family CPUs.
-rw-r--r--ChangeLog6
-rw-r--r--sysdeps/x86/cpu-features.c8
2 files changed, 14 insertions, 0 deletions
diff --git a/ChangeLog b/ChangeLog
index 424f7312c5..054998fd42 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,9 @@
+2016-01-14  Amit Pawar  <amit.pawar@amd.com>
+
+	[BZ #19467]
+	* sysdeps/x86/cpu-features.c (init_cpu_features): Set
+	index_Fast_Unaligned_Load flag for Excavator family CPUs.
+
 2016-01-02  Marcin Koƛcielnicki  <koriakin@0x04.net>
 
 	* sysdeps/s390/nptl/tls.h (struct tcbhead_t): Add __private_ss field.
diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
index e6bd4c909f..218ff2bd86 100644
--- a/sysdeps/x86/cpu-features.c
+++ b/sysdeps/x86/cpu-features.c
@@ -154,6 +154,14 @@ init_cpu_features (struct cpu_features *cpu_features)
 		 cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].ebx,
 		 cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].ecx,
 		 cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].edx);
+
+      if (family == 0x15)
+	{
+	  /* "Excavator"   */
+	  if (model >= 0x60 && model <= 0x7f)
+	    cpu_features->feature[index_Fast_Unaligned_Load]
+	      |= bit_Fast_Unaligned_Load;
+	}
     }
   else
     kind = arch_kind_other;