about summary refs log tree commit diff
path: root/ChangeLog
diff options
context:
space:
mode:
authorH.J. Lu <hjl.tools@gmail.com>2019-07-24 14:48:33 -0700
committerH.J. Lu <hjl.tools@gmail.com>2019-07-24 14:48:43 -0700
commit7e681561a3aea7aa8f21fb031a7c778147dfdf5b (patch)
tree9d70b934aeae381ec82fa7b21481728bbf0ad59a /ChangeLog
parent82c664ed751f52a3074a9d6d366e87086f10b2f4 (diff)
downloadglibc-7e681561a3aea7aa8f21fb031a7c778147dfdf5b.tar.gz
glibc-7e681561a3aea7aa8f21fb031a7c778147dfdf5b.tar.xz
glibc-7e681561a3aea7aa8f21fb031a7c778147dfdf5b.zip
x86-64: Compile branred.c with -mprefer-vector-width=128 [BZ #24603]
When compiled with -O3 and AVX, GCC 8 and 9 optimize some loops in
sysdeps/ieee754/dbl-64/branred.c with 256-bit vector instructions,
which leads to store forward stall:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90579

There is no easy fix in compiler.  This patch limits vector width to
128 bits to work around this issue.  It improves performance of sin
and cos by more than 40% on Skylake compiled with -O3 -march=skylake.

Tested with GCC 7/8/9 on x86-64.

	[BZ #24603]
	* sysdeps/x86_64/configure.ac: Check if -mprefer-vector-width=128
	works.
	* sysdeps/x86_64/configure: Regenerated.
	* sysdeps/x86_64/fpu/Makefile (CFLAGS-branred.c): New.  Set
	to -mprefer-vector-width=128 if supported.
Diffstat (limited to 'ChangeLog')
-rw-r--r--ChangeLog9
1 files changed, 9 insertions, 0 deletions
diff --git a/ChangeLog b/ChangeLog
index 88108d1e8b..31a6b38bd5 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+2019-07-24  H.J. Lu  <hongjiu.lu@intel.com>
+
+	[BZ #24603]
+	* sysdeps/x86_64/configure.ac: Check if -mprefer-vector-width=128
+	works.
+	* sysdeps/x86_64/configure: Regenerated.
+	* sysdeps/x86_64/fpu/Makefile (CFLAGS-branred.c): New.  Set
+	to -mprefer-vector-width=128 if supported.
+
 2019-07-24  Florian Weimer  <fweimer@redhat.com>
 
 	* scripts/build-many-glibcs.py (Context.checkout): Default to