From 0f04fc07f6a36e761ec76100043798ea5c7a590c Mon Sep 17 00:00:00 2001 From: Joseph Myers Date: Thu, 20 Oct 2016 23:24:44 +0000 Subject: Use VSQRT instruction for ARM sqrt (bug 20660). This patch makes ARM sqrt and sqrtf use the VSQRT VFP square root instruction when available, instead of much larger generic code for computing square roots. Now, GCC will normally inline sqrt calls except for negative arguments where errno needs to be set, and because the benchtests fail to use -fno-builtin that means no significant difference in benchmark results for sqrt (note, however, there are lots of __ieee754_sqrt calls internally in libm, which are *not* inlined - although some architectures define __ieee754_sqrt in their math_private.h for that purpose, ARM doesn't - so improving out-of-line sqrt performance is still relevant to those other functions, if not for most ordinary direct users of sqrt). With the benchtests changed to use -fno-builtin for sqrt tests, typical performance results before the change are ("max" is wildly varying in any case): "duration": 9.88358e+09, "iterations": 4.8783e+07, "max": 457.764, "min": 183.105, "mean": 202.603 and after it are: "duration": 9.45663e+09, "iterations": 2.24385e+08, "max": 274.659, "min": 30.517, "mean": 42.1447 Tested for ARM (hard-float and soft-float). [BZ #20660] * sysdeps/arm/e_sqrt.c: New file. * sysdeps/arm/e_sqrtf.c: Likewise. --- ChangeLog | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'ChangeLog') diff --git a/ChangeLog b/ChangeLog index 49436bcf4b..ef710b4194 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,9 @@ +2016-10-20 Joseph Myers + + [BZ #20660] + * sysdeps/arm/e_sqrt.c: New file. + * sysdeps/arm/e_sqrtf.c: Likewise. + 2016-10-19 Joseph Myers [BZ #20718] -- cgit 1.4.1