From 71ae86478edc7b21872464f43fb29ff650c1681a Mon Sep 17 00:00:00 2001 From: Adhemerval Zanella Date: Tue, 15 Jul 2014 12:19:09 -0400 Subject: PowerPC: memset optimization for POWER8/PPC64 This patch adds an optimized memset implementation for POWER8. For sizes from 0 to 255 bytes, a word/doubleword algorithm similar to POWER7 optimized one is used. For size higher than 255 two strategies are used: 1. If the constant is different than 0, the memory is written with altivec vector instruction; 2. If constant is 0, dbcz instructions are used. The loop is unrolled to clear 512 byte at time. Using vector instructions increases throughput considerable, with a double performance for sizes larger than 1024. The dcbz loops unrolls also shows performance improvement, by doubling throughput for sizes larger than 8192 bytes. --- benchtests/bench-memset.c | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'benchtests') diff --git a/benchtests/bench-memset.c b/benchtests/bench-memset.c index 5304113e3d..20265936b9 100644 --- a/benchtests/bench-memset.c +++ b/benchtests/bench-memset.c @@ -150,6 +150,11 @@ test_main (void) if (i & (i - 1)) do_test (0, c, i); } + for (i = 32; i < 512; i+=32) + { + do_test (0, c, i); + do_test (i, c, i); + } do_test (1, c, 14); do_test (3, c, 1024); do_test (4, c, 64); -- cgit 1.4.1