diff options
author | Noah Goldstein <goldstein.w.n@gmail.com> | 2021-11-10 16:18:56 -0600 |
---|---|---|
committer | Noah Goldstein <goldstein.w.n@gmail.com> | 2021-11-10 20:12:10 -0600 |
commit | 2f9062d7171850451e6044ef78d91ff8c017b9c0 (patch) | |
tree | e17665196e1f1d851601c54ad794b0bae8d5e50f /support | |
parent | 309548bec3b89022bbc81a372ec3e9240211d799 (diff) | |
download | glibc-2f9062d7171850451e6044ef78d91ff8c017b9c0.tar.gz glibc-2f9062d7171850451e6044ef78d91ff8c017b9c0.tar.xz glibc-2f9062d7171850451e6044ef78d91ff8c017b9c0.zip |
x86: Shrink memcmp-sse4.S code size
No bug. This implementation refactors memcmp-sse4.S primarily with minimizing code size in mind. It does this by removing the lookup table logic and removing the unrolled check from (256, 512] bytes. memcmp-sse4 code size reduction : -3487 bytes wmemcmp-sse4 code size reduction: -1472 bytes The current memcmp-sse4.S implementation has a large code size cost. This has serious adverse affects on the ICache / ITLB. While in micro-benchmarks the implementations appears fast, traces of real-world code have shown that the speed in micro benchmarks does not translate when the ICache/ITLB are not primed, and that the cost of the code size has measurable negative affects on overall application performance. See https://research.google/pubs/pub48320/ for more details. Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com> Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
Diffstat (limited to 'support')
0 files changed, 0 insertions, 0 deletions