diff options
author | Anton Youdkevitch <anton.youdkevitch@bell-sw.com> | 2019-04-05 13:59:54 -0700 |
---|---|---|
committer | Steve Ellcey <sellcey@caviumnetworks.com> | 2019-04-05 13:59:54 -0700 |
commit | 94e358f6d490650c714edb1ffc3a52f56ffe086e (patch) | |
tree | 0d3fe4130b57765fc8b4806d37a43cd517ac6ede /libio/getwchar_u.c | |
parent | f82ed45d7f77838bc8cff4c0a4ff33e76bb18a35 (diff) | |
download | glibc-94e358f6d490650c714edb1ffc3a52f56ffe086e.tar.gz glibc-94e358f6d490650c714edb1ffc3a52f56ffe086e.tar.xz glibc-94e358f6d490650c714edb1ffc3a52f56ffe086e.zip |
aarch64: thunderx2 memcpy implementation cleanup and streamlining
Here is the updated patch for improving the long unaligned code path (the one using "ext" instruction). 1. Always taken conditional branch at the beginning is removed. 2. Epilogue code is placed after the end of the loop to reduce the number of branches. 3. The redundant "mov" instructions inside the loop are gone due to the changed order of the registers in the "ext" instructions inside the loop, the prologue has additional "ext" instruction. 4.Updating count in the prologue was hoisted out as it is the same update for each prologue. 5. Invariant code of the loop epilogue was hoisted out. 6. As the current size of the ext chunk is exactly 16 instructions long "nop" was added at the beginning of the code sequence so that the loop entry for all the chunks be aligned. * sysdeps/aarch64/multiarch/memcpy_thunderx2.S: Cleanup branching and remove redundant code.
Diffstat (limited to 'libio/getwchar_u.c')
0 files changed, 0 insertions, 0 deletions