| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
For strings >16B and <32B existing algorithm takes more time than default
implementation when strings are placed closed to end of page. This is due
to byte by byte access for handling page cross. This is improved by
following >32B code path where the address is adjusted to aligned memory
before doing load doubleword operation instead of loading bytes.
Tested on powerpc64 and powerpc64le.
|
| |
|
|
|
|
|
|
|
| |
Vectorized loops are used for strings > 32B when compared
to power8 optimization.
Tested on power9 ppc64le simulator.
|
|
|
|
|
|
|
| |
Vectorized loops are used for strings > 32B when compared
to power8 optimization.
Tested on power9 ppc64le simulator.
|
|
|
|
|
| |
Fix multiarch build for POWER9 by correcting the order of the
directories listed at sysnames configure variable.
|
|
This patch adds the minimum changes for supporting the POWER9 processor.
|