Commit message (Collapse) | Author | Age | Files | Lines | ||
---|---|---|---|---|---|---|
... | ||||||
* | Add support for SSSE3 and SSE4.2 versions of strcasecmp on x86-64. | Ulrich Drepper | 2010-07-31 | 4 | -20/+373 | |
| | ||||||
* | Pretty printing x86-64 SSE4.3 strcmp. | Ulrich Drepper | 2010-07-30 | 1 | -29/+29 | |
| | ||||||
* | Fix tolower operation in strcasestr. | Ulrich Drepper | 2010-07-30 | 1 | -1/+1 | |
| | ||||||
* | Avoid compiling unneeded file in ld.so. | Ulrich Drepper | 2010-07-27 | 1 | -3/+5 | |
| | ||||||
* | Speed up x86-64 strcasestr a bit moew. | Ulrich Drepper | 2010-07-24 | 1 | -5/+11 | |
| | | | | | Using the new SSE4.2 instructions is cool but not really the fastest. Some older SSE instructions can do the trick faster. | |||||
* | Add strcasestr-nonascii to i386 build | Andreas Schwab | 2010-07-21 | 2 | -7/+10 | |
| | ||||||
* | Fix non-ASCII case of SSE4.2 strcasstr. | Ulrich Drepper | 2010-07-16 | 1 | -0/+2 | |
| | ||||||
* | Speed up SSE4.2 strcasestr by avoiding indirect function call. | Ulrich Drepper | 2010-07-16 | 4 | -49/+76 | |
| | ||||||
* | Improve 64bit memcpy/memmove for Atom, Core 2 and Core i7 | H.J. Lu | 2010-06-30 | 16 | -6/+6635 | |
| | | | | | | | This patch includes optimized 64bit memcpy/memmove for Atom, Core 2 and Core i7. It improves memcpy by up to 3X on Atom, up to 4X on Core 2 and up to 1X on Core i7. It also improves memmove by up to 3X on Atom, up to 4X on Core 2 and up to 2X on Core i7. | |||||
* | Incorrect x86 CPU family and model check. | H.J. Lu | 2010-05-27 | 1 | -3/+3 | |
| | ||||||
* | Check DATA_CACHE_SIZE_HALF | H.J. Lu | 2010-04-14 | 1 | -2/+2 | |
| | ||||||
* | Optimie x86-64 SSE4 memcmp for unaligned data. | H.J. Lu | 2010-04-14 | 1 | -6/+371 | |
| | ||||||
* | x86-64 SSE4 optimized memcmp | H.J. Lu | 2010-04-14 | 4 | -1/+1331 | |
| | | | | | This is 64bit SSE4 optimized memcmp. It improves memcmp by upto 3X on Intel Core i7. | |||||
* | Update x86-64 cpu multiarch selection header. | Ulrich Drepper | 2010-04-13 | 1 | -17/+21 | |
| | ||||||
* | Fix concurrent handling of __cpu_features. | Ulrich Drepper | 2010-04-04 | 2 | -14/+23 | |
| | ||||||
* | Don't define __strpbrk_sse42 in static library | H.J. Lu | 2010-03-24 | 1 | -4/+8 | |
| | ||||||
* | Unroll the loop x86-64 SSE4.2 strlen. | H.J. Lu | 2010-01-13 | 1 | -15/+45 | |
| | ||||||
* | Optimize 32bit memset/memcpy with SSE2/SSSE3. | H.J. Lu | 2010-01-12 | 3 | -1/+34 | |
| | ||||||
* | Define bit_SSE2 and index_SSE2. | H.J. Lu | 2009-12-13 | 1 | -0/+2 | |
| | ||||||
* | Define bit_XXX and index_XXX. | H.J. Lu | 2009-12-13 | 9 | -17/+31 | |
| | | | | | | This patch defines bit_XXX and index_XXX and use them to check processor feature in assembly code. It can prevent typos in processor feature check. | |||||
* | Fix whitespaces. | Ulrich Drepper | 2009-10-22 | 2 | -11/+11 | |
| | ||||||
* | Implement SSE4.2 optimized strchr and strrchr. | H.J. Lu | 2009-10-22 | 4 | -1/+506 | |
| | ||||||
* | Clean up unnecessary libc_hidden_builtin_def fiddling in x86 multiarch ↵ | Roland McGrath | 2009-10-06 | 2 | -5/+4 | |
| | | | | definitions. | |||||
* | Clean up x86 multiarch HAS_FOO macros. | Roland McGrath | 2009-10-06 | 2 | -23/+10 | |
| | ||||||
* | Fix strstr/strcasestr/fma/fmaf on x86_64. | Jakub Jelinek | 2009-09-02 | 4 | -6/+8 | |
| | ||||||
* | Remove ENABLE_SSSE3_ON_ATOM. | H.J. Lu | 2009-08-28 | 1 | -9/+1 | |
| | | | | | It turns that SSSE3 isn't slow on Atom. The problem is bsf. This patch removes ENABLE_SSSE3_ON_ATOM. | |||||
* | Move SSE4.2 functions together. | Ulrich Drepper | 2009-08-08 | 2 | -0/+2 | |
| | ||||||
* | Add SSSE3-optimized implementation of str{,n}cmp for x86-64. | Ulrich Drepper | 2009-08-07 | 4 | -4/+17 | |
| | ||||||
* | Avoid warning through fake initialization. | Ulrich Drepper | 2009-08-07 | 1 | -0/+2 | |
| | ||||||
* | Add x86 32-bit SSE4.2 string functions. | H.J. Lu | 2009-08-04 | 2 | -4/+4 | |
| | | | | | | This patch adds 32bit SSE4.2 string functions. It uses -16L instead of 0xfffffffffffffff0L, which works for both 32bit and 64bit long. Tested on 32bit Core i7 and Core 2. | |||||
* | Support multiarch for i686. | H.J. Lu | 2009-07-31 | 3 | -11/+15 | |
| | | | | | | This patch adds multiarch support when configured for i686. I modified some x86-64 functions to support 32bit. I will contribute 32bit SSE string and memory functions later. | |||||
* | Add support for x86-64 fma instruction. | Ulrich Drepper | 2009-07-29 | 3 | -0/+90 | |
| | | | | Use it to implement fma and fmaf, if possible. | |||||
* | Prepare use if IFUNC functions outside libc.so. | Ulrich Drepper | 2009-07-29 | 2 | -2/+30 | |
| | | | | | | We use a callback function into libc.so to get access to the data structure with the information and have special versions of the test macros which automatically use this function. | |||||
* | Refine testing for xmm/ymm register use in x86-64 ld.so. | Ulrich Drepper | 2009-07-27 | 1 | -1/+0 | |
| | | | | | | | | | The test now takes the callgraph into account. Only code called during runtime relocation is affected by the limitation. We now determine the affected object files as closely as possible from the outside. This allowed to remove some the specializations for some of the string functions as they are only used in other code paths. | |||||
* | Make sure no code in ld.so uses xmm/ymm registers on x86-64. | Ulrich Drepper | 2009-07-26 | 2 | -0/+2 | |
| | | | | | | | | | | This patch introduces a test to make sure no function modifies the xmm/ymm registers. With the exception of the auditing functions. The test is probably too pessimistic. All code linked into ld.so is checked. Perhaps at some point the callgraph starting from _dl_fixup and _dl_profile_fixup is checked and we can start using faster SSE-using functions in parts of ld.so. | |||||
* | Add SSE2 support to str{,n}cmp for x86-64. | H.J. Lu | 2009-07-26 | 3 | -262/+109 | |
| | ||||||
* | Some some optimizations for x86-64 strcmp. | H.J. Lu | 2009-07-25 | 1 | -9/+4 | |
| | ||||||
* | Optimize x86-64 SSE4.2 strcmp. | Ulrich Drepper | 2009-07-25 | 1 | -0/+5 | |
| | | | | | The file contained some code which was never used. Don't compile it in. | |||||
* | Perform test for Arom x86-64 in central place and handle it. | Ulrich Drepper | 2009-07-23 | 2 | -11/+10 | |
| | | | | | | | There will be more than one function which, in multiarch mode, wants to use SSSE3. We should not test in each of them for Atoms with slow SSSE3. Instead, disable the SSSE3 bit in the startup code for such machines. | |||||
* | Minor cleanups in x86-64 strstr. | Ulrich Drepper | 2009-07-21 | 1 | -78/+55 | |
| | ||||||
* | Better check for optimization in new x86-64 strstr/strcasestr. | Ulrich Drepper | 2009-07-20 | 1 | -11/+15 | |
| | ||||||
* | SSE4.2 strstr/strcasestr for x86-64. | H.J. Lu | 2009-07-20 | 5 | -1/+519 | |
| | | | | | This patch implements SSE4.2 strstr/strcasestr, using Knuth-Morris-Pratt string searching algorithm. | |||||
* | Minor cleanups in recently added files. | Ulrich Drepper | 2009-07-03 | 2 | -79/+57 | |
| | ||||||
* | Align functions to 16-byte boundary. | Ulrich Drepper | 2009-07-03 | 4 | -0/+4 | |
| | | | | | | Some of the new multi-arch string functions for x86-64 were not aligned to 16 byte boundarie,s possibly creating unnecessary cache line misses and delays. | |||||
* | Add SSE4.2 support for strcspn, strpbrk, and strspn on x86-64. | H.J. Lu | 2009-07-03 | 7 | -0/+776 | |
| | ||||||
* | Whitespace fixes in last patch. | Ulrich Drepper | 2009-07-02 | 1 | -31/+31 | |
| | ||||||
* | SSSE3 strcpy/stpcpy for x86-64 | H.J. Lu | 2009-07-02 | 7 | -1/+1950 | |
| | | | | | | This patch adds SSSE3 strcpy/stpcpy. I got up to 4X speed up on Core 2 and Core i7. I disabled it on Atom since SSSE3 version is slower for shorter (<64byte) data. | |||||
* | Fix little checkin problem in last patch. | Ulrich Drepper | 2009-06-30 | 1 | -2/+2 | |
| | ||||||
* | Determine and store processor family and model on x86-64. | H.J. Lu | 2009-06-30 | 3 | -8/+33 | |
| | ||||||
* | Clean up whitespaces in last patch. | Ulrich Drepper | 2009-06-22 | 1 | -1/+1 | |
| |