| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
This patch enables SSE2 memset for AMD's upcoming Orochi processor.
This patch also fixes the following bug:
For misaligned blocks larger than > 144 Bytes, memset branches into
the integer code path depending on the value of misalignment even if
the startup code chooses the SSE2 code path upfront, when multiarch
is enabled.
|
|
|
|
|
|
|
|
|
| |
32bit memset-sse2.S assumes cache size is multiple of 128 bytes. If
it isn't true, memset-sse2.S will fail. For example, a processor can
have 24576 KB L3 cache and 20 cores. That is 2516582 byte per core. Half
of it is 1258291, which isn't helpful for vector instructions. This
patch rounds cache sizes to multiple of 256 bytes and adds "raw" cache
sizes.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The meaning of the 25-14 bits in EAX returned from cpuid with EAX = 4
has been changed from "the maximum number of threads sharing the cache"
to "the maximum number of addressable IDs for logical processors sharing
the cache" if cpuid takes EAX = 11. We need to use results from both
EAX = 4 and EAX = 11 to get the number of threads sharing the cache.
The 25-14 bits in EAX on Core i7 is 15 although the number of logical
processors is 8. Here is a white paper on this:
http://software.intel.com/en-us/articles/intel-64-architecture-processor-topology-enumeration/
This patch correctly counts number of logical processors on Intel CPUs
with EAX = 11 support on cpuid. Tested on Dinnington, Core i7 and
Nehalem EX/EP.
It also fixed Pentium Ds workaround since EBX may not have the right
value returned from cpuid with EAX = 1.
|
|
|
|
|
|
| |
This patch adds multiarch support when configured for i686. I modified
some x86-64 functions to support 32bit. I will contribute 32bit SSE string
and memory functions later.
|
|
|
|
| |
When multiarch is enabled we have this information stored. Use it.
|
|
|
|
|
| |
The most recent AP 485 describes a few more cache descriptors for
L3 caches with 24-way associativity.
|
|
|
|
|
|
|
| |
SO far Intel and AMD use exactly the same bits meaning the same
things in CPUID index 1. Simplify the code. Should an architecture
come along which doesn't use the same semantics then it must use a
different index value than COMMON_CPUID_INDEX_1.
|
|
|
|
| |
This saves about 1.5kB in the DSO.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* nss/getXXbyYY_r.c: If NO_COMPAT_NEEDED is defined don't define any
compatibility functions.
* nss/getXXent_r.c: Likewise.
* gshadow/getsgent_r.c: Define NO_COMPAT_NEEDED.
* gshadow/getsgnam_r.c: Likewise.
* gshadow/Version: Remove duplicate entries.
* sysdeps/x86_64/cacheinfo.c (intel_02_cache_info): Add missing entries
for recent processor.
* sysdeps/unix/sysv/linux/i386/sysconf.c (intel_02_cache_info):
Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* configure.in: Handle --enable-multi-arch.
* elf/dl-runtime.c (_dl_fixup): Handle STT_GNU_IFUNC.
(_dl_fixup_profile): Likewise.
* elf/do-lookup.c (dl_lookup_x): Likewise.
* sysdeps/x86_64/dl-machine.h: Handle STT_GNU_IFUNC.
* elf/elf.h (STT_GNU_IFUNC): Define.
* include/libc-symbols.h (libc_ifunc): Define.
* sysdeps/x86_64/cacheinfo.c: If USE_MULTIARCH is defined, use the
framework in init-arch.h to get CPUID values.
* sysdeps/x86_64/multiarch/Makefile: New file.
* sysdeps/x86_64/multiarch/init-arch.c: New file.
* sysdeps/x86_64/multiarch/init-arch.h: New file.
* sysdeps/x86_64/multiarch/sched_cpucount.c: New file.
* config.make.in (experimental-malloc): Define.
* configure.in: Handle --enable-experimental-malloc.
* malloc/Makefile: Handle experimental-malloc flag.
* malloc/malloc.c: Implement PER_THREAD and ATOMIC_FASTBINS features.
* malloc/arena.c: Likewise.
* malloc/hooks.c: Likewise.
* malloc/malloc.h: Define M_ARENA_TEST and M_ARENA_MAX.
|
|
|
|
| |
* sysdeps/unix/sysv/linux/i386/sysconf.c (intel_02_known): Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
2008-2-26 Harsha Jagasia <harsha.jagasia@amd.com>
* sysdeps/x86_64/cacheinfo.c (NOT_USED_RIGHT_NOW): Remove ifdef guards.
* sysdeps/x86_64/memset.S: Rewrite non-SSE code path as tuned for AMD
Barcelona machine. Make default fall through branch of
__x86_64_preferred_memory_instruction check as the integer code path.
2007-10-15 H.J. Lu <hongjiu.lu@intel.com>
* sysdeps/x86_64/cacheinfo.c
(__x86_64_preferred_memory_instruction): New variable.
(init_cacheinfo): Initialize __x86_64_preferred_memory_instruction.
* sysdeps/x86_64/memset.S: Rewrite.
2008-01-08 Jakub Jelinek <jakub@redhat.com>
* malloc/malloc.c (public_cALLOc): For arenas other than
|
| |
|
|
|
|
|
|
| |
new memset.
too high for the improvements. Implement bzero unconditionally for
use in libc.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(init_cacheinfo): Initialize it.
* sysdeps/x86_64/memset.S: Use __x86_64_shared_cache_size.
Always define bzero.
Remove non-glibc code.
* sysdeps/x86_64/bzero.S: Make an empty file.
2007-10-15 H.J. Lu <hongjiu.lu@intel.com>
* sysdeps/x86_64/cacheinfo.c
(__x86_64_preferred_memory_instruction): New.
(init_cacheinfo): Initialize __x86_64_preferred_memory_instruction.
* sysdeps/x86_64/memset.S: Rewrite.
* nss/getXXbyYY_r.c (REENTRANT_NAME): Mangle startp and start_fct
|
|
|
|
| |
with some Pentium Ds.
|
|
|
|
|
|
|
|
|
| |
from __x86_64_core_cache_size_half.
(init_cacheinfo): Compute shared cache size for AMD processors with
shared L3 correctly.
* sysdeps/x86_64/memcpy.S: Adjust for __x86_64_data_cache_size_half
name change.
Patch in large parts by Evandro Menezes.
|
|
|
|
| |
associativity for fully-associative caches.
|
|
|
|
|
| |
requests. Fill on more associativity values for L2.
Patch mostly by Evandro Menezes.
|
|
|
|
| |
* sysdeps/unix/sysv/linux/i386/sysconf.c (intel_02_known): Likewise.
|
|
|
|
| |
as second parameter to handle_intel.
|
|
handling to ...
* sysdeps/x86_64/cacheinfo.c: ... here. New file.
* sysdeps/x86_64/Makefile [subdir=string] (sysdep_routines): Add
cacheinfo.
* sysdeps/x86_64/memcpy.S: Complete rewrite.
* sysdeps/x86_64/mempcpy.S: Adjust appropriately.
Patch by Evandro Menezes <evandro.menezes@amd.com>.
* sysdeps/unix/sysv/linux/i386/epoll_pwait.S: New file.
|