mirror/glibc - mirror of git://sourceware.org/git/glibc.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Add narrowing square root functions	Joseph Myers	2021-09-10	64	-1/+994
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds the narrowing square root functions from TS 18661-1 / TS 18661-3 / C2X to glibc's libm: fsqrt, fsqrtl, dsqrtl, f32sqrtf64, f32sqrtf32x, f32xsqrtf64 for all configurations; f32sqrtf64x, f32sqrtf128, f64sqrtf64x, f64sqrtf128, f32xsqrtf64x, f32xsqrtf128, f64xsqrtf128 for configurations with _Float64x and _Float128; __f32sqrtieee128 and __f64sqrtieee128 aliases in the powerpc64le case (for calls to fsqrtl and dsqrtl when long double is IEEE binary128). Corresponding tgmath.h macro support is also added. The changes are mostly similar to those for the other narrowing functions previously added, so the description of those generally applies to this patch as well. However, the not-actually-narrowing cases (where the two types involved in the function have the same floating-point format) are aliased to sqrt, sqrtl or sqrtf128 rather than needing a separately built not-actually-narrowing function such as was needed for add / sub / mul / div. Thus, there is no __nldbl_dsqrtl name for ldbl-opt because no such name was needed (whereas the other functions needed such a name since the only other name for that entry point was e.g. f32xaddf64, not reserved by TS 18661-1); the headers are made to arrange for sqrt to be called in that case instead. The DIAG_* calls in sysdeps/ieee754/soft-fp/s_dsqrtl.c are because they were observed to be needed in GCC 7 testing of riscv32-linux-gnu-rv32imac-ilp32. The other sysdeps/ieee754/soft-fp/ files added didn't need such DIAG_* in any configuration I tested with build-many-glibcs.py, but if they do turn out to be needed in more files with some other configuration / GCC version, they can always be added there. I reused the same test inputs in auto-libm-test-in as for non-narrowing sqrt rather than adding extra or separate inputs for narrowing sqrt. The tests in libm-test-narrow-sqrt.inc also follow those for non-narrowing sqrt. Tested as followed: natively with the full glibc testsuite for x86_64 (GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC 11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32 hard float, mips64 (all three ABIs, both hard and soft float). The different GCC versions are to cover the different cases in tgmath.h and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in glibc headers, GCC 7 has proper _Float* support, GCC 8 adds __builtin_tgmath).
*	Update syscall lists for Linux 5.14	Joseph Myers	2021-09-08	26	-2/+33
\| \| \| \| \| \| \| \|	Linux 5.14 has two new syscalls, memfd_secret (on some architectures only) and quotactl_fd. Update syscall-names.list and regenerate the arch-syscall.h headers with build-many-glibcs.py update-syscalls. Tested with build-many-glibcs.py.
*	MIPS: Setup errno for {f,l,}xstat	Jiaxun Yang	2021-09-07	3	-3/+9
\| \| \| \| \| \| \| \| \| \| \|	{f,l,}xstat stub for MIPS is using INTERNAL_SYSCALL to do xstat syscall for glibc ver, However it leaves errno untouched and thus giving bad errno output. Setup errno properly when syscall returns non-zero. Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
*	Update hppa libm-test-ulps	John David Anglin	2021-09-06	1	-1/+1
\|
*	AArch64: Update A64FX memset not to degrade at 16KB	Naohiro Tamura	2021-09-06	1	-1/+8
\| \| \| \| \| \| \| \| \| \|	This patch updates unroll8 code so as not to degrade at the peak performance 16KB for both FX1000 and FX700. Inserted 2 instructions at the beginning of the unroll8 loop, cmp and branch, are a workaround that is found heuristically. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
*	Revert "AArch64: Update A64FX memset not to degrade at 16KB"	Szabolcs Nagy	2021-09-06	1	-8/+1
\| \| \| \| \| \|	Because of wrong commit author. Will recommit it with right author. This reverts commit 23777232c23f80809613bdfa329f63aadf992922.
*	Remove "Contributed by" lines	Siddhesh Poyarekar	2021-09-03	1611	-1979/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We stopped adding "Contributed by" or similar lines in sources in 2012 in favour of git logs and keeping the Contributors section of the glibc manual up to date. Removing these lines makes the license header a bit more consistent across files and also removes the possibility of error in attribution when license blocks or files are copied across since the contributed-by lines don't actually reflect reality in those cases. Move all "Contributed by" and similar lines (Written by, Test by, etc.) into a new file CONTRIBUTED-BY to retain record of these contributions. These contributors are also mentioned in manual/contrib.texi, so we just maintain this additional record as a courtesy to the earlier developers. The following scripts were used to filter a list of files to edit in place and to clean up the CONTRIBUTED-BY file respectively. These were not added to the glibc sources because they're not expected to be of any use in future given that this is a one time task: https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
*	AArch64: Update A64FX memset not to degrade at 16KB	Naohiro Tamura via Libc-alpha	2021-09-03	1	-1/+8
\| \| \| \| \| \| \| \| \| \|	This patch updates unroll8 code so as not to degrade at the peak performance 16KB for both FX1000 and FX700. Inserted 2 instructions at the beginning of the unroll8 loop, cmp and branch, are a workaround that is found heuristically. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
*	configure: Allow LD to be LLD 13.0.0 or above [BZ #26558]	Fangrui Song	2021-08-31	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When using LLD (LLVM linker) as the linker, configure prints a confusing message. *** These critical programs are missing or too old: GNU ld LLD>=13.0.0 can build glibc --enable-static-pie. (8.0.0 needs one workaround for -Wl,-defsym=_begin=0. 9.0.0 works with --disable-static-pie). XFAIL two tests sysdeps/x86/tst-ifunc-isa-* which have the BZ #28154 issue (LLD follows the PowerPC port of GNU ld for ifunc by placing IRELATIVE relocations in .rela.dyn, triggering a glibc ifunc fragility). The set of dynamic symbols is the same with GNU ld and LLD, modulo unused SHN_ABS version node symbols. For comparison, gold does not support --enable-static-pie yet (--no-dynamic-linker is unsupported BZ #22221), yet has 6 failures more than LLD. gold linked libc.so has larger .dynsym differences with GNU ld and LLD (non-default version symbols are changed to default versions by a version script BZ #28196).
*	hurd msync: Drop bogus test	Samuel Thibault	2021-08-31	1	-3/+0
\| \| \| \| \|	MS_SYNC is actually 0, so we cannot test that both MS_SYNC and MS_ASYNC are set.
*	hurd: Fix typo in msync	Samuel Thibault	2021-08-31	1	-1/+1
\| \| \| \|	== has higher priority than &
*	x86-64: Use testl to check __x86_string_control	H.J. Lu	2021-08-30	1	-2/+2
\| \| \| \| \| \| \|	Use testl, instead of andl, to check __x86_string_control to avoid updating __x86_string_control. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
*	i686: Don't include multiarch memove in libc.a	H.J. Lu	2021-08-30	1	-1/+1
\| \| \| \| \|	On i686, there is no multiarch memove in libc.a, don't include multiarch memove in ifunc-impl-list.c in libc.a.
*	Use support_open_dev_null_range io/tst-closefrom, misc/tst-close_range, and ↵	Adhemerval Zanella	2021-08-26	1	-21/+10
\| \| \| \| \| \| \| \| \|	posix/tst-spawn5 (BZ #28260) It ensures a continuous range of file descriptor and avoid hitting the RLIMIT_NOFILE. Checked on x86_64-linux-gnu.
*	powerpc: Use --no-tls-get-addr-optimize in test only if the linker supports it	Fangrui Song	2021-08-24	3	-0/+40
\| \| \| \| \| \|	LLD doesn't support --{,no-}tls-get-addr-optimize. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
*	x86-64: Remove assembler AVX512DQ check	H.J. Lu	2021-08-24	14	-138/+0
\| \| \| \| \|	The minimum GNU binutils requirement is 2.25 which supports AVX512DQ. Remove assembler AVX512DQ check.
*	x86-64: Remove compiler -mavx512f check	H.J. Lu	2021-08-24	4	-39/+0
\| \| \| \| \|	The minimum GCC requirement is GCC 6.2 which supports -mavx512f. Remove compiler -mavx512f check. Tested with GCC 6.4.1 on Linux/x86-64.
*	hurd: Remove old test-err_np.c file	Samuel Thibault	2021-08-23	1	-4/+0
\| \| \| \|	This is not referenced any more and includes a non-existing file.
*	x86-64: Optimize load of all bits set into ZMM register [BZ #28252]	H.J. Lu	2021-08-22	10	-64/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Optimize loads of all bits set into ZMM register in AVX512 SVML codes by replacing vpbroadcastq .L_2il0floatpacket.16(%rip), %zmmX and vmovups .L_2il0floatpacket.13(%rip), %zmmX with vpternlogd $0xff, %zmmX, %zmmX, %zmmX This fixes BZ #28252.
*	x86: fix Autoconf caching of instruction support checks [BZ #27991]	Matt Whitlock	2021-08-19	2	-37/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Autoconf documentation for the AC_CACHE_CHECK macro states: The commands-to-set-it must have no side effects except for setting the variable cache-id, see below. However, the tests for support of -msahf and -mmovbe were embedded in the commands-to-set-it for lib_cv_include_x86_isa_level. This had the consequence that libc_cv_have_x86_lahf_sahf and libc_cv_have_x86_movbe were not defined whenever lib_cv_include_x86_isa_level was read from cache. These variables' being undefined meant that their unquoted use in later test expressions led to the 'test' built-in's misparsing its arguments and emitting errors like "test: =: unexpected operator" or "test: =: unary operator expected", depending on the particular shell. This commit refactors the tests for LAHF/SAHF and MOVBE instruction support into their own AC_CACHE_CHECK macro invocations to obey the rule that the commands-to-set-it must have no side effects other than setting the variable named by cache-id. Signed-off-by: Matt Whitlock <sourceware@mattwhitlock.name> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
*	arm: Simplify elf_machine_{load_address,dynamic}	Fangrui Song	2021-08-18	1	-37/+10
\| \| \| \| \| \| \| \| \| \|	and drop reliance on _GLOBAL_OFFSET_TABLE_[0] being the link-time address of _DYNAMIC. &__ehdr_start is a better way to get the load address. This is similar to commits b37b75d269883a2c553bb7019a813094eb4e2dd1 (x86-64) and 43d06ed218fc8be58987bdfd00e21e5720f0b862 (aarch64). Reviewed-by: Joseph Myers <joseph@codesourcery.com>
*	riscv: Drop reliance on _GLOBAL_OFFSET_TABLE_[0]	Fangrui Song	2021-08-18	1	-11/+10
\| \| \| \| \| \| \| \| \|	&__ehdr_start is a better way to get the load address. This is similar to commits b37b75d269883a2c553bb7019a813094eb4e2dd1 (x86-64) and 43d06ed218fc8be58987bdfd00e21e5720f0b862 (aarch64). Reviewed-by: Palmer Dabbelt <palmerdabbelt@google.com>
*	Remove sysdeps/*/tls-macros.h	Fangrui Song	2021-08-18	23	-1429/+0
\| \| \| \| \| \| \| \|	They provide TLS_GD/TLS_LD/TLS_IE/TLS_IE macros for TLS testing. Now that we have migrated to __thread and tls_model attributes, these macros are unused and the tls-macros.h files can retire. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
*	x86_64: Simplify elf_machine_{load_address,dynamic}	Fangrui Song	2021-08-17	1	-14/+7
\| \| \| \| \| \| \|	and drop reliance on _GLOBAL_OFFSET_TABLE_[0] being the link-time address of _DYNAMIC. &__ehdr_start is a better way to get the load address. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
*	elf: Drop elf/tls-macros.h in favor of __thread and tls_model attributes [BZ ↵	Fangrui Song	2021-08-16	2	-6/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	#28152] [BZ #28205] elf/tls-macros.h was added for TLS testing when GCC did not support __thread. __thread and tls_model attributes are mature now and have been used by many newer tests. Also delete tst-tls2.c which tests .tls_common (unused by modern GCC and unsupported by Clang/LLD). .tls_common and .tbss definition are almost identical after linking, so the runtime test doesn't add additional coverage. Assembler and linker tests should be on the binutils side. When LLD 13.0.0 is allowed in configure.ac (https://sourceware.org/pipermail/libc-alpha/2021-August/129866.html), `make check` result is on par with glibc built with GNU ld on aarch64 and x86_64. As a future clean-up, TLS_GD/TLS_LD/TLS_IE/TLS_IE macros can be removed from sysdeps/*/tls-macros.h. We can add optional -mtls-dialect={gnu2,trad} tests to ensure coverage. Tested on aarch64-linux-gnu, powerpc64le-linux-gnu, and x86_64-linux-gnu. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
*	hurd: Drop fmh kludge	Samuel Thibault	2021-08-16	1	-35/+0
\| \| \| \| \| \|	Gnumach's 0650a4ee30e3 implements support for high bits being set in the mask parameter of vm_map. This allows to remove the fmh kludge that was masking away the address range by mapping a dumb area there.
*	mips: increase stack alignment in clone to match the ABI	Xi Ruoyao	2021-08-13	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	In "mips: align stack in clone [BZ #28223]" (commit 1f51cd9a860ee45eee8a56fb2ba925267a2a7bfe) I made a mistake: I misbelieved one "word" was 2-byte and "doubleword" should be 4-byte. But in MIPS ABI one "word" is defined 32-bit (4-byte), so "doubleword" is 8-byte [1], and "quadword" is 16-byte [2]. [1]: "System V Application Binary Interface: MIPS(R) RISC Processor Supplement, 3rd edition", page 3-31 [2]: "MIPSpro(TM) 64-Bit Porting and Transition Guide", page 23
*	mips: align stack in clone [BZ #28223]	Xi Ruoyao	2021-08-12	1	-0/+7
\| \| \| \| \| \| \| \|	The MIPS O32 ABI requires 4 byte aligned stack, and the MIPS N64 and N32 ABI require 8 byte aligned stack. Previously if the caller passed an unaligned stack to clone the the child misbehaved. Fixes bug 28223.
*	hurd mmap: Reduce the requested max vmprot	Sergey Bugaev	2021-08-11	1	-4/+18
\| \| \| \| \| \| \|	When the memory object is read-only, the kernel would be right in refusing max vmprot containing VM_PROT_WRITE. Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
*	hurd mmap: Factorize MAP_SHARED flag check	Sergey Bugaev	2021-08-11	1	-12/+10
\| \| \| \|	Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
*	aarch64: Make elf_machine_{load_address,dynamic} robust [BZ #28203]	Fangrui Song	2021-08-11	1	-15/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The AArch64 ABI is largely platform agnostic and does not specify _GLOBAL_OFFSET_TABLE_[0] ([1]). glibc ld.so turns out to be probably the only user of _GLOBAL_OFFSET_TABLE_[0] and GNU ld defines the value to the link-time address _DYNAMIC. [2] In 2012, __ehdr_start was implemented in GNU ld and gold in binutils 2.23. Using adrp+add / (-mcmodel=tiny) adr to access __ehdr_start/_DYNAMIC gives us a robust way to get the load address and the link-time address of _DYNAMIC. [1]: From a psABI maintainer, https://bugs.llvm.org/show_bug.cgi?id=49672#c2 [2]: LLD's aarch64 port does not set _GLOBAL_OFFSET_TABLE_[0] to the link-time address _DYNAMIC. LLD is widely used on aarch64 Android and ChromeOS devices. Software just works without the need for _GLOBAL_OFFSET_TABLE_[0]. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
*	[5/5] AArch64: Improve A64FX memset medium loops	Wilco Dijkstra	2021-08-10	1	-26/+19
\| \| \| \| \| \| \|	Simplify the code for memsets smaller than L1. Improve the unroll8 and L1_prefetch loops. Reviewed-by: Naohiro Tamura <naohirot@fujitsu.com>
*	[4/5] AArch64: Improve A64FX memset by removing unroll32	Wilco Dijkstra	2021-08-10	1	-17/+1
\| \| \| \| \| \|	Remove unroll32 code since it doesn't improve performance. Reviewed-by: Naohiro Tamura <naohirot@fujitsu.com>
*	[3/5] AArch64: Improve A64FX memset for remaining bytes	Wilco Dijkstra	2021-08-10	1	-33/+13
\| \| \| \| \| \| \|	Simplify handling of remaining bytes. Avoid lots of taken branches and complex whilelo computations, instead unconditionally write vectors from the end. Reviewed-by: Naohiro Tamura <naohirot@fujitsu.com>
*	[2/5] AArch64: Improve A64FX memset for large sizes	Wilco Dijkstra	2021-08-10	1	-60/+25
\| \| \| \| \| \| \| \|	Improve performance of large memsets. Simplify alignment code. For zero memset use DC ZVA, which almost doubles performance. For non-zero memsets use the unroll8 loop which is about 10% faster. Reviewed-by: Naohiro Tamura <naohirot@fujitsu.com>
*	[1/5] AArch64: Improve A64FX memset for small sizes	Wilco Dijkstra	2021-08-10	1	-60/+36
\| \| \| \| \| \| \| \|	Improve performance of small memsets by reducing instruction counts and improving code alignment. Bench-memset shows 35-45% performance gain for small sizes. Reviewed-by: Naohiro Tamura <naohirot@fujitsu.com>
*	Add PTRACE_GET_RSEQ_CONFIGURATION from Linux 5.13 to sys/ptrace.h	Joseph Myers	2021-08-09	9	-7/+52
\| \| \| \| \| \| \| \| \| \| \|	Linux 5.13 adds a PTRACE_GET_RSEQ_CONFIGURATION constant, with an associated ptrace_rseq_configuration structure. Add this constant to the various sys/ptrace.h headers in glibc, with the structure in bits/ptrace-shared.h (named struct __ptrace_rseq_configuration in glibc, as with other such structures). Tested for x86_64, and with build-many-glibcs.py.
*	librt: fix NULL pointer dereference (bug 28213)	Nikita Popov	2021-08-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Helper thread frees copied attribute on NOTIFY_REMOVED message received from the OS kernel. Unfortunately, it fails to check whether copied attribute actually exists (data.attr != NULL). This worked earlier because free() checks passed pointer before actually attempting to release corresponding memory. But __pthread_attr_destroy assumes pointer is not NULL. So passing NULL pointer to __pthread_attr_destroy will result in segmentation fault. This scenario is possible if notification->sigev_notify_attributes == NULL (which means default thread attributes should be used). Signed-off-by: Nikita Popov <npv1310@gmail.com> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
*	powerpc64: Add checks for Altivec and VSX in ifunc selection	Anton Blanchard	2021-08-06	26	-68/+139
\| \| \| \| \| \| \|	We'd like to support processors without Altivec or VSX, so check the relevant hwcap bits before selecting them. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
*	powerpc64: Check cacheline size before using optimised memset routines	Anton Blanchard	2021-08-06	2	-10/+23
\| \| \| \| \| \| \|	A number of optimised memset routines assume the cacheline size is 128B, so we better check before using them. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
*	powerpc64: Replace some PPC_FEATURE_HAS_VSX with PPC_FEATURE_ARCH_2_06	Anton Blanchard	2021-08-06	20	-38/+38
\| \| \| \| \| \| \| \|	We use PPC_FEATURE_HAS_VSX to select a number of POWER7 optimised functions. These functions don't use any VSX instructions, so PPC_FEATURE_ARCH_2_06 seems like a better fit. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
*	Linux: Fix fcntl, ioctl, prctl redirects for _TIME_BITS=64 (bug 28182)	Florian Weimer	2021-08-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	__REDIRECT and __THROW are not compatible with C++ due to the ordering of the __asm__ alias and the throw specifier. __REDIRECT_NTH has to be used instead. Fixes commit 8a40aff86ba5f64a3a84883e539cb67b ("io: Add time64 alias for fcntl"), commit 82c395d91ea4f69120d453aeec398e30 ("misc: Add time64 alias for ioctl"), commit b39ffab860cd743a82c91946619f1b8158 ("Linux: Add time64 alias for prctl"). Reviewed-by: Carlos O'Donell <carlos@redhat.com>
*	Update sparc libm-test-ulps	Adhemerval Zanella	2021-08-04	1	-1/+1
\|
*	linux: Add sparck brk implementation	Adhemerval Zanella	2021-08-04	1	-0/+58
\| \| \| \| \| \| \| \| \| \| \|	It turned that the generic implementation of brk() does not work for sparc, since on failure kernel will just return the previous input value without setting the conditional register. This patches adds back a sparc32 and sparc64 implementation removed by 720480934ab9107. Checked on sparc64-linux-gnu and sparcv9-linux-gnu.
*	gethosts: Remove unused argument _type	Siddhesh Poyarekar	2021-08-04	1	-3/+3
\| \| \| \|	The generated code is unchanged.
*	gaiconf_init: Avoid double-free in label and precedence lists	Siddhesh Poyarekar	2021-08-03	1	-0/+2
\| \| \| \| \| \| \|	labellist and precedencelist could get freed a second time if there are allocation failures, so set them to NULL to avoid a double-free. Reviewed-by: Arjun Shankar <arjun@redhat.com>
*	x86-64: Add Avoid_Short_Distance_REP_MOVSB	H.J. Lu	2021-07-28	5	-0/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 3ec5d83d2a237d39e7fd6ef7a0bc8ac4c171a4a5 Author: H.J. Lu <hjl.tools@gmail.com> Date: Sat Jan 25 14:19:40 2020 -0800 x86-64: Avoid rep movsb with short distance [BZ #27130] introduced some regressions on Intel processors without Fast Short REP MOV (FSRM). Add Avoid_Short_Distance_REP_MOVSB to avoid rep movsb with short distance only on Intel processors with FSRM. bench-memmove-large on Skylake server shows that cycles of __memmove_evex_unaligned_erms improves for the following data size: before after Improvement length=4127, align1=3, align2=0: 479.38 349.25 27% length=4223, align1=9, align2=5: 405.62 333.25 18% length=8223, align1=3, align2=0: 786.12 496.38 37% length=8319, align1=9, align2=5: 727.50 501.38 31% length=16415, align1=3, align2=0: 1436.88 840.00 41% length=16511, align1=9, align2=5: 1375.50 836.38 39% length=32799, align1=3, align2=0: 2890.00 1860.12 36% length=32895, align1=9, align2=5: 2891.38 1931.88 33%
*	Typo: Rename HAVE_CLONE3_WAPPER to HAVE_CLONE3_WRAPPER	H.J. Lu	2021-07-28	3	-3/+3
\|
*	hurd: _Fork: unlock malloc before calling fork child hooks	Samuel Thibault	2021-07-27	1	-0/+1
\| \| \| \| \|	The setitimer fork hook, fork_itimer, needs to call malloc inside __mach_setup_tls, so we need to unlock malloc before calling it.
*	i386: Regenerate ulps	Arjun Shankar	2021-07-25	2	-61/+61
\| \| \| \| \|	These failures were caught while building glibc master for Fedora Rawhide which is built with `-mtune=generic -msse2 -mfpmath=sse'.