mirror/glibc - mirror of git://sourceware.org/git/glibc.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge branch release/2.24/master into ibm/2.24/master ibm/2.24/master	Gabriel F. T. Gomes	2017-12-19	60	-861/+2119
\|\
\| *	elf: Count components of the expanded path in _dl_init_path [BZ #22607]	Florian Weimer	2017-12-16	3	-9/+17
\| \| \| \| \| \| \| \|	(cherry picked from commit 3ff3dfa5af313a6ea33f3393916f30eece4f0171)
\| *	elf: Compute correct array size in _dl_init_paths [BZ #22606]	Florian Weimer	2017-12-16	3	-7/+20
\| \| \| \| \| \| \| \|	(cherry picked from commit 8a0b17e48b83e933960dfeb8fa08b259f03f310e)
\| *	<array_length.h>: New array_length and array_end macros	Florian Weimer	2017-12-16	2	-0/+41
\| \| \| \| \| \| \| \|	(cherry picked from commit c94a5688fb1228a862b2d4a3f1239cdc0e3349e5)
\| *	Update NEWS to add CVE-2017-15804 entry	Aurelien Jarno	2017-12-02	1	-2/+2
\| \| \| \| \| \| \| \|	(cherry picked from commit 15e84c63c05e0652047ba5e738c54d79d62ba74b)
\| *	posix/tst-glob-tilde.c: Add test for bug 22332	Florian Weimer	2017-12-02	2	-23/+37
\| \| \| \| \| \| \| \|	(cherry picked from commit 2fac6a6cd50c22ac28c97d0864306594807ade3e)
\| *	glob: Fix buffer overflow during GLOB_TILDE unescaping [BZ #22332]	Paul Eggert	2017-12-02	3	-2/+12
\| \| \| \| \| \| \| \|	(cherry picked from commit a159b53fa059947cc2548e3b0d5bdcf7b9630ba8)
\| *	Update NEWS and ChangeLog for CVE-2017-15671	Florian Weimer	2017-12-02	2	-0/+6
\| \| \| \| \| \| \| \|	(cherry picked from commit 914c9994d27b80bc3b71c483e801a4f04e269ba6)
\| *	glob: Add new test tst-glob-tilde	Florian Weimer	2017-12-02	3	-2/+153
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new test checks for memory leaks (see bug 22325) and attempts to trigger the buffer overflow in bug 22320. (cherry picked from commit e80fc1fc98bf614eb01cf8325503df3a1451a99c)
\| *	CVE-2017-15670: glob: Fix one-byte overflow [BZ #22320]	Paul Eggert	2017-12-02	3	-1/+11
\| \| \| \| \| \| \| \|	(cherry picked from commit c369d66e5426a30e4725b100d5cd28e372754f90)
\| *	posix: Sync glob with gnulib [BZ #1062]	Adhemerval Zanella	2017-12-02	22	-459/+739
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch syncs posix/glob.c implementation with gnulib version b5ec983 (glob: simplify symlink detection). The only difference to gnulib code is * DT_UNKNOWN, DT_DIR, and DT_LNK definition in the case there were not already defined. Gnulib code which uses HAVE_STRUCT_DIRENT_D_TYPE will redefine them wrongly because GLIBC does not define HAVE_STRUCT_DIRENT_D_TYPE. Instead the patch check for each definition instead. Also, the patch requires additional globfree and globfree64 files for compatibility version on some architectures. Also the code simplification leads to not macro simplification (not need for NO_GLOB_PATTERN_P anymore). Checked on x86_64-linux-gnu and on a build using build-many-glibcs.py for all major architectures. [BZ #1062] * posix/Makefile (routines): Add globfree, globfree64, and glob_pattern_p. * posix/flexmember.h: New file. * posix/glob_internal.h: Likewise. * posix/glob_pattern_p.c: Likewise. * posix/globfree.c: Likewise. * posix/globfree64.c: Likewise. * sysdeps/gnu/globfree64.c: Likewise. * sysdeps/unix/sysv/linux/alpha/globfree.c: Likewise. * sysdeps/unix/sysv/linux/mips/mips64/n64/globfree64.c: Likewise. * sysdeps/unix/sysv/linux/oldglob.c: Likewise. * sysdeps/unix/sysv/linux/wordsize-64/globfree64.c: Likewise. * sysdeps/unix/sysv/linux/x86_64/x32/globfree.c: Likewise. * sysdeps/wordsize-64/globfree.c: Likewise. * sysdeps/wordsize-64/globfree64.c: Likewise. * posix/glob.c (HAVE_CONFIG_H): Use !_LIBC instead. [NDEBUG): Remove comments. (GLOB_ONLY_P, _AMIGA, VMS): Remove define. (dirent_type): New type. Use uint_fast8_t not uint8_t, as C99 does not require uint8_t. (DT_UNKNOWN, DT_DIR, DT_LNK): New macros. (struct readdir_result): Use dirent_type. Do not define skip_entry unless it is needed; this saves a byte on platforms lacking d_ino. (readdir_result_type, readdir_result_skip_entry): New functions, replacing ... (readdir_result_might_be_symlink, readdir_result_might_be_dir): these functions, which were removed. This makes the callers easier to read. All callers changed. (D_INO_TO_RESULT): Now empty if there is no d_ino. (size_add_wrapv, glob_use_alloca): New static functions. (glob, glob_in_dir): Check for size_t overflow in several places, and fix some size_t checks that were not quite right. Remove old code using SHELL since Bash no longer uses this. (glob, prefix_array): Separate MS code better. (glob_in_dir): Remove old Amiga and VMS code. (globfree, __glob_pattern_type, __glob_pattern_p): Move to separate files. (glob_in_dir): Do not rely on undefined behavior in accessing struct members beyond their bounds. Use a flexible array member instead (link_stat): Rename from link_exists2_p and return -1/0 instead of 0/1. Caller changed. (glob): Fix memory leaks. * posix/glob64 (globfree64): Move to separate file. * sysdeps/gnu/glob64.c (NO_GLOB_PATTERN_P): Remove define. (globfree64): Remove hidden alias. * sysdeps/unix/sysv/linux/Makefile (sysdeps_routines): Add oldglob. * sysdeps/unix/sysv/linux/alpha/glob.c (__new_globfree): Move to separate file. * sysdeps/unix/sysv/linux/i386/glob64.c (NO_GLOB_PATTERN_P): Remove define. Move compat code to separate file. * sysdeps/wordsize-64/glob.c (globfree): Move definitions to separate file. (cherry picked from commit c66c908230169c1bab1f83b071eb585baa214b9f)
\| *	i386: Hide __old_glob64 [BZ #18822]	H.J. Lu	2017-12-02	2	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Hide internal __old_glob64 function to allow direct access within libc.so and libc.a without using GOT nor PLT. [BZ #18822] * sysdeps/unix/sysv/linux/i386/glob64.c (__old_glob64): Add libc_hidden_proto and libc_hidden_def. (cherry picked from commit 2585d7b839559e665d5723734862fbe62264b25d) (cherry picked from commit 2b54f16a8a237a1f3e6f8b974cafda09ed75d292)
\| *	x86-64: Use fxsave/xsave/xsavec in _dl_runtime_resolve [BZ #21265]	H.J. Lu	2017-10-22	8	-297/+265
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In _dl_runtime_resolve, use fxsave/xsave/xsavec to preserve all vector, mask and bound registers. It simplifies _dl_runtime_resolve and supports different calling conventions. ld.so code size is reduced by more than 1 KB. However, use fxsave/xsave/xsavec takes a little bit more cycles than saving and restoring vector and bound registers individually. Latency for _dl_runtime_resolve to lookup the function, foo, from one shared library plus libc.so: Before After Change Westmere (SSE)/fxsave 345 866 151% IvyBridge (AVX)/xsave 420 643 53% Haswell (AVX)/xsave 713 1252 75% Skylake (AVX+MPX)/xsavec 559 719 28% Skylake (AVX512+MPX)/xsavec 145 272 87% Ryzen (AVX)/xsavec 280 553 97% This is the worst case where portion of time spent for saving and restoring registers is bigger than majority of cases. With smaller _dl_runtime_resolve code size, overall performance impact is negligible. On IvyBridge, differences in build and test time of binutils with lazy binding GCC and binutils are noises. On Westmere, differences in bootstrap and "makc check" time of GCC 7 with lazy binding GCC and binutils are also noises. [BZ #21265] * sysdeps/x86/cpu-features-offsets.sym (XSAVE_STATE_SIZE_OFFSET): New. * sysdeps/x86/cpu-features.c: Include <libc-internal.h>. (get_common_indeces): Set xsave_state_size and bit_arch_XSAVEC_Usable if needed. (init_cpu_features): Remove bit_arch_Use_dl_runtime_resolve_slow and bit_arch_Use_dl_runtime_resolve_opt. * sysdeps/x86/cpu-features.h (bit_arch_Use_dl_runtime_resolve_opt): Removed. (bit_arch_Use_dl_runtime_resolve_slow): Likewise. (bit_arch_Prefer_No_AVX512): Updated. (bit_arch_MathVec_Prefer_No_AVX512): Likewise. (bit_arch_XSAVEC_Usable): New. (STATE_SAVE_OFFSET): Likewise. (STATE_SAVE_MASK): Likewise. [__ASSEMBLER__]: Include <cpu-features-offsets.h>. (cpu_features): Add xsave_state_size. (index_arch_Use_dl_runtime_resolve_opt): Removed. (index_arch_Use_dl_runtime_resolve_slow): Likewise. (index_arch_XSAVEC_Usable): New. * sysdeps/x86_64/dl-machine.h (elf_machine_runtime_setup): Replace _dl_runtime_resolve_sse, _dl_runtime_resolve_avx, _dl_runtime_resolve_avx_slow, _dl_runtime_resolve_avx_opt, _dl_runtime_resolve_avx512 and _dl_runtime_resolve_avx512_opt with _dl_runtime_resolve_fxsave, _dl_runtime_resolve_xsave and _dl_runtime_resolve_xsavec. * sysdeps/x86_64/dl-trampoline.S (DL_RUNTIME_UNALIGNED_VEC_SIZE): Removed. (DL_RUNTIME_RESOLVE_REALIGN_STACK): Check STATE_SAVE_ALIGNMENT instead of VEC_SIZE. (REGISTER_SAVE_BND0): Removed. (REGISTER_SAVE_BND1): Likewise. (REGISTER_SAVE_BND3): Likewise. (REGISTER_SAVE_RAX): Always defined to 0. (VMOV): Removed. (_dl_runtime_resolve_avx): Likewise. (_dl_runtime_resolve_avx_slow): Likewise. (_dl_runtime_resolve_avx_opt): Likewise. (_dl_runtime_resolve_avx512): Likewise. (_dl_runtime_resolve_avx512_opt): Likewise. (_dl_runtime_resolve_sse): Likewise. (_dl_runtime_resolve_sse_vex): Likewise. (USE_FXSAVE): New. (_dl_runtime_resolve_fxsave): Likewise. (USE_XSAVE): Likewise. (_dl_runtime_resolve_xsave): Likewise. (USE_XSAVEC): Likewise. (_dl_runtime_resolve_xsavec): Likewise. * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_avx512): Removed. (_dl_runtime_resolve_avx512_opt): Likewise. (_dl_runtime_resolve_avx): Likewise. (_dl_runtime_resolve_avx_opt): Likewise. (_dl_runtime_resolve_sse): Likewise. (_dl_runtime_resolve_sse_vex): Likewise. (_dl_runtime_resolve_fxsave): New. (_dl_runtime_resolve_xsave): Likewise. (_dl_runtime_resolve_xsavec): Likewise. (cherry picked from commit b52b0d793dcb226ecb0ecca1e672ca265973233c)
\| *	x86-64: Verify that _dl_runtime_resolve preserves vector registers	H.J. Lu	2017-10-19	10	-4/+428
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On x86-64, _dl_runtime_resolve must preserve the first 8 vector registers. Add 3 _dl_runtime_resolve tests to verify that SSE, AVX and AVX512 registers are preserved. * sysdeps/x86_64/Makefile (tests): Add tst-sse, tst-avx and tst-avx512. (test-extras): Add tst-avx-aux and tst-avx512-aux. (extra-test-objs): Add tst-avx-aux.o and tst-avx512-aux.o. (modules-names): Add tst-ssemod, tst-avxmod and tst-avx512mod. ($(objpfx)tst-sse): New rule. ($(objpfx)tst-avx): Likewise. ($(objpfx)tst-avx512): Likewise. (CFLAGS-tst-avx-aux.c): New. (CFLAGS-tst-avxmod.c): Likewise. (CFLAGS-tst-avx512-aux.c): Likewise. (CFLAGS-tst-avx512mod.c): Likewise. * sysdeps/x86_64/tst-avx-aux.c: New file. * sysdeps/x86_64/tst-avx.c: Likewise. * sysdeps/x86_64/tst-avx512-aux.c: Likewise. * sysdeps/x86_64/tst-avx512.c: Likewise. * sysdeps/x86_64/tst-avx512mod.c: Likewise. * sysdeps/x86_64/tst-avxmod.c: Likewise. * sysdeps/x86_64/tst-sse.c: Likewise. * sysdeps/x86_64/tst-ssemod.c: Likewise. (cherry picked from commit 3403a17fea8ccef7dc5f99553a13231acf838744)
\| *	X86-64: Correct CFA in _dl_runtime_resolve	H.J. Lu	2017-10-19	2	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When stack is re-aligned in _dl_runtime_resolve, there is no need to adjust CFA when allocating register save area on stack. * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve): Don't adjust CFA when allocating register save area on re-aligned stack. (cherry picked from commit 0ac8ee53e8efbfd6e1c37094b4653f5c2dad65b5)
\| *	Fix nss_nisplus build with mainline GCC (bug 20978).	Joseph Myers	2017-10-07	3	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	glibc build with current mainline GCC fails because nis/nss_nisplus/nisplus-alias.c contains code if (name != NULL) { errnop = EINVAL; return NSS_STATUS_UNAVAIL; } char buf[strlen (name) + 9 + tablename_len]; producing an error about strlen being called on a pointer that is always NULL (and a subsequent use of that pointer with a %s format in snprintf). As Andreas noted, the bogus conditional comes from a 1997 change: - if (name == NULL \|\| strlen(name) > 8) - return NSS_STATUS_NOTFOUND; - else + if (name != NULL \|\| strlen(name) <= 8) So the intention is clearly to return an error for NULL name. This patch duly inverts the sense of the conditional. It fixes the build with GCC mainline, and passes usual glibc testsuite testing for x86_64. However, I have not tried any actual substantive nisplus testing, do not have an environment for such testing, and do not know whether it is possible that strlen (name) or tablename_len might be large so that the VLA for buf is actually a security issue. However, if it is a security issue, there are plenty of other similar instances in the nisplus code (that haven't been hidden by a bogus comparison with NULL) - and nis_table.c:__create_ib_request uses strdupa on the string passed to nis_list, so a local fix in the caller wouldn't suffice anyway (see bug 20987). (Calls to strdupa and other such macros that use alloca must be considered equally questionable regarding stack overflow issues as direct calls to alloca and VLA declarations.) [BZ #20978] nis/nss_nisplus/nisplus-alias.c (_nss_nisplus_getaliasbyname_r): Compare name == NULL, not name != NULL. (cherry picked from commit f88759ea9bd3c8d8fef28f123ba9767cb0e421a3)
\| *	Fix rpcgen buffer overrun (bug 20790).	Joseph Myers	2017-10-07	5	-1/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Building with GCC 7 produces an error building rpcgen: rpc_parse.c: In function 'get_prog_declaration': rpc_parse.c:543:25: error: may write a terminating nul past the end of the destination [-Werror=format-length=] sprintf (name, "%s%d", ARGNAME, num); /* default name of argument / ~~~~^ rpc_parse.c:543:5: note: format output between 5 and 14 bytes into a destination of size 10 sprintf (name, "%s%d", ARGNAME, num); / default name of argument / ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ That buffer overrun is for the case where the .x file declares a program with a million arguments. The strcpy two lines above can generate a buffer overrun much more simply for a long argument name. The limit on length of line read by rpcgen (MAXLINESIZE == 1024) provides a bound on the buffer size needed, so this patch just changes the buffer size to MAXLINESIZE to avoid both possible buffer overruns. A testcase is added that rpcgen does not crash with a 500-character argument name, where it previously crashed. It would not at all surprise me if there are many other ways of crashing rpcgen with either valid or invalid input; fuzz testing would likely find various such bugs, though I don't think they are that important to fix (rpcgen is not that likely to be used with untrusted .x files as input). (As well as fuzz-findable bugs there are probably also issues when various int variables get overflowed on very large input.) The test infrastructure for rpcgen-not-crashing tests would need extending if tests are to be added for cases where rpcgen should produce an error, as opposed to cases where it should succeed. Tested for x86_64 and x86. [BZ #20790] sunrpc/rpc_parse.c (get_prog_declaration): Increase buffer size to MAXLINESIZE. * sunrpc/bug20790.x: New file. * sunrpc/Makefile [$(run-built-tests) = yes] (rpcgen-tests): New variable. [$(run-built-tests) = yes] (tests-special): Add $(rpcgen-tests). [$(run-built-tests) = yes] ($(rpcgen-tests)): New rule. (cherry picked from commit 5874510faaf3cbd0bb112aaacab9f225002beed1)
\| *	Fix warnings from latest GCC.	steve ellcey-CA Eng-Software	2017-10-07	2	-4/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	* sysdeps/ieee754/dbl-64/e_pow.c (checkint) Make conditions explicitly boolean. (cherry picked from commit e223d1fe72e820d96f43831412ab267a1ace04d0)
\| *	Fix cast-after-dereference	DJ Delorie	2017-10-07	3	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Original code was dereferencing a char, then casting the value to size_t. Should cast the pointer to size_t then deference. (cherry picked from commit f8cef4d07d9641e27629bd3ce2d13f5d702fb251)
\| *	Fix BZ #21654 - grp-merge.c alignment	DJ Delorie	2017-10-07	2	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* grp/grp_merge.c (__copy_grp): Align char** to minimum pointer alignment not char alignment. (__merge_grp): Likewise. (cherry picked from commit 4fa8ae49aa169fb8d97882938e8bee3ed9ce5410)
\| *	x86-64: Use _dl_runtime_resolve_opt only with AVX512F [BZ #21871]	H.J. Lu	2017-08-06	2	-2/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On AVX machines with XGETBV (ECX == 1) like Skylake processors, (gdb) disass _dl_runtime_resolve_avx_opt Dump of assembler code for function _dl_runtime_resolve_avx_opt: 0x0000000000015890 <+0>: push %rax 0x0000000000015891 <+1>: push %rcx 0x0000000000015892 <+2>: push %rdx 0x0000000000015893 <+3>: mov $0x1,%ecx 0x0000000000015898 <+8>: xgetbv 0x000000000001589b <+11>: mov %eax,%r11d 0x000000000001589e <+14>: pop %rdx 0x000000000001589f <+15>: pop %rcx 0x00000000000158a0 <+16>: pop %rax 0x00000000000158a1 <+17>: and $0x4,%r11d 0x00000000000158a5 <+21>: bnd je 0x16200 <_dl_runtime_resolve_sse_vex> End of assembler dump. is slower than: (gdb) disass _dl_runtime_resolve_avx_slow Dump of assembler code for function _dl_runtime_resolve_avx_slow: 0x0000000000015850 <+0>: vorpd %ymm0,%ymm1,%ymm8 0x0000000000015854 <+4>: vorpd %ymm2,%ymm3,%ymm9 0x0000000000015858 <+8>: vorpd %ymm4,%ymm5,%ymm10 0x000000000001585c <+12>: vorpd %ymm6,%ymm7,%ymm11 0x0000000000015860 <+16>: vorpd %ymm8,%ymm9,%ymm9 0x0000000000015865 <+21>: vorpd %ymm10,%ymm11,%ymm10 0x000000000001586a <+26>: vpcmpeqd %xmm8,%xmm8,%xmm8 0x000000000001586f <+31>: vorpd %ymm9,%ymm10,%ymm10 0x0000000000015874 <+36>: vptest %ymm10,%ymm8 0x0000000000015879 <+41>: bnd jae 0x158b0 <_dl_runtime_resolve_avx> 0x000000000001587c <+44>: vzeroupper 0x000000000001587f <+47>: bnd jmpq 0x16200 <_dl_runtime_resolve_sse_vex> End of assembler dump. (gdb) since xgetbv takes much more cycles than single cycle operations like vpord/vvpcmpeq/ptest. _dl_runtime_resolve_opt should be used only with AVX512 where AVX512 instructions lead to lower CPU frequency on Skylake server. [BZ #21871] * sysdeps/x86/cpu-features.c (init_cpu_features): Set bit_arch_Use_dl_runtime_resolve_opt only with AVX512F. (cherry picked from commit d2cf37c0a2a375cf2fde69f1afbcc49e45368fc4)
\| *	sunrpc: Avoid use-after-free read access in clntudp_call [BZ #21115]	Florian Weimer	2017-08-05	4	-2/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After commit bc779a1a5b3035133024b21e2f339fe4219fb11c (CVE-2016-4429: sunrpc: Do not use alloca in clntudp_call [BZ #20112]), ancillary data is stored on the heap, but it is accessed after it has been freed. The test case must be run under a heap debugger such as valgrind to observe the invalid access. A malloc implementation which immediately calls munmap on free would catch this bug as well. (cherry picked from commit d42eed4a044e5e10dfb885cf9891c2518a72a491)
\| *	Bug 21053: sh: Reduce namespace pollution from sys/ucontext.h	James Clarke	2017-07-31	4	-68/+78
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The problem is basically that sys/ucontext.h is defining R0..R15 which happens to conflict with some packages like Firefox when trying to build on SH. The very same problem existed on arm back then [1] and it was fixed by renaming R0..R15 to REG_R0..REG_R15. This patch imploy a similar strategy for SH. Checked on sh4-linux-gnu with run-built-tests=no and I also got reports that it fixes Firefox build on Debian sh4. * sysdeps/unix/sysv/linux/sh/sh3/ucontext_i.sym: Use new REG_R* constants instead of the old R* ones. * sysdeps/unix/sysv/linux/sh/sh4/ucontext_i.sym: Likewise. * sysdeps/unix/sysv/linux/sh/sys/ucontext.h (NGPREG): Rename... (NGREG): ... to this, to fit in with other architectures. (gpregset_t): Use new NGREG macro. [__USE_GNU]: Remove condition; all architectures other than tile are unconditional. (R): Rename to REG_R. (cherry picked from commit 3e1b518550634792de13332edaab0ad722322c2b)
\| *	Avoid .symver on common symbols [BZ #21666]	H.J. Lu	2017-07-26	2	-4/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The .symver directive on common symbol just creates a new common symbol, not an alias and the newer assembler with the bug fix for https://sourceware.org/bugzilla/show_bug.cgi?id=21661 will issue an error. Before the fix, we got $ readelf -sW libc.so \| grep "loc[12s]" 5109: 00000000003a0608 8 OBJECT LOCAL DEFAULT 36 loc1 5188: 00000000003a0610 8 OBJECT LOCAL DEFAULT 36 loc2 5455: 00000000003a0618 8 OBJECT LOCAL DEFAULT 36 locs 6575: 00000000003a05f0 8 OBJECT GLOBAL DEFAULT 36 locs@GLIBC_2.2.5 7156: 00000000003a05f8 8 OBJECT GLOBAL DEFAULT 36 loc1@GLIBC_2.2.5 7312: 00000000003a0600 8 OBJECT GLOBAL DEFAULT 36 loc2@GLIBC_2.2.5 in libc.so. The versioned loc1, loc2 and locs have the wrong addresses. After the fix, we got $ readelf -sW libc.so \| grep "loc[12s]" 6570: 000000000039e3b8 8 OBJECT GLOBAL DEFAULT 34 locs@GLIBC_2.2.5 7151: 000000000039e3c8 8 OBJECT GLOBAL DEFAULT 34 loc1@GLIBC_2.2.5 7307: 000000000039e3c0 8 OBJECT GLOBAL DEFAULT 34 loc2@GLIBC_2.2.5 [BZ #21666] * misc/regexp.c (loc1): Add __attribute__ ((nocommon)); (loc2): Likewise. (locs): Likewise. (cherry picked from commit 388b4f1a02f3a801965028bbfcd48d905638b797)
\| *	[AArch64] Use hidden __GI__dl_argv in rtld startup code	Szabolcs Nagy	2017-07-12	2	-2/+7
\| \| \| \| \| \| \| \| \| \|	We rely on the symbol being locally defined so using extern symbol is not correct and the linker may complain about the relocations.
\| *	conform tests: call perl with '-I.'	Aurelien Jarno	2017-07-06	2	-3/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Historically perl includes the current directory in the module search path. Over the time this has been considered as a security issue and the recent vulnerabilities [1] made people to reconsider this behaviour. It is almost sure that this will be removed in the future [2], possibly for the 5.26 release, although this is not yet firmly decided. Debian has decided to backport the patches [3], so the perl binary in unstable do not have '.' in @INC anymore. This behaviour is used in the conform perl scripts to include the GlibcConform module. This patch fixes that by calling perl with '-I.'. This is not a security issue in this case as make ensures that the current directory is $(srcdir)/conform/ when the scripts are called. Passing the full path would do exactly the same. [1] CVE-2016-1238 CVE-2016-6185 [2] https://rt.perl.org/Public/Bug/Display.html?id=127810 [3] https://lists.debian.org/debian-devel-announce/2016/08/msg00013.html Changelog: * conform/Makefile (conformtest-header-tests): Pass -I. to $(PERL). (linknamespace-symlists-tests): Likewise. (linknamespace-header-tests): Likewise. (cherry picked from commit 6d5336211d2e823d4d431a01e62a80d9be4cbc9d)
\| *	x86-64: Align the stack in __tls_get_addr [BZ #21609]	H.J. Lu	2017-07-06	8	-2/+144
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change forces realignment of the stack pointer in __tls_get_addr, so that binaries compiled by GCCs older than GCC 4.9: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58066 continue to work even if vector instructions are used in glibc which require the ABI stack realignment. __tls_get_addr_slow is added to handle the slow paths in the default implementation of__tls_get_addr in elf/dl-tls.c. The new __tls_get_addr calls __tls_get_addr_slow after realigning the stack. Internal calls within ld.so go directly to the default implementation of __tls_get_addr because they do not need stack realignment. [BZ #21609] * sysdeps/x86_64/Makefile (sysdep-dl-routines): Add tls_get_addr. (gen-as-const-headers): Add rtld-offsets.sym. * sysdeps/x86_64/dl-tls.c: New file. * sysdeps/x86_64/rtld-offsets.sym: Likwise. * sysdeps/x86_64/tls_get_addr.S: Likewise. * sysdeps/x86_64/dl-tls.h: Add multiple inclusion guards. * sysdeps/x86_64/tlsdesc.sym (TI_MODULE_OFFSET): New. (TI_OFFSET_OFFSET): Likwise. (cherry picked from commit 031e519c95c069abe4e4c7c59e2b4b67efccdee5)
* \|	Merge branch 'release/2.24/master' into ibm/2.24/master	Tulio Magno Quites Machado Filho	2017-06-21	208	-903/+7909
\|\\|
\| *	i686: Add missing IS_IN (libc) guards to vectorized strcspn	Florian Weimer	2017-06-20	3	-3/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since commit d957c4d3fa48d685ff2726c605c988127ef99395 (i386: Compile rtld-*.os with -mno-sse -mno-mmx -mfpmath=387), vector intrinsics can no longer be used in ld.so, even if the compiled code never makes it into the final ld.so link. This commit adds the missing IS_IN (libc) guard to the SSE 4.2 strcspn implementation, so that it can be used from ld.so in the future. (cherry picked from commit 69052a3a95da37169a08f9e59b2cc1808312753c)
\| *	Ignore and remove LD_HWCAP_MASK for AT_SECURE programs (bug #21209)	Siddhesh Poyarekar	2017-06-20	4	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The LD_HWCAP_MASK environment variable may alter the selection of function variants for some architectures. For AT_SECURE process it means that if an outdated routine has a bug that would otherwise not affect newer platforms by default, LD_HWCAP_MASK will allow that bug to be exploited. To be on the safe side, ignore and disable LD_HWCAP_MASK for setuid binaries. [BZ #21209] * elf/rtld.c (process_envvars): Ignore LD_HWCAP_MASK for AT_SECURE processes. * sysdeps/generic/unsecvars.h: Add LD_HWCAP_MASK. (cherry picked from commit 1c1243b6fc33c029488add276e56570a07803bfd)
\| *	ld.so: Reject overly long LD_AUDIT path elements	Florian Weimer	2017-06-19	2	-15/+106
\| \| \| \| \| \| \| \| \| \| \| \|	Also only process the last LD_AUDIT entry. (cherry picked from commit 81b82fb966ffbd94353f793ad17116c6088dedd9)
\| *	ld.so: Reject overly long LD_PRELOAD path elements	Florian Weimer	2017-06-19	2	-16/+73
\| \| \| \| \| \| \| \|	(cherry picked from commit 6d0ba622891bed9d8394eef1935add53003b12e8)
\| *	CVE-2017-1000366: Ignore LD_LIBRARY_PATH for AT_SECURE=1 programs [BZ #21624]	Florian Weimer	2017-06-19	3	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	LD_LIBRARY_PATH can only be used to reorder system search paths, which is not useful functionality. This makes an exploitable unbounded alloca in _dl_init_paths unreachable for AT_SECURE=1 programs. (cherry picked from commit f6110a8fee2ca36f8e2d2abecf3cba9fa7b8ea7d)
\| *	m68k: fix 64bit atomic ops	Andreas Schwab	2017-06-17	2	-6/+15
\| \| \| \| \| \| \| \|	(cherry picked from commit 64ae9fe45662c8994b0e56ab469b01967408a154)
\| *	Correct collation rules for Malayalam.	Santhosh Thottingal	2017-06-11	2	-4/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[BZ #19922] * locales/iso14651_t1_common: Add collation rules for U+07DA to U+07DF. [BZ #19919] * locales/iso14651_t1_common: Correct collation of U+0D36 and U+0D37.
\| *	fork: Remove bogus parent PID assertions [BZ #21386]	Florian Weimer	2017-06-09	3	-8/+8
\| \| \| \| \| \| \| \|	(cherry picked from commit 1d2bc2eae969543b89850e35e532f3144122d80a)
\| *	Use test-driver in sysdeps/unix/sysv/linux/tst-clone2.c	Arjun Shankar	2017-06-07	2	-4/+9
\| \| \| \| \| \| \| \|	(cherry picked from commit fdc543919a3d8578631a492e1227c2cd8f5ecec7)
\| *	Remove cached PID/TID in clone	Adhemerval Zanella	2017-06-07	70	-827/+165
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch remove the PID cache and usage in current GLIBC code. Current usage is mainly used a performance optimization to avoid the syscall, however it adds some issues: - The exposed clone syscall will try to set pid/tid to make the new thread somewhat compatible with current GLIBC assumptions. This cause a set of issue with new workloads and usecases (such as BZ#17214 and [1]) as well for new internal usage of clone to optimize other algorithms (such as clone plus CLONE_VM for posix_spawn, BZ#19957). - The caching complexity also added some bugs in the past [2] [3] and requires more effort of each port to handle such requirements (for both clone and vfork implementation). - Caching performance gain in mainly on getpid and some specific code paths. The getpid performance leverage is questionable [4], either by the idea of getpid being a hotspot as for the getpid implementation itself (if it is indeed a justifiable hotspot a vDSO symbol could let to a much more simpler solution). Other usage is mainly for non usual code paths, such as pthread cancellation signal and handling. For thread creation (on stack allocation) the code simplification in fact adds some performance gain due the no need of transverse the stack cache and invalidate each element pid. Other thread usages will require a direct getpid syscall, such as cancellation/setxid signal, thread cancellation, thread fail path (at create_thread), and thread signal (pthread_kill and pthread_sigqueue). However these are hardly usual hotspots and I think adding a syscall is justifiable. It also simplifies both the clone and vfork arch-specific implementation. And by review each fork implementation there are some discrepancies that this patch also solves: - microblaze clone/vfork does not set/reset the pid/tid field - hppa uses the default vfork implementation that fallback to fork. Since vfork is deprecated I do not think we should bother with it. The patch also removes the TID caching in clone. My understanding for such semantic is try provide some pthread usage after a user program issue clone directly (as done by thread creation with CLONE_PARENT_SETTID and pthread tid member). However, as stated before in multiple discussions threads, GLIBC provides clone syscalls without further supporting all this semantics. I ran a full make check on x86_64, x32, i686, armhf, aarch64, and powerpc64le. For sparc32, sparc64, and mips I ran the basic fork and vfork tests from posix/ folder (on a qemu system). So it would require further testing on alpha, hppa, ia64, m68k, nios2, s390, sh, and tile (I excluded microblaze because it is already implementing the patch semantic regarding clone/vfork). [1] https://codereview.chromium.org/800183004/ [2] https://sourceware.org/ml/libc-alpha/2006-07/msg00123.html [3] https://sourceware.org/bugzilla/show_bug.cgi?id=15368 [4] http://yarchive.net/comp/linux/getpid_caching.html * sysdeps/nptl/fork.c (__libc_fork): Remove pid cache setting. * nptl/allocatestack.c (allocate_stack): Likewise. (__reclaim_stacks): Likewise. (setxid_signal_thread): Obtain pid through syscall. * nptl/nptl-init.c (sigcancel_handler): Likewise. (sighandle_setxid): Likewise. * nptl/pthread_cancel.c (pthread_cancel): Likewise. * sysdeps/unix/sysv/linux/pthread_kill.c (__pthread_kill): Likewise. * sysdeps/unix/sysv/linux/pthread_sigqueue.c (pthread_sigqueue): Likewise. * sysdeps/unix/sysv/linux/createthread.c (create_thread): Likewise. * sysdeps/unix/sysv/linux/getpid.c: Remove file. * nptl/descr.h (struct pthread): Change comment about pid value. * nptl/pthread_getattr_np.c (pthread_getattr_np): Remove thread pid assert. * sysdeps/unix/sysv/linux/pthread-pids.h (__pthread_initialize_pids): Do not set pid value. * nptl_db/td_ta_thr_iter.c (iterate_thread_list): Remove thread pid cache check. * nptl_db/td_thr_validate.c (td_thr_validate): Likewise. * sysdeps/aarch64/nptl/tcb-offsets.sym: Remove pid offset. * sysdeps/alpha/nptl/tcb-offsets.sym: Likewise. * sysdeps/arm/nptl/tcb-offsets.sym: Likewise. * sysdeps/hppa/nptl/tcb-offsets.sym: Likewise. * sysdeps/i386/nptl/tcb-offsets.sym: Likewise. * sysdeps/ia64/nptl/tcb-offsets.sym: Likewise. * sysdeps/m68k/nptl/tcb-offsets.sym: Likewise. * sysdeps/microblaze/nptl/tcb-offsets.sym: Likewise. * sysdeps/mips/nptl/tcb-offsets.sym: Likewise. * sysdeps/nios2/nptl/tcb-offsets.sym: Likewise. * sysdeps/powerpc/nptl/tcb-offsets.sym: Likewise. * sysdeps/s390/nptl/tcb-offsets.sym: Likewise. * sysdeps/sh/nptl/tcb-offsets.sym: Likewise. * sysdeps/sparc/nptl/tcb-offsets.sym: Likewise. * sysdeps/tile/nptl/tcb-offsets.sym: Likewise. * sysdeps/x86_64/nptl/tcb-offsets.sym: Likewise. * sysdeps/unix/sysv/linux/aarch64/clone.S: Remove pid and tid caching. * sysdeps/unix/sysv/linux/alpha/clone.S: Likewise. * sysdeps/unix/sysv/linux/arm/clone.S: Likewise. * sysdeps/unix/sysv/linux/hppa/clone.S: Likewise. * sysdeps/unix/sysv/linux/i386/clone.S: Likewise. * sysdeps/unix/sysv/linux/ia64/clone2.S: Likewise. * sysdeps/unix/sysv/linux/mips/clone.S: Likewise. * sysdeps/unix/sysv/linux/nios2/clone.S: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/clone.S: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/clone.S: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/clone.S: Likewise. * sysdeps/unix/sysv/linux/sh/clone.S: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/clone.S: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/clone.S: Likewise. * sysdeps/unix/sysv/linux/tile/clone.S: Likewise. * sysdeps/unix/sysv/linux/x86_64/clone.S: Likewise. * sysdeps/unix/sysv/linux/aarch64/vfork.S: Remove pid set and reset. * sysdeps/unix/sysv/linux/alpha/vfork.S: Likewise. * sysdeps/unix/sysv/linux/arm/vfork.S: Likewise. * sysdeps/unix/sysv/linux/i386/vfork.S: Likewise. * sysdeps/unix/sysv/linux/ia64/vfork.S: Likewise. * sysdeps/unix/sysv/linux/m68k/clone.S: Likewise. * sysdeps/unix/sysv/linux/m68k/vfork.S: Likewise. * sysdeps/unix/sysv/linux/mips/vfork.S: Likewise. * sysdeps/unix/sysv/linux/nios2/vfork.S: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc32/vfork.S: Likewise. * sysdeps/unix/sysv/linux/powerpc/powerpc64/vfork.S: Likewise. * sysdeps/unix/sysv/linux/s390/s390-32/vfork.S: Likewise. * sysdeps/unix/sysv/linux/s390/s390-64/vfork.S: Likewise. * sysdeps/unix/sysv/linux/sh/vfork.S: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc32/vfork.S: Likewise. * sysdeps/unix/sysv/linux/sparc/sparc64/vfork.S: Likewise. * sysdeps/unix/sysv/linux/tile/vfork.S: Likewise. * sysdeps/unix/sysv/linux/x86_64/vfork.S: Likewise. * sysdeps/unix/sysv/linux/tst-clone2.c (f): Remove direct pthread struct access. (clone_test): Remove function. (do_test): Rewrite to take in consideration pid is not cached anymore. (cherry picked from commit c579f48edba88380635ab98cb612030e3ed8691e)
\| *	Add INTERNAL_SYSCALL_CALL	Adhemerval Zanella	2017-06-07	2	-18/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds two new macros for internal and inline syscall to use within GLIBC: INTERNAL_SYSCALL_CALL and INLINE_SYSCALL_CALL. They are similar to the old INTERNAL_SYSCALL and INLINE_SYSCALL with the difference the new macros accept a variable argument call and do not require to pass the expected argument size. The advantage is it is possible to use variable argument macros like SYSCALL_LL{64} without the need to also handle the argument size. So for an ABI where SYSCALL_LL might split the argument in high and low parts, instead of: INTERNAL_SYSCALL_DECL (err); #if ... INTERNAL_SYSCALL (syscall, err, 2, SYSCALL_LL (len)); #else INTERNAL_SYSCALL (syscall, err, 1, SYSCALL_LL (len)); #endif It will be just: INTERNAL_SYSCALL_CALL (syscall, err, SYSCALL_LL (len)); The INLINE_SYSCALL_CALL follows the same semanthic regarding the argument and is similar to INLINE_SYSCALL regarding setting errno. Checked with a build for x86_64, i386, aach64, armhf, powerpc64le, powerpc32, and mips32. No code generation changed. * sysdeps/unix/sysdep.h (__INTERNAL_SYSCALL0): New macro. (__INTERNAL_SYSCALL1): Likewise. (__INTERNAL_SYSCALL2): Likewise. (__INTERNAL_SYSCALL3): Likewise. (__INTERNAL_SYSCALL4): Likewise. (__INTERNAL_SYSCALL5): Likewise. (__INTERNAL_SYSCALL6): Likewise. (__INTERNAL_SYSCALL7): Likewise. (__INTERNAL_SYSCALL_NARGS_X): Likewise. (__INTERNAL_SYSCALL_NARGS): Likewise. (__INTERNAL_SYSCALL_DISP): Likewise. (INTERNAL_SYSCALL_CALL): Likewise. (__SYSCALL0): Rename to __INLINE_SYSCALL0. (__SYSCALL1): Rename to __INLINE_SYSCALL1. (__SYSCALL2): Rename to __INLINE_SYSCALL2. (__SYSCALL3): Rename to __INLINE_SYSCALL3. (__SYSCALL4): Rename to __INLINE_SYSCALL4. (__SYSCALL5): Rename to __INLINE_SYSCALL5. (__SYSCALL6): Rename to __INLINE_SYSCALL6. (__SYSCALL7): Rename to __INLINE_SYSCALL7. (__SYSCALL_NARGS_X): Rename to __INLINE_SYSCALL_NARGS_X. (__SYSCALL_NARGS): Rename to __INLINE_SYSCALL_NARGS. (__SYSCALL_DISP): Rename to __INLINE_SYSCALL_DISP. (__SYSCALL_CALL): Rename to INLINE_SYSCALL_CALL. (SYSCALL_CANCEL): Replace __SYSCALL_CALL with INLINE_SYSCALL_CALL. (cherry picked from commit e33a23fbe8c2dba04fe05678c584d3efcb6c9951)
\| *	Synchronize support/ infrastructure with master	Arjun Shankar	2017-06-07	119	-3/+7306
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit updates the support/ subdirectory to commit 2714c5f3c95f90977167c1d21326d907fb76b419 on the master branch and modifies Makeconfig, Rules, and extra-lib.mk accordingly.
\| *	x86: Use AVX2 memcpy/memset on Skylake server [BZ #21396]	H.J. Lu	2017-04-28	11	-1/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On Skylake server, AVX512 load/store instructions in memcpy/memset may lead to lower CPU turbo frequency in certain situations. Use of AVX2 in memcpy/memset has been observed to have improved overall performance in many workloads due to the higher frequency. Since AVX512ER is unique to Xeon Phi, this patch sets Prefer_No_AVX512 if AVX512ER isn't available so that AVX2 versions of memcpy/memset are used on Skylake server. [BZ #21396] * sysdeps/x86/cpu-features.c (init_cpu_features): Set Prefer_No_AVX512 if AVX512ER isn't available. * sysdeps/x86/cpu-features.h (bit_arch_Prefer_No_AVX512): New. (index_arch_Prefer_No_AVX512): Likewise. * sysdeps/x86_64/multiarch/memcpy.S (__new_memcpy): Don't use AVX512 version if Prefer_No_AVX512 is set. * sysdeps/x86_64/multiarch/memcpy_chk.S (__memcpy_chk): Likewise. * sysdeps/x86_64/multiarch/memmove.S (__libc_memmove): Likewise. * sysdeps/x86_64/multiarch/memmove_chk.S (__memmove_chk): Likewise. * sysdeps/x86_64/multiarch/mempcpy.S (__mempcpy): Likewise. * sysdeps/x86_64/multiarch/mempcpy_chk.S (__mempcpy_chk): Likewise. * sysdeps/x86_64/multiarch/memset.S (memset): Likewise. * sysdeps/x86_64/multiarch/memset_chk.S (__memset_chk): Likewise. (cherry picked from commit 4cb334c4d6249686653137ec273d081371b3672d)
\| *	x86: Set Prefer_No_VZEROUPPER if AVX512ER is available	H.J. Lu	2017-04-28	3	-2/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AVX512ER won't be implemented in any Xeon processors and will be in all Xeon Phi processors. Don't check CPU model number when setting Prefer_No_VZEROUPPER for Xeon Phi. Instead, set Prefer_No_VZEROUPPER if AVX512ER is available. It works with current and future Xeon Phi and non-Xeon Phi processors. * sysdeps/x86/cpu-features.c (init_cpu_features): Set Prefer_No_VZEROUPPER if AVX512ER is available. * sysdeps/x86/cpu-features.h (bit_cpu_AVX512PF): New. (bit_cpu_AVX512ER): Likewise. (bit_cpu_AVX512CD): Likewise. (bit_cpu_AVX512BW): Likewise. (bit_cpu_AVX512VL): Likewise. (index_cpu_AVX512PF): Likewise. (index_cpu_AVX512ER): Likewise. (index_cpu_AVX512CD): Likewise. (index_cpu_AVX512BW): Likewise. (index_cpu_AVX512VL): Likewise. (reg_AVX512PF): Likewise. (reg_AVX512ER): Likewise. (reg_AVX512CD): Likewise. (reg_AVX512BW): Likewise. (reg_AVX512VL): Likewise. (cherry picked from commit 1c53cb49de6d82d9469ccbd5aa0c55924502bd8b)
* \|	malloc: Simplify static malloc interposition [BZ #20432]	Florian Weimer	2017-05-26	9	-6/+390
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Existing interposed mallocs do not define the glibc-internal fork callbacks (and they should not), so statically interposed mallocs lead to link failures because the strong reference from fork pulls in glibc's malloc, resulting in multiple definitions of malloc-related symbols. (cherry picked from commit ef4f97648dc95849e417dd3e6328165de4c22185) Conflicts: nptl/tst-tls3-malloc.c
* \|	malloc: Run tests without calling mallopt [BZ #19469]	Florian Weimer	2017-05-26	3	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The compiled tests no longer refer to the mallopt symbol from their main functions. (Some tests still call mallopt explicitly, which is fine.) (cherry picked from commit f690b56979dea81340a397c1b5e44827a6fb06e7)
* \|	Merge branch release/2.24/master into ibm/2.24/master	Tulio Magno Quites Machado Filho	2017-04-25	16	-25/+303
\|\\|
\| *	Fix MIPS n64 readahead (bug 21026).	Joseph Myers	2017-04-09	2	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As noted in bug 20126, MIPS n64 uses an incorrect implementation of readahead intended for 32-bit systems. This patch adds a syscalls.list entry to fix this. An updated version of the consolidation patch <https://sourceware.org/ml/libc-alpha/2016-09/msg00527.html> could remove this syscalls.list entry again. Tested with compilation (only) for mips64; the nature of the syscall doesn't allow for a glibc test to detect this issue. [BZ #21026] * sysdeps/unix/sysv/linux/mips/mips64/n64/syscalls.list (readahead): New syscall entry. (cherry picked from commit 30733525c6867c160261db1afade6326000f9f75)
\| *	x86-64: Improve branch predication in _dl_runtime_resolve_avx512_opt [BZ #21258]	H.J. Lu	2017-04-07	3	-6/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On Skylake server, _dl_runtime_resolve_avx512_opt is used to preserve the first 8 vector registers. The code layout is if only %xmm0 - %xmm7 registers are used preserve %xmm0 - %xmm7 registers if only %ymm0 - %ymm7 registers are used preserve %ymm0 - %ymm7 registers preserve %zmm0 - %zmm7 registers Branch predication always executes the fallthrough code path to preserve %zmm0 - %zmm7 registers speculatively, even though only %xmm0 - %xmm7 registers are used. This leads to lower CPU frequency on Skylake server. This patch changes the fallthrough code path to preserve %xmm0 - %xmm7 registers instead: if whole %zmm0 - %zmm7 registers are used preserve %zmm0 - %zmm7 registers if only %ymm0 - %ymm7 registers are used preserve %ymm0 - %ymm7 registers preserve %xmm0 - %xmm7 registers Tested on Skylake server. [BZ #21258] * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve_opt): Define only if _dl_runtime_resolve is defined to _dl_runtime_resolve_sse_vex. * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_opt): Fallthrough to _dl_runtime_resolve_sse_vex. (cherry picked from commit c15f8eb50cea7ad1a4ccece6e0982bf426d52c00)
\| *	posix_spawn: use a larger min stack for -fstack-check [BZ #21253]	Mike Frysinger	2017-04-03	2	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When glibc is built with -fstack-check, trying to use posix_spawn can lead to segfaults due to gcc internally probing stack memory too far. The new spawn API will allocate a minimum of 1 page, but the stack checking logic might probe a couple of pages. When it tries to walk them, everything falls apart. The gcc internal docs [1] state the default interval checking is one page. Which means we need two pages (the current one, and the next probed). No target currently defines it larger. Further, it mentions that the default minimum stack size needed to recover from an overflow is 4/8KiB for sjlj or 8/12KiB for others. But some Linux targets (like mips and ppc) go up to 16KiB (and some non-Linux targets go up to 24KiB). Let's create each child with a minimum of 32KiB slack space to support them all, and give us future breathing room. No test is added as existing ones crash. Even a simple call is enough to trigger the problem: char *argv[] = { "/bin/ls", NULL }; posix_spawn(NULL, "/bin/ls", NULL, NULL, argv, NULL); [1] https://gcc.gnu.org/onlinedocs/gcc-6.3.0/gccint/Stack-Checking.html (cherry picked from commit 21f042c804835d1f7a4a8e06f2c93ca35a182042)
\| *	fts: Fix symbol redirect for fts_set [BZ #21289]	Slava Barinov	2017-03-31	3	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In a 32-bit environment with _FILE_OFFSET_BITS=64, the __REDIRECT macro combined with __THROW generates an invalid C++ declaration. (cherry picked from commit ce39613205dc47ceaeea76710d49e7a483b503ab)
\| *	posix_spawn: fix stack setup on ia64 [BZ #21275]	Mike Frysinger	2017-03-20	2	-5/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The ia64-specific clone2 call expects the base of the stack mapping and the stack size as sep arguments, not an initial stack value as on other stack-grows-down architectures. Reuse the stack-grows-up macro so we pass in the right stack base. Reported-by: Matt Turner <mattst88@gentoo.org> (cherry picked from commit ddc3fb333469c2997798742dc0509dc1e3201d91)