about summary refs log tree commit diff
path: root/sysdeps/unix/sysv
Commit message (Collapse)AuthorAgeFilesLines
* aarch64: Add half-width versions of AdvSIMD f32 libmvec routinesJoe Ramsay2023-12-201-0/+15
| | | | | | | | | | | Compilers may emit calls to 'half-width' routines (two-lane single-precision variants). These have been added in the form of wrappers around the full-width versions, where the low half of the vector is simply duplicated. This will perform poorly when one lane triggers the special-case handler, as there will be a redundant call to the scalar version, however this is expected to be rare at Ofast. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* elf: Do not duplicate the GLIBC_TUNABLES stringAdhemerval Zanella2023-12-193-46/+38
| | | | | | | | | | | | | | | | | | | | | The tunable parsing duplicates the tunable environment variable so it null-terminates each one since it simplifies the later parsing. It has the drawback of adding another point of failure (__minimal_malloc failing), and the memory copy requires tuning the compiler to avoid mem operations calls. The parsing now tracks the tunable start and its size. The dl-tunable-parse.h adds helper functions to help parsing, like a strcmp that also checks for size and an iterator for suboptions that are comma-separated (used on hwcap parsing by x86, powerpc, and s390x). Since the environment variable is allocated on the stack by the kernel, it is safe to keep the references to the suboptions for later parsing of string tunables (as done by set_hwcaps by multiple architectures). Checked on x86_64-linux-gnu, powerpc64le-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* s390: Set psw addr field in getcontext and friends.Stefan Liebler2023-12-196-0/+34
| | | | | | | | | | | | | | | | | | | | So far if the ucontext structure was obtained by getcontext and co, the return address was stored in general purpose register 14 as it is defined as return address in the ABI. In contrast, the context passed to a signal handler contains the address in psw.addr field. If somebody e.g. wants to dump the address of the context, the origin needs to be known. Now this patch adjusts getcontext and friends and stores the return address also in psw.addr field. Note that setcontext isn't adjusted and it is not supported to pass a ucontext structure from signal-handler to setcontext. We are not able to restore all registers and branching to psw.addr without clobbering one register.
* powerpc: Add space for HWCAP3/HWCAP4 in the TCB for future Power.Manjunath Matti2023-12-153-0/+3
| | | | | | | | | | | | | | This patch reserves space for HWCAP3/HWCAP4 in the TCB of powerpc. These hardware capabilities bits will be used by future Power architectures. Versioned symbol '__parse_hwcap_3_4_and_convert_at_platform' advertises the availability of the new HWCAP3/HWCAP4 data in the TCB. This is an ABI change for GLIBC 2.39. Suggested-by: Peter Bergner <bergner@linux.ibm.com> Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
* linux: Make fdopendir fail with O_PATH (BZ 30373)Adhemerval Zanella2023-11-303-1/+56
| | | | | | | | | It is not strictly required by the POSIX, since O_PATH is a Linux extension, but it is QoI to fail early instead of at readdir. Also the check is free, since fdopendir already checks if the file descriptor is opened for read. Checked on x86_64-linux-gnu.
* Remove __access_noerrnoJoseph Myers2023-11-231-17/+0
| | | | | | | | | | | | | | | | A recent commit, apparently commit 6c6fce572fb8f583f14d898e54fd7d25ae91cf56 "elf: Remove /etc/suid-debug support", resulted in localplt failures for i686-gnu and x86_64-gnu: Missing required PLT reference: ld.so: __access_noerrno After that commit, __access_noerrno is actually no longer used at all. So rather than just removing the localplt expectation for that symbol for Hurd, completely remove all definitions of and references to that symbol. Tested for x86_64, and with build-many-glibcs.py for i686-gnu and x86_64-gnu.
* malloc: Use __get_nprocs on arena_get2 (BZ 30945)Adhemerval Zanella2023-11-221-1/+1
| | | | | | | | | | | | | | | | This restore the 2.33 semantic for arena_get2. It was changed by 11a02b035b46 to avoid arena_get2 call malloc (back when __get_nproc was refactored to use an scratch_buffer - 903bc7dcc2acafc). The __get_nproc was refactored over then and now it also avoid to call malloc. The 11a02b035b46 did not take in consideration any performance implication, which should have been discussed properly. The __get_nprocs_sched is still used as a fallback mechanism if procfs and sysfs is not acessible. Checked on x86_64-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>
* elf: Fix _dl_debug_vdprintf to work before self-relocationAdhemerval Zanella2023-11-211-0/+24
| | | | | | | | | | | | | The strlen might trigger and invalid GOT entry if it used before the process is self-relocated (for instance on dl-tunables if any error occurs). For i386, _dl_writev with PIE requires to use the old 'int $0x80' syscall mode because the calling the TLS register (gs) is not yet initialized. Checked on x86_64-linux-gnu. Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* aarch64: Add vector implementations of expm1 routinesJoe Ramsay2023-11-201-0/+4
| | | | May discard sign of 0 - auto tests for -0 and -0x1p-10000 updated accordingly.
* linux: Use fchmodat2 on fchmod for flags different than 0 (BZ 26401)Adhemerval Zanella2023-11-202-53/+75
| | | | | | | | | | | | | | | | | | Linux 6.6 (09da082b07bbae1c) added support for fchmodat2, which has similar semantics as fchmodat with an extra flag argument. This allows fchmodat to implement AT_SYMLINK_NOFOLLOW and AT_EMPTY_PATH without the need for procfs. The syscall is registered on all architectures (with value of 452 except on alpha which is 562, commit 78252deb023cf087). The tst-lchmod.c requires a small fix where fchmodat checks two contradictory assertions ('(st.st_mode & 0777) == 2' and '(st.st_mode & 0777) == 3'). Checked on x86_64-linux-gnu on a 6.6 kernel. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* AArch64: Remove Falkor memcpyWilco Dijkstra2023-11-132-7/+0
| | | | | | | | | | The latest implementations of memcpy are actually faster than the Falkor implementations [1], so remove the falkor/phecda ifuncs for memcpy and the now unused IS_FALKOR/IS_PHECDA defines. [1] https://sourceware.org/pipermail/libc-alpha/2022-December/144227.html Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* aarch64: Add vector implementations of log1p routinesJoe Ramsay2023-11-101-0/+4
| | | | May discard sign of zero.
* aarch64: Add vector implementations of atan2 routinesJoe Ramsay2023-11-101-0/+4
|
* aarch64: Add vector implementations of atan routinesJoe Ramsay2023-11-101-0/+4
|
* aarch64: Add vector implementations of acos routinesJoe Ramsay2023-11-101-0/+4
|
* aarch64: Add vector implementations of asin routinesJoe Ramsay2023-11-101-0/+4
|
* elf: Add glibc.mem.decorate_maps tunableAdhemerval Zanella2023-11-071-5/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PR_SET_VMA_ANON_NAME support is only enabled through a configurable kernel switch, mainly because assigning a name to a anonymous virtual memory area might prevent that area from being merged with adjacent virtual memory areas. For instance, with the following code: void *p1 = mmap (NULL, 1024 * 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); void *p2 = mmap (p1 + (1024 * 4096), 1024 * 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); The kernel will potentially merge both mappings resulting in only one segment of size 0x800000. If the segment is names with PR_SET_VMA_ANON_NAME with different names, it results in two mappings. Although this will unlikely be an issue for pthread stacks and malloc arenas (since for pthread stacks the guard page will result in a PROT_NONE segment, similar to the alignment requirement for the arena block), it still might prevent the mmap memory allocated for detail malloc. There is also another potential scalability issue, where the prctl requires to take the mmap global lock which is still not fully fixed in Linux [1] (for pthread stacks and arenas, it is mitigated by the stack cached and the arena reuse). So this patch disables anonymous mapping annotations as default and add a new tunable, glibc.mem.decorate_maps, can be used to enable it. [1] https://lwn.net/Articles/906852/ Checked on x86_64-linux-gnu and aarch64-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>
* linux: Add PR_SET_VMA_ANON_NAME supportAdhemerval Zanella2023-11-073-0/+81
| | | | | | | | | | | | | | Linux 5.17 added support to naming anonymous virtual memory areas through the prctl syscall. The __set_vma_name is a wrapper to avoid optimizing the prctl call if the kernel does not support it. If the kernel does not support PR_SET_VMA_ANON_NAME, prctl returns EINVAL. And it also returns the same error for an invalid argument. Since it is an internal-only API, it assumes well-formatted input: aligned START, with (START, START+LEN) being a valid memory range, and NAME with a limit of 80 characters without an invalid one ("\\`$[]"). Reviewed-by: DJ Delorie <dj@redhat.com>
* Add SEGV_CPERR from Linux 6.6 to bits/siginfo-consts.hJoseph Myers2023-11-031-1/+3
| | | | | | | Linux 6.6 adds the constant SEGV_CPERR. Add it to glibc's bits/siginfo-consts.h. Tested for x86_64.
* linux: Add HWCAP2_HBC from Linux 6.6 to AArch64 bits/hwcap.hAdhemerval Zanella2023-11-031-0/+1
|
* linux: Add FSCONFIG_CMD_CREATE_EXCL from Linux 6.6 to sys/mount.hAdhemerval Zanella2023-11-031-0/+2
| | | | | The tst-mount-consts.py does not need to be updated because kernel exports it as an enum (compare_macro_consts can not parse it).
* linux: Add MMAP_ABOVE4G from Linux 6.6 to sys/mman.hAdhemerval Zanella2023-11-032-1/+2
| | | | | | x86 added the flag (29f890d1050fc099f) for CET enabled. Also update tst-mman-consts.py test.
* Update kernel version to 6.6 in header constant testsAdhemerval Zanella2023-11-032-3/+3
| | | | | There are no new constants covered, the tst-mman-consts.py is updated separately along with a header constant addition.
* Update syscall lists for Linux 6.6Adhemerval Zanella2023-11-0328-2/+32
| | | | | Linux 6.6 has one new syscall for all architectures, fchmodat2, and the map_shadow_stack on x86_64.
* Use correct subdir when building tst-rfc3484* for mach and armArjun Shankar2023-11-011-3/+0
| | | | | | | | | Commit 7f602256ab5b85db1dbfb5f40bd109c4b37b68c8 moved the tst-rfc3484* tests from posix/ to nss/, but didn't correct references to point to their new subdir when building for mach and arm. This commit fixes that. Tested with build-many-glibcs.sh for i686-gnu.
* crypt: Remove libcrypt supportAdhemerval Zanella2023-10-3034-271/+0
| | | | | | | | | | | | | | | | | | All the crypt related functions, cryptographic algorithms, and make requirements are removed, with only the exception of md5 implementation which is moved to locale folder since it is required by localedef for integrity protection (libc's locale-reading code does not check these, but localedef does generate them). Besides thec code itself, both internal documentation and the manual is also adjusted. This allows to remove both --enable-crypt and --enable-nss-crypt configure options. Checked with a build for all affected ABIs. Co-authored-by: Zack Weinberg <zack@owlfolio.org> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* LoongArch: Update hwcap.h to sync with LoongArch kernel.caiyinyu2023-10-261-0/+1
|
* AArch64: Add support for MOPS memcpy/memmove/memsetWilco Dijkstra2023-10-242-0/+4
| | | | | | | | Add support for MOPS in cpu_features and INIT_ARCH. Add ifuncs using MOPS for memcpy, memmove and memset (use .inst for now so it works with all binutils versions without needing complex configure and conditional compilation). Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* aarch64: Add vector implementations of exp10 routinesJoe Ramsay2023-10-231-0/+4
| | | | | Double-precision routines either reuse the exp table (AdvSIMD) or use SVE FEXPA intruction.
* aarch64: Add vector implementations of log10 routinesJoe Ramsay2023-10-231-0/+4
| | | | A table is also added, which is shared between AdvSIMD and SVE log10.
* aarch64: Add vector implementations of log2 routinesJoe Ramsay2023-10-231-0/+4
| | | | A table is also added, which is shared between AdvSIMD and SVE log2.
* aarch64: Add vector implementations of exp2 routinesJoe Ramsay2023-10-231-0/+4
| | | | Some routines reuse table from v_exp_data.c
* aarch64: Add vector implementations of tan routinesJoe Ramsay2023-10-231-0/+4
| | | | | This includes some utility headers for evaluating polynomials using various schemes.
* tst-spawn-cgroup.c: Fix argument order of UNSUPPORTED message.Stefan Liebler2023-10-201-3/+3
| | | | | | | The arguments for "expected" and "got" are mismatched. Furthermore this patch is dumping both values as hex. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Florian Weimer <fweimer@redhat.com>
* Add HWCAP2_MOPS from Linux 6.5 to AArch64 bits/hwcap.hJoseph Myers2023-10-171-0/+1
| | | | | | | Linux 6.5 adds a new AArch64 HWCAP2 value, HWCAP2_MOPS. Add it to glibc's bits/hwcap.h. Tested with build-many-glibcs.py for aarch64-linux-gnu.
* Add SCM_SECURITY, SCM_PIDFD to bits/socket.hJoseph Myers2023-10-161-0/+4
| | | | | | | | | | | | Linux 6.5 adds a constant SCM_PIDFD (recall that the non-uapi linux/socket.h, where this constant is added, is in fact a header providing many constants that are part of the kernel/userspace interface). This shows up that SCM_SECURITY, from the same set of definitions and added in Linux 2.6.17, is also missing from glibc, although glibc has the first two constants from this set, SCM_RIGHTS and SCM_CREDENTIALS; add both missing constants to glibc. Tested for x86_64.
* Add AT_HANDLE_FID from Linux 6.5 to bits/fcntl-linux.hJoseph Myers2023-10-161-0/+11
| | | | | | | | | Linux 6.5 adds a constant AT_HANDLE_FID; add it to glibc. Because this is a flag for the function name_to_handle_at declared in bits/fcntl-linux.h, put the flag there rather than alongside other AT_* flags in (OS-independent) fcntl.h. Tested for x86_64.
* Fix FORTIFY_SOURCE false positiveVolker Weißmann2023-10-041-1/+3
| | | | | | | | | | | | | | | | When -D_FORTIFY_SOURCE=2 was given during compilation, sprintf and similar functions will check if their first argument is in read-only memory and exit with *** %n in writable segment detected *** otherwise. To check if the memory is read-only, glibc reads frpm the file "/proc/self/maps". If opening this file fails due to too many open files (EMFILE), glibc will now ignore this error. Fixes [BZ #30932] Signed-off-by: Volker Weißmann <volker.weissmann@gmx.de> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* Linux: add ST_NOSYMFOLLOWKir Kolyshkin2023-10-021-1/+3
| | | | | | | | | | Linux v5.10 added a mount option MS_NOSYMFOLLOW, which was added to glibc in commit 0ca21427d950755b. Add the corresponding statfs/statvfs flag bit, ST_NOSYMFOLLOW. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* io: Do not implement fstat with fstatatAdhemerval Zanella2023-09-273-12/+68
| | | | | | | | | | | | | | | | AT_EMPTY_PATH is a requirement to implement fstat over fstatat, however it does not prevent the kernel to read the path argument. It is not an issue, but on x86-64 with SMAP-capable CPUs the kernel is forced to perform expensive user memory access. After that regular lookup is performed which adds even more overhead. Instead, issue the fstat syscall directly on LFS fstat implementation (32 bit architectures will still continue to use statx, which is required to have 64 bit time_t support). it should be even a small performance gain on non x86_64, since there is no need to handle the path argument. Checked on x86_64-linux-gnu.
* Revert "LoongArch: Add glibc.cpu.hwcap support."caiyinyu2023-09-215-158/+4
| | | | This reverts commit a53451559dc9cce765ea5bcbb92c4007e058e92b.
* Update kernel version to 6.5 in header constant testsJoseph Myers2023-09-202-2/+2
| | | | | | | | | | This patch updates the kernel version in the tests tst-mman-consts.py and tst-pidfd-consts.py to 6.5. (There are no new constants covered by these tests in 6.5 that need any other header changes; tst-mount-consts.py was updated separately along with a header constant addition.) Tested with build-many-glibcs.py.
* LoongArch: Add glibc.cpu.hwcap support.caiyinyu2023-09-195-4/+158
| | | | | | | | | | | | | | | | | | | | | | | | | Key Points: 1. On lasx & lsx platforms, We must use _dl_runtime_{profile, resolve}_{lsx, lasx} to save vector registers. 2. Via "tunables", users can choose str/mem_{lasx,lsx,unaligned} functions with `export GLIBC_TUNABLES=glibc.cpu.hwcaps=LASX,...`. Note: glibc.cpu.hwcaps doesn't affect _dl_runtime_{profile, resolve}_{lsx, lasx} selection. Usage Notes: 1. Only valid inputs: LASX, LSX, UAL. Case-sensitive, comma-separated, no spaces. 2. Example: `export GLIBC_TUNABLES=glibc.cpu.hwcaps=LASX,UAL` turns on LASX & UAL. Unmentioned features turn off. With default ifunc: lasx > lsx > unaligned > aligned > generic, effect is: lasx > unaligned > aligned > generic; lsx off. 3. Incorrect GLIBC_TUNABLES settings will show error messages. For example: On lsx platforms, you cannot enable lasx features. If you do that, you will get error messages. 4. Valid input examples: - GLIBC_TUNABLES=glibc.cpu.hwcaps=LASX: lasx > aligned > generic. - GLIBC_TUNABLES=glibc.cpu.hwcaps=LSX,UAL: lsx > unaligned > aligned > generic. - GLIBC_TUNABLES=glibc.cpu.hwcaps=LASX,UAL,LASX,UAL,LSX,LASX,UAL: Repetitions allowed but not recommended. Results in: lasx > lsx > unaligned > aligned > generic.
* Add MOVE_MOUNT_BENEATH from Linux 6.5 to sys/mount.hJoseph Myers2023-09-142-2/+3
| | | | | | | | This patch adds the MOVE_MOUNT_BENEATH constant from Linux 6.5 to glibc's sys/mount.h and updates tst-mount-consts.py to reflect these constants being up to date with that Linux kernel version. Tested with build-many-glibcs.py.
* Update syscall lists for Linux 6.5Joseph Myers2023-09-1228-2/+31
| | | | | | | | Linux 6.5 has one new syscall, cachestat, and also enables the cacheflush syscall for hppa. Update syscall-names.list and regenerate the arch-syscall.h headers with build-many-glibcs.py update-syscalls. Tested with build-many-glibcs.py.
* ia64: Work around miscompilation and fix build on ia64's gcc-10 and laterSergei Trofimovich2023-09-111-1/+3
| | | | | | | | | | | | | Needed since gcc-10 enabled -fno-common by default. [In use in Gentoo since gcc-10, no problems observed. Also discussed with and reviewed by Jessica Clarke from Debian. Andreas] Bug: https://bugs.gentoo.org/723268 Reviewed-by: Carlos O'Donell <carlos@redhat.com> Signed-off-by: Sergei Trofimovich <slyich@gmail.com> Signed-off-by: Andreas K. Hüttel <dilfridge@gentoo.org>
* io: Fix record locking contants for powerpc64 with __USE_FILE_OFFSET64Aurelien Jarno2023-09-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 5f828ff824e3b7cd1 ("io: Fix F_GETLK, F_SETLK, and F_SETLKW for powerpc64") fixed an issue with the value of the lock constants on powerpc64 when not using __USE_FILE_OFFSET64, but it ended-up also changing the value when using __USE_FILE_OFFSET64 causing an API change. Fix that by also checking that define, restoring the pre 4d0fe291aed3a476a commit values: Default values: - F_GETLK: 5 - F_SETLK: 6 - F_SETLKW: 7 With -D_FILE_OFFSET_BITS=64: - F_GETLK: 12 - F_SETLK: 13 - F_SETLKW: 14 At the same time, it has been noticed that there was no test for io lock with __USE_FILE_OFFSET64, so just add one. Tested on x86_64-linux-gnu, i686-linux-gnu and powerpc64le-unknown-linux-gnu. Resolves: BZ #30804. Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
* linux: Add pidfd_getpidAdhemerval Zanella Netto2023-09-0542-0/+478
| | | | | | | | | | | | | | | | | | | | | | | | This interface allows to obtain the associated process ID from the process file descriptor. It is done by parsing the procps fdinfo information. Its prototype is: pid_t pidfd_getpid (int fd) It returns the associated pid or -1 in case of an error and sets the errno accordingly. The possible errno values are those from open, read, and close (used on procps parsing), along with: - EBADF if the FD is negative, does not have a PID associated, or if the fdinfo fields contain a value larger than pid_t. - EREMOTE if the PID is in a separate namespace. - ESRCH if the process is already terminated. Checked on x86_64-linux-gnu on Linux 4.15 (no CLONE_PIDFD or waitid support), Linux 5.4 (full support), and Linux 6.2. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* posix: Add pidfd_spawn and pidfd_spawnp (BZ 30349)Adhemerval Zanella Netto2023-09-0551-3/+499
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Returning a pidfd allows a process to keep a race-free handle for a child process, otherwise, the caller will need to either use pidfd_open (which still might be subject to TOCTOU) or keep the old racy interface base on pid_t. To correct use pifd_spawn, the kernel must support not only returning the pidfd with clone/clone3 but also waitid (P_PIDFD) (added on Linux 5.4). If kernel does not support the waitid, pidfd return ENOSYS. It avoids the need to racy workarounds, such as reading the procfs fdinfo to get the pid to use along with other wait interfaces. These interfaces are similar to the posix_spawn and posix_spawnp, with the only difference being it returns a process file descriptor (int) instead of a process ID (pid_t). Their prototypes are: int pidfd_spawn (int *restrict pidfd, const char *restrict file, const posix_spawn_file_actions_t *restrict facts, const posix_spawnattr_t *restrict attrp, char *const argv[restrict], char *const envp[restrict]) int pidfd_spawnp (int *restrict pidfd, const char *restrict path, const posix_spawn_file_actions_t *restrict facts, const posix_spawnattr_t *restrict attrp, char *const argv[restrict_arr], char *const envp[restrict_arr]); A new symbol is used instead of a posix_spawn extension to avoid possible issues with language bindings that might track the return argument lifetime. Although on Linux pid_t and int are interchangeable, POSIX only states that pid_t should be a signed integer. Both symbols reuse the posix_spawn posix_spawn_file_actions_t and posix_spawnattr_t, to void rehash posix_spawn API or add a new one. It also means that both interfaces support the same attribute and file actions, and a new flag or file action on posix_spawn is also added automatically for pidfd_spawn. Also, using posix_spawn plumbing allows the reusing of most of the current testing with some changes: - waitid is used instead of waitpid since it is a more generic interface. - tst-posix_spawn-setsid.c is adapted to take into consideration that the caller can check for session id directly. The test now spawns itself and writes the session id as a file instead. - tst-spawn3.c need to know where pidfd_spawn is used so it keeps an extra file description unused. Checked on x86_64-linux-gnu on Linux 4.15 (no CLONE_PIDFD or waitid support), Linux 5.4 (full support), and Linux 6.2. Reviewed-by: Florian Weimer <fweimer@redhat.com>
* linux: Add posix_spawnattr_{get, set}cgroup_np (BZ 26371)Adhemerval Zanella Netto2023-09-0541-3/+414
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | These functions allow to posix_spawn and posix_spawnp to use CLONE_INTO_CGROUP with clone3, allowing the child process to be created in a different cgroup version 2. These are GNU extensions that are available only for Linux, and also only for the architectures that implement clone3 wrapper (HAVE_CLONE3_WRAPPER). To create a process on a different cgroupv2, one can use the: posix_spawnattr_t attr; posix_spawnattr_init (&attr); posix_spawnattr_setflags (&attr, POSIX_SPAWN_SETCGROUP); posix_spawnattr_setcgroup_np (&attr, cgroup); posix_spawn (...) Similar to other posix_spawn flags, POSIX_SPAWN_SETCGROUP control whether the cgroup file descriptor will be used or not with clone3. There is no fallback if either clone3 does not support the flag or if the architecture does not provide the clone3 wrapper, in this case posix_spawn returns EOPNOTSUPP. Checked on x86_64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com>