From b116855de71098ef7dd2875dd3237f8f3ecc12c2 Mon Sep 17 00:00:00 2001 From: Szabolcs Nagy Date: Tue, 16 Feb 2021 12:55:13 +0000 Subject: RFC elf: Fix slow tls access after dlopen [BZ #19924] In short: __tls_get_addr checks the global generation counter, _dl_update_slotinfo updates up to the generation of the accessed module. If the global generation is newer than geneneration of the module then __tls_get_addr keeps hitting the slow path that updates the dtv. Possible approaches i can see: 1. update to global generation instead of module, 2. check the module generation in the fast path. This patch is 1.: it needs additional sync (load acquire) so the slotinfo list is up to date with the observed global generation. Approach 2. would require walking the slotinfo list at all times. I don't know how to make that fast with many modules. Note: in the x86_64 version of dl-tls.c the generation is only loaded once, since relaxed mo is not faster than acquire mo load. I have not benchmarked this yet. --- elf/dl-close.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'elf/dl-close.c') diff --git a/elf/dl-close.c b/elf/dl-close.c index 9f31532f41..45f8a7fe31 100644 --- a/elf/dl-close.c +++ b/elf/dl-close.c @@ -780,7 +780,7 @@ _dl_close_worker (struct link_map *map, bool force) if (__glibc_unlikely (newgen == 0)) _dl_fatal_printf ("TLS generation counter wrapped! Please report as described in "REPORT_BUGS_TO".\n"); /* Can be read concurrently. */ - atomic_store_relaxed (&GL(dl_tls_generation), newgen); + atomic_store_release (&GL(dl_tls_generation), newgen); if (tls_free_end == GL(dl_tls_static_used)) GL(dl_tls_static_used) = tls_free_start; -- cgit 1.4.1