diff options
author | Rich Felker <dalias@aerifal.cx> | 2014-10-10 18:20:33 -0400 |
---|---|---|
committer | Rich Felker <dalias@aerifal.cx> | 2014-10-10 18:21:31 -0400 |
commit | df37d3960abec482e17fad2274a99b790f6cc08b (patch) | |
tree | ff2d8efd637e564dd846a3c96a425ea483246734 /src/thread/pthread_cond_wait.c | |
parent | 867b1822f30a76cb9c8342da29eb28ed75908fa9 (diff) | |
download | musl-df37d3960abec482e17fad2274a99b790f6cc08b.tar.gz musl-df37d3960abec482e17fad2274a99b790f6cc08b.tar.xz musl-df37d3960abec482e17fad2274a99b790f6cc08b.zip |
fix missing barrier in pthread_once/call_once shortcut path
these functions need to be fast when the init routine has already run, since they may be called very often from code which depends on global initialization having taken place. as such, a fast path bypassing atomic cas on the once control object was used to avoid heavy memory contention. however, on archs with weakly ordered memory, the fast path failed to ensure that the caller actually observes the side effects of the init routine. preliminary performance testing showed that simply removing the fast path was not practical; a performance drop of roughly 85x was observed with 20 threads hammering the same once control on a 24-core machine. so the new explicit barrier operation from atomic.h is used to retain the fast path while ensuring memory visibility. performance may be reduced on some archs where the barrier actually makes a difference, but the previous behavior was unsafe and incorrect on these archs. future improvements to the implementation of a_barrier should reduce the impact.
Diffstat (limited to 'src/thread/pthread_cond_wait.c')
0 files changed, 0 insertions, 0 deletions