diff options
author | Sajan Karumanchi <sajan.karumanchi@amd.com> | 2021-02-02 12:42:14 +0100 |
---|---|---|
committer | Florian Weimer <fweimer@redhat.com> | 2021-02-02 12:42:15 +0100 |
commit | 6e02b3e9327b7dbb063958d2b124b64fcb4bbe3f (patch) | |
tree | f5fa119e5c2db62c16cdbaaa01d856da390e607a /sysdeps/x86/cacheinfo.h | |
parent | caa60b79f8c98e97455078542a14b4c750e48ede (diff) | |
download | glibc-6e02b3e9327b7dbb063958d2b124b64fcb4bbe3f.tar.gz glibc-6e02b3e9327b7dbb063958d2b124b64fcb4bbe3f.tar.xz glibc-6e02b3e9327b7dbb063958d2b124b64fcb4bbe3f.zip |
x86: Adding an upper bound for Enhanced REP MOVSB.
In the process of optimizing memcpy for AMD machines, we have found the vector move operations are outperforming enhanced REP MOVSB for data transfers above the L2 cache size on Zen3 architectures. To handle this use case, we are adding an upper bound parameter on enhanced REP MOVSB:'__x86_rep_movsb_stop_threshold'. As per large-bench results, we are configuring this parameter to the L2 cache size for AMD machines and applicable from Zen3 architecture supporting the ERMS feature. For architectures other than AMD, it is the computed value of non-temporal threshold parameter. Reviewed-by: Premachandra Mallappa <premachandra.mallappa@amd.com>
Diffstat (limited to 'sysdeps/x86/cacheinfo.h')
-rw-r--r-- | sysdeps/x86/cacheinfo.h | 4 |
1 files changed, 4 insertions, 0 deletions
diff --git a/sysdeps/x86/cacheinfo.h b/sysdeps/x86/cacheinfo.h index 68c253542f..0f0ca7c08c 100644 --- a/sysdeps/x86/cacheinfo.h +++ b/sysdeps/x86/cacheinfo.h @@ -54,6 +54,9 @@ long int __x86_rep_movsb_threshold attribute_hidden = 2048; /* Threshold to use Enhanced REP STOSB. */ long int __x86_rep_stosb_threshold attribute_hidden = 2048; +/* Threshold to stop using Enhanced REP MOVSB. */ +long int __x86_rep_movsb_stop_threshold attribute_hidden; + static void init_cacheinfo (void) { @@ -79,5 +82,6 @@ init_cacheinfo (void) __x86_rep_movsb_threshold = cpu_features->rep_movsb_threshold; __x86_rep_stosb_threshold = cpu_features->rep_stosb_threshold; + __x86_rep_movsb_stop_threshold = cpu_features->rep_movsb_stop_threshold; } #endif |