summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* brx_IN locale: Fix yesexpr and noexprMike FABIAN2017-10-252-4/+8
| | | | | | * localedata/locales/brx_IN (LC_MESSAGES): Fix yesexpr and noexpr (Use first letters of yesstr and nostr correctly instead of using full words).
* ta_IN locale: Fix yesexpr and noexprMike FABIAN2017-10-252-2/+7
| | | | | * localedata/locales/ta_IN (LC_MESSAGES): Fix yesexpr and noexpr (Use first letters of yesstr and nostr correctly).
* hi_IN, kn_IN, ks_IN@devanagari locales: In yesexpr and noexpr, also check ↵Mike FABIAN2017-10-254-6/+13
| | | | | | | | | for the first characters of yesstr and nostr * localedata/locales/hi_IN (LC_MESSAGES): In yesexpr and noexpr, also check for the first characters of yesstr and nostr. * localedata/locales/kn_IN (LC_MESSAGES): Likewise. * localedata/locales/ks_IN@devanagari (LC_MESSAGES): Likewise.
* cmn_TW locale: Improve yesexpr and noexprMike FABIAN2017-10-252-4/+7
| | | | | * localedata/locales/cmn_TW (LC_MESSAGES): In yesexpr and noexpr, also check for Chinese characters.
* chr_US locale: Fix yesexpr and noexprMike FABIAN2017-10-252-2/+9
| | | | | | | * localedata/locales/chr_US (LC_MESSAGES): In yesexpr and noexpr, match also for the contents of yesstr and nostr. As the first letter of yesstr and nostr is equal, checking only for the first letter is not enough.
* ber_DZ locale: Use copy “"kab_DZ"” in LC_MESSAGES.Mike FABIAN2017-10-252-5/+7
| | | | | * localedata/locales/ber_DZ (LC_MESSAGES): Use copy "kab_DZ", it is the same according to Belkacem Mohammed <belkacem77@gmail.com>.
* kab_DZ locale: Add e-mail of main contributorMike FABIAN2017-10-252-1/+7
| | | | | * localedata/locales/kab_DZ (LC_IDENTIFICATION): Add e-mail of main contributor.
* zh_SG locale: Use copy "zh_CN" in LC_MESSAGES instead of EnglishMike FABIAN2017-10-252-4/+6
| | | | | * localedata/locales/zh_SG (LC_MESSAGES): Use copy "zh_CN" instead of using English.
* ug_CN locale: Fix noexpr and yesexprMike FABIAN2017-10-252-2/+8
| | | | | | * localedata/locales/ug_CN (LC_MESSAGES): Fix noexpr and yesexpr by including the first letters of nostr and yesexpr in the regexp. Also make it more readable by using ASCII where possible.
* ti_IN locale: Fix noexprMike FABIAN2017-10-252-2/+8
| | | | | | * localedata/locales/te_IN (LC_MESSAGES): Fix noexpr by including the first letter of nostr in the regexp. It agrees with CLDR now. Also make it more readable by using ASCII where possible.
* km_KH locale: Fix yesstr and nostr.Mike FABIAN2017-10-252-2/+10
| | | | | | | | * localedata/locales/km_KH (LC_MESSAGES): Fix yestr and nostr. The yesstr and nostr apparently came from CLDR. And CLDR has a bug there: these strings contain a U+17D6 (which somewhat looks like a colon) instead of a real colon to separate the full words for “yes” and “no” from the single letter responses.
* ka_GE locale: Fix yesexp to make it agree with CLDR.Mike FABIAN2017-10-252-2/+8
| | | | | | * localedata/locales/ka_GE (LC_MESSAGES): Fix yesexp to make it agree with CLDR (include the first letter of yesstr). Also make it more readable by using ASCII where possible.
* mr_IN locale: Fix yesstr and nostr and improve yesexpr and noexpr.Mike FABIAN2017-10-252-6/+13
| | | | | | | | | * localedata/locales/mr_IN (LC_MESSAGES): Fix yesstr and nostr and improve yesexpr and noexpr. The yesstr and nostr apparently came from CLDR. And CLDR has a bug there: these strings contain a U+0903 (which looks like a colon) instead of a real colon to separate the full words for “yes” and “no” from the single letter responses.
* bn_BD locale: Use only the first letters of the full yesstr and nostr in ↵Mike FABIAN2017-10-252-2/+7
| | | | | | | | | | | yesexpr and noexpr Using all characters of the full words for yes and no in yesexpr and noexpr makes no sense here, especially not because the words for yes and no share one character. * localedata/locales/bn_BD (LC_MESSAGES): Use only the first letters of the full yesstr and nostr in yesexpr and noexpr.
* Add yesstr, nostr, lang_term, lang_lib to an_ES localeMike FABIAN2017-10-252-46/+51
| | | | | | | * localedata/locales/an_ES (LC_MESSAGES): Add yesstr and nostr. * localedata/locales/an_ES (LC_ADDRESS): Add lang_term and lang_lib. * localedata/locales/an_ES: Make source more readable by using ASCII where possible.
* Add new locale yuw_PG [BZ #20952]Mike FABIAN2017-10-254-0/+162
| | | | | | | [BZ #20952] * localedata/locales/yuw_PG: New file. * localedata/SUPPORTED: Add yuw_PG/UTF-8. * locale/iso-639.def: Add Yau (Uruwa).
* Add single-threaded path to _int_mallocWilco Dijkstra2017-10-242-25/+42
| | | | | | This patch adds single-threaded fast paths to _int_malloc. * malloc/malloc.c (_int_malloc): Add SINGLE_THREAD_P path.
* Add single-threaded path to malloc/realloc/calloc/memallocWilco Dijkstra2017-10-242-9/+48
| | | | | | | | | | | | | This patch adds a single-threaded fast path to malloc, realloc, calloc and memalloc. When we're single-threaded, we can bypass arena_get (which always locks the arena it returns) and just use the main arena. Also avoid retrying a different arena since there is just the main arena. * malloc/malloc.c (__libc_malloc): Add SINGLE_THREAD_P path. (__libc_realloc): Likewise. (_mid_memalign): Likewise. (__libc_calloc): Likewise.
* Fixes for tpi_PG localeMike FABIAN2017-10-242-92/+75
| | | | | | | * localedata/locales/tpi_PG (LC_MESSAGES): Fix yesexpr and noexpr by adding the generic +1 and -0 as in all other locales. * localedata/locales/tpi_PG (LC_TIME): Fix some typos in the month and day names and make it more readable by using ASCII where possible.
* Update x86 fix-fp-int-compare-invalid.h for GCC 8.Joseph Myers2017-10-242-2/+11
| | | | | | | | | | | | | | | | | | The glibc implementation of iseqsig relies on ordered comparison operators raising the "invalid" exception for quiet NaN operands, with a workaround on platforms where a GCC bug means that exception is not raised. For x86, that bug has now been fixed for GCC 8, so this patch disables the workaround in that case. If and when the corresponding bugs for powerpc and s390 are fixed, the headers for those platforms should of course be updated similarly. Tested for x86_64 and x86, including with GCC mainline. Note that other failures appear with GCC mainline because of spurious use of ordered comparison instructions for unordered operations <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82692>. * sysdeps/x86/fpu/fix-fp-int-compare-invalid.h (FIX_COMPARE_INVALID): Define to 0 if [__GNUC_PREREQ (8, 0)].
* posix: Do not use WNOHANG in waitpid call for Linux posix_spawnAdhemerval Zanella2017-10-232-5/+10
| | | | | | | | | | | | | | | | | | | | As shown in some buildbot issues on aarch64 and powerpc, calling clone (VFORK) and waitpid (WNOHANG) does not guarantee the child is ready to be collected. This patch changes the call back to 0 as before fe05e1cb6d64 fix. This change can lead to the scenario 4.3 described in the commit, where the waitpid call can hang undefinitely on the call. However this is also a very unlikely and also undefinied situation where both the caller is trying to terminate a pid before posix_spawn returns and the race pid reuse is triggered. I don't see how to correct handle this specific situation within posix_spawn. Checked on x86_64-linux-gnu, aarch64-linux-gnu and powerpc64-linux-gnu. * sysdeps/unix/sysv/linux/spawni.c (__spawnix): Use 0 instead of WNOHANG in waitpid call.
* aarch64: Document _SC_LEVEL1_DCACHE_LINESIZE caveatSiddhesh Poyarekar2017-10-232-0/+15
| | | | | | | | | | | | | | | | | The _SC_LEVEL1_DCACHE_LINESIZE is reported using the contents of the ctr_el0 register, which tells us the minimum observable cache line size by userspace. This typically is the same as the L1 cache line size, but that may not always be true. It could be a higher level cache line size as long as cache cleaning and invalidation work correctly with that line size in userspace. The falkor core for example reports the L2 line size as the dcache line size in CTR_EL0 while also reporting the correct L1 dcache line size via CCSIDR_EL1. * manual/conf.texi (_SC_LEVEL1_DCACHE_LINESIZE, _SC_LEVEL1_ICACHE_LINESIZE): Document aarch64 caveat. Reviewed-by: Rical Jasan <ricaljasan@pacific.net> Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
* Document cache information sysconf variablesSiddhesh Poyarekar2017-10-232-0/+73
| | | | | | | | | | | | | | | | | Write short descriptions for each of the cache information sysconf variables. * manual/conf.texi (_SC_LEVEL1_ICACHE_SIZE, _SC_LEVEL1_ICACHE_ASSOC, _SC_LEVEL1_ICACHE_LINESIZE, _SC_LEVEL1_DCACHE_SIZE, _SC_LEVEL1_DCACHE_ASSOC, _SC_LEVEL1_DCACHE_LINESIZE, _SC_LEVEL2_CACHE_SIZE, _SC_LEVEL2_CACHE_ASSOC, _SC_LEVEL2_CACHE_LINESIZE, _SC_LEVEL3_CACHE_SIZE, _SC_LEVEL3_CACHE_ASSOC, _SC_LEVEL3_CACHE_LINESIZE, _SC_LEVEL4_CACHE_SIZE, _SC_LEVEL4_CACHE_ASSOC, _SC_LEVEL4_CACHE_LINESIZE): New variables. Reviewed-by: Rical Jasan <ricaljasan@pacific.net>
* aarch64: Add missing math Makefile for recent commitSzabolcs Nagy2017-10-232-1/+10
| | | | | Without -fno-math-errno, the builtins just do a call instead of inlining a single instruction.
* aarch64: Implement math acceleration via builtinsMichael Collison2017-10-2331-288/+309
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch converts asm statements into builtins for AArch64. As an example for the file sysdeps/aarch64/fpu/s_ceil.c, we convert the function from double __ceil (double x) { double result; asm ("frintp\t%d0, %d1" : "=w" (result) : "w" (x) ); return result; } into double __ceil (double x) { return __builtin_ceil (x); } Tested on aarch64-linux-gnu with gcc-4.9.4 and gcc-6. * sysdeps/aarch64/fpu/e_sqrt.c (ieee754_sqrt): Replace asm statements with __builtin_sqrt. * sysdeps/aarch64/fpu/e_sqrtf.c (ieee754_sqrtf): Replace asm statements with __builtin_sqrtf. * sysdeps/aarch64/fpu/s_ceil.c (__ceil): Replace asm statements with __builtin_ceil. * sysdeps/aarch64/fpu/s_ceilf.c (__ceilf): Replace asm statements with __builtin_ceilf. * sysdeps/aarch64/fpu/s_floor.c (__floor): Replace asm statements with __builtin_floor. * sysdeps/aarch64/fpu/s_floorf.c (__floorf): Replace asm statements with __builtin_floorf. * sysdeps/aarch64/fpu/s_fma.c (__fma): Replace asm statements with __builtin_fma. * sysdeps/aarch64/fpu/s_fmaf.c (__fmaf): Replace asm statements with __builtin_fmaf. * sysdeps/aarch64/fpu/s_fmax.c (__fmax): Replace asm statements with __builtin_fmax. * sysdeps/aarch64/fpu/s_fmaxf.c (__fmaxf): Replace asm statements with __builtin_fmaxf. * sysdeps/aarch64/fpu/s_fmin.c (__fmin): Replace asm statements with __builtin_fmin. * sysdeps/aarch64/fpu/s_fminf.c (__fminf): Replace asm statements with __builtin_fminf. * sysdeps/aarch64/fpu/s_frint.c: Delete file. * sysdeps/aarch64/fpu/s_frintf.c: Delete file. * sysdeps/aarch64/fpu/s_llrint.c (__llrint): Replace asm statements with builtin_rint and conversion to int. * sysdeps/aarch64/fpu/s_llrintf.c (__llrintf): Likewise. * sysdeps/aarch64/fpu/s_llround.c (__llround): Replace asm statements with builtin_llround. * sysdeps/aarch64/fpu/s_llroundf.c (__llroundf): Likewise. * sysdeps/aarch64/fpu/s_lrint.c (__lrint): Replace asm statements with builtin_rint and conversion to long int. * sysdeps/aarch64/fpu/s_lrintf.c (__lrintf): Likewise. * sysdeps/aarch64/fpu/s_lround.c (__lround): Replace asm statements with builtin_lround. * sysdeps/aarch64/fpu/s_lroundf.c (__lroundf): Replace asm statements with builtin_lroundf. * sysdeps/aarch64/fpu/s_nearbyint.c (__nearbyint): Replace asm statements with __builtin_nearbyint. * sysdeps/aarch64/fpu/s_nearbyintf.c (__nearbyintf): Replace asm statements with __builtin_nearbyintf. * sysdeps/aarch64/fpu/s_rint.c (__rint): Replace asm statements with __builtin_rint. * sysdeps/aarch64/fpu/s_rintf.c (__rintf): Replace asm statements with __builtin_rintf. * sysdeps/aarch64/fpu/s_round.c (__round): Replace asm statements with __builtin_round. * sysdeps/aarch64/fpu/s_roundf.c (__roundf): Replace asm statements with __builtin_roundf. * sysdeps/aarch64/fpu/s_trunc.c (__trunc): Replace asm statements with __builtin_trunc. * sysdeps/aarch64/fpu/s_truncf.c (__truncf): Replace asm statements with __builtin_truncf. * sysdeps/aarch64/fpu/Makefile: Build e_sqrt[f].c with -fno-math-errno.
* PowerPC64 power8 strncpy cfi fixesAlan Modra2017-10-232-13/+19
| | | | | | | | | | | | | | | | | | | | | cfi info for stack adjust needs to be on the insn doing the adjust. cfi describing register saves can be anywhere after the save insn but before the reg is altered. Fewer locations with cfi result in smaller cfi programs and possibly slightly faster exception handling. Thus the LR cfi_offset move. The idea behind ajusting sp after restoring regs is to break a register dependency chain, in this case not be using r1 immediately after it is modified. The missing LR cfi_restore meant that code after the blr, unaligned_lt_16 and other labels, would have cfi that said LR was at cfa+16, but that code is reached without LR being saved. * sysdeps/powerpc/powerpc64/power8/strncpy.S: Move LR cfi. Adjust stack after restoring regs. Add missing LR cfi_restore. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
* PowerPC64 power7 strncpy stack handling and cfiAlan Modra2017-10-232-10/+19
| | | | | | | | | | | | | This patch moves the frame setup and teardown to immediately around the single memset call, as has been done for power8. I've also decreased FRAMESIZE to that needed to save the two callee-saved registers used. Plus added cfi. * sysdeps/powerpc/powerpc64/power7/strncpy.S: Decrease FRAMESIZE. Move LR save and frame setup/teardown and LR restore to immediately around memset call. Provide cfi. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
* i386: Replace assembly versions of e_powf with generic e_powf.cH.J. Lu2017-10-229-401/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces i386 assembly versions of e_powf with generic e_powf.c. For workload-spec2017.wrf, on Nehalem, it improves performance by: Before After Improvement reciprocal-throughput 230.855 78.3358 194% latency 231.685 94.1259 146% On Skylake, it improves performance by: Before After Improvement reciprocal-throughput 239.858 47.4713 405% latency 247.57 93.8798 163% On IvyBridge with --disable-multi-arch, it improves performance by: Before After Improvement reciprocal-throughput 269.078 63.3758 324% latency 271.473 102.091 165% * sysdeps/i386/fpu/e_powf.S: Removed. * sysdeps/i386/fpu/e_powf_log2_data.c: Likewise. * sysdeps/i386/fpu/w_powf.c: Likewise. * sysdeps/i386/fpu/libm-test-ulps: Updated for generic e_powf.c. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/i386/i686/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_powf-sse2. (CFLAGS-e_powf-sse2.c): New. * sysdeps/i386/i686/fpu/multiarch/e_powf-sse2.c: New file. * sysdeps/i386/i686/fpu/multiarch/e_powf.c: Likewise.
* i386: Replace assembly versions of e_log2f with generic e_log2f.cH.J. Lu2017-10-229-72/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces i386 assembly versions of e_log2f with generic e_log2f.c. For workload-spec2017.wrf, on Nehalem, it improves performance by: Before After Improvement reciprocal-throughput 92.3845 30.8752 199% latency 112.855 54.8645 105% On Skylake, it improves performance by: Before After Improvement reciprocal-throughput 98.7488 22.7507 334% latency 118.01 51.6083 128% On IvyBridge with --disable-multi-arch, it improves performance by: Before After Improvement reciprocal-throughput 106.635 28.8596 269% latency 129.888 56.9187 128% * sysdeps/i386/fpu/e_log2f.S: Removed. * sysdeps/i386/fpu/e_log2f_data.c: Likewise. * sysdeps/i386/fpu/w_log2f.c: Likewise. * sysdeps/i386/fpu/libm-test-ulps: Updated for generic e_log2f.c. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/i386/i686/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_log2f-sse2. (CFLAGS-e_log2f-sse2.c): New. * sysdeps/i386/i686/fpu/multiarch/e_log2f-sse2.c: New file. * sysdeps/i386/i686/fpu/multiarch/e_log2f.c: Likewise.
* x86-64: Add powf with FMAH.J. Lu2017-10-224-1/+57
| | | | | | | | | | | | | | For workload-spec2017.wrf, on Skylake, it improves performance by: Before After Improvement reciprocal-throughput 35.4713 27.3842 29% latency 82.4537 66.3175 24% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_powf-fma. (CFLAGS-e_powf-fma.c): New. * sysdeps/x86_64/fpu/multiarch/e_powf-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/e_powf.c: Likewise.
* x86-64: Add log2f with FMAH.J. Lu2017-10-224-1/+53
| | | | | | | | | | | | | | For workload-spec2017.wrf, on Skylake, it improves performance by: Before After Improvement reciprocal-throughput 16.5937 14.0789 17% latency 41.7755 35.3586 18% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_log2f-fma. (CFLAGS-e_log2f-fma.c): New. * sysdeps/x86_64/fpu/multiarch/e_log2f-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/e_log2f.c: Likewise.
* x86-64: Add logf with FMAH.J. Lu2017-10-224-1/+53
| | | | | | | | | | | | | | For workload-spec2017.wrf, on Skylake, it improves performance by: Before After Improvement reciprocal-throughput 16.1534 13.8874 16% latency 41.9642 34.3072 22% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_logf-fma. (CFLAGS-e_logf-fma.c): New. * sysdeps/x86_64/fpu/multiarch/e_logf-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/e_logf.c: Likewise.
* i386: Replace assembly versions of e_logf with generic e_logf.cH.J. Lu2017-10-2210-143/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces i386 assembly versions of e_logf with generic e_logf.c. For workload-spec2017.wrf, on Nehalem, it improves performance by: Before After Improvement reciprocal-throughput 73.3865 40.0454 83% latency 90.0985 54.4479 65% On Skylake, it improves performance by: Before After Improvement reciprocal-throughput 75.1384 22.1452 239% latency 91.9441 50.7925 81% On IvyBridge with --disable-multi-arch, it improves performance by: Before After Improvement reciprocal-throughput 84.5575 28.7879 193% latency 103.971 57.5231 80% * sysdeps/i386/fpu/e_logf.S: Removed. * sysdeps/i386/fpu/e_logf_data.c: Likewise. * sysdeps/i386/fpu/w_logf.c: Likewise. * sysdeps/i386/i686/fpu/e_logf.S: Likewise. * sysdeps/i386/fpu/libm-test-ulps: Updated for generic e_logf.c. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/i386/i686/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_logf-sse2. (CFLAGS-e_logf-sse2.c): New. * sysdeps/i386/i686/fpu/multiarch/e_logf-sse2.c: New file. * sysdeps/i386/i686/fpu/multiarch/e_logf.c: Likewise.
* i386: Replace assembly versions of e_exp2f with generic e_exp2f.cH.J. Lu2017-10-228-54/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces i386 assembly versions of e_exp2f with generic e_exp2f.c. For workload-spec2017.wrf, on Nehalem, it improves performance by: Before After Improvement reciprocal-throughput 112.996 40.0454 182% latency 126.581 54.4479 132% On Skylake, it improves performance by: Before After Improvement reciprocal-throughput 113.14 39.447 186% latency 136.068 55.684 144% On IvyBridge with --disable-multi-arch, it improves performance by: Before After Improvement reciprocal-throughput 132.521 40.3759 228% latency 145.791 58.4587 149% * sysdeps/i386/fpu/e_exp2f.S: Removed. * sysdeps/i386/fpu/w_exp2f.c: Likewise. * sysdeps/i386/fpu/libm-test-ulps: Updated for generic e_exp2f.c. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/i386/i686/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_exp2f-sse2. (CFLAGS-e_exp2f-sse2.c): New. * sysdeps/i386/i686/fpu/multiarch/e_exp2f-sse2.c: New file. * sysdeps/i386/i686/fpu/multiarch/e_exp2f.c: Likewise.
* x86-64: Add exp2f with FMAH.J. Lu2017-10-224-1/+50
| | | | | | | | | | | | | | For workload-spec2017.wrf, on Skylake, it improves performance by: Before After Improvement reciprocal-throughput 13.0291 11.2225 16% latency 44.5154 37.5766 18% * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add e_exp2f-fma. (CFLAGS-e_exp2f-fma.c): New. * sysdeps/x86_64/fpu/multiarch/e_exp2f-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/e_exp2f.c: Likewise.
* i386: Replace assembly versions of e_expf with generic e_expf.cH.J. Lu2017-10-2213-442/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces i386 assembly versions of e_expf with generic e_expf.c. For workload-spec2017.wrf, on Nehalem, it improves performance by: Before After Improvement reciprocal-throughput 55.5724 40.2664 38% latency 80.0687 60.8517 31% On Skylake, it improves performance by: Before After Improvement reciprocal-throughput 62.4056 39.4188 58% latency 85.5496 59.6377 43% On IvyBridge with --disable-multi-arch, it improves performance by: Before After Improvement reciprocal-throughput 133.707 40.3778 231% latency 149.191 63.2515 135% * sysdeps/i386/fpu/e_exp2f_data.c: Removed. * sysdeps/i386/fpu/e_expf.S: Likewise. * sysdeps/i386/fpu/math_errf.c: Likewise. * sysdeps/i386/fpu/w_expf.c: Likewise. * sysdeps/i386/i686/fpu/multiarch/e_expf-ia32.S: Likewise. * sysdeps/i386/i686/fpu/multiarch/e_expf-sse2.S: Likewise. * sysdeps/i386/i686/fpu/multiarch/w_expf.c: Likewise. * sysdeps/i386/fpu/libm-test-ulps: Updated for generic e_expf.c. * sysdeps/i386/i686/fpu/multiarch/libm-test-ulps: Likewise. * sysdeps/i386/i686/fpu/multiarch/Makefile (libm-sysdep_routines): Remove e_expf-ia32. (CFLAGS-e_expf-sse2.c): New. * sysdeps/i386/i686/fpu/multiarch/e_expf-sse2.c: New file. * sysdeps/i386/i686/fpu/multiarch/e_expf.c: Rewritten.
* x86-64: Replace assembly versions of e_expf with generic e_expf.cH.J. Lu2017-10-228-529/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces x86-64 assembly versions of e_expf with generic e_expf.c. For workload-spec2017.wrf, on Nehalem, it improves performance by: Before After Improvement reciprocal-throughput 36.039 20.7749 73% latency 58.8096 40.8715 43% On Skylake, it improves Before After Improvement reciprocal-throughput 18.4436 11.1693 65% latency 47.5162 37.5411 26% * sysdeps/x86_64/fpu/e_expf.S: Removed. * sysdeps/x86_64/fpu/multiarch/e_expf-fma.S: Likewise. * sysdeps/x86_64/fpu/w_expf.c: Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Updated for generic e_expf.c. * sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_expf-fma.c): New. * sysdeps/x86_64/fpu/multiarch/e_expf-fma.c: New file. * sysdeps/x86_64/fpu/multiarch/e_expf.c (__redirect_ieee754_expf): Renamed to ... (__redirect_expf): This. (SYMBOL_NAME): Changed to expf. (__ieee754_expf): Renamed to ... (__expf): This. (__GI___expf): This. (__ieee754_expf): Add strong_alias. (__expf_finite): Likewise. (__expf): New. Include <sysdeps/ieee754/flt-32/e_expf.c>.
* glob: Fix buffer overflow during GLOB_TILDE unescaping [BZ #22332]Paul Eggert2017-10-223-2/+12
|
* Update NEWS and ChangeLog for CVE-2017-15671Florian Weimer2017-10-222-0/+6
|
* glob: Add new test tst-glob-tildeFlorian Weimer2017-10-213-2/+154
| | | | | The new test checks for memory leaks (see bug 22325) and attempts to trigger the buffer overflow in bug 22320.
* Add bits/floatn.h defines for more _FloatN / _FloatNx types.Joseph Myers2017-10-209-1/+316
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The bits/floatn.h header currently only has defines relating to _Float128. This patch adds defines relating to other _FloatN / _FloatNx types. The approach taken is to add defines for all _FloatN / _FloatNx types known to GCC, and to put them in a common bits/floatn-common.h header included at the end of all the individual bits/floatn.h headers. If in future some defines become different for different glibc configurations, they will move out into the separate bits/floatn.h headers. Some defines are expected always to be the same across glibc ports. Corresponding defines are nevertheless put in this header. The intent is that where there are conditionals (in headers or in non-installed files) that can just repeat the same or nearly the same logic for each floating-point type, they should do so, even if in fact the cases for some types could be unconditionally present or absent because the same conditionals are true or false for all glibc configurations. This should make the glibc code with such conditionals easier to read, because the reader can just see that the same conditionals are repeated for each type, rather than seeing different conditionals for different types and needing to reason, at each location with such differences, why those differences are indeed correct there. (Cases involving per-format rather than per-type logic are more likely still to need differences in how they handle different types.) Having such defines and conditionals also helps in incremental preparation for adding _Float32 / _Float64 / _Float32x / _Float64x function aliases. I intend subsequent patches to add such conditionals corresponding to those already present for _Float128, as well as making more architecture-specific function implementations use common macros to define aliases in preparation for adding such _FloatN / _FloatNx aliases. Tested for x86_64. * bits/floatn-common.h: New file. * math/Makefile (headers): Add bits/floatn-common.h. * bits/floatn.h: Include <bits/floatn-common.h>. * sysdeps/ia64/bits/floatn.h: Likewise. * sysdeps/ieee754/ldbl-128/bits/floatn.h: Likewise. * sysdeps/mips/ieee754/bits/floatn.h: Likewise. * sysdeps/powerpc/bits/floatn.h: Likewise. * sysdeps/x86/bits/floatn.h: Likewise.
* Avoid build multiarch if compiler warns about mismatched aliasAdhemerval Zanella2017-10-203-6/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | GCC 8 emits an warning for alias for functions with incompatible types and it is used extensivelly for ifunc resolvers implementations in C (for instance on weak_alias with the internal symbol name to the external one or with the libc_hidden_def to set ifunc for internal usage). This breaks the build when the ifunc resolver is not defined using gcc attribute extensions (HAVE_GCC_IFUNC being 0). Although for all currently architectures that have multiarch support this compiler options is enabled for default, there is still the option where the user might try build glibc with a compiler without support for such extension. In this case this patch just disable the multiarch folder in sysdeps selections. GCC 7 and before still builds IFUNCs regardless of compiler support (although for the lack of attribute support debug information would be optimal). Checked with a build on multiarch support architectures (aarch64, arm, sparc, s390, powerpc, x86_64, i386) with multiarch enable and disable and with GCC 7 and GCC 8. * configure.ac (libc_cv_gcc_incompatbile_alias): New define: indicates whether compiler emits an warning for alias for functions with incompatible types.
* posix: Fix improper assert in Linux posix_spawn (BZ#22273)Adhemerval Zanella2017-10-202-6/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As noted by Florian Weimer, current Linux posix_spawn implementation can trigger an assert if the auxiliary process is terminated before actually setting the err member: 340 /* Child must set args.err to something non-negative - we rely on 341 the parent and child sharing VM. */ 342 args.err = -1; [...] 362 new_pid = CLONE (__spawni_child, STACK (stack, stack_size), stack_size, 363 CLONE_VM | CLONE_VFORK | SIGCHLD, &args); 364 365 if (new_pid > 0) 366 { 367 ec = args.err; 368 assert (ec >= 0); Another possible issue is killing the child between setting the err and actually calling execve. In this case the process will not ran, but posix_spawn also will not report any error: 269 270 args->err = 0; 271 args->exec (args->file, args->argv, args->envp); As suggested by Andreas Schwab, this patch removes the faulty assert and also handles any signal that happens before fork and execve as the spawn was successful (and thus relaying the handling to the caller to figure this out). Different than Florian, I can not see why using atomics to set err would help here, essentially the code runs sequentially (due CLONE_VFORK) and I think it would not be legal the compiler evaluate ec without checking for new_pid result (thus there is no need to compiler barrier). Summarizing the possible scenarios on posix_spawn execution, we have: 1. For default case with a success execution, args.err will be 0, pid will not be collected and it will be reported to caller. 2. For default failure case, args.err will be positive and the it will be collected by the waitpid. An error will be reported to the caller. 3. For the unlikely case where the process was terminated and not collected by a caller signal handler, it will be reported as succeful execution and not be collected by posix_spawn (since args.err will be 0). The caller will need to actually handle this case. 4. For the unlikely case where the process was terminated and collected by caller we have 3 other possible scenarios: 4.1. The auxiliary process was terminated with args.err equal to 0: it will handled as 1. (so it does not matter if we hit the pid reuse race since we won't possible collect an unexpected process). 4.2. The auxiliary process was terminated after execve (due a failure in calling it) and before setting args.err to -1: it will also be handle as 1. but with the issue of not be able to report the caller a possible execve failures. 4.3. The auxiliary process was terminated after args.err is set to -1: this is the case where it will be possible to hit the pid reuse case where we will need to collected the auxiliary pid but we can not be sure if it will be expected one. I think for this case we need to actually change waitpid to use WNOHANG to avoid hanging indefinitely on the call and report an error to caller since we can't differentiate between a default failure as 2. and a possible pid reuse race issue. Checked on x86_64-linux-gnu. * sysdeps/unix/sysv/linux/spawni.c (__spawnix): Handle the case where the auxiliary process is terminated by a signal before calling _exit or execve.
* x86-64: Use fxsave/xsave/xsavec in _dl_runtime_resolve [BZ #21265]H.J. Lu2017-10-209-306/+296
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In _dl_runtime_resolve, use fxsave/xsave/xsavec to preserve all vector, mask and bound registers. It simplifies _dl_runtime_resolve and supports different calling conventions. ld.so code size is reduced by more than 1 KB. However, use fxsave/xsave/xsavec takes a little bit more cycles than saving and restoring vector and bound registers individually. Latency for _dl_runtime_resolve to lookup the function, foo, from one shared library plus libc.so: Before After Change Westmere (SSE)/fxsave 345 866 151% IvyBridge (AVX)/xsave 420 643 53% Haswell (AVX)/xsave 713 1252 75% Skylake (AVX+MPX)/xsavec 559 719 28% Skylake (AVX512+MPX)/xsavec 145 272 87% Ryzen (AVX)/xsavec 280 553 97% This is the worst case where portion of time spent for saving and restoring registers is bigger than majority of cases. With smaller _dl_runtime_resolve code size, overall performance impact is negligible. On IvyBridge, differences in build and test time of binutils with lazy binding GCC and binutils are noises. On Westmere, differences in bootstrap and "makc check" time of GCC 7 with lazy binding GCC and binutils are also noises. [BZ #21265] * sysdeps/x86/cpu-features-offsets.sym (XSAVE_STATE_SIZE_OFFSET): New. * sysdeps/x86/cpu-features.c: Include <libc-pointer-arith.h>. (get_common_indeces): Set xsave_state_size, xsave_state_full_size and bit_arch_XSAVEC_Usable if needed. (init_cpu_features): Remove bit_arch_Use_dl_runtime_resolve_slow and bit_arch_Use_dl_runtime_resolve_opt. * sysdeps/x86/cpu-features.h (bit_arch_Use_dl_runtime_resolve_opt): Removed. (bit_arch_Use_dl_runtime_resolve_slow): Likewise. (bit_arch_Prefer_No_AVX512): Updated. (bit_arch_MathVec_Prefer_No_AVX512): Likewise. (bit_arch_XSAVEC_Usable): New. (STATE_SAVE_OFFSET): Likewise. (STATE_SAVE_MASK): Likewise. [__ASSEMBLER__]: Include <cpu-features-offsets.h>. (cpu_features): Add xsave_state_size and xsave_state_full_size. (index_arch_Use_dl_runtime_resolve_opt): Removed. (index_arch_Use_dl_runtime_resolve_slow): Likewise. (index_arch_XSAVEC_Usable): New. * sysdeps/x86/cpu-tunables.c (TUNABLE_CALLBACK (set_hwcaps)): Support XSAVEC_Usable. Remove Use_dl_runtime_resolve_slow. * sysdeps/x86_64/Makefile (tst-x86_64-1-ENV): New if tunables is enabled. * sysdeps/x86_64/dl-machine.h (elf_machine_runtime_setup): Replace _dl_runtime_resolve_sse, _dl_runtime_resolve_avx, _dl_runtime_resolve_avx_slow, _dl_runtime_resolve_avx_opt, _dl_runtime_resolve_avx512 and _dl_runtime_resolve_avx512_opt with _dl_runtime_resolve_fxsave, _dl_runtime_resolve_xsave and _dl_runtime_resolve_xsavec. * sysdeps/x86_64/dl-trampoline.S (DL_RUNTIME_UNALIGNED_VEC_SIZE): Removed. (DL_RUNTIME_RESOLVE_REALIGN_STACK): Check STATE_SAVE_ALIGNMENT instead of VEC_SIZE. (REGISTER_SAVE_BND0): Removed. (REGISTER_SAVE_BND1): Likewise. (REGISTER_SAVE_BND3): Likewise. (REGISTER_SAVE_RAX): Always defined to 0. (VMOV): Removed. (_dl_runtime_resolve_avx): Likewise. (_dl_runtime_resolve_avx_slow): Likewise. (_dl_runtime_resolve_avx_opt): Likewise. (_dl_runtime_resolve_avx512): Likewise. (_dl_runtime_resolve_avx512_opt): Likewise. (_dl_runtime_resolve_sse): Likewise. (_dl_runtime_resolve_sse_vex): Likewise. (USE_FXSAVE): New. (_dl_runtime_resolve_fxsave): Likewise. (USE_XSAVE): Likewise. (_dl_runtime_resolve_xsave): Likewise. (USE_XSAVEC): Likewise. (_dl_runtime_resolve_xsavec): Likewise. * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_avx512): Removed. (_dl_runtime_resolve_avx512_opt): Likewise. (_dl_runtime_resolve_avx): Likewise. (_dl_runtime_resolve_avx_opt): Likewise. (_dl_runtime_resolve_sse): Likewise. (_dl_runtime_resolve_sse_vex): Likewise. (_dl_runtime_resolve_fxsave): New. (_dl_runtime_resolve_xsave): Likewise. (_dl_runtime_resolve_xsavec): Likewise.
* Mention Tim Rühsen as the reporter for CVE-2017-15670Florian Weimer2017-10-201-3/+4
|
* CVE-2017-15670: glob: Fix one-byte overflow [BZ #22320]Paul Eggert2017-10-203-1/+11
|
* Fix build issue with SINGLE_THREAD_PWilco Dijkstra2017-10-202-0/+7
| | | | | | Add sysdep-cancel.h include. * malloc/malloc.c (sysdep-cancel.h): Add include.
* Add single-threaded path to _int_freeWilco Dijkstra2017-10-202-14/+33
| | | | | | | This patch adds single-threaded fast paths to _int_free. Bypass the explicit locking for larger allocations. * malloc/malloc.c (_int_free): Add SINGLE_THREAD_P fast paths.
* resolv: Remove bogus targets that build ga_testWill Hawkins2017-10-203-105/+6
| | | | | | | | | | | | | | | | | | Remove the bogus targets (and source) that supposedly build ga_test. This code was added to resolv very early in the development process but does not appear to be an actual test program. The target for building this file is tests but because the glibc Make system is built the way it is, the target is overriden by higher-level tests targets and, therefore, the ga_test program is never built. Removing the target and the source code makes the resolv/Makefile less confusing. Tested by building and running 'make check' on 64 bit host running Kernel 4.10.0-19 configured with --prefix=/home/hawkinsw/code/glibc-build/install --enable-hardcoded-path-in-tests --disable-mathvec Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* Add new locale kab_DZ [BZ #18812]Mike FABIAN2017-10-203-0/+175
| | | | | | [BZ #18812] * localedata/SUPPORTED: Add kab_DZ/UTF-8. * localedata/locales/kab_DZ: New file.