summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* stdlib: Make atexit to not act as __cxa_atexitazanella/atexit-orderAdhemerval Zanella2019-07-1115-235/+332
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is patch to addresses the issue brought in recent discussion regarding atexit and shared libraries [1] [2]. As indicated in the libc-alpha discussion, the issue is since atexit uses __cxa_atexit its interaction __cxa_finalize could lead to atexit handlers being executed in a different order than the expected one. The github project gives a small example that triggers it [3]. The changes I could come with changes slight the atexit semantic as described in my last email [4]. So basically the changes are: 1. Add the __atexit symbol which is linked as __cxa_finalize in static mode (so __dso_handle is correctly set). The __atexit symbol adds an ef_at exit_function entry on __exit_funcs, different than an ef_cxa one from __cxa_atexit. Old binaries would still call __cxa_atexit, so we do not actually need to add a compat symbol. 2. Make __cxa_finalize to handle ef_at as well, similar to ef_cxa. 3. Change how the internal exit handler are organized, so ef_at and ef_on handles (registered by atexit and on_exit) are executed before ef_cxa (registered by __cxa_atexit). Each entry set (struct exit_function_list) has on type associated (el_at or el_cxa) to represent the internal handle it contains. New insertions (done by the __atexit, __cxa_atexit, etc.) keep the node orders, with following constraints: 3.1. el_at nodes should be prior el_cxa. 3.2. el_at should contain only ef_at, ef_on, or ef_free elements. 3.3. el_cxa should contain only ef_cxa or ef_free elements. 3.4. new insertions on each node type should be be kept in lifo order. 3.5. the original first element should be last one (since it is static allocated and 'exit' will deallocated the nodes in order. */ So the execution on both __cxa_finalize, exit, or quick_exit will iterate over the list by executing first atexit/on_exit handlers and then __cxa_atexit ones. New handlers added by registered functions are handled as before, by using the ef_free entry and reseting the list iteration. [1] https://sourceware.org/ml/libc-alpha/2019-06/msg00229.html [2] https://sourceware.org/ml/libc-help/2019-06/msg00025.html [3] https://github.com/mulle-nat/ld-so-breakage [4] https://sourceware.org/ml/libc-alpha/2019-06/msg00231.html
* stdlib: Testcase to show wrong atexit execution orderAdhemerval Zanella2019-07-117-1/+265
| | | | | | | | | | | | | | | This testcase is based on the one discussed on libc-help [1] [1] https://github.com/mulle-nat/ld-so-breakage The atexit calls are done by the DSO constructors and program expects the handlers to be executed in the reverse order the library are loaded. The issue is _dl_fini might re-sort the order when libraries are unloaded and the __cxa_finalize (called by __do_global_dtors_aux set in arrayfini from DSO) might in turn call its registered atexit in a wrong order.
* Call _dl_open_check after relocation [BZ #24259]H.J. Lu2019-07-0118-5/+387
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a workaround for [BZ #20839] which doesn't remove the NODELETE object when _dl_open_check throws an exception. Move it after relocation in dl_open_worker to avoid leaving the NODELETE object mapped without relocation. [BZ #24259] * elf/dl-open.c (dl_open_worker): Call _dl_open_check after relocation. * sysdeps/x86/Makefile (tests): Add tst-cet-legacy-5a, tst-cet-legacy-5b, tst-cet-legacy-6a and tst-cet-legacy-6b. (modules-names): Add tst-cet-legacy-mod-5a, tst-cet-legacy-mod-5b, tst-cet-legacy-mod-5c, tst-cet-legacy-mod-6a, tst-cet-legacy-mod-6b and tst-cet-legacy-mod-6c. (CFLAGS-tst-cet-legacy-5a.c): New. (CFLAGS-tst-cet-legacy-5b.c): Likewise. (CFLAGS-tst-cet-legacy-mod-5a.c): Likewise. (CFLAGS-tst-cet-legacy-mod-5b.c): Likewise. (CFLAGS-tst-cet-legacy-mod-5c.c): Likewise. (CFLAGS-tst-cet-legacy-6a.c): Likewise. (CFLAGS-tst-cet-legacy-6b.c): Likewise. (CFLAGS-tst-cet-legacy-mod-6a.c): Likewise. (CFLAGS-tst-cet-legacy-mod-6b.c): Likewise. (CFLAGS-tst-cet-legacy-mod-6c.c): Likewise. ($(objpfx)tst-cet-legacy-5a): Likewise. ($(objpfx)tst-cet-legacy-5a.out): Likewise. ($(objpfx)tst-cet-legacy-mod-5a.so): Likewise. ($(objpfx)tst-cet-legacy-mod-5b.so): Likewise. ($(objpfx)tst-cet-legacy-5b): Likewise. ($(objpfx)tst-cet-legacy-5b.out): Likewise. (tst-cet-legacy-5b-ENV): Likewise. ($(objpfx)tst-cet-legacy-6a): Likewise. ($(objpfx)tst-cet-legacy-6a.out): Likewise. ($(objpfx)tst-cet-legacy-mod-6a.so): Likewise. ($(objpfx)tst-cet-legacy-mod-6b.so): Likewise. ($(objpfx)tst-cet-legacy-6b): Likewise. ($(objpfx)tst-cet-legacy-6b.out): Likewise. (tst-cet-legacy-6b-ENV): Likewise. * sysdeps/x86/tst-cet-legacy-5.c: New file. * sysdeps/x86/tst-cet-legacy-5a.c: Likewise. * sysdeps/x86/tst-cet-legacy-5b.c: Likewise. * sysdeps/x86/tst-cet-legacy-6.c: Likewise. * sysdeps/x86/tst-cet-legacy-6a.c: Likewise. * sysdeps/x86/tst-cet-legacy-6b.c: Likewise. * sysdeps/x86/tst-cet-legacy-mod-5.c: Likewise. * sysdeps/x86/tst-cet-legacy-mod-5a.c: Likewise. * sysdeps/x86/tst-cet-legacy-mod-5b.c: Likewise. * sysdeps/x86/tst-cet-legacy-mod-5c.c: Likewise. * sysdeps/x86/tst-cet-legacy-mod-6.c: Likewise. * sysdeps/x86/tst-cet-legacy-mod-6a.c: Likewise. * sysdeps/x86/tst-cet-legacy-mod-6b.c: Likewise. * sysdeps/x86/tst-cet-legacy-mod-6c.c: Likewise.
* powerpc: Use faster means to access FPSCR when possible in some casesPaul A. Clarke2019-06-306-30/+94
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Using 'mffs' instruction to read the Floating Point Status Control Register (FPSCR) can force a processor flush in some cases, with undesirable performance impact. If the values of the bits in the FPSCR which force the flush are not needed, an instruction that is new to POWER9 (ISA version 3.0), 'mffsl' can be used instead. Cases included: get_rounding_mode, fegetround, fegetmode, fegetexcept. * sysdeps/powerpc/bits/fenvinline.h (__fegetround): Use __fegetround_ISA300() or __fegetround_ISA2() as appropriate. (__fegetround_ISA300) New. (__fegetround_ISA2) New. * sysdeps/powerpc/fpu_control.h (IS_ISA300): New. (_FPU_MFFS): Move implementation... (_FPU_GETCW): Here. (_FPU_MFFSL): Move implementation.... (_FPU_GET_RC_ISA300): Here. New. (_FPU_GET_RC): Use _FPU_GET_RC_ISA300() or _FPU_GETCW() as appropriate. * sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status_ISA300): New. (fegetenv_status): New. * sysdeps/powerpc/fpu/fegetmode.c (fegetmode): Use fegetenv_status() instead of fegetenv_register(). * sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Likewise. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
* Further improve string bench timingWilco Dijkstra2019-06-2813-13/+31
| | | | | | | | | | | | | | | | | | | | | | | Further improve the timings of the string benchmarks. Ensure most take between 1 and 4 seconds to improve accuracy. Overall time taken increases by 35%. Tested on AArch64. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> * benchtests/bench-math-inlines.c: Increase iterations. * benchtests/bench-memcmp.c: Likewise. * benchtests/bench-rawmemchr.c: Likewise. * benchtests/bench-strcmp.c: Likewise. * benchtests/bench-strcpy_chk.c: Likewise. * benchtests/bench-string.h (INNER_LOOP_ITERS8): Add define. (INNER_LOOP_ITERS_MEDIUM): Increase iterations. (INNER_LOOP_ITERS_SMALL): Likewise. * benchtests/bench-strncat.c: Increase iterations. * benchtests/bench-strncmp.c: Increase iterations. * benchtests/bench-strncpy.c: Reduce iterations for wide strings. * benchtests/bench-strrchr.c: Increase iterations. * benchtests/bench-strstr.c: Keep iterations unchanged. * benchtests/bench-strtod.c: Increase iterations.
* Bump up the runtime for "short" benchmarksAnton Youdkevitch2019-06-2810-8/+22
| | | | | | | | | | | | | | | | | | | | | Some benchmarks with a very short runtime show significantly different results across runs on Aarch64 - up to tens of percents. Increasing the runtime to 100ms+ makes the deviation under 5%. Tested on Aarch64 and x86-64. Reviewed-by: Carlos O'Donell <carlos@redhat.com> * benchtests/bench-memccpy.c: Replace INNER_LOOP_ITERS with INNER_LOOP_ITERS_LARGE. * benchtests/bench-memchr.c: Likewise. * benchtests/bench-rawmemchr.c: Likewise. * benchtests/bench-strcat.c: Likewise. * benchtests/bench-strchr.c: Likewise. * benchtests/bench-string.h: Likewise. * benchtests/bench-strlen.c: Likewise. * benchtests/bench-strncpy.c: Likewise. * benchtests/bench-strnlen.c: Likewise.
* Linux: Use mmap instead of malloc in dirent/tst-getdents64Florian Weimer2019-06-282-3/+15
| | | | | malloc dirties the entire allocated memory region due to M_PERTURB in the test harness.
* Replace PREPARE_VERSION macro with inline functionTobias Klauser2019-06-282-9/+16
| | | | | | | | | * sysdeps/unix/sysv/linux/dl-vdso.h (PREPARE_VERSION): Remove macro. (prepare_version_base): New helper inline function. (prepare_version): New macro replacing PREPARE_VERSION. (PREPARE_VERSION_KNOWN): Use prepare_version instead of PREPARE_VERSION. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* ld.so: Support moving versioned symbols between sonames [BZ #24741]Florian Weimer2019-06-2812-35/+223
| | | | | | | | | | | | | | | | | | | This change should be fully backwards-compatible because the old code aborted the load if a soname mismatch was encountered (instead of searching further for a matching symbol). This means that no different symbols are found. The soname check was explicitly disabled for the skip_map != NULL case. However, this only happens with dl(v)sym and RTLD_NEXT, and those lookups do not come with a verneed entry that could be used for the check. The error check was already explicitly disabled for the skip_map != NULL case, that is, when dl(v)sym was called with RTLD_NEXT. But _dl_vsym always sets filename in the struct r_found_version argument to NULL, so the check was not active anyway. This means that symbol lookup results for the skip_map != NULL case do not change, either.
* support: Add xdlvsym functionFlorian Weimer2019-06-283-0/+26
|
* io: Remove copy_file_range emulation [BZ #24744]Florian Weimer2019-06-2814-783/+81
| | | | | | | | | | The kernel is evolving this interface (e.g., removal of the restriction on cross-device copies), and keeping up with that is difficult. Applications which need the function should run kernels which support the system call instead of relying on the imperfect glibc emulation. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Prepare vfprintf to use __printf_fp/__printf_fphex with float128 argGabriel F. T. Gomes2019-06-274-17/+95
| | | | | | | | | | | | | | | | | | | | | | | | On powerpc64le, long double can currently take two formats: the same as double (-mlong-double-64) or IBM Extended Precision (default with -mlong-double-128 or explicitly with -mabi=ibmlongdouble). The internal implementation of printf-like functions is aware of these possibilities and properly parses floating-point values from the variable arguments, before making calls to __printf_fp and __printf_fphex. These functions are also aware of the format possibilities and know how to convert both formats to string. When library support for TS 18661-3 was added to glibc, __printf_fp and __printf_fphex were extended with support for an additional type (__float128/_Float128) with a different format (binary128). Now that powerpc64le is getting support for its third long double format, and taking into account that this format is the same as the format of __float128/_Float128, this patch extends __vfprintf_internal to properly call __printf_fp and __printf_fphex with this new format. Tested for powerpc64le (with additional patches to actually enable the use of these preparations) and for x86_64. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* Linux: Adjust gedents64 buffer size to int range [BZ #24740]Florian Weimer2019-06-274-0/+67
| | | | | | | | | The kernel interface uses type unsigned int, but there is an internal conversion to int, so INT_MAX is the correct limit. Part of the buffer will always be unused, but this is not a problem. Such huge buffers do not occur in practice anyway. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* x86: Add sysdeps/x86/dl-lookupcfg.hH.J. Lu2019-06-263-31/+6
| | | | | | | | | Since sysdeps/i386/dl-lookupcfg.h and sysdeps/x86_64/dl-lookupcfg.h are identical, we can replace them with sysdeps/x86/dl-lookupcfg.h. * sysdeps/i386/dl-lookupcfg.h: Moved to ... * sysdeps/x86/dl-lookupcfg.h: Here. * sysdeps/x86_64/dl-lookupcfg.h: Removed.
* powerpc: Use generic e_expfAdhemerval Zanella2019-06-268-383/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Generic implementation is faster on both power8 and power9: POWER9: - sysdeps/ieee754/flt-32/e_expf.c "expf": { "workload-spec2017.wrf": { "duration": 5.1236e+09, "iterations": 7.53344e+08, "reciprocal-throughput": 5.9436, "latency": 7.65869, "max-throughput": 1.68248e+08, "min-throughput": 1.30571e+08 } } - sysdeps/powerpc/powerpc64/power8/fpu/e_expf.S "expf": { "workload-spec2017.wrf": { "duration": 5.14429e+09, "iterations": 5.29248e+08, "reciprocal-throughput": 8.05372, "latency": 11.3863, "max-throughput": 1.24166e+08, "min-throughput": 8.78249e+07 } } Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove e_expf-power8 and expf-ppc64. * sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-power8.S: Remove file. * sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/w_expf.c: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/e_expf.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/w_expf.c: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
* powerpc: Refactor powerpc32 lround/lroundf/llround/llroundfAdhemerval Zanella2019-06-2623-547/+182
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patches consolidates all the powerpc llround{f} implementations on the generic sysdeps/powerpc/powerpc32/fpu/s_llround{f}. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/powerpc32/fpu/Makefile [$(subdir) == math] (CFLAGS-s_lround.c): New rule. * sysdeps/powerpc/powerpc32/fpu/s_llround.c (__llround): Add power5+ and fctidz optimization. * sysdeps/powerpc/powerpc32/fpu/s_lround.S: Remove file. * sysdeps/powerpc/powerpc32/fpu/s_lround.c: New file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (CFLAGS-s_llround-power6.c, CFLAGS-s_llround-power5+.c, CFLAGS-s_llround-ppc32.c, CFLAGS-s_lround-ppc32.c, CFLAGS-s_lround-power5+.c): New rule. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power5+.c: New file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power6.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-power5+.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power5+.S: Remove file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power6.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-power5+.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/s_llround.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/s_llroundf.S: Likewise. * sysdeps/powerpc/powerpc32/power5+/fpu/s_llround.S: Likewise. * sysdeps/powerpc/powerpc32/power5+/fpu/s_llroundf.S: Likewise. * sysdeps/powerpc/powerpc32/power5+/fpu/s_lround.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_llround.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_llroundf.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
* Linux: Add nds32 specific syscalls to syscall-names.listVincent Chen2019-06-262-0/+7
| | | | | | The nds32 creates two specific syscalls, udftrap and fp_udfiex_crtl, in kernel v5.0 and v5.2, respectively. Add these two syscalls to syscall-names.list.
* Fix build warnings in nptl/tst-eintr1.cStefan Liebler2019-06-262-3/+7
| | | | | | | | | | | | | | | | | This patch fixes the gcc warnings seen with gcc 9.1 -O3 on s390x: tst-eintr1.c: In function ‘tf1’: tst-eintr1.c:46:1: error: no return statement in function returning non-void [-Werror=return-type] 46 | } | ^ tst-eintr1.c: In function ‘do_test’: tst-eintr1.c:57:17: error: unused variable ‘th’ [-Werror=unused-variable] 57 | pthread_t th = xpthread_create (NULL, tf1, NULL); | ^~ ChangeLog: * nptl/tst-eintr1.c (tf1): Add return statement. (do_test): Remove unused th variable.
* Fix build warnings in locale/programs/ld-ctype.cStefan Liebler2019-06-262-1/+7
| | | | | | | | | | | | | | | | | | | | | | This patch fixes the gcc warnings seen with gcc 9 -march>=z13 on s390x: programs/ld-ctype.c: In function ‘ctype_read’: programs/ld-ctype.c:1392:13: error: ‘wch’ may be used uninitialized in this function [-Werror=maybe-uninitialized] 1392 | uint32_t wch; | ^~~ programs/ld-ctype.c:1401:7: error: ‘seq’ may be used uninitialized in this function [-Werror=maybe-uninitialized] 1401 | if (seq != NULL && seq->nbytes == 1) | ^ programs/ld-ctype.c:1391:20: note: ‘seq’ was declared here 1391 | struct charseq *seq; | ^~~ Both seq and wch are uninitialized if get_character fails. Thus we are now returning with an error. ChangeLog: * locale/programs/ld-ctype.c (charclass_symbolic_ellipsis): Return error if get_character fails.
* S390: Regenerate ULPs.Stefan Liebler2019-06-252-6/+10
| | | | | | | | The update is needed for builds with -O3 and -march>=z13. ChangeLog: * sysdeps/s390/fpu/libm-test-ulps: Regenerated.
* szl_PL locale: Fix a typo in the previous commit (bug 24652).Rafal Luzynski2019-06-242-1/+7
| | | | | | | | | | | | The Unicode sequences in the format <Uxxxx> should be used instead of non-ASCII characters. Reported by Piotr Drąg: https://sourceware.org/bugzilla/show_bug.cgi?id=24652#c8 [BZ #24652] * localedata/locales/szl_PL (day): Use the correct Unicode sequences instead of non-ASCII characters.
* szl_PL locale: Spelling corrections (bug 24652).Grzegorz Kulik2019-06-242-15/+37
| | | | | | | | | | | | | | This commit also provides the correct month names in both nominative and genitive case for Silesian language, as required by the fix for the bug 10871. [BZ #24652] * localedata/locales/szl_PL (abday): Spelling corrections. (day): Likewise. (abmon): Likewise. (mon): Rename to... (alt_mon): This, then apply spelling corrections. (mon): New entry, month names in the genitive case.
* nl_{AW,NL}: Correct the thousands separator and grouping (bug 23831).Rafal Luzynski2019-06-213-4/+12
| | | | | | | | | | | | According to CLDR 35.1 and the bug report the thousands grouping separator should be always "." (a single dot) and digits should be grouped by 3. [BZ #23831] * localedata/locales/nl_AW (mon_thousands_sep): Set to ".". * localedata/locales/nl_NL (mon_thousands_sep): Likewise. (thousands_sep): Likewise. (grouping): Set to 3;3.
* Add missing VDSO_{NAME,HASH}_* macros and use them for PREPARE_VERSION_KNOWNTobias Klauser2019-06-219-11/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | Define all currently used Linux versions used for PREPARE_VERSION{,_KNOWN} in sysdeps/unix/sysv/linux/dl-vdso.h and use them instead of duplicating the versions and precomputed hashes across architecture specific files. * sysdeps/unix/sysv/linux/aarch64/gettimeofday.c (INIT_ARCH): Use PREPARE_VERSION_KNOWN. * sysdeps/unix/sysv/linux/aarch64/init-first.c: Likewise. * sysdeps/unix/sysv/linux/dl-vdso.h (VDSO_NAME_LINUX_2_6_39): New define. (VDSO_HASH_LINUX_2_6_39): Likewise. (VDSO_NAME_LINUX_4_9): Likewise. (VDSO_HASH_LINUX_4_9): Likewise. * sysdeps/unix/sysv/linux/powerpc/gettimeofday.c (INIT_ARCH): Likewise. * sysdeps/unix/sysv/linux/powerpc/init-first.c (_libc_vdso_platform_setup): Likewise. * sysdeps/unix/sysv/linux/powerpc/time.c (INIT_ARCH): Likewise. * sysdeps/unix/sysv/linux/s390/init-first.c (_libc_vdso_platform_setup): Likewise. * sysdeps/unix/sysv/linux/x86_64/init-first.c (__vdso_platform_setup): Likewise. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* nptl: Convert various tests to use libsupportMike Crowe2019-06-2110-409/+126
| | | | | | | | | | | * nptl/eintr.c: Use libsupport. * nptl/tst-eintr1.c: Likewise. * nptl/tst-eintr2.c: Likewise. * nptl/tst-eintr3.c: Likewise. * nptl/tst-eintr4.c: Likewise. * nptl/tst-eintr5.c: Likewise. * nptl/tst-mutex-errorcheck.c: Likewise. * nptl/tst-mutex5.c: Likewise.
* support: Invent verbose_printf macroMike Crowe2019-06-212-0/+10
| | | | | | | Make it easier for tests to emit progress messages only when --verbose is specified. * support/test-driver.h: Add verbose_printf macro.
* support: Add xclock_now helper function.Mike Crowe2019-06-212-0/+14
| | | | | | | | | | | | | It's easier to read and write tests with: const struct timespec ts = xclock_now(CLOCK_REALTIME); than struct timespec ts; xclock_gettime(CLOCK_REALTIME, &ts); * support/xtime.h: Add xclock_now() helper function.
* libio: do not attempt to free wide buffers of legacy streams [BZ #24228]Dmitry V. Levin2019-06-205-5/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit a601b74d31ca086de38441d316a3dee24c866305 aka glibc-2.23~693 ("In preparation for fixing BZ#16734, fix failure in misc/tst-error1-mem when _G_HAVE_MMAP is turned off.") introduced a regression: _IO_unbuffer_all now invokes _IO_wsetb to free wide buffers of all files, including legacy standard files which are small statically allocated objects that do not have wide buffers and the _mode member, causing memory corruption. Another memory corruption in _IO_unbuffer_all happens when -1 is assigned to the _mode member of legacy standard files that do not have it. [BZ #24228] * libio/genops.c (_IO_unbuffer_all) [SHLIB_COMPAT (libc, GLIBC_2_0, GLIBC_2_1)]: Do not attempt to free wide buffers and access _IO_FILE_complete members of legacy libio streams. * libio/tst-bz24228.c: New file. * libio/tst-bz24228.map: Likewise. * libio/Makefile [build-shared] (tests): Add tst-bz24228. [build-shared] (generated): Add tst-bz24228.mtrace and tst-bz24228.check. [run-built-tests && build-shared] (tests-special): Add $(objpfx)tst-bz24228-mem.out. (LDFLAGS-tst-bz24228, tst-bz24228-ENV): New variables. ($(objpfx)tst-bz24228-mem.out): New rule.
* [powerpc] add 'volatile' to asmPaul A. Clarke2019-06-193-5/+12
| | | | | | | | | | | | | | Add 'volatile' keyword to a few asm statements, to force the compiler to generate the instructions therein. Some instances were implicitly volatile, but adding keyword for consistency. 2019-06-19 Paul A. Clarke <pc@us.ibm.com> * sysdeps/powerpc/fpu/fenv_libc.h (relax_fenv_state): Add 'volatile'. * sysdeps/powerpc/fpu/fpu_control.h (__FPU_MFFS): Likewise. (__FPU_MFFSL): Likewise. (_FPU_SETCW): Likewise.
* powerpc: Fix static-linked version of __ppc_get_timebase_freq [BZ #24640]Stan Shebs2019-06-194-1/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __ppc_get_timebase_freq() always return 0 when using static linked glibc. This is a minimal example.c to reproduce: /******************************/ #include <inttypes.h> #include <stdint.h> #include <stdio.h> #include <sys/platform/ppc.h> int main() { uint64_t freq = __ppc_get_timebase_freq(); printf("Time Base frequency = %"PRIu64" Hz\n", freq); if (freq == 0) return -1; return 0; } /******************************/ Compile command: gcc -static example.c This bug has been reproduced, fixed and tested on all powerpc platforms (ppc32, ppc64 and ppc64le). The underlying code of __ppc_get_timebase_freq uses __get_timebase_freq that has a different implementation for shared and static version of glibc. In the static version, there is an incorrect sense in the if check for the fd returned when opening /proc/cpuinfo. This solution is mostly a cherry-pick from: commit 4791e4f773d060c1a37b27aac5b03cdfa9327afc Author: Stan Shebs <stanshebs@google.com> Date: Fri May 17 12:25:19 2019 -0700 Subject: Fix sense of a test in the static-linking version of ppc get_clockfreq That is in branch glibc/google/grte/v5-2.27/master and was mentioned for inclusion on master here: https://www.sourceware.org/ml/libc-alpha/2019-05/msg00409.html Adapted from original fix for get_clockfreq. That code was moved to get_timebase_freq. Also added a static-build testcase for __ppc_get_timebase_freq since the underlying function has different implementations for shared and static build. [BZ #24640] * sysdeps/unix/sysv/linux/powerpc/get_timebase_freq.c [!SHARED] (__get_timebase_freq): Fix sense of a test in the static-linking version. * sysdeps/unix/sysv/linux/powerpc/Makefile (tests-static): Add test-gettimebasefreq-static. (tests): Likewise. * sysdeps/unix/sysv/linux/powerpc/test-gettimebasefreq-static.c: New file.
* nl_AW locale: Correct the negative monetary format (bug 24614).Rafal Luzynski2019-06-192-2/+9
| | | | | | | | | | | | Follow the same changes as made in the commit 02d8b5ab1c because the respective entries in nl_NL and nl_AW had been the same before the change so they should be the same after. CLDR does not provide complete data for nl_AW, it says it is missing and displays a copy of nl_NL. [BZ #24614] * localedata/locales/nl_AW (n_sep_by_space): Set to 2 (a space between the currency symbol and the minus sign). (n_sign_posn): Set to 4 (the minus sign after the currency symbol).
* Fix gcc 9 build errors for make xcheck. [BZ #24556]Stefan Liebler2019-06-197-6/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the following gcc 9 warnings for "make xcheck" / "make bench": -string/tst-strcasestr.c: ../include/bits/../../misc/bits/error.h:42:5: error: ‘%s’ directive argument is null [-Werror=format-overflow=] -argp/argp-test.c: argp-test.c:130:20: error: ‘%d’ directive writing between 1 and 11 bytes into a region of size 10 [-Werror=format-overflow=] argp-test.c:130:19: note: directive argument in the range [-2147483648, 122] argp-test.c:130:5: note: ‘sprintf’ output between 2 and 12 bytes into a destination of size 10 -nss/tst-field.c: tst-field.c:52:7: error: ‘%s’ directive argument is null [-Werror=format-overflow=] -benchtests/bench-strstr.c: ../include/bits/../../misc/bits/error.h:42:5: error: ‘%s’ directive argument is null [-Werror=format-overflow=] -benchtests/bench-malloc-simple.c: bench-malloc-simple.c:93:16: error: iteration 3 invokes undefined behavior [-Werror=aggressive-loop-optimizations] ChangeLog: [BZ #24556] * string/test-strcasestr.c (check_result): Add NULL check. * nss/tst-field.c (check_rewrite): Likewise. * benchtests/bench-strstr.c (do_one_test): Likewise. * string/test-strstr.c (check_result): Likewise. * argp/argp-test.c (popt): Increase size of buf to 12. * benchtests/bench-malloc-simple.c (bench): Do not initialize tests array out of bounds.
* dlfcn: Avoid one-element flexible array in Dl_serinfo [BZ #24166]Florian Weimer2019-06-192-0/+18
| | | | | | | | | The dls_serpath path field, as an array of length 1, introduces unexpected array subscript checks with some compilers. GCC versions before 3.0 treat the nested anonymous union as a declaration of an unnamed type, and not as a member declaration, so this construct cannot be used for these compilers.
* elf: Refuse to dlopen PIE objects [BZ #24323]Florian Weimer2019-06-185-6/+79
| | | | | Another executable has already been mapped, so the dynamic linker cannot perform relocations correctly for the second executable.
* nl_NL locale: Correct the negative monetary format (bug 24614).Rafal Luzynski2019-06-174-3/+14
| | | | | | | | | | | | | | | According to CLDR 35.1 and the bug report the correct monetary format for negative amounts should be "EUR -1 234,56" while previously it was "EUR 1 234,56-". This patch does not change the thousands (grouping) separator. [BZ #24614] * localedata/Makefile (LOCALES): Add nl_NL.UTF-8. * localedata/locales/nl_NL (n_sep_by_space): Set to 2 (a space between the currency symbol and the minus sign). (n_sign_posn): Set to 4 (the minus sign after the currency symbol). * localedata/tst-strfmon1.c (tests): Add test data for nl_NL.UTF-8.
* m68k: Remove vDSO supportAdhemerval Zanella2019-06-1710-321/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | Although defined in initial TLS/NPTL ABI for m68k and ColdFire [1], kernel support was never pushed upstream. This patch removes the unused m68k vDSO support. Checked with a build against m68k and m68k-coldfire and some basic tests on ARAnyM. * sysdeps/unix/sysv/linux/m68k/Makefile (sysdep_routines, sysdep-rtld-routines): Remove rules. * sysdeps/unix/sysv/linux/m68k/Versions (libc) [GLIBC_PRIVATE]: Remove __vdso_atomic_cmpxchg_32 and __vdso_atomic_barrier. (ld) [GLIBC_PRIVATE]: __rtld___vdso_read_tp, __rtld___vdso_atomic_cmpxchg_32, and __rtld___vdso_atomic_barrier. * sysdeps/unix/sysv/linux/m68k/coldfire/atomic-machine.h (atomic_compare_and_exchange_val_acq, atomic_full_barrier): Remove vDSO path for SHARED. * sysdeps/unix/sysv/linux/m68k/init-first.c: Remove file. * sysdeps/unix/sysv/linux/m68k/libc-m68k-vdso.c: Likewise. * sysdeps/unix/sysv/linux/m68k/m68k-helpers.S: Likewise. * sysdeps/unix/sysv/linux/m68k/m68k-vdso.c: Likewise. * sysdeps/unix/sysv/linux/m68k/m68k-vdso.h: Likewise. * sysdeps/unix/sysv/linux/m68k/m68k-helpers.c: New file. [1] https://lists.debian.org/debian-68k/2007/11/msg00071.html
* powerpc: Refactor powerpc64 lround/lroundf/llround/llroundfAdhemerval Zanella2019-06-1731-490/+240
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patches consolidates all the powerpc {l}lround{f} implementations on the generic sysdeps/powerpc/fpu/s_{l}lround{f}.c. The IFUNC support is also moved only to powerpc64 only, since for powerpc64le generic implementation resulting in optimized code. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_llround-power8, s_llround-power6x, s_llround-power5+, s_llround-ppc64, and s_llroundf-ppc64. (CFLAGS-s_llround-power8.c, CFLAGS-s_llround-power6x.c, CFLAGS-s_llround-power5+.c): New rule. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llround-power5+.c: New file. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llround-power6x.c: Likewise. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llround-power8.c: Likewise. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llround-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llroundf-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llround.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_llroundf.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llroundf.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_lround.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_lround.c: ... here. * sysdeps/powerpc/powerpc64/fpu/Makefile [$(subdir) == math] (CFLAGS-s_llround.c): New rule. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove s_llround-* objects. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-power5+.S: Remove file. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-power6x.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-power8.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_llroundf-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_llround.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_llroundf.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_lround.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_lroundf.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_llround.c: New file. * sysdeps/powerpc/powerpc64/fpu/s_llroundf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_lround.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_lroundf.c: Likewise. * sysdeps/powerpc/powerpc64/power5+/fpu/s_llround.S: Likewise. * sysdeps/powerpc/powerpc64/power5+/fpu/s_llroundf.S: Likewise. * sysdeps/powerpc/powerpc64/power6x/fpu/s_llround.S: Likewise. * sysdeps/powerpc/powerpc64/power6x/fpu/s_llroundf.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_llround.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_llroundf.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
* powerpc: Refactor powerpc32 lrint/lrintf/llrint/llrintfAdhemerval Zanella2019-06-1723-342/+116
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patches consolidates all the powerpc llrint{f} implementations on the generic sysdeps/powerpc/powerpc32/fpu/s_llrint{f}. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cpu and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/s_lrintf.S: Remove file. * sysdeps/powerpc/powerpc64/fpu/s_lrintf.c: Move to ... * sysdeps/powerpc/fpu/s_lrintf.c: ... here. * sysdeps/powerpc/powerpc32/fpu/Makefile [$(subdir) == math] (CFLAGS-s_lrint.c): New rule. * sysdeps/powerpc/powerpc32/fpu/s_llrint.c (__llrint): Add power4 optimization. * sysdeps/powerpc/powerpc32/fpu/s_llrintf.c (__llrintf): Likewise. * sysdeps/powerpc/powerpc32/fpu/s_lrint.S: Remove file. * sysdeps/powerpc/powerpc32/fpu/s_lrint.c: New file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (CFLAGS-s_llrintf-power6.c, CFLAGS-s_llrintf-ppc32.c, CFLAGS-s_llrint-power6.c, CFLAGS-s_llrint-ppc32.c, CFLAGS-s_lrint-ppc32.c): New rule. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint-power6.S: Remove file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf-power6.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lrint-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/s_llrint.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/s_llrintf.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_llrint.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_llrintf.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint-power6.c: New file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf-power6.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lrint-ppc32.c: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
* powerpc: refactor powerpc64 lrint/lrintf/llrint/llrintfAdhemerval Zanella2019-06-1722-225/+113
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patches consolidates all the powerpc llrint{f} implementations on the generic sysdeps/powerpc/fpu/s_llrint{f}. The IFUNC support is also moved only to powerpc64 only, since for powerpc64le generic implementation resulting in optimized code. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_llrint-power8, s_llrint-power6x, and s_llrint-ppc64. (CFLAGS-s_llrint-power8.c, CFLAGS-s_llrint-power6x.c): New rule. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint-power6x.c: New file. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint-power8.c: Likewise. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_lrint.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_lrint.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrint.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrintf.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_llrintf.c: ... here. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_lrint.c: New file. * sysdeps/powerpc/powerpc64/fpu/Makefile: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove s_llrint-* objects. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-power6x.S: Remove file. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-power8.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_llrint.c: New file. * sysdeps/powerpc/powerpc64/fpu/s_llrintf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_lrint.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_lrintf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_llrint.S: Remove file. * sysdeps/powerpc/powerpc64/fpu/s_llrintf.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_lrint.S: Likewise. * sysdeps/powerpc/powerpc64/power6x/fpu/s_llrint.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_llrint.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
* Linux: Fix __glibc_has_include use for <sys/stat.h> and statxFlorian Weimer2019-06-142-2/+10
| | | | | | | | | | | | | | | The identifier linux is used as a predefined macro, so the actually used path is 1/stat.h or 1/stat64.h. Using the quote-based version triggers a file lookup for /usr/include/bits/linux/stat.h (or whatever directory is used to store bits/statx.h), but since bits/ is pretty much reserved by glibc, this appears to be acceptable. This is related to GCC PR 80005: incorrect macro expansion of the argument of __has_include. Suggested by Zack Weinberg. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* <sys/cdefs.h>: Inhibit macro expansion for __glibc_has_includeFlorian Weimer2019-06-142-1/+9
| | | | | | | | | This is currently ineffective with GCC because of GCC PR 80005, but it makes sense to anticipate a fix for this defect. Suggested by Zack Weinberg. Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* Add IPV6_ROUTER_ALERT_ISOLATE from Linux 5.1 to bits/in.h.Joseph Myers2019-06-132-0/+4
| | | | | | | | | | This patch adds the new constant IPV6_ROUTER_ALERT_ISOLATE from Linux 5.1 to sysdeps/unix/sysv/linux/bits/in.h. Tested for x86_64. * sysdeps/unix/sysv/linux/bits/in.h (IPV6_ROUTER_ALERT_ISOLATE): New macro.
* Allow memset local PLT reference for powerpc soft-float.Joseph Myers2019-06-132-0/+6
| | | | | | | | | | | | | | | | | | | | Some recent change on GCC mainline resulted in the localplt test failing for powerpc soft-float (not sure exactly when, as the failure appeared when there were other build test failures as well; <https://sourceware.org/ml/libc-testresults/2019-q2/msg00261.html> shows it remaining when other failures went away). The problem is a call to memset that GCC now generates in the libgcc long double code. Since memset is documented as a function GCC may always implicitly generate calls to, it seems reasonable to allow that local PLT reference (just like those for libgcc functions that GCC implicitly generates calls to and that are also exported from libc.so), which this patch does. Tested for powerpc soft-float with build-many-glibcs.py. * sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/localplt.data: Allow memset in libc.so.
* aarch64: handle STO_AARCH64_VARIANT_PCSSzabolcs Nagy2019-06-134-4/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Avoid lazy binding of symbols that may follow a variant PCS with different register usage convention from the base PCS. Currently the lazy binding entry code does not preserve all the registers required for AdvSIMD and SVE vector calls. Saving and restoring all registers unconditionally may break existing binaries, even if they never use vector calls, because of the larger stack requirement for lazy resolution, which can be significant on an SVE system. The solution is to mark all symbols in the symbol table that may follow a variant PCS so the dynamic linker can handle them specially. In this patch such symbols are always resolved at load time, not lazily. So currently LD_AUDIT for variant PCS symbols are not supported, for that the _dl_runtime_profile entry needs to be changed e.g. to unconditionally save/restore all registers (but pass down arg and retval registers to pltentry/exit callbacks according to the base PCS). This patch also removes a __builtin_expect from the modified code because the branch prediction hint did not seem useful. * sysdeps/aarch64/dl-dtprocnum.h: New file. * sysdeps/aarch64/dl-machine.h (DT_AARCH64): Define. (elf_machine_runtime_setup): Handle DT_AARCH64_VARIANT_PCS. (elf_machine_lazy_rel): Check STO_AARCH64_VARIANT_PCS and bind such symbols at load time. * sysdeps/aarch64/linkmap.h (struct link_map_machine): Add variant_pcs.
* aarch64: add STO_AARCH64_VARIANT_PCS and DT_AARCH64_VARIANT_PCSSzabolcs Nagy2019-06-132-0/+12
| | | | | | | | | | | | | STO_AARCH64_VARIANT_PCS is a non-visibility st_other flag for marking symbols that reference functions that may follow a variant PCS with different register usage convention from the base PCS. DT_AARCH64_VARIANT_PCS is a dynamic tag that marks ELF modules that have R_*_JUMP_SLOT relocations for symbols marked with STO_AARCH64_VARIANT_PCS (i.e. have variant PCS calls via a PLT). * elf/elf.h (STO_AARCH64_VARIANT_PCS): Define. (DT_AARCH64_VARIANT_PCS): Define.
* powerpc: Remove optimized finiteAdhemerval Zanella2019-06-1220-654/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The powerpc finite optimization do not show much gain: - GCC will call libm iff -fsignaling-nans is used. This usage pattern is usually not performance oriented and for such calls PLT overhead should dominate execution time. - The power7 uses ftdiv to optimize for some input patterns, but at cost of others. Comparing against generic C implementation built for powerpc64-linux-gnu-power7 (--with-cpu=power7): - Generic sysdeps/ieee754 implementation: "isfinite": { "": { "duration": 5.0082e+09, "iterations": 2.45299e+09, "max": 43.824, "min": 2.008, "mean": 2.04167 }, "INF": { "duration": 4.66554e+09, "iterations": 2.28288e+09, "max": 35.73, "min": 2.008, "mean": 2.04371 }, "NAN": { "duration": 4.66274e+09, "iterations": 2.28716e+09, "max": 34.161, "min": 2.009, "mean": 2.03866 } } - power7 optimized one: "isfinite": { "": { "duration": 4.99111e+09, "iterations": 2.65566e+09, "max": 25.015, "min": 1.716, "mean": 1.87942 }, "INF": { "duration": 4.6783e+09, "iterations": 2.0999e+09, "max": 35.264, "min": 1.868, "mean": 2.22787 }, "NAN": { "duration": 4.67915e+09, "iterations": 2.08678e+09, "max": 38.099, "min": 1.869, "mean": 2.24228 } } So it basically optimizes marginally for normal numbers while increasing the latency for other kind of FP. - The power8 implementation is just the generic implementation using ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation). So generic implementation is the best option for powerpc64le. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (sysdeps_routines, libm-sysdep_routines): Remove s_finite* objects. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite-power7.S: Remove file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finitef-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finitef.c: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_finite.S: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_finitef.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_call): Remove s_finite* objects. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-power7.S: Remove file. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-power8.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finitef-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finitef.c: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_finite.S: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_finitef.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_finite.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_finitef.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
* math: Use wordsize-64 version for finiteAdhemerval Zanella2019-06-123-58/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | - math.h will use compiler builtin for gcc 4.4 when built without -fsignaling-nans and the builtin is expanded inline for all support architectures. As an example, there is no intra finite call on libm for the architecture I checked, x86, arm, aarch64, and powerpc. - The resulting binary difference on 32 bits architecture is minimum for the non hotspot symbol. - It helps wordsize-64 architectures that use ldbl-opt. - It add some code simplification with reduction of duplicated implementations. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/ieee754/dbl-64/wordsize-64/s_finite.c: Move to ... * sysdeps/ieee754/dbl-64/s_finite.c: ... here and format code. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
* powerpc: Remove optimized isinfAdhemerval Zanella2019-06-1220-634/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The powerpc isinf optimizations onyl adds complexity: - GCC will call libm iff -fsignaling-nans is used. This usage pattern is usually not performance oriented and for such calls PLT overhead should dominate execution time. - The power7 uses ftdiv to optimize for some input pattern and branch implementation for INF and denormal that does: return (ix & UINT64_C (0x7fffffffffffffff)) == UINT64_C (0x7ff0000000000000) Although it does show slight better latency than generic algorithm (as below), it is only for power7 and requires it to override it for power8. - The power8 implementation is just the generic implementation using ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation). So generic implementation is the best option for powerpc64le. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (sysdeps_routines, libm-sysdep_routines): Remove s_isinf* and s_isinf* objects. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf-power7.S: Remove file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinff-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinff.c: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_isinf.S: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_isinff.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_call): Remove s_isinf* and s_isinf* objects. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-power7.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-power8.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinff-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinff.c: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_isinf.S: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_isinff.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_isinf.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_isinff.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
* math: Use wordsize-64 version for isinfAdhemerval Zanella2019-06-123-43/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | - math.h will use compiler builtin for gcc 4.4 when built without -fsignaling-nans and the builtin is expanded inline for all support architectures. As an example, there is no intra isinf call on libm for the architecture I checked, x86, arm, aarch64, and powerpc. - The resulting binary difference on 32 bits architecture is minimum for the non hotspot symbol. - It helps wordsize-64 architectures that use ldbl-opt. - It add some code simplification with reduction of duplicated implementations. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/ieee754/dbl-64/wordsize-64/s_isinf.c: Move to ... * sysdeps/ieee754/dbl-64/s_isinf.c: ... here and format code. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
* powerpc: Remove optimized isnanAdhemerval Zanella2019-06-1236-1381/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The powerpc isnan optimizations are not really a gain: - GCC will call libm iff -fsignaling-nans is used. This usage pattern is usually not performance oriented and for such calls PLT overhead should dominate execution time. - The power5, power6, and power6x are just micro-optimization to improve the Load-Hit-Store hazards from floating-point to general register transfer, and current GCC already has support to minimize it by inserting either extra nops or group dispatch instructions. - The power7 uses ftdiv to optimize for some input patterns, but at cost of others. Comparing against generic C implementation built for powerpc-linux-gnu-power4 (which uses the hp-timing support on benchtests): - Generic sysdeps/ieee754 implementation: "isnan": { "": { "duration": 4.98415e+09, "iterations": 2.34516e+09, "max": 45.925, "min": 2.052, "mean": 2.12529 }, "INF": { "duration": 4.74057e+09, "iterations": 1.69761e+09, "max": 91.01, "min": 2.052, "mean": 2.79249 }, "NAN": { "duration": 4.74071e+09, "iterations": 1.68768e+09, "max": 282.343, "min": 2.052, "mean": 2.809 } } - power7 optimized one: $ ./testrun.sh benchtests/bench-isnan "isnan": { "": { "duration": 4.96842e+09, "iterations": 2.56297e+09, "max": 50.048, "min": 1.872, "mean": 1.93854 }, "INF": { "duration": 4.76648e+09, "iterations": 1.54213e+09, "max": 373.408, "min": 2.661, "mean": 3.09084 }, "NAN": { "duration": 4.76845e+09, "iterations": 1.54515e+09, "max": 51.016, "min": 2.736, "mean": 3.08607 } } So it basically optimizes marginally for normal numbers while increasing the latency for other kind of FP. - The generic implementation requires getting the floating point status, disable the invalid operation bit, and restore the floating-point status. Each operation is costly and requires flushing the FP pipeline. Using the same scenarion for the previous analysis: "isnan": { "": { "duration": 5.08284e+09, "iterations": 6.2898e+08, "max": 41.844, "min": 8.057, "mean": 8.08108 }, "INF": { "duration": 4.97904e+09, "iterations": 6.16176e+08, "max": 39.661, "min": 8.057, "mean": 8.08055 }, "NAN": { "duration": 4.98695e+09, "iterations": 5.95866e+08, "max": 29.728, "min": 8.345, "mean": 8.36925 } } - The power8 implementation is just the generic implementation using ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation). So generic implementation is the best option for powerpc64le. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/s_isnan.c: Remove file. * sysdeps/powerpc/fpu/s_isnanf.S: Likewise. * sysdeps/powerpc/powerpc32/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (sysdeps_routines, libm-sysdep_routines): Remove s_isnan-* and s_isnanf-* objects. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power5.S: Remove file * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power6.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power7.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf-power5.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf-power6.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf.c: Likewise. * sysdeps/powerpc/powerpc32/power5/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc32/power5/fpu/s_isnanf.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_isnanf.S: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_isnanf.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_calls): Remove s_isnan-* and s_isnanf-* objects. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power5.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6x.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power7.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power8.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnanf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power5/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power6/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power6x/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_isnanf.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_isnanf.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>