| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
__strlen_power9 and __stpcpy_power9 were added to their ifunc lists
using the wrong function names.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This started as a trivial change to Anton's rawmemchr. I got
carried away. This is a hybrid between P8's asympotically
faster 64B checks with extremely efficient small string checks
e.g <64B (and sometimes a little bit more depending on alignment).
The second trick is to align to 64B by running a 48B checking loop
16B at a time until we naturally align to 64B (i.e checking 48/96/144
bytes/iteration based on the alignment after the first 5 comparisons).
This allieviates the need to check page boundaries.
Finally, explicly use the P7 strlen with the runtime loader when building
P9. We need to be cautious about vector/vsx extensions here on P9 only
builds.
|
|
|
|
|
|
|
|
|
|
|
| |
This version uses vector instructions and is up to 60% faster on medium
matches and up to 90% faster on long matches, compared to the POWER7
version. A few examples:
__rawmemchr_power9 __rawmemchr_power7
Length 32, alignment 0: 2.27566 3.77765
Length 64, alignment 2: 2.46231 3.51064
Length 1024, alignment 0: 17.3059 32.6678
|
|
|
|
|
|
|
|
|
|
| |
Add stpcpy support to the POWER9 strcpy. This is up to 40% faster on
small strings and up to 90% faster on long relatively unaligned strings,
compared to the POWER8 version. A few examples:
__stpcpy_power9 __stpcpy_power8
Length 20, alignments in bytes 4/ 4: 2.58246 4.8788
Length 1024, alignments in bytes 1/ 6: 24.8186 47.8528
|
|
|
|
|
|
|
|
|
|
| |
This version uses VSX store vector with length instructions and is
significantly faster on small strings and relatively unaligned large
strings, compared to the POWER8 version. A few examples:
__strcpy_power9 __strcpy_power8
Length 16, alignments in bytes 0/ 0: 2.52454 4.62695
Length 412, alignments in bytes 4/ 0: 11.6 22.9185
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also, change sources.redhat.com to sourceware.org.
This patch was automatically generated by running the following shell
script, which uses GNU sed, and which avoids modifying files imported
from upstream:
sed -ri '
s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g
s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g
' \
$(find $(git ls-files) -prune -type f \
! -name '*.po' \
! -name 'ChangeLog*' \
! -path COPYING ! -path COPYING.LIB \
! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \
! -path manual/texinfo.tex ! -path scripts/config.guess \
! -path scripts/config.sub ! -path scripts/install-sh \
! -path scripts/mkinstalldirs ! -path scripts/move-if-change \
! -path INSTALL ! -path locale/programs/charmap-kw.h \
! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \
! '(' -name configure \
-execdir test -f configure.ac -o -f configure.in ';' ')' \
! '(' -name preconfigure \
-execdir test -f preconfigure.ac ';' ')' \
-print)
and then by running 'make dist-prepare' to regenerate files built
from the altered files, and then executing the following to cleanup:
chmod a+x sysdeps/unix/sysv/linux/riscv/configure
# Omit irrelevant whitespace and comment-only changes,
# perhaps from a slightly-different Autoconf version.
git checkout -f \
sysdeps/csky/configure \
sysdeps/hppa/configure \
sysdeps/riscv/configure \
sysdeps/unix/sysv/linux/csky/configure
# Omit changes that caused a pre-commit check to fail like this:
# remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines
git checkout -f \
sysdeps/powerpc/powerpc64/ppc-mcount.S \
sysdeps/unix/sysv/linux/s390/s390-64/syscall.S
# Omit change that caused a pre-commit check to fail like this:
# remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline
git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch removes the power6 wcsrchr optimization and uses generic
implementation instead. Currently, both power6 and power7 IFUNC variant
resulting binary are essentially the same and the generic implementation
with unrolling loop set to 8 also results in similar performance.
Checked on powerpc64-linux-gnu.
* sysdeps/powerpc/Makefile [$(subdir) == wcsmbs] (CFLAGS-wcsrchr.c):
New rule.
* sysdeps/powerpc/power6/wcsrchr.c: Remove file.
* sysdeps/powerpc/powerpc32/power4/multiarch/wcsrchr-power6.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/multiarch/wcsrchr-power7.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/multiarch/wcsrchr-ppc32.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/multiarch/wcsrchr.c: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/wcsrchr-power6.c: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/wcsrchr-power7.c: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/wcsrchr-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/wcsrchr.c: Likewise.
* sysdeps/powerpc/powerpc64/power6/wcsrchr.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/multiarch/Makefile
[$(subdir) == wcsmbs] (sysdeps_routines): Remove wcsrchr-power6 and
wcsrchr-power7.
(CFLAGS-wcsrchr-power7.c, CFLAGS-wcsrchr-power6.c): Remove rule.
* sysdeps/powerpc/powerpc64/multiarch/Makefile: Likewise.
* sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c:
Remove wcsrchr optimizations.
* sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch removes the power6 wcschr optimization and uses generic
implementation instead. Currently, both power6 and power7 IFUNC variant
resulting binary are essentially the same and the generic implementation
with unrolling loop set to 8 also results in similar performance.
Checked on powerpc64-linux-gnu.
* sysdeps/powerpc/Makefile [$(subdir) == wcsmbs] (CFLAGS-wcschr.c):
New rule.
* sysdeps/powerpc/power6/wcschr.c: Remove file.
* sysdeps/powerpc/powerpc32/power4/multiarch/wcschr-power6.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/multiarch/wcschr-power7.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/multiarch/wcschr-ppc32.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/multiarch/wcschr.c: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/wcschr-power6.c: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/wcschr-power7.c: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/wcschr-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/wcschr.c: Likewise.
* sysdeps/powerpc/powerpc64/power6/wcschr.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/multiarch/Makefile
[$(subdir) == wcsmbs] (sysdeps_routines): Remove wcschr-power6 and
wcschr-power7.
(CFLAGS-wcschr-power7.c, CFLAGS-wcschr-power6.c): Remove rule.
* sysdeps/powerpc/powerpc64/multiarch/Makefile: Likewise.
* sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c:
Remove wcschr optimizations.
* sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch removes the power6 wcscpy optimization and uses generic
implementation instead. Currently, both power6 and power7 IFUNC variant
resulting binary are essentially the same and the generic implementation
with unrolling loop set to 8 also results in similar performance.
Checked on powerpc64-linux-gnu.
* sysdeps/powerpc/Makefile [$(subdir) == wcsmbs] (CFLAGS-wcscpy.c):
New rule.
* sysdeps/powerpc/power6/wcscpy.c: Remove file.
* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy-power6.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy-power7.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy-ppc32.c:
Likewise.
* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy.c: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/wcscpy-power6.c: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/wcscpy-power7.c: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/wcscpy-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/wcscpy.c: Likewise.
* sysdeps/powerpc/powerpc64/power6/wcscpy.c: Likewise.
* sysdeps/powerpc/powerpc32/power4/multiarch/Makefile
[$(subdir) == wcsmbs] (sysdeps_routines): Remove wcscpy-power6 and
wcscpy-power7.
(CFLAGS-wcscpy-power7.c, CFLAGS-wcscpy-power6.c): Remove rule.
* sysdeps/powerpc/powerpc64/multiarch/Makefile: Likewise.
* sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c:
Remove wcscpy optimizations.
* sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch rewrites wcscat using wcslen and wcscpy. This is similar to
the optimization done on strcat by 6e46de42fe.
The strcpy changes are mainly to add the internal alias to avoid PLT
calls.
Checked on x86_64-linux-gnu and a build against the affected
architectures.
* include/wchar.h (__wcscpy): New prototype.
* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy-ppc32.c
(__wcscpy): Route internal symbol to generic implementation.
* sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy.c (wcscpy):
Add internal __wcscpy alias.
* sysdeps/powerpc/powerpc64/multiarch/wcscpy.c (wcscpy): Likewise.
* sysdeps/s390/wcscpy.c (wcscpy): Likewise.
* sysdeps/x86_64/multiarch/wcscpy.c (wcscpy): Likewise.
* wcsmbs/wcscpy.c (wcscpy): Add
* sysdeps/x86_64/multiarch/wcscpy-c.c (WCSCPY): Adjust macro to
use generic implementation.
* wcsmbs/wcscat.c (wcscat): Rewrite using wcslen and wcscpy.
|
|
|
|
|
|
|
|
|
|
|
| |
A single underscore was omitted in
sysdeps/powerpc/powerpc64/multiarch/strncmp.c, resulting in use of
power8 version of strncmp instead of power9 version, with significant
performance degradation.
* sysdeps/powerpc/powerpc64/multiarch/strncmp.c: Fix #ifdef.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
|
|
|
|
|
|
|
| |
* All files with FSF copyright notices: Update copyright dates
using scripts/update-copyrights.
* locale/programs/charmap-kw.h: Regenerated.
* locale/programs/locfile-kw.h: Likewise.
|
|
|
|
|
|
| |
This patch moves little endian specific POWER9 optimization files to
sysdeps/powerpc/powerpc64/le and creates POWER9 ifunc functions
only for little endian.
|
|
|
|
|
|
|
| |
* All files with FSF copyright notices: Update copyright dates
using scripts/update-copyrights.
* locale/programs/charmap-kw.h: Regenerated.
* locale/programs/locfile-kw.h: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On POWER8, unaligned memory accesses to cached memory has little impact
on performance as opposed to its ancestors.
It is disabled by default and will only be available when the tunable
glibc.tune.cached_memopt is set to 1.
__memcpy_power8_cached __memcpy_power7
============================================================
max-size=4096: 33325.70 ( 12.65%) 38153.00
max-size=8192: 32878.20 ( 11.17%) 37012.30
max-size=16384: 33782.20 ( 11.61%) 38219.20
max-size=32768: 33296.20 ( 11.30%) 37538.30
max-size=65536: 33765.60 ( 10.53%) 37738.40
* manual/tunables.texi (Hardware Capability Tunables): Document
glibc.tune.cached_memopt.
* sysdeps/powerpc/cpu-features.c: New file.
* sysdeps/powerpc/cpu-features.h: New file.
* sysdeps/powerpc/dl-procinfo.c [!IS_IN(ldconfig)]: Add
_dl_powerpc_cpu_features.
* sysdeps/powerpc/dl-tunables.list: New file.
* sysdeps/powerpc/ldsodefs.h: Include cpu-features.h.
* sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h
(INIT_ARCH): Initialize use_aligned_memopt.
* sysdeps/powerpc/powerpc64/dl-machine.h [defined(SHARED &&
IS_IN(rtld))]: Restrict dl_platform_init availability and
initialize CPU features used by tunables.
* sysdeps/powerpc/powerpc64/multiarch/Makefile (sysdep_routines):
Add memcpy-power8-cached.
* sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c: Add
__memcpy_power8_cached.
* sysdeps/powerpc/powerpc64/multiarch/memcpy.c: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-power8-cached.S:
New file.
Reviewed-by: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
|
|
|
|
|
|
|
| |
Update strcasestr-power8 to use power8 version of strnlen for
calculating length.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the .c/.S file neither uses nor modifies macros defined in
sysdep.h there is no point to #include it. The same goes for
math_ldbl_opt.h except that it includes shlib-compat.h, and if
compat_symbol is redefined we need to include shlib-compat.h first.
* sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf-power8.S: Don't
include sysdep.h.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf-power5+.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_cosf-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_cosf-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor-power5+.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_roundf-power5+.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_roundf-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_sinf-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_sinf-ppc64.c: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_truncf-power5+.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_truncf-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memchr-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memchr-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcmp-power4.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcmp-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcmp-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-a2.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-cell.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-power4.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-power6.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/mempcpy-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memrchr-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memrchr-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memset-power4.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memset-power6.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memset-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memset-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/rawmemchr-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/stpcpy-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/stpncpy-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/stpncpy-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcasecmp-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcasecmp-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcasecmp_l-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcasestr-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strchr-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strchr-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strchr-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strchrnul-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strchrnul-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcmp-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcmp-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcmp-power9.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcmp-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcpy-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcspn-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strlen-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strlen-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strlen-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncase-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power4.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power9.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncpy-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncpy-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strnlen-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strnlen-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strrchr-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strrchr-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strspn-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strstr-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf-ppc64.S: Don't
include sysdep.h and math_ldbl_opt.h.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil-power5+.S: Don't
include sysdep.h and math_ldbl_opt.h. Include shlib-compat.h.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign-power6.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf-power5+.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power5.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6x.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-power6x.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrint-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-power5+.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-power6x.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llround-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_llroundf-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_round-power5+.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_round-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_trunc-power5+.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/multiarch/s_trunc-ppc64.S: Likewise.
|
|
|
|
|
|
|
|
|
|
| |
This is another one where we'll be wanting the base symbols for
powerpc64le rather than just a power7 variant.
* sysdeps/powerpc/powerpc64/multiarch/strncase_l-power7.c: Include
string/strncase_l.c, not string/strncase.c.
(USE_IN_EXTENDED_LOCALE_MODEL): Don't define.
(libc_hidden_def): Redefine.
|
|
|
|
|
|
|
|
|
|
|
| |
The routine being assembled here is strcasecmp_l, so ask for that via
__STRCMP and STRCMP defines. That change means tweaking the power7
override. Needed for later powerpc64le changes where we want the base
symbols, not just a power7 variant.
* sysdeps/powerpc/powerpc64/multiarch/strcasecmp_l-power7.S:
(__STRCMP, STRCMP, __strcasecmp_l): Define.
(__strcasecmp): Don't define.
|
|
|
|
|
|
|
|
|
|
|
| |
These functions aren't used in ld.so at the moment since we don't have
strcmp or strncmp ifuncs for them there. Remove the ld.so bloat.
* sysdeps/powerpc/powerpc64/multiarch/strcmp-power8.S: Wrap in
IS_IN (libc).
* sysdeps/powerpc/powerpc64/multiarch/strcmp-power9.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power9.S: Likewise.
|
|
|
|
|
|
|
|
| |
USE_AS_STPNCPY is defined by sysdeps/powerpc/powerpc64/power8/stpncpy.S,
included by this file.
* sysdeps/powerpc/powerpc64/multiarch/stpncpy-power8.S: Don't define
USE_AS_STPNCPY.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Recent commit 59ba2d2b5421 missed to add __memrchr_power8 in
ifunc list. Also handled discarding unwanted bytes for
unaligned inputs in power8 optimization.
2017-10-05 Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
* sysdeps/powerpc/powerpc64/multiarch/memrchr-ppc64.c: Revert
back to powerpc32 file.
* sysdeps/powerpc/powerpc64/multiarch/memrchr.c
(memrchr): Add __memrchr_power8 to ifunc list.
* sysdeps/powerpc/powerpc64/power8/memrchr.S: Mask
extra bytes for unaligned inputs.
|
|
|
|
|
|
|
| |
Vectorized loops are used for sizes greater than 32B to improve
performance over power7 optimization. This shows as an average
of 25% improvement depending on the position of search
character. The performance is same for shorter strings.
|
|
|
|
|
|
| |
As done in commit 6d15a5c2e9450a1e926d5b4991759e1cfa50fccf
clean up IFUNC implementation for power8 in order to remove
unneeded macro definitions.
|
|
|
|
|
| |
Vectorized loops are used for sizes greater than 32B to improve
performance over power7 optimiztion.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These machine-dependent inline string functions have never been on by
default, and even if they were a good idea at the time they were
introduced, they haven't really been touched in ten to fifteen years
and probably aren't a good idea on current-gen processors. Current
thinking is that this class of optimization is best left to the
compiler.
* bits/string.h, string/bits/string.h
* sysdeps/aarch64/bits/string.h
* sysdeps/m68k/m680x0/m68020/bits/string.h
* sysdeps/s390/bits/string.h, sysdeps/sparc/bits/string.h
* sysdeps/x86/bits/string.h: Delete file.
* string/string.h: Don't include bits/string.h.
* string/bits/string3.h: Rename to bits/string_fortified.h.
No need to undef various symbols that the removed headers
might have defined as macros.
* string/Makefile (headers): Remove bits/string.h, change
bits/string3.h to bits/string_fortified.h.
* string/string-inlines.c: Update commentary. Remove definitions
of various macros that nothing looks at anymore. Don't directly
include bits/string.h. Set _STRING_INLINE_unaligned here, based on
compiler-predefined macros.
* string/strncat.c: If STRNCAT is not defined, or STRNCAT_PRIMARY
_is_ defined, provide internal hidden alias __strncat.
* include/string.h: Declare internal hidden alias __strncat.
Only forward __stpcpy to __builtin_stpcpy if __NO_STRING_INLINES is
not defined.
* include/bits/string3.h: Rename to bits/string_fortified.h,
update to match above.
* sysdeps/i386/string-inlines.c: Define compat symbols for
everything formerly defined by sysdeps/x86/bits/string.h.
Make existing definitions into compat symbols as well.
Remove some no-longer-necessary messing around with macros.
* sysdeps/powerpc/powerpc32/power4/multiarch/mempcpy.c
* sysdeps/powerpc/powerpc64/multiarch/mempcpy.c
* sysdeps/powerpc/powerpc64/multiarch/stpcpy.c
* sysdeps/s390/multiarch/mempcpy.c
No need to define _HAVE_STRING_ARCH_mempcpy.
Do define __NO_STRING_INLINES and NO_MEMPCPY_STPCPY_REDIRECT.
* sysdeps/i386/i686/multiarch/strncat-c.c
* sysdeps/s390/multiarch/strncat-c.c
* sysdeps/x86_64/multiarch/strncat-c.c
Define STRNCAT_PRIMARY. Don't change definition of libc_hidden_def.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A number of functions in the sysdeps/powerpc/powerpc64/ tree don't use
or change r2, yet declare a global entry that sets up r2. This patch
fixes that problem, and consolidates the ENTRY and EALIGN macros.
* sysdeps/powerpc/powerpc64/sysdep.h: Formatting.
(NOPS, ENTRY_3): New macros.
(ENTRY): Rewrite.
(ENTRY_TOCLESS): Define.
(EALIGN, EALIGN_W_0, EALIGN_W_1, EALIGN_W_2, EALIGN_W_4, EALIGN_W_5,
EALIGN_W_6, EALIGN_W_7, EALIGN_W_8): Delete.
* sysdeps/powerpc/powerpc64/a2/memcpy.S: Replace EALIGN with ENTRY.
* sysdeps/powerpc/powerpc64/dl-trampoline.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_ceil.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_ceilf.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_floor.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_floorf.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_nearbyint.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_nearbyintf.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_rint.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_rintf.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_round.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_roundf.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_trunc.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_truncf.S: Likewise.
* sysdeps/powerpc/powerpc64/memset.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/fpu/s_finite.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/fpu/s_isinf.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/strstr.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/e_expf.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_cosf.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_sinf.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/strcasestr.S: Likewise.
* sysdeps/powerpc/powerpc64/addmul_1.S: Use ENTRY_TOCLESS.
* sysdeps/powerpc/powerpc64/cell/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_copysignl.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_fabsl.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_llrint.S: Likewise.
* sysdeps/powerpc/powerpc64/fpu/s_llrintf.S: Likewise.
* sysdeps/powerpc/powerpc64/lshift.S: Likewise.
* sysdeps/powerpc/powerpc64/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/mul_1.S: Likewise.
* sysdeps/powerpc/powerpc64/power4/memcmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power4/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power4/memset.S: Likewise.
* sysdeps/powerpc/powerpc64/power4/strncmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_ceil.S: Likewise.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_ceilf.S: Likewise.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_floor.S: Likewise.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_floorf.S: Likewise.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_llround.S: Likewise.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_round.S: Likewise.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_roundf.S: Likewise.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_trunc.S: Likewise.
* sysdeps/powerpc/powerpc64/power5+/fpu/s_truncf.S: Likewise.
* sysdeps/powerpc/powerpc64/power5/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power6/fpu/s_copysign.S: Likewise.
* sysdeps/powerpc/powerpc64/power6/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power6/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power6/memset.S: Likewise.
* sysdeps/powerpc/powerpc64/power6x/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power6x/fpu/s_llrint.S: Likewise.
* sysdeps/powerpc/powerpc64/power6x/fpu/s_llround.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/add_n.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memchr.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memcmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memmove.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/mempcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memrchr.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memset.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/rawmemchr.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/strcasecmp.S (strcasecmp_l):
Likewise.
* sysdeps/powerpc/powerpc64/power7/strchr.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/strchrnul.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/strcmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/strlen.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/strncmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/strncpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/strnlen.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/strrchr.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_finite.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_isinf.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_isnan.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_llrint.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/fpu/s_llround.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/memcmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/memset.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/strchr.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/strcmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/strcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/strlen.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/strncmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/strncpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/strnlen.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/strrchr.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/strspn.S: Likewise.
* sysdeps/powerpc/powerpc64/power9/strcmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power9/strncmp.S: Likewise.
* sysdeps/powerpc/powerpc64/strchr.S: Likewise.
* sysdeps/powerpc/powerpc64/strcmp.S: Likewise.
* sysdeps/powerpc/powerpc64/strlen.S: Likewise.
* sysdeps/powerpc/powerpc64/strncmp.S: Likewise.
* sysdeps/powerpc/powerpc64/ppc-mcount.S: Store LR earlier. Don't
add nop when SHARED.
* sysdeps/powerpc/powerpc64/start.S: Fix comment.
* sysdeps/powerpc/powerpc64/multiarch/strrchr-power8.S (ENTRY): Don't
define.
(ENTRY_TOCLESS): Define.
* sysdeps/powerpc/powerpc32/sysdep.h (ENTRY_TOCLESS): Define.
* sysdeps/powerpc/fpu/s_fma.S: Use ENTRY_TOCLESS.
* sysdeps/powerpc/fpu/s_fmaf.S: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Makes __stpncpy_power8 call __memset_power8 directly rather than via an
IFUNC. Fixes a missing _mcount, and removes some redundant NOPS. The
*_is_local defines are also used in a followup patch.
* sysdeps/powerpc/powerpc64/multiarch/strncpy-power7.S: Define
MEMSET_is_local.
* sysdeps/powerpc/powerpc64/multiarch/strncpy-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/stpncpy-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/stpncpy-power8.S: Likewise.
Define MEMSET.
* sysdeps/powerpc/powerpc64/multiarch/strstr-power7.S: Define
STRLEN_is_local, STRNLEN_is_local, and STRCHR_is_local.
* sysdeps/powerpc/powerpc64/power7/strstr.S: Likewise. Don't add
nop after local calls.
* sysdeps/powerpc/powerpc64/power7/strncpy.S: Define MEMSET_is_local.
Don't add nop after local call.
* sysdeps/powerpc/powerpc64/power8/strncpy.S: Likewise. Add missing
CALL_MCOUNT.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
.align on some targets takes a byte alignment, on others like powerpc,
log2 of the byte alignment. It's a good idea to avoid .align,
particularly since x86 and powerpc are different. This patch fixes
the occurrences of .align in powerpc64/sysdep.h, renames DOT_LABEL
since the macro doesn't have anything to do with adding dots, removes
extraneous semicolons, and fixes some formatting.
* sysdeps/powerpc/powerpc64/sysdep.h: Formatting.
(FUNC_LABEL): Rename from DOT_LABEL.
(ENTRY_1): Use FUNC_LABEL and remove leading space from label.
Use .p2align rather than .align.
(TRACEBACK, TRACEBACK_MASK): Use .p2align rather than .align.
(ABORT_TRANSACTION): Likewise.
(ENTRY_1, ENTRY_2, END_2, LOCALENTRY): Remove unnecessary semicolons,
particularly at end. Add semicolon at invocation as necessary.
(TRACEBACK, TRACEBACK_MASK, PSEUDO, PSEUDO_NOERRNO): Likewise.
(PSEUDO_ERRVAL, PPC64_LOAD_FUNCPTR, OPD_ENT): Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strrchr-power8.S (ENTRY,
END): Adjust to suit.
|
|
|
|
|
| |
Vectorization improves performance over the current implementation.
Tested on powerpc64 and powerpc64le.
|
|
|
|
|
|
| |
Correct hwcap usage in strncat introduced by commit
249dcdb71b79e4c488a46c9027e0014c0bc27044.
Tested on power7 and power8 systems
|
|
|
|
|
|
|
| |
P7 code is used for <=32B strings and for > 32B vectorized loops are used.
This shows as an average 25% improvement depending on the position of search
character. The performance is same for shorter strings.
Tested on ppc64 and ppc64le.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With new optimized strnlen for POWER8 [1], this patch adds
strncat for power8 to make use of optimized strlen and strnlen.
This is faster than POWER7 current implementation for larger strings.
Tested on powerpc64 and powerpc64le.
[1] https://sourceware.org/ml/libc-alpha/2017-03/msg00491.html
* sysdeps/powerpc/powerpc64/multiarch/Makefile (sysdep_routines): Add
strncat-power8.
* sysdeps/powerpc/powerpc64/multiarch/strncat.c (strncat): Add
__strncat_power8 to ifunc list.
* sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
(strncat): Add __strncat_power8 to list of strncat functions.
* sysdeps/powerpc/powerpc64/multiarch/strncat-power8.c: New file.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.
Tested on ppc64le with and without --disable-multi-arch flag.
* sysdeps/powerpc/powerpc64/multiarch/memcmp-power4.S: Define the
implementation-specific function name and remove unneeded
macros definition.
* sysdeps/powerpc/powerpc64/multiarch/memcmp-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memmove-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/power4/memcmp.S: Set a default function
name if not defined and pass as parameter to macros accordingly.
* sysdeps/powerpc/powerpc64/power7/memcmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memmove.S: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.
Tested on ppc64le with and without --disable-multi-arch flag.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-a2.S: Define the
implementation-specific function name and remove unneeded
macros definition.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-cell.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-power4.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-power6.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memcpy-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/mempcpy-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/a2/memcpy.S: Set a default function
name if not defined and pass as parameter to macros accordingly.
* sysdeps/powerpc/powerpc64/cell/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power4/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power6/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/mempcpy.S: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.
Tested on ppc64le with and without --disable-multi-arch flag.
* sysdeps/powerpc/powerpc64/multiarch/memchr-power7.S: Define the
implementation-specific function name and remove unneeded macros
definition.
* sysdeps/powerpc/powerpc64/multiarch/memrchr-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/rawmemchr-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memchr.S: Set a default
function name if not defined and pass as parameter to macros
accordingly.
* sysdeps/powerpc/powerpc64/power7/memrchr.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/rawmemchr.S: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.
Tested on ppc64le with and without --disable-multi-arch flag.
* sysdeps/powerpc/powerpc64/multiarch/memset-power4.S: Define the
implementation-specific function name and remove unneeded macros
definition.
* sysdeps/powerpc/powerpc64/multiarch/memset-power6.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memset-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memset-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/memset-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/memset.S: Set a default function name if
not defined and pass as parameter to macros accordingly.
* sysdeps/powerpc/powerpc64/power4/memset.S: Likewise.
* sysdeps/powerpc/powerpc64/power6/memset.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/memset.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/memset.S: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.
Tested on ppc64le with and without --disable-multi-arch flag.
* sysdeps/powerpc/powerpc64/multiarch/strcasestr-power8.S: Define the
strcasestr implementation name and remove unneeded macros definition.
* sysdeps/powerpc/powerpc64/multiarch/strstr-power7.S: Define
strstr implementation name and remove unneeded macros definition.
* sysdeps/powerpc/powerpc64/power7/strstr.S: Set a default function
name if not defined and pass as parameter to macros accordingly.
* sysdeps/powerpc/powerpc64/power8/strcasestr.S: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.
Tested on ppc64le with and without --disable-multi-arch flag.
* sysdeps/powerpc/powerpc64/multiarch/strchr-power7.S: Define the
implementation-specific function name and remove unneeded macros
definition.
* sysdeps/powerpc/powerpc64/multiarch/strchr-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strchr-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strchrnul-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strchrnul-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strrchr-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/strchr.S: Set a default
function name if not defined and pass as parameter to macros
accordingly.
* sysdeps/powerpc/powerpc64/power7/strchrnul.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/strrchr.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/strchr.S: Likewise.
* sysdeps/powerpc/powerpc64/strchr.S: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.
Tested on ppc64le with and without --disable-multi-arch flag.
* sysdeps/powerpc/powerpc64/multiarch/strlen-power7.S: Define
the strlen implementation name and remove unneeded macros definition.
* sysdeps/powerpc/powerpc64/multiarch/strlen-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strlen-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strnlen-power7.S: Define
the strnlen implementation name and remove unneeded macros definition.
* sysdeps/powerpc/powerpc64/power7/strlen.S: Set a default function
name if not defined and pass as parameter to macros accordingly.
* sysdeps/powerpc/powerpc64/power7/strnlen.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/strlen.S: Likewise.
* sysdeps/powerpc/powerpc64/strlen.S: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.
Tested on ppc64le with and without --disable-multi-arch flag.
* sysdeps/powerpc/powerpc64/multiarch/strcasecmp_l-power7.S: Define
the implementation-specific function name and remove unneeded
macros definition.
* sysdeps/powerpc/powerpc64/multiarch/strcmp-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcmp-power8.S Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcmp-power9.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcmp-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power4.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-power9.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncmp-ppc64.S: Likewise.
* sysdeps/powerpc/powerpc64/power4/strncmp.S: Set a default function
name if not defined and pass as parameter to macros accordingly.
* sysdeps/powerpc/powerpc64/power7/strcmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/strncmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/strcmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/strncmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power9/strcmp.S: Likewise.
* sysdeps/powerpc/powerpc64/power9/strncmp.S: Likewise.
* sysdeps/powerpc/powerpc64/strcmp.S: Likewise.
* sysdeps/powerpc/powerpc64/strncmp.S: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clean up the IFUNC implementations for powerpc in order to remove
unneeded macro definitions.
Tested on ppc64le with and without --disable-multi-arch flag.
* sysdeps/powerpc/powerpc64/multiarch/stpcpy-power8.S: Define the
implementation-specific function name and remove unneeded macros
definition.
* sysdeps/powerpc/powerpc64/multiarch/stpncpy-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/stpncpy-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcpy-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncpy-power7.S: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strncpy-power8.S: Likewise.
* sysdeps/powerpc/powerpc64/power7/strncpy.S: Set a default
function name if not defined.
* sysdeps/powerpc/powerpc64/power8/strcpy.S: Likewise.
* sysdeps/powerpc/powerpc64/power8/strncpy.S: Likewise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added strnlen POWER8 otimized for long strings. It delivers
same performance as POWER7 implementation for short strings.
This takes advantage of reasonably performing unaligned loads
and bit permutes to check the first 1-16 bytes until
quadword aligned, then checks in 64 bytes strides until unsafe,
then 16 bytes, truncating the count if need be.
Likewise, the POWER7 code is recycled for less than 32 bytes strings.
Tested on ppc64 and ppc64le.
* sysdeps/powerpc/powerpc64/multiarch/Makefile
(sysdep_routines): Add strnlen-power8.
* sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c
(strnlen): Add __strnlen_power8 to list of strnlen functions.
* sysdeps/powerpc/powerpc64/multiarch/strnlen-power8.S:
New file.
* sysdeps/powerpc/powerpc64/multiarch/strnlen.c
(__strnlen): Add __strnlen_power8 to ifunc list.
* sysdeps/powerpc/powerpc64/power8/strnlen.S: New file.
|
|
|
|
|
| |
Some of the power8 strings optimizations are not updated to use the latest
version of other string optimizations
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since commit 6e46de42fe16 default strcat implementation is essentially
the same for specialized ia64 and powerpc ones. This patch removes the
redundant implementation and adjust powerpc64 ifunc code to use the
default one.
Checked on powerpc32-linux-gnu (default and power4) and ia64-linux build
and on powerpc64le-linux-gnu.
* sysdeps/ia64/strcat.c: Remove file.
* sysdeps/powerpc/strcat.c: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcat-power7.c: Use default
C implementation.
* sysdeps/powerpc/powerpc64/multiarch/strcat-power8.c: Likewise.
* sysdeps/powerpc/powerpc64/multiarch/strcat-ppc64.c: Likewise.
|
| |
|
|
|
|
|
|
|
| |
The P7 code is used for <=32B strings and for > 32B vectorized loops are used.
This shows as an average 25% improvement depending on the position of search
character. The performance is same for shorter strings.
Tested on ppc64 and ppc64le.
|
|
|
|
|
|
|
| |
Vectorized loops are used for strings > 32B when compared
to power8 optimization.
Tested on power9 ppc64le simulator.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit c7debbdfacb redirected the internal strrch to default powerpc64
implementation by redefining the weak_alias at
sysdeps/powerpc/powerpc64/multiarch/strchr-ppc64.c:
#undef weak_alias
#define weak_alias(name, aliasname) \
extern __typeof (__strrchr_ppc) aliasname \
__attribute__ ((weak, alias ("__strrchr_ppc")));
This creates a __GI_strchr alias that clashes with the IFUNC symbol in
stprchr.os. There is not need to define the default version for internal
version, since ifunc should work internally for powerpc64. This patch
removes the weak_alias indirection.
Checked on powerpc64le.
* sysdeps/powerpc/powerpc64/multiarch/strrchr-ppc64.c (weak_alias):
Remove redirection to __strrchr_ppc.
|