| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
When a fatal error (unaligned memory etc.) is detected, gf-complete should
assert(3) instead of exit(3) to give a chance to the calling program to
catch the exception and display a stack trace. Although it is possible
for gdb to display the stack trace and break on exit, libraries are not
usually expected to terminate the calling program in this way.
Signed-off-by: Loic Dachary <loic@dachary.org>
(cherry picked from commit 29427efac2ce362fce8e4f5f5f1030feba942b73)
|
|\
| |
| | |
arm neon optimisations
|
| |
| |
| |
| |
| | |
Optimisations for 4,64 split table region multiplications. Only used on
ARMv8-A since it is not faster on ARMv7-A.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Optimisations for 4,32 split table multiplications.
Selected time_tool.sh results on a 1.7 GHz cortex-a9:
Region Best (MB/s): 346.67 W-Method: 32 -m SPLIT 32 4 -r SIMD -
Region Best (MB/s): 92.89 W-Method: 32 -m SPLIT 32 4 -r NOSIMD -
Region Best (MB/s): 258.17 W-Method: 32 -m SPLIT 32 4 -r SIMD -r ALTMAP -
Region Best (MB/s): 162.00 W-Method: 32 -m SPLIT 32 8 -
Region Best (MB/s): 160.53 W-Method: 32 -m SPLIT 8 8 -
Region Best (MB/s): 32.74 W-Method: 32 -m COMPOSITE 2 - -
Region Best (MB/s): 199.79 W-Method: 32 -m COMPOSITE 2 - -r ALTMAP -
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Optimisations for the 4,16 split table region multiplications.
Selected time_tool.sh 16 -A -B results for a 1.7 GHz cortex-a9:
Region Best (MB/s): 532.14 W-Method: 16 -m SPLIT 16 4 -r SIMD -
Region Best (MB/s): 212.34 W-Method: 16 -m SPLIT 16 4 -r NOSIMD -
Region Best (MB/s): 801.36 W-Method: 16 -m SPLIT 16 4 -r SIMD -r ALTMAP -
Region Best (MB/s): 93.20 W-Method: 16 -m SPLIT 16 4 -r NOSIMD -r ALTMAP -
Region Best (MB/s): 273.99 W-Method: 16 -m SPLIT 16 8 -
Region Best (MB/s): 270.81 W-Method: 16 -m SPLIT 8 8 -
Region Best (MB/s): 70.42 W-Method: 16 -m COMPOSITE 2 - -
Region Best (MB/s): 393.54 W-Method: 16 -m COMPOSITE 2 - -r ALTMAP -
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Optimisations for the 4,4 split table region multiplication and carry
less multiplication using NEON's polynomial long multiplication.
arm: w8: NEON carry less multiplication
Selected time_tool.sh results for a 1.7GHz cortex-a9:
Region Best (MB/s): 375.86 W-Method: 8 -m CARRY_FREE -
Region Best (MB/s): 142.94 W-Method: 8 -m TABLE -
Region Best (MB/s): 225.01 W-Method: 8 -m TABLE -r DOUBLE -
Region Best (MB/s): 211.23 W-Method: 8 -m TABLE -r DOUBLE -r LAZY -
Region Best (MB/s): 160.09 W-Method: 8 -m LOG -
Region Best (MB/s): 123.61 W-Method: 8 -m LOG_ZERO -
Region Best (MB/s): 123.85 W-Method: 8 -m LOG_ZERO_EXT -
Region Best (MB/s): 1183.79 W-Method: 8 -m SPLIT 8 4 -r SIMD -
Region Best (MB/s): 177.68 W-Method: 8 -m SPLIT 8 4 -r NOSIMD -
Region Best (MB/s): 87.85 W-Method: 8 -m COMPOSITE 2 - -
Region Best (MB/s): 428.59 W-Method: 8 -m COMPOSITE 2 - -r ALTMAP -
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Optimisations for the single table region multiplication and carry less
multiplication using NEON's polynomial multiplication of 8-bit values.
The single polynomial multiplication is not that useful but vector
version is for region multiplication.
Selected time_tool.sh results for a 1.7GHz cortex-a9:
Region Best (MB/s): 672.72 W-Method: 4 -m CARRY_FREE -
Region Best (MB/s): 265.84 W-Method: 4 -m BYTWO_p -
Region Best (MB/s): 329.41 W-Method: 4 -m TABLE -r DOUBLE -
Region Best (MB/s): 278.63 W-Method: 4 -m TABLE -r QUAD -
Region Best (MB/s): 329.81 W-Method: 4 -m TABLE -r QUAD -r LAZY -
Region Best (MB/s): 1318.03 W-Method: 4 -m TABLE -r SIMD -
Region Best (MB/s): 165.15 W-Method: 4 -m TABLE -r NOSIMD -
Region Best (MB/s): 99.73 W-Method: 4 -m LOG -
|
| | |
|
| |
| |
| |
| | |
Properly emulate aligned allocation if posix_memalign is not available.
|
| |
| |
| |
| | |
Checks for arm_neon.h header.
|
| |
| |
| |
| |
| | |
SSE is not the only supported SIMD instruction set. Keep the old names
for backward compatibility.
|
| | |
|
| | |
|
|/
|
|
|
| |
There is no need to force the non-default CFLAGS on users trying to set
them via enviroment variable or on configure command.
|
|\
| |
| | |
static code analysis fixes
|
| |
| |
| |
| |
| |
| |
| |
| | |
Since there can only be one -m, base cannot be set by -m COMPOSITE and
then deallocated on the second -m if it is bugous. The second -m will
exit on error at _gf_errno = GF_E_TWOMULT;.
Signed-off-by: Loic Dachary <loic-201408@dachary.org>
|
|/
|
|
|
|
| |
Because >> 64 does not have a defined behavior.
Signed-off-by: Loic Dachary <loic-201408@dachary.org>
|
|\
| |
| | |
On CPU that doesn't support SSE4.2 instructions set, this will fail
|
|/
|
|
|
|
|
| |
because incorrect header is included.
smmintrin.h => SSE4.1
nmmintrin.h => SSE4.2
|
| |
|
|\ |
|
| |
| |
| |
| | |
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| | |
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| | |
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| | |
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| | |
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| |
| |
| | |
Fix dead assignment in case of INTEL_SSSE3 defined.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| |
| |
| |
| | |
The 'm2' variable in gf_w64_clm_multiply_region_from_single_2() isn't
used except for calculations on 'm2' which are not used later in the code.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| |
| |
| |
| | |
These assigments are never used and directly overwritten later
in the function.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Due to man page of malloc the behaviour in case of allocation size of
0 bytes is undefined: "If size was equal to 0, either NULL or a
pointer suitable to be passed to free() is returned"
Fix for clang scan-build report:
Unix API Undefined allocation of 0 bytes (CERT MEM04-C; CWE-131)
210 poly = (gf_general_t *) malloc(sizeof(gf_general_t)*(n+1));
9 Call to 'malloc' has an allocation size of 0 bytes
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| |
| |
| | |
Check for array boundaries of 't' in while loop header.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| | |
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Free all with malloc allocated memory before exit. Change
if checks against 'w' to be a if-else check to prevent checking
after already matched.
Signed-off-by: Danny Al-Gaaf <danny.al-gaaf@bisect.de>
|
| | |
|
|\ \
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Conflicts:
.gitignore
INSTALL
Makefile.in
aclocal.m4
config.guess
config.sub
configure
examples/Makefile.in
include/config.h.in
include/config.h.in~
install-sh
ltmain.sh
m4/libtool.m4
m4/ltversion.m4
missing
src/Makefile.in
test/Makefile.in
tools/Makefile.in
|
| | | |
|
| | | |
|
| | | |
|
| | | |
|
| |\ \
| | | |
| | | |
| | | | |
https://bitbucket.org/jayrde/gf-complete into wip-autoconf-cleanup
|
| | | | |
|
| | | | |
|
| | | | |
|
| | | | |
|
| |/ / |
|
| | | |
|
| | | |
|
| | |
| | |
| | |
| | | |
get the manual.
|
| |/
|/|
| |
| | |
for easy navigation.
|
|\ \
| | |
| | | |
Fixes for some issues found via Coverity in the Ceph project.
|