summaryrefslogtreecommitdiff
path: root/gcm-simd.cpp
Commit message (Collapse)AuthorAgeFilesLines
* Remove INLINE used for debuggingJeffrey Walton2018-08-101-27/+14
| | | | We needed to switch inlining off manually. GDB was not stepping into code for us. No longer needed
* Cleanup GCM codeJeffrey Walton2018-08-101-6/+4
| | | | I always thought the SSE code in GCM_ReverseHashBufferIfNeeded_CLMUL was a wart
* Switch to vector shifts instead of vector mergeJeffrey Walton2018-08-101-4/+4
|
* Cleanup GCM modeJeffrey Walton2018-08-101-5/+10
|
* Cleanup Aarch64 GCM modeJeffrey Walton2018-08-101-32/+30
|
* Add POWER8 GCM mode (GH #698)Jeffrey Walton2018-08-091-52/+55
| | | | | Commit 3ed38e42f619 added the POWER8 infrastructure for GCM mode. It also added GCM_SetKeyWithoutResync_VMULL, GCM_Multiply_VMULL and GCM_Reduce_VMULL. This commit adds the remainder, which includes GCM_AuthenticateBlocks_VMULL. GCC is OK on Linux (ppc64-le) and AIX (ppc64-be). We may need some touchups for XLC compiler
* Update commentsJeffrey Walton2018-08-091-11/+25
|
* Add POWER8 GCM mode (GH #698)Jeffrey Walton2018-08-091-59/+252
| | | | GCM_SetKeyWithoutResync_VMULL, GCM_Multiply_VMULL and GCM_Reduce_VMULL work as expected on Linux (ppc64-le) and AIX (ppc64-be). We are still working on GCM_AuthenticateBlocks_VMULL.
* Update commentsJeffrey Walton2018-08-061-4/+8
|
* Cleanup VPMSUM probesJeffrey Walton2018-08-061-41/+40
|
* Update documentationJeffrey Walton2018-08-061-7/+4
|
* Prepare for POWER8 carryless multiplies using vpmsumJeffrey Walton2018-08-061-8/+107
|
* Whitespace check-inJeffrey Walton2018-08-051-4/+6
|
* Remove s_clmulConstants table in GCM modeJeffrey Walton2018-07-161-46/+23
| | | | Local scopes and loading the constants with _mm_set_epi32 saves about 0.03 cpb. It does not sound like much but it improves GMAC by about 500 MB/s. GMAC is just shy of 8 GB/s.
* Fix "error C2719: formal parameter with requested alignment of 16 won't be ↵Jeffrey Walton2018-07-161-1/+1
| | | | | | aligned" This was somewhat expected due to the Solaris knob turning.
* Disable CLMUL again on SunStudio (GH# 188, GH #224)Jeffrey Walton2018-07-161-3/+4
| | | | We got reports that x86_64 was producing incorrect results. Also, the problem persisted in i386 builds. I don't think we can work around this issue. Oracle must fix it.
* Fix SunStudio 12.4 compile on SolarisJeffrey Walton2018-07-161-1/+3
|
* Fix SunStudio 12.6 GCM compile on Solaris (GH #188, GH #224)Jeffrey Walton2018-07-151-18/+10
| | | | | I think we have this issue somewhat sorted out. First, there is a compiler bug. Second, it seems to be triggered when function parameters mix const and non-const references. Third, to work around it, all parameters need to be non-const (as in this patch). I'm really glad we kind of got to the bottom of things. The crash when compiling GCM has been bothering me for nearly 3 years.
* Fix SunStudio compile on Solaris (GH #226)Jeffrey Walton2018-07-151-3/+3
|
* Add ARMv8.4 cpu feature detection support (GH #685) (#687)Jeffrey Walton2018-07-151-11/+4
| | | | | | | | | This PR adds ARMv8.4 cpu feature detection support. Previously we only needed ARMv8.1 and things were much easier. For example, ARMv8.1 `__ARM_FEATURE_CRYPTO` meant PMULL, AES, SHA-1 and SHA-256 were available. ARMv8.4 `__ARM_FEATURE_CRYPTO` means PMULL, AES, SHA-1, SHA-256, SHA-512, SHA-3, SM3 and SM4 are available. We still use the same pattern as before. We make something available based on compiler version and/or preprocessor macros. But this time around we had to tighten things up a bit to ensure ARMv8.4 did not cross-pollinate down into ARMv8.1. ARMv8.4 is largely untested at the moment. There is no hardware in the field and CI lacks QEMU with the relevant patches/support. We will probably have to revisit some of this stuff in the future. Since this update applies to ARM gadgets we took the time to expand Android and iOS testing on Travis. Travis now tests more platforms, and includes Autotools and CMake builds, too.
* Squash MS LNK4221 and libtool warningsJeffrey Walton2018-07-061-0/+3
|
* Remove extra ; from gcm-simd.cpp (PR #618)Ilja2018-03-311-1/+1
|
* Clear GCC -Wcast-align warnings on ARMJeffrey Walton2018-01-201-1/+5
| | | | The buffers and workspaces are aligned
* Improve logic for <arm_acle.h> include (GH #568)Jeffrey Walton2018-01-201-1/+4
|
* Fix "Internal compiler error: max number of generated reload insns ..." (GH ↵Jeffrey Walton2018-01-071-1/+1
| | | | #554)
* Fix "impossible register constraint in ASM" (GH #554)Jeffrey Walton2018-01-021-1/+1
| | | | Thanks to Eduardo Miravalls for reporting the issue
* Fix crash on VIA C7-D when using GCMJeffrey Walton2017-11-241-1/+1
| | | | This was interesting... The C&-D is an early 2000's 32-bit processor with SSE2 and SSSE3. Using a destination register constraint of "xm" witnessed a crash, while a constraint of "m" does not
* Fix "impossible constraint in \\asm\" on i686Jeffrey Walton2017-11-241-2/+5
| | | | | gcm.cpp:89:50: error: impossible constraint in \\asm\ : "=xm" (a[0]) : "xm"(b[0]), "xm"(c[0]));
* Fix SunCC 12.2 compiler crash with GCM_Xor16_SSE2Jeffrey Walton2017-11-161-0/+17
| | | | SunCC 12.3 through 12.5 still cannot handle CLMUL, though. It would be nice if Sun fixed the regression.
* Switch to intrinsic operation instead of casts for GCM SSE2 XOR'sJeffrey Walton2017-11-151-1/+1
|
* Clear missing newline warningJeffrey Walton2017-10-121-1/+1
|
* Add CRYPTOPP_NO_CPU_FEATURE_PROBES (GH #511)Jeffrey Walton2017-09-191-1/+3
| | | | We determine machine capabilities by performing an os/platform *query* first, like getauxv(). If the *query* fails, we move onto a cpu *probe*. The cpu *probe* tries to exeute an instruction and then catches a SIGILL on Linux or the exception EXCEPTION_ILLEGAL_INSTRUCTION on Windows. Some OSes fail to hangle a SIGILL gracefully, like Apple OSes. Apple machines corrupt memory and variables around the probe.
* Fix armeabi and armv7-a for Android (GH #509)Jeffrey Walton2017-09-171-1/+1
|
* Add Aarch64 specific defines to Android cross-compileJeffrey Walton2017-09-131-5/+1
| | | | Move <arm_acle.h> logic into "sonfig.h". Detecting when we can/should include <arm_acle.h> is proving to be troublesome
* Guard <arm_acle.h> include for GCC 4.8Jeffrey Walton2017-09-121-6/+7
| | | | Use system includes for <arm_neon.h> and <arm_acle.h>
* Fix SunCC crash when compiling GCMJeffrey Walton2017-08-271-2/+2
|
* Support Base Implementation + SIMD implementation on Solaris (PR #461)Jeffrey Walton2017-08-241-0/+6
|
* Remove BOOL macro value (GH #462)Jeffrey Walton2017-08-201-1/+1
| | | | Currently the CRYPTOPP_BOOL_XXX macros set the macro value to 0 or 1. If we remove setting the 0 value (the #else part of the expression), then the self tests speed up by about 0.3 seconds. I can't explain it, but I have observed it repeatedly. This check-in prepares for the removal in Upstream master
* Update commentsJeffrey Walton2017-08-191-1/+1
|
* Guard use of SIGILL probes on Apple platformsJeffrey Walton2017-08-171-0/+5
|
* Split source files to support Base Implementation + SIMD implementation (GH ↵Jeffrey Walton2017-08-171-0/+610
#461) Split source files to support Base Implementation + SIMD implementation