summaryrefslogtreecommitdiff
path: root/cipher/cipher-internal.h
Commit message (Collapse)AuthorAgeFilesLines
* Add x86 HW acceleration for GCM-SIV counter modeJussi Kivilinna2021-08-261-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/cipher-gcm-siv.c (do_ctr_le32): Use bulk function if available. * cipher/cipher-internal.h (cipher_bulk_ops): Add 'ctr32le_enc'. * cipher/rijndael-aesni.c (_gcry_aes_aesni_ctr32le_enc): New. * cipher/rijndael-vaes-avx2-amd64.S (_gcry_vaes_avx2_ctr32le_enc_amd64, .Lle_addd_*): New. * cipher/rijndael-vaes.c (_gcry_vaes_avx2_ctr32le_enc_amd64) (_gcry_aes_vaes_ctr32le_enc): New. * cipher/rijndael.c (_gcry_aes_aesni_ctr32le_enc) (_gcry_aes_vaes_ctr32le_enc): New prototypes. (do_setkey): Add setup of 'bulk_ops->ctr32le_enc' for AES-NI and VAES. * tests/basic.c (check_gcm_siv_cipher): Add large test-vector for bulk ops testing. -- Counter mode in GCM-SIV is little-endian on first 4 bytes of of counter block, unlike regular CTR mode which works on big-endian full block. Benchmark on AMD Ryzen 7 5800X: Before: AES | nanosecs/byte mebibytes/sec cycles/byte auto Mhz GCM-SIV enc | 1.00 ns/B 953.2 MiB/s 4.85 c/B 4850 GCM-SIV dec | 1.01 ns/B 940.1 MiB/s 4.92 c/B 4850 GCM-SIV auth | 0.118 ns/B 8051 MiB/s 0.575 c/B 4850 After (~6x faster): AES | nanosecs/byte mebibytes/sec cycles/byte auto Mhz GCM-SIV enc | 0.150 ns/B 6367 MiB/s 0.727 c/B 4850 GCM-SIV dec | 0.161 ns/B 5909 MiB/s 0.783 c/B 4850 GCM-SIV auth | 0.118 ns/B 8051 MiB/s 0.574 c/B 4850 GnuPG-bug-id: T4485 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add AES-GCM-SIV mode (RFC 8452)Jussi Kivilinna2021-08-261-1/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/Makefile.am: Add 'cipher-gcm-siv.c'. * cipher/cipher-gcm-siv.c: New. * cipher/cipher-gcm.c (_gcry_cipher_gcm_setupM): New. * cipher/cipher-internal.h (gcry_cipher_handle): Add 'siv_keylen'. (_gcry_cipher_gcm_setupM, _gcry_cipher_gcm_siv_encrypt) (_gcry_cipher_gcm_siv_decrypt, _gcry_cipher_gcm_siv_set_nonce) (_gcry_cipher_gcm_siv_authenticate) (_gcry_cipher_gcm_siv_set_decryption_tag) (_gcry_cipher_gcm_siv_get_tag, _gcry_cipher_gcm_siv_check_tag) (_gcry_cipher_gcm_siv_setkey): New prototypes. (cipher_block_bswap): New helper function. * cipher/cipher.c (_gcry_cipher_open_internal): Add 'GCRY_CIPHER_MODE_GCM_SIV'; Refactor mode requirement checks for better size optimization (check pointers & blocksize in same order for all). (cipher_setkey, cipher_reset, _gcry_cipher_setup_mode_ops) (_gcry_cipher_setup_mode_ops, _gcry_cipher_info): Add GCM-SIV. (_gcry_cipher_ctl): Handle 'set decryption tag' for GCM-SIV. * doc/gcrypt.texi: Add GCM-SIV. * src/gcrypt.h.in (GCRY_CIPHER_MODE_GCM_SIV): New. (GCRY_SIV_BLOCK_LEN, gcry_cipher_set_decryption_tag): Add to comment that these are also for GCM-SIV in addition to SIV mode. * tests/basic.c (check_gcm_siv_cipher): New. (check_cipher_modes): Check for GCM-SIV. * tests/bench-slope.c (bench_gcm_siv_encrypt_do_bench) (bench_gcm_siv_decrypt_do_bench, bench_gcm_siv_authenticate_do_bench) (gcm_siv_encrypt_ops, gcm_siv_decrypt_ops) (gcm_siv_authenticate_ops): New. (cipher_modes): Add GCM-SIV. (cipher_bench_one): Check key length requirement for GCM-SIV. -- GnuPG-bug-id: T4485 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add SIV mode (RFC 5297)Jussi Kivilinna2021-08-261-0/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/Makefile.am: Add 'cipher-siv.c'. * cipher/cipher-ctr.c (_gcry_cipher_ctr_encrypt): Rename to _gcry_cipher_ctr_encrypt_ctx and add algo context parameter. (_gcry_cipher_ctr_encrypt): New using _gcry_cipher_ctr_encrypt_ctx. * cipher/cipher-internal.h (gcry_cipher_handle): Add 'u_mode.siv'. (_gcry_cipher_ctr_encrypt_ctx, _gcry_cipher_siv_encrypt) (_gcry_cipher_siv_decrypt, _gcry_cipher_siv_set_nonce) (_gcry_cipher_siv_authenticate, _gcry_cipher_siv_set_decryption_tag) (_gcry_cipher_siv_get_tag, _gcry_cipher_siv_check_tag) (_gcry_cipher_siv_setkey): New. * cipher/cipher-siv.c: New. * cipher/cipher.c (_gcry_cipher_open_internal, cipher_setkey) (cipher_reset, _gcry_cipher_setup_mode_ops, _gcry_cipher_info): Add GCRY_CIPHER_MODE_SIV handling. (_gcry_cipher_ctl): Add GCRYCTL_SET_DECRYPTION_TAG handling. * doc/gcrypt.texi: Add documentation for SIV mode. * src/gcrypt.h.in (GCRYCTL_SET_DECRYPTION_TAG): New. (GCRY_CIPHER_MODE_SIV): New. (gcry_cipher_set_decryption_tag): New. * tests/basic.c (check_siv_cipher): New. (check_cipher_modes): Add call for 'check_siv_cipher'. * tests/bench-slope.c (bench_encrypt_init): Use double size key for SIV mode. (bench_aead_encrypt_do_bench, bench_aead_decrypt_do_bench) (bench_aead_authenticate_do_bench): Reset cipher context on each run. (bench_aead_authenticate_do_bench): Support nonce-less operation. (bench_siv_encrypt_do_bench, bench_siv_decrypt_do_bench) (bench_siv_authenticate_do_bench, siv_encrypt_ops) (siv_decrypt_ops, siv_authenticate_ops): New. (cipher_modes): Add SIV mode benchmarks. (cipher_bench_one): Restrict SIV mode testing to 16 byte block-size. -- GnuPG-bug-id: T4486 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* cipher-gcm-ppc: add big-endian supportJussi Kivilinna2021-04-011-1/+1
| | | | | | | | | | | | | | | | | | * cipher/cipher-gcm-ppc.c (ALIGNED_16): New. (vec_store_he, vec_load_he): Remove WORDS_BIGENDIAN ifdef. (vec_dup_byte_elem): New. (_gcry_ghash_setup_ppc_vpmsum): Match function declaration with prototype in cipher-gcm.c; Load C2 with VEC_LOAD_BE; Use vec_dup_byte_elem; Align constants to 16 bytes. (_gcry_ghash_ppc_vpmsum): Match function declaration with prototype in cipher-gcm.c; Align constant to 16 bytes. * cipher/cipher-gcm.c (ghash_ppc_vpmsum): Return value from _gcry_ghash_ppc_vpmsum. * cipher/cipher-internal.h (GCM_USE_PPC_VPMSUM): Remove requirement for !WORDS_BIGENDIAN. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* VPMSUMD acceleration for GCM mode on PPCShawn Landden2021-03-071-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/Makefile.am: Add 'cipher-gcm-ppc.c'. * cipher/cipher-gcm-ppc.c: New. * cipher/cipher-gcm.c [GCM_USE_PPC_VPMSUM] (_gcry_ghash_setup_ppc_vpmsum) (_gcry_ghash_ppc_vpmsum, ghash_setup_ppc_vpsum, ghash_ppc_vpmsum): New. (setupM) [GCM_USE_PPC_VPMSUM]: Select ppc-vpmsum implementation if HW feature "ppc-vcrypto" is available. * cipher/cipher-internal.h (GCM_USE_PPC_VPMSUM): New. (gcry_cipher_handle): Move 'ghash_fn' at end of 'gcm' block to align 'gcm_table' to 16 bytes. * configure.ac: Add 'cipher-gcm-ppc.lo'. * tests/basic.c (_check_gcm_cipher): New AES256 test vector. * AUTHORS: Add 'CRYPTOGAMS'. * LICENSES: Add original license to 3-clause-BSD section. -- https://dev.gnupg.org/D501: 10-20X speed. However this Power 9 machine is faster than the last Power 9 benchmarks on the optimized versions, so while better than the last patch, it is not all due to the code. Before: GCM enc | 4.23 ns/B 225.3 MiB/s - c/B GCM dec | 3.58 ns/B 266.2 MiB/s - c/B GCM auth | 3.34 ns/B 285.3 MiB/s - c/B After: GCM enc | 0.370 ns/B 2578 MiB/s - c/B GCM dec | 0.371 ns/B 2571 MiB/s - c/B GCM auth | 0.159 ns/B 6003 MiB/s - c/B Signed-off-by: Shawn Landden <shawn@git.icu> [jk: coding style fixes, Makefile.am integration, patch from Differential to git, commit changelog, fixed few compiler warnings] GnuPG-bug-id: 5040 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add bulk AES-GCM acceleration for s390x/zSeriesJussi Kivilinna2020-12-181-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/Makefile.am: Add 'asm-inline-s390x.h'. * cipher/asm-inline-s390x.h: New. * cipher/cipher-gcm.c [GCM_USE_S390X_CRYPTO] (ghash_s390x_kimd): New. (setupM) [GCM_USE_S390X_CRYPTO]: Add setup for s390x GHASH function. * cipher/cipher-internal.h (GCM_USE_S390X_CRYPTO): New. * cipher/rijndael-s390x.c (u128_t, km_functions_e): Move to 'asm-inline-s390x.h'. (aes_s390x_gcm_crypt): New. (_gcry_aes_s390x_setup_acceleration): Use 'km_function_to_mask'; Add setup for GCM bulk function. -- This patch adds zSeries acceleration for GHASH and AES-GCM. Benchmarks (z15, 5.2Ghz): Before: AES | nanosecs/byte mebibytes/sec cycles/byte GCM enc | 2.64 ns/B 361.6 MiB/s 13.71 c/B GCM dec | 2.64 ns/B 361.3 MiB/s 13.72 c/B GCM auth | 2.58 ns/B 370.1 MiB/s 13.40 c/B After: AES | nanosecs/byte mebibytes/sec cycles/byte GCM enc | 0.059 ns/B 16066 MiB/s 0.309 c/B GCM dec | 0.059 ns/B 16114 MiB/s 0.308 c/B GCM auth | 0.057 ns/B 16747 MiB/s 0.296 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add bulk function interface for GCM modeJussi Kivilinna2020-12-181-0/+2
| | | | | | | | | | | | * cipher/cipher-gcm.c (do_ghash_buf): Proper handling for the case where 'unused' gets filled to full blocksize. (gcm_crypt_inner): New. (_gcry_cipher_gcm_encrypt, _gcry_cipher_gcm_decrypt): Use 'gcm_crypt_inner'. * cipher/cipher-internal.h (cipher_bulk_ops_t): Add 'gcm_crypt'. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add bulk function interface for OFB modeJussi Kivilinna2020-12-181-0/+2
| | | | | | | | | | * cipher/cipher-internal.h (cipher_bulk_ops): Add 'ofb_enc'. * cipher/cipher-ofb.c (_gcry_cipher_ofb_encrypt): Use bulk encryption function if defined. * cipher/basic.c (check_bulk_cipher_modes): Add OFB-AES test vectors. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* cipher: setup bulk functions at each algorithms key setupJussi Kivilinna2020-09-271-45/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/cipher-internal.h (cipher_mode_ops_t, cipher_bulk_ops_t): New. (gcry_cipher_handle): Define members 'mode_ops' and 'bulk' using new types. * cipher/cipher.c (_gcry_cipher_open_internal): Remove bulk function setup. (cipher_setkey): Pass context bulk function pointer to algorithm setkey function. * cipher/cipher-selftest.c (_gcry_selftest_helper_cbc) (_gcry_selftest_helper_cfb, _gcry_selftest_helper_ctr): Remove bulk function parameter; Use bulk function returned by setkey function. * cipher/cipher-selftest.h (_gcry_selftest_helper_cbc) (_gcry_selftest_helper_cfb, _gcry_selftest_helper_ctr): Remove bulk function parameter. * cipher/arcfour.c (arcfour_setkey): Change 'hd' parameter to 'bulk_ops'. * cipher/blowfish.c (bf_setkey): Change 'hd' parameter to 'bulk_ops'; Setup 'bulk_ops' with bulk acceleration functions. (_gcry_blowfish_ctr_enc, _gcry_blowfish_cbc_dec) (_gcry_blowfish_cfb_dec): Make static. (selftest_ctr, selftest_cbc, selftest_cfb): Do not pass bulk function to selftest helper. (selftest): Pass 'bulk_ops' to setkey function. * cipher/camellia.c (camellia_setkey): Change 'hd' parameter to 'bulk_ops'; Setup 'bulk_ops' with bulk acceleration functions. (_gcry_camellia_ctr_enc, _gcry_camellia_cbc_dec) (_gcry_camellia_cfb_dec, _gcry_camellia_ocb_crypt) (_gcry_camellia_ocb_auth): Make static. (selftest_ctr, selftest_cbc, selftest_cfb): Do not pass bulk function to selftest helper. (selftest): Pass 'bulk_ops' to setkey function. * cipher/cast5.c (cast_setkey): Change 'hd' parameter to 'bulk_ops'; Setup 'bulk_ops' with bulk acceleration functions. (_gcry_cast5_ctr_enc, _gcry_cast5_cbc_dec, _gcry_cast5_cfb_dec): Make static. (selftest_ctr, selftest_cbc, selftest_cfb): Do not pass bulk function to selftest helper. (selftest): Pass 'bulk_ops' to setkey function. * cipher/chacha20.c (chacha20_setkey): Change 'hd' parameter to 'bulk_ops'. * cipher/cast5.c (do_tripledes_setkey): Change 'hd' parameter to 'bulk_ops'; Setup 'bulk_ops' with bulk acceleration functions. (_gcry_3des_ctr_enc, _gcry_3des_cbc_dec, _gcry_3des_cfb_dec): Make static. (bulk_selftest_setkey): Change 'hd' parameter to 'bulk_ops'. (selftest_ctr, selftest_cbc, selftest_cfb): Do not pass bulk function to selftest helper. (do_des_setkey): Change 'hd' parameter to 'bulk_ops'. * cipher/gost28147.c (gost_setkey): Change 'hd' parameter to 'bulk_ops'. * cipher/idea.c (idea_setkey): Change 'hd' parameter to 'bulk_ops'. * cipher/rfc2268.c (do_setkey): Change 'hd' parameter to 'bulk_ops'. * cipher/rijndael.c (do_setkey): Change 'hd' parameter to 'bulk_ops'; Setup 'bulk_ops' with bulk acceleration functions. (rijndael_setkey): Change 'hd' parameter to 'bulk_ops'. (_gcry_aes_cfb_enc, _gcry_aes_cfb_dec, _gcry_aes_cbc_enc) (_gcry_aes_cbc_dec, _gcry_aes_ctr_enc, _gcry_aes_ocb_crypt) (_gcry_aes_ocb_auth, _gcry_aes_xts_crypt): Make static. (selftest_basic_128, selftest_basic_192, selftest_basic_256): Pass 'bulk_ops' to setkey function. (selftest_ctr, selftest_cbc, selftest_cfb): Do not pass bulk function to selftest helper. * cipher/salsa20.c (salsa20_setkey): Change 'hd' parameter to 'bulk_ops'. * cipher/seed.c (seed_setkey): Change 'hd' parameter to 'bulk_ops'. * cipher/serpent.c (serpent_setkey): Change 'hd' parameter to 'bulk_ops'; Setup 'bulk_ops' with bulk acceleration functions. (_gcry_serpent_ctr_enc, _gcry_serpent_cbc_dec, _gcry_serpent_cfb_dec) (_gcry_serpent_ocb_crypt, _gcry_serpent_ocb_auth): Make static. (selftest_ctr_128, selftest_cbc_128, selftest_cfb_128): Do not pass bulk function to selftest helper. * cipher/sm4.c (sm4_setkey): Change 'hd' parameter to 'bulk_ops'; Setup 'bulk_ops' with bulk acceleration functions. (_gcry_sm4_ctr_enc, _gcry_sm4_cbc_dec, _gcry_sm4_cfb_dec) (_gcry_sm4_ocb_crypt, _gcry_sm4_ocb_auth): Make static. (selftest_ctr_128, selftest_cbc_128, selftest_cfb_128): Do not pass bulk function to selftest helper. * cipher/twofish.c (twofish_setkey): Change 'hd' parameter to 'bulk_ops'; Setup 'bulk_ops' with bulk acceleration functions. (_gcry_twofish_ctr_enc, _gcry_twofish_cbc_dec) (_gcry_twofish_cfb_dec, _gcry_twofish_ocb_crypt) (_gcry_twofish_ocb_auth): Make static. (selftest_ctr, selftest_cbc, selftest_cfb): Do not pass bulk function to selftest helper. (selftest, main): Pass 'bulk_ops' to setkey function. * src/cipher-proto.h: Forward declare 'cipher_bulk_ops_t'. (gcry_cipher_setkey_t): Replace 'hd' with 'bulk_ops'. * src/cipher.h: Remove bulk acceleration function prototypes for 'aes', 'blowfish', 'cast5', 'camellia', '3des', 'serpent', 'sm4' and 'twofish'. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add gcry_cipher_ctl command to allow weak keys in testing use-casesJussi Kivilinna2020-02-021-0/+1
| | | | | | | | | | | | | | | * cipher/cipher-internal.h (gcry_cipher_handle): Add 'marks.allow_weak_key' flag. * cipher/cipher.c (cipher_setkey): Do not handle weak key as error when weak keys are allowed. (cipher_reset): Preserve 'marks.allow_weak_key' flag on object reset. (_gcry_cipher_ctl): Add handling for GCRYCTL_SET_ALLOW_WEAK_KEY. * src/gcrypt.h.in (gcry_ctl_cmds): Add GCRYCTL_SET_ALLOW_WEAK_KEY. * tests/basic.c (check_ecb_cipher): Add tests for weak key errors and for GCRYCTL_SET_ALLOW_WEAK_KEY. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Optimizations for generic table-based GCM implementationsJussi Kivilinna2019-04-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/cipher-gcm.c [GCM_TABLES_USE_U64] (do_fillM): Precalculate M[32..63] values. [GCM_TABLES_USE_U64] (do_ghash): Split processing of two 64-bit halfs of the input to two separate loops; Use precalculated M[] values. [GCM_USE_TABLES && !GCM_TABLES_USE_U64] (do_fillM): Precalculate M[64..127] values. [GCM_USE_TABLES && !GCM_TABLES_USE_U64] (do_ghash): Use precalculated M[] values. [GCM_USE_TABLES] (bshift): Avoid conditional execution for mask calculation. * cipher/cipher-internal.h (gcry_cipher_handle): Double gcm_table size. -- Benchmark on Intel Haswell (amd64, --disable-hwf all): Before: | nanosecs/byte mebibytes/sec cycles/byte auto Mhz GMAC_AES | 2.79 ns/B 341.3 MiB/s 11.17 c/B 3998 After (~36% faster): | nanosecs/byte mebibytes/sec cycles/byte auto Mhz GMAC_AES | 2.05 ns/B 464.7 MiB/s 8.20 c/B 3998 Benchmark on Intel Haswell (win32, --disable-hwf all): Before: | nanosecs/byte mebibytes/sec cycles/byte auto Mhz GMAC_AES | 4.90 ns/B 194.8 MiB/s 19.57 c/B 3997 After (~36% faster): | nanosecs/byte mebibytes/sec cycles/byte auto Mhz GMAC_AES | 3.58 ns/B 266.4 MiB/s 14.31 c/B 3999 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add helper function for adding value to cipher blockJussi Kivilinna2019-03-311-0/+23
| | | | | | | | | | | | | | | | * cipher/cipher-internal.h (cipher_block_add): New. * cipher/blowfish.c (_gcry_blowfish_ctr_enc): Use new helper function for CTR block increment. * cipher/camellia-glue.c (_gcry_camellia_ctr_enc): Ditto. * cipher/cast5.c (_gcry_cast5_ctr_enc): Ditto. * cipher/cipher-ctr.c (_gcry_cipher_ctr_encrypt): Ditto. * cipher/des.c (_gcry_3des_ctr_enc): Ditto. * cipher/rijndael.c (_gcry_aes_ctr_enc): Ditto. * cipher/serpent.c (_gcry_serpent_ctr_enc): Ditto. * cipher/twofish.c (_gcry_twofish_ctr_enc): Ditto. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add ARMv7/NEON accelerated GCM implementationJussi Kivilinna2019-03-231-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | * cipher/Makefile.am: Add 'cipher-gcm-armv7-neon.S'. * cipher/cipher-gcm-armv7-neon.S: New. * cipher/cipher-gcm.c [GCM_USE_ARM_NEON] (_gcry_ghash_setup_armv7_neon) (_gcry_ghash_armv7_neon, ghash_setup_armv7_neon) (ghash_armv7_neon): New. (setupM) [GCM_USE_ARM_NEON]: Use armv7/neon implementation if have HWF_ARM_NEON. * cipher/cipher-internal.h (GCM_USE_ARM_NEON): New. -- Benchmark on Cortex-A53 (816 Mhz): Before: | nanosecs/byte mebibytes/sec cycles/byte GMAC_AES | 34.81 ns/B 27.40 MiB/s 28.41 c/B After (3.0x faster): | nanosecs/byte mebibytes/sec cycles/byte GMAC_AES | 11.49 ns/B 82.99 MiB/s 9.38 c/B Reported-by: Yuriy M. Kaminskiy <yumkam@gmail.com> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Do not precalculate OCB offset L0+L1+L0Jussi Kivilinna2019-01-271-1/+0
| | | | | | | | | | | | * cipher/cipher-internal.h (gcry_cipher_handle): Remove OCB L0L1L0. * cipher/cipher-ocb.c (_gcry_cipher_ocb_setkey): Ditto. * cipher/rijndael-aesni.c (aesni_ocb_enc, aesni_ocb_dec) (_gcry_aes_aesni_ocb_auth): Replace L0L1L0 use with L1. -- Patch fixes L0+L1+L0 thinko. This is same as L1 (L0 xor L1 xor L0). Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Calculate OCB L-tables when setting key instead of when setting nonceJussi Kivilinna2019-01-271-0/+6
| | | | | | | | | | | | | | | | * cipher/cipher-internal.h (gcry_cipher_handle): Mark areas of u_mode.ocb that are and are not cleared by gcry_cipher_reset. (_gcry_cipher_ocb_setkey): New. * cipher/cipher-ocb.c (_gcry_cipher_ocb_set_nonce): Split L-table generation to ... (_gcry_cipher_ocb_setkey): ... this new function. * cipher/cipher.c (cipher_setkey): Add handling for OCB mode. (cipher_reset): Do not clear L-values for OCB mode. -- OCB L-tables do not depend on nonce value, but only on cipher key. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add stitched ChaCha20-Poly1305 SSSE3 and AVX2 implementationsJussi Kivilinna2019-01-271-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/asm-poly1305-amd64.h: New. * cipher/Makefile.am: Add 'asm-poly1305-amd64.h'. * cipher/chacha20-amd64-avx2.S (QUATERROUND2): Add interleave operators. (_gcry_chacha20_poly1305_amd64_avx2_blocks8): New. * cipher/chacha20-amd64-ssse3.S (QUATERROUND2): Add interleave operators. (_gcry_chacha20_poly1305_amd64_ssse3_blocks4) (_gcry_chacha20_poly1305_amd64_ssse3_blocks1): New. * cipher/chacha20.c (_gcry_chacha20_poly1305_amd64_ssse3_blocks4) (_gcry_chacha20_poly1305_amd64_ssse3_blocks1) (_gcry_chacha20_poly1305_amd64_avx2_blocks8): New prototypes. (chacha20_encrypt_stream): Split tail to... (do_chacha20_encrypt_stream_tail): ... new function. (_gcry_chacha20_poly1305_encrypt) (_gcry_chacha20_poly1305_decrypt): New. * cipher/cipher-internal.h (_gcry_chacha20_poly1305_encrypt) (_gcry_chacha20_poly1305_decrypt): New prototypes. * cipher/cipher-poly1305.c (_gcry_cipher_poly1305_encrypt): Call '_gcry_chacha20_poly1305_encrypt' if cipher is ChaCha20. (_gcry_cipher_poly1305_decrypt): Call '_gcry_chacha20_poly1305_decrypt' if cipher is ChaCha20. * cipher/poly1305-internal.h (_gcry_cipher_poly1305_update_burn): New prototype. * cipher/poly1305.c (poly1305_blocks): Make static. (_gcry_poly1305_update): Split main function body to ... (_gcry_poly1305_update_burn): ... new function. -- Benchmark on Intel Skylake (i5-6500, 3200 Mhz): Before, 8-way AVX2: CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 0.378 ns/B 2526 MiB/s 1.21 c/B STREAM dec | 0.373 ns/B 2560 MiB/s 1.19 c/B POLY1305 enc | 0.685 ns/B 1392 MiB/s 2.19 c/B POLY1305 dec | 0.686 ns/B 1390 MiB/s 2.20 c/B POLY1305 auth | 0.315 ns/B 3031 MiB/s 1.01 c/B After, 8-way AVX2 (~36% faster): CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte POLY1305 enc | 0.503 ns/B 1896 MiB/s 1.61 c/B POLY1305 dec | 0.485 ns/B 1965 MiB/s 1.55 c/B Benchmark on Intel Haswell (i7-4790K, 3998 Mhz): Before, 8-way AVX2: CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 0.318 ns/B 2999 MiB/s 1.27 c/B STREAM dec | 0.317 ns/B 3004 MiB/s 1.27 c/B POLY1305 enc | 0.586 ns/B 1627 MiB/s 2.34 c/B POLY1305 dec | 0.586 ns/B 1627 MiB/s 2.34 c/B POLY1305 auth | 0.271 ns/B 3524 MiB/s 1.08 c/B After, 8-way AVX2 (~30% faster): CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte POLY1305 enc | 0.452 ns/B 2108 MiB/s 1.81 c/B POLY1305 dec | 0.440 ns/B 2167 MiB/s 1.76 c/B Before, 4-way SSSE3: CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 0.627 ns/B 1521 MiB/s 2.51 c/B STREAM dec | 0.626 ns/B 1523 MiB/s 2.50 c/B POLY1305 enc | 0.895 ns/B 1065 MiB/s 3.58 c/B POLY1305 dec | 0.896 ns/B 1064 MiB/s 3.58 c/B POLY1305 auth | 0.271 ns/B 3521 MiB/s 1.08 c/B After, 4-way SSSE3 (~20% faster): CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte POLY1305 enc | 0.733 ns/B 1301 MiB/s 2.93 c/B POLY1305 dec | 0.726 ns/B 1314 MiB/s 2.90 c/B Before, 1-way SSSE3: CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte POLY1305 enc | 1.56 ns/B 609.6 MiB/s 6.25 c/B POLY1305 dec | 1.56 ns/B 609.4 MiB/s 6.26 c/B After, 1-way SSSE3 (~18% faster): CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte POLY1305 enc | 1.31 ns/B 725.4 MiB/s 5.26 c/B POLY1305 dec | 1.31 ns/B 727.3 MiB/s 5.24 c/B For comparison to other libraries (on Intel i7-4790K, 3998 Mhz): bench-slope-openssl: OpenSSL 1.1.1 11 Sep 2018 Cipher: chacha20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 0.301 ns/B 3166.4 MiB/s 1.20 c/B STREAM dec | 0.300 ns/B 3174.7 MiB/s 1.20 c/B POLY1305 enc | 0.463 ns/B 2060.6 MiB/s 1.85 c/B POLY1305 dec | 0.462 ns/B 2063.8 MiB/s 1.85 c/B POLY1305 auth | 0.162 ns/B 5899.3 MiB/s 0.646 c/B bench-slope-nettle: Nettle 3.4 Cipher: chacha | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 1.65 ns/B 578.2 MiB/s 6.59 c/B STREAM dec | 1.65 ns/B 578.2 MiB/s 6.59 c/B POLY1305 enc | 2.05 ns/B 464.8 MiB/s 8.20 c/B POLY1305 dec | 2.05 ns/B 464.7 MiB/s 8.20 c/B POLY1305 auth | 0.404 ns/B 2359.1 MiB/s 1.62 c/B bench-slope-botan: Botan 2.6.0 Cipher: ChaCha | nanosecs/byte mebibytes/sec cycles/byte STREAM enc/dec | 0.855 ns/B 1116.0 MiB/s 3.42 c/B POLY1305 enc | 1.60 ns/B 595.4 MiB/s 6.40 c/B POLY1305 dec | 1.60 ns/B 595.8 MiB/s 6.40 c/B POLY1305 auth | 0.752 ns/B 1268.3 MiB/s 3.01 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Optimizations for AES-NI OCBJussi Kivilinna2018-11-201-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/cipher-internal.h (gcry_cipher_handle): New pre-computed OCB values L0L1 and L0L1L0; Swap dimensions for OCB L table. * cipher/cipher-ocb.c (_gcry_cipher_ocb_set_nonce): Setup L0L1 and L0L1L0 values. (ocb_crypt): Process input in 24KiB chunks for better cache locality for checksumming. * cipher/rijndael-aesni.c (ALWAYS_INLINE): New macro for always inlining functions, change all functions with 'inline' to use ALWAYS_INLINE. (NO_INLINE): New macro. (aesni_prepare_2_6_variable, aesni_prepare_7_15_variable): Rename to... (aesni_prepare_2_7_variable, aesni_prepare_8_15_variable): ...these and adjust accordingly (xmm7 moved from *_7_15 to *_2_7). (aesni_prepare_2_6, aesni_prepare_7_15): Rename to... (aesni_prepare_2_7, aesni_prepare_8_15): ...these and adjust accordingly. (aesni_cleanup_2_6, aesni_cleanup_7_15): Rename to... (aesni_cleanup_2_7, aesni_cleanup_8_15): ...these and adjust accordingly. (aesni_ocb_checksum): New. (aesni_ocb_enc, aesni_ocb_dec): Calculate OCB offsets in parallel with help of pre-computed offsets L0+L1 ja L0+L1+L0; Do checksum calculation as separate pass instead of inline; Use NO_INLINE. (_gcry_aes_aesni_ocb_auth): Calculate OCB offsets in parallel with help of pre-computed offsets L0+L1 ja L0+L1+L0. * cipher/rijndael-internal.h (RIJNDAEL_context_s) [USE_AESNI]: Add 'use_avx2' and 'use_avx'. * cipher/rijndael.c (do_setkey) [USE_AESNI]: Set 'use_avx2' if Intel AVX2 HW feature is available and 'use_avx' if Intel AVX HW feature is available. * tests/basic.c (do_check_ocb_cipher): New test vector; increase size of temporary buffers for new test vector. (check_ocb_cipher_largebuf_split): Make test plaintext non-uniform for better checksum testing. (check_ocb_cipher_checksum): New. (check_ocb_cipher_largebuf): Call check_ocb_cipher_checksum. (check_ocb_cipher): New expected tags for check_ocb_cipher_largebuf test runs. -- Benchmark on Haswell i7-4970k @ 4.0Ghz: Before: AES | nanosecs/byte mebibytes/sec cycles/byte OCB enc | 0.175 ns/B 5436 MiB/s 0.702 c/B OCB dec | 0.184 ns/B 5184 MiB/s 0.736 c/B OCB auth | 0.156 ns/B 6097 MiB/s 0.626 c/B After (enc +2% faster, dec +7% faster): OCB enc | 0.172 ns/B 5547 MiB/s 0.688 c/B OCB dec | 0.171 ns/B 5582 MiB/s 0.683 c/B OCB auth | 0.156 ns/B 6097 MiB/s 0.626 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add size optimized cipher block copy and xor functionsJussi Kivilinna2018-07-211-0/+141
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/bufhelp.h (buf_get_he32, buf_put_he32, buf_get_he64) (buf_put_he64): New. * cipher/cipher-internal.h (cipher_block_cpy, cipher_block_xor) (cipher_block_xor_1, cipher_block_xor_2dst, cipher_block_xor_n_copy_2) (cipher_block_xor_n_copy): New. * cipher/cipher-gcm-intel-pclmul.c (_gcry_ghash_setup_intel_pclmul): Use assembly for swapping endianness instead of buf_get_be64 and buf_cpy. * cipher/blowfish.c: Use new cipher_block_* functions for cipher block sized buf_cpy/xor* operations. * cipher/camellia-glue.c: Ditto. * cipher/cast5.c: Ditto. * cipher/cipher-aeswrap.c: Ditto. * cipher/cipher-cbc.c: Ditto. * cipher/cipher-ccm.c: Ditto. * cipher/cipher-cfb.c: Ditto. * cipher/cipher-cmac.c: Ditto. * cipher/cipher-ctr.c: Ditto. * cipher/cipher-eax.c: Ditto. * cipher/cipher-gcm.c: Ditto. * cipher/cipher-ocb.c: Ditto. * cipher/cipher-ofb.c: Ditto. * cipher/cipher-xts.c: Ditto. * cipher/des.c: Ditto. * cipher/rijndael.c: Ditto. * cipher/serpent.c: Ditto. * cipher/twofish.c: Ditto. -- This commit adds size-optimized functions for copying and xoring cipher block sized buffers. These functions also allow GCC to use inline auto-vectorization for block cipher copying and xoring on higher optimization levels. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Access cipher mode routines through routine pointersJussi Kivilinna2018-06-191-2/+24
| | | | | | | | | | | | | | | | | | | | | | | * cipher/cipher-internal.h (gcry_cipher_handle): Add function pointers for mode operations. (_gcry_cipher_xts_crypt): Remove. (_gcry_cipher_xts_encrypt, _gcry_cipher_xts_decrypt): New. * cipher/cipher-xts.c (_gcry_cipher_xts_encrypt) (_gcry_cipher_xts_decrypt): New. * cipher/cipher.c (_gcry_cipher_setup_mode_ops): New. (_gcry_cipher_open_internal): Setup mode routines. (cipher_encrypt, cipher_decrypt): Remove. (do_stream_encrypt, do_stream_decrypt, do_encrypt_none_unknown) (do_decrypt_none_unknown): New. (_gcry_cipher_encrypt, _gcry_cipher_decrypt, _gcry_cipher_setiv) (_gcry_cipher_authenticate, _gcry_cipher_gettag) (_gcry_cipher_checktag): Adapted to use mode routines through pointers. -- Change to use mode operations through pointers to reduce per call overhead for cipher operations. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add separate handlers for CBC-CTS variantJussi Kivilinna2018-06-191-0/+8
| | | | | | | | | | | | | | | | | * cipher/cipher-cbc.c (cbc_encrypt_inner, cbc_decrypt_inner) (_gcry_cipher_cbc_cts_encrypt, _gcry_cipher_cbc_cts_decrypt): New. (_gcry_cipher_cbc_encrypt, _gcry_cipher_cbc_decrypt): Remove CTS handling. * cipher/cipher-internal.h (_gcry_cipher_cbc_cts_encrypt) (_gcry_cipher_cbc_cts_decrypt): New. * cipher/cipher.c (cipher_encrypt, cipher_decrypt): Call CBC-CTS handler if CBC-CTS flag is set. -- Separate CTS handling to separate function for small decrease in CBC per call overhead. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Avoid division by spec->blocksize in cipher mode handlersJussi Kivilinna2018-06-191-0/+10
| | | | | | | | | | | | | | | | | | | | * cipher/cipher-internal.h (_gcry_blocksize_shift): New. * cipher/cipher-cbc.c (_gcry_cipher_cbc_encrypt) (_gcry_cipherp_cbc_decrypt): Use bit-level operations instead of division to get number of blocks and check input length against blocksize. * cipher/cipher-cfb.c (_gcry_cipher_cfb_encrypt) (_gcry_cipher_cfb_decrypt): Ditto. * cipher/cipher-cmac.c (_gcry_cmac_write): Ditto. * cipher/cipher-ctr.c (_gcry_cipher_ctr_crypt): Ditto. * cipher/cipher-ofb.c (_gcry_cipher_ofb_encrypt) (_gcry_cipher_ofb_decrypt): Ditto. -- Integer division was causing 10 to 20 cycles per call overhead for cipher modes on x86-64. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add EAX modeJussi Kivilinna2018-01-201-7/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/Makefile.am: Add 'cipher-eax.c'. * cipher/cipher-cmac.c (cmac_write): Rename to ... (_gcry_cmac_write): ... this; Take CMAC context as new input parameter; Return error code. (cmac_generate_subkeys): Rename to ... (_gcry_cmac_generate_subkeys): ... this; Take CMAC context as new input parameter; Return error code. (cmac_final): Rename to ... (_gcry_cmac_final): ... this; Take CMAC context as new input parameter; Return error code. (cmac_tag): Take CMAC context as new input parameter. (_gcry_cmac_reset): New. (_gcry_cipher_cmac_authenticate): Remove duplicate tag flag check; Adapt to changes above. (_gcry_cipher_cmac_get_tag): Adapt to changes above. (_gcry_cipher_cmac_check_tag): Ditto. (_gcry_cipher_cmac_set_subkeys): Ditto. * cipher-eax.c: New. * cipher-internal.h (gcry_cmac_context_t): New. (gcry_cipher_handle): Update u_mode.cmac; Add u_mode.eax. (_gcry_cmac_write, _gcry_cmac_generate_subkeys, _gcry_cmac_final) (_gcry_cmac_reset, _gcry_cipher_eax_encrypt, _gcry_cipher_eax_decrypt) (_gcry_cipher_eax_set_nonce, _gcry_cipher_eax_authenticate) (_gcry_cipher_eax_get_tag, _gcry_cipher_eax_check_tag) (_gcry_cipher_eax_setkey): New prototypes. * cipher/cipher.c (_gcry_cipher_open_internal, cipher_setkey) (cipher_reset, cipher_encrypt, cipher_decrypt, _gcry_cipher_setiv) (_gcry_cipher_authenticate, _gcry_cipher_gettag, _gcry_cipher_checktag) (_gcry_cipher_info): Add EAX mode. * doc/gcrypt.texi: Add EAX mode. * src/gcrypt.h.in (GCRY_CIPHER_MODE_EAX): New. * tests/basic.c (_check_gcm_cipher, _check_poly1305_cipher): Constify test vectors array. (_check_eax_cipher, check_eax_cipher): New. (check_ciphers, check_cipher_modes): Add EAX mode. * tests/bench-slope.c (bench_eax_encrypt_do_bench) (bench_eax_decrypt_do_bench, bench_eax_authenticate_do_bench) (eax_encrypt_ops, eax_decrypt_ops, eax_authenticate_ops): New. (cipher_modes): Add EAX mode. * tests/benchmark.c (cipher_bench): Add EAX mode. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add AES-NI acceleration for AES-XTSJussi Kivilinna2018-01-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/cipher-internal.h (gcry_cipher_handle): Change bulk XTS function to take cipher context. * cipher/cipher-xts.c (_gcry_cipher_xts_crypt): Ditto. * cipher/cipher.c (_gcry_cipher_open_internal): Setup AES-NI XTS bulk function. * cipher/rijndael-aesni.c (xts_gfmul_const, _gcry_aes_aesni_xts_enc) (_gcry_aes_aesni_xts_enc, _gcry_aes_aesni_xts_crypt): New. * cipher/rijndael.c (_gcry_aes_aesni_xts_crypt) (_gcry_aes_xts_crypt): New. * src/cipher.h (_gcry_aes_xts_crypt): New. -- Benchmarks on Intel Core i7-4790K, 4.0Ghz (no turbo): Before: XTS enc | 1.66 ns/B 575.7 MiB/s 6.63 c/B XTS dec | 1.66 ns/B 575.5 MiB/s 6.63 c/B After (~6x faster): XTS enc | 0.270 ns/B 3528.5 MiB/s 1.08 c/B XTS dec | 0.272 ns/B 3511.5 MiB/s 1.09 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Spelling fixes in docs and comments.NIIBE Yutaka2017-04-281-1/+1
| | | | | | | | -- GnuPG-bug-id: 3120 Reported-by: ka7 (klemens) Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
* Implement CFB with 8-bit modeMathias L. Baumann2017-02-041-0/+8
| | | | | | | | | | | | | | | * cipher/cipher-cfb.c (_gcry_cipher_cfb8_encrypt) (_gcry_cipher_cfg8_decrypt): Add 8-bit variants of decrypt/encrypt functions. * cipher/cipher-internal.h (_gcry_cipher_cfb8_encrypt) (_gcry_cipher_cfg8_decrypt): Ditto. * cipher/cipher.c: Adjust code flow to work with GCRY_CIPHER_MODE_CFB8. * tests/basic.c: Add tests for cfb8 with AES and 3DES. -- Signed-off-by: Mathias L. Baumann <mathias.baumann at sociomantic.com> [JK: edit changelog, fix email malformed patch] Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add XTS cipher modeJussi Kivilinna2017-01-061-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | * cipher/Makefile.am: Add 'cipher-xts.c'. * cipher/cipher-internal.h (gcry_cipher_handle): Add 'bulk.xts_crypt' and 'u_mode.xts' members. (_gcry_cipher_xts_crypt): New prototype. * cipher/cipher-xts.c: New. * cipher/cipher.c (_gcry_cipher_open_internal, cipher_setkey) (cipher_reset, cipher_encrypt, cipher_decrypt): Add XTS mode handling. * doc/gcrypt.texi: Add XTS mode to documentation. * src/gcrypt.h.in (GCRY_CIPHER_MODE_XTS, GCRY_XTS_BLOCK_LEN): New. * tests/basic.c (do_check_xts_cipher, check_xts_cipher): New. (check_bulk_cipher_modes): Add XTS test-vectors. (check_one_cipher_core, check_one_cipher, check_ciphers): Add XTS testing support. (check_cipher_modes): Add XTS test. * tests/bench-slope.c (bench_xts_encrypt_init) (bench_xts_encrypt_do_bench, bench_xts_decrypt_do_bench) (xts_encrypt_ops, xts_decrypt_ops): New. (cipher_modes, cipher_bench_one): Add XTS. * tests/benchmark.c (cipher_bench): Add XTS testing. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* OCB: Move large L handling from bottom to upper levelJussi Kivilinna2016-12-101-18/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/cipher-ocb.c (_gcry_cipher_ocb_get_l): Remove. (ocb_get_L_big): New. (_gcry_cipher_ocb_authenticate): L-big handling done in upper processing loop, so that lower level never sees the case where 'aad_nblocks % 65536 == 0'; Add missing stack burn. (ocb_aad_finalize): Add missing stack burn. (ocb_crypt): L-big handling done in upper processing loop, so that lower level never sees the case where 'data_nblocks % 65536 == 0'. * cipher/cipher-internal.h (_gcry_cipher_ocb_get_l): Remove. (ocb_get_l): Remove 'l_tmp' usage and simplify since input is more limited now, 'N is not multiple of 65536'. * cipher/rijndael-aesni.c (get_l): Remove. (aesni_ocb_enc, aesni_ocb_dec, _gcry_aes_aesni_ocb_auth): Remove l_tmp; Use 'ocb_get_l'. * cipher/rijndael-ssse3-amd64.c (get_l): Remove. (ssse3_ocb_enc, ssse3_ocb_dec, _gcry_aes_ssse3_ocb_auth): Remove l_tmp; Use 'ocb_get_l'. * cipher/camellia-glue.c: Remove OCB l_tmp usage. * cipher/rijndael-armv8-ce.c: Ditto. * cipher/rijndael.c: Ditto. * cipher/serpent.c: Ditto. * cipher/twofish.c: Ditto. -- Move large L value generation to up-most level to simplify lower level ocb_get_l for greater performance and simpler implementation. This helps implementing OCB in assembly as 'ocb_get_l' no longer has function call on slow-path. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add ARMv8/AArch64 Crypto Extension implementation of GCMJussi Kivilinna2016-09-051-0/+4
| | | | | | | | | | | | | | | | | | | | * cipher/Makefile.am: Add 'cipher-gcm-armv8-aarch64-ce.S'. * cipher/cipher-gcm-armv8-aarch64-ce.S: New. * cipher/cipher-internal.h (GCM_USE_ARM_PMULL): Enable on ARMv8/AArch64. -- Benchmark on Cortex-A53 (1152 Mhz): Before: | nanosecs/byte mebibytes/sec cycles/byte GMAC_AES | 15.54 ns/B 61.36 MiB/s 17.91 c/B After (11.9x faster): | nanosecs/byte mebibytes/sec cycles/byte GMAC_AES | 1.30 ns/B 731.5 MiB/s 1.50 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add ARMv8/AArch32 Crypto Extension implementation of GCMJussi Kivilinna2016-07-141-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | * cipher/Makefile.am: Add 'cipher-gcm-armv8-aarch32-ce.S'. * cipher/cipher-gcm-armv8-aarch32-ce.S: New. * cipher/cipher-gcm.c [GCM_USE_ARM_PMULL] (_gcry_ghash_setup_armv8_ce_pmull, _gcry_ghash_armv8_ce_pmull) (ghash_setup_armv8_ce_pmull, ghash_armv8_ce_pmull): New. (setupM) [GCM_USE_ARM_PMULL]: Enable ARM PMULL implementation if HWF_ARM_PULL HW feature flag is enabled. * cipher/cipher-gcm.h (GCM_USE_ARM_PMULL): New. -- Benchmark on Cortex-A53 (1152 Mhz): Before: | nanosecs/byte mebibytes/sec cycles/byte GMAC_AES | 24.10 ns/B 39.57 MiB/s 27.76 c/B After (~26x faster): | nanosecs/byte mebibytes/sec cycles/byte GMAC_AES | 0.924 ns/B 1032.2 MiB/s 1.06 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* cipher: Buffer data from gcry_cipher_authenticate in OCB mode.Werner Koch2016-04-121-0/+6
| | | | | | | | | | | | | | | | | | | | | | | * cipher/cipher-internal.h (gcry_cipher_handle): Add fields aad_leftover and aad_nleftover to u_mode.ocb. * cipher/cipher-ocb.c (_gcry_cipher_ocb_set_nonce): Clear aad_nleftover. (_gcry_cipher_ocb_authenticate): Add buffering and facor some code out to ... (ocb_aad_finalize): new. (compute_tag_if_needed): Call new function. * tests/basic.c (check_ocb_cipher_splitaad): New. (check_ocb_cipher): Call new function. (main): Also call check_cipher_modes with --ciper-modes. -- It is more convenient to not require full blocks for gcry_cipher_authenticate. Other modes than OCB do this as well. Note that the size of the context structure is not increased because other modes require more context data. Signed-off-by: Werner Koch <wk@gnupg.org>
* Always require a 64 bit integer typeWerner Koch2016-03-181-6/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * configure.ac (available_digests_64): Merge with available_digests. (available_kdfs_64): Merge with available_kdfs. <64 bit datatype test>: Bail out if no such type is available. * src/types.h: Emit #error if no u64 can be defined. (PROPERLY_ALIGNED_TYPE): Always add u64 type. * cipher/bithelp.h: Remove all code paths which handle the case of !HAVE_U64_TYPEDEF. * cipher/bufhelp.h: Ditto. * cipher/cipher-ccm.c: Ditto. * cipher/cipher-gcm.c: Ditto. * cipher/cipher-internal.h: Ditto. * cipher/cipher.c: Ditto. * cipher/hash-common.h: Ditto. * cipher/md.c: Ditto. * cipher/poly1305.c: Ditto. * cipher/scrypt.c: Ditto. * cipher/tiger.c: Ditto. * src/g10lib.h: Ditto. * tests/basic.c: Ditto. * tests/bench-slope.c: Ditto. * tests/benchmark.c: Ditto. -- Given that SHA-2 and some other algorithms require a 64 bit type it does not make anymore sense to conditionally compile some part when the platform does not provide such a type. GnuPG-bug-id: 1815. Signed-off-by: Werner Koch <wk@gnupg.org>
* Optimize OCB offset calculationJussi Kivilinna2015-08-101-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/cipher-internal.h (ocb_get_l): New. * cipher/cipher-ocb.c (_gcry_cipher_ocb_authenticate) (ocb_crypt): Use 'ocb_get_l' instead of '_gcry_cipher_ocb_get_l'. * cipher/camellia-glue.c (get_l): Remove. (_gcry_camellia_ocb_crypt, _gcry_camellia_ocb_auth): Precalculate offset array when block count matches parallel operation size; Use 'ocb_get_l' instead of 'get_l'. * cipher/rijndael-aesni.c (get_l): Add fast path for 75% most common offsets. (aesni_ocb_enc, aesni_ocb_dec, _gcry_aes_aesni_ocb_auth): Precalculate offset array when block count matches parallel operation size. * cipher/rijndael-ssse3-amd64.c (get_l): Add fast path for 75% most common offsets. * cipher/rijndael.c (_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth): Use 'ocb_get_l' instead of '_gcry_cipher_ocb_get_l'. * cipher/serpent.c (get_l): Remove. (_gcry_serpent_ocb_crypt, _gcry_serpent_ocb_auth): Precalculate offset array when block count matches parallel operation size; Use 'ocb_get_l' instead of 'get_l'. * cipher/twofish.c (get_l): Remove. (_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): Use 'ocb_get_l' instead of 'get_l'. -- Patch optimizes OCB offset calculation for generic code and assembly implementations with parallel block processing. Benchmark of OCB AES-NI on Intel Haswell: $ tests/bench-slope --cpu-mhz 3201 cipher aes Before: AES | nanosecs/byte mebibytes/sec cycles/byte CTR enc | 0.274 ns/B 3483.9 MiB/s 0.876 c/B CTR dec | 0.273 ns/B 3490.0 MiB/s 0.875 c/B OCB enc | 0.289 ns/B 3296.1 MiB/s 0.926 c/B OCB dec | 0.299 ns/B 3189.9 MiB/s 0.957 c/B OCB auth | 0.260 ns/B 3670.0 MiB/s 0.832 c/B After: AES | nanosecs/byte mebibytes/sec cycles/byte CTR enc | 0.273 ns/B 3489.4 MiB/s 0.875 c/B CTR dec | 0.273 ns/B 3487.5 MiB/s 0.875 c/B OCB enc | 0.248 ns/B 3852.8 MiB/s 0.792 c/B OCB dec | 0.261 ns/B 3659.5 MiB/s 0.834 c/B OCB auth | 0.227 ns/B 4205.5 MiB/s 0.726 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Reduce amount of duplicated code in OCB bulk implementationsJussi Kivilinna2015-07-271-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/cipher-ocb.c (_gcry_cipher_ocb_authenticate) (ocb_crypt): Change bulk function to return number of unprocessed blocks. * src/cipher.h (_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth) (_gcry_camellia_ocb_crypt, _gcry_camellia_ocb_auth) (_gcry_serpent_ocb_crypt, _gcry_serpent_ocb_auth) (_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): Change return type to 'size_t'. * cipher/camellia-glue.c (get_l): Only if USE_AESNI_AVX or USE_AESNI_AVX2 defined. (_gcry_camellia_ocb_crypt, _gcry_camellia_ocb_auth): Change return type to 'size_t' and return remaining blocks; Remove unaccelerated common code path. Enable remaining common code only if USE_AESNI_AVX or USE_AESNI_AVX2 defined; Remove unaccelerated common code. * cipher/rijndael.c (_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth): Change return type to 'size_t' and return zero. * cipher/serpent.c (get_l): Only if USE_SSE2, USE_AVX2 or USE_NEON defined. (_gcry_serpent_ocb_crypt, _gcry_serpent_ocb_auth): Change return type to 'size_t' and return remaining blocks; Remove unaccelerated common code path. Enable remaining common code only if USE_SSE2, USE_AVX2 or USE_NEON defined; Remove unaccelerated common code. * cipher/twofish.c (get_l): Only if USE_AMD64_ASM defined. (_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): Change return type to 'size_t' and return remaining blocks; Remove unaccelerated common code path. Enable remaining common code only if USE_AMD64_ASM defined; Remove unaccelerated common code. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Enable AES/AES-NI, AES/SSSE3 and GCM/PCLMUL implementations on WIN64Jussi Kivilinna2015-05-011-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/cipher-gcm-intel-pclmul.c (_gcry_ghash_intel_pclmul) ( _gcry_ghash_intel_pclmul) [__WIN64__]: Store non-volatile vector registers before use and restore after. * cipher/cipher-internal.h (GCM_USE_INTEL_PCLMUL): Remove dependency on !defined(__WIN64__). * cipher/rijndael-aesni.c [__WIN64__] (aesni_prepare_2_6_variable, aesni_prepare, aesni_prepare_2_6, aesni_cleanup) ( aesni_cleanup_2_6): New. [!__WIN64__] (aesni_prepare_2_6_variable, aesni_prepare_2_6): New. (_gcry_aes_aesni_do_setkey, _gcry_aes_aesni_cbc_enc) (_gcry_aesni_ctr_enc, _gcry_aesni_cfb_dec, _gcry_aesni_cbc_dec) (_gcry_aesni_ocb_crypt, _gcry_aesni_ocb_auth): Use 'aesni_prepare_2_6'. * cipher/rijndael-internal.h (USE_SSSE3): Enable if HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS or HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS. (USE_AESNI): Remove dependency on !defined(__WIN64__) * cipher/rijndael-ssse3-amd64.c [HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS] (vpaes_ssse3_prepare, vpaes_ssse3_cleanup): New. [!HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS] (vpaes_ssse3_prepare): New. (vpaes_ssse3_prepare_enc, vpaes_ssse3_prepare_dec): Use 'vpaes_ssse3_prepare'. (_gcry_aes_ssse3_do_setkey, _gcry_aes_ssse3_prepare_decryption): Use 'vpaes_ssse3_prepare' and 'vpaes_ssse3_cleanup'. [HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS] (X): Add masking macro to exclude '.type' and '.size' markers from assembly code, as they are not support on WIN64/COFF objects. * configure.ac (gcry_cv_gcc_attribute_ms_abi) (gcry_cv_gcc_attribute_sysv_abi, gcry_cv_gcc_default_abi_is_ms_abi) (gcry_cv_gcc_default_abi_is_sysv_abi) (gcry_cv_gcc_win64_platform_as_ok): New checks. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Disable GCM and AES-NI assembly implementations for WIN64Jussi Kivilinna2015-05-011-1/+3
| | | | | | | | | * cipher/cipher-internal.h (GCM_USE_INTEL_PCLMUL): Do not enable when __WIN64__ defined. * cipher/rijndael-internal.h (USE_AESNI): Ditto. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add OCB bulk crypt/auth functions for AES/AES-NIJussi Kivilinna2015-04-181-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/cipher-internal.h (gcry_cipher_handle): Add bulk.ocb_crypt and bulk.ocb_auth. (_gcry_cipher_ocb_get_l): New prototype. * cipher/cipher-ocb.c (get_l): Rename to ... (_gcry_cipher_ocb_get_l): ... this. (_gcry_cipher_ocb_authenticate, ocb_crypt): Use bulk function when available. * cipher/cipher.c (_gcry_cipher_open_internal): Setup OCB bulk functions for AES. * cipher/rijndael-aesni.c (get_l, aesni_ocb_enc, aes_ocb_dec) (_gcry_aes_aesni_ocb_crypt, _gcry_aes_aesni_ocb_auth): New. * cipher/rijndael.c [USE_AESNI] (_gcry_aes_aesni_ocb_crypt) (_gcry_aes_aesni_ocb_auth): New prototypes. (_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth): New. * src/cipher.h (_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth): New prototypes. * tests/basic.c (check_ocb_cipher_largebuf): New. (check_ocb_cipher): Add large buffer encryption/decryption test. -- Patch adds bulk encryption/decryption/authentication code for AES-NI accelerated AES. Benchmark on Intel i5-4570 (3200 Mhz, turbo off): Before: AES | nanosecs/byte mebibytes/sec cycles/byte OCB enc | 2.12 ns/B 449.7 MiB/s 6.79 c/B OCB dec | 2.12 ns/B 449.6 MiB/s 6.79 c/B OCB auth | 2.07 ns/B 459.9 MiB/s 6.64 c/B After: AES | nanosecs/byte mebibytes/sec cycles/byte OCB enc | 0.292 ns/B 3262.5 MiB/s 0.935 c/B OCB dec | 0.297 ns/B 3212.2 MiB/s 0.950 c/B OCB auth | 0.260 ns/B 3666.1 MiB/s 0.832 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add OCB cipher modeWerner Koch2015-01-161-2/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/cipher-ocb.c: New. * cipher/Makefile.am (libcipher_la_SOURCES): Add cipher-ocb.c * cipher/cipher-internal.h (OCB_BLOCK_LEN, OCB_L_TABLE_SIZE): New. (gcry_cipher_handle): Add fields marks.finalize and u_mode.ocb. * cipher/cipher.c (_gcry_cipher_open_internal): Add OCB mode. (_gcry_cipher_open_internal): Setup default taglen of OCB. (cipher_reset): Clear OCB specific data. (cipher_encrypt, cipher_decrypt, _gcry_cipher_authenticate) (_gcry_cipher_gettag, _gcry_cipher_checktag): Call OCB functions. (_gcry_cipher_setiv): Add OCB specific nonce setting. (_gcry_cipher_ctl): Add GCRYCTL_FINALIZE and GCRYCTL_SET_TAGLEN * src/gcrypt.h.in (GCRYCTL_SET_TAGLEN): New. (gcry_cipher_final): New. * cipher/bufhelp.h (buf_xor_1): New. * tests/basic.c (hex2buffer): New. (check_ocb_cipher): New. (main): Call it here. Add option --cipher-modes. * tests/bench-slope.c (bench_aead_encrypt_do_bench): Call gcry_cipher_final. (bench_aead_decrypt_do_bench): Ditto. (bench_aead_authenticate_do_bench): Ditto. Check error code. (bench_ocb_encrypt_do_bench): New. (bench_ocb_decrypt_do_bench): New. (bench_ocb_authenticate_do_bench): New. (ocb_encrypt_ops): New. (ocb_decrypt_ops): New. (ocb_authenticate_ops): New. (cipher_modes): Add them. (cipher_bench_one): Skip wrong block length for OCB. * tests/benchmark.c (cipher_bench): Add field noncelen to MODES. Add OCB support. -- See the comments on top of cipher/cipher-ocb.c for the patent status of the OCB mode. The implementation has not yet been optimized and as such is not faster that the other AEAD modes. A first candidate for optimization is the double_block function. Large improvements can be expected by writing an AES ECB function to work on multiple blocks. Signed-off-by: Werner Koch <wk@gnupg.org>
* Poly1305-AEAD: updated implementation to match ↵Jussi Kivilinna2014-12-231-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | draft-irtf-cfrg-chacha20-poly1305-03 * cipher/cipher-internal.h (gcry_cipher_handle): Use separate byte counters for AAD and data in Poly1305. * cipher/cipher-poly1305.c (poly1305_fill_bytecount): Remove. (poly1305_fill_bytecounts, poly1305_do_padding): New. (poly1305_aad_finish): Fill padding to Poly1305 and do not fill AAD length. (_gcry_cipher_poly1305_authenticate, _gcry_cipher_poly1305_encrypt) (_gcry_cipher_poly1305_decrypt): Update AAD and data length separately. (_gcry_cipher_poly1305_tag): Fill padding and bytecounts to Poly1305. (_gcry_cipher_poly1305_setkey, _gcry_cipher_poly1305_setiv): Reset AAD and data byte counts; only allow 96-bit IV. * cipher/cipher.c (_gcry_cipher_open_internal): Limit Poly1305-AEAD to ChaCha20 cipher. * tests/basic.c (_check_poly1305_cipher): Update test-vectors. (check_ciphers): Limit Poly1305-AEAD checks to ChaCha20. * tests/bench-slope.c (cipher_bench_one): Ditto. -- Latest Internet-Draft version for "ChaCha20 and Poly1305 for IETF protocols" has added additional padding to Poly1305-AEAD and limited support IV size to 96-bits: https://www.ietf.org/rfcdiff?url1=draft-nir-cfrg-chacha20-poly1305-03&difftype=--html&submit=Go!&url2=draft-irtf-cfrg-chacha20-poly1305-03 Patch makes Poly1305-AEAD implementation to match the changes and limits Poly1305-AEAD to ChaCha20 only. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* GCM: move Intel PCLMUL accelerated implementation to separate fileJussi Kivilinna2014-12-121-5/+8
| | | | | | | | | | | | | | | | | | | | | | * cipher/Makefile.am: Add 'cipher-gcm-intel-pclmul.c'. * cipher/cipher-gcm-intel-pclmul.c: New. * cipher/cipher-gcm.c [GCM_USE_INTEL_PCLMUL] (_gcry_ghash_setup_intel_pclmul, _gcry_ghash_intel_pclmul): New prototypes. [GCM_USE_INTEL_PCLMUL] (gfmul_pclmul, gfmul_pclmul_aggr4): Move to 'cipher-gcm-intel-pclmul.c'. (ghash): Rename to... (ghash_internal): ...this and move GCM_USE_INTEL_PCLMUL part to new function in 'cipher-gcm-intel-pclmul.c'. (setupM): Move GCM_USE_INTEL_PCLMUL part to new function in 'cipher-gcm-intel-pclmul.c'; Add selection of ghash function based on available HW acceleration. (do_ghash_buf): Change use of 'ghash' to 'c->u_mode.gcm.ghash_fn'. * cipher/internal.h (ghash_fn_t): New. (gcry_cipher_handle): Remove 'use_intel_pclmul'; Add 'ghash_fn'. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add Poly1305 based cipher AEAD modeJussi Kivilinna2014-05-121-0/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/Makefile.am: Add 'cipher-poly1305.c'. * cipher/cipher-internal.h (gcry_cipher_handle): Add 'u_mode.poly1305'. (_gcry_cipher_poly1305_encrypt, _gcry_cipher_poly1305_decrypt) (_gcry_cipher_poly1305_setiv, _gcry_cipher_poly1305_authenticate) (_gcry_cipher_poly1305_get_tag, _gcry_cipher_poly1305_check_tag): New. * cipher/cipher-poly1305.c: New. * cipher/cipher.c (_gcry_cipher_open_internal, cipher_setkey) (cipher_reset, cipher_encrypt, cipher_decrypt, _gcry_cipher_setiv) (_gcry_cipher_authenticate, _gcry_cipher_gettag) (_gcry_cipher_checktag): Handle 'GCRY_CIPHER_MODE_POLY1305'. (cipher_setiv): Move handling of 'GCRY_CIPHER_MODE_GCM' to ... (_gcry_cipher_setiv): ... here, as with other modes. * src/gcrypt.h.in: Add 'GCRY_CIPHER_MODE_POLY1305'. * tests/basic.c (_check_poly1305_cipher, check_poly1305_cipher): New. (check_ciphers): Add Poly1305 check. (check_cipher_modes): Call 'check_poly1305_cipher'. * tests/bench-slope.c (bench_gcm_encrypt_do_bench): Rename to bench_aead_... and take nonce as argument. (bench_gcm_decrypt_do_bench, bench_gcm_authenticate_do_bench): Ditto. (bench_gcm_encrypt_do_bench, bench_gcm_decrypt_do_bench) (bench_gcm_authenticate_do_bench, bench_poly1305_encrypt_do_bench) (bench_poly1305_decrypt_do_bench) (bench_poly1305_authenticate_do_bench, poly1305_encrypt_ops) (poly1305_decrypt_ops, poly1305_authenticate_ops): New. (cipher_modes): Add Poly1305. (cipher_bench_one): Add special handling for Poly1305. -- Patch adds Poly1305 based AEAD cipher mode to libgcrypt. ChaCha20 variant of this mode is proposed for use in TLS and ipsec: https://tools.ietf.org/html/draft-agl-tls-chacha20poly1305-04 http://tools.ietf.org/html/draft-nir-ipsecme-chacha20-poly1305-02 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Use u64 for CCM data lengthsJussi Kivilinna2013-12-151-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | * cipher/cipher-ccm.c: Move code inside [HAVE_U64_TYPEDEF]. [HAVE_U64_TYPEDEF] (_gcry_cipher_ccm_set_lengths): Use 'u64' for data lengths. [!HAVE_U64_TYPEDEF] (_gcry_cipher_ccm_encrypt) (_gcry_cipher_ccm_decrypt, _gcry_cipher_ccm_set_nonce) (_gcry_cipher_ccm_authenticate, _gcry_cipher_ccm_get_tag) (_gcry_cipher_ccm_check_tag): Dummy functions returning GPG_ERROR_NOT_SUPPORTED. * cipher/cipher-internal.h (gcry_cipher_handle.u_mode.ccm) (_gcry_cipher_ccm_set_lengths): Move inside [HAVE_U64_TYPEDEF] and use u64 instead of size_t for CCM data lengths. * cipher/cipher.c (_gcry_cipher_open_internal, cipher_reset) (_gcry_cipher_ctl) [!HAVE_U64_TYPEDEF]: Return GPG_ERR_NOT_SUPPORTED for CCM. (_gcry_cipher_ctl) [HAVE_U64_TYPEDEF]: Use u64 for GCRYCTL_SET_CCM_LENGTHS length parameters. * tests/basic.c: Do not use CCM if !HAVE_U64_TYPEDEF. * tests/bench-slope.c: Ditto. * tests/benchmark.c: Ditto. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* GCM: Move gcm_table initialization to setkeyJussi Kivilinna2013-11-211-9/+21
| | | | | | | | | | | | | | | | | | | * cipher/cipher-gcm.c: Change all 'c->u_iv.iv' to 'c->u_mode.gcm.u_ghash_key.key'. (_gcry_cipher_gcm_setkey): New. (_gcry_cipher_gcm_initiv): Move ghash initialization to function above. * cipher/cipher-internal.h (gcry_cipher_handle): Add 'u_mode.gcm.u_ghash_key'; Reorder 'u_mode.gcm' members for partial clearing in gcry_cipher_reset. (_gcry_cipher_gcm_setkey): New prototype. * cipher/cipher.c (cipher_setkey): Add GCM setkey. (cipher_reset): Clear 'u_mode' only partially for GCM. -- GHASH tables can be generated at setkey time. No need to regenerate for every new IV. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* GCM: Add support for split data buffers and online operationJussi Kivilinna2013-11-201-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/cipher-gcm.c (do_ghash_buf): Add buffering for less than blocksize length input and padding handling. (_gcry_cipher_gcm_encrypt, _gcry_cipher_gcm_decrypt): Add handling for AAD padding and check if data has already being padded. (_gcry_cipher_gcm_authenticate): Check that AAD or data has not being padded yet. (_gcry_cipher_gcm_initiv): Clear padding marks. (_gcry_cipher_gcm_tag): Add finalization and padding; Clear sensitive data from cipher handle, since they are not used after generating tag. * cipher/cipher-internal.h (gcry_cipher_handle): Add 'u_mode.gcm.macbuf', 'u_mode.gcm.mac_unused', 'u_mode.gcm.ghash_data_finalized' and 'u_mode.gcm.ghash_aad_finalized'. * tests/basic.c (check_gcm_cipher): Rename to... (_check_gcm_cipher): ...this and add handling for different buffer step lengths; Enable per byte buffer testing. (check_gcm_cipher): Call _check_gcm_cipher with different buffer step sizes. -- Until now, GCM was expecting full data to be input in one go. This patch adds support for feeding data continuously (for encryption/decryption/aad). Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* GCM: Use size_t for buffer sizesJussi Kivilinna2013-11-201-5/+6
| | | | | | | | | | | | | * cipher/cipher-gcm.c (ghash, gcm_bytecounter_add, do_ghash_buf) (_gcry_cipher_gcm_encrypt, _gcry_cipher_gcm_decrypt) (_gcry_cipher_gcm_authenticate, _gcry_cipher_gcm_geniv) (_gcry_cipher_gcm_tag): Use size_t for buffer lengths. * cipher/cipher-internal.h (_gcry_cipher_gcm_encrypt) (_gcry_cipher_gcm_decrypt, _gcry_cipher_gcm_authenticate): Use size_t for buffer lengths. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* GCM: add FIPS mode restrictionsJussi Kivilinna2013-11-201-0/+1
| | | | | | | | | | | | | | | * cipher/cipher-gcm.c (_gcry_cipher_gcm_encrypt) (_gcry_cipher_gcm_get_tag): Do not allow using in FIPS mode is setiv was invocated directly. (_gcry_cipher_gcm_setiv): Rename to... (_gcry_cipher_gcm_initiv): ...this. (_gcry_cipher_gcm_setiv): New setiv function with check for FIPS mode. [TODO] (_gcry_cipher_gcm_getiv): New. * cipher/cipher-internal.h (gcry_cipher_handle): Add 'u_mode.gcm.disallow_encryption_because_of_setiv_in_fips_mode'. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* GCM: Use counter mode code for speed-upJussi Kivilinna2013-11-201-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/cipher-gcm.c (ghash): Add process for multiple blocks. (gcm_bytecounter_add, gcm_add32_be128, gcm_check_datalen) (gcm_check_aadlen_or_ivlen, do_ghash_buf): New functions. (_gcry_cipher_gcm_encrypt, _gcry_cipher_gcm_decrypt) (_gcry_cipher_gcm_authenticate, _gcry_cipher_gcm_set_iv) (_gcry_cipher_gcm_tag): Adjust to use above new functions and counter mode functions for encryption/decryption. * cipher/cipher-internal.h (gcry_cipher_handle): Remove 'length'; Add 'u_mode.gcm.(addlen|datalen|tagiv|datalen_over_limits)'. (_gcry_cipher_gcm_setiv): Return gcry_err_code_t. * cipher/cipher.c (cipher_setiv): Return error code. (_gcry_cipher_setiv): Handle error code from 'cipher_setiv'. -- Patch changes GCM to use counter mode code for bulk speed up and also adds data length checks as given in NIST SP-800-38D section 5.2.1.1. Bit length requirements from section 5.2.1.1: len(plaintext) <= 2^39-256 bits == 2^36-32 bytes == 2^32-2 blocks len(aad) <= 2^64-1 bits ~= 2^61-1 bytes len(iv) <= 2^64-1 bit ~= 2^61-1 bytes Intel Haswell: Old: AES GCM enc | 3.00 ns/B 317.4 MiB/s 9.61 c/B GCM dec | 1.96 ns/B 486.9 MiB/s 6.27 c/B GCM auth | 0.848 ns/B 1124.7 MiB/s 2.71 c/B New: AES GCM enc | 1.12 ns/B 851.8 MiB/s 3.58 c/B GCM dec | 1.12 ns/B 853.7 MiB/s 3.57 c/B GCM auth | 0.843 ns/B 1131.4 MiB/s 2.70 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add Intel PCLMUL acceleration for GCMJussi Kivilinna2013-11-201-17/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/cipher-gcm.c (fillM): Rename... (do_fillM): ...to this. (ghash): Remove. (fillM): New macro. (GHASH): Use 'do_ghash' instead of 'ghash'. [GCM_USE_INTEL_PCLMUL] (do_ghash_pclmul): New. (ghash): New. (setupM): New. (_gcry_cipher_gcm_encrypt, _gcry_cipher_gcm_decrypt) (_gcry_cipher_gcm_authenticate, _gcry_cipher_gcm_setiv) (_gcry_cipher_gcm_tag): Use 'ghash' instead of 'GHASH' and 'c->u_mode.gcm.u_tag.tag' instead of 'c->u_tag.tag'. * cipher/cipher-internal.h (GCM_USE_INTEL_PCLMUL): New. (gcry_cipher_handle): Move 'u_tag' and 'gcm_table' under 'u_mode.gcm'. * configure.ac (pclmulsupport, gcry_cv_gcc_inline_asm_pclmul): New. * src/g10lib.h (HWF_INTEL_PCLMUL): New. * src/global.c: Add "intel-pclmul". * src/hwf-x86.c (detect_x86_gnuc): Add check for Intel PCLMUL. -- Speed-up GCM for Intel CPUs. Intel Haswell (x86-64): Old: AES GCM enc | 5.17 ns/B 184.4 MiB/s 16.55 c/B GCM dec | 4.38 ns/B 218.0 MiB/s 14.00 c/B GCM auth | 3.17 ns/B 300.4 MiB/s 10.16 c/B New: AES GCM enc | 3.01 ns/B 317.2 MiB/s 9.62 c/B GCM dec | 1.96 ns/B 486.9 MiB/s 6.27 c/B GCM auth | 0.848 ns/B 1124.8 MiB/s 2.71 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* GCM: GHASH optimizationsJussi Kivilinna2013-11-201-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/cipher-gcm.c [GCM_USE_TABLES] (gcmR, ghash): Replace with new. [GCM_USE_TABLES] [GCM_TABLES_USE_U64] (bshift, fillM, do_ghash): New. [GCM_USE_TABLES] [!GCM_TABLES_USE_U64] (bshift, fillM): Replace with new. [GCM_USE_TABLES] [!GCM_TABLES_USE_U64] (do_ghash): New. (_gcry_cipher_gcm_tag): Remove extra memcpy to outbuf and use buf_eq_const for comparing authentication tag. * cipher/cipher-internal.h (gcry_cipher_handle): Different 'gcm_table' for 32-bit and 64-bit platforms. -- Patch improves GHASH speed. Intel Haswell (x86-64): Old: GCM auth | 26.22 ns/B 36.38 MiB/s 83.89 c/B New: GCM auth | 3.18 ns/B 300.0 MiB/s 10.17 c/B Intel Haswell (mingw32): Old: GCM auth | 27.27 ns/B 34.97 MiB/s 87.27 c/B New: GCM auth | 7.58 ns/B 125.7 MiB/s 24.27 c/B Cortex-A8: Old: GCM auth | 231.4 ns/B 4.12 MiB/s 233.3 c/B New: GCM auth | 30.82 ns/B 30.94 MiB/s 31.07 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Initial implementation of GCMDmitry Eremin-Solenikov2013-11-191-1/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/Makefile.am: Add 'cipher-gcm.c'. * cipher/cipher-ccm.c (_gcry_ciphert_ccm_set_lengths) (_gcry_cipher_ccm_authenticate, _gcry_cipher_ccm_tag) (_gcry_cipher_ccm_encrypt, _gcry_cipher_ccm_decrypt): Change 'c->u_mode.ccm.tag' to 'c->marks.tag'. * cipher/cipher-gcm.c: New. * cipher/cipher-internal.h (GCM_USE_TABLES): New. (gcry_cipher_handle): Add 'marks.tag', 'u_tag', 'length' and 'gcm_table'; Remove 'u_mode.ccm.tag'. (_gcry_cipher_gcm_encrypt, _gcry_cipher_gcm_decrypt) (_gcry_cipher_gcm_setiv, _gcry_cipher_gcm_authenticate) (_gcry_cipher_gcm_get_tag, _gcry_cipher_gcm_check_tag): New. * cipher/cipher.c (_gcry_cipher_open_internal, cipher_setkey) (cipher_encrypt, cipher_decrypt, _gcry_cipher_authenticate) (_gcry_cipher_gettag, _gcry_cipher_checktag): Add GCM mode handling. * src/gcrypt.h.in (gcry_cipher_modes): Add GCRY_CIPHER_MODE_GCM. (GCRY_GCM_BLOCK_LEN): New. * tests/basic.c (check_gcm_cipher): New. (check_ciphers): Add GCM check. (check_cipher_modes): Call 'check_gcm_cipher'. * tests/bench-slope.c (bench_gcm_encrypt_do_bench) (bench_gcm_decrypt_do_bench, bench_gcm_authenticate_do_bench) (gcm_encrypt_ops, gcm_decrypt_ops, gcm_authenticate_ops): New. (cipher_modes): Add GCM enc/dec/auth. (cipher_bench_one): Limit GCM to block ciphers with 16 byte block-size. * tests/benchmark.c (cipher_bench): Add GCM. -- Currently it is still quite slow. Still no support for generate_iv(). Is it really necessary? TODO: Merge/reuse cipher-internal state used by CCM. Changelog entry will be present in final patch submission. Changes since v1: - 6x-7x speedup. - added bench-slope support Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> [jk: mangle new file throught 'indent -nut'] [jk: few fixes] [jk: changelog]
* Add CMAC (Cipher-based MAC) to MAC APIJussi Kivilinna2013-11-191-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/Makefile.am: Add 'cipher-cmac.c' and 'mac-cmac.c'. * cipher/cipher-cmac.c: New. * cipher/cipher-internal.h (gcry_cipher_handle.u_mode): Add 'cmac'. * cipher/cipher.c (gcry_cipher_open): Rename to... (_gcry_cipher_open_internal): ...this and add CMAC. (gcry_cipher_open): New wrapper that disallows use of internal modes (CMAC) from outside. (cipher_setkey, cipher_encrypt, cipher_decrypt) (_gcry_cipher_authenticate, _gcry_cipher_gettag) (_gcry_cipher_checktag): Add handling for CMAC mode. (cipher_reset): Do not reset 'marks.key' and do not clear subkeys in 'u_mode' in CMAC mode. * cipher/mac-cmac.c: New. * cipher/mac-internal.h: Add CMAC support and algorithms. * cipher/mac.c: Add CMAC algorithms. * doc/gcrypt.texi: Add documentation for CMAC. * src/cipher.h (gcry_cipher_internal_modes): New. (_gcry_cipher_open_internal, _gcry_cipher_cmac_authenticate) (_gcry_cipher_cmac_get_tag, _gcry_cipher_cmac_check_tag) (_gcry_cipher_cmac_set_subkeys): New prototypes. * src/gcrypt.h.in (gcry_mac_algos): Add CMAC algorithms. * tests/basic.c (check_mac): Add CMAC test vectors. -- Patch adds CMAC (Cipher-based MAC) as defined in RFC 4493 and NIST Special Publication 800-38B. Internally CMAC is added to cipher module, but is available to outside only through MAC API. [v2]: - Add documentation. [v3]: - CMAC algorithm ids start from 201. - Coding style fixes. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>