kernel_optimize_test/arch/x86/crypto
Johannes Goetzfried 4ea1277d30 crypto: cast6 - add x86_64/avx assembler implementation
This patch adds a x86_64/avx assembler implementation of the Cast6 block
cipher. The implementation processes eight blocks in parallel (two 4 block
chunk AVX operations). The table-lookups are done in general-purpose registers.
For small blocksizes the functions from the generic module are called. A good
performance increase is provided for blocksizes greater or equal to 128B.

Patch has been tested with tcrypt and automated filesystem tests.

Tcrypt benchmark results:

Intel Core i5-2500 CPU (fam:6, model:42, step:7)

cast6-avx-x86_64 vs. cast6-generic
128bit key:                                             (lrw:256bit)    (xts:256bit)
size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B     0.97x   1.00x   1.01x   1.01x   0.99x   0.97x   0.98x   1.01x   0.96x   0.98x
64B     0.98x   0.99x   1.02x   1.01x   0.99x   1.00x   1.01x   0.99x   1.00x   0.99x
256B    1.77x   1.84x   0.99x   1.85x   1.77x   1.77x   1.70x   1.74x   1.69x   1.72x
1024B   1.93x   1.95x   0.99x   1.96x   1.93x   1.93x   1.84x   1.85x   1.89x   1.87x
8192B   1.91x   1.95x   0.99x   1.97x   1.95x   1.91x   1.86x   1.87x   1.93x   1.90x

256bit key:                                             (lrw:384bit)    (xts:512bit)
size    ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec
16B     0.97x   0.99x   1.02x   1.01x   0.98x   0.99x   1.00x   1.00x   0.98x   0.98x
64B     0.98x   0.99x   1.01x   1.00x   1.00x   1.00x   1.01x   1.01x   0.97x   1.00x
256B    1.77x   1.83x   1.00x   1.86x   1.79x   1.78x   1.70x   1.76x   1.71x   1.69x
1024B   1.92x   1.95x   0.99x   1.96x   1.93x   1.93x   1.83x   1.86x   1.89x   1.87x
8192B   1.94x   1.95x   0.99x   1.97x   1.95x   1.95x   1.87x   1.87x   1.93x   1.91x

Signed-off-by: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-08-01 17:47:30 +08:00
..
ablk_helper.c crypto: aes_ni - change to use shared ablk_* functions 2012-06-27 14:42:01 +08:00
aes_glue.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
aes-i586-asm_32.S
aes-x86_64-asm_64.S
aesni-intel_asm.S crypto: aesni-intel - fix unaligned cbc decrypt for x86-32 2012-05-31 20:53:22 +10:00
aesni-intel_glue.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
blowfish_glue.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
blowfish-x86_64-asm_64.S
camellia_glue.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
camellia-x86_64-asm_64.S
cast5_avx_glue.c crypto: cast5 - add x86_64/avx assembler implementation 2012-08-01 17:47:30 +08:00
cast5-avx-x86_64-asm_64.S crypto: cast5 - add x86_64/avx assembler implementation 2012-08-01 17:47:30 +08:00
cast6_avx_glue.c crypto: cast6 - add x86_64/avx assembler implementation 2012-08-01 17:47:30 +08:00
cast6-avx-x86_64-asm_64.S crypto: cast6 - add x86_64/avx assembler implementation 2012-08-01 17:47:30 +08:00
crc32c-intel.c
fpu.c
ghash-clmulni-intel_asm.S
ghash-clmulni-intel_glue.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
glue_helper.c crypto: serpent-sse2 - split generic glue code to new helper module 2012-06-27 14:42:01 +08:00
Makefile crypto: cast6 - add x86_64/avx assembler implementation 2012-08-01 17:47:30 +08:00
salsa20_glue.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
salsa20-i586-asm_32.S
salsa20-x86_64-asm_64.S
serpent_avx_glue.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
serpent_sse2_glue.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
serpent-avx-x86_64-asm_64.S crypto: serpent-sse2/avx - allow both to be built into kernel 2012-06-14 10:09:03 +08:00
serpent-sse2-i586-asm_32.S
serpent-sse2-x86_64-asm_64.S
sha1_ssse3_asm.S crypto: sha1 - use Kbuild supplied flags for AVX test 2012-06-12 16:37:16 +08:00
sha1_ssse3_glue.c crypto: sha1 - use Kbuild supplied flags for AVX test 2012-06-12 16:37:16 +08:00
twofish_avx_glue.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
twofish_glue_3way.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
twofish_glue.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
twofish-avx-x86_64-asm_64.S crypto: twofish-avx - remove useless instruction 2012-07-11 11:08:30 +08:00
twofish-i586-asm_32.S
twofish-x86_64-asm_64-3way.S
twofish-x86_64-asm_64.S