kernel_optimize_test/arch/x86/crypto
Jussi Kivilinna f94a73f8dd crypto: twofish-avx - tune assembler code for more performance
Patch replaces 'movb' instructions with 'movzbl' to break false register
dependencies and interleaves instructions better for out-of-order scheduling.

Tested on Intel Core i5-2450M and AMD FX-8100.

tcrypt ECB results:

Intel Core i5-2450M:

size    old-vs-new      new-vs-3way     old-vs-3way
        enc     dec     enc     dec     enc     dec
256     1.12x   1.13x   1.36x   1.37x   1.21x   1.22x
1k      1.14x   1.14x   1.48x   1.49x   1.29x   1.31x
8k      1.14x   1.14x   1.50x   1.52x   1.32x   1.33x

AMD FX-8100:

size    old-vs-new      new-vs-3way     old-vs-3way
        enc     dec     enc     dec     enc     dec
256     1.10x   1.11x   1.01x   1.01x   0.92x   0.91x
1k      1.11x   1.12x   1.08x   1.07x   0.97x   0.96x
8k      1.11x   1.13x   1.10x   1.08x   0.99x   0.97x

[v2]
 - Do instruction interleaving another way to avoid adding new FPU<=>CPU
   register moves as these cause performance drop on Bulldozer.
 - Further interleaving improvements for better out-of-order scheduling.

Tested-by: Borislav Petkov <bp@alien8.de>
Cc: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2012-09-07 04:17:04 +08:00
..
ablk_helper.c crypto: aes_ni - change to use shared ablk_* functions 2012-06-27 14:42:01 +08:00
aes_glue.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
aes-i586-asm_32.S
aes-x86_64-asm_64.S
aesni-intel_asm.S crypto: aesni-intel - fix unaligned cbc decrypt for x86-32 2012-05-31 20:53:22 +10:00
aesni-intel_glue.c crypto: aesni_intel - improve lrw and xts performance by utilizing parallel AES-NI hardware pipelines 2012-08-20 16:28:10 +08:00
blowfish_glue.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
blowfish-x86_64-asm_64.S
camellia_glue.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
camellia-x86_64-asm_64.S
cast5_avx_glue.c crypto: cast5 - add x86_64/avx assembler implementation 2012-08-01 17:47:30 +08:00
cast5-avx-x86_64-asm_64.S crypto: cast5 - add x86_64/avx assembler implementation 2012-08-01 17:47:30 +08:00
cast6_avx_glue.c crypto: cast6 - add x86_64/avx assembler implementation 2012-08-01 17:47:30 +08:00
cast6-avx-x86_64-asm_64.S crypto: cast6 - add x86_64/avx assembler implementation 2012-08-01 17:47:30 +08:00
crc32c-intel.c
fpu.c
ghash-clmulni-intel_asm.S
ghash-clmulni-intel_glue.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
glue_helper.c crypto: serpent-sse2 - split generic glue code to new helper module 2012-06-27 14:42:01 +08:00
Makefile crypto: cast6 - add x86_64/avx assembler implementation 2012-08-01 17:47:30 +08:00
salsa20_glue.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
salsa20-i586-asm_32.S
salsa20-x86_64-asm_64.S
serpent_avx_glue.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
serpent_sse2_glue.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
serpent-avx-x86_64-asm_64.S crypto: serpent-sse2/avx - allow both to be built into kernel 2012-06-14 10:09:03 +08:00
serpent-sse2-i586-asm_32.S
serpent-sse2-x86_64-asm_64.S
sha1_ssse3_asm.S crypto: sha1 - use Kbuild supplied flags for AVX test 2012-06-12 16:37:16 +08:00
sha1_ssse3_glue.c crypto: sha1 - use Kbuild supplied flags for AVX test 2012-06-12 16:37:16 +08:00
twofish_avx_glue.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
twofish_glue_3way.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
twofish_glue.c crypto: arch/x86 - cleanup - remove unneeded crypto_alg.cra_list initializations 2012-08-01 17:47:27 +08:00
twofish-avx-x86_64-asm_64.S crypto: twofish-avx - tune assembler code for more performance 2012-09-07 04:17:04 +08:00
twofish-i586-asm_32.S
twofish-x86_64-asm_64-3way.S
twofish-x86_64-asm_64.S