forked from luck/tmp_suning_uos_patched
f94a73f8dd
Patch replaces 'movb' instructions with 'movzbl' to break false register dependencies and interleaves instructions better for out-of-order scheduling. Tested on Intel Core i5-2450M and AMD FX-8100. tcrypt ECB results: Intel Core i5-2450M: size old-vs-new new-vs-3way old-vs-3way enc dec enc dec enc dec 256 1.12x 1.13x 1.36x 1.37x 1.21x 1.22x 1k 1.14x 1.14x 1.48x 1.49x 1.29x 1.31x 8k 1.14x 1.14x 1.50x 1.52x 1.32x 1.33x AMD FX-8100: size old-vs-new new-vs-3way old-vs-3way enc dec enc dec enc dec 256 1.10x 1.11x 1.01x 1.01x 0.92x 0.91x 1k 1.11x 1.12x 1.08x 1.07x 0.97x 0.96x 8k 1.11x 1.13x 1.10x 1.08x 0.99x 0.97x [v2] - Do instruction interleaving another way to avoid adding new FPU<=>CPU register moves as these cause performance drop on Bulldozer. - Further interleaving improvements for better out-of-order scheduling. Tested-by: Borislav Petkov <bp@alien8.de> Cc: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> |
||
---|---|---|
.. | ||
ablk_helper.c | ||
aes_glue.c | ||
aes-i586-asm_32.S | ||
aes-x86_64-asm_64.S | ||
aesni-intel_asm.S | ||
aesni-intel_glue.c | ||
blowfish_glue.c | ||
blowfish-x86_64-asm_64.S | ||
camellia_glue.c | ||
camellia-x86_64-asm_64.S | ||
cast5_avx_glue.c | ||
cast5-avx-x86_64-asm_64.S | ||
cast6_avx_glue.c | ||
cast6-avx-x86_64-asm_64.S | ||
crc32c-intel.c | ||
fpu.c | ||
ghash-clmulni-intel_asm.S | ||
ghash-clmulni-intel_glue.c | ||
glue_helper.c | ||
Makefile | ||
salsa20_glue.c | ||
salsa20-i586-asm_32.S | ||
salsa20-x86_64-asm_64.S | ||
serpent_avx_glue.c | ||
serpent_sse2_glue.c | ||
serpent-avx-x86_64-asm_64.S | ||
serpent-sse2-i586-asm_32.S | ||
serpent-sse2-x86_64-asm_64.S | ||
sha1_ssse3_asm.S | ||
sha1_ssse3_glue.c | ||
twofish_avx_glue.c | ||
twofish_glue_3way.c | ||
twofish_glue.c | ||
twofish-avx-x86_64-asm_64.S | ||
twofish-i586-asm_32.S | ||
twofish-x86_64-asm_64-3way.S | ||
twofish-x86_64-asm_64.S |