kernel_optimize_test

Author	SHA1	Message	Date
Jussi Kivilinna	64b94ceae8	crypto: blowfish - add x86_64 assembly implementation Patch adds x86_64 assembly implementation of blowfish. Two set of assembler functions are provided. First set is regular 'one-block at time' encrypt/decrypt functions. Second is 'four-block at time' functions that gain performance increase on out-of-order CPUs. Performance of 4-way functions should be equal to 1-way functions with in-order CPUs. Summary of the tcrypt benchmarks: Blowfish assembler vs blowfish C (256bit 8kb block ECB) encrypt: 2.2x speed decrypt: 2.3x speed Blowfish assembler vs blowfish C (256bit 8kb block CBC) encrypt: 1.12x speed decrypt: 2.5x speed Blowfish assembler vs blowfish C (256bit 8kb block CTR) encrypt: 2.5x speed Full output: http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-blowfish-asm-x86_64.txt http://koti.mbnet.fi/axh/kernel/crypto/tcrypt-speed-blowfish-c-x86_64.txt Tests were run on: vendor_id : AuthenticAMD cpu family : 16 model : 10 model name : AMD Phenom(tm) II X6 1055T Processor stepping : 0 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2011-09-22 21:25:26 +10:00
Jussi Kivilinna	52ba867c8c	crypto: blowfish - split generic and common c code Patch splits up the blowfish crypto routine into a common part (key setup) which will be used by blowfish crypto modules (x86_64 assembly and generic-c). Also fixes errors/warnings reported by checkpatch. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2011-09-22 21:25:25 +10:00
Mathias Krause	66be895158	crypto: sha1 - SSSE3 based SHA1 implementation for x86-64 This is an assembler implementation of the SHA1 algorithm using the Supplemental SSE3 (SSSE3) instructions or, when available, the Advanced Vector Extensions (AVX). Testing with the tcrypt module shows the raw hash performance is up to 2.3 times faster than the C implementation, using 8k data blocks on a Core 2 Duo T5500. For the smalest data set (16 byte) it is still 25% faster. Since this implementation uses SSE/YMM registers it cannot safely be used in every situation, e.g. while an IRQ interrupts a kernel thread. The implementation falls back to the generic SHA1 variant, if using the SSE/YMM registers is not possible. With this algorithm I was able to increase the throughput of a single IPsec link from 344 Mbit/s to 464 Mbit/s on a Core 2 Quad CPU using the SSSE3 variant -- a speedup of +34.8%. Saving and restoring SSE/YMM state might make the actual throughput fluctuate when there are FPU intensive userland applications running. For example, meassuring the performance using iperf2 directly on the machine under test gives wobbling numbers because iperf2 uses the FPU for each packet to check if the reporting interval has expired (in the above test I got min/max/avg: 402/484/464 MBit/s). Using this algorithm on a IPsec gateway gives much more reasonable and stable numbers, albeit not as high as in the directly connected case. Here is the result from an RFC 2544 test run with a EXFO Packet Blazer FTB-8510: frame size sha1-generic sha1-ssse3 delta 64 byte 37.5 MBit/s 37.5 MBit/s 0.0% 128 byte 56.3 MBit/s 62.5 MBit/s +11.0% 256 byte 87.5 MBit/s 100.0 MBit/s +14.3% 512 byte 131.3 MBit/s 150.0 MBit/s +14.2% 1024 byte 162.5 MBit/s 193.8 MBit/s +19.3% 1280 byte 175.0 MBit/s 212.5 MBit/s +21.4% 1420 byte 175.0 MBit/s 218.7 MBit/s +25.0% 1518 byte 150.0 MBit/s 181.2 MBit/s +20.8% The throughput for the largest frame size is lower than for the previous size because the IP packets need to be fragmented in this case to make there way through the IPsec tunnel. Signed-off-by: Mathias Krause <minipli@googlemail.com> Cc: Maxim Locktyukhin <maxim.locktyukhin@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2011-08-10 19:00:29 +08:00
Linus Torvalds	d3ec4844d4	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits) fs: Merge split strings treewide: fix potentially dangerous trailing ';' in #defined values/expressions uwb: Fix misspelling of neighbourhood in comment net, netfilter: Remove redundant goto in ebt_ulog_packet trivial: don't touch files that are removed in the staging tree lib/vsprintf: replace link to Draft by final RFC number doc: Kconfig: `to be' -> `be' doc: Kconfig: Typo: square -> squared doc: Konfig: Documentation/power/{pm => apm-acpi}.txt drivers/net: static should be at beginning of declaration drivers/media: static should be at beginning of declaration drivers/i2c: static should be at beginning of declaration XTENSA: static should be at beginning of declaration SH: static should be at beginning of declaration MIPS: static should be at beginning of declaration ARM: static should be at beginning of declaration rcu: treewide: Do not use rcu_read_lock_held when calling rcu_dereference_check Update my e-mail address PCIe ASPM: forcedly -> forcibly gma500: push through device driver tree ... Fix up trivial conflicts: - arch/arm/mach-ep93xx/dma-m2p.c (deleted) - drivers/gpio/gpio-ep93xx.c (renamed and context nearby) - drivers/net/r8169.c (just context changes)	2011-07-25 13:56:39 -07:00
Michael Witten	35ed4b35be	doc: Kconfig: `to be' ->` be' Also, a comma was inserted to offset a modifier. Signed-off-by: Michael Witten <mfwitten@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-07-11 14:23:35 +02:00
Richard Weinberger	8af00860c9	crypto: UML build fixes CRYPTO_GHASH_CLMUL_NI_INTEL and CRYPTO_AES_NI_INTEL cannot be used on UML. Commit `3e02e5cb` and `54b6a1b` enabled them by accident. Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2011-06-30 07:44:01 +08:00
Andy Lutomirski	b23b645165	crypto: aesni-intel - Merge with fpu.ko Loading fpu without aesni-intel does nothing. Loading aesni-intel without fpu causes modes like xts to fail. (Unloading aesni-intel will restore those modes.) One solution would be to make aesni-intel depend on fpu, but it seems cleaner to just combine the modules. This is probably responsible for bugs like: https://bugzilla.redhat.com/show_bug.cgi?id=589390 Signed-off-by: Andy Lutomirski <luto@mit.edu> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2011-05-16 15:12:47 +10:00
Herbert Xu	8ad225e8e4	crypto: gf128mul - Remove experimental tag This feature no longer needs the experimental tag. Reported-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2010-12-28 22:56:26 +11:00
Herbert Xu	7451708f39	crypto: af_alg - Add dependency on NET Add missing dependency on NET since we require sockets for our interface. Should really be a select but kconfig doesn't like that: net/Kconfig:6:error: found recursive dependency: NET -> NETWORK_FILESYSTEMS -> AFS_FS -> AF_RXRPC -> CRYPTO -> CRYPTO_USER_API_HASH -> CRYPTO_USER_API -> NET Reported-by: Zimny Lech <napohybelskurwysynom2010@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2010-11-29 22:56:03 +08:00
Mathias Krause	0d258efb6a	crypto: aesni-intel - Ported implementation to x86-32 The AES-NI instructions are also available in legacy mode so the 32-bit architecture may profit from those, too. To illustrate the performance gain here's a short summary of a dm-crypt speed test on a Core i7 M620 running at 2.67GHz comparing both assembler implementations: x86: i568 aes-ni delta ECB, 256 bit: 93.8 MB/s 123.3 MB/s +31.4% CBC, 256 bit: 84.8 MB/s 262.3 MB/s +209.3% LRW, 256 bit: 108.6 MB/s 222.1 MB/s +104.5% XTS, 256 bit: 105.0 MB/s 205.5 MB/s +95.7% Additionally, due to some minor optimizations, the 64-bit version also got a minor performance gain as seen below: x86-64: old impl. new impl. delta ECB, 256 bit: 121.1 MB/s 123.0 MB/s +1.5% CBC, 256 bit: 285.3 MB/s 290.8 MB/s +1.9% LRW, 256 bit: 263.7 MB/s 265.3 MB/s +0.6% XTS, 256 bit: 251.1 MB/s 255.3 MB/s +1.7% Signed-off-by: Mathias Krause <minipli@googlemail.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2010-11-27 16:34:46 +08:00
Herbert Xu	8ff590903d	crypto: algif_skcipher - User-space interface for skcipher operations This patch adds the af_alg plugin for symmetric key ciphers, corresponding to the ablkcipher kernel operation type. Keys can optionally be set through the setsockopt interface. Once a sendmsg call occurs without MSG_MORE no further writes may be made to the socket until all previous data has been read. IVs and and whether encryption/decryption is performed can be set through the setsockopt interface or as a control message to sendmsg. The interface is completely synchronous, all operations are carried out in recvmsg(2) and will complete prior to the system call returning. The splice(2) interface support reading the user-space data directly without copying (except that the Crypto API itself may copy the data if alignment is off). The recvmsg(2) interface supports directly writing to user-space without additional copying, i.e., the kernel crypto interface will receive the user-space address as its output SG list. Thakns to Miloslav Trmac for reviewing this and contributing fixes and improvements. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: David S. Miller <davem@davemloft.net>	2010-11-26 20:53:59 +08:00
Herbert Xu	fe869cdb89	crypto: algif_hash - User-space interface for hash operations This patch adds the af_alg plugin for hash, corresponding to the ahash kernel operation type. Keys can optionally be set through the setsockopt interface. Each sendmsg call will finalise the hash unless sent with a MSG_MORE flag. Partial hash states can be cloned using accept(2). The interface is completely synchronous, all operations will complete prior to the system call returning. Both sendmsg(2) and splice(2) support reading the user-space data directly without copying (except that the Crypto API itself may copy the data if alignment is off). For now only the splice(2) interface supports performing digest instead of init/update/final. In future the sendmsg(2) interface will also be modified to use digest/finup where possible so that hardware that cannot return a partial hash state can still benefit from this interface. Thakns to Miloslav Trmac for reviewing this and contributing fixes and improvements. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: David S. Miller <davem@davemloft.net> Tested-by: Martin Willi <martin@strongswan.org>	2010-11-19 17:47:58 +08:00
Herbert Xu	03c8efc1ff	crypto: af_alg - User-space interface for Crypto API This patch creates the backbone of the user-space interface for the Crypto API, through a new socket family AF_ALG. Each session corresponds to one or more connections obtained from that socket. The number depends on the number of inputs/outputs of that particular type of operation. For most types there will be a s ingle connection/file descriptor that is used for both input and output. AEAD is one of the few that require two inputs. Each algorithm type will provide its own implementation that plugs into af_alg. They're keyed using a string such as "skcipher" or "hash". IOW this patch only contains the boring bits that is required to hold everything together. Thakns to Miloslav Trmac for reviewing this and contributing fixes and improvements. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: David S. Miller <davem@davemloft.net> Tested-by: Martin Willi <martin@strongswan.org>	2010-11-19 17:47:57 +08:00
Justin P. Mattock	6d8de74c5c	crypto: Kconfig - update broken web addresses Below is a patch to update the broken web addresses, in crypto/* that I could locate. Some are just simple typos that needed to be fixed, and some had a change in location altogether.. let me know if any of them need to be changed and such. Signed-off-by: Justin P. Mattock <justinmattock@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2010-09-12 10:42:47 +08:00
Chuck Ebbert	e84c5480b7	crypto: fips - FIPS requires algorithm self-tests Signed-off-by: Chuck Ebbert <cebbert@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2010-09-03 19:17:49 +08:00
Herbert Xu	00ca28a507	crypto: testmgr - Default to no tests On Thu, Aug 05, 2010 at 07:01:21PM -0700, Linus Torvalds wrote: > On Thu, Aug 5, 2010 at 6:40 PM, Herbert Xu <herbert@gondor.hengli.com.au> wrote: > > > > -config CRYPTO_MANAGER_TESTS > > - bool "Run algolithms' self-tests" > > - default y > > - depends on CRYPTO_MANAGER2 > > +config CRYPTO_MANAGER_DISABLE_TESTS > > + bool "Disable run-time self tests" > > + depends on CRYPTO_MANAGER2 && EMBEDDED > > Why do you still want to force-enable those tests? I was going to > complain about the "default y" anyway, now I'm _really_ complaining, > because you've now made it impossible to disable those tests. Why? As requested, this patch sets the default to y and removes the EMBEDDED dependency. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2010-08-06 10:34:00 +08:00
Herbert Xu	326a6346ff	crypto: testmgr - Fix test disabling option This patch fixes a serious bug in the test disabling patch where it can cause an spurious load of the cryptomgr module even when it's compiled in. It also negates the test disabling option so that its absence causes tests to be enabled. The Kconfig option is also now behind EMBEDDED. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2010-08-06 09:40:28 +08:00
Alexander Shishkin	0b767f9616	crypto: testmgr - add an option to disable cryptoalgos' self-tests By default, CONFIG_CRYPTO_MANAGER_TESTS will be enabled and thus self-tests will still run, but it is now possible to disable them to gain some time during bootup. Signed-off-by: Alexander Shishkin <virtuoso@slind.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2010-06-03 20:53:43 +10:00
Herbert Xu	bc94e59662	crypto: pcomp - Fix illegal Kconfig configuration The PCOMP Kconfig entry current allows the following combination which is illegal: ZLIB=y PCOMP=y ALGAPI=m ALGAPI2=y MANAGER=m MANAGER2=m This patch fixes this by adding PCOMP2 so that PCOMP can select ALGAPI to propagate the setting to MANAGER2. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2010-06-03 20:33:06 +10:00
Gilles Espinasse	f77f13e22d	Fix comment and Kconfig typos for 'require' and 'fragment' Signed-off-by: Gilles Espinasse <g.esp@free.fr> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-03-29 15:41:47 +02:00
Jiri Kosina	318ae2edc3	Merge branch 'for-next' into for-linus Conflicts: Documentation/filesystems/proc.txt arch/arm/mach-u300/include/mach/debug-macro.S drivers/net/qlge/qlge_ethtool.c drivers/net/qlge/qlge_main.c drivers/net/typhoon.c	2010-03-08 16:55:37 +01:00
Jiri Kosina	7dd607e82d	crypto: fix typo in Kconfig help text Reported-by: Toralf Förster <toralf.foerster@gmx.de> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-02-05 12:22:41 +01:00
Steffen Klassert	5068c7a883	crypto: pcrypt - Add pcrypt crypto parallelization wrapper This patch adds a parallel crypto template that takes a crypto algorithm and converts it to process the crypto transforms in parallel. For the moment only aead algorithms are supported. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2010-01-07 15:57:19 +11:00
Huang Ying	3e02e5cb47	crypto: ghash-intel - Fix building failure on x86_32 CLMUL-NI accelerated GHASH should be turned off on non-x86_64 machine. Reported-by: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-10-27 19:07:24 +08:00
Huang Ying	0e1227d356	crypto: ghash - Add PCLMULQDQ accelerated implementation PCLMULQDQ is used to accelerate the most time-consuming part of GHASH, carry-less multiplication. More information about PCLMULQDQ can be found at: http://software.intel.com/en-us/articles/carry-less-multiplication-and-its-usage-for-computing-the-gcm-mode/ Because PCLMULQDQ changes XMM state, its usage must be enclosed with kernel_fpu_begin/end, which can be used only in process context, the acceleration is implemented as crypto_ahash. That is, request in soft IRQ context will be defered to the cryptd kernel thread. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-10-19 11:53:06 +09:00
Shane Wang	f1939f7c56	crypto: vmac - New hash algorithm for intel_txt support This patch adds VMAC (a fast MAC) support into crypto framework. Signed-off-by: Shane Wang <shane.wang@intel.com> Signed-off-by: Joseph Cihula <joseph.cihula@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-09-02 20:05:22 +10:00
Neil Horman	4e4ed83be6	crypto: fips - Depend on ansi_cprng What about something like this? It defaults the CPRNG to m and makes FIPS dependent on the CPRNG. That way you get a module build by default, but you can change it to y manually during config and still satisfy the dependency, and if you select N it disables FIPS as well. I rather like that better than making FIPS a tristate. I just tested it out here and it seems to work well. Let me know what you think Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-08-20 17:54:16 +10:00
Herbert Xu	73fec12094	Revert crypto: fips - Select CPRNG This reverts commit `215ccd6f55`. It causes CPRNG and everything selected by it to be built-in whenever FIPS is enabled. The problem is that it is selecting a tristate from a bool, which is usually not what is intended. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-08-13 22:41:25 +10:00
Huang Ying	9382d97af5	crypto: gcm - Use GHASH digest algorithm Remove the dedicated GHASH implementation in GCM, and uses the GHASH digest algorithm instead. This will make GCM uses hardware accelerated GHASH implementation automatically if available. ahash instead of shash interface is used, because some hardware accelerated GHASH implementation needs asynchronous interface. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-08-06 15:34:26 +10:00
Huang Ying	2cdc6899a8	crypto: ghash - Add GHASH digest algorithm for GCM GHASH is implemented as a shash algorithm. The actual implementation is copied from gcm.c. This makes it possible to add architecture/hardware accelerated GHASH implementation. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-08-06 15:32:38 +10:00
Neil Horman	215ccd6f55	crypto: fips - Select CPRNG The ANSI CPRNG has no dependence on FIPS support. FIPS support however, requires the use of the CPRNG. Adjust that depedency relationship in Kconfig. Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-06-21 21:38:03 +08:00
Herbert Xu	27300176d7	crypto: ansi_cprng - Do not select FIPS The RNG should work with FIPS disabled. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-06-19 20:32:58 +08:00
Huang Ying	2cf4ac8beb	crypto: aes-ni - Add support for more modes Because kernel_fpu_begin() and kernel_fpu_end() operations are too slow, the performance gain of general mode implementation + aes-aesni is almost all compensated. The AES-NI support for more modes are implemented as follow: - Add a new AES algorithm implementation named __aes-aesni without kernel_fpu_begin/end() - Use fpu(<mode>(AES)) to provide kenrel_fpu_begin/end() invoking - Add <mode>(AES) ablkcipher, which uses cryptd(fpu(<mode>(AES))) to defer cryption to cryptd context in soft_irq context. Now the ctr, lrw, pcbc and xts support are added. Performance testing based on dm-crypt shows that cryption time can be reduced to 50% of general mode implementation + aes-aesni implementation. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-06-02 14:04:16 +10:00
Huang Ying	150c7e8552	crypto: fpu - Add template for blkcipher touching FPU Blkcipher touching FPU need to be enclosed by kernel_fpu_begin() and kernel_fpu_end(). If they are invoked in cipher algorithm implementation, they will be invoked for each block, so that performance will be hurt, because they are "slow" operations. This patch implements "fpu" template, which makes these operations to be invoked for each request. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-06-02 14:04:15 +10:00
Geert Uytterhoeven	0c01aed50d	crypto: testmgr - add zlib test Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-03-04 15:42:15 +08:00
Geert Uytterhoeven	bf68e65ec9	crypto: zlib - New zlib crypto module, using pcomp Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-03-04 15:16:19 +08:00
Geert Uytterhoeven	a1d2f09544	crypto: compress - Add pcomp interface The current "comp" crypto interface supports one-shot (de)compression only, i.e. the whole data buffer to be (de)compressed must be passed at once, and the whole (de)compressed data buffer will be received at once. In several use-cases (e.g. compressed file systems that store files in big compressed blocks), this workflow is not suitable. Furthermore, the "comp" type doesn't provide for the configuration of (de)compression parameters, and always allocates workspace memory for both compression and decompression, which may waste memory. To solve this, add a "pcomp" partial (de)compression interface that provides the following operations: - crypto_compress_{init,update,final}() for compression, - crypto_decompress_{init,update,final}() for decompression, - crypto_{,de}compress_setup(), to configure (de)compression parameters (incl. allocating workspace memory). The (de)compression methods take a struct comp_request, which was mimicked after the z_stream object in zlib, and contains buffer pointer and length pairs for input and output. The setup methods take an opaque parameter pointer and length pair. Parameters are supposed to be encoded using netlink attributes, whose meanings depend on the actual (name of the) (de)compression algorithm. Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-03-04 15:05:33 +08:00
Huang Ying	0a2e821d62	crypto: chainiv - Use kcrypto_wq instead of keventd_wq keventd_wq has potential starvation problem, so use dedicated kcrypto_wq instead. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-19 14:44:02 +08:00
Huang Ying	254eff7714	crypto: cryptd - Per-CPU thread implementation based on kcrypto_wq Original cryptd thread implementation has scalability issue, this patch solve the issue with a per-CPU thread implementation. struct cryptd_queue is defined to be a per-CPU queue, which holds one struct cryptd_cpu_queue for each CPU. In struct cryptd_cpu_queue, a struct crypto_queue holds all requests for the CPU, a struct work_struct is used to run all requests for the CPU. Testing based on dm-crypt on an Intel Core 2 E6400 (two cores) machine shows 19.2% performance gain. The testing script is as follow: -------------------- script begin --------------------------- #!/bin/sh dmc_create() { # Create a crypt device using dmsetup dmsetup create $2 --table "0 `blockdev --getsize $1` crypt cbc(aes-asm)?cryptd?plain:plain babebabebabebabebabebabebabebabe 0 $1 0" } dmsetup remove crypt0 dmsetup remove crypt1 dd if=/dev/zero of=/dev/ram0 bs=1M count=4 >& /dev/null dd if=/dev/zero of=/dev/ram1 bs=1M count=4 >& /dev/null dmc_create /dev/ram0 crypt0 dmc_create /dev/ram1 crypt1 cat >tr.sh <<EOF #!/bin/sh for n in \$(seq 10); do dd if=/dev/dm-0 of=/dev/null >& /dev/null & dd if=/dev/dm-1 of=/dev/null >& /dev/null & done wait EOF for n in $(seq 10); do /usr/bin/time sh tr.sh done rm tr.sh -------------------- script end --------------------------- The separator of dm-crypt parameter is changed from "-" to "?", because "-" is used in some cipher driver name too, and cryptds need to specify cipher driver name instead of cipher name. The test result on an Intel Core2 E6400 (two cores) is as follow: without patch: -----------------wo begin -------------------------- 0.04user 0.38system 0:00.39elapsed 107%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6566minor)pagefaults 0swaps 0.07user 0.35system 0:00.35elapsed 121%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6567minor)pagefaults 0swaps 0.06user 0.34system 0:00.30elapsed 135%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6562minor)pagefaults 0swaps 0.05user 0.37system 0:00.36elapsed 119%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6607minor)pagefaults 0swaps 0.06user 0.36system 0:00.35elapsed 120%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6562minor)pagefaults 0swaps 0.05user 0.37system 0:00.31elapsed 136%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6594minor)pagefaults 0swaps 0.04user 0.34system 0:00.30elapsed 126%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6597minor)pagefaults 0swaps 0.06user 0.32system 0:00.31elapsed 125%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6571minor)pagefaults 0swaps 0.06user 0.34system 0:00.31elapsed 134%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6581minor)pagefaults 0swaps 0.05user 0.38system 0:00.31elapsed 138%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6600minor)pagefaults 0swaps -----------------wo end -------------------------- with patch: ------------------w begin -------------------------- 0.02user 0.31system 0:00.24elapsed 141%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6554minor)pagefaults 0swaps 0.05user 0.34system 0:00.31elapsed 127%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6606minor)pagefaults 0swaps 0.07user 0.33system 0:00.26elapsed 155%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6559minor)pagefaults 0swaps 0.07user 0.32system 0:00.26elapsed 151%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6562minor)pagefaults 0swaps 0.05user 0.34system 0:00.26elapsed 150%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6603minor)pagefaults 0swaps 0.03user 0.36system 0:00.31elapsed 124%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6562minor)pagefaults 0swaps 0.04user 0.35system 0:00.26elapsed 147%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6586minor)pagefaults 0swaps 0.03user 0.37system 0:00.27elapsed 146%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6562minor)pagefaults 0swaps 0.04user 0.36system 0:00.26elapsed 154%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6594minor)pagefaults 0swaps 0.04user 0.35system 0:00.26elapsed 154%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+6557minor)pagefaults 0swaps ------------------w end -------------------------- The middle value of elapsed time is: wo cryptwq: 0.31 w cryptwq: 0.26 The performance gain is about (0.31-0.26)/0.26 = 0.192. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-19 14:42:19 +08:00
Huang Ying	25c38d3fb9	crypto: api - Use dedicated workqueue for crypto subsystem Use dedicated workqueue for crypto subsystem A dedicated workqueue named kcrypto_wq is created to be used by crypto subsystem. The system shared keventd_wq is not suitable for encryption/decryption, because of potential starvation problem. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-19 14:33:40 +08:00
Huang Ying	54b6a1bd53	crypto: aes-ni - Add support to Intel AES-NI instructions for x86_64 platform Intel AES-NI is a new set of Single Instruction Multiple Data (SIMD) instructions that are going to be introduced in the next generation of Intel processor, as of 2009. These instructions enable fast and secure data encryption and decryption, using the Advanced Encryption Standard (AES), defined by FIPS Publication number 197. The architecture introduces six instructions that offer full hardware support for AES. Four of them support high performance data encryption and decryption, and the other two instructions support the AES key expansion procedure. The white paper can be downloaded from: http://softwarecommunity.intel.com/isn/downloads/intelavx/AES-Instructions-Set_WP.pdf AES may be used in soft_irq context, but MMX/SSE context can not be touched safely in soft_irq context. So in_interrupt() is checked, if in IRQ or soft_irq context, the general x86_64 implementation are used instead. Signed-off-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2009-02-18 16:48:06 +08:00
Adrian-Ken Rueegsegger	bd9d20dba1	crypto: sha512 - Switch to shash This patch changes sha512 and sha384 to the new shash interface. Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:02:27 +11:00
Adrian-Ken Rueegsegger	19e2bf1467	crypto: michael_mic - Switch to shash This patch changes michael_mic to the new shash interface. Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:02:24 +11:00
Adrian-Ken Rueegsegger	4946510baa	crypto: wp512 - Switch to shash This patch changes wp512, wp384 and wp256 to the new shash interface. Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:02:22 +11:00
Adrian-Ken Rueegsegger	f63fbd3d50	crypto: tgr192 - Switch to shash This patch changes tgr192, tgr160 and tgr128 to the new shash interface. Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:02:21 +11:00
Adrian-Ken Rueegsegger	50e109b5b9	crypto: sha256 - Switch to shash This patch changes sha256 and sha224 to the new shash interface. Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:02:19 +11:00
Adrian-Ken Rueegsegger	14b75ba70d	crypto: md5 - Switch to shash This patch changes md5 to the new shash interface. Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:02:18 +11:00
Adrian-Ken Rueegsegger	808a1763ce	crypto: md4 - Switch to shash This patch changes md4 to the new shash interface. Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:02:16 +11:00
Adrian-Ken Rueegsegger	54ccb36776	crypto: sha1 - Switch to shash This patch changes sha1 to the new shash interface. Signed-off-by: Adrian-Ken Rueegsegger <ken@codelabs.ch> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:02:15 +11:00
Herbert Xu	3b8efb4c41	crypto: rmd320 - Switch to shash This patch changes rmd320 to the new shash interface. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2008-12-25 11:02:13 +11:00

1 2 3

133 Commits