kernel_optimize_test

Author	SHA1	Message	Date
Stephan Mueller	2d45a7e898	crypto: af_alg - get_page upon reassignment to TX SGL When a page is assigned to a TX SGL, call get_page to increment the reference counter. It is possible that one page is referenced in multiple SGLs: - in the global TX SGL in case a previous af_alg_pull_tsgl only reassigned parts of a page to a per-request TX SGL - in the per-request TX SGL as assigned by af_alg_pull_tsgl Note, multiple requests can be active at the same time whose TX SGLs all point to different parts of the same page. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-22 15:03:27 +08:00
Christophe Jaillet	32e67b9af0	crypto: cavium/nitrox - Fix an error handling path in 'nitrox_probe()' 'err' is known to be 0 at this point. If 'kzalloc()' fails, returns -ENOMEM instead of 0 which means success. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-22 14:54:55 +08:00
Christophe Jaillet	b7d65fe181	crypto: inside-secure - fix an error handling path in safexcel_probe() 'ret' is known to be 0 at this point. If 'safexcel_request_ring_irq()' fails, it returns an error code. Return this value instead of 0 which means success. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-22 14:54:55 +08:00
Zain Wang	5a7801f663	crypto: rockchip - Don't dequeue the request when device is busy The device can only process one request at a time. So if multiple requests came at the same time, we can enqueue them first, and dequeue them one by one when the device is idle. Signed-off-by: zain wang <wzz@rock-chips.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-22 14:54:54 +08:00
Corentin LABBE	baf5b752da	crypto: cavium - add release_firmware to all return case Two return case misses to call release_firmware() and so leak some memory. This patch create a fw_release label (and so a common error path) and use it on all return case. Detected by CoverityScan, CID#1416422 ("Resource Leak") Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-22 14:54:54 +08:00
Arvind Yadav	249cb06325	crypto: sahara - constify platform_device_id platform_device_id are not supposed to change at runtime. All functions working with platform_device_id provided by <linux/platform_device.h> work with const platform_device_id. So mark the non-const structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-22 14:54:53 +08:00
Lars Persson	f93ed02818	MAINTAINERS: Add ARTPEC crypto maintainer Assign the Axis kernel team as maintainer for crypto drivers under drivers/crypto/axis. Signed-off-by: Lars Persson <larper@axis.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-22 14:54:53 +08:00
Lars Persson	a21eb94fc4	crypto: axis - add ARTPEC-6/7 crypto accelerator driver This is an asynchronous crypto API driver for the accelerator present in the ARTPEC-6 and -7 SoCs from Axis Communications AB. The driver supports AES in ECB/CTR/CBC/XTS/GCM modes and SHA1/2 hash standards. Signed-off-by: Lars Persson <larper@axis.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-22 14:54:52 +08:00
Rabin Vincent	6f7473c524	crypto: hash - add crypto_(un)register_ahashes() There are already helpers to (un)register multiple normal and AEAD algos. Add one for ahashes too. Signed-off-by: Lars Persson <larper@axis.com> Signed-off-by: Rabin Vincent <rabinv@axis.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-22 14:54:52 +08:00
Lars Persson	ac6b6f4531	dt-bindings: crypto: add ARTPEC crypto Document the device tree bindings for the ARTPEC crypto accelerator on ARTPEC-6 and ARTPEC-7 SoCs. Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Lars Persson <larper@axis.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-22 14:54:51 +08:00
Stephan Mueller	75d11e7535	crypto: algif_aead - fix comment regarding memory layout Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-22 14:54:50 +08:00
Herbert Xu	e90c48efde	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Merge the crypto tree to resolve the conflict between the temporary and long-term fixes in algif_skcipher.	2017-08-22 14:53:32 +08:00
Stephan Mueller	445a582738	crypto: algif_skcipher - only call put_page on referenced and used pages For asynchronous operation, SGs are allocated without a page mapped to them or with a page that is not used (ref-counted). If the SGL is freed, the code must only call put_page for an SG if there was a page assigned and ref-counted in the first place. This fixes a kernel crash when using io_submit with more than one iocb using the sendmsg and sendpage (vmsplice/splice) interface. Cc: <stable@vger.kernel.org> Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-22 14:45:48 +08:00
Ard Biesheuvel	549f64153c	crypto: testmgr - add chunked test cases for chacha20 We failed to catch a bug in the chacha20 code after porting it to the skcipher API. We would have caught it if any chunked tests had been defined, so define some now so we will catch future regressions. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-22 14:45:48 +08:00
Ard Biesheuvel	4de437265e	crypto: chacha20 - fix handling of chunked input Commit `9ae433bc79` ("crypto: chacha20 - convert generic and x86 versions to skcipher") ported the existing chacha20 code to use the new skcipher API, and introduced a bug along the way. Unfortunately, the tcrypt tests did not catch the error, and it was only found recently by Tobias. Stefan kindly diagnosed the error, and proposed a fix which is similar to the one below, with the exception that 'walk.stride' is used rather than the hardcoded block size. This does not actually matter in this case, but it's a better example of how to use the skcipher walk API. Fixes: `9ae433bc79` ("crypto: chacha20 - convert generic and x86 ...") Cc: <stable@vger.kernel.org> # v4.11+ Cc: Steffen Klassert <steffen.klassert@secunet.com> Reported-by: Tobias Brunner <tobias@strongswan.org> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-22 14:45:47 +08:00
Stephan Mueller	dea3eb8b45	lib/mpi: kunmap after finishing accessing buffer Using sg_miter_start and sg_miter_next, the buffer of an SG is kmap'ed to buff. The current code calls sg_miter_stop (and thus kunmap) on the SG entry before the last access of buff. The patch moves the sg_miter_stop call after the last access to buff to ensure that the memory pointed to by buff is still mapped. Fixes: `4816c94064` ("lib/mpi: Fix SG miter leak") Cc: <stable@vger.kernel.org> Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-22 14:45:01 +08:00
Pan Bian	ef4064bb3f	crypto: ccp - use dma_mapping_error to check map error The return value of dma_map_single() should be checked by dma_mapping_error(). However, in function ccp_init_dm_workarea(), its return value is checked against NULL, which could result in failures. Signed-off-by: Pan Bian <bianpan2016@163.com> Acked-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-17 16:53:33 +08:00
Stefan Agner	dea632cadd	lib/mpi: fix build with clang Use just @ to denote comments which works with gcc and clang. Otherwise clang reports an escape sequence error: error: invalid % escape in inline assembly string Use %0-%3 as operand references, this avoids: error: invalid operand in inline asm: 'umull ${1:r}, ${0:r}, ${2:r}, ${3:r}' Also remove superfluous casts on output operands to avoid warnings such as: warning: invalid use of a cast in an inline asm context requiring an l-value Signed-off-by: Stefan Agner <stefan@agner.ch> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-17 16:53:31 +08:00
Mogens Lauridsen	67adc432f4	crypto: sahara - Remove leftover from previous used spinlock This driver previously used a spinlock. The spinlock is not used any more, but the spinlock variable was still there and also being initialized. Signed-off-by: Mogens Lauridsen <mlauridsen171@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-17 16:53:28 +08:00
Mogens Lauridsen	1e3204104c	crypto: sahara - Fix dma unmap direction The direction used in dma_unmap_sg in aes calc is wrong. This result in the cache not being invalidated correct when aes calculation is done and result has been dma'ed to memory. This is seen as sporadic wrong result from aes calc. Signed-off-by: Mogens Lauridsen <mlauridsen171@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-17 16:53:26 +08:00
Stephan Mueller	2d97591ef4	crypto: af_alg - consolidation of duplicate code Consolidate following data structures: skcipher_async_req, aead_async_req -> af_alg_async_req skcipher_rsgl, aead_rsql -> af_alg_rsgl skcipher_tsgl, aead_tsql -> af_alg_tsgl skcipher_ctx, aead_ctx -> af_alg_ctx Consolidate following functions: skcipher_sndbuf, aead_sndbuf -> af_alg_sndbuf skcipher_writable, aead_writable -> af_alg_writable skcipher_rcvbuf, aead_rcvbuf -> af_alg_rcvbuf skcipher_readable, aead_readable -> af_alg_readable aead_alloc_tsgl, skcipher_alloc_tsgl -> af_alg_alloc_tsgl aead_count_tsgl, skcipher_count_tsgl -> af_alg_count_tsgl aead_pull_tsgl, skcipher_pull_tsgl -> af_alg_pull_tsgl aead_free_areq_sgls, skcipher_free_areq_sgls -> af_alg_free_areq_sgls aead_wait_for_wmem, skcipher_wait_for_wmem -> af_alg_wait_for_wmem aead_wmem_wakeup, skcipher_wmem_wakeup -> af_alg_wmem_wakeup aead_wait_for_data, skcipher_wait_for_data -> af_alg_wait_for_data aead_data_wakeup, skcipher_data_wakeup -> af_alg_data_wakeup aead_sendmsg, skcipher_sendmsg -> af_alg_sendmsg aead_sendpage, skcipher_sendpage -> af_alg_sendpage aead_async_cb, skcipher_async_cb -> af_alg_async_cb aead_poll, skcipher_poll -> af_alg_poll Split out the following common code from recvmsg: af_alg_alloc_areq: allocation of the request data structure for the cipher operation af_alg_get_rsgl: creation of the RX SGL anchored in the request data structure The following changes to the implementation without affecting the functionality have been applied to synchronize slightly different code bases in algif_skcipher and algif_aead: The wakeup in af_alg_wait_for_data is triggered when either more data is received or the indicator that more data is to be expected is released. The first is triggered by user space, the second is triggered by the kernel upon finishing the processing of data (i.e. the kernel is ready for more). af_alg_sendmsg uses size_t in min_t calculation for obtaining len. Return code determination is consistent with algif_skcipher. The scope of the variable i is reduced to match algif_aead. The type of the variable i is switched from int to unsigned int to match algif_aead. af_alg_sendpage does not contain the superfluous err = 0 from aead_sendpage. af_alg_async_cb requires to store the number of output bytes in areq->outlen before the AIO callback is triggered. The POLLIN / POLLRDNORM is now set when either not more data is given or the kernel is supplied with data. This is consistent to the wakeup from sleep when the kernel waits for data. The request data structure is extended by the field last_rsgl which points to the last RX SGL list entry. This shall help recvmsg implementation to chain the RX SGL to other SG(L)s if needed. It is currently used by algif_aead which chains the tag SGL to the RX SGL during decryption. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-09 20:18:32 +08:00
Fabio Estevam	a92f7af385	crypto: caam - Remove unused dentry members Most of the dentry members from structure caam_drv_private are never used at all, so it is safe to remove them. Since debugfs_remove_recursive() is called, we don't need the file entries. Signed-off-by: Fabio Estevam <festevam@gmail.com> Acked-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-09 20:18:29 +08:00
Arnd Bergmann	ac360faf95	crypto: ccp - select CONFIG_CRYPTO_RSA Without the base RSA code, we run into a link error: ERROR: "rsa_parse_pub_key" [drivers/crypto/ccp/ccp-crypto.ko] undefined! ERROR: "rsa_parse_priv_key" [drivers/crypto/ccp/ccp-crypto.ko] undefined! Like the other drivers implementing RSA in hardware, this can be avoided by always enabling the base support when we build CCP. Fixes: `ceeec0afd6` ("crypto: ccp - Add support for RSA on the CCP") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-09 20:17:59 +08:00
Arnd Bergmann	d634baea62	crypto: ccp - avoid uninitialized variable warning The added support for version 5 CCPs introduced a false-positive warning in the RSA implementation: drivers/crypto/ccp/ccp-ops.c: In function 'ccp_run_rsa_cmd': drivers/crypto/ccp/ccp-ops.c:1856:3: error: 'sb_count' may be used uninitialized in this function [-Werror=maybe-uninitialized] This changes the code in a way that should make it easier for the compiler to track the state of the sb_count variable, and avoid the warning. Fixes: `6ba46c7d4d` ("crypto: ccp - Fix base RSA function for version 5 CCPs") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-09 20:17:58 +08:00
Arnd Bergmann	c871c10e4e	crypto: serpent - improve __serpent_setkey with UBSAN When UBSAN is enabled, we get a very large stack frame for __serpent_setkey, when the register allocator ends up using more registers than it has, and has to spill temporary values to the stack. The code was originally optimized for in-order x86-32 CPU implementations using older compilers, but it now runs into a highly suboptimal case on all CPU architectures, as seen by this warning: crypto/serpent_generic.c: In function '__serpent_setkey': crypto/serpent_generic.c:436:1: error: the frame size of 2720 bytes is larger than 2048 bytes [-Werror=frame-larger-than=] Disabling -fsanitize=alignment would avoid that warning, presumably the option turns off a optimization step that is required for getting the register allocation right, but there is no easy way to do that on gcc-7 (gcc-8 introduces a function attribute for this). I tried to figure out a way to modify the source code instead, and noticed that the two stages of the setkey() function (keyiter and sbox) each are fine by themselves, but not when combined into one function. Splitting out the entire sbox into a separate function also happens to work fine with all compilers I tried (arm, arm64 and x86). The setkey function uses a strange way to handle offsets into the key array, using both negative and positive index values, as well as adjusting the array pointer back and forth. I have checked that this actually makes no difference to modern compilers, but I left that untouched to make the patch easier to review and to keep the code closer to the reference implementation. Link: https://patchwork.kernel.org/patch/9189575/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-09 20:17:54 +08:00
Stephan Mueller	72548b093e	crypto: algif_aead - copy AAD from src to dst Use the NULL cipher to copy the AAD and PT/CT from the TX SGL to the RX SGL. This allows an in-place crypto operation on the RX SGL for encryption, because the TX data is always smaller or equal to the RX data (the RX data will hold the tag). For decryption, a per-request TX SGL is created which will only hold the tag value. As the RX SGL will have no space for the tag value and an in-place operation will not write the tag buffer, the TX SGL with the tag value is chained to the RX SGL. This now allows an in-place crypto operation. For example: * without the patch: kcapi -x 2 -e -c "gcm(aes)" -p 89154d0d4129d322e4487bafaa4f6b46 -k c0ece3e63198af382b5603331cc23fa8 -i 7e489b83622e7228314d878d -a afcd7202d621e06ca53b70c2bdff7fb2 -l 16 -u -s 00000000000000000000000000000000f4a3eacfbdadd3b1a17117b1d67ffc1f1e21efbbc6d83724a8c296e3bb8cda0c * with the patch: kcapi -x 2 -e -c "gcm(aes)" -p 89154d0d4129d322e4487bafaa4f6b46 -k c0ece3e63198af382b5603331cc23fa8 -i 7e489b83622e7228314d878d -a afcd7202d621e06ca53b70c2bdff7fb2 -l 16 -u -s afcd7202d621e06ca53b70c2bdff7fb2f4a3eacfbdadd3b1a17117b1d67ffc1f1e21efbbc6d83724a8c296e3bb8cda0c Tests covering this functionality have been added to libkcapi. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-09 20:17:52 +08:00
Stephan Mueller	5703c826b7	crypto: algif - return error code when no data was processed If no data has been processed during recvmsg, return the error code. This covers all errors received during non-AIO operations. If any error occurs during a synchronous operation in addition to -EIOCBQUEUED or -EBADMSG (like -ENOMEM), it should be relayed to the caller. Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-09 20:17:50 +08:00
megha.dey@linux.intel.com	8861249c74	crypto: x86/sha1 - Fix reads beyond the number of blocks passed It was reported that the sha1 AVX2 function(sha1_transform_avx2) is reading ahead beyond its intended data, and causing a crash if the next block is beyond page boundary: http://marc.info/?l=linux-crypto-vger&m=149373371023377 This patch makes sure that there is no overflow for any buffer length. It passes the tests written by Jan Stancek that revealed this problem: https://github.com/jstancek/sha1-avx2-crash I have re-enabled sha1-avx2 by reverting commit `b82ce24426` Cc: <stable@vger.kernel.org> Fixes: `b82ce24426` ("crypto: sha1-ssse3 - Disable avx2") Originally-by: Ilya Albrekht <ilya.albrekht@intel.com> Tested-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Megha Dey <megha.dey@linux.intel.com> Reported-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-09 20:01:37 +08:00
Herbert Xu	28389575a8	crypto: ixp4xx - Fix error handling path in 'aead_perform()' In commit `0f987e25cb`, the source processing has been moved in front of the destination processing, but the error handling path has not been modified accordingly. Free resources in the correct order to avoid some leaks. Cc: <stable@vger.kernel.org> Fixes: `0f987e25cb` ("crypto: ixp4xx - Fix false lastlen uninitialised warning") Reported-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Reviewed-by: Arnd Bergmann <arnd@arndb.de>	2017-08-09 20:01:33 +08:00
Gary R Hook	5060ffc97b	crypto: ccp - Add XTS-AES-256 support for CCP version 5 Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-04 09:27:44 +08:00
Gary R Hook	7f7216cfaf	crypto: ccp - Rework the unit-size check for XTS-AES The CCP supports a limited set of unit-size values. Change the check for this parameter such that acceptable values match the enumeration. Then clarify the conditions under which we must use the fallback implementation. Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-04 09:27:43 +08:00
Gary R Hook	47f27f160b	crypto: ccp - Add a call to xts_check_key() Vet the key using the available standard function Signed-off-by: Gary R Hook <gary.hook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-04 09:27:42 +08:00
Gary R Hook	e652399edb	crypto: ccp - Fix XTS-AES-128 support on v5 CCPs Version 5 CCPs have some new requirements for XTS-AES: the type field must be specified, and the key requires 512 bits, with each part occupying 256 bits and padded with zeroes. cc: <stable@vger.kernel.org> # 4.9.x+ Signed-off-by: Gary R Hook <ghook@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-04 09:27:41 +08:00
Ard Biesheuvel	7c83d689c7	crypto: arm64/aes - avoid expanded lookup tables in the final round For the final round, avoid the expanded and padded lookup tables exported by the generic AES driver. Instead, for encryption, we can perform byte loads from the same table we used for the inner rounds, which will still be hot in the caches. For decryption, use the inverse AES Sbox directly, which is 4x smaller than the inverse lookup table exported by the generic driver. This should significantly reduce the Dcache footprint of our code, which makes the code more robust against timing attacks. It does not introduce any additional module dependencies, given that we already rely on the core AES module for the shared key expansion routines. It also frees up register x18, which is not available as a scratch register on all platforms, which and so avoiding it improves shareability of this code. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-04 09:27:26 +08:00
Ard Biesheuvel	0d149ce67d	crypto: arm/aes - avoid expanded lookup tables in the final round For the final round, avoid the expanded and padded lookup tables exported by the generic AES driver. Instead, for encryption, we can perform byte loads from the same table we used for the inner rounds, which will still be hot in the caches. For decryption, use the inverse AES Sbox directly, which is 4x smaller than the inverse lookup table exported by the generic driver. This should significantly reduce the Dcache footprint of our code, which makes the code more robust against timing attacks. It does not introduce any additional module dependencies, given that we already rely on the core AES module for the shared key expansion routines. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-04 09:27:25 +08:00
Ard Biesheuvel	03c9a333fe	crypto: arm64/ghash - add NEON accelerated fallback for 64-bit PMULL Implement a NEON fallback for systems that do support NEON but have no support for the optional 64x64->128 polynomial multiplication instruction that is part of the ARMv8 Crypto Extensions. It is based on the paper "Fast Software Polynomial Multiplication on ARM Processors Using the NEON Engine" by Danilo Camara, Conrado Gouvea, Julio Lopez and Ricardo Dahab (https://hal.inria.fr/hal-01506572), but has been reworked extensively for the AArch64 ISA. On a low-end core such as the Cortex-A53 found in the Raspberry Pi3, the NEON based implementation is 4x faster than the table based one, and is time invariant as well, making it less vulnerable to timing attacks. When combined with the bit-sliced NEON implementation of AES-CTR, the AES-GCM performance increases by 2x (from 58 to 29 cycles per byte). Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-04 09:27:25 +08:00
Ard Biesheuvel	3759ee0572	crypto: arm/ghash - add NEON accelerated fallback for vmull.p64 Implement a NEON fallback for systems that do support NEON but have no support for the optional 64x64->128 polynomial multiplication instruction that is part of the ARMv8 Crypto Extensions. It is based on the paper "Fast Software Polynomial Multiplication on ARM Processors Using the NEON Engine" by Danilo Camara, Conrado Gouvea, Julio Lopez and Ricardo Dahab (https://hal.inria.fr/hal-01506572) On a 32-bit guest executing under KVM on a Cortex-A57, the new code is not only 4x faster than the generic table based GHASH driver, it is also time invariant. (Note that the existing vmull.p64 code is 16x faster on this core). Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-04 09:27:24 +08:00
Ard Biesheuvel	537c1445ab	crypto: arm64/gcm - implement native driver using v8 Crypto Extensions Currently, the AES-GCM implementation for arm64 systems that support the ARMv8 Crypto Extensions is based on the generic GCM module, which combines the AES-CTR implementation using AES instructions with the PMULL based GHASH driver. This is suboptimal, given the fact that the input data needs to be loaded twice, once for the encryption and again for the MAC calculation. On Cortex-A57 (r1p2) and other recent cores that implement micro-op fusing for the AES instructions, AES executes at less than 1 cycle per byte, which means that any cycles wasted on loading the data twice hurt even more. So implement a new GCM driver that combines the AES and PMULL instructions at the block level. This improves performance on Cortex-A57 by ~37% (from 3.5 cpb to 2.6 cpb) Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-04 09:27:23 +08:00
Ard Biesheuvel	ec808bbef0	crypto: arm64/aes-bs - implement non-SIMD fallback for AES-CTR Of the various chaining modes implemented by the bit sliced AES driver, only CTR is exposed as a synchronous cipher, and requires a fallback in order to remain usable once we update the kernel mode NEON handling logic to disallow nested use. So wire up the existing CTR fallback C code. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-04 09:27:22 +08:00
Ard Biesheuvel	611d5324f4	crypto: arm64/chacha20 - take may_use_simd() into account To accommodate systems that disallow the use of kernel mode NEON in some circumstances, take the return value of may_use_simd into account when deciding whether to invoke the C fallback routine. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-04 09:27:22 +08:00
Ard Biesheuvel	e211506979	crypto: arm64/aes-blk - add a non-SIMD fallback for synchronous CTR To accommodate systems that may disallow use of the NEON in kernel mode in some circumstances, introduce a C fallback for synchronous AES in CTR mode, and use it if may_use_simd() returns false. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-04 09:27:21 +08:00
Ard Biesheuvel	5092fcf349	crypto: arm64/aes-ce-ccm: add non-SIMD generic fallback The arm64 kernel will shortly disallow nested kernel mode NEON. So honour this in the ARMv8 Crypto Extensions implementation of CCM-AES, and fall back to a scalar implementation using the generic crypto helpers for AES, XOR and incrementing the CTR counter. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-04 09:27:21 +08:00
Ard Biesheuvel	b8fb993a83	crypto: arm64/aes-ce-cipher: add non-SIMD generic fallback The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar code that can be invoked in that case. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-04 09:27:20 +08:00
Ard Biesheuvel	f402e3115e	crypto: arm64/aes-ce-cipher - match round key endianness with generic code In order to be able to reuse the generic AES code as a fallback for situations where the NEON may not be used, update the key handling to match the byte order of the generic code: it stores round keys as sequences of 32-bit quantities rather than streams of bytes, and so our code needs to be updated to reflect that. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-04 09:27:19 +08:00
Ard Biesheuvel	da1793312f	crypto: arm64/sha2-ce - add non-SIMD scalar fallback The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar code that can be invoked in that case. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-04 09:27:19 +08:00
Ard Biesheuvel	0771f3234d	crypto: arm64/sha1-ce - add non-SIMD generic fallback The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar C code that can be invoked in that case. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-04 09:27:18 +08:00
Ard Biesheuvel	15c7d8f8a2	crypto: arm64/crc32 - add non-SIMD scalar fallback The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar C code that can be invoked in that case. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-04 09:27:17 +08:00
Ard Biesheuvel	2dde374e1f	crypto: arm64/crct10dif - add non-SIMD generic fallback The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar C code that can be invoked in that case. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-04 09:27:16 +08:00
Ard Biesheuvel	6d6254d728	crypto: arm64/ghash-ce - add non-SIMD scalar fallback The arm64 kernel will shortly disallow nested kernel mode NEON, so add a fallback to scalar C code that can be invoked in that case. Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-04 09:27:16 +08:00
Ard Biesheuvel	45fe93dff2	crypto: algapi - make crypto_xor() take separate dst and src arguments There are quite a number of occurrences in the kernel of the pattern if (dst != src) memcpy(dst, src, walk.total % AES_BLOCK_SIZE); crypto_xor(dst, final, walk.total % AES_BLOCK_SIZE); or crypto_xor(keystream, src, nbytes); memcpy(dst, keystream, nbytes); where crypto_xor() is preceded or followed by a memcpy() invocation that is only there because crypto_xor() uses its output parameter as one of the inputs. To avoid having to add new instances of this pattern in the arm64 code, which will be refactored to implement non-SIMD fallbacks, add an alternative implementation called crypto_xor_cpy(), taking separate input and output arguments. This removes the need for the separate memcpy(). Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2017-08-04 09:27:15 +08:00

1 2 3 4 5 ...

691601 Commits