This patch adds a large AES CTR mode test vector. The test vector is
4100 bytes in size. It was generated using a C++ program that called
Crypto++.
Note that this patch increases considerably the size of "struct
cipher_testvec" and hence the size of tcrypt.ko.
Signed-off-by: Tan Swee Heng <thesweeheng@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently the number of entries in a cipher test vector template is
limited by TVMEMSIZE/sizeof(struct cipher_testvec). This patch
circumvents the problem by pointing cipher_tv to each entry in the
template, rather than the template itself.
Signed-off-by: Tan Swee Heng <thesweeheng@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
When the data spans across a page boundary, CTR may incorrectly process
a partial block in the middle because the blkcipher walking code may
supply partial blocks in the middle as long as the total length of the
supplied data is more than a block. CTR is supposed to return any unused
partial block in that case to the walker.
This patch fixes this by doing exactly that, returning partial blocks to
the walker unless we received less than a block-worth of data to start
with.
This also allows us to optimise the bulk of the processing since we no
longer have to worry about partial blocks until the very end.
Thanks to Tan Swee Heng for fixes and actually testing this :)
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add GCM/GMAC support to cryptoapi.
GCM (Galois/Counter Mode) is an AEAD mode of operations for any block cipher
with a block size of 16. The typical example is AES-GCM.
Signed-off-by: Mikko Herranen <mh1@iki.fi>
Reviewed-by: Mika Kukkonen <mika.kukkonen@nsn.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add AEAD support to tcrypt, needed by GCM.
Signed-off-by: Mikko Herranen <mh1@iki.fi>
Reviewed-by: Mika Kukkonen <mika.kukkonen@nsn.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Analogously to camellia7 patch, move
"absorb kw2 to other subkeys" and "absorb kw4 to other subkeys"
code parts into camellia_setup_tail(). This further reduces
source and object code size at the cost of two brances
in key setup code.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move "key XOR is end of F-function" code part into
camellia_setup_tail(), it is sufficiently similar
between camellia_setup128 and camellia_setup256.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
unifies encrypt/decrypt routines for different key lengths.
This reduces module size by ~25%, with tiny (less than 1%)
speed impact.
Also collapses encrypt/decrypt into more readable
(visually shorter) form using macros.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Remove unused macro params.
Use (u8)(expr) instead of (expr) & 0xff,
helps gcc to realize how to use simpler commands.
Move CAMELLIA_FLS macro closer to encrypt/decrypt routines.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch replaces the custom xor in CBC with the generic crypto_xor.
It changes the operations for in-place encryption slightly to avoid
calling crypto_xor with tmpbuf since it is not necessarily aligned.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
All common block ciphers have a block size that's a power of 2. In fact,
all of our block ciphers obey this rule.
If we require this then CBC can be optimised to avoid an expensive divide
on in-place decryption.
I've also changed the saving of the first IV in the in-place decryption
case to the last IV because that lets us use walk->iv (which is already
aligned) for the xor operation where alignment is required.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
With the addition of more stream ciphers we need to curb the proliferation
of ad-hoc xor functions. This patch creates a generic pair of functions,
crypto_inc and crypto_xor which does big-endian increment and exclusive or,
respectively.
For optimum performance, they both use u32 operations so alignment must be
as that of u32 even though the arguments are of type u8 *.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Up until now we have ablkcipher algorithms have been identified as
type BLKCIPHER with the ASYNC bit set. This is suboptimal because
ablkcipher refers to two things. On the one hand it refers to the
top-level ablkcipher interface with requests. On the other hand it
refers to and algorithm type underneath.
As it is you cannot request a synchronous block cipher algorithm
with the ablkcipher interface on top. This is a problem because
we want to be able to eventually phase out the blkcipher top-level
interface.
This patch fixes this by making ABLKCIPHER its own type, just as
we have distinct types for HASH and DIGEST. The type it associated
with the algorithm implementation only.
Which top-level interface is used for synchronous block ciphers is
then determined by the mask that's used. If it's a specific mask
then the old blkcipher interface is given, otherwise we go with the
new ablkcipher interface.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch converts the crypto scatterwalk code to use the generic
scatterlist chaining rather the version specific to crypto.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Resubmitting this patch which extends sha256_generic.c to support SHA-224 as
described in FIPS 180-2 and RFC 3874. HMAC-SHA-224 as described in RFC4231
is then supported through the hmac interface.
Patch includes test vectors for SHA-224 and HMAC-SHA-224.
SHA-224 chould be chosen as a hash algorithm when 112 bits of security
strength is required.
Patch generated against the 2.6.24-rc1 kernel and tested against
2.6.24-rc1-git14 which includes fix for scatter gather implementation for HMAC.
Signed-off-by: Jonathan Lynch <jonathan.lynch@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The setkey() function can be shared with the generic algorithm.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
NO other block mode is M by default.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The setkey() function can be shared with the generic algorithm.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch exports four tables and the set_key() routine. This ressources
can be shared by other AES implementations (aes-x86_64 for instance).
The decryption key has been turned around (deckey[0] is the first piece
of the key instead of deckey[keylen+20]). The encrypt/decrypt functions
are looking now identical (except they are using different tables and
key).
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds countersize to CTR mode.
The template is now ctr(algo,noncesize,ivsize,countersize).
For example, ctr(aes,4,8,4) indicates the counterblock
will be composed of a salt/nonce that is 4 bytes, an iv
that is 8 bytes and the counter is 4 bytes.
When noncesize + ivsize < blocksize, CTR initializes the
last block - ivsize - noncesize portion of the block to
zero. Otherwise the counter block is composed of the IV
(and nonce if necessary).
If noncesize + ivsize == blocksize, then this indicates that
user is passing in entire counterblock. Thus countersize
indicates the amount of bytes in counterblock to use as
the counter for incrementing. CTR will increment counter
portion by 1, and begin encryption with that value.
Note that CTR assumes the counter portion of the block that
will be incremented is stored in big endian.
Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move huge unrolled pieces of code (3 screenfuls) at the end of
128/256 key setup routines into common camellia_setup_tail(),
convert it to loop there.
Loop is still unrolled six times, so performance hit is very small,
code size win is big.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Acked-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Optimize GETU32 to use 4-byte memcpy (modern gcc will convert
such memcpy to single move instruction on i386).
Original GETU32 did four byte fetches, and shifted/XORed those.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Acked-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Rename some macros to shorter names: CAMELLIA_RR8 -> ROR8,
making it easier to understand that it is just a right rotation,
nothing camellia-specific in it.
CAMELLIA_SUBKEY_L() -> SUBKEY_L() - just shorter.
Move be32 <-> cpu conversions out of en/decrypt128/256 and into
camellia_en/decrypt - no reason to have that code duplicated twice.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Acked-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Move code blocks around so that related pieces are closer together:
e.g. CAMELLIA_ROUNDSM macro does not need to be separated
from the rest of the code by huge array of constants.
Remove unused macros (COPY4WORD, SWAP4WORD, XOR4WORD[2])
Drop SUBL(), SUBR() macros which only obscure things.
Same for CAMELLIA_SP1110() macro and KEY_TABLE_TYPE typedef.
Remove useless comments:
/* encryption */ -- well it's obvious enough already!
void camellia_encrypt128(...)
Combine swap with copying at the beginning/end of encrypt/decrypt.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Acked-by: Noriaki TAKAMIYA <takamiya@po.ntts.co.jp>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently twofish cipher key setup code
has unrolled loops - approximately 70-100
instructions are repeated 40 times.
As a result, twofish module is the biggest module
in crypto/*.
Unrolling produces x2.5 more code (+18k on i386), and speeds up key
setup by 7%:
unrolled: twofish_setkey/sec: 41128
loop: twofish_setkey/sec: 38148
CALC_K256: ~100 insns each
CALC_K192: ~90 insns
CALC_K: ~70 insns
Attached patch removes this unrolling.
$ size */twofish_common.o
text data bss dec hex filename
37920 0 0 37920 9420 crypto.org/twofish_common.o
13209 0 0 13209 3399 crypto/twofish_common.o
Run tested (modprobe tcrypt reports ok). Please apply.
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This three defines are used in all AES related hardware.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
HIFN driver update to use DES weak key checks (exported in this patch).
Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch creates include/crypto/des.h for common macros shared between
DES implementations.
Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch implements CTR mode for IPsec.
It is based off of RFC 3686.
Please note:
1. CTR turns a block cipher into a stream cipher.
Encryption is done in blocks, however the last block
may be a partial block.
A "counter block" is encrypted, creating a keystream
that is xor'ed with the plaintext. The counter portion
of the counter block is incremented after each block
of plaintext is encrypted.
Decryption is performed in same manner.
2. The CTR counterblock is composed of,
nonce + IV + counter
The size of the counterblock is equivalent to the
blocksize of the cipher.
sizeof(nonce) + sizeof(IV) + sizeof(counter) = blocksize
The CTR template requires the name of the cipher
algorithm, the sizeof the nonce, and the sizeof the iv.
ctr(cipher,sizeof_nonce,sizeof_iv)
So for example,
ctr(aes,4,8)
specifies the counterblock will be composed of 4 bytes
from a nonce, 8 bytes from the iv, and 4 bytes for counter
since aes has a blocksize of 16 bytes.
3. The counter portion of the counter block is stored
in big endian for conformance to rfc 3686.
Signed-off-by: Joy Latten <latten@austin.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
As it is crypto_remove_spawn may try to unregister an instance which is
yet to be registered. This patch fixes this by checking whether the
instance has been registered before attempting to remove it.
It also removes a bogus cra_destroy check in crypto_register_instance as
1) it's outside the mutex;
2) we have a check in __crypto_register_alg already.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
It seems that newer versions of gcc have regressed in their abilities to
analyse initialisations. This patch moves the initialisations up to avoid
the warnings.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Not architecture specific code should not #include <asm/scatterlist.h>.
This patch therefore either replaces them with
#include <linux/scatterlist.h> or simply removes them if they were
unused.
Signed-off-by: Adrian Bunk <bunk@kernel.org>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
This patch moves the sg_init_table out of the timing loops for hash
algorithms so that it doesn't impact on the speed test results.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
hmac_setkey(), hmac_init(), and hmac_final() have
a singular on-stack scatterlist. Initialit is
using sg_init_one() instead of using sg_set_buf().
Signed-off-by: David S. Miller <davem@davemloft.net>
Crypto now uses SG helper functions. Fix hmac_digest to use those
functions correctly and fix the oops associated with it.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most drivers need to set length and offset as well, so may as well fold
those three lines into one.
Add sg_assign_page() for those two locations that only needed to set
the page, where the offset/length is set outside of the function context.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Convert the subdirectory "crypto" to UTF-8. The files changed are
<crypto/fcrypt.c> and <crypto/api.c>.
Signed-off-by: John Anthony Kazos Jr. <jakj@j-a-k-j.com>
Signed-off-by: Adrian Bunk <bunk@kernel.org>
There are currently several SHA implementations that all define their own
initialization vectors and size values. Since this values are idential
move them to a header file under include/crypto.
Signed-off-by: Jan Glauber <jang@de.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Loading the crypto algorithm by the alias instead of by module directly
has the advantage that all possible implementations of this algorithm
are loaded automatically and the crypto API can choose the best one
depending on its priority.
Additionally it ensures that the generic implementation as well as the
HW driver (if available) is loaded in case the HW driver needs the
generic version as fallback in corner cases.
Also remove the probe for sha1 in padlock's init code.
Quote from Herbert:
The probe is actually pointless since we can always probe when
the algorithm is actually used which does not lead to dead-locks
like this.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Loading the crypto algorithm by the alias instead of by module directly
has the advantage that all possible implementations of this algorithm
are loaded automatically and the crypto API can choose the best one
depending on its priority.
Additionally it ensures that the generic implementation as well as the
HW driver (if available) is loaded in case the HW driver needs the
generic version as fallback in corner cases.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Loading the crypto algorithm by the alias instead of by module directly
has the advantage that all possible implementations of this algorithm
are loaded automatically and the crypto API can choose the best one
depending on its priority.
Signed-off-by: Sebastian Siewior <sebastian@breakpoint.cc>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds the helper blkcipher_walk_virt_block which is similar to
blkcipher_walk_virt but uses a supplied block size instead of the block
size of the block cipher. This is useful for CTR where the block size is
1 but we still want to walk by the block size of the underlying cipher.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Now that the block size is no longer a multiple of the alignment, we need to
increase the kmalloc amount in blkcipher_next_slow to use the aligned block
size.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>