kernel_optimize_test

History

Rasmus Villemoes 94a58c360a slab.h: sprinkle __assume_aligned attributes The various allocators return aligned memory. Telling the compiler that allows it to generate better code in many cases, for example when the return value is immediately passed to memset(). Some code does become larger, but at least we win twice as much as we lose: $ scripts/bloat-o-meter /tmp/vmlinux vmlinux add/remove: 0/0 grow/shrink: 13/52 up/down: 995/-2140 (-1145) An example of the different (and smaller) code can be seen in mm_alloc(). Before: : 48 8d 78 08 lea 0x8(%rax),%rdi : 48 89 c1 mov %rax,%rcx : 48 89 c2 mov %rax,%rdx : 48 c7 00 00 00 00 00 movq $0x0,(%rax) : 48 c7 80 48 03 00 00 movq $0x0,0x348(%rax) : 00 00 00 00 : 31 c0 xor %eax,%eax : 48 83 e7 f8 and $0xfffffffffffffff8,%rdi : 48 29 f9 sub %rdi,%rcx : 81 c1 50 03 00 00 add $0x350,%ecx : c1 e9 03 shr $0x3,%ecx : f3 48 ab rep stos %rax,%es:(%rdi) After: : 48 89 c2 mov %rax,%rdx : b9 6a 00 00 00 mov $0x6a,%ecx : 31 c0 xor %eax,%eax : 48 89 d7 mov %rdx,%rdi : f3 48 ab rep stos %rax,%es:(%rdi) So gcc's strategy is to do two possibly (but not really, of course) unaligned stores to the first and last word, then do an aligned rep stos covering the middle part with a little overlap. Maybe arches which do not allow unaligned stores gain even more. I don't know if gcc can actually make use of alignments greater than 8 for anything, so one could probably drop the __assume_xyz_alignment macros and just use __assume_aligned(8). The increases in code size are mostly caused by gcc deciding to opencode strlen() using the check-four-bytes-at-a-time trick when it knows the buffer is sufficiently aligned (one function grew by 200 bytes). Now it turns out that many of these strlen() calls showing up were in fact redundant, and they're gone from -next. Applying the two patches to next-20151001 bloat-o-meter instead says add/remove: 0/0 grow/shrink: 6/52 up/down: 244/-2140 (-1896) Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Acked-by: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>		2015-11-20 16:17:32 -08:00
..
acpi
asm-generic	h8300 update for v4.4	2015-11-12 15:26:39 -08:00
clocksource
crypto
drm	drm/atomic: add a drm_atomic_clean_old_fb helper.	2015-11-17 13:02:14 +02:00
dt-bindings	ARM: DT updates for v4.4	2015-11-10 15:06:26 -08:00
keys
kvm
linux	slab.h: sprinkle __assume_aligned attributes	2015-11-20 16:17:32 -08:00
math-emu
media
memory
misc
net	net: switchdev: fix return code of fdb_dump stub	2015-11-16 15:24:37 -05:00
pcmcia
ras
rdma
rxrpc
scsi	scsi: use host wide tags by default	2015-11-09 17:11:57 -08:00
soc	ARM: SoC driver updates for v4.4	2015-11-10 15:00:03 -08:00
sound
target	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending	2015-11-13 20:04:17 -08:00
trace	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux	2015-11-11 09:03:01 -08:00
uapi	VFIO updates for v4.4-rc1	2015-11-13 17:05:32 -08:00
video
xen
Kbuild