kernel_optimize_test

History

Michael Wang 62470419e9 sched: Implement smarter wake-affine logic The wake-affine scheduler feature is currently always trying to pull the wakee close to the waker. In theory this should be beneficial if the waker's CPU caches hot data for the wakee, and it's also beneficial in the extreme ping-pong high context switch rate case. Testing shows it can benefit hackbench up to 15%. However, the feature is somewhat blind, from which some workloads such as pgbench suffer. It's also time-consuming algorithmically. Testing shows it can damage pgbench up to 50% - far more than the benefit it brings in the best case. So wake-affine should be smarter and it should realize when to stop its thankless effort at trying to find a suitable CPU to wake on. This patch introduces 'wakee_flips', which will be increased each time the task flips (switches) its wakee target. So a high 'wakee_flips' value means the task has more than one wakee, and the bigger the number, the higher the wakeup frequency. Now when making the decision on whether to pull or not, pay attention to the wakee with a high 'wakee_flips', pulling such a task may benefit the wakee. Also imply that the waker will face cruel competition later, it could be very cruel or very fast depends on the story behind 'wakee_flips', waker therefore suffers. Furthermore, if waker also has a high 'wakee_flips', that implies that multiple tasks rely on it, then waker's higher latency will damage all of them, so pulling wakee seems to be a bad deal. Thus, when 'waker->wakee_flips / wakee->wakee_flips' becomes higher and higher, the cost of pulling seems to be worse and worse. The patch therefore helps the wake-affine feature to stop its pulling work when: wakee->wakee_flips > factor && waker->wakee_flips > (factor * wakee->wakee_flips) The 'factor' here is the number of CPUs in the current CPU's NUMA node, so a bigger node will lead to more pulling since the trial becomes more severe. After applying the patch, pgbench shows up to 40% improvements and no regressions. Tested with 12 cpu x86 server and tip 3.10.0-rc7. The percentages in the final column highlight the areas with the biggest wins, all other areas improved as well: pgbench base smart \| db_size \| clients \| tps \| \| tps \| +---------+---------+-------+ +-------+ \| 22 MB \| 1 \| 10598 \| \| 10796 \| \| 22 MB \| 2 \| 21257 \| \| 21336 \| \| 22 MB \| 4 \| 41386 \| \| 41622 \| \| 22 MB \| 8 \| 51253 \| \| 57932 \| \| 22 MB \| 12 \| 48570 \| \| 54000 \| \| 22 MB \| 16 \| 46748 \| \| 55982 \| +19.75% \| 22 MB \| 24 \| 44346 \| \| 55847 \| +25.93% \| 22 MB \| 32 \| 43460 \| \| 54614 \| +25.66% \| 7484 MB \| 1 \| 8951 \| \| 9193 \| \| 7484 MB \| 2 \| 19233 \| \| 19240 \| \| 7484 MB \| 4 \| 37239 \| \| 37302 \| \| 7484 MB \| 8 \| 46087 \| \| 50018 \| \| 7484 MB \| 12 \| 42054 \| \| 48763 \| \| 7484 MB \| 16 \| 40765 \| \| 51633 \| +26.66% \| 7484 MB \| 24 \| 37651 \| \| 52377 \| +39.11% \| 7484 MB \| 32 \| 37056 \| \| 51108 \| +37.92% \| 15 GB \| 1 \| 8845 \| \| 9104 \| \| 15 GB \| 2 \| 19094 \| \| 19162 \| \| 15 GB \| 4 \| 36979 \| \| 36983 \| \| 15 GB \| 8 \| 46087 \| \| 49977 \| \| 15 GB \| 12 \| 41901 \| \| 48591 \| \| 15 GB \| 16 \| 40147 \| \| 50651 \| +26.16% \| 15 GB \| 24 \| 37250 \| \| 52365 \| +40.58% \| 15 GB \| 32 \| 36470 \| \| 50015 \| +37.14% Signed-off-by: Michael Wang <wangyun@linux.vnet.ibm.com> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/51D50057.9000809@linux.vnet.ibm.com [ Improved the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>		2013-07-23 12:18:41 +02:00
..
cpu	idle: Enable interrupts in the weak arch_cpu_idle() implementation	2013-06-14 23:01:05 +02:00
debug	kgdb/sysrq: fix inconstistent help message of sysrq key	2013-04-30 17:04:10 -07:00
events	perf: Update perf_event_type documentation	2013-07-23 12:17:08 +02:00
gcov	kernel/gcov: remove depends on CONFIG_EXPERIMENTAL	2013-01-11 11:39:33 -08:00
irq	Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2013-07-13 15:37:30 -07:00
power	Merge branch 'akpm' (updates from Andrew Morton)	2013-07-03 17:12:13 -07:00
sched	sched: Implement smarter wake-affine logic	2013-07-23 12:18:41 +02:00
time	kernel: delete __cpuinit usage from all core kernel files	2013-07-14 19:36:59 -04:00
trace	The majority of the changes here are cleanups for the large changes that	2013-07-11 09:02:09 -07:00
.gitignore	kernel/hz.bc: ignore.	2013-04-22 07:09:06 -07:00
acct.c	fs: Fix hang with BSD accounting on frozen filesystem	2013-05-04 14:57:58 -04:00
async.c	async: rename and redefine async_func_ptr	2013-03-12 13:59:14 -07:00
audit_tree.c	kernel/audit_tree.c:audit_add_tree_rule(): protect `rule' from kill_rules()	2013-06-12 16:29:46 -07:00
audit_watch.c	audit: catch possible NULL audit buffers	2013-01-11 14:54:55 -08:00
audit.c	audit: wait_for_auditd() should use TASK_UNINTERRUPTIBLE	2013-06-12 16:29:45 -07:00
audit.h	audit: fix mq_open and mq_unlink to add the MQ root as a hidden parent audit_names record	2013-07-09 10:33:19 -07:00
auditfilter.c	audit: Fix decimal constant description	2013-07-09 10:33:19 -07:00
auditsc.c	audit: fix mq_open and mq_unlink to add the MQ root as a hidden parent audit_names record	2013-07-09 10:33:19 -07:00
backtracetest.c
bounds.c
capability.c	Add file_ns_capable() helper function for open-time capability checking	2013-04-14 10:06:31 -07:00
cgroup_freezer.c	cgroup: rename ->create/post_create/pre_destroy/destroy() to ->css_alloc/online/offline/free()	2012-11-19 08:13:38 -08:00
cgroup.c	cgroup: we can use simple_lookup() now	2013-07-14 17:50:23 +04:00
compat.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal	2013-05-01 07:21:43 -07:00
configs.c	proc: Supply PDE attribute setting accessor functions	2013-05-01 17:29:18 -04:00
context_tracking.c	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2013-06-20 08:18:35 -10:00
cpu_pm.c
cpu.c	kernel: delete __cpuinit usage from all core kernel files	2013-07-14 19:36:59 -04:00
cpuset.c	Merge branch 'for-3.11-cpuset' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup	2013-07-02 20:04:25 -07:00
crash_dump.c
cred.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace	2012-12-18 10:55:28 -08:00
delayacct.c	cputime: Use accessors to read task cputime stats	2013-01-27 19:23:31 +01:00
dma.c
elfcore.c
exec_domain.c
exit.c	ptrace: revert "Prepare to fix racy accesses on task breakpoints"	2013-07-09 10:33:26 -07:00
extable.c	extable: Flip the sorting message	2013-04-15 13:25:16 +02:00
fork.c	kernel: delete __cpuinit usage from all core kernel files	2013-07-14 19:36:59 -04:00
freezer.c	freezer: skip waking up tasks with PF_FREEZER_SKIP set	2013-05-12 14:16:22 +02:00
futex_compat.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal	2013-02-23 18:50:11 -08:00
futex.c	futex: Use freezable blocking call	2013-06-25 23:11:19 +02:00
groups.c
hrtimer.c	kernel: delete __cpuinit usage from all core kernel files	2013-07-14 19:36:59 -04:00
hung_task.c
irq_work.c	Merge branch 'nohz/printk-v8' into irq/core	2013-02-05 00:48:46 +01:00
itimer.c
jump_label.c
kallsyms.c	kernel: kallsyms: memory override issue, need check destination buffer length	2013-04-15 15:17:26 +09:30
kcmp.c	kcmp: include linux/ptrace.h	2012-12-20 17:40:19 -08:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks	locking: Fix copy/paste errors of "ARCH_INLINE_*_UNLOCK_BH"	2013-05-28 08:50:00 +02:00
Kconfig.preempt
kexec.c	kexec: Use min() and min_t() to simplify logic	2013-04-30 17:04:07 -07:00
kmod.c	usermodehelper: kill the sub_info->path[0] check	2013-07-03 16:08:02 -07:00
kprobes.c	kprobes/x86: Call out into INT3 handler directly instead of using notifier	2013-07-23 10:12:57 +02:00
ksysfs.c	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2012-12-11 18:10:49 -08:00
kthread.c	kthread: implement probe_kthread_data()	2013-04-30 17:04:02 -07:00
latencytop.c
lglock.c
lockdep_internals.h
lockdep_proc.c	lockdep: Use KSYM_NAME_LEN'ed buffer for __get_key_name()	2012-10-24 12:39:09 +02:00
lockdep_states.h
lockdep.c	lockdep: remove task argument from debug_check_no_locks_held	2013-05-12 14:16:21 +02:00
Makefile	reboot: move shutdown/reboot related functions to kernel/reboot.c	2013-07-09 10:33:29 -07:00
modsign_certificate.S	CONFIG_SYMBOL_PREFIX: cleanup.	2013-03-15 15:09:43 +10:30
modsign_pubkey.c	keys: use keyring_alloc() to create module signing keyring	2012-12-20 17:40:21 -08:00
module_signing.c	MODSIGN: Don't use enum-type bitfields in module signature info block	2012-12-05 11:27:24 +10:30
module-internal.h	MODSIGN: Move the magic string to the end of a module and eliminate the search	2012-10-19 17:30:40 -07:00
module.c	Nothing interesting. Except the most embarrassing bugfix ever. But let's	2013-07-10 14:51:41 -07:00
mutex-debug.c
mutex-debug.h
mutex.c	mutex: Move ww_mutex definitions to ww_mutex.h	2013-07-12 12:07:46 +02:00
mutex.h
notifier.c
nsproxy.c	proc: Split the namespace stuff out into linux/proc_ns.h	2013-05-01 17:29:39 -04:00
padata.c	padata: use __this_cpu_read per-cpu helper	2012-12-06 17:16:23 +08:00
panic.c	The majority of the changes here are cleanups for the large changes that	2013-07-11 09:02:09 -07:00
params.c	There is no /sys/parameters	2013-07-02 15:38:19 +09:30
pid_namespace.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2013-05-01 17:51:54 -07:00
pid.c	kernel/pid.c: move statement	2013-07-03 16:08:05 -07:00
posix-cpu-timers.c	posix_timers: fix racy timer delta caching on task exit	2013-07-03 16:54:42 +02:00
posix-timers.c	posix-timers: Remove unused variable	2013-04-18 12:51:19 +02:00
printk.c	kernel: delete __cpuinit usage from all core kernel files	2013-07-14 19:36:59 -04:00
profile.c	kernel: delete __cpuinit usage from all core kernel files	2013-07-14 19:36:59 -04:00
ptrace.c	ptrace: PTRACE_DETACH should do flush_ptrace_hw_breakpoint(child)	2013-07-09 10:33:26 -07:00
range.c	range: Do not add new blank slot with add_range_with_merge	2013-06-18 11:32:10 -05:00
rcu.h	rcu: Provide RCU CPU stall warnings for tiny RCU	2013-01-28 22:06:21 -08:00
rcupdate.c	Merge branches 'cbnum.2013.06.10a', 'doc.2013.06.10a', 'fixes.2013.06.10a', 'srcu.2013.06.10a' and 'tiny.2013.06.10a' into HEAD	2013-06-10 13:46:44 -07:00
rcutiny_plugin.h	rcu: Shrink TINY_RCU by reworking CPU-stall ifdefs	2013-06-10 13:45:53 -07:00
rcutiny.c	rcu: Shrink TINY_RCU by reworking CPU-stall ifdefs	2013-06-10 13:45:53 -07:00
rcutorture.c	rcu: delete __cpuinit usage from all rcu files	2013-07-14 19:36:58 -04:00
rcutree_plugin.h	rcu: delete __cpuinit usage from all rcu files	2013-07-14 19:36:58 -04:00
rcutree_trace.c	rcutrace: single_open() leaks	2013-05-05 00:16:35 -04:00
rcutree.c	rcu: delete __cpuinit usage from all rcu files	2013-07-14 19:36:58 -04:00
rcutree.h	rcu: delete __cpuinit usage from all rcu files	2013-07-14 19:36:58 -04:00
reboot.c	reboot: move arch/x86 reboot= handling to generic kernel	2013-07-09 10:33:29 -07:00
relay.c	kernel: delete __cpuinit usage from all core kernel files	2013-07-14 19:36:59 -04:00
res_counter.c	res_counter: return amount of charges after res_counter_uncharge()	2012-12-18 15:02:12 -08:00
resource.c	kernel/resource.c: remove the unneeded assignment in function __find_resource	2013-07-03 16:08:06 -07:00
rtmutex_common.h
rtmutex-debug.c	sched/rt: Move rt specific bits into new header file	2013-02-07 20:51:08 +01:00
rtmutex-debug.h
rtmutex-tester.c	locking/rtmutex/tester: Set correct permissions on sysfs files	2013-04-10 14:48:37 +02:00
rtmutex.c	rtmutex: Document rt_mutex_adjust_prio_chain()	2013-05-28 09:23:52 +02:00
rtmutex.h
rwsem.c	Revert "rw_semaphore: remove up/down_read_non_owner"	2013-03-23 15:53:52 -07:00
seccomp.c	seccomp: allow BPF_XOR based ALU instructions.	2013-03-26 11:07:19 +11:00
semaphore.c	semaphore: use `bool' type for semaphore_waiter's up	2013-04-30 17:04:08 -07:00
signal.c	sigtimedwait: use freezable blocking call	2013-05-12 14:16:23 +02:00
smp.c	kernel: delete __cpuinit usage from all core kernel files	2013-07-14 19:36:59 -04:00
smpboot.c	kernel: delete __cpuinit usage from all core kernel files	2013-07-14 19:36:59 -04:00
smpboot.h
softirq.c	kernel: delete __cpuinit usage from all core kernel files	2013-07-14 19:36:59 -04:00
spinlock.c
srcu.c	srcu: use ACCESS_ONCE() to access sp->completed in srcu_read_lock()	2013-02-07 15:19:36 -08:00
stacktrace.c
stop_machine.c	stop_machine: Mark per cpu stopper enabled early	2013-02-26 22:25:17 +01:00
sys_ni.c	unify compat fanotify_mark(2), switch to COMPAT_SYSCALL_DEFINE	2013-05-09 13:46:38 -04:00
sys.c	reboot: move shutdown/reboot related functions to kernel/reboot.c	2013-07-09 10:33:29 -07:00
sysctl_binary.c	kernel: remove unnecessary head file	2013-06-26 18:01:46 +09:00
sysctl.c	Merge branch 'linus' into timers/urgent	2013-07-12 12:34:42 +02:00
task_work.c
taskstats.c	taskstats: cgroupstats_user_cmd() may leak on error	2012-10-06 03:05:31 +09:00
test_kprobes.c	kernel/: rename random32() to prandom_u32()	2013-04-29 18:28:42 -07:00
time.c	sched: Rename sched.c as sched/core.c in comments and Documentation	2013-06-19 12:58:42 +02:00
timeconst.bc	kernel: Replace timeconst.pl with a bc script	2013-02-16 23:17:25 +01:00
timer.c	kernel: delete __cpuinit usage from all core kernel files	2013-07-14 19:36:59 -04:00
tracepoint.c	Tracing updates for Linux 3.10	2013-04-29 13:55:38 -07:00
tsacct.c	cputime: Use accessors to read task cputime stats	2013-01-27 19:23:31 +01:00
uid16.c	make SYSCALL_DEFINE<n>-generated wrappers do asmlinkage_protect	2013-03-03 22:58:33 -05:00
up.c
user_namespace.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2013-05-01 17:51:54 -07:00
user-return-notifier.c	hlist: drop the node parameter from iterators	2013-02-27 19:10:24 -08:00
user.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2013-05-01 17:51:54 -07:00
utsname_sysctl.c	kernel/utsname_sysctl.c: put get/get_uts() into CONFIG_PROC_SYSCTL code block	2013-02-27 19:10:22 -08:00
utsname.c	proc: Split the namespace stuff out into linux/proc_ns.h	2013-05-01 17:29:39 -04:00
wait.c	Add wait_on_atomic_t() and wake_up_atomic_t()	2013-05-15 13:50:38 +01:00
watchdog.c	watchdog: Boot-disable by default on full dynticks	2013-06-20 15:46:32 +02:00
workqueue_internal.h	sched: Rename sched.c as sched/core.c in comments and Documentation	2013-06-19 12:58:42 +02:00
workqueue.c	kernel: delete __cpuinit usage from all core kernel files	2013-07-14 19:36:59 -04:00

No results found.