kernel_optimize_test/kernel
David Howells de09a9771a CRED: Fix get_task_cred() and task_state() to not resurrect dead credentials
It's possible for get_task_cred() as it currently stands to 'corrupt' a set of
credentials by incrementing their usage count after their replacement by the
task being accessed.

What happens is that get_task_cred() can race with commit_creds():

	TASK_1			TASK_2			RCU_CLEANER
	-->get_task_cred(TASK_2)
	rcu_read_lock()
	__cred = __task_cred(TASK_2)
				-->commit_creds()
				old_cred = TASK_2->real_cred
				TASK_2->real_cred = ...
				put_cred(old_cred)
				  call_rcu(old_cred)
		[__cred->usage == 0]
	get_cred(__cred)
		[__cred->usage == 1]
	rcu_read_unlock()
							-->put_cred_rcu()
							[__cred->usage == 1]
							panic()

However, since a tasks credentials are generally not changed very often, we can
reasonably make use of a loop involving reading the creds pointer and using
atomic_inc_not_zero() to attempt to increment it if it hasn't already hit zero.

If successful, we can safely return the credentials in the knowledge that, even
if the task we're accessing has released them, they haven't gone to the RCU
cleanup code.

We then change task_state() in procfs to use get_task_cred() rather than
calling get_cred() on the result of __task_cred(), as that suffers from the
same problem.

Without this change, a BUG_ON in __put_cred() or in put_cred_rcu() can be
tripped when it is noticed that the usage count is not zero as it ought to be,
for example:

kernel BUG at kernel/cred.c:168!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/kernel/mm/ksm/run
CPU 0
Pid: 2436, comm: master Not tainted 2.6.33.3-85.fc13.x86_64 #1 0HR330/OptiPlex
745
RIP: 0010:[<ffffffff81069881>]  [<ffffffff81069881>] __put_cred+0xc/0x45
RSP: 0018:ffff88019e7e9eb8  EFLAGS: 00010202
RAX: 0000000000000001 RBX: ffff880161514480 RCX: 00000000ffffffff
RDX: 00000000ffffffff RSI: ffff880140c690c0 RDI: ffff880140c690c0
RBP: ffff88019e7e9eb8 R08: 00000000000000d0 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000040 R12: ffff880140c690c0
R13: ffff88019e77aea0 R14: 00007fff336b0a5c R15: 0000000000000001
FS:  00007f12f50d97c0(0000) GS:ffff880007400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8f461bc000 CR3: 00000001b26ce000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process master (pid: 2436, threadinfo ffff88019e7e8000, task ffff88019e77aea0)
Stack:
 ffff88019e7e9ec8 ffffffff810698cd ffff88019e7e9ef8 ffffffff81069b45
<0> ffff880161514180 ffff880161514480 ffff880161514180 0000000000000000
<0> ffff88019e7e9f28 ffffffff8106aace 0000000000000001 0000000000000246
Call Trace:
 [<ffffffff810698cd>] put_cred+0x13/0x15
 [<ffffffff81069b45>] commit_creds+0x16b/0x175
 [<ffffffff8106aace>] set_current_groups+0x47/0x4e
 [<ffffffff8106ac89>] sys_setgroups+0xf6/0x105
 [<ffffffff81009b02>] system_call_fastpath+0x16/0x1b
Code: 48 8d 71 ff e8 7e 4e 15 00 85 c0 78 0b 8b 75 ec 48 89 df e8 ef 4a 15 00
48 83 c4 18 5b c9 c3 55 8b 07 8b 07 48 89 e5 85 c0 74 04 <0f> 0b eb fe 65 48 8b
04 25 00 cc 00 00 48 3b b8 58 04 00 00 75
RIP  [<ffffffff81069881>] __put_cred+0xc/0x45
 RSP <ffff88019e7e9eb8>
---[ end trace df391256a100ebdd ]---

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-07-29 15:16:17 -07:00
..
debug sysrq,kdb: Use __handle_sysrq() for kdb's sysrq function 2010-07-21 19:27:07 -05:00
gcov
irq genirq: Deal with desc->set_type() changing desc->chip 2010-06-09 17:05:08 +02:00
power suspend: Move NVS save/restore code to generic suspend functionality 2010-06-10 11:02:34 -04:00
time Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2010-07-02 09:52:58 -07:00
trace perf/tracing: Fix regression of perf losing kprobe events 2010-06-10 20:56:54 -04:00
.gitignore
acct.c
async.c
audit_tree.c
audit_watch.c
audit.c
audit.h
auditfilter.c
auditsc.c
backtracetest.c
bounds.c
capability.c
cgroup_freezer.c
cgroup.c cgroups: alloc_css_id() increments hierarchy depth 2010-06-04 15:21:45 -07:00
compat.c
configs.c
cpu.c fix cpu_chain section mismatch... 2010-06-01 09:22:50 -07:00
cpuset.c cpusets: new round-robin rotor for SLAB allocations 2010-05-27 09:12:44 -07:00
cred.c CRED: Fix get_task_cred() and task_state() to not resurrect dead credentials 2010-07-29 15:16:17 -07:00
delayacct.c
dma.c
early_res.c kmemleak: Add support for NO_BOOTMEM configurations 2010-07-19 11:54:15 +01:00
elfcore.c
exec_domain.c sys_personality: change sys_personality() to accept "unsigned int" instead of u_long 2010-06-04 15:21:45 -07:00
exit.c proc: turn signal_struct->count into "int nr_threads" 2010-05-27 09:12:47 -07:00
extable.c
fork.c Revert "cpusets: randomize node rotor used in cpuset_mem_spread_node()" 2010-05-30 09:00:03 -07:00
freezer.c
futex_compat.c
futex.c futex: futex_find_get_task remove credentails check 2010-06-30 15:43:44 -07:00
groups.c
hrtimer.c hrtimer: Avoid double seqlock 2010-05-26 16:15:37 +02:00
hung_task.c
hw_breakpoint.c
itimer.c
kallsyms.c kdb: core for kgdb back end (2 of 2) 2010-05-20 21:04:21 -05:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kexec.c kexec: fix Oops in crash_shrink_memory() 2010-06-29 15:29:31 -07:00
kfifo.c
kmod.c call_usermodehelper: UMH_WAIT_EXEC ignores kernel_thread() failure 2010-05-27 09:12:45 -07:00
kprobes.c
ksysfs.c sysfs: add struct file* to bin_attr callbacks 2010-05-21 09:37:31 -07:00
kthread.c
latencytop.c
lockdep_internals.h
lockdep_proc.c
lockdep_states.h
lockdep.c lockdep: Add novalidate class for dev->mutex conversion 2010-05-21 09:37:30 -07:00
Makefile Move kernel/kgdb.c to kernel/debug/debug_core.c 2010-05-20 21:04:18 -05:00
module.c dynamic debug: move ddebug_remove_module() down into free_module() 2010-07-27 14:32:06 -07:00
mutex-debug.c
mutex-debug.h
mutex.c
mutex.h
notifier.c
ns_cgroup.c
nsproxy.c
padata.c kernel/: convert cpu notifier to return encapsulate errno value 2010-05-27 09:12:48 -07:00
panic.c panic: call console_verbose() in panic 2010-05-27 09:12:53 -07:00
params.c
perf_event.c perf: Fix signed comparison in perf_adjust_period() 2010-06-08 18:43:00 +02:00
pid_namespace.c
pid.c pids: increase pid_max based on num_possible_cpus 2010-05-27 09:12:51 -07:00
pm_qos_params.c
posix-cpu-timers.c posix-cpu-timers: avoid "task->signal != NULL" checks 2010-05-27 09:12:46 -07:00
posix-timers.c posix_timer: Fix error path in timer_create 2010-05-27 22:38:15 +02:00
printk.c printk,kdb: capture printk() when in kdb shell 2010-05-20 21:04:27 -05:00
profile.c numa: in-kernel profiling: use cpu_to_mem() for per cpu allocations 2010-05-27 09:12:57 -07:00
ptrace.c ptrace: PTRACE_GETFDPIC: fix the unsafe usage of child->mm 2010-05-27 09:12:44 -07:00
range.c
rcupdate.c
rcutiny_plugin.h
rcutiny.c
rcutorture.c
rcutree_plugin.h
rcutree_trace.c
rcutree.c
rcutree.h
relay.c kernel/: convert cpu notifier to return encapsulate errno value 2010-05-27 09:12:48 -07:00
res_counter.c
resource.c
rtmutex_common.h
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c
rtmutex.h
rwsem.c
sched_clock.c
sched_cpupri.c
sched_cpupri.h
sched_debug.c proc_sched_show_task(): use get_nr_threads() 2010-05-27 09:12:47 -07:00
sched_fair.c rcu: apply RCU protection to wake_affine() 2010-06-23 06:50:44 -07:00
sched_features.h
sched_idletask.c
sched_rt.c
sched_stats.h
sched.c Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2010-07-02 09:52:58 -07:00
seccomp.c
semaphore.c
signal.c exit: change zap_other_threads() to count sub-threads 2010-05-27 09:12:46 -07:00
slow-work-debugfs.c
slow-work.c
slow-work.h
smp.c kernel/: convert cpu notifier to return encapsulate errno value 2010-05-27 09:12:48 -07:00
softirq.c kernel/: fix BUG_ON checks for cpu notifier callbacks direct call 2010-06-04 15:21:45 -07:00
softlockup.c
spinlock.c
srcu.c
stacktrace.c
stop_machine.c sched: Make sure timers have migrated before killing the migration_thread 2010-05-31 08:37:44 +02:00
sys_ni.c
sys.c kmod: add init function to usermodehelper 2010-05-27 09:12:44 -07:00
sysctl_binary.c sysctl: don't use own implementation of hex_to_bin() 2010-05-25 08:07:05 -07:00
sysctl_check.c
sysctl.c pipe: change /proc/sys/fs/pipe-max-pages to byte sized interface 2010-06-03 14:54:39 +02:00
taskstats.c
test_kprobes.c
time.c timekeeping: Fix timezone update 2010-05-24 11:50:38 +02:00
timeconst.pl
timer.c kernel/: fix BUG_ON checks for cpu notifier callbacks direct call 2010-06-04 15:21:45 -07:00
tracepoint.c
tsacct.c
uid16.c
up.c
user_namespace.c kref: remove kref_set 2010-05-21 09:37:29 -07:00
user-return-notifier.c
user.c
utsname_sysctl.c
utsname.c
wait.c
workqueue.c kernel/: convert cpu notifier to return encapsulate errno value 2010-05-27 09:12:48 -07:00