tmp_suning_uos_patched/kernel
Al Viro 8f7b0ba1c8 Fix inotify watch removal/umount races
Inotify watch removals suck violently.

To kick the watch out we need (in this order) inode->inotify_mutex and
ih->mutex.  That's fine if we have a hold on inode; however, for all
other cases we need to make damn sure we don't race with umount.  We can
*NOT* just grab a reference to a watch - inotify_unmount_inodes() will
happily sail past it and we'll end with reference to inode potentially
outliving its superblock.

Ideally we just want to grab an active reference to superblock if we
can; that will make sure we won't go into inotify_umount_inodes() until
we are done.  Cleanup is just deactivate_super().

However, that leaves a messy case - what if we *are* racing with
umount() and active references to superblock can't be acquired anymore?
We can bump ->s_count, grab ->s_umount, which will almost certainly wait
until the superblock is shut down and the watch in question is pining
for fjords.  That's fine, but there is a problem - we might have hit the
window between ->s_active getting to 0 / ->s_count - below S_BIAS (i.e.
the moment when superblock is past the point of no return and is heading
for shutdown) and the moment when deactivate_super() acquires
->s_umount.

We could just do drop_super() yield() and retry, but that's rather
antisocial and this stuff is luser-triggerable.  OTOH, having grabbed
->s_umount and having found that we'd got there first (i.e.  that
->s_root is non-NULL) we know that we won't race with
inotify_umount_inodes().

So we could grab a reference to watch and do the rest as above, just
with drop_super() instead of deactivate_super(), right? Wrong.  We had
to drop ih->mutex before we could grab ->s_umount.  So the watch
could've been gone already.

That still can be dealt with - we need to save watch->wd, do idr_find()
and compare its result with our pointer.  If they match, we either have
the damn thing still alive or we'd lost not one but two races at once,
the watch had been killed and a new one got created with the same ->wd
at the same address.  That couldn't have happened in inotify_destroy(),
but inotify_rm_wd() could run into that.  Still, "new one got created"
is not a problem - we have every right to kill it or leave it alone,
whatever's more convenient.

So we can use idr_find(...) == watch && watch->inode->i_sb == sb as
"grab it and kill it" check.  If it's been our original watch, we are
fine, if it's a newcomer - nevermind, just pretend that we'd won the
race and kill the fscker anyway; we are safe since we know that its
superblock won't be going away.

And yes, this is far beyond mere "not very pretty"; so's the entire
concept of inotify to start with.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Greg KH <greg@kroah.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-11-15 12:26:44 -08:00
..
irq
power PM_TEST_SUSPEND should depend on RTC_CLASS, not RTC_LIB 2008-11-01 12:40:38 -07:00
time nohz: disable tick_nohz_kick_tick() for now 2008-11-10 22:39:27 +01:00
trace Merge branch 'devel' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/urgent 2008-11-11 09:16:20 +01:00
.gitignore
acct.c
audit_tree.c Fix inotify watch removal/umount races 2008-11-15 12:26:44 -08:00
audit.c
audit.h
auditfilter.c Fix inotify watch removal/umount races 2008-11-15 12:26:44 -08:00
auditsc.c
backtracetest.c
bounds.c
capability.c
cgroup_debug.c
cgroup_freezer.c freezer_cg: disable writing freezer.state of root cgroup 2008-11-12 17:17:16 -08:00
cgroup.c cgroups: fix invalid cgrp->dentry before cgroup has been completely removed 2008-11-06 15:41:19 -08:00
compat.c
configs.c
cpu.c cpumask: introduce new API, without changing anything 2008-11-06 09:05:33 +01:00
cpuset.c
delayacct.c
dma-coherent.c
dma.c
exec_domain.c
exit.c Move "exit_robust_list" into mm_release() 2008-11-15 10:20:36 -08:00
extable.c
fork.c Move "exit_robust_list" into mm_release() 2008-11-15 10:20:36 -08:00
freezer.c freezer_cg: use thaw_process() in unfreeze_cgroup() 2008-10-30 11:38:45 -07:00
futex_compat.c
futex.c
hrtimer.c hrtimer: clean up unused callback modes 2008-11-12 09:54:40 +01:00
itimer.c
kallsyms.c
Kconfig.freezer
Kconfig.hz
Kconfig.preempt
kexec.c
kfifo.c
kgdb.c
kmod.c
kprobes.c kernel/kprobes.c: don't pad kretprobe_table_locks[] on uniprocessor builds 2008-11-12 17:17:17 -08:00
ksysfs.c
kthread.c
latencytop.c
lockdep_internals.h
lockdep_proc.c
lockdep.c lockdep: fix irqs on/off ip tracing 2008-10-28 11:19:07 +01:00
Makefile
marker.c
module.c
mutex-debug.c
mutex-debug.h
mutex.c
mutex.h
notifier.c
ns_cgroup.c
nsproxy.c
panic.c
params.c Fix compile warning in kernel/params.c 2008-10-23 12:09:00 -07:00
pid_namespace.c
pid.c
pm_qos_params.c
posix-cpu-timers.c
posix-timers.c
printk.c printk: remove unused code from kernel/printk.c 2008-10-23 21:54:29 +02:00
profile.c kernel/profile: fix profile_init() section mismatch 2008-10-30 11:38:46 -07:00
ptrace.c
rcuclassic.c
rcupdate.c
rcupreempt_trace.c
rcupreempt.c
rcutorture.c
relay.c
res_counter.c
resource.c reserve_region_with_split: Fix GFP_KERNEL usage under spinlock 2008-11-01 09:53:58 -07:00
rtmutex_common.h
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c
rtmutex.h
rwsem.c
sched_clock.c
sched_cpupri.c
sched_cpupri.h
sched_debug.c sched: clean up debug info 2008-11-10 10:51:51 +01:00
sched_fair.c sched: release buddies on yield 2008-11-11 11:57:22 +01:00
sched_features.h sched: backward looking buddy 2008-11-05 10:30:14 +01:00
sched_idletask.c
sched_rt.c Merge commit 'v2.6.28-rc1' into sched/urgent 2008-10-24 12:48:46 +02:00
sched_stats.h
sched.c sched: fix init_idle()'s use of sched_clock() 2008-11-12 20:05:50 +01:00
seccomp.c
semaphore.c
signal.c 'kill sig -1' must only apply to caller's namespace 2008-10-30 11:38:46 -07:00
smp.c generic-ipi: fix the smp_mb() placement 2008-11-06 08:41:56 +01:00
softirq.c irq: call __irq_enter() before calling the tick_idle_check 2008-11-10 22:36:39 +01:00
softlockup.c
spinlock.c
srcu.c
stacktrace.c
stop_machine.c Revert "Call init_workqueues before pre smp initcalls." 2008-10-25 19:53:38 -07:00
sys_ni.c
sys.c
sysctl_check.c
sysctl.c Merge commit 'v2.6.28-rc2' into tracing/urgent 2008-10-27 10:50:54 +01:00
taskstats.c
test_kprobes.c
time.c
timeconst.pl
timer.c Add round_jiffies_up and related routines 2008-11-06 08:42:48 +01:00
tracepoint.c tracepoint: check if the probe has been registered 2008-10-27 16:45:46 +01:00
tsacct.c
uid16.c
user_namespace.c
user.c
utsname_sysctl.c
utsname.c
wait.c
workqueue.c cpumask: introduce new API, without changing anything 2008-11-06 09:05:33 +01:00