kernel_optimize_test

History

Al Viro 8aef188452 VFS: Fix vfsmount overput on simultaneous automount [Kudos to dhowells for tracking that crap down] If two processes attempt to cause automounting on the same mountpoint at the same time, the vfsmount holding the mountpoint will be left with one too few references on it, causing a BUG when the kernel tries to clean up. The problem is that lock_mount() drops the caller's reference to the mountpoint's vfsmount in the case where it finds something already mounted on the mountpoint as it transits to the mounted filesystem and replaces path->mnt with the new mountpoint vfsmount. During a pathwalk, however, we don't take a reference on the vfsmount if it is the same as the one in the nameidata struct, but do_add_mount() doesn't know this. The fix is to make sure we have a ref on the vfsmount of the mountpoint before calling do_add_mount(). However, if lock_mount() doesn't transit, we're then left with an extra ref on the mountpoint vfsmount which needs releasing. We can handle that in follow_managed() by not making assumptions about what we can and what we cannot get from lookup_mnt() as the current code does. The callers of follow_managed() expect that reference to path->mnt will be grabbed iff path->mnt has been changed. follow_managed() and follow_automount() keep track of whether such reference has been grabbed and assume that it'll happen in those and only those cases that'll have us return with changed path->mnt. That assumption is almost correct - it breaks in case of racing automounts and in even harder to hit race between following a mountpoint and a couple of mount --move. The thing is, we don't need to make that assumption at all - after the end of loop in follow_manage() we can check if path->mnt has ended up unchanged and do mntput() if needed. The BUG can be reproduced with the following test program: #include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <unistd.h> #include <sys/wait.h> int main(int argc, char **argv) { int pid, ws; struct stat buf; pid = fork(); stat(argv[1], &buf); if (pid > 0) wait(&ws); return 0; } and the following procedure: (1) Mount an NFS volume that on the server has something else mounted on a subdirectory. For instance, I can mount / from my server: mount warthog:/ /mnt -t nfs4 -r On the server /data has another filesystem mounted on it, so NFS will see a change in FSID as it walks down the path, and will mark /mnt/data as being a mountpoint. This will cause the automount code to be triggered. !!! Do not look inside the mounted fs at this point !!! (2) Run the above program on a file within the submount to generate two simultaneous automount requests: /tmp/forkstat /mnt/data/testfile (3) Unmount the automounted submount: umount /mnt/data (4) Unmount the original mount: umount /mnt At this point the kernel should throw a BUG with something like the following: BUG: Dentry ffff880032e3c5c0{i=2,n=} still in use (1) [unmount of nfs4 0:12] Note that the bug appears on the root dentry of the original mount, not the mountpoint and not the submount because sys_umount() hasn't got to its final mntput_no_expire() yet, but this isn't so obvious from the call trace: [<ffffffff8117cd82>] shrink_dcache_for_umount+0x69/0x82 [<ffffffff8116160e>] generic_shutdown_super+0x37/0x15b [<ffffffffa00fae56>] ? nfs_super_return_all_delegations+0x2e/0x1b1 [nfs] [<ffffffff811617f3>] kill_anon_super+0x1d/0x7e [<ffffffffa00d0be1>] nfs4_kill_super+0x60/0xb6 [nfs] [<ffffffff81161c17>] deactivate_locked_super+0x34/0x83 [<ffffffff811629ff>] deactivate_super+0x6f/0x7b [<ffffffff81186261>] mntput_no_expire+0x18d/0x199 [<ffffffff811862a8>] mntput+0x3b/0x44 [<ffffffff81186d87>] release_mounts+0xa2/0xbf [<ffffffff811876af>] sys_umount+0x47a/0x4ba [<ffffffff8109e1ca>] ? trace_hardirqs_on_caller+0x1fd/0x22f [<ffffffff816ea86b>] system_call_fastpath+0x16/0x1b as do_umount() is inlined. However, you can see release_mounts() in there. Note also that it may be necessary to have multiple CPU cores to be able to trigger this bug. Tested-by: Jeff Layton <jlayton@redhat.com> Tested-by: Ian Kent <raven@themaw.net> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>		2011-06-16 11:28:16 -04:00
..
9p	9p: remove unnecessary dentry_unhash on rmdir, dir rename	2011-05-28 01:02:53 -04:00
adfs	Fix common misspellings	2011-03-31 11:26:23 -03:00
affs	affs: remove unnecessary dentry_unhash on rmdir, dir rename	2011-05-28 01:02:53 -04:00
afs	afs: fix sget() races, close leak on umount	2011-06-12 17:45:36 -04:00
autofs4	autofs4: bogus dentry_unhash() added in ->unlink()	2011-05-30 01:50:53 -04:00
befs	Fix common misspellings	2011-03-31 11:26:23 -03:00
bfs	bfs: remove unnecessary dentry_unhash on dir rename	2011-05-28 01:02:50 -04:00
btrfs	Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6	2011-06-07 18:36:59 -07:00
cachefiles	Fix common misspellings	2011-03-31 11:26:23 -03:00
ceph	ceph: remove unnecessary dentry_unhash calls	2011-05-26 07:26:53 -04:00
cifs	cifs: trivial: add space in fsc error message	2011-06-08 16:03:29 +00:00
coda	coda: remove unnecessary dentry_unhash on rmdir, dir rename	2011-05-28 01:02:53 -04:00
configfs	configfs: remove unnecessary dentry_unhash on rmdir, dir rename	2011-05-28 01:02:54 -04:00
cramfs	cramfs: generate unique inode number for better inode cache usage	2011-01-13 08:03:23 -08:00
debugfs	debugfs: move to new strtobool	2011-05-19 16:55:28 +09:30
devpts	fs/devpts/inode.c: correctly check d_alloc_name() return code in devpts_pty_new()	2011-03-22 17:44:17 -07:00
dlm	Merge branch 'trivial' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6	2011-05-26 13:19:00 -07:00
ecryptfs	eCryptfs: Remove ecryptfs_header_cache_2	2011-05-29 14:24:25 -05:00
efs	block: remove per-queue plugging	2011-03-10 08:52:07 +01:00
exofs	exofs: remove unnecessary dentry_unhash on rmdir/rename_dir	2011-05-26 07:26:57 -04:00
exportfs	vfs: Add open by file handle support	2011-03-15 02:21:44 -04:00
ext2	ext2: remove unnecessary dentry_unhash on rmdir/rename_dir	2011-05-26 07:26:56 -04:00
ext3	fs: pass exact type of data dirties to ->dirty_inode	2011-05-27 07:04:40 -04:00
ext4	fs: pass exact type of data dirties to ->dirty_inode	2011-05-27 07:04:40 -04:00
fat	fat: Fix corrupt inode flags when remove ATTR_SYS flag	2011-05-31 19:42:24 +09:00
freevxfs	treewide: fix a few typos in comments	2011-05-10 10:16:21 +02:00
fscache	fscache: remove dead code under CONFIG_WORKQUEUE_DEBUGFS	2011-05-25 08:39:44 -07:00
fuse	more conservative S_NOSEC handling	2011-06-03 18:24:58 -04:00
gfs2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes	2011-06-07 18:44:10 -07:00
hfs	hfs: remove unnecessary dentry_unhash on rmdir, dir rename	2011-05-28 01:02:52 -04:00
hfsplus	hfsplus: remove unnecessary dentry_unhash on rmdir, dir rename	2011-05-28 01:02:52 -04:00
hostfs	hostfs: remove unnecessary dentry_unhash on rmdir, dir rename	2011-05-28 01:02:52 -04:00
hpfs	hpfs: remove unnecessary dentry_unhash on rmdir, dir rename	2011-05-28 01:02:54 -04:00
hppfs	fs: icache RCU free inodes	2011-01-07 17:50:26 +11:00
hugetlbfs	mm: don't access vm_flags as 'int'	2011-05-26 09:20:31 -07:00
isofs	Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block	2011-03-24 10:16:26 -07:00
jbd	jbd: Fix comment to match the code in journal_start()	2011-05-24 00:27:53 +02:00
jbd2	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4	2011-05-26 09:53:20 -07:00
jffs2	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6	2011-05-28 13:03:41 -07:00
jfs	lmLogOpen() broken failure exit	2011-06-07 08:50:59 -04:00
lockd	NLM: Fix "kernel BUG at fs/lockd/host.c:417!" or ".../host.c:283!"	2011-01-25 15:24:47 -05:00
logfs	logfs: remove unnecessary dentry_unhash from rmdir, dir rename	2011-05-28 01:02:51 -04:00
minix	minix: remove unnecessary dentry_unhash on rmdir, dir rename	2011-05-28 01:02:54 -04:00
ncpfs	ncpfs: fix rename over directory with dangling references	2011-05-28 01:02:53 -04:00
nfs	Merge branch 'pnfs-submit' of git://git.open-osd.org/linux-open-osd	2011-05-29 14:10:13 -07:00
nfs_common	Fix common misspellings	2011-03-31 11:26:23 -03:00
nfsd	Merge branch 'for-2.6.40' of git://linux-nfs.org/~bfields/linux	2011-05-29 11:21:12 -07:00
nilfs2	nilfs2: remove unnecessary dentry_unhash from rmdir, dir rename	2011-05-28 01:02:51 -04:00
nls
notify	Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6	2011-04-07 11:14:49 -07:00
ntfs	Fix common misspellings	2011-03-31 11:26:23 -03:00
ocfs2	more conservative S_NOSEC handling	2011-06-03 18:24:58 -04:00
omfs	omfs: remove unnecessary dentry_unhash on rmdir, dir rneame	2011-05-28 01:02:52 -04:00
openpromfs	fs: icache RCU free inodes	2011-01-07 17:50:26 +11:00
partitions	Revert "block: Remove extra discard_alignment from hd_struct."	2011-05-30 07:42:51 +02:00
proc	fix leak in proc_set_super()	2011-06-12 17:45:28 -04:00
pstore	pstore: fix pstore filesystem mount/remount issue	2011-05-16 11:05:00 -07:00
qnx4	block: remove per-queue plugging	2011-03-10 08:52:07 +01:00
quota	vmscan: change shrinker API by passing shrink_control struct	2011-05-25 08:39:26 -07:00
ramfs	ramfs: fix memleak on no-mmu arch	2011-04-14 16:06:56 -07:00
reiserfs	reiserfs: remove unnecessary dentry_unhash from rmdir, dir rename	2011-05-28 01:02:51 -04:00
romfs	fs: icache RCU free inodes	2011-01-07 17:50:26 +11:00
squashfs	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-linus	2011-05-29 11:19:45 -07:00
sysfs	Delay struct net freeing while there's a sysfs instance refering to it	2011-06-12 17:45:41 -04:00
sysv	sysv: remove unnecessary dentry_unhash from rmdir, dir rename	2011-05-28 01:02:50 -04:00
ubifs	ubifs: fix sget races	2011-06-12 17:45:34 -04:00
udf	udf: remove unnecessary dentry_unhash from rmdir, dir rename	2011-05-28 01:02:52 -04:00
ufs	ufs: remove unnecessary dentry_unhash from rmdir, dir rename	2011-05-28 01:02:51 -04:00
xfs	fs: pass exact type of data dirties to ->dirty_inode	2011-05-27 07:04:40 -04:00
aio.c	Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block	2011-03-24 10:16:26 -07:00
anon_inodes.c	sanitize vfsmount refcounting changes	2011-01-16 13:47:07 -05:00
attr.c	Cache xattr security drop check for write v2	2011-05-28 12:02:09 -04:00
bad_inode.c	fs: provide rcu-walk aware permission i_ops	2011-01-07 17:50:29 +11:00
binfmt_aout.c
binfmt_elf_fdpic.c
binfmt_elf.c	brk: COMPAT_BRK: fix detection of randomized brk	2011-04-14 16:06:55 -07:00
binfmt_em86.c
binfmt_flat.c	CRED: Fix load_flat_shared_library() to initialise bprm correctly	2011-05-03 10:10:51 +10:00
binfmt_misc.c
binfmt_script.c
binfmt_som.c
bio-integrity.c	block: Require subsystems to explicitly allocate bio_set integrity mempool	2011-03-17 11:11:05 +01:00
bio.c	block: improve the bio_add_page() and bio_add_pc_page() descriptions	2011-05-28 14:44:46 +02:00
block_dev.c	block: blkdev_get() should access ->bd_disk only after success	2011-06-01 08:28:47 +02:00
buffer.c	fs: block_page_mkwrite should wait for writeback to finish	2011-05-28 01:03:21 -04:00
char_dev.c	Merge branch 'for-2.6.38/core' of git://git.kernel.dk/linux-2.6-block	2011-01-13 10:45:01 -08:00
compat_binfmt_elf.c
compat_ioctl.c	Merge branch 'tty-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6	2011-01-07 14:39:20 -08:00
compat.c	exec: unify do_execve/compat_do_execve code	2011-04-09 15:53:56 +02:00
dcache.c	vmscan: change shrinker API by passing shrink_control struct	2011-05-25 08:39:26 -07:00
dcookies.c
direct-io.c	Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block	2011-03-24 10:16:26 -07:00
drop_caches.c	vmscan: change shrinker API by passing shrink_control struct	2011-05-25 08:39:26 -07:00
eventfd.c	Docbook: add fs/eventfd.c and fix typos in it	2011-02-21 15:07:04 -08:00
eventpoll.c	Fix common misspellings	2011-03-31 11:26:23 -03:00
exec.c	exec: delay address limit change until point of no return	2011-06-09 12:50:05 -07:00
fcntl.c	userns: rename is_owner_or_cap to inode_owner_or_capable	2011-03-23 19:47:13 -07:00
fhandle.c	fs/fhandle.c: add <linux/personality.h> for ia64	2011-04-14 16:06:56 -07:00
fifo.c	Filesystem: fifo: Fixed coding style issue.	2011-03-21 00:16:09 -04:00
file_table.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6	2011-03-16 13:26:17 -07:00
file.c	vfs: avoid large kmalloc()s for the fdtable	2011-04-28 11:28:20 -07:00
filesystems.c	fs: synchronize_rcu when unregister_filesystem success not failure	2011-04-17 10:42:01 -07:00
fs_struct.c	sanitize vfsmount refcounting changes	2011-01-16 13:47:07 -05:00
fs-writeback.c	fs: pass exact type of data dirties to ->dirty_inode	2011-05-27 07:04:40 -04:00
generic_acl.c	userns: rename is_owner_or_cap to inode_owner_or_capable	2011-03-23 19:47:13 -07:00
inode.c	fs: cosmetic inode.c cleanups	2011-05-27 09:43:00 -04:00
internal.h	fs: move i_wb_list out from under inode_lock	2011-03-24 21:17:51 -04:00
ioctl.c	vfs: cleanup do_vfs_ioctl()	2011-03-21 00:16:08 -04:00
ioprio.c
Kconfig	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6	2011-05-26 09:52:14 -07:00
Kconfig.binfmt
libfs.c	libfs: drop unneeded dentry_unhash	2011-05-26 07:26:50 -04:00
locks.c	Merge branch 'for-2.6.39' of git://linux-nfs.org/~bfields/linux	2011-03-24 08:20:39 -07:00
Makefile	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux-2.6	2011-03-16 19:01:29 -07:00
mbcache.c	vmscan: change shrinker API by passing shrink_control struct	2011-05-25 08:39:26 -07:00
mpage.c	mm/fs: add hooks to support cleancache	2011-05-26 10:01:43 -06:00
namei.c	VFS: Fix vfsmount overput on simultaneous automount	2011-06-16 11:28:16 -04:00
namespace.c	fs/namespace.c: bound mount propagation fix	2011-05-26 07:26:44 -04:00
nfsctl.c	open-style analog of vfs_path_lookup()	2011-03-14 09:15:28 -04:00
no-block.c
open.c	fs: Use BUG_ON(!mnt) at dentry_open().	2011-03-21 01:10:41 -04:00
pipe.c	Fix broken "pipe: use event aware wakeups" optimization	2011-01-20 16:21:59 -08:00
pnode.c	fs: scale mntget/mntput	2011-01-07 17:50:33 +11:00
pnode.h
posix_acl.c	NFS: Prevent memory allocation failure in nfsacl_encode()	2011-01-25 15:24:47 -05:00
read_write.c	fix signedness mess in rw_verify_area() on 64bit architectures	2011-01-12 20:06:58 -05:00
read_write.h
readdir.c
select.c	select: remove unused MAX_SELECT_SECONDS	2011-03-21 00:16:08 -04:00
seq_file.c
signalfd.c
splice.c	splice: add wakeup_pipe_readers()	2011-05-23 19:58:53 +02:00
stack.c
stat.c	readlinkat(), fchownat() and fstatat() with empty relative pathnames	2011-03-15 02:21:45 -04:00
statfs.c	clean statfs-like syscalls up	2011-03-14 09:15:28 -04:00
super.c	more conservative S_NOSEC handling	2011-06-03 18:24:58 -04:00
sync.c	Merge branch 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block	2011-03-24 10:16:26 -07:00
timerfd.c	timerfd: Manage cancelable timers in timerfd	2011-05-23 13:59:53 +02:00
utimes.c	userns: rename is_owner_or_cap to inode_owner_or_capable	2011-03-23 19:47:13 -07:00
xattr_acl.c
xattr.c	Cache xattr security drop check for write v2	2011-05-28 12:02:09 -04:00