kernel_optimize_test/fs/ocfs2
Tao Ma 6b791bcc8b ocfs2: Adjust rightmost path in ocfs2_add_branch.
In ocfs2_add_branch, we use the rightmost rec of the leaf extent block
to generate the e_cpos for the newly added branch. In the most case, it
is OK but if the parent extent block's rightmost rec covers more clusters
than the leaf does, it will cause kernel panic if we insert some clusters
in it. The message is something like:
(7445,1):ocfs2_insert_at_leaf:3775 ERROR: bug expression:
le16_to_cpu(el->l_next_free_rec) >= le16_to_cpu(el->l_count)
(7445,1):ocfs2_insert_at_leaf:3775 ERROR: inode 66053, depth 0, count 28,
next free 28, rec.cpos 270, rec.clusters 1, insert.cpos 275, insert.clusters 1
 [<fa7ad565>] ? ocfs2_do_insert_extent+0xb58/0xda0 [ocfs2]
 [<fa7b08f2>] ? ocfs2_insert_extent+0x5bd/0x6ba [ocfs2]
 [<fa7b1b8b>] ? ocfs2_add_clusters_in_btree+0x37f/0x564 [ocfs2]
...

The panic can be easily reproduced by the following small test case
(with bs=512, cs=4K, and I remove all the error handling so that it looks
clear enough for reading).

int main(int argc, char **argv)
{
	int fd, i;
	char buf[5] = "test";

	fd = open(argv[1], O_RDWR|O_CREAT);

	for (i = 0; i < 30; i++) {
		lseek(fd, 40960 * i, SEEK_SET);
		write(fd, buf, 5);
	}

	ftruncate(fd, 1146880);

	lseek(fd, 1126400, SEEK_SET);
	write(fd, buf, 5);

	close(fd);

	return 0;
}

The reason of the panic is that:
the 30 writes and the ftruncate makes the file's extent list looks like:

	Tree Depth: 1   Count: 19   Next Free Rec: 1
	## Offset        Clusters       Block#
	0  0             280            86183
	SubAlloc Bit: 7   SubAlloc Slot: 0
	Blknum: 86183   Next Leaf: 0
	CRC32: 00000000   ECC: 0000
	Tree Depth: 0   Count: 28   Next Free Rec: 28
	## Offset        Clusters       Block#          Flags
	0  0             1              143368          0x0
	1  10            1              143376          0x0
	...
	26 260           1              143576          0x0
	27 270           1              143584          0x0

Now another write at 1126400(275 cluster) whiich will write at the gap
between 271 and 280 will trigger ocfs2_add_branch, but the result after
the function looks like:
	Tree Depth: 1   Count: 19   Next Free Rec: 2
	## Offset        Clusters       Block#
	0  0             280            86183
	1  271           0             143592
So the extent record is intersected and make the following operation bug out.

This patch just try to remove the gap before we add the new branch, so that
the root(branch) rightmost rec will cover the same right position. So in the
above case, before adding branch the tree will be changed to
	Tree Depth: 1   Count: 19   Next Free Rec: 1
	## Offset        Clusters       Block#
	0  0             271            86183
	SubAlloc Bit: 7   SubAlloc Slot: 0
	Blknum: 86183   Next Leaf: 0
	CRC32: 00000000   ECC: 0000
	Tree Depth: 0   Count: 28   Next Free Rec: 28
	## Offset        Clusters       Block#          Flags
	0  0             1              143368          0x0
	1  10            1              143376          0x0
	...
	26 260           1              143576          0x0
	27 270           1              143584          0x0
And after branch add, the tree looks like
	Tree Depth: 1   Count: 19   Next Free Rec: 2
	## Offset        Clusters       Block#
	0  0             271            86183
	1  271           0             143592

Signed-off-by: Tao Ma <tao.ma@oracle.com>
Acked-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
2009-06-15 14:49:43 -07:00
..
cluster ocfs2: update comments in masklog.h 2009-05-05 14:48:11 -07:00
dlm ocfs2/dlm: Tweak mle_state output 2009-04-03 11:39:25 -07:00
acl.c New helper - current_umask() 2009-03-31 23:00:26 -04:00
acl.h ocfs2: add ocfs2_init_acl in mknod 2009-01-05 08:34:20 -08:00
alloc.c ocfs2: Adjust rightmost path in ocfs2_add_branch. 2009-06-15 14:49:43 -07:00
alloc.h ocfs2: Add a name indexed b-tree to directory inodes 2009-04-03 11:39:15 -07:00
aops.c ocfs2: Pagecache usage optimization on ocfs2 2009-04-03 11:39:26 -07:00
aops.h
blockcheck.c ocfs2: Add statistics for the checksum and ecc operations. 2009-06-03 19:15:36 -07:00
blockcheck.h ocfs2: Add statistics for the checksum and ecc operations. 2009-06-03 19:15:36 -07:00
buffer_head_io.c ocfs2: Use BH_JBDPrivateStart instead of BH_Unshadow 2009-01-05 08:40:24 -08:00
buffer_head_io.h ocfs2: Validate metadata only when it's read from disk. 2009-01-05 08:36:53 -08:00
dcache.c ocfs2: Add missing iput() during error handling in ocfs2_dentry_attach_lock() 2009-04-23 14:56:13 -07:00
dcache.h constify dentry_operations: OCFS2 2009-03-27 14:44:02 -04:00
dir.c ocfs2: Correct ordering of ip_alloc_sem and localloc locks for directories 2009-06-03 19:14:30 -07:00
dir.h ocfs2: Introduce dir free space list 2009-04-03 11:39:16 -07:00
dlmglue.c ocfs2: timer to queue scan of all orphan slots 2009-06-03 19:14:31 -07:00
dlmglue.h ocfs2: timer to queue scan of all orphan slots 2009-06-03 19:14:31 -07:00
export.c ocfs2: Fix some printk() warnings. 2009-04-21 16:31:20 -07:00
export.h
extent_map.c ocfs2: Wrap virtual block reads in ocfs2_read_virt_blocks() 2009-01-05 08:36:54 -08:00
extent_map.h ocfs2: Wrap virtual block reads in ocfs2_read_virt_blocks() 2009-01-05 08:36:54 -08:00
file.c ocfs2: fdatasync should skip unimportant metadata writeout 2009-06-09 10:45:47 -07:00
file.h ocfs2: Implementation of local and global quota file handling 2009-01-05 08:40:23 -08:00
heartbeat.c
heartbeat.h
inode.c ocfs2: fix rare stale inode errors when exporting via nfs 2009-04-03 11:39:25 -07:00
inode.h ocfs2: fix rare stale inode errors when exporting via nfs 2009-04-03 11:39:25 -07:00
ioctl.c
ioctl.h
journal.c ocfs2 patch to track delayed orphan scan timer statistics 2009-06-03 19:14:31 -07:00
journal.h ocfs2: timer to queue scan of all orphan slots 2009-06-03 19:14:31 -07:00
Kconfig fs/Kconfig: move ocfs2 out 2009-01-22 13:15:54 +03:00
localalloc.c ocfs2: Remove debugfs file local_alloc_stats 2009-04-03 11:39:15 -07:00
localalloc.h
locks.c
locks.h
Makefile ocfs2: Add the underlying blockcheck code. 2009-01-05 08:40:31 -08:00
mmap.c mm: page_mkwrite change prototype to match fault 2009-04-01 08:59:14 -07:00
mmap.h
namei.c ocfs2/trivial: Remove unused variable in ocfs2_rename. 2009-04-29 10:57:18 -07:00
namei.h
ocfs1_fs_compat.h
ocfs2_fs.h ocfs2: Enable indexed directories 2009-04-03 11:39:16 -07:00
ocfs2_lockid.h ocfs2: timer to queue scan of all orphan slots 2009-06-03 19:14:31 -07:00
ocfs2_lockingver.h
ocfs2.h ocfs2: Add statistics for the checksum and ecc operations. 2009-06-03 19:15:36 -07:00
quota_global.c ocfs2: Fix possible deadlock in ocfs2_global_read_dquot() 2009-06-03 19:14:28 -07:00
quota_local.c ocfs2: Fix possible deadlock in quota recovery 2009-06-03 19:14:30 -07:00
quota.h ocfs2: Fix ocfs2_read_quota_block() error handling. 2009-01-05 08:40:24 -08:00
resize.c ocfs2: Use metadata-specific ocfs2_journal_access_*() functions. 2009-01-05 08:40:32 -08:00
resize.h
slot_map.c ocfs2: Validate metadata only when it's read from disk. 2009-01-05 08:36:53 -08:00
slot_map.h
stack_o2cb.c
stack_user.c ocfs2: initialize stack_user lvbptr 2008-12-01 14:46:39 -08:00
stackglue.c
stackglue.h
suballoc.c ocfs2: Fix some printk() warnings. 2009-04-21 16:31:20 -07:00
suballoc.h ocfs2: fix rare stale inode errors when exporting via nfs 2009-04-03 11:39:25 -07:00
super.c ocfs2: Remove redundant gotos in ocfs2_mount_volume() 2009-06-03 19:20:15 -07:00
super.h
symlink.c ocfs2: Wrap inode block reads in a dedicated function. 2009-01-05 08:36:52 -08:00
symlink.h
sysfile.c
sysfile.h
uptodate.c
uptodate.h
ver.c
ver.h
xattr.c ocfs2: Don't printk the error when listing too many xattrs. 2009-05-05 14:43:24 -07:00
xattr.h ocfs2: Add a name indexed b-tree to directory inodes 2009-04-03 11:39:15 -07:00