forked from luck/tmp_suning_uos_patched
da4f4916da
commit 9449ad33be8480f538b11a593e2dda2fb33ca06d upstream. For punch holes in EOF blocks, fallocate used buffer write to zero the EOF blocks in last cluster. But since ->writepage will ignore EOF pages, those zeros will not be flushed. This "looks" ok as commit 6bba4471f0cc ("ocfs2: fix data corruption by fallocate") will zero the EOF blocks when extend the file size, but it isn't. The problem happened on those EOF pages, before writeback, those pages had DIRTY flag set and all buffer_head in them also had DIRTY flag set, when writeback run by write_cache_pages(), DIRTY flag on the page was cleared, but DIRTY flag on the buffer_head not. When next write happened to those EOF pages, since buffer_head already had DIRTY flag set, it would not mark page DIRTY again. That made writeback ignore them forever. That will cause data corruption. Even directio write can't work because it will fail when trying to drop pages caches before direct io, as it found the buffer_head for those pages still had DIRTY flag set, then it will fall back to buffer io mode. To make a summary of the issue, as writeback ingores EOF pages, once any EOF page is generated, any write to it will only go to the page cache, it will never be flushed to disk even file size extends and that page is not EOF page any more. The fix is to avoid zero EOF blocks with buffer write. The following code snippet from qemu-img could trigger the corruption. 656 open("6b3711ae-3306-4bdd-823c-cf1c0060a095.conv.2", O_RDWR|O_DIRECT|O_CLOEXEC) = 11 ... 660 fallocate(11, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 2275868672, 327680 <unfinished ...> 660 fallocate(11, 0, 2275868672, 327680) = 0 658 pwrite64(11, " Link: https://lkml.kernel.org/r/20210722054923.24389-2-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
---|---|---|
.. | ||
cluster | ||
dlm | ||
dlmfs | ||
acl.c | ||
acl.h | ||
alloc.c | ||
alloc.h | ||
aops.c | ||
aops.h | ||
blockcheck.c | ||
blockcheck.h | ||
buffer_head_io.c | ||
buffer_head_io.h | ||
dcache.c | ||
dcache.h | ||
dir.c | ||
dir.h | ||
dlmglue.c | ||
dlmglue.h | ||
export.c | ||
export.h | ||
extent_map.c | ||
extent_map.h | ||
file.c | ||
file.h | ||
filecheck.c | ||
filecheck.h | ||
heartbeat.c | ||
heartbeat.h | ||
inode.c | ||
inode.h | ||
ioctl.c | ||
ioctl.h | ||
journal.c | ||
journal.h | ||
Kconfig | ||
localalloc.c | ||
localalloc.h | ||
locks.c | ||
locks.h | ||
Makefile | ||
mmap.c | ||
mmap.h | ||
move_extents.c | ||
move_extents.h | ||
namei.c | ||
namei.h | ||
ocfs1_fs_compat.h | ||
ocfs2_fs.h | ||
ocfs2_ioctl.h | ||
ocfs2_lockid.h | ||
ocfs2_lockingver.h | ||
ocfs2_trace.h | ||
ocfs2.h | ||
quota_global.c | ||
quota_local.c | ||
quota.h | ||
refcounttree.c | ||
refcounttree.h | ||
reservations.c | ||
reservations.h | ||
resize.c | ||
resize.h | ||
slot_map.c | ||
slot_map.h | ||
stack_o2cb.c | ||
stack_user.c | ||
stackglue.c | ||
stackglue.h | ||
suballoc.c | ||
suballoc.h | ||
super.c | ||
super.h | ||
symlink.c | ||
symlink.h | ||
sysfile.c | ||
sysfile.h | ||
uptodate.c | ||
uptodate.h | ||
xattr.c | ||
xattr.h |