kernel_optimize_test/fs
Michal Hocko d1908f5255 fs: break out of iomap_file_buffered_write on fatal signals
Tetsuo has noticed that an OOM stress test which performs large write
requests can cause the full memory reserves depletion.  He has tracked
this down to the following path

	__alloc_pages_nodemask+0x436/0x4d0
	alloc_pages_current+0x97/0x1b0
	__page_cache_alloc+0x15d/0x1a0          mm/filemap.c:728
	pagecache_get_page+0x5a/0x2b0           mm/filemap.c:1331
	grab_cache_page_write_begin+0x23/0x40   mm/filemap.c:2773
	iomap_write_begin+0x50/0xd0             fs/iomap.c:118
	iomap_write_actor+0xb5/0x1a0            fs/iomap.c:190
	? iomap_write_end+0x80/0x80             fs/iomap.c:150
	iomap_apply+0xb3/0x130                  fs/iomap.c:79
	iomap_file_buffered_write+0x68/0xa0     fs/iomap.c:243
	? iomap_write_end+0x80/0x80
	xfs_file_buffered_aio_write+0x132/0x390 [xfs]
	? remove_wait_queue+0x59/0x60
	xfs_file_write_iter+0x90/0x130 [xfs]
	__vfs_write+0xe5/0x140
	vfs_write+0xc7/0x1f0
	? syscall_trace_enter+0x1d0/0x380
	SyS_write+0x58/0xc0
	do_syscall_64+0x6c/0x200
	entry_SYSCALL64_slow_path+0x25/0x25

the oom victim has access to all memory reserves to make a forward
progress to exit easier.  But iomap_file_buffered_write and other
callers of iomap_apply loop to complete the full request.  We need to
check for fatal signals and back off with a short write instead.

As the iomap_apply delegates all the work down to the actor we have to
hook into those.  All callers that work with the page cache are calling
iomap_write_begin so we will check for signals there.  dax_iomap_actor
has to handle the situation explicitly because it copies data to the
userspace directly.  Other callers like iomap_page_mkwrite work on a
single page or iomap_fiemap_actor do not allocate memory based on the
given len.

Fixes: 68a9f5e700 ("xfs: implement iomap based buffered write path")
Link: http://lkml.kernel.org/r/20170201092706.9966-2-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>	[4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-03 14:13:19 -08:00
..
9p
adfs
affs
afs
autofs4
befs
bfs
btrfs Merge branch 'for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2017-01-27 12:41:46 -08:00
cachefiles
ceph
cifs
coda
configfs
cramfs
crypto
debugfs
devpts
dlm
ecryptfs
efivarfs
efs
exofs
exportfs
ext2 dax: fix build warnings with FS_DAX and !FS_IOMAP 2017-01-24 16:26:14 -08:00
ext4 dax: fix build warnings with FS_DAX and !FS_IOMAP 2017-01-24 16:26:14 -08:00
f2fs
fat
freevxfs
fscache fscache: Fix dead object requeue 2017-01-31 13:23:09 -05:00
fuse
gfs2
hfs
hfsplus
hostfs
hpfs
hugetlbfs
isofs
jbd2
jffs2
jfs
kernfs
lockd
minix
ncpfs
nfs pNFS: Fix a reference leak in _pnfs_return_layout 2017-01-26 15:50:41 -05:00
nfs_common
nfsd nfsd: special case truncates some more 2017-01-31 12:29:24 -05:00
nilfs2
nls
notify
ntfs
ocfs2
omfs
openpromfs
orangefs
overlayfs
proc proc: add a schedule point in proc_pid_readdir() 2017-01-24 16:26:14 -08:00
pstore
qnx4
qnx6
quota
ramfs
reiserfs
romfs romfs: use different way to generate fsid for BLOCK or MTD 2017-01-24 16:26:14 -08:00
squashfs
sysfs
sysv
tracefs
ubifs
udf
ufs
xfs xfs: prevent quotacheck from overloading inode lru 2017-01-27 09:32:30 -08:00
aio.c
anon_inodes.c
attr.c
bad_inode.c
binfmt_aout.c
binfmt_elf_fdpic.c
binfmt_elf.c
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c
binfmt_script.c
block_dev.c block: fix use after free in __blkdev_direct_IO 2017-01-24 07:55:53 -07:00
buffer.c
char_dev.c
compat_binfmt_elf.c
compat_ioctl.c
compat.c
coredump.c
dax.c fs: break out of iomap_file_buffered_write on fatal signals 2017-02-03 14:13:19 -08:00
dcache.c
dcookies.c
direct-io.c
drop_caches.c
eventfd.c
eventpoll.c
exec.c
fcntl.c
fhandle.c
file_table.c
file.c
filesystems.c
fs_pin.c
fs_struct.c
fs-writeback.c
inode.c
internal.h
ioctl.c
iomap.c fs: break out of iomap_file_buffered_write on fatal signals 2017-02-03 14:13:19 -08:00
Kconfig dax: fix build warnings with FS_DAX and !FS_IOMAP 2017-01-24 16:26:14 -08:00
Kconfig.binfmt
libfs.c
locks.c
Makefile
mbcache.c
mount.h
mpage.c
namei.c
namespace.c
no-block.c
nsfs.c
open.c
pipe.c
pnode.c
pnode.h
posix_acl.c
proc_namespace.c
read_write.c
readdir.c
select.c
seq_file.c
signalfd.c
splice.c
stack.c
stat.c
statfs.c
super.c
sync.c
timerfd.c
userfaultfd.c userfaultfd: fix SIGBUS resulting from false rwsem wakeups 2017-01-24 16:26:14 -08:00
utimes.c
xattr.c