kernel_optimize_test/mm
Jan Kara ef5d437f71 mm: fix XFS oops due to dirty pages without buffers on s390
On s390 any write to a page (even from kernel itself) sets architecture
specific page dirty bit.  Thus when a page is written to via buffered
write, HW dirty bit gets set and when we later map and unmap the page,
page_remove_rmap() finds the dirty bit and calls set_page_dirty().

Dirtying of a page which shouldn't be dirty can cause all sorts of
problems to filesystems.  The bug we observed in practice is that
buffers from the page get freed, so when the page gets later marked as
dirty and writeback writes it, XFS crashes due to an assertion
BUG_ON(!PagePrivate(page)) in page_buffers() called from
xfs_count_page_state().

Similar problem can also happen when zero_user_segment() call from
xfs_vm_writepage() (or block_write_full_page() for that matter) set the
hardware dirty bit during writeback, later buffers get freed, and then
page unmapped.

Fix the issue by ignoring s390 HW dirty bit for page cache pages of
mappings with mapping_cap_account_dirty().  This is safe because for
such mappings when a page gets marked as writeable in PTE it is also
marked dirty in do_wp_page() or do_page_fault().  When the dirty bit is
cleared by clear_page_dirty_for_io(), the page gets writeprotected in
page_mkclean().  So pagecache page is writeable if and only if it is
dirty.

Thanks to Hugh Dickins for pointing out mapping has to have
mapping_cap_account_dirty() for things to work and proposing a cleaned
up variant of the patch.

The patch has survived about two hours of running fsx-linux on tmpfs
while heavily swapping and several days of running on out build machines
where the original problem was triggered.

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <stable@vger.kernel.org>		[3.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-25 14:37:52 -07:00
..
backing-dev.c
bootmem.c mm: fix-up zone present pages 2012-10-09 16:22:54 +09:00
bounce.c
cleancache.c
compaction.c mm: compaction: correct the nr_strict va isolated check for CMA 2012-10-19 14:07:47 -07:00
debug-pagealloc.c
dmapool.c
fadvise.c
failslab.c
filemap_xip.c mm: move all mmu notifier invocations to be done outside the PT lock 2012-10-09 16:22:58 +09:00
filemap.c readahead: fault retry breaks mmap file read random detection 2012-10-09 16:22:47 +09:00
fremap.c remap_file_pages: correctly handle the case of a NULL vm_ops pointer 2012-10-19 13:37:57 -07:00
frontswap.c
highmem.c
huge_memory.c mm: huge_memory: Fix build error. 2012-10-15 07:59:15 -07:00
hugetlb_cgroup.c
hugetlb.c mm: document PageHuge somewhat 2012-10-09 16:23:03 +09:00
hwpoison-inject.c
init-mm.c
internal.h mm, thp: fix mlock statistics 2012-10-09 16:23:03 +09:00
interval_tree.c mm: add CONFIG_DEBUG_VM_RB build option 2012-10-09 16:22:42 +09:00
Kconfig mm: enable CONFIG_COMPACTION by default 2012-10-09 16:22:53 +09:00
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c kmemleak: use rbtree instead of prio tree 2012-10-09 16:22:39 +09:00
ksm.c mm: wrap calls to set_pte_at_notify with invalidate_range_start and invalidate_range_end 2012-10-09 16:22:58 +09:00
maccess.c
madvise.c mm: prepare VM_DONTDUMP for using in drivers 2012-10-09 16:22:18 +09:00
Makefile mm: replace vma prio_tree with an interval tree 2012-10-09 16:22:39 +09:00
memblock.c mm: avoid section mismatch warning for memblock_type_name 2012-10-09 16:23:01 +09:00
memcontrol.c memcg: move mem_cgroup_is_root upwards 2012-10-09 16:22:55 +09:00
memory_hotplug.c memory-hotplug: suppress "Trying to free nonexistent resource <XXXXXXXXXXXXXXXX-YYYYYYYYYYYYYYYY>" warning 2012-10-09 16:23:04 +09:00
memory-failure.c mm anon rmap: replace same_anon_vma linked list with an interval tree. 2012-10-09 16:22:41 +09:00
memory.c mm, thp: fix mapped pages avoiding unevictable list on mlock 2012-10-09 16:23:02 +09:00
mempolicy.c mm, mempolicy: fix printing stack contents in numa_maps 2012-10-16 18:00:50 -07:00
mempool.c
migrate.c
mincore.c
mlock.c mm, thp: fix mlock statistics 2012-10-09 16:23:03 +09:00
mm_init.c
mmap.c mm: avoid taking rmap locks in move_ptes() 2012-10-09 16:22:42 +09:00
mmu_context.c
mmu_notifier.c mm: wrap calls to set_pte_at_notify with invalidate_range_start and invalidate_range_end 2012-10-09 16:22:58 +09:00
mmzone.c
mprotect.c
mremap.c mm: move all mmu notifier invocations to be done outside the PT lock 2012-10-09 16:22:58 +09:00
msync.c
nobootmem.c mm: fix-up zone present pages 2012-10-09 16:22:54 +09:00
nommu.c mm: replace vma prio_tree with an interval tree 2012-10-09 16:22:39 +09:00
oom_kill.c oom: remove deprecated oom_adj 2012-10-09 16:22:24 +09:00
page_alloc.c cma: decrease cc.nr_migratepages after reclaiming pagelist 2012-10-09 16:23:01 +09:00
page_cgroup.c
page_io.c
page_isolation.c mm/page_alloc: refactor out __alloc_contig_migrate_alloc() 2012-10-09 16:22:52 +09:00
page-writeback.c
pagewalk.c
percpu-km.c
percpu-vm.c
percpu.c sections: fix section conflicts in mm/percpu.c 2012-10-06 03:04:44 +09:00
pgtable-generic.c thp: introduce pmdp_invalidate() 2012-10-09 16:22:29 +09:00
process_vm_access.c
quicklist.c
readahead.c
rmap.c mm: fix XFS oops due to dirty pages without buffers on s390 2012-10-25 14:37:52 -07:00
shmem.c tmpfs,ceph,gfs2,isofs,reiserfs,xfs: fix fh_len checking 2012-10-09 23:33:55 -04:00
slab_common.c mm, slab: release slab_mutex earlier in kmem_cache_destroy() 2012-10-10 09:25:08 +03:00
slab.c Merge branch 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux 2012-10-07 07:53:13 +09:00
slab.h
slob.c Merge branch 'testing/driver-warnings' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc into fixes 2012-10-19 15:40:18 -07:00
slub.c Merge branch 'slab/common-for-cgroups' into slab/for-linus 2012-10-03 09:56:37 +03:00
sparse-vmemmap.c
sparse.c
swap_state.c
swap.c mm: remove vma arg from page_evictable 2012-10-09 16:22:55 +09:00
swapfile.c vfs: make path_openat take a struct filename pointer 2012-10-12 20:15:09 -04:00
truncate.c mm: use clear_page_mlock() in page_remove_rmap() 2012-10-09 16:22:56 +09:00
util.c
vmalloc.c mm: use %pK for /proc/vmallocinfo 2012-10-09 16:23:03 +09:00
vmscan.c CMA: migrate mlocked pages 2012-10-09 16:23:00 +09:00
vmstat.c mm: remove unevictable_pgs_mlockfreed 2012-10-09 16:22:59 +09:00