kernel_optimize_test/mm
David Rientjes b676b293fb mm, thp: fix mapped pages avoiding unevictable list on mlock
When a transparent hugepage is mapped and it is included in an mlock()
range, follow_page() incorrectly avoids setting the page's mlock bit and
moving it to the unevictable lru.

This is evident if you try to mlock(), munlock(), and then mlock() a
range again.  Currently:

	#define MAP_SIZE	(4 << 30)	/* 4GB */

	void *ptr = mmap(NULL, MAP_SIZE, PROT_READ | PROT_WRITE,
			 MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);
	mlock(ptr, MAP_SIZE);

		$ grep -E "Unevictable|Inactive\(anon" /proc/meminfo
		Inactive(anon):     6304 kB
		Unevictable:     4213924 kB

	munlock(ptr, MAP_SIZE);

		Inactive(anon):  4186252 kB
		Unevictable:       19652 kB

	mlock(ptr, MAP_SIZE);

		Inactive(anon):  4198556 kB
		Unevictable:       21684 kB

Notice that less than 2MB was added to the unevictable list; this is
because these pages in the range are not transparent hugepages since the
4GB range was allocated with mmap() and has no specific alignment.  If
posix_memalign() were used instead, unevictable would not have grown at
all on the second mlock().

The fix is to call mlock_vma_page() so that the mlock bit is set and the
page is added to the unevictable list.  With this patch:

	mlock(ptr, MAP_SIZE);

		Inactive(anon):     4056 kB
		Unevictable:     4213940 kB

	munlock(ptr, MAP_SIZE);

		Inactive(anon):  4198268 kB
		Unevictable:       19636 kB

	mlock(ptr, MAP_SIZE);

		Inactive(anon):     4008 kB
		Unevictable:     4213940 kB

Signed-off-by: David Rientjes <rientjes@google.com>
Acked-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michel Lespinasse <walken@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-09 16:23:02 +09:00
..
backing-dev.c
bootmem.c mm: fix-up zone present pages 2012-10-09 16:22:54 +09:00
bounce.c
cleancache.c
compaction.c CMA: migrate mlocked pages 2012-10-09 16:23:00 +09:00
debug-pagealloc.c
dmapool.c
fadvise.c
failslab.c
filemap_xip.c mm: move all mmu notifier invocations to be done outside the PT lock 2012-10-09 16:22:58 +09:00
filemap.c readahead: fault retry breaks mmap file read random detection 2012-10-09 16:22:47 +09:00
fremap.c mm: replace vma prio_tree with an interval tree 2012-10-09 16:22:39 +09:00
frontswap.c
highmem.c
huge_memory.c mm, thp: fix mapped pages avoiding unevictable list on mlock 2012-10-09 16:23:02 +09:00
hugetlb_cgroup.c
hugetlb.c mm: move all mmu notifier invocations to be done outside the PT lock 2012-10-09 16:22:58 +09:00
hwpoison-inject.c
init-mm.c
internal.h CMA: migrate mlocked pages 2012-10-09 16:23:00 +09:00
interval_tree.c mm: add CONFIG_DEBUG_VM_RB build option 2012-10-09 16:22:42 +09:00
Kconfig mm: enable CONFIG_COMPACTION by default 2012-10-09 16:22:53 +09:00
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c kmemleak: use rbtree instead of prio tree 2012-10-09 16:22:39 +09:00
ksm.c mm: wrap calls to set_pte_at_notify with invalidate_range_start and invalidate_range_end 2012-10-09 16:22:58 +09:00
maccess.c
madvise.c mm: prepare VM_DONTDUMP for using in drivers 2012-10-09 16:22:18 +09:00
Makefile mm: replace vma prio_tree with an interval tree 2012-10-09 16:22:39 +09:00
memblock.c mm: avoid section mismatch warning for memblock_type_name 2012-10-09 16:23:01 +09:00
memcontrol.c memcg: move mem_cgroup_is_root upwards 2012-10-09 16:22:55 +09:00
memory_hotplug.c memory-hotplug: update memory block's state and notify userspace 2012-10-09 16:23:02 +09:00
memory-failure.c mm anon rmap: replace same_anon_vma linked list with an interval tree. 2012-10-09 16:22:41 +09:00
memory.c mm, thp: fix mapped pages avoiding unevictable list on mlock 2012-10-09 16:23:02 +09:00
mempolicy.c mm: revert 0def08e3 ("mm/mempolicy.c: check return code of check_range") 2012-10-09 16:22:58 +09:00
mempool.c
migrate.c
mincore.c
mlock.c mm: use clear_page_mlock() in page_remove_rmap() 2012-10-09 16:22:56 +09:00
mm_init.c
mmap.c mm: avoid taking rmap locks in move_ptes() 2012-10-09 16:22:42 +09:00
mmu_context.c
mmu_notifier.c mm: wrap calls to set_pte_at_notify with invalidate_range_start and invalidate_range_end 2012-10-09 16:22:58 +09:00
mmzone.c
mprotect.c
mremap.c mm: move all mmu notifier invocations to be done outside the PT lock 2012-10-09 16:22:58 +09:00
msync.c
nobootmem.c mm: fix-up zone present pages 2012-10-09 16:22:54 +09:00
nommu.c mm: replace vma prio_tree with an interval tree 2012-10-09 16:22:39 +09:00
oom_kill.c oom: remove deprecated oom_adj 2012-10-09 16:22:24 +09:00
page_alloc.c cma: decrease cc.nr_migratepages after reclaiming pagelist 2012-10-09 16:23:01 +09:00
page_cgroup.c
page_io.c
page_isolation.c mm/page_alloc: refactor out __alloc_contig_migrate_alloc() 2012-10-09 16:22:52 +09:00
page-writeback.c
pagewalk.c
percpu-km.c
percpu-vm.c
percpu.c
pgtable-generic.c thp: introduce pmdp_invalidate() 2012-10-09 16:22:29 +09:00
process_vm_access.c
quicklist.c
readahead.c
rmap.c mm: move all mmu notifier invocations to be done outside the PT lock 2012-10-09 16:22:58 +09:00
shmem.c mm: kill vma flag VM_CAN_NONLINEAR 2012-10-09 16:22:17 +09:00
slab_common.c
slab.c
slab.h
slob.c
slub.c
sparse-vmemmap.c
sparse.c
swap_state.c
swap.c mm: remove vma arg from page_evictable 2012-10-09 16:22:55 +09:00
swapfile.c
truncate.c mm: use clear_page_mlock() in page_remove_rmap() 2012-10-09 16:22:56 +09:00
util.c
vmalloc.c mm: kill vma flag VM_RESERVED and mm->reserved_vm counter 2012-10-09 16:22:19 +09:00
vmscan.c CMA: migrate mlocked pages 2012-10-09 16:23:00 +09:00
vmstat.c mm: remove unevictable_pgs_mlockfreed 2012-10-09 16:22:59 +09:00