kernel_optimize_test

History

Andrea Arcangeli 1c2fb7a4c2 ksm: fix deadlock with munlock in exit_mmap Rawhide users have reported hang at startup when cryptsetup is run: the same problem can be simply reproduced by running a program int main() { mlockall(MCL_CURRENT \| MCL_FUTURE); return 0; } The problem is that exit_mmap() applies munlock_vma_pages_all() to clean up VM_LOCKED areas, and its current implementation (stupidly) tries to fault in absent pages, for example where PROT_NONE prevented them being faulted in when mlocking. Whereas the "ksm: fix oom deadlock" patch, knowing there's a race by which KSM might try to fault in pages after exit_mmap() had finally zapped the range, backs out of such faults doing nothing when its ksm_test_exit() notices mm_users 0. So revert that part of "ksm: fix oom deadlock" which moved the ksm_exit() call from before exit_mmap() to the middle of exit_mmap(); and remove those ksm_test_exit() checks from the page fault paths, so allowing the munlocking to proceed without interference. ksm_exit, if there are rmap_items still chained on this mm slot, takes mmap_sem write side: so preventing KSM from working on an mm while exit_mmap runs. And KSM will bail out as soon as it notices that mm_users is already zero, thanks to its internal ksm_test_exit checks. So that when a task is killed by OOM killer or the user, KSM will not indefinitely prevent it from running exit_mmap to release its memory. This does break a part of what "ksm: fix oom deadlock" was trying to achieve. When unmerging KSM (echo 2 >/sys/kernel/mm/ksm), and even when ksmd itself has to cancel a KSM page, it is possible that the first OOM-kill victim would be the KSM process being faulted: then its memory won't be freed until a second victim has been selected (freeing memory for the unmerging fault to complete). But the OOM killer is already liable to kill a second victim once the intended victim's p->mm goes to NULL: so there's not much point in rejecting this KSM patch before fixing that OOM behaviour. It is very much more important to allow KSM users to boot up, than to haggle over an unlikely and poorly supported OOM case. We also intend to fix munlocking to not fault pages: at which point this patch _could_ be reverted; though that would be controversial, so we hope to find a better solution. Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Justin M. Forbes <jforbes@redhat.com> Acked-for-now-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Cc: Izik Eidus <ieidus@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>		2009-09-22 07:17:32 -07:00
..
allocpercpu.c
backing-dev.c	writeback: splice dirty inode entries to default bdi on bdi_destroy()	2009-09-16 15:18:52 +02:00
bootmem.c
bounce.c
debug-pagealloc.c
dmapool.c
fadvise.c
failslab.c
filemap_xip.c
filemap.c	mm: oom analysis: add shmem vmstat	2009-09-22 07:17:27 -07:00
fremap.c
highmem.c
hugetlb.c	hugetlb: restore interleaving of bootmem huge pages	2009-09-22 07:17:26 -07:00
init-mm.c
internal.h
Kconfig	ksm: the mm interface to ksm	2009-09-22 07:17:31 -07:00
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c
ksm.c	ksm: fix deadlock with munlock in exit_mmap	2009-09-22 07:17:32 -07:00
maccess.c
madvise.c	ksm: the mm interface to ksm	2009-09-22 07:17:31 -07:00
Makefile	ksm: the mm interface to ksm	2009-09-22 07:17:31 -07:00
memcontrol.c
memory_hotplug.c	memory hotplug: update zone pcp at memory online	2009-09-22 07:17:25 -07:00
memory.c	ksm: fix deadlock with munlock in exit_mmap	2009-09-22 07:17:32 -07:00
mempolicy.c
mempool.c
migrate.c	mm: vmstat: add isolate pages	2009-09-22 07:17:29 -07:00
mincore.c
mlock.c
mm_init.c
mmap.c	ksm: fix deadlock with munlock in exit_mmap	2009-09-22 07:17:32 -07:00
mmu_notifier.c	ksm: add mmu_notifier set_pte_at_notify()	2009-09-22 07:17:31 -07:00
mmzone.c
mprotect.c	perf: Do the big rename: Performance Counters -> Performance Events	2009-09-21 14:28:04 +02:00
mremap.c	ksm: prevent mremap move poisoning	2009-09-22 07:17:31 -07:00
msync.c
nommu.c
oom_kill.c
page_alloc.c	mm: perform non-atomic test-clear of PG_mlocked on free	2009-09-22 07:17:30 -07:00
page_cgroup.c	memory hotplug: alloc page from other node in memory online	2009-09-22 07:17:26 -07:00
page_io.c
page_isolation.c
page-writeback.c	mm: count only reclaimable lru pages	2009-09-22 07:17:30 -07:00
pagewalk.c
percpu.c	Merge branch 'for-next' into for-linus	2009-09-15 09:57:19 +09:00
prio_tree.c
quicklist.c
readahead.c
rmap.c	ksm: no debug in page_dup_rmap()	2009-09-22 07:17:31 -07:00
shmem_acl.c
shmem.c	Driver Core: devtmpfs - kernel-maintained tmpfs-based /dev	2009-09-15 09:50:49 -07:00
slab.c
slob.c
slub.c	slub: Fix build error in kmem_cache_open() with !CONFIG_SLUB_DEBUG	2009-09-15 22:32:10 +03:00
sparse-vmemmap.c	memory hotplug: alloc page from other node in memory online	2009-09-22 07:17:26 -07:00
sparse.c	memory hotplug: alloc page from other node in memory online	2009-09-22 07:17:26 -07:00
swap_state.c
swap.c
swapfile.c
thrash.c
truncate.c
util.c
vmalloc.c	vmalloc.c: fix double error checking	2009-09-22 07:17:30 -07:00
vmscan.c	vmscan: kill unnecessary prefetch	2009-09-22 07:17:30 -07:00
vmstat.c	mm: vmstat: add isolate pages	2009-09-22 07:17:29 -07:00