kernel_optimize_test/mm
Zhang Yi 13d60f4b6a futex: Take hugepages into account when generating futex_key
The futex_keys of process shared futexes are generated from the page
offset, the mapping host and the mapping index of the futex user space
address. This should result in an unique identifier for each futex.

Though this is not true when futexes are located in different subpages
of an hugepage. The reason is, that the mapping index for all those
futexes evaluates to the index of the base page of the hugetlbfs
mapping. So a futex at offset 0 of the hugepage mapping and another
one at offset PAGE_SIZE of the same hugepage mapping have identical
futex_keys. This happens because the futex code blindly uses
page->index.

Steps to reproduce the bug:

1. Map a file from hugetlbfs. Initialize pthread_mutex1 at offset 0
   and pthread_mutex2 at offset PAGE_SIZE of the hugetlbfs
   mapping.

   The mutexes must be initialized as PTHREAD_PROCESS_SHARED because
   PTHREAD_PROCESS_PRIVATE mutexes are not affected by this issue as
   their keys solely depend on the user space address.

2. Lock mutex1 and mutex2

3. Create thread1 and in the thread function lock mutex1, which
   results in thread1 blocking on the locked mutex1.

4. Create thread2 and in the thread function lock mutex2, which
   results in thread2 blocking on the locked mutex2.

5. Unlock mutex2. Despite the fact that mutex2 got unlocked, thread2
   still blocks on mutex2 because the futex_key points to mutex1.

To solve this issue we need to take the normal page index of the page
which contains the futex into account, if the futex is in an hugetlbfs
mapping. In other words, we calculate the normal page mapping index of
the subpage in the hugetlbfs mapping.

Mappings which are not based on hugetlbfs are not affected and still
use page->index.

Thanks to Mel Gorman who provided a patch for adding proper evaluation
functions to the hugetlbfs code to avoid exposing hugetlbfs specific
details to the futex code.

[ tglx: Massaged changelog ]

Signed-off-by: Zhang Yi <zhang.yi20@zte.com.cn>
Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>
Tested-by: Ma Chenggong <ma.chenggong@zte.com.cn>
Reviewed-by: 'Mel Gorman' <mgorman@suse.de>
Acked-by: 'Darren Hart' <dvhart@linux.intel.com>
Cc: 'Peter Zijlstra' <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/000101ce71a6%24a83c5880%24f8b50980%24@com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2013-06-25 23:11:19 +02:00
..
backing-dev.c writeback: expose the bdi_wq workqueue 2013-04-01 19:08:06 -07:00
balloon_compaction.c
bootmem.c
bounce.c Merge branch 'for-3.10/core' of git://git.kernel.dk/linux-block 2013-05-08 10:13:35 -07:00
cleancache.c mm: cleancache: clean up cleancache_enabled 2013-04-30 17:04:01 -07:00
compaction.c
debug-pagealloc.c
dmapool.c
fadvise.c teach SYSCALL_DEFINE<n> how to deal with long long/unsigned long long 2013-03-03 22:46:22 -05:00
failslab.c
filemap_xip.c lift sb_start_write() out of ->write() 2013-04-09 14:12:56 -04:00
filemap.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2013-05-01 17:51:54 -07:00
fremap.c Revert "mm: introduce VM_POPULATE flag to better deal with racy userspace programs" 2013-03-28 17:45:51 -07:00
frontswap.c frontswap: get rid of swap_lock dependency 2013-04-30 17:04:00 -07:00
highmem.c
huge_memory.c mm/THP: use pmd_populate() to update the pmd with pgtable_t pointer 2013-05-24 16:22:51 -07:00
hugetlb_cgroup.c
hugetlb.c futex: Take hugepages into account when generating futex_key 2013-06-25 23:11:19 +02:00
hwpoison-inject.c
init-mm.c
internal.h
interval_tree.c
Kconfig mmKconfig: add an option to disable bounce 2013-04-29 15:54:40 -07:00
Kconfig.debug
kmemcheck.c
kmemleak-test.c
kmemleak.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
ksm.c ksm: fix m68k build: only NUMA needs pfn_to_nid 2013-03-08 15:05:34 -08:00
maccess.c
madvise.c mm: madvise: complete input validation before taking lock 2013-04-29 15:54:37 -07:00
Makefile memcg: add memory.pressure_level events 2013-04-29 15:54:38 -07:00
memblock.c memblock: fix missing comment of memblock_insert_region() 2013-04-29 15:54:38 -07:00
memcontrol.c mm: memcg: remove incorrect VM_BUG_ON for swap cache pages in uncharge 2013-05-24 16:22:51 -07:00
memory_hotplug.c mm/memory_hotplug.c: fix printk format warnings 2013-05-24 16:22:52 -07:00
memory-failure.c HWPOISON: check dirty flag to match against clean page 2013-04-29 15:54:28 -07:00
memory.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2013-04-30 09:36:50 -07:00
mempolicy.c mm/mempolicy.c: fix sp_node_init() argument ordering 2013-03-08 15:05:34 -08:00
mempool.c
migrate.c mm compaction: fix of improper cache flush in migration code 2013-05-24 16:22:51 -07:00
mincore.c
mlock.c Revert "mm: introduce VM_POPULATE flag to better deal with racy userspace programs" 2013-03-28 17:45:51 -07:00
mm_init.c
mmap.c shm: fix null pointer deref when userspace specifies invalid hugepage size 2013-05-09 14:22:47 -07:00
mmu_context.c mm: remove old aio use_mm() comment 2013-05-07 18:38:27 -07:00
mmu_notifier.c mm: mmu_notifier: re-fix freed page still mapped in secondary MMU 2013-05-24 16:22:51 -07:00
mmzone.c
mprotect.c
mremap.c
msync.c
nobootmem.c mm, nobootmem: do memset() after memblock_reserve() 2013-04-29 15:54:39 -07:00
nommu.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal 2013-05-01 07:21:43 -07:00
oom_kill.c
page_alloc.c mm: Fix virt_to_page() warning 2013-05-22 08:05:16 -07:00
page_cgroup.c
page_io.c Merge branch 'for-3.10/core' of git://git.kernel.dk/linux-block 2013-05-08 10:13:35 -07:00
page_isolation.c
page-writeback.c mm: make snapshotting pages for stable writes a per-bio operation 2013-04-29 15:54:33 -07:00
pagewalk.c mm/pagewalk.c: walk_page_range should avoid VM_PFNMAP areas 2013-05-24 16:22:53 -07:00
percpu-km.c
percpu-vm.c
percpu.c
pgtable-generic.c
process_vm_access.c Fix: compat_rw_copy_check_uvector() misuse in aio, readv, writev, and security keys 2013-03-12 11:05:45 -07:00
quicklist.c
readahead.c teach SYSCALL_DEFINE<n> how to deal with long long/unsigned long long 2013-03-03 22:46:22 -05:00
rmap.c rmap: recompute pgoff for unmapping huge page 2013-04-29 15:54:28 -07:00
shmem.c aio: don't include aio.h in sched.h 2013-05-07 20:16:25 -07:00
slab_common.c mm/slab: Fix crash during slab init 2013-05-08 15:02:33 -07:00
slab.c Merge branch 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux 2013-05-07 08:42:20 -07:00
slab.h
slob.c
slub.c Merge branch 'slab/for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux 2013-05-07 08:42:20 -07:00
sparse-vmemmap.c sparse-vmemmap: specify vmemmap population range in bytes 2013-04-29 15:54:35 -07:00
sparse.c mm, hotplug: avoid compiling memory hotremove functions when disabled 2013-04-29 15:54:37 -07:00
swap_state.c mm: thp: add split tail pages to shrink page list in page reclaim 2013-04-29 15:54:38 -07:00
swap.c aio: don't include aio.h in sched.h 2013-05-07 20:16:25 -07:00
swapfile.c frontswap: get rid of swap_lock dependency 2013-04-30 17:04:00 -07:00
truncate.c
util.c
vmalloc.c mm/vmalloc.c: add vfree comment 2013-05-07 18:38:27 -07:00
vmpressure.c memcg: add memory.pressure_level events 2013-04-29 15:54:38 -07:00
vmscan.c mm: thp: add split tail pages to shrink page list in page reclaim 2013-04-29 15:54:38 -07:00
vmstat.c mm/vmstat: add note on safety of drain_zonestat 2013-04-29 15:54:38 -07:00