kernel_optimize_test/mm
Johannes Weiner def0fdae81 mm: memcontrol: fix NUMA round-robin reclaim at intermediate level
When a cgroup is reclaimed on behalf of a configured limit, reclaim
needs to round-robin through all NUMA nodes that hold pages of the memcg
in question.  However, when assembling the mask of candidate NUMA nodes,
the code only consults the *local* cgroup LRU counters, not the
recursive counters for the entire subtree.  Cgroup limits are frequently
configured against intermediate cgroups that do not have memory on their
own LRUs.  In this case, the node mask will always come up empty and
reclaim falls back to scanning only the current node.

If a cgroup subtree has some memory on one node but the processes are
bound to another node afterwards, the limit reclaim will never age or
reclaim that memory anymore.

To fix this, use the recursive LRU counts for a cgroup subtree to
determine which nodes hold memory of that cgroup.

The code has been broken like this forever, so it doesn't seem to be a
problem in practice.  I just noticed it while reviewing the way the LRU
counters are used in general.

Link: http://lkml.kernel.org/r/20190412151507.2769-5-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 19:52:53 -07:00
..
kasan arm64 updates for 5.2 2019-05-06 17:54:22 -07:00
backing-dev.c
balloon_compaction.c
cleancache.c
cma_debug.c mm/cma_debug.c: fix the break condition in cma_maxchunk_get() 2019-05-14 09:47:45 -07:00
cma.c mm/cma.c: fix crash on CMA allocation if bitmap allocation fails 2019-05-14 09:47:47 -07:00
cma.h
compaction.c mm: move buddy list manipulations into helpers 2019-05-14 19:52:48 -07:00
debug_page_ref.c
debug.c mm: update references to page _refcount 2019-05-14 19:52:47 -07:00
dmapool.c
early_ioremap.c
fadvise.c
failslab.c
filemap.c mm: delete find_get_entries_tag 2019-05-14 09:47:51 -07:00
frame_vector.c
frontswap.c
gup_benchmark.c mm/gup: replace get_user_pages_longterm() with FOLL_LONGTERM 2019-05-14 09:47:45 -07:00
gup.c mm: introduce put_user_page*(), placeholder versions 2019-05-14 09:47:47 -07:00
highmem.c
hmm.c mm/mmu_notifier: convert user range->blockable to helper function 2019-05-14 09:47:49 -07:00
huge_memory.c mm/huge_memory.c: make __thp_get_unmapped_area static 2019-05-14 09:47:51 -07:00
hugetlb_cgroup.c
hugetlb.c hugetlbfs: always use address space in inode for resv_map pointer 2019-05-14 09:47:50 -07:00
hwpoison-inject.c
init-mm.c
internal.h
interval_tree.c
Kconfig mm/Kconfig: update "Memory Model" help text 2019-05-14 09:47:51 -07:00
Kconfig.debug mm: remove redundant 'default n' from Kconfig-s 2019-05-14 09:47:50 -07:00
khugepaged.c mm/mmu_notifier: use correct mmu_notifier events for each invalidation 2019-05-14 09:47:49 -07:00
kmemleak-test.c
kmemleak.c
ksm.c mm/mmu_notifier: use correct mmu_notifier events for each invalidation 2019-05-14 09:47:49 -07:00
list_lru.c
maccess.c
madvise.c mm/mmu_notifier: use correct mmu_notifier events for each invalidation 2019-05-14 09:47:49 -07:00
Makefile mm: shuffle initial free memory to improve memory-side-cache utilization 2019-05-14 19:52:48 -07:00
memblock.c mm: memblock: make keeping memblock memory opt-in rather than opt-out 2019-05-14 09:47:50 -07:00
memcontrol.c mm: memcontrol: fix NUMA round-robin reclaim at intermediate level 2019-05-14 19:52:53 -07:00
memfd.c mm: page cache: store only head pages in i_pages 2019-05-14 09:47:45 -07:00
memory_hotplug.c mm: shuffle initial free memory to improve memory-side-cache utilization 2019-05-14 19:52:48 -07:00
memory-failure.c
memory.c mm: introduce new vm_map_pages() and vm_map_pages_zero() API 2019-05-14 09:47:50 -07:00
mempolicy.c
mempool.c
memtest.c
migrate.c mm/mmu_notifier: use correct mmu_notifier events for each invalidation 2019-05-14 09:47:49 -07:00
mincore.c mm/mincore.c: make mincore() more conservative 2019-05-14 19:52:48 -07:00
mlock.c
mm_init.c
mmap.c
mmu_context.c
mmu_gather.c
mmu_notifier.c mm/mmu_notifier: mmu_notifier_range_update_to_read_only() helper 2019-05-14 09:47:49 -07:00
mmzone.c
mprotect.c mm/mprotect.c: fix compilation warning because of unused 'mm' variable 2019-05-14 09:47:51 -07:00
mremap.c mm/mmu_notifier: contextual information for event triggering invalidation 2019-05-14 09:47:49 -07:00
msync.c
nommu.c mm: introduce new vm_map_pages() and vm_map_pages_zero() API 2019-05-14 09:47:50 -07:00
oom_kill.c mm/mmu_notifier: contextual information for event triggering invalidation 2019-05-14 09:47:49 -07:00
page_alloc.c mm: maintain randomization of page free lists 2019-05-14 19:52:48 -07:00
page_counter.c
page_ext.c
page_idle.c
page_io.c
page_isolation.c mm/page_isolation.c: remove redundant pfn_valid_within() in __first_valid_page() 2019-05-14 09:47:46 -07:00
page_owner.c
page_poison.c
page_vma_mapped.c
page-writeback.c mm/page-writeback: introduce tracepoint for wait_on_page_writeback() 2019-05-14 09:47:51 -07:00
pagewalk.c
percpu-internal.h
percpu-km.c
percpu-stats.c
percpu-vm.c
percpu.c Merge branch 'for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu 2019-05-13 15:34:03 -07:00
pgtable-generic.c
process_vm_access.c
quicklist.c
readahead.c
rmap.c mm/rmap.c: use the pra.mapcount to do the check 2019-05-14 09:47:49 -07:00
rodata_test.c
shmem.c mm: page cache: store only head pages in i_pages 2019-05-14 09:47:45 -07:00
shuffle.c mm: maintain randomization of page free lists 2019-05-14 19:52:48 -07:00
shuffle.h mm: maintain randomization of page free lists 2019-05-14 19:52:48 -07:00
slab_common.c
slab.c mm/slab.c: fix an infinite loop in leaks_show() 2019-05-14 09:47:45 -07:00
slab.h
slob.c slob: use slab_list instead of lru 2019-05-14 09:47:44 -07:00
slub.c mm/slub.c: update the comment about slab frozen 2019-05-14 09:47:45 -07:00
sparse-vmemmap.c
sparse.c mm/sparse.c: clean up obsolete code comment 2019-05-14 09:47:48 -07:00
swap_cgroup.c
swap_slots.c
swap_state.c mm: page cache: store only head pages in i_pages 2019-05-14 09:47:45 -07:00
swap.c mm/swap.c: __pagevec_lru_add_fn: typo fix 2019-05-14 09:47:48 -07:00
swapfile.c
truncate.c
usercopy.c
userfaultfd.c hugetlb: use same fault hash key for shared and private mappings 2019-05-14 09:47:48 -07:00
util.c mm: fix false-positive OVERCOMMIT_GUESS failures 2019-05-14 09:47:50 -07:00
vmacache.c
vmalloc.c mm/vmalloc.c: convert vmap_lazy_nr to atomic_long_t 2019-05-14 19:52:48 -07:00
vmpressure.c
vmscan.c mm: memcontrol: make cgroup stats and events query API explicitly local 2019-05-14 19:52:53 -07:00
vmstat.c
workingset.c mm: memcontrol: make cgroup stats and events query API explicitly local 2019-05-14 19:52:53 -07:00
z3fold.c mm/z3fold.c: support page migration 2019-05-14 09:47:50 -07:00
zbud.c
zpool.c
zsmalloc.c
zswap.c