Go to file
Mel Gorman 670105a256 mm: compaction: avoid 100% CPU usage during compaction when a task is killed
"howaboutsynergy" reported via kernel buzilla number 204165 that
compact_zone_order was consuming 100% CPU during a stress test for
prolonged periods of time.  Specifically the following command, which
should exit in 10 seconds, was taking an excessive time to finish while
the CPU was pegged at 100%.

  stress -m 220 --vm-bytes 1000000000 --timeout 10

Tracing indicated a pattern as follows

          stress-3923  [007]   519.106208: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
          stress-3923  [007]   519.106212: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
          stress-3923  [007]   519.106216: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
          stress-3923  [007]   519.106219: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
          stress-3923  [007]   519.106223: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
          stress-3923  [007]   519.106227: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
          stress-3923  [007]   519.106231: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
          stress-3923  [007]   519.106235: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
          stress-3923  [007]   519.106238: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0
          stress-3923  [007]   519.106242: mm_compaction_isolate_migratepages: range=(0x70bb80 ~ 0x70bb80) nr_scanned=0 nr_taken=0

Note that compaction is entered in rapid succession while scanning and
isolating nothing.  The problem is that when a task that is compacting
receives a fatal signal, it retries indefinitely instead of exiting
while making no progress as a fatal signal is pending.

It's not easy to trigger this condition although enabling zswap helps on
the basis that the timing is altered.  A very small window has to be hit
for the problem to occur (signal delivered while compacting and
isolating a PFN for migration that is not aligned to SWAP_CLUSTER_MAX).

This was reproduced locally -- 16G single socket system, 8G swap, 30%
zswap configured, vm-bytes 22000000000 using Colin Kings stress-ng
implementation from github running in a loop until the problem hits).
Tracing recorded the problem occurring almost 200K times in a short
window.  With this patch, the problem hit 4 times but the task existed
normally instead of consuming CPU.

This problem has existed for some time but it was made worse by commit
cf66f0700c ("mm, compaction: do not consider a need to reschedule as
contention").  Before that commit, if the same condition was hit then
locks would be quickly contended and compaction would exit that way.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204165
Link: http://lkml.kernel.org/r/20190718085708.GE24383@techsingularity.net
Fixes: cf66f0700c ("mm, compaction: do not consider a need to reschedule as contention")
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>	[5.1+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-08-03 07:02:00 -07:00
arch add swiotlb support to arm 2019-08-02 08:44:33 -07:00
block for-linus-20190726-2 2019-07-26 19:20:34 -07:00
certs Revert "Merge tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs" 2019-07-10 18:43:43 -07:00
crypto USB / PHY patches for 5.3-rc1 2019-07-11 15:40:06 -07:00
Documentation HMM patches for 5.3-rc 2019-07-30 12:54:44 -07:00
drivers Power management fix for 5.3-rc3 2019-08-02 08:55:28 -07:00
fs ocfs2: remove set but not used variable 'last_hash' 2019-08-03 07:02:00 -07:00
include A few fixes for code that came in during the merge window or 2019-08-02 08:47:28 -07:00
init Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
ipc Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
kernel kernel/signal.c: fix a kernel-doc markup 2019-08-03 07:02:00 -07:00
lib test_firmware: fix a memory leak bug 2019-07-25 14:39:52 +02:00
LICENSES LICENSES: Rename other to deprecated 2019-05-03 06:34:32 -06:00
mm mm: compaction: avoid 100% CPU usage during compaction when a task is killed 2019-08-03 07:02:00 -07:00
net tcp: be more careful in tcp_fragment() 2019-07-21 20:41:24 -07:00
samples treewide: remove SPDX "WITH Linux-syscall-note" from kernel-space headers again 2019-07-25 11:05:10 +02:00
scripts Kbuild fixes for v5.3 2019-07-28 10:35:04 -07:00
security meminit fix 2019-07-28 12:33:15 -07:00
sound sound fixes for 5.3-rc3 2019-08-02 08:53:34 -07:00
tools for-linus-20190730 2019-07-30 13:25:14 -07:00
usr kbuild: enable arch/s390/include/uapi/asm/zcrypt.h for uapi header test 2019-07-23 10:45:46 +02:00
virt Documentation: move Documentation/virtual to Documentation/virt 2019-07-24 10:52:11 +02:00
.clang-format Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-04-17 11:26:25 -07:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes
.gitignore .gitignore: Add compilation database file 2019-07-27 12:18:19 +09:00
.mailmap MAINTAINERS: Update my email address 2019-07-22 14:57:50 +01:00
COPYING
CREDITS Remove references to dead website. 2019-07-19 12:22:04 -07:00
Kbuild Kbuild updates for v5.1 2019-03-10 17:48:21 -07:00
Kconfig docs: kbuild: convert docs to ReST and rename to *.rst 2019-06-14 14:21:21 -06:00
MAINTAINERS Char/Misc driver fixes for 5.3-rc2 2019-07-28 10:26:10 -07:00
Makefile Linux 5.3-rc2 2019-07-28 12:47:02 -07:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.