2a9d3b5118
[ Upstream commit d60c4d01a98bc1942dba6e3adc02031f5519f94b ]
When running the stress-ng clone benchmark with multiple testing threads,
it was found that there were significant spinlock contention in sget_fc().
The contended spinlock was the sb_lock. It is under heavy contention
because the following code in the critcal section of sget_fc():
hlist_for_each_entry(old, &fc->fs_type->fs_supers, s_instances) {
if (test(old, fc))
goto share_extant_sb;
}
After testing with added instrumentation code, it was found that the
benchmark could generate thousands of ipc namespaces with the
corresponding number of entries in the mqueue's fs_supers list where the
namespaces are the key for the search. This leads to excessive time in
scanning the list for a match.
Looking back at the mqueue calling sequence leading to sget_fc():
mq_init_ns()
=> mq_create_mount()
=> fc_mount()
=> vfs_get_tree()
=> mqueue_get_tree()
=> get_tree_keyed()
=> vfs_get_super()
=> sget_fc()
Currently, mq_init_ns() is the only mqueue function that will indirectly
call mqueue_get_tree() with a newly allocated ipc namespace as the key for
searching. As a result, there will never be a match with the exising ipc
namespaces stored in the mqueue's fs_supers list.
So using get_tree_keyed() to do an existing ipc namespace search is just a
waste of time. Instead, we could use get_tree_nodev() to eliminate the
useless search. By doing so, we can greatly reduce the sb_lock hold time
and avoid the spinlock contention problem in case a large number of ipc
namespaces are present.
Of course, if the code is modified in the future to allow
mqueue_get_tree() to be called with an existing ipc namespace instead of a
new one, we will have to use get_tree_keyed() in this case.
The following stress-ng clone benchmark command was run on a 2-socket
48-core Intel system:
./stress-ng --clone 32 --verbose --oomable --metrics-brief -t 20
The "bogo ops/s" increased from 5948.45 before patch to 9137.06 after
patch. This is an increase of 54% in performance.
Link: https://lkml.kernel.org/r/20220121172315.19652-1-longman@redhat.com
Fixes:
|
||
---|---|---|
arch | ||
block | ||
certs | ||
crypto | ||
Documentation | ||
drivers | ||
fs | ||
include | ||
init | ||
ipc | ||
kernel | ||
lib | ||
LICENSES | ||
mm | ||
net | ||
samples | ||
scripts | ||
security | ||
sound | ||
tools | ||
usr | ||
virt | ||
.clang-format | ||
.cocciconfig | ||
.get_maintainer.ignore | ||
.gitattributes | ||
.gitignore | ||
.mailmap | ||
COPYING | ||
CREDITS | ||
Kbuild | ||
Kconfig | ||
MAINTAINERS | ||
Makefile | ||
README |
Linux kernel ============ There are several guides for kernel developers and users. These guides can be rendered in a number of formats, like HTML and PDF. Please read Documentation/admin-guide/README.rst first. In order to build the documentation, use ``make htmldocs`` or ``make pdfdocs``. The formatted documentation can also be read online at: https://www.kernel.org/doc/html/latest/ There are various text files in the Documentation/ subdirectory, several of them using the Restructured Text markup notation. Please read the Documentation/process/changes.rst file, as it contains the requirements for building and running the kernel, and information about the problems which may result by upgrading your kernel.