Go to file
John Garry 32bc15afed blk-mq: Facilitate a shared sbitmap per tagset
Some SCSI HBAs (such as HPSA, megaraid, mpt3sas, hisi_sas_v3 ..) support
multiple reply queues with single hostwide tags.

In addition, these drivers want to use interrupt assignment in
pci_alloc_irq_vectors(PCI_IRQ_AFFINITY). However, as discussed in [0],
CPU hotplug may cause in-flight IO completion to not be serviced when an
interrupt is shutdown. That problem is solved in commit bf0beec060
("blk-mq: drain I/O when all CPUs in a hctx are offline").

However, to take advantage of that blk-mq feature, the HBA HW queuess are
required to be mapped to that of the blk-mq hctx's; to do that, the HBA HW
queues need to be exposed to the upper layer.

In making that transition, the per-SCSI command request tags are no
longer unique per Scsi host - they are just unique per hctx. As such, the
HBA LLDD would have to generate this tag internally, which has a certain
performance overhead.

However another problem is that blk-mq assumes the host may accept
(Scsi_host.can_queue * #hw queue) commands. In commit 6eb045e092 ("scsi:
 core: avoid host-wide host_busy counter for scsi_mq"), the Scsi host busy
counter was removed, which would stop the LLDD being sent more than
.can_queue commands; however, it should still be ensured that the block
layer does not issue more than .can_queue commands to the Scsi host.

To solve this problem, introduce a shared sbitmap per blk_mq_tag_set,
which may be requested at init time.

New flag BLK_MQ_F_TAG_HCTX_SHARED should be set when requesting the
tagset to indicate whether the shared sbitmap should be used.

Even when BLK_MQ_F_TAG_HCTX_SHARED is set, a full set of tags and requests
are still allocated per hctx; the reason for this is that if tags and
requests were only allocated for a single hctx - like hctx0 - it may break
block drivers which expect a request be associated with a specific hctx,
i.e. not always hctx0. This will introduce extra memory usage.

This change is based on work originally from Ming Lei in [1] and from
Bart's suggestion in [2].

[0] https://lore.kernel.org/linux-block/alpine.DEB.2.21.1904051331270.1802@nanos.tec.linutronix.de/
[1] https://lore.kernel.org/linux-block/20190531022801.10003-1-ming.lei@redhat.com/
[2] https://lore.kernel.org/linux-block/ff77beff-5fd9-9f05-12b6-826922bace1f@huawei.com/T/#m3db0a602f095cbcbff27e9c884d6b4ae826144be

Signed-off-by: John Garry <john.garry@huawei.com>
Tested-by: Don Brace<don.brace@microsemi.com> #SCSI resv cmds patches used
Tested-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-03 15:20:47 -06:00
arch Three interrupt related fixes for X86: 2020-08-30 12:01:23 -07:00
block blk-mq: Facilitate a shared sbitmap per tagset 2020-09-03 15:20:47 -06:00
certs
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2020-08-30 15:53:44 -07:00
Documentation Documentation/filesystems/locking.rst: remove an incorrect sentence 2020-09-02 07:59:59 -06:00
drivers blk-mq: Rename BLK_MQ_F_TAG_SHARED as BLK_MQ_F_TAG_QUEUE_SHARED 2020-09-03 15:20:46 -06:00
fs block: remove revalidate_disk() 2020-09-02 08:00:26 -06:00
include blk-mq: Facilitate a shared sbitmap per tagset 2020-09-03 15:20:47 -06:00
init OpenRISC updates for 5.9 2020-08-14 14:04:53 -07:00
ipc treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
kernel Three interrupt related fixes for X86: 2020-08-30 12:01:23 -07:00
lib lib: Revert use of fallthrough pseudo-keyword in lib/ 2020-08-24 14:17:44 -07:00
LICENSES
mm powerpc fixes for 5.9 #4 2020-08-30 10:56:12 -07:00
net Fixes: 2020-08-25 18:01:36 -07:00
samples treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
scripts kconfig: qconf: replace deprecated QString::sprintf() with QTextStream 2020-08-21 10:23:38 +09:00
security treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
sound treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
tools blk-iocost: update iocost_monitor.py 2020-09-01 19:38:33 -06:00
usr Merge branch 'work.fdpic' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 13:29:39 -07:00
virt * PAE and PKU bugfixes for x86 2020-08-22 10:03:05 -07:00
.clang-format
.cocciconfig
.get_maintainer.ignore
.gitattributes
.gitignore .gitignore: Add ZSTD-compressed files 2020-07-31 11:50:49 +02:00
.mailmap Merge branch 'akpm' (patches from Andrew) 2020-08-21 14:44:48 -07:00
COPYING
CREDITS CREDITS: Replace HTTP links with HTTPS ones 2020-07-23 14:53:58 -06:00
Kbuild
Kconfig
MAINTAINERS Three interrupt related fixes for X86: 2020-08-30 12:01:23 -07:00
Makefile Linux 5.9-rc3 2020-08-30 16:01:54 -07:00
README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.