In the quest to bring io_kiocb down to 3 cachelines, this one does
the trick. Make the wait_queue_entry for the poll command come out
of kmalloc instead of embedding it in struct io_poll_iocb, as the
latter is the largest member of io_kiocb. Once we trim this down a
bit, we're back at a healthy 192 bytes for struct io_kiocb.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Currently we're using 40 bytes for the io_wq_work structure, and 16 of
those is the doubly link list node. We don't need doubly linked lists,
we always add to tail to keep things ordered, and any other use case
is list traversal with deletion. For the deletion case, we can easily
support any node deletion by keeping track of the previous entry.
This shrinks io_wq_work to 32 bytes, and subsequently io_kiock from
io_uring to 216 to 208 bytes.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There are several things that can go wrong in the current code on NUMA
systems, especially if not all nodes are online all the time:
- If the identifiers of the online nodes do not form a single contiguous
block starting at zero, wq->wqes will be too small, and OOB memory
accesses will occur e.g. in the loop in io_wq_create().
- If a node comes online between the call to num_online_nodes() and the
for_each_node() loop in io_wq_create(), an OOB write will occur.
- If a node comes online between io_wq_create() and io_wq_enqueue(), a
lookup is performed for an element that doesn't exist, and an OOB read
will probably occur.
Fix it by:
- using nr_node_ids instead of num_online_nodes() for the allocation size;
nr_node_ids is calculated by setup_nr_node_ids() to be bigger than the
highest node ID that could possibly come online at some point, even if
those nodes' identifiers are not a contiguous block
- creating workers for all possible CPUs, not just all online ones
This is basically what the normal workqueue code also does, as far as I can
tell.
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
These allocations are single-element allocations, so don't use the array
allocation wrapper for them.
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Clean io_import_fixed() call site and make it return proper type.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There is no point left in keeping struct sqe_submit. Inline it
into struct io_kiocb, so any req->submit.field is now just req->field
- moves initialisation of ring_file into io_get_req()
- removes duplicated req->sequence.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Timeouts' sequence offset (i.e. sqe->off) is stored in
req->submit.sequence under a false name. Keep it in timeout.data
instead. The unused space for sequence will be reclaimed in the
following patches.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Only io_uring uses (and added) these, and we want to disallow the
use of sendmsg/recvmsg for anything but regular data transfers.
Use the newly added prep helper to split the msghdr copy out from
the core function, to check for msg_control and msg_controllen
settings. If either is set, we return -EINVAL.
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This is in preparation for enabling the io_uring helpers for sendmsg
and recvmsg to first copy the header for validation before continuing
with the operation.
There should be no functional changes in this patch.
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This field contains a pointer to addrlen and checking to see if it's set
returns -EINVAL if the caller sets addr & addrlen pointers.
Fixes: 17f2fe35d0 ("io_uring: add support for IORING_OP_ACCEPT")
Signed-off-by: Hrvoje Zeba <zeba.hrvoje@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We currently pass in 4 arguments outside of the bounded size. In
preparation for adding one more argument, let's bundle them up in
a struct to make it more readable.
No functional changes in this patch.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Read/write requests to devices without implemented read/write_iter
using fixed buffers can cause general protection fault, which totally
hangs a machine.
io_import_fixed() initialises iov_iter with bvec, but loop_rw_iter()
accesses it as iovec, dereferencing random address.
kmap() page by page in this case
Cc: stable@vger.kernel.org
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This allows an application to call connect() in an async fashion. Like
other opcodes, we first try a non-blocking connect, then punt to async
context if we have to.
Note that we can still return -EINPROGRESS, and in that case the caller
should use IORING_OP_POLL_ADD to do an async wait for completion of the
connect request (just like for regular connect(2), except we can do it
async here too).
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This is identical to __sys_connect(), except it takes a struct file
instead of an fd, and it also allows passing in extra file->f_flags
flags. The latter is done to support masking in O_NONBLOCK without
manipulating the original file flags.
No functional changes in this patch.
Cc: netdev@vger.kernel.org
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We return -EBUSY on submit when we have a CQ ring overflow backlog, but
that can be a bit problematic if the application is using pure userspace
poll of the CQ ring. For that case, if the ring briefly overflowed and
we have pending entries in the backlog, the submit flushes the backlog
successfully but still returns -EBUSY. If we're able to fully flush the
CQ ring backlog, let the submission proceed.
Reported-by: Dan Melnic <dmm@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pass only non-null @nxt to io_issue_sqe() and handle it at the caller's
side. And propagate it.
- kiocb_done() is only called from io_read() and io_write(), which are
only called from io_issue_sqe(), so it's @nxt != NULL
- io_put_req_find_next() is called either with explicitly non-null local
nxt, or from one of the functions in io_issue_sqe() switch (or their
callees).
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
"if (nxt)" is always true, as it was checked in the while's condition.
io_wq_current_is_worker() is unnecessary, as non-async callers don't
pass nxt, so io_queue_async_work() will be called for them anyway.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Make io_req_find_next() and io_req_link_next() to accept only non-null
nxt, and handle it in callers.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There is only one one-liner user of io_free_req_find_next(). Inline it.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The number of SQEs to submit is specified by a user, so io_get_sqring()
in most of the cases succeeds. Hint compilers about that.
Checking ASM genereted by gcc 9.2.0 for x64, there is one branch
misprediction.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
__io_submit_sqe() is issuing requests, so call it as
such. Moreover, it ends by calling io_iopoll_req_issued().
Rename it and make terminology clearer.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We don't have shadow requests anymore, so get rid of the shadow
argument. Add the user_data argument, as that's often useful to easily
match up requests, instead of having to look at request pointers.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There's an issue with the shadow drain logic in that we drop the
completion lock after deciding to defer a request, then re-grab it later
and assume that the state is still the same. In the mean time, someone
else completing a request could have found and issued it. This can cause
a stall in the queue, by having a shadow request inserted that nobody is
going to drain.
Additionally, if we fail allocating the shadow request, we simply ignore
the drain.
Instead of using a shadow request, defer the next request/link instead.
This also has the following advantages:
- removes semi-duplicated code
- doesn't allocate memory for shadows
- works better if only the head marked for drain
- doesn't need complex synchronisation
On the flip side, it removes the shadow->seq ==
last_drain_in_in_link->seq optimization. That shouldn't be a common
case, and can always be added back, if needed.
Fixes: 4fe2c96315 ("io_uring: add support for link with drain")
Cc: Jackie Liu <liuyun01@kylinos.cn>
Reported-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
When we find new work to process within the work handler, we queue the
linked timeout before we have issued the new work. This can be
problematic for very short timeouts, as we have a window where the new
work isn't visible.
Allow the work handler to store a callback function for this in the work
item, and flag it with IO_WQ_WORK_CB if the caller has done so. If that
is set, then io-wq will call the callback when it has setup the new work
item.
Reported-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We currently try and start the next link when we put the request, and
only if we were going to free it. This means that the optimization to
continue executing requests from the same context often fails, as we're
not putting the final reference.
Add REQ_F_LINK_NEXT to keep track of this, and allow io_uring to find the
next request more efficiently.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We currently rely on the ring destroy on cleaning things up in case of
failure, but io_allocate_scq_urings() can leave things half initialized
if only parts of it fails.
Be nice and return with either everything setup in success, or return an
error with things nicely cleaned up.
Reported-by: syzbot+0d818c0d39399188f393@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Always mark requests with allocated sqe and deallocate it in
__io_free_req(). It's easier to follow and doesn't add edge cases.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We currently clear the linked timeout field if we cancel such a timeout,
but we should only attempt to cancel if it's the first one we see.
Others should simply be freed like other requests, as they haven't
been started yet.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
let have a dependant link: REQ -> LINK_TIMEOUT -> LINK_TIMEOUT
1. submission stage: submission references for REQ and LINK_TIMEOUT
are dropped. So, references respectively (1,1,2)
2. io_put(REQ) + FAIL_LINKS stage: calls io_fail_links(), which for all
linked timeouts will call cancel_timeout() and drop 1 reference.
So, references after: (0,0,1). That's a leak.
Make it treat only the first linked timeout as such, and pass others
through __io_double_put_req().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pass any IORING_OP_LINK_TIMEOUT request further, where it will
eventually fail in io_issue_sqe().
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If io_req_defer() failed, it needs to cancel a dependant link.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We currently have a race where if setup is really slow, we can be
calling io_wq_destroy() before we're done setting up. This will cause
the caller to get stuck waiting for the manager to set things up, but
the manager already exited.
Fix this by doing a sync setup of the manager. This also fixes the case
where if we failed creating workers, we'd also get stuck.
In practice this race window was really small, as we already wait for
the manager to start. Hence someone would have to call io_wq_destroy()
after the task has started, but before it started the first loop. The
reported test case forked tons of these, which is why it became an
issue.
Reported-by: syzbot+0f1cc17f85154f400465@syzkaller.appspotmail.com
Fixes: 771b53d033 ("io-wq: small threadpool implementation for io_uring")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We currently don't explicitly break links if a request is cancelled, but
we should. Add explicitly link breakage for all types of request
cancellations that we support.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Currently a poll request fills a completion entry of 0, even if it got
cancelled. This is odd, and it makes it harder to support with chains.
Ensure that it returns -ECANCELED in the completions events if it got
cancelled, and furthermore ensure that the linked timeout that triggered
it completes with -ETIME if we did indeed trigger the completions
through a timeout.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
With the conversion to io-wq, we no longer use that flag. Kill it.
Fixes: 561fb04a6a ("io_uring: replace workqueue usage with io-wq")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We have an issue with timeout links that are deeper in the submit chain,
because we only handle it upfront, not from later submissions. Move the
prep + issue of the timeout link to the async work prep handler, and do
it normally for non-async queue. If we validate and prepare the timeout
links upfront when we first see them, there's nothing stopping us from
supporting any sort of nesting.
Fixes: 2665abfd75 ("io_uring: add support for linked SQE timeouts")
Reported-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There are a few reasons for this:
- As a prep to improving the linked timeout logic
- io_timeout is the biggest member in the io_kiocb opcode union
This also enables a few cleanups, like unifying the timer setup between
IORING_OP_TIMEOUT and IORING_OP_LINK_TIMEOUT, and not needing multiple
arguments to the link/prep helpers.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If we don't use the normal completion path, we may skip killing links
that should be errored and freed. Add __io_double_put_req() for use
within the completion path itself, other calls should just use
io_double_put_req().
Signed-off-by: Jens Axboe <axboe@kernel.dk>
__io_queue_sqe(), io_queue_sqe(), io_queue_link_head() all return 0/err,
but the caller doesn't care since the errors are handled inline. Clean
these up and just make them void.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If we have a linked request, this enables us to pass it back directly
without having to go through async context.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
* Part one of substantial EDAC core + ghes_edac driver cleanup (Robert Richter)
* Print additional useful logging information in skx_* (Tony Luck)
* Improve amd64_edac hw detection + cleanups (Yazen Ghannam)
* Misc cleanups, fixes and code improvements
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAl3bf/MACgkQEsHwGGHe
VUqRwBAAhyg0jxp6pKJQe1cUvX+Qv00/x2DBAy66nRQAwC8YG1pzJWtGr3YPWX0t
+sBDEjE2Pqnh0EZOvUaGhRDCPOWBCMPrx53+KlNMw++hFy4rfrIFMUOVFRKabSEC
zYwVbzC7tgLkyeqNccPQ2qmsG7L9H44zAR8HkH4c4IrWWgDtIlp5BSoGpiPWqszv
LENuKvp+B2ll35oa9yfusqnudQzjdDn5wESigaEpjkPD7I+bHrWhUaE0xWGlIxbd
hZUhh1tL31gBfQoumEADNOSZEHJBa9xsJrX9dJR07UIrgp5yn1PVhm+gGti5Np9v
f/+3vtwTQGWJlYxJnvitvm4EGavgB/ihHpXiDwfGXWiwU/PqG3uEGIS3Lucu3PRn
ZVv/4jCf2rxo+ECFOd7nj8WID2Mu1ZEKPQkIlrCx0A7v1FBaCXB9itUUL7GhX37W
HaqBPoHa/eJDOlyYWgVXfdQQJ/gqMhfHCedMTL1WLNkHwbQU05TQoFDBd8r/J2Fb
VkwkTS4fYTRyl+pbHX6Kk1bymyorTKp9jyxgoBdzlwLegqksmctxADZx/RJfM5WT
5CbNVsK9XVWYDiGRleMpbNveSgWGod+ul32BLp1rOLPOVXjZFDNawDCygtUjtHdH
pXvoP1wIRupq1o745uoXQ17WvjYFR9cjVRGcQLZjW3EUng2fVrw=
=ue24
-----END PGP SIGNATURE-----
Merge tag 'edac_for_5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC updates from Borislav Petkov:
"A lot of changes this time around, details below.
From the next cycle onwards, we'll switch the EDAC tree to topic
branches (instead of a single edac-for-next branch) which should make
the changes handling more flexible, hopefully. We'll see.
Summary:
- Rework error logging functions to accept a count of errors
parameter (Hanna Hawa)
- Part one of substantial EDAC core + ghes_edac driver cleanup
(Robert Richter)
- Print additional useful logging information in skx_* (Tony Luck)
- Improve amd64_edac hw detection + cleanups (Yazen Ghannam)
- Misc cleanups, fixes and code improvements"
* tag 'edac_for_5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: (35 commits)
EDAC/altera: Use the Altera System Manager driver
EDAC/altera: Cleanup the ECC Manager
EDAC/altera: Use fast register IO for S10 IRQs
EDAC/ghes: Do not warn when incrementing refcount on 0
EDAC/Documentation: Describe CPER module definition and DIMM ranks
EDAC: Unify the mc_event tracepoint call
EDAC/ghes: Remove intermediate buffer pvt->detail_location
EDAC/ghes: Fix grain calculation
EDAC/ghes: Use standard kernel macros for page calculations
EDAC: Remove misleading comment in struct edac_raw_error_desc
EDAC/mc: Reduce indentation level in edac_mc_handle_error()
EDAC/mc: Remove needless zero string termination
EDAC/mc: Do not BUG_ON() in edac_mc_alloc()
EDAC: Introduce an mci_for_each_dimm() iterator
EDAC: Remove EDAC_DIMM_OFF() macro
EDAC: Replace EDAC_DIMM_PTR() macro with edac_get_dimm() function
EDAC/amd64: Get rid of the ECC disabled long message
EDAC/ghes: Fix locking and memory barrier issues
EDAC/amd64: Check for memory before fully initializing an instance
EDAC/amd64: Use cached data when checking for ECC
...
- Data abort report and injection
- Steal time support
- GICv4 performance improvements
- vgic ITS emulation fixes
- Simplify FWB handling
- Enable halt polling counters
- Make the emulated timer PREEMPT_RT compliant
s390:
- Small fixes and cleanups
- selftest improvements
- yield improvements
PPC:
- Add capability to tell userspace whether we can single-step the guest.
- Improve the allocation of XIVE virtual processor IDs
- Rewrite interrupt synthesis code to deliver interrupts in virtual
mode when appropriate.
- Minor cleanups and improvements.
x86:
- XSAVES support for AMD
- more accurate report of nested guest TSC to the nested hypervisor
- retpoline optimizations
- support for nested 5-level page tables
- PMU virtualization optimizations, and improved support for nested
PMU virtualization
- correct latching of INITs for nested virtualization
- IOAPIC optimization
- TSX_CTRL virtualization for more TAA happiness
- improved allocation and flushing of SEV ASIDs
- many bugfixes and cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJd27PMAAoJEL/70l94x66DspsH+gPc6YWtKJFJH58Zj8NrNh6y
t0FwDFcvUa51+m4jaY4L5Y8+zqu1dZFnPPhFGqNWpxrjCEvE/glQJv3BiUX06Seh
aYUHNymGoYCTJOHaaGhV+NlgQaDuZOCOkIsOLAPehyFd1KojwB+FRC0xmO6aROPw
9yQgYrKuK1UUn5HwxBNrMS4+Xv+2iKv/9sTnq1G4W2qX2NZQg84LVPg1zIdkCh3D
3GOvoCBEk3ivQqjmdE7rP/InPr0XvW0b6TFhchIk8J6jEIQFHsmOUefiTvTxsIHV
OKAZwvyeYPrYHA/aDZpaBmY2aR0ydfKDUQcviNIJoF1vOktGs0hvl3VbsmG8QCg=
=OSI1
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"ARM:
- data abort report and injection
- steal time support
- GICv4 performance improvements
- vgic ITS emulation fixes
- simplify FWB handling
- enable halt polling counters
- make the emulated timer PREEMPT_RT compliant
s390:
- small fixes and cleanups
- selftest improvements
- yield improvements
PPC:
- add capability to tell userspace whether we can single-step the
guest
- improve the allocation of XIVE virtual processor IDs
- rewrite interrupt synthesis code to deliver interrupts in virtual
mode when appropriate.
- minor cleanups and improvements.
x86:
- XSAVES support for AMD
- more accurate report of nested guest TSC to the nested hypervisor
- retpoline optimizations
- support for nested 5-level page tables
- PMU virtualization optimizations, and improved support for nested
PMU virtualization
- correct latching of INITs for nested virtualization
- IOAPIC optimization
- TSX_CTRL virtualization for more TAA happiness
- improved allocation and flushing of SEV ASIDs
- many bugfixes and cleanups"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits)
kvm: nVMX: Relax guest IA32_FEATURE_CONTROL constraints
KVM: x86: Grab KVM's srcu lock when setting nested state
KVM: x86: Open code shared_msr_update() in its only caller
KVM: Fix jump label out_free_* in kvm_init()
KVM: x86: Remove a spurious export of a static function
KVM: x86: create mmu/ subdirectory
KVM: nVMX: Remove unnecessary TLB flushes on L1<->L2 switches when L1 use apic-access-page
KVM: x86: remove set but not used variable 'called'
KVM: nVMX: Do not mark vmcs02->apic_access_page as dirty when unpinning
KVM: vmx: use MSR_IA32_TSX_CTRL to hard-disable TSX on guest that lack it
KVM: vmx: implement MSR_IA32_TSX_CTRL disable RTM functionality
KVM: x86: implement MSR_IA32_TSX_CTRL effect on CPUID
KVM: x86: do not modify masked bits of shared MSRs
KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES
KVM: PPC: Book3S HV: XIVE: Fix potential page leak on error path
KVM: PPC: Book3S HV: XIVE: Free previous EQ page when setting up a new one
KVM: nVMX: Assume TLB entries of L1 and L2 are tagged differently if L0 use EPT
KVM: x86: Unexport kvm_vcpu_reload_apic_access_page()
KVM: nVMX: add CR4_LA57 bit to nested CR4_FIXED1
KVM: nVMX: Use semi-colon instead of comma for exit-handlers initialization
...
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCXdtnVQAKCRCAXGG7T9hj
vg1hAQDqG1DKZvR6BtlvETFMz7ZlrXVkpm6C74Wy4bLiO5KSSAEAneFbrDwFVa0c
d05Z6wemjlyEd7u3gkVQBKfHkbWBRQQ=
=aDIL
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.5a-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
- a small series to remove the build constraint of Xen x86 MCE handling
to 64-bit only
- a bunch of minor cleanups
* tag 'for-linus-5.5a-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: Fix Kconfig indentation
xen/mcelog: also allow building for 32-bit kernels
xen/mcelog: add PPIN to record when available
xen/mcelog: drop __MC_MSR_MCGCAP
xen/gntdev: Use select for DMA_SHARED_BUFFER
xen: mm: make xen_mm_init static
xen: mm: include <xen/xen-ops.h> for missing declarations
- Atomics-related code sees some rework & cleanup, most notably allowing
Loongson LL/SC errata workarounds to be more bulletproof & their
correctness to be checked at build time.
- Command line setup code is simplified somewhat, resolving various
corner cases.
- MIPS kernels can now be built with kcov code coverage support.
- We can now build with CONFIG_FORTIFY_SOURCE=y.
- Miscellaneous cleanups.
And some platform specific changes:
- We now disable some broken TLB functionality on certain Ingenic
systems, and JZ4780 systems gain some devicetree nodes to support
more devices.
- Loongson support sees a number of cleanups, and we gain initial
support for Loongson 3A R4 systems.
- We gain support for MediaTek MT7688-based GARDENA Smart Gateway
systems.
- SGI IP27 (Origin 2*) see a number of fixes, cleanups &
simplifications.
- SGI IP30 (Octane) systems are now supported.
-----BEGIN PGP SIGNATURE-----
iIwEABYIADQWIQRgLjeFAZEXQzy86/s+p5+stXUA3QUCXdwj2RYccGF1bGJ1cnRv
bkBrZXJuZWwub3JnAAoJED6nn6y1dQDd54QA/2CrWLcWCcWVwN8XwLTh3gWf8/k7
d19Ttd0bYrsBUnaHAP9s9kc9RFOAhB3p1G0dsMZqI0YX7emLGPgW3ejJ7CXNCg==
=/Hsa
-----END PGP SIGNATURE-----
Merge tag 'mips_5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS updates from Paul Burton:
"The main MIPS changes for 5.5:
- Atomics-related code sees some rework & cleanup, most notably
allowing Loongson LL/SC errata workarounds to be more bulletproof &
their correctness to be checked at build time.
- Command line setup code is simplified somewhat, resolving various
corner cases.
- MIPS kernels can now be built with kcov code coverage support.
- We can now build with CONFIG_FORTIFY_SOURCE=y.
- Miscellaneous cleanups.
And some platform specific changes:
- We now disable some broken TLB functionality on certain Ingenic
systems, and JZ4780 systems gain some devicetree nodes to support
more devices.
- Loongson support sees a number of cleanups, and we gain initial
support for Loongson 3A R4 systems.
- We gain support for MediaTek MT7688-based GARDENA Smart Gateway
systems.
- SGI IP27 (Origin 2*) see a number of fixes, cleanups &
simplifications.
- SGI IP30 (Octane) systems are now supported"
* tag 'mips_5.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (107 commits)
MIPS: SGI-IP27: Enable ethernet phy on second Origin 200 module
MIPS: PCI: Fix fake subdevice ID for IOC3
MIPS: Ingenic: Disable abandoned HPTLB function.
MIPS: PCI: remember nasid changed by set interrupt affinity
MIPS: SGI-IP27: Fix crash, when CPUs are disabled via nr_cpus parameter
mips: add support for folded p4d page tables
mips: drop __pXd_offset() macros that duplicate pXd_index() ones
mips: fix build when "48 bits virtual memory" is enabled
MIPS: math-emu: Reuse name array in debugfs_fpuemu()
MIPS: allow building with kcov coverage
MIPS: Loongson64: Drop setup_pcimap
MIPS: Loongson2ef: Convert to early_printk_8250
MIPS: Drop CPU_SUPPORTS_UNCACHED_ACCELERATED
MIPS: Loongson{2ef, 32, 64} convert to generic fw cmdline
MIPS: Drop pmon.h
MIPS: Loongson: Unify LOONGSON3/LOONGSON64 Kconfig usage
MIPS: Loongson: Rename LOONGSON1 to LOONGSON32
MIPS: Loongson: Fix return value of loongson_hwmon_init
MIPS: add support for SGI Octane (IP30)
MIPS: PCI: make phys_to_dma/dma_to_phys for pci-xtalk-bridge common
...
- Atari Falcon IDE platform driver conversion for module autoload,
- Defconfig updates (incl. enablement of Amiga ICY I2C),
- Small fixes and cleanups.
-----BEGIN PGP SIGNATURE-----
iIsEABYIADMWIQQ9qaHoIs/1I4cXmEiKwlD9ZEnxcAUCXdufABUcZ2VlcnRAbGlu
dXgtbTY4ay5vcmcACgkQisJQ/WRJ8XAhxAEAp1MN6cEMFzE2NmulbSLXSb6ceLnB
4eRvmVi5HJACqXoBAMexfc0YUl5BB6LMi+7HRetd46QCSJ7TWko0XE9NiBsP
=CRVE
-----END PGP SIGNATURE-----
Merge tag 'm68k-for-v5.5-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull m68k updates from Geert Uytterhoeven:
- Atari Falcon IDE platform driver conversion for module autoload
- defconfig updates (including enablement of Amiga ICY I2C)
- small fixes and cleanups
* tag 'm68k-for-v5.5-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k/atari: Convert Falcon IDE drivers to platform drivers
m68k: defconfig: Enable ICY I2C and LTC2990 on Amiga
m68k: defconfig: Update defconfigs for v5.4-rc1
m68k: q40: Fix info-leak in rtc_ioctl
nubus: Remove cast to void pointer
Pull RAS updates from Borislav Petkov:
- Fully reworked thermal throttling notifications, there should be no
more spamming of dmesg (Srinivas Pandruvada and Benjamin Berg)
- More enablement for the Intel-compatible CPUs Zhaoxin (Tony W
Wang-oc)
- PPIN support for Icelake (Tony Luck)
* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce/therm_throt: Optimize notifications of thermal throttle
x86/mce: Add Xeon Icelake to list of CPUs that support PPIN
x86/mce: Lower throttling MCE messages' priority to warning
x86/mce: Add Zhaoxin LMCE support
x86/mce: Add Zhaoxin CMCI support
x86/mce: Add Zhaoxin MCE support
x86/mce/amd: Make disable_err_thresholding() static
Pull x86 microcode updates from Borislav Petkov:
"This converts the late loading method to load the microcode in
parallel (vs sequentially currently). The patch remained in linux-next
for the maximum amount of time so that any potential and hard to debug
fallout be minimized.
Now cloud folks have their milliseconds back but all the normal people
should use early loading anyway :-)"
* 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode/intel: Issue the revision updated message only on the BSP
x86/microcode: Update late microcode in parallel
x86/microcode/amd: Fix two -Wunused-but-set-variable warnings
- Adjust PMU device drivers registration to avoid WARN_ON and few other
perf improvements.
- Enhance tracing in vfio-ccw.
- Few stack unwinder fixes and improvements, convert get_wchan custom
stack unwinding to generic api usage.
- Fixes for mm helpers issues uncovered with tests validating architecture
page table helpers.
- Fix noexec bit handling when hardware doesn't support it.
- Fix memleak and unsigned value compared with zero bugs in crypto
code. Minor code simplification.
- Fix crash during kdump with kasan enabled kernel.
- Switch bug and alternatives from asm to asm_inline to improve inlining
decisions.
- Use 'depends on cc-option' for MARCH and TUNE options in Kconfig,
add z13s and z14 ZR1 to TUNE descriptions.
- Minor head64.S simplification.
- Fix physical to logical CPU map for SMT.
- Several cleanups in qdio code.
- Other minor cleanups and fixes all over the code.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAl3ahGYACgkQjYWKoQLX
FBguwAgAig+FNos8zkd7Sr2wg4DPL2IlYERVP40fOLXfGuVUOnMLg8OTO6yDWDpH
5+cKAQS1wWgyvlfjWRUJ6anXLBAsgKRD1nyFIZTpn/wArGk/duCbnl/VFriDgrST
8KTQDJpZ9w9nXtQ7lA2QWaw5U2WG8I2T2JuQJCdLXze7RXi0bDVe8e6131NMaJ42
LLxqOqm8d8XDnd8oDVP04LT5IfhuI2cILoGBP/GyI2fqQk9Ems6M2gxuISq1COmy
WORDLfwWyCLeF7gWKKjxf8Vo1HYcyoFvdXnxWiHb0TDZesQZJr/LLELTP03fbCW9
U4jbXncnnPA7kT4tlC95jT5M69yK5w==
=+FxG
-----END PGP SIGNATURE-----
Merge tag 's390-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Vasily Gorbik:
- Adjust PMU device drivers registration to avoid WARN_ON and few other
perf improvements.
- Enhance tracing in vfio-ccw.
- Few stack unwinder fixes and improvements, convert get_wchan custom
stack unwinding to generic api usage.
- Fixes for mm helpers issues uncovered with tests validating
architecture page table helpers.
- Fix noexec bit handling when hardware doesn't support it.
- Fix memleak and unsigned value compared with zero bugs in crypto
code. Minor code simplification.
- Fix crash during kdump with kasan enabled kernel.
- Switch bug and alternatives from asm to asm_inline to improve
inlining decisions.
- Use 'depends on cc-option' for MARCH and TUNE options in Kconfig, add
z13s and z14 ZR1 to TUNE descriptions.
- Minor head64.S simplification.
- Fix physical to logical CPU map for SMT.
- Several cleanups in qdio code.
- Other minor cleanups and fixes all over the code.
* tag 's390-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (41 commits)
s390/cpumf: Adjust registration of s390 PMU device drivers
s390/smp: fix physical to logical CPU map for SMT
s390/early: move access registers setup in C code
s390/head64: remove unnecessary vdso_per_cpu_data setup
s390/early: move control registers setup in C code
s390/kasan: support memcpy_real with TRACE_IRQFLAGS
s390/crypto: Fix unsigned variable compared with zero
s390/pkey: use memdup_user() to simplify code
s390/pkey: fix memory leak within _copy_apqns_from_user()
s390/disassembler: don't hide instruction addresses
s390/cpum_sf: Assign error value to err variable
s390/cpum_sf: Replace function name in debug statements
s390/cpum_sf: Use consistant debug print format for sampling
s390/unwind: drop unnecessary code around calling ftrace_graph_ret_addr()
s390: add error handling to perf_callchain_kernel
s390: always inline current_stack_pointer()
s390/mm: add mm_pxd_folded() checks to pxd_free()
s390/mm: properly clear _PAGE_NOEXEC bit when it is not supported
s390/mm: simplify page table helpers for large entries
s390/mm: make pmd/pud_bad() report large entries as bad
...