Commit Graph

643269 Commits

Author SHA1 Message Date
Petr Mladek
34aaff40b4 kdb: call vkdb_printf() from vprintk_default() only when wanted
kdb_trap_printk allows to pass normal printk() messages to kdb via
vkdb_printk().  For example, it is used to get backtrace using the
classic show_stack(), see kdb_show_stack().

vkdb_printf() tries to avoid a potential infinite loop by disabling the
trap.  But this approach is racy, for example:

CPU1					CPU2

vkdb_printf()
  // assume that kdb_trap_printk == 0
  saved_trap_printk = kdb_trap_printk;
  kdb_trap_printk = 0;

					kdb_show_stack()
					  kdb_trap_printk++;

Problem1: Now, a nested printk() on CPU0 calls vkdb_printf()
	  even when it should have been disabled. It will not
	  cause a deadlock but...

   // using the outdated saved value: 0
   kdb_trap_printk = saved_trap_printk;

					  kdb_trap_printk--;

Problem2: Now, kdb_trap_printk == -1 and will stay like this.
   It means that all messages will get passed to kdb from
   now on.

This patch removes the racy saved_trap_printk handling.  Instead, the
recursion is prevented by a check for the locked CPU.

The solution is still kind of racy.  A non-related printk(), from
another process, might get trapped by vkdb_printf().  And the wanted
printk() might not get trapped because kdb_printf_cpu is assigned.  But
this problem existed even with the original code.

A proper solution would be to get_cpu() before setting kdb_trap_printk
and trap messages only from this CPU.  I am not sure if it is worth the
effort, though.

In fact, the race is very theoretical.  When kdb is running any of the
commands that use kdb_trap_printk there is a single active CPU and the
other CPUs should be in a holding pen inside kgdb_cpu_enter().

The only time this is violated is when there is a timeout waiting for
the other CPUs to report to the holding pen.

Finally, note that the situation is a bit schizophrenic.  vkdb_printf()
explicitly allows recursion but only from KDB code that calls
kdb_printf() directly.  On the other hand, the generic printk()
recursion is not allowed because it might cause an infinite loop.  This
is why we could not hide the decision inside vkdb_printf() easily.

Link: http://lkml.kernel.org/r/1480412276-16690-4-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Cc: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:08 -08:00
Petr Mladek
d5d8d3d0d4 kdb: properly synchronize vkdb_printf() calls with other CPUs
kdb_printf_lock does not prevent other CPUs from entering the critical
section because it is ignored when KDB_STATE_PRINTF_LOCK is set.

The problematic situation might look like:

CPU0					CPU1

vkdb_printf()
  if (!KDB_STATE(PRINTF_LOCK))
    KDB_STATE_SET(PRINTF_LOCK);
    spin_lock_irqsave(&kdb_printf_lock, flags);

					vkdb_printf()
					  if (!KDB_STATE(PRINTF_LOCK))

BANG: The PRINTF_LOCK state is set and CPU1 is entering the critical
section without spinning on the lock.

The problem is that the code tries to implement locking using two state
variables that are not handled atomically.  Well, we need a custom
locking because we want to allow reentering the critical section on the
very same CPU.

Let's use solution from Petr Zijlstra that was proposed for a similar
scenario, see
https://lkml.kernel.org/r/20161018171513.734367391@infradead.org

This patch uses the same trick with cmpxchg().  The only difference is
that we want to handle only recursion from the same context and
therefore we disable interrupts.

In addition, KDB_STATE_PRINTF_LOCK is removed.  In fact, we are not able
to set it a non-racy way.

Link: http://lkml.kernel.org/r/1480412276-16690-3-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:08 -08:00
Petr Mladek
d1bd8ead12 kdb: remove unused kdb_event handling
kdb_event state variable is only set but never checked in the kernel
code.

http://www.spinics.net/lists/kdb/msg01733.html suggests that this
variable affected WARN_CONSOLE_UNLOCKED() in the original
implementation.  But this check never went upstream.

The semantic is unclear and racy.  The value is updated after the
kdb_printf_lock is acquired and after it is released.  It should be
symmetric at minimum.  The value should be manipulated either inside or
outside the locked area.

Fortunately, it seems that the original function is gone and we could
simply remove the state variable.

Link: http://lkml.kernel.org/r/1480412276-16690-2-git-send-email-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Suggested-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:08 -08:00
Douglas Anderson
2d13bb6494 kernel/debug/debug_core.c: more properly delay for secondary CPUs
We've got a delay loop waiting for secondary CPUs.  That loop uses
loops_per_jiffy.  However, loops_per_jiffy doesn't actually mean how
many tight loops make up a jiffy on all architectures.  It is quite
common to see things like this in the boot log:

  Calibrating delay loop (skipped), value calculated using timer
  frequency.. 48.00 BogoMIPS (lpj=24000)

In my case I was seeing lots of cases where other CPUs timed out
entering the debugger only to print their stack crawls shortly after the
kdb> prompt was written.

Elsewhere in kgdb we already use udelay(), so that should be safe enough
to use to implement our timeout.  We'll delay 1 ms for 1000 times, which
should give us a full second of delay (just like the old code wanted)
but allow us to notice that we're done every 1 ms.

[akpm@linux-foundation.org: simplifications, per Daniel]
Link: http://lkml.kernel.org/r/1477091361-2039-1-git-send-email-dianders@chromium.org
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Brian Norris <briannorris@chromium.org>
Cc: <stable@vger.kernel.org>	[4.0+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:08 -08:00
Kefeng Wang
db862358a4 kcov: add more missing includes
It is fragile that some definitions acquired via transitive
dependencies, as shown in below:

atomic_*        (<linux/atomic.h>)
ENOMEM/EN*      (<linux/errno.h>)
EXPORT_SYMBOL   (<linux/export.h>)
device_initcall (<linux/init.h>)
preempt_*       (<linux/preempt.h>)

Include them to prevent possible issues.

Link: http://lkml.kernel.org/r/1481163221-40170-1-git-send-email-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:08 -08:00
Andreas Platschek
0462554707 Kconfig: lib/Kconfig.ubsan fix reference to ubsan documentation
Documenation/ubsan.txt was moved to Documentation/dev-tools/ubsan.rst,
this fixes the reference.

Link: http://lkml.kernel.org/r/1476698152-29340-3-git-send-email-andreas.platschek@opentech.at
Signed-off-by: Andreas Platschek <andreas.platschek@opentech.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:08 -08:00
Andreas Platschek
700199b0c1 Kconfig: lib/Kconfig.debug: fix references to Documenation
Documentation on development tools was moved to Documentation/devl-tools
and sphinxified (renamed from .txt to .rst).

References in lib/Kconfig.debug need to be updated to the new location.

Link: http://lkml.kernel.org/r/1476698152-29340-2-git-send-email-andreas.platschek@opentech.at
Signed-off-by: Andreas Platschek <andreas.platschek@opentech.at>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:08 -08:00
Dan Carpenter
9a29d0fbc2 relay: check array offset before using it
Smatch complains that we started using the array offset before we
checked that it was valid.

Fixes: 017c59c042 ('relay: Use per CPU constructs for the relay channel buffer pointers')
Link: http://lkml.kernel.org/r/20161013084947.GC16198@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:08 -08:00
Alexander Duyck
bd4171a5d4 igb: update code to better handle incrementing page count
Update the driver code so that we do bulk updates of the page reference
count instead of just incrementing it by one reference at a time.  The
advantage to doing this is that we cut down on atomic operations and
this in turn should give us a slight improvement in cycles per packet.
In addition if we eventually move this over to using build_skb the gains
will be more noticeable.

Link: http://lkml.kernel.org/r/20161110113616.76501.17072.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Cc: Helge Deller <deller@gmx.de>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Keguang Zhang <keguang.zhang@gmail.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Tobias Klauser <tklauser@distanz.ch>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:08 -08:00
Alexander Duyck
5be5955425 igb: update driver to make use of DMA_ATTR_SKIP_CPU_SYNC
The ARM architecture provides a mechanism for deferring cache line
invalidation in the case of map/unmap.  This patch makes use of this
mechanism to avoid unnecessary synchronization.

A secondary effect of this change is that the portion of the page that
has been synchronized for use by the CPU should be writable and could be
passed up the stack (at least on ARM).

The last bit that occurred to me is that on architectures where the
sync_for_cpu call invalidates cache lines we were prefetching and then
invalidating the first 128 bytes of the packet.  To avoid that I have
moved the sync up to before we perform the prefetch and allocate the
skbuff so that we can actually make use of it.

Link: http://lkml.kernel.org/r/20161110113611.76501.98897.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Cc: Helge Deller <deller@gmx.de>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Keguang Zhang <keguang.zhang@gmail.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Tobias Klauser <tklauser@distanz.ch>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:08 -08:00
Alexander Duyck
44fdffd705 mm: add support for releasing multiple instances of a page
Add a function that allows us to batch free a page that has multiple
references outstanding.  Specifically this function can be used to drop
a page being used in the page frag alloc cache.  With this drivers can
make use of functionality similar to the page frag alloc cache without
having to do any workarounds for the fact that there is no function that
frees multiple references.

Link: http://lkml.kernel.org/r/20161110113606.76501.70752.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Cc: Helge Deller <deller@gmx.de>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Keguang Zhang <keguang.zhang@gmail.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Tobias Klauser <tklauser@distanz.ch>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:08 -08:00
Alexander Duyck
0495c3d367 dma: add calls for dma_map_page_attrs and dma_unmap_page_attrs
Add support for mapping and unmapping a page with attributes.

The primary use for this is currently to allow for us to pass the
DMA_ATTR_SKIP_CPU_SYNC attribute when mapping and unmapping a page.  On
some architectures such as ARM the synchronization has significant
overhead and if we are already taking care of the sync_for_cpu and
sync_for_device from the driver there isn't much need to handle this in
the map/unmap calls as well.

Link: http://lkml.kernel.org/r/20161110113601.76501.46095.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:08 -08:00
Alexander Duyck
4bfa135abe arch/xtensa: add option to skip DMA sync as a part of mapping
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Link: http://lkml.kernel.org/r/20161110113555.76501.52536.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:08 -08:00
Alexander Duyck
33c77e53d8 arch/tile: add option to skip DMA sync as a part of map and unmap
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Link: http://lkml.kernel.org/r/20161110113550.76501.73060.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:08 -08:00
Alexander Duyck
68bbc28f61 arch/sparc: add option to skip DMA sync as a part of map and unmap
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Link: http://lkml.kernel.org/r/20161110113544.76501.40008.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Alexander Duyck
a08120017d arch/sh: add option to skip DMA sync as a part of mapping
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Link: http://lkml.kernel.org/r/20161110113539.76501.6539.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Alexander Duyck
6f77480961 arch/powerpc: add option to skip DMA sync as a part of mapping
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Link: http://lkml.kernel.org/r/20161110113534.76501.86492.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Alexander Duyck
f50a2bd298 arch/parisc: add option to skip DMA sync as a part of map and unmap
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Link: http://lkml.kernel.org/r/20161110113529.76501.44762.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Alexander Duyck
043b42bcbb arch/openrisc: add option to skip DMA sync as a part of mapping
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Link: http://lkml.kernel.org/r/20161110113524.76501.87966.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Jonas Bonn <jonas@southpole.se>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Alexander Duyck
abdf4799da arch/nios2: add option to skip DMA sync as a part of map and unmap
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Link: http://lkml.kernel.org/r/20161110113518.76501.52225.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Reviewed-by: Tobias Klauser <tklauser@distanz.ch>
Cc: Ley Foon Tan <lftan@altera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Alexander Duyck
9f318d470e arch/mips: add option to skip DMA sync as a part of map and unmap
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Link: http://lkml.kernel.org/r/20161110113513.76501.32321.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Keguang Zhang <keguang.zhang@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Alexander Duyck
98ac2fc274 arch/microblaze: add option to skip DMA sync as a part of map and unmap
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Link: http://lkml.kernel.org/r/20161110113508.76501.77583.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Alexander Duyck
38bdbdc7e3 arch/metag: add option to skip DMA sync as a part of map and unmap
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
via a sync_for_cpu or sync_for_device call.

Link: http://lkml.kernel.org/r/20161110113503.76501.80809.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Alexander Duyck
5140d2344f arch/m68k: add option to skip DMA sync as a part of mapping
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
later via a sync_for_cpu or sync_for_device call.

Link: http://lkml.kernel.org/r/20161110113457.76501.77603.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Alexander Duyck
b8a346dd47 arch/hexagon: Add option to skip DMA sync as a part of mapping
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
later via a sync_for_cpu or sync_for_device call.

Link: http://lkml.kernel.org/r/20161110113452.76501.45864.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Alexander Duyck
34f8be79a7 arch/frv: add option to skip sync on DMA map
The use of DMA_ATTR_SKIP_CPU_SYNC was not consistent across all of the
DMA APIs in the arch/arm folder.  This change is meant to correct that
so that we get consistent behavior.

Link: http://lkml.kernel.org/r/20161110113447.76501.93160.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Alexander Duyck
64c596b59c arch/c6x: add option to skip sync on DMA map and unmap
This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
later via a sync_for_cpu or sync_for_device call.

Link: http://lkml.kernel.org/r/20161110113442.76501.7673.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Mark Salter <msalter@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Alexander Duyck
8c16a2e209 arch/blackfin: add option to skip sync on DMA map
The use of DMA_ATTR_SKIP_CPU_SYNC was not consistent across all of the
DMA APIs in the arch/arm folder.  This change is meant to correct that
so that we get consistent behavior.

Link: http://lkml.kernel.org/r/20161110113436.76501.13386.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Steven Miao <realmz6@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Alexander Duyck
e8b4762c22 arch/avr32: add option to skip sync on DMA map
The use of DMA_ATTR_SKIP_CPU_SYNC was not consistent across all of the
DMA APIs in the arch/arm folder.  This change is meant to correct that
so that we get consistent behavior.

Link: http://lkml.kernel.org/r/20161110113430.76501.79737.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Alexander Duyck
fc1b138de7 arch/arm: add option to skip sync on DMA map and unmap
The use of DMA_ATTR_SKIP_CPU_SYNC was not consistent across all of the
DMA APIs in the arch/arm folder.  This change is meant to correct that
so that we get consistent behavior.

Link: http://lkml.kernel.org/r/20161110113424.76501.2715.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Cc: Russell King <linux@armlinux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Alexander Duyck
8a3385d2d4 arch/arc: add option to skip sync on DMA mapping
Patch series "Add support for DMA writable pages being writable by the
network stack", v3.

The first 19 patches in the set add support for the DMA attribute
DMA_ATTR_SKIP_CPU_SYNC on multiple platforms/architectures.  This is
needed so that we can flag the calls to dma_map/unmap_page so that we do
not invalidate cache lines that do not currently belong to the device.
Instead we have to take care of this in the driver via a call to
sync_single_range_for_cpu prior to freeing the Rx page.

Patch 20 adds support for dma_map_page_attrs and dma_unmap_page_attrs so
that we can unmap and map a page using the DMA_ATTR_SKIP_CPU_SYNC
attribute.

Patch 21 adds support for freeing a page that has multiple references
being held by a single caller.  This way we can free page fragments that
were allocated by a given driver.

The last 2 patches use these updates in the igb driver, and lay the
groundwork to allow for us to reimplement the use of build_skb.

This patch (of 23):

This change allows us to pass DMA_ATTR_SKIP_CPU_SYNC which allows us to
avoid invoking cache line invalidation if the driver will just handle it
later via a sync_for_cpu or sync_for_device call.

Link: http://lkml.kernel.org/r/20161110113419.76501.38491.stgit@ahduyck-blue-test.jf.intel.com
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Tetsuo Handa
7560ef39dc sysctl: add KERN_CONT to deprecated_sysctl_warning()
Do not break lines while printk()ing values.

  kernel: warning: process `tomoyo_file_tes' used the deprecated sysctl system call with
  kernel: 3.
  kernel: 5.
  kernel: 56.
  kernel:

Link: http://lkml.kernel.org/r/1480814833-4976-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
zhong jiang
8e53c073a4 kexec: add cond_resched into kimage_alloc_crash_control_pages
A soft lookup will occur when I run trinity in syscall kexec_load.  the
corresponding stack information is as follows.

  BUG: soft lockup - CPU#6 stuck for 22s! [trinity-c6:13859]
  Kernel panic - not syncing: softlockup: hung tasks
  CPU: 6 PID: 13859 Comm: trinity-c6 Tainted: G           O L ----V-------   3.10.0-327.28.3.35.zhongjiang.x86_64 #1
  Hardware name: Huawei Technologies Co., Ltd. Tecal BH622 V2/BC01SRSA0, BIOS RMIBV386 06/30/2014
  Call Trace:
   <IRQ>  dump_stack+0x19/0x1b
   panic+0xd8/0x214
   watchdog_timer_fn+0x1cc/0x1e0
   __hrtimer_run_queues+0xd2/0x260
   hrtimer_interrupt+0xb0/0x1e0
   ? call_softirq+0x1c/0x30
   local_apic_timer_interrupt+0x37/0x60
   smp_apic_timer_interrupt+0x3f/0x60
   apic_timer_interrupt+0x6d/0x80
   <EOI>  ? kimage_alloc_control_pages+0x80/0x270
   ? kmem_cache_alloc_trace+0x1ce/0x1f0
   ? do_kimage_alloc_init+0x1f/0x90
   kimage_alloc_init+0x12a/0x180
   SyS_kexec_load+0x20a/0x260
   system_call_fastpath+0x16/0x1b

the first time allocation of control pages may take too much time
because crash_res.end can be set to a higher value.  we need to add
cond_resched to avoid the issue.

The patch have been tested and above issue is not appear.

Link: http://lkml.kernel.org/r/1481164674-42775-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Xunlei Pang <xpang@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Baoquan He
401721ecd1 kexec: export the value of phys_base instead of symbol address
Currently in x86_64, the symbol address of phys_base is exported to
vmcoreinfo.  Dave Anderson complained this is really useless for his
Crash implementation.  Because in user-space utility Crash and
Makedumpfile which exported vmcore information is mainly used for, value
of phys_base is needed to covert virtual address of exported kernel
symbol to physical address.  Especially init_level4_pgt, if we want to
access and go over the page table to look up a PA corresponding to VA,
firstly we need calculate

  page_dir = SYMBOL(init_level4_pgt) - __START_KERNEL_map + phys_base;

Now in Crash and Makedumpfile, we have to analyze the vmcore elf program
header to get value of phys_base.  As Dave said, it would be preferable
if it were readily availabl in vmcoreinfo rather than depending upon the
PT_LOAD semantics.

Hence in this patch change to export the value of phys_base instead of
its virtual address.

And people also complained that KERNEL_IMAGE_SIZE exporting is x86_64
only, should be moved into arch dependent function
arch_crash_save_vmcoreinfo.  Do the moving in this patch.

Link: http://lkml.kernel.org/r/1478568596-30060-2-git-send-email-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Xunlei Pang <xlpang@redhat.com>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Eugene Surovegin <surovegin@google.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Atsushi Kumagai <ats-kumagai@wm.jp.nec.com>
Cc: Dave Anderson <anderson@redhat.com>
Cc: Pratyush Anand <panand@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Baoquan He
69f5838479 Revert "kdump, vmcoreinfo: report memory sections virtual addresses"
This reverts commit 0549a3c02e ("kdump, vmcoreinfo: report memory
sections virtual addresses").

Commit 0549a3c02e tells the userspace utility makedumpfile the
randomized base address of these memmory sections when mm kaslr is
enabled.  However the following patch "kexec: export the value of
phys_base instead of symbol address" makes makedumpfile not need these
addresses any more.

Besides we should use VMCOREINFO_NUMBER to export the value of the
variable so that we can use the existing number_table mechanism of
Makedumpfile to fetch it.  So revert it now.  If needed we can add it
later.

http://lists.infradead.org/pipermail/kexec/2016-October/017540.html
Link: http://lkml.kernel.org/r/1478568596-30060-1-git-send-email-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Cc: Thomas Garnier <thgarnie@google.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Xunlei Pang <xlpang@redhat.com>
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Eugene Surovegin <surovegin@google.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
Cc: Atsushi Kumagai <ats-kumagai@wm.jp.nec.com>
Cc: Dave Anderson <anderson@redhat.com>
Cc: Pratyush Anand <panand@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Alexey Dobriyan
760c6a9139 coredump: clarify "unsafe core_pattern" warning
I was amused to find "unsafe core_pattern" warning having these lines in
/etc/sysctl.conf:

	fs.suid_dumpable=2
	kernel.core_pattern=/core/core-%e-%p-%E
	kernel.core_uses_pid=0

Turns out kernel is formally right.  Default core_pattern is just "core",
which doesn't qualify for secure path while setting suid.dumpable.

Hint admins about solution, clarify sysctl names, delete unnecessary '\'
characters (string literals are concatenated regardless) and reformat for
easier grepping.

Link: http://lkml.kernel.org/r/20161029152124.GA1258@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Waiman Long
c7be96af89 signals: avoid unnecessary taking of sighand->siglock
When running certain database workload on a high-end system with many
CPUs, it was found that spinlock contention in the sigprocmask syscalls
became a significant portion of the overall CPU cycles as shown below.

  9.30%  9.30%  905387  dataserver  /proc/kcore 0x7fff8163f4d2
  [k] _raw_spin_lock_irq
            |
            ---_raw_spin_lock_irq
               |
               |--99.34%-- __set_current_blocked
               |          sigprocmask
               |          sys_rt_sigprocmask
               |          system_call_fastpath
               |          |
               |          |--50.63%-- __swapcontext
               |          |          |
               |          |          |--99.91%-- upsleepgeneric
               |          |
               |          |--49.36%-- __setcontext
               |          |          ktskRun

Looking further into the swapcontext function in glibc, it was found that
the function always call sigprocmask() without checking if there are
changes in the signal mask.

A check was added to the __set_current_blocked() function to avoid taking
the sighand->siglock spinlock if there is no change in the signal mask.
This will prevent unneeded spinlock contention when many threads are
trying to call sigprocmask().

With this patch applied, the spinlock contention in sigprocmask() was
gone.

Link: http://lkml.kernel.org/r/1474979209-11867-1-git-send-email-Waiman.Long@hpe.com
Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Stas Sergeev <stsp@list.ru>
Cc: Scott J Norton <scott.norton@hpe.com>
Cc: Douglas Hatch <doug.hatch@hpe.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Michal Hocko
73e64c51af mm, compaction: allow compaction for GFP_NOFS requests
compaction has been disabled for GFP_NOFS and GFP_NOIO requests since
the direct compaction was introduced by commit 56de7263fc ("mm:
compaction: direct compact when a high-order allocation fails").  The
main reason is that the migration of page cache pages might recurse back
to fs/io layer and we could potentially deadlock.  This is overly
conservative because all the anonymous memory is migrateable in the
GFP_NOFS context just fine.  This might be a large portion of the memory
in many/most workkloads.

Remove the GFP_NOFS restriction and make sure that we skip all fs pages
(those with a mapping) while isolating pages to be migrated.  We cannot
consider clean fs pages because they might need a metadata update so
only isolate pages without any mapping for nofs requests.

The effect of this patch will be probably very limited in many/most
workloads because higher order GFP_NOFS requests are quite rare,
although different configurations might lead to very different results.
David Chinner has mentioned a heavy metadata workload with 64kB block
which to quote him:

: Unfortunately, there was an era of cargo cult configuration tweaks in the
: Ceph community that has resulted in a large number of production machines
: with XFS filesystems configured this way.  And a lot of them store large
: numbers of small files and run under significant sustained memory
: pressure.
:
: I slowly working towards getting rid of these high order allocations and
: replacing them with the equivalent number of single page allocations, but
: I haven't got that (complex) change working yet.

We can do the following to simulate that workload:
$ mkfs.xfs -f -n size=64k <dev>
$ mount <dev> /mnt/scratch
$ time ./fs_mark  -D  10000  -S0  -n  100000  -s  0  -L  32 \
        -d  /mnt/scratch/0  -d  /mnt/scratch/1 \
        -d  /mnt/scratch/2  -d  /mnt/scratch/3 \
        -d  /mnt/scratch/4  -d  /mnt/scratch/5 \
        -d  /mnt/scratch/6  -d  /mnt/scratch/7 \
        -d  /mnt/scratch/8  -d  /mnt/scratch/9 \
        -d  /mnt/scratch/10  -d  /mnt/scratch/11 \
        -d  /mnt/scratch/12  -d  /mnt/scratch/13 \
        -d  /mnt/scratch/14  -d  /mnt/scratch/15

and indeed is hammers the system with many high order GFP_NOFS requests as
per a simle tracepoint during the load:
$ echo '!(gfp_flags & 0x80) && (gfp_flags &0x400000)' > $TRACE_MNT/events/kmem/mm_page_alloc/filter
I am getting
5287609 order=0
     37 order=1
1594905 order=2
3048439 order=3
6699207 order=4
  66645 order=5

My testing was done in a kvm guest so performance numbers should be
taken with a grain of salt but there seems to be a difference when the
patch is applied:

* Original kernel
FSUse%        Count         Size    Files/sec     App Overhead
     1      1600000            0       4300.1         20745838
     3      3200000            0       4239.9         23849857
     5      4800000            0       4243.4         25939543
     6      6400000            0       4248.4         19514050
     8      8000000            0       4262.1         20796169
     9      9600000            0       4257.6         21288675
    11     11200000            0       4259.7         19375120
    13     12800000            0       4220.7         22734141
    14     14400000            0       4238.5         31936458
    16     16000000            0       4231.5         23409901
    18     17600000            0       4045.3         23577700
    19     19200000            0       2783.4         58299526
    21     20800000            0       2678.2         40616302
    23     22400000            0       2693.5         83973996

and xfs complaining about memory allocation not making progress
[ 2304.372647] XFS: fs_mark(3289) possible memory allocation deadlock size 65624 in kmem_alloc (mode:0x2408240)
[ 2304.443323] XFS: fs_mark(3285) possible memory allocation deadlock size 65728 in kmem_alloc (mode:0x2408240)
[ 4796.772477] XFS: fs_mark(3424) possible memory allocation deadlock size 46936 in kmem_alloc (mode:0x2408240)
[ 4796.775329] XFS: fs_mark(3423) possible memory allocation deadlock size 51416 in kmem_alloc (mode:0x2408240)
[ 4797.388808] XFS: fs_mark(3424) possible memory allocation deadlock size 65728 in kmem_alloc (mode:0x2408240)

* Patched kernel
FSUse%        Count         Size    Files/sec     App Overhead
     1      1600000            0       4289.1         19243934
     3      3200000            0       4241.6         32828865
     5      4800000            0       4248.7         32884693
     6      6400000            0       4314.4         19608921
     8      8000000            0       4269.9         24953292
     9      9600000            0       4270.7         33235572
    11     11200000            0       4346.4         40817101
    13     12800000            0       4285.3         29972397
    14     14400000            0       4297.2         20539765
    16     16000000            0       4219.6         18596767
    18     17600000            0       4273.8         49611187
    19     19200000            0       4300.4         27944451
    21     20800000            0       4270.6         22324585
    22     22400000            0       4317.6         22650382
    24     24000000            0       4065.2         22297964

So the dropdown at Count 19200000 didn't happen and there was only a
single warning about allocation not making progress
[ 3063.815003] XFS: fs_mark(3272) possible memory allocation deadlock size 65624 in kmem_alloc (mode:0x2408240)

This suggests that the patch has helped even though there is not all that
much of anonymous memory as the workload mostly generates fs metadata.  I
assume the success rate would be higher with more anonymous memory which
should be the case in many workloads.

[akpm@linux-foundation.org: fix comment]
Link: http://lkml.kernel.org/r/20161012114721.31853-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Dave Chinner <david@fromorbit.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Konstantin Khlebnikov
4d1f0fb096 kernel/watchdog: use nmi registers snapshot in hardlockup handler
NMI handler doesn't call set_irq_regs(), it's set only by normal IRQ.
Thus get_irq_regs() returns NULL or stale registers snapshot with IP/SP
pointing to the code interrupted by IRQ which was interrupted by NMI.
NULL isn't a problem: in this case watchdog calls dump_stack() and
prints full stack trace including NMI.  But if we're stuck in IRQ
handler then NMI watchlog will print stack trace without IRQ part at
all.

This patch uses registers snapshot passed into NMI handler as arguments:
these registers point exactly to the instruction interrupted by NMI.

Fixes: 55537871ef ("kernel/watchdog.c: perform all-CPU backtrace in case of hard lockup")
Link: http://lkml.kernel.org/r/146771764784.86724.6006627197118544150.stgit@buzz
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Aaron Tomlin <atomlin@redhat.com>
Cc: <stable@vger.kernel.org>	[4.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Petr Mladek
40f7828b36 btrfs: better handle btrfs_printk() defaults
Commit 262c5e86fe ("printk/btrfs: handle more message headers")
triggers:

    warning: `ratelimit' may be used uninitialized in this function

with gcc (4.1.2) and probably many other versions.  The code actually is
correct but a bit twisted.  Let's make it more straightforward and set
the default values at the beginning.

Link: http://lkml.kernel.org/r/20161213135246.GQ3506@pathway.suse.cz
Signed-off-by: Petr Mladek <pmladek@suse.com>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: David Sterba <dsterba@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:07 -08:00
Linus Torvalds
775a2e29c3 . various fixes and improvements to request-based DM and DM multipath
. some locking improvements in DM bufio
 
 . add Kconfig option to disable the DM block manager's extra locking
   which mainly serves as a developer tool
 
 . a few bug fixes to DM's persistent-data
 
 . a couple changes to prepare for multipage biovec support in the block
   layer
 
 . various improvements and cleanups in the DM core, DM cache, DM raid
   and DM crypt
 
 . add ability to have DM crypt use keys from the kernel key retention
   service
 
 . add a new "error_writes" feature to the DM flakey target, reads are
   left unchanged in this mode
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJYUW8zAAoJEMUj8QotnQNaAWEIAMRQ4aCXq5T7F9Hf4K/l6FwO
 FoBr2TPS3Lf0vm/A5Tr819I47hk7q0oroa61ARbpS90iuGt/Au/Sk35cn1BwT0YW
 llMvMGbh+w9ZBUJGkyexdXbyfm5ywPHuthMr4CK/UNASyjDl2QMAeBuUZ6FLSPn1
 RUL/RYv0mG/7EXOPz0PURPb5rpjO15cAU0NjfNS0862UVR8x8dNS6iImOmScsioe
 Flw90qPl3kMBxBHik8xSPJfhtW+lD7xSaOlWzHKtalnUZHRG2BNUtlAMKdiaynx2
 yl9MhSsi8wlgd4h9WmlmaOr0VqkU5UYY9D9TDuuJwXnHUXGenVSJ/aGOohr+bm4=
 =kOoK
 -----END PGP SIGNATURE-----

Merge tag 'dm-4.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:

 - various fixes and improvements to request-based DM and DM multipath

 - some locking improvements in DM bufio

 - add Kconfig option to disable the DM block manager's extra locking
   which mainly serves as a developer tool

 - a few bug fixes to DM's persistent-data

 - a couple changes to prepare for multipage biovec support in the block
   layer

 - various improvements and cleanups in the DM core, DM cache, DM raid
   and DM crypt

 - add ability to have DM crypt use keys from the kernel key retention
   service

 - add a new "error_writes" feature to the DM flakey target, reads are
   left unchanged in this mode

* tag 'dm-4.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (40 commits)
  dm flakey: introduce "error_writes" feature
  dm cache policy smq: use hash_32() instead of hash_32_generic()
  dm crypt: reject key strings containing whitespace chars
  dm space map: always set ev if sm_ll_mutate() succeeds
  dm space map metadata: skip useless memcpy in metadata_ll_init_index()
  dm space map metadata: fix 'struct sm_metadata' leak on failed create
  Documentation: dm raid: define data_offset status field
  dm raid: fix discard support regression
  dm raid: don't allow "write behind" with raid4/5/6
  dm mpath: use hw_handler_params if attached hw_handler is same as requested
  dm crypt: add ability to use keys from the kernel key retention service
  dm array: remove a dead assignment in populate_ablock_with_values()
  dm ioctl: use offsetof() instead of open-coding it
  dm rq: simplify use_blk_mq initialization
  dm: use blk_set_queue_dying() in __dm_destroy()
  dm bufio: drop the lock when doing GFP_NOIO allocation
  dm bufio: don't take the lock in dm_bufio_shrink_count
  dm bufio: avoid sleeping while holding the dm_bufio lock
  dm table: simplify dm_table_determine_type()
  dm table: an 'all_blk_mq' table must be loaded for a blk-mq DM device
  ...
2016-12-14 11:01:00 -08:00
Linus Torvalds
2a4c32edd3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md
Pull MD updates from Shaohua Li:

 - a raid5 writeback cache feature.

   The goal is to aggregate writes to make full stripe write and reduce
   read-modify-write. It's helpful for workload which does sequential
   write and follows fsync for example. This feature is experimental and
   off by default right now.

 - FAILFAST support.

   This fails IOs to broken raid disks quickly, so can improve latency.
   It's mainly for DASD storage, but some patches help normal raid array
   too.

 - support bad block for raid array with external metadata

 - AVX2 instruction support for raid6 parity calculation

 - normalize MD info output

 - add missing blktrace

 - other bug fixes

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md: (66 commits)
  md: separate flags for superblock changes
  md: MD_RECOVERY_NEEDED is set for mddev->recovery
  md: takeover should clear unrelated bits
  md/r5cache: after recovery, increase journal seq by 10000
  md/raid5-cache: fix crc in rewrite_data_only_stripes()
  md/raid5-cache: no recovery is required when create super-block
  md: fix refcount problem on mddev when stopping array.
  md/r5cache: do r5c_update_log_state after log recovery
  md/raid5-cache: adjust the write position of the empty block if no data blocks
  md/r5cache: run_no_space_stripes() when R5C_LOG_CRITICAL == 0
  md/raid5: limit request size according to implementation limits
  md/raid5-cache: do not need to set STRIPE_PREREAD_ACTIVE repeatedly
  md/raid5-cache: remove the unnecessary next_cp_seq field from the r5l_log
  md/raid5-cache: release the stripe_head at the appropriate location
  md/raid5-cache: use ring add to prevent overflow
  md/raid5-cache: remove unnecessary function parameters
  raid5-cache: don't set STRIPE_R5C_PARTIAL_STRIPE flag while load stripe into cache
  raid5-cache: add another check conditon before replaying one stripe
  md/r5cache: enable IRQs on error path
  md/r5cache: handle alloc_page failure
  ...
2016-12-14 10:58:17 -08:00
Linus Torvalds
b9f98bd403 MMC core:
- Move files from the card directory to the core directory to enable
    future clean-ups of the generic mmc header files and interfaces.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYURIiAAoJEP4mhCVzWIwpYAUQAL2qmTde04Qs/MjVtTCih3L0
 +ZB0oK/ppNUwZTmAJlMhi1osKPmdRd1Kngv+7zEBnddrbVJDAYrM+T5Psd8WebM4
 xUK0e3LnwuzFGskkgQ1mtd9qZHEniEdUg8Dq+nXSnGiesvCST2/uAAtjvFQHYr0R
 QK3U7hOyJBd1+L5LpPIIF+e/Ix7yaHobfYLGcGxRGULw5uhYiy3vbkpN7omD0hlQ
 2EYdBFcHj1GHarDYFtD3BWUNGtT7GpC10iMueUYVuz+bzOjCAhEeZ0Lh6V7R6HCr
 ZVSeMEK45HegLHx81XMDGYqDfRUPgShjNKrAq8DxfzBXvQ+1iuq5Mh01Zrilcxfw
 nNbqNajc+OIXluxYRz5h3HHC02R4gGN6Uk/gx/H8a/eRRcM6a3irDLkLBpyvETpb
 pH1OkomaHg5B/dTH0cUFamK/fTDNr3Lv2HFcKJ9ALQxhgEAQwcsWaeAyvvUncFuU
 /uiyd6NEWo/19mHB905wF5R9oTv1WWrQGDmSHGJT51Br1BJf+W/2ZT9N9P12zweW
 d5ZvheGymUjQ6yFnv5ov6udTWdogy8qqAihGa4kwPbfpY+DwlcT+pN/bej1pYwqQ
 kUxCxBE84RciJzZbped9xDo10HSGVudlIZPMa8Jp0f/CbPYnwf+z3tP6Xt5wtwLh
 jAhBjdzkWvEdTAJfyYmw
 =U2+h
 -----END PGP SIGNATURE-----

Merge tag 'mmc-v4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull another MMC update from Ulf Hansson:
 "Here's a second pull request for MMC for v4.10.

  As a matter of fact it's only one change that moves some mmc files
  around. I thought it was a good idea to get this into v4.10, as it
  gives us a nice and fresh base for v4.11. Summary:

  MMC core:

   - Move files from the card directory to the core directory to enable
     future clean-ups of the generic mmc header files and interfaces"

* tag 'mmc-v4.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: block: Move files to core
2016-12-14 10:55:56 -08:00
Linus Torvalds
a829a8445f SCSI misc on 20161213
This update includes the usual round of major driver updates (ncr5380,
 lpfc, hisi_sas, megaraid_sas, ufs, ibmvscsis, mpt3sas).  There's also
 an assortment of minor fixes, mostly in error legs or other not very
 user visible stuff.  The major change is the pci_alloc_irq_vectors
 replacement for the old pci_msix_.. calls; this effectively makes IRQ
 mapping generic for the drivers and allows blk_mq to use the
 information.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJYUJOvAAoJEAVr7HOZEZN42B0P/1lj1W2N7y0LOAbR2MasyQvT
 fMD/SSip/v+R/zJiTv+5M/IDQT6ez62JnQGWyX3HZTob9VEfoqagbUuHH6y+fmib
 tHyqiYc9FC/NZSRX/0ib+rpnSVhC/YRSVV7RrAqilbpOKAzeU25FlN/vbz+Nv/XL
 ReVBl+2nGjJtHyWqUN45Zuf74c0zXOWPPUD0hRaNclK5CfZv5wDYupzHzTNSQTkj
 nWvwPYT0OgSMNe7mR+IDFyOe3UQ/OYyeJB0yBNqO63IiaUabT8/hgrWR9qJAvWh8
 LiH+iSQ69+sDUnvWvFjuls/GzcFuuTljwJbm+FyTsmNHONPVY8JRCLNq7CNDJ6Vx
 HwpNuJdTSJpne4lAVBGPwgjs+GhlMvUP/xYVLWAXdaBxU9XGePrwqQDcFu1Rbx3P
 yfMiVaY1+e45OEjLRCbDAwFnMPevC3kyymIvSsTySJxhTbYrOsyrrWt5kwWsvE3r
 SKANsub+xUnpCkyg57nXRQStJSCiSfGIDsydKmMX+pf1SR4k6gCUQZlcchUX0uOa
 dcY6re0c7EJIQQiT7qeGP5TRBblxARocCA/Igx6b5U5HmuQ48tDFlMCps7/TE84V
 JBsBnmkXcEi/ALShL/Tui+3YKA1DfOtEnwHtXx/9Ecx/nxP2Sjr9LJwCKiONv8NY
 RgLpGfccrix34lQumOk5
 =sPXh
 -----END PGP SIGNATURE-----

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI updates from James Bottomley:
 "This update includes the usual round of major driver updates (ncr5380,
  lpfc, hisi_sas, megaraid_sas, ufs, ibmvscsis, mpt3sas).

  There's also an assortment of minor fixes, mostly in error legs or
  other not very user visible stuff. The major change is the
  pci_alloc_irq_vectors replacement for the old pci_msix_.. calls; this
  effectively makes IRQ mapping generic for the drivers and allows
  blk_mq to use the information"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (256 commits)
  scsi: qla4xxx: switch to pci_alloc_irq_vectors
  scsi: hisi_sas: support deferred probe for v2 hw
  scsi: megaraid_sas: switch to pci_alloc_irq_vectors
  scsi: scsi_devinfo: remove synchronous ALUA for NETAPP devices
  scsi: be2iscsi: set errno on error path
  scsi: be2iscsi: set errno on error path
  scsi: hpsa: fallback to use legacy REPORT PHYS command
  scsi: scsi_dh_alua: Fix RCU annotations
  scsi: hpsa: use %phN for short hex dumps
  scsi: hisi_sas: fix free'ing in probe and remove
  scsi: isci: switch to pci_alloc_irq_vectors
  scsi: ipr: Fix runaway IRQs when falling back from MSI to LSI
  scsi: dpt_i2o: double free on error path
  scsi: cxlflash: Migrate scsi command pointer to AFU command
  scsi: cxlflash: Migrate IOARRIN specific routines to function pointers
  scsi: cxlflash: Cleanup queuecommand()
  scsi: cxlflash: Cleanup send_tmf()
  scsi: cxlflash: Remove AFU command lock
  scsi: cxlflash: Wait for active AFU commands to timeout upon tear down
  scsi: cxlflash: Remove private command pool
  ...
2016-12-14 10:49:33 -08:00
Linus Torvalds
84b6079134 Just one simple change from Andrzej to drop the pointless return value
from the ->drop_link method.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYUUZ8AAoJEA+eU2VSBFGDjHIP/RKCBerQNkQ5hdUPnwJkXOh1
 b4SWiPNsNgkzpKJsE/vuL6R2qq9KX5sxWLi8a3Wx5jyMst1Xj2m4SvI4fUew/hBC
 QZM86aMJLcWZ8BfqLk2tLys759z/MSZSrXcUU8EM+JJqb6E0+i+5pgX8gOk7Pxwo
 FUagFzuGXTGORwkWRf47ludBuDEGMCrLQZ6WaXEKQULTUwfPrnP+n9EhZcWzzsyu
 0YxW9SFD73LgRSPLgxc+rw875D8rb3WSClWj/2LLQUy8z8QEJ83Mgt9hcbBV0Ppa
 efP0kPZbpDnVx6TjpldRKW9GivkbFXNnChMmgTkBGYTRjn8IHDsyAb6ZABw/O37N
 oyvd42xVDCE+GSImaMgCPL/5MEsQ+v9xCfgkBcyhWVQYFFj89Nmz/8VKp6AtTW3j
 X7MQuzdzKGoWTVpAOgw/SvjrRx+fcciTg31AhhGjE5cmARVoBJuCDa6NM3WXFQf4
 Pq74zetWDB38sBZwQV/6Y1m3OJGquD4MxX9b5SbNzwuROrKyJCAe3CCw7CKvuuWj
 RPkwZkHiCawRijxNDCWU8zpMcuUCdt9yjTWbUW/WrvKR6BGF4IwUHf9k8oorv1Qc
 Bo7enURnYBmcb7cijOLO+onzCf5iWAwMNk8KKJ1zthaZbloIGw8jdp5kOe5spEvF
 Uju3n/GNA/OWe2MAuutE
 =nOOH
 -----END PGP SIGNATURE-----

Merge tag 'configfs-for-4.10' of git://git.infradead.org/users/hch/configfs

Pull configfs update from Christoph Hellwig:
 "Just one simple change from Andrzej to drop the pointless return value
  from the ->drop_link method"

* tag 'configfs-for-4.10' of git://git.infradead.org/users/hch/configfs:
  fs: configfs: don't return anything from drop_link
2016-12-14 10:31:25 -08:00
Linus Torvalds
5084fdf081 This merge request includes the dax-4.0-iomap-pmd branch which is
needed for both ext4 and xfs dax changes to use iomap for DAX.  It
 also includes the fscrypt branch which is needed for ubifs encryption
 work as well as ext4 encryption and fscrypt cleanups.
 
 Lots of cleanups and bug fixes, especially making sure ext4 is robust
 against maliciously corrupted file systems --- especially maliciously
 corrupted xattr blocks and a maliciously corrupted superblock.  Also
 fix ext4 support for 64k block sizes so it works well on ppcle.  Fixed
 mbcache so we don't miss some common xattr blocks that can be merged.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAlhQQVEACgkQ8vlZVpUN
 gaN9TQgAoCD+V4kJjMCFhiV8u6QR3hqD6bOZbggo5wJf4CHglWkmrbAmc3jANOgH
 CKsXDRRjxuDjPXf1ukB1i4M7ArLYjkbbzKdsu7lismoJLS+w8uwUKSNdep+LYMjD
 alxUcf5DCzLlUmdOdW4yE22L+CwRfqfs8IpBvKmJb7DrAKiwJVA340ys6daBGuu1
 63xYx0QIyPzq0xjqLb6TVf88HUI4NiGVXmlm2wcrnYd5966hEZd/SztOZTVCVWOf
 Z0Z0fGQ1WJzmaBB9+YV3aBi+BObOx4m2PUprIa531+iEW02E+ot5Xd4vVQFoV/r4
 NX3XtoBrT1XlKagy2sJLMBoCavqrKw==
 =j4KP
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 updates from Ted Ts'o:
 "This merge request includes the dax-4.0-iomap-pmd branch which is
  needed for both ext4 and xfs dax changes to use iomap for DAX. It also
  includes the fscrypt branch which is needed for ubifs encryption work
  as well as ext4 encryption and fscrypt cleanups.

  Lots of cleanups and bug fixes, especially making sure ext4 is robust
  against maliciously corrupted file systems --- especially maliciously
  corrupted xattr blocks and a maliciously corrupted superblock. Also
  fix ext4 support for 64k block sizes so it works well on ppcle. Fixed
  mbcache so we don't miss some common xattr blocks that can be merged"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (86 commits)
  dax: Fix sleep in atomic contex in grab_mapping_entry()
  fscrypt: Rename FS_WRITE_PATH_FL to FS_CTX_HAS_BOUNCE_BUFFER_FL
  fscrypt: Delay bounce page pool allocation until needed
  fscrypt: Cleanup page locking requirements for fscrypt_{decrypt,encrypt}_page()
  fscrypt: Cleanup fscrypt_{decrypt,encrypt}_page()
  fscrypt: Never allocate fscrypt_ctx on in-place encryption
  fscrypt: Use correct index in decrypt path.
  fscrypt: move the policy flags and encryption mode definitions to uapi header
  fscrypt: move non-public structures and constants to fscrypt_private.h
  fscrypt: unexport fscrypt_initialize()
  fscrypt: rename get_crypt_info() to fscrypt_get_crypt_info()
  fscrypto: move ioctl processing more fully into common code
  fscrypto: remove unneeded Kconfig dependencies
  MAINTAINERS: fscrypto: recommend linux-fsdevel for fscrypto patches
  ext4: do not perform data journaling when data is encrypted
  ext4: return -ENOMEM instead of success
  ext4: reject inodes with negative size
  ext4: remove another test in ext4_alloc_file_blocks()
  Documentation: fix description of ext4's block_validity mount option
  ext4: fix checks for data=ordered and journal_async_commit options
  ...
2016-12-14 09:17:42 -08:00
Linus Torvalds
09cb6464fe for-f2fs-4.10
This patch series contains several performance tuning patches regarding to the
 IO submission flow, in addition to supporting new features such as a ZBC-base
 drive and multiple devices.
 
 It also includes some major bug fixes such as:
  - checkpoint version control
  - fdatasync-related roll-forward recovery routine
  - memory boundary or null-pointer access in corner cases
  - missing error cases
 
 It has various minor clean-up patches as well.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJYTx44AAoJEEAUqH6CSFDSnAQP/jeYJq5Zd0bweEF5g00Ec1Qg
 qNKQ57e9EHDRaDLBUmHHEaCEPRL0bw6SOUUWWqzGA07KcsIK+Yb/dGAyIcuV7WMl
 PjntVbYm4yARDYBHGupdOCzFSkzr8gDalb+98jJnoGUonsftljhES9jedQ1NjAms
 GFPHDNtirZM/r0bjKkYKjpqJ6FCxFxcGPfb/GtohDajIpohWfKZiemaXGTgtYR4d
 iBVek16h+Hprz90ycZBY69uz0TdAwu/gb+htMVBrAdExHWvlFzgp35OIywiAB/YX
 3QD/x4t2HqOBaNYiiOAY4ukVW/Yyqa/ZAzbm+m5B5CAcFYiWXMy+cMXUY9HJJ/K0
 wdvi//Avtvgpp2PVZFn2pASx14vgMFylBzuNgKpP6MPdtWTEL33jT7VYs9Nuz45E
 dgZ9IpiDt4DeTRuZ4mPO5iH7bVHPvAVV80bpXzirCCzDeNZ1EFFIQzXh/2UAmCxI
 twPXGBIYul0aIl9JkWAyhCZSd3XDSqedpfPudknjhzM9Xb1H5X0QJco7f/UwsWXH
 WxV6lHr1Q7UH96wJ7x/GAqj8ArOAASRV18+K51dqU+DWHnFPpBArJe39FVf8NGWs
 Fz1ZmlWBQ0ZgzvLkGa80llhjalXIEy/JabMrpy6VrzQGxHdmW4cVxe4dJ3710WxX
 VysJUcNMRKxMUTWOKsxp
 =Boum
 -----END PGP SIGNATURE-----

Merge tag 'for-f2fs-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "This patch series contains several performance tuning patches
  regarding to the IO submission flow, in addition to supporting new
  features such as a ZBC-base drive and multiple devices.

  It also includes some major bug fixes such as:
   - checkpoint version control
   - fdatasync-related roll-forward recovery routine
   - memory boundary or null-pointer access in corner cases
   - missing error cases

  It has various minor clean-up patches as well"

* tag 'for-f2fs-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (66 commits)
  f2fs: fix a missing size change in f2fs_setattr
  f2fs: fix to access nullified flush_cmd_control pointer
  f2fs: free meta pages if sanity check for ckpt is failed
  f2fs: detect wrong layout
  f2fs: call sync_fs when f2fs is idle
  Revert "f2fs: use percpu_counter for # of dirty pages in inode"
  f2fs: return AOP_WRITEPAGE_ACTIVATE for writepage
  f2fs: do not activate auto_recovery for fallocated i_size
  f2fs: fix to determine start_cp_addr by sbi->cur_cp_pack
  f2fs: fix 32-bit build
  f2fs: set ->owner for debugfs status file's file_operations
  f2fs: fix incorrect free inode count in ->statfs
  f2fs: drop duplicate header timer.h
  f2fs: fix wrong AUTO_RECOVER condition
  f2fs: do not recover i_size if it's valid
  f2fs: fix fdatasync
  f2fs: fix to account total free nid correctly
  f2fs: fix an infinite loop when flush nodes in cp
  f2fs: don't wait writeback for datas during checkpoint
  f2fs: fix wrong written_valid_blocks counting
  ...
2016-12-14 09:07:36 -08:00
Linus Torvalds
19d37ce2a7 dlm for 4.10
This set fixes error reporting for dlm sockets, removes the unbound
 property on the dlm callback workqueue to improve performance, and
 includes a couple trivial changes.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYTwosAAoJEDgbc8f8gGmqqHUQAKj+/z+kIrMp5MJEhzriMLpP
 wKIZa9bkmcm+BuLLf7EOwmaYx374HCq4oNNY7DJT0bE9rbFLwx9zgOvdoIjJFU3V
 mSvhyH8FeyueNyZHZdXmA1JZCGbuCeS36cxseaeS14+ANE/cQFlOHW5ihvLAmmnR
 fyV/38IjbDl33pTVf2YU5G232csicNMM8xR+1+ctrhd6CREdbY8Nf4TYVjNLHAsD
 r3FsuzScv1+p1LuczEhFP/Nl0YcVpH3EzSgOY67WRSQlSMyrfdnVvJkgwSIZkhpp
 XwW++ZBFq3B5Et1YgrFtTECrvMOb3hvoejtKTeTPq3tWoOvgweml1brtO8rVN85U
 brdTn3blKE7oyh+0ITdENLKXsWB5+qe1afNN51qO+MZyXKCR6uct+SjSI+zelet8
 jKqxP1bQCxbnvPfF/pWVGujDE4Cb6qoeCrFSoJ/VpC/JcKxxLB7p06yflY5Ztokr
 yWnPiBSEz7M7+lRF/HKmJ2PZKwdZwyrrRWtCyRXPPD29kg4pG46oxjqU9iEp3R9F
 hDCt/AiqQWWQuhU0RZ910h2ce1y9oSyQSAbVqfmqNYZMk6UeO+0X9+kxl5fSeIWT
 bjO+LsZqz8QQG33XYADs+5dSRK9Lmh5roR6j7QKlVJUsB+RbBhkDSMArh+jSCQap
 61L10OPKaN97m6TNXfVw
 =4ZWQ
 -----END PGP SIGNATURE-----

Merge tag 'dlm-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm

Pull dlm fixes from David Teigland:
 "This set fixes error reporting for dlm sockets, removes the unbound
  property on the dlm callback workqueue to improve performance, and
  includes a couple trivial changes"

* tag 'dlm-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
  dlm: fix error return code in sctp_accept_from_sock()
  dlm: don't specify WQ_UNBOUND for the ast callback workqueue
  dlm: remove lock_sock to avoid scheduling while atomic
  dlm: don't save callbacks after accept
  dlm: audit and remove any unnecessary uses of module.h
  dlm: make genl_ops const
2016-12-14 08:31:37 -08:00
Linus Torvalds
3e5cecf268 The jfs piece of the current_time() series.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEIodevzQLVs53l6BhNqiEXrVAjGQFAlhOxdwACgkQNqiEXrVA
 jGSLyxAAjfq81k86sv9FD+I8OQzF0rv9t4oKwmkoky/8hk2cOJzeujbo0Xm2KWf5
 8nZvgQ47mJKj0qB4XZNuetRxnmlVlh9Fs162clKb9VhklFlupDfJMwGA93kmXwRu
 bmGEpmAVRCtjc6T89GLor4InxGz8sj5aY1phPhKvOXbONtAEfY5Z/TLroCh9gVjt
 J6jW14V8i24SDqgse65MJUtDAvNpVKtNwDlWbm7i91T6QN9IwrNoT6Yslk9xXCNH
 srkbEvtkcB4kIDwub45QIcbl381ZqyJuKLD2aIkzW5vyvux9eaQZRxMGls/OZZqw
 7Kxt4zIOM2A73+50hpCqlISQb5vTwl6HbwuzqYGLYuKYBXjxJEeUpPKQkHZikFn0
 /rg6Ui40VY84hEHK9R/tpZX2hYkShbpci3s95PvyhGjUzo64vc+9EU7Tbs3xRJAP
 5MwDT+t/7NL8VvpINJFByCpPDfk/qHOKItZ5dC7FK5VAZUHVv9BR6chlC/7kdOtw
 8nZEoZIHYFPc08hHjLNQ34/8pf2uG7rjbrK8Fk4kI1dZHHC7ZAQm0uSKEYM3x3LX
 EIClnW4ohDeVDrM+bzshn8BTnYcOCZHBh7AcjToAzzhtJH2DrW72Dl40F9BIWtHr
 lyUmAXtFoJZlnLnlFbrcw/OWz4+uPJ2NRYa/0Yg1cnWtC7WGc90=
 =99M0
 -----END PGP SIGNATURE-----

Merge tag 'jfs-4.10' of git://github.com/kleikamp/linux-shaggy

Pull jfs update from David Kleikamp:
 "The jfs piece of the current_time() series"

* tag 'jfs-4.10' of git://github.com/kleikamp/linux-shaggy:
  fs: jfs: Replace CURRENT_TIME_SEC by current_time()
2016-12-14 08:30:08 -08:00
Linus Torvalds
cdb98c2698 Revert "nvme: add support for the Write Zeroes command"
This reverts commit 6d31e3ba23.

This causes bootup problems for me both on my laptop and my desktop.
What they have in common is that they have NVMe disks with dm-crypt, but
it's not the same controller, so it's not controller-specific.

Jens does not see it on his machine (also NVMe), so it's presumably
something that triggers just on bootup.  Possibly related to dm-crypt
and the fact that I mark my luks volume with "allow-discards" in
/etc/crypttab.

It's 100% repeatable for me, which made it fairly straightforward to
bisect the problem to this commit. Small mercies.

So we don't know what the reason is yet, but the revert is needed to get
things going again.

Acked-by: Jens Axboe <axboe@fb.com>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni@hgst.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-13 19:53:37 -08:00