Commit Graph

649804 Commits

Author SHA1 Message Date
Kees Cook
9e34404818 ARM: 8658/1: uaccess: fix zeroing of 64-bit get_user()
The 64-bit get_user() wasn't clearing the high word due to a typo in the
error handler. The exception handler entry was already correct, though.
Noticed during recent usercopy test additions in lib/test_user_copy.c.

Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-02-16 15:58:32 +00:00
Kees Cook
32b143637e ARM: 8657/1: uaccess: consistently check object sizes
In commit 76624175dc ("arm64: uaccess: consistently check object sizes"),
the object size checks are moved outside the access_ok() so that bad
destinations are detected before hitting the "memset(dest, 0, size)" in the
copy_from_user() failure path.

This makes the same change for arm, with attention given to possibly
extracting the uaccess routines into a common header file for all
architectures in the future.

Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-02-16 15:58:31 +00:00
Jens Axboe
5d7f5ce151 cfq-iosched: don't call wbt_disable_default() with IRQs disabled
wbt_disable_default() calls del_timer_sync() to wait for the wbt
timer to finish before disabling throttling. We can't do this with
IRQs disable. This fixes a lockdep splat on boot, if non-root
cgroups are used.

Reported-by: Gabriel C <nix.or.die@gmail.com>
Fixes: 87760e5eef ("block: hook up writeback throttling")
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-02-16 08:02:06 -07:00
Miklos Szeredi
84588a93d0 fuse: fix uninitialized flags in pipe_buffer
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: d82718e348 ("fuse_dev_splice_read(): switch to add_to_pipe()")
Cc: <stable@vger.kernel.org> # 4.9+
2017-02-16 15:08:20 +01:00
David S. Miller
bf3f14d634 rhashtable: Revert nested table changes.
This reverts commits:

6a25478077
9dbbfb0ab6
40137906c5

It's too risky to put in this late in the release
cycle.  We'll put these changes into the next merge
window instead.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-15 22:29:51 -05:00
Dave Airlie
b7a2699859 Merge tag 'drm-misc-fixes-2017-02-15' of git://anongit.freedesktop.org/git/drm-misc into drm-fixes
dp/mst oops fix for v4.10

* tag 'drm-misc-fixes-2017-02-15' of git://anongit.freedesktop.org/git/drm-misc:
  drm/dp/mst: fix kernel oops when turning off secondary monitor
2017-02-16 13:26:41 +10:00
Paul Mackerras
3f91a89d42 powerpc/64: Disable use of radix under a hypervisor
Currently, if the kernel is running on a POWER9 processor under a
hypervisor, it may try to use the radix MMU even though it doesn't have
the necessary code to do so (it doesn't negotiate use of radix, and it
doesn't do the H_REGISTER_PROC_TBL hcall).  If the hypervisor supports
both radix and HPT, then it will set up the guest to use HPT (since the
guest doesn't request radix in the CAS call), but if the radix feature
bit is set in the ibm,pa-features property (which is valid, since
ibm,pa-features is defined to represent the capabilities of the
processor) the guest will try to use radix, resulting in a crash when
it turns the MMU on.

This makes the minimal fix for the current code, which is to disable
radix unless we are running in hypervisor mode.

Fixes: 2bfd65e45e ("powerpc/mm/radix: Add radix callbacks for early init routines")
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-16 14:01:42 +11:00
Thomas Falcon
75224c93fa ibmvnic: Fix endian errors in error reporting output
Error reports received from firmware were not being converted from
big endian values, leading to bogus error codes reported on little
endian systems.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-15 14:48:31 -05:00
Thomas Falcon
28f4d16570 ibmvnic: Fix endian error when requesting device capabilities
When a vNIC client driver requests a faulty device setting, the
server returns an acceptable value for the client to request.
This 64 bit value was incorrectly being swapped as a 32 bit value,
resulting in loss of data. This patch corrects that by using
the 64 bit swap function.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-15 14:48:31 -05:00
Marcus Huewe
7627ae6030 net: neigh: Fix netevent NETEVENT_DELAY_PROBE_TIME_UPDATE notification
When setting a neigh related sysctl parameter, we always send a
NETEVENT_DELAY_PROBE_TIME_UPDATE netevent. For instance, when
executing

	sysctl net.ipv6.neigh.wlp3s0.retrans_time_ms=2000

a NETEVENT_DELAY_PROBE_TIME_UPDATE netevent is generated.

This is caused by commit 2a4501ae18 ("neigh: Send a
notification when DELAY_PROBE_TIME changes"). According to the
commit's description, it was intended to generate such an event
when setting the "delay_first_probe_time" sysctl parameter.

In order to fix this, only generate this event when actually
setting the "delay_first_probe_time" sysctl parameter. This fix
should not have any unintended side-effects, because all but one
registered netevent callbacks check for other netevent event
types (the registered callbacks were obtained by grepping for
"register_netevent_notifier"). The only callback that uses the
NETEVENT_DELAY_PROBE_TIME_UPDATE event is
mlxsw_sp_router_netevent_event() (in
drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c): in case
of this event, it only accesses the DELAY_PROBE_TIME of the
passed neigh_parms.

Fixes: 2a4501ae18 ("neigh: Send a notification when DELAY_PROBE_TIME changes")
Signed-off-by: Marcus Huewe <suse-tux@gmx.de>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-15 12:38:43 -05:00
Anssi Hannula
acf138f1b0 net: xilinx_emaclite: fix freezes due to unordered I/O
The xilinx_emaclite uses __raw_writel and __raw_readl for register
accesses. Those functions do not imply any kind of memory barriers and
they may be reordered.

The driver does not seem to take that into account, though, and the
driver does not satisfy the ordering requirements of the hardware.
For clear examples, see xemaclite_mdio_write() and xemaclite_mdio_read()
which try to set MDIO address before initiating the transaction.

I'm seeing system freezes with the driver with GCC 5.4 and current
Linux kernels on Zynq-7000 SoC immediately when trying to use the
interface.

In commit 123c1407af ("net: emaclite: Do not use microblaze and ppc
IO functions") the driver was switched from non-generic
in_be32/out_be32 (memory barriers, big endian) to
__raw_readl/__raw_writel (no memory barriers, native endian), so
apparently the device follows system endianness and the driver was
originally written with the assumption of memory barriers.

Rather than try to hunt for each case of missing barrier, just switch
the driver to use iowrite32/ioread32/iowrite32be/ioread32be depending
on endianness instead.

Tested on little-endian Zynq-7000 ARM SoC FPGA.

Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Fixes: 123c1407af ("net: emaclite: Do not use microblaze and ppc IO
functions")
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-15 12:18:34 -05:00
Anssi Hannula
cd22455364 net: xilinx_emaclite: fix receive buffer overflow
xilinx_emaclite looks at the received data to try to determine the
Ethernet packet length but does not properly clamp it if
proto_type == ETH_P_IP or 1500 < proto_type <= 1518, causing a buffer
overflow and a panic via skb_panic() as the length exceeds the allocated
skb size.

Fix those cases.

Also add an additional unconditional check with WARN_ON() at the end.

Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Fixes: bb81b2ddfa ("net: add Xilinx emac lite device driver")
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-15 12:18:33 -05:00
Yinghai Lu
afe3e4d11b PCI/PME: Restore pcie_pme_driver.remove
In addition to making PME non-modular, d7def20400 ("PCI/PME: Make
explicitly non-modular") removed the pcie_pme_driver .remove() method,
pcie_pme_remove().

pcie_pme_remove() freed the PME IRQ that was requested in pci_pme_probe().
The fact that we don't free the IRQ after d7def20400 causes the following
crash when removing a PCIe port device via /sys:

  ------------[ cut here ]------------
  kernel BUG at drivers/pci/msi.c:370!
  invalid opcode: 0000 [#1] SMP
  Modules linked in:
  CPU: 1 PID: 14509 Comm: sh Tainted: G    W  4.8.0-rc1-yh-00012-gd29438d
  RIP: 0010:[<ffffffff9758bbf5>]  free_msi_irqs+0x65/0x190
  ...
  Call Trace:
   [<ffffffff9758cda4>] pci_disable_msi+0x34/0x40
   [<ffffffff97583817>] cleanup_service_irqs+0x27/0x30
   [<ffffffff97583e9a>] pcie_port_device_remove+0x2a/0x40
   [<ffffffff97584250>] pcie_portdrv_remove+0x40/0x50
   [<ffffffff97576d7b>] pci_device_remove+0x4b/0xc0
   [<ffffffff9785ebe6>] __device_release_driver+0xb6/0x150
   [<ffffffff9785eca5>] device_release_driver+0x25/0x40
   [<ffffffff975702e4>] pci_stop_bus_device+0x74/0xa0
   [<ffffffff975704ea>] pci_stop_and_remove_bus_device_locked+0x1a/0x30
   [<ffffffff97578810>] remove_store+0x50/0x70
   [<ffffffff9785a378>] dev_attr_store+0x18/0x30
   [<ffffffff97260b64>] sysfs_kf_write+0x44/0x60
   [<ffffffff9725feae>] kernfs_fop_write+0x10e/0x190
   [<ffffffff971e13f8>] __vfs_write+0x28/0x110
   [<ffffffff970b0fa4>] ? percpu_down_read+0x44/0x80
   [<ffffffff971e53a7>] ? __sb_start_write+0xa7/0xe0
   [<ffffffff971e53a7>] ? __sb_start_write+0xa7/0xe0
   [<ffffffff971e1f04>] vfs_write+0xc4/0x180
   [<ffffffff971e3089>] SyS_write+0x49/0xa0
   [<ffffffff97001a46>] do_syscall_64+0xa6/0x1b0
   [<ffffffff9819201e>] entry_SYSCALL64_slow_path+0x25/0x25
  ...
   RIP  [<ffffffff9758bbf5>] free_msi_irqs+0x65/0x190
   RSP <ffff89ad3085bc48>
  ---[ end trace f4505e1dac5b95d3 ]---
  Segmentation fault

Restore pcie_pme_remove().

[bhelgaas: changelog]
Fixes: d7def20400 ("PCI/PME: Make explicitly non-modular")
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: stable@vger.kernel.org	# v4.9+
2017-02-15 09:39:32 -06:00
Sergey Senozhatsky
f222449c9d timekeeping: Use deferred printk() in debug code
We cannot do printk() from tk_debug_account_sleep_time(), because
tk_debug_account_sleep_time() is called under tk_core seq lock.
The reason why printk() is unsafe there is that console_sem may
invoke scheduler (up()->wake_up_process()->activate_task()), which,
in turn, can return back to timekeeping code, for instance, via
get_time()->ktime_get(), deadlocking the system on tk_core seq lock.

[   48.950592] ======================================================
[   48.950622] [ INFO: possible circular locking dependency detected ]
[   48.950622] 4.10.0-rc7-next-20170213+ #101 Not tainted
[   48.950622] -------------------------------------------------------
[   48.950622] kworker/0:0/3 is trying to acquire lock:
[   48.950653]  (tk_core){----..}, at: [<c01cc624>] retrigger_next_event+0x4c/0x90
[   48.950683]
               but task is already holding lock:
[   48.950683]  (hrtimer_bases.lock){-.-...}, at: [<c01cc610>] retrigger_next_event+0x38/0x90
[   48.950714]
               which lock already depends on the new lock.

[   48.950714]
               the existing dependency chain (in reverse order) is:
[   48.950714]
               -> #5 (hrtimer_bases.lock){-.-...}:
[   48.950744]        _raw_spin_lock_irqsave+0x50/0x64
[   48.950775]        lock_hrtimer_base+0x28/0x58
[   48.950775]        hrtimer_start_range_ns+0x20/0x5c8
[   48.950775]        __enqueue_rt_entity+0x320/0x360
[   48.950805]        enqueue_rt_entity+0x2c/0x44
[   48.950805]        enqueue_task_rt+0x24/0x94
[   48.950836]        ttwu_do_activate+0x54/0xc0
[   48.950836]        try_to_wake_up+0x248/0x5c8
[   48.950836]        __setup_irq+0x420/0x5f0
[   48.950836]        request_threaded_irq+0xdc/0x184
[   48.950866]        devm_request_threaded_irq+0x58/0xa4
[   48.950866]        omap_i2c_probe+0x530/0x6a0
[   48.950897]        platform_drv_probe+0x50/0xb0
[   48.950897]        driver_probe_device+0x1f8/0x2cc
[   48.950897]        __driver_attach+0xc0/0xc4
[   48.950927]        bus_for_each_dev+0x6c/0xa0
[   48.950927]        bus_add_driver+0x100/0x210
[   48.950927]        driver_register+0x78/0xf4
[   48.950958]        do_one_initcall+0x3c/0x16c
[   48.950958]        kernel_init_freeable+0x20c/0x2d8
[   48.950958]        kernel_init+0x8/0x110
[   48.950988]        ret_from_fork+0x14/0x24
[   48.950988]
               -> #4 (&rt_b->rt_runtime_lock){-.-...}:
[   48.951019]        _raw_spin_lock+0x40/0x50
[   48.951019]        rq_offline_rt+0x9c/0x2bc
[   48.951019]        set_rq_offline.part.2+0x2c/0x58
[   48.951049]        rq_attach_root+0x134/0x144
[   48.951049]        cpu_attach_domain+0x18c/0x6f4
[   48.951049]        build_sched_domains+0xba4/0xd80
[   48.951080]        sched_init_smp+0x68/0x10c
[   48.951080]        kernel_init_freeable+0x160/0x2d8
[   48.951080]        kernel_init+0x8/0x110
[   48.951080]        ret_from_fork+0x14/0x24
[   48.951110]
               -> #3 (&rq->lock){-.-.-.}:
[   48.951110]        _raw_spin_lock+0x40/0x50
[   48.951141]        task_fork_fair+0x30/0x124
[   48.951141]        sched_fork+0x194/0x2e0
[   48.951141]        copy_process.part.5+0x448/0x1a20
[   48.951171]        _do_fork+0x98/0x7e8
[   48.951171]        kernel_thread+0x2c/0x34
[   48.951171]        rest_init+0x1c/0x18c
[   48.951202]        start_kernel+0x35c/0x3d4
[   48.951202]        0x8000807c
[   48.951202]
               -> #2 (&p->pi_lock){-.-.-.}:
[   48.951232]        _raw_spin_lock_irqsave+0x50/0x64
[   48.951232]        try_to_wake_up+0x30/0x5c8
[   48.951232]        up+0x4c/0x60
[   48.951263]        __up_console_sem+0x2c/0x58
[   48.951263]        console_unlock+0x3b4/0x650
[   48.951263]        vprintk_emit+0x270/0x474
[   48.951293]        vprintk_default+0x20/0x28
[   48.951293]        printk+0x20/0x30
[   48.951324]        kauditd_hold_skb+0x94/0xb8
[   48.951324]        kauditd_thread+0x1a4/0x56c
[   48.951324]        kthread+0x104/0x148
[   48.951354]        ret_from_fork+0x14/0x24
[   48.951354]
               -> #1 ((console_sem).lock){-.....}:
[   48.951385]        _raw_spin_lock_irqsave+0x50/0x64
[   48.951385]        down_trylock+0xc/0x2c
[   48.951385]        __down_trylock_console_sem+0x24/0x80
[   48.951385]        console_trylock+0x10/0x8c
[   48.951416]        vprintk_emit+0x264/0x474
[   48.951416]        vprintk_default+0x20/0x28
[   48.951416]        printk+0x20/0x30
[   48.951446]        tk_debug_account_sleep_time+0x5c/0x70
[   48.951446]        __timekeeping_inject_sleeptime.constprop.3+0x170/0x1a0
[   48.951446]        timekeeping_resume+0x218/0x23c
[   48.951477]        syscore_resume+0x94/0x42c
[   48.951477]        suspend_enter+0x554/0x9b4
[   48.951477]        suspend_devices_and_enter+0xd8/0x4b4
[   48.951507]        enter_state+0x934/0xbd4
[   48.951507]        pm_suspend+0x14/0x70
[   48.951507]        state_store+0x68/0xc8
[   48.951538]        kernfs_fop_write+0xf4/0x1f8
[   48.951538]        __vfs_write+0x1c/0x114
[   48.951538]        vfs_write+0xa0/0x168
[   48.951568]        SyS_write+0x3c/0x90
[   48.951568]        __sys_trace_return+0x0/0x10
[   48.951568]
               -> #0 (tk_core){----..}:
[   48.951599]        lock_acquire+0xe0/0x294
[   48.951599]        ktime_get_update_offsets_now+0x5c/0x1d4
[   48.951629]        retrigger_next_event+0x4c/0x90
[   48.951629]        on_each_cpu+0x40/0x7c
[   48.951629]        clock_was_set_work+0x14/0x20
[   48.951660]        process_one_work+0x2b4/0x808
[   48.951660]        worker_thread+0x3c/0x550
[   48.951660]        kthread+0x104/0x148
[   48.951690]        ret_from_fork+0x14/0x24
[   48.951690]
               other info that might help us debug this:

[   48.951690] Chain exists of:
                 tk_core --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock

[   48.951721]  Possible unsafe locking scenario:

[   48.951721]        CPU0                    CPU1
[   48.951721]        ----                    ----
[   48.951721]   lock(hrtimer_bases.lock);
[   48.951751]                                lock(&rt_b->rt_runtime_lock);
[   48.951751]                                lock(hrtimer_bases.lock);
[   48.951751]   lock(tk_core);
[   48.951782]
                *** DEADLOCK ***

[   48.951782] 3 locks held by kworker/0:0/3:
[   48.951782]  #0:  ("events"){.+.+.+}, at: [<c0156590>] process_one_work+0x1f8/0x808
[   48.951812]  #1:  (hrtimer_work){+.+...}, at: [<c0156590>] process_one_work+0x1f8/0x808
[   48.951843]  #2:  (hrtimer_bases.lock){-.-...}, at: [<c01cc610>] retrigger_next_event+0x38/0x90
[   48.951843]   stack backtrace:
[   48.951873] CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 4.10.0-rc7-next-20170213+
[   48.951904] Workqueue: events clock_was_set_work
[   48.951904] [<c0110208>] (unwind_backtrace) from [<c010c224>] (show_stack+0x10/0x14)
[   48.951934] [<c010c224>] (show_stack) from [<c04ca6c0>] (dump_stack+0xac/0xe0)
[   48.951934] [<c04ca6c0>] (dump_stack) from [<c019b5cc>] (print_circular_bug+0x1d0/0x308)
[   48.951965] [<c019b5cc>] (print_circular_bug) from [<c019d2a8>] (validate_chain+0xf50/0x1324)
[   48.951965] [<c019d2a8>] (validate_chain) from [<c019ec18>] (__lock_acquire+0x468/0x7e8)
[   48.951995] [<c019ec18>] (__lock_acquire) from [<c019f634>] (lock_acquire+0xe0/0x294)
[   48.951995] [<c019f634>] (lock_acquire) from [<c01d0ea0>] (ktime_get_update_offsets_now+0x5c/0x1d4)
[   48.952026] [<c01d0ea0>] (ktime_get_update_offsets_now) from [<c01cc624>] (retrigger_next_event+0x4c/0x90)
[   48.952026] [<c01cc624>] (retrigger_next_event) from [<c01e4e24>] (on_each_cpu+0x40/0x7c)
[   48.952056] [<c01e4e24>] (on_each_cpu) from [<c01cafc4>] (clock_was_set_work+0x14/0x20)
[   48.952056] [<c01cafc4>] (clock_was_set_work) from [<c015664c>] (process_one_work+0x2b4/0x808)
[   48.952087] [<c015664c>] (process_one_work) from [<c0157774>] (worker_thread+0x3c/0x550)
[   48.952087] [<c0157774>] (worker_thread) from [<c015d644>] (kthread+0x104/0x148)
[   48.952087] [<c015d644>] (kthread) from [<c0107830>] (ret_from_fork+0x14/0x24)

Replace printk() with printk_deferred(), which does not call into
the scheduler.

Fixes: 0bf43f15db ("timekeeping: Prints the amounts of time spent during suspend")
Reported-and-tested-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J . Wysocki" <rjw@rjwysocki.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: "[4.9+]" <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20170215044332.30449-1-sergey.senozhatsky@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-15 11:46:08 +01:00
Pierre-Louis Bossart
bb08c04dc8 drm/dp/mst: fix kernel oops when turning off secondary monitor
100% reproducible issue found on SKL SkullCanyon NUC with two external
DP daisy-chained monitors in DP/MST mode. When turning off or changing
the input of the second monitor the machine stops with a kernel
oops. This issue happened with 4.8.8 as well as drm/drm-intel-nightly.

This issue is traced to an inconsistent control flow in
drm_dp_update_payload_part1(): the 'port' pointer is set to NULL at the
same time as 'req_payload.num_slots' is set to zero, but the pointer is
dereferenced even when req_payload.num_slot is zero.

The problematic dereference was introduced in commit dfda0df34
("drm/mst: rework payload table allocation to conform better") and may
impact all versions since v3.18

The fix suggested by Chris Wilson removes the kernel oops and was found to
work well after 10mn of monkey-testing with the second monitor power and
input buttons

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98990
Fixes: dfda0df342 ("drm/mst: rework payload table allocation to conform better.")
Cc: Dave Airlie <airlied@redhat.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Nathan D Ciobanu <nathan.d.ciobanu@linux.intel.com>
Cc: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: <stable@vger.kernel.org> # v3.18+
Tested-by: Nathan D Ciobanu <nathan.d.ciobanu@linux.intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1487076561-2169-1-git-send-email-jani.nikula@intel.com
2017-02-15 11:41:19 +02:00
Sahitya Tummala
6ba4d2722d fuse: fix use after free issue in fuse_dev_do_read()
There is a potential race between fuse_dev_do_write()
and request_wait_answer() contexts as shown below:

TASK 1:
__fuse_request_send():
  |--spin_lock(&fiq->waitq.lock);
  |--queue_request();
  |--spin_unlock(&fiq->waitq.lock);
  |--request_wait_answer():
       |--if (test_bit(FR_SENT, &req->flags))
       <gets pre-empted after it is validated true>
                                   TASK 2:
                                   fuse_dev_do_write():
                                     |--clears bit FR_SENT,
                                     |--request_end():
                                        |--sets bit FR_FINISHED
                                        |--spin_lock(&fiq->waitq.lock);
                                        |--list_del_init(&req->intr_entry);
                                        |--spin_unlock(&fiq->waitq.lock);
                                        |--fuse_put_request();
       |--queue_interrupt();
       <request gets queued to interrupts list>
            |--wake_up_locked(&fiq->waitq);
       |--wait_event_freezable();
       <as FR_FINISHED is set, it returns and then
       the caller frees this request>

Now, the next fuse_dev_do_read(), see interrupts list is not empty
and then calls fuse_read_interrupt() which tries to access the request
which is already free'd and gets the below crash:

[11432.401266] Unable to handle kernel paging request at virtual address
6b6b6b6b6b6b6b6b
...
[11432.418518] Kernel BUG at ffffff80083720e0
[11432.456168] PC is at __list_del_entry+0x6c/0xc4
[11432.463573] LR is at fuse_dev_do_read+0x1ac/0x474
...
[11432.679999] [<ffffff80083720e0>] __list_del_entry+0x6c/0xc4
[11432.687794] [<ffffff80082c65e0>] fuse_dev_do_read+0x1ac/0x474
[11432.693180] [<ffffff80082c6b14>] fuse_dev_read+0x6c/0x78
[11432.699082] [<ffffff80081d5638>] __vfs_read+0xc0/0xe8
[11432.704459] [<ffffff80081d5efc>] vfs_read+0x90/0x108
[11432.709406] [<ffffff80081d67f0>] SyS_read+0x58/0x94

As FR_FINISHED bit is set before deleting the intr_entry with input
queue lock in request completion path, do the testing of this flag and
queueing atomically with the same lock in queue_interrupt().

Signed-off-by: Sahitya Tummala <stummala@codeaurora.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: fd22d62ed0 ("fuse: no fc->lock for iqueue parts")
Cc: <stable@vger.kernel.org> # 4.2+
2017-02-15 10:28:24 +01:00
Stephen Rothwell
5463b3d043 bpf: kernel header files need to be copied into the tools directory
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14 22:27:31 -05:00
Eric Dumazet
e70ac17165 tcp: tcp_probe: use spin_lock_bh()
tcp_rcv_established() can now run in process context.

We need to disable BH while acquiring tcp probe spinlock,
or risk a deadlock.

Fixes: 5413d1babe ("net: do not block BH while processing socket backlog")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Ricardo Nabinger Sanchez <rnsanchez@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14 22:19:39 -05:00
Dmitry V. Levin
a725eb15db uapi: fix linux/if_pppol2tp.h userspace compilation errors
Because of <linux/libc-compat.h> interface limitations, <netinet/in.h>
provided by libc cannot be included after <linux/in.h>, therefore any
header that includes <netinet/in.h> cannot be included after <linux/in.h>.

Change uapi/linux/l2tp.h, the last uapi header that includes
<netinet/in.h>, to include <linux/in.h> and <linux/in6.h> instead of
<netinet/in.h> and use __SOCK_SIZE__ instead of sizeof(struct sockaddr)
the same way as uapi/linux/in.h does, to fix linux/if_pppol2tp.h userspace
compilation errors like this:

In file included from /usr/include/linux/l2tp.h:12:0,
                 from /usr/include/linux/if_pppol2tp.h:21,
/usr/include/netinet/in.h:31:8: error: redefinition of 'struct in_addr'

Fixes: 47c3e7783b ("net: l2tp: deprecate PPPOL2TP_MSG_* in favour of L2TP_MSG_*")
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14 22:18:05 -05:00
Jarkko Nikula
12688dc21f Revert "i2c: designware: detect when dynamic tar update is possible"
This reverts commit 63d0f0a695.

It caused a regression on platforms where I2C controller is synthesized
with dynamic TAR update disabled. Detection code is testing is bit
DW_IC_CON_10BITADDR_MASTER in register DW_IC_CON read-only but fails to
restore original value in case bit is read-write.

Instead of fixing this we revert the commit since it was preparation for
the commit 0317e6c0f1 ("i2c: designware: do not disable adapter after
transfer") which was also reverted.

Reported-by: Shah Nehal-Bakulchandra <Nehal-bakulchandra.Shah@amd.com>
Reported-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Acked-By: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: <stable@vger.kernel.org> # v4.9+
Fixes: 63d0f0a695 ("i2c: designware: detect when dynamic tar update is possible")
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-02-14 22:10:26 +01:00
Mauro Carvalho Chehab
f9c85ee671 [media] siano: make it work again with CONFIG_VMAP_STACK
Reported as a Kaffeine bug:
	https://bugs.kde.org/show_bug.cgi?id=375811

The USB control messages require DMA to work. We cannot pass
a stack-allocated buffer, as it is not warranted that the
stack would be into a DMA enabled area.

On Kernel 4.9, the default is to not accept DMA on stack anymore
on x86 architecture. On other architectures, this has been a
requirement since Kernel 2.2. So, after this patch, this driver
should likely work fine on all archs.

Tested with USB ID 2040:5510: Hauppauge Windham

Cc: stable@vger.kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-02-14 18:13:49 -02:00
Eric Dumazet
d199fab63c packet: fix races in fanout_add()
Multiple threads can call fanout_add() at the same time.

We need to grab fanout_mutex earlier to avoid races that could
lead to one thread freeing po->rollover that was set by another thread.

Do the same in fanout_release(), for peace of mind, and to help us
finding lockdep issues earlier.

Fixes: dc99f60069 ("packet: Add fanout support.")
Fixes: 0648ab70af ("packet: rollover prepare: per-socket state")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14 15:05:12 -05:00
Thomas Falcon
f39f0d1e1e ibmvnic: Fix initial MTU settings
In the current driver, the MTU is set to the maximum value
capable for the backing device. This decision turned out to
be a mistake as it led to confusion among users. The expected
initial MTU value used for other IBM vNIC capable operating
systems is 1500, with the maximum value (9000) reserved for
when Jumbo frames are enabled. This patch sets the MTU to
the default value for a net device.

It also corrects a discrepancy between MTU values received from
firmware, which includes the ethernet header length, and net
device MTU values.

Finally, it removes redundant min/max MTU assignments after device
initialization.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14 14:57:45 -05:00
Ivan Khoronzhuk
a60ced990e net: ethernet: ti: cpsw: fix cpsw assignment in resume
There is a copy-paste error, which hides breaking of resume
for CPSW driver: there was replaced netdev_priv() to ndev_to_cpsw(ndev)
in suspend, but left it unchanged in resume.

Fixes: 606f399395
(ti: cpsw: move platform data and slaves info to cpsw_common)

Reported-by: Alexey Starikovskiy <AStarikovskiy@topcon.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14 14:54:19 -05:00
WANG Cong
cd27b96bc1 kcm: fix a null pointer dereference in kcm_sendmsg()
In commit 98e3862ca2 ("kcm: fix 0-length case for kcm_sendmsg()")
I tried to avoid skb allocation for 0-length case, but missed
a check for NULL pointer in the non EOR case.

Fixes: 98e3862ca2 ("kcm: fix 0-length case for kcm_sendmsg()")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Tom Herbert <tom@herbertland.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14 13:06:37 -05:00
Rui Sousa
01f8902bcf net: fec: fix multicast filtering hardware setup
Fix hardware setup of multicast address hash:
- Never clear the hardware hash (to avoid packet loss)
- Construct the hash register values in software and then write once
to hardware

Signed-off-by: Rui Sousa <rui.sousa@nxp.com>
Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14 12:15:34 -05:00
David S. Miller
144adc655f Merge branch 'ipv6-v4mapped'
Jonathan T. Leighton says:

====================
IPv4-mapped on wire, :: dst address issue

Under some circumstances IPv6 datagrams are sent with IPv4-mapped IPv6
addresses as the source. Given an IPv6 socket bound to an IPv4-mapped
IPv6 address, and an IPv6 destination address, both TCP and UDP will
will send packets using the IPv4-mapped IPv6 address as the source. Per
RFC 6890 (Table 20), IPv4-mapped IPv6 source addresses are not allowed
in an IP datagram. The problem can be observed by attempting to
connect() either a TCP or UDP socket, or by using sendmsg() with a UDP
socket. The patch is intended to correct this issue for all socket
types.

linux follows the BSD convention that an IPv6 destination address
specified as in6addr_any is converted to the loopback address.
Currently, neither TCP nor UDP consider the possibility that the source
address is an IPv4-mapped IPv6 address, and assume that the appropriate
loopback address is ::1. The patch adds a check on whether or not the
source address is an IPv4-mapped IPv6 address and then sets the
destination address to either ::ffff:127.0.0.1 or ::1, as appropriate.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14 12:13:52 -05:00
Jonathan T. Leighton
052d2369d1 ipv6: Handle IPv4-mapped src to in6addr_any dst.
This patch adds a check on the type of the source address for the case
where the destination address is in6addr_any. If the source is an
IPv4-mapped IPv6 source address, the destination is changed to
::ffff:127.0.0.1, and otherwise the destination is changed to ::1. This
is done in three locations to handle UDP calls to either connect() or
sendmsg() and TCP calls to connect(). Note that udpv6_sendmsg() delays
handling an in6addr_any destination until very late, so the patch only
needs to handle the case where the source is an IPv4-mapped IPv6
address.

Signed-off-by: Jonathan T. Leighton <jtleight@udel.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14 12:13:51 -05:00
Jonathan T. Leighton
ec5e3b0a1d ipv6: Inhibit IPv4-mapped src address on the wire.
This patch adds a check for the problematic case of an IPv4-mapped IPv6
source address and a destination address that is neither an IPv4-mapped
IPv6 address nor in6addr_any, and returns an appropriate error. The
check in done before returning from looking up the route.

Signed-off-by: Jonathan T. Leighton <jtleight@udel.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14 12:13:51 -05:00
Or Gerlitz
fed06ee89b net/mlx5e: Disable preemption when doing TC statistics upcall
When called by HW offloading drivers, the TC action (e.g
net/sched/act_mirred.c) code uses this_cpu logic, e.g

 _bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets)

per the kernel documention, preemption should be disabled, add that.

Before the fix, when running with CONFIG_PREEMPT set, we get a

BUG: using smp_processor_id() in preemptible [00000000] code: tc/3793

asserion from the TC action (mirred) stats_update callback.

Fixes: aad7e08d39 ('net/mlx5e: Hardware offloaded flower filter statistics support')
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-14 11:56:01 -05:00
Linus Torvalds
747ae0a96f media fixes for v4.10-rc8
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJYosRpAAoJEAhfPr2O5OEVAU4P/RB/A9v422J1aFixQ1fPp89P
 xRp6m6xj2ln/r/ydl5j3LgSzA6nCSQT4p1jRalFRdpk/FyS4v6wE4RhaIDGW/Q1P
 WDwRfcyrUdWIPZR6T0289m8eTHG0w4ewbEHPm5iG5UZQHZsObEpmNJD6mOSm00N4
 0JhAmJIccWxIjObIajZizQ9BEUUE+T1PQgDf7QiTHFb3d1UrNXYu/cka8Ys9lafu
 kR//BPRxLsCXWWHf45ZHgYH+V214haGcVhtlf1ehALlZQ3wNYZaC7XvH8G795ykR
 ayrc77qLj2NROhTG4rvljyOH4L0I1IH0k2bC633FrPpgzOCCqNLBMLwn5YhFN3Z+
 QNO2ChxDmZlWcqbepkOmK6IK2HmIulM4AOK0gQa+J4inYM3w2aZD3egaFFsx2I0+
 1y7KSKAULNjEZanx80t3rxGw+bnIh80eD3n2ODgAU0spqEsm5cb3C3JIN3qeS8KQ
 j6vVawB2+gTRCRIHWDBHnYUzkn2SPINp1bRajuxryH1u5TKgvs02NQB84i9mdNR+
 iRpH7Yn8BIuA2w0sGGjbyBYyb6IVFrdx9yS7xsr4CbKYR8tWvEsqRxlN7V16J30M
 crxj2OM+yAIXLTSB1bW892WL51JqJRGF3lFnUGS4R4PH9tMvR4YN/PT8yv0wZJC/
 SfwwJ38/yZBkZuRSNEpt
 =OTHY
 -----END PGP SIGNATURE-----

Merge tag 'media/v4.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:
 "A colorspace regression fix in V4L2 core and a CEC core bug that makes
  it discard valid messages"

* tag 'media/v4.10-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] cec: initiator should be the same as the destination for, poll
  [media] videodev2.h: go back to limited range Y'CbCr for SRGB and, ADOBERGB
2017-02-14 06:29:21 -08:00
Anssi Hannula
3d4ef32975 mmc: core: fix multi-bit bus width without high-speed mode
Commit 577fb13199 ("mmc: rework selection of bus speed mode")
refactored bus width selection code to mmc_select_bus_width().

However, it also altered the behavior to not call the selection code in
non-high-speed modes anymore.

This causes 1-bit mode to always be used when the high-speed mode is not
enabled, even though 4-bit and 8-bit bus are valid bus widths in the
backwards-compatibility (legacy) mode as well (see e.g. 5.3.2 Bus Speed
Modes in JEDEC 84-B50). This results in a significant regression in
transfer speeds.

Fix the code to allow 4-bit and 8-bit widths even without high-speed
mode, as before.

Tested with a Zynq-7000 PicoZed 7020 board.

Fixes: 577fb13199 ("mmc: rework selection of bus speed mode")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Cc: <stable@vger.kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2017-02-14 08:50:10 +01:00
David S. Miller
0c8ef291d9 Merge branch 'rhashtable-allocation-failure-during-insertion'
Herbert Xu says:

====================
rhashtable: Handle table allocation failure during insertion

v2 -

Added Ack to patch 2.
Fixed RCU annotation in code path executed by rehasher by using
rht_dereference_bucket.

v1 -

This series tackles the problem of table allocation failures during
insertion.  The issue is that we cannot vmalloc during insertion.
This series deals with this by introducing nested tables.

The first two patches removes manual hash table walks which cannot
work on a nested table.

The final patch introduces nested tables.

I've tested this with test_rhashtable and it appears to work.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-13 22:17:06 -05:00
Herbert Xu
40137906c5 rhashtable: Add nested tables
This patch adds code that handles GFP_ATOMIC kmalloc failure on
insertion.  As we cannot use vmalloc, we solve it by making our
hash table nested.  That is, we allocate single pages at each level
and reach our desired table size by nesting them.

When a nested table is created, only a single page is allocated
at the top-level.  Lower levels are allocated on demand during
insertion.  Therefore for each insertion to succeed, only two
(non-consecutive) pages are needed.

After a nested table is created, a rehash will be scheduled in
order to switch to a vmalloced table as soon as possible.  Also,
the rehash code will never rehash into a nested table.  If we
detect a nested table during a rehash, the rehash will be aborted
and a new rehash will be scheduled.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-13 22:17:05 -05:00
Herbert Xu
9dbbfb0ab6 tipc: Fix tipc_sk_reinit race conditions
There are two problems with the function tipc_sk_reinit.  Firstly
it's doing a manual walk over an rhashtable.  This is broken as
an rhashtable can be resized and if you manually walk over it
during a resize then you may miss entries.

Secondly it's missing memory barriers as previously the code used
spinlocks which provide the barriers implicitly.

This patch fixes both problems.

Fixes: 07f6c4bc04 ("tipc: convert tipc reference table to...")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-13 22:17:05 -05:00
Herbert Xu
6a25478077 gfs2: Use rhashtable walk interface in glock_hash_walk
The function glock_hash_walk walks the rhashtable by hand.  This
is broken because if it catches the hash table in the middle of
a rehash, then it will miss entries.

This patch replaces the manual walk by using the rhashtable walk
interface.

Fixes: 88ffbf3e03 ("GFS2: Use resizable hash table for glocks")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-13 22:17:05 -05:00
Ralf Baechle
4872e57c81 NET: Fix /proc/net/arp for AX.25
When sending ARP requests over AX.25 links the hwaddress in the neighbour
cache are not getting initialized.  For such an incomplete arp entry
ax2asc2 will generate an empty string resulting in /proc/net/arp output
like the following:

$ cat /proc/net/arp
IP address       HW type     Flags       HW address            Mask     Device
192.168.122.1    0x1         0x2         52:54:00:00:5d:5f     *        ens3
172.20.1.99      0x3         0x0              *        bpq0

The missing field will confuse the procfs parsing of arp(8) resulting in
incorrect output for the device such as the following:

$ arp
Address                  HWtype  HWaddress           Flags Mask            Iface
gateway                  ether   52:54:00:00:5d:5f   C                     ens3
172.20.1.99                      (incomplete)                              ens3

This changes the content of /proc/net/arp to:

$ cat /proc/net/arp
IP address       HW type     Flags       HW address            Mask     Device
172.20.1.99      0x3         0x0         *                     *        bpq0
192.168.122.1    0x1         0x2         52:54:00:00:5d:5f     *        ens3

To do so it change ax2asc to put the string "*" in buf for a NULL address
argument.  Finally the HW address field is left aligned in a 17 character
field (the length of an ethernet HW address in the usual hex notation) for
readability.

Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-13 22:15:03 -05:00
Mart van Santen
ebf692f85f xen-netback: vif counters from int/long to u64
This patch fixes an issue where the type of counters in the queue(s)
and interface are not in sync (queue counters are int, interface
counters are long), causing incorrect reporting of tx/rx values
of the vif interface and unclear counter overflows.
This patch sets both counters to the u64 type.

Signed-off-by: Mart van Santen <mart@greenhost.nl>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-13 21:49:53 -05:00
Kirill A. Shutemov
3ba5b5ea7d x86/vm86: Fix unused variable warning if THP is disabled
GCC complains about unused variable 'vma' in mark_screen_rdonly() if THP is
disabled:

arch/x86/kernel/vm86_32.c: In function ‘mark_screen_rdonly’:
arch/x86/kernel/vm86_32.c:180:26: warning: unused variable ‘vma’
[-Wunused-variable]
   struct vm_area_struct *vma = find_vma(mm, 0xA0000);

That's silly. pmd_trans_huge() resolves to 0 when THP is disabled, so the
whole block should be eliminated.

Moving the variable declaration outside the if() block shuts GCC up.

Reported-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Borislav Petkov <bp@suse.de>
Cc: Carlos O'Donell <carlos@redhat.com>
Link: http://lkml.kernel.org/r/20170213125228.63645-1-kirill.shutemov@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-13 19:04:38 +01:00
Arnaldo Carvalho de Melo
0c59d28121 MAINTAINERS: Remove old e-mail address
The ghostprotocols.net domain is not working, remove it from CREDITS and
MAINTAINERS, and change the status to "Odd fixes", and since I haven't
been maintaining those, remove my address from there.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-13 12:24:56 -05:00
Hans Verkuil
42980da2eb [media] cec: initiator should be the same as the destination for, poll
Poll messages that are used to allocate a logical address should
use the same initiator as the destination. Instead, it expected that
the initiator was 0xf which is not according to the standard.

This also had consequences for the message checks in cec_transmit_msg_fh
that incorrectly rejected poll messages with the same initiator and
destination.

Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-02-13 14:34:11 -02:00
Hans Verkuil
35879ee476 [media] videodev2.h: go back to limited range Y'CbCr for SRGB and, ADOBERGB
This reverts 'commit 7e0739cd9c ("[media] videodev2.h: fix
sYCC/AdobeYCC default quantization range").

The problem is that many drivers can convert R'G'B' content (often
from sensors) to Y'CbCr, but they all produce limited range Y'CbCr.

To stay backwards compatible the default quantization range for
sRGB and AdobeRGB Y'CbCr encoding should be limited range, not full
range, even though the corresponding standards specify full range.

Update the V4L2_MAP_QUANTIZATION_DEFAULT define accordingly and
also update the documentation.

Fixes: 7e0739cd9c ("[media] videodev2.h: fix sYCC/AdobeYCC default quantization range")
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Cc: <stable@vger.kernel.org>      # for v4.9 and up
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
2017-02-13 14:33:56 -02:00
Yang Yang
25f71d1c3e futex: Move futex_init() to core_initcall
The UEVENT user mode helper is enabled before the initcalls are executed
and is available when the root filesystem has been mounted.

The user mode helper is triggered by device init calls and the executable
might use the futex syscall.

futex_init() is marked __initcall which maps to device_initcall, but there
is no guarantee that futex_init() is invoked _before_ the first device init
call which triggers the UEVENT user mode helper.

If the user mode helper uses the futex syscall before futex_init() then the
syscall crashes with a NULL pointer dereference because the futex subsystem
has not been initialized yet.

Move futex_init() to core_initcall so futexes are initialized before the
root filesystem is mounted and the usermode helper becomes available.

[ tglx: Rewrote changelog ]

Signed-off-by: Yang Yang <yang.yang29@zte.com.cn>
Cc: jiang.biao2@zte.com.cn
Cc: jiang.zhengxiong@zte.com.cn
Cc: zhong.weidong@zte.com.cn
Cc: deng.huali@zte.com.cn
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1483085875-6130-1-git-send-email-yang.yang29@zte.com.cn
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-13 16:12:22 +01:00
Mike Galbraith
202461e2f3 tick/broadcast: Prevent deadlock on tick_broadcast_lock
tick_broadcast_lock is taken from interrupt context, but the following call
chain takes the lock without disabling interrupts:

[   12.703736]  _raw_spin_lock+0x3b/0x50
[   12.703738]  tick_broadcast_control+0x5a/0x1a0
[   12.703742]  intel_idle_cpu_online+0x22/0x100
[   12.703744]  cpuhp_invoke_callback+0x245/0x9d0
[   12.703752]  cpuhp_thread_fun+0x52/0x110
[   12.703754]  smpboot_thread_fn+0x276/0x320

So the following deadlock can happen:

   lock(tick_broadcast_lock);
   <Interrupt>
      lock(tick_broadcast_lock);

intel_idle_cpu_online() is the only place which violates the calling
convention of tick_broadcast_control(). This was caused by the removal of
the smp function call in course of the cpu hotplug rework.

Instead of slapping local_irq_disable/enable() at the call site, we can
relax the calling convention and handle it in the core code, which makes
the whole machinery more robust.

Fixes: 29d7bbada9 ("intel_idle: Remove superfluous SMP fuction call")
Reported-by: Gabriel C <nix.or.die@gmail.com>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Ruslan Ruslichenko <rruslich@cisco.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: lwn@lwn.net
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: stable <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1486953115.5912.4.camel@gmx.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-02-13 09:49:31 +01:00
Eric Dumazet
8b74d439e1 net/llc: avoid BUG_ON() in skb_orphan()
It seems nobody used LLC since linux-3.12.

Fortunately fuzzers like syzkaller still know how to run this code,
otherwise it would be no fun.

Setting skb->sk without skb->destructor leads to all kinds of
bugs, we now prefer to be very strict about it.

Ideally here we would use skb_set_owner() but this helper does not exist yet,
only CAN seems to have a private helper for that.

Fixes: 376c7311bd ("net: add a temporary sanity check in skb_orphan()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-12 22:14:49 -05:00
Alexei Starovoitov
7f67763337 bpf: introduce BPF_F_ALLOW_OVERRIDE flag
If BPF_F_ALLOW_OVERRIDE flag is used in BPF_PROG_ATTACH command
to the given cgroup the descendent cgroup will be able to override
effective bpf program that was inherited from this cgroup.
By default it's not passed, therefore override is disallowed.

Examples:
1.
prog X attached to /A with default
prog Y fails to attach to /A/B and /A/B/C
Everything under /A runs prog X

2.
prog X attached to /A with allow_override.
prog Y fails to attach to /A/B with default (non-override)
prog M attached to /A/B with allow_override.
Everything under /A/B runs prog M only.

3.
prog X attached to /A with allow_override.
prog Y fails to attach to /A with default.
The user has to detach first to switch the mode.

In the future this behavior may be extended with a chain of
non-overridable programs.

Also fix the bug where detach from cgroup where nothing is attached
was not throwing error. Return ENOENT in such case.

Add several testcases and adjust libbpf.

Fixes: 3007098494 ("cgroup: add support for eBPF programs")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Daniel Mack <daniel@zonque.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-12 21:52:19 -05:00
IHARA Hiroka
722c5ac708 Input: elan_i2c - add ELAN0605 to the ACPI table
ELAN0605 has been confirmed to be a variant of ELAN0600, which is
blacklisted in the hid-core to be managed by elan_i2c. This device can be
found in Lenovo ideapad 310s (80U4000).

Signed-off-by: Hiroka IHARA <ihara_h@live.jp>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2017-02-12 18:36:49 -08:00
Linus Torvalds
7089db84e3 Linux 4.10-rc8 2017-02-12 13:03:20 -08:00
Nathan Fontenot
e722af6391 ibmvnic: Call napi_disable instead of napi_enable in failure path
The failure path in ibmvnic_open() mistakenly makes a second call
to napi_enable instead of calling napi_disable. This can result
in a BUG_ON for any queues that were enabled in the previous call
to napi_enable.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-11 21:24:15 -05:00
Nathan Fontenot
db5d0b597b ibmvnic: Initialize completion variables before starting work
Initialize condition variables prior to invoking any work that can
mark them complete. This resolves a race in the ibmvnic driver where
the driver faults trying to complete an uninitialized condition
variable.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-11 21:23:43 -05:00