Commit Graph

782925 Commits

Author SHA1 Message Date
YueHaibing
f0f25516e3 net: wiznet: fix return type of ndo_start_xmit function
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.

Found by coccinelle.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:15:15 -07:00
YueHaibing
28d304efb8 net: sgi: fix return type of ndo_start_xmit function
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.

Found by coccinelle.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:15:14 -07:00
YueHaibing
f3bf939f3d net: cirrus: fix return type of ndo_start_xmit function
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.

Found by coccinelle.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:15:14 -07:00
YueHaibing
72b462798c net: seeq: fix return type of ndo_start_xmit function
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t',
which is a typedef for an enum type, so make sure the implementation in
this driver has returns 'netdev_tx_t' value, and change the function
return type to netdev_tx_t.

Found by coccinelle.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:15:14 -07:00
Wei Yongjun
54be5b8ce3 PCI: hv: Fix return value check in hv_pci_assign_slots()
In case of error, the function pci_create_slot() returns ERR_PTR() and
never returns NULL. The NULL test in the return value check should be
replaced with IS_ERR().

Fixes: a15f2c08c7 ("PCI: hv: support reporting serial number as slot information")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:13:22 -07:00
Jeff Barnhill
86f9bd1ff6 net/ipv6: Display all addresses in output of /proc/net/if_inet6
The backend handling for /proc/net/if_inet6 in addrconf.c doesn't properly
handle starting/stopping the iteration.  The problem is that at some point
during the iteration, an overflow is detected and the process is
subsequently stopped.  The item being shown via seq_printf() when the
overflow occurs is not actually shown, though.  When start() is
subsequently called to resume iterating, it returns the next item, and
thus the item that was being processed when the overflow occurred never
gets printed.

Alter the meaning of the private data member "offset".  Currently, when it
is not 0 (which only happens at the very beginning), "offset" represents
the next hlist item to be printed.  After this change, "offset" always
represents the current item.

This is also consistent with the private data member "bucket", which
represents the current bucket, and also the use of "pos" as defined in
seq_file.txt:
    The pos passed to start() will always be either zero, or the most
    recent pos used in the previous session.

Signed-off-by: Jeff Barnhill <0xeffeff@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 19:10:49 -07:00
Sean Tranchetti
f88b4c01b9 netlabel: check for IPV4MASK in addrinfo_get
netlbl_unlabel_addrinfo_get() assumes that if it finds the
NLBL_UNLABEL_A_IPV4ADDR attribute, it must also have the
NLBL_UNLABEL_A_IPV4MASK attribute as well. However, this is
not necessarily the case as the current checks in
netlbl_unlabel_staticadd() and friends are not sufficent to
enforce this.

If passed a netlink message with NLBL_UNLABEL_A_IPV4ADDR,
NLBL_UNLABEL_A_IPV6ADDR, and NLBL_UNLABEL_A_IPV6MASK attributes,
these functions will all call netlbl_unlabel_addrinfo_get() which
will then attempt dereference NULL when fetching the non-existent
NLBL_UNLABEL_A_IPV4MASK attribute:

Unable to handle kernel NULL pointer dereference at virtual address 0
Process unlab (pid: 31762, stack limit = 0xffffff80502d8000)
Call trace:
	netlbl_unlabel_addrinfo_get+0x44/0xd8
	netlbl_unlabel_staticremovedef+0x98/0xe0
	genl_rcv_msg+0x354/0x388
	netlink_rcv_skb+0xac/0x118
	genl_rcv+0x34/0x48
	netlink_unicast+0x158/0x1f0
	netlink_sendmsg+0x32c/0x338
	sock_sendmsg+0x44/0x60
	___sys_sendmsg+0x1d0/0x2a8
	__sys_sendmsg+0x64/0xb4
	SyS_sendmsg+0x34/0x4c
	el0_svc_naked+0x34/0x38
Code: 51001149 7100113f 540000a0 f9401508 (79400108)
---[ end trace f6438a488e737143 ]---
Kernel panic - not syncing: Fatal exception

Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 18:58:34 -07:00
Daniel Borkmann
fad0c40fab Merge branch 'bpf-sockmap-estab-fixes'
John Fastabend says:

====================
Eric noted that using the close callback is not sufficient
to catch all transitions from ESTABLISHED state to a LISTEN
state. So this series does two things. First, only allow
adding socks in ESTABLISH state and second use unhash callback
to catch tcp_disconnect() transitions.

v2: added check for ESTABLISH state in hash update sockmap as well
v3: Do not release lock from unhash in error path, no lock was
    used in the first place. And drop not so useful code comments
v4: convert,
	if (unhash()) return unhash(); return
     to if (unhash()) unhash(); return;

Thanks for reviewing Yonghong I carried your ACKs forward.
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-22 02:46:43 +02:00
John Fastabend
5028027844 bpf: test_maps, only support ESTABLISHED socks
Ensure that sockets added to a sock{map|hash} that is not in the
ESTABLISHED state is rejected.

Fixes: 1aa12bdf1b ("bpf: sockmap, add sock close() hook to remove socks")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-22 02:46:42 +02:00
John Fastabend
b05545e15e bpf: sockmap, fix transition through disconnect without close
It is possible (via shutdown()) for TCP socks to go trough TCP_CLOSE
state via tcp_disconnect() without actually calling tcp_close which
would then call our bpf_tcp_close() callback. Because of this a user
could disconnect a socket then put it in a LISTEN state which would
break our assumptions about sockets always being ESTABLISHED state.

To resolve this rely on the unhash hook, which is called in the
disconnect case, to remove the sock from the sockmap.

Reported-by: Eric Dumazet <edumazet@google.com>
Fixes: 1aa12bdf1b ("bpf: sockmap, add sock close() hook to remove socks")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-22 02:46:41 +02:00
John Fastabend
5607fff303 bpf: sockmap only allow ESTABLISHED sock state
After this patch we only allow socks that are in ESTABLISHED state or
are being added via a sock_ops event that is transitioning into an
ESTABLISHED state. By allowing sock_ops events we allow users to
manage sockmaps directly from sock ops programs. The two supported
sock_ops ops are BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB and
BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB.

Similar to TLS ULP this ensures sk_user_data is correct.

Reported-by: Eric Dumazet <edumazet@google.com>
Fixes: 1aa12bdf1b ("bpf: sockmap, add sock close() hook to remove socks")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-22 02:46:41 +02:00
Greg Kroah-Hartman
10dc890d42 Pin control fixes for v4.19:
- Two fixes for the Intel pin controllers than cause
   problems on laptops.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbpSNBAAoJEEEQszewGV1zhIEQAJmn3wGuCSnycJghe0oCNtGz
 fk2YcWmg1TmzaXsq1zhtL7Nx3wKHrNrvVOw091Wm1nP+XO9REglu+03mXIxpmb8i
 wGSBDApsux+5Pit+SxKKbryDHlOeXmhMA3dBPQTdM7phaiLSkH36BucKYNjAesqu
 qYa2AXrxy4akb8ot6lLZQNWaBNWRsFZ07X6y9Rf9H4cBEaFf2XSzf4EmbCdMNjtN
 agJQURc8pTaJ7Z4ovtx0fs9E4jzvI+4Ln+sT6JlxCv/NL36pI71YaiRSaBI9LphV
 jL1nW5cjE6gMdSNZ0LVuR4bD5M+P7mAFx98rVQ/MlSADfn2uxpOLW8CiJHkXCCuI
 WG6lRmatJgWfw2be/01GGr9UA9J25QdLragMtz+lIp0SUrI+PTcnir2TksmuEJ0/
 xZWHMaTDeFsnmBVhEe22jimnJsTf8tbupDYIp3eYq6TSAxZbFY5YlhxksE8UGJ0t
 kgjrapBpQul6hC5KXbYXCOPqbibp6auyu/wThAS3KKU0XrfZOpkPjLvp9gW1306Q
 C+wlg7BViQNQzYt9B4/G6jcXB3TDV/AbCQYaMYSRFJ01OW0LA/49HHLt1bfM6h4E
 Ve/wwu5FHLoaR1uSK6n68wcONjcU7O2J5lVNxwzJ4KP147piB01NwzxRQqgyGtun
 yYHKoEx/hM8eEw2BNYW/
 =NdkN
 -----END PGP SIGNATURE-----

Merge tag 'pinctrl-v4.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

Linus writes:
  "Pin control fixes for v4.19:
   - Two fixes for the Intel pin controllers than cause
     problems on laptops."

* tag 'pinctrl-v4.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: intel: Do pin translation in other GPIO operations as well
  pinctrl: cannonlake: Fix gpio base for GPP-E
2018-09-21 20:01:16 +02:00
Johannes Thumshirn
f1f1fadaca scsi: sd: don't crash the host on invalid commands
When sd_init_command() get's a command with a unknown req_op() it crashes the
system via BUG().

This makes debugging the actual reason for the broken request cmd_flags pretty
hard as the system is down before it's able to write out debugging data on the
serial console or the trace buffer.

Change the BUG() to a WARN_ON() and return BLKPREP_KILL to fail gracefully and
return an I/O error to the producer of the request.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-09-21 12:42:57 -04:00
Wen Xiong
318ddb34b2 scsi: ipr: System hung while dlpar adding primary ipr adapter back
While dlpar adding primary ipr adapter back, driver goes through adapter
initialization then schedule ipr_worker_thread to start te disk scan by
dropping the host lock, calling scsi_add_device.  Then get the adapter reset
request again, so driver does scsi_block_requests, this will cause the
scsi_add_device get hung until we unblock. But we can't run ipr_worker_thread
to do the unblock because its stuck in scsi_add_device.

This patch fixes the issue.

[mkp: typo and whitespace fixes]

Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-09-21 12:35:39 -04:00
Vincent Pelletier
8c39e2699f scsi: target: iscsi: Use bin2hex instead of a re-implementation
Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-09-21 12:32:30 -04:00
Vincent Pelletier
1816494330 scsi: target: iscsi: Use hex2bin instead of a re-implementation
This change has the following effects, in order of descreasing importance:

1) Prevent a stack buffer overflow

2) Do not append an unnecessary NULL to an anyway binary buffer, which
   is writing one byte past client_digest when caller is:
   chap_string_to_hex(client_digest, chap_r, strlen(chap_r));

The latter was found by KASAN (see below) when input value hes expected size
(32 hex chars), and further analysis revealed a stack buffer overflow can
happen when network-received value is longer, allowing an unauthenticated
remote attacker to smash up to 17 bytes after destination buffer (16 bytes
attacker-controlled and one null).  As switching to hex2bin requires
specifying destination buffer length, and does not internally append any null,
it solves both issues.

This addresses CVE-2018-14633.

Beyond this:

- Validate received value length and check hex2bin accepted the input, to log
  this rejection reason instead of just failing authentication.

- Only log received CHAP_R and CHAP_C values once they passed sanity checks.

==================================================================
BUG: KASAN: stack-out-of-bounds in chap_string_to_hex+0x32/0x60 [iscsi_target_mod]
Write of size 1 at addr ffff8801090ef7c8 by task kworker/0:0/1021

CPU: 0 PID: 1021 Comm: kworker/0:0 Tainted: G           O      4.17.8kasan.sess.connops+ #2
Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 05/19/2014
Workqueue: events iscsi_target_do_login_rx [iscsi_target_mod]
Call Trace:
 dump_stack+0x71/0xac
 print_address_description+0x65/0x22e
 ? chap_string_to_hex+0x32/0x60 [iscsi_target_mod]
 kasan_report.cold.6+0x241/0x2fd
 chap_string_to_hex+0x32/0x60 [iscsi_target_mod]
 chap_server_compute_md5.isra.2+0x2cb/0x860 [iscsi_target_mod]
 ? chap_binaryhex_to_asciihex.constprop.5+0x50/0x50 [iscsi_target_mod]
 ? ftrace_caller_op_ptr+0xe/0xe
 ? __orc_find+0x6f/0xc0
 ? unwind_next_frame+0x231/0x850
 ? kthread+0x1a0/0x1c0
 ? ret_from_fork+0x35/0x40
 ? ret_from_fork+0x35/0x40
 ? iscsi_target_do_login_rx+0x3bc/0x4c0 [iscsi_target_mod]
 ? deref_stack_reg+0xd0/0xd0
 ? iscsi_target_do_login_rx+0x3bc/0x4c0 [iscsi_target_mod]
 ? is_module_text_address+0xa/0x11
 ? kernel_text_address+0x4c/0x110
 ? __save_stack_trace+0x82/0x100
 ? ret_from_fork+0x35/0x40
 ? save_stack+0x8c/0xb0
 ? 0xffffffffc1660000
 ? iscsi_target_do_login+0x155/0x8d0 [iscsi_target_mod]
 ? iscsi_target_do_login_rx+0x3bc/0x4c0 [iscsi_target_mod]
 ? process_one_work+0x35c/0x640
 ? worker_thread+0x66/0x5d0
 ? kthread+0x1a0/0x1c0
 ? ret_from_fork+0x35/0x40
 ? iscsi_update_param_value+0x80/0x80 [iscsi_target_mod]
 ? iscsit_release_cmd+0x170/0x170 [iscsi_target_mod]
 chap_main_loop+0x172/0x570 [iscsi_target_mod]
 ? chap_server_compute_md5.isra.2+0x860/0x860 [iscsi_target_mod]
 ? rx_data+0xd6/0x120 [iscsi_target_mod]
 ? iscsit_print_session_params+0xd0/0xd0 [iscsi_target_mod]
 ? cyc2ns_read_begin.part.2+0x90/0x90
 ? _raw_spin_lock_irqsave+0x25/0x50
 ? memcmp+0x45/0x70
 iscsi_target_do_login+0x875/0x8d0 [iscsi_target_mod]
 ? iscsi_target_check_first_request.isra.5+0x1a0/0x1a0 [iscsi_target_mod]
 ? del_timer+0xe0/0xe0
 ? memset+0x1f/0x40
 ? flush_sigqueue+0x29/0xd0
 iscsi_target_do_login_rx+0x3bc/0x4c0 [iscsi_target_mod]
 ? iscsi_target_nego_release+0x80/0x80 [iscsi_target_mod]
 ? iscsi_target_restore_sock_callbacks+0x130/0x130 [iscsi_target_mod]
 process_one_work+0x35c/0x640
 worker_thread+0x66/0x5d0
 ? flush_rcu_work+0x40/0x40
 kthread+0x1a0/0x1c0
 ? kthread_bind+0x30/0x30
 ret_from_fork+0x35/0x40

The buggy address belongs to the page:
page:ffffea0004243bc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
flags: 0x17fffc000000000()
raw: 017fffc000000000 0000000000000000 0000000000000000 00000000ffffffff
raw: ffffea0004243c20 ffffea0004243ba0 0000000000000000 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8801090ef680: f2 f2 f2 f2 f2 f2 f2 01 f2 f2 f2 f2 f2 f2 f2 00
 ffff8801090ef700: f2 f2 f2 f2 f2 f2 f2 00 02 f2 f2 f2 f2 f2 f2 00
>ffff8801090ef780: 00 f2 f2 f2 f2 f2 f2 00 00 f2 f2 f2 f2 f2 f2 00
                                              ^
 ffff8801090ef800: 00 f2 f2 f2 f2 f2 f2 00 00 00 00 02 f2 f2 f2 f2
 ffff8801090ef880: f2 f2 f2 00 00 00 00 00 00 00 00 f2 f2 f2 f2 00
==================================================================

Signed-off-by: Vincent Pelletier <plr.vincent@gmail.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-09-21 12:31:13 -04:00
Antoine Tenart
652ef42c13 net: mscc: fix the frame extraction into the skb
When extracting frames from the Ocelot switch, the frame check sequence
(FCS) is present at the end of the data extracted. The FCS was put into
the sk buffer which introduced some issues (as length related ones), as
the FCS shouldn't be part of an Rx sk buffer.

This patch fixes the Ocelot switch extraction behaviour by discarding
the FCS.

Fixes: a556c76adc ("net: mscc: Add initial Ocelot switch support")
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-21 09:07:50 -07:00
Greg Kroah-Hartman
a27fb6d983 This pull request is slightly bigger than usual at this stage, but
I swear I would have sent it the same to Linus!  The main cause for
 this is that I was on vacation until two weeks ago and it took a while
 to sort all the pending patches between 4.19 and 4.20, test them and
 so on.
 
 It's mostly small bugfixes and cleanups, mostly around x86 nested
 virtualization.  One important change, not related to nested
 virtualization, is that the ability for the guest kernel to trap CPUID
 instructions (in Linux that's the ARCH_SET_CPUID arch_prctl) is now
 masked by default.  This is because the feature is detected through an
 MSR; a very bad idea that Intel seems to like more and more.  Some
 applications choke if the other fields of that MSR are not initialized
 as on real hardware, hence we have to disable the whole MSR by default,
 as was the case before Linux 4.12.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJbpPo1AAoJEL/70l94x66DdxgH/is0qe6ZBtzb6Qc0W+8mHHD7
 nxIkWAs2V5NsouJ750YwRQ+0Ym407+wlNt30acdBUEoXhrnA5/TvyGq999XvCL96
 upWEIxpIgbvTMX/e2nLhe4wQdhsboUK4r0/B9IFgVFYrdCt5uRXjB2G4ewxcqxL/
 GxxqrAKhaRsbQG9Xv0Fw5Vohh/Ls6fQDJcyuY1EBnbMpVenq2QDLI6cOAPXncyFb
 uLN6ov4GNCWIPckwxejri5XhZesUOsafrmn48sApShh4T6TrisrdtSYdzl+DGza+
 j5vhIEwdFO5kulZ3viuhqKJOnS2+F6wvfZ75IKT0tEKeU2bi+ifGDyGRefSF6Q0=
 =YXLw
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Paolo writes:
  "It's mostly small bugfixes and cleanups, mostly around x86 nested
   virtualization.  One important change, not related to nested
   virtualization, is that the ability for the guest kernel to trap
   CPUID instructions (in Linux that's the ARCH_SET_CPUID arch_prctl) is
   now masked by default.  This is because the feature is detected
   through an MSR; a very bad idea that Intel seems to like more and
   more.  Some applications choke if the other fields of that MSR are
   not initialized as on real hardware, hence we have to disable the
   whole MSR by default, as was the case before Linux 4.12."

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (23 commits)
  KVM: nVMX: Fix bad cleanup on error of get/set nested state IOCTLs
  kvm: selftests: Add platform_info_test
  KVM: x86: Control guest reads of MSR_PLATFORM_INFO
  KVM: x86: Turbo bits in MSR_PLATFORM_INFO
  nVMX x86: Check VPID value on vmentry of L2 guests
  nVMX x86: check posted-interrupt descriptor addresss on vmentry of L2
  KVM: nVMX: Wake blocked vCPU in guest-mode if pending interrupt in virtual APICv
  KVM: VMX: check nested state and CR4.VMXE against SMM
  kvm: x86: make kvm_{load|put}_guest_fpu() static
  x86/hyper-v: rename ipi_arg_{ex,non_ex} structures
  KVM: VMX: use preemption timer to force immediate VMExit
  KVM: VMX: modify preemption timer bit only when arming timer
  KVM: VMX: immediately mark preemption timer expired only for zero value
  KVM: SVM: Switch to bitmap_zalloc()
  KVM/MMU: Fix comment in walk_shadow_page_lockless_end()
  kvm: selftests: use -pthread instead of -lpthread
  KVM: x86: don't reset root in kvm_mmu_setup()
  kvm: mmu: Don't read PDPTEs when paging is not enabled
  x86/kvm/lapic: always disable MMIO interface in x2APIC mode
  KVM: s390: Make huge pages unavailable in ucontrol VMs
  ...
2018-09-21 16:21:42 +02:00
Greg Kroah-Hartman
0eba8697bc This pull request contains fixes for UBIFS:
- A wrong UBIFS assertion in mount code
 - Fix for a NULL pointer deref in mount code
 - Revert of a bad fix for xattrs
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAlukuf8WHHJpY2hhcmRA
 c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wRnyD/40jc4PA1pP3dJcXqdqwSeN+vJZ
 YWu307rE+giWWg6vRKa8zu3z614579UC7EE/I9miSrPNPYCi9VTLNzJGNMvn8aSf
 8seMrbhW0kz29T4YZ5fESr6x3hMsSdmIWZkPobruul8oBGmNcUm3Ag3MBM4XbQTm
 5AcWUdXVQUXPROK/Xm04hNa5pJ6wkSu5CxA+mf9YM2ZYYPPdCblxxq3NsDX99vDw
 gZup7JndSzpk+IpZMQIEKUP83vh5YgCU06YPjXi5YIfXl+sQnpH0fyrWm3+1C0Yz
 4T80rd3fm+7+obXdAI29mhcoZ09fGayozfqjU/fV3edW43+pUvRWced13vuLqz9Q
 JannCFeN+v/nB6OPlj4JGkgNunZ5M3r6ZZIQBbY9RJuZgG0/FMbpYb62RY1z/jg9
 YA7e3J/R02i7tHnPbcIz6ngKE0c42VKxTdATaybXVunx5ZPip7txj9OeJbfIYUrr
 CbLMdiqIJUAAheHMUrTiF3jDNwEiMhBZA4dAGDvLmpLuyfMlN8Lwje9BdRd5VNKp
 zwEGUQkDWLxniFYLtmbceFSa4IQT6z70XJn1pqdDvgjxfp8trEqhtp9GMxA2u6TA
 adcug5gUzRWMQ2QuyoKIv6tkjmnII5N2KzhZH9hwBReRJUlKY9WQ9jowcGrPXhbw
 F2prIwBxp8drneOeKg==
 =CGLq
 -----END PGP SIGNATURE-----

Merge tag 'upstream-4.19-rc4' of git://git.infradead.org/linux-ubifs

Richard writes:
  "This pull request contains fixes for UBIFS:
   - A wrong UBIFS assertion in mount code
   - Fix for a NULL pointer deref in mount code
   - Revert of a bad fix for xattrs"

* tag 'upstream-4.19-rc4' of git://git.infradead.org/linux-ubifs:
  Revert "ubifs: xattr: Don't operate on deleted inodes"
  ubifs: drop false positive assertion
  ubifs: Check for name being NULL while mounting
2018-09-21 15:29:44 +02:00
Greg Kroah-Hartman
211b100a5c for-linus-20180920
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlukFZsQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpi5MD/424bOq72KPsGNGU6io2c56ecIDItpZiDvq
 srDPKN/pdGQ5WZuP2E21Zz+Ry/LP4niUvjZmY88UC9SXx9nxkESyCccxVFGmlGXN
 Ttno1l47bu9nvAVPBdZ8brvsODCXk03RiJUDf9NrJsROMB3Xhbo1DtxyjwFwJ/Fr
 aokAD3KvFitIbrHhTozxyvzpT55icVu+ykNQypnwdmWSyvebuCvrnQTxRJDNyrgk
 wYUB9bhjGaCJzfBT4U7Pi5Is7t16cOSLmw0nv7uo/CUvk1xmSNajxRybCG1IEIFs
 eg7Xyo2AFd/+BMwp7cFdOncjLbLzNJW0tJn3bsoJXPVE+MAy2EYYgk760fgqePMV
 8afzG6l+obDCefCySj6SMVhXZi3YE640MHWeRBxGJVW5HmWq7Himlt2BUJBvcP72
 y6syVoU5gkOTtUun60uF4Lloc0If6MPqvHA5mfDQx6hmQEi/H2XBE0YJxyQoxsZg
 8FDq6KMBlmBlxdzfCtOMGNPPS9LIilvgFUpfsR3XzbY5nUhKB4ZZmMCWe9qm6hOl
 1yoT6RS1tNV5PL1wQu1cj+d/RHaM7ok0F9FeKhofh1w3NG2x1qajTLNHLb/no7u2
 eOJ8mmDxOvJOmU+9H7/Eu4jb2jhpaZPlrX//lxF3VzJcfGwU1wprjsbAjB9mxN1i
 4k8CjG0h6w==
 =z9pB
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20180920' of git://git.kernel.dk/linux-block

Jens writes:
  "Storage fixes for 4.19-rc5

  - Fix for leaking kernel pointer in floppy ioctl (Andy Whitcroft)

  - NVMe pull request from Christoph, and a single ANA log page fix
    (Hannes)

  - Regression fix for libata qd32 support, where we trigger an illegal
    active command transition. This fixes a CD-ROM detection issue that
    was reported, but could also trigger premature completion of the
    internal tag (me)"

* tag 'for-linus-20180920' of git://git.kernel.dk/linux-block:
  floppy: Do not copy a kernel pointer to user memory in FDGETPRM ioctl
  libata: mask swap internal and hardware tag
  nvme: count all ANA groups for ANA Log page
2018-09-21 09:41:05 +02:00
Greg Kroah-Hartman
a38fd7d808 amdgpu, vwmgfx, i915, sun4i, vgem, vc4, udl and core fixes.
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbpDR8AAoJEAx081l5xIa+9doQAJGNbihlBJx9TSYjejuAW5SD
 Rb0WQVCaHaHIzyaCTTcN+1GPryCUBPrEnF5DMic8WodWDIxxTGp0c4pjYRazVIqh
 V7l+IeZAmUXYMlv5jTGMgYaCXDuxaKPA1kCVAGg/q//XxBR6RCMRweZiGjl/+oqX
 Yt3N+nM2NWmzt6t96fYrDyHdPKsdAVB8ogkd4z/GBJ1xo1hU+MlDrJx4j+R/yGHk
 WZ34ThPmACMnpnyskamJ2J8gmhmHPe7Qt7/izbsusLH1CYXnvwKL+iMkQ3YGvUuS
 9cob6u9ar/8WgfaRevsU0L3t0e68UHiQiD0O4R3YGxl32Wumd7BPCY0uUW3Q+I64
 1ikA6AtzdCz64LtWhulxKtnLKN/lekfG5qbSuKxGt/ppHY3PY0qWWUiG2CEIjtPh
 OYFLg8/3lEhDAoyW0Qn9oUCQ6uWXUFAqaVV8L5d+zFZzKkJm4msyEVOf3MZXVTTw
 EqcogCfNpqm/T5gsyZ4YSWPZhp+OlVJGEXTIgA4TBciKvDjlcD86BH9cdQx5MtnO
 pLf1bCAJPVufU1zBa81CNbDzwGl475s0eY8NpFJztCn+43ews0bhTtpwSdn8F4BS
 4pU4hlZPbCuAekDUXiDvVzQveXCU45wI2wL1tVqpTZS4MglOFz7nnFriI7I3eh1B
 /Dh//KeQ0qDxu4P/XSlL
 =56tH
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2018-09-21' of git://anongit.freedesktop.org/drm/drm

David writes:
  "drm fixes for 4.19-rc5:

   - core: fix debugfs for atomic, fix the check for atomic for
     non-modesetting drivers
   - amdgpu: adds a new PCI id, some kfd fixes and a sdma fix
   - i915: a bunch of GVT fixes.
   - vc4: scaling fix
   - vmwgfx: modesetting fixes and a old buffer eviction fix
   - udl: framebuffer destruction fix
   - sun4i: disable on R40 fix until next kernel
   - pl111: NULL termination on table fix"

* tag 'drm-fixes-2018-09-21' of git://anongit.freedesktop.org/drm/drm: (21 commits)
  drm/amdkfd: Fix ATS capablity was not reported correctly on some APUs
  drm/amdkfd: Change the control stack MTYPE from UC to NC on GFX9
  drm/amdgpu: Fix SDMA HQD destroy error on gfx_v7
  drm/vmwgfx: Fix buffer object eviction
  drm/vmwgfx: Don't impose STDU limits on framebuffer size
  drm/vmwgfx: limit mode size for all display unit to texture_max
  drm/vmwgfx: limit screen size to stdu_max during check_modeset
  drm/vmwgfx: don't check for old_crtc_state enable status
  drm/amdgpu: add new polaris pci id
  drm: sun4i: drop second PLL from A64 HDMI PHY
  drm: fix drm_drv_uses_atomic_modeset on non modesetting drivers.
  drm/i915/gvt: clear ggtt entries when destroy vgpu
  drm/i915/gvt: request srcu_read_lock before checking if one gfn is valid
  drm/i915/gvt: Add GEN9_CLKGATE_DIS_4 to default BXT mmio handler
  drm/i915/gvt: Init PHY related registers for BXT
  drm/atomic: Use drm_drv_uses_atomic_modeset() for debugfs creation
  drm/fb-helper: Remove set but not used variable 'connector_funcs'
  drm: udl: Destroy framebuffer only if it was initialized
  drm/sun4i: Remove R40 display pipeline compatibles
  drm/pl111: Make sure of_device_id tables are NULL terminated
  ...
2018-09-21 09:11:18 +02:00
Heiner Kallweit
10bc6a6042 r8169: fix autoneg issue on resume with RTL8168E
It was reported that chip version 33 (RTL8168E) ends up with
10MBit/Half on a 1GBit link after resuming from S3 (with different
link partners). For whatever reason the PHY on this chip doesn't
properly start a renegotiation when soft-reset.
Explicitly requesting a renegotiation fixes this.

Fixes: a2965f12fd ("r8169: remove rtl8169_set_speed_xmii")
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reported-by: Neil MacLeod <neil@nmacleod.com>
Tested-by: Neil MacLeod <neil@nmacleod.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-20 19:58:47 -07:00
James Smart
9e21017826 scsi: lpfc: Synchronize access to remoteport via rport
The driver currently uses the ndlp to get the local rport which is then used
to get the nvme transport remoteport pointer. There can be cases where a stale
remoteport pointer is obtained as synchronization isn't done through the
different dereferences.

Correct by using locks to synchronize the dereferences.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-09-20 22:02:36 -04:00
Adrian Hunter
d87161bea4 scsi: ufs: Disable blk-mq for now
blk-mq does not support runtime pm, so disable blk-mq support for now.

Fixes: d5038a13ec ("scsi: core: switch to scsi-mq by default")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2018-09-20 21:58:42 -04:00
Dave Airlie
4fcb7f8be8 Merge branch 'drm-fixes-4.19' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
A few fixes for 4.19:
- Add a new polaris pci id
- KFD fixes for raven and gfx7

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180920155850.5455-1-alexander.deucher@amd.com
2018-09-21 09:52:27 +10:00
Dave Airlie
618cc1514b Merge branch 'vmwgfx-fixes-4.19' of git://people.freedesktop.org/~thomash/linux into drm-fixes
A couple of modesetting fixes and a fix for a long-standing buffer-eviction
problem cc'd stable.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Hellstrom <thellstrom@vmware.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180920063935.35492-1-thellstrom@vmware.com
2018-09-21 09:51:09 +10:00
Feng Tang
05ab1d8a4b x86/mm: Expand static page table for fixmap space
We met a kernel panic when enabling earlycon, which is due to the fixmap
address of earlycon is not statically setup.

Currently the static fixmap setup in head_64.S only covers 2M virtual
address space, while it actually could be in 4M space with different
kernel configurations, e.g. when VSYSCALL emulation is disabled.

So increase the static space to 4M for now by defining FIXMAP_PMD_NUM to 2,
and add a build time check to ensure that the fixmap is covered by the
initial static page tables.

Fixes: 1ad83c858c ("x86_64,vsyscall: Make vsyscall emulation configurable")
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: kernel test robot <rong.a.chen@intel.com>
Reviewed-by: Juergen Gross <jgross@suse.com> (Xen parts)
Cc: H Peter Anvin <hpa@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andy Lutomirsky <luto@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180920025828.23699-1-feng.tang@intel.com
2018-09-20 23:17:22 +02:00
Junxiao Bi
234b69e3e0 ocfs2: fix ocfs2 read block panic
While reading block, it is possible that io error return due to underlying
storage issue, in this case, BH_NeedsValidate was left in the buffer head.
Then when reading the very block next time, if it was already linked into
journal, that will trigger the following panic.

[203748.702517] kernel BUG at fs/ocfs2/buffer_head_io.c:342!
[203748.702533] invalid opcode: 0000 [#1] SMP
[203748.702561] Modules linked in: ocfs2 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sunrpc dm_switch dm_queue_length dm_multipath bonding be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i iw_cxgb4 cxgb4 cxgb3i libcxgbi iw_cxgb3 cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ipmi_devintf iTCO_wdt iTCO_vendor_support dcdbas ipmi_ssif i2c_core ipmi_si ipmi_msghandler acpi_pad pcspkr sb_edac edac_core lpc_ich mfd_core shpchp sg tg3 ptp pps_core ext4 jbd2 mbcache2 sr_mod cdrom sd_mod ahci libahci megaraid_sas wmi dm_mirror dm_region_hash dm_log dm_mod
[203748.703024] CPU: 7 PID: 38369 Comm: touch Not tainted 4.1.12-124.18.6.el6uek.x86_64 #2
[203748.703045] Hardware name: Dell Inc. PowerEdge R620/0PXXHP, BIOS 2.5.2 01/28/2015
[203748.703067] task: ffff880768139c00 ti: ffff88006ff48000 task.ti: ffff88006ff48000
[203748.703088] RIP: 0010:[<ffffffffa05e9f09>]  [<ffffffffa05e9f09>] ocfs2_read_blocks+0x669/0x7f0 [ocfs2]
[203748.703130] RSP: 0018:ffff88006ff4b818  EFLAGS: 00010206
[203748.703389] RAX: 0000000008620029 RBX: ffff88006ff4b910 RCX: 0000000000000000
[203748.703885] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00000000023079fe
[203748.704382] RBP: ffff88006ff4b8d8 R08: 0000000000000000 R09: ffff8807578c25b0
[203748.704877] R10: 000000000f637376 R11: 000000003030322e R12: 0000000000000000
[203748.705373] R13: ffff88006ff4b910 R14: ffff880732fe38f0 R15: 0000000000000000
[203748.705871] FS:  00007f401992c700(0000) GS:ffff880bfebc0000(0000) knlGS:0000000000000000
[203748.706370] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[203748.706627] CR2: 00007f4019252440 CR3: 00000000a621e000 CR4: 0000000000060670
[203748.707124] Stack:
[203748.707371]  ffff88006ff4b828 ffffffffa0609f52 ffff88006ff4b838 0000000000000001
[203748.707885]  0000000000000000 0000000000000000 ffff880bf67c3800 ffffffffa05eca00
[203748.708399]  00000000023079ff ffffffff81c58b80 0000000000000000 0000000000000000
[203748.708915] Call Trace:
[203748.709175]  [<ffffffffa0609f52>] ? ocfs2_inode_cache_io_unlock+0x12/0x20 [ocfs2]
[203748.709680]  [<ffffffffa05eca00>] ? ocfs2_empty_dir_filldir+0x80/0x80 [ocfs2]
[203748.710185]  [<ffffffffa05ec0cb>] ocfs2_read_dir_block_direct+0x3b/0x200 [ocfs2]
[203748.710691]  [<ffffffffa05f0fbf>] ocfs2_prepare_dx_dir_for_insert.isra.57+0x19f/0xf60 [ocfs2]
[203748.711204]  [<ffffffffa065660f>] ? ocfs2_metadata_cache_io_unlock+0x1f/0x30 [ocfs2]
[203748.711716]  [<ffffffffa05f4f3a>] ocfs2_prepare_dir_for_insert+0x13a/0x890 [ocfs2]
[203748.712227]  [<ffffffffa05f442e>] ? ocfs2_check_dir_for_entry+0x8e/0x140 [ocfs2]
[203748.712737]  [<ffffffffa061b2f2>] ocfs2_mknod+0x4b2/0x1370 [ocfs2]
[203748.713003]  [<ffffffffa061c385>] ocfs2_create+0x65/0x170 [ocfs2]
[203748.713263]  [<ffffffff8121714b>] vfs_create+0xdb/0x150
[203748.713518]  [<ffffffff8121b225>] do_last+0x815/0x1210
[203748.713772]  [<ffffffff812192e9>] ? path_init+0xb9/0x450
[203748.714123]  [<ffffffff8121bca0>] path_openat+0x80/0x600
[203748.714378]  [<ffffffff811bcd45>] ? handle_pte_fault+0xd15/0x1620
[203748.714634]  [<ffffffff8121d7ba>] do_filp_open+0x3a/0xb0
[203748.714888]  [<ffffffff8122a767>] ? __alloc_fd+0xa7/0x130
[203748.715143]  [<ffffffff81209ffc>] do_sys_open+0x12c/0x220
[203748.715403]  [<ffffffff81026ddb>] ? syscall_trace_enter_phase1+0x11b/0x180
[203748.715668]  [<ffffffff816f0c9f>] ? system_call_after_swapgs+0xe9/0x190
[203748.715928]  [<ffffffff8120a10e>] SyS_open+0x1e/0x20
[203748.716184]  [<ffffffff816f0d5e>] system_call_fastpath+0x18/0xd7
[203748.716440] Code: 00 00 48 8b 7b 08 48 83 c3 10 45 89 f8 44 89 e1 44 89 f2 4c 89 ee e8 07 06 11 e1 48 8b 03 48 85 c0 75 df 8b 5d c8 e9 4d fa ff ff <0f> 0b 48 8b 7d a0 e8 dc c6 06 00 48 b8 00 00 00 00 00 00 00 10
[203748.717505] RIP  [<ffffffffa05e9f09>] ocfs2_read_blocks+0x669/0x7f0 [ocfs2]
[203748.717775]  RSP <ffff88006ff4b818>

Joesph ever reported a similar panic.
Link: https://oss.oracle.com/pipermail/ocfs2-devel/2013-May/008931.html

Link: http://lkml.kernel.org/r/20180912063207.29484-1-junxiao.bi@oracle.com
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Changwei Ge <ge.changwei@h3c.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-20 22:01:12 +02:00
Roman Gushchin
172b06c32b mm: slowly shrink slabs with a relatively small number of objects
9092c71bb7 ("mm: use sc->priority for slab shrink targets") changed the
way that the target slab pressure is calculated and made it
priority-based:

    delta = freeable >> priority;
    delta *= 4;
    do_div(delta, shrinker->seeks);

The problem is that on a default priority (which is 12) no pressure is
applied at all, if the number of potentially reclaimable objects is less
than 4096 (1<<12).

This causes the last objects on slab caches of no longer used cgroups to
(almost) never get reclaimed.  It's obviously a waste of memory.

It can be especially painful, if these stale objects are holding a
reference to a dying cgroup.  Slab LRU lists are reparented on memcg
offlining, but corresponding objects are still holding a reference to the
dying cgroup.  If we don't scan these objects, the dying cgroup can't go
away.  Most likely, the parent cgroup hasn't any directly charged objects,
only remaining objects from dying children cgroups.  So it can easily hold
a reference to hundreds of dying cgroups.

If there are no big spikes in memory pressure, and new memory cgroups are
created and destroyed periodically, this causes the number of dying
cgroups grow steadily, causing a slow-ish and hard-to-detect memory
"leak".  It's not a real leak, as the memory can be eventually reclaimed,
but it could not happen in a real life at all.  I've seen hosts with a
steadily climbing number of dying cgroups, which doesn't show any signs of
a decline in months, despite the host is loaded with a production
workload.

It is an obvious waste of memory, and to prevent it, let's apply a minimal
pressure even on small shrinker lists.  E.g.  if there are freeable
objects, let's scan at least min(freeable, scan_batch) objects.

This fix significantly improves a chance of a dying cgroup to be
reclaimed, and together with some previous patches stops the steady growth
of the dying cgroups number on some of our hosts.

Link: http://lkml.kernel.org/r/20180905230759.12236-1-guro@fb.com
Fixes: 9092c71bb7 ("mm: use sc->priority for slab shrink targets")
Signed-off-by: Roman Gushchin <guro@fb.com>
Acked-by: Rik van Riel <riel@surriel.com>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-20 22:01:11 +02:00
YueHaibing
3bf181bc5d kernel/sys.c: remove duplicated include
Link: http://lkml.kernel.org/r/20180821133424.18716-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-20 22:01:11 +02:00
Joel Fernandes (Google)
b45d71fb89 mm: shmem.c: Correctly annotate new inodes for lockdep
Directories and inodes don't necessarily need to be in the same lockdep
class.  For ex, hugetlbfs splits them out too to prevent false positives
in lockdep.  Annotate correctly after new inode creation.  If its a
directory inode, it will be put into a different class.

This should fix a lockdep splat reported by syzbot:

> ======================================================
> WARNING: possible circular locking dependency detected
> 4.18.0-rc8-next-20180810+ #36 Not tainted
> ------------------------------------------------------
> syz-executor900/4483 is trying to acquire lock:
> 00000000d2bfc8fe (&sb->s_type->i_mutex_key#9){++++}, at: inode_lock
> include/linux/fs.h:765 [inline]
> 00000000d2bfc8fe (&sb->s_type->i_mutex_key#9){++++}, at:
> shmem_fallocate+0x18b/0x12e0 mm/shmem.c:2602
>
> but task is already holding lock:
> 0000000025208078 (ashmem_mutex){+.+.}, at: ashmem_shrink_scan+0xb4/0x630
> drivers/staging/android/ashmem.c:448
>
> which lock already depends on the new lock.
>
> -> #2 (ashmem_mutex){+.+.}:
>        __mutex_lock_common kernel/locking/mutex.c:925 [inline]
>        __mutex_lock+0x171/0x1700 kernel/locking/mutex.c:1073
>        mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1088
>        ashmem_mmap+0x55/0x520 drivers/staging/android/ashmem.c:361
>        call_mmap include/linux/fs.h:1844 [inline]
>        mmap_region+0xf27/0x1c50 mm/mmap.c:1762
>        do_mmap+0xa10/0x1220 mm/mmap.c:1535
>        do_mmap_pgoff include/linux/mm.h:2298 [inline]
>        vm_mmap_pgoff+0x213/0x2c0 mm/util.c:357
>        ksys_mmap_pgoff+0x4da/0x660 mm/mmap.c:1585
>        __do_sys_mmap arch/x86/kernel/sys_x86_64.c:100 [inline]
>        __se_sys_mmap arch/x86/kernel/sys_x86_64.c:91 [inline]
>        __x64_sys_mmap+0xe9/0x1b0 arch/x86/kernel/sys_x86_64.c:91
>        do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>        entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> -> #1 (&mm->mmap_sem){++++}:
>        __might_fault+0x155/0x1e0 mm/memory.c:4568
>        _copy_to_user+0x30/0x110 lib/usercopy.c:25
>        copy_to_user include/linux/uaccess.h:155 [inline]
>        filldir+0x1ea/0x3a0 fs/readdir.c:196
>        dir_emit_dot include/linux/fs.h:3464 [inline]
>        dir_emit_dots include/linux/fs.h:3475 [inline]
>        dcache_readdir+0x13a/0x620 fs/libfs.c:193
>        iterate_dir+0x48b/0x5d0 fs/readdir.c:51
>        __do_sys_getdents fs/readdir.c:231 [inline]
>        __se_sys_getdents fs/readdir.c:212 [inline]
>        __x64_sys_getdents+0x29f/0x510 fs/readdir.c:212
>        do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>        entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> -> #0 (&sb->s_type->i_mutex_key#9){++++}:
>        lock_acquire+0x1e4/0x540 kernel/locking/lockdep.c:3924
>        down_write+0x8f/0x130 kernel/locking/rwsem.c:70
>        inode_lock include/linux/fs.h:765 [inline]
>        shmem_fallocate+0x18b/0x12e0 mm/shmem.c:2602
>        ashmem_shrink_scan+0x236/0x630 drivers/staging/android/ashmem.c:455
>        ashmem_ioctl+0x3ae/0x13a0 drivers/staging/android/ashmem.c:797
>        vfs_ioctl fs/ioctl.c:46 [inline]
>        file_ioctl fs/ioctl.c:501 [inline]
>        do_vfs_ioctl+0x1de/0x1720 fs/ioctl.c:685
>        ksys_ioctl+0xa9/0xd0 fs/ioctl.c:702
>        __do_sys_ioctl fs/ioctl.c:709 [inline]
>        __se_sys_ioctl fs/ioctl.c:707 [inline]
>        __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:707
>        do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
>        entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> other info that might help us debug this:
>
> Chain exists of:
>   &sb->s_type->i_mutex_key#9 --> &mm->mmap_sem --> ashmem_mutex
>
>  Possible unsafe locking scenario:
>
>        CPU0                    CPU1
>        ----                    ----
>   lock(ashmem_mutex);
>                                lock(&mm->mmap_sem);
>                                lock(ashmem_mutex);
>   lock(&sb->s_type->i_mutex_key#9);
>
>  *** DEADLOCK ***
>
> 1 lock held by syz-executor900/4483:
>  #0: 0000000025208078 (ashmem_mutex){+.+.}, at:
> ashmem_shrink_scan+0xb4/0x630 drivers/staging/android/ashmem.c:448

Link: http://lkml.kernel.org/r/20180821231835.166639-1-joel@joelfernandes.org
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: NeilBrown <neilb@suse.com>
Suggested-by: NeilBrown <neilb@suse.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-20 22:01:11 +02:00
Dominique Martinet
a1b3d2f217 fs/proc/kcore.c: fix invalid memory access in multi-page read optimization
The 'm' kcore_list item could point to kclist_head, and it is incorrect to
look at m->addr / m->size in this case.

There is no choice but to run through the list of entries for every
address if we did not find any entry in the previous iteration

Reset 'm' to NULL in that case at Omar Sandoval's suggestion.

[akpm@linux-foundation.org: add comment]
Link: http://lkml.kernel.org/r/1536100702-28706-1-git-send-email-asmadeus@codewreck.org
Fixes: bf991c2231 ("proc/kcore: optimize multiple page reads")
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Omar Sandoval <osandov@osandov.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: James Morse <james.morse@arm.com>
Cc: Bhupesh Sharma <bhsharma@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-20 22:01:11 +02:00
Pasha Tatashin
889c695d41 mm: disable deferred struct page for 32-bit arches
Deferred struct page init is needed only on systems with large amount of
physical memory to improve boot performance.  32-bit systems do not
benefit from this feature.

Jiri reported a problem where deferred struct pages do not work well with
x86-32:

[    0.035162] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
[    0.035725] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
[    0.036269] Initializing CPU#0
[    0.036513] Initializing HighMem for node 0 (00036ffe:0007ffe0)
[    0.038459] page:f6780000 is uninitialized and poisoned
[    0.038460] raw: ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
[    0.039509] page dumped because: VM_BUG_ON_PAGE(1 && PageCompound(page))
[    0.040038] ------------[ cut here ]------------
[    0.040399] kernel BUG at include/linux/page-flags.h:293!
[    0.040823] invalid opcode: 0000 [#1] SMP PTI
[    0.041166] CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.0-rc1_pt_jiri #9
[    0.041694] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
[    0.042496] EIP: free_highmem_page+0x64/0x80
[    0.042839] Code: 13 46 d8 c1 e8 18 5d 83 e0 03 8d 04 c0 c1 e0 06 ff 80 ec 5f 44 d8 c3 8d b4 26 00 00 00 00 ba 08 65 28 d8 89 d8 e8 fc 71 02 00 <0f> 0b 8d 76 00 8d bc 27 00 00 00 00 ba d0 b1 26 d8 89 d8 e8 e4 71
[    0.044338] EAX: 0000003c EBX: f6780000 ECX: 00000000 EDX: d856cbe8
[    0.044868] ESI: 0007ffe0 EDI: d838df20 EBP: d838df00 ESP: d838defc
[    0.045372] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210086
[    0.045913] CR0: 80050033 CR2: 00000000 CR3: 18556000 CR4: 00040690
[    0.046413] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[    0.046913] DR6: fffe0ff0 DR7: 00000400
[    0.047220] Call Trace:
[    0.047419]  add_highpages_with_active_regions+0xbd/0x10d
[    0.047854]  set_highmem_pages_init+0x5b/0x71
[    0.048202]  mem_init+0x2b/0x1e8
[    0.048460]  start_kernel+0x1d2/0x425
[    0.048757]  i386_start_kernel+0x93/0x97
[    0.049073]  startup_32_smp+0x164/0x168
[    0.049379] Modules linked in:
[    0.049626] ---[ end trace 337949378db0abbb ]---

We free highmem pages before their struct pages are initialized:

mem_init()
 set_highmem_pages_init()
  add_highpages_with_active_regions()
   free_highmem_page()
    .. Access uninitialized struct page here..

Because there is no reason to have this feature on 32-bit systems, just
disable it.

Link: http://lkml.kernel.org/r/20180831150506.31246-1-pavel.tatashin@microsoft.com
Fixes: 2e3ca40f03 ("mm: relax deferred struct page requirements")
Signed-off-by: Pavel Tatashin <pavel.tatashin@microsoft.com>
Reported-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-20 22:01:11 +02:00
KJ Tsanaktsidis
f83606f5eb fork: report pid exhaustion correctly
Make the clone and fork syscalls return EAGAIN when the limit on the
number of pids /proc/sys/kernel/pid_max is exceeded.

Currently, when the pid_max limit is exceeded, the kernel will return
ENOSPC from the fork and clone syscalls.  This is contrary to the
documented behaviour, which explicitly calls out the pid_max case as one
where EAGAIN should be returned.  It also leads to really confusing error
messages in userspace programs which will complain about a lack of disk
space when they fail to create processes/threads for this reason.

This error is being returned because alloc_pid() uses the idr api to find
a new pid; when there are none available, idr_alloc_cyclic() returns
-ENOSPC, and this is being propagated back to userspace.

This behaviour has been broken before, and was explicitly fixed in
commit 35f71bc0a0 ("fork: report pid reservation failure properly"),
so I think -EAGAIN is definitely the right thing to return in this case.
The current behaviour change dates from commit 95846ecf9d ("pid:
replace pid bitmap implementation with IDR AIP") and was I believe
unintentional.

This patch has no impact on the case where allocating a pid fails because
the child reaper for the namespace is dead; that case will still return
-ENOMEM.

Link: http://lkml.kernel.org/r/20180903111016.46461-1-ktsanaktsidis@zendesk.com
Fixes: 95846ecf9d ("pid: replace pid bitmap implementation with IDR AIP")
Signed-off-by: KJ Tsanaktsidis <ktsanaktsidis@zendesk.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Gargi Sharma <gs051095@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-09-20 22:01:11 +02:00
Thomas Gleixner
9068a427ee MAINTAINERS: Add X86 MM entry
Dave, Andy and Peter are de facto overseing the mm parts of X86. Add an
explicit maintainers entry.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
2018-09-20 21:48:08 +02:00
Fenghua Yu
a8b3bb338e x86/intel_rdt: Add Reinette as co-maintainer for RDT
Reinette Chatre is doing great job on enabling pseudo-locking and other
features in RDT. Add her as co-maintainer for RDT.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Reinette Chatre <reinette.chatre@intel.com>
Cc: "H Peter Anvin" <hpa@zytor.com>
Cc: "Tony Luck" <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/1537472228-221799-1-git-send-email-fenghua.yu@intel.com
2018-09-20 21:44:35 +02:00
Richard Weinberger
f061c1cc40 Revert "ubifs: xattr: Don't operate on deleted inodes"
This reverts commit 11a6fc3dc7.
UBIFS wants to assert that xattr operations are only issued on files
with positive link count. The said patch made this operations return
-ENOENT for unlinked files such that the asserts will no longer trigger.
This was wrong since xattr operations are perfectly fine on unlinked
files.
Instead the assertions need to be fixed/removed.

Cc: <stable@vger.kernel.org>
Fixes: 11a6fc3dc7 ("ubifs: xattr: Don't operate on deleted inodes")
Reported-by: Koen Vandeputte <koen.vandeputte@ncentric.com>
Tested-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Richard Weinberger <richard@nod.at>
2018-09-20 21:37:41 +02:00
Sascha Hauer
d3bdc016c5 ubifs: drop false positive assertion
The following sequence triggers

	ubifs_assert(c, c->lst.taken_empty_lebs > 0);

at the end of ubifs_remount_fs():

mount -t ubifs /dev/ubi0_0 /mnt
echo 1 > /sys/kernel/debug/ubifs/ubi0_0/ro_error
umount /mnt
mount -t ubifs -o ro /dev/ubix_y /mnt
mount -o remount,ro /mnt

The resulting

UBIFS assert failed in ubifs_remount_fs at 1878 (pid 161)

is a false positive. In the case above c->lst.taken_empty_lebs has
never been changed from its initial zero value. This will only happen
when the deferred recovery is done.

Fix this by doing the assertion only when recovery has been done
already.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
2018-09-20 21:37:07 +02:00
Richard Weinberger
37f31b6ca4 ubifs: Check for name being NULL while mounting
The requested device name can be NULL or an empty string.
Check for that and refuse to continue. UBIFS has to do this manually
since we cannot use mount_bdev(), which checks for this condition.

Fixes: 1e51764a3c ("UBIFS: add new flash file system")
Reported-by: syzbot+38bd0f7865e5c6379280@syzkaller.appspotmail.com
Signed-off-by: Richard Weinberger <richard@nod.at>
2018-09-20 21:37:07 +02:00
Xin Long
d7ab5cdce5 sctp: update dst pmtu with the correct daddr
When processing pmtu update from an icmp packet, it calls .update_pmtu
with sk instead of skb in sctp_transport_update_pmtu.

However for sctp, the daddr in the transport might be different from
inet_sock->inet_daddr or sk->sk_v6_daddr, which is used to update or
create the route cache. The incorrect daddr will cause a different
route cache created for the path.

So before calling .update_pmtu, inet_sock->inet_daddr/sk->sk_v6_daddr
should be updated with the daddr in the transport, and update it back
after it's done.

The issue has existed since route exceptions introduction.

Fixes: 4895c771c7 ("ipv4: Add FIB nexthop exceptions.")
Reported-by: ian.periam@dialogic.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-20 11:29:30 -07:00
Davide Caratti
8c6ec3613e bnxt_en: don't try to offload VLAN 'modify' action
bnxt offload code currently supports only 'push' and 'pop' operation: let
.ndo_setup_tc() return -EOPNOTSUPP if VLAN 'modify' action is configured.

Fixes: 2ae7408fed ("bnxt_en: bnxt: add TC flower filter offload support")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Sathya Perla <sathya.perla@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-20 11:25:54 -07:00
Liran Alon
26b471c7e2 KVM: nVMX: Fix bad cleanup on error of get/set nested state IOCTLs
The handlers of IOCTLs in kvm_arch_vcpu_ioctl() are expected to set
their return value in "r" local var and break out of switch block
when they encounter some error.
This is because vcpu_load() is called before the switch block which
have a proper cleanup of vcpu_put() afterwards.

However, KVM_{GET,SET}_NESTED_STATE IOCTLs handlers just return
immediately on error without performing above mentioned cleanup.

Thus, change these handlers to behave as expected.

Fixes: 8fcc4b5923 ("kvm: nVMX: Introduce KVM_CAP_NESTED_STATE")

Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Reviewed-by: Patrick Colp <patrick.colp@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-09-20 18:54:08 +02:00
Yong Zhao
44d8cc6f1a drm/amdkfd: Fix ATS capablity was not reported correctly on some APUs
Because CRAT_CU_FLAGS_IOMMU_PRESENT was not set in some BIOS crat, we
need to workaround this.

For future compatibility, we also overwrite the bit in capability according
to the value of needs_iommu_device.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-20 10:25:23 -05:00
Yong Zhao
15426dbb65 drm/amdkfd: Change the control stack MTYPE from UC to NC on GFX9
CWSR fails on Raven if the control stack is MTYPE_UC, which is used
for regular GART mappings. As a workaround we map it using MTYPE_NC.

The MEC firmware expects the control stack at one page offset from the
start of the MQD so it is part of the MQD allocation on GFXv9. AMDGPU
added a memory allocation flag just for this purpose.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-20 10:25:17 -05:00
Amber Lin
caaa4c8a6b drm/amdgpu: Fix SDMA HQD destroy error on gfx_v7
A wrong register bit was examinated for checking SDMA status so it reports
false failures. This typo only appears on gfx_v7. gfx_v8 checks the correct
bit.

Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-09-20 10:25:01 -05:00
Mika Westerberg
96147db1e1 pinctrl: intel: Do pin translation in other GPIO operations as well
For some reason I thought GPIOLIB handles translation from GPIO ranges
to pinctrl pins but it turns out not to be the case. This means that
when GPIOs operations are performed for a pin controller having a custom
GPIO base such as Cannon Lake and Ice Lake incorrect pin number gets
used internally.

Fix this in the same way we did for lock/unlock IRQ operations and
translate the GPIO number to pin before using it.

Fixes: a60eac3239 ("pinctrl: intel: Allow custom GPIO base for pad groups")
Reported-by: Rajat Jain <rajatja@google.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Rajat Jain <rajatja@google.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2018-09-20 08:21:52 -07:00
Jens Axboe
d611aaf336 Merge branch 'nvme-4.19' of git://git.infradead.org/nvme into for-linus
Pull NVMe fix from Christoph.

* 'nvme-4.19' of git://git.infradead.org/nvme:
  nvme: count all ANA groups for ANA Log page
2018-09-20 09:10:38 -06:00
Andy Whitcroft
65eea8edc3 floppy: Do not copy a kernel pointer to user memory in FDGETPRM ioctl
The final field of a floppy_struct is the field "name", which is a pointer
to a string in kernel memory.  The kernel pointer should not be copied to
user memory.  The FDGETPRM ioctl copies a floppy_struct to user memory,
including this "name" field.  This pointer cannot be used by the user
and it will leak a kernel address to user-space, which will reveal the
location of kernel code and data and undermine KASLR protection.

Model this code after the compat ioctl which copies the returned data
to a previously cleared temporary structure on the stack (excluding the
name pointer) and copy out to userspace from there.  As we already have
an inparam union with an appropriate member and that memory is already
cleared even for read only calls make use of that as a temporary store.

Based on an initial patch by Brian Belleville.

CVE-2018-7755
Signed-off-by: Andy Whitcroft <apw@canonical.com>

Broke up long line.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-09-20 09:09:48 -06:00
Johannes Berg
56ce3c5a50 smc: generic netlink family should be __ro_after_init
The generic netlink family is only initialized during module init,
so it should be __ro_after_init like all other generic netlink
families.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-20 07:49:55 -07:00
Petr Machata
f9d5b1d508 mlxsw: spectrum: Bump required firmware version
MC-aware mode was introduced to mlxsw in commit 7b81953066 ("mlxsw: spectrum:
Configure MC-aware mode on mlxsw ports") and fixed up later in commit
3a3539cd36 ("mlxsw: spectrum_buffers: Set up a dedicated pool for BUM
traffic"). As the final piece of puzzle, a firmware issue whereby a wrong
priority was assigned to BUM traffic was corrected in FW version 13.1703.4.
Therefore require this FW version in the driver.

Fixes: 7b81953066 ("mlxsw: spectrum: Configure MC-aware mode on mlxsw ports")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-20 07:48:37 -07:00