platform/kernel/linux-rpi.git
5 years agodrm/amd/display: Increase size of audios array
Tai Man [Fri, 28 Jun 2019 15:40:38 +0000 (11:40 -0400)]
drm/amd/display: Increase size of audios array

[ Upstream commit 7352193a33dfc9b69ba3bf6a8caea925b96243b1 ]

[Why]
The audios array defined in "struct resource_pool" is only 6 (MAX_PIPES)
but the max number of audio devices (num_audio) is 7. In some projects,
it will run out of audios array.

[How]
Incraese the audios array size to 7.

Signed-off-by: Tai Man <taiman.wong@amd.com>
Reviewed-by: Joshua Aberback <Joshua.Aberback@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agodrm/amd/display: Only enable audio if speaker allocation exists
Alvin Lee [Thu, 4 Jul 2019 19:17:42 +0000 (15:17 -0400)]
drm/amd/display: Only enable audio if speaker allocation exists

[ Upstream commit 6ac25e6d5b2fbf251e9fa2f4131d42c815b43867 ]

[Why]

In dm_helpers_parse_edid_caps, there is a corner case where no speakers
can be allocated even though the audio mode count is greater than 0.
Enabling audio when no speaker allocations exists can cause issues in
the video stream.

[How]

Add a check to not enable audio unless one or more speaker allocations
exist (since doing this can cause issues in the video stream).

Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agodrm/amd/display: Fix dc_create failure handling and 666 color depths
Julian Parkin [Tue, 25 Jun 2019 18:55:53 +0000 (14:55 -0400)]
drm/amd/display: Fix dc_create failure handling and 666 color depths

[ Upstream commit 0905f32977268149f06e3ce6ea4bd6d374dd891f ]

[Why]
It is possible (but very unlikely) that constructing dc fails
before current_state is created.

We support 666 color depth in some scenarios, but this
isn't handled in get_norm_pix_clk. It uses exactly the
same pixel clock as the 888 case.

[How]
Check for non null current_state before destructing.

Add case for 666 color depth to get_norm_pix_clk to
avoid assertion.

Signed-off-by: Julian Parkin <julian.parkin@amd.com>
Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agodrm/amd/display: use encoder's engine id to find matched free audio device
Tai Man [Fri, 7 Jun 2019 21:32:27 +0000 (17:32 -0400)]
drm/amd/display: use encoder's engine id to find matched free audio device

[ Upstream commit 74eda776d7a4e69ec7aa1ce30a87636f14220fbb ]

[Why]
On some platforms, the encoder id 3 is not populated. So the encoders
are not stored in right order as index (id: 0, 1, 2, 4, 5) at pool. This
would cause encoders id 4 & id 5 to fail when finding corresponding
audio device, defaulting to the first available audio device. As result,
we cannot stream audio into two DP ports with encoders id 4 & id 5.

[How]
It need to create enough audio device objects (0 - 5) to perform matching.
Then use encoder engine id to find matched audio device.

Signed-off-by: Tai Man <taiman.wong@amd.com>
Reviewed-by: Charlene Liu <Charlene.Liu@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agodrm/amd/display: Wait for backlight programming completion in set backlight level
SivapiriyanKumarasamy [Fri, 14 Jun 2019 19:04:00 +0000 (15:04 -0400)]
drm/amd/display: Wait for backlight programming completion in set backlight level

[ Upstream commit c7990daebe71d11a9e360b5c3b0ecd1846a3a4bb ]

[WHY]
Currently we don't wait for blacklight programming completion in DMCU
when setting backlight level. Some sequences such as PSR static screen
event trigger reprogramming requires it to be complete.

[How]
Add generic wait for dmcu command completion in set backlight level.

Signed-off-by: SivapiriyanKumarasamy <sivapiriyan.kumarasamy@amd.com>
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agoscripts/sphinx-pre-install: fix script for RHEL/CentOS
Mauro Carvalho Chehab [Sat, 13 Jul 2019 11:50:24 +0000 (08:50 -0300)]
scripts/sphinx-pre-install: fix script for RHEL/CentOS

[ Upstream commit b308467c916aa7acc5069802ab76a9f657434701 ]

There's a missing parenthesis at the script, with causes it to
fail to detect non-Fedora releases (e. g. RHEL/CentOS).

Tested with Centos 7.6.1810.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agonetfilter: nft_hash: fix symhash with modulus one
Laura Garcia Liebana [Mon, 15 Jul 2019 11:23:37 +0000 (13:23 +0200)]
netfilter: nft_hash: fix symhash with modulus one

[ Upstream commit 28b1d6ef53e3303b90ca8924bb78f31fa527cafb ]

The rule below doesn't work as the kernel raises -ERANGE.

nft add rule netdev nftlb lb01 ip daddr set \
symhash mod 1 map { 0 : 192.168.0.10 } fwd to "eth0"

This patch allows to use the symhash modulus with one
element, in the same way that the other types of hashes and
algorithms that uses the modulus parameter.

Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agonetfilter: conntrack: always store window size un-scaled
Florian Westphal [Thu, 11 Jul 2019 22:29:05 +0000 (00:29 +0200)]
netfilter: conntrack: always store window size un-scaled

[ Upstream commit 959b69ef57db00cb33e9c4777400ae7183ebddd3 ]

Jakub Jankowski reported following oddity:

After 3 way handshake completes, timeout of new connection is set to
max_retrans (300s) instead of established (5 days).

shortened excerpt from pcap provided:
25.070622 IP (flags [DF], proto TCP (6), length 52)
10.8.5.4.1025 > 10.8.1.2.80: Flags [S], seq 11, win 64240, [wscale 8]
26.070462 IP (flags [DF], proto TCP (6), length 48)
10.8.1.2.80 > 10.8.5.4.1025: Flags [S.], seq 82, ack 12, win 65535, [wscale 3]
27.070449 IP (flags [DF], proto TCP (6), length 40)
10.8.5.4.1025 > 10.8.1.2.80: Flags [.], ack 83, win 512, length 0

Turns out the last_win is of u16 type, but we store the scaled value:
512 << 8 (== 0x20000) becomes 0 window.

The Fixes tag is not correct, as the bug has existed forever, but
without that change all that this causes might cause is to mistake a
window update (to-nonzero-from-zero) for a retransmit.

Fixes: fbcd253d2448b8 ("netfilter: conntrack: lower timeout to RETRANS seconds if window is 0")
Reported-by: Jakub Jankowski <shasta@toxcorp.com>
Tested-by: Jakub Jankowski <shasta@toxcorp.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agonetfilter: Fix rpfilter dropping vrf packets by mistake
Miaohe Lin [Tue, 2 Jul 2019 03:59:36 +0000 (03:59 +0000)]
netfilter: Fix rpfilter dropping vrf packets by mistake

[ Upstream commit b575b24b8eee37f10484e951b62ce2a31c579775 ]

When firewalld is enabled with ipv4/ipv6 rpfilter, vrf
ipv4/ipv6 packets will be dropped. Vrf device will pass
through netfilter hook twice. One with enslaved device
and another one with l3 master device. So in device may
dismatch witch out device because out device is always
enslaved device.So failed with the check of the rpfilter
and drop the packets by mistake.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agovfio-ccw: Set pa_nr to 0 if memory allocation fails for pa_iova_pfn
Farhan Ali [Thu, 11 Jul 2019 14:28:53 +0000 (10:28 -0400)]
vfio-ccw: Set pa_nr to 0 if memory allocation fails for pa_iova_pfn

[ Upstream commit c1ab69268d124ebdbb3864580808188ccd3ea355 ]

So we don't call try to call vfio_unpin_pages() incorrectly.

Fixes: 0a19e61e6d4c ("vfio: ccw: introduce channel program interfaces")
Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <33a89467ad6369196ae6edf820cbcb1e2d8d050c.1562854091.git.alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agonetfilter: nfnetlink: avoid deadlock due to synchronous request_module
Florian Westphal [Tue, 2 Jul 2019 19:41:40 +0000 (21:41 +0200)]
netfilter: nfnetlink: avoid deadlock due to synchronous request_module

[ Upstream commit 1b0890cd60829bd51455dc5ad689ed58c4408227 ]

Thomas and Juliana report a deadlock when running:

(rmmod nf_conntrack_netlink/xfrm_user)

  conntrack -e NEW -E &
  modprobe -v xfrm_user

They provided following analysis:

conntrack -e NEW -E
    netlink_bind()
        netlink_lock_table() -> increases "nl_table_users"
            nfnetlink_bind()
            # does not unlock the table as it's locked by netlink_bind()
                __request_module()
                    call_usermodehelper_exec()

This triggers "modprobe nf_conntrack_netlink" from kernel, netlink_bind()
won't return until modprobe process is done.

"modprobe xfrm_user":
    xfrm_user_init()
        register_pernet_subsys()
            -> grab pernet_ops_rwsem
                ..
                netlink_table_grab()
                    calls schedule() as "nl_table_users" is non-zero

so modprobe is blocked because netlink_bind() increased
nl_table_users while also holding pernet_ops_rwsem.

"modprobe nf_conntrack_netlink" runs and inits nf_conntrack_netlink:
    ctnetlink_init()
        register_pernet_subsys()
            -> blocks on "pernet_ops_rwsem" thanks to xfrm_user module

both modprobe processes wait on one another -- neither can make
progress.

Switch netlink_bind() to "nowait" modprobe -- this releases the netlink
table lock, which then allows both modprobe instances to complete.

Reported-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Reported-by: Juliana Rodrigueiro <juliana.rodrigueiro@intra2net.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agocan: peak_usb: fix potential double kfree_skb()
Stephane Grosjean [Fri, 5 Jul 2019 13:32:16 +0000 (15:32 +0200)]
can: peak_usb: fix potential double kfree_skb()

commit fee6a8923ae0d318a7f7950c6c6c28a96cea099b upstream.

When closing the CAN device while tx skbs are inflight, echo skb could
be released twice. By calling close_candev() before unlinking all
pending tx urbs, then the internal echo_skb[] array is fully and
correctly cleared before the USB write callback and, therefore,
can_get_echo_skb() are called, for each aborted URB.

Fixes: bb4785551f64 ("can: usb: PEAK-System Technik USB adapters driver core")
Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agocan: rcar_canfd: fix possible IRQ storm on high load
Nikita Yushchenko [Wed, 26 Jun 2019 13:08:48 +0000 (16:08 +0300)]
can: rcar_canfd: fix possible IRQ storm on high load

commit d4b890aec4bea7334ca2ca56fd3b12fb48a00cd1 upstream.

We have observed rcar_canfd driver entering IRQ storm under high load,
with following scenario:
- rcar_canfd_global_interrupt() in entered due to Rx available,
- napi_schedule_prep() is called, and sets NAPIF_STATE_SCHED in state
- Rx fifo interrupts are masked,
- rcar_canfd_global_interrupt() is entered again, this time due to
  error interrupt (e.g. due to overflow),
- since scheduled napi poller has not yet executed, condition for calling
  napi_schedule_prep() from rcar_canfd_global_interrupt() remains true,
  thus napi_schedule_prep() gets called and sets NAPIF_STATE_MISSED flag
  in state,
- later, napi poller function rcar_canfd_rx_poll() gets executed, and
  calls napi_complete_done(),
- due to NAPIF_STATE_MISSED flag in state, this call does not clear
  NAPIF_STATE_SCHED flag from state,
- on return from napi_complete_done(), rcar_canfd_rx_poll() unmasks Rx
  interrutps,
- Rx interrupt happens, rcar_canfd_global_interrupt() gets called
  and calls napi_schedule_prep(),
- since NAPIF_STATE_SCHED is set in state at this time, this call
  returns false,
- due to that false return, rcar_canfd_global_interrupt() returns
  without masking Rx interrupt
- and this results into IRQ storm: unmasked Rx interrupt happens again
  and again is misprocessed in the same way.

This patch fixes that scenario by unmasking Rx interrupts only when
napi_complete_done() returns true, which means it has cleared
NAPIF_STATE_SCHED in state.

Fixes: dd3bd23eb438 ("can: rcar_canfd: Add Renesas R-Car CAN FD driver")
Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agousb: typec: tcpm: Ignore unsupported/unknown alternate mode requests
Guenter Roeck [Fri, 2 Aug 2019 16:03:42 +0000 (09:03 -0700)]
usb: typec: tcpm: Ignore unsupported/unknown alternate mode requests

commit 88d02c9ba2e83fc22d37ccb1f11c62ea6fc9ae50 upstream.

TCPM may receive PD messages associated with unknown or unsupported
alternate modes. If that happens, calls to typec_match_altmode()
will return NULL. The tcpm code does not currently take this into
account. This results in crashes.

Unable to handle kernel NULL pointer dereference at virtual address 000001f0
pgd = 41dad9a1
[000001f0] *pgd=00000000
Internal error: Oops: 5 [#1] THUMB2
Modules linked in: tcpci tcpm
CPU: 0 PID: 2338 Comm: kworker/u2:0 Not tainted 5.1.18-sama5-armv7-r2 #6
Hardware name: Atmel SAMA5
Workqueue: 2-0050 tcpm_pd_rx_handler [tcpm]
PC is at typec_altmode_attention+0x0/0x14
LR is at tcpm_pd_rx_handler+0xa3b/0xda0 [tcpm]
...
[<c03fbee8>] (typec_altmode_attention) from [<bf8030fb>]
(tcpm_pd_rx_handler+0xa3b/0xda0 [tcpm])
[<bf8030fb>] (tcpm_pd_rx_handler [tcpm]) from [<c012082b>]
(process_one_work+0x123/0x2a8)
[<c012082b>] (process_one_work) from [<c0120a6d>]
(worker_thread+0xbd/0x3b0)
[<c0120a6d>] (worker_thread) from [<c012431f>] (kthread+0xcf/0xf4)
[<c012431f>] (kthread) from [<c01010f9>] (ret_from_fork+0x11/0x38)

Ignore PD messages if the associated alternate mode is not supported.

Fixes: e9576fe8e605c ("usb: typec: tcpm: Support for Alternate Modes")
Cc: stable <stable@vger.kernel.org>
Reported-by: Douglas Gilbert <dgilbert@interlog.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Tested-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/1564761822-13984-1-git-send-email-linux@roeck-us.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agousb: typec: tcpm: Add NULL check before dereferencing config
Guenter Roeck [Wed, 24 Jul 2019 14:38:32 +0000 (07:38 -0700)]
usb: typec: tcpm: Add NULL check before dereferencing config

commit 1957de95d425d1c06560069dc7277a73a8b28683 upstream.

When instantiating tcpm on an NXP OM 13588 board with NXP PTN5110,
the following crash is seen when writing into the 'preferred_role'
sysfs attribute.

Unable to handle kernel NULL pointer dereference at virtual address 00000028
pgd = f69149ad
[00000028] *pgd=00000000
Internal error: Oops: 5 [#1] THUMB2
Modules linked in: tcpci tcpm
CPU: 0 PID: 1882 Comm: bash Not tainted 5.1.18-sama5-armv7-r2 #4
Hardware name: Atmel SAMA5
PC is at tcpm_try_role+0x3a/0x4c [tcpm]
LR is at tcpm_try_role+0x15/0x4c [tcpm]
pc : [<bf8000e2>]    lr : [<bf8000bd>]    psr: 60030033
sp : dc1a1e88  ip : c03fb47d  fp : 00000000
r10: dc216190  r9 : dc1a1f78  r8 : 00000001
r7 : df4ae044  r6 : dd032e90  r5 : dd1ce340  r4 : df4ae054
r3 : 00000000  r2 : 00000000  r1 : 00000000  r0 : df4ae044
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA Thumb  Segment none
Control: 50c53c7d  Table: 3efec059  DAC: 00000051
Process bash (pid: 1882, stack limit = 0x6a6d4aa5)
Stack: (0xdc1a1e88 to 0xdc1a2000)
1e80:                   dd05d808 dd1ce340 00000001 00000007 dd1ce340 c03fb4a7
1ea0: 00000007 00000007 dc216180 00000000 00000000 c01e1e03 00000000 00000000
1ec0: c0907008 dee98b40 c01e1d5d c06106c4 00000000 00000000 00000007 c0194e8b
1ee0: 0000000a 00000400 00000000 c01a97db dc22bf00 ffffe000 df4b6a00 df745900
1f00: 00000001 00000001 000000dd c01a9c2f 7aeab3be c0907008 00000000 dc22bf00
1f20: c0907008 00000000 00000000 00000000 00000000 7aeab3be 00000007 dee98b40
1f40: 005dc318 dc1a1f78 00000000 00000000 00000007 c01969f7 0000000a c01a20cb
1f60: dee98b40 c0907008 dee98b40 005dc318 00000000 c0196b9b 00000000 00000000
1f80: dee98b40 7aeab3be 00000074 005dc318 b6f3bdb0 00000004 c0101224 dc1a0000
1fa0: 00000004 c0101001 00000074 005dc318 00000001 005dc318 00000007 00000000
1fc0: 00000074 005dc318 b6f3bdb0 00000004 00000007 00000007 00000000 00000000
1fe0: 00000004 be800880 b6ed35b3 b6e5c746 60030030 00000001 00000000 00000000
[<bf8000e2>] (tcpm_try_role [tcpm]) from [<c03fb4a7>] (preferred_role_store+0x2b/0x5c)
[<c03fb4a7>] (preferred_role_store) from [<c01e1e03>] (kernfs_fop_write+0xa7/0x150)
[<c01e1e03>] (kernfs_fop_write) from [<c0194e8b>] (__vfs_write+0x1f/0x104)
[<c0194e8b>] (__vfs_write) from [<c01969f7>] (vfs_write+0x6b/0x104)
[<c01969f7>] (vfs_write) from [<c0196b9b>] (ksys_write+0x43/0x94)
[<c0196b9b>] (ksys_write) from [<c0101001>] (ret_fast_syscall+0x1/0x62)

Since commit 96232cbc6c994 ("usb: typec: tcpm: support get typec and pd
config from device properties"), the 'config' pointer in struct tcpc_dev
is optional when registering a Type-C port. Since it is optional, we have
to check if it is NULL before dereferencing it.

Reported-by: Douglas Gilbert <dgilbert@interlog.com>
Cc: Douglas Gilbert <dgilbert@interlog.com>
Fixes: 96232cbc6c994 ("usb: typec: tcpm: support get typec and pd config from device properties")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Jun Li <jun.li@nxp.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Link: https://lore.kernel.org/r/1563979112-22483-1-git-send-email-linux@roeck-us.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agousb: typec: tcpm: remove tcpm dir if no children
Li Jun [Wed, 17 Jul 2019 08:06:46 +0000 (16:06 +0800)]
usb: typec: tcpm: remove tcpm dir if no children

commit 12ca7297b8855c0af1848503d37196159b24e6b9 upstream.

If config tcpm as module, module unload will not remove tcpm dir,
then the next module load will have problem: the rootdir is NULL
but tcpm dir is still there, so tcpm_debugfs_init() will create
tcpm dir again with failure, fix it by remove the tcpm dir if no
children.

Cc: stable@vger.kernel.org # v4.15+
Fixes: 4b4e02c83167 ("typec: tcpm: Move out of staging")
Signed-off-by: Li Jun <jun.li@nxp.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20190717080646.30421-2-jun.li@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agousb: typec: tcpm: free log buf memory when remove debug file
Li Jun [Wed, 17 Jul 2019 08:06:45 +0000 (16:06 +0800)]
usb: typec: tcpm: free log buf memory when remove debug file

commit fd5da3e2cc61b4a7c877172fdc9348c82cf6ccfc upstream.

The logbuffer memory should be freed when remove debug file.

Cc: stable@vger.kernel.org # v4.15+
Fixes: 4b4e02c83167 ("typec: tcpm: Move out of staging")
Signed-off-by: Li Jun <jun.li@nxp.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20190717080646.30421-1-jun.li@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agousb: yurex: Fix use-after-free in yurex_delete
Suzuki K Poulose [Mon, 5 Aug 2019 11:15:28 +0000 (12:15 +0100)]
usb: yurex: Fix use-after-free in yurex_delete

commit fc05481b2fcabaaeccf63e32ac1baab54e5b6963 upstream.

syzbot reported the following crash [0]:

BUG: KASAN: use-after-free in usb_free_coherent+0x79/0x80
drivers/usb/core/usb.c:928
Read of size 8 at addr ffff8881b18599c8 by task syz-executor.4/16007

CPU: 0 PID: 16007 Comm: syz-executor.4 Not tainted 5.3.0-rc2+ #23
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0xca/0x13e lib/dump_stack.c:113
  print_address_description+0x6a/0x32c mm/kasan/report.c:351
  __kasan_report.cold+0x1a/0x33 mm/kasan/report.c:482
  kasan_report+0xe/0x12 mm/kasan/common.c:612
  usb_free_coherent+0x79/0x80 drivers/usb/core/usb.c:928
  yurex_delete+0x138/0x330 drivers/usb/misc/yurex.c:100
  kref_put include/linux/kref.h:65 [inline]
  yurex_release+0x66/0x90 drivers/usb/misc/yurex.c:392
  __fput+0x2d7/0x840 fs/file_table.c:280
  task_work_run+0x13f/0x1c0 kernel/task_work.c:113
  tracehook_notify_resume include/linux/tracehook.h:188 [inline]
  exit_to_usermode_loop+0x1d2/0x200 arch/x86/entry/common.c:163
  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
  syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
  do_syscall_64+0x45f/0x580 arch/x86/entry/common.c:299
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x413511
Code: 75 14 b8 03 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 04 1b 00 00 c3 48
83 ec 08 e8 0a fc ff ff 48 89 04 24 b8 03 00 00 00 0f 05 <48> 8b 3c 24 48
89 c2 e8 53 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01
RSP: 002b:00007ffc424ea2e0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 0000000000000007 RCX: 0000000000413511
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000006
RBP: 0000000000000001 R08: 0000000029a2fc22 R09: 0000000029a2fc26
R10: 00007ffc424ea3c0 R11: 0000000000000293 R12: 000000000075c9a0
R13: 000000000075c9a0 R14: 0000000000761938 R15: ffffffffffffffff

Allocated by task 2776:
  save_stack+0x1b/0x80 mm/kasan/common.c:69
  set_track mm/kasan/common.c:77 [inline]
  __kasan_kmalloc mm/kasan/common.c:487 [inline]
  __kasan_kmalloc.constprop.0+0xbf/0xd0 mm/kasan/common.c:460
  kmalloc include/linux/slab.h:552 [inline]
  kzalloc include/linux/slab.h:748 [inline]
  usb_alloc_dev+0x51/0xf95 drivers/usb/core/usb.c:583
  hub_port_connect drivers/usb/core/hub.c:5004 [inline]
  hub_port_connect_change drivers/usb/core/hub.c:5213 [inline]
  port_event drivers/usb/core/hub.c:5359 [inline]
  hub_event+0x15c0/0x3640 drivers/usb/core/hub.c:5441
  process_one_work+0x92b/0x1530 kernel/workqueue.c:2269
  worker_thread+0x96/0xe20 kernel/workqueue.c:2415
  kthread+0x318/0x420 kernel/kthread.c:255
  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352

Freed by task 16007:
  save_stack+0x1b/0x80 mm/kasan/common.c:69
  set_track mm/kasan/common.c:77 [inline]
  __kasan_slab_free+0x130/0x180 mm/kasan/common.c:449
  slab_free_hook mm/slub.c:1423 [inline]
  slab_free_freelist_hook mm/slub.c:1470 [inline]
  slab_free mm/slub.c:3012 [inline]
  kfree+0xe4/0x2f0 mm/slub.c:3953
  device_release+0x71/0x200 drivers/base/core.c:1064
  kobject_cleanup lib/kobject.c:693 [inline]
  kobject_release lib/kobject.c:722 [inline]
  kref_put include/linux/kref.h:65 [inline]
  kobject_put+0x171/0x280 lib/kobject.c:739
  put_device+0x1b/0x30 drivers/base/core.c:2213
  usb_put_dev+0x1f/0x30 drivers/usb/core/usb.c:725
  yurex_delete+0x40/0x330 drivers/usb/misc/yurex.c:95
  kref_put include/linux/kref.h:65 [inline]
  yurex_release+0x66/0x90 drivers/usb/misc/yurex.c:392
  __fput+0x2d7/0x840 fs/file_table.c:280
  task_work_run+0x13f/0x1c0 kernel/task_work.c:113
  tracehook_notify_resume include/linux/tracehook.h:188 [inline]
  exit_to_usermode_loop+0x1d2/0x200 arch/x86/entry/common.c:163
  prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
  syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
  do_syscall_64+0x45f/0x580 arch/x86/entry/common.c:299
  entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at ffff8881b1859980
  which belongs to the cache kmalloc-2k of size 2048
The buggy address is located 72 bytes inside of
  2048-byte region [ffff8881b1859980ffff8881b185a180)
The buggy address belongs to the page:
page:ffffea0006c61600 refcount:1 mapcount:0 mapping:ffff8881da00c000
index:0x0 compound_mapcount: 0
flags: 0x200000000010200(slab|head)
raw: 0200000000010200 0000000000000000 0000000100000001 ffff8881da00c000
raw: 0000000000000000 00000000000f000f 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
  ffff8881b1859880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
  ffff8881b1859900: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff8881b1859980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                               ^
  ffff8881b1859a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ffff8881b1859a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

A quick look at the yurex_delete() shows that we drop the reference
to the usb_device before releasing any buffers associated with the
device. Delay the reference drop until we have finished the cleanup.

[0] https://lore.kernel.org/lkml/0000000000003f86d8058f0bd671@google.com/

Fixes: 6bc235a2e24a5e ("USB: add driver for Meywa-Denki & Kayac YUREX")
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Tomoki Sekiyama <tomoki.sekiyama@gmail.com>
Cc: Oliver Neukum <oneukum@suse.com>
Cc: andreyknvl@google.com
Cc: gregkh@linuxfoundation.org
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: syzkaller-bugs@googlegroups.com
Cc: dtor@chromium.org
Reported-by: syzbot+d1fedb1c1fdb07fca507@syzkaller.appspotmail.com
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20190805111528.6758-1-suzuki.poulose@arm.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agousb: host: xhci-rcar: Fix timeout in xhci_suspend()
Yoshihiro Shimoda [Fri, 2 Aug 2019 08:33:35 +0000 (17:33 +0900)]
usb: host: xhci-rcar: Fix timeout in xhci_suspend()

commit 783bda5e41acc71f98336e1a402c180f9748e5dc upstream.

When a USB device is connected to the host controller and
the system enters suspend, the following error happens
in xhci_suspend():

xhci-hcd ee000000.usb: WARN: xHC CMD_RUN timeout

Since the firmware/internal CPU control the USBSTS.STS_HALT
and the process speed is down when the roothub port enters U3,
long delay for the handshake of STS_HALT is neeed in xhci_suspend().
So, this patch adds to set the XHCI_SLOW_SUSPEND.

Fixes: 435cc1138ec9 ("usb: host: xhci-plat: set resume_quirk() for R-Car controllers")
Cc: <stable@vger.kernel.org> # v4.12+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/1564734815-17964-1-git-send-email-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agogfs2: gfs2_walk_metadata fix
Andreas Gruenbacher [Mon, 5 Aug 2019 11:22:03 +0000 (12:22 +0100)]
gfs2: gfs2_walk_metadata fix

commit a27a0c9b6a208722016c8ec5ad31ec96082b91ec upstream.

It turns out that the current version of gfs2_metadata_walker suffers
from multiple problems that can cause gfs2_hole_size to report an
incorrect size.  This will confuse fiemap as well as lseek with the
SEEK_DATA flag.

Fix that by changing gfs2_hole_walker to compute the metapath to the
first data block after the hole (if any), and compute the hole size
based on that.

Fixes xfstest generic/490.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
Cc: stable@vger.kernel.org # v4.18+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/purgatory: Use CFLAGS_REMOVE rather than reset KBUILD_CFLAGS
Nick Desaulniers [Wed, 7 Aug 2019 22:15:33 +0000 (15:15 -0700)]
x86/purgatory: Use CFLAGS_REMOVE rather than reset KBUILD_CFLAGS

commit b059f801a937d164e03b33c1848bb3dca67c0b04 upstream.

KBUILD_CFLAGS is very carefully built up in the top level Makefile,
particularly when cross compiling or using different build tools.
Resetting KBUILD_CFLAGS via := assignment is an antipattern.

The comment above the reset mentions that -pg is problematic.  Other
Makefiles use `CFLAGS_REMOVE_file.o = $(CC_FLAGS_FTRACE)` when
CONFIG_FUNCTION_TRACER is set. Prefer that pattern to wiping out all of
the important KBUILD_CFLAGS then manually having to re-add them. Seems
also that __stack_chk_fail references are generated when using
CONFIG_STACKPROTECTOR or CONFIG_STACKPROTECTOR_STRONG.

Fixes: 8fc5b4d4121c ("purgatory: core purgatory functionality")
Reported-by: Vaibhav Rustagi <vaibhavrustagi@google.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Vaibhav Rustagi <vaibhavrustagi@google.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190807221539.94583-2-ndesaulniers@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoperf record: Fix module size on s390
Thomas Richter [Wed, 24 Jul 2019 12:27:02 +0000 (14:27 +0200)]
perf record: Fix module size on s390

commit 12a6d2940b5f02b4b9f71ce098e3bb02bc24a9ea upstream.

On s390 the modules loaded in memory have the text segment located after
the GOT and Relocation table. This can be seen with this output:

  [root@m35lp76 perf]# fgrep qeth /proc/modules
  qeth 151552 1 qeth_l2, Live 0x000003ff800b2000
  ...
  [root@m35lp76 perf]# cat /sys/module/qeth/sections/.text
  0x000003ff800b3990
  [root@m35lp76 perf]#

There is an offset of 0x1990 bytes. The size of the qeth module is
151552 bytes (0x25000 in hex).

The location of the GOT/relocation table at the beginning of a module is
unique to s390.

commit 203d8a4aa6ed ("perf s390: Fix 'start' address of module's map")
adjusts the start address of a module in the map structures, but does
not adjust the size of the modules. This leads to overlapping of module
maps as this example shows:

[root@m35lp76 perf] # ./perf report -D
     0 0 0xfb0 [0xa0]: PERF_RECORD_MMAP -1/0: [0x3ff800b3990(0x25000)
          @ 0]:  x /lib/modules/.../qeth.ko.xz
     0 0 0x1050 [0xb0]: PERF_RECORD_MMAP -1/0: [0x3ff800d85a0(0x8000)
          @ 0]:  x /lib/modules/.../ip6_tables.ko.xz

The module qeth.ko has an adjusted start address modified to b3990, but
its size is unchanged and the module ends at 0x3ff800d8990.  This end
address overlaps with the next modules start address of 0x3ff800d85a0.

When the size of the leading GOT/Relocation table stored in the
beginning of the text segment (0x1990 bytes) is subtracted from module
qeth end address, there are no overlaps anymore:

   0x3ff800d8990 - 0x1990 = 0x0x3ff800d7000

which is the same as

   0x3ff800b2000 + 0x25000 = 0x0x3ff800d7000.

To fix this issue, also adjust the modules size in function
arch__fix_module_text_start(). Add another function parameter named size
and reduce the size of the module when the text segment start address is
changed.

Output after:
     0 0 0xfb0 [0xa0]: PERF_RECORD_MMAP -1/0: [0x3ff800b3990(0x23670)
          @ 0]:  x /lib/modules/.../qeth.ko.xz
     0 0 0x1050 [0xb0]: PERF_RECORD_MMAP -1/0: [0x3ff800d85a0(0x7a60)
          @ 0]:  x /lib/modules/.../ip6_tables.ko.xz

Reported-by: Stefan Liebler <stli@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 203d8a4aa6ed ("perf s390: Fix 'start' address of module's map")
Link: http://lkml.kernel.org/r/20190724122703.3996-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoperf db-export: Fix thread__exec_comm()
Adrian Hunter [Thu, 8 Aug 2019 06:48:23 +0000 (09:48 +0300)]
perf db-export: Fix thread__exec_comm()

commit 3de7ae0b2a1d86dbb23d0cb135150534fdb2e836 upstream.

Threads synthesized from /proc have comms with a start time of zero, and
not marked as "exec". Currently, there can be 2 such comms. The first is
created by processing a synthesized fork event and is set to the
parent's comm string, and the second by processing a synthesized comm
event set to the thread's current comm string.

In the absence of an "exec" comm, thread__exec_comm() picks the last
(oldest) comm, which, in the case above, is the parent's comm string.
For a main thread, that is very probably wrong. Use the second-to-last
in that case.

This affects only db-export because it is the only user of
thread__exec_comm().

Example:

  $ sudo perf record -a -o pt-a-sleep-1 -e intel_pt//u -- sleep 1
  $ sudo chown ahunter pt-a-sleep-1

Before:

  $ perf script -i pt-a-sleep-1 --itrace=bep -s tools/perf/scripts/python/export-to-sqlite.py pt-a-sleep-1.db branches calls
  $ sqlite3 -header -column pt-a-sleep-1.db 'select * from comm_threads_view'
  comm_id     command     thread_id   pid         tid
  ----------  ----------  ----------  ----------  ----------
  1           swapper     1           0           0
  2           rcu_sched   2           10          10
  3           kthreadd    3           78          78
  5           sudo        4           15180       15180
  5           sudo        5           15180       15182
  7           kworker/4:  6           10335       10335
  8           kthreadd    7           55          55
  10          systemd     8           865         865
  10          systemd     9           865         875
  13          perf        10          15181       15181
  15          sleep       10          15181       15181
  16          kworker/3:  11          14179       14179
  17          kthreadd    12          29376       29376
  19          systemd     13          746         746
  21          systemd     14          401         401
  23          systemd     15          879         879
  23          systemd     16          879         945
  25          kthreadd    17          556         556
  27          kworker/u1  18          14136       14136
  28          kworker/u1  19          15021       15021
  29          kthreadd    20          509         509
  31          systemd     21          836         836
  31          systemd     22          836         967
  33          systemd     23          1148        1148
  33          systemd     24          1148        1163
  35          kworker/2:  25          17988       17988
  36          kworker/0:  26          13478       13478

After:

  $ perf script -i pt-a-sleep-1 --itrace=bep -s tools/perf/scripts/python/export-to-sqlite.py pt-a-sleep-1b.db branches calls
  $ sqlite3 -header -column pt-a-sleep-1b.db 'select * from comm_threads_view'
  comm_id     command     thread_id   pid         tid
  ----------  ----------  ----------  ----------  ----------
  1           swapper     1           0           0
  2           rcu_sched   2           10          10
  3           kswapd0     3           78          78
  4           perf        4           15180       15180
  4           perf        5           15180       15182
  6           kworker/4:  6           10335       10335
  7           kcompactd0  7           55          55
  8           accounts-d  8           865         865
  8           accounts-d  9           865         875
  10          perf        10          15181       15181
  12          sleep       10          15181       15181
  13          kworker/3:  11          14179       14179
  14          kworker/1:  12          29376       29376
  15          haveged     13          746         746
  16          systemd-jo  14          401         401
  17          NetworkMan  15          879         879
  17          NetworkMan  16          879         945
  19          irq/131-iw  17          556         556
  20          kworker/u1  18          14136       14136
  21          kworker/u1  19          15021       15021
  22          kworker/u1  20          509         509
  23          thermald    21          836         836
  23          thermald    22          836         967
  25          unity-sett  23          1148        1148
  25          unity-sett  24          1148        1163
  27          kworker/2:  25          17988       17988
  28          kworker/0:  26          13478       13478

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 65de51f93ebf ("perf tools: Identify which comms are from exec")
Link: http://lkml.kernel.org/r/20190808064823.14846-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoperf annotate: Fix s390 gap between kernel end and module start
Thomas Richter [Wed, 24 Jul 2019 12:27:03 +0000 (14:27 +0200)]
perf annotate: Fix s390 gap between kernel end and module start

commit b9c0a64901d5bdec6eafd38d1dc8fa0e2974fccb upstream.

During execution of command 'perf top' the error message:

   Not enough memory for annotating '__irf_end' symbol!)

is emitted from this call sequence:
  __cmd_top
    perf_top__mmap_read
      perf_top__mmap_read_idx
        perf_event__process_sample
          hist_entry_iter__add
            hist_iter__top_callback
              perf_top__record_precise_ip
                hist_entry__inc_addr_samples
                  symbol__inc_addr_samples
                    symbol__get_annotation
                      symbol__alloc_hist

In this function the size of symbol __irf_end is calculated. The size of
a symbol is the difference between its start and end address.

When the symbol was read the first time, its start and end was set to:

   symbol__new: __irf_end 0xe954d0-0xe954d0

which is correct and maps with /proc/kallsyms:

   root@s8360046:~/linux-4.15.0/tools/perf# fgrep _irf_end /proc/kallsyms
   0000000000e954d0 t __irf_end
   root@s8360046:~/linux-4.15.0/tools/perf#

In function symbol__alloc_hist() the end of symbol __irf_end is

  symbol__alloc_hist sym:__irf_end start:0xe954d0 end:0x3ff80045a8

which is identical with the first module entry in /proc/kallsyms

This results in a symbol size of __irf_req for histogram analyses of
70334140059072 bytes and a malloc() for this requested size fails.

The root cause of this is function
  __dso__load_kallsyms()
  +-> symbols__fixup_end()

Function symbols__fixup_end() enlarges the last symbol in the kallsyms
map:

   # fgrep __irf_end /proc/kallsyms
   0000000000e954d0 t __irf_end
   #

to the start address of the first module:
   # cat /proc/kallsyms | sort  | egrep ' [tT] '
   ....
   0000000000e952d0 T __security_initcall_end
   0000000000e954d0 T __initramfs_size
   0000000000e954d0 t __irf_end
   000003ff800045a8 T fc_get_event_number       [scsi_transport_fc]
   000003ff800045d0 t store_fc_vport_disable    [scsi_transport_fc]
   000003ff800046a8 T scsi_is_fc_rport  [scsi_transport_fc]
   000003ff800046d0 t fc_target_setup   [scsi_transport_fc]

On s390 the kernel is located around memory address 0x200, 0x10000 or
0x100000, depending on linux version. Modules however start some- where
around 0x3ff xxxx xxxx.

This is different than x86 and produces a large gap for which histogram
allocation fails.

Fix this by detecting the kernel's last symbol and do no adjustment for
it. Introduce a weak function and handle s390 specifics.

Reported-by: Klaus Theurich <klaus.theurich@de.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20190724122703.3996-2-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agomm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()
Joerg Roedel [Fri, 19 Jul 2019 18:46:52 +0000 (20:46 +0200)]
mm/vmalloc: Sync unmappings in __purge_vmap_area_lazy()

commit 3f8fd02b1bf1d7ba964485a56f2f4b53ae88c167 upstream.

On x86-32 with PTI enabled, parts of the kernel page-tables are not shared
between processes. This can cause mappings in the vmalloc/ioremap area to
persist in some page-tables after the region is unmapped and released.

When the region is re-used the processes with the old mappings do not fault
in the new mappings but still access the old ones.

This causes undefined behavior, in reality often data corruption, kernel
oopses and panics and even spontaneous reboots.

Fix this problem by activly syncing unmaps in the vmalloc/ioremap area to
all page-tables in the system before the regions can be re-used.

References: https://bugzilla.suse.com/show_bug.cgi?id=1118689
Fixes: 5d72b4fba40ef ('x86, mm: support huge I/O mapping capability I/F')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20190719184652.11391-4-joro@8bytes.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/mm: Sync also unmappings in vmalloc_sync_all()
Joerg Roedel [Fri, 19 Jul 2019 18:46:51 +0000 (20:46 +0200)]
x86/mm: Sync also unmappings in vmalloc_sync_all()

commit 8e998fc24de47c55b47a887f6c95ab91acd4a720 upstream.

With huge-page ioremap areas the unmappings also need to be synced between
all page-tables. Otherwise it can cause data corruption when a region is
unmapped and later re-used.

Make the vmalloc_sync_one() function ready to sync unmappings and make sure
vmalloc_sync_all() iterates over all page-tables even when an unmapped PMD
is found.

Fixes: 5d72b4fba40ef ('x86, mm: support huge I/O mapping capability I/F')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20190719184652.11391-3-joro@8bytes.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/mm: Check for pfn instead of page in vmalloc_sync_one()
Joerg Roedel [Fri, 19 Jul 2019 18:46:50 +0000 (20:46 +0200)]
x86/mm: Check for pfn instead of page in vmalloc_sync_one()

commit 51b75b5b563a2637f9d8dc5bd02a31b2ff9e5ea0 upstream.

Do not require a struct page for the mapped memory location because it
might not exist. This can happen when an ioremapped region is mapped with
2MB pages.

Fixes: 5d72b4fba40ef ('x86, mm: support huge I/O mapping capability I/F')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20190719184652.11391-2-joro@8bytes.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoInput: synaptics - enable RMI mode for HP Spectre X360
Dmitry Torokhov [Fri, 12 Jul 2019 18:37:17 +0000 (11:37 -0700)]
Input: synaptics - enable RMI mode for HP Spectre X360

commit 25f8c834e2a6871920cc1ca113f02fb301d007c3 upstream.

The 2016 kabylake HP Spectre X360 (model number 13-w013dx) works much better
with psmouse.synaptics_intertouch=1 kernel parameter, so let's enable RMI4
mode automatically.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=204115
Reported-by: Nate Graham <pointedstick@zoho.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoInput: elantech - enable SMBus on new (2018+) systems
Kai-Heng Feng [Mon, 22 Jul 2019 07:56:55 +0000 (10:56 +0300)]
Input: elantech - enable SMBus on new (2018+) systems

commit 883a2a80f79ca5c0c105605fafabd1f3df99b34c upstream.

There are some new HP laptops with Elantech touchpad that don't support
multitouch.

Currently we use ETP_NEW_IC_SMBUS_HOST_NOTIFY() to check if SMBus is supported,
but in addition to firmware version, the bus type also informs us whether the IC
can support SMBus. To avoid breaking old ICs, we will only enable SMbus support
based the bus type on systems manufactured after 2018.

Lastly, let's consolidate all checks into elantech_use_host_notify() and use it
to determine whether to use PS/2 or SMBus.

Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoInput: usbtouchscreen - initialize PM mutex before using it
Oliver Neukum [Thu, 1 Aug 2019 16:40:26 +0000 (09:40 -0700)]
Input: usbtouchscreen - initialize PM mutex before using it

commit b55d996f057bf2e7ba9422a80b5e17e99860cb0b upstream.

Mutexes shall be initialized before they are used.

Fixes: 12e510dbc57b2 ("Input: usbtouchscreen - fix deadlock in autosuspend")
Reported-by: syzbot+199ea16c7f26418b4365@syzkaller.appspotmail.com
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoloop: set PF_MEMALLOC_NOIO for the worker thread
Mikulas Patocka [Thu, 8 Aug 2019 15:17:01 +0000 (11:17 -0400)]
loop: set PF_MEMALLOC_NOIO for the worker thread

commit d0a255e795ab976481565f6ac178314b34fbf891 upstream.

A deadlock with this stacktrace was observed.

The loop thread does a GFP_KERNEL allocation, it calls into dm-bufio
shrinker and the shrinker depends on I/O completion in the dm-bufio
subsystem.

In order to fix the deadlock (and other similar ones), we set the flag
PF_MEMALLOC_NOIO at loop thread entry.

PID: 474    TASK: ffff8813e11f4600  CPU: 10  COMMAND: "kswapd0"
   #0 [ffff8813dedfb938] __schedule at ffffffff8173f405
   #1 [ffff8813dedfb990] schedule at ffffffff8173fa27
   #2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec
   #3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186
   #4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f
   #5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8
   #6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81
   #7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio]
   #8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio]
   #9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio]
  #10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce
  #11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778
  #12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f
  #13 [ffff8813dedfbec0] kthread at ffffffff810a8428
  #14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242

  PID: 14127  TASK: ffff881455749c00  CPU: 11  COMMAND: "loop1"
   #0 [ffff88272f5af228] __schedule at ffffffff8173f405
   #1 [ffff88272f5af280] schedule at ffffffff8173fa27
   #2 [ffff88272f5af2a0] schedule_preempt_disabled at ffffffff8173fd5e
   #3 [ffff88272f5af2b0] __mutex_lock_slowpath at ffffffff81741fb5
   #4 [ffff88272f5af330] mutex_lock at ffffffff81742133
   #5 [ffff88272f5af350] dm_bufio_shrink_count at ffffffffa03865f9 [dm_bufio]
   #6 [ffff88272f5af380] shrink_slab at ffffffff811a86bd
   #7 [ffff88272f5af470] shrink_zone at ffffffff811ad778
   #8 [ffff88272f5af500] do_try_to_free_pages at ffffffff811adb34
   #9 [ffff88272f5af590] try_to_free_pages at ffffffff811adef8
  #10 [ffff88272f5af610] __alloc_pages_nodemask at ffffffff811a09c3
  #11 [ffff88272f5af710] alloc_pages_current at ffffffff811e8b71
  #12 [ffff88272f5af760] new_slab at ffffffff811f4523
  #13 [ffff88272f5af7b0] __slab_alloc at ffffffff8173a1b5
  #14 [ffff88272f5af880] kmem_cache_alloc at ffffffff811f484b
  #15 [ffff88272f5af8d0] do_blockdev_direct_IO at ffffffff812535b3
  #16 [ffff88272f5afb00] __blockdev_direct_IO at ffffffff81255dc3
  #17 [ffff88272f5afb30] xfs_vm_direct_IO at ffffffffa01fe3fc [xfs]
  #18 [ffff88272f5afb90] generic_file_read_iter at ffffffff81198994
  #19 [ffff88272f5afc50] __dta_xfs_file_read_iter_2398 at ffffffffa020c970 [xfs]
  #20 [ffff88272f5afcc0] lo_rw_aio at ffffffffa0377042 [loop]
  #21 [ffff88272f5afd70] loop_queue_work at ffffffffa0377c3b [loop]
  #22 [ffff88272f5afe60] kthread_worker_fn at ffffffff810a8a0c
  #23 [ffff88272f5afec0] kthread at ffffffff810a8428
  #24 [ffff88272f5aff50] ret_from_fork at ffffffff81745242

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agommc: cavium: Add the missing dma unmap when the dma has finished.
Kevin Hao [Fri, 26 Jul 2019 02:30:49 +0000 (10:30 +0800)]
mmc: cavium: Add the missing dma unmap when the dma has finished.

commit b803974a86039913d5280add083d730b2b9ed8ec upstream.

This fixes the below calltrace when the CONFIG_DMA_API_DEBUG is enabled.
  DMA-API: thunderx_mmc 0000:01:01.4: cpu touching an active dma mapped cacheline [cln=0x000000002fdf9800]
  WARNING: CPU: 21 PID: 1 at kernel/dma/debug.c:596 debug_dma_assert_idle+0x1f8/0x270
  Modules linked in:
  CPU: 21 PID: 1 Comm: init Not tainted 5.3.0-rc1-next-20190725-yocto-standard+ #64
  Hardware name: Marvell OcteonTX CN96XX board (DT)
  pstate: 80400009 (Nzcv daif +PAN -UAO)
  pc : debug_dma_assert_idle+0x1f8/0x270
  lr : debug_dma_assert_idle+0x1f8/0x270
  sp : ffff0000113cfc10
  x29: ffff0000113cfc10 x28: 0000ffff8c880000
  x27: ffff800bc72a0000 x26: ffff000010ff8000
  x25: ffff000010ff8940 x24: ffff000010ff8968
  x23: 0000000000000000 x22: ffff000010e83700
  x21: ffff000010ea2000 x20: ffff000010e835c8
  x19: ffff800bc2c73300 x18: ffffffffffffffff
  x17: 0000000000000000 x16: 0000000000000000
  x15: ffff000010e835c8 x14: 6d20616d64206576
  x13: 69746361206e6120 x12: 676e696863756f74
  x11: 20757063203a342e x10: 31303a31303a3030
  x9 : 303020636d6d5f78 x8 : 3230303030303030
  x7 : 00000000000002fd x6 : ffff000010fd57d0
  x5 : 0000000000000000 x4 : ffff0000106c5210
  x3 : 00000000ffffffff x2 : 0000800bee9c0000
  x1 : 57d5843f4aa62800 x0 : 0000000000000000
  Call trace:
   debug_dma_assert_idle+0x1f8/0x270
   wp_page_copy+0xb0/0x688
   do_wp_page+0xa8/0x5b8
   __handle_mm_fault+0x600/0xd00
   handle_mm_fault+0x118/0x1e8
   do_page_fault+0x200/0x500
   do_mem_abort+0x50/0xb0
   el0_da+0x20/0x24
  ---[ end trace a005534bd23e109f ]---
  DMA-API: Mapped at:
   debug_dma_map_sg+0x94/0x350
   cvm_mmc_request+0x3c4/0x988
   __mmc_start_request+0x9c/0x1f8
   mmc_start_request+0x7c/0xb0
   mmc_blk_mq_issue_rq+0x5c4/0x7b8

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Fixes: ba3869ff32e4 ("mmc: cavium: Add core MMC driver for Cavium SOCs")
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agommc: cavium: Set the correct dma max segment size for mmc_host
Kevin Hao [Fri, 26 Jul 2019 02:30:48 +0000 (10:30 +0800)]
mmc: cavium: Set the correct dma max segment size for mmc_host

commit fa25eba6993b3750f417baabba169afaba076178 upstream.

We have set the mmc_host.max_seg_size to 8M, but the dma max segment
size of PCI device is set to 64K by default in function pci_device_add().
The mmc_host.max_seg_size is used to set the max segment size of
the blk queue. Then this mismatch will trigger a calltrace like below
when a bigger than 64K segment request arrives at mmc dev. So we should
consider the limitation of the cvm_mmc_host when setting the
mmc_host.max_seg_size.
  DMA-API: thunderx_mmc 0000:01:01.4: mapping sg segment longer than device claims to support [len=131072] [max=65536]
  WARNING: CPU: 6 PID: 238 at kernel/dma/debug.c:1221 debug_dma_map_sg+0x2b8/0x350
  Modules linked in:
  CPU: 6 PID: 238 Comm: kworker/6:1H Not tainted 5.3.0-rc1-next-20190724-yocto-standard+ #62
  Hardware name: Marvell OcteonTX CN96XX board (DT)
  Workqueue: kblockd blk_mq_run_work_fn
  pstate: 80c00009 (Nzcv daif +PAN +UAO)
  pc : debug_dma_map_sg+0x2b8/0x350
  lr : debug_dma_map_sg+0x2b8/0x350
  sp : ffff00001770f9e0
  x29: ffff00001770f9e0 x28: ffffffff00000000
  x27: 00000000ffffffff x26: ffff800bc2c73180
  x25: ffff000010e83700 x24: 0000000000000002
  x23: 0000000000000001 x22: 0000000000000001
  x21: 0000000000000000 x20: ffff800bc48ba0b0
  x19: ffff800bc97e8c00 x18: ffffffffffffffff
  x17: 0000000000000000 x16: 0000000000000000
  x15: ffff000010e835c8 x14: 6874207265676e6f
  x13: 6c20746e656d6765 x12: 7320677320676e69
  x11: 7070616d203a342e x10: 31303a31303a3030
  x9 : 303020636d6d5f78 x8 : 35363d78616d5b20
  x7 : 00000000000002fd x6 : ffff000010fd57dc
  x5 : 0000000000000000 x4 : ffff0000106c61f0
  x3 : 00000000ffffffff x2 : 0000800bee060000
  x1 : 7010678df3041a00 x0 : 0000000000000000
  Call trace:
   debug_dma_map_sg+0x2b8/0x350
   cvm_mmc_request+0x3c4/0x988
   __mmc_start_request+0x9c/0x1f8
   mmc_start_request+0x7c/0xb0
   mmc_blk_mq_issue_rq+0x5c4/0x7b8
   mmc_mq_queue_rq+0x11c/0x278
   blk_mq_dispatch_rq_list+0xb0/0x568
   blk_mq_do_dispatch_sched+0x6c/0x108
   blk_mq_sched_dispatch_requests+0x110/0x1b8
   __blk_mq_run_hw_queue+0xb0/0x118
   blk_mq_run_work_fn+0x28/0x38
   process_one_work+0x210/0x490
   worker_thread+0x48/0x458
   kthread+0x130/0x138
   ret_from_fork+0x10/0x1c

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Fixes: ba3869ff32e4 ("mmc: cavium: Add core MMC driver for Cavium SOCs")
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agosound: fix a memory leak bug
Wenwen Wang [Thu, 8 Aug 2019 05:15:21 +0000 (00:15 -0500)]
sound: fix a memory leak bug

commit c7cd7c748a3250ca33509f9235efab9c803aca09 upstream.

In sound_insert_unit(), the controlling structure 's' is allocated through
kmalloc(). Then it is added to the sound driver list by invoking
__sound_insert_unit(). Later on, if __register_chrdev() fails, 's' is
removed from the list through __sound_remove_unit(). If 'index' is not less
than 0, -EBUSY is returned to indicate the error. However, 's' is not
deallocated on this execution path, leading to a memory leak bug.

To fix the above issue, free 's' before -EBUSY is returned.

Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agousb: iowarrior: fix deadlock on disconnect
Oliver Neukum [Thu, 8 Aug 2019 09:27:28 +0000 (11:27 +0200)]
usb: iowarrior: fix deadlock on disconnect

commit c468a8aa790e0dfe0a7f8a39db282d39c2c00b46 upstream.

We have to drop the mutex before we close() upon disconnect()
as close() needs the lock. This is safe to do by dropping the
mutex as intfdata is already set to NULL, so open() will fail.

Fixes: 03f36e885fc26 ("USB: open disconnect race in iowarrior")
Reported-by: syzbot+a64a382964bf6c71a9c0@syzkaller.appspotmail.com
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Link: https://lore.kernel.org/r/20190808092728.23417-1-oneukum@suse.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agousb: usbfs: fix double-free of usb memory upon submiturb error
Gavin Li [Sun, 4 Aug 2019 23:50:44 +0000 (16:50 -0700)]
usb: usbfs: fix double-free of usb memory upon submiturb error

commit c43f28dfdc4654e738aa6d3fd08a105b2bee758d upstream.

Upon an error within proc_do_submiturb(), dec_usb_memory_use_count()
gets called once by the error handling tail and again by free_async().
Remove the first call.

Signed-off-by: Gavin Li <git@thegavinli.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20190804235044.22327-1-gavinli@thegavinli.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agocrypto: ccp - Ignore tag length when decrypting GCM ciphertext
Gary R Hook [Tue, 30 Jul 2019 16:05:26 +0000 (16:05 +0000)]
crypto: ccp - Ignore tag length when decrypting GCM ciphertext

commit e2664ecbb2f26225ac6646876f2899558ffb2604 upstream.

AES GCM input buffers for decryption contain AAD+CTEXT+TAG. Only
decrypt the ciphertext, and use the tag for comparison.

Fixes: 36cf515b9bbe2 ("crypto: ccp - Enable support for AES GCM on v5 CCPs")
Cc: <stable@vger.kernel.org>
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agocrypto: ccp - Add support for valid authsize values less than 16
Gary R Hook [Tue, 30 Jul 2019 16:05:24 +0000 (16:05 +0000)]
crypto: ccp - Add support for valid authsize values less than 16

commit 9f00baf74e4b6f79a3a3dfab44fb7bb2e797b551 upstream.

AES GCM encryption allows for authsize values of 4, 8, and 12-16 bytes.
Validate the requested authsize, and retain it to save in the request
context.

Fixes: 36cf515b9bbe2 ("crypto: ccp - Enable support for AES GCM on v5 CCPs")
Cc: <stable@vger.kernel.org>
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agocrypto: ccp - Fix oops by properly managing allocated structures
Gary R Hook [Tue, 30 Jul 2019 16:05:22 +0000 (16:05 +0000)]
crypto: ccp - Fix oops by properly managing allocated structures

commit 25e44338321af545ab34243a6081c3f0fc6107d0 upstream.

A plaintext or ciphertext length of 0 is allowed in AES, in which case
no encryption occurs. Ensure that we don't clean up data structures
that were never allocated.

Fixes: 36cf515b9bbe2 ("crypto: ccp - Enable support for AES GCM on v5 CCPs")
Cc: <stable@vger.kernel.org>
Signed-off-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agostaging: android: ion: Bail out upon SIGKILL when allocating memory.
Tetsuo Handa [Mon, 1 Jul 2019 10:55:19 +0000 (19:55 +0900)]
staging: android: ion: Bail out upon SIGKILL when allocating memory.

commit 8f9e86ee795971eabbf372e6d804d6b8578287a7 upstream.

syzbot found that a thread can stall for minutes inside
ion_system_heap_allocate() after that thread was killed by SIGKILL [1].
Let's check for SIGKILL before doing memory allocation.

[1] https://syzkaller.appspot.com/bug?id=a0e3436829698d5824231251fad9d8e998f94f5e

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: stable <stable@vger.kernel.org>
Reported-by: syzbot <syzbot+8ab2d0f39fb79fe6ca40@syzkaller.appspotmail.com>
Acked-by: Laura Abbott <labbott@redhat.com>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://lore.kernel.org/r/d088f188-5f32-d8fc-b9a0-0b404f7501cc@I-love.SAKURA.ne.jp
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agostaging: gasket: apex: fix copy-paste typo
Ivan Bornyakov [Wed, 10 Jul 2019 20:45:18 +0000 (23:45 +0300)]
staging: gasket: apex: fix copy-paste typo

commit 66665bb9979246729562a09fcdbb101c83127989 upstream.

In sysfs_show() case-branches ATTR_KERNEL_HIB_PAGE_TABLE_SIZE and
ATTR_KERNEL_HIB_SIMPLE_PAGE_TABLE_SIZE do the same. It looks like
copy-paste mistake.

Signed-off-by: Ivan Bornyakov <brnkv.i1@gmail.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20190710204518.16814-1-brnkv.i1@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoiio: adc: max9611: Fix misuse of GENMASK macro
Joe Perches [Wed, 10 Jul 2019 05:04:17 +0000 (22:04 -0700)]
iio: adc: max9611: Fix misuse of GENMASK macro

commit ae8cc91a7d85e018c0c267f580820b2bb558cd48 upstream.

Arguments are supposed to be ordered high then low.

Signed-off-by: Joe Perches <joe@perches.com>
Fixes: 69780a3bbc0b ("iio: adc: Add Maxim max9611 ADC driver")
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoiio: cros_ec_accel_legacy: Fix incorrect channel setting
Gwendal Grignou [Fri, 28 Jun 2019 19:17:09 +0000 (12:17 -0700)]
iio: cros_ec_accel_legacy: Fix incorrect channel setting

commit 6cdff99c9f7d7d28b87cf05dd464f7c7736332ae upstream.

INFO_SCALE is set both for each channel and all channels.
iio is using all channel setting, so the error was not user visible.

Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoLinux 4.19.66 v4.19.66
Greg Kroah-Hartman [Fri, 9 Aug 2019 15:52:35 +0000 (17:52 +0200)]
Linux 4.19.66

5 years agospi: bcm2835: Fix 3-wire mode if DMA is enabled
Lukas Wunner [Wed, 3 Jul 2019 10:29:31 +0000 (12:29 +0200)]
spi: bcm2835: Fix 3-wire mode if DMA is enabled

commit 8d8bef50365847134b51c1ec46786bc2873e4e47 upstream.

Commit 6935224da248 ("spi: bcm2835: enable support of 3-wire mode")
added 3-wire support to the BCM2835 SPI driver by setting the REN bit
(Read Enable) in the CS register when receiving data.  The REN bit puts
the transmitter in high-impedance state.  The driver recognizes that
data is to be received by checking whether the rx_buf of a transfer is
non-NULL.

Commit 3ecd37edaa2a ("spi: bcm2835: enable dma modes for transfers
meeting certain conditions") subsequently broke 3-wire support because
it set the SPI_MASTER_MUST_RX flag which causes spi_map_msg() to replace
rx_buf with a dummy buffer if it is NULL.  As a result, rx_buf is
*always* non-NULL if DMA is enabled.

Reinstate 3-wire support by not only checking whether rx_buf is non-NULL,
but also checking that it is not the dummy buffer.

Fixes: 3ecd37edaa2a ("spi: bcm2835: enable dma modes for transfers meeting certain conditions")
Reported-by: Nuno Sá <nuno.sa@analog.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v4.2+
Cc: Martin Sperl <kernel@martin.sperl.org>
Acked-by: Stefan Wahren <wahrenst@gmx.net>
Link: https://lore.kernel.org/r/328318841455e505370ef8ecad97b646c033dc8a.1562148527.git.lukas@wunner.de
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agocgroup: Fix css_task_iter_advance_css_set() cset skip condition
Tejun Heo [Mon, 10 Jun 2019 16:08:27 +0000 (09:08 -0700)]
cgroup: Fix css_task_iter_advance_css_set() cset skip condition

commit c596687a008b579c503afb7a64fcacc7270fae9e upstream.

While adding handling for dying task group leaders c03cd7738a83
("cgroup: Include dying leaders with live threads in PROCS
iterations") added an inverted cset skip condition to
css_task_iter_advance_css_set().  It should skip cset if it's
completely empty but was incorrectly testing for the inverse condition
for the dying_tasks list.  Fix it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Fixes: c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS iterations")
Reported-by: syzbot+d4bba5ccd4f9a2a68681@syzkaller.appspotmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agocgroup: css_task_iter_skip()'d iterators must be advanced before accessed
Tejun Heo [Wed, 5 Jun 2019 16:54:34 +0000 (09:54 -0700)]
cgroup: css_task_iter_skip()'d iterators must be advanced before accessed

commit cee0c33c546a93957a52ae9ab6bebadbee765ec5 upstream.

b636fd38dc40 ("cgroup: Implement css_task_iter_skip()") introduced
css_task_iter_skip() which is used to fix task iterations skipping
dying threadgroup leaders with live threads.  Skipping is implemented
as a subportion of full advancing but css_task_iter_next() forgot to
fully advance a skipped iterator before determining the next task to
visit causing it to return invalid task pointers.

Fix it by making css_task_iter_next() fully advance the iterator if it
has been skipped since the previous iteration.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: syzbot
Link: http://lkml.kernel.org/r/00000000000097025d058a7fd785@google.com
Fixes: b636fd38dc40 ("cgroup: Implement css_task_iter_skip()")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agocgroup: Include dying leaders with live threads in PROCS iterations
Tejun Heo [Fri, 31 May 2019 17:38:58 +0000 (10:38 -0700)]
cgroup: Include dying leaders with live threads in PROCS iterations

commit c03cd7738a83b13739f00546166969342c8ff014 upstream.

CSS_TASK_ITER_PROCS currently iterates live group leaders; however,
this means that a process with dying leader and live threads will be
skipped.  IOW, cgroup.procs might be empty while cgroup.threads isn't,
which is confusing to say the least.

Fix it by making cset track dying tasks and include dying leaders with
live threads in PROCS iteration.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-and-tested-by: Topi Miettinen <toiwoton@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agocgroup: Implement css_task_iter_skip()
Tejun Heo [Fri, 31 May 2019 17:38:58 +0000 (10:38 -0700)]
cgroup: Implement css_task_iter_skip()

commit b636fd38dc40113f853337a7d2a6885ad23b8811 upstream.

When a task is moved out of a cset, task iterators pointing to the
task are advanced using the normal css_task_iter_advance() call.  This
is fine but we'll be tracking dying tasks on csets and thus moving
tasks from cset->tasks to (to be added) cset->dying_tasks.  When we
remove a task from cset->tasks, if we advance the iterators, they may
move over to the next cset before we had the chance to add the task
back on the dying list, which can allow the task to escape iteration.

This patch separates out skipping from advancing.  Skipping only moves
the affected iterators to the next pointer rather than fully advancing
it and the following advancing will recognize that the cursor has
already been moved forward and do the rest of advancing.  This ensures
that when a task moves from one list to another in its cset, as long
as it moves in the right direction, it's always visible to iteration.

This doesn't cause any visible behavior changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agocgroup: Call cgroup_release() before __exit_signal()
Tejun Heo [Fri, 31 May 2019 17:38:57 +0000 (10:38 -0700)]
cgroup: Call cgroup_release() before __exit_signal()

commit 6b115bf58e6f013ca75e7115aabcbd56c20ff31d upstream.

cgroup_release() calls cgroup_subsys->release() which is used by the
pids controller to uncharge its pid.  We want to use it to manage
iteration of dying tasks which requires putting it before
__unhash_process().  Move cgroup_release() above __exit_signal().
While this makes it uncharge before the pid is freed, pid is RCU freed
anyway and the window is very narrow.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agocompat_ioctl: pppoe: fix PPPOEIOCSFWD handling
Arnd Bergmann [Tue, 30 Jul 2019 19:25:20 +0000 (21:25 +0200)]
compat_ioctl: pppoe: fix PPPOEIOCSFWD handling

[ Upstream commit 055d88242a6046a1ceac3167290f054c72571cd9 ]

Support for handling the PPPOEIOCSFWD ioctl in compat mode was added in
linux-2.5.69 along with hundreds of other commands, but was always broken
sincen only the structure is compatible, but the command number is not,
due to the size being sizeof(size_t), or at first sizeof(sizeof((struct
sockaddr_pppox)), which is different on 64-bit architectures.

Guillaume Nault adds:

  And the implementation was broken until 2016 (see 29e73269aa4d ("pppoe:
  fix reference counting in PPPoE proxy")), and nobody ever noticed. I
  should probably have removed this ioctl entirely instead of fixing it.
  Clearly, it has never been used.

Fix it by adding a compat_ioctl handler for all pppoe variants that
translates the command number and then calls the regular ioctl function.

All other ioctl commands handled by pppoe are compatible between 32-bit
and 64-bit, and require compat_ptr() conversion.

This should apply to all stable kernels.

Acked-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agor8169: don't use MSI before RTL8168d
Heiner Kallweit [Sat, 27 Jul 2019 10:45:10 +0000 (12:45 +0200)]
r8169: don't use MSI before RTL8168d

[ Upstream commit 003bd5b4a7b4a94b501e3a1e2e7c9df6b2a94ed4 ]

It was reported that after resuming from suspend network fails with
error "do_IRQ: 3.38 No irq handler for vector", see [0]. Enabling WoL
can work around the issue, but the only actual fix is to disable MSI.
So let's mimic the behavior of the vendor driver and disable MSI on
all chip versions before RTL8168d.

[0] https://bugzilla.kernel.org/show_bug.cgi?id=204079

Fixes: 6c6aa15fdea5 ("r8169: improve interrupt handling")
Reported-by: Dušan Dragić <dragic.dusan@gmail.com>
Tested-by: Dušan Dragić <dragic.dusan@gmail.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet/mlx5e: Prevent encap flow counter update async to user query
Ariel Levkovich [Sat, 6 Jul 2019 15:06:15 +0000 (18:06 +0300)]
net/mlx5e: Prevent encap flow counter update async to user query

[ Upstream commit 90bb769291161cf25a818d69cf608c181654473e ]

This patch prevents a race between user invoked cached counters
query and a neighbor last usage updater.

The cached flow counter stats can be queried by calling
"mlx5_fc_query_cached" which provides the number of bytes and
packets that passed via this flow since the last time this counter
was queried.
It does so by reducting the last saved stats from the current, cached
stats and then updating the last saved stats with the cached stats.
It also provide the lastuse value for that flow.

Since "mlx5e_tc_update_neigh_used_value" needs to retrieve the
last usage time of encapsulation flows, it calls the flow counter
query method periodically and async to user queries of the flow counter
using cls_flower.
This call is causing the driver to update the last reported bytes and
packets from the cache and therefore, future user queries of the flow
stats will return lower than expected number for bytes and packets
since the last saved stats in the driver was updated async to the last
saved stats in cls_flower.

This causes wrong stats presentation of encapsulation flows to user.

Since the neighbor usage updater only needs the lastuse stats from the
cached counter, the fix is to use a dedicated lastuse query call that
returns the lastuse value without synching between the cached stats and
the last saved stats.

Fixes: f6dfb4c3f216 ("net/mlx5e: Update neighbour 'used' state using HW flow rules counters")
Signed-off-by: Ariel Levkovich <lariel@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet/mlx5: Fix modify_cq_in alignment
Edward Srouji [Tue, 23 Jul 2019 07:12:55 +0000 (10:12 +0300)]
net/mlx5: Fix modify_cq_in alignment

[ Upstream commit 7a32f2962c56d9d8a836b4469855caeee8766bd4 ]

Fix modify_cq_in alignment to match the device specification.
After this fix the 'cq_umem_valid' field will be in the right offset.

Cc: <stable@vger.kernel.org> # 4.19
Fixes: bd37197554eb ("net/mlx5: Update mlx5_ifc with DEVX UID bits")
Signed-off-by: Edward Srouji <edwards@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agotun: mark small packets as owned by the tap sock
Alexis Bauvin [Tue, 23 Jul 2019 14:23:01 +0000 (16:23 +0200)]
tun: mark small packets as owned by the tap sock

[ Upstream commit 4b663366246be1d1d4b1b8b01245b2e88ad9e706 ]

- v1 -> v2: Move skb_set_owner_w to __tun_build_skb to reduce patch size

Small packets going out of a tap device go through an optimized code
path that uses build_skb() rather than sock_alloc_send_pskb(). The
latter calls skb_set_owner_w(), but the small packet code path does not.

The net effect is that small packets are not owned by the userland
application's socket (e.g. QEMU), while large packets are.
This can be seen with a TCP session, where packets are not owned when
the window size is small enough (around PAGE_SIZE), while they are once
the window grows (note that this requires the host to support virtio
tso for the guest to offload segmentation).
All this leads to inconsistent behaviour in the kernel, especially on
netfilter modules that uses sk->socket (e.g. xt_owner).

Fixes: 66ccbc9c87c2 ("tap: use build_skb() for small packet")
Signed-off-by: Alexis Bauvin <abauvin@scaleway.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agotipc: compat: allow tipc commands without arguments
Taras Kondratiuk [Mon, 29 Jul 2019 22:15:07 +0000 (22:15 +0000)]
tipc: compat: allow tipc commands without arguments

[ Upstream commit 4da5f0018eef4c0de31675b670c80e82e13e99d1 ]

Commit 2753ca5d9009 ("tipc: fix uninit-value in tipc_nl_compat_doit")
broke older tipc tools that use compat interface (e.g. tipc-config from
tipcutils package):

% tipc-config -p
operation not supported

The commit started to reject TIPC netlink compat messages that do not
have attributes. It is too restrictive because some of such messages are
valid (they don't need any arguments):

% grep 'tx none' include/uapi/linux/tipc_config.h
#define  TIPC_CMD_NOOP              0x0000    /* tx none, rx none */
#define  TIPC_CMD_GET_MEDIA_NAMES   0x0002    /* tx none, rx media_name(s) */
#define  TIPC_CMD_GET_BEARER_NAMES  0x0003    /* tx none, rx bearer_name(s) */
#define  TIPC_CMD_SHOW_PORTS        0x0006    /* tx none, rx ultra_string */
#define  TIPC_CMD_GET_REMOTE_MNG    0x4003    /* tx none, rx unsigned */
#define  TIPC_CMD_GET_MAX_PORTS     0x4004    /* tx none, rx unsigned */
#define  TIPC_CMD_GET_NETID         0x400B    /* tx none, rx unsigned */
#define  TIPC_CMD_NOT_NET_ADMIN     0xC001    /* tx none, rx none */

This patch relaxes the original fix and rejects messages without
arguments only if such arguments are expected by a command (reg_type is
non zero).

Fixes: 2753ca5d9009 ("tipc: fix uninit-value in tipc_nl_compat_doit")
Cc: stable@vger.kernel.org
Signed-off-by: Taras Kondratiuk <takondra@cisco.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoocelot: Cancel delayed work before wq destruction
Claudiu Manoil [Thu, 25 Jul 2019 13:33:18 +0000 (16:33 +0300)]
ocelot: Cancel delayed work before wq destruction

[ Upstream commit c5d139697d5d9ecf9c7cd92d7d7838a173508900 ]

Make sure the delayed work for stats update is not pending before
wq destruction.
This fixes the module unload path.
The issue is there since day 1.

Fixes: a556c76adc05 ("net: mscc: Add initial Ocelot switch support")

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoNFC: nfcmrvl: fix gpio-handling regression
Johan Hovold [Mon, 5 Aug 2019 10:00:55 +0000 (12:00 +0200)]
NFC: nfcmrvl: fix gpio-handling regression

[ Upstream commit c3953a3c2d3175d2f9f0304c9a1ba89e7743c5e4 ]

Fix two reset-gpio sanity checks which were never converted to use
gpio_is_valid(), and make sure to use -EINVAL to indicate a missing
reset line also for the UART-driver module parameter and for the USB
driver.

This specifically prevents the UART and USB drivers from incidentally
trying to request and use gpio 0, and also avoids triggering a WARN() in
gpio_to_desc() during probe when no valid reset line has been specified.

Fixes: e33a3f84f88f ("NFC: nfcmrvl: allow gpio 0 for reset signalling")
Reported-by: syzbot+cf35b76f35e068a1107f@syzkaller.appspotmail.com
Tested-by: syzbot+cf35b76f35e068a1107f@syzkaller.appspotmail.com
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet/smc: do not schedule tx_work in SMC_CLOSED state
Ursula Braun [Fri, 2 Aug 2019 08:16:38 +0000 (10:16 +0200)]
net/smc: do not schedule tx_work in SMC_CLOSED state

[ Upstream commit f9cedf1a9b1cdcfb0c52edb391d01771e43994a4 ]

The setsockopts options TCP_NODELAY and TCP_CORK may schedule the
tx worker. Make sure the socket is not yet moved into SMC_CLOSED
state (for instance by a shutdown SHUT_RDWR call).

Reported-by: syzbot+92209502e7aab127c75f@syzkaller.appspotmail.com
Reported-by: syzbot+b972214bb803a343f4fe@syzkaller.appspotmail.com
Fixes: 01d2f7e2cdd31 ("net/smc: sockopts TCP_NODELAY and TCP_CORK")
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: sched: use temporary variable for actions indexes
Dmytro Linkin [Thu, 1 Aug 2019 13:02:51 +0000 (13:02 +0000)]
net: sched: use temporary variable for actions indexes

[ Upstream commit 7be8ef2cdbfe41a2e524b7c6cc3f8e6cfaa906e4 ]

Currently init call of all actions (except ipt) init their 'parm'
structure as a direct pointer to nla data in skb. This leads to race
condition when some of the filter actions were initialized successfully
(and were assigned with idr action index that was written directly
into nla data), but then were deleted and retried (due to following
action module missing or classifier-initiated retry), in which case
action init code tries to insert action to idr with index that was
assigned on previous iteration. During retry the index can be reused
by another action that was inserted concurrently, which causes
unintended action sharing between filters.
To fix described race condition, save action idr index to temporary
stack-allocated variable instead on nla data.

Fixes: 0190c1d452a9 ("net: sched: atomically check-allocate action")
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet sched: update vlan action for batched events operations
Roman Mashak [Fri, 2 Aug 2019 19:16:46 +0000 (15:16 -0400)]
net sched: update vlan action for batched events operations

[ Upstream commit b35475c5491a14c8ce7a5046ef7bcda8a860581a ]

Add get_fill_size() routine used to calculate the action size
when building a batch of events.

Fixes: c7e2b9689 ("sched: introduce vlan action")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: sched: Fix a possible null-pointer dereference in dequeue_func()
Jia-Ju Bai [Mon, 29 Jul 2019 08:24:33 +0000 (16:24 +0800)]
net: sched: Fix a possible null-pointer dereference in dequeue_func()

[ Upstream commit 051c7b39be4a91f6b7d8c4548444e4b850f1f56c ]

In dequeue_func(), there is an if statement on line 74 to check whether
skb is NULL:
    if (skb)

When skb is NULL, it is used on line 77:
    prefetch(&skb->end);

Thus, a possible null-pointer dereference may occur.

To fix this bug, skb->end is used when skb is not NULL.

This bug is found by a static analysis tool STCheck written by us.

Fixes: 76e3cc126bb2 ("codel: Controlled Delay AQM")
Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: qualcomm: rmnet: Fix incorrect UL checksum offload logic
Subash Abhinov Kasiviswanathan [Thu, 25 Jul 2019 18:07:12 +0000 (12:07 -0600)]
net: qualcomm: rmnet: Fix incorrect UL checksum offload logic

[ Upstream commit a7cf3d24ee6081930feb4c830a7f6f16ebe31c49 ]

The udp_ip4_ind bit is set only for IPv4 UDP non-fragmented packets
so that the hardware can flip the checksum to 0xFFFF if the computed
checksum is 0 per RFC768.

However, this bit had to be set for IPv6 UDP non fragmented packets
as well per hardware requirements. Otherwise, IPv6 UDP packets
with computed checksum as 0 were transmitted by hardware and were
dropped in the network.

In addition to setting this bit for IPv6 UDP, the field is also
appropriately renamed to udp_ind as part of this change.

Fixes: 5eb5f8608ef1 ("net: qualcomm: rmnet: Add support for TX checksum offload")
Cc: Sean Tranchetti <stranche@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: phylink: Fix flow control for fixed-link
René van Dorst [Sat, 27 Jul 2019 09:40:11 +0000 (11:40 +0200)]
net: phylink: Fix flow control for fixed-link

[ Upstream commit 8aace4f3eba2a3ceb431e18683ea0e1ecbade5cd ]

In phylink_parse_fixedlink() the pl->link_config.advertising bits are AND
with pl->supported, pl->supported is zeroed and only the speed/duplex
modes and MII bits are set.
So pl->link_config.advertising always loses the flow control/pause bits.

By setting Pause and Asym_Pause bits in pl->supported, the flow control
work again when devicetree "pause" is set in fixes-link node and the MAC
advertise that is supports pause.

Results with this patch.

Legend:
- DT = 'Pause' is set in the fixed-link in devicetree.
- validate() = ‘Yes’ means phylink_set(mask, Pause) is set in the
  validate().
- flow = results reported my link is Up line.

+-----+------------+-------+
| DT  | validate() | flow  |
+-----+------------+-------+
| Yes | Yes        | rx/tx |
| No  | Yes        | off   |
| Yes | No         | off   |
+-----+------------+-------+

Fixes: 9525ae83959b ("phylink: add phylink infrastructure")
Signed-off-by: René van Dorst <opensource@vdorst.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet/mlx5: Use reversed order when unregister devices
Mark Zhang [Tue, 9 Jul 2019 02:37:12 +0000 (05:37 +0300)]
net/mlx5: Use reversed order when unregister devices

[ Upstream commit 08aa5e7da6bce1a1963f63cf32c2e7ad434ad578 ]

When lag is active, which is controlled by the bonded mlx5e netdev, mlx5
interface unregestering must happen in the reverse order where rdma is
unregistered (unloaded) first, to guarantee all references to the lag
context in hardware is removed, then remove mlx5e netdev interface which
will cleanup the lag context from hardware.

Without this fix during destroy of LAG interface, we observed following
errors:
 * mlx5_cmd_check:752:(pid 12556): DESTROY_LAG(0x843) op_mod(0x0) failed,
   status bad parameter(0x3), syndrome (0xe4ac33)
 * mlx5_cmd_check:752:(pid 12556): DESTROY_LAG(0x843) op_mod(0x0) failed,
   status bad parameter(0x3), syndrome (0xa5aee8).

Fixes: a31208b1e11d ("net/mlx5_core: New init and exit flow for mlx5_core")
Reviewed-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet/mlx5e: always initialize frag->last_in_page
Qian Cai [Thu, 1 Aug 2019 13:52:54 +0000 (09:52 -0400)]
net/mlx5e: always initialize frag->last_in_page

[ Upstream commit 60d60c8fbd8d1acf25b041ecd72ae4fa16e9405b ]

The commit 069d11465a80 ("net/mlx5e: RX, Enhance legacy Receive Queue
memory scheme") introduced an undefined behaviour below due to
"frag->last_in_page" is only initialized in mlx5e_init_frags_partition()
when,

if (next_frag.offset + frag_info[f].frag_stride > PAGE_SIZE)

or after bailed out the loop,

for (i = 0; i < mlx5_wq_cyc_get_size(&rq->wqe.wq); i++)

As the result, there could be some "frag" have uninitialized
value of "last_in_page".

Later, get_frag() obtains those "frag" and check "frag->last_in_page" in
mlx5e_put_rx_frag() and triggers the error during boot. Fix it by always
initializing "frag->last_in_page" to "false" in
mlx5e_init_frags_partition().

UBSAN: Undefined behaviour in
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c:325:12
load of value 170 is not a valid value for type 'bool' (aka '_Bool')
Call trace:
 dump_backtrace+0x0/0x264
 show_stack+0x20/0x2c
 dump_stack+0xb0/0x104
 __ubsan_handle_load_invalid_value+0x104/0x128
 mlx5e_handle_rx_cqe+0x8e8/0x12cc [mlx5_core]
 mlx5e_poll_rx_cq+0xca8/0x1a94 [mlx5_core]
 mlx5e_napi_poll+0x17c/0xa30 [mlx5_core]
 net_rx_action+0x248/0x940
 __do_softirq+0x350/0x7b8
 irq_exit+0x200/0x26c
 __handle_domain_irq+0xc8/0x128
 gic_handle_irq+0x138/0x228
 el1_irq+0xb8/0x140
 arch_cpu_idle+0x1a4/0x348
 do_idle+0x114/0x1b0
 cpu_startup_entry+0x24/0x28
 rest_init+0x1ac/0x1dc
 arch_call_rest_init+0x10/0x18
 start_kernel+0x4d4/0x57c

Fixes: 069d11465a80 ("net/mlx5e: RX, Enhance legacy Receive Queue memory scheme")
Signed-off-by: Qian Cai <cai@lca.pw>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: fix ifindex collision during namespace removal
Jiri Pirko [Sun, 28 Jul 2019 12:56:36 +0000 (14:56 +0200)]
net: fix ifindex collision during namespace removal

[ Upstream commit 55b40dbf0e76b4bfb9d8b3a16a0208640a9a45df ]

Commit aca51397d014 ("netns: Fix arbitrary net_device-s corruptions
on net_ns stop.") introduced a possibility to hit a BUG in case device
is returning back to init_net and two following conditions are met:
1) dev->ifindex value is used in a name of another "dev%d"
   device in init_net.
2) dev->name is used by another device in init_net.

Under real life circumstances this is hard to get. Therefore this has
been present happily for over 10 years. To reproduce:

$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 86:89:3f:86:61:29 brd ff:ff:ff:ff:ff:ff
3: enp0s2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff
$ ip netns add ns1
$ ip -n ns1 link add dummy1ns1 type dummy
$ ip -n ns1 link add dummy2ns1 type dummy
$ ip link set enp0s2 netns ns1
$ ip -n ns1 link set enp0s2 name dummy0
[  100.858894] virtio_net virtio0 dummy0: renamed from enp0s2
$ ip link add dev4 type dummy
$ ip -n ns1 a
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: dummy1ns1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 16:63:4c:38:3e:ff brd ff:ff:ff:ff:ff:ff
3: dummy2ns1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether aa:9e:86:dd:6b:5d brd ff:ff:ff:ff:ff:ff
4: dummy0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff
$ ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 86:89:3f:86:61:29 brd ff:ff:ff:ff:ff:ff
4: dev4: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 5a:e1:4a:b6:ec:f8 brd ff:ff:ff:ff:ff:ff
$ ip netns del ns1
[  158.717795] default_device_exit: failed to move dummy0 to init_net: -17
[  158.719316] ------------[ cut here ]------------
[  158.720591] kernel BUG at net/core/dev.c:9824!
[  158.722260] invalid opcode: 0000 [#1] SMP KASAN PTI
[  158.723728] CPU: 0 PID: 56 Comm: kworker/u2:1 Not tainted 5.3.0-rc1+ #18
[  158.725422] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-2.fc30 04/01/2014
[  158.727508] Workqueue: netns cleanup_net
[  158.728915] RIP: 0010:default_device_exit.cold+0x1d/0x1f
[  158.730683] Code: 84 e8 18 c9 3e fe 0f 0b e9 70 90 ff ff e8 36 e4 52 fe 89 d9 4c 89 e2 48 c7 c6 80 d6 25 84 48 c7 c7 20 c0 25 84 e8 f4 c8 3e
[  158.736854] RSP: 0018:ffff8880347e7b90 EFLAGS: 00010282
[  158.738752] RAX: 000000000000003b RBX: 00000000ffffffef RCX: 0000000000000000
[  158.741369] RDX: 0000000000000000 RSI: ffffffff8128013d RDI: ffffed10068fcf64
[  158.743418] RBP: ffff888033550170 R08: 000000000000003b R09: fffffbfff0b94b9c
[  158.745626] R10: fffffbfff0b94b9b R11: ffffffff85ca5cdf R12: ffff888032f28000
[  158.748405] R13: dffffc0000000000 R14: ffff8880335501b8 R15: 1ffff110068fcf72
[  158.750638] FS:  0000000000000000(0000) GS:ffff888036000000(0000) knlGS:0000000000000000
[  158.752944] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  158.755245] CR2: 00007fe8b45d21d0 CR3: 00000000340b4005 CR4: 0000000000360ef0
[  158.757654] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  158.760012] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  158.762758] Call Trace:
[  158.763882]  ? dev_change_net_namespace+0xbb0/0xbb0
[  158.766148]  ? devlink_nl_cmd_set_doit+0x520/0x520
[  158.768034]  ? dev_change_net_namespace+0xbb0/0xbb0
[  158.769870]  ops_exit_list.isra.0+0xa8/0x150
[  158.771544]  cleanup_net+0x446/0x8f0
[  158.772945]  ? unregister_pernet_operations+0x4a0/0x4a0
[  158.775294]  process_one_work+0xa1a/0x1740
[  158.776896]  ? pwq_dec_nr_in_flight+0x310/0x310
[  158.779143]  ? do_raw_spin_lock+0x11b/0x280
[  158.780848]  worker_thread+0x9e/0x1060
[  158.782500]  ? process_one_work+0x1740/0x1740
[  158.784454]  kthread+0x31b/0x420
[  158.786082]  ? __kthread_create_on_node+0x3f0/0x3f0
[  158.788286]  ret_from_fork+0x3a/0x50
[  158.789871] ---[ end trace defd6c657c71f936 ]---
[  158.792273] RIP: 0010:default_device_exit.cold+0x1d/0x1f
[  158.795478] Code: 84 e8 18 c9 3e fe 0f 0b e9 70 90 ff ff e8 36 e4 52 fe 89 d9 4c 89 e2 48 c7 c6 80 d6 25 84 48 c7 c7 20 c0 25 84 e8 f4 c8 3e
[  158.804854] RSP: 0018:ffff8880347e7b90 EFLAGS: 00010282
[  158.807865] RAX: 000000000000003b RBX: 00000000ffffffef RCX: 0000000000000000
[  158.811794] RDX: 0000000000000000 RSI: ffffffff8128013d RDI: ffffed10068fcf64
[  158.816652] RBP: ffff888033550170 R08: 000000000000003b R09: fffffbfff0b94b9c
[  158.820930] R10: fffffbfff0b94b9b R11: ffffffff85ca5cdf R12: ffff888032f28000
[  158.825113] R13: dffffc0000000000 R14: ffff8880335501b8 R15: 1ffff110068fcf72
[  158.829899] FS:  0000000000000000(0000) GS:ffff888036000000(0000) knlGS:0000000000000000
[  158.834923] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  158.838164] CR2: 00007fe8b45d21d0 CR3: 00000000340b4005 CR4: 0000000000360ef0
[  158.841917] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  158.845149] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

Fix this by checking if a device with the same name exists in init_net
and fallback to original code - dev%d to allocate name - in case it does.

This was found using syzkaller.

Fixes: aca51397d014 ("netns: Fix arbitrary net_device-s corruptions on net_ns stop.")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: bridge: mcast: don't delete permanent entries when fast leave is enabled
Nikolay Aleksandrov [Tue, 30 Jul 2019 11:21:00 +0000 (14:21 +0300)]
net: bridge: mcast: don't delete permanent entries when fast leave is enabled

[ Upstream commit 5c725b6b65067909548ac9ca9bc777098ec9883d ]

When permanent entries were introduced by the commit below, they were
exempt from timing out and thus igmp leave wouldn't affect them unless
fast leave was enabled on the port which was added before permanent
entries existed. It shouldn't matter if fast leave is enabled or not
if the user added a permanent entry it shouldn't be deleted on igmp
leave.

Before:
$ echo 1 > /sys/class/net/eth4/brport/multicast_fast_leave
$ bridge mdb add dev br0 port eth4 grp 229.1.1.1 permanent
$ bridge mdb show
dev br0 port eth4 grp 229.1.1.1 permanent

< join and leave 229.1.1.1 on eth4 >

$ bridge mdb show
$

After:
$ echo 1 > /sys/class/net/eth4/brport/multicast_fast_leave
$ bridge mdb add dev br0 port eth4 grp 229.1.1.1 permanent
$ bridge mdb show
dev br0 port eth4 grp 229.1.1.1 permanent

< join and leave 229.1.1.1 on eth4 >

$ bridge mdb show
dev br0 port eth4 grp 229.1.1.1 permanent

Fixes: ccb1c31a7a87 ("bridge: add flags to distinguish permanent mdb entires")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agonet: bridge: delete local fdb on device init failure
Nikolay Aleksandrov [Mon, 29 Jul 2019 09:28:41 +0000 (12:28 +0300)]
net: bridge: delete local fdb on device init failure

[ Upstream commit d7bae09fa008c6c9a489580db0a5a12063b97f97 ]

On initialization failure we have to delete the local fdb which was
inserted due to the default pvid creation. This problem has been present
since the inception of default_pvid. Note that currently there are 2 cases:
1) in br_dev_init() when br_multicast_init() fails
2) if register_netdevice() fails after calling ndo_init()

This patch takes care of both since br_vlan_flush() is called on both
occasions. Also the new fdb delete would be a no-op on normal bridge
device destruction since the local fdb would've been already flushed by
br_dev_delete(). This is not an issue for ports since nbp_vlan_init() is
called last when adding a port thus nothing can fail after it.

Reported-by: syzbot+88533dc8b582309bf3ee@syzkaller.appspotmail.com
Fixes: 5be5a2df40f0 ("bridge: Add filtering support for default_pvid")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agomvpp2: refactor MTU change code
Matteo Croce [Sun, 28 Jul 2019 00:46:45 +0000 (02:46 +0200)]
mvpp2: refactor MTU change code

[ Upstream commit 230bd958c2c846ee292aa38bc6b006296c24ca01 ]

The MTU change code can call napi_disable() with the device already down,
leading to a deadlock. Also, lot of code is duplicated unnecessarily.

Rework mvpp2_change_mtu() to avoid the deadlock and remove duplicated code.

Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agomvpp2: fix panic on module removal
Matteo Croce [Thu, 1 Aug 2019 12:13:30 +0000 (14:13 +0200)]
mvpp2: fix panic on module removal

[ Upstream commit 944a83a2669ae8aa2c7664e79376ca7468eb0a2b ]

mvpp2 uses a delayed workqueue to gather traffic statistics.
On module removal the workqueue can be destroyed before calling
cancel_delayed_work_sync() on its works.
Fix it by moving the destroy_workqueue() call after mvpp2_port_remove().
Also remove an unneeded call to flush_workqueue()

    # rmmod mvpp2
    [ 2743.311722] mvpp2 f4000000.ethernet eth1: phy link down 10gbase-kr/10Gbps/Full
    [ 2743.320063] mvpp2 f4000000.ethernet eth1: Link is Down
    [ 2743.572263] mvpp2 f4000000.ethernet eth2: phy link down sgmii/1Gbps/Full
    [ 2743.580076] mvpp2 f4000000.ethernet eth2: Link is Down
    [ 2744.102169] mvpp2 f2000000.ethernet eth0: phy link down 10gbase-kr/10Gbps/Full
    [ 2744.110441] mvpp2 f2000000.ethernet eth0: Link is Down
    [ 2744.115614] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
    [ 2744.115615] Mem abort info:
    [ 2744.115616]   ESR = 0x96000005
    [ 2744.115617]   Exception class = DABT (current EL), IL = 32 bits
    [ 2744.115618]   SET = 0, FnV = 0
    [ 2744.115619]   EA = 0, S1PTW = 0
    [ 2744.115620] Data abort info:
    [ 2744.115621]   ISV = 0, ISS = 0x00000005
    [ 2744.115622]   CM = 0, WnR = 0
    [ 2744.115624] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000422681000
    [ 2744.115626] [0000000000000000] pgd=0000000000000000, pud=0000000000000000
    [ 2744.115630] Internal error: Oops: 96000005 [#1] SMP
    [ 2744.115632] Modules linked in: mvpp2(-) algif_hash af_alg nls_iso8859_1 nls_cp437 vfat fat xhci_plat_hcd m25p80 spi_nor xhci_hcd mtd usbcore i2c_mv64xxx sfp usb_common marvell10g phy_generic spi_orion mdio_i2c i2c_core mvmdio phylink sbsa_gwdt ip_tables x_tables autofs4 [last unloaded: mvpp2]
    [ 2744.115654] CPU: 3 PID: 8357 Comm: kworker/3:2 Not tainted 5.3.0-rc2 #1
    [ 2744.115655] Hardware name: Marvell 8040 MACCHIATOBin Double-shot (DT)
    [ 2744.115665] Workqueue: events_power_efficient phylink_resolve [phylink]
    [ 2744.115669] pstate: a0000085 (NzCv daIf -PAN -UAO)
    [ 2744.115675] pc : __queue_work+0x9c/0x4d8
    [ 2744.115677] lr : __queue_work+0x170/0x4d8
    [ 2744.115678] sp : ffffff801001bd50
    [ 2744.115680] x29: ffffff801001bd50 x28: ffffffc422597600
    [ 2744.115684] x27: ffffff80109ae6f0 x26: ffffff80108e4018
    [ 2744.115688] x25: 0000000000000003 x24: 0000000000000004
    [ 2744.115691] x23: ffffff80109ae6e0 x22: 0000000000000017
    [ 2744.115694] x21: ffffffc42c030000 x20: ffffffc42209e8f8
    [ 2744.115697] x19: 0000000000000000 x18: 0000000000000000
    [ 2744.115699] x17: 0000000000000000 x16: 0000000000000000
    [ 2744.115701] x15: 0000000000000010 x14: ffffffffffffffff
    [ 2744.115702] x13: ffffff8090e2b95f x12: ffffff8010e2b967
    [ 2744.115704] x11: ffffff8010906000 x10: 0000000000000040
    [ 2744.115706] x9 : ffffff80109223b8 x8 : ffffff80109223b0
    [ 2744.115707] x7 : ffffffc42bc00068 x6 : 0000000000000000
    [ 2744.115709] x5 : ffffffc42bc00000 x4 : 0000000000000000
    [ 2744.115710] x3 : 0000000000000000 x2 : 0000000000000000
    [ 2744.115712] x1 : 0000000000000008 x0 : ffffffc42c030000
    [ 2744.115714] Call trace:
    [ 2744.115716]  __queue_work+0x9c/0x4d8
    [ 2744.115718]  delayed_work_timer_fn+0x28/0x38
    [ 2744.115722]  call_timer_fn+0x3c/0x180
    [ 2744.115723]  expire_timers+0x60/0x168
    [ 2744.115724]  run_timer_softirq+0xbc/0x1e8
    [ 2744.115727]  __do_softirq+0x128/0x320
    [ 2744.115731]  irq_exit+0xa4/0xc0
    [ 2744.115734]  __handle_domain_irq+0x70/0xc0
    [ 2744.115735]  gic_handle_irq+0x58/0xa8
    [ 2744.115737]  el1_irq+0xb8/0x140
    [ 2744.115738]  console_unlock+0x3a0/0x568
    [ 2744.115740]  vprintk_emit+0x200/0x2a0
    [ 2744.115744]  dev_vprintk_emit+0x1c8/0x1e4
    [ 2744.115747]  dev_printk_emit+0x6c/0x7c
    [ 2744.115751]  __netdev_printk+0x104/0x1d8
    [ 2744.115752]  netdev_printk+0x60/0x70
    [ 2744.115756]  phylink_resolve+0x38c/0x3c8 [phylink]
    [ 2744.115758]  process_one_work+0x1f8/0x448
    [ 2744.115760]  worker_thread+0x54/0x500
    [ 2744.115762]  kthread+0x12c/0x130
    [ 2744.115764]  ret_from_fork+0x10/0x1c
    [ 2744.115768] Code: aa1403e0 97fffbbe aa0003f5 b4000700 (f9400261)

Fixes: 118d6298f6f0 ("net: mvpp2: add ethtool GOP statistics")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Acked-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agomlxsw: spectrum: Fix error path in mlxsw_sp_module_init()
Jiri Pirko [Wed, 31 Jul 2019 06:33:14 +0000 (09:33 +0300)]
mlxsw: spectrum: Fix error path in mlxsw_sp_module_init()

[ Upstream commit 28fe79000e9b0a6f99959869947f1ca305f14599 ]

In case of sp2 pci driver registration fail, fix the error path to
start with sp1 pci driver unregister.

Fixes: c3ab435466d5 ("mlxsw: spectrum: Extend to support Spectrum-2 ASIC")
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoipip: validate header length in ipip_tunnel_xmit
Haishuang Yan [Thu, 25 Jul 2019 03:07:56 +0000 (11:07 +0800)]
ipip: validate header length in ipip_tunnel_xmit

[ Upstream commit 47d858d0bdcd47cc1c6c9eeca91b091dd9e55637 ]

We need the same checks introduced by commit cb9f1b783850
("ip: validate header length on virtual device xmit") for
ipip tunnel.

Fixes: cb9f1b783850b ("ip: validate header length on virtual device xmit")
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoip6_tunnel: fix possible use-after-free on xmit
Haishuang Yan [Thu, 25 Jul 2019 16:40:17 +0000 (00:40 +0800)]
ip6_tunnel: fix possible use-after-free on xmit

[ Upstream commit 01f5bffad555f8e22a61f4b1261fe09cf1b96994 ]

ip4ip6/ip6ip6 tunnels run iptunnel_handle_offloads on xmit which
can cause a possible use-after-free accessing iph/ipv6h pointer
since the packet will be 'uncloned' running pskb_expand_head if
it is a cloned gso skb.

Fixes: 0e9a709560db ("ip6_tunnel, ip6_gre: fix setting of DSCP on encapsulated packets")
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoip6_gre: reload ipv6h in prepare_ip6gre_xmit_ipv6
Haishuang Yan [Wed, 24 Jul 2019 12:00:42 +0000 (20:00 +0800)]
ip6_gre: reload ipv6h in prepare_ip6gre_xmit_ipv6

[ Upstream commit 3bc817d665ac6d9de89f59df522ad86f5b5dfc03 ]

Since ip6_tnl_parse_tlv_enc_lim() can call pskb_may_pull()
which may change skb->data, so we need to re-load ipv6h at
the right place.

Fixes: 898b29798e36 ("ip6_gre: Refactor ip6gre xmit codes")
Cc: William Tu <u9012063@gmail.com>
Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com>
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoife: error out when nla attributes are empty
Cong Wang [Tue, 23 Jul 2019 04:43:00 +0000 (21:43 -0700)]
ife: error out when nla attributes are empty

[ Upstream commit c8ec4632c6ac9cda0e8c3d51aa41eeab66585bd5 ]

act_ife at least requires TCA_IFE_PARMS, so we have to bail out
when there is no attribute passed in.

Reported-by: syzbot+fbb5b288c9cb6a2eeac4@syzkaller.appspotmail.com
Fixes: ef6980b6becb ("introduce IFE action")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agobnx2x: Disable multi-cos feature.
Sudarsana Reddy Kalluru [Wed, 24 Jul 2019 02:32:41 +0000 (19:32 -0700)]
bnx2x: Disable multi-cos feature.

[ Upstream commit d1f0b5dce8fda09a7f5f04c1878f181d548e42f5 ]

Commit 3968d38917eb ("bnx2x: Fix Multi-Cos.") which enabled multi-cos
feature after prolonged time in driver added some regression causing
numerous issues (sudden reboots, tx timeout etc.) reported by customers.
We plan to backout this commit and submit proper fix once we have root
cause of issues reported with this feature enabled.

Fixes: 3968d38917eb ("bnx2x: Fix Multi-Cos.")
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoatm: iphase: Fix Spectre v1 vulnerability
Gustavo A. R. Silva [Wed, 31 Jul 2019 03:21:41 +0000 (22:21 -0500)]
atm: iphase: Fix Spectre v1 vulnerability

[ Upstream commit ea443e5e98b5b74e317ef3d26bcaea54931ccdee ]

board is controlled by user-space, hence leading to a potential
exploitation of the Spectre variant 1 vulnerability.

This issue was detected with the help of Smatch:

drivers/atm/iphase.c:2765 ia_ioctl() warn: potential spectre issue 'ia_dev' [r] (local cap)
drivers/atm/iphase.c:2774 ia_ioctl() warn: possible spectre second half.  'iadev'
drivers/atm/iphase.c:2782 ia_ioctl() warn: possible spectre second half.  'iadev'
drivers/atm/iphase.c:2816 ia_ioctl() warn: possible spectre second half.  'iadev'
drivers/atm/iphase.c:2823 ia_ioctl() warn: possible spectre second half.  'iadev'
drivers/atm/iphase.c:2830 ia_ioctl() warn: potential spectre issue '_ia_dev' [r] (local cap)
drivers/atm/iphase.c:2845 ia_ioctl() warn: possible spectre second half.  'iadev'
drivers/atm/iphase.c:2856 ia_ioctl() warn: possible spectre second half.  'iadev'

Fix this by sanitizing board before using it to index ia_dev and _ia_dev

Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].

[1] https://lore.kernel.org/lkml/20180423164740.GY17484@dhcp22.suse.cz/

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoIB: directly cast the sockaddr union to aockaddr
Greg Kroah-Hartman [Wed, 7 Aug 2019 16:44:12 +0000 (18:44 +0200)]
IB: directly cast the sockaddr union to aockaddr

Like commit 641114d2af31 ("RDMA: Directly cast the sockaddr union to
sockaddr") we need to quiet gcc 9 from warning about this crazy union.
That commit did not fix all of the warnings in 4.19 and older kernels
because the logic in roce_resolve_route_from_path() was rewritten
between 4.19 and 5.2 when that change happened.

Cc: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoHID: Add quirk for HP X1200 PIXART OEM mouse
Sebastian Parschauer [Wed, 24 Jul 2019 18:40:03 +0000 (20:40 +0200)]
HID: Add quirk for HP X1200 PIXART OEM mouse

commit 49869d2ea9eecc105a10724c1abf035151a3c4e2 upstream.

The PixArt OEM mice are known for disconnecting every minute in
runlevel 1 or 3 if they are not always polled. So add quirk
ALWAYS_POLL for this one as well.

Jonathan Teh (@jonathan-teh) reported and tested the quirk.
Reference: https://github.com/sriemer/fix-linux-mouse/issues/15

Signed-off-by: Sebastian Parschauer <s.parschauer@gmx.de>
CC: stable@vger.kernel.org
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoHID: wacom: fix bit shift for Cintiq Companion 2
Aaron Armstrong Skomra [Tue, 23 Jul 2019 18:09:15 +0000 (11:09 -0700)]
HID: wacom: fix bit shift for Cintiq Companion 2

commit 693c3dab4e50403f91bca4b52fc6d8562a3180f6 upstream.

The bit indicating BTN_6 on this device is overshifted
by 2 bits, resulting in the incorrect button being
reported.

Also fix copy-paste mistake in comments.

Signed-off-by: Aaron Armstrong Skomra <aaron.skomra@wacom.com>
Reviewed-by: Ping Cheng <ping.cheng@wacom.com>
Link: https://github.com/linuxwacom/xf86-input-wacom/issues/71
Fixes: c7f0522a1ad1 ("HID: wacom: Slim down wacom_intuos_pad processing")
Cc: <stable@vger.kernel.org> # v4.5+
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agolibnvdimm/bus: Fix wait_nvdimm_bus_probe_idle() ABBA deadlock
Dan Williams [Tue, 6 Aug 2019 01:32:13 +0000 (18:32 -0700)]
libnvdimm/bus: Fix wait_nvdimm_bus_probe_idle() ABBA deadlock

commit ca6bf264f6d856f959c4239cda1047b587745c67 upstream.

A multithreaded namespace creation/destruction stress test currently
deadlocks with the following lockup signature:

    INFO: task ndctl:2924 blocked for more than 122 seconds.
          Tainted: G           OE     5.2.0-rc4+ #3382
    "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    ndctl           D    0  2924   1176 0x00000000
    Call Trace:
     ? __schedule+0x27e/0x780
     schedule+0x30/0xb0
     wait_nvdimm_bus_probe_idle+0x8a/0xd0 [libnvdimm]
     ? finish_wait+0x80/0x80
     uuid_store+0xe6/0x2e0 [libnvdimm]
     kernfs_fop_write+0xf0/0x1a0
     vfs_write+0xb7/0x1b0
     ksys_write+0x5c/0xd0
     do_syscall_64+0x60/0x240

     INFO: task ndctl:2923 blocked for more than 122 seconds.
           Tainted: G           OE     5.2.0-rc4+ #3382
     "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
     ndctl           D    0  2923   1175 0x00000000
     Call Trace:
      ? __schedule+0x27e/0x780
      ? __mutex_lock+0x489/0x910
      schedule+0x30/0xb0
      schedule_preempt_disabled+0x11/0x20
      __mutex_lock+0x48e/0x910
      ? nvdimm_namespace_common_probe+0x95/0x4d0 [libnvdimm]
      ? __lock_acquire+0x23f/0x1710
      ? nvdimm_namespace_common_probe+0x95/0x4d0 [libnvdimm]
      nvdimm_namespace_common_probe+0x95/0x4d0 [libnvdimm]
      __dax_pmem_probe+0x5e/0x210 [dax_pmem_core]
      ? nvdimm_bus_probe+0x1d0/0x2c0 [libnvdimm]
      dax_pmem_probe+0xc/0x20 [dax_pmem]
      nvdimm_bus_probe+0x90/0x2c0 [libnvdimm]
      really_probe+0xef/0x390
      driver_probe_device+0xb4/0x100

In this sequence an 'nd_dax' device is being probed and trying to take
the lock on its backing namespace to validate that the 'nd_dax' device
indeed has exclusive access to the backing namespace. Meanwhile, another
thread is trying to update the uuid property of that same backing
namespace. So one thread is in the probe path trying to acquire the
lock, and the other thread has acquired the lock and tries to flush the
probe path.

Fix this deadlock by not holding the namespace device_lock over the
wait_nvdimm_bus_probe_idle() synchronization step. In turn this requires
the device_lock to be held on entry to wait_nvdimm_bus_probe_idle() and
subsequently dropped internally to wait_nvdimm_bus_probe_idle().

Cc: <stable@vger.kernel.org>
Fixes: bf9bccc14c05 ("libnvdimm: pmem label sets and namespace instantiation")
Cc: Vishal Verma <vishal.l.verma@intel.com>
Tested-by: Jane Chu <jane.chu@oracle.com>
Link: https://lore.kernel.org/r/156341210094.292348.2384694131126767789.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agolibnvdimm/bus: Prepare the nd_ioctl() path to be re-entrant
Dan Williams [Tue, 6 Aug 2019 01:32:07 +0000 (18:32 -0700)]
libnvdimm/bus: Prepare the nd_ioctl() path to be re-entrant

commit 6de5d06e657acdbcf9637dac37916a4a5309e0f4 upstream.

In preparation for not holding a lock over the execution of nd_ioctl(),
update the implementation to allow multiple threads to be attempting
ioctls at the same time. The bus lock still prevents multiple in-flight
->ndctl() invocations from corrupting each other's state, but static
global staging buffers are moved to the heap.

Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Tested-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/156341208947.292348.10560140326807607481.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agolibnvdimm/region: Register badblocks before namespaces
Dan Williams [Tue, 6 Aug 2019 01:32:02 +0000 (18:32 -0700)]
libnvdimm/region: Register badblocks before namespaces

commit 700cd033a82d466ad8f9615f9985525e45f8960a upstream.

Namespace activation expects to be able to reference region badblocks.
The following warning sometimes triggers when asynchronous namespace
activation races in front of the completion of namespace probing. Move
all possible namespace probing after region badblocks initialization.

Otherwise, lockdep sometimes catches the uninitialized state of the
badblocks seqlock with stack trace signatures like:

    INFO: trying to register non-static key.
    pmem2: detected capacity change from 0 to 136365211648
    the code is fine but needs lockdep annotation.
    turning off the locking correctness validator.
    CPU: 9 PID: 358 Comm: kworker/u80:5 Tainted: G           OE     5.2.0-rc4+ #3382
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
    Workqueue: events_unbound async_run_entry_fn
    Call Trace:
     dump_stack+0x85/0xc0
    pmem1.12: detected capacity change from 0 to 8589934592
     register_lock_class+0x56a/0x570
     ? check_object+0x140/0x270
     __lock_acquire+0x80/0x1710
     ? __mutex_lock+0x39d/0x910
     lock_acquire+0x9e/0x180
     ? nd_pfn_validate+0x28f/0x440 [libnvdimm]
     badblocks_check+0x93/0x1f0
     ? nd_pfn_validate+0x28f/0x440 [libnvdimm]
     nd_pfn_validate+0x28f/0x440 [libnvdimm]
     ? lockdep_hardirqs_on+0xf0/0x180
     nd_dax_probe+0x9a/0x120 [libnvdimm]
     nd_pmem_probe+0x6d/0x180 [nd_pmem]
     nvdimm_bus_probe+0x90/0x2c0 [libnvdimm]

Fixes: 48af2f7e52f4 ("libnvdimm, pfn: during init, clear errors...")
Cc: <stable@vger.kernel.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/156341208365.292348.1547528796026249120.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agolibnvdimm/bus: Prevent duplicate device_unregister() calls
Dan Williams [Tue, 6 Aug 2019 01:31:56 +0000 (18:31 -0700)]
libnvdimm/bus: Prevent duplicate device_unregister() calls

commit 8aac0e2338916e273ccbd438a2b7a1e8c61749f5 upstream.

A multithreaded namespace creation/destruction stress test currently
fails with signatures like the following:

    sysfs group 'power' not found for kobject 'dax1.1'
    RIP: 0010:sysfs_remove_group+0x76/0x80
    Call Trace:
     device_del+0x73/0x370
     device_unregister+0x16/0x50
     nd_async_device_unregister+0x1e/0x30 [libnvdimm]
     async_run_entry_fn+0x39/0x160
     process_one_work+0x23c/0x5e0
     worker_thread+0x3c/0x390

    BUG: kernel NULL pointer dereference, address: 0000000000000020
    RIP: 0010:klist_put+0x1b/0x6c
    Call Trace:
     klist_del+0xe/0x10
     device_del+0x8a/0x2c9
     ? __switch_to_asm+0x34/0x70
     ? __switch_to_asm+0x40/0x70
     device_unregister+0x44/0x4f
     nd_async_device_unregister+0x22/0x2d [libnvdimm]
     async_run_entry_fn+0x47/0x15a
     process_one_work+0x1a2/0x2eb
     worker_thread+0x1b8/0x26e

Use the kill_device() helper to atomically resolve the race of multiple
threads issuing kill, device_unregister(), requests.

Reported-by: Jane Chu <jane.chu@oracle.com>
Reported-by: Erwin Tsaur <erwin.tsaur@oracle.com>
Fixes: 4d88a97aa9e8 ("libnvdimm, nvdimm: dimm driver and base libnvdimm device-driver...")
Cc: <stable@vger.kernel.org>
Link: https://github.com/pmem/ndctl/issues/96
Tested-by: Tested-by: Jane Chu <jane.chu@oracle.com>
Link: https://lore.kernel.org/r/156341207846.292348.10435719262819764054.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agodrivers/base: Introduce kill_device()
Dan Williams [Tue, 6 Aug 2019 01:31:51 +0000 (18:31 -0700)]
drivers/base: Introduce kill_device()

commit 00289cd87676e14913d2d8492d1ce05c4baafdae upstream.

The libnvdimm subsystem arranges for devices to be destroyed as a result
of a sysfs operation. Since device_unregister() cannot be called from
an actively running sysfs attribute of the same device libnvdimm
arranges for device_unregister() to be performed in an out-of-line async
context.

The driver core maintains a 'dead' state for coordinating its own racing
async registration / de-registration requests. Rather than add local
'dead' state tracking infrastructure to libnvdimm device objects, export
the existing state tracking via a new kill_device() helper.

The kill_device() helper simply marks the device as dead, i.e. that it
is on its way to device_del(), or returns that the device was already
dead. This can be used in advance of calling device_unregister() for
subsystems like libnvdimm that might need to handle multiple user
threads racing to delete a device.

This refactoring does not change any behavior, but it is a pre-requisite
for follow-on fixes and therefore marked for -stable.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Fixes: 4d88a97aa9e8 ("libnvdimm, nvdimm: dimm driver and base libnvdimm device-driver...")
Cc: <stable@vger.kernel.org>
Tested-by: Jane Chu <jane.chu@oracle.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/156341207332.292348.14959761496009347574.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agodriver core: Establish order of operations for device_add and device_del via bitflag
Alexander Duyck [Tue, 6 Aug 2019 01:31:45 +0000 (18:31 -0700)]
driver core: Establish order of operations for device_add and device_del via bitflag

commit 3451a495ef244a88ed6317a035299d835554d579 upstream.

Add an additional bit flag to the device_private struct named "dead".

This additional flag provides a guarantee that when a device_del is
executed on a given interface an async worker will not attempt to attach
the driver following the earlier device_del call. Previously this
guarantee was not present and could result in the device_del call
attempting to remove a driver from an interface only to have the async
worker attempt to probe the driver later when it finally completes the
asynchronous probe call.

One additional change added was that I pulled the check for dev->driver
out of the __device_attach_driver call and instead placed it in the
__device_attach_async_helper call. This was motivated by the fact that the
only other caller of this, __device_attach, had already taken the
device_lock() and checked for dev->driver. Instead of testing for this
twice in this path it makes more sense to just consolidate the dev->dead
and dev->driver checks together into one set of checks.

Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
5 years agogcc-9: don't warn about uninitialized variable
Linus Torvalds [Wed, 1 May 2019 18:07:40 +0000 (11:07 -0700)]
gcc-9: don't warn about uninitialized variable

commit cf676908846a06443fa5e6724ca3f5dd7460eca1 upstream.

I'm not sure what made gcc warn about this code now.  The 'ret' variable
does end up initialized in all cases, but it's definitely not obvious,
so the compiler is quite reasonable to warn about this.

So just add initialization to make it all much more obvious both to
compilers and to humans.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoscsi: fcoe: Embed fc_rport_priv in fcoe_rport structure
Hannes Reinecke [Wed, 24 Jul 2019 09:00:55 +0000 (11:00 +0200)]
scsi: fcoe: Embed fc_rport_priv in fcoe_rport structure

commit 023358b136d490ca91735ac6490db3741af5a8bd upstream.

Gcc-9 complains for a memset across pointer boundaries, which happens as
the code tries to allocate a flexible array on the stack.  Turns out we
cannot do this without relying on gcc-isms, so with this patch we'll embed
the fc_rport_priv structure into fcoe_rport, can use the normal
'container_of' outcast, and will only have to do a memset over one
structure.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoLinux 4.19.65 v4.19.65
Greg Kroah-Hartman [Tue, 6 Aug 2019 17:06:58 +0000 (19:06 +0200)]
Linux 4.19.65

5 years agoDocumentation: Add swapgs description to the Spectre v1 documentation
Josh Poimboeuf [Sat, 3 Aug 2019 19:21:54 +0000 (21:21 +0200)]
Documentation: Add swapgs description to the Spectre v1 documentation

commit 4c92057661a3412f547ede95715641d7ee16ddac upstream

Add documentation to the Spectre document about the new swapgs variant of
Spectre v1.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS
Thomas Gleixner [Wed, 17 Jul 2019 19:18:59 +0000 (21:18 +0200)]
x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS

commit f36cf386e3fec258a341d446915862eded3e13d8 upstream

Intel provided the following information:

 On all current Atom processors, instructions that use a segment register
 value (e.g. a load or store) will not speculatively execute before the
 last writer of that segment retires. Thus they will not use a
 speculatively written segment value.

That means on ATOMs there is no speculation through SWAPGS, so the SWAPGS
entry paths can be excluded from the extra LFENCE if PTI is disabled.

Create a separate bug flag for the through SWAPGS speculation and mark all
out-of-order ATOMs and AMD/HYGON CPUs as not affected. The in-order ATOMs
are excluded from the whole mitigation mess anyway.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Tyler Hicks <tyhicks@canonical.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/entry/64: Use JMP instead of JMPQ
Josh Poimboeuf [Mon, 15 Jul 2019 16:51:39 +0000 (11:51 -0500)]
x86/entry/64: Use JMP instead of JMPQ

commit 64dbc122b20f75183d8822618c24f85144a5a94d upstream

Somehow the swapgs mitigation entry code patch ended up with a JMPQ
instruction instead of JMP, where only the short jump is needed.  Some
assembler versions apparently fail to optimize JMPQ into a two-byte JMP
when possible, instead always using a 7-byte JMP with relocation.  For
some reason that makes the entry code explode with a #GP during boot.

Change it back to "JMP" as originally intended.

Fixes: 18ec54fdd6d1 ("x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/speculation: Enable Spectre v1 swapgs mitigations
Josh Poimboeuf [Mon, 8 Jul 2019 16:52:26 +0000 (11:52 -0500)]
x86/speculation: Enable Spectre v1 swapgs mitigations

commit a2059825986a1c8143fd6698774fa9d83733bb11 upstream

The previous commit added macro calls in the entry code which mitigate the
Spectre v1 swapgs issue if the X86_FEATURE_FENCE_SWAPGS_* features are
enabled.  Enable those features where applicable.

The mitigations may be disabled with "nospectre_v1" or "mitigations=off".

There are different features which can affect the risk of attack:

- When FSGSBASE is enabled, unprivileged users are able to place any
  value in GS, using the wrgsbase instruction.  This means they can
  write a GS value which points to any value in kernel space, which can
  be useful with the following gadget in an interrupt/exception/NMI
  handler:

if (coming from user space)
swapgs
mov %gs:<percpu_offset>, %reg1
// dependent load or store based on the value of %reg
// for example: mov %(reg1), %reg2

  If an interrupt is coming from user space, and the entry code
  speculatively skips the swapgs (due to user branch mistraining), it
  may speculatively execute the GS-based load and a subsequent dependent
  load or store, exposing the kernel data to an L1 side channel leak.

  Note that, on Intel, a similar attack exists in the above gadget when
  coming from kernel space, if the swapgs gets speculatively executed to
  switch back to the user GS.  On AMD, this variant isn't possible
  because swapgs is serializing with respect to future GS-based
  accesses.

  NOTE: The FSGSBASE patch set hasn't been merged yet, so the above case
doesn't exist quite yet.

- When FSGSBASE is disabled, the issue is mitigated somewhat because
  unprivileged users must use prctl(ARCH_SET_GS) to set GS, which
  restricts GS values to user space addresses only.  That means the
  gadget would need an additional step, since the target kernel address
  needs to be read from user space first.  Something like:

if (coming from user space)
swapgs
mov %gs:<percpu_offset>, %reg1
mov (%reg1), %reg2
// dependent load or store based on the value of %reg2
// for example: mov %(reg2), %reg3

  It's difficult to audit for this gadget in all the handlers, so while
  there are no known instances of it, it's entirely possible that it
  exists somewhere (or could be introduced in the future).  Without
  tooling to analyze all such code paths, consider it vulnerable.

  Effects of SMAP on the !FSGSBASE case:

  - If SMAP is enabled, and the CPU reports RDCL_NO (i.e., not
    susceptible to Meltdown), the kernel is prevented from speculatively
    reading user space memory, even L1 cached values.  This effectively
    disables the !FSGSBASE attack vector.

  - If SMAP is enabled, but the CPU *is* susceptible to Meltdown, SMAP
    still prevents the kernel from speculatively reading user space
    memory.  But it does *not* prevent the kernel from reading the
    user value from L1, if it has already been cached.  This is probably
    only a small hurdle for an attacker to overcome.

Thanks to Dave Hansen for contributing the speculative_smap() function.

Thanks to Andrew Cooper for providing the inside scoop on whether swapgs
is serializing on AMD.

[ tglx: Fixed the USER fence decision and polished the comment as suggested
   by Dave Hansen ]

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/speculation: Prepare entry code for Spectre v1 swapgs mitigations
Josh Poimboeuf [Mon, 8 Jul 2019 16:52:25 +0000 (11:52 -0500)]
x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations

commit 18ec54fdd6d18d92025af097cd042a75cf0ea24c upstream

Spectre v1 isn't only about array bounds checks.  It can affect any
conditional checks.  The kernel entry code interrupt, exception, and NMI
handlers all have conditional swapgs checks.  Those may be problematic in
the context of Spectre v1, as kernel code can speculatively run with a user
GS.

For example:

if (coming from user space)
swapgs
mov %gs:<percpu_offset>, %reg
mov (%reg), %reg1

When coming from user space, the CPU can speculatively skip the swapgs, and
then do a speculative percpu load using the user GS value.  So the user can
speculatively force a read of any kernel value.  If a gadget exists which
uses the percpu value as an address in another load/store, then the
contents of the kernel value may become visible via an L1 side channel
attack.

A similar attack exists when coming from kernel space.  The CPU can
speculatively do the swapgs, causing the user GS to get used for the rest
of the speculative window.

The mitigation is similar to a traditional Spectre v1 mitigation, except:

  a) index masking isn't possible; because the index (percpu offset)
     isn't user-controlled; and

  b) an lfence is needed in both the "from user" swapgs path and the
     "from kernel" non-swapgs path (because of the two attacks described
     above).

The user entry swapgs paths already have SWITCH_TO_KERNEL_CR3, which has a
CR3 write when PTI is enabled.  Since CR3 writes are serializing, the
lfences can be skipped in those cases.

On the other hand, the kernel entry swapgs paths don't depend on PTI.

To avoid unnecessary lfences for the user entry case, create two separate
features for alternative patching:

  X86_FEATURE_FENCE_SWAPGS_USER
  X86_FEATURE_FENCE_SWAPGS_KERNEL

Use these features in entry code to patch in lfences where needed.

The features aren't enabled yet, so there's no functional change.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/cpufeatures: Combine word 11 and 12 into a new scattered features word
Fenghua Yu [Wed, 19 Jun 2019 16:51:09 +0000 (18:51 +0200)]
x86/cpufeatures: Combine word 11 and 12 into a new scattered features word

commit acec0ce081de0c36459eea91647faf99296445a3 upstream

It's a waste for the four X86_FEATURE_CQM_* feature bits to occupy two
whole feature bits words. To better utilize feature words, re-define
word 11 to host scattered features and move the four X86_FEATURE_CQM_*
features into Linux defined word 11. More scattered features can be
added in word 11 in the future.

Rename leaf 11 in cpuid_leafs to CPUID_LNX_4 to reflect it's a
Linux-defined leaf.

Rename leaf 12 as CPUID_DUMMY which will be replaced by a meaningful
name in the next patch when CPUID.7.1:EAX occupies world 12.

Maximum number of RMID and cache occupancy scale are retrieved from
CPUID.0xf.1 after scattered CQM features are enumerated. Carve out the
code into a separate function.

KVM doesn't support resctrl now. So it's safe to move the
X86_FEATURE_CQM_* features to scattered features word 11 for KVM.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Aaron Lewis <aaronlewis@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Babu Moger <babu.moger@amd.com>
Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
Cc: "Sean J Christopherson" <sean.j.christopherson@intel.com>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: kvm ML <kvm@vger.kernel.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Nadav Amit <namit@vmware.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Peter Feiner <pfeiner@google.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
Cc: x86 <x86@kernel.org>
Link: https://lkml.kernel.org/r/1560794416-217638-2-git-send-email-fenghua.yu@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/cpufeatures: Carve out CQM features retrieval
Borislav Petkov [Wed, 19 Jun 2019 15:24:34 +0000 (17:24 +0200)]
x86/cpufeatures: Carve out CQM features retrieval

commit 45fc56e629caa451467e7664fbd4c797c434a6c4 upstream

... into a separate function for better readability. Split out from a
patch from Fenghua Yu <fenghua.yu@intel.com> to keep the mechanical,
sole code movement separate for easy review.

No functional changes.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: x86@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agoscsi: mpt3sas: Use 63-bit DMA addressing on SAS35 HBA
Suganath Prabu [Tue, 30 Jul 2019 07:43:57 +0000 (03:43 -0400)]
scsi: mpt3sas: Use 63-bit DMA addressing on SAS35 HBA

commit df9a606184bfdb5ae3ca9d226184e9489f5c24f7 upstream.

Although SAS3 & SAS3.5 IT HBA controllers support 64-bit DMA addressing, as
per hardware design, if DMA-able range contains all 64-bits
set (0xFFFFFFFF-FFFFFFFF) then it results in a firmware fault.

E.g. SGE's start address is 0xFFFFFFFF-FFFF000 and data length is 0x1000
bytes. when HBA tries to DMA the data at 0xFFFFFFFF-FFFFFFFF location then
HBA will fault the firmware.

Driver will set 63-bit DMA mask to ensure the above address will not be
used.

Cc: <stable@vger.kernel.org> # 4.19.63
Signed-off-by: Suganath Prabu <suganath-prabu.subramani@broadcom.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agox86/vdso: Prevent segfaults due to hoisted vclock reads
Andy Lutomirski [Fri, 21 Jun 2019 15:43:04 +0000 (08:43 -0700)]
x86/vdso: Prevent segfaults due to hoisted vclock reads

commit ff17bbe0bb405ad8b36e55815d381841f9fdeebc upstream.

GCC 5.5.0 sometimes cleverly hoists reads of the pvclock and/or hvclock
pages before the vclock mode checks.  This creates a path through
vclock_gettime() in which no vclock is enabled at all (due to disabled
TSC on old CPUs, for example) but the pvclock or hvclock page
nevertheless read.  This will segfault on bare metal.

This fixes commit 459e3a21535a ("gcc-9: properly declare the
{pv,hv}clock_page storage") in the sense that, before that commit, GCC
didn't seem to generate the offending code.  There was nothing wrong
with that commit per se, and -stable maintainers should backport this to
all supported kernels regardless of whether the offending commit was
present, since the same crash could just as easily be triggered by the
phase of the moon.

On GCC 9.1.1, this doesn't seem to affect the generated code at all, so
I'm not too concerned about performance regressions from this fix.

Cc: stable@vger.kernel.org
Cc: x86@kernel.org
Cc: Borislav Petkov <bp@alien8.de>
Reported-by: Duncan Roe <duncan_roe@optusnet.com.au>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
5 years agogcc-9: properly declare the {pv,hv}clock_page storage
Linus Torvalds [Wed, 1 May 2019 18:20:53 +0000 (11:20 -0700)]
gcc-9: properly declare the {pv,hv}clock_page storage

commit 459e3a21535ae3c7a9a123650e54f5c882b8fcbf upstream.

The pvlock_page and hvclock_page variables are (as the name implies)
addresses to pages, created by the linker script.

But we declared them as just "extern u8" variables, which _works_, but
now that gcc does some more bounds checking, it causes warnings like

    warning: array subscript 1 is outside array bounds of ‘u8[1]’

when we then access more than one byte from those variables.

Fix this by simply making the declaration of the variables match
reality, which makes the compiler happy too.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>