Maciej S. Szmigiero [Sat, 24 Feb 2018 16:03:21 +0000 (17:03 +0100)]
crypto: ccp - return an actual key size from RSA max_size callback
commit
0a9eb80e643064266868bd2fb2cd608e669309b0 upstream.
rsa-pkcs1pad uses a value returned from a RSA implementation max_size
callback as a size of an input buffer passed to the RSA implementation for
encrypt and sign operations.
CCP RSA implementation uses a hardware input buffer which size depends only
on the current RSA key length, so it should return this key length in
the max_size callback, too.
This also matches what the kernel software RSA implementation does.
Previously, the value returned from this callback was always the maximum
RSA key size the CCP hardware supports.
This resulted in this huge buffer being passed by rsa-pkcs1pad to CCP even
for smaller key sizes and then in a buffer overflow when ccp_run_rsa_cmd()
tried to copy this large input buffer into a RSA key length-sized hardware
input buffer.
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Fixes:
ceeec0afd684 ("crypto: ccp - Add support for RSA on the CCP")
Cc: stable@vger.kernel.org
Acked-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rui Miguel Silva [Thu, 22 Feb 2018 14:22:47 +0000 (14:22 +0000)]
crypto: caam - Fix null dereference at error path
commit
b85149f6f5d5a9279f29a73b2e95342f4d465e73 upstream.
caam_remove already removes the debugfs entry, so we need to remove the one
immediately before calling caam_remove.
This fix a NULL dereference at error paths is caam_probe fail.
Fixes:
67c2315def06 ("crypto: caam - add Queue Interface (QI) backend support")
Tested-by: Ryan Harkin <ryan.harkin@linaro.org>
Cc: "Horia Geantă" <horia.geanta@nxp.com>
Cc: Aymen Sghaier <aymen.sghaier@nxp.com>
Cc: Fabio Estevam <fabio.estevam@nxp.com>
Cc: Peng Fan <peng.fan@nxp.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Lukas Auer <lukas.auer@aisec.fraunhofer.de>
Cc: <stable@vger.kernel.org> # 4.12+
Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Rui Miguel Silva <rui.silva@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Herbert Xu [Mon, 26 Mar 2018 00:53:25 +0000 (08:53 +0800)]
crypto: ahash - Fix early termination in hash walk
commit
900a081f6912a8985dc15380ec912752cb66025a upstream.
When we have an unaligned SG list entry where there is no leftover
aligned data, the hash walk code will incorrectly return zero as if
the entire SG list has been processed.
This patch fixes it by moving onto the next page instead.
Reported-by: Eli Cooper <elicooper@gmx.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Conor McLoughlin [Tue, 13 Feb 2018 08:29:56 +0000 (08:29 +0000)]
crypto: testmgr - Fix incorrect values in PKCS#1 test vector
commit
333e18c5cc74438f8940c7f3a8b3573748a371f9 upstream.
The RSA private key for the first form should have
version, prime1, prime2, exponent1, exponent2, coefficient
values 0.
With non-zero values for prime1,2, exponent 1,2 and coefficient
the Intel QAT driver will assume that values are provided for the
private key second form. This will result in signature verification
failures for modules where QAT device is present and the modules
are signed with rsa,sha256.
Cc: <stable@vger.kernel.org>
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
Signed-off-by: Conor McLoughlin <conor.mcloughlin@intel.com>
Reviewed-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Gregory CLEMENT [Tue, 13 Mar 2018 16:48:40 +0000 (17:48 +0100)]
crypto: inside-secure - fix clock management
commit
f962eb46e7a9b98a58d2483f5eb216e738fec732 upstream.
In this driver the clock is got but never put when the driver is removed
or if there is an error in the probe.
Using the managed version of clk_get() allows to let the kernel take care
of it.
Fixes:
1b44c5a60c13 ("crypto: inside-secure - add SafeXcel EIP197 crypto
engine driver")
cc: stable@vger.kernel.org
Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Herbert Xu [Fri, 23 Mar 2018 00:14:44 +0000 (08:14 +0800)]
crypto: lrw - Free rctx->ext with kzfree
commit
8c9bdab21289c211ca1ca6a5f9b7537b4a600a02 upstream.
The buffer rctx->ext contains potentially sensitive data and should
be freed with kzfree.
Cc: <stable@vger.kernel.org>
Fixes:
700cb3f5fe75 ("crypto: lrw - Convert to skcipher")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alexander Gerasiov [Sat, 3 Feb 2018 23:50:22 +0000 (02:50 +0300)]
parport_pc: Add support for WCH CH382L PCI-E single parallel port card.
commit
823f7923833c6cc2b16e601546d607dcfb368004 upstream.
WCH CH382L is a PCI-E adapter with 1 parallel port. It is similair to CH382
but serial ports are not soldered on board. Detected as
Serial controller: Device 1c00:3050 (rev 10) (prog-if 05 [16850])
Signed-off-by: Alexander Gerasiov <gq@redlab-i.ru>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Oliver Neukum [Mon, 8 Jan 2018 14:21:07 +0000 (09:21 -0500)]
media: usbtv: prevent double free in error case
commit
50e7044535537b2a54c7ab798cd34c7f6d900bd2 upstream.
Quoting the original report:
It looks like there is a double-free vulnerability in Linux usbtv driver
on an error path of usbtv_probe function. When audio registration fails,
usbtv_video_free function ends up freeing usbtv data structure, which
gets freed the second time under usbtv_video_fail label.
usbtv_audio_fail:
usbtv_video_free(usbtv); =>
v4l2_device_put(&usbtv->v4l2_dev);
=> v4l2_device_put
=> kref_put
=> v4l2_device_release
=> usbtv_release (CALLBACK)
=> kfree(usbtv) (1st time)
usbtv_video_fail:
usb_set_intfdata(intf, NULL);
usb_put_dev(usbtv->udev);
kfree(usbtv); (2nd time)
So, as we have refcounting, use it
Reported-by: Yavuz, Tuba <tuba@ece.ufl.edu>
Signed-off-by: Oliver Neukum <oneukum@suse.com>
CC: stable@vger.kernel.org
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kees Cook [Tue, 27 Mar 2018 21:06:14 +0000 (14:06 -0700)]
/dev/mem: Avoid overwriting "err" in read_mem()
commit
b5b38200ebe54879a7264cb6f33821f61c586a7e upstream.
Successes in probe_kernel_read() would mask failures in copy_to_user()
during read_mem().
Reported-by: Brad Spengler <spender@grsecurity.net>
Fixes:
22ec1a2aea73 ("/dev/mem: Add bounce buffer for copy-out")
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Colin Ian King [Tue, 27 Feb 2018 16:21:05 +0000 (16:21 +0000)]
mei: remove dev_err message on an unsupported ioctl
commit
bb0829a741792b56c908d7745bc0b2b540293bcc upstream.
Currently the driver spams the kernel log on unsupported ioctls which is
unnecessary as the ioctl returns -ENOIOCTLCMD to indicate this anyway.
I suspect this was originally for debugging purposes but it really is not
required so remove it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Joel Stanley [Mon, 5 Mar 2018 11:47:38 +0000 (22:17 +1030)]
serial: 8250: Add Nuvoton NPCM UART
commit
f597fbce38d230af95384f4a04e0a13a1d0ad45d upstream.
The Nuvoton UART is almost compatible with the 8250 driver when probed
via the 8250_of driver, however it requires some extra configuration
at startup.
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johan Hovold [Tue, 6 Mar 2018 08:32:43 +0000 (09:32 +0100)]
USB: serial: cp210x: add ELDAT Easywave RX09 id
commit
1f1e82f74c0947e40144688c9e36abe4b3999f49 upstream.
Add device id for ELDAT Easywave RX09 tranceiver.
Reported-by: Jan Jansen <nattelip@hotmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clemens Werther [Fri, 16 Mar 2018 09:20:46 +0000 (10:20 +0100)]
USB: serial: ftdi_sio: add support for Harman FirmwareHubEmulator
commit
6555ad13a01952c16485c82a52ad1f3e07e34b3a upstream.
Add device id for Harman FirmwareHubEmulator to make the device
auto-detectable by the driver.
Signed-off-by: Clemens Werther <clemens.werther@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Major Hayden [Fri, 23 Feb 2018 20:29:54 +0000 (14:29 -0600)]
USB: serial: ftdi_sio: add RT Systems VX-8 cable
commit
9608e5c0f079390473b484ef92334dfd3431bb89 upstream.
This patch adds a device ID for the RT Systems cable used to
program Yaesu VX-8R/VX-8DR handheld radios. It uses the main
FTDI VID instead of the common RT Systems VID.
Signed-off-by: Major Hayden <major@mhtx.net>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Omar Sandoval [Mon, 2 Apr 2018 22:58:31 +0000 (15:58 -0700)]
bitmap: fix memset optimization on big-endian systems
commit
21035965f60b0502fc6537b232839389bb4ce664 upstream.
Commit
2a98dc028f91 ("include/linux/bitmap.h: turn bitmap_set and
bitmap_clear into memset when possible") introduced an optimization to
bitmap_{set,clear}() which uses memset() when the start and length are
constants aligned to a byte.
This is wrong on big-endian systems; our bitmaps are arrays of unsigned
long, so bit n is not at byte n / 8 in memory. This was caught by the
Btrfs selftests, but the bitmap selftests also fail when run on a
big-endian machine.
We can still use memset if the start and length are aligned to an
unsigned long, so do that on big-endian. The same problem applies to
the memcmp in bitmap_equal(), so fix it there, too.
Fixes:
2a98dc028f91 ("include/linux/bitmap.h: turn bitmap_set and bitmap_clear into memset when possible")
Fixes:
2c6deb01525a ("bitmap: use memcmp optimisation in more situations")
Cc: stable@kernel.org
Reported-by: "Erhard F." <erhard_f@mailbox.org>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
John Stultz [Mon, 23 Oct 2017 21:32:48 +0000 (14:32 -0700)]
usb: dwc2: Improve gadget state disconnection handling
commit
d2471d4a24dfbff5e463d382e2c6fec7d7e25a09 upstream.
In the earlier commit
dad3f793f20f ("usb: dwc2: Make sure we
disconnect the gadget state"), I was trying to fix up the
fact that we somehow weren't disconnecting the gadget state,
so that when the OTG port was plugged in the second time we
would get warnings about the state tracking being wrong.
(This seems to be due to a quirk of the HiKey board where
we do not ever get any otg interrupts, particularly the session
end detected signal. Instead we only see status change
interrupt.)
The fix there was somewhat simple, as it just made sure to
call dwc2_hsotg_disconnect() before we connected things up
in OTG mode, ensuring the state handling didn't throw errors.
But in looking at a different issue I was seeing with UDC
state handling, I realized that it would be much better
to call dwc2_hsotg_disconnect when we get the state change
signal moving to host mode.
Thus, this patch removes the earlier disconnect call I added
and moves it (and the needed locking) to the host mode
transition.
Cc: Wei Xu <xuwei5@hisilicon.com>
Cc: Guodong Xu <guodong.xu@linaro.org>
Cc: Amit Pundir <amit.pundir@linaro.org>
Cc: YongQin Liu <yongqin.liu@linaro.org>
Cc: John Youn <johnyoun@synopsys.com>
Cc: Minas Harutyunyan <Minas.Harutyunyan@synopsys.com>
Cc: Douglas Anderson <dianders@chromium.org>
Cc: Chen Yu <chenyu56@huawei.com>
Cc: Felipe Balbi <felipe.balbi@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-usb@vger.kernel.org
Acked-by: Minas Harutyunyan <hminas@synopsys.com>
Tested-by: Minas Harutyunyan <hminas@synopsys.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Szymon Janc [Mon, 26 Feb 2018 14:41:53 +0000 (15:41 +0100)]
Bluetooth: Fix missing encryption refresh on Security Request
commit
64e759f58f128730b97a3c3a26d283c075ad7c86 upstream.
If Security Request is received on connection that is already encrypted
with sufficient security master should perform encryption key refresh
procedure instead of just ignoring Slave Security Request
(Core Spec 5.0 Vol 3 Part H 2.4.6).
> ACL Data RX: Handle 3585 flags 0x02 dlen 6
SMP: Security Request (0x0b) len 1
Authentication requirement: Bonding, No MITM, SC, No Keypresses (0x09)
< HCI Command: LE Start Encryption (0x08|0x0019) plen 28
Handle: 3585
Random number: 0x0000000000000000
Encrypted diversifier: 0x0000
Long term key:
44264272a5c426a9e868f034cf0e69f3
> HCI Event: Command Status (0x0f) plen 4
LE Start Encryption (0x08|0x0019) ncmd 1
Status: Success (0x00)
> HCI Event: Encryption Key Refresh Complete (0x30) plen 3
Status: Success (0x00)
Handle: 3585
Signed-off-by: Szymon Janc <szymon.janc@codecoup.pl>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Arnd Bergmann [Wed, 10 Jan 2018 16:35:43 +0000 (17:35 +0100)]
phy: qcom-ufs: add MODULE_LICENSE tag
commit
59fba0869acae06ff594dd7e9808ed673f53538a upstream.
While the specific UFS PHY drivers (14nm and 20nm) have a module
license, the common base module does not, leading to a Kbuild
failure:
WARNING: modpost: missing MODULE_LICENSE() in drivers/phy/qualcomm/phy-qcom-ufs.o
FATAL: modpost: GPL-incompatible module phy-qcom-ufs.ko uses GPL-only symbol 'clk_enable'
This adds a module description and license tag to fix the build.
I added both Yaniv and Vivek as authors here, as Yaniv sent the initial
submission, while Vivek did most of the work since.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Florian Westphal [Sat, 10 Mar 2018 00:15:45 +0000 (01:15 +0100)]
netfilter: x_tables: add and use xt_check_proc_name
commit
b1d0a5d0cba4597c0394997b2d5fced3e3841b4e upstream.
recent and hashlimit both create /proc files, but only check that
name is 0 terminated.
This can trigger WARN() from procfs when name is "" or "/".
Add helper for this and then use it for both.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Reported-by: <syzbot+0502b00edac2a0680b61@syzkaller.appspotmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Paolo Abeni [Thu, 22 Mar 2018 10:08:50 +0000 (11:08 +0100)]
netfilter: drop template ct when conntrack is skipped.
commit
aebfa52a925d701114afd6af0def35bab16d4f47 upstream.
The ipv4 nf_ct code currently skips the nf_conntrak_in() call
for fragmented packets. As a results later matches/target can end
up manipulating template ct entry instead of 'real' ones.
Exploiting the above, syzbot found a way to trigger the following
splat:
WARNING: CPU: 1 PID: 4242 at net/netfilter/xt_cluster.c:55
xt_cluster_mt+0x6c1/0x840 net/netfilter/xt_cluster.c:127
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 4242 Comm: syzkaller027971 Not tainted 4.16.0-rc2+ #243
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x24d lib/dump_stack.c:53
panic+0x1e4/0x41c kernel/panic.c:183
__warn+0x1dc/0x200 kernel/panic.c:547
report_bug+0x211/0x2d0 lib/bug.c:184
fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178
fixup_bug arch/x86/kernel/traps.c:247 [inline]
do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
invalid_op+0x58/0x80 arch/x86/entry/entry_64.S:957
RIP: 0010:xt_cluster_hash net/netfilter/xt_cluster.c:55 [inline]
RIP: 0010:xt_cluster_mt+0x6c1/0x840 net/netfilter/xt_cluster.c:127
RSP: 0018:
ffff8801d2f6f2d0 EFLAGS:
00010293
RAX:
ffff8801af700540 RBX:
0000000000000000 RCX:
ffffffff84a2d1e1
RDX:
0000000000000000 RSI:
ffff8801d2f6f478 RDI:
ffff8801cafd336a
RBP:
ffff8801d2f6f2e8 R08:
0000000000000000 R09:
0000000000000001
R10:
0000000000000000 R11:
0000000000000000 R12:
ffff8801b03b3d18
R13:
ffff8801cafd3300 R14:
dffffc0000000000 R15:
ffff8801d2f6f478
ipt_do_table+0xa91/0x19b0 net/ipv4/netfilter/ip_tables.c:296
iptable_filter_hook+0x65/0x80 net/ipv4/netfilter/iptable_filter.c:41
nf_hook_entry_hookfn include/linux/netfilter.h:120 [inline]
nf_hook_slow+0xba/0x1a0 net/netfilter/core.c:483
nf_hook include/linux/netfilter.h:243 [inline]
NF_HOOK include/linux/netfilter.h:286 [inline]
raw_send_hdrinc.isra.17+0xf39/0x1880 net/ipv4/raw.c:432
raw_sendmsg+0x14cd/0x26b0 net/ipv4/raw.c:669
inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
sock_sendmsg_nosec net/socket.c:629 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:639
SYSC_sendto+0x361/0x5c0 net/socket.c:1748
SyS_sendto+0x40/0x50 net/socket.c:1716
do_syscall_64+0x280/0x940 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x441b49
RSP: 002b:
00007ffff5ca8b18 EFLAGS:
00000216 ORIG_RAX:
000000000000002c
RAX:
ffffffffffffffda RBX:
00000000004002c8 RCX:
0000000000441b49
RDX:
0000000000000030 RSI:
0000000020ff7000 RDI:
0000000000000003
RBP:
00000000006cc018 R08:
000000002066354c R09:
0000000000000010
R10:
0000000000000000 R11:
0000000000000216 R12:
0000000000403470
R13:
0000000000403500 R14:
0000000000000000 R15:
0000000000000000
Dumping ftrace buffer:
(ftrace buffer empty)
Kernel Offset: disabled
Rebooting in 86400 seconds..
Instead of adding checks for template ct on every target/match
manipulating skb->_nfct, simply drop the template ct when skipping
nf_conntrack_in().
Fixes:
7b4fdf77a450ec ("netfilter: don't track fragmented packets")
Reported-and-tested-by: syzbot+0346441ae0545cfcea3a@syzkaller.appspotmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Paolo Abeni [Mon, 12 Mar 2018 13:54:24 +0000 (14:54 +0100)]
l2tp: fix races with ipv4-mapped ipv6 addresses
commit
b954f94023dcc61388c8384f0f14eb8e42c863c5 upstream.
The l2tp_tunnel_create() function checks for v4mapped ipv6
sockets and cache that flag, so that l2tp core code can
reusing it at xmit time.
If the socket is provided by the userspace, the connection
status of the tunnel sockets can change between the tunnel
creation and the xmit call, so that syzbot is able to
trigger the following splat:
BUG: KASAN: use-after-free in ip6_dst_idev include/net/ip6_fib.h:192
[inline]
BUG: KASAN: use-after-free in ip6_xmit+0x1f76/0x2260
net/ipv6/ip6_output.c:264
Read of size 8 at addr
ffff8801bd949318 by task syz-executor4/23448
CPU: 0 PID: 23448 Comm: syz-executor4 Not tainted 4.16.0-rc4+ #65
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x24d lib/dump_stack.c:53
print_address_description+0x73/0x250 mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report+0x23c/0x360 mm/kasan/report.c:412
__asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
ip6_dst_idev include/net/ip6_fib.h:192 [inline]
ip6_xmit+0x1f76/0x2260 net/ipv6/ip6_output.c:264
inet6_csk_xmit+0x2fc/0x580 net/ipv6/inet6_connection_sock.c:139
l2tp_xmit_core net/l2tp/l2tp_core.c:1053 [inline]
l2tp_xmit_skb+0x105f/0x1410 net/l2tp/l2tp_core.c:1148
pppol2tp_sendmsg+0x470/0x670 net/l2tp/l2tp_ppp.c:341
sock_sendmsg_nosec net/socket.c:630 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:640
___sys_sendmsg+0x767/0x8b0 net/socket.c:2046
__sys_sendmsg+0xe5/0x210 net/socket.c:2080
SYSC_sendmsg net/socket.c:2091 [inline]
SyS_sendmsg+0x2d/0x50 net/socket.c:2087
do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x453e69
RSP: 002b:
00007f819593cc68 EFLAGS:
00000246 ORIG_RAX:
000000000000002e
RAX:
ffffffffffffffda RBX:
00007f819593d6d4 RCX:
0000000000453e69
RDX:
0000000000000081 RSI:
000000002037ffc8 RDI:
0000000000000004
RBP:
000000000072bea0 R08:
0000000000000000 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000246 R12:
00000000ffffffff
R13:
00000000000004c3 R14:
00000000006f72e8 R15:
0000000000000000
This change addresses the issues:
* explicitly checking for TCP_ESTABLISHED for user space provided sockets
* dropping the v4mapped flag usage - it can become outdated - and
explicitly invoking ipv6_addr_v4mapped() instead
The issue is apparently there since ancient times.
v1 -> v2: (many thanks to Guillaume)
- with csum issue introduced in v1
- replace pr_err with pr_debug
- fix build issue with IPV6 disabled
- move l2tp_sk_is_v4mapped in l2tp_core.c
v2 -> v3:
- don't update inet_daddr for v4mapped address, unneeded
- drop rendundant check at creation time
Reported-and-tested-by: syzbot+92fa328176eb07e4ac1a@syzkaller.appspotmail.com
Fixes:
3557baabf280 ("[L2TP]: PPP over L2TP driver core")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Florian Westphal [Fri, 9 Mar 2018 13:27:31 +0000 (14:27 +0100)]
netfilter: bridge: ebt_among: add more missing match size checks
commit
c8d70a700a5b486bfa8e5a7d33d805389f6e59f9 upstream.
ebt_among is special, it has a dynamic match size and is exempt
from the central size checks.
commit
c4585a2823edf ("bridge: ebt_among: add missing match size checks")
added validation for pool size, but missed fact that the macros
ebt_among_wh_src/dst can already return out-of-bound result because
they do not check value of wh_src/dst_ofs (an offset) vs. the size
of the match that userspace gave to us.
v2:
check that offset has correct alignment.
Paolo Abeni points out that we should also check that src/dst
wormhash arrays do not overlap, and src + length lines up with
start of dst (or vice versa).
v3: compact wormhash_sizes_valid() part
NB: Fixes tag is intentionally wrong, this bug exists from day
one when match was added for 2.6 kernel. Tag is there so stable
maintainers will notice this one too.
Tested with same rules from the earlier patch.
Fixes:
c4585a2823edf ("bridge: ebt_among: add missing match size checks")
Reported-by: <syzbot+bdabab6f1983a03fc009@syzkaller.appspotmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Michal Hocko [Tue, 30 Jan 2018 19:30:11 +0000 (11:30 -0800)]
netfilter: x_tables: make allocation less aggressive
commit
0537250fdc6c876ed4cbbe874c739aebef493ee2 upstream.
syzbot has noticed that xt_alloc_table_info can allocate a lot of memory.
This is an admin only interface but an admin in a namespace is sufficient
as well.
eacd86ca3b03 ("net/netfilter/x_tables.c: use kvmalloc() in
xt_alloc_table_info()") has changed the opencoded kmalloc->vmalloc
fallback into kvmalloc. It has dropped __GFP_NORETRY on the way because
vmalloc has simply never fully supported __GFP_NORETRY semantic. This is
still the case because e.g. page tables backing the vmalloc area are
hardcoded GFP_KERNEL.
Revert back to __GFP_NORETRY as a poors man defence against excessively
large allocation request here. We will not rule out the OOM killer
completely but __GFP_NORETRY should at least stop the large request in
most cases.
[akpm@linux-foundation.org: coding-style fixes]
tableLink: http://lkml.kernel.org/r/20180130140104.GE21609@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dennis Zhou [Fri, 16 Feb 2018 18:07:19 +0000 (12:07 -0600)]
percpu: add __GFP_NORETRY semantics to the percpu balancing path
commit
47504ee04b9241548ae2c28be7d0b01cff3b7aa6 upstream.
Percpu memory using the vmalloc area based chunk allocator lazily
populates chunks by first requesting the full virtual address space
required for the chunk and subsequently adding pages as allocations come
through. To ensure atomic allocations can succeed, a workqueue item is
used to maintain a minimum number of empty pages. In certain scenarios,
such as reported in [1], it is possible that physical memory becomes
quite scarce which can result in either a rather long time spent trying
to find free pages or worse, a kernel panic.
This patch adds support for __GFP_NORETRY and __GFP_NOWARN passing them
through to the underlying allocators. This should prevent any
unnecessary panics potentially caused by the workqueue item. The passing
of gfp around is as additional flags rather than a full set of flags.
The next patch will change these to caller passed semantics.
V2:
Added const modifier to gfp flags in the balance path.
Removed an extra whitespace.
[1] https://lkml.org/lkml/2018/2/12/551
Signed-off-by: Dennis Zhou <dennisszhou@gmail.com>
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Reported-by: syzbot+adb03f3f0bb57ce3acda@syzkaller.appspotmail.com
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Steffen Klassert [Thu, 1 Feb 2018 07:49:23 +0000 (08:49 +0100)]
xfrm: Refuse to insert 32 bit userspace socket policies on 64 bit systems
commit
19d7df69fdb2636856dc8919de72fc1bf8f79598 upstream.
We don't have a compat layer for xfrm, so userspace and kernel
structures have different sizes in this case. This results in
a broken configuration, so refuse to configure socket policies
when trying to insert from 32 bit userspace as we do it already
with policies inserted via netlink.
Reported-and-tested-by: syzbot+e1a1577ca8bcb47b769a@syzkaller.appspotmail.com
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Hackmann [Wed, 7 Mar 2018 22:42:53 +0000 (14:42 -0800)]
net: xfrm: use preempt-safe this_cpu_read() in ipcomp_alloc_tfms()
commit
0dcd7876029b58770f769cbb7b484e88e4a305e5 upstream.
f7c83bcbfaf5 ("net: xfrm: use __this_cpu_read per-cpu helper") added a
__this_cpu_read() call inside ipcomp_alloc_tfms().
At the time, __this_cpu_read() required the caller to either not care
about races or to handle preemption/interrupt issues. 3.15 tightened
the rules around some per-cpu operations, and now __this_cpu_read()
should never be used in a preemptible context. On 3.15 and later, we
need to use this_cpu_read() instead.
syzkaller reported this leading to the following kernel BUG while
fuzzing sendmsg:
BUG: using __this_cpu_read() in preemptible [
00000000] code: repro/3101
caller is ipcomp_init_state+0x185/0x990
CPU: 3 PID: 3101 Comm: repro Not tainted 4.16.0-rc4-00123-g86f84779d8e9 #154
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Call Trace:
dump_stack+0xb9/0x115
check_preemption_disabled+0x1cb/0x1f0
ipcomp_init_state+0x185/0x990
? __xfrm_init_state+0x876/0xc20
? lock_downgrade+0x5e0/0x5e0
ipcomp4_init_state+0xaa/0x7c0
__xfrm_init_state+0x3eb/0xc20
xfrm_init_state+0x19/0x60
pfkey_add+0x20df/0x36f0
? pfkey_broadcast+0x3dd/0x600
? pfkey_sock_destruct+0x340/0x340
? pfkey_seq_stop+0x80/0x80
? __skb_clone+0x236/0x750
? kmem_cache_alloc+0x1f6/0x260
? pfkey_sock_destruct+0x340/0x340
? pfkey_process+0x62a/0x6f0
pfkey_process+0x62a/0x6f0
? pfkey_send_new_mapping+0x11c0/0x11c0
? mutex_lock_io_nested+0x1390/0x1390
pfkey_sendmsg+0x383/0x750
? dump_sp+0x430/0x430
sock_sendmsg+0xc0/0x100
___sys_sendmsg+0x6c8/0x8b0
? copy_msghdr_from_user+0x3b0/0x3b0
? pagevec_lru_move_fn+0x144/0x1f0
? find_held_lock+0x32/0x1c0
? do_huge_pmd_anonymous_page+0xc43/0x11e0
? lock_downgrade+0x5e0/0x5e0
? get_kernel_page+0xb0/0xb0
? _raw_spin_unlock+0x29/0x40
? do_huge_pmd_anonymous_page+0x400/0x11e0
? __handle_mm_fault+0x553/0x2460
? __fget_light+0x163/0x1f0
? __sys_sendmsg+0xc7/0x170
__sys_sendmsg+0xc7/0x170
? SyS_shutdown+0x1a0/0x1a0
? __do_page_fault+0x5a0/0xca0
? lock_downgrade+0x5e0/0x5e0
SyS_sendmsg+0x27/0x40
? __sys_sendmsg+0x170/0x170
do_syscall_64+0x19f/0x640
entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x7f0ee73dfb79
RSP: 002b:
00007ffe14fc15a8 EFLAGS:
00000207 ORIG_RAX:
000000000000002e
RAX:
ffffffffffffffda RBX:
0000000000000000 RCX:
00007f0ee73dfb79
RDX:
0000000000000000 RSI:
00000000208befc8 RDI:
0000000000000004
RBP:
00007ffe14fc15b0 R08:
00007ffe14fc15c0 R09:
00007ffe14fc15c0
R10:
0000000000000000 R11:
0000000000000207 R12:
0000000000400440
R13:
00007ffe14fc16b0 R14:
0000000000000000 R15:
0000000000000000
Signed-off-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Roland Dreier [Wed, 28 Mar 2018 18:27:22 +0000 (11:27 -0700)]
RDMA/ucma: Introduce safer rdma_addr_size() variants
commit
84652aefb347297aa08e91e283adf7b18f77c2d5 upstream.
There are several places in the ucma ABI where userspace can pass in a
sockaddr but set the address family to AF_IB. When that happens,
rdma_addr_size() will return a size bigger than sizeof struct sockaddr_in6,
and the ucma kernel code might end up copying past the end of a buffer
not sized for a struct sockaddr_ib.
Fix this by introducing new variants
int rdma_addr_size_in6(struct sockaddr_in6 *addr);
int rdma_addr_size_kss(struct __kernel_sockaddr_storage *addr);
that are type-safe for the types used in the ucma ABI and return 0 if the
size computed is bigger than the size of the type passed in. We can use
these new variants to check what size userspace has passed in before
copying any addresses.
Reported-by: <syzbot+6800425d54ed3ed8135d@syzkaller.appspotmail.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Leon Romanovsky [Sun, 25 Mar 2018 08:39:05 +0000 (11:39 +0300)]
RDMA/ucma: Check that device exists prior to accessing it
commit
c8d3bcbfc5eab3f01cf373d039af725f3b488813 upstream.
Ensure that device exists prior to accessing its properties.
Reported-by: <syzbot+71655d44855ac3e76366@syzkaller.appspotmail.com>
Fixes:
75216638572f ("RDMA/cma: Export rdma cm interface to userspace")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Leon Romanovsky [Sun, 25 Mar 2018 08:23:55 +0000 (11:23 +0300)]
RDMA/ucma: Check that device is connected prior to access it
commit
4b658d1bbc16605330694bb3ef2570c465ef383d upstream.
Add missing check that device is connected prior to access it.
[ 55.358652] BUG: KASAN: null-ptr-deref in rdma_init_qp_attr+0x4a/0x2c0
[ 55.359389] Read of size 8 at addr
00000000000000b0 by task qp/618
[ 55.360255]
[ 55.360432] CPU: 1 PID: 618 Comm: qp Not tainted 4.16.0-rc1-00071-gcaf61b1b8b88 #91
[ 55.361693] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
[ 55.363264] Call Trace:
[ 55.363833] dump_stack+0x5c/0x77
[ 55.364215] kasan_report+0x163/0x380
[ 55.364610] ? rdma_init_qp_attr+0x4a/0x2c0
[ 55.365238] rdma_init_qp_attr+0x4a/0x2c0
[ 55.366410] ucma_init_qp_attr+0x111/0x200
[ 55.366846] ? ucma_notify+0xf0/0xf0
[ 55.367405] ? _get_random_bytes+0xea/0x1b0
[ 55.367846] ? urandom_read+0x2f0/0x2f0
[ 55.368436] ? kmem_cache_alloc_trace+0xd2/0x1e0
[ 55.369104] ? refcount_inc_not_zero+0x9/0x60
[ 55.369583] ? refcount_inc+0x5/0x30
[ 55.370155] ? rdma_create_id+0x215/0x240
[ 55.370937] ? _copy_to_user+0x4f/0x60
[ 55.371620] ? mem_cgroup_commit_charge+0x1f5/0x290
[ 55.372127] ? _copy_from_user+0x5e/0x90
[ 55.372720] ucma_write+0x174/0x1f0
[ 55.373090] ? ucma_close_id+0x40/0x40
[ 55.373805] ? __lru_cache_add+0xa8/0xd0
[ 55.374403] __vfs_write+0xc4/0x350
[ 55.374774] ? kernel_read+0xa0/0xa0
[ 55.375173] ? fsnotify+0x899/0x8f0
[ 55.375544] ? fsnotify_unmount_inodes+0x170/0x170
[ 55.376689] ? __fsnotify_update_child_dentry_flags+0x30/0x30
[ 55.377522] ? handle_mm_fault+0x174/0x320
[ 55.378169] vfs_write+0xf7/0x280
[ 55.378864] SyS_write+0xa1/0x120
[ 55.379270] ? SyS_read+0x120/0x120
[ 55.379643] ? mm_fault_error+0x180/0x180
[ 55.380071] ? task_work_run+0x7d/0xd0
[ 55.380910] ? __task_pid_nr_ns+0x120/0x140
[ 55.381366] ? SyS_read+0x120/0x120
[ 55.381739] do_syscall_64+0xeb/0x250
[ 55.382143] entry_SYSCALL_64_after_hwframe+0x21/0x86
[ 55.382841] RIP: 0033:0x7fc2ef803e99
[ 55.383227] RSP: 002b:
00007fffcc5f3be8 EFLAGS:
00000217 ORIG_RAX:
0000000000000001
[ 55.384173] RAX:
ffffffffffffffda RBX:
0000000000000000 RCX:
00007fc2ef803e99
[ 55.386145] RDX:
0000000000000057 RSI:
0000000020000080 RDI:
0000000000000003
[ 55.388418] RBP:
00007fffcc5f3c00 R08:
0000000000000000 R09:
0000000000000000
[ 55.390542] R10:
0000000000000000 R11:
0000000000000217 R12:
0000000000400480
[ 55.392916] R13:
00007fffcc5f3cf0 R14:
0000000000000000 R15:
0000000000000000
[ 55.521088] Code: e5 4d 1e ff 48 89 df 44 0f b6 b3 b8 01 00 00 e8 65 50 1e ff 4c 8b 2b 49
8d bd b0 00 00 00 e8 56 50 1e ff 41 0f b6 c6 48 c1 e0 04 <49> 03 85 b0 00 00 00 48 8d 78 08
48 89 04 24 e8 3a 4f 1e ff 48
[ 55.525980] RIP: rdma_init_qp_attr+0x52/0x2c0 RSP:
ffff8801e2c2f9d8
[ 55.532648] CR2:
00000000000000b0
[ 55.534396] ---[ end trace
70cee64090251c0b ]---
Fixes:
75216638572f ("RDMA/cma: Export rdma cm interface to userspace")
Fixes:
d541e45500bd ("IB/core: Convert ah_attr from OPA to IB when copying to user")
Reported-by: <syzbot+7b62c837c2516f8f38c8@syzkaller.appspotmail.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jason Gunthorpe [Thu, 22 Mar 2018 20:04:23 +0000 (14:04 -0600)]
RDMA/rdma_cm: Fix use after free race with process_one_req
commit
9137108cc3d64ade13e753108ec611a0daed16a0 upstream.
process_one_req() can race with rdma_addr_cancel():
CPU0 CPU1
==== ====
process_one_work()
debug_work_deactivate(work);
process_one_req()
rdma_addr_cancel()
mutex_lock(&lock);
set_timeout(&req->work,..);
__queue_work()
debug_work_activate(work);
mutex_unlock(&lock);
mutex_lock(&lock);
[..]
list_del(&req->list);
mutex_unlock(&lock);
[..]
// ODEBUG explodes since the work is still queued.
kfree(req);
Causing ODEBUG to detect the use after free:
ODEBUG: free active (active state 0) object type: work_struct hint: process_one_req+0x0/0x6c0 include/net/dst.h:165
WARNING: CPU: 0 PID: 79 at lib/debugobjects.c:291 debug_print_object+0x166/0x220 lib/debugobjects.c:288
kvm: emulating exchange as write
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 79 Comm: kworker/u4:3 Not tainted 4.16.0-rc6+ #361
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: ib_addr process_one_req
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x24d lib/dump_stack.c:53
panic+0x1e4/0x41c kernel/panic.c:183
__warn+0x1dc/0x200 kernel/panic.c:547
report_bug+0x1f4/0x2b0 lib/bug.c:186
fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178
fixup_bug arch/x86/kernel/traps.c:247 [inline]
do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:986
RIP: 0010:debug_print_object+0x166/0x220 lib/debugobjects.c:288
RSP: 0000:
ffff8801d966f210 EFLAGS:
00010086
RAX:
dffffc0000000008 RBX:
0000000000000003 RCX:
ffffffff815acd6e
RDX:
0000000000000000 RSI:
1ffff1003b2cddf2 RDI:
0000000000000000
RBP:
ffff8801d966f250 R08:
0000000000000000 R09:
1ffff1003b2cddc8
R10:
ffffed003b2cde71 R11:
ffffffff86f39a98 R12:
0000000000000001
R13:
ffffffff86f15540 R14:
ffffffff86408700 R15:
ffffffff8147c0a0
__debug_check_no_obj_freed lib/debugobjects.c:745 [inline]
debug_check_no_obj_freed+0x662/0xf1f lib/debugobjects.c:774
kfree+0xc7/0x260 mm/slab.c:3799
process_one_req+0x2e7/0x6c0 drivers/infiniband/core/addr.c:592
process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113
worker_thread+0x223/0x1990 kernel/workqueue.c:2247
kthread+0x33c/0x400 kernel/kthread.c:238
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406
Fixes:
5fff41e1f89d ("IB/core: Fix race condition in resolving IP to MAC")
Reported-by: <syzbot+3b4acab09b6463472d0a@syzkaller.appspotmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Leon Romanovsky [Tue, 20 Mar 2018 15:05:13 +0000 (17:05 +0200)]
RDMA/ucma: Ensure that CM_ID exists prior to access it
commit
e8980d67d6017c8eee8f9c35f782c4bd68e004c9 upstream.
Prior to access UCMA commands, the context should be initialized
and connected to CM_ID with ucma_create_id(). In case user skips
this step, he can provide non-valid ctx without CM_ID and cause
to multiple NULL dereferences.
Also there are situations where the create_id can be raced with
other user access, ensure that the context is only shared to
other threads once it is fully initialized to avoid the races.
[ 109.088108] BUG: unable to handle kernel NULL pointer dereference at
0000000000000020
[ 109.090315] IP: ucma_connect+0x138/0x1d0
[ 109.092595] PGD
80000001dc02d067 P4D
80000001dc02d067 PUD
1da9ef067 PMD 0
[ 109.095384] Oops: 0000 [#1] SMP KASAN PTI
[ 109.097834] CPU: 0 PID: 663 Comm: uclose Tainted: G B 4.16.0-rc1-00062-g2975d5de6428 #45
[ 109.100816] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
[ 109.105943] RIP: 0010:ucma_connect+0x138/0x1d0
[ 109.108850] RSP: 0018:
ffff8801c8567a80 EFLAGS:
00010246
[ 109.111484] RAX:
0000000000000000 RBX:
1ffff100390acf50 RCX:
ffffffff9d7812e2
[ 109.114496] RDX:
1ffffffff3f507a5 RSI:
0000000000000297 RDI:
0000000000000297
[ 109.117490] RBP:
ffff8801daa15600 R08:
0000000000000000 R09:
ffffed00390aceeb
[ 109.120429] R10:
0000000000000001 R11:
ffffed00390aceea R12:
0000000000000000
[ 109.123318] R13:
0000000000000120 R14:
ffff8801de6459c0 R15:
0000000000000118
[ 109.126221] FS:
00007fabb68d6700(0000) GS:
ffff8801e5c00000(0000) knlGS:
0000000000000000
[ 109.129468] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 109.132523] CR2:
0000000000000020 CR3:
00000001d45d8003 CR4:
00000000003606b0
[ 109.135573] DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
[ 109.138716] DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
[ 109.142057] Call Trace:
[ 109.144160] ? ucma_listen+0x110/0x110
[ 109.146386] ? wake_up_q+0x59/0x90
[ 109.148853] ? futex_wake+0x10b/0x2a0
[ 109.151297] ? save_stack+0x89/0xb0
[ 109.153489] ? _copy_from_user+0x5e/0x90
[ 109.155500] ucma_write+0x174/0x1f0
[ 109.157933] ? ucma_resolve_route+0xf0/0xf0
[ 109.160389] ? __mod_node_page_state+0x1d/0x80
[ 109.162706] __vfs_write+0xc4/0x350
[ 109.164911] ? kernel_read+0xa0/0xa0
[ 109.167121] ? path_openat+0x1b10/0x1b10
[ 109.169355] ? fsnotify+0x899/0x8f0
[ 109.171567] ? fsnotify_unmount_inodes+0x170/0x170
[ 109.174145] ? __fget+0xa8/0xf0
[ 109.177110] vfs_write+0xf7/0x280
[ 109.179532] SyS_write+0xa1/0x120
[ 109.181885] ? SyS_read+0x120/0x120
[ 109.184482] ? compat_start_thread+0x60/0x60
[ 109.187124] ? SyS_read+0x120/0x120
[ 109.189548] do_syscall_64+0xeb/0x250
[ 109.192178] entry_SYSCALL_64_after_hwframe+0x21/0x86
[ 109.194725] RIP: 0033:0x7fabb61ebe99
[ 109.197040] RSP: 002b:
00007fabb68d5e98 EFLAGS:
00000202 ORIG_RAX:
0000000000000001
[ 109.200294] RAX:
ffffffffffffffda RBX:
0000000000000000 RCX:
00007fabb61ebe99
[ 109.203399] RDX:
0000000000000120 RSI:
00000000200001c0 RDI:
0000000000000004
[ 109.206548] RBP:
00007fabb68d5ec0 R08:
0000000000000000 R09:
0000000000000000
[ 109.209902] R10:
0000000000000000 R11:
0000000000000202 R12:
00007fabb68d5fc0
[ 109.213327] R13:
0000000000000000 R14:
00007fff40ab2430 R15:
00007fabb68d69c0
[ 109.216613] Code: 88 44 24 2c 0f b6 84 24 6e 01 00 00 88 44 24 2d 0f
b6 84 24 69 01 00 00 88 44 24 2e 8b 44 24 60 89 44 24 30 e8 da f6 06 ff
31 c0 <66> 41 83 7c 24 20 1b 75 04 8b 44 24 64 48 8d 74 24 20 4c 89 e7
[ 109.223602] RIP: ucma_connect+0x138/0x1d0 RSP:
ffff8801c8567a80
[ 109.226256] CR2:
0000000000000020
Fixes:
75216638572f ("RDMA/cma: Export rdma cm interface to userspace")
Reported-by: <syzbot+36712f50b0552615bf59@syzkaller.appspotmail.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Leon Romanovsky [Mon, 19 Mar 2018 12:20:15 +0000 (14:20 +0200)]
RDMA/ucma: Fix use-after-free access in ucma_close
commit
ed65a4dc22083e73bac599ded6a262318cad7baf upstream.
The error in ucma_create_id() left ctx in the list of contexts belong
to ucma file descriptor. The attempt to close this file descriptor causes
to use-after-free accesses while iterating over such list.
Fixes:
75216638572f ("RDMA/cma: Export rdma cm interface to userspace")
Reported-by: <syzbot+dcfd344365a56fbebd0f@syzkaller.appspotmail.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Leon Romanovsky [Thu, 15 Mar 2018 13:33:02 +0000 (15:33 +0200)]
RDMA/ucma: Check AF family prior resolving address
commit
2975d5de6428ff6d9317e9948f0968f7d42e5d74 upstream.
Garbage supplied by user will cause to UCMA module provide zero
memory size for memcpy(), because it wasn't checked, it will
produce unpredictable results in rdma_resolve_addr().
[ 42.873814] BUG: KASAN: null-ptr-deref in rdma_resolve_addr+0xc8/0xfb0
[ 42.874816] Write of size 28 at addr
00000000000000a0 by task resaddr/1044
[ 42.876765]
[ 42.876960] CPU: 1 PID: 1044 Comm: resaddr Not tainted 4.16.0-rc1-00057-gaa56a5293d7e #34
[ 42.877840] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
[ 42.879691] Call Trace:
[ 42.880236] dump_stack+0x5c/0x77
[ 42.880664] kasan_report+0x163/0x380
[ 42.881354] ? rdma_resolve_addr+0xc8/0xfb0
[ 42.881864] memcpy+0x34/0x50
[ 42.882692] rdma_resolve_addr+0xc8/0xfb0
[ 42.883366] ? deref_stack_reg+0x88/0xd0
[ 42.883856] ? vsnprintf+0x31a/0x770
[ 42.884686] ? rdma_bind_addr+0xc40/0xc40
[ 42.885327] ? num_to_str+0x130/0x130
[ 42.885773] ? deref_stack_reg+0x88/0xd0
[ 42.886217] ? __read_once_size_nocheck.constprop.6+0x10/0x10
[ 42.887698] ? unwind_get_return_address_ptr+0x50/0x50
[ 42.888302] ? replace_slot+0x147/0x170
[ 42.889176] ? delete_node+0x12c/0x340
[ 42.890223] ? __radix_tree_lookup+0xa9/0x160
[ 42.891196] ? ucma_resolve_ip+0xb7/0x110
[ 42.891917] ucma_resolve_ip+0xb7/0x110
[ 42.893003] ? ucma_resolve_addr+0x190/0x190
[ 42.893531] ? _copy_from_user+0x5e/0x90
[ 42.894204] ucma_write+0x174/0x1f0
[ 42.895162] ? ucma_resolve_route+0xf0/0xf0
[ 42.896309] ? dequeue_task_fair+0x67e/0xd90
[ 42.897192] ? put_prev_entity+0x7d/0x170
[ 42.897870] ? ring_buffer_record_is_on+0xd/0x20
[ 42.898439] ? tracing_record_taskinfo_skip+0x20/0x50
[ 42.899686] __vfs_write+0xc4/0x350
[ 42.900142] ? kernel_read+0xa0/0xa0
[ 42.900602] ? firmware_map_remove+0xdf/0xdf
[ 42.901135] ? do_task_dead+0x5d/0x60
[ 42.901598] ? do_exit+0xcc6/0x1220
[ 42.902789] ? __fget+0xa8/0xf0
[ 42.903190] vfs_write+0xf7/0x280
[ 42.903600] SyS_write+0xa1/0x120
[ 42.904206] ? SyS_read+0x120/0x120
[ 42.905710] ? compat_start_thread+0x60/0x60
[ 42.906423] ? SyS_read+0x120/0x120
[ 42.908716] do_syscall_64+0xeb/0x250
[ 42.910760] entry_SYSCALL_64_after_hwframe+0x21/0x86
[ 42.912735] RIP: 0033:0x7f138b0afe99
[ 42.914734] RSP: 002b:
00007f138b799e98 EFLAGS:
00000287 ORIG_RAX:
0000000000000001
[ 42.917134] RAX:
ffffffffffffffda RBX:
0000000000000000 RCX:
00007f138b0afe99
[ 42.919487] RDX:
000000000000002e RSI:
0000000020000c40 RDI:
0000000000000004
[ 42.922393] RBP:
00007f138b799ec0 R08:
00007f138b79a700 R09:
0000000000000000
[ 42.925266] R10:
00007f138b79a700 R11:
0000000000000287 R12:
00007f138b799fc0
[ 42.927570] R13:
0000000000000000 R14:
00007ffdbae757c0 R15:
00007f138b79a9c0
[ 42.930047]
[ 42.932681] Disabling lock debugging due to kernel taint
[ 42.934795] BUG: unable to handle kernel NULL pointer dereference at
00000000000000a0
[ 42.936939] IP: memcpy_erms+0x6/0x10
[ 42.938864] PGD
80000001bea92067 P4D
80000001bea92067 PUD
1bea96067 PMD 0
[ 42.941576] Oops: 0002 [#1] SMP KASAN PTI
[ 42.943952] CPU: 1 PID: 1044 Comm: resaddr Tainted: G B 4.16.0-rc1-00057-gaa56a5293d7e #34
[ 42.946964] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
[ 42.952336] RIP: 0010:memcpy_erms+0x6/0x10
[ 42.954707] RSP: 0018:
ffff8801c8b479c8 EFLAGS:
00010286
[ 42.957227] RAX:
00000000000000a0 RBX:
ffff8801c8b47ba0 RCX:
000000000000001c
[ 42.960543] RDX:
000000000000001c RSI:
ffff8801c8b47bbc RDI:
00000000000000a0
[ 42.963867] RBP:
ffff8801c8b47b60 R08:
0000000000000000 R09:
ffffed0039168ed1
[ 42.967303] R10:
0000000000000001 R11:
ffffed0039168ed0 R12:
ffff8801c8b47bbc
[ 42.970685] R13:
00000000000000a0 R14:
1ffff10039168f4a R15:
0000000000000000
[ 42.973631] FS:
00007f138b79a700(0000) GS:
ffff8801e5d00000(0000) knlGS:
0000000000000000
[ 42.976831] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 42.979239] CR2:
00000000000000a0 CR3:
00000001be908002 CR4:
00000000003606a0
[ 42.982060] DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
[ 42.984877] DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
[ 42.988033] Call Trace:
[ 42.990487] rdma_resolve_addr+0xc8/0xfb0
[ 42.993202] ? deref_stack_reg+0x88/0xd0
[ 42.996055] ? vsnprintf+0x31a/0x770
[ 42.998707] ? rdma_bind_addr+0xc40/0xc40
[ 43.000985] ? num_to_str+0x130/0x130
[ 43.003410] ? deref_stack_reg+0x88/0xd0
[ 43.006302] ? __read_once_size_nocheck.constprop.6+0x10/0x10
[ 43.008780] ? unwind_get_return_address_ptr+0x50/0x50
[ 43.011178] ? replace_slot+0x147/0x170
[ 43.013517] ? delete_node+0x12c/0x340
[ 43.016019] ? __radix_tree_lookup+0xa9/0x160
[ 43.018755] ? ucma_resolve_ip+0xb7/0x110
[ 43.021270] ucma_resolve_ip+0xb7/0x110
[ 43.023968] ? ucma_resolve_addr+0x190/0x190
[ 43.026312] ? _copy_from_user+0x5e/0x90
[ 43.029384] ucma_write+0x174/0x1f0
[ 43.031861] ? ucma_resolve_route+0xf0/0xf0
[ 43.034782] ? dequeue_task_fair+0x67e/0xd90
[ 43.037483] ? put_prev_entity+0x7d/0x170
[ 43.040215] ? ring_buffer_record_is_on+0xd/0x20
[ 43.042990] ? tracing_record_taskinfo_skip+0x20/0x50
[ 43.045595] __vfs_write+0xc4/0x350
[ 43.048624] ? kernel_read+0xa0/0xa0
[ 43.051604] ? firmware_map_remove+0xdf/0xdf
[ 43.055379] ? do_task_dead+0x5d/0x60
[ 43.058000] ? do_exit+0xcc6/0x1220
[ 43.060783] ? __fget+0xa8/0xf0
[ 43.063133] vfs_write+0xf7/0x280
[ 43.065677] SyS_write+0xa1/0x120
[ 43.068647] ? SyS_read+0x120/0x120
[ 43.071179] ? compat_start_thread+0x60/0x60
[ 43.074025] ? SyS_read+0x120/0x120
[ 43.076705] do_syscall_64+0xeb/0x250
[ 43.079006] entry_SYSCALL_64_after_hwframe+0x21/0x86
[ 43.081606] RIP: 0033:0x7f138b0afe99
[ 43.083679] RSP: 002b:
00007f138b799e98 EFLAGS:
00000287 ORIG_RAX:
0000000000000001
[ 43.086802] RAX:
ffffffffffffffda RBX:
0000000000000000 RCX:
00007f138b0afe99
[ 43.089989] RDX:
000000000000002e RSI:
0000000020000c40 RDI:
0000000000000004
[ 43.092866] RBP:
00007f138b799ec0 R08:
00007f138b79a700 R09:
0000000000000000
[ 43.096233] R10:
00007f138b79a700 R11:
0000000000000287 R12:
00007f138b799fc0
[ 43.098913] R13:
0000000000000000 R14:
00007ffdbae757c0 R15:
00007f138b79a9c0
[ 43.101809] Code: 90 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48
c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48
89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38
[ 43.107950] RIP: memcpy_erms+0x6/0x10 RSP:
ffff8801c8b479c8
Reported-by: <syzbot+1d8c43206853b369d00c@syzkaller.appspotmail.com>
Fixes:
75216638572f ("RDMA/cma: Export rdma cm interface to userspace")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Florian Westphal [Mon, 12 Feb 2018 13:42:01 +0000 (14:42 +0100)]
xfrm_user: uncoditionally validate esn replay attribute struct
commit
d97ca5d714a5334aecadadf696875da40f1fbf3e upstream.
The sanity test added in
ecd7918745234 can be bypassed, validation
only occurs if XFRM_STATE_ESN flag is set, but rest of code doesn't care
and just checks if the attribute itself is present.
So always validate. Alternative is to reject if we have the attribute
without the flag but that would change abi.
Reported-by: syzbot+0ab777c27d2bb7588f73@syzkaller.appspotmail.com
Cc: Mathias Krause <minipli@googlemail.com>
Fixes:
ecd7918745234 ("xfrm_user: ensure user supplied esn replay window is valid")
Fixes:
d8647b79c3b7e ("xfrm: Add user interface for esn and big anti-replay windows")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Richard Narron [Wed, 10 Jan 2018 16:12:16 +0000 (09:12 -0700)]
partitions/msdos: Unable to mount UFS 44bsd partitions
commit
5f15684bd5e5ef39d4337988864fec8012471dda upstream.
UFS partitions from newer versions of FreeBSD 10 and 11 use relative
addressing for their subpartitions. But older versions of FreeBSD still
use absolute addressing just like OpenBSD and NetBSD.
Instead of simply testing for a FreeBSD partition, the code needs to
also test if the starting offset of the C subpartition is zero.
https://bugzilla.kernel.org/show_bug.cgi?id=197733
Signed-off-by: Richard Narron <comet.berkeley@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nicholas Piggin [Fri, 23 Mar 2018 05:53:38 +0000 (15:53 +1000)]
powerpc/64s: Fix i-side SLB miss bad address handler saving nonvolatile GPRs
commit
52396500f97c53860164debc7d4f759077853423 upstream.
The SLB bad address handler's trap number fixup does not preserve the
low bit that indicates nonvolatile GPRs have not been saved. This
leads save_nvgprs to skip saving them, and subsequent functions and
return from interrupt will think they are saved.
This causes kernel branch-to-garbage debugging to not have correct
registers, can also cause userspace to have its registers clobbered
after a segfault.
Fixes:
f0f558b131db ("powerpc/mm: Preserve CFAR value on SLB miss caused by access to bogus address")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nicholas Piggin [Wed, 21 Mar 2018 02:22:28 +0000 (12:22 +1000)]
powerpc/64s: Fix lost pending interrupt due to race causing lost update to irq_happened
commit
ff6781fd1bb404d8a551c02c35c70cec1da17ff1 upstream.
force_external_irq_replay() can be called in the do_IRQ path with
interrupts hard enabled and soft disabled if may_hard_irq_enable() set
MSR[EE]=1. It updates local_paca->irq_happened with a load, modify,
store sequence. If a maskable interrupt hits during this sequence, it
will go to the masked handler to be marked pending in irq_happened.
This update will be lost when the interrupt returns and the store
instruction executes. This can result in unpredictable latencies,
timeouts, lockups, etc.
Fix this by ensuring hard interrupts are disabled before modifying
irq_happened.
This could cause any maskable asynchronous interrupt to get lost, but
it was noticed on P9 SMP system doing RDMA NVMe target over 100GbE,
so very high external interrupt rate and high IPI rate. The hang was
bisected down to enabling doorbell interrupts for IPIs. These provided
an interrupt type that could run at high rates in the do_IRQ path,
stressing the race.
Fixes:
1d607bb3bd60 ("powerpc/irq: Add mechanism to force a replay of interrupts")
Cc: stable@vger.kernel.org # v4.8+
Reported-by: Carol L. Soto <clsoto@us.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pierre-Yves MORDRET [Wed, 21 Mar 2018 16:48:40 +0000 (17:48 +0100)]
i2c: i2c-stm32f7: fix no check on returned setup
commit
771b7bf05339081019d22452ebcab6929372e13e upstream.
Before assigning returned setup structure check if not null
Fixes:
463a9215f3ca7600b5ff ("i2c: stm32f7: fix setup structure")
Signed-off-by: Pierre-Yves MORDRET <pierre-yves.mordret@st.com>
Acked-by: Alexandre TORGUE <alexandre.torgue@st.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mike Kravetz [Wed, 28 Mar 2018 23:01:01 +0000 (16:01 -0700)]
ipc/shm.c: add split function to shm_vm_ops
commit
3d942ee079b917b24e2a0c5f18d35ac8ec9fee48 upstream.
If System V shmget/shmat operations are used to create a hugetlbfs
backed mapping, it is possible to munmap part of the mapping and split
the underlying vma such that it is not huge page aligned. This will
untimately result in the following BUG:
kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/mm/hugetlb.c:3310!
Oops: Exception in kernel mode, sig: 5 [#1]
LE SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in: kcm nfc af_alg caif_socket caif phonet fcrypt
CPU: 18 PID: 43243 Comm: trinity-subchil Tainted: G C E 4.15.0-10-generic #11-Ubuntu
NIP:
c00000000036e764 LR:
c00000000036ee48 CTR:
0000000000000009
REGS:
c000003fbcdcf810 TRAP: 0700 Tainted: G C E (4.15.0-10-generic)
MSR:
9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE> CR:
24002222 XER:
20040000
CFAR:
c00000000036ee44 SOFTE: 1
NIP __unmap_hugepage_range+0xa4/0x760
LR __unmap_hugepage_range_final+0x28/0x50
Call Trace:
0x7115e4e00000 (unreliable)
__unmap_hugepage_range_final+0x28/0x50
unmap_single_vma+0x11c/0x190
unmap_vmas+0x94/0x140
exit_mmap+0x9c/0x1d0
mmput+0xa8/0x1d0
do_exit+0x360/0xc80
do_group_exit+0x60/0x100
SyS_exit_group+0x24/0x30
system_call+0x58/0x6c
---[ end trace
ee88f958a1c62605 ]---
This bug was introduced by commit
31383c6865a5 ("mm, hugetlbfs:
introduce ->split() to vm_operations_struct"). A split function was
added to vm_operations_struct to determine if a mapping can be split.
This was mostly for device-dax and hugetlbfs mappings which have
specific alignment constraints.
Mappings initiated via shmget/shmat have their original vm_ops
overwritten with shm_vm_ops. shm_vm_ops functions will call back to the
original vm_ops if needed. Add such a split function to shm_vm_ops.
Link: http://lkml.kernel.org/r/20180321161314.7711-1-mike.kravetz@oracle.com
Fixes:
31383c6865a5 ("mm, hugetlbfs: introduce ->split() to vm_operations_struct")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reported-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Tested-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Yan, Zheng [Fri, 16 Mar 2018 03:22:29 +0000 (11:22 +0800)]
ceph: only dirty ITER_IOVEC pages for direct read
commit
85784f9395987a422fa04263e7c0fb13da11eb5c upstream.
If a page is already locked, attempting to dirty it leads to a deadlock
in lock_page(). This is what currently happens to ITER_BVEC pages when
a dio-enabled loop device is backed by ceph:
$ losetup --direct-io /dev/loop0 /mnt/cephfs/img
$ xfs_io -c 'pread 0 4k' /dev/loop0
Follow other file systems and only dirty ITER_IOVEC pages.
Cc: stable@kernel.org
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus Torvalds [Tue, 27 Mar 2018 01:39:07 +0000 (15:39 -1000)]
perf/hwbp: Simplify the perf-hwbp code, fix documentation
commit
f67b15037a7a50c57f72e69a6d59941ad90a0f0f upstream.
Annoyingly, modify_user_hw_breakpoint() unnecessarily complicates the
modification of a breakpoint - simplify it and remove the pointless
local variables.
Also update the stale Docbook while at it.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andrew Banman [Tue, 27 Mar 2018 22:09:06 +0000 (17:09 -0500)]
x86/platform/uv/BAU: Add APIC idt entry
commit
151ad17fbe5e56afa59709f41980508672c777ce upstream.
BAU uses the old alloc_initr_gate90 method to setup its interrupt. This
fails silently as the BAU vector is in the range of APIC vectors that are
registered to the spurious interrupt handler. As a consequence BAU
broadcasts are not handled, and the broadcast source CPU hangs.
Update BAU to use new idt structure.
Fixes:
dc20b2d52653 ("x86/idt: Move interrupt gate initialization to IDT code")
Signed-off-by: Andrew Banman <abanman@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mike Travis <mike.travis@hpe.com>
Cc: Dimitri Sivanich <sivanich@hpe.com>
Cc: Russ Anderson <rja@hpe.com>
Cc: stable@vger.kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lkml.kernel.org/r/1522188546-196177-1-git-send-email-abanman@hpe.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dan Carpenter [Tue, 27 Mar 2018 13:07:52 +0000 (16:07 +0300)]
ALSA: pcm: potential uninitialized return values
commit
5607dddbfca774fb38bffadcb077fe03aa4ac5c6 upstream.
Smatch complains that "tmp" can be uninitialized if we do a zero size
write.
Fixes:
02a5d6925cd3 ("ALSA: pcm: Avoid potential races between OSS ioctls and read/write")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stefan Roese [Mon, 26 Mar 2018 14:10:21 +0000 (16:10 +0200)]
ALSA: pcm: Use dma_bytes as size parameter in dma_mmap_coherent()
commit
9066ae7ff5d89c0b5daa271e2d573540097a94fa upstream.
When trying to use the driver (e.g. aplay *.wav), the 4MiB DMA buffer
will get mmapp'ed in 16KiB chunks. But this fails with the 2nd 16KiB
area, as the page offset is outside of the VMA range (size), which is
currently used as size parameter in snd_pcm_lib_default_mmap(). By
using the DMA buffer size (dma_bytes) instead, the complete DMA buffer
can be mmapp'ed and the issue is fixed.
This issue was detected on an ARM platform (TI AM57xx) using the RME
HDSP MADI PCIe soundcard.
Fixes:
657b1989dacf ("ALSA: pcm - Use dma_mmap_coherent() if available")
Signed-off-by: Stefan Roese <sr@denx.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nobutaka Okabe [Fri, 23 Mar 2018 10:49:44 +0000 (19:49 +0900)]
ALSA: usb-audio: Add native DSD support for TEAC UD-301
commit
b00214865d65100163574ba250008f182cf90869 upstream.
Add native DSD support quirk for TEAC UD-301 DAC,
by adding the PID/VID 0644:804a.
Signed-off-by: Nobutaka Okabe <nob77413@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Boris Brezillon [Tue, 27 Mar 2018 17:01:58 +0000 (19:01 +0200)]
mtd: nand: atmel: Fix get_sectorsize() function
commit
2b1b1b4ac716fd929a2d221bd4ade62263bed915 upstream.
get_sectorsize() was not using the appropriate macro to extract the
ECC sector size from the config cache, which led to buggy ECC when
using 1024 byte sectors.
Fixes:
f88fc122cc34 ("mtd: nand: Cleanup/rework the atmel_nand driver")
Cc: <stable@vger.kernel.org>
Reported-by: Olivier Schonken <olivier.schonken@gmail.com>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Reviewed-by: Richard Weinberger <richard@nod.at>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Tested-by: Olivier Schonken <olivier.schonken@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus Walleij [Sat, 3 Mar 2018 22:29:03 +0000 (23:29 +0100)]
mtd: jedec_probe: Fix crash in jedec_read_mfr()
commit
87a73eb5b56fd6e07c8e499fe8608ef2d8912b82 upstream.
It turns out that the loop where we read manufacturer
jedec_read_mfd() can under some circumstances get a
CFI_MFR_CONTINUATION repeatedly, making the loop go
over all banks and eventually hit the end of the
map and crash because of an access violation:
Unable to handle kernel paging request at virtual address
c4980000
pgd = (ptrval)
[
c4980000] *pgd=
03808811, *pte=
00000000, *ppte=
00000000
Internal error: Oops: 7 [#1] PREEMPT ARM
CPU: 0 PID: 1 Comm: swapper Not tainted 4.16.0-rc1+ #150
Hardware name: Gemini (Device Tree)
PC is at jedec_probe_chip+0x6ec/0xcd0
LR is at 0x4
pc : [<
c03a2bf4>] lr : [<
00000004>] psr:
60000013
sp :
c382dd18 ip :
0000ffff fp :
00000000
r10:
c0626388 r9 :
00020000 r8 :
c0626340
r7 :
00000000 r6 :
00000001 r5 :
c3a71afc r4 :
c382dd70
r3 :
00000001 r2 :
c4900000 r1 :
00000002 r0 :
00080000
Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
Control:
0000397f Table:
00004000 DAC:
00000053
Process swapper (pid: 1, stack limit = 0x(ptrval))
Fix this by breaking the loop with a return 0 if
the offset exceeds the map size.
Fixes:
5c9c11e1c47c ("[MTD] [NOR] Add support for flash chips with ID in bank other than 0")
Cc: <stable@vger.kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Philipp Rossak [Wed, 14 Feb 2018 14:10:25 +0000 (15:10 +0100)]
ARM: dts: sun6i: a31s: bpi-m2: add missing regulators
commit
70b8d21496758dd7ff600ec9de0ee3812fac7a40 upstream.
This patch fixes a bootproblem with the Bananapi M2 board. Since there
are some regulators missing we add them right now. Those values come
from the schematic, below you can find a small overview:
* reg_aldo1: 3,3V, powers the wifi
* reg_aldo2: 2,5V, powers the IO of the RTL8211E
* reg_aldo3: 3,3V, powers the audio
* reg_dldo1: 3,0V, powers the RTL8211E
* reg_dldo2: 2,8V, powers the analog part of the csi
* reg_dldo3: 3,3V, powers misc
* reg_eldo1: 1,8V, powers the csi
* reg_ldo_io1:1,8V, powers the gpio
* reg_dc5ldo: needs to be always on
This patch updates also the vmmc-supply properties on the mmc0 and mmc2
node to use the allready existent regulators.
We can now remove the sunxi-common-regulators.dtsi include since we
don't need it anymore.
Fixes:
7daa21370075 ("ARM: dts: sunxi: Add regulators for Sinovoip BPI-M2")
Cc: <stable@vger.kernel.org>
Signed-off-by: Philipp Rossak <embed3d@gmail.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Philipp Rossak [Wed, 14 Feb 2018 14:10:24 +0000 (15:10 +0100)]
ARM: dts: sun6i: a31s: bpi-m2: improve pmic properties
commit
b23af6ad8d2f708c4c3f92dd8f82c233247ba8bf upstream.
The eldoin is supplied from the dcdc1 regulator. The N_VBUSEN pin is
connected to an external power regulator (SY6280AAC).
With this commit we update the pmic binding properties to support
those features.
Fixes:
7daa21370075 ("ARM: dts: sunxi: Add regulators for Sinovoip BPI-M2")
Cc: <stable@vger.kernel.org>
Signed-off-by: Philipp Rossak <embed3d@gmail.com>
Signed-off-by: Maxime Ripard <maxime.ripard@bootlin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fabio Estevam [Mon, 22 Jan 2018 11:20:26 +0000 (12:20 +0100)]
ARM: 8746/1: vfp: Go back to clearing vfp_current_hw_state[]
commit
1328f02005bbbaed15b9d5b7f3ab5ec9d4d5268a upstream.
Commit
384b38b66947 ("ARM: 7873/1: vfp: clear vfp_current_hw_state
for dying cpu") fixed the cpu dying notifier by clearing
vfp_current_hw_state[]. However commit
e5b61bafe704 ("arm: Convert VFP
hotplug notifiers to state machine") incorrectly used the original
vfp_force_reload() function in the cpu dying notifier.
Fix it by going back to clearing vfp_current_hw_state[].
Fixes:
e5b61bafe704 ("arm: Convert VFP hotplug notifiers to state machine")
Cc: linux-stable <stable@vger.kernel.org>
Reported-by: Kohji Okuno <okuno.kohji@jp.panasonic.com>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tony Lindgren [Wed, 21 Mar 2018 15:16:29 +0000 (08:16 -0700)]
ARM: OMAP: Fix SRAM W+X mapping
commit
eb85a355c3afd9379f5953cfe2df73632d14c884 upstream.
We are still using custom SRAM code for some SoCs and are not marking
the PM code mapped to SRAM as read-only and executable after we're
done. With CONFIG_DEBUG_WX=y, we will get "Found insecure W+X mapping
at address" warning.
Let's fix this issue the same way as commit
728bbe75c82f ("misc: sram:
Introduce support code for protect-exec sram type") is doing for
drivers/misc/sram-exec.c.
On omap3, we need to restore SRAM when returning from off mode after
idle, so init time configuration is not enough.
And as we no longer have users for omap_sram_push_address() we can
make it static while at it.
Note that eventually we should be using sram-exec.c for all SoCs.
Cc: stable@vger.kernel.org # v4.12+
Cc: Dave Gerlach <d-gerlach@ti.com>
Reported-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Sat, 31 Mar 2018 16:10:43 +0000 (18:10 +0200)]
Linux 4.14.32
Julian Wiedmann [Tue, 20 Mar 2018 06:59:15 +0000 (07:59 +0100)]
s390/qeth: on channel error, reject further cmd requests
[ Upstream commit
a6c3d93963e4b333c764fde69802c3ea9eaa9d5c ]
When the IRQ handler determines that one of the cmd IO channels has
failed and schedules recovery, block any further cmd requests from
being submitted. The request would inevitably stall, and prevent the
recovery from making progress until the request times out.
This sort of error was observed after Live Guest Relocation, where
the pending IO on the READ channel intentionally gets terminated to
kick-start recovery. Simultaneously the guest executed SIOCETHTOOL,
triggering qeth to issue a QUERY CARD INFO command. The command
then stalled in the inoperabel WRITE channel.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Julian Wiedmann [Tue, 20 Mar 2018 06:59:14 +0000 (07:59 +0100)]
s390/qeth: lock read device while queueing next buffer
[ Upstream commit
17bf8c9b3d499d5168537c98b61eb7a1fcbca6c2 ]
For calling ccw_device_start(), issue_next_read() needs to hold the
device's ccwlock.
This is satisfied for the IRQ handler path (where qeth_irq() gets called
under the ccwlock), but we need explicit locking for the initial call by
the MPC initialization.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Julian Wiedmann [Tue, 20 Mar 2018 06:59:13 +0000 (07:59 +0100)]
s390/qeth: when thread completes, wake up all waiters
[ Upstream commit
1063e432bb45be209427ed3f1ca3908e4aa3c7d7 ]
qeth_wait_for_threads() is potentially called by multiple users, make
sure to notify all of them after qeth_clear_thread_running_bit()
adjusted the thread_running_mask. With no timeout, callers would
otherwise stall.
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Julian Wiedmann [Tue, 20 Mar 2018 06:59:12 +0000 (07:59 +0100)]
s390/qeth: free netdevice when removing a card
[ Upstream commit
6be687395b3124f002a653c1a50b3260222b3cd7 ]
On removal, a qeth card's netdevice is currently not properly freed
because the call chain looks as follows:
qeth_core_remove_device(card)
lx_remove_device(card)
unregister_netdev(card->dev)
card->dev = NULL !!!
qeth_core_free_card(card)
if (card->dev) !!!
free_netdev(card->dev)
Fix it by free'ing the netdev straight after unregistering. This also
fixes the sysfs-driven layer switch case (qeth_dev_layer2_store()),
where the need to free the current netdevice was not considered at all.
Note that free_netdev() takes care of the netif_napi_del() for us too.
Fixes:
4a71df50047f ("qeth: new qeth device driver")
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Reviewed-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Camelia Groza [Wed, 14 Mar 2018 13:37:32 +0000 (08:37 -0500)]
dpaa_eth: remove duplicate increment of the tx_errors counter
[ Upstream commit
82d141cd19d088ee41feafde4a6f86eeb40d93c5 ]
The tx_errors counter is incremented by the dpaa_xmit caller.
Signed-off-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Camelia Groza [Wed, 14 Mar 2018 13:37:31 +0000 (08:37 -0500)]
dpaa_eth: increment the RX dropped counter when needed
[ Upstream commit
e4d1b37c17d000a3da9368a3e260fb9ea4927c25 ]
Signed-off-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Camelia Groza [Wed, 14 Mar 2018 13:37:30 +0000 (08:37 -0500)]
dpaa_eth: remove duplicate initialization
[ Upstream commit
565186362b73226a288830abe595f05f0cec0bbc ]
The fd_format has already been initialized at this point.
Signed-off-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Madalin Bucur [Wed, 14 Mar 2018 13:37:29 +0000 (08:37 -0500)]
dpaa_eth: fix error in dpaa_remove()
[ Upstream commit
88075256ee817041d68c2387f29065b5cb2b342a ]
The recent changes that make the driver probing compatible with DSA
were not propagated in the dpa_remove() function, breaking the
module unload function. Using the proper device to address the issue.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Madalin Bucur [Wed, 14 Mar 2018 13:37:28 +0000 (08:37 -0500)]
soc/fsl/qbman: fix issue in qman_delete_cgr_safe()
[ Upstream commit
96f413f47677366e0ae03797409bfcc4151dbf9e ]
The wait_for_completion() call in qman_delete_cgr_safe()
was triggering a scheduling while atomic bug, replacing the
kthread with a smp_call_function_single() call to fix it.
Signed-off-by: Madalin Bucur <madalin.bucur@nxp.com>
Signed-off-by: Roy Pledge <roy.pledge@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Arkadi Sharshevsky [Thu, 8 Mar 2018 10:42:10 +0000 (12:42 +0200)]
team: Fix double free in error path
[ Upstream commit
cbcc607e18422555db569b593608aec26111cb0b ]
The __send_and_alloc_skb() receives a skb ptr as a parameter but in
case it fails the skb is not valid:
- Send failed and released the skb internally.
- Allocation failed.
The current code tries to release the skb in case of failure which
causes redundant freeing.
Fixes:
9b00cf2d1024 ("team: implement multipart netlink messages for options transfers")
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Vinicius Costa Gomes [Wed, 14 Mar 2018 20:32:09 +0000 (13:32 -0700)]
skbuff: Fix not waking applications when errors are enqueued
[ Upstream commit
6e5d58fdc9bedd0255a8781b258f10bbdc63e975 ]
When errors are enqueued to the error queue via sock_queue_err_skb()
function, it is possible that the waiting application is not notified.
Calling 'sk->sk_data_ready()' would not notify applications that
selected only POLLERR events in poll() (for example).
Fixes:
1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Randy E. Witt <randy.e.witt@intel.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Michal Kalderon [Wed, 14 Mar 2018 12:56:53 +0000 (14:56 +0200)]
qede: Fix qedr link update
[ Upstream commit
4609adc27175839408359822523de7247d56c87f ]
Link updates were not reported to qedr correctly.
Leading to cases where a link could be down, but qedr
would see it as up.
In addition, once qede was loaded, link state would be up,
regardless of the actual link state.
Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Florian Fainelli [Tue, 13 Mar 2018 21:45:07 +0000 (14:45 -0700)]
net: systemport: Rewrite __bcm_sysport_tx_reclaim()
[ Upstream commit
484d802d0f2f29c335563fcac2a8facf174a1bbc ]
There is no need for complex checking between the last consumed index
and current consumed index, a simple subtraction will do.
This also eliminates the possibility of a permanent transmit queue stall
under the following conditions:
- one CPU bursts ring->size worth of traffic (up to 256 buffers), to the
point where we run out of free descriptors, so we stop the transmit
queue at the end of bcm_sysport_xmit()
- because of our locking, we have the transmit process disable
interrupts which means we can be blocking the TX reclamation process
- when TX reclamation finally runs, we will be computing the difference
between ring->c_index (last consumed index by SW) and what the HW
reports through its register
- this register is masked with (ring->size - 1) = 0xff, which will lead
to stripping the upper bits of the index (register is 16-bits wide)
- we will be computing last_tx_cn as 0, which means there is no work to
be done, and we never wake-up the transmit queue, leaving it
permanently disabled
A practical example is e.g: ring->c_index aka last_c_index = 12, we
pushed 256 entries, HW consumer index = 268, we mask it with 0xff = 12,
so last_tx_cn == 0, nothing happens.
Fixes:
80105befdb4b ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Ahern [Fri, 16 Feb 2018 19:03:03 +0000 (11:03 -0800)]
net: Only honor ifindex in IP_PKTINFO if non-0
[ Upstream commit
2cbb4ea7de167b02ffa63e9cdfdb07a7e7094615 ]
Only allow ifindex from IP_PKTINFO to override SO_BINDTODEVICE settings
if the index is actually set in the message.
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nicolas Dichtel [Wed, 14 Mar 2018 20:10:23 +0000 (21:10 +0100)]
netlink: avoid a double skb free in genlmsg_mcast()
[ Upstream commit
02a2385f37a7c6594c9d89b64c4a1451276f08eb ]
nlmsg_multicast() consumes always the skb, thus the original skb must be
freed only when this function is called with a clone.
Fixes:
cb9f7a9a5c96 ("netlink: ensure to loop over all netns in genlmsg_multicast_allns()")
Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Arvind Yadav [Tue, 13 Mar 2018 15:50:06 +0000 (16:50 +0100)]
net/iucv: Free memory obtained by kzalloc
[ Upstream commit
fa6a91e9b907231d2e38ea5ed89c537b3525df3d ]
Free memory by calling put_device(), if afiucv_iucv_init is not
successful.
Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Florian Fainelli [Sun, 18 Mar 2018 19:49:51 +0000 (12:49 -0700)]
net: fec: Fix unbalanced PM runtime calls
[ Upstream commit
a069215cf5985f3aa1bba550264907d6bd05c5f7 ]
When unbinding/removing the driver, we will run into the following warnings:
[ 259.655198] fec
400d1000.ethernet:
400d1000.ethernet supply phy not found, using dummy regulator
[ 259.665065] fec
400d1000.ethernet: Unbalanced pm_runtime_enable!
[ 259.672770] fec
400d1000.ethernet (unnamed net_device) (uninitialized): Invalid MAC address: 00:00:00:00:00:00
[ 259.683062] fec
400d1000.ethernet (unnamed net_device) (uninitialized): Using random MAC address: f2:3e:93:b7:29:c1
[ 259.696239] libphy: fec_enet_mii_bus: probed
Avoid these warnings by balancing the runtime PM calls during fec_drv_remove().
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
SZ Lin (林上智) [Thu, 15 Mar 2018 16:56:01 +0000 (00:56 +0800)]
net: ethernet: ti: cpsw: add check for in-band mode setting with RGMII PHY interface
[ Upstream commit
f9db50691db4a7d860fce985f080bb3fc23a7ede ]
According to AM335x TRM[1] 14.3.6.2, AM437x TRM[2] 15.3.6.2 and
DRA7 TRM[3] 24.11.4.8.7.3.3, in-band mode in EXT_EN(bit18) register is only
available when PHY is configured in RGMII mode with 10Mbps speed. It will
cause some networking issues without RGMII mode, such as carrier sense
errors and low throughput. TI also mentioned this issue in their forum[4].
This patch adds the check mechanism for PHY interface with RGMII interface
type, the in-band mode can only be set in RGMII mode with 10Mbps speed.
References:
[1]: https://www.ti.com/lit/ug/spruh73p/spruh73p.pdf
[2]: http://www.ti.com/lit/ug/spruhl7h/spruhl7h.pdf
[3]: http://www.ti.com/lit/ug/spruic2b/spruic2b.pdf
[4]: https://e2e.ti.com/support/arm/sitara_arm/f/791/p/640765/2392155
Suggested-by: Holsety Chen (陳憲輝) <Holsety.Chen@moxa.com>
Signed-off-by: SZ Lin (林上智) <sz.lin@moxa.com>
Signed-off-by: Schuyler Patton <spatton@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Christophe JAILLET [Sun, 18 Mar 2018 22:59:36 +0000 (23:59 +0100)]
net: ethernet: arc: Fix a potential memory leak if an optional regulator is deferred
[ Upstream commit
00777fac28ba3e126b9e63e789a613e8bd2cab25 ]
If the optional regulator is deferred, we must release some resources.
They will be re-allocated when the probe function will be called again.
Fixes:
6eacf31139bf ("ethernet: arc: Add support for Rockchip SoC layer device tree bindings")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Tue, 6 Mar 2018 15:54:53 +0000 (07:54 -0800)]
l2tp: do not accept arbitrary sockets
[ Upstream commit
17cfe79a65f98abe535261856c5aef14f306dff7 ]
syzkaller found an issue caused by lack of sufficient checks
in l2tp_tunnel_create()
RAW sockets can not be considered as UDP ones for instance.
In another patch, we shall replace all pr_err() by less intrusive
pr_debug() so that syzkaller can find other bugs faster.
Acked-by: Guillaume Nault <g.nault@alphalink.fr>
Acked-by: James Chapman <jchapman@katalix.com>
==================================================================
BUG: KASAN: slab-out-of-bounds in setup_udp_tunnel_sock+0x3ee/0x5f0 net/ipv4/udp_tunnel.c:69
dst_release: dst:
00000000d53d0d0f refcnt:-1
Write of size 1 at addr
ffff8801d013b798 by task syz-executor3/6242
CPU: 1 PID: 6242 Comm: syz-executor3 Not tainted 4.16.0-rc2+ #253
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x24d lib/dump_stack.c:53
print_address_description+0x73/0x250 mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report+0x23b/0x360 mm/kasan/report.c:412
__asan_report_store1_noabort+0x17/0x20 mm/kasan/report.c:435
setup_udp_tunnel_sock+0x3ee/0x5f0 net/ipv4/udp_tunnel.c:69
l2tp_tunnel_create+0x1354/0x17f0 net/l2tp/l2tp_core.c:1596
pppol2tp_connect+0x14b1/0x1dd0 net/l2tp/l2tp_ppp.c:707
SYSC_connect+0x213/0x4a0 net/socket.c:1640
SyS_connect+0x24/0x30 net/socket.c:1621
do_syscall_64+0x280/0x940 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x42/0xb7
Fixes:
fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Lorenzo Bianconi [Thu, 8 Mar 2018 16:00:02 +0000 (17:00 +0100)]
ipv6: fix access to non-linear packet in ndisc_fill_redirect_hdr_option()
[ Upstream commit
9f62c15f28b0d1d746734666d88a79f08ba1e43e ]
Fix the following slab-out-of-bounds kasan report in
ndisc_fill_redirect_hdr_option when the incoming ipv6 packet is not
linear and the accessed data are not in the linear data region of orig_skb.
[ 1503.122508] ==================================================================
[ 1503.122832] BUG: KASAN: slab-out-of-bounds in ndisc_send_redirect+0x94e/0x990
[ 1503.123036] Read of size 1184 at addr
ffff8800298ab6b0 by task netperf/1932
[ 1503.123220] CPU: 0 PID: 1932 Comm: netperf Not tainted 4.16.0-rc2+ #124
[ 1503.123347] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-2.fc27 04/01/2014
[ 1503.123527] Call Trace:
[ 1503.123579] <IRQ>
[ 1503.123638] print_address_description+0x6e/0x280
[ 1503.123849] kasan_report+0x233/0x350
[ 1503.123946] memcpy+0x1f/0x50
[ 1503.124037] ndisc_send_redirect+0x94e/0x990
[ 1503.125150] ip6_forward+0x1242/0x13b0
[...]
[ 1503.153890] Allocated by task 1932:
[ 1503.153982] kasan_kmalloc+0x9f/0xd0
[ 1503.154074] __kmalloc_track_caller+0xb5/0x160
[ 1503.154198] __kmalloc_reserve.isra.41+0x24/0x70
[ 1503.154324] __alloc_skb+0x130/0x3e0
[ 1503.154415] sctp_packet_transmit+0x21a/0x1810
[ 1503.154533] sctp_outq_flush+0xc14/0x1db0
[ 1503.154624] sctp_do_sm+0x34e/0x2740
[ 1503.154715] sctp_primitive_SEND+0x57/0x70
[ 1503.154807] sctp_sendmsg+0xaa6/0x1b10
[ 1503.154897] sock_sendmsg+0x68/0x80
[ 1503.154987] ___sys_sendmsg+0x431/0x4b0
[ 1503.155078] __sys_sendmsg+0xa4/0x130
[ 1503.155168] do_syscall_64+0x171/0x3f0
[ 1503.155259] entry_SYSCALL_64_after_hwframe+0x42/0xb7
[ 1503.155436] Freed by task 1932:
[ 1503.155527] __kasan_slab_free+0x134/0x180
[ 1503.155618] kfree+0xbc/0x180
[ 1503.155709] skb_release_data+0x27f/0x2c0
[ 1503.155800] consume_skb+0x94/0xe0
[ 1503.155889] sctp_chunk_put+0x1aa/0x1f0
[ 1503.155979] sctp_inq_pop+0x2f8/0x6e0
[ 1503.156070] sctp_assoc_bh_rcv+0x6a/0x230
[ 1503.156164] sctp_inq_push+0x117/0x150
[ 1503.156255] sctp_backlog_rcv+0xdf/0x4a0
[ 1503.156346] __release_sock+0x142/0x250
[ 1503.156436] release_sock+0x80/0x180
[ 1503.156526] sctp_sendmsg+0xbb0/0x1b10
[ 1503.156617] sock_sendmsg+0x68/0x80
[ 1503.156708] ___sys_sendmsg+0x431/0x4b0
[ 1503.156799] __sys_sendmsg+0xa4/0x130
[ 1503.156889] do_syscall_64+0x171/0x3f0
[ 1503.156980] entry_SYSCALL_64_after_hwframe+0x42/0xb7
[ 1503.157158] The buggy address belongs to the object at
ffff8800298ab600
which belongs to the cache kmalloc-1024 of size 1024
[ 1503.157444] The buggy address is located 176 bytes inside of
1024-byte region [
ffff8800298ab600,
ffff8800298aba00)
[ 1503.157702] The buggy address belongs to the page:
[ 1503.157820] page:
ffffea0000a62a00 count:1 mapcount:0 mapping:
0000000000000000 index:0x0 compound_mapcount: 0
[ 1503.158053] flags: 0x4000000000008100(slab|head)
[ 1503.158171] raw:
4000000000008100 0000000000000000 0000000000000000 00000001800e000e
[ 1503.158350] raw:
dead000000000100 dead000000000200 ffff880036002600 0000000000000000
[ 1503.158523] page dumped because: kasan: bad access detected
[ 1503.158698] Memory state around the buggy address:
[ 1503.158816]
ffff8800298ab900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1503.158988]
ffff8800298ab980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1503.159165] >
ffff8800298aba00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 1503.159338] ^
[ 1503.159436]
ffff8800298aba80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1503.159610]
ffff8800298abb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[ 1503.159785] ==================================================================
[ 1503.159964] Disabling lock debugging due to kernel taint
The test scenario to trigger the issue consists of 4 devices:
- H0: data sender, connected to LAN0
- H1: data receiver, connected to LAN1
- GW0 and GW1: routers between LAN0 and LAN1. Both of them have an
ethernet connection on LAN0 and LAN1
On H{0,1} set GW0 as default gateway while on GW0 set GW1 as next hop for
data from LAN0 to LAN1.
Moreover create an ip6ip6 tunnel between H0 and H1 and send 3 concurrent
data streams (TCP/UDP/SCTP) from H0 to H1 through ip6ip6 tunnel (send
buffer size is set to 16K). While data streams are active flush the route
cache on HA multiple times.
I have not been able to identify a given commit that introduced the issue
since, using the reproducer described above, the kasan report has been
triggered from 4.14 and I have not gone back further.
Reported-by: Jianlin Shi <jishi@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alexey Kodanev [Tue, 6 Mar 2018 19:57:01 +0000 (22:57 +0300)]
dccp: check sk for closed state in dccp_sendmsg()
[ Upstream commit
67f93df79aeefc3add4e4b31a752600f834236e2 ]
dccp_disconnect() sets 'dp->dccps_hc_tx_ccid' tx handler to NULL,
therefore if DCCP socket is disconnected and dccp_sendmsg() is
called after it, it will cause a NULL pointer dereference in
dccp_write_xmit().
This crash and the reproducer was reported by syzbot. Looks like
it is reproduced if commit
69c64866ce07 ("dccp: CVE-2017-8824:
use-after-free in DCCP code") is applied.
Reported-by: syzbot+f99ab3887ab65d70f816@syzkaller.appspotmail.com
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kirill Tkhai [Tue, 6 Mar 2018 15:46:39 +0000 (18:46 +0300)]
net: Fix hlist corruptions in inet_evict_bucket()
[ Upstream commit
a560002437d3646dafccecb1bf32d1685112ddda ]
inet_evict_bucket() iterates global list, and
several tasks may call it in parallel. All of
them hash the same fq->list_evictor to different
lists, which leads to list corruption.
This patch makes fq be hashed to expired list
only if this has not been made yet by another
task. Since inet_frag_alloc() allocates fq
using kmem_cache_zalloc(), we may rely on
list_evictor is initially unhashed.
The problem seems to exist before async
pernet_operations, as there was possible to have
exit method to be executed in parallel with
inet_frags::frags_work, so I add two Fixes tags.
This also may go to stable.
Fixes:
d1fe19444d82 "inet: frag: don't re-use chainlist for evictor"
Fixes:
f84c6821aa54 "net: Convert pernet_subsys, registered from inet_init()"
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Wed, 14 Mar 2018 16:04:16 +0000 (09:04 -0700)]
net: use skb_to_full_sk() in skb_update_prio()
[ Upstream commit
4dcb31d4649df36297296b819437709f5407059c ]
Andrei Vagin reported a KASAN: slab-out-of-bounds error in
skb_update_prio()
Since SYNACK might be attached to a request socket, we need to
get back to the listener socket.
Since this listener is manipulated without locks, add const
qualifiers to sock_cgroup_prioidx() so that the const can also
be used in skb_update_prio()
Also add the const qualifier to sock_cgroup_classid() for consistency.
Fixes:
ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of listener")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Mon, 5 Mar 2018 16:51:03 +0000 (08:51 -0800)]
ieee802154: 6lowpan: fix possible NULL deref in lowpan_device_event()
[ Upstream commit
ca0edb131bdf1e6beaeb2b8289fd6b374b74147d ]
A tun device type can trivially be set to arbitrary value using
TUNSETLINK ioctl().
Therefore, lowpan_device_event() must really check that ieee802154_ptr
is not NULL.
Fixes:
2c88b5283f60d ("ieee802154: 6lowpan: remove check on null")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Alexander Aring <alex.aring@gmail.com>
Cc: Stefan Schmidt <stefan@osg.samsung.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Acked-by: Stefan Schmidt <stefan@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alexey Kodanev [Mon, 5 Mar 2018 17:52:54 +0000 (20:52 +0300)]
sch_netem: fix skb leak in netem_enqueue()
[ Upstream commit
35d889d10b649fda66121891ec05eca88150059d ]
When we exceed current packets limit and we have more than one
segment in the list returned by skb_gso_segment(), netem drops
only the first one, skipping the rest, hence kmemleak reports:
unreferenced object 0xffff880b5d23b600 (size 1024):
comm "softirq", pid 0, jiffies
4384527763 (age 2770.629s)
hex dump (first 32 bytes):
00 80 23 5d 0b 88 ff ff 00 00 00 00 00 00 00 00 ..#]............
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<
00000000d8a19b9d>] __alloc_skb+0xc9/0x520
[<
000000001709b32f>] skb_segment+0x8c8/0x3710
[<
00000000c7b9bb88>] tcp_gso_segment+0x331/0x1830
[<
00000000c921cba1>] inet_gso_segment+0x476/0x1370
[<
000000008b762dd4>] skb_mac_gso_segment+0x1f9/0x510
[<
000000002182660a>] __skb_gso_segment+0x1dd/0x620
[<
00000000412651b9>] netem_enqueue+0x1536/0x2590 [sch_netem]
[<
0000000005d3b2a9>] __dev_queue_xmit+0x1167/0x2120
[<
00000000fc5f7327>] ip_finish_output2+0x998/0xf00
[<
00000000d309e9d3>] ip_output+0x1aa/0x2c0
[<
000000007ecbd3a4>] tcp_transmit_skb+0x18db/0x3670
[<
0000000042d2a45f>] tcp_write_xmit+0x4d4/0x58c0
[<
0000000056a44199>] tcp_tasklet_func+0x3d9/0x540
[<
0000000013d06d02>] tasklet_action+0x1ca/0x250
[<
00000000fcde0b8b>] __do_softirq+0x1b4/0x5a3
[<
00000000e7ed027c>] irq_exit+0x1e2/0x210
Fix it by adding the rest of the segments, if any, to skb 'to_free'
list. Add new __qdisc_drop_all() and qdisc_drop_all() functions
because they can be useful in the future if we need to drop segmented
GSO packets in other places.
Fixes:
6071bd1aa13e ("netem: Segment GSO packets on enqueue")
Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tom Herbert [Tue, 13 Mar 2018 19:01:43 +0000 (12:01 -0700)]
kcm: lock lower socket in kcm_attach
[ Upstream commit
2cc683e88c0c993ac3721d9b702cb0630abe2879 ]
Need to lock lower socket in order to provide mutual exclusion
with kcm_unattach.
v2: Add Reported-by for syzbot
Fixes:
ab7ac4eb9832e32a09f4e804 ("kcm: Kernel Connection Multiplexor module")
Reported-by: syzbot+ea75c0ffcd353d32515f064aaebefc5279e6161e@syzkaller.appspotmail.com
Signed-off-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Paul Blakey [Sun, 4 Mar 2018 15:29:48 +0000 (17:29 +0200)]
rhashtable: Fix rhlist duplicates insertion
[ Upstream commit
d3dcf8eb615537526bd42ff27a081d46d337816e ]
When inserting duplicate objects (those with the same key),
current rhlist implementation messes up the chain pointers by
updating the bucket pointer instead of prev next pointer to the
newly inserted node. This causes missing elements on removal and
travesal.
Fix that by properly updating pprev pointer to point to
the correct rhash_head next pointer.
Issue: 1241076
Change-Id: I86b2c140bcb4aeb10b70a72a267ff590bb2b17e7
Fixes:
ca26893f05e8 ('rhashtable: Add rhlist interface')
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Guillaume Nault [Tue, 20 Mar 2018 15:49:26 +0000 (16:49 +0100)]
ppp: avoid loop in xmit recursion detection code
[ Upstream commit
6d066734e9f09cdea4a3b9cb76136db3f29cfb02 ]
We already detect situations where a PPP channel sends packets back to
its upper PPP device. While this is enough to avoid deadlocking on xmit
locks, this doesn't prevent packets from looping between the channel
and the unit.
The problem is that ppp_start_xmit() enqueues packets in ppp->file.xq
before checking for xmit recursion. Therefore, __ppp_xmit_process()
might dequeue a packet from ppp->file.xq and send it on the channel
which, in turn, loops it back on the unit. Then ppp_start_xmit()
queues the packet back to ppp->file.xq and __ppp_xmit_process() picks
it up and sends it again through the channel. Therefore, the packet
will loop between __ppp_xmit_process() and ppp_start_xmit() until some
other part of the xmit path drops it.
For L2TP, we rapidly fill the skb's headroom and pppol2tp_xmit() drops
the packet after a few iterations. But PPTP reallocates the headroom
if necessary, letting the loop run and exhaust the machine resources
(as reported in https://bugzilla.kernel.org/show_bug.cgi?id=199109).
Fix this by letting __ppp_xmit_process() enqueue the skb to
ppp->file.xq, so that we can check for recursion before adding it to
the queue. Now ppp_xmit_process() can drop the packet when recursion is
detected.
__ppp_channel_push() is a bit special. It calls __ppp_xmit_process()
without having any actual packet to send. This is used by
ppp_output_wakeup() to re-enable transmission on the parent unit (for
implementations like ppp_async.c, where the .start_xmit() function
might not consume the skb, leaving it in ppp->xmit_pending and
disabling transmission).
Therefore, __ppp_xmit_process() needs to handle the case where skb is
NULL, dequeuing as many packets as possible from ppp->file.xq.
Reported-by: xu heng <xuheng333@zoho.com>
Fixes:
55454a565836 ("ppp: avoid dealock on recursive xmit")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Roman Mashak [Mon, 12 Mar 2018 20:20:58 +0000 (16:20 -0400)]
net sched actions: return explicit error when tunnel_key mode is not specified
[ Upstream commit
51d4740f88affd85d49c04e3c9cd129c0e33bcb9 ]
If set/unset mode of the tunnel_key action is not provided, ->init() still
returns 0, and the caller proceeds with bogus 'struct tc_action *' object,
this results in crash:
% tc actions add action tunnel_key src_ip 1.1.1.1 dst_ip 2.2.2.1 id 7 index 1
[ 35.805515] general protection fault: 0000 [#1] SMP PTI
[ 35.806161] Modules linked in: act_tunnel_key kvm_intel kvm irqbypass
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64
crypto_simd glue_helper cryptd serio_raw
[ 35.808233] CPU: 1 PID: 428 Comm: tc Not tainted 4.16.0-rc4+ #286
[ 35.808929] RIP: 0010:tcf_action_init+0x90/0x190
[ 35.809457] RSP: 0018:
ffffb8edc068b9a0 EFLAGS:
00010206
[ 35.810053] RAX:
1320c000000a0003 RBX:
0000000000000001 RCX:
0000000000000000
[ 35.810866] RDX:
0000000000000070 RSI:
0000000000007965 RDI:
ffffb8edc068b910
[ 35.811660] RBP:
ffffb8edc068b9d0 R08:
0000000000000000 R09:
ffffb8edc068b808
[ 35.812463] R10:
ffffffffc02bf040 R11:
0000000000000040 R12:
ffffb8edc068bb38
[ 35.813235] R13:
0000000000000000 R14:
0000000000000000 R15:
ffffb8edc068b910
[ 35.814006] FS:
00007f3d0d8556c0(0000) GS:
ffff91d1dbc40000(0000)
knlGS:
0000000000000000
[ 35.814881] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 35.815540] CR2:
000000000043f720 CR3:
0000000019248001 CR4:
00000000001606a0
[ 35.816457] Call Trace:
[ 35.817158] tc_ctl_action+0x11a/0x220
[ 35.817795] rtnetlink_rcv_msg+0x23d/0x2e0
[ 35.818457] ? __slab_alloc+0x1c/0x30
[ 35.819079] ? __kmalloc_node_track_caller+0xb1/0x2b0
[ 35.819544] ? rtnl_calcit.isra.30+0xe0/0xe0
[ 35.820231] netlink_rcv_skb+0xce/0x100
[ 35.820744] netlink_unicast+0x164/0x220
[ 35.821500] netlink_sendmsg+0x293/0x370
[ 35.822040] sock_sendmsg+0x30/0x40
[ 35.822508] ___sys_sendmsg+0x2c5/0x2e0
[ 35.823149] ? pagecache_get_page+0x27/0x220
[ 35.823714] ? filemap_fault+0xa2/0x640
[ 35.824423] ? page_add_file_rmap+0x108/0x200
[ 35.825065] ? alloc_set_pte+0x2aa/0x530
[ 35.825585] ? finish_fault+0x4e/0x70
[ 35.826140] ? __handle_mm_fault+0xbc1/0x10d0
[ 35.826723] ? __sys_sendmsg+0x41/0x70
[ 35.827230] __sys_sendmsg+0x41/0x70
[ 35.827710] do_syscall_64+0x68/0x120
[ 35.828195] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 35.828859] RIP: 0033:0x7f3d0ca4da67
[ 35.829331] RSP: 002b:
00007ffc9f284338 EFLAGS:
00000246 ORIG_RAX:
000000000000002e
[ 35.830304] RAX:
ffffffffffffffda RBX:
00007ffc9f284460 RCX:
00007f3d0ca4da67
[ 35.831247] RDX:
0000000000000000 RSI:
00007ffc9f2843b0 RDI:
0000000000000003
[ 35.832167] RBP:
000000005aa6a7a9 R08:
0000000000000001 R09:
0000000000000000
[ 35.833075] R10:
00000000000005f1 R11:
0000000000000246 R12:
0000000000000000
[ 35.833997] R13:
00007ffc9f2884c0 R14:
0000000000000001 R15:
0000000000674640
[ 35.834923] Code: 24 30 bb 01 00 00 00 45 31 f6 eb 5e 8b 50 08 83 c2 07 83 e2
fc 83 c2 70 49 8b 07 48 8b 40 70 48 85 c0 74 10 48 89 14 24 4c 89 ff <ff> d0 48
8b 14 24 48 01 c2 49 01 d6 45 85 ed 74 05 41 83 47 2c
[ 35.837442] RIP: tcf_action_init+0x90/0x190 RSP:
ffffb8edc068b9a0
[ 35.838291] ---[ end trace
a095c06ee4b97a26 ]---
Fixes:
d0f6dd8a914f ("net/sched: Introduce act_tunnel_key")
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Brad Mouring [Thu, 8 Mar 2018 22:23:03 +0000 (16:23 -0600)]
net: phy: Tell caller result of phy_change()
[ Upstream commit
a2c054a896b8ac794ddcfc7c92e2dc7ec4ed4ed5 ]
In
664fcf123a30e (net: phy: Threaded interrupts allow some simplification)
the phy_interrupt system was changed to use a traditional threaded
interrupt scheme instead of a workqueue approach.
With this change, the phy status check moved into phy_change, which
did not report back to the caller whether or not the interrupt was
handled. This means that, in the case of a shared phy interrupt,
only the first phydev's interrupt registers are checked (since
phy_interrupt() would always return IRQ_HANDLED). This leads to
interrupt storms when it is a secondary device that's actually the
interrupt source.
Signed-off-by: Brad Mouring <brad.mouring@ni.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ido Schimmel [Thu, 15 Mar 2018 12:49:56 +0000 (14:49 +0200)]
mlxsw: spectrum_buffers: Set a minimum quota for CPU port traffic
[ Upstream commit
bcdd5de80a2275f7879dc278bfc747f1caf94442 ]
In commit
9ffcc3725f09 ("mlxsw: spectrum: Allow packets to be trapped
from any PG") I fixed a problem where packets could not be trapped to
the CPU due to exceeded shared buffer quotas. The mentioned commit
explains the problem in detail.
The problem was fixed by assigning a minimum quota for the CPU port and
the traffic class used for scheduling traffic to the CPU.
However, commit
117b0dad2d54 ("mlxsw: Create a different trap group list
for each device") assigned different traffic classes to different
packet types and rendered the fix useless.
Fix the problem by assigning a minimum quota for the CPU port and all
the traffic classes that are currently in use.
Fixes:
117b0dad2d54 ("mlxsw: Create a different trap group list for each device")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Eddie Shklaer <eddies@mellanox.com>
Tested-by: Eddie Shklaer <eddies@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Lebrun [Tue, 20 Mar 2018 14:44:55 +0000 (14:44 +0000)]
ipv6: sr: fix scheduling in RCU when creating seg6 lwtunnel state
[ Upstream commit
191f86ca8ef27f7a492fd1c03620498c6e94f0ac ]
The seg6_build_state() function is called with RCU read lock held,
so we cannot use GFP_KERNEL. This patch uses GFP_ATOMIC instead.
[ 92.770271] =============================
[ 92.770628] WARNING: suspicious RCU usage
[ 92.770921] 4.16.0-rc4+ #12 Not tainted
[ 92.771277] -----------------------------
[ 92.771585] ./include/linux/rcupdate.h:302 Illegal context switch in RCU read-side critical section!
[ 92.772279]
[ 92.772279] other info that might help us debug this:
[ 92.772279]
[ 92.773067]
[ 92.773067] rcu_scheduler_active = 2, debug_locks = 1
[ 92.773514] 2 locks held by ip/2413:
[ 92.773765] #0: (rtnl_mutex){+.+.}, at: [<
00000000e5461720>] rtnetlink_rcv_msg+0x441/0x4d0
[ 92.774377] #1: (rcu_read_lock){....}, at: [<
00000000df4f161e>] lwtunnel_build_state+0x59/0x210
[ 92.775065]
[ 92.775065] stack backtrace:
[ 92.775371] CPU: 0 PID: 2413 Comm: ip Not tainted 4.16.0-rc4+ #12
[ 92.775791] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc27 04/01/2014
[ 92.776608] Call Trace:
[ 92.776852] dump_stack+0x7d/0xbc
[ 92.777130] __schedule+0x133/0xf00
[ 92.777393] ? unwind_get_return_address_ptr+0x50/0x50
[ 92.777783] ? __sched_text_start+0x8/0x8
[ 92.778073] ? rcu_is_watching+0x19/0x30
[ 92.778383] ? kernel_text_address+0x49/0x60
[ 92.778800] ? __kernel_text_address+0x9/0x30
[ 92.779241] ? unwind_get_return_address+0x29/0x40
[ 92.779727] ? pcpu_alloc+0x102/0x8f0
[ 92.780101] _cond_resched+0x23/0x50
[ 92.780459] __mutex_lock+0xbd/0xad0
[ 92.780818] ? pcpu_alloc+0x102/0x8f0
[ 92.781194] ? seg6_build_state+0x11d/0x240
[ 92.781611] ? save_stack+0x9b/0xb0
[ 92.781965] ? __ww_mutex_wakeup_for_backoff+0xf0/0xf0
[ 92.782480] ? seg6_build_state+0x11d/0x240
[ 92.782925] ? lwtunnel_build_state+0x1bd/0x210
[ 92.783393] ? ip6_route_info_create+0x687/0x1640
[ 92.783846] ? ip6_route_add+0x74/0x110
[ 92.784236] ? inet6_rtm_newroute+0x8a/0xd0
Fixes:
6c8702c60b886 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels")
Signed-off-by: David Lebrun <dlebrun@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Lebrun [Tue, 20 Mar 2018 14:44:56 +0000 (14:44 +0000)]
ipv6: sr: fix NULL pointer dereference when setting encap source address
[ Upstream commit
8936ef7604c11b5d701580d779e0f5684abc7b68 ]
When using seg6 in encap mode, we call ipv6_dev_get_saddr() to set the
source address of the outer IPv6 header, in case none was specified.
Using skb->dev can lead to BUG() when it is in an inconsistent state.
This patch uses the net_device attached to the skb's dst instead.
[940807.667429] BUG: unable to handle kernel NULL pointer dereference at
000000000000047c
[940807.762427] IP: ipv6_dev_get_saddr+0x8b/0x1d0
[940807.815725] PGD 0 P4D 0
[940807.847173] Oops: 0000 [#1] SMP PTI
[940807.890073] Modules linked in:
[940807.927765] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G W 4.16.0-rc1-seg6bpf+ #2
[940808.028988] Hardware name: HP ProLiant DL120 G6/ProLiant DL120 G6, BIOS O26 09/06/2010
[940808.128128] RIP: 0010:ipv6_dev_get_saddr+0x8b/0x1d0
[940808.187667] RSP: 0018:
ffff88043fd836b0 EFLAGS:
00010206
[940808.251366] RAX:
0000000000000005 RBX:
ffff88042cb1c860 RCX:
00000000000000fe
[940808.338025] RDX:
00000000000002c0 RSI:
ffff88042cb1c860 RDI:
0000000000004500
[940808.424683] RBP:
ffff88043fd83740 R08:
0000000000000000 R09:
ffffffffffffffff
[940808.511342] R10:
0000000000000040 R11:
0000000000000000 R12:
ffff88042cb1c850
[940808.598012] R13:
ffffffff8208e380 R14:
ffff88042ac8da00 R15:
0000000000000002
[940808.684675] FS:
0000000000000000(0000) GS:
ffff88043fd80000(0000) knlGS:
0000000000000000
[940808.783036] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[940808.852975] CR2:
000000000000047c CR3:
00000004255fe000 CR4:
00000000000006e0
[940808.939634] Call Trace:
[940808.970041] <IRQ>
[940808.995250] ? ip6t_do_table+0x265/0x640
[940809.043341] seg6_do_srh_encap+0x28f/0x300
[940809.093516] ? seg6_do_srh+0x1a0/0x210
[940809.139528] seg6_do_srh+0x1a0/0x210
[940809.183462] seg6_output+0x28/0x1e0
[940809.226358] lwtunnel_output+0x3f/0x70
[940809.272370] ip6_xmit+0x2b8/0x530
[940809.313185] ? ac6_proc_exit+0x20/0x20
[940809.359197] inet6_csk_xmit+0x7d/0xc0
[940809.404173] tcp_transmit_skb+0x548/0x9a0
[940809.453304] __tcp_retransmit_skb+0x1a8/0x7a0
[940809.506603] ? ip6_default_advmss+0x40/0x40
[940809.557824] ? tcp_current_mss+0x24/0x90
[940809.605925] tcp_retransmit_skb+0xd/0x80
[940809.654016] tcp_xmit_retransmit_queue.part.17+0xf9/0x210
[940809.719797] tcp_ack+0xa47/0x1110
[940809.760612] tcp_rcv_established+0x13c/0x570
[940809.812865] tcp_v6_do_rcv+0x151/0x3d0
[940809.858879] tcp_v6_rcv+0xa5c/0xb10
[940809.901770] ? seg6_output+0xdd/0x1e0
[940809.946745] ip6_input_finish+0xbb/0x460
[940809.994837] ip6_input+0x74/0x80
[940810.034612] ? ip6_rcv_finish+0xb0/0xb0
[940810.081663] ipv6_rcv+0x31c/0x4c0
...
Fixes:
6c8702c60b886 ("ipv6: sr: add support for SRH encapsulation and injection with lwtunnels")
Reported-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David Lebrun <dlebrun@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stefano Brivio [Mon, 19 Mar 2018 10:24:58 +0000 (11:24 +0100)]
ipv6: old_dport should be a __be16 in __ip6_datagram_connect()
[ Upstream commit
5f2fb802eee1df0810b47ea251942fe3fd36589a ]
Fixes:
2f987a76a977 ("net: ipv6: keep sk status consistent after datagram connect failure")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Paolo Abeni [Mon, 12 Mar 2018 13:54:23 +0000 (14:54 +0100)]
net: ipv6: keep sk status consistent after datagram connect failure
[ Upstream commit
2f987a76a97773beafbc615b9c4d8fe79129a7f4 ]
On unsuccesful ip6_datagram_connect(), if the failure is caused by
ip6_datagram_dst_update(), the sk peer information are cleared, but
the sk->sk_state is preserved.
If the socket was already in an established status, the overall sk
status is inconsistent and fouls later checks in datagram code.
Fix this saving the old peer information and restoring them in
case of failure. This also aligns ipv6 datagram connect() behavior
with ipv4.
v1 -> v2:
- added missing Fixes tag
Fixes:
85cb73ff9b74 ("net: ipv6: reset daddr and dport in sk if connect() fails")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Shannon Nelson [Fri, 9 Mar 2018 00:17:23 +0000 (16:17 -0800)]
macvlan: filter out unsupported feature flags
[ Upstream commit
13fbcc8dc573482dd3f27568257fd7087f8935f4 ]
Adding a macvlan device on top of a lowerdev that supports
the xfrm offloads fails with a new regression:
# ip link add link ens1f0 mv0 type macvlan
RTNETLINK answers: Operation not permitted
Tracing down the failure shows that the macvlan device inherits
the NETIF_F_HW_ESP and NETIF_F_HW_ESP_TX_CSUM feature flags
from the lowerdev, but with no dev->xfrmdev_ops API filled
in, it doesn't actually support xfrm. When the request is
made to add the new macvlan device, the XFRM listener for
NETDEV_REGISTER calls xfrm_api_check() which fails the new
registration because dev->xfrmdev_ops is NULL.
The macvlan creation succeeds when we filter out the ESP
feature flags in macvlan_fix_features(), so let's filter them
out like we're already filtering out ~NETIF_F_NETNS_LOCAL.
When XFRM support is added in the future, we can add the flags
into MACVLAN_FEATURES.
This same problem could crop up in the future with any other
new feature flags, so let's filter out any flags that aren't
defined as supported in macvlan.
Fixes:
d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API")
Reported-by: Alexey Kodanev <alexey.kodanev@oracle.com>
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Arkadi Sharshevsky [Sun, 18 Mar 2018 15:37:22 +0000 (17:37 +0200)]
devlink: Remove redundant free on error path
[ Upstream commit
7fe4d6dcbcb43fe0282d4213fc52be178bb30e91 ]
The current code performs unneeded free. Remove the redundant skb freeing
during the error path.
Fixes:
1555d204e743 ("devlink: Support for pipeline debug (dpipe)")
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Grygorii Strashko [Fri, 16 Mar 2018 22:08:35 +0000 (17:08 -0500)]
net: phy: relax error checking when creating sysfs link netdev->phydev
[ Upstream commit
4414b3ed74be0e205e04e12cd83542a727d88255 ]
Some ethernet drivers (like TI CPSW) may connect and manage >1 Net PHYs per
one netdevice, as result such drivers will produce warning during system
boot and fail to connect second phy to netdevice when PHYLIB framework
will try to create sysfs link netdev->phydev for second PHY
in phy_attach_direct(), because sysfs link with the same name has been
created already for the first PHY. As result, second CPSW external
port will became unusable.
Fix it by relaxing error checking when PHYLIB framework is creating sysfs
link netdev->phydev in phy_attach_direct(), suppressing warning by using
sysfs_create_link_nowarn() and adding error message instead.
After this change links (phy->netdev and netdev->phy) creation failure is not
fatal any more and system can continue working, which fixes TI CPSW issue.
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Fixes:
a3995460491d ("net: phy: Relax error checking on sysfs_create_link()")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Grygorii Strashko [Fri, 16 Mar 2018 22:08:34 +0000 (17:08 -0500)]
sysfs: symlink: export sysfs_create_link_nowarn()
[ Upstream commit
2399ac42e762ab25c58420e25359b2921afdc55f ]
The sysfs_create_link_nowarn() is going to be used in phylib framework in
subsequent patch which can be built as module. Hence, export
sysfs_create_link_nowarn() to avoid build errors.
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Fixes:
a3995460491d ("net: phy: Relax error checking on sysfs_create_link()")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Michal Kalderon [Wed, 14 Mar 2018 12:49:28 +0000 (14:49 +0200)]
qed: Fix non TCP packets should be dropped on iWARP ll2 connection
[ Upstream commit
16da09047d3fb991dc48af41f6d255fd578e8ca2 ]
FW workaround. The iWARP LL2 connection did not expect TCP packets
to arrive on it's connection. The fix drops any non-tcp packets
Fixes b5c29ca ("qed: iWARP CM - setup a ll2 connection for handling
SYN packets")
Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Soheil Hassas Yeganeh [Tue, 6 Mar 2018 22:15:12 +0000 (17:15 -0500)]
tcp: purge write queue upon aborting the connection
[ Upstream commit
e05836ac07c77dd90377f8c8140bce2a44af5fe7 ]
When the connection is aborted, there is no point in
keeping the packets on the write queue until the connection
is closed.
Similar to
a27fd7a8ed38 ('tcp: purge write queue upon RST'),
this is essential for a correct MSG_ZEROCOPY implementation,
because userspace cannot call close(fd) before receiving
zerocopy signals even when the connection is aborted.
Fixes:
f214f915e7db ("tcp: enable MSG_ZEROCOPY")
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Soheil Hassas Yeganeh [Thu, 15 Mar 2018 16:09:13 +0000 (12:09 -0400)]
tcp: reset sk_send_head in tcp_write_queue_purge
tcp_write_queue_purge clears all the SKBs in the write queue
but does not reset the sk_send_head. As a result, we can have
a NULL pointer dereference anywhere that we use tcp_send_head
instead of the tcp_write_queue_tail.
For example, after
a27fd7a8ed38 (tcp: purge write queue upon RST),
we can purge the write queue on RST. Prior to
75c119afe14f (tcp: implement rb-tree based retransmit queue),
tcp_push will only check tcp_send_head and then accesses
tcp_write_queue_tail to send the actual SKB. As a result, it will
dereference a NULL pointer.
This has been reported twice for 4.14 where we don't have
75c119afe14f:
By Timofey Titovets:
[ 422.081094] BUG: unable to handle kernel NULL pointer dereference
at
0000000000000038
[ 422.081254] IP: tcp_push+0x42/0x110
[ 422.081314] PGD 0 P4D 0
[ 422.081364] Oops: 0002 [#1] SMP PTI
By Yongjian Xu:
BUG: unable to handle kernel NULL pointer dereference at
0000000000000038
IP: tcp_push+0x48/0x120
PGD
80000007ff77b067 P4D
80000007ff77b067 PUD
7fd989067 PMD 0
Oops: 0002 [#18] SMP PTI
Modules linked in: tcp_diag inet_diag tcp_bbr sch_fq iTCO_wdt
iTCO_vendor_support pcspkr ixgbe mdio i2c_i801 lpc_ich joydev input_leds shpchp
e1000e igb dca ptp pps_core hwmon mei_me mei ipmi_si ipmi_msghandler sg ses
scsi_transport_sas enclosure ext4 jbd2 mbcache sd_mod ahci libahci megaraid_sas
wmi ast ttm dm_mirror dm_region_hash dm_log dm_mod dax
CPU: 6 PID: 14156 Comm: [ET_NET 6] Tainted: G D 4.14.26-1.el6.x86_64 #1
Hardware name: LENOVO ThinkServer RD440 /ThinkServer RD440, BIOS A0TS80A
09/22/2014
task:
ffff8807d78d8140 task.stack:
ffffc9000e944000
RIP: 0010:tcp_push+0x48/0x120
RSP: 0018:
ffffc9000e947a88 EFLAGS:
00010246
RAX:
00000000000005b4 RBX:
ffff880f7cce9c00 RCX:
0000000000000000
RDX:
0000000000000000 RSI:
0000000000000040 RDI:
ffff8807d00f5000
RBP:
ffffc9000e947aa8 R08:
0000000000001c84 R09:
0000000000000000
R10:
ffff8807d00f5158 R11:
0000000000000000 R12:
ffff8807d00f5000
R13:
0000000000000020 R14:
00000000000256d4 R15:
0000000000000000
FS:
00007f5916de9700(0000) GS:
ffff88107fd00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000000000038 CR3:
00000007f8226004 CR4:
00000000001606e0
Call Trace:
tcp_sendmsg_locked+0x33d/0xe50
tcp_sendmsg+0x37/0x60
inet_sendmsg+0x39/0xc0
sock_sendmsg+0x49/0x60
sock_write_iter+0xb6/0x100
do_iter_readv_writev+0xec/0x130
? rw_verify_area+0x49/0xb0
do_iter_write+0x97/0xd0
vfs_writev+0x7e/0xe0
? __wake_up_common_lock+0x80/0xa0
? __fget_light+0x2c/0x70
? __do_page_fault+0x1e7/0x530
do_writev+0x60/0xf0
? inet_shutdown+0xac/0x110
SyS_writev+0x10/0x20
do_syscall_64+0x6f/0x140
? prepare_exit_to_usermode+0x8b/0xa0
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
RIP: 0033:0x3135ce0c57
RSP: 002b:
00007f5916de4b00 EFLAGS:
00000293 ORIG_RAX:
0000000000000014
RAX:
ffffffffffffffda RBX:
0000000000000000 RCX:
0000003135ce0c57
RDX:
0000000000000002 RSI:
00007f5916de4b90 RDI:
000000000000606f
RBP:
0000000000000000 R08:
0000000000000000 R09:
00007f5916de8c38
R10:
0000000000000000 R11:
0000000000000293 R12:
00000000000464cc
R13:
00007f5916de8c30 R14:
00007f58d8bef080 R15:
0000000000000002
Code: 48 8b 97 60 01 00 00 4c 8d 97 58 01 00 00 41 b9 00 00 00 00 41 89 f3 4c 39
d2 49 0f 44 d1 41 81 e3 00 80 00 00 0f 85 b0 00 00 00 <80> 4a 38 08 44 8b 8f 74
06 00 00 44 89 8f 7c 06 00 00 83 e6 01
RIP: tcp_push+0x48/0x120 RSP:
ffffc9000e947a88
CR2:
0000000000000038
---[ end trace
8d545c2e93515549 ]---
Fixes:
a27fd7a8ed38 (tcp: purge write queue upon RST)
Reported-by: Timofey Titovets <nefelim4ag@gmail.com>
Reported-by: Yongjian Xu <yongjianchn@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Tested-by: Yongjian Xu <yongjianchn@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Wed, 28 Mar 2018 16:24:51 +0000 (18:24 +0200)]
Linux 4.14.31
Daniel Borkmann [Wed, 7 Mar 2018 21:10:01 +0000 (22:10 +0100)]
bpf, x64: increase number of passes
commit
6007b080d2e2adb7af22bf29165f0594ea12b34c upstream.
In Cilium some of the main programs we run today are hitting 9 passes
on x64's JIT compiler, and we've had cases already where we surpassed
the limit where the JIT then punts the program to the interpreter
instead, leading to insertion failures due to CONFIG_BPF_JIT_ALWAYS_ON
or insertion failures due to the prog array owner being JITed but the
program to insert not (both must have the same JITed/non-JITed property).
One concrete case the program image shrunk from 12,767 bytes down to
10,288 bytes where the image converged after 16 steps. I've measured
that this took 340us in the JIT until it converges on my i7-6600U. Thus,
increase the original limit we had from day one where the JIT covered
cBPF only back then before we run into the case (as similar with the
complexity limit) where we trip over this and hit program rejections.
Also add a cond_resched() into the compilation loop, the JIT process
runs without any locks and may sleep anyway.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Chenbo Feng [Tue, 20 Mar 2018 00:57:27 +0000 (17:57 -0700)]
bpf: skip unnecessary capability check
commit
0fa4fe85f4724fff89b09741c437cbee9cf8b008 upstream.
The current check statement in BPF syscall will do a capability check
for CAP_SYS_ADMIN before checking sysctl_unprivileged_bpf_disabled. This
code path will trigger unnecessary security hooks on capability checking
and cause false alarms on unprivileged process trying to get CAP_SYS_ADMIN
access. This can be resolved by simply switch the order of the statement
and CAP_SYS_ADMIN is not required anyway if unprivileged bpf syscall is
allowed.
Signed-off-by: Chenbo Feng <fengc@google.com>
Acked-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Borkmann [Wed, 21 Mar 2018 00:18:24 +0000 (01:18 +0100)]
kbuild: disable clang's default use of -fmerge-all-constants
commit
87e0d4f0f37fb0c8c4aeeac46fff5e957738df79 upstream.
Prasad reported that he has seen crashes in BPF subsystem with netd
on Android with arm64 in the form of (note, the taint is unrelated):
[ 4134.721483] Unable to handle kernel paging request at virtual address
800000001
[ 4134.820925] Mem abort info:
[ 4134.901283] Exception class = DABT (current EL), IL = 32 bits
[ 4135.016736] SET = 0, FnV = 0
[ 4135.119820] EA = 0, S1PTW = 0
[ 4135.201431] Data abort info:
[ 4135.301388] ISV = 0, ISS = 0x00000021
[ 4135.359599] CM = 0, WnR = 0
[ 4135.470873] user pgtable: 4k pages, 39-bit VAs, pgd =
ffffffe39b946000
[ 4135.499757] [
0000000800000001] *pgd=
0000000000000000, *pud=
0000000000000000
[ 4135.660725] Internal error: Oops:
96000021 [#1] PREEMPT SMP
[ 4135.674610] Modules linked in:
[ 4135.682883] CPU: 5 PID: 1260 Comm: netd Tainted: G S W 4.14.19+ #1
[ 4135.716188] task:
ffffffe39f4aa380 task.stack:
ffffff801d4e0000
[ 4135.731599] PC is at bpf_prog_add+0x20/0x68
[ 4135.741746] LR is at bpf_prog_inc+0x20/0x2c
[ 4135.751788] pc : [<
ffffff94ab7ad584>] lr : [<
ffffff94ab7ad638>] pstate:
60400145
[ 4135.769062] sp :
ffffff801d4e3ce0
[...]
[ 4136.258315] Process netd (pid: 1260, stack limit = 0xffffff801d4e0000)
[ 4136.273746] Call trace:
[...]
[ 4136.442494] 3ca0:
ffffff94ab7ad584 0000000060400145 ffffffe3a01bf8f8 0000000000000006
[ 4136.460936] 3cc0:
0000008000000000 ffffff94ab844204 ffffff801d4e3cf0 ffffff94ab7ad584
[ 4136.479241] [<
ffffff94ab7ad584>] bpf_prog_add+0x20/0x68
[ 4136.491767] [<
ffffff94ab7ad638>] bpf_prog_inc+0x20/0x2c
[ 4136.504536] [<
ffffff94ab7b5d08>] bpf_obj_get_user+0x204/0x22c
[ 4136.518746] [<
ffffff94ab7ade68>] SyS_bpf+0x5a8/0x1a88
Android's netd was basically pinning the uid cookie BPF map in BPF
fs (/sys/fs/bpf/traffic_cookie_uid_map) and later on retrieving it
again resulting in above panic. Issue is that the map was wrongly
identified as a prog! Above kernel was compiled with clang 4.0,
and it turns out that clang decided to merge the bpf_prog_iops and
bpf_map_iops into a single memory location, such that the two i_ops
could then not be distinguished anymore.
Reason for this miscompilation is that clang has the more aggressive
-fmerge-all-constants enabled by default. In fact, clang source code
has a comment about it in lib/AST/ExprConstant.cpp on why it is okay
to do so:
Pointers with different bases cannot represent the same object.
(Note that clang defaults to -fmerge-all-constants, which can
lead to inconsistent results for comparisons involving the address
of a constant; this generally doesn't matter in practice.)
The issue never appeared with gcc however, since gcc does not enable
-fmerge-all-constants by default and even *explicitly* states in
it's option description that using this flag results in non-conforming
behavior, quote from man gcc:
Languages like C or C++ require each variable, including multiple
instances of the same variable in recursive calls, to have distinct
locations, so using this option results in non-conforming behavior.
There are also various clang bug reports open on that matter [1],
where clang developers acknowledge the non-conforming behavior,
and refer to disabling it with -fno-merge-all-constants. But even
if this gets fixed in clang today, there are already users out there
that triggered this. Thus, fix this issue by explicitly adding
-fno-merge-all-constants to the kernel's Makefile to generically
disable this optimization, since potentially other places in the
kernel could subtly break as well.
Note, there is also a flag called -fmerge-constants (not supported
by clang), which is more conservative and only applies to strings
and it's enabled in gcc's -O/-O2/-O3/-Os optimization levels. In
gcc's code, the two flags -fmerge-{all-,}constants share the same
variable internally, so when disabling it via -fno-merge-all-constants,
then we really don't merge any const data (e.g. strings), and text
size increases with gcc (14,927,214 -> 14,942,646 for vmlinux.o).
$ gcc -fverbose-asm -O2 foo.c -S -o foo.S
-> foo.S lists -fmerge-constants under options enabled
$ gcc -fverbose-asm -O2 -fno-merge-all-constants foo.c -S -o foo.S
-> foo.S doesn't list -fmerge-constants under options enabled
$ gcc -fverbose-asm -O2 -fno-merge-all-constants -fmerge-constants foo.c -S -o foo.S
-> foo.S lists -fmerge-constants under options enabled
Thus, as a workaround we need to set both -fno-merge-all-constants
*and* -fmerge-constants in the Makefile in order for text size to
stay as is.
[1] https://bugs.llvm.org/show_bug.cgi?id=18538
Reported-by: Prasad Sodagudi <psodagud@codeaurora.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Chenbo Feng <fengc@google.com>
Cc: Richard Smith <richard-llvm@metafoo.co.uk>
Cc: Chandler Carruth <chandlerc@gmail.com>
Cc: linux-kernel@vger.kernel.org
Tested-by: Prasad Sodagudi <psodagud@codeaurora.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dave Hansen [Sat, 11 Nov 2017 00:12:31 +0000 (16:12 -0800)]
x86/pkeys/selftests: Rename 'si_pkey' to 'siginfo_pkey'
commit
91c49c2deb96ffc3c461eaae70219d89224076b7 upstream.
'si_pkey' is now #defined to be the name of the new siginfo field that
protection keys uses. Rename it not to conflict.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171111001231.DFFC8285@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>