Eric Dumazet [Tue, 19 Mar 2019 12:45:35 +0000 (05:45 -0700)]
tcp: do not use ipv6 header for ipv4 flow
[ Upstream commit
89e4130939a20304f4059ab72179da81f5347528 ]
When a dual stack tcp listener accepts an ipv4 flow,
it should not attempt to use an ipv6 header or tcp_v6_iif() helper.
Fixes: 1397ed35f22d ("ipv6: add flowinfo for tcp6 pkt_options for all cases")
Fixes: df3687ffc665 ("ipv6: add the IPV6_FL_F_REFLECT flag to IPV6_FL_A_GET")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Maxime Chevallier [Sat, 16 Mar 2019 13:41:30 +0000 (14:41 +0100)]
packets: Always register packet sk in the same order
[ Upstream commit
a4dc6a49156b1f8d6e17251ffda17c9e6a5db78a ]
When using fanouts with AF_PACKET, the demux functions such as
fanout_demux_cpu will return an index in the fanout socket array, which
corresponds to the selected socket.
The ordering of this array depends on the order the sockets were added
to a given fanout group, so for FANOUT_CPU this means sockets are bound
to cpus in the order they are configured, which is OK.
However, when stopping then restarting the interface these sockets are
bound to, the sockets are reassigned to the fanout group in the reverse
order, due to the fact that they were inserted at the head of the
interface's AF_PACKET socket list.
This means that traffic that was directed to the first socket in the
fanout group is now directed to the last one after an interface restart.
In the case of FANOUT_CPU, traffic from CPU0 will be directed to the
socket that used to receive traffic from the last CPU after an interface
restart.
This commit introduces a helper to add a socket at the tail of a list,
then uses it to register AF_PACKET sockets.
Note that this changes the order in which sockets are listed in /proc and
with sock_diag.
Fixes: dc99f600698d ("packet: Add fanout support")
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Fri, 15 Mar 2019 17:41:14 +0000 (10:41 -0700)]
net: rose: fix a possible stack overflow
[ Upstream commit
e5dcc0c3223c45c94100f05f28d8ef814db3d82c ]
rose_write_internal() uses a temp buffer of 100 bytes, but a manual
inspection showed that given arbitrary input, rose_create_facilities()
can fill up to 110 bytes.
Lets use a tailroom of 256 bytes for peace of mind, and remove
the bounce buffer : we can simply allocate a big enough skb
and adjust its length as needed.
syzbot report :
BUG: KASAN: stack-out-of-bounds in memcpy include/linux/string.h:352 [inline]
BUG: KASAN: stack-out-of-bounds in rose_create_facilities net/rose/rose_subr.c:521 [inline]
BUG: KASAN: stack-out-of-bounds in rose_write_internal+0x597/0x15d0 net/rose/rose_subr.c:116
Write of size 7 at addr
ffff88808b1ffbef by task syz-executor.0/24854
CPU: 0 PID: 24854 Comm: syz-executor.0 Not tainted 5.0.0+ #97
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x172/0x1f0 lib/dump_stack.c:113
print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
check_memory_region_inline mm/kasan/generic.c:185 [inline]
check_memory_region+0x123/0x190 mm/kasan/generic.c:191
memcpy+0x38/0x50 mm/kasan/common.c:131
memcpy include/linux/string.h:352 [inline]
rose_create_facilities net/rose/rose_subr.c:521 [inline]
rose_write_internal+0x597/0x15d0 net/rose/rose_subr.c:116
rose_connect+0x7cb/0x1510 net/rose/af_rose.c:826
__sys_connect+0x266/0x330 net/socket.c:1685
__do_sys_connect net/socket.c:1696 [inline]
__se_sys_connect net/socket.c:1693 [inline]
__x64_sys_connect+0x73/0xb0 net/socket.c:1693
do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x458079
Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:
00007f47b8d9dc78 EFLAGS:
00000246 ORIG_RAX:
000000000000002a
RAX:
ffffffffffffffda RBX:
0000000000000003 RCX:
0000000000458079
RDX:
000000000000001c RSI:
0000000020000040 RDI:
0000000000000004
RBP:
000000000073bf00 R08:
0000000000000000 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000246 R12:
00007f47b8d9e6d4
R13:
00000000004be4a4 R14:
00000000004ceca8 R15:
00000000ffffffff
The buggy address belongs to the page:
page:
ffffea00022c7fc0 count:0 mapcount:0 mapping:
0000000000000000 index:0x0
flags: 0x1fffc0000000000()
raw:
01fffc0000000000 0000000000000000 ffffffff022c0101 0000000000000000
raw:
0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88808b1ffa80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff88808b1ffb00: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 03
>
ffff88808b1ffb80: f2 f2 00 00 00 00 00 00 00 00 00 00 00 00 04 f3
^
ffff88808b1ffc00: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
ffff88808b1ffc80: 00 00 00 00 00 00 00 f1 f1 f1 f1 f1 f1 01 f2 01
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Christoph Paasch [Tue, 19 Mar 2019 06:14:52 +0000 (23:14 -0700)]
net/packet: Set __GFP_NOWARN upon allocation in alloc_pg_vec
[ Upstream commit
398f0132c14754fcd03c1c4f8e7176d001ce8ea1 ]
Since commit
fc62814d690c ("net/packet: fix 4gb buffer limit due to overflow check")
one can now allocate packet ring buffers >= UINT_MAX. However, syzkaller
found that that triggers a warning:
[ 21.100000] WARNING: CPU: 2 PID: 2075 at mm/page_alloc.c:4584 __alloc_pages_nod0
[ 21.101490] Modules linked in:
[ 21.101921] CPU: 2 PID: 2075 Comm: syz-executor.0 Not tainted 5.0.0 #146
[ 21.102784] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.5.1 01/01/2011
[ 21.103887] RIP: 0010:__alloc_pages_nodemask+0x2a0/0x630
[ 21.104640] Code: fe ff ff 65 48 8b 04 25 c0 de 01 00 48 05 90 0f 00 00 41 bd 01 00 00 00 48 89 44 24 48 e9 9c fe 3
[ 21.107121] RSP: 0018:
ffff88805e1cf920 EFLAGS:
00010246
[ 21.107819] RAX:
0000000000000000 RBX:
ffffffff85a488a0 RCX:
0000000000000000
[ 21.108753] RDX:
0000000000000000 RSI:
dffffc0000000000 RDI:
0000000000000000
[ 21.109699] RBP:
1ffff1100bc39f28 R08:
ffffed100bcefb67 R09:
ffffed100bcefb67
[ 21.110646] R10:
0000000000000001 R11:
ffffed100bcefb66 R12:
000000000000000d
[ 21.111623] R13:
0000000000000000 R14:
ffff88805e77d888 R15:
000000000000000d
[ 21.112552] FS:
00007f7c7de05700(0000) GS:
ffff88806d100000(0000) knlGS:
0000000000000000
[ 21.113612] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 21.114405] CR2:
000000000065c000 CR3:
000000005e58e006 CR4:
00000000001606e0
[ 21.115367] Call Trace:
[ 21.115705] ? __alloc_pages_slowpath+0x21c0/0x21c0
[ 21.116362] alloc_pages_current+0xac/0x1e0
[ 21.116923] kmalloc_order+0x18/0x70
[ 21.117393] kmalloc_order_trace+0x18/0x110
[ 21.117949] packet_set_ring+0x9d5/0x1770
[ 21.118524] ? packet_rcv_spkt+0x440/0x440
[ 21.119094] ? lock_downgrade+0x620/0x620
[ 21.119646] ? __might_fault+0x177/0x1b0
[ 21.120177] packet_setsockopt+0x981/0x2940
[ 21.120753] ? __fget+0x2fb/0x4b0
[ 21.121209] ? packet_release+0xab0/0xab0
[ 21.121740] ? sock_has_perm+0x1cd/0x260
[ 21.122297] ? selinux_secmark_relabel_packet+0xd0/0xd0
[ 21.123013] ? __fget+0x324/0x4b0
[ 21.123451] ? selinux_netlbl_socket_setsockopt+0x101/0x320
[ 21.124186] ? selinux_netlbl_sock_rcv_skb+0x3a0/0x3a0
[ 21.124908] ? __lock_acquire+0x529/0x3200
[ 21.125453] ? selinux_socket_setsockopt+0x5d/0x70
[ 21.126075] ? __sys_setsockopt+0x131/0x210
[ 21.126533] ? packet_release+0xab0/0xab0
[ 21.127004] __sys_setsockopt+0x131/0x210
[ 21.127449] ? kernel_accept+0x2f0/0x2f0
[ 21.127911] ? ret_from_fork+0x8/0x50
[ 21.128313] ? do_raw_spin_lock+0x11b/0x280
[ 21.128800] __x64_sys_setsockopt+0xba/0x150
[ 21.129271] ? lockdep_hardirqs_on+0x37f/0x560
[ 21.129769] do_syscall_64+0x9f/0x450
[ 21.130182] entry_SYSCALL_64_after_hwframe+0x49/0xbe
We should allocate with __GFP_NOWARN to handle this.
Cc: Kal Conley <kal.conley@dectris.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Fixes: fc62814d690c ("net/packet: fix 4gb buffer limit due to overflow check")
Signed-off-by: Christoph Paasch <cpaasch@apple.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bjorn Helgaas [Mon, 18 Mar 2019 13:51:06 +0000 (08:51 -0500)]
mISDN: hfcpci: Test both vendor & device ID for Digium HFC4S
[ Upstream commit
fae846e2b7124d4b076ef17791c73addf3b26350 ]
The device ID alone does not uniquely identify a device. Test both the
vendor and device ID to make sure we don't mistakenly think some other
vendor's 0xB410 device is a Digium HFC4S. Also, instead of the bare hex
ID, use the same constant (PCI_DEVICE_ID_DIGIUM_HFC4S) used in the device
ID table.
No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Tue, 19 Mar 2019 12:46:18 +0000 (05:46 -0700)]
dccp: do not use ipv6 header for ipv4 flow
[ Upstream commit
e0aa67709f89d08c8d8e5bdd9e0b649df61d0090 ]
When a dual stack dccp listener accepts an ipv4 flow,
it should not attempt to use an ipv6 header or
inet6_iif() helper.
Fixes: 3df80d9320bc ("[DCCP]: Introduce DCCPv6")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bhadram Varka [Fri, 27 Oct 2017 02:52:02 +0000 (08:22 +0530)]
stmmac: copy unicast mac address to MAC registers
[ Upstream commit
a830405ee452ddc4101c3c9334e6fedd42c6b357 ]
Currently stmmac driver not copying the valid ethernet
MAC address to MAC registers. This patch takes care
of updating the MAC register with MAC address.
Signed-off-by: Bhadram Varka <vbhadram@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johannes Berg [Mon, 9 Jan 2017 10:10:42 +0000 (11:10 +0100)]
cfg80211: size various nl80211 messages correctly
[ Upstream commit
4ef8c1c93f848e360754f10eb2e7134c872b6597 ]
Ilan reported that sometimes nl80211 messages weren't working if
the frames being transported got very large, which was really a
problem for userspace-to-kernel messages, but prompted me to look
at the code.
Upon review, I found various places where variable-length data is
transported in an nl80211 message but the message isn't allocated
taking that into account. This shouldn't cause any problems since
the frames aren't really that long, apart in one place where two
(possibly very long frames) might not fit.
Fix all the places (that I found) that get variable length data
from the driver and put it into a message to take the length of
the variable data into account. The 100 there is just a safe
constant for the remaining message overhead (it's usually around
50 for most messages.)
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Christoffer Dall [Tue, 3 Jul 2018 15:43:09 +0000 (17:43 +0200)]
video: fbdev: Set pixclock = 0 in goldfishfb
[ Upstream commit
ace6033ec5c356615eaa3582fb1946e9eaff6662 ]
User space Android code identifies pixclock == 0 as a sign for emulation
and will set the frame rate to 60 fps when reading this value, which is
the desired outcome.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Roman Kiryanov <rkir@google.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Marcel Holtmann [Fri, 18 Jan 2019 12:43:19 +0000 (13:43 +0100)]
Bluetooth: Verify that l2cap_get_conf_opt provides large enough buffer
commit
7c9cbd0b5e38a1672fcd137894ace3b042dfbf69 upstream.
The function l2cap_get_conf_opt will return L2CAP_CONF_OPT_SIZE + opt->len
as length value. The opt->len however is in control over the remote user
and can be used by an attacker to gain access beyond the bounds of the
actual packet.
To prevent any potential leak of heap memory, it is enough to check that
the resulting len calculation after calling l2cap_get_conf_opt is not
below zero. A well formed packet will always return >= 0 here and will
end with the length value being zero after the last option has been
parsed. In case of malformed packets messing with the opt->len field the
length value will become negative. If that is the case, then just abort
and ignore the option.
In case an attacker uses a too short opt->len value, then garbage will
be parsed, but that is protected by the unknown option handling and also
the option parameter size checks.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Marcel Holtmann [Fri, 18 Jan 2019 11:56:20 +0000 (12:56 +0100)]
Bluetooth: Check L2CAP option sizes returned from l2cap_get_conf_opt
commit
af3d5d1c87664a4f150fcf3534c6567cb19909b0 upstream.
When doing option parsing for standard type values of 1, 2 or 4 octets,
the value is converted directly into a variable instead of a pointer. To
avoid being tricked into being a pointer, check that for these option
types that sizes actually match. In L2CAP every option is fixed size and
thus it is prudent anyway to ensure that the remote side sends us the
right option size along with option paramters.
If the option size is not matching the option type, then that option is
silently ignored. It is a protocol violation and instead of trying to
give the remote attacker any further hints just pretend that option is
not present and proceed with the default values. Implementation
following the specification and its qualification procedures will always
use the correct size and thus not being impacted here.
To keep the code readable and consistent accross all options, a few
cosmetic changes were also required.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Wed, 27 Mar 2019 05:13:05 +0000 (14:13 +0900)]
Linux 4.9.166
Arnd Bergmann [Wed, 28 Mar 2018 22:06:10 +0000 (00:06 +0200)]
ath10k: avoid possible string overflow
commit
6707ba0105a2d350710bc0a537a98f49eb4b895d upstream.
The way that 'strncat' is used here raised a warning in gcc-8:
drivers/net/wireless/ath/ath10k/wmi.c: In function 'ath10k_wmi_tpc_stats_final_disp_tables':
drivers/net/wireless/ath/ath10k/wmi.c:4649:4: error: 'strncat' output truncated before terminating nul copying as many bytes from a string as its length [-Werror=stringop-truncation]
Effectively, this is simply a strcat() but the use of strncat() suggests
some form of overflow check. Regardless of whether this might actually
overflow, using strlcat() instead of strncat() avoids the warning and
makes the code more robust.
Fixes: bc64d05220f3 ("ath10k: debugfs support to get final TPC stats for 10.4 variants")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Baolin Wang [Fri, 16 Nov 2018 11:01:10 +0000 (19:01 +0800)]
power: supply: charger-manager: Fix incorrect return value
commit
f25a646fbe2051527ad9721853e892d13a99199e upstream.
Fix incorrect return value.
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Enric Balletbo i Serra [Wed, 28 Mar 2018 17:03:23 +0000 (19:03 +0200)]
pwm-backlight: Enable/disable the PWM before/after LCD enable toggle.
commit
5fb5caee92ba35a4a3baa61d45a78eb057e2c031 upstream.
Before this patch the enable signal was set before the PWM signal and
vice-versa on power off. This sequence is wrong, at least, it is on
the different panels datasheets that I checked, so I inverted the sequence
to follow the specs.
For reference the following panels have the mentioned sequence:
- N133HSE-EA1 (Innolux)
- N116BGE (Innolux)
- N156BGE-L21 (Innolux)
- B101EAN0 (Auo)
- B101AW03 (Auo)
- LTN101NT05 (Samsung)
- CLAA101WA01A (Chunghwa)
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Acked-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Jingoo Han <jingoohan1@gmail.com>
Acked-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Baolin Wang [Mon, 25 Dec 2017 11:10:37 +0000 (19:10 +0800)]
rtc: Fix overflow when converting time64_t to rtc_time
commit
36d46cdb43efea74043e29e2a62b13e9aca31452 upstream.
If we convert one large time values to rtc_time, in the original formula
'days * 86400' can be overflowed in 'unsigned int' type to make the formula
get one incorrect remain seconds value. Thus we can use div_s64_rem()
function to avoid this situation.
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
kehuanlin [Wed, 6 Sep 2017 09:58:39 +0000 (17:58 +0800)]
scsi: ufs: fix wrong command type of UTRD for UFSHCI v2.1
commit
83dc7e3dea76b77b6bcc289eb86c5b5c145e8dff upstream.
Since the command type of UTRD in UFS 2.1 specification is the same with
UFS 2.0. And it assumes the future UFS specification will follow the
same definition.
Signed-off-by: kehuanlin <kehuanlin@pinecone.net>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andrey Konovalov [Mon, 11 Dec 2017 21:48:41 +0000 (22:48 +0100)]
USB: core: only clean up what we allocated
commit
32fd87b3bbf5f7a045546401dfe2894dbbf4d8c3 upstream.
When cleaning up the configurations, make sure we only free the number
of configurations and interfaces that we could have allocated.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Zijlstra [Fri, 17 Nov 2017 23:28:04 +0000 (15:28 -0800)]
lib/int_sqrt: optimize small argument
commit
3f3295709edea6268ff1609855f498035286af73 upstream.
The current int_sqrt() computation is sub-optimal for the case of small
@x. Which is the interesting case when we're going to do cumulative
distribution functions on idle times, which we assume to be a random
variable, where the target residency of the deepest idle state gives an
upper bound on the variable (5e6ns on recent Intel chips).
In the case of small @x, the compute loop:
while (m != 0) {
b = y + m;
y >>= 1;
if (x >= b) {
x -= b;
y += m;
}
m >>= 2;
}
can be reduced to:
while (m > x)
m >>= 2;
Because y==0, b==m and until x>=m y will remain 0.
And while this is computationally equivalent, it runs much faster
because there's less code, in particular less branches.
cycles: branches: branch-misses:
OLD:
hot: 45.109444 +- 0.044117 44.333392 +- 0.002254 0.018723 +- 0.000593
cold: 187.737379 +- 0.156678 44.333407 +- 0.002254 6.272844 +- 0.004305
PRE:
hot: 67.937492 +- 0.064124 66.999535 +- 0.000488 0.066720 +- 0.001113
cold: 232.004379 +- 0.332811 66.999527 +- 0.000488 6.914634 +- 0.006568
POST:
hot: 43.633557 +- 0.034373 45.333132 +- 0.002277 0.023529 +- 0.000681
cold: 207.438411 +- 0.125840 45.333132 +- 0.002277 6.976486 +- 0.004219
Averages computed over all values <128k using a LFSR to generate order.
Cold numbers have a LFSR based branch trace buffer 'confuser' ran between
each int_sqrt() invocation.
Link: http://lkml.kernel.org/r/20171020164644.876503355@infradead.org
Fixes: 30493cc9dddb ("lib/int_sqrt.c: optimize square root algorithm")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Suggested-by: Anshul Garg <aksgarg1989@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Joe Perches <joe@perches.com>
Cc: David Miller <davem@davemloft.net>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michael Davidson <md@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Lanqing Liu [Tue, 18 Jul 2017 09:58:13 +0000 (17:58 +0800)]
serial: sprd: clear timeout interrupt only rather than all interrupts
commit
4350782570b919f254c1e083261a21c19fcaee90 upstream.
On Spreadtrum's serial device, nearly all of interrupts would be cleared
by hardware except timeout interrupt. This patch removed the operation
of clearing all interrupt in irq handler, instead added an if statement
to check if the timeout interrupt is supposed to be cleared.
Wrongly clearing timeout interrupt would lead to uart data stay in rx
fifo, that means the driver cannot read them out anymore.
Signed-off-by: Lanqing Liu <lanqing.liu@spreadtrum.com>
Signed-off-by: Chunyan Zhang <chunyan.zhang@spreadtrum.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Qiao Zhou [Fri, 7 Jul 2017 09:29:34 +0000 (17:29 +0800)]
arm64: traps: disable irq in die()
commit
6f44a0bacb79a03972c83759711832b382b1b8ac upstream.
In current die(), the irq is disabled for __die() handle, not
including the possible panic() handling. Since the log in __die()
can take several hundreds ms, new irq might come and interrupt
current die().
If the process calling die() holds some critical resource, and some
other process scheduled later also needs it, then it would deadlock.
The first panic will not be executed.
So here disable irq for the whole flow of die().
Signed-off-by: Qiao Zhou <qiaozhou@asrmicro.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Al Viro [Sat, 3 Jun 2017 06:20:09 +0000 (07:20 +0100)]
Hang/soft lockup in d_invalidate with simultaneous calls
commit
81be24d263dbeddaba35827036d6f6787a59c2c3 upstream.
It's not hard to trigger a bunch of d_invalidate() on the same
dentry in parallel. They end up fighting each other - any
dentry picked for removal by one will be skipped by the rest
and we'll go for the next iteration through the entire
subtree, even if everything is being skipped. Morevoer, we
immediately go back to scanning the subtree. The only thing
we really need is to dissolve all mounts in the subtree and
as soon as we've nothing left to do, we can just unhash the
dentry and bugger off.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Wei Qiao [Mon, 27 Mar 2017 06:06:42 +0000 (14:06 +0800)]
serial: sprd: adjust TIMEOUT to a big value
commit
e1dc9b08051a2c2e694edf48d1e704f07c7c143c upstream.
SPRD_TIMEOUT was 256, which is too small to wait until the status
switched to workable in a while loop, so that the earlycon could
not work correctly.
Signed-off-by: Wei Qiao <wei.qiao@spreadtrum.com>
Signed-off-by: Chunyan Zhang <chunyan.zhang@spreadtrum.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Wed, 26 Oct 2016 16:27:57 +0000 (09:27 -0700)]
tcp/dccp: drop SYN packets if accept queue is full
commit
5ea8ea2cb7f1d0db15762c9b0bb9e7330425a071 upstream.
Per listen(fd, backlog) rules, there is really no point accepting a SYN,
sending a SYNACK, and dropping the following ACK packet if accept queue
is full, because application is not draining accept queue fast enough.
This behavior is fooling TCP clients that believe they established a
flow, while there is nothing at server side. They might then send about
10 MSS (if using IW10) that will be dropped anyway while server is under
stress.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hui Wang [Tue, 19 Mar 2019 01:28:44 +0000 (09:28 +0800)]
ALSA: hda - Enforces runtime_resume after S3 and S4 for each codec
commit
b5a236c175b0d984552a5f7c9d35141024c2b261 upstream.
Recently we found the audio jack detection stop working after suspend
on many machines with Realtek codec. Sometimes the audio selection
dialogue didn't show up after users plugged headhphone/headset into
the headset jack, sometimes after uses plugged headphone/headset, then
click the sound icon on the upper-right corner of gnome-desktop, it
also showed the speaker rather than the headphone.
The root cause is that before suspend, the codec already call the
runtime_suspend since this codec is not used by any apps, then in
resume, it will not call runtime_resume for this codec. But for some
realtek codec (so far, alc236, alc255 and alc891) with the specific
BIOS, if it doesn't run runtime_resume after suspend, all codec
functions including jack detection stop working anymore.
This problem existed for a long time, but it was not exposed, that is
because when problem happens, if users play sound or open
sound-setting to check audio device, this will trigger calling to
runtime_resume (via snd_hda_power_up), then the codec starts working
again before users notice this problem.
Since we don't know how many codec and BIOS combinations have this
problem, to fix it, let the driver call runtime_resume for all codecs
in pm_resume, maybe for some codecs, this is not needed, but it is
harmless. After a codec is runtime resumed, if it is not used by any
apps, it will be runtime suspended soon and furthermore we don't run
suspend frequently, this change will not add much power consumption.
Fixes: cc72da7d4d06 ("ALSA: hda - Use standard runtime PM for codec power-save control")
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Takashi Iwai [Tue, 29 Jan 2019 13:03:33 +0000 (14:03 +0100)]
ALSA: hda - Record the current power state before suspend/resume calls
commit
98081ca62cbac31fb0f7efaf90b2e7384ce22257 upstream.
Currently we deal with single codec and suspend codec callbacks for
all S3, S4 and runtime PM handling. But it turned out that we want
distinguish the call patterns sometimes, e.g. for applying some init
sequence only at probing and restoring from hibernate.
This patch slightly modifies the common PM callbacks for HD-audio
codec and stores the currently processed PM event in power_state of
the codec's device.power field, which is currently unused. The codec
callback can take a look at this event value and judges which purpose
it's being called.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Waiman Long [Thu, 10 Jan 2019 04:03:25 +0000 (23:03 -0500)]
locking/lockdep: Add debug_locks check in __lock_downgrade()
commit
71492580571467fb7177aade19c18ce7486267f5 upstream.
Tetsuo Handa had reported he saw an incorrect "downgrading a read lock"
warning right after a previous lockdep warning. It is likely that the
previous warning turned off lock debugging causing the lockdep to have
inconsistency states leading to the lock downgrade warning.
Fix that by add a check for debug_locks at the beginning of
__lock_downgrade().
Debugged-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Reported-by: syzbot+53383ae265fb161ef488@syzkaller.appspotmail.com
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Link: https://lkml.kernel.org/r/1547093005-26085-1-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Myungho Jung [Sun, 3 Feb 2019 00:56:36 +0000 (16:56 -0800)]
Bluetooth: Fix decrementing reference count twice in releasing socket
commit
e20a2e9c42c9e4002d9e338d74e7819e88d77162 upstream.
When releasing socket, it is possible to enter hci_sock_release() and
hci_sock_dev_event(HCI_DEV_UNREG) at the same time in different thread.
The reference count of hdev should be decremented only once from one of
them but if storing hdev to local variable in hci_sock_release() before
detached from socket and setting to NULL in hci_sock_dev_event(),
hci_dev_put(hdev) is unexpectedly called twice. This is resolved by
referencing hdev from socket after bt_sock_unlink() in
hci_sock_release().
Reported-by: syzbot+fdc00003f4efff43bc5b@syzkaller.appspotmail.com
Signed-off-by: Myungho Jung <mhjungk@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hans Verkuil [Tue, 18 Dec 2018 13:37:08 +0000 (08:37 -0500)]
media: v4l2-ctrls.c/uvc: zero v4l2_event
commit
f45f3f753b0a3d739acda8e311b4f744d82dc52a upstream.
Control events can leak kernel memory since they do not fully zero the
event. The same code is present in both v4l2-ctrls.c and uvc_ctrl.c, so
fix both.
It appears that all other event code is properly zeroing the structure,
it's these two places.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Reported-by: syzbot+4f021cf3697781dbd9fb@syzkaller.appspotmail.com
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
zhangyi (F) [Sat, 23 Mar 2019 15:43:05 +0000 (11:43 -0400)]
ext4: brelse all indirect buffer in ext4_ind_remove_space()
commit
674a2b27234d1b7afcb0a9162e81b2e53aeef217 upstream.
All indirect buffers get by ext4_find_shared() should be released no
mater the branch should be freed or not. But now, we forget to release
the lower depth indirect buffers when removing space from the same
higher depth indirect block. It will lead to buffer leak and futher
more, it may lead to quota information corruption when using old quota,
consider the following case.
- Create and mount an empty ext4 filesystem without extent and quota
features,
- quotacheck and enable the user & group quota,
- Create some files and write some data to them, and then punch hole
to some files of them, it may trigger the buffer leak problem
mentioned above.
- Disable quota and run quotacheck again, it will create two new
aquota files and write the checked quota information to them, which
probably may reuse the freed indirect block(the buffer and page
cache was not freed) as data block.
- Enable quota again, it will invoke
vfs_load_quota_inode()->invalidate_bdev() to try to clean unused
buffers and pagecache. Unfortunately, because of the buffer of quota
data block is still referenced, quota code cannot read the up to date
quota info from the device and lead to quota information corruption.
This problem can be reproduced by xfstests generic/231 on ext3 file
system or ext4 file system without extent and quota features.
This patch fix this problem by releasing the missing indirect buffers,
in ext4_ind_remove_space().
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Lukas Czerner [Fri, 15 Mar 2019 03:20:25 +0000 (23:20 -0400)]
ext4: fix data corruption caused by unaligned direct AIO
commit
372a03e01853f860560eade508794dd274e9b390 upstream.
Ext4 needs to serialize unaligned direct AIO because the zeroing of
partial blocks of two competing unaligned AIOs can result in data
corruption.
However it decides not to serialize if the potentially unaligned aio is
past i_size with the rationale that no pending writes are possible past
i_size. Unfortunately if the i_size is not block aligned and the second
unaligned write lands past i_size, but still into the same block, it has
the potential of corrupting the previous unaligned write to the same
block.
This is (very simplified) reproducer from Frank
// 41472 = (10 * 4096) + 512
// 37376 = 41472 - 4096
ftruncate(fd, 41472);
io_prep_pwrite(iocbs[0], fd, buf[0], 4096, 37376);
io_prep_pwrite(iocbs[1], fd, buf[1], 4096, 41472);
io_submit(io_ctx, 1, &iocbs[1]);
io_submit(io_ctx, 1, &iocbs[2]);
io_getevents(io_ctx, 2, 2, events, NULL);
Without this patch the 512B range from 40960 up to the start of the
second unaligned write (41472) is going to be zeroed overwriting the data
written by the first write. This is a data corruption.
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
00009200 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30
*
0000a000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
0000a200 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31
With this patch the data corruption is avoided because we will recognize
the unaligned_aio and wait for the unwritten extent conversion.
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
00009200 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30
*
0000a200 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31 31
*
0000b200
Reported-by: Frank Sorenson <fsorenso@redhat.com>
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fixes: e9e3bcecf44c ("ext4: serialize unaligned asynchronous DIO")
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jiufei Xue [Fri, 15 Mar 2019 03:19:22 +0000 (23:19 -0400)]
ext4: fix NULL pointer dereference while journal is aborted
commit
fa30dde38aa8628c73a6dded7cb0bba38c27b576 upstream.
We see the following NULL pointer dereference while running xfstests
generic/475:
BUG: unable to handle kernel NULL pointer dereference at
0000000000000008
PGD
8000000c84bad067 P4D
8000000c84bad067 PUD
c84e62067 PMD 0
Oops: 0000 [#1] SMP PTI
CPU: 7 PID: 9886 Comm: fsstress Kdump: loaded Not tainted 5.0.0-rc8 #10
RIP: 0010:ext4_do_update_inode+0x4ec/0x760
...
Call Trace:
? jbd2_journal_get_write_access+0x42/0x50
? __ext4_journal_get_write_access+0x2c/0x70
? ext4_truncate+0x186/0x3f0
ext4_mark_iloc_dirty+0x61/0x80
ext4_mark_inode_dirty+0x62/0x1b0
ext4_truncate+0x186/0x3f0
? unmap_mapping_pages+0x56/0x100
ext4_setattr+0x817/0x8b0
notify_change+0x1df/0x430
do_truncate+0x5e/0x90
? generic_permission+0x12b/0x1a0
This is triggered because the NULL pointer handle->h_transaction was
dereferenced in function ext4_update_inode_fsync_trans().
I found that the h_transaction was set to NULL in jbd2__journal_restart
but failed to attached to a new transaction while the journal is aborted.
Fix this by checking the handle before updating the inode.
Fixes: b436b9bef84d ("ext4: Wait for proper transaction commit on fsync")
Signed-off-by: Jiufei Xue <jiufei.xue@linux.alibaba.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: stable@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Josh Poimboeuf [Tue, 19 Mar 2019 00:09:38 +0000 (19:09 -0500)]
objtool: Move objtool_file struct off the stack
commit
0c671812f152b628bd87c0af49da032cc2a2c319 upstream.
Objtool uses over 512k of stack, thanks to the hash table embedded in
the objtool_file struct. This causes an unnecessarily large stack
allocation and breaks users with low stack limits.
Move the struct off the stack.
Fixes: 042ba73fe7eb ("objtool: Add several performance improvements")
Reported-by: Vassili Karpov <moosotc@gmail.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/df92dcbc4b84b02ffa252f46876df125fb56e2d7.1552954176.git.jpoimboe@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Chen Jie [Fri, 15 Mar 2019 03:44:38 +0000 (03:44 +0000)]
futex: Ensure that futex address is aligned in handle_futex_death()
commit
5a07168d8d89b00fe1760120714378175b3ef992 upstream.
The futex code requires that the user space addresses of futexes are 32bit
aligned. sys_futex() checks this in futex_get_keys() but the robust list
code has no alignment check in place.
As a consequence the kernel crashes on architectures with strict alignment
requirements in handle_futex_death() when trying to cmpxchg() on an
unaligned futex address which was retrieved from the robust list.
[ tglx: Rewrote changelog, proper sizeof() based alignement check and add
comment ]
Fixes: 0771dfefc9e5 ("[PATCH] lightweight robust futexes: core")
Signed-off-by: Chen Jie <chenjie6@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <dvhart@infradead.org>
Cc: <peterz@infradead.org>
Cc: <zengweilin@huawei.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1552621478-119787-1-git-send-email-chenjie6@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Archer Yan [Fri, 8 Mar 2019 03:29:19 +0000 (03:29 +0000)]
MIPS: Fix kernel crash for R6 in jump label branch function
commit
47c25036b60f27b86ab44b66a8861bcf81cde39b upstream.
Insert Branch instruction instead of NOP to make sure assembler don't
patch code in forbidden slot. In jump label function, it might
be possible to patch Control Transfer Instructions(CTIs) into
forbidden slot, which will generate Reserved Instruction exception
in MIPS release 6.
Signed-off-by: Archer Yan <ayan@wavecomp.com>
Reviewed-by: Paul Burton <paul.burton@mips.com>
[paul.burton@mips.com:
- Add MIPS prefix to subject.
- Mark for stable from v4.0, which introduced r6 support, onwards.]
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Yasha Cherikovsky [Fri, 8 Mar 2019 12:58:51 +0000 (14:58 +0200)]
MIPS: Ensure ELF appended dtb is relocated
commit
3f0a53bc6482fb09770982a8447981260ea258dc upstream.
This fixes booting with the combination of CONFIG_RELOCATABLE=y
and CONFIG_MIPS_ELF_APPENDED_DTB=y.
Sections that appear after the relocation table are not relocated
on system boot (except .bss, which has special handling).
With CONFIG_MIPS_ELF_APPENDED_DTB, the dtb is part of the
vmlinux ELF, so it must be relocated together with everything else.
Fixes: 069fd766271d ("MIPS: Reserve space for relocation table")
Signed-off-by: Yasha Cherikovsky <yasha.che3@gmail.com>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Paul Burton <paul.burton@mips.com>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # v4.7+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Yifeng Li [Mon, 4 Mar 2019 22:00:22 +0000 (06:00 +0800)]
mips: loongson64: lemote-2f: Add IRQF_NO_SUSPEND to "cascade" irqaction.
commit
5f5f67da9781770df0403269bc57d7aae608fecd upstream.
Timekeeping IRQs from CS5536 MFGPT are routed to i8259, which then
triggers the "cascade" IRQ on MIPS CPU. Without IRQF_NO_SUSPEND in
cascade_irqaction, MFGPT interrupts will be masked in suspend mode,
and the machine would be unable to resume once suspended.
Previously, MIPS IRQs were not disabled properly, so the original
code appeared to work. Commit
a3e6c1eff5 ("MIPS: IRQ: Fix disable_irq on
CPU IRQs") uncovers the bug. To fix it, add IRQF_NO_SUSPEND to
cascade_irqaction.
This commit is functionally identical to
0add9c2f1cff ("MIPS:
Loongson-3: Add IRQF_NO_SUSPEND to Cascade irqaction"), but it forgot
to apply the same fix to Loongson2.
Signed-off-by: Yifeng Li <tomli@tomli.me>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Cc: linux-mips@vger.kernel.org
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Huacai Chen <chenhc@lemote.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # v3.19+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jan Kara [Mon, 11 Mar 2019 14:04:18 +0000 (15:04 +0100)]
udf: Fix crash on IO error during truncate
commit
d3ca4651d05c0ff7259d087d8c949bcf3e14fb46 upstream.
When truncate(2) hits IO error when reading indirect extent block the
code just bugs with:
kernel BUG at linux-4.15.0/fs/udf/truncate.c:249!
...
Fix the problem by bailing out cleanly in case of IO error.
CC: stable@vger.kernel.org
Reported-by: jean-luc malet <jeanluc.malet@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ilya Dryomov [Wed, 20 Mar 2019 08:46:58 +0000 (09:46 +0100)]
libceph: wait for latest osdmap in ceph_monc_blacklist_add()
commit
bb229bbb3bf63d23128e851a1f3b85c083178fa1 upstream.
Because map updates are distributed lazily, an OSD may not know about
the new blacklist for quite some time after "osd blacklist add" command
is completed. This makes it possible for a blacklisted but still alive
client to overwrite a post-blacklist update, resulting in data
corruption.
Waiting for latest osdmap in ceph_monc_blacklist_add() and thus using
the post-blacklist epoch for all post-blacklist requests ensures that
all such requests "wait" for the blacklist to come into force on their
respective OSDs.
Cc: stable@vger.kernel.org
Fixes: 6305a3b41515 ("libceph: support for blacklisting clients")
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Jason Dillaman <dillaman@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stanislaw Gruszka [Wed, 13 Mar 2019 09:03:17 +0000 (10:03 +0100)]
iommu/amd: fix sg->dma_address for sg->offset bigger than PAGE_SIZE
commit
4e50ce03976fbc8ae995a000c4b10c737467beaa upstream.
Take into account that sg->offset can be bigger than PAGE_SIZE when
setting segment sg->dma_address. Otherwise sg->dma_address will point
at diffrent page, what makes DMA not possible with erros like this:
xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa70c0 flags=0x0020]
xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7040 flags=0x0020]
xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7080 flags=0x0020]
xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7100 flags=0x0020]
xhci_hcd 0000:38:00.3: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fdaa7000 flags=0x0020]
Additinally with wrong sg->dma_address unmap_sg will free wrong pages,
what what can cause crashes like this:
Feb 28 19:27:45 kernel: BUG: Bad page state in process cinnamon pfn:39e8b1
Feb 28 19:27:45 kernel: Disabling lock debugging due to kernel taint
Feb 28 19:27:45 kernel: flags: 0x2ffff0000000000()
Feb 28 19:27:45 kernel: raw:
02ffff0000000000 0000000000000000 ffffffff00000301 0000000000000000
Feb 28 19:27:45 kernel: raw:
0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
Feb 28 19:27:45 kernel: page dumped because: nonzero _refcount
Feb 28 19:27:45 kernel: Modules linked in: ccm fuse arc4 nct6775 hwmon_vid amdgpu nls_iso8859_1 nls_cp437 edac_mce_amd vfat fat kvm_amd ccp rng_core kvm mt76x0u mt76x0_common mt76x02_usb irqbypass mt76_usb mt76x02_lib mt76 crct10dif_pclmul crc32_pclmul chash mac80211 amd_iommu_v2 ghash_clmulni_intel gpu_sched i2c_algo_bit ttm wmi_bmof snd_hda_codec_realtek snd_hda_codec_generic drm_kms_helper snd_hda_codec_hdmi snd_hda_intel drm snd_hda_codec aesni_intel snd_hda_core snd_hwdep aes_x86_64 crypto_simd snd_pcm cfg80211 cryptd mousedev snd_timer glue_helper pcspkr r8169 input_leds realtek agpgart libphy rfkill snd syscopyarea sysfillrect sysimgblt fb_sys_fops soundcore sp5100_tco k10temp i2c_piix4 wmi evdev gpio_amdpt pinctrl_amd mac_hid pcc_cpufreq acpi_cpufreq sg ip_tables x_tables ext4(E) crc32c_generic(E) crc16(E) mbcache(E) jbd2(E) fscrypto(E) sd_mod(E) hid_generic(E) usbhid(E) hid(E) dm_mod(E) serio_raw(E) atkbd(E) libps2(E) crc32c_intel(E) ahci(E) libahci(E) libata(E) xhci_pci(E) xhci_hcd(E)
Feb 28 19:27:45 kernel: scsi_mod(E) i8042(E) serio(E) bcache(E) crc64(E)
Feb 28 19:27:45 kernel: CPU: 2 PID: 896 Comm: cinnamon Tainted: G B W E 4.20.12-arch1-1-custom #1
Feb 28 19:27:45 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450M Pro4, BIOS P1.20 06/26/2018
Feb 28 19:27:45 kernel: Call Trace:
Feb 28 19:27:45 kernel: dump_stack+0x5c/0x80
Feb 28 19:27:45 kernel: bad_page.cold.29+0x7f/0xb2
Feb 28 19:27:45 kernel: __free_pages_ok+0x2c0/0x2d0
Feb 28 19:27:45 kernel: skb_release_data+0x96/0x180
Feb 28 19:27:45 kernel: __kfree_skb+0xe/0x20
Feb 28 19:27:45 kernel: tcp_recvmsg+0x894/0xc60
Feb 28 19:27:45 kernel: ? reuse_swap_page+0x120/0x340
Feb 28 19:27:45 kernel: ? ptep_set_access_flags+0x23/0x30
Feb 28 19:27:45 kernel: inet_recvmsg+0x5b/0x100
Feb 28 19:27:45 kernel: __sys_recvfrom+0xc3/0x180
Feb 28 19:27:45 kernel: ? handle_mm_fault+0x10a/0x250
Feb 28 19:27:45 kernel: ? syscall_trace_enter+0x1d3/0x2d0
Feb 28 19:27:45 kernel: ? __audit_syscall_exit+0x22a/0x290
Feb 28 19:27:45 kernel: __x64_sys_recvfrom+0x24/0x30
Feb 28 19:27:45 kernel: do_syscall_64+0x5b/0x170
Feb 28 19:27:45 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
Cc: stable@vger.kernel.org
Reported-and-tested-by: Jan Viktorin <jan.viktorin@gmail.com>
Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Fixes: 80187fd39dcb ('iommu/amd: Optimize map_sg and unmap_sg')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Thomas Zimmermann [Mon, 18 Mar 2019 14:47:58 +0000 (15:47 +0100)]
drm/vmwgfx: Don't double-free the mode stored in par->set_mode
commit
c2d311553855395764e2e5bf401d987ba65c2056 upstream.
When calling vmw_fb_set_par(), the mode stored in par->set_mode gets free'd
twice. The first free is in vmw_fb_kms_detach(), the second is near the
end of vmw_fb_set_par() under the name of 'old_mode'. The mode-setting code
only works correctly if the mode doesn't actually change. Removing
'old_mode' in favor of using par->set_mode directly fixes the problem.
Cc: <stable@vger.kernel.org>
Fixes: a278724aa23c ("drm/vmwgfx: Implement fbdev on kms v2")
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Deepak Rawat <drawat@vmware.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Arnd Bergmann [Thu, 7 Mar 2019 10:09:19 +0000 (11:09 +0100)]
mmc: pxamci: fix enum type confusion
commit
e60a582bcde01158a64ff948fb799f21f5d31a11 upstream.
clang points out several instances of mismatched types in this drivers,
all coming from a single declaration:
drivers/mmc/host/pxamci.c:193:15: error: implicit conversion from enumeration type 'enum dma_transfer_direction' to
different enumeration type 'enum dma_data_direction' [-Werror,-Wenum-conversion]
direction = DMA_DEV_TO_MEM;
~ ^~~~~~~~~~~~~~
drivers/mmc/host/pxamci.c:212:62: error: implicit conversion from enumeration type 'enum dma_data_direction' to
different enumeration type 'enum dma_transfer_direction' [-Werror,-Wenum-conversion]
tx = dmaengine_prep_slave_sg(chan, data->sg, host->dma_len, direction,
The behavior is correct, so this must be a simply typo from
dma_data_direction and dma_transfer_direction being similarly named
types with a similar purpose.
Fixes: 6464b7140951 ("mmc: pxamci: switch over to dmaengine use")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Robert Jarzmik <robert.jarzmik@free.fr>
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Sat, 23 Mar 2019 12:19:54 +0000 (13:19 +0100)]
Linux 4.9.165
Wanpeng Li [Thu, 10 Aug 2017 05:33:12 +0000 (22:33 -0700)]
KVM: X86: Fix residual mmio emulation request to userspace
commit
bbeac2830f4de270bb48141681cb730aadf8dce1 upstream.
Reported by syzkaller:
The kvm-intel.unrestricted_guest=0
WARNING: CPU: 5 PID: 1014 at /home/kernel/data/kvm/arch/x86/kvm//x86.c:7227 kvm_arch_vcpu_ioctl_run+0x38b/0x1be0 [kvm]
CPU: 5 PID: 1014 Comm: warn_test Tainted: G W OE 4.13.0-rc3+ #8
RIP: 0010:kvm_arch_vcpu_ioctl_run+0x38b/0x1be0 [kvm]
Call Trace:
? put_pid+0x3a/0x50
? rcu_read_lock_sched_held+0x79/0x80
? kmem_cache_free+0x2f2/0x350
kvm_vcpu_ioctl+0x340/0x700 [kvm]
? kvm_vcpu_ioctl+0x340/0x700 [kvm]
? __fget+0xfc/0x210
do_vfs_ioctl+0xa4/0x6a0
? __fget+0x11d/0x210
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x23/0xc2
? __this_cpu_preempt_check+0x13/0x20
The syszkaller folks reported a residual mmio emulation request to userspace
due to vm86 fails to emulate inject real mode interrupt(fails to read CS) and
incurs a triple fault. The vCPU returns to userspace with vcpu->mmio_needed == true
and KVM_EXIT_SHUTDOWN exit reason. However, the syszkaller testcase constructs
several threads to launch the same vCPU, the thread which lauch this vCPU after
the thread whichs get the vcpu->mmio_needed == true and KVM_EXIT_SHUTDOWN will
trigger the warning.
#define _GNU_SOURCE
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>
#include <linux/kvm.h>
#include <stdio.h>
int kvmcpu;
struct kvm_run *run;
void* thr(void* arg)
{
int res;
res = ioctl(kvmcpu, KVM_RUN, 0);
printf("ret1=%d exit_reason=%d suberror=%d\n",
res, run->exit_reason, run->internal.suberror);
return 0;
}
void test()
{
int i, kvm, kvmvm;
pthread_t th[4];
kvm = open("/dev/kvm", O_RDWR);
kvmvm = ioctl(kvm, KVM_CREATE_VM, 0);
kvmcpu = ioctl(kvmvm, KVM_CREATE_VCPU, 0);
run = (struct kvm_run*)mmap(0, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, kvmcpu, 0);
srand(getpid());
for (i = 0; i < 4; i++) {
pthread_create(&th[i], 0, thr, 0);
usleep(rand() % 10000);
}
for (i = 0; i < 4; i++)
pthread_join(th[i], 0);
}
int main()
{
for (;;) {
int pid = fork();
if (pid < 0)
exit(1);
if (pid == 0) {
test();
exit(0);
}
int status;
while (waitpid(pid, &status, __WALL) != pid) {}
}
return 0;
}
This patch fixes it by resetting the vcpu->mmio_needed once we receive
the triple fault to avoid the residue.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Zubin Mithra <zsm@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sean Christopherson [Wed, 23 Jan 2019 22:39:25 +0000 (14:39 -0800)]
KVM: nVMX: Ignore limit checks on VMX instructions using flat segments
commit
34333cc6c2cb021662fd32e24e618d1b86de95bf upstream.
Regarding segments with a limit==0xffffffff, the SDM officially states:
When the effective limit is FFFFFFFFH (4 GBytes), these accesses may
or may not cause the indicated exceptions. Behavior is
implementation-specific and may vary from one execution to another.
In practice, all CPUs that support VMX ignore limit checks for "flat
segments", i.e. an expand-up data or code segment with base=0 and
limit=0xffffffff. This is subtly different than wrapping the effective
address calculation based on the address size, as the flat segment
behavior also applies to accesses that would wrap the 4g boundary, e.g.
a 4-byte access starting at 0xffffffff will access linear addresses
0xffffffff, 0x0, 0x1 and 0x2.
Fixes: f9eb4af67c9d ("KVM: nVMX: VMX instructions: add checks for #GP/#SS exceptions")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sean Christopherson [Wed, 23 Jan 2019 22:39:23 +0000 (14:39 -0800)]
KVM: nVMX: Sign extend displacements of VMX instr's mem operands
commit
946c522b603f281195af1df91837a1d4d1eb3bc9 upstream.
The VMCS.EXIT_QUALIFCATION field reports the displacements of memory
operands for various instructions, including VMX instructions, as a
naturally sized unsigned value, but masks the value by the addr size,
e.g. given a ModRM encoded as -0x28(%ebp), the -0x28 displacement is
reported as 0xffffffd8 for a 32-bit address size. Despite some weird
wording regarding sign extension, the SDM explicitly states that bits
beyond the instructions address size are undefined:
In all cases, bits of this field beyond the instruction’s address
size are undefined.
Failure to sign extend the displacement results in KVM incorrectly
treating a negative displacement as a large positive displacement when
the address size of the VMX instruction is smaller than KVM's native
size, e.g. a 32-bit address size on a 64-bit KVM.
The very original decoding, added by commit
064aea774768 ("KVM: nVMX:
Decoding memory operands of VMX instructions"), sort of modeled sign
extension by truncating the final virtual/linear address for a 32-bit
address size. I.e. it messed up the effective address but made it work
by adjusting the final address.
When segmentation checks were added, the truncation logic was kept
as-is and no sign extension logic was introduced. In other words, it
kept calculating the wrong effective address while mostly generating
the correct virtual/linear address. As the effective address is what's
used in the segment limit checks, this results in KVM incorreclty
injecting #GP/#SS faults due to non-existent segment violations when
a nested VMM uses negative displacements with an address size smaller
than KVM's native address size.
Using the -0x28(%ebp) example, an EBP value of 0x1000 will result in
KVM using 0x100000fd8 as the effective address when checking for a
segment limit violation. This causes a 100% failure rate when running
a 32-bit KVM build as L1 on top of a 64-bit KVM L0.
Fixes: f9eb4af67c9d ("KVM: nVMX: VMX instructions: add checks for #GP/#SS exceptions")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Gustavo A. R. Silva [Fri, 15 Feb 2019 20:29:26 +0000 (14:29 -0600)]
drm/radeon/evergreen_cs: fix missing break in switch statement
commit
cc5034a5d293dd620484d1d836aa16c6764a1c8c upstream.
Add missing break statement in order to prevent the code from falling
through to case CB_TARGET_MASK.
This bug was found thanks to the ongoing efforts to enable
-Wimplicit-fallthrough.
Fixes: dd220a00e8bd ("drm/radeon/kms: add support for streamout v7")
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sakari Ailus [Wed, 30 Jan 2019 10:09:41 +0000 (05:09 -0500)]
media: uvcvideo: Avoid NULL pointer dereference at the end of streaming
commit
9dd0627d8d62a7ddb001a75f63942d92b5336561 upstream.
The UVC video driver converts the timestamp from hardware specific unit
to one known by the kernel at the time when the buffer is dequeued. This
is fine in general, but the streamoff operation consists of the
following steps (among other things):
1. uvc_video_clock_cleanup --- the hardware clock sample array is
released and the pointer to the array is set to NULL,
2. buffers in active state are returned to the user and
3. buf_finish callback is called on buffers that are prepared.
buf_finish includes calling uvc_video_clock_update that accesses the
hardware clock sample array.
The above is serialised by a queue specific mutex. Address the problem
by skipping the clock conversion if the hardware clock sample array is
already released.
Fixes: 9c0863b1cc48 ("[media] vb2: call buf_finish from __queue_cancel")
Reported-by: Chiranjeevi Rapolu <chiranjeevi.rapolu@intel.com>
Tested-by: Chiranjeevi Rapolu <chiranjeevi.rapolu@intel.com>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Zhang, Jun [Tue, 18 Dec 2018 14:55:01 +0000 (06:55 -0800)]
rcu: Do RCU GP kthread self-wakeup from softirq and interrupt
commit
1d1f898df6586c5ea9aeaf349f13089c6fa37903 upstream.
The rcu_gp_kthread_wake() function is invoked when it might be necessary
to wake the RCU grace-period kthread. Because self-wakeups are normally
a useless waste of CPU cycles, if rcu_gp_kthread_wake() is invoked from
this kthread, it naturally refuses to do the wakeup.
Unfortunately, natural though it might be, this heuristic fails when
rcu_gp_kthread_wake() is invoked from an interrupt or softirq handler
that interrupted the grace-period kthread just after the final check of
the wait-event condition but just before the schedule() call. In this
case, a wakeup is required, even though the call to rcu_gp_kthread_wake()
is within the RCU grace-period kthread's context. Failing to provide
this wakeup can result in grace periods failing to start, which in turn
results in out-of-memory conditions.
This race window is quite narrow, but it actually did happen during real
testing. It would of course need to be fixed even if it was strictly
theoretical in nature.
This patch does not Cc stable because it does not apply cleanly to
earlier kernel versions.
Fixes: 48a7639ce80c ("rcu: Make callers awaken grace-period kthread")
Reported-by: "He, Bo" <bo.he@intel.com>
Co-developed-by: "Zhang, Jun" <jun.zhang@intel.com>
Co-developed-by: "He, Bo" <bo.he@intel.com>
Co-developed-by: "xiao, jin" <jin.xiao@intel.com>
Co-developed-by: Bai, Jie A <jie.a.bai@intel.com>
Signed-off: "Zhang, Jun" <jun.zhang@intel.com>
Signed-off: "He, Bo" <bo.he@intel.com>
Signed-off: "xiao, jin" <jin.xiao@intel.com>
Signed-off: Bai, Jie A <jie.a.bai@intel.com>
Signed-off-by: "Zhang, Jun" <jun.zhang@intel.com>
[ paulmck: Switch from !in_softirq() to "!in_interrupt() &&
!in_serving_softirq() to avoid redundant wakeups and to also handle the
interrupt-handler scenario as well as the softirq-handler scenario that
actually occurred in testing. ]
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Link: https://lkml.kernel.org/r/CD6925E8781EFD4D8E11882D20FC406D52A11F61@SHSMSX104.ccr.corp.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Aditya Pakki [Mon, 4 Mar 2019 22:48:54 +0000 (16:48 -0600)]
md: Fix failed allocation of md_register_thread
commit
e406f12dde1a8375d77ea02d91f313fb1a9c6aec upstream.
mddev->sync_thread can be set to NULL on kzalloc failure downstream.
The patch checks for such a scenario and frees allocated resources.
Committer node:
Added similar fix to raid5.c, as suggested by Guoqing.
Cc: stable@vger.kernel.org # v3.16+
Acked-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Adrian Hunter [Fri, 1 Mar 2019 10:35:36 +0000 (12:35 +0200)]
perf intel-pt: Fix divide by zero when TSC is not available
commit
076333870c2f5bdd9b6d31e7ca1909cf0c84cbfa upstream.
When TSC is not available, "timeless" decoding is used but a divide by
zero occurs if perf_time_to_tsc() is called.
Ensure the divisor is not zero.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v4.9+
Link: https://lkml.kernel.org/n/tip-1i4j0wqoc8vlbkcizqqxpsf4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Adrian Hunter [Wed, 6 Feb 2019 10:39:44 +0000 (12:39 +0200)]
perf intel-pt: Fix overlap calculation for padding
commit
5a99d99e3310a565b0cf63f785b347be9ee0da45 upstream.
Auxtrace records might have up to 7 bytes of padding appended. Adjust
the overlap accordingly.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20190206103947.15750-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Adrian Hunter [Wed, 6 Feb 2019 10:39:43 +0000 (12:39 +0200)]
perf auxtrace: Define auxtrace record alignment
commit
c3fcadf0bb765faf45d6d562246e1d08885466df upstream.
Define auxtrace record alignment so that it can be referenced elsewhere.
Note this is preparation for patch "perf intel-pt: Fix overlap calculation
for padding"
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20190206103947.15750-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Adrian Hunter [Wed, 6 Feb 2019 10:39:45 +0000 (12:39 +0200)]
perf intel-pt: Fix CYC timestamp calculation after OVF
commit
03997612904866abe7cdcc992784ef65cb3a4b81 upstream.
CYC packet timestamp calculation depends upon CBR which was being
cleared upon overflow (OVF). That can cause errors due to failing to
synchronize with sideband events. Even if a CBR change has been lost,
the old CBR is still a better estimate than zero. So remove the clearing
of CBR.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20190206103947.15750-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Axtens [Sat, 9 Feb 2019 04:52:53 +0000 (12:52 +0800)]
bcache: never writeback a discard operation
commit
9951379b0ca88c95876ad9778b9099e19a95d566 upstream.
Some users see panics like the following when performing fstrim on a
bcached volume:
[ 529.803060] BUG: unable to handle kernel NULL pointer dereference at
0000000000000008
[ 530.183928] #PF error: [normal kernel read fault]
[ 530.412392] PGD
8000001f42163067 P4D
8000001f42163067 PUD
1f42168067 PMD 0
[ 530.750887] Oops: 0000 [#1] SMP PTI
[ 530.920869] CPU: 10 PID: 4167 Comm: fstrim Kdump: loaded Not tainted 5.0.0-rc1+ #3
[ 531.290204] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 12/27/2015
[ 531.693137] RIP: 0010:blk_queue_split+0x148/0x620
[ 531.922205] Code: 60 38 89 55 a0 45 31 db 45 31 f6 45 31 c9 31 ff 89 4d 98 85 db 0f 84 7f 04 00 00 44 8b 6d 98 4c 89 ee 48 c1 e6 04 49 03 70 78 <8b> 46 08 44 8b 56 0c 48
8b 16 44 29 e0 39 d8 48 89 55 a8 0f 47 c3
[ 532.838634] RSP: 0018:
ffffb9b708df39b0 EFLAGS:
00010246
[ 533.093571] RAX:
00000000ffffffff RBX:
0000000000046000 RCX:
0000000000000000
[ 533.441865] RDX:
0000000000000200 RSI:
0000000000000000 RDI:
0000000000000000
[ 533.789922] RBP:
ffffb9b708df3a48 R08:
ffff940d3b3fdd20 R09:
0000000000000000
[ 534.137512] R10:
ffffb9b708df3958 R11:
0000000000000000 R12:
0000000000000000
[ 534.485329] R13:
0000000000000000 R14:
0000000000000000 R15:
ffff940d39212020
[ 534.833319] FS:
00007efec26e3840(0000) GS:
ffff940d1f480000(0000) knlGS:
0000000000000000
[ 535.224098] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 535.504318] CR2:
0000000000000008 CR3:
0000001f4e256004 CR4:
00000000001606e0
[ 535.851759] Call Trace:
[ 535.970308] ? mempool_alloc_slab+0x15/0x20
[ 536.174152] ? bch_data_insert+0x42/0xd0 [bcache]
[ 536.403399] blk_mq_make_request+0x97/0x4f0
[ 536.607036] generic_make_request+0x1e2/0x410
[ 536.819164] submit_bio+0x73/0x150
[ 536.980168] ? submit_bio+0x73/0x150
[ 537.149731] ? bio_associate_blkg_from_css+0x3b/0x60
[ 537.391595] ? _cond_resched+0x1a/0x50
[ 537.573774] submit_bio_wait+0x59/0x90
[ 537.756105] blkdev_issue_discard+0x80/0xd0
[ 537.959590] ext4_trim_fs+0x4a9/0x9e0
[ 538.137636] ? ext4_trim_fs+0x4a9/0x9e0
[ 538.324087] ext4_ioctl+0xea4/0x1530
[ 538.497712] ? _copy_to_user+0x2a/0x40
[ 538.679632] do_vfs_ioctl+0xa6/0x600
[ 538.853127] ? __do_sys_newfstat+0x44/0x70
[ 539.051951] ksys_ioctl+0x6d/0x80
[ 539.212785] __x64_sys_ioctl+0x1a/0x20
[ 539.394918] do_syscall_64+0x5a/0x110
[ 539.568674] entry_SYSCALL_64_after_hwframe+0x44/0xa9
We have observed it where both:
1) LVM/devmapper is involved (bcache backing device is LVM volume) and
2) writeback cache is involved (bcache cache_mode is writeback)
On one machine, we can reliably reproduce it with:
# echo writeback > /sys/block/bcache0/bcache/cache_mode
(not sure whether above line is required)
# mount /dev/bcache0 /test
# for i in {0..10}; do
file="$(mktemp /test/zero.XXX)"
dd if=/dev/zero of="$file" bs=1M count=256
sync
rm $file
done
# fstrim -v /test
Observing this with tracepoints on, we see the following writes:
fstrim-18019 [022] .... 91107.302026: bcache_write:
73f95583-561c-408f-a93a-
4cbd2498f5c8 inode 0 DS
4260112 + 196352 hit 0 bypass 1
fstrim-18019 [022] .... 91107.302050: bcache_write:
73f95583-561c-408f-a93a-
4cbd2498f5c8 inode 0 DS
4456464 + 262144 hit 0 bypass 1
fstrim-18019 [022] .... 91107.302075: bcache_write:
73f95583-561c-408f-a93a-
4cbd2498f5c8 inode 0 DS
4718608 + 81920 hit 0 bypass 1
fstrim-18019 [022] .... 91107.302094: bcache_write:
73f95583-561c-408f-a93a-
4cbd2498f5c8 inode 0 DS
5324816 + 180224 hit 0 bypass 1
fstrim-18019 [022] .... 91107.302121: bcache_write:
73f95583-561c-408f-a93a-
4cbd2498f5c8 inode 0 DS
5505040 + 262144 hit 0 bypass 1
fstrim-18019 [022] .... 91107.302145: bcache_write:
73f95583-561c-408f-a93a-
4cbd2498f5c8 inode 0 DS
5767184 + 81920 hit 0 bypass 1
fstrim-18019 [022] .... 91107.308777: bcache_write:
73f95583-561c-408f-a93a-
4cbd2498f5c8 inode 0 DS
6373392 + 180224 hit 1 bypass 0
<crash>
Note the final one has different hit/bypass flags.
This is because in should_writeback(), we were hitting a case where
the partial stripe condition was returning true and so
should_writeback() was returning true early.
If that hadn't been the case, it would have hit the would_skip test, and
as would_skip == s->iop.bypass == true, should_writeback() would have
returned false.
Looking at the git history from 'commit
72c270612bd3 ("bcache: Write out
full stripes")', it looks like the idea was to optimise for raid5/6:
* If a stripe is already dirty, force writes to that stripe to
writeback mode - to help build up full stripes of dirty data
To fix this issue, make sure that should_writeback() on a discard op
never returns true.
More details of debugging:
https://www.spinics.net/lists/linux-bcache/msg06996.html
Previous reports:
- https://bugzilla.kernel.org/show_bug.cgi?id=201051
- https://bugzilla.kernel.org/show_bug.cgi?id=196103
- https://www.spinics.net/lists/linux-bcache/msg06885.html
(Coly Li: minor modification to follow maximum 75 chars per line rule)
Cc: Kent Overstreet <koverstreet@google.com>
Cc: stable@vger.kernel.org
Fixes: 72c270612bd3 ("bcache: Write out full stripes")
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Viresh Kumar [Fri, 8 Mar 2019 09:53:11 +0000 (15:23 +0530)]
PM / wakeup: Rework wakeup source timer cancellation
commit
1fad17fb1bbcd73159c2b992668a6957ecc5af8a upstream.
If wakeup_source_add() is called right after wakeup_source_remove()
for the same wakeup source, timer_setup() may be called for a
potentially scheduled timer which is incorrect.
To avoid that, move the wakeup source timer cancellation from
wakeup_source_drop() to wakeup_source_remove().
Moreover, make wakeup_source_remove() clear the timer function after
canceling the timer to let wakeup_source_not_registered() treat
unregistered wakeup sources in the same way as the ones that have
never been registered.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Cc: 4.4+ <stable@vger.kernel.org> # 4.4+
[ rjw: Subject, changelog, merged two patches together ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Yihao Wu [Wed, 6 Mar 2019 13:03:50 +0000 (21:03 +0800)]
nfsd: fix wrong check in write_v4_end_grace()
commit
dd838821f0a29781b185cd8fb8e48d5c177bd838 upstream.
Commit
62a063b8e7d1 "nfsd4: fix crash on writing v4_end_grace before
nfsd startup" is trying to fix a NULL dereference issue, but it
mistakenly checks if the nfsd server is started. So fix it.
Fixes: 62a063b8e7d1 "nfsd4: fix crash on writing v4_end_grace before nfsd startup"
Cc: stable@vger.kernel.org
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Signed-off-by: Yihao Wu <wuyihao@linux.alibaba.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
NeilBrown [Mon, 4 Mar 2019 03:08:22 +0000 (14:08 +1100)]
nfsd: fix memory corruption caused by readdir
commit
b602345da6cbb135ba68cf042df8ec9a73da7981 upstream.
If the result of an NFSv3 readdir{,plus} request results in the
"offset" on one entry having to be split across 2 pages, and is sized
so that the next directory entry doesn't fit in the requested size,
then memory corruption can happen.
When encode_entry() is called after encoding the last entry that fits,
it notices that ->offset and ->offset1 are set, and so stores the
offset value in the two pages as required. It clears ->offset1 but
*does not* clear ->offset.
Normally this omission doesn't matter as encode_entry_baggage() will
be called, and will set ->offset to a suitable value (not on a page
boundary).
But in the case where cd->buflen < elen and nfserr_toosmall is
returned, ->offset is not reset.
This means that nfsd3proc_readdirplus will see ->offset with a value 4
bytes before the end of a page, and ->offset1 set to NULL.
It will try to write 8bytes to ->offset.
If we are lucky, the next page will be read-only, and the system will
BUG: unable to handle kernel paging request at...
If we are unlucky, some innocent page will have the first 4 bytes
corrupted.
nfsd3proc_readdir() doesn't even check for ->offset1, it just blindly
writes 8 bytes to the offset wherever it is.
Fix this by clearing ->offset after it is used, and copying the
->offset handling code from nfsd3_proc_readdirplus into
nfsd3_proc_readdir.
(Note that the commit hash in the Fixes tag is from the 'history'
tree - this bug predates git).
Fixes: 0b1d57cf7654 ("[PATCH] kNFSd: Fix nfs3 dentry encoding")
Fixes-URL: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=
0b1d57cf7654
Cc: stable@vger.kernel.org (v2.6.12+)
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Trond Myklebust [Fri, 15 Feb 2019 21:08:25 +0000 (16:08 -0500)]
NFS: Don't recoalesce on error in nfs_pageio_complete_mirror()
commit
8127d82705998568b52ac724e28e00941538083d upstream.
If the I/O completion failed with a fatal error, then we should just
exit nfs_pageio_complete_mirror() rather than try to recoalesce.
Fixes: a7d42ddb3099 ("nfs: add mirroring support to pgio layer")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Trond Myklebust [Fri, 15 Feb 2019 19:59:52 +0000 (14:59 -0500)]
NFS: Fix an I/O request leakage in nfs_do_recoalesce
commit
4d91969ed4dbcefd0e78f77494f0cb8fada9048a upstream.
Whether we need to exit early, or just reprocess the list, we
must not lost track of the request which failed to get recoalesced.
Fixes: 03d5eb65b538 ("NFS: Fix a memory leak in nfs_do_recoalesce")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Trond Myklebust [Wed, 13 Feb 2019 14:21:38 +0000 (09:21 -0500)]
NFS: Fix I/O request leakages
commit
f57dcf4c72113c745d83f1c65f7291299f65c14f upstream.
When we fail to add the request to the I/O queue, we currently leave it
to the caller to free the failed request. However since some of the
requests that fail are actually created by nfs_pageio_add_request()
itself, and are not passed back the caller, this leads to a leakage
issue, which can again cause page locks to leak.
This commit addresses the leakage by freeing the created requests on
error, using desc->pg_completion_ops->error_cleanup()
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Fixes: a7d42ddb30997 ("nfs: add mirroring support to pgio layer")
Cc: stable@vger.kernel.org # v4.0: c18b96a1b862: nfs: clean up rest of reqs
Cc: stable@vger.kernel.org # v4.0: d600ad1f2bdb: NFS41: pop some layoutget
Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
NeilBrown [Sun, 6 Jan 2019 10:06:25 +0000 (21:06 +1100)]
dm: fix to_sector() for 32bit
commit
0bdb50c531f7377a9da80d3ce2d61f389c84cb30 upstream.
A dm-raid array with devices larger than 4GB won't assemble on
a 32 bit host since _check_data_dev_sectors() was added in 4.16.
This is because to_sector() treats its argument as an "unsigned long"
which is 32bits (4GB) on a 32bit host. Using "unsigned long long"
is more correct.
Kernels as early as 4.2 can have other problems due to to_sector()
being used on the size of a device.
Fixes: 0cf4503174c1 ("dm raid: add support for the MD RAID0 personality")
cc: stable@vger.kernel.org (v4.2+)
Reported-and-tested-by: Guillaume Perréal <gperreal@free.fr>
Signed-off-by: NeilBrown <neil@brown.name>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Gustavo A. R. Silva [Thu, 3 Jan 2019 20:14:08 +0000 (14:14 -0600)]
ARM: s3c24xx: Fix boolean expressions in osiris_dvs_notify
commit
e2477233145f2156434afb799583bccd878f3e9f upstream.
Fix boolean expressions by using logical AND operator '&&' instead of
bitwise operator '&'.
This issue was detected with the help of Coccinelle.
Fixes: 4fa084af28ca ("ARM: OSIRIS: DVS (Dynamic Voltage Scaling) supoort.")
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
[krzk: Fix -Wparentheses warning]
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Michael Ellerman [Thu, 14 Feb 2019 00:08:29 +0000 (11:08 +1100)]
powerpc/ptrace: Simplify vr_get/set() to avoid GCC warning
commit
ca6d5149d2ad0a8d2f9c28cbe379802260a0a5e0 upstream.
GCC 8 warns about the logic in vr_get/set(), which with -Werror breaks
the build:
In function ‘user_regset_copyin’,
inlined from ‘vr_set’ at arch/powerpc/kernel/ptrace.c:628:9:
include/linux/regset.h:295:4: error: ‘memcpy’ offset [-527, -529] is
out of the bounds [0, 16] of object ‘vrsave’ with type ‘union
<anonymous>’ [-Werror=array-bounds]
arch/powerpc/kernel/ptrace.c: In function ‘vr_set’:
arch/powerpc/kernel/ptrace.c:623:5: note: ‘vrsave’ declared here
} vrsave;
This has been identified as a regression in GCC, see GCC bug 88273.
However we can avoid the warning and also simplify the logic and make
it more robust.
Currently we pass -1 as end_pos to user_regset_copyout(). This says
"copy up to the end of the regset".
The definition of the regset is:
[REGSET_VMX] = {
.core_note_type = NT_PPC_VMX, .n = 34,
.size = sizeof(vector128), .align = sizeof(vector128),
.active = vr_active, .get = vr_get, .set = vr_set
},
The end is calculated as (n * size), ie. 34 * sizeof(vector128).
In vr_get/set() we pass start_pos as 33 * sizeof(vector128), meaning
we can copy up to sizeof(vector128) into/out-of vrsave.
The on-stack vrsave is defined as:
union {
elf_vrreg_t reg;
u32 word;
} vrsave;
And elf_vrreg_t is:
typedef __vector128 elf_vrreg_t;
So there is no bug, but we rely on all those sizes lining up,
otherwise we would have a kernel stack exposure/overwrite on our
hands.
Rather than relying on that we can pass an explict end_pos based on
the sizeof(vrsave). The result should be exactly the same but it's
more obviously not over-reading/writing the stack and it avoids the
compiler warning.
Reported-by: Meelis Roos <mroos@linux.ee>
Reported-by: Mathieu Malaterre <malat@debian.org>
Cc: stable@vger.kernel.org
Tested-by: Mathieu Malaterre <malat@debian.org>
Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mark Cave-Ayland [Fri, 8 Feb 2019 14:33:19 +0000 (14:33 +0000)]
powerpc: Fix 32-bit KVM-PR lockup and host crash with MacOS guest
commit
fe1ef6bcdb4fca33434256a802a3ed6aacf0bd2f upstream.
Commit
8792468da5e1 "powerpc: Add the ability to save FPU without
giving it up" unexpectedly removed the MSR_FE0 and MSR_FE1 bits from
the bitmask used to update the MSR of the previous thread in
__giveup_fpu() causing a KVM-PR MacOS guest to lockup and panic the
host kernel.
Leaving FE0/1 enabled means unrelated processes might receive FPEs
when they're not expecting them and crash. In particular if this
happens to init the host will then panic.
eg (transcribed):
qemu-system-ppc[837]: unhandled signal 8 at
12cc9ce4 nip
12cc9ce4 lr
12cc9ca4 code 0
systemd[1]: unhandled signal 8 at
202f02e0 nip
202f02e0 lr
001003d4 code 0
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
Reinstate these bits to the MSR bitmask to enable MacOS guests to run
under 32-bit KVM-PR once again without issue.
Fixes: 8792468da5e1 ("powerpc: Add the ability to save FPU without giving it up")
Cc: stable@vger.kernel.org # v4.6+
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Christophe Leroy [Fri, 25 Jan 2019 12:03:55 +0000 (12:03 +0000)]
powerpc/83xx: Also save/restore SPRG4-7 during suspend
commit
36da5ff0bea2dc67298150ead8d8471575c54c7d upstream.
The 83xx has 8 SPRG registers and uses at least SPRG4
for DTLB handling LRU.
Fixes: 2319f1239592 ("powerpc/mm: e300c2/c3/c4 TLB errata workaround")
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jordan Niethe [Wed, 27 Feb 2019 03:02:29 +0000 (14:02 +1100)]
powerpc/powernv: Make opal log only readable by root
commit
7b62f9bd2246b7d3d086e571397c14ba52645ef1 upstream.
Currently the opal log is globally readable. It is kernel policy to
limit the visibility of physical addresses / kernel pointers to root.
Given this and the fact the opal log may contain this information it
would be better to limit the readability to root.
Fixes: bfc36894a48b ("powerpc/powernv: Add OPAL message log interface")
Cc: stable@vger.kernel.org # v3.15+
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
Reviewed-by: Stewart Smith <stewart@linux.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Christophe Leroy [Thu, 21 Feb 2019 19:08:37 +0000 (19:08 +0000)]
powerpc/wii: properly disable use of BATs when requested.
commit
6d183ca8baec983dc4208ca45ece3c36763df912 upstream.
'nobats' kernel parameter or some options like CONFIG_DEBUG_PAGEALLOC
deny the use of BATS for mapping memory.
This patch makes sure that the specific wii RAM mapping function
takes it into account as well.
Fixes: de32400dd26e ("wii: use both mem1 and mem2 as ram")
Cc: stable@vger.kernel.org
Reviewed-by: Jonathan Neuschafer <j.neuschaefer@gmx.net>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Christophe Leroy [Wed, 27 Feb 2019 11:45:30 +0000 (11:45 +0000)]
powerpc/32: Clear on-stack exception marker upon exception return
commit
9580b71b5a7863c24a9bd18bcd2ad759b86b1eff upstream.
Clear the on-stack STACK_FRAME_REGS_MARKER on exception exit in order
to avoid confusing stacktrace like the one below.
Call Trace:
[
c0e9dca0] [
c01c42a0] print_address_description+0x64/0x2bc (unreliable)
[
c0e9dcd0] [
c01c4684] kasan_report+0xfc/0x180
[
c0e9dd10] [
c0895130] memchr+0x24/0x74
[
c0e9dd30] [
c00a9e38] msg_print_text+0x124/0x574
[
c0e9dde0] [
c00ab710] console_unlock+0x114/0x4f8
[
c0e9de40] [
c00adc60] vprintk_emit+0x188/0x1c4
--- interrupt:
c0e9df00 at 0x400f330
LR = init_stack+0x1f00/0x2000
[
c0e9de80] [
c00ae3c4] printk+0xa8/0xcc (unreliable)
[
c0e9df20] [
c0c27e44] early_irq_init+0x38/0x108
[
c0e9df50] [
c0c15434] start_kernel+0x310/0x488
[
c0e9dff0] [
00003484] 0x3484
With this patch the trace becomes:
Call Trace:
[
c0e9dca0] [
c01c42c0] print_address_description+0x64/0x2bc (unreliable)
[
c0e9dcd0] [
c01c46a4] kasan_report+0xfc/0x180
[
c0e9dd10] [
c0895150] memchr+0x24/0x74
[
c0e9dd30] [
c00a9e58] msg_print_text+0x124/0x574
[
c0e9dde0] [
c00ab730] console_unlock+0x114/0x4f8
[
c0e9de40] [
c00adc80] vprintk_emit+0x188/0x1c4
[
c0e9de80] [
c00ae3e4] printk+0xa8/0xcc
[
c0e9df20] [
c0c27e44] early_irq_init+0x38/0x108
[
c0e9df50] [
c0c15434] start_kernel+0x310/0x488
[
c0e9dff0] [
00003484] 0x3484
Cc: stable@vger.kernel.org
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
zhangyi (F) [Thu, 21 Feb 2019 16:24:09 +0000 (11:24 -0500)]
jbd2: fix compile warning when using JBUFFER_TRACE
commit
01215d3edb0f384ddeaa5e4a22c1ae5ff634149f upstream.
The jh pointer may be used uninitialized in the two cases below and the
compiler complain about it when enabling JBUFFER_TRACE macro, fix them.
In file included from fs/jbd2/transaction.c:19:0:
fs/jbd2/transaction.c: In function ‘jbd2_journal_get_undo_access’:
./include/linux/jbd2.h:1637:38: warning: ‘jh’ is used uninitialized in this function [-Wuninitialized]
#define JBUFFER_TRACE(jh, info) do { printk("%s: %d\n", __func__, jh->b_jcount);} while (0)
^
fs/jbd2/transaction.c:1219:23: note: ‘jh’ was declared here
struct journal_head *jh;
^
In file included from fs/jbd2/transaction.c:19:0:
fs/jbd2/transaction.c: In function ‘jbd2_journal_dirty_metadata’:
./include/linux/jbd2.h:1637:38: warning: ‘jh’ may be used uninitialized in this function [-Wmaybe-uninitialized]
#define JBUFFER_TRACE(jh, info) do { printk("%s: %d\n", __func__, jh->b_jcount);} while (0)
^
fs/jbd2/transaction.c:1332:23: note: ‘jh’ was declared here
struct journal_head *jh;
^
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
zhangyi (F) [Mon, 11 Feb 2019 04:23:04 +0000 (23:23 -0500)]
jbd2: clear dirty flag when revoking a buffer from an older transaction
commit
904cdbd41d749a476863a0ca41f6f396774f26e4 upstream.
Now, we capture a data corruption problem on ext4 while we're truncating
an extent index block. Imaging that if we are revoking a buffer which
has been journaled by the committing transaction, the buffer's jbddirty
flag will not be cleared in jbd2_journal_forget(), so the commit code
will set the buffer dirty flag again after refile the buffer.
fsx kjournald2
jbd2_journal_commit_transaction
jbd2_journal_revoke commit phase 1~5...
jbd2_journal_forget
belongs to older transaction commit phase 6
jbddirty not clear __jbd2_journal_refile_buffer
__jbd2_journal_unfile_buffer
test_clear_buffer_jbddirty
mark_buffer_dirty
Finally, if the freed extent index block was allocated again as data
block by some other files, it may corrupt the file data after writing
cached pages later, such as during unmount time. (In general,
clean_bdev_aliases() related helpers should be invoked after
re-allocation to prevent the above corruption, but unfortunately we
missed it when zeroout the head of extra extent blocks in
ext4_ext_handle_unwritten_extents()).
This patch mark buffer as freed and set j_next_transaction to the new
transaction when it already belongs to the committing transaction in
jbd2_journal_forget(), so that commit code knows it should clear dirty
bits when it is done with the buffer.
This problem can be reproduced by xfstests generic/455 easily with
seeds (3246 3247 3248 3249).
Signed-off-by: zhangyi (F) <yi.zhang@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jay Dolan [Wed, 13 Feb 2019 05:43:12 +0000 (21:43 -0800)]
serial: 8250_pci: Have ACCES cards that use the four port Pericom PI7C9X7954 chip use the pci_pericom_setup()
commit
78d3820b9bd39028727c6aab7297b63c093db343 upstream.
The four port Pericom chips have the fourth port at the wrong address.
Make use of quirk to fix it.
Fixes: c8d192428f52 ("serial: 8250: added acces i/o products quad and octal serial cards")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Jay Dolan <jay.dolan@accesio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jay Dolan [Wed, 13 Feb 2019 05:43:11 +0000 (21:43 -0800)]
serial: 8250_pci: Fix number of ports for ACCES serial cards
commit
b896b03bc7fce43a07012cc6bf5e2ab2fddf3364 upstream.
Have the correct number of ports created for ACCES serial cards. Two port
cards show up as four ports, and four port cards show up as eight.
Fixes: c8d192428f52 ("serial: 8250: added acces i/o products quad and octal serial cards")
Signed-off-by: Jay Dolan <jay.dolan@accesio.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Angelo Butti [Mon, 7 Nov 2016 15:39:03 +0000 (16:39 +0100)]
8250: FIX Fourth port offset of Pericom PI7C9X7954 boards
commit
5c31ef91c06db7800ad573174bd92be4df34ecb2 upstream.
Hi,
below patch to fix Fourth port offset of Percom PI7C9X7954 boards.
I had a problem using Fourth port on a pci express serial board based on Pericom
PI7C9X7954. Reading datasheet I notice a "special" offset assign to this port
when used in I/O mode.
Offset 0x0 -> UART 0
Offset 0x8 -> UART 1
Offset 0x10 -> UART 2
Offset 0x38 -> UART 3 <<---- This don't follow a logical sequence
This patch add a different init to last port, to have right offset.
I check also Pericom 7952 and 7958 but that devices follow logical sequence,
so they are ok.
Regards,
Angelo
Signed-off-by: Angelo Butti <buttiangelo@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Lubomir Rintel [Sun, 24 Feb 2019 12:00:53 +0000 (13:00 +0100)]
serial: 8250_of: assume reg-shift of 2 for mrvl,mmp-uart
commit
f4817843e39ce78aace0195a57d4e8500a65a898 upstream.
There are two other drivers that bind to mrvl,mmp-uart and both of them
assume register shift of 2 bits. There are device trees that lack the
property and rely on that assumption.
If this driver wins the race to bind to those devices, it should behave
the same as the older deprecated driver.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Anssi Hannula [Fri, 15 Feb 2019 16:45:08 +0000 (18:45 +0200)]
serial: uartps: Fix stuck ISR if RX disabled with non-empty FIFO
commit
7abab1605139bc41442864c18f9573440f7ca105 upstream.
If RX is disabled while there are still unprocessed bytes in RX FIFO,
cdns_uart_handle_rx() called from interrupt handler will get stuck in
the receive loop as read bytes will not get removed from the RX FIFO
and CDNS_UART_SR_RXEMPTY bit will never get set.
Avoid the stuck handler by checking first if RX is disabled. port->lock
protects against race with RX-disabling functions.
This HW behavior was mentioned by Nathan Rossi in
43e98facc4a3 ("tty:
xuartps: Fix RX hang, and TX corruption in termios call") which fixed a
similar issue in cdns_uart_set_termios().
The behavior can also be easily verified by e.g. setting
CDNS_UART_CR_RX_DIS at the beginning of cdns_uart_handle_rx() - the
following loop will then get stuck.
Resetting the FIFO using RXRST would not set RXEMPTY either so simply
issuing a reset after RX-disable would not work.
I observe this frequently on a ZynqMP board during heavy RX load at 1M
baudrate when the reader process exits and thus RX gets disabled.
Fixes: 61ec9016988f ("tty/serial: add support for Xilinx PS UART")
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tvrtko Ursulin [Tue, 5 Mar 2019 11:04:08 +0000 (11:04 +0000)]
drm/i915: Relax mmap VMA check
[ Upstream commit
ca22f32a6296cbfa29de56328c8505560a18cfa8 ]
Legacy behaviour was to allow non-page-aligned mmap requests, as does the
linux mmap(2) implementation by virtue of automatically rounding up for
the caller.
To avoid breaking legacy userspace relax the newly introduced fix.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 5c4604e757ba ("drm/i915: Prevent a race during I915_GEM_MMAP ioctl with WC set")
Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Adam Zabrocki <adamza@microsoft.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.0+
Cc: Akash Goel <akash.goel@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305110409.28633-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit
a90e1948efb648f567444f87f3c19b2a0787affd)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sowjanya Komatineni [Tue, 12 Feb 2019 19:06:44 +0000 (11:06 -0800)]
i2c: tegra: fix maximum transfer size
commit
f4e3f4ae1d9c9330de355f432b69952e8cef650c upstream.
Tegra186 and prior supports maximum 4K bytes per packet transfer
including 12 bytes of packet header.
This patch fixes max write length limit to account packet header
size for transfers.
Cc: stable@vger.kernel.org # 4.4+
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
QiaoChong [Sat, 9 Feb 2019 20:59:07 +0000 (20:59 +0000)]
parport_pc: fix find_superio io compare code, should use equal test.
commit
21698fd57984cd28207d841dbdaa026d6061bceb upstream.
In the original code before
181bf1e815a2 the loop was continuing until
it finds the first matching superios[i].io and p->base.
But after
181bf1e815a2 the logic changed and the loop now returns the
pointer to the first mismatched array element which is then used in
get_superio_dma() and get_superio_irq() and thus returning the wrong
value.
Fix the condition so that it now returns the correct pointer.
Fixes: 181bf1e815a2 ("parport_pc: clean up the modified while loops using for")
Cc: Alan Cox <alan@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: QiaoChong <qiaochong@loongson.cn>
[rewrite the commit message]
Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alexander Shishkin [Thu, 24 Jan 2019 13:11:53 +0000 (15:11 +0200)]
intel_th: Don't reference unassigned outputs
commit
9ed3f22223c33347ed963e7c7019cf2956dd4e37 upstream.
When an output port driver is removed, also remove references to it from
any masters. Failing to do this causes a NULL ptr dereference when
configuring another output port:
> BUG: unable to handle kernel NULL pointer dereference at
000000000000000d
> RIP: 0010:master_attr_store+0x9d/0x160 [intel_th_gth]
> Call Trace:
> dev_attr_store+0x1b/0x30
> sysfs_kf_write+0x3c/0x50
> kernfs_fop_write+0x125/0x1a0
> __vfs_write+0x3a/0x190
> ? __vfs_write+0x5/0x190
> ? _cond_resched+0x1a/0x50
> ? rcu_all_qs+0x5/0xb0
> ? __vfs_write+0x5/0x190
> vfs_write+0xb8/0x1b0
> ksys_write+0x55/0xc0
> __x64_sys_write+0x1a/0x20
> do_syscall_64+0x5a/0x140
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Fixes: b27a6a3f97b9 ("intel_th: Add Global Trace Hub driver")
CC: stable@vger.kernel.org # v4.4+
Reported-by: Ammy Yi <ammy.yi@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Heikki Krogerus [Wed, 23 Jan 2019 14:44:16 +0000 (17:44 +0300)]
device property: Fix the length used in PROPERTY_ENTRY_STRING()
commit
2b6e492467c78183bb629bb0a100ea3509b615a5 upstream.
With string type property entries we need to use
sizeof(const char *) instead of the number of characters as
the length of the entry.
If the string was shorter then sizeof(const char *),
attempts to read it would have failed with -EOVERFLOW. The
problem has been hidden because all build-in string
properties have had a string longer then 8 characters until
now.
Fixes: a85f42047533 ("device property: helper macros for property entry creation")
Cc: 4.5+ <stable@vger.kernel.org> # 4.5+
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Zev Weiss [Tue, 12 Mar 2019 06:28:02 +0000 (23:28 -0700)]
kernel/sysctl.c: add missing range check in do_proc_dointvec_minmax_conv
commit
8cf7630b29701d364f8df4a50e4f1f5e752b2778 upstream.
This bug has apparently existed since the introduction of this function
in the pre-git era (
4500e91754d3 in Thomas Gleixner's history.git,
"[NET]: Add proc_dointvec_userhz_jiffies, use it for proper handling of
neighbour sysctls.").
As a minimal fix we can simply duplicate the corresponding check in
do_proc_dointvec_conv().
Link: http://lkml.kernel.org/r/20190207123426.9202-3-zev@bewilderbeest.net
Signed-off-by: Zev Weiss <zev@bewilderbeest.net>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: <stable@vger.kernel.org> [2.6.2+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Roman Penyaev [Tue, 5 Mar 2019 23:43:20 +0000 (15:43 -0800)]
mm/vmalloc: fix size check for remap_vmalloc_range_partial()
commit
401592d2e095947344e10ec0623adbcd58934dd4 upstream.
When VM_NO_GUARD is not set area->size includes adjacent guard page,
thus for correct size checking get_vm_area_size() should be used, but
not area->size.
This fixes possible kernel oops when userspace tries to mmap an area on
1 page bigger than was allocated by vmalloc_user() call: the size check
inside remap_vmalloc_range_partial() accounts non-existing guard page
also, so check successfully passes but vmalloc_to_page() returns NULL
(guard page does not physically exist).
The following code pattern example should trigger an oops:
static int oops_mmap(struct file *file, struct vm_area_struct *vma)
{
void *mem;
mem = vmalloc_user(4096);
BUG_ON(!mem);
/* Do not care about mem leak */
return remap_vmalloc_range(vma, mem, 0);
}
And userspace simply mmaps size + PAGE_SIZE:
mmap(NULL, 8192, PROT_WRITE|PROT_READ, MAP_PRIVATE, fd, 0);
Possible candidates for oops which do not have any explicit size
checks:
*** drivers/media/usb/stkwebcam/stk-webcam.c:
v4l_stk_mmap[789] ret = remap_vmalloc_range(vma, sbuf->buffer, 0);
Or the following one:
*** drivers/video/fbdev/core/fbmem.c
static int
fb_mmap(struct file *file, struct vm_area_struct * vma)
...
res = fb->fb_mmap(info, vma);
Where fb_mmap callback calls remap_vmalloc_range() directly without any
explicit checks:
*** drivers/video/fbdev/vfb.c
static int vfb_mmap(struct fb_info *info,
struct vm_area_struct *vma)
{
return remap_vmalloc_range(vma, (void *)info->fix.smem_start, vma->vm_pgoff);
}
Link: http://lkml.kernel.org/r/20190103145954.16942-2-rpenyaev@suse.de
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Joe Perches <joe@perches.com>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
zhongjiang [Tue, 5 Mar 2019 23:41:16 +0000 (15:41 -0800)]
mm: hwpoison: fix thp split handing in soft_offline_in_use_page()
commit
46612b751c4941c5c0472ddf04027e877ae5990f upstream.
When soft_offline_in_use_page() runs on a thp tail page after pmd is
split, we trigger the following VM_BUG_ON_PAGE():
Memory failure: 0x3755ff: non anonymous thp
__get_any_page: 0x3755ff: unknown zero refcount page type
2fffff80000000
Soft offlining pfn 0x34d805 at process virtual address 0x20fff000
page:
ffffea000d360140 count:0 mapcount:0 mapping:
0000000000000000 index:0x1
flags: 0x2fffff80000000()
raw:
002fffff80000000 ffffea000d360108 ffffea000d360188 0000000000000000
raw:
0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)
------------[ cut here ]------------
kernel BUG at ./include/linux/mm.h:519!
soft_offline_in_use_page() passed refcount and page lock from tail page
to head page, which is not needed because we can pass any subpage to
split_huge_page().
Naoya had fixed a similar issue in
c3901e722b29 ("mm: hwpoison: fix thp
split handling in memory_failure()"). But he missed fixing soft
offline.
Link: http://lkml.kernel.org/r/1551452476-24000-1-git-send-email-zhongjiang@huawei.com
Fixes: 61f5d698cc97 ("mm: re-enable THP")
Signed-off-by: zhongjiang <zhongjiang@huawei.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org> [4.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dexuan Cui [Wed, 30 Jan 2019 01:23:01 +0000 (01:23 +0000)]
nfit: acpi_nfit_ctl(): Check out_obj->type in the right place
commit
43f89877f26671c6309cd87d7364b1a3e66e71cf upstream.
In the case of ND_CMD_CALL, we should also check out_obj->type.
The patch uses out_obj->type, which is a short alias to
out_obj->package.type.
Fixes: 31eca76ba2fc ("nfit, libnvdimm: limited/whitelisted dimm command marshaling mechanism")
Cc: <stable@vger.kernel.org>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Paul Cercueil [Mon, 28 Jan 2019 02:09:21 +0000 (23:09 -0300)]
clk: ingenic: Fix doc of ingenic_cgu_div_info
commit
7ca4c922aad2e3c46767a12f80d01c6b25337b59 upstream.
The 'div' field does not represent a number of bits used to divide
(understand: right-shift) the divider, but a number itself used to
divide the divider.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Maarten ter Huurne <maarten@treewalker.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Paul Cercueil [Mon, 28 Jan 2019 02:09:20 +0000 (23:09 -0300)]
clk: ingenic: Fix round_rate misbehaving with non-integer dividers
commit
bc5d922c93491878c44c9216e9d227c7eeb81d7f upstream.
Take a parent rate of 180 MHz, and a requested rate of 4.285715 MHz.
This results in a theorical divider of 41.999993 which is then rounded
up to 42. The .round_rate function would then return (180 MHz / 42) as
the clock, rounded down, so 4.285714 MHz.
Calling clk_set_rate on 4.285714 MHz would round the rate again, and
give a theorical divider of 42,
0000028, now rounded up to 43, and the
rate returned would be (180 MHz / 43) which is 4.186046 MHz, aka. not
what we requested.
Fix this by rounding up the divisions.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Tested-by: Maarten ter Huurne <maarten@treewalker.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tony Lindgren [Mon, 11 Feb 2019 22:59:07 +0000 (14:59 -0800)]
clk: clk-twl6040: Fix imprecise external abort for pdmclk
commit
5ae51d67aec95f6f9386aa8dd5db424964895575 upstream.
I noticed that modprobe clk-twl6040 can fail after a cold boot with:
abe_cm:clk:0010:0: failed to enable
...
Unhandled fault: imprecise external abort (0x1406) at 0xbe896b20
WARNING: CPU: 1 PID: 29 at drivers/clk/clk.c:828 clk_core_disable_lock+0x18/0x24
...
(clk_core_disable_lock) from [<
c0123534>] (_disable_clocks+0x18/0x90)
(_disable_clocks) from [<
c0124040>] (_idle+0x17c/0x244)
(_idle) from [<
c0125ad4>] (omap_hwmod_idle+0x24/0x44)
(omap_hwmod_idle) from [<
c053a038>] (sysc_runtime_suspend+0x48/0x108)
(sysc_runtime_suspend) from [<
c06084c4>] (__rpm_callback+0x144/0x1d8)
(__rpm_callback) from [<
c0608578>] (rpm_callback+0x20/0x80)
(rpm_callback) from [<
c0607034>] (rpm_suspend+0x120/0x694)
(rpm_suspend) from [<
c0607a78>] (__pm_runtime_idle+0x60/0x84)
(__pm_runtime_idle) from [<
c053aaf0>] (sysc_probe+0x874/0xf2c)
(sysc_probe) from [<
c05fecd4>] (platform_drv_probe+0x48/0x98)
After searching around for a similar issue, I came across an earlier fix
that never got merged upstream in the Android tree for glass-omap-xrr02.
There is patch "MFD: twl6040-codec: Implement PDMCLK cold temp errata"
by Misael Lopez Cruz <misael.lopez@ti.com>.
Based on my observations, this fix is also needed when cold booting
devices, and not just for deeper idle modes. Since we now have a clock
driver for pdmclk, let's fix the issue in twl6040_pdmclk_prepare().
Cc: Misael Lopez Cruz <misael.lopez@ti.com>
Cc: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Acked-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jan Kara [Tue, 29 Jan 2019 16:17:24 +0000 (17:17 +0100)]
ext2: Fix underflow in ext2_max_size()
commit
1c2d14212b15a60300a2d4f6364753e87394c521 upstream.
When ext2 filesystem is created with 64k block size, ext2_max_size()
will return value less than 0. Also, we cannot write any file in this fs
since the sb->maxbytes is less than 0. The core of the problem is that
the size of block index tree for such large block size is more than
i_blocks can carry. So fix the computation to count with this
possibility.
File size limits computed with the new function for the full range of
possible block sizes look like:
bits file_size
10
17247252480
11
275415851008
12
2196873666560
13
2197948973056
14
2198486220800
15
2198754754560
16
2198888906752
CC: stable@vger.kernel.org
Reported-by: yangerkun <yangerkun@huawei.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jan Kara [Mon, 11 Feb 2019 18:30:32 +0000 (13:30 -0500)]
ext4: fix crash during online resizing
commit
f96c3ac8dfc24b4e38fc4c2eba5fea2107b929d1 upstream.
When computing maximum size of filesystem possible with given number of
group descriptor blocks, we forget to include s_first_data_block into
the number of blocks. Thus for filesystems with non-zero
s_first_data_block it can happen that computed maximum filesystem size
is actually lower than current filesystem size which confuses the code
and eventually leads to a BUG_ON in ext4_alloc_group_tables() hitting on
flex_gd->count == 0. The problem can be reproduced like:
truncate -s 100g /tmp/image
mkfs.ext4 -b 1024 -E resize=262144 /tmp/image 32768
mount -t ext4 -o loop /tmp/image /mnt
resize2fs /dev/loop0 262145
resize2fs /dev/loop0 300000
Fix the problem by properly including s_first_data_block into the
computed number of filesystem blocks.
Fixes: 1c6bd7173d66 "ext4: convert file system to meta_bg if needed..."
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Arnd Bergmann [Thu, 7 Mar 2019 10:22:41 +0000 (11:22 +0100)]
cpufreq: pxa2xx: remove incorrect __init annotation
commit
9505b98ccddc454008ca7efff90044e3e857c827 upstream.
pxa_cpufreq_init_voltages() is marked __init but usually inlined into
the non-__init pxa_cpufreq_init() function. When building with clang,
it can stay as a standalone function in a discarded section, and produce
this warning:
WARNING: vmlinux.o(.text+0x616a00): Section mismatch in reference from the function pxa_cpufreq_init() to the function .init.text:pxa_cpufreq_init_voltages()
The function pxa_cpufreq_init() references
the function __init pxa_cpufreq_init_voltages().
This is often because pxa_cpufreq_init lacks a __init
annotation or the annotation of pxa_cpufreq_init_voltages is wrong.
Fixes: 50e77fcd790e ("ARM: pxa: remove __init from cpufreq_driver->init()")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Robert Jarzmik <robert.jarzmik@free.fr>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Yangtao Li [Mon, 4 Feb 2019 07:48:54 +0000 (02:48 -0500)]
cpufreq: tegra124: add missing of_node_put()
commit
446fae2bb5395f3028d8e3aae1508737e5a72ea1 upstream.
of_cpu_device_node_get() will increase the refcount of device_node,
it is necessary to call of_node_put() at the end to release the
refcount.
Fixes: 9eb15dbbfa1a2 ("cpufreq: Add cpufreq driver for Tegra124")
Cc: <stable@vger.kernel.org> # 4.4+
Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Lubomir Rintel [Sun, 10 Feb 2019 19:47:49 +0000 (20:47 +0100)]
libertas_tf: don't set URB_ZERO_PACKET on IN USB transfer
commit
607076a904c435f2677fadaadd4af546279db68b upstream.
It doesn't make sense and the USB core warns on each submit of such
URB, easily flooding the message buffer with tracebacks.
Analogous issue was fixed in regular libertas driver in commit
6528d8804780
("libertas: don't set URB_ZERO_PACKET on IN USB transfer").
Cc: stable@vger.kernel.org
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Reviewed-by: Steve deRosier <derosier@cal-sierra.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Biggers [Fri, 4 Jan 2019 04:16:13 +0000 (20:16 -0800)]
crypto: pcbc - remove bogus memcpy()s with src == dest
commit
251b7aea34ba3c4d4fdfa9447695642eb8b8b098 upstream.
The memcpy()s in the PCBC implementation use walk->iv as both the source
and destination, which has undefined behavior. These memcpy()'s are
actually unneeded, because walk->iv is already used to hold the previous
plaintext block XOR'd with the previous ciphertext block. Thus,
walk->iv is already updated to its final value.
So remove the broken and unnecessary memcpy()s.
Fixes: 91652be5d1b9 ("[CRYPTO] pcbc: Add Propagated CBC template")
Cc: <stable@vger.kernel.org> # v2.6.21+
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Maxim Zhukov <mussitantesmortem@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Filipe Manana [Thu, 14 Feb 2019 15:17:20 +0000 (15:17 +0000)]
Btrfs: fix corruption reading shared and compressed extents after hole punching
commit
8e928218780e2f1cf2f5891c7575e8f0b284fcce upstream.
In the past we had data corruption when reading compressed extents that
are shared within the same file and they are consecutive, this got fixed
by commit
005efedf2c7d0 ("Btrfs: fix read corruption of compressed and
shared extents") and by commit
808f80b46790f ("Btrfs: update fix for read
corruption of compressed and shared extents"). However there was a case
that was missing in those fixes, which is when the shared and compressed
extents are referenced with a non-zero offset. The following shell script
creates a reproducer for this issue:
#!/bin/bash
mkfs.btrfs -f /dev/sdc &> /dev/null
mount -o compress /dev/sdc /mnt/sdc
# Create a file with 3 consecutive compressed extents, each has an
# uncompressed size of 128Kb and a compressed size of 4Kb.
for ((i = 1; i <= 3; i++)); do
head -c 4096 /dev/zero
for ((j = 1; j <= 31; j++)); do
head -c 4096 /dev/zero | tr '\0' "\377"
done
done > /mnt/sdc/foobar
sync
echo "Digest after file creation: $(md5sum /mnt/sdc/foobar)"
# Clone the first extent into offsets 128K and 256K.
xfs_io -c "reflink /mnt/sdc/foobar 0 128K 128K" /mnt/sdc/foobar
xfs_io -c "reflink /mnt/sdc/foobar 0 256K 128K" /mnt/sdc/foobar
sync
echo "Digest after cloning: $(md5sum /mnt/sdc/foobar)"
# Punch holes into the regions that are already full of zeroes.
xfs_io -c "fpunch 0 4K" /mnt/sdc/foobar
xfs_io -c "fpunch 128K 4K" /mnt/sdc/foobar
xfs_io -c "fpunch 256K 4K" /mnt/sdc/foobar
sync
echo "Digest after hole punching: $(md5sum /mnt/sdc/foobar)"
echo "Dropping page cache..."
sysctl -q vm.drop_caches=1
echo "Digest after hole punching: $(md5sum /mnt/sdc/foobar)"
umount /dev/sdc
When running the script we get the following output:
Digest after file creation:
5a0888d80d7ab1fd31c229f83a3bbcc8 /mnt/sdc/foobar
linked 131072/131072 bytes at offset 131072
128 KiB, 1 ops; 0.0033 sec (36.960 MiB/sec and 295.6830 ops/sec)
linked 131072/131072 bytes at offset 262144
128 KiB, 1 ops; 0.0015 sec (78.567 MiB/sec and 628.5355 ops/sec)
Digest after cloning:
5a0888d80d7ab1fd31c229f83a3bbcc8 /mnt/sdc/foobar
Digest after hole punching:
5a0888d80d7ab1fd31c229f83a3bbcc8 /mnt/sdc/foobar
Dropping page cache...
Digest after hole punching:
fba694ae8664ed0c2e9ff8937e7f1484 /mnt/sdc/foobar
This happens because after reading all the pages of the extent in the
range from 128K to 256K for example, we read the hole at offset 256K
and then when reading the page at offset 260K we don't submit the
existing bio, which is responsible for filling all the page in the
range 128K to 256K only, therefore adding the pages from range 260K
to 384K to the existing bio and submitting it after iterating over the
entire range. Once the bio completes, the uncompressed data fills only
the pages in the range 128K to 256K because there's no more data read
from disk, leaving the pages in the range 260K to 384K unfilled. It is
just a slightly different variant of what was solved by commit
005efedf2c7d0 ("Btrfs: fix read corruption of compressed and shared
extents").
Fix this by forcing a bio submit, during readpages(), whenever we find a
compressed extent map for a page that is different from the extent map
for the previous page or has a different starting offset (in case it's
the same compressed extent), instead of the extent map's original start
offset.
A test case for fstests follows soon.
Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Fixes: 808f80b46790f ("Btrfs: update fix for read corruption of compressed and shared extents")
Fixes: 005efedf2c7d0 ("Btrfs: fix read corruption of compressed and shared extents")
Cc: stable@vger.kernel.org # 4.3+
Tested-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johannes Thumshirn [Mon, 18 Feb 2019 10:28:37 +0000 (11:28 +0100)]
btrfs: ensure that a DUP or RAID1 block group has exactly two stripes
commit
349ae63f40638a28c6fce52e8447c2d14b84cc0c upstream.
We recently had a customer issue with a corrupted filesystem. When
trying to mount this image btrfs panicked with a division by zero in
calc_stripe_length().
The corrupt chunk had a 'num_stripes' value of 1. calc_stripe_length()
takes this value and divides it by the number of copies the RAID profile
is expected to have to calculate the amount of data stripes. As a DUP
profile is expected to have 2 copies this division resulted in 1/2 = 0.
Later then the 'data_stripes' variable is used as a divisor in the
stripe length calculation which results in a division by 0 and thus a
kernel panic.
When encountering a filesystem with a DUP block group and a
'num_stripes' value unequal to 2, refuse mounting as the image is
corrupted and will lead to unexpected behaviour.
Code inspection showed a RAID1 block group has the same issues.
Fixes: e06cd3dd7cea ("Btrfs: add validadtion checks for chunk loading")
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Finn Thain [Wed, 16 Jan 2019 05:23:24 +0000 (16:23 +1100)]
m68k: Add -ffreestanding to CFLAGS
commit
28713169d879b67be2ef2f84dcf54905de238294 upstream.
This patch fixes a build failure when using GCC 8.1:
/usr/bin/ld: block/partitions/ldm.o: in function `ldm_parse_tocblock':
block/partitions/ldm.c:153: undefined reference to `strcmp'
This is caused by a new optimization which effectively replaces a
strncmp() call with a strcmp() call. This affects a number of strncmp()
call sites in the kernel.
The entire class of optimizations is avoided with -fno-builtin, which
gets enabled by -ffreestanding. This may avoid possible future build
failures in case new optimizations appear in future compilers.
I haven't done any performance measurements with this patch but I did
count the function calls in a defconfig build. For example, there are now
23 more sprintf() calls and 39 fewer strcpy() calls. The effect on the
other libc functions is smaller.
If this harms performance we can tackle that regression by optimizing
the call sites, ideally using semantic patches. That way, clang and ICC
builds might benfit too.
Cc: stable@vger.kernel.org
Reference: https://marc.info/?l=linux-m68k&m=
154514816222244&w=2
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jann Horn [Wed, 23 Jan 2019 14:19:17 +0000 (15:19 +0100)]
splice: don't merge into linked buffers
commit
a0ce2f0aa6ad97c3d4927bf2ca54bcebdf062d55 upstream.
Before this patch, it was possible for two pipes to affect each other after
data had been transferred between them with tee():
============
$ cat tee_test.c
int main(void) {
int pipe_a[2];
if (pipe(pipe_a)) err(1, "pipe");
int pipe_b[2];
if (pipe(pipe_b)) err(1, "pipe");
if (write(pipe_a[1], "abcd", 4) != 4) err(1, "write");
if (tee(pipe_a[0], pipe_b[1], 2, 0) != 2) err(1, "tee");
if (write(pipe_b[1], "xx", 2) != 2) err(1, "write");
char buf[5];
if (read(pipe_a[0], buf, 4) != 4) err(1, "read");
buf[4] = 0;
printf("got back: '%s'\n", buf);
}
$ gcc -o tee_test tee_test.c
$ ./tee_test
got back: 'abxx'
$
============
As suggested by Al Viro, fix it by creating a separate type for
non-mergeable pipe buffers, then changing the types of buffers in
splice_pipe_to_pipe() and link_pipe().
Cc: <stable@vger.kernel.org>
Fixes: 7c77f0b3f920 ("splice: implement pipe to pipe splicing")
Fixes: 70524490ee2e ("[PATCH] splice: add support for sys_tee()")
Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Varad Gautam [Thu, 24 Jan 2019 13:03:06 +0000 (14:03 +0100)]
fs/devpts: always delete dcache dentry-s in dput()
commit
73052b0daee0b750b39af18460dfec683e4f5887 upstream.
d_delete only unhashes an entry if it is reached with
dentry->d_lockref.count != 1. Prior to commit
8ead9dd54716 ("devpts:
more pty driver interface cleanups"), d_delete was called on a dentry
from devpts_pty_kill with two references held, which would trigger the
unhashing, and the subsequent dputs would release it.
Commit
8ead9dd54716 reworked devpts_pty_kill to stop acquiring the second
reference from d_find_alias, and the d_delete call left the dentries
still on the hashed list without actually ever being dropped from dcache
before explicit cleanup. This causes the number of negative dentries for
devpts to pile up, and an `ls /dev/pts` invocation can take seconds to
return.
Provide always_delete_dentry() from simple_dentry_operations
as .d_delete for devpts, to make the dentry be dropped from dcache.
Without this cleanup, the number of dentries in /dev/pts/ can be grown
arbitrarily as:
`python -c 'import pty; pty.spawn(["ls", "/dev/pts"])'`
A systemtap probe on dcache_readdir to count d_subdirs shows this count
to increase with each pty spawn invocation above:
probe kernel.function("dcache_readdir") {
subdirs = &@cast($file->f_path->dentry, "dentry")->d_subdirs;
p = subdirs;
p = @cast(p, "list_head")->next;
i = 0
while (p != subdirs) {
p = @cast(p, "list_head")->next;
i = i+1;
}
printf("number of dentries: %d\n", i);
}
Fixes: 8ead9dd54716 ("devpts: more pty driver interface cleanups")
Signed-off-by: Varad Gautam <vrd@amazon.de>
Reported-by: Zheng Wang <wanz@amazon.de>
Reported-by: Brandon Schwartz <bsschwar@amazon.de>
Root-caused-by: Maximilian Heyne <mheyne@amazon.de>
Root-caused-by: Nicolas Pernas Maradei <npernas@amazon.de>
CC: David Woodhouse <dwmw@amazon.co.uk>
CC: Maximilian Heyne <mheyne@amazon.de>
CC: Stefan Nuernberger <snu@amazon.de>
CC: Amit Shah <aams@amazon.de>
CC: Linus Torvalds <torvalds@linux-foundation.org>
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: Al Viro <viro@ZenIV.linux.org.uk>
CC: Christian Brauner <christian.brauner@ubuntu.com>
CC: Eric W. Biederman <ebiederm@xmission.com>
CC: Matthew Wilcox <willy@infradead.org>
CC: Eric Biggers <ebiggers@google.com>
CC: <stable@vger.kernel.org> # 4.9+
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bart Van Assche [Fri, 25 Jan 2019 18:34:56 +0000 (10:34 -0800)]
scsi: target/iscsi: Avoid iscsit_release_commands_from_conn() deadlock
commit
32e36bfbcf31452a854263e7c7f32fbefc4b44d8 upstream.
When using SCSI passthrough in combination with the iSCSI target driver
then cmd->t_state_lock may be obtained from interrupt context. Hence, all
code that obtains cmd->t_state_lock from thread context must disable
interrupts first. This patch avoids that lockdep reports the following:
WARNING: inconsistent lock state
4.18.0-dbg+ #1 Not tainted
--------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
iscsi_ttx/1800 [HC1[1]:SC0[2]:HE0:SE0] takes:
000000006e7b0ceb (&(&cmd->t_state_lock)->rlock){?...}, at: target_complete_cmd+0x47/0x2c0 [target_core_mod]
{HARDIRQ-ON-W} state was registered at:
lock_acquire+0xd2/0x260
_raw_spin_lock+0x32/0x50
iscsit_close_connection+0x97e/0x1020 [iscsi_target_mod]
iscsit_take_action_for_connection_exit+0x108/0x200 [iscsi_target_mod]
iscsi_target_rx_thread+0x180/0x190 [iscsi_target_mod]
kthread+0x1cf/0x1f0
ret_from_fork+0x24/0x30
irq event stamp: 1281
hardirqs last enabled at (1279): [<
ffffffff970ade79>] __local_bh_enable_ip+0xa9/0x160
hardirqs last disabled at (1281): [<
ffffffff97a008a5>] interrupt_entry+0xb5/0xd0
softirqs last enabled at (1278): [<
ffffffff977cd9a1>] lock_sock_nested+0x51/0xc0
softirqs last disabled at (1280): [<
ffffffffc07a6e04>] ip6_finish_output2+0x124/0xe40 [ipv6]
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&(&cmd->t_state_lock)->rlock);
<Interrupt>
lock(&(&cmd->t_state_lock)->rlock);