Ka-Cheong Poon [Wed, 8 Apr 2020 10:21:02 +0000 (03:21 -0700)]
net/rds: Fix MR reference counting problem
In rds_free_mr(), it calls rds_destroy_mr(mr) directly. But this
defeats the purpose of reference counting and makes MR free handling
impossible. It means that holding a reference does not guarantee that
it is safe to access some fields. For example, In
rds_cmsg_rdma_dest(), it increases the ref count, unlocks and then
calls mr->r_trans->sync_mr(). But if rds_free_mr() (and
rds_destroy_mr()) is called in between (there is no lock preventing
this to happen), r_trans_private is set to NULL, causing a panic.
Similar issue is in rds_rdma_unuse().
Reported-by: zerons <sironhide0null@gmail.com>
Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ka-Cheong Poon [Wed, 8 Apr 2020 10:21:01 +0000 (03:21 -0700)]
net/rds: Replace struct rds_mr's r_refcount with struct kref
And removed rds_mr_put().
Signed-off-by: Ka-Cheong Poon <ka-cheong.poon@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Taehee Yoo [Thu, 9 Apr 2020 14:08:08 +0000 (14:08 +0000)]
net: macsec: fix using wrong structure in macsec_changelink()
In the macsec_changelink(), "struct macsec_tx_sa tx_sc" is used to
store "macsec_secy.tx_sc".
But, the struct type of tx_sc is macsec_tx_sc, not macsec_tx_sa.
So, the macsec_tx_sc should be used instead.
Test commands:
ip link add dummy0 type dummy
ip link add macsec0 link dummy0 type macsec
ip link set macsec0 type macsec encrypt off
Splat looks like:
[61119.963483][ T9335] ==================================================================
[61119.964709][ T9335] BUG: KASAN: slab-out-of-bounds in macsec_changelink.part.34+0xb6/0x200 [macsec]
[61119.965787][ T9335] Read of size 160 at addr
ffff888020d69c68 by task ip/9335
[61119.966699][ T9335]
[61119.966979][ T9335] CPU: 0 PID: 9335 Comm: ip Not tainted 5.6.0+ #503
[61119.967791][ T9335] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[61119.968914][ T9335] Call Trace:
[61119.969324][ T9335] dump_stack+0x96/0xdb
[61119.969809][ T9335] ? macsec_changelink.part.34+0xb6/0x200 [macsec]
[61119.970554][ T9335] print_address_description.constprop.5+0x1be/0x360
[61119.971294][ T9335] ? macsec_changelink.part.34+0xb6/0x200 [macsec]
[61119.971973][ T9335] ? macsec_changelink.part.34+0xb6/0x200 [macsec]
[61119.972703][ T9335] __kasan_report+0x12a/0x170
[61119.973323][ T9335] ? macsec_changelink.part.34+0xb6/0x200 [macsec]
[61119.973942][ T9335] kasan_report+0xe/0x20
[61119.974397][ T9335] check_memory_region+0x149/0x1a0
[61119.974866][ T9335] memcpy+0x1f/0x50
[61119.975209][ T9335] macsec_changelink.part.34+0xb6/0x200 [macsec]
[61119.975825][ T9335] ? macsec_get_stats64+0x3e0/0x3e0 [macsec]
[61119.976451][ T9335] ? kernel_text_address+0x111/0x120
[61119.976990][ T9335] ? pskb_expand_head+0x25f/0xe10
[61119.977503][ T9335] ? stack_trace_save+0x82/0xb0
[61119.977986][ T9335] ? memset+0x1f/0x40
[61119.978397][ T9335] ? __nla_validate_parse+0x98/0x1ab0
[61119.978936][ T9335] ? macsec_alloc_tfm+0x90/0x90 [macsec]
[61119.979511][ T9335] ? __kasan_slab_free+0x111/0x150
[61119.980021][ T9335] ? kfree+0xce/0x2f0
[61119.980700][ T9335] ? netlink_trim+0x196/0x1f0
[61119.981420][ T9335] ? nla_memcpy+0x90/0x90
[61119.982036][ T9335] ? register_lock_class+0x19e0/0x19e0
[61119.982776][ T9335] ? memcpy+0x34/0x50
[61119.983327][ T9335] __rtnl_newlink+0x922/0x1270
[ ... ]
Fixes:
3cf3227a21d1 ("net: macsec: hardware offloading infrastructure")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Thu, 9 Apr 2020 13:41:26 +0000 (14:41 +0100)]
net-sysfs: remove redundant assignment to variable ret
The variable ret is being initialized with a value that is never read
and it is being updated later with a new value. The initialization is
redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Wenhu [Thu, 9 Apr 2020 02:53:53 +0000 (19:53 -0700)]
net: qrtr: send msgs from local of same id as broadcast
If the local node id(qrtr_local_nid) is not modified after its
initialization, it equals to the broadcast node id(QRTR_NODE_BCAST).
So the messages from local node should not be taken as broadcast
and keep the process going to send them out anyway.
The definitions are as follow:
static unsigned int qrtr_local_nid = NUMA_NO_NODE;
Fixes:
fdf5fd397566 ("net: qrtr: Broadcast messages only from control port")
Signed-off-by: Wang Wenhu <wenhu.wang@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 9 Apr 2020 17:03:26 +0000 (10:03 -0700)]
Merge tag 'mlx5-fixes-2020-04-08' of git://git./linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2020-04-08
This series introduces some fixes to mlx5 driver.
Please pull and let me know if there is any problem.
For -stable v5.3
('net/mlx5: Fix frequent ioread PCI access during recovery')
('net/mlx5e: Add missing release firmware call')
For -stable v5.4
('net/mlx5e: Fix nest_level for vlan pop action')
('net/mlx5e: Fix pfnum in devlink port attribute')
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Lothar Rubusch [Wed, 8 Apr 2020 22:09:31 +0000 (22:09 +0000)]
Documentation: devlink: fix broken link warning
At 'make htmldocs' the following warning is thrown:
Documentation/networking/devlink/devlink-trap.rst:302:
WARNING: undefined label: generic-packet-trap-groups
Fixes the warning by setting the label to the specified header,
within the same document.
Signed-off-by: Lothar Rubusch <l.rubusch@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Blakey [Fri, 27 Mar 2020 09:12:31 +0000 (12:12 +0300)]
net/mlx5e: CT: Use rhashtable's ct entries instead of a separate list
Fixes CT entries list corruption.
After allowing parallel insertion/removals in upper nf flow table
layer, unprotected ct entries list can be corrupted by parallel add/del
on the same flow table.
CT entries list is only used while freeing a ct zone flow table to
go over all the ct entries offloaded on that zone/table, and flush
the table.
As rhashtable already provides an api to go over all the inserted entries,
fix the race by using the rhashtable iteration instead, and remove the list.
Fixes:
7da182a998d6 ("netfilter: flowtable: Use work entry per offload command")
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Parav Pandit [Fri, 3 Apr 2020 07:35:46 +0000 (02:35 -0500)]
net/mlx5e: Fix devlink port netdev unregistration sequence
In cited commit netdevice is registered after devlink port.
Unregistration flow should be mirror sequence of registration flow.
Hence, unregister netdevice before devlink port.
Fixes:
31e87b39ba9d ("net/mlx5e: Fix devlink port register sequence")
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Parav Pandit [Fri, 3 Apr 2020 08:57:30 +0000 (03:57 -0500)]
net/mlx5e: Fix pfnum in devlink port attribute
Cited patch missed to extract PCI pf number accurately for PF and VF
port flavour. It considered PCI device + function number.
Due to this, device having non zero device number shown large pfnum.
Hence, use only PCI function number; to avoid similar errors, derive
pfnum one time for all port flavours.
Fixes:
f60f315d339e ("net/mlx5e: Register devlink ports for physical link, PCI PF, VFs")
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Roi Dayan [Sun, 29 Mar 2020 15:54:10 +0000 (18:54 +0300)]
net/mlx5e: Fix missing pedit action after ct clear action
With ct clear action we should not allocate the action in hw
and not release the mod_acts parsed in advance.
It will be done when handling the ct clear action.
Fixes:
1ef3018f5af3 ("net/mlx5e: CT: Support clear action")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Dmytro Linkin [Wed, 1 Apr 2020 11:41:27 +0000 (14:41 +0300)]
net/mlx5e: Fix nest_level for vlan pop action
Current value of nest_level, assigned from net_device lower_level value,
does not reflect the actual number of vlan headers, needed to pop.
For ex., if we have untagged ingress traffic sended over vlan devices,
instead of one pop action, driver will perform two pop actions.
To fix that, calculate nest_level as difference between vlan device and
parent device lower_levels.
Fixes:
f3b0a18bb6cb ("net: remove unnecessary variables and callback")
Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Eran Ben Elisha [Tue, 24 Mar 2020 13:04:26 +0000 (15:04 +0200)]
net/mlx5e: Add missing release firmware call
Once driver finishes flashing the firmware image, it should release it.
Fixes:
9c8bca2637b8 ("mlx5: Move firmware flash implementation to devlink")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Eli Cohen [Sun, 29 Mar 2020 04:10:43 +0000 (07:10 +0300)]
net/mlx5: Fix condition for termination table cleanup
When we destroy rules from slow path we need to avoid destroying
termination tables since termination tables are never created in slow
path. By doing so we avoid destroying the termination table created for the
slow path.
Fixes:
d8a2034f152a ("net/mlx5: Don't use termination tables in slow path")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Moshe Shemesh [Mon, 30 Mar 2020 07:21:49 +0000 (10:21 +0300)]
net/mlx5: Fix frequent ioread PCI access during recovery
High frequency of PCI ioread calls during recovery flow may cause the
following trace on powerpc:
[ 248.670288] EEH: 2100000 reads ignored for recovering device at
location=Slot1 driver=mlx5_core pci addr=0000:01:00.1
[ 248.670331] EEH: Might be infinite loop in mlx5_core driver
[ 248.670361] CPU: 2 PID: 35247 Comm: kworker/u192:11 Kdump: loaded
Tainted: G OE ------------ 4.14.0-115.14.1.el7a.ppc64le #1
[ 248.670425] Workqueue: mlx5_health0000:01:00.1 health_recover_work
[mlx5_core]
[ 248.670471] Call Trace:
[ 248.670492] [
c00020391c11b960] [
c000000000c217ac] dump_stack+0xb0/0xf4
(unreliable)
[ 248.670548] [
c00020391c11b9a0] [
c000000000045818]
eeh_check_failure+0x5c8/0x630
[ 248.670631] [
c00020391c11ba50] [
c00000000068fce4]
ioread32be+0x114/0x1c0
[ 248.670692] [
c00020391c11bac0] [
c00800000dd8b400]
mlx5_error_sw_reset+0x160/0x510 [mlx5_core]
[ 248.670752] [
c00020391c11bb60] [
c00800000dd75824]
mlx5_disable_device+0x34/0x1d0 [mlx5_core]
[ 248.670822] [
c00020391c11bbe0] [
c00800000dd8affc]
health_recover_work+0x11c/0x3c0 [mlx5_core]
[ 248.670891] [
c00020391c11bc80] [
c000000000164fcc]
process_one_work+0x1bc/0x5f0
[ 248.670955] [
c00020391c11bd20] [
c000000000167f8c]
worker_thread+0xac/0x6b0
[ 248.671015] [
c00020391c11bdc0] [
c000000000171618] kthread+0x168/0x1b0
[ 248.671067] [
c00020391c11be30] [
c00000000000b65c]
ret_from_kernel_thread+0x5c/0x80
Reduce the PCI ioread frequency during recovery by using msleep()
instead of cond_resched()
Fixes:
3e5b72ac2f29 ("net/mlx5: Issue SW reset on FW assert")
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Arnd Bergmann [Wed, 8 Apr 2020 18:54:43 +0000 (20:54 +0200)]
net/tls: fix const assignment warning
Building with some experimental patches, I came across a warning
in the tls code:
include/linux/compiler.h:215:30: warning: assignment discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
215 | *(volatile typeof(x) *)&(x) = (val); \
| ^
net/tls/tls_main.c:650:4: note: in expansion of macro 'smp_store_release'
650 | smp_store_release(&saved_tcpv4_prot, prot);
This appears to be a legitimate warning about assigning a const pointer
into the non-const 'saved_tcpv4_prot' global. Annotate both the ipv4 and
ipv6 pointers 'const' to make the code internally consistent.
Fixes:
5bb4c45d466c ("net/tls: Read sk_prot once when building tls proto ops")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Weiß [Tue, 7 Apr 2020 11:11:48 +0000 (13:11 +0200)]
l2tp: Allow management of tunnels and session in user namespace
Creation and management of L2TPv3 tunnels and session through netlink
requires CAP_NET_ADMIN. However, a process with CAP_NET_ADMIN in a
non-initial user namespace gets an EPERM due to the use of the
genetlink GENL_ADMIN_PERM flag. Thus, management of L2TP VPNs inside
an unprivileged container won't work.
We replaced the GENL_ADMIN_PERM by the GENL_UNS_ADMIN_PERM flag
similar to other network modules which also had this problem, e.g.,
openvswitch (commit
4a92602aa1cd "openvswitch: allow management from
inside user namespaces") and nl80211 (commit
5617c6cd6f844 "nl80211:
Allow privileged operations from user namespaces").
I tested this in the container runtime trustm3 (trustm3.github.io)
and was able to create l2tp tunnels and sessions in unpriviliged
(user namespaced) containers using a private network namespace.
For other runtimes such as docker or lxc this should work, too.
Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 8 Apr 2020 20:02:32 +0000 (13:02 -0700)]
Merge branch 'ionic-fw-upgrade-filter-fixes'
Shannon Nelson says:
====================
ionic: fw upgrade filter fixes
With further testing of the fw-upgrade operations we found a
couple of issues that needed to be cleaned up:
- the filters other than the base MAC address need to be
reinstated into the device
- we don't need to remove the station MAC filter if it
isn't changing from a previous MAC filter
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Wed, 8 Apr 2020 16:19:12 +0000 (09:19 -0700)]
ionic: set station addr only if needed
The code was working too hard and in some cases was trying to
delete filters that weren't there, generating a potentially
misleading error message.
IONIC_CMD_RX_FILTER_DEL (32) failed: IONIC_RC_ENOENT (-2)
Fixes:
2a654540be10 ("ionic: Add Rx filter and rx_mode ndo support")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Wed, 8 Apr 2020 16:19:11 +0000 (09:19 -0700)]
ionic: replay filters after fw upgrade
The NIC's filters are lost in the midst of the fw-upgrade
so we need to replay them into the FW. We also remove the
unused ionic_rx_filter_del() function.
Fixes:
c672412f6172 ("ionic: remove lifs on fw reset")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roman Mashak [Tue, 7 Apr 2020 17:13:25 +0000 (13:13 -0400)]
tc-testing: remove duplicate code in tdc.py
In set_operation_mode() function remove duplicated check for args.list
parameter, which is already done one line before.
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Taehee Yoo [Tue, 7 Apr 2020 13:23:21 +0000 (13:23 +0000)]
hsr: check protocol version in hsr_newlink()
In the current hsr code, only 0 and 1 protocol versions are valid.
But current hsr code doesn't check the version, which is received by
userspace.
Test commands:
ip link add dummy0 type dummy
ip link add dummy1 type dummy
ip link add hsr0 type hsr slave1 dummy0 slave2 dummy1 version 4
In the test commands, version 4 is invalid.
So, the command should be failed.
After this patch, following error will occur.
"Error: hsr: Only versions 0..1 are supported."
Fixes:
ee1c27977284 ("net/hsr: Added support for HSR v1")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lothar Rubusch [Mon, 6 Apr 2020 21:29:20 +0000 (21:29 +0000)]
Documentation: mdio_bus.c - fix warnings
Fix wrong parameter description and related warnings at 'make htmldocs'.
Signed-off-by: Lothar Rubusch <l.rubusch@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Blakey [Mon, 6 Apr 2020 15:36:56 +0000 (18:36 +0300)]
net: sched: Fix setting last executed chain on skb extension
After driver sets the missed chain on the tc skb extension it is
consumed (deleted) by tc_classify_ingress and tc jumps to that chain.
If tc now misses on this chain (either no match, or no goto action),
then last executed chain remains 0, and the skb extension is not re-added,
and the next datapath (ovs) will start from 0.
Fix that by setting last executed chain to the chain read from the skb
extension, so if there is a miss, we set it back.
Fixes:
af699626ee26 ("net: sched: Support specifying a starting chain via tc skb ext")
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Konstantin Khlebnikov [Mon, 6 Apr 2020 11:39:32 +0000 (14:39 +0300)]
net: revert default NAPI poll timeout to 2 jiffies
For HZ < 1000 timeout 2000us rounds up to 1 jiffy but expires randomly
because next timer interrupt could come shortly after starting softirq.
For commonly used CONFIG_HZ=1000 nothing changes.
Fixes:
7acf8a1e8a28 ("Replace 2 jiffies with sysctl netdev_budget_usecs to enable softirq tuning")
Reported-by: Dmitry Yakunin <zeil@yandex-team.ru>
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
René van Dorst [Sun, 5 Apr 2020 21:42:54 +0000 (05:42 +0800)]
net: ethernet: mediatek: move mt7623 settings out off the mt7530
Moving mt7623 logic out off mt7530, is required to make hardware setting
consistent after we introduce phylink to mtk driver.
Fixes:
b8fc9f30821e ("net: ethernet: mediatek: Add basic PHYLINK support")
Reviewed-by: Sean Wang <sean.wang@mediatek.com>
Tested-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: René van Dorst <opensource@vdorst.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
René van Dorst [Sun, 5 Apr 2020 21:42:53 +0000 (05:42 +0800)]
net: dsa: mt7530: move mt7623 settings out off the mt7530
Moving mt7623 logic out off mt7530, is required to make hardware setting
consistent after we introduce phylink to mtk driver.
Fixes:
ca366d6c889b ("net: dsa: mt7530: Convert to PHYLINK API")
Reviewed-by: Sean Wang <sean.wang@mediatek.com>
Tested-by: Sean Wang <sean.wang@mediatek.com>
Signed-off-by: René van Dorst <opensource@vdorst.com>
Tested-by: Frank Wunderlich <frank-w@public-files.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tim Stallard [Fri, 3 Apr 2020 20:26:21 +0000 (21:26 +0100)]
net: ipv6: do not consider routes via gateways for anycast address check
The behaviour for what is considered an anycast address changed in
commit
45e4fd26683c ("ipv6: Only create RTF_CACHE routes after
encountering pmtu exception"). This now considers the first
address in a subnet where there is a route via a gateway
to be an anycast address.
This breaks path MTU discovery and traceroutes when a host in a
remote network uses the address at the start of a prefix
(eg 2600:: advertised as 2600::/48 in the DFZ) as ICMP errors
will not be sent to anycast addresses.
This patch excludes any routes with a gateway, or via point to
point links, like the behaviour previously from
rt6_is_gw_or_nonexthop in net/ipv6/route.c.
This can be tested with:
ip link add v1 type veth peer name v2
ip netns add test
ip netns exec test ip link set lo up
ip link set v2 netns test
ip link set v1 up
ip netns exec test ip link set v2 up
ip addr add 2001:db8::1/64 dev v1 nodad
ip addr add 2001:db8:100:: dev lo nodad
ip netns exec test ip addr add 2001:db8::2/64 dev v2 nodad
ip netns exec test ip route add unreachable 2001:db8:1::1
ip netns exec test ip route add 2001:db8:100::/64 via 2001:db8::1
ip netns exec test sysctl net.ipv6.conf.all.forwarding=1
ip route add 2001:db8:1::1 via 2001:db8::2
ping -I 2001:db8::1 2001:db8:1::1 -c1
ping -I 2001:db8:100:: 2001:db8:1::1 -c1
ip addr delete 2001:db8:100:: dev lo
ip netns delete test
Currently the first ping will get back a destination unreachable ICMP
error, but the second will never get a response, with "icmp6_send:
acast source" logged. After this patch, both get destination
unreachable ICMP replies.
Fixes:
45e4fd26683c ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception")
Signed-off-by: Tim Stallard <code@timstallard.me.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tim Stallard [Fri, 3 Apr 2020 20:22:57 +0000 (21:22 +0100)]
net: icmp6: do not select saddr from iif when route has prefsrc set
Since commit
fac6fce9bdb5 ("net: icmp6: provide input address for
traceroute6") ICMPv6 errors have source addresses from the ingress
interface. However, this overrides when source address selection is
influenced by setting preferred source addresses on routes.
This can result in ICMP errors being lost to upstream BCP38 filters
when the wrong source addresses are used, breaking path MTU discovery
and traceroute.
This patch sets the modified source address selection to only take place
when the route used has no prefsrc set.
It can be tested with:
ip link add v1 type veth peer name v2
ip netns add test
ip netns exec test ip link set lo up
ip link set v2 netns test
ip link set v1 up
ip netns exec test ip link set v2 up
ip addr add 2001:db8::1/64 dev v1 nodad
ip addr add 2001:db8::3 dev v1 nodad
ip netns exec test ip addr add 2001:db8::2/64 dev v2 nodad
ip netns exec test ip route add unreachable 2001:db8:1::1
ip netns exec test ip addr add 2001:db8:100::1 dev lo
ip netns exec test ip route add 2001:db8::1 dev v2 src 2001:db8:100::1
ip route add 2001:db8:1000::1 via 2001:db8::2
traceroute6 -s 2001:db8::1 2001:db8:1000::1
traceroute6 -s 2001:db8::3 2001:db8:1000::1
ip netns delete test
Output before:
$ traceroute6 -s 2001:db8::1 2001:db8:1000::1
traceroute to 2001:db8:1000::1 (2001:db8:1000::1), 30 hops max, 80 byte packets
1 2001:db8::2 (2001:db8::2) 0.843 ms !N 0.396 ms !N 0.257 ms !N
$ traceroute6 -s 2001:db8::3 2001:db8:1000::1
traceroute to 2001:db8:1000::1 (2001:db8:1000::1), 30 hops max, 80 byte packets
1 2001:db8::2 (2001:db8::2) 0.772 ms !N 0.257 ms !N 0.357 ms !N
After:
$ traceroute6 -s 2001:db8::1 2001:db8:1000::1
traceroute to 2001:db8:1000::1 (2001:db8:1000::1), 30 hops max, 80 byte packets
1 2001:db8:100::1 (2001:db8:100::1) 8.885 ms !N 0.310 ms !N 0.174 ms !N
$ traceroute6 -s 2001:db8::3 2001:db8:1000::1
traceroute to 2001:db8:1000::1 (2001:db8:1000::1), 30 hops max, 80 byte packets
1 2001:db8::2 (2001:db8::2) 1.403 ms !N 0.205 ms !N 0.313 ms !N
Fixes:
fac6fce9bdb5 ("net: icmp6: provide input address for traceroute6")
Signed-off-by: Tim Stallard <code@timstallard.me.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 8 Apr 2020 01:23:38 +0000 (18:23 -0700)]
Merge branch 'fec-fix-wake-on-lan'
Martin Fuzzey says:
====================
Fix Wake on lan with FEC on i.MX6
This series fixes WoL support with the FEC on i.MX6
The support was already in mainline but seems to have bitrotted
somewhat.
Only tested with i.MX6DL
Changes V2->V3
Patch 1:
fix non initialized variable introduced in V2 causing
probe to sometimes fail.
Patch 2:
remove /delete-property/interrupts-extended in
arch/arm/boot/dts/imx6qp.dtsi.
Patches 3 and 4:
Add received Acked-by and RB tags.
Changes V1->V2
Move the register offset and bit number from the DT to driver code
Add SOB from Fugang Duan for the NXP code on which this is based
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Fuzzey [Thu, 2 Apr 2020 13:51:30 +0000 (15:51 +0200)]
ARM: dts: imx6: add fec gpr property.
This is required for wake on lan on i.MX6
Signed-off-by: Martin Fuzzey <martin.fuzzey@flowbird.group>
Reviewed-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Fuzzey [Thu, 2 Apr 2020 13:51:29 +0000 (15:51 +0200)]
dt-bindings: fec: document the new gpr property.
This property allows the gpr register bit to be defined
for wake on lan support.
Signed-off-by: Martin Fuzzey <martin.fuzzey@flowbird.group>
Reviewed-by: Fugang Duan <fugang.duan@nxp.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Fuzzey [Thu, 2 Apr 2020 13:51:28 +0000 (15:51 +0200)]
ARM: dts: imx6: Use gpc for FEC interrupt controller to fix wake on LAN.
In order to wake from suspend by ethernet magic packets the GPC
must be used as intc does not have wakeup functionality.
But the FEC DT node currently uses interrupt-extended,
specificying intc, thus breaking WoL.
This problem is probably fallout from the stacked domain conversion
as intc used to chain to GPC.
So replace "interrupts-extended" by "interrupts" to use the default
parent which is GPC.
Fixes:
b923ff6af0d5 ("ARM: imx6: convert GPC to stacked domains")
Signed-off-by: Martin Fuzzey <martin.fuzzey@flowbird.group>
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Fuzzey [Thu, 2 Apr 2020 13:51:27 +0000 (15:51 +0200)]
net: fec: set GPR bit on suspend by DT configuration.
On some SoCs, such as the i.MX6, it is necessary to set a bit
in the SoC level GPR register before suspending for wake on lan
to work.
The fec platform callback sleep_mode_enable was intended to allow this
but the platform implementation was NAK'd back in 2015 [1]
This means that, currently, wake on lan is broken on mainline for
the i.MX6 at least.
So implement the required bit setting in the fec driver by itself
by adding a new optional DT property indicating the GPR register
and adding the offset and bit information to the driver.
[1] https://www.spinics.net/lists/netdev/msg310922.html
Signed-off-by: Martin Fuzzey <martin.fuzzey@flowbird.group>
Signed-off-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lothar Rubusch [Tue, 7 Apr 2020 22:55:25 +0000 (22:55 +0000)]
net: sock.h: fix skb_steal_sock() kernel-doc
Fix warnings related to kernel-doc notation, and wording in
function description.
Signed-off-by: Lothar Rubusch <l.rubusch@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 8 Apr 2020 01:08:06 +0000 (18:08 -0700)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net, they are:
1) Fix spurious overlap condition in the rbtree tree, from Stefano Brivio.
2) Fix possible uninitialized pointer dereference in nft_lookup.
3) IDLETIMER v1 target matches the Android layout, from
Maciej Zenczykowski.
4) Dangling pointer in nf_tables_set_alloc_name, from Eric Dumazet.
5) Fix RCU warning splat in ipset find_set_type(), from Amol Grover.
6) Report EOPNOTSUPP on unsupported set flags and object types in sets.
7) Add NFT_SET_CONCAT flag to provide consistent error reporting
when users defines set with ranges in concatenations in old kernels.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 7 Apr 2020 21:11:54 +0000 (14:11 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
- a lot more of MM, quite a bit more yet to come: (memcg, pagemap,
vmalloc, pagealloc, migration, thp, ksm, madvise, virtio,
userfaultfd, memory-hotplug, shmem, rmap, zswap, zsmalloc, cleanups)
- various other subsystems (procfs, misc, MAINTAINERS, bitops, lib,
checkpatch, epoll, binfmt, kallsyms, reiserfs, kmod, gcov, kconfig,
ubsan, fault-injection, ipc)
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (158 commits)
ipc/shm.c: make compat_ksys_shmctl() static
ipc/mqueue.c: fix a brace coding style issue
lib/Kconfig.debug: fix a typo "capabilitiy" -> "capability"
ubsan: include bug type in report header
kasan: unset panic_on_warn before calling panic()
ubsan: check panic_on_warn
drivers/misc/lkdtm/bugs.c: add arithmetic overflow and array bounds checks
ubsan: split "bounds" checker from other options
ubsan: add trap instrumentation option
init/Kconfig: clean up ANON_INODES and old IO schedulers options
kernel/gcov/fs.c: replace zero-length array with flexible-array member
gcov: gcc_3_4: replace zero-length array with flexible-array member
gcov: gcc_4_7: replace zero-length array with flexible-array member
kernel/kmod.c: fix a typo "assuems" -> "assumes"
reiserfs: clean up several indentation issues
kallsyms: unexport kallsyms_lookup_name() and kallsyms_on_each_symbol()
samples/hw_breakpoint: drop use of kallsyms_lookup_name()
samples/hw_breakpoint: drop HW_BREAKPOINT_R when reporting writes
fs/binfmt_elf.c: don't free interpreter's ELF pheaders on common path
fs/binfmt_elf.c: allocate less for static executable
...
Linus Torvalds [Tue, 7 Apr 2020 20:51:39 +0000 (13:51 -0700)]
Merge tag 'nfs-for-5.7-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Highlights include:
Stable fixes:
- Fix a page leak in nfs_destroy_unlinked_subrequests()
- Fix use-after-free issues in nfs_pageio_add_request()
- Fix new mount code constant_table array definitions
- finish_automount() requires us to hold 2 refs to the mount record
Features:
- Improve the accuracy of telldir/seekdir by using 64-bit cookies
when possible.
- Allow one RDMA active connection and several zombie connections to
prevent blocking if the remote server is unresponsive.
- Limit the size of the NFS access cache by default
- Reduce the number of references to credentials that are taken by
NFS
- pNFS files and flexfiles drivers now support per-layout segment
COMMIT lists.
- Enable partial-file layout segments in the pNFS/flexfiles driver.
- Add support for CB_RECALL_ANY to the pNFS flexfiles layout type
- pNFS/flexfiles Report NFS4ERR_DELAY and NFS4ERR_GRACE errors from
the DS using the layouterror mechanism.
Bugfixes and cleanups:
- SUNRPC: Fix krb5p regressions
- Don't specify NFS version in "UDP not supported" error
- nfsroot: set tcp as the default transport protocol
- pnfs: Return valid stateids in nfs_layout_find_inode_by_stateid()
- alloc_nfs_open_context() must use the file cred when available
- Fix locking when dereferencing the delegation cred
- Fix memory leaks in O_DIRECT when nfs_get_lock_context() fails
- Various clean ups of the NFS O_DIRECT commit code
- Clean up RDMA connect/disconnect
- Replace zero-length arrays with C99-style flexible arrays"
* tag 'nfs-for-5.7-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (86 commits)
NFS: Clean up process of marking inode stale.
SUNRPC: Don't start a timer on an already queued rpc task
NFS/pnfs: Reference the layout cred in pnfs_prepare_layoutreturn()
NFS/pnfs: Fix dereference of layout cred in pnfs_layoutcommit_inode()
NFS: Beware when dereferencing the delegation cred
NFS: Add a module parameter to set nfs_mountpoint_expiry_timeout
NFS: finish_automount() requires us to hold 2 refs to the mount record
NFS: Fix a few constant_table array definitions
NFS: Try to join page groups before an O_DIRECT retransmission
NFS: Refactor nfs_lock_and_join_requests()
NFS: Reverse the submission order of requests in __nfs_pageio_add_request()
NFS: Clean up nfs_lock_and_join_requests()
NFS: Remove the redundant function nfs_pgio_has_mirroring()
NFS: Fix memory leaks in nfs_pageio_stop_mirroring()
NFS: Fix a request reference leak in nfs_direct_write_clear_reqs()
NFS: Fix use-after-free issues in nfs_pageio_add_request()
NFS: Fix races nfs_page_group_destroy() vs nfs_destroy_unlinked_subrequests()
NFS: Fix a page leak in nfs_destroy_unlinked_subrequests()
NFS: Remove unused FLUSH_SYNC support in nfs_initiate_pgio()
pNFS/flexfiles: Specify the layout segment range in LAYOUTGET
...
Linus Torvalds [Tue, 7 Apr 2020 20:48:26 +0000 (13:48 -0700)]
Merge tag 'f2fs-for-5.7-rc1' of git://git./linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this round, we've mainly focused on fixing bugs and addressing
issues in recently introduced compression support.
Enhancement:
- add zstd support, and set LZ4 by default
- add ioctl() to show # of compressed blocks
- show mount time in debugfs
- replace rwsem with spinlock
- avoid lock contention in DIO reads
Some major bug fixes wrt compression:
- compressed block count
- memory access and leak
- remove obsolete fields
- flag controls
Other bug fixes and clean ups:
- fix overflow when handling .flags in inode_info
- fix SPO issue during resize FS flow
- fix compression with fsverity enabled
- potential deadlock when writing compressed pages
- show missing mount options"
* tag 'f2fs-for-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (66 commits)
f2fs: keep inline_data when compression conversion
f2fs: fix to disable compression on directory
f2fs: add missing CONFIG_F2FS_FS_COMPRESSION
f2fs: switch discard_policy.timeout to bool type
f2fs: fix to verify tpage before releasing in f2fs_free_dic()
f2fs: show compression in statx
f2fs: clean up dic->tpages assignment
f2fs: compress: support zstd compress algorithm
f2fs: compress: add .{init,destroy}_decompress_ctx callback
f2fs: compress: fix to call missing destroy_compress_ctx()
f2fs: change default compression algorithm
f2fs: clean up {cic,dic}.ref handling
f2fs: fix to use f2fs_readpage_limit() in f2fs_read_multi_pages()
f2fs: xattr.h: Make stub helpers inline
f2fs: fix to avoid double unlock
f2fs: fix potential .flags overflow on 32bit architecture
f2fs: fix NULL pointer dereference in f2fs_verity_work()
f2fs: fix to clear PG_error if fsverity failed
f2fs: don't call fscrypt_get_encryption_info() explicitly in f2fs_tmpfile()
f2fs: don't trigger data flush in foreground operation
...
Linus Torvalds [Tue, 7 Apr 2020 19:40:56 +0000 (12:40 -0700)]
Merge tag 'for-linus-5.7-rc1' of git://git./linux/kernel/git/rw/ubifs
Pull UBI and UBIFS updates from Richard Weinberger:
- Fix for memory leaks around UBIFS orphan handling
- Fix for memory leaks around UBI fastmap
- Remove zero-length array from ubi-media.h
- Fix for TNC lookup in UBIFS orphan code
* tag 'for-linus-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
ubi: ubi-media.h: Replace zero-length array with flexible-array member
ubifs: Fix out-of-bounds memory access caused by abnormal value of node_len
ubi: fastmap: Only produce the initial anchor PEB when fastmap is used
ubi: fastmap: Free unused fastmap anchor peb during detach
ubifs: ubifs_add_orphan: Fix a memory leak bug
ubifs: ubifs_jnl_write_inode: Fix a memory leak bug
ubifs: Fix ubifs_tnc_lookup() usage in do_kill_orphans()
Linus Torvalds [Tue, 7 Apr 2020 19:36:09 +0000 (12:36 -0700)]
Merge tag 'for-linus-5.7-rc1' of git://git./linux/kernel/git/rw/uml
Pull UML updates from Richard Weinberger:
- New mode for time travel, external via virtio
- Fixes for ubd to make sure no requests can get lost
- Fixes for vector networking
- Allow CONFIG_STATIC_LINK only when possible
- Minor cleanups and fixes
* tag 'for-linus-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
um: Remove some unnecessary NULL checks in vector_user.c
um: vector: Avoid NULL ptr deference if transport is unset
um: Make CONFIG_STATIC_LINK actually static
um: Implement cpu_relax() as ndelay(1) for time-travel
um: Implement ndelay/udelay in time-travel mode
um: Implement time-travel=ext
um: virtio: Implement VHOST_USER_PROTOCOL_F_INBAND_NOTIFICATIONS
um: time-travel: Rewrite as an event scheduler
um: Move timer-internal.h to non-shared
hostfs: Use kasprintf() instead of fixed buffer formatting
um: falloc.h needs to be directly included for older libc
um: ubd: Retry buffer read on any kind of error
um: ubd: Prevent buffer overrun on command completion
um: Fix overlapping ELF segments when statically linked
um: Delete never executed timer
um: Don't overwrite ethtool driver version
um: Fix len of file in create_pid_file
um: Don't use console_drivers directly
um: Cleanup CONFIG_IOSCHED_CFQ
Linus Torvalds [Tue, 7 Apr 2020 19:33:37 +0000 (12:33 -0700)]
Merge tag 'for-linus' of git://github.com/openrisc/linux
Pull OpenRISC updates from Stafford Horne:
"A few cleanups all over the place, things of note:
- Enable the clone3 syscall
- Remove CONFIG_CROSS_COMPILE from Krzysztof Kozlowski
- Update to use mmgrab from Julia Lawall"
* tag 'for-linus' of git://github.com/openrisc/linux:
openrisc: Remove obsolete show_trace_task function
openrisc: Cleanup copy_thread_tls docs and comments
openrisc: Enable the clone3 syscall
openrisc: Convert copy_thread to copy_thread_tls
openrisc: use mmgrab
openrisc: configs: Cleanup CONFIG_CROSS_COMPILE
Linus Torvalds [Tue, 7 Apr 2020 19:30:41 +0000 (12:30 -0700)]
Merge branch 'parisc-5.7-1' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
"Some cleanups in arch_rw locking functions, improved interrupt
handling in arch spinlocks, coversions to request_irq() and syscall
table generation cleanups"
* 'parisc-5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: remove nargs from __SYSCALL
parisc: Refactor alternative code to accept multiple conditions
parisc: Rework arch_rw locking functions
parisc: Improve interrupt handling in arch_spin_lock_flags()
parisc: Replace setup_irq() by request_irq()
Linus Torvalds [Tue, 7 Apr 2020 19:26:07 +0000 (12:26 -0700)]
Merge git://git./linux/kernel/git/davem/sparc
Pull sparc update from David Miller:
"A per-device DMA ops conversion for sparc32 by Chrstioph Hellwig"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc32: use per-device dma_ops
Linus Torvalds [Tue, 7 Apr 2020 19:16:15 +0000 (12:16 -0700)]
Merge git://git./linux/kernel/git/davem/ide
Pull IDE update from David Miller:
"As usual, very quiet in this subsystem.
Just a list_for_each_entry_safe() conversion"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
drivers/ide: Fix build regression.
drivers/ide: convert to list_for_each_entry_safe()
Linus Torvalds [Tue, 7 Apr 2020 19:03:32 +0000 (12:03 -0700)]
Merge git://git./linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
1) Slave bond and team devices should not be assigned ipv6 link local
addresses, from Jarod Wilson.
2) Fix clock sink config on some at803x PHY devices, from Oleksij
Rempel.
3) Uninitialized stack space transmitted in slcan frames, fix from
Richard Palethorpe.
4) Guard HW VLAN ops properly in stmmac driver, from Jose Abreu.
5) "=" --> "|=" fix in aquantia driver, from Colin Ian King.
6) Fix TCP fallback in mptcp, from Florian Westphal. (accessing a plain
tcp_sk as if it were an mptcp socket).
7) Fix cavium driver in some configurations wrt. PTP, from Yue Haibing.
8) Make ipv6 and ipv4 consistent in the lower bound allowed for
neighbour entry retrans_time, from Hangbin Liu.
9) Don't use private workqueue in pegasus usb driver, from Petko
Manolov.
10) Fix integer overflow in mlxsw, from Colin Ian King.
11) Missing refcnt init in cls_tcindex, from Cong Wang.
12) One too many loop iterations when processing cmpri entries in ipv6
rpl code, from Alexander Aring.
13) Disable SG and TSO by default in r8169, from Heiner Kallweit.
14) NULL deref in macsec, from Davide Caratti.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (42 commits)
macsec: fix NULL dereference in macsec_upd_offload()
skbuff.h: Improve the checksum related comments
net: dsa: bcm_sf2: Ensure correct sub-node is parsed
qed: remove redundant assignment to variable 'rc'
wimax: remove some redundant assignments to variable result
mlxsw: spectrum_flower: Do not stop at FLOW_ACTION_VLAN_MANGLE
mlxsw: spectrum_flower: Do not stop at FLOW_ACTION_PRIORITY
r8169: change back SG and TSO to be disabled by default
net: dsa: bcm_sf2: Do not register slave MDIO bus with OF
ipv6: rpl: fix loop iteration
tun: Don't put_page() for all negative return values from XDP program
net: dsa: mt7530: fix null pointer dereferencing in port5 setup
mptcp: add some missing pr_fmt defines
net: phy: micrel: kszphy_resume(): add delay after genphy_resume() before accessing PHY registers
net_sched: fix a missing refcnt in tcindex_init()
net: stmmac: dwmac1000: fix out-of-bounds mac address reg setting
mlxsw: spectrum_trap: fix unintention integer overflow on left shift
pegasus: Remove pegasus' own workqueue
neigh: support smaller retrans_time settting
net: openvswitch: use hlist_for_each_entry_rcu instead of hlist_for_each_entry
...
Linus Torvalds [Tue, 7 Apr 2020 18:01:37 +0000 (11:01 -0700)]
Merge branch 'pcmcia-next' of git://git./linux/kernel/git/brodo/linux
Pull pcmcia updates from Dominik Brodowski:
"A few PCMCIA odd fixes: removing a few spaces and useless casts,
replacing snprintf() with scnprintf(), and replacing zero-length
arrays with a flexible-array member"
* 'pcmcia-next' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux:
pcmcia: remove some unused space characters
pcmcia: soc_common.h: Replace zero-length array with flexible-array member
pcmcia: cs_internal.h: Replace zero-length array with flexible-array member
pcmcia: Use scnprintf() for avoiding potential buffer overflow
pcmcia: omap: remove useless cast for driver.name
Jason Yan [Tue, 7 Apr 2020 03:12:56 +0000 (20:12 -0700)]
ipc/shm.c: make compat_ksys_shmctl() static
Fix the following sparse warning:
ipc/shm.c:1335:6: warning: symbol 'compat_ksys_shmctl' was not declared.
Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200403063933.24785-1-yanaijie@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Somala Swaraj [Tue, 7 Apr 2020 03:12:53 +0000 (20:12 -0700)]
ipc/mqueue.c: fix a brace coding style issue
Signed-off-by: somala swaraj <somalaswaraj@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200301135530.18340-1-somalaswaraj@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Qiujun Huang [Tue, 7 Apr 2020 03:12:49 +0000 (20:12 -0700)]
lib/Kconfig.debug: fix a typo "capabilitiy" -> "capability"
s/capabilitiy/capability
Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/1585818594-27373-1-git-send-email-hqjagain@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Tue, 7 Apr 2020 03:12:45 +0000 (20:12 -0700)]
ubsan: include bug type in report header
When syzbot tries to figure out how to deduplicate bug reports, it prefers
seeing a hint about a specific bug type (we can do better than just
"UBSAN"). This lifts the handler reason into the UBSAN report line that
includes the file path that tripped a check. Unfortunately, UBSAN does
not provide function names.
Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Elena Petrova <lenaptr@google.com>
Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Link: http://lkml.kernel.org/r/20200227193516.32566-7-keescook@chromium.org
Link: https://lore.kernel.org/lkml/CACT4Y+bsLJ-wFx_TaXqax3JByUOWB3uk787LsyMVcfW6JzzGvg@mail.gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Tue, 7 Apr 2020 03:12:42 +0000 (20:12 -0700)]
kasan: unset panic_on_warn before calling panic()
As done in the full WARN() handler, panic_on_warn needs to be cleared
before calling panic() to avoid recursive panics.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Elena Petrova <lenaptr@google.com>
Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Link: http://lkml.kernel.org/r/20200227193516.32566-6-keescook@chromium.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Tue, 7 Apr 2020 03:12:38 +0000 (20:12 -0700)]
ubsan: check panic_on_warn
Syzkaller expects kernel warnings to panic when the panic_on_warn sysctl
is set. More work is needed here to have UBSan reuse the WARN
infrastructure, but for now, just check the flag manually.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Elena Petrova <lenaptr@google.com>
Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Link: https://lore.kernel.org/lkml/CACT4Y+bsLJ-wFx_TaXqax3JByUOWB3uk787LsyMVcfW6JzzGvg@mail.gmail.com
Link: http://lkml.kernel.org/r/20200227193516.32566-5-keescook@chromium.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Tue, 7 Apr 2020 03:12:34 +0000 (20:12 -0700)]
drivers/misc/lkdtm/bugs.c: add arithmetic overflow and array bounds checks
Adds LKDTM tests for arithmetic overflow (both signed and unsigned), as
well as array bounds checking.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Elena Petrova <lenaptr@google.com>
Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Link: http://lkml.kernel.org/r/20200227193516.32566-4-keescook@chromium.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Tue, 7 Apr 2020 03:12:31 +0000 (20:12 -0700)]
ubsan: split "bounds" checker from other options
In order to do kernel builds with the bounds checker individually
available, introduce CONFIG_UBSAN_BOUNDS, with the remaining options under
CONFIG_UBSAN_MISC.
For example, using this, we can start to expand the coverage syzkaller is
providing. Right now, all of UBSan is disabled for syzbot builds because
taken as a whole, it is too noisy. This will let us focus on one feature
at a time.
For the bounds checker specifically, this provides a mechanism to
eliminate an entire class of array overflows with close to zero
performance overhead (I cannot measure a difference). In my (mostly)
defconfig, enabling bounds checking adds ~4200 checks to the kernel.
Performance changes are in the noise, likely due to the branch predictors
optimizing for the non-fail path.
Some notes on the bounds checker:
- it does not instrument {mem,str}*()-family functions, it only
instruments direct indexed accesses (e.g. "foo[i]"). Dealing with
the {mem,str}*()-family functions is a work-in-progress around
CONFIG_FORTIFY_SOURCE[1].
- it ignores flexible array members, including the very old single
byte (e.g. "int foo[1];") declarations. (Note that GCC's
implementation appears to ignore _all_ trailing arrays, but Clang only
ignores empty, 0, and 1 byte arrays[2].)
[1] https://github.com/KSPP/linux/issues/6
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92589
Suggested-by: Elena Petrova <lenaptr@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Link: http://lkml.kernel.org/r/20200227193516.32566-3-keescook@chromium.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Tue, 7 Apr 2020 03:12:27 +0000 (20:12 -0700)]
ubsan: add trap instrumentation option
Patch series "ubsan: Split out bounds checker", v5.
This splits out the bounds checker so it can be individually used. This
is enabled in Android and hopefully for syzbot. Includes LKDTM tests for
behavioral corner-cases (beyond just the bounds checker), and adjusts
ubsan and kasan slightly for correct panic handling.
This patch (of 6):
The Undefined Behavior Sanitizer can operate in two modes: warning
reporting mode via lib/ubsan.c handler calls, or trap mode, which uses
__builtin_trap() as the handler. Using lib/ubsan.c means the kernel image
is about 5% larger (due to all the debugging text and reporting structures
to capture details about the warning conditions). Using the trap mode,
the image size changes are much smaller, though at the loss of the
"warning only" mode.
In order to give greater flexibility to system builders that want minimal
changes to image size and are prepared to deal with kernel code being
aborted and potentially destabilizing the system, this introduces
CONFIG_UBSAN_TRAP. The resulting image sizes comparison:
text data bss dec hex filename
19533663 6183037
18554956 44271656 2a38828 vmlinux.stock
19991849 7618513
18874448 46484810 2c54d4a vmlinux.ubsan
19712181 6284181
18366540 44362902 2a4ec96 vmlinux.ubsan-trap
CONFIG_UBSAN=y: image +4.8% (text +2.3%, data +18.9%)
CONFIG_UBSAN_TRAP=y: image +0.2% (text +0.9%, data +1.6%)
Additionally adjusts the CONFIG_UBSAN Kconfig help for clarity and removes
the mention of non-existing boot param "ubsan_handle".
Suggested-by: Elena Petrova <lenaptr@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: "Gustavo A. R. Silva" <gustavo@embeddedor.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Link: http://lkml.kernel.org/r/20200227193516.32566-2-keescook@chromium.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Krzysztof Kozlowski [Tue, 7 Apr 2020 03:12:02 +0000 (20:12 -0700)]
init/Kconfig: clean up ANON_INODES and old IO schedulers options
CONFIG_ANON_INODES is gone since commit
5dd50aaeb185 ("Make anon_inodes
unconditional").
CONFIG_CFQ_GROUP_IOSCHED was replaced with CONFIG_BFQ_GROUP_IOSCHED in
commit
f382fb0bcef4 ("block: remove legacy IO schedulers").
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: http://lkml.kernel.org/r/20200130192419.3026-1-krzk@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gustavo A. R. Silva [Tue, 7 Apr 2020 03:11:58 +0000 (20:11 -0700)]
kernel/gcov/fs.c: replace zero-length array with flexible-array member
The current codebase makes use of the zero-length array language extension
to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning in
case the flexible array does not occur last in the structure, which will
help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by this
change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit
76497732932f ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Link: http://lkml.kernel.org/r/20200302224851.GA26467@embeddedor
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gustavo A. R. Silva [Tue, 7 Apr 2020 03:11:55 +0000 (20:11 -0700)]
gcov: gcc_3_4: replace zero-length array with flexible-array member
The current codebase makes use of the zero-length array language extension
to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning in
case the flexible array does not occur last in the structure, which will
help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by this
change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit
76497732932f ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Oberparleiter <oberpar@linux.ibm.com>
Link: http://lkml.kernel.org/r/20200302224501.GA14175@embeddedor
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gustavo A. R. Silva [Tue, 7 Apr 2020 03:11:52 +0000 (20:11 -0700)]
gcov: gcc_4_7: replace zero-length array with flexible-array member
The current codebase makes use of the zero-length array language extension
to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning in
case the flexible array does not occur last in the structure, which will
help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by this
change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit
76497732932f ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Link: http://lkml.kernel.org/r/20200213152241.GA877@embeddedor
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Qiujun Huang [Tue, 7 Apr 2020 03:11:49 +0000 (20:11 -0700)]
kernel/kmod.c: fix a typo "assuems" -> "assumes"
There is a typo in comment. Fix it. s/assuems/assumes/
Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Link: http://lkml.kernel.org/r/1585891029-6450-1-git-send-email-hqjagain@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Colin Ian King [Tue, 7 Apr 2020 03:11:46 +0000 (20:11 -0700)]
reiserfs: clean up several indentation issues
There are several places where code is indented incorrectly. Fix these.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200325135018.113431-1-colin.king@canonical.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Will Deacon [Tue, 7 Apr 2020 03:11:43 +0000 (20:11 -0700)]
kallsyms: unexport kallsyms_lookup_name() and kallsyms_on_each_symbol()
kallsyms_lookup_name() and kallsyms_on_each_symbol() are exported to
modules despite having no in-tree users and being wide open to abuse by
out-of-tree modules that can use them as a method to invoke arbitrary
non-exported kernel functions.
Unexport kallsyms_lookup_name() and kallsyms_on_each_symbol().
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Quentin Perret <qperret@google.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: http://lkml.kernel.org/r/20200221114404.14641-4-will@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Will Deacon [Tue, 7 Apr 2020 03:11:39 +0000 (20:11 -0700)]
samples/hw_breakpoint: drop use of kallsyms_lookup_name()
The 'data_breakpoint' test code is the only modular user of
kallsyms_lookup_name(), which was exported as part of fixing the test in
f60d24d2ad04 ("hw-breakpoints: Fix broken hw-breakpoint sample module").
In preparation for un-exporting this symbol, switch the test over to using
__symbol_get(), which can be used to place breakpoints on exported
symbols.
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Quentin Perret <qperret@google.com>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Link: http://lkml.kernel.org/r/20200221114404.14641-3-will@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Will Deacon [Tue, 7 Apr 2020 03:11:36 +0000 (20:11 -0700)]
samples/hw_breakpoint: drop HW_BREAKPOINT_R when reporting writes
Patch series "Unexport kallsyms_lookup_name() and kallsyms_on_each_symbol()".
Despite having just a single modular in-tree user that I could spot,
kallsyms_lookup_name() is exported to modules and provides a mechanism
for out-of-tree modules to access and invoke arbitrary, non-exported
kernel symbols when kallsyms is enabled.
This patch series fixes up that one user and unexports the symbol along
with kallsyms_on_each_symbol(), since that could also be abused in a
similar manner.
I would like to avoid out-of-tree modules being easily able to call
functions that are not exported. kallsyms_lookup_name() makes this
trivial to the point that there is very little incentive to rework these
modules to either use upstream interfaces correctly or propose
functionality which may be otherwise missing upstream. Both of these
latter solutions would be pre-requisites to upstreaming these modules, and
the current state of things actively discourages that approach.
The background here is that we are aiming for Android devices to be able
to use a generic binary kernel image closely following upstream, with any
vendor extensions coming in as kernel modules. In this case, we (Google)
end up maintaining the binary module ABI within the scope of a single LTS
kernel. Monitoring and managing the ABI surface is not feasible if it
effectively includes all data and functions via kallsyms_lookup_name().
Of course, we could just carry this patch in the Android kernel tree, but
we're aiming to carry as little as possible (ideally nothing) and I think
it's a sensible change in its own right. I'm surprised you object to it,
in all honesty.
Now, you could turn around and say "that's not upstream's problem", but it
still seems highly undesirable to me to have an upstream bypass for
exported symbols that isn't even used by upstream modules. It's ripe for
abuse and encourages people to work outside of the upstream tree. The
usual rule is that we don't export symbols without a user in the tree and
that seems especially relevant in this case.
Joe Lawrence said:
: FWIW, kallsyms was historically used by the out-of-tree kpatch support
: module to resolve external symbols as well as call set_memory_r{w,o}()
: API. All of that support code has been merged upstream, so modern kpatch
: modules* no longer leverage kallsyms by default.
:
: That said, there are still some users who still use the deprecated support
: module with newer kernels, but that is not officially supported by the
: project.
This patch (of 3):
Given the name of a kernel symbol, the 'data_breakpoint' test claims to
"report any write operations on the kernel symbol". However, it creates
the breakpoint using both HW_BREAKPOINT_W and HW_BREAKPOINT_R, which menas
it also fires for read access.
Drop HW_BREAKPOINT_R from the breakpoint attributes.
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Quentin Perret <qperret@google.com>
Cc: K.Prasad <prasad@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Joe Lawrence <joe.lawrence@redhat.com>
Link: http://lkml.kernel.org/r/20200221114404.14641-2-will@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexey Dobriyan [Tue, 7 Apr 2020 03:11:32 +0000 (20:11 -0700)]
fs/binfmt_elf.c: don't free interpreter's ELF pheaders on common path
Static executables don't need to free NULL pointer.
It doesn't matter really because static executable is not common scenario
but do it anyway out of pedantry.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200219185330.GA4933@avx2
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexey Dobriyan [Tue, 7 Apr 2020 03:11:29 +0000 (20:11 -0700)]
fs/binfmt_elf.c: allocate less for static executable
PT_INTERP ELF header can be spared if executable is static.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200219185012.GB4871@avx2
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexey Dobriyan [Tue, 7 Apr 2020 03:11:26 +0000 (20:11 -0700)]
fs/binfmt_elf.c: delete "loc" variable
"loc" variable became just a wrapper for PT_INTERP ELF header after main
ELF header was moved to "bprm->buf". Delete it.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200219184847.GA4871@avx2
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jason Baron [Tue, 7 Apr 2020 03:11:23 +0000 (20:11 -0700)]
fs/epoll: make nesting accounting safe for -rt kernel
Davidlohr Bueso pointed out that when CONFIG_DEBUG_LOCK_ALLOC is set
ep_poll_safewake() can take several non-raw spinlocks after disabling
interrupts. Since a spinlock can block in the -rt kernel, we can't take a
spinlock after disabling interrupts. So let's re-work how we determine
the nesting level such that it plays nicely with the -rt kernel.
Let's introduce a 'nests' field in struct eventpoll that records the
current nesting level during ep_poll_callback(). Then, if we nest again
we can find the previous struct eventpoll that we were called from and
increase our count by 1. The 'nests' field is protected by
ep->poll_wait.lock.
I've also moved the visited field to reduce the size of struct eventpoll
from 184 bytes to 176 bytes on x86_64 for !CONFIG_DEBUG_LOCK_ALLOC, which
is typical for a production config.
Reported-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Jason Baron <jbaron@akamai.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Roman Penyaev <rpenyaev@suse.de>
Cc: Eric Wong <normalperson@yhbt.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Link: http://lkml.kernel.org/r/1582739816-13167-1-git-send-email-jbaron@akamai.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roman Penyaev [Tue, 7 Apr 2020 03:11:20 +0000 (20:11 -0700)]
kselftest: introduce new epoll test case
This testcase repeats epollbug.c from the bug:
https://bugzilla.kernel.org/show_bug.cgi?id=205933
What it tests? It tests the race between epoll_ctl() and epoll_wait().
New event mask passed to epoll_ctl() triggers wake up, which can be missed
because of the bug described in the link. Reproduction is 100%, so easy
to fix. Kudos, Max, for wonderful test case.
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Max Neunhoeffer <max@arangodb.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Christopher Kohlhoff <chris.kohlhoff@clearpool.io>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Jes Sorensen <jes.sorensen@gmail.com>
Link: http://lkml.kernel.org/r/20200214170211.561524-2-rpenyaev@suse.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 7 Apr 2020 03:11:17 +0000 (20:11 -0700)]
checkpatch: avoid warning about uninitialized_var()
WARNING: function definition argument 'flags' should also have an identifier name
#26: FILE: drivers/tty/serial/sh-sci.c:1348:
+ unsigned long uninitialized_var(flags);
Special-case uninitialized_var() to prevent this.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/7db7944761b0bd88c70eb17d4b7f40fe589e14ed.camel@perches.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lubomir Rintel [Tue, 7 Apr 2020 03:11:13 +0000 (20:11 -0700)]
checkpatch: check proper licensing of Devicetree bindings
According to Devicetree maintainers (see Link: below), the Devicetree
binding documents are preferrably licensed (GPL-2.0-only OR BSD-2-Clause).
Let's check that. The actual check is a bit more relaxed, to allow more
liberal but compatible licensing (e.g. GPL-2.0-or-later OR BSD-2-Clause).
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Joe Perches <joe@perches.com>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Neil Armstrong <narmstrong@baylibre.com>
Cc: Laurent Pinchart <Laurent.pinchart@ideasonboard.com>,
Cc: Jonas Karlman <jonas@kwiboo.se>,
Cc: Jernej Skrabec <jernej.skrabec@siol.net>,
Cc: Mark Rutland <mark.rutland@arm.com>,
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>,
Link: https://lore.kernel.org/lkml/20200108142132.GA4830@bogus/
Link: http://lkml.kernel.org/r/20200309215153.38824-1-lkundrak@v3.sk
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 7 Apr 2020 03:11:10 +0000 (20:11 -0700)]
checkpatch: improve Gerrit Change-Id: test
The Gerrit Change-Id: entry is sometimes placed after a Signed-off-by:
line. When this occurs, the Gerrit warning is not currently emitted as
the first Signed-off-by: signature sets a flag to stop looking.
Change the test to add a test for the --- patch separator and emit the
warning before any before the --- and also before any diff file name.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/2f6d5f8766fe7439a116c77ea8cc721a3f2d77a2.camel@perches.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Antonio Borneo [Tue, 7 Apr 2020 03:11:07 +0000 (20:11 -0700)]
checkpatch: add command-line option for TAB size
Linux kernel coding style requires a size of 8 characters for both TAB and
indentation, and such value is embedded as magic value allover the
checkpatch script.
This makes hard to reuse the script by other projects with different
requirements in their coding style (e.g. OpenOCD [1] requires TAB size of
4 characters [2]).
Replace the magic value 8 with a variable.
Add a command-line option "--tab-size" to let the user select a
TAB size value other than 8.
[1] http://openocd.org/
[2] http://openocd.org/doc/doxygen/html/stylec.html#styleformat
Signed-off-by: Antonio Borneo <borneo.antonio@gmail.com>
Signed-off-by: Erik Ahlén <erik.ahlen@avalonenterprise.com>
Signed-off-by: Spencer Oliver <spen@spen-soft.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Joe Perches <joe@perches.com>
Link: http://lkml.kernel.org/r/20200122163852.124417-3-borneo.antonio@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Antonio Borneo [Tue, 7 Apr 2020 03:11:04 +0000 (20:11 -0700)]
checkpatch: fix multiple const * types
Commit
1574a29f8e76 ("checkpatch: allow multiple const * types") claims to
support repetition of pattern "const *", but it actually allows only one
extra instance.
Check the following lines
int a(char const * const x[]);
int b(char const * const *x);
int c(char const * const * const x[]);
int d(char const * const * const *x);
with command
./scripts/checkpatch.pl --show-types -f filename
to find that only the first line passes the test, while a warning
is triggered by the other 3 lines:
WARNING:FUNCTION_ARGUMENTS: function definition argument
'char const * const' should also have an identifier name
The reason is that the pattern match halts at the second asterisk in the
line, thus the remaining text starting with asterisk fails to match a
valid name for a variable.
Fixed by replacing "?" (Match 1 or 0 times) with "{0,4}" (Match no more
than 4 times) in the regular expression. Fix also the similar test for
types in unusual order.
Fixes:
1574a29f8e76 ("checkpatch: allow multiple const * types")
Signed-off-by: Antonio Borneo <borneo.antonio@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Joe Perches <joe@perches.com>
Link: http://lkml.kernel.org/r/20200122163852.124417-1-borneo.antonio@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Antonio Borneo [Tue, 7 Apr 2020 03:11:01 +0000 (20:11 -0700)]
checkpatch: fix minor typo and mixed space+tab in indentation
Fix spelling of "concatenation".
Don't use tab after space in indentation.
Signed-off-by: Antonio Borneo <borneo.antonio@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Joe Perches <joe@perches.com>
Link: http://lkml.kernel.org/r/20200122163852.124417-2-borneo.antonio@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 7 Apr 2020 03:10:58 +0000 (20:10 -0700)]
checkpatch: prefer fallthrough; over fallthrough comments
commit
294f69e662d1 ("compiler_attributes.h: Add 'fallthrough' pseudo
keyword for switch/case use") added the pseudo keyword so add a test for
it to checkpatch.
Warn on a patch or use --strict for files.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/8b6c1b9031ab9f3cdebada06b8d46467f1492d68.camel@perches.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
John Hubbard [Tue, 7 Apr 2020 03:10:55 +0000 (20:10 -0700)]
checkpatch: support "base-commit:" format
In order to support the get-lore-mbox.py tool described in [1], I ran:
git format-patch --base=<commit> --cover-letter <revrange>
... which generated a "base-commit: <commit-hash>" tag at the end of the
cover letter. However, checkpatch.pl generated an error upon encounting
"base-commit:" in the cover letter:
"ERROR: Please use git commit description style..."
... because it found the "commit" keyword, and failed to recognize that
it was part of the "base-commit" phrase, and as such, should not be
subjected to the same commit description style rules.
Update checkpatch.pl to include a special case for "base-commit:" (at the
start of the line, possibly with some leading whitespace) so that that tag
no longer generates a checkpatch error.
[1] https://lwn.net/Articles/811528/ "Better tools for kernel
developers"
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Joe Perches <joe@perches.com>
Cc: Andy Whitcroft <apw@canonical.com>
Cc: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Link: http://lkml.kernel.org/r/20200213055004.69235-2-jhubbard@nvidia.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lubomir Rintel [Tue, 7 Apr 2020 03:10:51 +0000 (20:10 -0700)]
checkpatch: check SPDX tags in YAML files
This adds a warning when a YAML file is lacking a SPDX header on first
line, or it uses incorrect commenting style.
Currently the only YAML files in the tree are Devicetree binding
documents.
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Joe Perches <joe@perches.com>
Cc: Rob Herring <robh@kernel.org>
Link: http://lkml.kernel.org/r/20200129123356.388669-1-lkundrak@v3.sk
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 7 Apr 2020 03:10:48 +0000 (20:10 -0700)]
checkpatch: remove email address comment from email address comparisons
About 2% of the last 100K commits have email addresses that include an
RFC2822 compliant comment like:
Peter Zijlstra (Intel) <peterz@infradead.org>
checkpatch currently does a comparison of the complete name and address to
the submitted author to determine if the author has signed-off and emits a
warning if the exact email names and addresses do not match.
Unfortunately, the author email address can be written without the comment
like:
Peter Zijlstra <peterz@infradead.org>
Add logic to compare the comment stripped email addresses to avoid this
warning.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/ebaa2f7c8f94e25520981945cddcc1982e70e072.camel@perches.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nathan Chancellor [Tue, 7 Apr 2020 03:10:45 +0000 (20:10 -0700)]
lib/dynamic_debug.c: use address-of operator on section symbols
Clang warns:
../lib/dynamic_debug.c:1034:24: warning: array comparison always
evaluates to false [-Wtautological-compare]
if (__start___verbose == __stop___verbose) {
^
1 warning generated.
These are not true arrays, they are linker defined symbols, which are just
addresses. Using the address of operator silences the warning and does
not change the resulting assembly with either clang/ld.lld or gcc/ld
(tested with diff + objdump -Dr).
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Jason Baron <jbaron@akamai.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/894
Link: http://lkml.kernel.org/r/20200220051320.10739-1-natechancellor@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rikard Falkeborn [Tue, 7 Apr 2020 03:10:38 +0000 (20:10 -0700)]
linux/bits.h: add compile time sanity check of GENMASK inputs
GENMASK() and GENMASK_ULL() are supposed to be called with the high bit as
the first argument and the low bit as the second argument. Mixing them
will return a mask with zero bits set.
Recent commits show getting this wrong is not uncommon, see e.g. commit
aa4c0c9091b0 ("net: stmmac: Fix misuses of GENMASK macro") and commit
9bdd7bb3a844 ("clocksource/drivers/npcm: Fix misuse of GENMASK macro").
To prevent such mistakes from appearing again, add compile time sanity
checking to the arguments of GENMASK() and GENMASK_ULL(). If both
arguments are known at compile time, and the low bit is higher than the
high bit, break the build to detect the mistake immediately.
Since GENMASK() is used in declarations, BUILD_BUG_ON_ZERO() must be used
instead of BUILD_BUG_ON().
__builtin_constant_p does not evaluate is argument, it only checks if it
is a constant or not at compile time, and __builtin_choose_expr does not
evaluate the expression that is not chosen. Therefore, GENMASK(x++, 0)
does only evaluate x++ once.
Commit
95b980d62d52 ("linux/bits.h: make BIT(), GENMASK(), and friends
available in assembly") made the macros in linux/bits.h available in
assembly. Since BUILD_BUG_OR_ZERO() is not asm compatible, disable the
checks if the file is included in an asm file.
Due to bugs in GCC versions before 4.9 [0], disable the check if building
with a too old GCC compiler.
[0]: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19449
Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Haren Myneni <haren@us.ibm.com>
Cc: Joe Perches <joe@perches.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: lkml <linux-kernel@vger.kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20200308193954.2372399-1-rikard.falkeborn@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dan Carpenter [Tue, 7 Apr 2020 03:10:35 +0000 (20:10 -0700)]
lib/test_kmod.c: remove a NULL test
The "info" pointer has already been dereferenced so checking here is too
late. Fortunately, we never pass NULL pointers to the
test_kmod_put_module() function so the test can simply be removed.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Link: http://lkml.kernel.org/r/20200228092452.vwkhthsn77nrxdy6@kili.mountain
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
chenqiwu [Tue, 7 Apr 2020 03:10:31 +0000 (20:10 -0700)]
lib/rbtree: fix coding style of assignments
Leave blank space between the right-hand and left-hand side of the
assignment to meet the kernel coding style better.
Signed-off-by: chenqiwu <chenqiwu@xiaomi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Michel Lespinasse <walken@google.com>
Link: http://lkml.kernel.org/r/1582621140-25850-1-git-send-email-qiwuchen55@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andy Shevchenko [Tue, 7 Apr 2020 03:10:28 +0000 (20:10 -0700)]
lib/test_bitmap.c: make use of EXP2_IN_BITS
Commit
30544ed5de43 ("lib/bitmap: introduce bitmap_replace() helper")
introduced some new test cases to the test_bitmap.c module. Among these
it also introduced an (unused) definition. Let's make use of
EXP2_IN_BITS.
Reported-by: Alex Shi <alex.shi@linux.alibaba.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Alex Shi <alex.shi@linux.alibaba.com>
Link: http://lkml.kernel.org/r/20200121151847.75223-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Qian Cai [Tue, 7 Apr 2020 03:10:25 +0000 (20:10 -0700)]
percpu_counter: fix a data race at vm_committed_as
"vm_committed_as.count" could be accessed concurrently as reported by
KCSAN,
BUG: KCSAN: data-race in __vm_enough_memory / percpu_counter_add_batch
write to 0xffffffff9451c538 of 8 bytes by task 65879 on cpu 35:
percpu_counter_add_batch+0x83/0xd0
percpu_counter_add_batch at lib/percpu_counter.c:91
__vm_enough_memory+0xb9/0x260
dup_mm+0x3a4/0x8f0
copy_process+0x2458/0x3240
_do_fork+0xaa/0x9f0
__do_sys_clone+0x125/0x160
__x64_sys_clone+0x70/0x90
do_syscall_64+0x91/0xb05
entry_SYSCALL_64_after_hwframe+0x49/0xbe
read to 0xffffffff9451c538 of 8 bytes by task 66773 on cpu 19:
__vm_enough_memory+0x199/0x260
percpu_counter_read_positive at include/linux/percpu_counter.h:81
(inlined by) __vm_enough_memory at mm/util.c:839
mmap_region+0x1b2/0xa10
do_mmap+0x45c/0x700
vm_mmap_pgoff+0xc0/0x130
ksys_mmap_pgoff+0x6e/0x300
__x64_sys_mmap+0x33/0x40
do_syscall_64+0x91/0xb05
entry_SYSCALL_64_after_hwframe+0x49/0xbe
The read is outside percpu_counter::lock critical section which results in
a data race. Fix it by adding a READ_ONCE() in
percpu_counter_read_positive() which could also service as the existing
compiler memory barrier.
Signed-off-by: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Marco Elver <elver@google.com>
Link: http://lkml.kernel.org/r/1582302724-2804-1-git-send-email-cai@lca.pw
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Potapenko [Tue, 7 Apr 2020 03:10:22 +0000 (20:10 -0700)]
kasan: stackdepot: move filter_irq_stacks() to stackdepot.c
filter_irq_stacks() can be used by other tools (e.g. KMSAN), so it needs
to be moved to a common location. lib/stackdepot.c seems a good place, as
filter_irq_stacks() is usually applied to the output of
stack_trace_save().
This patch has been previously mailed as part of KMSAN RFC patch series.
[glider@google.co: nds32: linker script: add SOFTIRQENTRY_TEXT\
Link: http://lkml.kernel.org/r/20200311121002.241430-1-glider@google.com
[glider@google.com: add IRQENTRY_TEXT and SOFTIRQENTRY_TEXT to linker script]
Link: http://lkml.kernel.org/r/20200311121124.243352-1-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Link: http://lkml.kernel.org/r/20200220141916.55455-3-glider@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Potapenko [Tue, 7 Apr 2020 03:10:19 +0000 (20:10 -0700)]
lib/stackdepot.c: build with -fno-builtin
Clang may replace stackdepot_memcmp() with a call to instrumented bcmp(),
which is exactly what we wanted to avoid creating stackdepot_memcmp().
Building the file with -fno-builtin prevents such optimizations.
This patch has been previously mailed as part of KMSAN RFC patch series.
Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Link: http://lkml.kernel.org/r/20200220141916.55455-2-glider@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Potapenko [Tue, 7 Apr 2020 03:10:15 +0000 (20:10 -0700)]
lib/stackdepot.c: check depot_index before accessing the stack slab
Avoid crashes on corrupted stack ids. Despite stack ID corruption may
indicate other bugs in the program, we'd better fail gracefully on such
IDs instead of crashing the kernel.
This patch has been previously mailed as part of KMSAN RFC patch series.
Link: http://lkml.kernel.org/r/20200220141916.55455-1-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
From: Dan Carpenter <dan.carpenter@oracle.com>
Subject: lib/stackdepot.c: fix a condition in stack_depot_fetch()
We should check for a NULL pointer first before adding the offset.
Otherwise if the pointer is NULL and the offset is non-zero, it will lead
to an Oops.
Fixes:
d45048e65a59 ("lib/stackdepot.c: check depot_index before accessing the stack slab")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Alexander Potapenko <glider@google.com>
Link: http://lkml.kernel.org/r/20200312113006.GA20562@mwanda
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kees Cook [Tue, 7 Apr 2020 03:10:12 +0000 (20:10 -0700)]
lib: test_stackinit.c: XFAIL switch variable init tests
The tests for initializing a variable defined between a switch statement's
test and its first "case" statement are currently not initialized in
Clang[1] nor the proposed auto-initialization feature in GCC.
We should retain the test (so that we can evaluate compiler fixes), but
mark it as an "expected fail". The rest of the kernel source will be
adjusted to avoid this corner case.
Also disable -Wswitch-unreachable for the test so that the intentionally
broken code won't trigger warnings for GCC (nor future Clang) when
initialization happens this unhandled place.
[1] https://bugs.llvm.org/show_bug.cgi?id=44916
Suggested-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Jann Horn <jannh@google.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Link: http://lkml.kernel.org/r/202002191358.2897A07C6@keescook
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Geert Uytterhoeven [Tue, 7 Apr 2020 03:10:09 +0000 (20:10 -0700)]
lib/scatterlist: fix sg_copy_buffer() kerneldoc
Add the missing closing parenthesis to the description for the to_buffer
parameter of sg_copy_buffer().
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Akinobu Mita <akinobu.mita@gmail.com
Link: http://lkml.kernel.org/r/20200212084241.8778-1-geert+renesas@glider.be
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gustavo A. R. Silva [Tue, 7 Apr 2020 03:10:06 +0000 (20:10 -0700)]
lib/ts_kmp.c: replace zero-length array with flexible-array member
The current codebase makes use of the zero-length array language extension
to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning in
case the flexible array does not occur last in the structure, which will
help us prevent some kind of undefined behavior bugs from being
inadvertenly introduced[3] to the codebase from now on.
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit
76497732932f ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200211205948.GA26459@embeddedor
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gustavo A. R. Silva [Tue, 7 Apr 2020 03:10:03 +0000 (20:10 -0700)]
lib/ts_fsm.c: replace zero-length array with flexible-array member
The current codebase makes use of the zero-length array language extension
to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning in
case the flexible array does not occur last in the structure, which will
help us prevent some kind of undefined behavior bugs from being
inadvertenly introduced[3] to the codebase from now on.
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit
76497732932f ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200211205813.GA25602@embeddedor
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gustavo A. R. Silva [Tue, 7 Apr 2020 03:10:00 +0000 (20:10 -0700)]
lib/ts_bm.c: replace zero-length array with flexible-array member
The current codebase makes use of the zero-length array language extension
to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning in
case the flexible array does not occur last in the structure, which will
help us prevent some kind of undefined behavior bugs from being
inadvertenly introduced[3] to the codebase from now on.
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit
76497732932f ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200211205620.GA24694@embeddedor
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Gustavo A. R. Silva [Tue, 7 Apr 2020 03:09:57 +0000 (20:09 -0700)]
lib/bch.c: replace zero-length array with flexible-array member
The current codebase makes use of the zero-length array language extension
to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning in
case the flexible array does not occur last in the structure, which will
help us prevent some kind of undefined behavior bugs from being
inadvertenly introduced[3] to the codebase from now on.
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit
76497732932f ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200211205119.GA21234@embeddedor
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Konstantin Khlebnikov [Tue, 7 Apr 2020 03:09:54 +0000 (20:09 -0700)]
lib/test_lockup.c: add parameters for locking generic vfs locks
file_path=<path> defines file or directory to open
lock_inode=Y set lock_rwsem_ptr to inode->i_rwsem
lock_mapping=Y set lock_rwsem_ptr to mapping->i_mmap_rwsem
lock_sb_umount=Y set lock_rwsem_ptr to sb->s_umount
This gives safe and simple way to see how system reacts to contention of
common vfs locks and how syscalls depend on them directly or indirectly.
For example to block s_umount for 60 seconds:
# modprobe test_lockup file_path=. lock_sb_umount time_secs=60 state=S
This is useful for checking/testing scalability issues like this:
https://lore.kernel.org/lkml/
158497590858.7371.
9311902565121473436.stgit@buzz/
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/158498153964.5621.83061779039255681.stgit@buzz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Colin Ian King [Tue, 7 Apr 2020 03:09:50 +0000 (20:09 -0700)]
lib/test_lockup.c: fix spelling mistake "iteraions" -> "iterations"
There is a spelling mistake in a pr_notice message. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20200221155145.79522-1-colin.king@canonical.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Konstantin Khlebnikov [Tue, 7 Apr 2020 03:09:47 +0000 (20:09 -0700)]
lib/test_lockup: test module to generate lockups
CONFIG_TEST_LOCKUP=m adds module "test_lockup" that helps to make sure
that watchdogs and lockup detectors are working properly.
Depending on module parameters test_lockup could emulate soft or hard
lockup, "hung task", hold arbitrary lock, allocate bunch of pages.
Also it could generate series of lockups with cooling-down periods, in
this way it could be used as "ping" for locks or page allocator. Loop
checks signals between iteration thus could be stopped by ^C.
# modinfo test_lockup
...
parm: time_secs:lockup time in seconds, default 0 (uint)
parm: time_nsecs:nanoseconds part of lockup time, default 0 (uint)
parm: cooldown_secs:cooldown time between iterations in seconds, default 0 (uint)
parm: cooldown_nsecs:nanoseconds part of cooldown, default 0 (uint)
parm: iterations:lockup iterations, default 1 (uint)
parm: all_cpus:trigger lockup at all cpus at once (bool)
parm: state:wait in 'R' running (default), 'D' uninterruptible, 'K' killable, 'S' interruptible state (charp)
parm: use_hrtimer:use high-resolution timer for sleeping (bool)
parm: iowait:account sleep time as iowait (bool)
parm: lock_read:lock read-write locks for read (bool)
parm: lock_single:acquire locks only at one cpu (bool)
parm: reacquire_locks:release and reacquire locks/irq/preempt between iterations (bool)
parm: touch_softlockup:touch soft-lockup watchdog between iterations (bool)
parm: touch_hardlockup:touch hard-lockup watchdog between iterations (bool)
parm: call_cond_resched:call cond_resched() between iterations (bool)
parm: measure_lock_wait:measure lock wait time (bool)
parm: lock_wait_threshold:print lock wait time longer than this in nanoseconds, default off (ulong)
parm: disable_irq:disable interrupts: generate hard-lockups (bool)
parm: disable_softirq:disable bottom-half irq handlers (bool)
parm: disable_preempt:disable preemption: generate soft-lockups (bool)
parm: lock_rcu:grab rcu_read_lock: generate rcu stalls (bool)
parm: lock_mmap_sem:lock mm->mmap_sem: block procfs interfaces (bool)
parm: lock_rwsem_ptr:lock rw_semaphore at address (ulong)
parm: lock_mutex_ptr:lock mutex at address (ulong)
parm: lock_spinlock_ptr:lock spinlock at address (ulong)
parm: lock_rwlock_ptr:lock rwlock at address (ulong)
parm: alloc_pages_nr:allocate and free pages under locks (uint)
parm: alloc_pages_order:page order to allocate (uint)
parm: alloc_pages_gfp:allocate pages with this gfp_mask, default GFP_KERNEL (uint)
parm: alloc_pages_atomic:allocate pages with GFP_ATOMIC (bool)
parm: reallocate_pages:free and allocate pages between iterations (bool)
Parameters for locking by address are unsafe and taints kernel. With
CONFIG_DEBUG_SPINLOCK=y they at least check magics for embedded spinlocks.
Examples:
task hang in D-state:
modprobe test_lockup time_secs=1 iterations=60 state=D
task hang in io-wait D-state:
modprobe test_lockup time_secs=1 iterations=60 state=D iowait
softlockup:
modprobe test_lockup time_secs=1 iterations=60 state=R
hardlockup:
modprobe test_lockup time_secs=1 iterations=60 state=R disable_irq
system-wide hardlockup:
modprobe test_lockup time_secs=1 iterations=60 state=R \
disable_irq all_cpus
rcu stall:
modprobe test_lockup time_secs=1 iterations=60 state=R \
lock_rcu touch_softlockup
lock mmap_sem / block procfs interfaces:
modprobe test_lockup time_secs=1 iterations=60 state=S lock_mmap_sem
lock tasklist_lock for read / block forks:
TASKLIST_LOCK=$(awk '$3 == "tasklist_lock" {print "0x"$1}' /proc/kallsyms)
modprobe test_lockup time_secs=1 iterations=60 state=R \
disable_irq lock_read lock_rwlock_ptr=$TASKLIST_LOCK
lock namespace_sem / block vfs mount operations:
NAMESPACE_SEM=$(awk '$3 == "namespace_sem" {print "0x"$1}' /proc/kallsyms)
modprobe test_lockup time_secs=1 iterations=60 state=S \
lock_rwsem_ptr=$NAMESPACE_SEM
lock cgroup mutex / block cgroup operations:
CGROUP_MUTEX=$(awk '$3 == "cgroup_mutex" {print "0x"$1}' /proc/kallsyms)
modprobe test_lockup time_secs=1 iterations=60 state=S \
lock_mutex_ptr=$CGROUP_MUTEX
ping cgroup_mutex every second and measure maximum lock wait time:
modprobe test_lockup cooldown_secs=1 iterations=60 state=S \
lock_mutex_ptr=$CGROUP_MUTEX reacquire_locks measure_lock_wait
[linux@roeck-us.net: rename disable_irq to fix build error]
Link: http://lkml.kernel.org/r/20200317133614.23152-1-linux@roeck-us.net
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Sasha Levin <sashal@kernel.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Dmitry Monakhov <dmtrmonakhov@yandex-team.ru
Cc: Colin Ian King <colin.king@canonical.com>
Cc: Guenter Roeck <linux@roeck-us.net>
Link: http://lkml.kernel.org/r/158132859146.2797.525923171323227836.stgit@buzz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Josh Poimboeuf [Tue, 7 Apr 2020 03:09:43 +0000 (20:09 -0700)]
bitops: always inline sign extension helpers
With CONFIG_CC_OPTIMIZE_FOR_SIZE, objtool reports:
drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool: i915_gem_execbuffer2_ioctl()+0x5b7: call to gen8_canonical_addr() with UACCESS enabled
This means i915_gem_execbuffer2_ioctl() is calling gen8_canonical_addr()
from the user_access_begin/end critical region (i.e, with SMAP disabled).
While it's probably harmless in this case, in general we like to avoid
extra function calls in SMAP-disabled regions because it can open up
inadvertent security holes.
Fix the warning by changing the sign extension helpers to __always_inline.
This convinces GCC to inline gen8_canonical_addr().
The sign extension functions are trivial anyway, so it makes sense to
always inline them. With my test optimize-for-size-based config, this
actually shrinks the text size of i915_gem_execbuffer.o by 45 bytes -- and
no change for vmlinux.
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://lkml.kernel.org/r/740179324b2b18b750b16295c48357f00b5fa9ed.1582982020.git.jpoimboe@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joe Perches [Tue, 7 Apr 2020 03:09:40 +0000 (20:09 -0700)]
MAINTAINERS: list the section entries in the preferred order
The MAINTAINERS file header has never shown a preferred order for the
section entries but scripts/parse-maintainers.pl added a preferred order
with commit
61f741645a35 ("parse-maintainers: Add section pattern
sorting")
Commit
5cdbec108fd2 ("parse-maintainers: Do not sort section content by
default") changed the preferred order to be a bit more sensible.
Update the MAINTAINERS section description block to use this preferred
section entry ordering.
Add a slightly better description for the N: entry too.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Link: http://lkml.kernel.org/r/5aa5aad6fb1678230c260337dc066cd449a2bf32.camel@perches.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>