Dominik Brodowski [Wed, 1 Jan 2020 19:05:03 +0000 (20:05 +0100)]
Revert "fs: remove ksys_dup()"
This reverts commit
8243186f0cc7 ("fs: remove ksys_dup()") and the
subsequent fix for it in commit
2d3145f8d280 ("early init: fix error
handling when opening /dev/console").
Trying to use filp_open() and f_dupfd() instead of pseudo-syscalls
caused more trouble than what is worth it: it requires accessing vfs
internals and it turns out there were other bugs in it too.
In particular, the file reference counting was wrong - because unlike
the original "open+2*dup" sequence it used "filp_open+3*f_dupfd" and
thus had an extra leaked file reference.
That in turn then caused odd problems with Androidx86 long after boot
becaue of how the extra reference to the console kept the session active
even after all file descriptors had been closed.
Reported-by: youling 257 <youling257@gmail.com>
Cc: Arvind Sankar <nivedita@alum.mit.edu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 31 Dec 2019 19:14:58 +0000 (11:14 -0800)]
Merge git://git./linux/kernel/git/netdev/net
Pull networking fixes from David Miller:
1) Fix big endian overflow in nf_flow_table, from Arnd Bergmann.
2) Fix port selection on big endian in nft_tproxy, from Phil Sutter.
3) Fix precision tracking for unbound scalars in bpf verifier, from
Daniel Borkmann.
4) Fix integer overflow in socket rcvbuf check in UDP, from Antonio
Messina.
5) Do not perform a neigh confirmation during a pmtu update over a
tunnel, from Hangbin Liu.
6) Fix DMA mapping leak in dpaa_eth driver, from Madalin Bucur.
7) Various PTP fixes for sja1105 dsa driver, from Vladimir Oltean.
8) Add missing to dummy definition of of_mdiobus_child_is_phy(), from
Geert Uytterhoeven
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits)
hsr: fix slab-out-of-bounds Read in hsr_debugfs_rename()
net/sched: add delete_empty() to filters and use it in cls_flower
tcp: Fix highest_sack and highest_sack_seq
ptp: fix the race between the release of ptp_clock and cdev
net: dsa: sja1105: Reconcile the meaning of TPID and TPID2 for E/T and P/Q/R/S
Documentation: net: dsa: sja1105: Remove text about taprio base-time limitation
net: dsa: sja1105: Remove restriction of zero base-time for taprio offload
net: dsa: sja1105: Really make the PTP command read-write
net: dsa: sja1105: Take PTP egress timestamp by port, not mgmt slot
cxgb4/cxgb4vf: fix flow control display for auto negotiation
mlxsw: spectrum: Use dedicated policer for VRRP packets
mlxsw: spectrum_router: Skip loopback RIFs during MAC validation
net: stmmac: dwmac-meson8b: Fix the RGMII TX delay on Meson8b/8m2 SoCs
net/sched: act_mirred: Pull mac prior redir to non mac_header_xmit device
net_sched: sch_fq: properly set sk->sk_pacing_status
bnx2x: Fix accounting of vlan resources among the PFs
bnx2x: Use appropriate define for vlan credit
of: mdio: Add missing inline to of_mdiobus_child_is_phy() dummy
net: phy: aquantia: add suspend / resume ops for AQR105
dpaa_eth: fix DMA mapping leak
...
Linus Torvalds [Tue, 31 Dec 2019 18:51:27 +0000 (10:51 -0800)]
Merge tag 'tomoyo-fixes-for-5.5' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1
Pull tomoyo fixes from Tetsuo Handa:
"Two bug fixes:
- Suppress RCU warning at list_for_each_entry_rcu()
- Don't use fancy names on sockets"
* tag 'tomoyo-fixes-for-5.5' of git://git.osdn.net/gitroot/tomoyo/tomoyo-test1:
tomoyo: Suppress RCU warning at list_for_each_entry_rcu().
tomoyo: Don't use nifty names on sockets.
Taehee Yoo [Sat, 28 Dec 2019 16:28:09 +0000 (16:28 +0000)]
hsr: fix slab-out-of-bounds Read in hsr_debugfs_rename()
hsr slave interfaces don't have debugfs directory.
So, hsr_debugfs_rename() shouldn't be called when hsr slave interface name
is changed.
Test commands:
ip link add dummy0 type dummy
ip link add dummy1 type dummy
ip link add hsr0 type hsr slave1 dummy0 slave2 dummy1
ip link set dummy0 name ap
Splat looks like:
[21071.899367][T22666] ap: renamed from dummy0
[21071.914005][T22666] ==================================================================
[21071.919008][T22666] BUG: KASAN: slab-out-of-bounds in hsr_debugfs_rename+0xaa/0xb0 [hsr]
[21071.923640][T22666] Read of size 8 at addr
ffff88805febcd98 by task ip/22666
[21071.926941][T22666]
[21071.927750][T22666] CPU: 0 PID: 22666 Comm: ip Not tainted 5.5.0-rc2+ #240
[21071.929919][T22666] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[21071.935094][T22666] Call Trace:
[21071.935867][T22666] dump_stack+0x96/0xdb
[21071.936687][T22666] ? hsr_debugfs_rename+0xaa/0xb0 [hsr]
[21071.937774][T22666] print_address_description.constprop.5+0x1be/0x360
[21071.939019][T22666] ? hsr_debugfs_rename+0xaa/0xb0 [hsr]
[21071.940081][T22666] ? hsr_debugfs_rename+0xaa/0xb0 [hsr]
[21071.940949][T22666] __kasan_report+0x12a/0x16f
[21071.941758][T22666] ? hsr_debugfs_rename+0xaa/0xb0 [hsr]
[21071.942674][T22666] kasan_report+0xe/0x20
[21071.943325][T22666] hsr_debugfs_rename+0xaa/0xb0 [hsr]
[21071.944187][T22666] hsr_netdev_notify+0x1fe/0x9b0 [hsr]
[21071.945052][T22666] ? __module_text_address+0x13/0x140
[21071.945897][T22666] notifier_call_chain+0x90/0x160
[21071.946743][T22666] dev_change_name+0x419/0x840
[21071.947496][T22666] ? __read_once_size_nocheck.constprop.6+0x10/0x10
[21071.948600][T22666] ? netdev_adjacent_rename_links+0x280/0x280
[21071.949577][T22666] ? __read_once_size_nocheck.constprop.6+0x10/0x10
[21071.950672][T22666] ? lock_downgrade+0x6e0/0x6e0
[21071.951345][T22666] ? do_setlink+0x811/0x2ef0
[21071.951991][T22666] do_setlink+0x811/0x2ef0
[21071.952613][T22666] ? is_bpf_text_address+0x81/0xe0
[ ... ]
Reported-by: syzbot+9328206518f08318a5fd@syzkaller.appspotmail.com
Fixes:
4c2d5e33dcd3 ("hsr: rename debugfs file when interface name is changed")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Davide Caratti [Sat, 28 Dec 2019 15:36:58 +0000 (16:36 +0100)]
net/sched: add delete_empty() to filters and use it in cls_flower
Revert "net/sched: cls_u32: fix refcount leak in the error path of
u32_change()", and fix the u32 refcount leak in a more generic way that
preserves the semantic of rule dumping.
On tc filters that don't support lockless insertion/removal, there is no
need to guard against concurrent insertion when a removal is in progress.
Therefore, for most of them we can avoid a full walk() when deleting, and
just decrease the refcount, like it was done on older Linux kernels.
This fixes situations where walk() was wrongly detecting a non-empty
filter, like it happened with cls_u32 in the error path of change(), thus
leading to failures in the following tdc selftests:
6aa7: (filter, u32) Add/Replace u32 with source match and invalid indev
6658: (filter, u32) Add/Replace u32 with custom hash table and invalid handle
74c2: (filter, u32) Add/Replace u32 filter with invalid hash table id
On cls_flower, and on (future) lockless filters, this check is necessary:
move all the check_empty() logic in a callback so that each filter
can have its own implementation. For cls_flower, it's sufficient to check
if no IDRs have been allocated.
This reverts commit
275c44aa194b7159d1191817b20e076f55f0e620.
Changes since v1:
- document the need for delete_empty() when TCF_PROTO_OPS_DOIT_UNLOCKED
is used, thanks to Vlad Buslov
- implement delete_empty() without doing fl_walk(), thanks to Vlad Buslov
- squash revert and new fix in a single patch, to be nice with bisect
tests that run tdc on u32 filter, thanks to Dave Miller
Fixes:
275c44aa194b ("net/sched: cls_u32: fix refcount leak in the error path of u32_change()")
Fixes:
6676d5e416ee ("net: sched: set dedicated tcf_walker flag when tp is empty")
Suggested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Suggested-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Tested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cambda Zhu [Fri, 27 Dec 2019 08:52:37 +0000 (16:52 +0800)]
tcp: Fix highest_sack and highest_sack_seq
>From commit
50895b9de1d3 ("tcp: highest_sack fix"), the logic about
setting tp->highest_sack to the head of the send queue was removed.
Of course the logic is error prone, but it is logical. Before we
remove the pointer to the highest sack skb and use the seq instead,
we need to set tp->highest_sack to NULL when there is no skb after
the last sack, and then replace NULL with the real skb when new skb
inserted into the rtx queue, because the NULL means the highest sack
seq is tp->snd_nxt. If tp->highest_sack is NULL and new data sent,
the next ACK with sack option will increase tp->reordering unexpectedly.
This patch sets tp->highest_sack to the tail of the rtx queue if
it's NULL and new data is sent. The patch keeps the rule that the
highest_sack can only be maintained by sack processing, except for
this only case.
Fixes:
50895b9de1d3 ("tcp: highest_sack fix")
Signed-off-by: Cambda Zhu <cambda@linux.alibaba.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladis Dronov [Fri, 27 Dec 2019 02:26:27 +0000 (03:26 +0100)]
ptp: fix the race between the release of ptp_clock and cdev
In a case when a ptp chardev (like /dev/ptp0) is open but an underlying
device is removed, closing this file leads to a race. This reproduces
easily in a kvm virtual machine:
ts# cat openptp0.c
int main() { ... fp = fopen("/dev/ptp0", "r"); ... sleep(10); }
ts# uname -r
5.5.0-rc3-
46cf053e
ts# cat /proc/cmdline
... slub_debug=FZP
ts# modprobe ptp_kvm
ts# ./openptp0 &
[1] 670
opened /dev/ptp0, sleeping 10s...
ts# rmmod ptp_kvm
ts# ls /dev/ptp*
ls: cannot access '/dev/ptp*': No such file or directory
ts# ...woken up
[ 48.010809] general protection fault: 0000 [#1] SMP
[ 48.012502] CPU: 6 PID: 658 Comm: openptp0 Not tainted 5.5.0-rc3-
46cf053e #25
[ 48.014624] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ...
[ 48.016270] RIP: 0010:module_put.part.0+0x7/0x80
[ 48.017939] RSP: 0018:
ffffb3850073be00 EFLAGS:
00010202
[ 48.018339] RAX:
000000006b6b6b6b RBX:
6b6b6b6b6b6b6b6b RCX:
ffff89a476c00ad0
[ 48.018936] RDX:
fffff65a08d3ea08 RSI:
0000000000000247 RDI:
6b6b6b6b6b6b6b6b
[ 48.019470] ... ^^^ a slub poison
[ 48.023854] Call Trace:
[ 48.024050] __fput+0x21f/0x240
[ 48.024288] task_work_run+0x79/0x90
[ 48.024555] do_exit+0x2af/0xab0
[ 48.024799] ? vfs_write+0x16a/0x190
[ 48.025082] do_group_exit+0x35/0x90
[ 48.025387] __x64_sys_exit_group+0xf/0x10
[ 48.025737] do_syscall_64+0x3d/0x130
[ 48.026056] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 48.026479] RIP: 0033:0x7f53b12082f6
[ 48.026792] ...
[ 48.030945] Modules linked in: ptp i6300esb watchdog [last unloaded: ptp_kvm]
[ 48.045001] Fixing recursive fault but reboot is needed!
This happens in:
static void __fput(struct file *file)
{ ...
if (file->f_op->release)
file->f_op->release(inode, file); <<< cdev is kfree'd here
if (unlikely(S_ISCHR(inode->i_mode) && inode->i_cdev != NULL &&
!(mode & FMODE_PATH))) {
cdev_put(inode->i_cdev); <<< cdev fields are accessed here
Namely:
__fput()
posix_clock_release()
kref_put(&clk->kref, delete_clock) <<< the last reference
delete_clock()
delete_ptp_clock()
kfree(ptp) <<< cdev is embedded in ptp
cdev_put
module_put(p->owner) <<< *p is kfree'd, bang!
Here cdev is embedded in posix_clock which is embedded in ptp_clock.
The race happens because ptp_clock's lifetime is controlled by two
refcounts: kref and cdev.kobj in posix_clock. This is wrong.
Make ptp_clock's sysfs device a parent of cdev with cdev_device_add()
created especially for such cases. This way the parent device with its
ptp_clock is not released until all references to the cdev are released.
This adds a requirement that an initialized but not exposed struct
device should be provided to posix_clock_register() by a caller instead
of a simple dev_t.
This approach was adopted from the commit
72139dfa2464 ("watchdog: Fix
the race between the release of watchdog_core_data and cdev"). See
details of the implementation in the commit
233ed09d7fda ("chardev: add
helper function to register char devs with a struct device").
Link: https://lore.kernel.org/linux-fsdevel/20191125125342.6189-1-vdronov@redhat.com/T/#u
Analyzed-by: Stephen Johnston <sjohnsto@redhat.com>
Analyzed-by: Vern Lovejoy <vlovejoy@redhat.com>
Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Fri, 27 Dec 2019 01:11:13 +0000 (03:11 +0200)]
net: dsa: sja1105: Reconcile the meaning of TPID and TPID2 for E/T and P/Q/R/S
For first-generation switches (SJA1105E and SJA1105T):
- TPID means C-Tag (typically 0x8100)
- TPID2 means S-Tag (typically 0x88A8)
While for the second generation switches (SJA1105P, SJA1105Q, SJA1105R,
SJA1105S) it is the other way around:
- TPID means S-Tag (typically 0x88A8)
- TPID2 means C-Tag (typically 0x8100)
In other words, E/T tags untagged traffic with TPID, and P/Q/R/S with
TPID2.
So the patch mentioned below fixed VLAN filtering for P/Q/R/S, but broke
it for E/T.
We strive for a common code path for all switches in the family, so just
lie in the static config packing functions that TPID and TPID2 are at
swapped bit offsets than they actually are, for P/Q/R/S. This will make
both switches understand TPID to be ETH_P_8021Q and TPID2 to be
ETH_P_8021AD. The meaning from the original E/T was chosen over P/Q/R/S
because E/T is actually the one with public documentation available
(UM10944.pdf).
Fixes:
f9a1a7646c0d ("net: dsa: sja1105: Reverse TPID and TPID2")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Fri, 27 Dec 2019 01:08:07 +0000 (03:08 +0200)]
Documentation: net: dsa: sja1105: Remove text about taprio base-time limitation
Since commit
86db36a347b4 ("net: dsa: sja1105: Implement state machine
for TAS with PTP clock source"), this paragraph is no longer true. So
remove it.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Fri, 27 Dec 2019 01:03:54 +0000 (03:03 +0200)]
net: dsa: sja1105: Remove restriction of zero base-time for taprio offload
The check originates from the initial implementation which was not based
on PTP time but on a standalone clock source. In the meantime we can now
program the PTPSCHTM register at runtime with the dynamic base time
(actually with a value that is 200 ns smaller, to avoid writing DELTA=0
in the Schedule Entry Points Parameters Table). And we also have logic
for moving the actual base time in the future of the PHC's current time
base, so the check for zero serves no purpose, since even if the user
will specify zero, that's not what will end up in the static config
table where the limitation is.
Fixes:
86db36a347b4 ("net: dsa: sja1105: Implement state machine for TAS with PTP clock source")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Fri, 27 Dec 2019 01:01:50 +0000 (03:01 +0200)]
net: dsa: sja1105: Really make the PTP command read-write
When activating tc-taprio offload on the switch ports, the TAS state
machine will try to check whether it is running or not, but will find
both the STARTED and STOPPED bits as false in the
sja1105_tas_check_running function. So the function will return -EINVAL
(an abnormal situation) and the kernel will keep printing this from the
TAS FSM workqueue:
[ 37.691971] sja1105 spi0.1: An operation returned -22
The reason is that the underlying function that gets called,
sja1105_ptp_commit, does not actually do a SPI_READ, but a SPI_WRITE. So
the command buffer remains initialized with zeroes instead of retrieving
the hardware state. Fix that.
Fixes:
41603d78b362 ("net: dsa: sja1105: Make the PTP command read-write")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Fri, 27 Dec 2019 00:59:54 +0000 (02:59 +0200)]
net: dsa: sja1105: Take PTP egress timestamp by port, not mgmt slot
The PTP egress timestamp N must be captured from register PTPEGR_TS[n],
where n = 2 * PORT + TSREG. There are 10 PTPEGR_TS registers, 2 per
port. We are only using TSREG=0.
As opposed to the management slots, which are 4 in number
(SJA1105_NUM_PORTS, minus the CPU port). Any management frame (which
includes PTP frames) can be sent to any non-CPU port through any
management slot. When the CPU port is not the last port (#4), there will
be a mismatch between the slot and the port number.
Luckily, the only mainline occurrence with this switch
(arch/arm/boot/dts/ls1021a-tsn.dts) does have the CPU port as #4, so the
issue did not manifest itself thus far.
Fixes:
47ed985e97f5 ("net: dsa: sja1105: Add logic for TX timestamping")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rahul Lakkireddy [Mon, 30 Dec 2019 12:44:08 +0000 (18:14 +0530)]
cxgb4/cxgb4vf: fix flow control display for auto negotiation
As per 802.3-2005, Section Two, Annex 28B, Table 28B-2 [1], when
_only_ Rx pause is enabled, both symmetric and asymmetric pause
towards local device must be enabled. Also, firmware returns the local
device's flow control pause params as part of advertised capabilities
and negotiated params as part of current link attributes. So, fix up
ethtool's flow control pause params fetch logic to read from acaps,
instead of linkattr.
[1] https://standards.ieee.org/standard/802_3-2005.html
Fixes:
c3168cabe1af ("cxgb4/cxgbvf: Handle 32-bit fw port capabilities")
Signed-off-by: Surendra Mobiya <surendra@chelsio.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 29 Dec 2019 23:29:16 +0000 (15:29 -0800)]
Linux 5.5-rc4
David S. Miller [Sun, 29 Dec 2019 20:29:14 +0000 (12:29 -0800)]
Merge branch 'mlxsw-fixes'
Ido Schimmel says:
====================
mlxsw: Couple of fixes
This patch set contains two fixes for mlxsw. Please consider both for
stable.
Patch #1 from Amit fixes a wrong check during MAC validation when
creating router interfaces (RIFs). Given a particular order of
configuration this can result in the driver refusing to create new RIFs.
Patch #2 fixes a wrong trap configuration in which VRRP packets and
routing exceptions were policed by the same policer towards the CPU. In
certain situations this can prevent VRRP packets from reaching the CPU.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Sun, 29 Dec 2019 11:40:23 +0000 (13:40 +0200)]
mlxsw: spectrum: Use dedicated policer for VRRP packets
Currently, VRRP packets and packets that hit exceptions during routing
(e.g., MTU error) are policed using the same policer towards the CPU.
This means, for example, that misconfiguration of the MTU on a routed
interface can prevent VRRP packets from reaching the CPU, which in turn
can cause the VRRP daemon to assume it is the Master router.
Fix this by using a dedicated policer for VRRP packets.
Fixes:
11566d34f895 ("mlxsw: spectrum: Add VRRP traps")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Alex Veber <alexve@mellanox.com>
Tested-by: Alex Veber <alexve@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Sun, 29 Dec 2019 11:40:22 +0000 (13:40 +0200)]
mlxsw: spectrum_router: Skip loopback RIFs during MAC validation
When a router interface (RIF) is created the MAC address of the backing
netdev is verified to have the same MSBs as existing RIFs. This is
required in order to avoid changing existing RIF MAC addresses that all
share the same MSBs.
Loopback RIFs are special in this regard as they do not have a MAC
address, given they are only used to loop packets from the overlay to
the underlay.
Without this change, an error is returned when trying to create a RIF
after the creation of a GRE tunnel that is represented by a loopback
RIF. 'rif->dev->dev_addr' points to the GRE device's local IP, which
does not share the same MSBs as physical interfaces. Adding an IP
address to any physical interface results in:
Error: mlxsw_spectrum: All router interface MAC addresses must have the
same prefix.
Fix this by skipping loopback RIFs during MAC validation.
Fixes:
74bc99397438 ("mlxsw: spectrum_router: Veto unsupported RIF MAC addresses")
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 29 Dec 2019 18:01:27 +0000 (10:01 -0800)]
Merge tag 'riscv/for-v5.5-rc4' of git://git./linux/kernel/git/riscv/linux
Pull RISC-V fixes from Paul Walmsley:
"One important fix for RISC-V:
- Redirect any incoming syscall with an ID less than -1 to
sys_ni_syscall, rather than allowing them to fall through into the
syscall handler.
and two minor build fixes:
- Export __asm_copy_{from,to}_user() from where they are defined.
This fixes a build error triggered by some randconfigs.
- Export flush_icache_all(). I'd resisted this before, since
historically we didn't want modules to be able to flush the I$
directly; but apparently everyone else is doing it now"
* tag 'riscv/for-v5.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: export flush_icache_all to modules
riscv: reject invalid syscalls below -1
riscv: fix compile failure with EXPORT_SYMBOL() & !MMU
Linus Torvalds [Sun, 29 Dec 2019 17:50:57 +0000 (09:50 -0800)]
Merge tag 'locks-v5.5-1' of git://git./linux/kernel/git/jlayton/linux
Pull /proc/locks formatting fix from Jeff Layton:
"This is a trivial fix for a _very_ long standing bug in /proc/locks
formatting. Ordinarily, I'd wait for the merge window for something
like this, but it is making it difficult to validate some overlayfs
fixes.
I've also gone ahead and marked this for stable"
* tag 'locks-v5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
locks: print unsigned ino in /proc/locks
Linus Torvalds [Sun, 29 Dec 2019 17:48:47 +0000 (09:48 -0800)]
Merge tag '5.5-rc3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"One performance fix for large directory searches, and one minor style
cleanup noticed by Clang"
* tag '5.5-rc3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: Optimize readdir on reparse points
cifs: Adjust indentation in smb2_open_file
Amir Goldstein [Sun, 22 Dec 2019 18:45:28 +0000 (20:45 +0200)]
locks: print unsigned ino in /proc/locks
An ino is unsigned, so display it as such in /proc/locks.
Cc: stable@vger.kernel.org
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Olof Johansson [Tue, 17 Dec 2019 04:07:04 +0000 (20:07 -0800)]
riscv: export flush_icache_all to modules
This is needed by LKDTM (crash dump test module), it calls
flush_icache_range(), which on RISC-V turns into flush_icache_all(). On
other architectures, the actual implementation is exported, so follow
that precedence and export it here too.
Fixes build of CONFIG_LKDTM that fails with:
ERROR: "flush_icache_all" [drivers/misc/lkdtm/lkdtm.ko] undefined!
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
David Abdurachmanov [Wed, 18 Dec 2019 08:47:56 +0000 (10:47 +0200)]
riscv: reject invalid syscalls below -1
Running "stress-ng --enosys 4 -t 20 -v" showed a large number of kernel oops
with "Unable to handle kernel paging request at virtual address" message. This
happens when enosys stressor starts testing random non-valid syscalls.
I forgot to redirect any syscall below -1 to sys_ni_syscall.
With the patch kernel oops messages are gone while running stress-ng enosys
stressor.
Signed-off-by: David Abdurachmanov <david.abdurachmanov@sifive.com>
Fixes:
5340627e3fe0 ("riscv: add support for SECCOMP and SECCOMP_FILTER")
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
Luc Van Oostenryck [Sun, 22 Dec 2019 09:26:04 +0000 (10:26 +0100)]
riscv: fix compile failure with EXPORT_SYMBOL() & !MMU
When support for !MMU was added, the declaration of
__asm_copy_to_user() & __asm_copy_from_user() were #ifdefed
out hence their EXPORT_SYMBOL() give an error message like:
.../riscv_ksyms.c:13:15: error: '__asm_copy_to_user' undeclared here
.../riscv_ksyms.c:14:15: error: '__asm_copy_from_user' undeclared here
Since these symbols are not defined with !MMU it's wrong to export them.
Same for __clear_user() (even though this one is also declared in
include/asm-generic/uaccess.h and thus doesn't give an error message).
Fix this by doing the EXPORT_SYMBOL() directly where these symbols
are defined: inside lib/uaccess.S itself.
Fixes:
6bd33e1ece52 ("riscv: fix compile failure with EXPORT_SYMBOL() & !MMU")
Reported-by: kbuild test robot <lkp@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
Linus Torvalds [Sat, 28 Dec 2019 01:28:41 +0000 (17:28 -0800)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Four fixes and one spelling update, all in drivers: two in lpfc and
the rest in mp3sas, cxgbi and target"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: target/iblock: Fix protection error with blocks greater than 512B
scsi: libcxgbi: fix NULL pointer dereference in cxgbi_device_destroy()
scsi: lpfc: fix spelling mistakes of asynchronous
scsi: lpfc: fix build failure with DEBUGFS disabled
scsi: mpt3sas: Fix double free in attach error handling
Martin Blumenstingl [Thu, 26 Dec 2019 19:01:01 +0000 (20:01 +0100)]
net: stmmac: dwmac-meson8b: Fix the RGMII TX delay on Meson8b/8m2 SoCs
GXBB and newer SoCs use the fixed FCLK_DIV2 (1GHz) clock as input for
the m250_sel clock. Meson8b and Meson8m2 use MPLL2 instead, whose rate
can be adjusted at runtime.
So far we have been running MPLL2 with ~250MHz (and the internal
m250_div with value 1), which worked enough that we could transfer data
with an TX delay of 4ns. Unfortunately there is high packet loss with
an RGMII PHY when transferring data (receiving data works fine though).
Odroid-C1's u-boot is running with a TX delay of only 2ns as well as
the internal m250_div set to 2 - no lost (TX) packets can be observed
with that setting in u-boot.
Manual testing has shown that the TX packet loss goes away when using
the following settings in Linux (the vendor kernel uses the same
settings):
- MPLL2 clock set to ~500MHz
- m250_div set to 2
- TX delay set to 2ns on the MAC side
Update the m250_div divider settings to only accept dividers greater or
equal 2 to fix the TX delay generated by the MAC.
iperf3 results before the change:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 182 MBytes 153 Mbits/sec 514 sender
[ 5] 0.00-10.00 sec 182 MBytes 152 Mbits/sec receiver
iperf3 results after the change (including an updated TX delay of 2ns):
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-10.00 sec 927 MBytes 778 Mbits/sec 0 sender
[ 5] 0.00-10.01 sec 927 MBytes 777 Mbits/sec receiver
Fixes:
4f6a71b84e1afd ("net: stmmac: dwmac-meson8b: fix internal RGMII clock configuration")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shmulik Ladkani [Wed, 25 Dec 2019 08:51:01 +0000 (10:51 +0200)]
net/sched: act_mirred: Pull mac prior redir to non mac_header_xmit device
There's no skb_pull performed when a mirred action is set at egress of a
mac device, with a target device/action that expects skb->data to point
at the network header.
As a result, either the target device is errornously given an skb with
data pointing to the mac (egress case), or the net stack receives the
skb with data pointing to the mac (ingress case).
E.g:
# tc qdisc add dev eth9 root handle 1: prio
# tc filter add dev eth9 parent 1: prio 9 protocol ip handle 9 basic \
action mirred egress redirect dev tun0
(tun0 is a tun device. result: tun0 errornously gets the eth header
instead of the iph)
Revise the push/pull logic of tcf_mirred_act() to not rely on the
skb_at_tc_ingress() vs tcf_mirred_act_wants_ingress() comparison, as it
does not cover all "pull" cases.
Instead, calculate whether the required action on the target device
requires the data to point at the network header, and compare this to
whether skb->data points to network header - and make the push/pull
adjustments as necessary.
Fixes:
1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Shmulik Ladkani <sladkani@proofpoint.com>
Tested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 27 Dec 2019 21:21:06 +0000 (13:21 -0800)]
Merge tag 'drm-fixes-2019-12-28' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Post-xmas food coma recovery fixes. Only three fixes for i915 since I
expect most people are holidaying.
i915:
- power management rc6 fix
- framebuffer tracking fix
- display power management ratelimit fix"
* tag 'drm-fixes-2019-12-28' of git://anongit.freedesktop.org/drm/drm:
drm/i915: Hold reference to intel_frontbuffer as we track activity
drm/i915/gt: Ratelimit display power w/a
drm/i915/pmu: Ensure monotonic rc6
Linus Torvalds [Fri, 27 Dec 2019 19:30:26 +0000 (11:30 -0800)]
Merge tag 'linux-kselftest-5.5-rc4' of git://git./linux/kernel/git/shuah/linux-kselftest
Pull Kselftest fixes from Shuah Khan:
- rseq build failures fixes related to glibc 2.30 compatibility from
Mathieu Desnoyers
- Kunit fixes and cleanups from SeongJae Park
- Fixes to filesystems/epoll, firmware, and livepatch build failures
and skip handling.
* tag 'linux-kselftest-5.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
rseq/selftests: Clarify rseq_prepare_unload() helper requirements
rseq/selftests: Fix: Namespace gettid() for compatibility with glibc 2.30
rseq/selftests: Turn off timeout setting
kunit/kunit_tool_test: Test '--build_dir' option run
kunit: Rename 'kunitconfig' to '.kunitconfig'
kunit: Place 'test.log' under the 'build_dir'
kunit: Create default config in '--build_dir'
kunit: Remove duplicated defconfig creation
docs/kunit/start: Use in-tree 'kunit_defconfig'
selftests: livepatch: Fix it to do root uid check and skip
selftests: firmware: Fix it to do root uid check and skip
selftests: filesystems/epoll: fix build error
Linus Torvalds [Fri, 27 Dec 2019 19:26:54 +0000 (11:26 -0800)]
Merge tag 'pm-5.5-rc4' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"Fix compile test of the Tegra devfreq driver (Arnd Bergmann) and
remove redundant Kconfig dependencies from multiple devfreq drivers
(Leonard Crestez)"
* tag 'pm-5.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / devfreq: tegra: Add COMMON_CLK dependency
PM / devfreq: Drop explicit selection of PM_OPP
Linus Torvalds [Fri, 27 Dec 2019 19:17:08 +0000 (11:17 -0800)]
Merge tag 'io_uring-5.5-
20191226' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
- Removal of now unused busy wqe list (Hillf)
- Add cond_resched() to io-wq work processing (Hillf)
- And then the series that I hinted at from last week, which removes
the sqe from the io_kiocb and keeps all sqe handling on the prep
side. This guarantees that an opcode can't do the wrong thing and
read the sqe more than once. This is unchanged from last week, no
issues have been observed with this in testing. Hence I really think
we should fold this into 5.5.
* tag 'io_uring-5.5-
20191226' of git://git.kernel.dk/linux-block:
io-wq: add cond_resched() to worker thread
io-wq: remove unused busy list from io_sqe
io_uring: pass in 'sqe' to the prep handlers
io_uring: standardize the prep methods
io_uring: read 'count' for IORING_OP_TIMEOUT in prep handler
io_uring: move all prep state for IORING_OP_{SEND,RECV}_MGS to prep handler
io_uring: move all prep state for IORING_OP_CONNECT to prep handler
io_uring: add and use struct io_rw for read/writes
io_uring: use u64_to_user_ptr() consistently
Linus Torvalds [Fri, 27 Dec 2019 19:13:18 +0000 (11:13 -0800)]
Merge tag 'libata-5.5-
20191226' of git://git.kernel.dk/linux-block
Pull libata fixes from Jens Axboe:
"Two things in here:
- First half of a series that fixes ahci_brcm, also marked for
stable. The other part of the series is going into 5.6 (Florian)
- sata_nv regression fix that is also marked for stable (Sascha)"
* tag 'libata-5.5-
20191226' of git://git.kernel.dk/linux-block:
ata: ahci_brcm: Add missing clock management during recovery
ata: ahci_brcm: BCM7425 AHCI requires AHCI_HFLAG_DELAY_ENGINE
ata: ahci_brcm: Fix AHCI resources management
ata: libahci_platform: Export again ahci_platform_<en/dis>able_phys()
libata: Fix retrieving of active qcs
Linus Torvalds [Fri, 27 Dec 2019 19:09:04 +0000 (11:09 -0800)]
Merge tag 'block-5.5-
20191226' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Only thing here are the changes from Arnd from last week, which now
have the appropriate header include to ensure they actually compile if
COMPAT is enabled"
* tag 'block-5.5-
20191226' of git://git.kernel.dk/linux-block:
compat_ioctl: block: handle Persistent Reservations
compat_ioctl: block: handle add zone open, close and finish ioctl
compat_ioctl: block: handle BLKGETZONESZ/BLKGETNRZONES
compat_ioctl: block: handle BLKREPORTZONE/BLKRESETZONE
pktcdvd: fix regression on 64-bit architectures
Linus Torvalds [Fri, 27 Dec 2019 19:02:48 +0000 (11:02 -0800)]
Merge tag 'gpio-v5.5-2' of git://git./linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"A set of fixes for the v5.5 series:
- Fix the build for the Xtensa driver.
- Make sure to set up the parent device for mpc8xxx.
- Clarify the look-up error message.
- Fix the usage of the line direction in the mockup device.
- Fix a type warning on the Aspeed driver.
- Remove the pointless __exit annotation on the xgs-iproc which is
causing a compilation problem.
- Fix up emultation of open drain outputs .get_direction()
- Fix the IRQ callbacks on the PCA953xx to use bitops and work
properly.
- Fix the Kconfig on the Tegra driver"
* tag 'gpio-v5.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: tegra186: Allow building on Tegra194-only configurations
gpio: pca953x: Switch to bitops in IRQ callbacks
gpiolib: fix up emulated open drain outputs
MAINTAINERS: Append missed file to the database
gpio: xgs-iproc: remove __exit annotation for iproc_gpio_remove
gpio: aspeed: avoid return type warning
gpio: mockup: Fix usage of new GPIO_LINE_DIRECTION
gpio: Fix error message on out-of-range GPIO in lookup table
gpio: mpc8xxx: Add platform device to gpiochip->parent
gpio: xtensa: fix driver build
Dave Airlie [Fri, 27 Dec 2019 03:13:06 +0000 (13:13 +1000)]
Merge tag 'drm-intel-fixes-2019-12-23' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
i915 power and frontbuffer tracking fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87r20vdlrs.fsf@intel.com
Eric Dumazet [Mon, 23 Dec 2019 19:13:24 +0000 (11:13 -0800)]
net_sched: sch_fq: properly set sk->sk_pacing_status
If fq_classify() recycles a struct fq_flow because
a socket structure has been reallocated, we do not
set sk->sk_pacing_status immediately, but later if the
flow becomes detached.
This means that any flow requiring pacing (BBR, or SO_MAX_PACING_RATE)
might fallback to TCP internal pacing, which requires a per-socket
high resolution timer, and therefore more cpu cycles.
Fixes:
218af599fa63 ("tcp: internal implementation for pacing")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 26 Dec 2019 23:27:15 +0000 (15:27 -0800)]
Merge branch 'bnx2x-Bug-fixes'
Manish Chopra says:
====================
bnx2x: Bug fixes
This series has changes in the area of vlan resources
management APIs to fix fw assert issue reported in max
vlan configuration testing over the PF.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Manish Chopra [Mon, 23 Dec 2019 18:23:09 +0000 (10:23 -0800)]
bnx2x: Fix accounting of vlan resources among the PFs
While testing max vlan configuration on the PF, firmware gets
assert as driver was configuring number of vlans more than what
is supported per port/engine, it was figured out that there is an
implicit vlan (hidden default vlan consuming hardware cam entry resource)
which is configured default for all the clients (PF/VFs) on client_init
ramrod by the adapter implicitly, so when allocating resources among the
PFs this implicit vlan should be considered or total vlan entries should
be reduced by one to accommodate that default/implicit vlan entry.
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Manish Chopra [Mon, 23 Dec 2019 18:23:08 +0000 (10:23 -0800)]
bnx2x: Use appropriate define for vlan credit
Although it has same value as MAX_MAC_CREDIT_E2,
use MAX_VLAN_CREDIT_E2 appropriately.
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 26 Dec 2019 23:25:04 +0000 (15:25 -0800)]
Merge git://git./pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2019-12-23
The following pull-request contains BPF updates for your *net* tree.
We've added 2 non-merge commits during the last 1 day(s) which contain
a total of 4 files changed, 34 insertions(+), 31 deletions(-).
The main changes are:
1) Fix libbpf build when building on a read-only filesystem with O=dir
option, from Namhyung Kim.
2) Fix a precision tracking bug for unknown scalars, from Daniel Borkmann.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Geert Uytterhoeven [Mon, 23 Dec 2019 10:03:21 +0000 (11:03 +0100)]
of: mdio: Add missing inline to of_mdiobus_child_is_phy() dummy
If CONFIG_OF_MDIO=n:
drivers/net/phy/mdio_bus.c:23:
include/linux/of_mdio.h:58:13: warning: ‘of_mdiobus_child_is_phy’ defined but not used [-Wunused-function]
static bool of_mdiobus_child_is_phy(struct device_node *child)
^~~~~~~~~~~~~~~~~~~~~~~
Fix this by adding the missing "inline" keyword.
Fixes:
0aa4d016c043d16a ("of: mdio: export of_mdiobus_child_is_phy")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Madalin Bucur [Mon, 23 Dec 2019 08:06:10 +0000 (10:06 +0200)]
net: phy: aquantia: add suspend / resume ops for AQR105
The suspend/resume code for AQR107 works on AQR105 too.
This patch fixes issues with the partner not seeing the link down
when the interface using AQR105 is brought down.
Fixes:
bee8259dd31f ("net: phy: add driver for aquantia phy")
Signed-off-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Madalin Bucur [Mon, 23 Dec 2019 07:39:22 +0000 (09:39 +0200)]
dpaa_eth: fix DMA mapping leak
On the error path some fragments remain DMA mapped. Adding a fix
that unmaps all the fragments. Rework cleanup path to be simpler.
Fixes:
8151ee88bad5 ("dpaa_eth: use page backed rx buffers")
Signed-off-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 26 Dec 2019 21:11:40 +0000 (13:11 -0800)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Fix endianness issue in flowtable TCP flags dissector,
from Arnd Bergmann.
2) Extend flowtable test script with dnat rules, from Florian Westphal.
3) Reject padding in ebtables user entries and validate computed user
offset, reported by syzbot, from Florian Westphal.
4) Fix endianness in nft_tproxy, from Phil Sutter.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladyslav Tarasiuk [Thu, 26 Dec 2019 08:41:56 +0000 (10:41 +0200)]
net/mlxfw: Fix out-of-memory error in mfa2 flash burning
The burning process requires to perform internal allocations of large
chunks of memory. This memory doesn't need to be contiguous and can be
safely allocated by vzalloc() instead of kzalloc(). This patch changes
such allocation to avoid possible out-of-memory failure.
Fixes:
410ed13cae39 ("Add the mlxfw module for Mellanox firmware flash process")
Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Tue, 10 Dec 2019 18:53:47 +0000 (10:53 -0800)]
ata: ahci_brcm: Add missing clock management during recovery
The downstream implementation of ahci_brcm.c did contain clock
management recovery, but until recently, did that outside of the
libahci_platform helpers and this was unintentionally stripped out while
forward porting the patch upstream.
Add the missing clock management during recovery and sleep for 10
milliseconds per the design team recommendations to ensure the SATA PHY
controller and AFE have been fully quiesced.
Fixes:
eb73390ae241 ("ata: ahci_brcm: Recover from failures to identify devices")
Cc: stable@vger.kernel.org
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Florian Fainelli [Tue, 10 Dec 2019 18:53:46 +0000 (10:53 -0800)]
ata: ahci_brcm: BCM7425 AHCI requires AHCI_HFLAG_DELAY_ENGINE
Set AHCI_HFLAG_DELAY_ENGINE for the BCM7425 AHCI controller thus making
it conforming to the 'strict' AHCI implementation which this controller
is based on.
This solves long link establishment with specific hard drives (e.g.:
Seagate ST1000VM002-9ZL1 SC12) that would otherwise have to complete the
error recovery handling before finally establishing a succesful SATA
link at the desired speed.
We re-order the hpriv->flags assignment to also remove the NONCQ quirk
since we can set the flag directly.
Fixes:
9586114cf1e9 ("ata: ahci_brcmstb: add support MIPS-based platforms")
Fixes:
423be77daabe ("ata: ahci_brcmstb: add quirk for broken ncq")
Cc: stable@vger.kernel.org
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Florian Fainelli [Tue, 10 Dec 2019 18:53:45 +0000 (10:53 -0800)]
ata: ahci_brcm: Fix AHCI resources management
The AHCI resources management within ahci_brcm.c is a little
convoluted, largely because it historically had a dedicated clock that
was managed within this file in the downstream tree. Once brough
upstream though, the clock was left to be managed by libahci_platform.c
which is entirely appropriate.
This patch series ensures that the AHCI resources are fetched and
enabled before any register access is done, thus avoiding bus errors on
platforms which clock gate the controller by default.
As a result we need to re-arrange the suspend() and resume() functions
in order to avoid accessing registers after the clocks have been turned
off respectively before the clocks have been turned on. Finally, we can
refactor brcm_ahci_get_portmask() in order to fetch the number of ports
from hpriv->mmio which is now accessible without jumping through hoops
like we used to do.
The commit pointed in the Fixes tag is both old and new enough not to
require major headaches for backporting of this patch.
Fixes:
eba68f829794 ("ata: ahci_brcmstb: rename to support across Broadcom SoC's")
Cc: stable@vger.kernel.org
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Florian Fainelli [Tue, 10 Dec 2019 18:53:44 +0000 (10:53 -0800)]
ata: libahci_platform: Export again ahci_platform_<en/dis>able_phys()
This reverts commit
6bb86fefa086faba7b60bb452300b76a47cde1a5
("libahci_platform: Staticize ahci_platform_<en/dis>able_phys()") we are
going to need ahci_platform_{enable,disable}_phys() in a subsequent
commit for ahci_brcm.c in order to properly control the PHY
initialization order.
Also make sure the function prototypes are declared in
include/linux/ahci_platform.h as a result.
Cc: stable@vger.kernel.org
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
David S. Miller [Thu, 26 Dec 2019 00:35:35 +0000 (16:35 -0800)]
Merge branch 'hsr-fix-several-bugs-in-hsr-module'
Taehee Yoo says:
====================
hsr: fix several bugs in hsr module
1. The first patch fixes debugfs warning when it's opened when hsr module
is being removed. debugfs file is opened, it tries to hold .owner module,
but it would print warning messages if it couldn't hold .owner module.
In order to avoid the warning message, this patch makes hsr module does
not set .owner. Unsetting .owner is safe because these are protected by
inode_lock().
2. The second patch fixes wrong error handling of hsr_dev_finalize()
a) hsr_dev_finalize() calls debugfs_create_{dir/file} to create debugfs.
it checks NULL pointer but debugfs don't return NULL so it's wrong code.
b) hsr_dev_finalize() calls register_netdevice(). so if it fails after
register_netdevice(), it should call unregister_netdevice().
But it doesn't.
c) debugfs doesn't affect any actual logic of hsr module.
So, the failure of creating of debugfs could be ignored.
3. The third patch adds hsr root debugfs directory.
When hsr interface is created, it creates debugfs directory in
/sys/kernel/debug/<interface name>.
It's a little bit faulty path because if an interface is the same with
another directory name in the same path, it will fail. If hsr root
directory is existing, the possibility of failure of creating debugfs
file will be reduced.
4. The fourth patch adds debugfs rename routine.
debugfs directory name is the same with hsr interface name.
So hsr interface name is changed, debugfs directory name should be
changed too.
5. The fifth patch fixes a race condition in node list add and del.
hsr nodes are protected by RCU and there is no write side lock.
But node insertions and deletions could be being operated concurrently.
So write side locking is needed.
6. The Sixth patch resets network header
Tap routine is enabled, below message will be printed.
[ 175.852292][ C3] protocol 88fb is buggy, dev veth0
hsr module doesn't set network header for supervision frame.
But tap routine validates network header.
If network header wasn't set, it resets and warns about it.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Taehee Yoo [Sun, 22 Dec 2019 11:27:08 +0000 (11:27 +0000)]
hsr: reset network header when supervision frame is created
The supervision frame is L2 frame.
When supervision frame is created, hsr module doesn't set network header.
If tap routine is enabled, dev_queue_xmit_nit() is called and it checks
network_header. If network_header pointer wasn't set(or invalid),
it resets network_header and warns.
In order to avoid unnecessary warning message, resetting network_header
is needed.
Test commands:
ip netns add nst
ip link add veth0 type veth peer name veth1
ip link add veth2 type veth peer name veth3
ip link set veth1 netns nst
ip link set veth3 netns nst
ip link set veth0 up
ip link set veth2 up
ip link add hsr0 type hsr slave1 veth0 slave2 veth2
ip a a 192.168.100.1/24 dev hsr0
ip link set hsr0 up
ip netns exec nst ip link set veth1 up
ip netns exec nst ip link set veth3 up
ip netns exec nst ip link add hsr1 type hsr slave1 veth1 slave2 veth3
ip netns exec nst ip a a 192.168.100.2/24 dev hsr1
ip netns exec nst ip link set hsr1 up
tcpdump -nei veth0
Splat looks like:
[ 175.852292][ C3] protocol 88fb is buggy, dev veth0
Fixes:
f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Taehee Yoo [Sun, 22 Dec 2019 11:26:54 +0000 (11:26 +0000)]
hsr: fix a race condition in node list insertion and deletion
hsr nodes are protected by RCU and there is no write side lock.
But node insertions and deletions could be being operated concurrently.
So write side locking is needed.
Test commands:
ip netns add nst
ip link add veth0 type veth peer name veth1
ip link add veth2 type veth peer name veth3
ip link set veth1 netns nst
ip link set veth3 netns nst
ip link set veth0 up
ip link set veth2 up
ip link add hsr0 type hsr slave1 veth0 slave2 veth2
ip a a 192.168.100.1/24 dev hsr0
ip link set hsr0 up
ip netns exec nst ip link set veth1 up
ip netns exec nst ip link set veth3 up
ip netns exec nst ip link add hsr1 type hsr slave1 veth1 slave2 veth3
ip netns exec nst ip a a 192.168.100.2/24 dev hsr1
ip netns exec nst ip link set hsr1 up
for i in {0..9}
do
for j in {0..9}
do
for k in {0..9}
do
for l in {0..9}
do
arping 192.168.100.2 -I hsr0 -s 00:01:3$i:4$j:5$k:6$l -c1 &
done
done
done
done
Splat looks like:
[ 236.066091][ T3286] list_add corruption. next->prev should be prev (
ffff8880a5940300), but was
ffff8880a5940d0.
[ 236.069617][ T3286] ------------[ cut here ]------------
[ 236.070545][ T3286] kernel BUG at lib/list_debug.c:25!
[ 236.071391][ T3286] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[ 236.072343][ T3286] CPU: 0 PID: 3286 Comm: arping Tainted: G W 5.5.0-rc1+ #209
[ 236.073463][ T3286] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 236.074695][ T3286] RIP: 0010:__list_add_valid+0x74/0xd0
[ 236.075499][ T3286] Code: 48 39 da 75 27 48 39 f5 74 36 48 39 dd 74 31 48 83 c4 08 b8 01 00 00 00 5b 5d c3 48 b
[ 236.078277][ T3286] RSP: 0018:
ffff8880aaa97648 EFLAGS:
00010286
[ 236.086991][ T3286] RAX:
0000000000000075 RBX:
ffff8880d4624c20 RCX:
0000000000000000
[ 236.088000][ T3286] RDX:
0000000000000075 RSI:
0000000000000008 RDI:
ffffed1015552ebf
[ 236.098897][ T3286] RBP:
ffff88809b53d200 R08:
ffffed101b3c04f9 R09:
ffffed101b3c04f9
[ 236.099960][ T3286] R10:
00000000308769a1 R11:
ffffed101b3c04f8 R12:
ffff8880d4624c28
[ 236.100974][ T3286] R13:
ffff8880d4624c20 R14:
0000000040310100 R15:
ffff8880ce17ee02
[ 236.138967][ T3286] FS:
00007f23479fa680(0000) GS:
ffff8880d9c00000(0000) knlGS:
0000000000000000
[ 236.144852][ T3286] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 236.145720][ T3286] CR2:
00007f4a14bab210 CR3:
00000000a61c6001 CR4:
00000000000606f0
[ 236.146776][ T3286] Call Trace:
[ 236.147222][ T3286] hsr_add_node+0x314/0x490 [hsr]
[ 236.153633][ T3286] hsr_forward_skb+0x2b6/0x1bc0 [hsr]
[ 236.154362][ T3286] ? rcu_read_lock_sched_held+0x90/0xc0
[ 236.155091][ T3286] ? rcu_read_lock_bh_held+0xa0/0xa0
[ 236.156607][ T3286] hsr_dev_xmit+0x70/0xd0 [hsr]
[ 236.157254][ T3286] dev_hard_start_xmit+0x160/0x740
[ 236.157941][ T3286] __dev_queue_xmit+0x1961/0x2e10
[ 236.158565][ T3286] ? netdev_core_pick_tx+0x2e0/0x2e0
[ ... ]
Reported-by: syzbot+3924327f9ad5f4d2b343@syzkaller.appspotmail.com
Fixes:
f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Taehee Yoo [Sun, 22 Dec 2019 11:26:39 +0000 (11:26 +0000)]
hsr: rename debugfs file when interface name is changed
hsr interface has own debugfs file, which name is same with interface name.
So, interface name is changed, debugfs file name should be changed too.
Fixes:
fc4ecaeebd26 ("net: hsr: add debugfs support for display node list")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Taehee Yoo [Sun, 22 Dec 2019 11:26:27 +0000 (11:26 +0000)]
hsr: add hsr root debugfs directory
In current hsr code, when hsr interface is created, it creates debugfs
directory /sys/kernel/debug/<interface name>.
If there is same directory or file name in there, it fails.
In order to reduce possibility of failure of creation of debugfs,
this patch adds root directory.
Test commands:
ip link add dummy0 type dummy
ip link add dummy1 type dummy
ip link add hsr0 type hsr slave1 dummy0 slave2 dummy1
Before this patch:
/sys/kernel/debug/hsr0/node_table
After this patch:
/sys/kernel/debug/hsr/hsr0/node_table
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Taehee Yoo [Sun, 22 Dec 2019 11:26:15 +0000 (11:26 +0000)]
hsr: fix error handling routine in hsr_dev_finalize()
hsr_dev_finalize() is called to create new hsr interface.
There are some wrong error handling codes.
1. wrong checking return value of debugfs_create_{dir/file}.
These function doesn't return NULL. If error occurs in there,
it returns error pointer.
So, it should check error pointer instead of NULL.
2. It doesn't unregister interface if it fails to setup hsr interface.
If it fails to initialize hsr interface after register_netdevice(),
it should call unregister_netdevice().
3. Ignore failure of creation of debugfs
If creating of debugfs dir and file is failed, creating hsr interface
will be failed. But debugfs doesn't affect actual logic of hsr module.
So, ignoring this is more correct and this behavior is more general.
Fixes:
c5a759117210 ("net/hsr: Use list_head (and rcu) instead of array for slave devices.")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Taehee Yoo [Sun, 22 Dec 2019 11:25:27 +0000 (11:25 +0000)]
hsr: avoid debugfs warning message when module is remove
When hsr module is being removed, debugfs_remove() is called to remove
both debugfs directory and file.
When module is being removed, module state is changed to
MODULE_STATE_GOING then exit() is called.
At this moment, module couldn't be held so try_module_get()
will be failed.
debugfs's open() callback tries to hold the module if .owner is existing.
If it fails, warning message is printed.
CPU0 CPU1
delete_module()
try_stop_module()
hsr_exit() open() <-- WARNING
debugfs_remove()
In order to avoid the warning message, this patch makes hsr module does
not set .owner. Unsetting .owner is safe because these are protected by
inode_lock().
Test commands:
#SHELL1
ip link add dummy0 type dummy
ip link add dummy1 type dummy
while :
do
ip link add hsr0 type hsr slave1 dummy0 slave2 dummy1
modprobe -rv hsr
done
#SHELL2
while :
do
cat /sys/kernel/debug/hsr0/node_table
done
Splat looks like:
[ 101.223783][ T1271] ------------[ cut here ]------------
[ 101.230309][ T1271] debugfs file owner did not clean up at exit: node_table
[ 101.230380][ T1271] WARNING: CPU: 3 PID: 1271 at fs/debugfs/file.c:309 full_proxy_open+0x10f/0x650
[ 101.233153][ T1271] Modules linked in: hsr(-) dummy veth openvswitch nsh nf_conncount nf_nat nf_conntrack nf_d]
[ 101.237112][ T1271] CPU: 3 PID: 1271 Comm: cat Tainted: G W 5.5.0-rc1+ #204
[ 101.238270][ T1271] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 101.240379][ T1271] RIP: 0010:full_proxy_open+0x10f/0x650
[ 101.241166][ T1271] Code: 48 c1 ea 03 80 3c 02 00 0f 85 c1 04 00 00 49 8b 3c 24 e8 04 86 7e ff 84 c0 75 2d 4c 8
[ 101.251985][ T1271] RSP: 0018:
ffff8880ca22fa38 EFLAGS:
00010286
[ 101.273355][ T1271] RAX:
dffffc0000000008 RBX:
ffff8880cc6e6200 RCX:
0000000000000000
[ 101.274466][ T1271] RDX:
0000000000000000 RSI:
0000000000000006 RDI:
ffff8880c4dd5c14
[ 101.275581][ T1271] RBP:
0000000000000000 R08:
fffffbfff2922f5d R09:
0000000000000000
[ 101.276733][ T1271] R10:
0000000000000001 R11:
0000000000000000 R12:
ffffffffc0551bc0
[ 101.277853][ T1271] R13:
ffff8880c4059a48 R14:
ffff8880be50a5e0 R15:
ffffffff941adaa0
[ 101.278956][ T1271] FS:
00007f8871cda540(0000) GS:
ffff8880da800000(0000) knlGS:
0000000000000000
[ 101.280216][ T1271] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 101.282832][ T1271] CR2:
00007f88717cfd10 CR3:
00000000b9440005 CR4:
00000000000606e0
[ 101.283974][ T1271] Call Trace:
[ 101.285328][ T1271] do_dentry_open+0x63c/0xf50
[ 101.286077][ T1271] ? open_proxy_open+0x270/0x270
[ 101.288271][ T1271] ? __x64_sys_fchdir+0x180/0x180
[ 101.288987][ T1271] ? inode_permission+0x65/0x390
[ 101.289682][ T1271] path_openat+0x701/0x2810
[ 101.290294][ T1271] ? path_lookupat+0x880/0x880
[ 101.290957][ T1271] ? check_chain_key+0x236/0x5d0
[ 101.291676][ T1271] ? __lock_acquire+0xdfe/0x3de0
[ 101.292358][ T1271] ? sched_clock+0x5/0x10
[ 101.292962][ T1271] ? sched_clock_cpu+0x18/0x170
[ 101.293644][ T1271] ? find_held_lock+0x39/0x1d0
[ 101.305616][ T1271] do_filp_open+0x17a/0x270
[ 101.306061][ T1271] ? may_open_dev+0xc0/0xc0
[ ... ]
Fixes:
fc4ecaeebd26 ("net: hsr: add debugfs support for display node list")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netanel Belgazal [Sun, 22 Dec 2019 09:47:59 +0000 (09:47 +0000)]
MAINTAINERS: Add additional maintainers to ENA Ethernet driver
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sascha Hauer [Fri, 13 Dec 2019 08:04:08 +0000 (09:04 +0100)]
libata: Fix retrieving of active qcs
ata_qc_complete_multiple() is called with a mask of the still active
tags.
mv_sata doesn't have this information directly and instead calculates
the still active tags from the started tags (ap->qc_active) and the
finished tags as (ap->qc_active ^ done_mask)
Since
28361c40368 the hw_tag and tag are no longer the same and the
equation is no longer valid. In ata_exec_internal_sg() ap->qc_active is
initialized as 1ULL << ATA_TAG_INTERNAL, but in hardware tag 0 is
started and this will be in done_mask on completion. ap->qc_active ^
done_mask becomes 0x100000000 ^ 0x1 = 0x100000001 and thus tag 0 used as
the internal tag will never be reported as completed.
This is fixed by introducing ata_qc_get_active() which returns the
active hardware tags and calling it where appropriate.
This is tested on mv_sata, but sata_fsl and sata_nv suffer from the same
problem. There is another case in sata_nv that most likely needs fixing
as well, but this looks a little different, so I wasn't confident enough
to change that.
Fixes:
28361c403683 ("libata: add extra internal command")
Cc: stable@vger.kernel.org
Tested-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Add missing export of ata_qc_get_active(), as per Pali.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Rafael J. Wysocki [Wed, 25 Dec 2019 14:15:55 +0000 (15:15 +0100)]
Merge tag 'devfreq-fixes-for-5.5-rc4' of git://git./linux/kernel/git/chanwoo/linux
Pull devfreq fixes for 5.5-rc4 from Chanwoo Choi:
"1. Fix the build error of tegra*-devfreq.c when COMPILE_TEST is enabled.
2. Drop unneeded PM_OPP dependency from each driver in Kconfig."
* tag 'devfreq-fixes-for-5.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux:
PM / devfreq: tegra: Add COMMON_CLK dependency
PM / devfreq: Drop explicit selection of PM_OPP
David S. Miller [Wed, 25 Dec 2019 06:41:07 +0000 (22:41 -0800)]
Merge branch 's390-qeth-fixes'
Julian Wiedmann says:
====================
s390/qeth: fixes 2019-12-23
please apply the following patch series for qeth to your net tree.
This brings two fixes for errors during device initialization, deals with
several issues in the vnicc control code, and adds a missing lock.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Mon, 23 Dec 2019 14:03:26 +0000 (15:03 +0100)]
s390/qeth: fix initialization on old HW
I stumbled over an old OSA model that claims to support DIAG_ASSIST,
but then rejects the cmd to query its DIAG capabilities.
In the old code this was ok, as the returned raw error code was > 0.
Now that we translate the raw codes to errnos, the "rc < 0" causes us
to fail the initialization of the device.
The fix is trivial: don't bail out when the DIAG query fails. Such an
error is not critical, we can still use the device (with a slightly
reduced set of features).
Fixes:
742d4d40831d ("s390/qeth: convert remaining legacy cmd callbacks")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexandra Winter [Mon, 23 Dec 2019 14:03:25 +0000 (15:03 +0100)]
s390/qeth: vnicc Fix init to default
During vnicc_init wanted_char should be compared to cur_char and not
to QETH_VNICC_DEFAULT. Without this patch there is no way to enforce
the default values as desired values.
Note, that it is expected, that a card comes online with default values.
This patch was tested with private card firmware.
Fixes:
caa1f0b10d18 ("s390/qeth: add VNICC enable/disable support")
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexandra Winter [Mon, 23 Dec 2019 14:03:24 +0000 (15:03 +0100)]
s390/qeth: Fix vnicc_is_in_use if rx_bcast not set
Symptom: After vnicc/rx_bcast has been manually set to 0,
bridge_* sysfs parameters can still be set or written.
Only occurs on HiperSockets, as OSA doesn't support changing rx_bcast.
Vnic characteristics and bridgeport settings are mutually exclusive.
rx_bcast defaults to 1, so manually setting it to 0 should disable
bridge_* parameters.
Instead it makes sense here to check the supported mask. If the card
does not support vnicc at all, bridge commands are always allowed.
Fixes:
caa1f0b10d18 ("s390/qeth: add VNICC enable/disable support")
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexandra Winter [Mon, 23 Dec 2019 14:03:23 +0000 (15:03 +0100)]
s390/qeth: fix false reporting of VNIC CHAR config failure
Symptom: Error message "Configuring the VNIC characteristics failed"
in dmesg whenever an OSA interface on z15 is set online.
The VNIC characteristics get re-programmed when setting a L2 device
online. This follows the selected 'wanted' characteristics - with the
exception that the INVISIBLE characteristic unconditionally gets
switched off.
For devices that don't support INVISIBLE (ie. OSA), the resulting
IO failure raises a noisy error message
("Configuring the VNIC characteristics failed").
For IQD, INVISIBLE is off by default anyways.
So don't unnecessarily special-case the INVISIBLE characteristic, and
thereby suppress the misleading error message on OSA devices.
Fixes:
caa1f0b10d18 ("s390/qeth: add VNICC enable/disable support")
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Mon, 23 Dec 2019 14:03:22 +0000 (15:03 +0100)]
s390/qeth: lock the card while changing its hsuid
qeth_l3_dev_hsuid_store() initially checks the card state, but doesn't
take the conf_mutex to ensure that the card stays in this state while
being reconfigured.
Rework the code to take this lock, and drop a redundant state check in a
helper function.
Fixes:
b333293058aa ("qeth: add support for af_iucv HiperSockets transport")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann [Mon, 23 Dec 2019 14:03:21 +0000 (15:03 +0100)]
s390/qeth: fix qdio teardown after early init error
qeth_l?_set_online() goes through a number of initialization steps, and
on any error uses qeth_l?_stop_card() to tear down the residual state.
The first initialization step is qeth_core_hardsetup_card(). When this
fails after having established a QDIO context on the device
(ie. somewhere after qeth_mpc_initialize()), qeth_l?_stop_card() doesn't
shut down this QDIO context again (since the card state hasn't
progressed from DOWN at this stage).
Even worse, we then call qdio_free() as final teardown step to free the
QDIO data structures - while some of them are still hooked into wider
QDIO infrastructure such as the IRQ list. This is inevitably followed by
use-after-frees and other nastyness.
Fix this by unconditionally calling qeth_qdio_clear_card() to shut down
the QDIO context, and also to halt/clear any pending activity on the
various IO channels.
Remove the naive attempt at handling the teardown in
qeth_mpc_initialize(), it clearly doesn't suffice and we're handling it
properly now in the wider teardown code.
Fixes:
4a71df50047f ("qeth: new qeth device driver")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 25 Dec 2019 06:28:55 +0000 (22:28 -0800)]
Merge branch 'disable-neigh-update-for-tunnels-during-pmtu-update'
Hangbin Liu says:
====================
disable neigh update for tunnels during pmtu update
When we setup a pair of gretap, ping each other and create neighbour cache.
Then delete and recreate one side. We will never be able to ping6 to the new
created gretap.
The reason is when we ping6 remote via gretap, we will call like
gre_tap_xmit()
- ip_tunnel_xmit()
- tnl_update_pmtu()
- skb_dst_update_pmtu()
- ip6_rt_update_pmtu()
- __ip6_rt_update_pmtu()
- dst_confirm_neigh()
- ip6_confirm_neigh()
- __ipv6_confirm_neigh()
- n->confirmed = now
As the confirmed time updated, in neigh_timer_handler() the check for
NUD_DELAY confirm time will pass and the neigh state will back to
NUD_REACHABLE. So the old/wrong mac address will be used again.
If we do not update the confirmed time, the neigh state will go to
neigh->nud_state = NUD_PROBE; then go to NUD_FAILED and re-create the
neigh later, which is what IPv4 does.
We couldn't remove the ip6_confirm_neigh() directly as we still need it
for TCP flows. To fix it, we have to pass a bool parameter to
dst_ops.update_pmtu() and only disable neighbor update for tunnels.
v5: No code change, upate some commits description
v4: No code change, upate some commits description
v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
dst_ops.update_pmtu to control whether we should do neighbor confirm.
Also split the big patch to small ones for each area.
v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Sun, 22 Dec 2019 02:51:16 +0000 (10:51 +0800)]
net/dst: do not confirm neighbor for vxlan and geneve pmtu update
When do IPv6 tunnel PMTU update and calls __ip6_rt_update_pmtu() in the end,
we should not call dst_confirm_neigh() as there is no two-way communication.
So disable the neigh confirm for vxlan and geneve pmtu update.
v5: No change.
v4: No change.
v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
dst_ops.update_pmtu to control whether we should do neighbor confirm.
Also split the big patch to small ones for each area.
v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.
Fixes:
a93bf0ff4490 ("vxlan: update skb dst pmtu on tx path")
Fixes:
52a589d51f10 ("geneve: update skb dst pmtu on tx path")
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Tested-by: Guillaume Nault <gnault@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Sun, 22 Dec 2019 02:51:15 +0000 (10:51 +0800)]
sit: do not confirm neighbor when do pmtu update
When do IPv6 tunnel PMTU update and calls __ip6_rt_update_pmtu() in the end,
we should not call dst_confirm_neigh() as there is no two-way communication.
v5: No change.
v4: No change.
v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
dst_ops.update_pmtu to control whether we should do neighbor confirm.
Also split the big patch to small ones for each area.
v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Sun, 22 Dec 2019 02:51:14 +0000 (10:51 +0800)]
vti: do not confirm neighbor when do pmtu update
When do IPv6 tunnel PMTU update and calls __ip6_rt_update_pmtu() in the end,
we should not call dst_confirm_neigh() as there is no two-way communication.
Although vti and vti6 are immune to this problem because they are IFF_NOARP
interfaces, as Guillaume pointed. There is still no sense to confirm neighbour
here.
v5: Update commit description.
v4: No change.
v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
dst_ops.update_pmtu to control whether we should do neighbor confirm.
Also split the big patch to small ones for each area.
v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Sun, 22 Dec 2019 02:51:13 +0000 (10:51 +0800)]
tunnel: do not confirm neighbor when do pmtu update
When do tunnel PMTU update and calls __ip6_rt_update_pmtu() in the end,
we should not call dst_confirm_neigh() as there is no two-way communication.
v5: No Change.
v4: Update commit description
v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
dst_ops.update_pmtu to control whether we should do neighbor confirm.
Also split the big patch to small ones for each area.
v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.
Fixes:
0dec879f636f ("net: use dst_confirm_neigh for UDP, RAW, ICMP, L2TP")
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Tested-by: Guillaume Nault <gnault@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Sun, 22 Dec 2019 02:51:12 +0000 (10:51 +0800)]
net/dst: add new function skb_dst_update_pmtu_no_confirm
Add a new function skb_dst_update_pmtu_no_confirm() for callers who need
update pmtu but should not do neighbor confirm.
v5: No change.
v4: No change.
v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
dst_ops.update_pmtu to control whether we should do neighbor confirm.
Also split the big patch to small ones for each area.
v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Sun, 22 Dec 2019 02:51:11 +0000 (10:51 +0800)]
gtp: do not confirm neighbor when do pmtu update
When do IPv6 tunnel PMTU update and calls __ip6_rt_update_pmtu() in the end,
we should not call dst_confirm_neigh() as there is no two-way communication.
Although GTP only support ipv4 right now, and __ip_rt_update_pmtu() does not
call dst_confirm_neigh(), we still set it to false to keep consistency with
IPv6 code.
v5: No change.
v4: No change.
v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
dst_ops.update_pmtu to control whether we should do neighbor confirm.
Also split the big patch to small ones for each area.
v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Sun, 22 Dec 2019 02:51:10 +0000 (10:51 +0800)]
ip6_gre: do not confirm neighbor when do pmtu update
When we do ipv6 gre pmtu update, we will also do neigh confirm currently.
This will cause the neigh cache be refreshed and set to REACHABLE before
xmit.
But if the remote mac address changed, e.g. device is deleted and recreated,
we will not able to notice this and still use the old mac address as the neigh
cache is REACHABLE.
Fix this by disable neigh confirm when do pmtu update
v5: No change.
v4: No change.
v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
dst_ops.update_pmtu to control whether we should do neighbor confirm.
Also split the big patch to small ones for each area.
v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.
Reported-by: Jianlin Shi <jishi@redhat.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Sun, 22 Dec 2019 02:51:09 +0000 (10:51 +0800)]
net: add bool confirm_neigh parameter for dst_ops.update_pmtu
The MTU update code is supposed to be invoked in response to real
networking events that update the PMTU. In IPv6 PMTU update function
__ip6_rt_update_pmtu() we called dst_confirm_neigh() to update neighbor
confirmed time.
But for tunnel code, it will call pmtu before xmit, like:
- tnl_update_pmtu()
- skb_dst_update_pmtu()
- ip6_rt_update_pmtu()
- __ip6_rt_update_pmtu()
- dst_confirm_neigh()
If the tunnel remote dst mac address changed and we still do the neigh
confirm, we will not be able to update neigh cache and ping6 remote
will failed.
So for this ip_tunnel_xmit() case, _EVEN_ if the MTU is changed, we
should not be invoking dst_confirm_neigh() as we have no evidence
of successful two-way communication at this point.
On the other hand it is also important to keep the neigh reachability fresh
for TCP flows, so we cannot remove this dst_confirm_neigh() call.
To fix the issue, we have to add a new bool parameter for dst_ops.update_pmtu
to choose whether we should do neigh update or not. I will add the parameter
in this patch and set all the callers to true to comply with the previous
way, and fix the tunnel code one by one on later patches.
v5: No change.
v4: No change.
v3: Do not remove dst_confirm_neigh, but add a new bool parameter in
dst_ops.update_pmtu to control whether we should do neighbor confirm.
Also split the big patch to small ones for each area.
v2: Remove dst_confirm_neigh in __ip6_rt_update_pmtu.
Suggested-by: David Miller <davem@davemloft.net>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 25 Dec 2019 00:12:47 +0000 (16:12 -0800)]
Merge tag 'rxrpc-fixes-
20191220' of git://git./linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Fixes
Here are a couple of bugfixes plus a patch that makes one of the bugfixes
easier:
(1) Move the ping and mutex unlock on a new call from rxrpc_input_packet()
into rxrpc_new_incoming_call(), which it calls. This means the
lock-unlock section is entirely within the latter function. This
simplifies patch (2).
(2) Don't take the call->user_mutex at all in the softirq path. Mutexes
aren't allowed to be taken or released there and a patch was merged
that caused a warning to be emitted every time this happened. Looking
at the code again, it looks like that taking the mutex isn't actually
necessary, as the value of call->state will block access to the call.
(3) Fix the incoming call path to check incoming calls earlier to reject
calls to RPC services for which we don't have a security key of the
appropriate class. This avoids an assertion failure if YFS tries
making a secure call to the kafs cache manager RPC service.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 20 Dec 2019 19:24:21 +0000 (11:24 -0800)]
net: dsa: bcm_sf2: Fix IP fragment location and behavior
The IP fragment is specified through user-defined field as the first
bit of the first user-defined word. We were previously trying to extract
it from the user-defined mask which could not possibly work. The ip_frag
is also supposed to be a boolean, if we do not cast it as such, we risk
overwriting the next fields in CFP_DATA(6) which would render the rule
inoperative.
Fixes:
7318166cacad ("net: dsa: bcm_sf2: Add support for ethtool::rxnfc")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Ricardo Leitner [Fri, 20 Dec 2019 18:03:44 +0000 (15:03 -0300)]
sctp: fix err handling of stream initialization
The fix on
951c6db954a1 fixed the issued reported there but introduced
another. When the allocation fails within sctp_stream_init() it is
okay/necessary to free the genradix. But it is also called when adding
new streams, from sctp_send_add_streams() and
sctp_process_strreset_addstrm_in() and in those situations it cannot
just free the genradix because by then it is a fully operational
association.
The fix here then is to only free the genradix in sctp_stream_init()
and on those other call sites move on with what it already had and let
the subsequent error handling to handle it.
Tested with the reproducers from this report and the previous one,
with lksctp-tools and sctp-tests.
Reported-by: syzbot+9a1bc632e78a1a98488b@syzkaller.appspotmail.com
Fixes:
951c6db954a1 ("sctp: fix memleak on err handling of stream initialization")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Antonio Messina [Thu, 19 Dec 2019 14:08:03 +0000 (15:08 +0100)]
udp: fix integer overflow while computing available space in sk_rcvbuf
When the size of the receive buffer for a socket is close to 2^31 when
computing if we have enough space in the buffer to copy a packet from
the queue to the buffer we might hit an integer overflow.
When an user set net.core.rmem_default to a value close to 2^31 UDP
packets are dropped because of this overflow. This can be visible, for
instance, with failure to resolve hostnames.
This can be fixed by casting sk_rcvbuf (which is an int) to unsigned
int, similarly to how it is done in TCP.
Signed-off-by: Antonio Messina <amessina@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hillf Danton [Tue, 24 Dec 2019 16:14:29 +0000 (09:14 -0700)]
io-wq: add cond_resched() to worker thread
Reschedule the current IO worker to cut the risk that it is becoming
a cpu hog.
Signed-off-by: Hillf Danton <hdanton@sina.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Mathieu Desnoyers [Fri, 20 Dec 2019 20:33:18 +0000 (15:33 -0500)]
rseq/selftests: Clarify rseq_prepare_unload() helper requirements
The rseq.h UAPI now documents that the rseq_cs field must be cleared
before reclaiming memory that contains the targeted struct rseq_cs, but
also that the rseq_cs field must be cleared before reclaiming memory of
the code pointed to by the rseq_cs start_ip and post_commit_offset
fields.
While we can expect that use of dlclose(3) will typically unmap
both struct rseq_cs and its associated code at once, nothing would
theoretically prevent a JIT from reclaiming the code without
reclaiming the struct rseq_cs, which would erroneously allow the
kernel to consider new code which is not a rseq critical section
as a rseq critical section following a code reclaim.
Suggested-by: Florian Weimer <fw@deneb.enyo.de>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Florian Weimer <fw@deneb.enyo.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@linux.ibm.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Paul Turner <pjt@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Mathieu Desnoyers [Wed, 11 Dec 2019 16:17:13 +0000 (11:17 -0500)]
rseq/selftests: Fix: Namespace gettid() for compatibility with glibc 2.30
glibc 2.30 introduces gettid() in public headers, which clashes with
the internal static definition within rseq selftests.
Rename gettid() to rseq_gettid() to eliminate this symbol name clash.
Reported-by: Tommi T. Rantala <tommi.t.rantala@nokia.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Tommi T. Rantala <tommi.t.rantala@nokia.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@linux.ibm.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Paul Turner <pjt@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: <stable@vger.kernel.org> # v4.18+
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Mathieu Desnoyers [Wed, 11 Dec 2019 16:28:57 +0000 (11:28 -0500)]
rseq/selftests: Turn off timeout setting
As the rseq selftests can run for a long period of time, disable the
timeout that the general selftests have.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@linux.ibm.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Paul Turner <pjt@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
SeongJae Park [Fri, 20 Dec 2019 05:14:08 +0000 (05:14 +0000)]
kunit/kunit_tool_test: Test '--build_dir' option run
This commit adds kunit tool test for the '--build_dir' option.
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Tested-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
SeongJae Park [Fri, 20 Dec 2019 05:14:07 +0000 (05:14 +0000)]
kunit: Rename 'kunitconfig' to '.kunitconfig'
This commit renames 'kunitconfig' to '.kunitconfig' so that it can be
automatically ignored by git and do not disturb people who want to type
'kernel/' by pressing only the 'k' and then 'tab' key.
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Tested-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
SeongJae Park [Fri, 20 Dec 2019 05:14:06 +0000 (05:14 +0000)]
kunit: Place 'test.log' under the 'build_dir'
'kunit' writes the 'test.log' under the kernel source directory even
though a 'build_dir' option is given. As users who use the option might
expect the outputs to be placed under the specified directory, this
commit modifies the logic to write the log file under the 'build_dir'.
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Tested-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
SeongJae Park [Fri, 20 Dec 2019 05:14:05 +0000 (05:14 +0000)]
kunit: Create default config in '--build_dir'
If both '--build_dir' and '--defconfig' are given, the handling of
'--defconfig' ignores '--build_dir' option. This commit modifies the
behavior to respect '--build_dir' option.
Reported-by: Brendan Higgins <brendanhiggins@google.com>
Suggested-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Tested-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
SeongJae Park [Fri, 20 Dec 2019 05:14:04 +0000 (05:14 +0000)]
kunit: Remove duplicated defconfig creation
'--defconfig' option is handled by the 'main() of the 'kunit.py' but
again handled in following 'run_tests()'. This commit removes this
duplicated handling of the option in the 'run_tests()'.
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Tested-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
SeongJae Park [Fri, 20 Dec 2019 05:14:03 +0000 (05:14 +0000)]
docs/kunit/start: Use in-tree 'kunit_defconfig'
The kunit doc suggests users to get the default `kunitconfig` from an
external git tree. However, the file is already located under the
`arch/um/configs/` of the kernel tree. Because the local file is easier
to access and maintain, this commit updates the doc to use it.
Signed-off-by: SeongJae Park <sjpark@amazon.de>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Tested-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Shuah Khan [Mon, 16 Dec 2019 19:18:40 +0000 (12:18 -0700)]
selftests: livepatch: Fix it to do root uid check and skip
livepatch test configures the system and debug environment to run
tests. Some of these actions fail without root access and test
dumps several permission denied messages before it exits.
Fix test-state.sh to call setup_config instead of set_dynamic_debug
as suggested by Petr Mladek <pmladek@suse.com>
Fix it to check root uid and exit with skip code instead.
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Joe Lawrence <joe.lawrence@redhat.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Shuah Khan [Fri, 13 Dec 2019 01:56:06 +0000 (18:56 -0700)]
selftests: firmware: Fix it to do root uid check and skip
firmware attempts to load test modules that require root access
and fail. Fix it to check for root uid and exit with skip code
instead.
Before this fix:
selftests: firmware: fw_run_tests.sh
modprobe: ERROR: could not insert 'test_firmware': Operation not permitted
You must have the following enabled in your kernel:
CONFIG_TEST_FIRMWARE=y
CONFIG_FW_LOADER=y
CONFIG_FW_LOADER_USER_HELPER=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
not ok 1 selftests: firmware: fw_run_tests.sh # SKIP
With this fix:
selftests: firmware: fw_run_tests.sh
skip all tests: must be run as root
not ok 1 selftests: firmware: fw_run_tests.sh # SKIP
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Reviwed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Shuah Khan [Wed, 11 Dec 2019 00:12:33 +0000 (17:12 -0700)]
selftests: filesystems/epoll: fix build error
epoll build fails to find pthread lib. Fix Makefile to use LDLIBS
instead of LDFLAGS. LDLIBS is the right flag to use here with -l
option when invoking ld.
gcc -I../../../../../usr/include/ -lpthread epoll_wakeup_test.c -o .../tools/testing/selftests/filesystems/epoll/epoll_wakeup_test
/usr/bin/ld: /tmp/ccaZvJUl.o: in function `kill_timeout':
epoll_wakeup_test.c:(.text+0x4dd): undefined reference to `pthread_kill'
/usr/bin/ld: epoll_wakeup_test.c:(.text+0x4f2): undefined reference to `pthread_kill'
/usr/bin/ld: /tmp/ccaZvJUl.o: in function `epoll9':
epoll_wakeup_test.c:(.text+0x6382): undefined reference to `pthread_create'
/usr/bin/ld: epoll_wakeup_test.c:(.text+0x64d2): undefined reference to `pthread_create'
/usr/bin/ld: epoll_wakeup_test.c:(.text+0x6626): undefined reference to `pthread_join'
/usr/bin/ld: epoll_wakeup_test.c:(.text+0x684c): undefined reference to `pthread_tryjoin_np'
/usr/bin/ld: epoll_wakeup_test.c:(.text+0x6864): undefined reference to `pthread_kill'
/usr/bin/ld: epoll_wakeup_test.c:(.text+0x6878): undefined reference to `pthread_join'
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Hillf Danton [Sun, 22 Dec 2019 14:46:54 +0000 (22:46 +0800)]
io-wq: remove unused busy list from io_sqe
Commit
e61df66c69b1 ("io-wq: ensure free/busy list browsing see all
items") added a list for io workers in addition to the free and busy
lists, not only making worker walk cleaner, but leaving the busy list
unused. Let's remove it.
Signed-off-by: Hillf Danton <hdanton@sina.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Paulo Alcantara (SUSE) [Wed, 18 Dec 2019 21:11:37 +0000 (18:11 -0300)]
cifs: Optimize readdir on reparse points
When listing a directory with thounsands of files and most of them are
reparse points, we simply marked all those dentries for revalidation
and then sending additional (compounded) create/getinfo/close requests
for each of them.
Instead, upon receiving a response from an SMB2_QUERY_DIRECTORY
(FileIdFullDirectoryInformation) command, the directory entries that
have a file attribute of FILE_ATTRIBUTE_REPARSE_POINT will contain an
EaSize field with a reparse tag in it, so we parse it and mark the
dentry for revalidation only if it is a DFS or a symlink.
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Nathan Chancellor [Wed, 18 Dec 2019 03:04:51 +0000 (20:04 -0700)]
cifs: Adjust indentation in smb2_open_file
Clang warns:
../fs/cifs/smb2file.c:70:3: warning: misleading indentation; statement
is not part of the previous 'if' [-Wmisleading-indentation]
if (oparms->tcon->use_resilient) {
^
../fs/cifs/smb2file.c:66:2: note: previous statement is here
if (rc)
^
1 warning generated.
This warning occurs because there is a space after the tab on this line.
Remove it so that the indentation is consistent with the Linux kernel
coding style and clang no longer warns.
Fixes:
592fafe644bf ("Add resilienthandles mount parm")
Link: https://github.com/ClangBuiltLinux/linux/issues/826
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Namhyung Kim [Mon, 23 Dec 2019 06:13:26 +0000 (15:13 +0900)]
libbpf: Fix build on read-only filesystems
I got the following error when I tried to build perf on a read-only
filesystem with O=dir option.
$ cd /some/where/ro/linux/tools/perf
$ make O=$HOME/build/perf
...
CC /home/namhyung/build/perf/lib.o
/bin/sh: bpf_helper_defs.h: Read-only file system
make[3]: *** [Makefile:184: bpf_helper_defs.h] Error 1
make[2]: *** [Makefile.perf:778: /home/namhyung/build/perf/libbpf.a] Error 2
make[2]: *** Waiting for unfinished jobs....
LD /home/namhyung/build/perf/libperf-in.o
AR /home/namhyung/build/perf/libperf.a
PERF_VERSION = 5.4.0
make[1]: *** [Makefile.perf:225: sub-make] Error 2
make: *** [Makefile:70: all] Error 2
It was becaused bpf_helper_defs.h was generated in current directory.
Move it to OUTPUT directory.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191223061326.843366-1-namhyung@kernel.org
Chris Wilson [Wed, 18 Dec 2019 10:40:43 +0000 (10:40 +0000)]
drm/i915: Hold reference to intel_frontbuffer as we track activity
Since obj->frontbuffer is no longer protected by the struct_mutex, as we
are processing the execbuf, it may be removed. Mark the
intel_frontbuffer as rcu protected, and so acquire a reference to
the struct as we track activity upon it.
Closes: https://gitlab.freedesktop.org/drm/intel/issues/827
Fixes:
8e7cb1799b4f ("drm/i915: Extract intel_frontbuffer active tracking")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191218104043.3539458-1-chris@chris-wilson.co.uk
(cherry picked from commit
da42104f589d979bbe402703fd836cec60befae1)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Chris Wilson [Wed, 18 Dec 2019 09:35:04 +0000 (09:35 +0000)]
drm/i915/gt: Ratelimit display power w/a
For very light workloads that frequently park, acquiring the display
power well (required to prevent the dmc from trashing the system) takes
longer than the execution. A good example is the igt_coherency selftest,
which is slowed down by an order of magnitude in the worst case with
powerwell cycling. To prevent frequent cycling, while keeping our fast
soft-rc6, use a timer to delay release of the display powerwell.
Fixes:
311770173fac ("drm/i915/gt: Schedule request retirement when timeline idles")
References: https://gitlab.freedesktop.org/drm/intel/issues/848
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191218093504.3477048-1-chris@chris-wilson.co.uk
(cherry picked from commit
81ff52b705775433a955b2746d37b87bdc89a3d0)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Tvrtko Ursulin [Tue, 17 Dec 2019 14:20:57 +0000 (14:20 +0000)]
drm/i915/pmu: Ensure monotonic rc6
Avoid rc6 counter going backward in close to 0% RC6 scenarios like:
15.
005477996 114,246,613 ns i915/rc6-residency/
16.
005876662 667,657 ns i915/rc6-residency/
17.
006131417 7,286 ns i915/rc6-residency/
18.
006615031 18,446,744,073,708,914,688 ns i915/rc6-residency/
19.
007158361 18,446,744,073,709,447,168 ns i915/rc6-residency/
20.
007806498 0 ns i915/rc6-residency/
21.
008227495 1,440,403 ns i915/rc6-residency/
There are two aspects to this fix.
First is not assuming rc6 value zero means GT is asleep since that can
also mean GPU is fully busy and we do not want to enter the estimation
path in that case.
Second is ensuring monotonicity on the estimation path itself. I suspect
what is happening is with extremely rapid park/unpark cycles we get no
updates on the real rc6 and therefore have to careful not to
unconditionally trust use last known real rc6 when creating a new
estimation.
v2:
* Simplify logic by not tracking the estimate but last reported value.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes:
16ffe73c186b ("drm/i915/pmu: Use GT parked for estimating RC6 while asleep")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191217142057.1000-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit
df6a42053513846475ae1fbd224dfbdbcd0c7010)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Arnd Bergmann [Thu, 12 Dec 2019 01:56:31 +0000 (10:56 +0900)]
PM / devfreq: tegra: Add COMMON_CLK dependency
Compile-testing this driver fails if CONFIG_COMMON_CLK is not set:
drivers/devfreq/tegra30-devfreq.o: In function `tegra_devfreq_target':
tegra30-devfreq.c:(.text+0x164): undefined reference to `clk_set_min_rate'
Fixes:
35f8dbc72721 ("PM / devfreq: tegra: Enable COMPILE_TEST for the driver")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>