Linus Torvalds [Fri, 29 Jan 2021 21:54:40 +0000 (13:54 -0800)]
Merge tag 'for-5.11-rc5-tag' of git://git./linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"A few more fixes for a late rc:
- fix lockdep complaint on 32bit arches and also remove an unsafe
memory use due to device vs filesystem lifetime
- two fixes for free space tree:
* race during log replay and cache rebuild, now more likely to
happen due to changes in this dev cycle
* possible free space tree corruption with online conversion
during initial tree population"
* tag 'for-5.11-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix log replay failure due to race with space cache rebuild
btrfs: fix lockdep warning due to seqcount_mutex on 32bit arch
btrfs: fix possible free space tree corruption with online conversion
Linus Torvalds [Fri, 29 Jan 2021 21:50:06 +0000 (13:50 -0800)]
Merge tag 'block-5.11-2021-01-29' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"All over the place fixes for this release:
- blk-cgroup iteration teardown resched fix (Baolin)
- NVMe pull request from Christoph:
- add another Write Zeroes quirk (Chaitanya Kulkarni)
- handle a no path available corner case (Daniel Wagner)
- use the proper RCU aware list_add helper (Chao Leng)
- bcache regression fix (Coly)
- bdev->bd_size_lock IRQ fix. This will be fixed in drivers for 5.12,
but for now, we'll make it IRQ safe (Damien)
- null_blk zoned init fix (Damien)
- add_partition() error handling fix (Dinghao)
- s390 dasd kobject fix (Jan)
- nbd fix for freezing queue while adding connections (Josef)
- tag queueing regression fix (Ming)
- revert of a patch that inadvertently meant that we regressed write
performance on raid (Maxim)"
* tag 'block-5.11-2021-01-29' of git://git.kernel.dk/linux-block:
null_blk: cleanup zoned mode initialization
nvme-core: use list_add_tail_rcu instead of list_add_tail for nvme_init_ns_head
nvme-multipath: Early exit if no path is available
nvme-pci: add the DISABLE_WRITE_ZEROES quirk for a SPCC device
bcache: only check feature sets when sb->version >= BCACHE_SB_VERSION_CDEV_WITH_FEATURES
block: fix bd_size_lock use
blk-cgroup: Use cond_resched() when destroy blkgs
Revert "block: simplify set_init_blocksize" to regain lost performance
nbd: freeze the queue while we're adding connections
s390/dasd: Fix inconsistent kobject removal
block: Fix an error handling in add_partition
blk-mq: test QUEUE_FLAG_HCTX_ACTIVE for sbitmap_shared in hctx_may_queue
Linus Torvalds [Fri, 29 Jan 2021 21:47:47 +0000 (13:47 -0800)]
Merge tag 'io_uring-5.11-2021-01-29' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"We got the cancelation story sorted now, so for all intents and
purposes, this should be it for 5.11 outside of any potential little
fixes that may come in. This contains:
- task_work task state fixes (Hao, Pavel)
- Cancelation fixes (me, Pavel)
- Fix for an inflight req patch in this release (Pavel)
- Fix for a lock deadlock issue (Pavel)"
* tag 'io_uring-5.11-2021-01-29' of git://git.kernel.dk/linux-block:
io_uring: reinforce cancel on flush during exit
io_uring: fix sqo ownership false positive warning
io_uring: fix list corruption for splice file_get
io_uring: fix flush cqring overflow list while TASK_INTERRUPTIBLE
io_uring: fix wqe->lock/completion_lock deadlock
io_uring: fix cancellation taking mutex while TASK_UNINTERRUPTIBLE
io_uring: fix __io_uring_files_cancel() with TASK_UNINTERRUPTIBLE
io_uring: only call io_cqring_ev_posted() if events were posted
io_uring: if we see flush on exit, cancel related tasks
Linus Torvalds [Fri, 29 Jan 2021 21:32:05 +0000 (13:32 -0800)]
Merge tag 'iommu-fixes-v5.11-rc5' of git://git./linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
- AMD IOMMU fix to make sure features are detected before they are
queried.
- Intel IOMMU address alignment check fix for an IOLTB flushing
command.
- Performance fix for Intel IOMMU to make sure the code does not do
full IOTLB flushes all the time. Those flushes are very expensive
on emulated IOMMUs.
* tag 'iommu-fixes-v5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/vt-d: Do not use flush-queue when caching-mode is on
iommu/vt-d: Correctly check addr alignment in qi_flush_dev_iotlb_pasid()
iommu/amd: Use IVHD EFR for early initialization of IOMMU features
Linus Torvalds [Fri, 29 Jan 2021 21:30:09 +0000 (13:30 -0800)]
Merge tag 'pm-5.11-rc6' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix a deadlock in the 'kexec jump' code and address a possible
hibernation image creation issue.
Specifics:
- Fix a deadlock caused by attempting to acquire the same mutex twice
in a row in the "kexec jump" code (Baoquan He)
- Modify the hibernation image saving code to flush the unwritten
data to the swap storage later so as to avoid failing to write the
image signature which is possible in some cases (Laurent Badel)"
* tag 'pm-5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: hibernate: flush swap writer after marking
kernel: kexec: remove the lock operation of system_transition_mutex
Linus Torvalds [Fri, 29 Jan 2021 21:23:21 +0000 (13:23 -0800)]
Merge tag 'acpi-5.11-rc6' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix the handling of notifications in the ACPI thermal driver and
address a device enumeration issue leading to the presence of multiple
'MODALIAS=' entries in one uevent file in sysfs in some cases.
Specifics:
- Modify the ACPI thermal driver to avoid evaluating _TMP directly in
its Notify () handler callback and running too many thermal checks
for one thermal zone at the same time so as to address a work item
accumulation issue observed on some systems that fail to shut down
as a result of it (Rafael Wysocki)
- Modify the ACPI uevent file creation code to avoid putting multiple
'MODALIAS=' entries in one uevent file in sysfs which breaks
systemd-udevd (Kai-Heng Feng)"
* tag 'acpi-5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: thermal: Do not call acpi_thermal_check() directly
ACPI: sysfs: Prefer "compatible" modalias
Linus Torvalds [Fri, 29 Jan 2021 21:18:23 +0000 (13:18 -0800)]
Merge tag 'drm-fixes-2021-01-29' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Weekly fixes for graphics, nothing too major, nouveau has a few
regression fixes for various fallout from header changes previously,
vc4 has two fixes, two amdgpu, and a smattering of i915 fixes.
All seems on course for a quieter rc7, fingers crossed.
nouveau:
- fix svm init conditions
- fix nv50 modesetting regression
- fix cursor plane modifiers
- fix > 64x64 cursor regression
vc4:
- Fix LBM size calculation
- Fix high resolutions for hvs5
i915:
- Fix ICL MG PHY vswing
- Fix subplatform handling
- Fix selftest memleak
- Clear CACHE_MODE prior to clearing residuals
- Always flush the active worker before returning from the wait
- Always try to reserve GGTT address 0x0
amdgpu:
- Fix a fan control regression on some boards
- Fix clang warning"
* tag 'drm-fixes-2021-01-29' of git://anongit.freedesktop.org/drm/drm:
drm/nouveau/kms/gk104-gp1xx: Fix > 64x64 cursors
drm/nouveau/kms/nv50-: Report max cursor size to userspace
drivers/nouveau/kms/nv50-: Reject format modifiers for cursor planes
drm/nouveau/svm: fail NOUVEAU_SVM_INIT ioctl on unsupported devices
drm/nouveau/dispnv50: Restore pushing of all data.
amdgpu: fix clang build warning
Revert "drm/amdgpu/swsmu: drop set_fan_speed_percent (v2)"
drm/i915/gt: Always try to reserve GGTT address 0x0
drm/i915: Always flush the active worker before returning from the wait
drm/i915/selftest: Fix potential memory leak
drm/i915: Check for all subplatform bits
drm/i915: Fix ICL MG PHY vswing handling
drm/i915/gt: Clear CACHE_MODE prior to clearing residuals
drm/vc4: Correct POS1_SCL for hvs5
drm/vc4: Correct lbm size and calculation
drm/nouveau/nvif: fix method count when pushing an array
Linus Torvalds [Fri, 29 Jan 2021 20:28:20 +0000 (12:28 -0800)]
tty: avoid using vfs_iocb_iter_write() for redirected console writes
It turns out that the vfs_iocb_iter_{read,write}() functions are
entirely broken, and don't actually use the passed-in file pointer for
IO - only for the preparatory work (permission checking and for the
write_iter function lookup).
That worked fine for overlayfs, which always builds the new iocb with
the same file pointer that it passes in, but in the general case it ends
up doing nonsensical things (and could cause an iterator call that
doesn't even match the passed-in file pointer).
This subtly broke the tty conversion to write_iter in commit
9bb48c82aced ("tty: implement write_iter"), because the console
redirection didn't actually end up redirecting anything, since the
passed-in file pointer was basically ignored, and the actual write was
done with the original non-redirected console tty after all.
The main visible effect of this is that the console messages were no
longer logged to /var/log/boot.log during graphical boot.
Fix the issue by simply not using the vfs write "helper" function at
all, and just redirecting the write entirely internally to the tty
layer. Do the target writability permission checks when actually
registering the target tty with TIOCCONS instead of at write time.
Fixes: 9bb48c82aced ("tty: implement write_iter")
Reported-and-tested-by: Hans de Goede <hdegoede@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Fri, 29 Jan 2021 15:28:48 +0000 (16:28 +0100)]
Merge branch 'acpi-sysfs'
* acpi-sysfs:
ACPI: sysfs: Prefer "compatible" modalias
Damien Le Moal [Fri, 29 Jan 2021 14:47:25 +0000 (23:47 +0900)]
null_blk: cleanup zoned mode initialization
To avoid potential compilation problems, replaced the badly written
MB_TO_SECTS() macro (missing parenthesis around the argument use) with
the inline function mb_to_sects(). And while at it, simplify the
calculation of the total number of zones of the device using the
round_up() macro.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Fri, 29 Jan 2021 03:40:26 +0000 (19:40 -0800)]
Merge tag 'ecryptfs-5.11-rc6-setxattr-fix' of git://git./linux/kernel/git/tyhicks/ecryptfs
Pull ecryptfs fix from Tyler Hicks:
"Fix a regression that resulted in two rounds of UID translations when
setting v3 namespaced file capabilities in some configurations"
* tag 'ecryptfs-5.11-rc6-setxattr-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs:
ecryptfs: fix uid translation for setxattr on security.capability
Dave Airlie [Fri, 29 Jan 2021 01:36:38 +0000 (11:36 +1000)]
Merge tag 'amd-drm-fixes-5.11-2021-01-28' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-5.11-2021-01-28:
amdgpu:
- Fix a fan control regression on some boards
- Fix clang warning
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210128191558.3821-1-alexander.deucher@amd.com
Dave Airlie [Fri, 29 Jan 2021 01:33:37 +0000 (11:33 +1000)]
Merge tag 'drm-intel-fixes-2021-01-28' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.11-rc6:
- Fix ICL MG PHY vswing
- Fix subplatform handling
- Fix selftest memleak
- Clear CACHE_MODE prior to clearing residuals
- Always flush the active worker before returning from the wait
- Always try to reserve GGTT address 0x0
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87y2gdi3mp.fsf@intel.com
Dave Airlie [Fri, 29 Jan 2021 01:32:30 +0000 (11:32 +1000)]
Merge tag 'drm-misc-fixes-2021-01-27' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Short summary of fixes pull (less than what git shortlog provides):
* drm/vc4: Fix LBM size calculation; Fix high resolutions for hvs5
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YBEco1Vxeny8U/ca@linux-uq9g
Dave Airlie [Fri, 29 Jan 2021 01:05:41 +0000 (11:05 +1000)]
Merge branch '04.01-ampere-lite' of git://github.com/skeggsb/linux into drm-fixes
Mostly a regression fixes here, a couple of which could lead to
display hanging, and have been affecting a number of users.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Ben Skeggs <skeggsb@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CACAvsv4Y0ZiAevSvgphLAOaZjFi75ECXqUD9ShBvRxZ6S-pb9Q@mail.gmail.com
Lyude Paul [Tue, 19 Jan 2021 01:54:14 +0000 (20:54 -0500)]
drm/nouveau/kms/gk104-gp1xx: Fix > 64x64 cursors
While we do handle the additional cursor sizes introduced in NVE4, it looks
like we accidentally broke this when converting over to use Nvidia's
display headers. Since we now use NVVAL in dispnv50/head907d.c in order to
format the value for the cursor layout and NVD9 only had one byte reserved
vs. the 2 bytes reserved in later generations, we end up accidentally
stripping the second bit in the cursor layout format parameter - causing us
to set the wrong cursor size.
This fixes that by adding our own curs_set hook for 917d which uses the
NV917D headers.
Cc: Martin Peres <martin.peres@free.fr>
Cc: Jeremy Cline <jcline@redhat.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: <stable@vger.kernel.org> # v5.9+
Signed-off-by: Lyude Paul <lyude@redhat.com>
Fixes: ed0b86a90bf9 ("drm/nouveau/kms/nv50-: use NVIDIA's headers for core head_curs_set()")
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Lyude Paul [Tue, 19 Jan 2021 01:54:13 +0000 (20:54 -0500)]
drm/nouveau/kms/nv50-: Report max cursor size to userspace
Cc: Martin Peres <martin.peres@free.fr>
Cc: Jeremy Cline <jcline@redhat.com>
Cc: Simon Ser <contact@emersion.fr>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Tested-by: Simon Ser <contact@emersion.fr>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Lyude Paul [Tue, 19 Jan 2021 01:54:12 +0000 (20:54 -0500)]
drivers/nouveau/kms/nv50-: Reject format modifiers for cursor planes
Nvidia hardware doesn't actually support using tiling formats with the
cursor plane, only linear is allowed. In the future, we should write a
testcase for this.
Fixes: c586f30bf74c ("drm/nouveau/kms: Add format mod prop to base/ovly/nvdisp")
Cc: James Jones <jajones@nvidia.com>
Cc: Martin Peres <martin.peres@free.fr>
Cc: Jeremy Cline <jcline@redhat.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: <stable@vger.kernel.org> # v5.8+
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Reviewed-by: James Jones <jajones@nvidia.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Karol Herbst [Mon, 18 Jan 2021 17:16:06 +0000 (18:16 +0100)]
drm/nouveau/svm: fail NOUVEAU_SVM_INIT ioctl on unsupported devices
Fixes a crash when trying to create a channel on e.g. Turing GPUs when
NOUVEAU_SVM_INIT was called before.
Fixes: eeaf06ac1a558 ("drm/nouveau/svm: initial support for shared virtual memory")
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Bastian Beranek [Thu, 21 Jan 2021 14:27:36 +0000 (15:27 +0100)]
drm/nouveau/dispnv50: Restore pushing of all data.
Commit
f844eb485eb056ad3b67e49f95cbc6c685a73db4 introduced a regression for
NV50, which lead to visual artifacts, tearing and eventual crashes.
In the changes of
f844eb485eb056ad3b67e49f95cbc6c685a73db4 only the first line
was correctly translated to the new NVIDIA header macros:
- PUSH_NVSQ(push, NV827C, 0x0110, 0,
- 0x0114, 0);
+ PUSH_MTHD(push, NV827C, SET_PROCESSING,
+ NVDEF(NV827C, SET_PROCESSING, USE_GAIN_OFS, DISABLE));
The lower part ("0x0114, 0") was probably omitted by accident.
This patch restores the push of the missing data and fixes the regression.
Signed-off-by: Bastian Beranek <bastian.beischer@rwth-aachen.de>
Fixes: f844eb485eb05 ("drm/nouveau/kms/nv50-: use NVIDIA's headers for wndw image_set()")
Link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/14
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Pavel Begunkov [Thu, 28 Jan 2021 23:23:42 +0000 (23:23 +0000)]
io_uring: reinforce cancel on flush during exit
What
84965ff8a84f0 ("io_uring: if we see flush on exit, cancel related tasks")
really wants is to cancel all relevant REQ_F_INFLIGHT requests reliably.
That can be achieved by io_uring_cancel_files(), but we'll miss it
calling io_uring_cancel_task_requests(files=NULL) from io_uring_flush(),
because it will go through __io_uring_cancel_task_requests().
Just always call io_uring_cancel_files() during cancel, it's good enough
for now.
Cc: stable@vger.kernel.org # 5.9+
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Linus Torvalds [Thu, 28 Jan 2021 23:24:43 +0000 (15:24 -0800)]
Merge tag 'net-5.11-rc6' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Networking fixes including fixes from can, xfrm, wireless,
wireless-drivers and netfilter trees. Nothing scary, Intel
WiFi-related fixes seemed most notable to the users.
Current release - regressions:
- dsa: microchip: ksz8795: fix KSZ8794 port map again to program the
CPU port correctly
Current release - new code bugs:
- iwlwifi: pcie: reschedule in long-running memory reads
Previous releases - regressions:
- iwlwifi: dbg: don't try to overwrite read-only FW data
- iwlwifi: provide gso_type to GSO packets
- octeontx2: make sure the buffer is 128 byte aligned
- tcp: make TCP_USER_TIMEOUT accurate for zero window probes
- xfrm: fix wraparound in xfrm_policy_addr_delta()
- xfrm: fix oops in xfrm_replay_advance_bmp due to a race between
CPUs in presence of packet reorder
- tcp: fix TLP timer not set when CA_STATE changes from DISORDER to
OPEN
- wext: fix NULL-ptr-dereference with cfg80211's lack of commit()
Previous releases - always broken:
- igc: fix link speed advertising
- stmmac: configure EHL PSE0 GbE and PSE1 GbE to 32 bits DMA
addressing
- team: protect features update by RCU to avoid deadlock
- xfrm: fix disable_xfrm sysctl when used on xfrm interfaces
themselves
- fec: fix temporary RMII clock reset on link up
- can: dev: prevent potential information leak in can_fill_info()
Misc:
- mrp: fix bad packing of MRP test packet structures
- uapi: fix big endian definition of ipv6_rpl_sr_hdr
- add David Ahern to IPv4/IPv6 maintainers"
* tag 'net-5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (86 commits)
rxrpc: Fix memory leak in rxrpc_lookup_local
mlxsw: spectrum_span: Do not overwrite policer configuration
selftests: forwarding: Specify interface when invoking mausezahn
stmmac: intel: Configure EHL PSE0 GbE and PSE1 GbE to 32 bits DMA addressing
net: usb: cdc_ether: added support for Thales Cinterion PLSx3 modem family.
ibmvnic: Ensure that CRQ entry read are correctly ordered
MAINTAINERS: add missing header for bonding
net: decnet: fix netdev refcount leaking on error path
net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP
can: dev: prevent potential information leak in can_fill_info()
net: fec: Fix temporary RMII clock reset on link up
net: lapb: Add locking to the lapb module
team: protect features update by RCU to avoid deadlock
MAINTAINERS: add David Ahern to IPv4/IPv6 maintainers
net/mlx5: CT: Fix incorrect removal of tuple_nat_node from nat rhashtable
net/mlx5e: Revert parameters on errors when changing MTU and LRO state without reset
net/mlx5e: Revert parameters on errors when changing trust state without reset
net/mlx5e: Correctly handle changing the number of queues when the interface is down
net/mlx5e: Fix CT rule + encap slow path offload and deletion
net/mlx5e: Disable hw-tc-offload when MLX5_CLS_ACT config is disabled
...
Takeshi Misawa [Thu, 28 Jan 2021 10:48:36 +0000 (10:48 +0000)]
rxrpc: Fix memory leak in rxrpc_lookup_local
Commit
9ebeddef58c4 ("rxrpc: rxrpc_peer needs to hold a ref on the rxrpc_local record")
Then release ref in __rxrpc_put_peer and rxrpc_put_peer_locked.
struct rxrpc_peer *rxrpc_alloc_peer(struct rxrpc_local *local, gfp_t gfp)
- peer->local = local;
+ peer->local = rxrpc_get_local(local);
rxrpc_discard_prealloc also need ref release in discarding.
syzbot report:
BUG: memory leak
unreferenced object 0xffff8881080ddc00 (size 256):
comm "syz-executor339", pid 8462, jiffies
4294942238 (age 12.350s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 0a 00 00 00 00 c0 00 08 81 88 ff ff ................
backtrace:
[<
000000002b6e495f>] kmalloc include/linux/slab.h:552 [inline]
[<
000000002b6e495f>] kzalloc include/linux/slab.h:682 [inline]
[<
000000002b6e495f>] rxrpc_alloc_local net/rxrpc/local_object.c:79 [inline]
[<
000000002b6e495f>] rxrpc_lookup_local+0x1c1/0x760 net/rxrpc/local_object.c:244
[<
000000006b43a77b>] rxrpc_bind+0x174/0x240 net/rxrpc/af_rxrpc.c:149
[<
00000000fd447a55>] afs_open_socket+0xdb/0x200 fs/afs/rxrpc.c:64
[<
000000007fd8867c>] afs_net_init+0x2b4/0x340 fs/afs/main.c:126
[<
0000000063d80ec1>] ops_init+0x4e/0x190 net/core/net_namespace.c:152
[<
00000000073c5efa>] setup_net+0xde/0x2d0 net/core/net_namespace.c:342
[<
00000000a6744d5b>] copy_net_ns+0x19f/0x3e0 net/core/net_namespace.c:483
[<
0000000017d3aec3>] create_new_namespaces+0x199/0x4f0 kernel/nsproxy.c:110
[<
00000000186271ef>] unshare_nsproxy_namespaces+0x9b/0x120 kernel/nsproxy.c:226
[<
000000002de7bac4>] ksys_unshare+0x2fe/0x5c0 kernel/fork.c:2957
[<
00000000349b12ba>] __do_sys_unshare kernel/fork.c:3025 [inline]
[<
00000000349b12ba>] __se_sys_unshare kernel/fork.c:3023 [inline]
[<
00000000349b12ba>] __x64_sys_unshare+0x12/0x20 kernel/fork.c:3023
[<
000000006d178ef7>] do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
[<
00000000637076d4>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 9ebeddef58c4 ("rxrpc: rxrpc_peer needs to hold a ref on the rxrpc_local record")
Signed-off-by: Takeshi Misawa <jeliantsurux@gmail.com>
Reported-and-tested-by: syzbot+305326672fed51b205f7@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/161183091692.3506637.3206605651502458810.stgit@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 28 Jan 2021 21:09:03 +0000 (13:09 -0800)]
Merge branch 'mlxsw-various-fixes'
Ido Schimmel says:
====================
mlxsw: Various fixes
Patch #1 fixes wrong invocation of mausezahn in a couple of selftests.
The tests started failing after Fedora updated their libnet package from
version 1.1.6 to 1.2.1. With the fix the tests pass regardless of libnet
version.
Patch #2 fixes an issue in the mirroring to CPU code that results in
policer configuration being overwritten.
====================
Link: https://lore.kernel.org/r/20210128144820.3280295-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ido Schimmel [Thu, 28 Jan 2021 14:48:20 +0000 (16:48 +0200)]
mlxsw: spectrum_span: Do not overwrite policer configuration
The purpose of the delayed work in the SPAN module is to potentially
update the destination port and various encapsulation parameters of SPAN
agents that point to a VLAN device or a GRE tap. The destination port
can change following the insertion of a new route, for example.
SPAN agents that point to a physical port or the CPU port are static and
never change throughout the lifetime of the SPAN agent. Therefore, skip
over them in the delayed work.
This fixes an issue where the delayed work overwrites the policer
that was set on a SPAN agent pointing to the CPU. Modifying the delayed
work to inherit the original policer configuration is error-prone, as
the same will be needed for any new parameter.
Fixes: 4039504e6a0c ("mlxsw: spectrum_span: Allow setting policer on a SPAN agent")
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Danielle Ratson [Thu, 28 Jan 2021 14:48:19 +0000 (16:48 +0200)]
selftests: forwarding: Specify interface when invoking mausezahn
Specify the interface through which packets should be transmitted so
that the test will pass regardless of the libnet version against which
mausezahn is linked.
Fixes: cab14d1087d9 ("selftests: Add version of router_multipath.sh using nexthop objects")
Fixes: 3d578d879517 ("selftests: forwarding: Test IPv4 weighted nexthops")
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Voon Weifeng [Tue, 26 Jan 2021 10:08:44 +0000 (18:08 +0800)]
stmmac: intel: Configure EHL PSE0 GbE and PSE1 GbE to 32 bits DMA addressing
Fix an issue where dump stack is printed and Reset Adapter occurs when
PSE0 GbE or/and PSE1 GbE is/are enabled. EHL PSE0 GbE and PSE1 GbE use
32 bits DMA addressing whereas EHL PCH GbE uses 64 bits DMA addressing.
[ 25.535095] ------------[ cut here ]------------
[ 25.540276] NETDEV WATCHDOG: enp0s29f2 (intel-eth-pci): transmit queue 2 timed out
[ 25.548749] WARNING: CPU: 2 PID: 0 at net/sched/sch_generic.c:443 dev_watchdog+0x259/0x260
[ 25.558004] Modules linked in: 8021q bnep bluetooth ecryptfs snd_hda_codec_hdmi intel_gpy marvell intel_ishtp_loader intel_ishtp_hid iTCO_wdt mei_hdcp iTCO_vendor_support x86_pkg_temp_thermal kvm_intel dwmac_intel stmmac kvm igb pcs_xpcs irqbypass phylink snd_hda_intel intel_rapl_msr pcspkr dca snd_hda_codec i915 i2c_i801 i2c_smbus libphy intel_ish_ipc snd_hda_core mei_me intel_ishtp mei spi_dw_pci 8250_lpss spi_dw thermal dw_dmac_core parport_pc tpm_crb tpm_tis parport tpm_tis_core tpm intel_pmc_core sch_fq_codel uhid fuse configfs snd_sof_pci snd_sof_intel_byt snd_sof_intel_ipc snd_sof_intel_hda_common snd_sof_xtensa_dsp snd_sof snd_soc_acpi_intel_match snd_soc_acpi snd_intel_dspcfg ledtrig_audio snd_soc_core snd_compress ac97_bus snd_pcm snd_timer snd soundcore
[ 25.633795] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G U 5.11.0-rc4-intel-lts-MISMAIL5+ #5
[ 25.644306] Hardware name: Intel Corporation Elkhart Lake Embedded Platform/ElkhartLake LPDDR4x T4 RVP1, BIOS EHLSFWI1.R00.2434.A00.
2010231402 10/23/2020
[ 25.659674] RIP: 0010:dev_watchdog+0x259/0x260
[ 25.664650] Code: e8 3b 6b 60 ff eb 98 4c 89 ef c6 05 ec e7 bf 00 01 e8 fb e5 fa ff 89 d9 4c 89 ee 48 c7 c7 78 31 d2 9e 48 89 c2 e8 79 1b 18 00 <0f> 0b e9 77 ff ff ff 0f 1f 44 00 00 48 c7 47 08 00 00 00 00 48 c7
[ 25.685647] RSP: 0018:
ffffb7ca80160eb8 EFLAGS:
00010286
[ 25.691498] RAX:
0000000000000000 RBX:
0000000000000002 RCX:
0000000000000103
[ 25.699483] RDX:
0000000080000103 RSI:
00000000000000f6 RDI:
00000000ffffffff
[ 25.707465] RBP:
ffff985709ce0440 R08:
0000000000000000 R09:
c0000000ffffefff
[ 25.715455] R10:
ffffb7ca80160cf0 R11:
ffffb7ca80160ce8 R12:
ffff985709ce039c
[ 25.723438] R13:
ffff985709ce0000 R14:
0000000000000008 R15:
ffff9857068af940
[ 25.731425] FS:
0000000000000000(0000) GS:
ffff985864300000(0000) knlGS:
0000000000000000
[ 25.740481] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 25.746913] CR2:
00005567f8bb76b8 CR3:
00000001f8e0a000 CR4:
0000000000350ee0
[ 25.754900] Call Trace:
[ 25.757631] <IRQ>
[ 25.759891] ? qdisc_put_unlocked+0x30/0x30
[ 25.764565] ? qdisc_put_unlocked+0x30/0x30
[ 25.769245] call_timer_fn+0x2e/0x140
[ 25.773346] run_timer_softirq+0x1f3/0x430
[ 25.777932] ? __hrtimer_run_queues+0x12c/0x2c0
[ 25.783005] ? ktime_get+0x3e/0xa0
[ 25.786812] __do_softirq+0xa6/0x2ef
[ 25.790816] asm_call_irq_on_stack+0xf/0x20
[ 25.795501] </IRQ>
[ 25.797852] do_softirq_own_stack+0x5d/0x80
[ 25.802538] irq_exit_rcu+0x94/0xb0
[ 25.806475] sysvec_apic_timer_interrupt+0x42/0xc0
[ 25.811836] asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 25.817586] RIP: 0010:cpuidle_enter_state+0xd9/0x370
[ 25.823142] Code: 85 c0 0f 8f 0a 02 00 00 31 ff e8 22 d5 7e ff 45 84 ff 74 12 9c 58 f6 c4 02 0f 85 47 02 00 00 31 ff e8 7b a0 84 ff fb 45 85 f6 <0f> 88 ab 00 00 00 49 63 ce 48 2b 2c 24 48 89 c8 48 6b d1 68 48 c1
[ 25.844140] RSP: 0018:
ffffb7ca800f7e80 EFLAGS:
00000206
[ 25.849996] RAX:
ffff985864300000 RBX:
0000000000000003 RCX:
000000000000001f
[ 25.857975] RDX:
00000005f2028ea8 RSI:
ffffffff9ec5907f RDI:
ffffffff9ec62a5d
[ 25.865961] RBP:
00000005f2028ea8 R08:
0000000000000000 R09:
0000000000029d00
[ 25.873947] R10:
000000137b0e0508 R11:
ffff9858643294e4 R12:
ffff9858643336d0
[ 25.881935] R13:
ffffffff9ef74b00 R14:
0000000000000003 R15:
0000000000000000
[ 25.889918] cpuidle_enter+0x29/0x40
[ 25.893922] do_idle+0x24a/0x290
[ 25.897536] cpu_startup_entry+0x19/0x20
[ 25.901930] start_secondary+0x128/0x160
[ 25.906326] secondary_startup_64_no_verify+0xb0/0xbb
[ 25.911983] ---[ end trace
b4c0c8195d0ba61f ]---
[ 25.917193] intel-eth-pci 0000:00:1d.2 enp0s29f2: Reset adapter.
Fixes: 67c08ac4140a ("net: stmmac: add EHL PSE0 & PSE1 1Gbps PCI info and PCI ID")
Signed-off-by: Voon Weifeng <weifeng.voon@intel.com>
Co-developed-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com>
Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com>
Link: https://lore.kernel.org/r/20210126100844.30326-1-mohammad.athari.ismail@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Giacinto Cifelli [Tue, 26 Jan 2021 04:42:45 +0000 (05:42 +0100)]
net: usb: cdc_ether: added support for Thales Cinterion PLSx3 modem family.
lsusb -v for this device:
Bus 003 Device 007: ID 1e2d:0069
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 2.00
bDeviceClass 239 Miscellaneous Device
bDeviceSubClass 2 ?
bDeviceProtocol 1 Interface Association
bMaxPacketSize0 64
idVendor 0x1e2d
idProduct 0x0069
bcdDevice 0.00
iManufacturer 4 Cinterion Wireless Modules
iProduct 3 PLSx3
iSerial 5
fa3c1419
bNumConfigurations 1
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 352
bNumInterfaces 10
bConfigurationValue 1
iConfiguration 2 Cinterion Configuration
bmAttributes 0xe0
Self Powered
Remote Wakeup
MaxPower 500mA
Interface Association:
bLength 8
bDescriptorType 11
bFirstInterface 0
bInterfaceCount 2
bFunctionClass 2 Communications
bFunctionSubClass 2 Abstract (modem)
bFunctionProtocol 1 AT-commands (v.25ter)
iFunction 0
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 0
bAlternateSetting 0
bNumEndpoints 1
bInterfaceClass 2 Communications
bInterfaceSubClass 2 Abstract (modem)
bInterfaceProtocol 1 AT-commands (v.25ter)
iInterface 0
CDC Header:
bcdCDC 1.10
CDC ACM:
bmCapabilities 0x02
line coding and serial state
CDC Call Management:
bmCapabilities 0x03
call management
use DataInterface
bDataInterface 1
CDC Union:
bMasterInterface 0
bSlaveInterface 1
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x81 EP 1 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 5
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 1
bAlternateSetting 0
bNumEndpoints 2
bInterfaceClass 10 CDC Data
bInterfaceSubClass 0 Unused
bInterfaceProtocol 0
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x82 EP 2 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x01 EP 1 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 0
Interface Association:
bLength 8
bDescriptorType 11
bFirstInterface 2
bInterfaceCount 2
bFunctionClass 2 Communications
bFunctionSubClass 2 Abstract (modem)
bFunctionProtocol 1 AT-commands (v.25ter)
iFunction 0
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 2
bAlternateSetting 0
bNumEndpoints 1
bInterfaceClass 2 Communications
bInterfaceSubClass 2 Abstract (modem)
bInterfaceProtocol 1 AT-commands (v.25ter)
iInterface 0
CDC Header:
bcdCDC 1.10
CDC ACM:
bmCapabilities 0x02
line coding and serial state
CDC Call Management:
bmCapabilities 0x03
call management
use DataInterface
bDataInterface 3
CDC Union:
bMasterInterface 2
bSlaveInterface 3
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x83 EP 3 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 5
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 3
bAlternateSetting 0
bNumEndpoints 2
bInterfaceClass 10 CDC Data
bInterfaceSubClass 0 Unused
bInterfaceProtocol 0
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x84 EP 4 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x02 EP 2 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 0
Interface Association:
bLength 8
bDescriptorType 11
bFirstInterface 4
bInterfaceCount 2
bFunctionClass 2 Communications
bFunctionSubClass 2 Abstract (modem)
bFunctionProtocol 1 AT-commands (v.25ter)
iFunction 0
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 4
bAlternateSetting 0
bNumEndpoints 1
bInterfaceClass 2 Communications
bInterfaceSubClass 2 Abstract (modem)
bInterfaceProtocol 1 AT-commands (v.25ter)
iInterface 0
CDC Header:
bcdCDC 1.10
CDC ACM:
bmCapabilities 0x02
line coding and serial state
CDC Call Management:
bmCapabilities 0x03
call management
use DataInterface
bDataInterface 5
CDC Union:
bMasterInterface 4
bSlaveInterface 5
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x85 EP 5 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 5
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 5
bAlternateSetting 0
bNumEndpoints 2
bInterfaceClass 10 CDC Data
bInterfaceSubClass 0 Unused
bInterfaceProtocol 0
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x86 EP 6 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x03 EP 3 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 0
Interface Association:
bLength 8
bDescriptorType 11
bFirstInterface 6
bInterfaceCount 2
bFunctionClass 2 Communications
bFunctionSubClass 2 Abstract (modem)
bFunctionProtocol 1 AT-commands (v.25ter)
iFunction 0
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 6
bAlternateSetting 0
bNumEndpoints 1
bInterfaceClass 2 Communications
bInterfaceSubClass 2 Abstract (modem)
bInterfaceProtocol 1 AT-commands (v.25ter)
iInterface 0
CDC Header:
bcdCDC 1.10
CDC ACM:
bmCapabilities 0x02
line coding and serial state
CDC Call Management:
bmCapabilities 0x03
call management
use DataInterface
bDataInterface 7
CDC Union:
bMasterInterface 6
bSlaveInterface 7
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x87 EP 7 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 5
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 7
bAlternateSetting 0
bNumEndpoints 2
bInterfaceClass 10 CDC Data
bInterfaceSubClass 0 Unused
bInterfaceProtocol 0
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x88 EP 8 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x04 EP 4 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 0
Interface Association:
bLength 8
bDescriptorType 11
bFirstInterface 8
bInterfaceCount 2
bFunctionClass 2 Communications
bFunctionSubClass 0
bFunctionProtocol 0
iFunction 0
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 8
bAlternateSetting 0
bNumEndpoints 1
bInterfaceClass 2 Communications
bInterfaceSubClass 6 Ethernet Networking
bInterfaceProtocol 0
iInterface 0
CDC Header:
bcdCDC 1.10
CDC Ethernet:
iMacAddress 1
00A0C6C14190
bmEthernetStatistics 0x00000000
wMaxSegmentSize 16384
wNumberMCFilters 0x0001
bNumberPowerFilters 0
CDC Union:
bMasterInterface 8
bSlaveInterface 9
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x89 EP 9 IN
bmAttributes 3
Transfer Type Interrupt
Synch Type None
Usage Type Data
wMaxPacketSize 0x0040 1x 64 bytes
bInterval 5
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 9
bAlternateSetting 0
bNumEndpoints 0
bInterfaceClass 10 CDC Data
bInterfaceSubClass 0 Unused
bInterfaceProtocol 0
iInterface 0
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 9
bAlternateSetting 1
bNumEndpoints 2
bInterfaceClass 10 CDC Data
bInterfaceSubClass 0 Unused
bInterfaceProtocol 0
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x8a EP 10 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x05 EP 5 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 0
Device Qualifier (for other device speed):
bLength 10
bDescriptorType 6
bcdUSB 2.00
bDeviceClass 239 Miscellaneous Device
bDeviceSubClass 2 ?
bDeviceProtocol 1 Interface Association
bMaxPacketSize0 64
bNumConfigurations 1
Device Status: 0x0000
(Bus Powered)
Signed-off-by: Giacinto Cifelli <gciofono@gmail.com>
Link: https://lore.kernel.org/r/20210126044245.8455-1-gciofono@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Thu, 28 Jan 2021 19:18:43 +0000 (11:18 -0800)]
Merge tag 'locking-urgent-2021-01-28' of git://git./linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
"A set of PI futex fixes:
- Address a longstanding issue where the user space part of the PI
futex is not writeable. The kernel returns with inconsistent state
which can in the worst case result in a UAF of a tasks kernel
stack.
The solution is to establish consistent kernel state which makes
future operations on the futex fail because user space and kernel
space state are inconsistent. Not a problem as PI futexes
fundamentaly require a functional RW mapping and if user space
pulls the rug under it, then it can keep the pieces it asked for.
- Address an issue where the return value is incorrect in case that
the futex was acquired after a timeout/signal made the waiter drop
out of the rtmutex wait.
In one of the corner cases the kernel returned an error code
despite having successfully acquired the futex"
* tag 'locking-urgent-2021-01-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
futex: Handle faults correctly for PI futexes
futex: Simplify fixup_pi_state_owner()
futex: Use pi_state_update_owner() in put_pi_state()
rtmutex: Remove unused argument from rt_mutex_proxy_unlock()
futex: Provide and use pi_state_update_owner()
futex: Replace pointless printk in fixup_owner()
futex: Ensure the correct return value from futex_lock_pi()
Jens Axboe [Thu, 28 Jan 2021 18:57:33 +0000 (11:57 -0700)]
Merge tag 'nvme-5.11-2021-01-28' of git://git.infradead.org/nvme into block-5.11
Pull NVMe fixes from Christoph:
"nvme fixes for 5.11:
- add another Write Zeroes quirk (Chaitanya Kulkarni)
- handle a no path available corner case (Daniel Wagner)
- use the proper RCU aware list_add helper (Chao Leng)"
* tag 'nvme-5.11-2021-01-28' of git://git.infradead.org/nvme:
nvme-core: use list_add_tail_rcu instead of list_add_tail for nvme_init_ns_head
nvme-multipath: Early exit if no path is available
nvme-pci: add the DISABLE_WRITE_ZEROES quirk for a SPCC device
Pavel Begunkov [Thu, 28 Jan 2021 18:39:25 +0000 (18:39 +0000)]
io_uring: fix sqo ownership false positive warning
WARNING: CPU: 0 PID: 21359 at fs/io_uring.c:9042
io_uring_cancel_task_requests+0xe55/0x10c0 fs/io_uring.c:9042
Call Trace:
io_uring_flush+0x47b/0x6e0 fs/io_uring.c:9227
filp_close+0xb4/0x170 fs/open.c:1295
close_files fs/file.c:403 [inline]
put_files_struct fs/file.c:418 [inline]
put_files_struct+0x1cc/0x350 fs/file.c:415
exit_files+0x7e/0xa0 fs/file.c:435
do_exit+0xc22/0x2ae0 kernel/exit.c:820
do_group_exit+0x125/0x310 kernel/exit.c:922
get_signal+0x427/0x20f0 kernel/signal.c:2773
arch_do_signal_or_restart+0x2a8/0x1eb0 arch/x86/kernel/signal.c:811
handle_signal_work kernel/entry/common.c:147 [inline]
exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
exit_to_user_mode_prepare+0x148/0x250 kernel/entry/common.c:201
__syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline]
syscall_exit_to_user_mode+0x19/0x50 kernel/entry/common.c:302
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Now io_uring_cancel_task_requests() can be called not through file
notes but directly, remove a WARN_ONCE() there that give us false
positives. That check is not very important and we catch it in other
places.
Fixes: 84965ff8a84f0 ("io_uring: if we see flush on exit, cancel related tasks")
Cc: stable@vger.kernel.org # 5.9+
Reported-by: syzbot+3e3d9bd0c6ce9efbc3ef@syzkaller.appspotmail.com
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pavel Begunkov [Thu, 28 Jan 2021 18:39:24 +0000 (18:39 +0000)]
io_uring: fix list corruption for splice file_get
kernel BUG at lib/list_debug.c:29!
Call Trace:
__list_add include/linux/list.h:67 [inline]
list_add include/linux/list.h:86 [inline]
io_file_get+0x8cc/0xdb0 fs/io_uring.c:6466
__io_splice_prep+0x1bc/0x530 fs/io_uring.c:3866
io_splice_prep fs/io_uring.c:3920 [inline]
io_req_prep+0x3546/0x4e80 fs/io_uring.c:6081
io_queue_sqe+0x609/0x10d0 fs/io_uring.c:6628
io_submit_sqe fs/io_uring.c:6705 [inline]
io_submit_sqes+0x1495/0x2720 fs/io_uring.c:6953
__do_sys_io_uring_enter+0x107d/0x1f30 fs/io_uring.c:9353
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
io_file_get() may be called from splice, and so REQ_F_INFLIGHT may
already be set.
Fixes: 02a13674fa0e8 ("io_uring: account io_uring internal files as REQ_F_INFLIGHT")
Cc: stable@vger.kernel.org # 5.9+
Reported-by: syzbot+6879187cf57845801267@syzkaller.appspotmail.com
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Arnd Bergmann [Mon, 25 Jan 2021 12:23:20 +0000 (13:23 +0100)]
amdgpu: fix clang build warning
clang warns about the -mhard-float command line arguments
on architectures that do not support this:
clang: error: argument unused during compilation: '-mhard-float' [-Werror,-Wunused-command-line-argument]
Move this into the gcc-specific arguments.
Fixes: e77165bf7b02 ("drm/amd/display: Add DCN3 blocks to Makefile")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 28 Jan 2021 18:28:59 +0000 (13:28 -0500)]
Revert "drm/amdgpu/swsmu: drop set_fan_speed_percent (v2)"
On some boards the rpm interface apparently does not work at all
leading to the fan not spinning or spinning at strange speeds.
Revert this for now to fix 5.10, 5.11. The follow on patch
fixes this properly for 5.12.
This reverts commit
8d6e65adc25e23fabbc5293b6cd320195c708dca.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1408
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Chao Leng [Thu, 28 Jan 2021 03:33:51 +0000 (11:33 +0800)]
nvme-core: use list_add_tail_rcu instead of list_add_tail for nvme_init_ns_head
The "list" of nvme_ns_head is used as rcu list, now in nvme_init_ns_head
list_add_tail is used to add ns->siblings to the rcu list. It is not safe.
Should use list_add_tail_rcu instead of list_add_tail.
Signed-off-by: Chao Leng <lengchao@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Daniel Wagner [Wed, 27 Jan 2021 10:30:33 +0000 (11:30 +0100)]
nvme-multipath: Early exit if no path is available
nvme_round_robin_path() should test if the return ns pointer is valid.
nvme_next_ns() will return a NULL pointer if there is no path left.
Fixes: 75c10e732724 ("nvme-multipath: round-robin I/O policy")
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Chaitanya Kulkarni [Tue, 26 Jan 2021 05:19:16 +0000 (21:19 -0800)]
nvme-pci: add the DISABLE_WRITE_ZEROES quirk for a SPCC device
This adds a quirk for SPCC 256GB NVMe 1.3 drive which fixes timeouts and
I/O errors due to the fact that the controller does not properly
handle the Write Zeroes command:
[ 2745.659527] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G E 5.10.6-BET #1
[ 2745.659528] Hardware name: System manufacturer System Product Name/PRIME X570-P, BIOS 3001 12/04/2020
[ 2776.138874] nvme nvme1: I/O 414 QID 3 timeout, aborting
[ 2776.138886] nvme nvme1: I/O 415 QID 3 timeout, aborting
[ 2776.138891] nvme nvme1: I/O 416 QID 3 timeout, aborting
[ 2776.138895] nvme nvme1: I/O 417 QID 3 timeout, aborting
[ 2776.138912] nvme nvme1: Abort status: 0x0
[ 2776.138921] nvme nvme1: I/O 428 QID 3 timeout, aborting
[ 2776.138922] nvme nvme1: Abort status: 0x0
[ 2776.138925] nvme nvme1: Abort status: 0x0
[ 2776.138974] nvme nvme1: Abort status: 0x0
[ 2776.138977] nvme nvme1: Abort status: 0x0
[ 2806.346792] nvme nvme1: I/O 414 QID 3 timeout, reset controller
[ 2806.363566] nvme nvme1: 15/0/0 default/read/poll queues
[ 2836.554298] nvme nvme1: I/O 415 QID 3 timeout, disable controller
[ 2836.672064] blk_update_request: I/O error, dev nvme1n1, sector 16350 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 2836.672072] blk_update_request: I/O error, dev nvme1n1, sector 16093 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 2836.672074] blk_update_request: I/O error, dev nvme1n1, sector 15836 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 2836.672076] blk_update_request: I/O error, dev nvme1n1, sector 15579 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 2836.672078] blk_update_request: I/O error, dev nvme1n1, sector 15322 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 2836.672080] blk_update_request: I/O error, dev nvme1n1, sector 15065 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 2836.672082] blk_update_request: I/O error, dev nvme1n1, sector 14808 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 2836.672083] blk_update_request: I/O error, dev nvme1n1, sector 14551 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 2836.672085] blk_update_request: I/O error, dev nvme1n1, sector 14294 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 2836.672087] blk_update_request: I/O error, dev nvme1n1, sector 14037 op 0x9:(WRITE_ZEROES) flags 0x0 phys_seg 0 prio class 0
[ 2836.672121] nvme nvme1: failed to mark controller live state
[ 2836.672123] nvme nvme1: Removing after probe failure status: -19
[ 2836.689016] Aborting journal on device dm-0-8.
[ 2836.689024] Buffer I/O error on dev dm-0, logical block
25198592, lost sync page write
[ 2836.689027] JBD2: Error -5 detected when updating journal superblock for dm-0-8.
Reported-by: Bradley Chapman <chapman6235@comcast.net>
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Tested-by: Bradley Chapman <chapman6235@comcast.net>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Linus Torvalds [Thu, 28 Jan 2021 18:08:08 +0000 (10:08 -0800)]
Merge tag 'for-linus-5.11-rc6-tag' of git://git./linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- A fix for a regression introduced in 5.11 resulting in Xen dom0
having problems to correctly initialize Xenstore.
- A fix for avoiding WARN splats when booting as Xen dom0 with
CONFIG_AMD_MEM_ENCRYPT enabled due to a missing trap handler for the
#VC exception (even if the handler should never be called).
- A fix for the Xen bklfront driver adapting to the correct but
unexpected behavior of new qemu.
* tag 'for-linus-5.11-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: avoid warning in Xen pv guest with CONFIG_AMD_MEM_ENCRYPT enabled
xen: Fix XenStore initialisation for XS_LOCAL
xen-blkfront: allow discard-* nodes to be optional
Linus Torvalds [Thu, 28 Jan 2021 18:00:26 +0000 (10:00 -0800)]
Merge tag 'asm-generic-fixes-v5.11' of git://git./linux/kernel/git/arnd/asm-generic
Pull ia64 fixes from Arnd Bergmann:
"asm-generic/ia64 fixes, and mark as orphaned
Commit
2b49ddcef297 ("ia64: convert to legacy_timer_tick") from my
timer series I merged through the asm-generic tree caused a regression
on all ia64 machines, as bisected by Adrian Glaubitz.
Tony Luck is no longer really working on ia64, so instead of merging
the fix through his tree, we ended up deciding that I'd merge the fix
myself along a patch to mark the architecture as Orphaned and a
compile time warning fix I made while working on the regression"
[ HPE no longer accepts orders for new Itanium hardware, and Intel
stopped accepting orders a year ago. While intel is still officially
shipping chips until July 29, 2021, it's unlikely that any such orders
actually exist.
It's dead, Jim.
- Linus ]
* tag 'asm-generic-fixes-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
ia64: Mark architecture as orphaned
ia64: fix xchg() warning
ia64: fix timer cleanup regression
Linus Torvalds [Thu, 28 Jan 2021 17:57:33 +0000 (09:57 -0800)]
Merge tag 'arm-soc-fixes-v5.11-2' of git://git./linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"These are the current arm-soc bug fixes for linux-5.11. I already
merged a larger set that just came in during the past three days but
has not had much exposure in linux-next, but this is the subset I
merged last week.
Most of these are for the NXP i.MX platform (descriptions from their
pull request):
- Fix pcf2127 reset for imx7d-flex-concentrator board.
- Fix i.MX6 suspend with Thumb-2 kernel.
- Fix ethernet-phy address issue on imx6qdl-sr-som board.
- Fix GPIO3 `gpio-ranges` on i.MX8MP.
- Select SOC_BUS for IMX_SCU driver to fix build issue.
- Fix backlight pwm on imx6qdl-kontron-samx6i which is lost from
#pwm-cells conversion.
- Fix duplicated bus node name for i.MX8MN SoC.
- Fix reset register offset on LS1028A SoC.
- Rename MMC node aliases for imx6q-tbs2910 to keep the MMC device
index consistent with previous kernel version.
- Selecting ARM_GIC_V3 on non-CP15 processors to fix one build
failure with i.MX8M SoC driver.
- Fix typos with status property on imx6qdl-kontron-samx6i board.
- Fix duplicated regulator-name on imx6qdl-gw52xx board.
Aside from i.MX, the bugfixes are all over the place:
- Coccinelle found a refcount imbalance on integrator
- defconfig fix for TI K3
- A boot regression fix for ST ux500
- A code preemption fix for the optee driver
- USB DMA regression on Broadcom Stingray
- A bogus boot time warning fix for at91 code"
* tag 'arm-soc-fixes-v5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
MAINTAINERS: Include bcm2835 subsequents into search
arm64: dts: broadcom: Fix USB DMA address translation for Stingray
drivers: soc: atmel: add null entry at the end of at91_soc_allowed_list[]
drivers: soc: atmel: Avoid calling at91_soc_init on non AT91 SoCs
tee: optee: replace might_sleep with cond_resched
firmware: imx: select SOC_BUS to fix firmware build
arm64: dts: imx8mp: Correct the gpio ranges of gpio3
ARM: dts: imx6qdl-sr-som: fix some cubox-i platforms
ARM: imx: build suspend-imx6.S with arm instruction set
ARM: dts: imx7d-flex-concentrator: fix pcf2127 reset
ARM: dts: ux500: Reserve memory carveouts
arm64: defconfig: Drop unused K3 SoC specific options
bus: arm-integrator-lm: Add of_node_put() before return statement
ARM: dts: imx6qdl-gw52xx: fix duplicate regulator naming
ARM: dts: imx6qdl-kontron-samx6i: fix i2c_lcd/cam default status
ARM: imx: fix imx8m dependencies
ARM: dts: tbs2910: rename MMC node aliases
arm64: dts: ls1028a: fix the offset of the reset register
arm64: dts: imx8mn: Fix duplicate node name
ARM: dts: imx6qdl-kontron-samx6i: fix pwms for lcd-backlight
Linus Torvalds [Thu, 28 Jan 2021 17:27:26 +0000 (09:27 -0800)]
Merge tag 'for-linus' of git://git./linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"Several recent regressions and some bug fixes:
- Typo corrupting the max_recv_sge for cxgb4
- Regression from re-using kernel enums as a HW AbI in vmw_pvrdma
- Sleeping inside a spinlock in hns
- Revert the attempt to fix devlink deadlocks as the fix is more buggy
- Typo in sysfs_emit_at conversions
- Revert the removal of VLAN support in rxe"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
Revert "RDMA/rxe: Remove VLAN code leftovers from RXE"
RDMA/usnic: Fix misuse of sysfs_emit_at
Revert "RDMA/mlx5: Fix devlink deadlock on net namespace deletion"
RDMA/hns: Use mutex instead of spinlock for ida allocation
RDMA/vmw_pvrdma: Fix network_hdr_type reported in WC
RDMA/cxgb4: Fix the reported max_recv_sge value
Linus Torvalds [Thu, 28 Jan 2021 17:23:01 +0000 (09:23 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
- NULL pointer dereference regression fix for Wacom driver (Jason
Gerecke)
- functional regression fix for pam handling on some Elan and Synaptics
touchpads (Kai-Heng Feng)
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: wacom: Correct NULL dereference on AES pen proximity
HID: multitouch: Apply MT_QUIRK_CONFIDENCE quirk for multi-input devices
Linus Torvalds [Thu, 28 Jan 2021 17:18:05 +0000 (09:18 -0800)]
Merge tag 'media/v5.11-2' of git://git./linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab:
- a V4L2 core regression at videobuf2 when checking for single-plane
dmabuf
- a change at uAPI header v4l2-subdev.h, fixing a breakage as BIT()
macro is not available in userspace
- fix some regressions at RC core due to the usage of microseconds
everywhere on it
- a fix for a race condition at RC core
- a rename on a newly-introduced kAPI symbol (v4l2_get_link_rate),
currently used only by a single driver
- Regression fixes for rcar-vin, cedrus, ite-cir, hantro, css, venus,
and cec drivers.
* tag 'media/v5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
media: hantro: Fix reset_raw_fmt initialization
media: cec: add stm32 driver
media: cedrus: Fix H264 decoding
media: v4l2-subdev.h: BIT() is not available in userspace
media: Revert "media: videobuf2: Fix length check for single plane dmabuf queueing"
media: rc: ite-cir: fix min_timeout calculation
media: venus: core: Fix platform driver shutdown
media: rc: fix timeout handling after switch to microsecond durations
media: v4l: common: Fix naming of v4l2_get_link_rate
media: rcar-vin: fix return, use ret instead of zero
media: ccs: Get static data version minor correctly
media: ccs-pll: Fix link frequency for C-PHY
media: rc: ensure that uevent can be read directly after rc device register
Linus Torvalds [Thu, 28 Jan 2021 17:14:58 +0000 (09:14 -0800)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"A handful of clk driver fixes:
- Build fix for CONFIG_PM=n in the mmp2 driver
- Kconfig warning for unmet dependencies in the i.MX driver
- Make the camera AHB clk always be enabled on qcom sc7180
- Use rate round down semantics for qcom sm8250 SD clks"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: mmp2: fix build without CONFIG_PM
clk: qcom: gcc-sm250: Use floor ops for sdcc clks
clk: imx: fix Kconfig warning for i.MX SCU clk
clk: qcom: gcc-sc7180: Mark the camera abh clock always ON
Linus Torvalds [Thu, 28 Jan 2021 17:06:52 +0000 (09:06 -0800)]
Merge tag 'sound-5.11-rc6' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Although the incoming fixes haven't settled down yet, all changes here
are small and mostly device-specific fixes, so nothing look worrisome.
- Yet another USB-audio regression fixes
- HD-audio ID fix and device-specific quirks
- SOF Intel / SoundWire fixes including topology
- ASoC Qualcomm and Mediatek fixes"
* tag 'sound-5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (24 commits)
ALSA: hda/via: Apply the workaround generically for Clevo machines
ASoC: Intel: sof_sdw: set proper flags for Dell TGL-H SKU 0A5E
ASoC: qcom: lpass: Fix out-of-bounds DAI ID lookup
ASoC: mediatek: mt8192-mt6359: add format constraints for RT5682
ASoC: ak4458: correct reset polarity
ASoC: SOF: SND_INTEL_DSP_CONFIG dependency
ASoC: SOF: Intel: soundwire: fix select/depend unmet dependencies
ALSA: hda: intel-dsp-config: add PCI id for TGL-H
ALSA: usb-audio: workaround for iface reset issue
ALSA: pcm: One more dependency for hw constraints
ALSA: hda/realtek: Enable headset of ASUS B1400CEPE with ALC256
ASoC: Intel: Skylake: Zero snd_ctl_elem_value
ASoC: Intel: Skylake: skl-topology: Fix OOPs ib skl_tplg_complete
ASoC: qcom: Fix number of HDMI RDMA channels on sc7180
ASoC: mediatek: mt8183-da7219: ignore TDM DAI link by default
ASoC: mediatek: mt8183-mt6358: ignore TDM DAI link by default
ASoC: topology: Properly unregister DAI on removal
ASoC: topology: Fix memory corruption in soc_tplg_denum_create_values()
ASoC: qcom: lpass-ipq806x: fix bitwidth regmap field
ASoC: AMD Renoir - refine DMI entries for some Lenovo products
...
Wang Hai [Thu, 28 Jan 2021 11:32:50 +0000 (19:32 +0800)]
Revert "mm/slub: fix a memory leak in sysfs_slab_add()"
This reverts commit
dde3c6b72a16c2db826f54b2d49bdea26c3534a2.
syzbot report a double-free bug. The following case can cause this bug.
- mm/slab_common.c: create_cache(): if the __kmem_cache_create() fails,
it does:
out_free_cache:
kmem_cache_free(kmem_cache, s);
- but __kmem_cache_create() - at least for slub() - will have done
sysfs_slab_add(s)
-> sysfs_create_group() .. fails ..
-> kobject_del(&s->kobj); .. which frees s ...
We can't remove the kmem_cache_free() in create_cache(), because other
error cases of __kmem_cache_create() do not free this.
So, revert the commit
dde3c6b72a16 ("mm/slub: fix a memory leak in
sysfs_slab_add()") to fix this.
Reported-by: syzbot+d0bd96b4696c1ef67991@syzkaller.appspotmail.com
Fixes: dde3c6b72a16 ("mm/slub: fix a memory leak in sysfs_slab_add()")
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Coly Li [Thu, 28 Jan 2021 10:48:47 +0000 (18:48 +0800)]
bcache: only check feature sets when sb->version >= BCACHE_SB_VERSION_CDEV_WITH_FEATURES
For super block version < BCACHE_SB_VERSION_CDEV_WITH_FEATURES, it
doesn't make sense to check the feature sets. This patch checks
super block version in bch_has_feature_* routines, if the version
doesn't have feature sets yet, returns 0 (false) to the caller.
Fixes: 5342fd425502 ("bcache: set bcache device into read-only mode for BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET")
Fixes: ffa470327572 ("bcache: add bucket_size_hi into struct cache_sb_disk for large bucket")
Cc: stable@vger.kernel.org # 5.9+
Reported-and-tested-by: Bockholdt Arne <a.bockholdt@precitec-optronik.de>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Damien Le Moal [Thu, 28 Jan 2021 06:36:19 +0000 (15:36 +0900)]
block: fix bd_size_lock use
Some block device drivers, e.g. the skd driver, call set_capacity() with
IRQ disabled. This results in lockdep ito complain about inconsistent
lock states ("inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage")
because set_capacity takes a block device bd_size_lock using the
functions spin_lock() and spin_unlock(). Ensure a consistent locking
state by replacing these calls with spin_lock_irqsave() and
spin_lock_irqrestore(). The same applies to bdev_set_nr_sectors().
With this fix, all lockdep complaints are resolved.
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Baolin Wang [Thu, 28 Jan 2021 05:58:15 +0000 (13:58 +0800)]
blk-cgroup: Use cond_resched() when destroy blkgs
On !PREEMPT kernel, we can get below softlockup when doing stress
testing with creating and destroying block cgroup repeatly. The
reason is it may take a long time to acquire the queue's lock in
the loop of blkcg_destroy_blkgs(), or the system can accumulate a
huge number of blkgs in pathological cases. We can add a need_resched()
check on each loop and release locks and do cond_resched() if true
to avoid this issue, since the blkcg_destroy_blkgs() is not called
from atomic contexts.
[ 4757.010308] watchdog: BUG: soft lockup - CPU#11 stuck for 94s!
[ 4757.010698] Call trace:
[ 4757.010700] blkcg_destroy_blkgs+0x68/0x150
[ 4757.010701] cgwb_release_workfn+0x104/0x158
[ 4757.010702] process_one_work+0x1bc/0x3f0
[ 4757.010704] worker_thread+0x164/0x468
[ 4757.010705] kthread+0x108/0x138
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Nadav Amit [Wed, 27 Jan 2021 17:53:17 +0000 (09:53 -0800)]
iommu/vt-d: Do not use flush-queue when caching-mode is on
When an Intel IOMMU is virtualized, and a physical device is
passed-through to the VM, changes of the virtual IOMMU need to be
propagated to the physical IOMMU. The hypervisor therefore needs to
monitor PTE mappings in the IOMMU page-tables. Intel specifications
provide "caching-mode" capability that a virtual IOMMU uses to report
that the IOMMU is virtualized and a TLB flush is needed after mapping to
allow the hypervisor to propagate virtual IOMMU mappings to the physical
IOMMU. To the best of my knowledge no real physical IOMMU reports
"caching-mode" as turned on.
Synchronizing the virtual and the physical IOMMU tables is expensive if
the hypervisor is unaware which PTEs have changed, as the hypervisor is
required to walk all the virtualized tables and look for changes.
Consequently, domain flushes are much more expensive than page-specific
flushes on virtualized IOMMUs with passthrough devices. The kernel
therefore exploited the "caching-mode" indication to avoid domain
flushing and use page-specific flushing in virtualized environments. See
commit
78d5f0f500e6 ("intel-iommu: Avoid global flushes with caching
mode.")
This behavior changed after commit
13cf01744608 ("iommu/vt-d: Make use
of iova deferred flushing"). Now, when batched TLB flushing is used (the
default), full TLB domain flushes are performed frequently, requiring
the hypervisor to perform expensive synchronization between the virtual
TLB and the physical one.
Getting batched TLB flushes to use page-specific invalidations again in
such circumstances is not easy, since the TLB invalidation scheme
assumes that "full" domain TLB flushes are performed for scalability.
Disable batched TLB flushes when caching-mode is on, as the performance
benefit from using batched TLB invalidations is likely to be much
smaller than the overhead of the virtual-to-physical IOMMU page-tables
synchronization.
Fixes: 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing")
Signed-off-by: Nadav Amit <namit@vmware.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Will Deacon <will@kernel.org>
Cc: stable@vger.kernel.org
Acked-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20210127175317.1600473-1-namit@vmware.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Lu Baolu [Tue, 19 Jan 2021 04:35:00 +0000 (12:35 +0800)]
iommu/vt-d: Correctly check addr alignment in qi_flush_dev_iotlb_pasid()
An incorrect address mask is being used in the qi_flush_dev_iotlb_pasid()
to check the address alignment. This leads to a lot of spurious kernel
warnings:
[ 485.837093] DMAR: Invalidate non-aligned address
7f76f47f9000, order 0
[ 485.837098] DMAR: Invalidate non-aligned address
7f76f47f9000, order 0
[ 492.494145] qi_flush_dev_iotlb_pasid: 5734 callbacks suppressed
[ 492.494147] DMAR: Invalidate non-aligned address
7f7728800000, order 11
[ 492.508965] DMAR: Invalidate non-aligned address
7f7728800000, order 11
Fix it by checking the alignment in right way.
Fixes: 288d08e780088 ("iommu/vt-d: Handle non-page aligned address")
Reported-and-tested-by: Guo Kaijie <Kaijie.Guo@intel.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Liu Yi L <yi.l.liu@intel.com>
Link: https://lore.kernel.org/r/20210119043500.1539596-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Suravee Suthikulpanit [Wed, 20 Jan 2021 13:50:02 +0000 (07:50 -0600)]
iommu/amd: Use IVHD EFR for early initialization of IOMMU features
IOMMU Extended Feature Register (EFR) is used to communicate
the supported features for each IOMMU to the IOMMU driver.
This is normally read from the PCI MMIO register offset 0x30,
and used by the iommu_feature() helper function.
However, there are certain scenarios where the information is needed
prior to PCI initialization, and the iommu_feature() function is used
prematurely w/o warning. This has caused incorrect initialization of IOMMU.
This is the case for the commit
6d39bdee238f ("iommu/amd: Enforce 4k
mapping for certain IOMMU data structures")
Since, the EFR is also available in the IVHD header, and is available to
the driver prior to PCI initialization. Therefore, default to using
the IVHD EFR instead.
Fixes: 6d39bdee238f ("iommu/amd: Enforce 4k mapping for certain IOMMU data structures")
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Brijesh Singh <brijesh.singh@amd.com>
Reviewed-by: Robert Richter <rrichter@amd.com>
Link: https://lore.kernel.org/r/20210120135002.2682-1-suravee.suthikulpanit@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Jakub Kicinski [Thu, 28 Jan 2021 03:18:37 +0000 (19:18 -0800)]
Merge tag 'mlx5-fixes-2021-01-26' of git://git./linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5 fixes 2021-01-26
This series introduces some fixes to mlx5 driver.
* tag 'mlx5-fixes-2021-01-26' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5: CT: Fix incorrect removal of tuple_nat_node from nat rhashtable
net/mlx5e: Revert parameters on errors when changing MTU and LRO state without reset
net/mlx5e: Revert parameters on errors when changing trust state without reset
net/mlx5e: Correctly handle changing the number of queues when the interface is down
net/mlx5e: Fix CT rule + encap slow path offload and deletion
net/mlx5e: Disable hw-tc-offload when MLX5_CLS_ACT config is disabled
net/mlx5: Maintain separate page trees for ECPF and PF functions
net/mlx5e: Fix IPSEC stats
net/mlx5e: Reduce tc unsupported key print level
net/mlx5e: free page before return
net/mlx5e: E-switch, Fix rate calculation for overflow
net/mlx5: Fix memory leak on flow table creation error flow
====================
Link: https://lore.kernel.org/r/20210126234345.202096-1-saeedm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Lijun Pan [Thu, 28 Jan 2021 01:34:42 +0000 (19:34 -0600)]
ibmvnic: Ensure that CRQ entry read are correctly ordered
Ensure that received Command-Response Queue (CRQ) entries are
properly read in order by the driver. dma_rmb barrier has
been added before accessing the CRQ descriptor to ensure
the entire descriptor is read before processing.
Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Link: https://lore.kernel.org/r/20210128013442.88319-1-ljp@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 28 Jan 2021 01:53:45 +0000 (17:53 -0800)]
Merge git://git./pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
1) Honor stateful expressions defined in the set from the dynset
extension. The set definition provides a stateful expression
that must be used by the dynset expression in case it is specified.
2) Missing timeout extension in the set element in the dynset
extension leads to inconsistent ruleset listing, not allowing
the user to restore timeout and expiration on ruleset reload.
3) Do not dump the stateful expression from the dynset extension
if it coming from the set definition.
* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
netfilter: nft_dynset: dump expressions when set definition contains no expressions
netfilter: nft_dynset: add timeout extension to template
netfilter: nft_dynset: honor stateful expressions in set definition
====================
Link: https://lore.kernel.org/r/20210127132512.5472-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 28 Jan 2021 01:51:45 +0000 (17:51 -0800)]
Merge tag 'linux-can-fixes-for-5.11-
20210127' of git://git./linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2021-01-27
The patch is by Dan Carpenter and fixes a potential information leak in
can_fill_info().
* tag 'linux-can-fixes-for-5.11-
20210127' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: dev: prevent potential information leak in can_fill_info()
====================
Link: https://lore.kernel.org/r/20210127094028.2778793-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 27 Jan 2021 02:18:44 +0000 (18:18 -0800)]
MAINTAINERS: add missing header for bonding
include/net/bonding.h is missing from bonding entry.
Link: https://lore.kernel.org/r/20210127021844.4071706-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 28 Jan 2021 01:44:51 +0000 (17:44 -0800)]
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/net-queue
Anthony Nguyen says:
====================
Intel Wired LAN Driver Updates 2021-01-26
This series contains updates to the ice, i40e, and igc driver.
Henry corrects setting an unspecified protocol to IPPROTO_NONE instead of
0 for IPv6 flexbytes filters for ice.
Nick fixes the IPv6 extension header being processed incorrectly and
updates the netdev->dev_addr if it exists in hardware as it may have been
modified outside the ice driver.
Brett ensures a user cannot request more channels than available LAN MSI-X
and fixes the minimum allocation logic as it was incorrectly trying to use
more MSI-X than allocated for ice.
Stefan Assmann minimizes the delay between getting and using the VSI
pointer to prevent a possible crash for i40e.
Corinna Vinschen fixes link speed advertising for igc.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
igc: fix link speed advertising
i40e: acquire VSI pointer only after VF is initialized
ice: Fix MSI-X vector fallback logic
ice: Don't allow more channels than LAN MSI-X available
ice: update dev_addr in ice_set_mac_address even if HW filter exists
ice: Implement flow for IPv6 next header (extension header)
ice: fix FDir IPv6 flexbyte
====================
Link: https://lore.kernel.org/r/20210126221035.658124-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vadim Fedorenko [Tue, 26 Jan 2021 00:02:14 +0000 (03:02 +0300)]
net: decnet: fix netdev refcount leaking on error path
On building the route there is an assumption that the destination
could be local. In this case loopback_dev is used to get the address.
If the address is still cannot be retrieved dn_route_output_slow
returns EADDRNOTAVAIL with loopback_dev reference taken.
Cannot find hash for the fixes tag because this code was introduced
long time ago. I don't think that this bug has ever fired but the
patch is done just to have a consistent code base.
Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Link: https://lore.kernel.org/r/1611619334-20955-1-git-send-email-vfedorenko@novek.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Rasmus Villemoes [Mon, 25 Jan 2021 12:41:16 +0000 (13:41 +0100)]
net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP
It's not true that switchdev_port_obj_notify() only inspects the
->handled field of "struct switchdev_notifier_port_obj_info" if
call_switchdev_blocking_notifiers() returns 0 - there's a WARN_ON()
triggering for a non-zero return combined with ->handled not being
true. But the real problem here is that -EOPNOTSUPP is not being
properly handled.
The wrapper functions switchdev_handle_port_obj_add() et al change a
return value of -EOPNOTSUPP to 0, and the treatment of ->handled in
switchdev_port_obj_notify() seems to be designed to change that back
to -EOPNOTSUPP in case nobody actually acted on the notifier (i.e.,
everybody returned -EOPNOTSUPP).
Currently, as soon as some device down the stack passes the check_cb()
check, ->handled gets set to true, which means that
switchdev_port_obj_notify() cannot actually ever return -EOPNOTSUPP.
This, for example, means that the detection of hardware offload
support in the MRP code is broken: switchdev_port_obj_add() used by
br_mrp_switchdev_send_ring_test() always returns 0, so since the MRP
code thinks the generation of MRP test frames has been offloaded, no
such frames are actually put on the wire. Similarly,
br_mrp_switchdev_set_ring_role() also always returns 0, causing
mrp->ring_role_offloaded to be set to 1.
To fix this, continue to set ->handled true if any callback returns
success or any error distinct from -EOPNOTSUPP. But if all the
callbacks return -EOPNOTSUPP, make sure that ->handled stays false, so
the logic in switchdev_port_obj_notify() can propagate that
information.
Fixes: 9a9f26e8f7ea ("bridge: mrp: Connect MRP API with the switchdev API")
Fixes: f30f0601eb93 ("switchdev: Add helpers to aid traversal through lower devices")
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Link: https://lore.kernel.org/r/20210125124116.102928-1-rasmus.villemoes@prevas.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Wed, 27 Jan 2021 19:06:15 +0000 (11:06 -0800)]
Merge branch 'parisc-5.11-2' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
"Two small fixes:
- Fix linking error with 64-bit kernel when modules are disabled,
reported by kernel test robot
- Remove leftover reference to power_tasklet, by Davidlohr Bueso"
* 'parisc-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Enable -mlong-calls gcc option by default when !CONFIG_MODULES
parisc: Remove leftover reference to the power_tasklet
Hao Xu [Wed, 27 Jan 2021 07:14:09 +0000 (15:14 +0800)]
io_uring: fix flush cqring overflow list while TASK_INTERRUPTIBLE
Abaci reported the follow warning:
[ 27.073425] do not call blocking ops when !TASK_RUNNING; state=1 set at [] prepare_to_wait_exclusive+0x3a/0xc0
[ 27.075805] WARNING: CPU: 0 PID: 951 at kernel/sched/core.c:7853 __might_sleep+0x80/0xa0
[ 27.077604] Modules linked in:
[ 27.078379] CPU: 0 PID: 951 Comm: a.out Not tainted 5.11.0-rc3+ #1
[ 27.079637] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 27.080852] RIP: 0010:__might_sleep+0x80/0xa0
[ 27.081835] Code: 65 48 8b 04 25 80 71 01 00 48 8b 90 c0 15 00 00 48 8b 70 18 48 c7 c7 08 39 95 82 c6 05 f9 5f de 08 01 48 89 d1 e8 00 c6 fa ff 0b eb bf 41 0f b6 f5 48 c7 c7 40 23 c9 82 e8 f3 48 ec 00 eb a7
[ 27.084521] RSP: 0018:
ffffc90000fe3ce8 EFLAGS:
00010286
[ 27.085350] RAX:
0000000000000000 RBX:
ffffffff82956083 RCX:
0000000000000000
[ 27.086348] RDX:
ffff8881057a0000 RSI:
ffffffff8118cc9e RDI:
ffff88813bc28570
[ 27.087598] RBP:
00000000000003a7 R08:
0000000000000001 R09:
0000000000000001
[ 27.088819] R10:
ffffc90000fe3e00 R11:
00000000fffef9f0 R12:
0000000000000000
[ 27.089819] R13:
0000000000000000 R14:
ffff88810576eb80 R15:
ffff88810576e800
[ 27.091058] FS:
00007f7b144cf740(0000) GS:
ffff88813bc00000(0000) knlGS:
0000000000000000
[ 27.092775] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 27.093796] CR2:
00000000022da7b8 CR3:
000000010b928002 CR4:
00000000003706f0
[ 27.094778] DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
[ 27.095780] DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
[ 27.097011] Call Trace:
[ 27.097685] __mutex_lock+0x5d/0xa30
[ 27.098565] ? prepare_to_wait_exclusive+0x71/0xc0
[ 27.099412] ? io_cqring_overflow_flush.part.101+0x6d/0x70
[ 27.100441] ? lockdep_hardirqs_on_prepare+0xe9/0x1c0
[ 27.101537] ? _raw_spin_unlock_irqrestore+0x2d/0x40
[ 27.102656] ? trace_hardirqs_on+0x46/0x110
[ 27.103459] ? io_cqring_overflow_flush.part.101+0x6d/0x70
[ 27.104317] io_cqring_overflow_flush.part.101+0x6d/0x70
[ 27.105113] io_cqring_wait+0x36e/0x4d0
[ 27.105770] ? find_held_lock+0x28/0xb0
[ 27.106370] ? io_uring_remove_task_files+0xa0/0xa0
[ 27.107076] __x64_sys_io_uring_enter+0x4fb/0x640
[ 27.107801] ? rcu_read_lock_sched_held+0x59/0xa0
[ 27.108562] ? lockdep_hardirqs_on_prepare+0xe9/0x1c0
[ 27.109684] ? syscall_enter_from_user_mode+0x26/0x70
[ 27.110731] do_syscall_64+0x2d/0x40
[ 27.111296] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 27.112056] RIP: 0033:0x7f7b13dc8239
[ 27.112663] Code: 01 00 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 3d 01 f0 ff ff 73 01 c3 48 8b 0d 27 ec 2c 00 f7 d8 64 89 01 48
[ 27.115113] RSP: 002b:
00007ffd6d7f5c88 EFLAGS:
00000286 ORIG_RAX:
00000000000001aa
[ 27.116562] RAX:
ffffffffffffffda RBX:
0000000000000000 RCX:
00007f7b13dc8239
[ 27.117961] RDX:
000000000000478e RSI:
0000000000000000 RDI:
0000000000000003
[ 27.118925] RBP:
00007ffd6d7f5cb0 R08:
0000000020000040 R09:
0000000000000008
[ 27.119773] R10:
0000000000000001 R11:
0000000000000286 R12:
0000000000400480
[ 27.120614] R13:
00007ffd6d7f5d90 R14:
0000000000000000 R15:
0000000000000000
[ 27.121490] irq event stamp: 5635
[ 27.121946] hardirqs last enabled at (5643): [] console_unlock+0x5c4/0x740
[ 27.123476] hardirqs last disabled at (5652): [] console_unlock+0x4e7/0x740
[ 27.125192] softirqs last enabled at (5272): [] __do_softirq+0x3c5/0x5aa
[ 27.126430] softirqs last disabled at (5267): [] asm_call_irq_on_stack+0xf/0x20
[ 27.127634] ---[ end trace
289d7e28fa60f928 ]---
This is caused by calling io_cqring_overflow_flush() which may sleep
after calling prepare_to_wait_exclusive() which set task state to
TASK_INTERRUPTIBLE
Reported-by: Abaci <abaci@linux.alibaba.com>
Fixes: 6c503150ae33 ("io_uring: patch up IOPOLL overflow_flush sync")
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Hao Xu <haoxu@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Maxim Mikityanskiy [Tue, 26 Jan 2021 19:59:07 +0000 (21:59 +0200)]
Revert "block: simplify set_init_blocksize" to regain lost performance
The cited commit introduced a serious regression with SATA write speed,
as found by bisecting. This patch reverts this commit, which restores
write speed back to the values observed before this commit.
The performance tests were done on a Helios4 NAS (2nd batch) with 4 HDDs
(WD8003FFBX) using dd (bs=1M count=2000). "Direct" is a test with a
single HDD, the rest are different RAID levels built over the first
partitions of 4 HDDs. Test results are in MB/s, R is read, W is write.
| Direct | RAID0 | RAID10 f2 | RAID10 n2 | RAID6
----------------+--------+-------+-----------+-----------+--------
9011495c9466 | R:256 | R:313 | R:276 | R:313 | R:323
(before faulty) | W:254 | W:253 | W:195 | W:204 | W:117
----------------+--------+-------+-----------+-----------+--------
5ff9f19231a0 | R:257 | R:398 | R:312 | R:344 | R:391
(faulty commit) | W:154 | W:122 | W:67.7 | W:66.6 | W:67.2
----------------+--------+-------+-----------+-----------+--------
5.10.10 | R:256 | R:401 | R:312 | R:356 | R:375
unpatched | W:149 | W:123 | W:64 | W:64.1 | W:61.5
----------------+--------+-------+-----------+-----------+--------
5.10.10 | R:255 | R:396 | R:312 | R:340 | R:393
patched | W:247 | W:274 | W:220 | W:225 | W:121
Applying this patch doesn't hurt read performance, while improves the
write speed by 1.5x - 3.5x (more impact on RAID tests). The write speed
is restored back to the state before the faulty commit, and even a bit
higher in RAID tests (which aren't HDD-bound on this device) - that is
likely related to other optimizations done between the faulty commit and
5.10.10 which also improved the read speed.
Signed-off-by: Maxim Mikityanskiy <maxtram95@gmail.com>
Fixes: 5ff9f19231a0 ("block: simplify set_init_blocksize")
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Juergen Gross [Mon, 25 Jan 2021 13:42:07 +0000 (14:42 +0100)]
x86/xen: avoid warning in Xen pv guest with CONFIG_AMD_MEM_ENCRYPT enabled
When booting a kernel which has been built with CONFIG_AMD_MEM_ENCRYPT
enabled as a Xen pv guest a warning is issued for each processor:
[ 5.964347] ------------[ cut here ]------------
[ 5.968314] WARNING: CPU: 0 PID: 1 at /home/gross/linux/head/arch/x86/xen/enlighten_pv.c:660 get_trap_addr+0x59/0x90
[ 5.972321] Modules linked in:
[ 5.976313] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W 5.11.0-rc5-default #75
[ 5.980313] Hardware name: Dell Inc. OptiPlex 9020/0PC5F7, BIOS A05 12/05/2013
[ 5.984313] RIP: e030:get_trap_addr+0x59/0x90
[ 5.988313] Code: 42 10 83 f0 01 85 f6 74 04 84 c0 75 1d b8 01 00 00 00 c3 48 3d 00 80 83 82 72 08 48 3d 20 81 83 82 72 0c b8 01 00 00 00 eb db <0f> 0b 31 c0 c3 48 2d 00 80 83 82 48 ba 72 1c c7 71 1c c7 71 1c 48
[ 5.992313] RSP: e02b:
ffffc90040033d38 EFLAGS:
00010202
[ 5.996313] RAX:
0000000000000001 RBX:
ffffffff82a141d0 RCX:
ffffffff8222ec38
[ 6.000312] RDX:
ffffffff8222ec38 RSI:
0000000000000005 RDI:
ffffc90040033d40
[ 6.004313] RBP:
ffff8881003984a0 R08:
0000000000000007 R09:
ffff888100398000
[ 6.008312] R10:
0000000000000007 R11:
ffffc90040246000 R12:
ffff8884082182a8
[ 6.012313] R13:
0000000000000100 R14:
000000000000001d R15:
ffff8881003982d0
[ 6.016316] FS:
0000000000000000(0000) GS:
ffff888408200000(0000) knlGS:
0000000000000000
[ 6.020313] CS: e030 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 6.024313] CR2:
ffffc900020ef000 CR3:
000000000220a000 CR4:
0000000000050660
[ 6.028314] Call Trace:
[ 6.032313] cvt_gate_to_trap.part.7+0x3f/0x90
[ 6.036313] ? asm_exc_double_fault+0x30/0x30
[ 6.040313] xen_convert_trap_info+0x87/0xd0
[ 6.044313] xen_pv_cpu_up+0x17a/0x450
[ 6.048313] bringup_cpu+0x2b/0xc0
[ 6.052313] ? cpus_read_trylock+0x50/0x50
[ 6.056313] cpuhp_invoke_callback+0x80/0x4c0
[ 6.060313] _cpu_up+0xa7/0x140
[ 6.064313] cpu_up+0x98/0xd0
[ 6.068313] bringup_nonboot_cpus+0x4f/0x60
[ 6.072313] smp_init+0x26/0x79
[ 6.076313] kernel_init_freeable+0x103/0x258
[ 6.080313] ? rest_init+0xd0/0xd0
[ 6.084313] kernel_init+0xa/0x110
[ 6.088313] ret_from_fork+0x1f/0x30
[ 6.092313] ---[ end trace
be9ecf17dceeb4f3 ]---
Reason is that there is no Xen pv trap entry for X86_TRAP_VC.
Fix that by adding a generic trap handler for unknown traps and wire all
unknown bare metal handlers to this generic handler, which will just
crash the system in case such a trap will ever happen.
Fixes: 0786138c78e793 ("x86/sev-es: Add a Runtime #VC Exception Handler")
Cc: <stable@vger.kernel.org> # v5.10
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Dan Carpenter [Thu, 21 Jan 2021 06:08:05 +0000 (09:08 +0300)]
can: dev: prevent potential information leak in can_fill_info()
The "bec" struct isn't necessarily always initialized. For example, the
mcp251xfd_get_berr_counter() function doesn't initialize anything if the
interface is down.
Fixes: 52c793f24054 ("can: netlink support for bus-error reporting and counters")
Link: https://lore.kernel.org/r/YAkaRdRJncsJO8Ve@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
David Woodhouse [Tue, 26 Jan 2021 17:01:49 +0000 (17:01 +0000)]
xen: Fix XenStore initialisation for XS_LOCAL
In commit
3499ba8198ca ("xen: Fix event channel callback via INTX/GSI")
I reworked the triggering of xenbus_probe().
I tried to simplify things by taking out the workqueue based startup
triggered from wake_waiting(); the somewhat poorly named xenbus IRQ
handler.
I missed the fact that in the XS_LOCAL case (Dom0 starting its own
xenstored or xenstore-stubdom, which happens after the kernel is booted
completely), that IRQ-based trigger is still actually needed.
So... put it back, except more cleanly. By just spawning a xenbus_probe
thread which waits on xb_waitq and runs the probe the first time it
gets woken, just as the workqueue-based hack did.
This is actually a nicer approach for *all* the back ends with different
interrupt methods, and we can switch them all over to that without the
complex conditions for when to trigger it. But not in -rc6. This is
the minimal fix for the regression, although it's a step in the right
direction instead of doing a partial revert and actually putting the
workqueue back. It's also simpler than the workqueue.
Fixes: 3499ba8198ca ("xen: Fix event channel callback via INTX/GSI")
Reported-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/4c9af052a6e0f6485d1de43f2c38b1461996db99.camel@infradead.org
Signed-off-by: Juergen Gross <jgross@suse.com>
Pavel Begunkov [Tue, 26 Jan 2021 23:35:10 +0000 (23:35 +0000)]
io_uring: fix wqe->lock/completion_lock deadlock
Joseph reports following deadlock:
CPU0:
...
io_kill_linked_timeout // &ctx->completion_lock
io_commit_cqring
__io_queue_deferred
__io_queue_async_work
io_wq_enqueue
io_wqe_enqueue // &wqe->lock
CPU1:
...
__io_uring_files_cancel
io_wq_cancel_cb
io_wqe_cancel_pending_work // &wqe->lock
io_cancel_task_cb // &ctx->completion_lock
Only __io_queue_deferred() calls queue_async_work() while holding
ctx->completion_lock, enqueue drained requests via io_req_task_queue()
instead.
Cc: stable@vger.kernel.org # 5.9+
Reported-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Tested-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jakub Kicinski [Wed, 27 Jan 2021 02:23:39 +0000 (18:23 -0800)]
Merge branch 'net-fec-fix-temporary-rmii-clock-reset-on-link-up'
Laurent Badel says:
====================
net: fec: Fix temporary RMII clock reset on link up
v2: fixed a compilation warning
The FEC drivers performs a "hardware reset" of the MAC module when the
link is reported to be up. This causes a short glitch in the RMII clock
due to the hardware reset clearing the receive control register which
controls the MII mode. It seems that some link partners do not tolerate
this glitch, and invalidate the link, which leads to a never-ending loop
of negotiation-link up-link down events.
This was observed with the iMX28 Soc and LAN8720/LAN8742 PHYs, with two
Intel adapters I218-LM and X722-DA2 as link partners, though a number of
other link partners do not seem to mind the clock glitch. Changing the
hardware reset to a software reset (clearing bit 1 of the ECR register)
cured the issue.
Attempts to optimize fec_restart() in order to minimize the duration of
the glitch were unsuccessful. Furthermore manually producing the glitch by
setting MII mode and then back to RMII in two consecutive instructions,
resulting in a clock glitch <10us in duration, was enough to cause the
partner to invalidate the link. This strongly suggests that the root cause
of the link being dropped is indeed the change in clock frequency.
In an effort to minimize changes to driver, the patch proposes to use
soft reset only for tested SoCs (iMX28) and only if the link is up. This
preserves hardware reset in other situations, which might be required for
proper setup of the MAC.
====================
Link: https://lore.kernel.org/r/20210125100745.5090-1-laurentbadel@eaton.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Laurent Badel [Mon, 25 Jan 2021 10:07:45 +0000 (11:07 +0100)]
net: fec: Fix temporary RMII clock reset on link up
fec_restart() does a hard reset of the MAC module when the link status
changes to up. This temporarily resets the R_CNTRL register which controls
the MII mode of the ENET_OUT clock. In the case of RMII, the clock
frequency momentarily drops from 50MHz to 25MHz until the register is
reconfigured. Some link partners do not tolerate this glitch and
invalidate the link causing failure to establish a stable link when using
PHY polling mode. Since as per IEEE802.3 the criteria for link validity
are PHY-specific, what the partner should tolerate cannot be assumed, so
avoid resetting the MII clock by using software reset instead of hardware
reset when the link is up. This is generally relevant only if the SoC
provides the clock to an external PHY and the PHY is configured for RMII.
Signed-off-by: Laurent Badel <laurentbadel@eaton.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Xie He [Tue, 26 Jan 2021 04:09:39 +0000 (20:09 -0800)]
net: lapb: Add locking to the lapb module
In the lapb module, the timers may run concurrently with other code in
this module, and there is currently no locking to prevent the code from
racing on "struct lapb_cb". This patch adds locking to prevent racing.
1. Add "spinlock_t lock" to "struct lapb_cb"; Add "spin_lock_bh" and
"spin_unlock_bh" to APIs, timer functions and notifier functions.
2. Add "bool t1timer_stop, t2timer_stop" to "struct lapb_cb" to make us
able to ask running timers to abort; Modify "lapb_stop_t1timer" and
"lapb_stop_t2timer" to make them able to abort running timers;
Modify "lapb_t2timer_expiry" and "lapb_t1timer_expiry" to make them
abort after they are stopped by "lapb_stop_t1timer", "lapb_stop_t2timer",
and "lapb_start_t1timer", "lapb_start_t2timer".
3. Let lapb_unregister wait for other API functions and running timers
to stop.
4. The lapb_device_event function calls lapb_disconnect_request. In
order to avoid trying to hold the lock twice, add a new function named
"__lapb_disconnect_request" which assumes the lock is held, and make
it called by lapb_disconnect_request and lapb_device_event.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: Xie He <xie.he.0141@gmail.com>
Link: https://lore.kernel.org/r/20210126040939.69995-1-xie.he.0141@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ivan Vecera [Mon, 25 Jan 2021 07:44:16 +0000 (08:44 +0100)]
team: protect features update by RCU to avoid deadlock
Function __team_compute_features() is protected by team->lock
mutex when it is called from team_compute_features() used when
features of an underlying device is changed. This causes
a deadlock when NETDEV_FEAT_CHANGE notifier for underlying device
is fired due to change propagated from team driver (e.g. MTU
change). It's because callbacks like team_change_mtu() or
team_vlan_rx_{add,del}_vid() protect their port list traversal
by team->lock mutex.
Example (r8169 case where this driver disables TSO for certain MTU
values):
...
[ 6391.348202] __mutex_lock.isra.6+0x2d0/0x4a0
[ 6391.358602] team_device_event+0x9d/0x160 [team]
[ 6391.363756] notifier_call_chain+0x47/0x70
[ 6391.368329] netdev_update_features+0x56/0x60
[ 6391.373207] rtl8169_change_mtu+0x14/0x50 [r8169]
[ 6391.378457] dev_set_mtu_ext+0xe1/0x1d0
[ 6391.387022] dev_set_mtu+0x52/0x90
[ 6391.390820] team_change_mtu+0x64/0xf0 [team]
[ 6391.395683] dev_set_mtu_ext+0xe1/0x1d0
[ 6391.399963] do_setlink+0x231/0xf50
...
In fact team_compute_features() called from team_device_event()
does not need to be protected by team->lock mutex and rcu_read_lock()
is sufficient there for port list traversal.
Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device")
Cc: Saeed Mahameed <saeed@kernel.org>
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20210125074416.4056484-1-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Fri, 22 Jan 2021 17:32:20 +0000 (09:32 -0800)]
MAINTAINERS: add David Ahern to IPv4/IPv6 maintainers
David has been the de-facto maintainer for much of the IP code
for the last couple of years, let's make it official.
Link: https://lore.kernel.org/r/20210122173220.3579491-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paul Blakey [Mon, 25 Jan 2021 15:31:26 +0000 (17:31 +0200)]
net/mlx5: CT: Fix incorrect removal of tuple_nat_node from nat rhashtable
If a non nat tuple entry is inserted just to the regular tuples
rhashtable (ct_tuples_ht) and not to natted tuples rhashtable
(ct_nat_tuples_ht). Commit
bc562be9674b ("net/mlx5e: CT: Save ct entries
tuples in hashtables") mixed up the return labels and names sot that on
cleanup or failure we still try to remove for the natted tuples rhashtable.
Fix that by correctly checking if a natted tuples insertion
before removing it. While here make it more readable.
Fixes: bc562be9674b ("net/mlx5e: CT: Save ct entries tuples in hashtables")
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maxim Mikityanskiy [Fri, 11 Dec 2020 14:05:01 +0000 (16:05 +0200)]
net/mlx5e: Revert parameters on errors when changing MTU and LRO state without reset
Sometimes, channel params are changed without recreating the channels.
It happens in two basic cases: when the channels are closed, and when
the parameter being changed doesn't affect how channels are configured.
Such changes invoke a hardware command that might fail. The whole
operation should be reverted in such cases, but the code that restores
the parameters' values in the driver was missing. This commit adds this
handling.
Fixes: 2e20a151205b ("net/mlx5e: Fail safe mtu and lro setting")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maxim Mikityanskiy [Thu, 14 Jan 2021 10:34:01 +0000 (12:34 +0200)]
net/mlx5e: Revert parameters on errors when changing trust state without reset
Trust state may be changed without recreating the channels. It happens
when the channels are closed, and when channel parameters (min inline
mode) stay the same after changing the trust state. Changing the trust
state is a hardware command that may fail. The current code didn't
restore the channel parameters to their old values if an error happened
and the channels were closed. This commit adds handling for this case.
Fixes: 6e0504c69811 ("net/mlx5e: Change inline mode correctly when changing trust state")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maxim Mikityanskiy [Fri, 11 Dec 2020 10:56:56 +0000 (12:56 +0200)]
net/mlx5e: Correctly handle changing the number of queues when the interface is down
This commit addresses two issues related to changing the number of
queues when the channels are closed:
1. Missing call to mlx5e_num_channels_changed to update
real_num_tx_queues when the number of TCs is changed.
2. When mlx5e_num_channels_changed returns an error, the channel
parameters must be reverted.
Two Fixes: tags correspond to the first commits where these two issues
were introduced.
Fixes: 3909a12e7913 ("net/mlx5e: Fix configuration of XPS cpumasks and netdev queues in corner cases")
Fixes: fa3748775b92 ("net/mlx5e: Handle errors from netif_set_real_num_{tx,rx}_queues")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Paul Blakey [Thu, 21 Jan 2021 08:06:45 +0000 (10:06 +0200)]
net/mlx5e: Fix CT rule + encap slow path offload and deletion
Currently, if a neighbour isn't valid when offloading tunnel encap rules,
we offload the original match and replace the original action with
"goto slow path" action. For this we use a temporary flow attribute based
on the original flow attribute and then change the action. Flow flags,
which among those is the CT flag, are still shared for the slow path rule
offload, so we end up parsing this flow as a CT + goto slow path rule.
Besides being unnecessary, CT action offload saves extra information in
the passed flow attribute, such as created ct_flow and mod_hdr, which
is lost onces the temporary flow attribute is freed.
When a neigh is updated and is valid, we offload the original CT rule
with original CT action, which again creates a ct_flow and mod_hdr
and saves it in the flow's original attribute. Then we delete the slow
path rule with a temporary flow attribute based on original updated
flow attribute, and we free the relevant ct_flow and mod_hdr.
Then when tc deletes this flow, we try to free the ct_flow and mod_hdr
on the flow's attribute again.
To fix the issue, skip all furture proccesing (CT/Sample/Split rules)
in offload/unoffload of slow path rules.
Call trace:
[ 758.850525] Unable to handle kernel NULL pointer dereference at virtual address
0000000000000218
[ 758.952987] Internal error: Oops:
96000005 [#1] PREEMPT SMP
[ 758.964170] Modules linked in: act_csum(E) act_pedit(E) act_tunnel_key(E) act_ct(E) nf_flow_table(E) xt_nat(E) ip6table_filter(E) ip6table_nat(E) xt_comment(E) ip6_tables(E) xt_conntrack(E) xt_MASQUERADE(E) nf_conntrack_netlink(E) xt_addrtype(E) iptable_filter(E) iptable_nat(E) bpfilter(E) br_netfilter(E) bridge(E) stp(E) llc(E) xfrm_user(E) overlay(E) act_mirred(E) act_skbedit(E) rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) esp6_offload(E) esp6(E) esp4_offload(E) esp4(E) xfrm_algo(E) mlx5_ib(OE) ib_uverbs(OE) geneve(E) ip6_udp_tunnel(E) udp_tunnel(E) nfnetlink_cttimeout(E) nfnetlink(E) mlx5_core(OE) act_gact(E) cls_flower(E) sch_ingress(E) openvswitch(E) nsh(E) nf_conncount(E) nf_nat(E) mlxfw(OE) psample(E) nf_conntrack(E) nf_defrag_ipv4(E) vfio_mdev(E) mdev(E) ib_core(OE) mlx_compat(OE) crct10dif_ce(E) uio_pdrv_genirq(E) uio(E) i2c_mlx(E) mlxbf_pmc(E) sbsa_gwdt(E) mlxbf_gige(E) gpio_mlxbf2(E) mlxbf_pka(E) mlx_trio(E) mlx_bootctl(E) bluefield_edac(E) knem(O)
[ 758.964225] ip_tables(E) mlxbf_tmfifo(E) ipv6(E) crc_ccitt(E) nf_defrag_ipv6(E)
[ 759.154186] CPU: 5 PID: 122 Comm: kworker/u16:1 Tainted: G OE 5.4.60-mlnx.52.gde81e85 #1
[ 759.172870] Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS BlueField:
3.5.0-2-gc1b5d64 Jan 4 2021
[ 759.195466] Workqueue: mlx5e mlx5e_rep_neigh_update [mlx5_core]
[ 759.207344] pstate:
a0000005 (NzCv daif -PAN -UAO)
[ 759.217003] pc : mlx5_del_flow_rules+0x5c/0x160 [mlx5_core]
[ 759.228229] lr : mlx5_del_flow_rules+0x34/0x160 [mlx5_core]
[ 759.405858] Call trace:
[ 759.410804] mlx5_del_flow_rules+0x5c/0x160 [mlx5_core]
[ 759.421337] __mlx5_eswitch_del_rule.isra.43+0x5c/0x1c8 [mlx5_core]
[ 759.433963] mlx5_eswitch_del_offloaded_rule_ct+0x34/0x40 [mlx5_core]
[ 759.446942] mlx5_tc_rule_delete_ct+0x68/0x74 [mlx5_core]
[ 759.457821] mlx5_tc_ct_delete_flow+0x160/0x21c [mlx5_core]
[ 759.469051] mlx5e_tc_unoffload_fdb_rules+0x158/0x168 [mlx5_core]
[ 759.481325] mlx5e_tc_encap_flows_del+0x140/0x26c [mlx5_core]
[ 759.492901] mlx5e_rep_update_flows+0x11c/0x1ec [mlx5_core]
[ 759.504127] mlx5e_rep_neigh_update+0x160/0x200 [mlx5_core]
[ 759.515314] process_one_work+0x178/0x400
[ 759.523350] worker_thread+0x58/0x3e8
[ 759.530685] kthread+0x100/0x12c
[ 759.537152] ret_from_fork+0x10/0x18
[ 759.544320] Code:
97ffef55 51000673 3100067f 54ffff41 (
b9421ab3)
[ 759.556548] ---[ end trace
fab818bb1085832d ]---
Fixes: 4c3844d9e97e ("net/mlx5e: CT: Introduce connection tracking")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maor Dickman [Sun, 24 Jan 2021 15:21:25 +0000 (17:21 +0200)]
net/mlx5e: Disable hw-tc-offload when MLX5_CLS_ACT config is disabled
The cited commit introduce new CONFIG_MLX5_CLS_ACT kconfig variable
to control compilation of TC hardware offloads implementation.
When this configuration is disabled the driver is still wrongly
reports in ethtool that hw-tc-offload is supported.
Fixed by reporting hw-tc-offload is supported only when
CONFIG_MLX5_CLS_ACT is enabled.
Fixes: d956873f908c ("net/mlx5e: Introduce kconfig var for TC support")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Vlad Buslov <vladbu@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Daniel Jurgens [Fri, 22 Jan 2021 21:13:53 +0000 (23:13 +0200)]
net/mlx5: Maintain separate page trees for ECPF and PF functions
Pages for the host PF and ECPF were stored in the same tree, so the ECPF
pages were being freed along with the host PF's when the host driver
unloaded.
Combine the function ID and ECPF flag to use as an index into the
x-array containing the trees to get a different tree for the host PF and
ECPF.
Fixes: c6168161f693 ("net/mlx5: Add support for release all pages event")
Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maxim Mikityanskiy [Wed, 25 Nov 2020 11:52:36 +0000 (13:52 +0200)]
net/mlx5e: Fix IPSEC stats
When IPSEC offload isn't active, the number of stats is not zero, but
the strings are not filled, leading to exposing stats with empty names.
Fix this by using the same condition for NUM_STATS and FILL_STRS.
Fixes: 0aab3e1b04ae ("net/mlx5e: IPSec, Expose IPsec HW stat only for supporting HW")
Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Raed Salem <raeds@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Maor Dickman [Tue, 19 Jan 2021 15:21:38 +0000 (17:21 +0200)]
net/mlx5e: Reduce tc unsupported key print level
"Unsupported key used:" appears in kernel log when flows with
unsupported key are used, arp fields for example.
OpenVSwitch was changed to match on arp fields by default that
caused this warning to appear in kernel log for every arp rule, which
can be a lot.
Fix by lowering print level from warning to debug.
Fixes: e3a2b7ed018e ("net/mlx5e: Support offload cls_flower with drop action")
Signed-off-by: Maor Dickman <maord@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Pan Bian [Thu, 21 Jan 2021 04:58:30 +0000 (20:58 -0800)]
net/mlx5e: free page before return
Instead of directly return, goto the error handling label to free
allocated page.
Fixes: 5f29458b77d5 ("net/mlx5e: Support dump callback in TX reporter")
Signed-off-by: Pan Bian <bianpan2016@163.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Parav Pandit [Tue, 12 Jan 2021 14:13:22 +0000 (16:13 +0200)]
net/mlx5e: E-switch, Fix rate calculation for overflow
rate_bytes_ps is a 64-bit field. It passed as 32-bit field to
apply_police_params(). Due to this when police rate is higher
than 4Gbps, 32-bit calculation ignores the carry. This results
in incorrect rate configurationn the device.
Fix it by performing 64-bit calculation.
Fixes: fcb64c0f5640 ("net/mlx5: E-Switch, add ingress rate support")
Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Eli Cohen <elic@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Roi Dayan [Tue, 12 Jan 2021 12:04:29 +0000 (14:04 +0200)]
net/mlx5: Fix memory leak on flow table creation error flow
When we create the ft object we also init rhltable in ft->fgs_hash.
So in error flow before kfree of ft we need to destroy that rhltable.
Fixes: 693c6883bbc4 ("net/mlx5: Add hash table for flow groups in flow table")
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Jakub Kicinski [Tue, 26 Jan 2021 23:23:17 +0000 (15:23 -0800)]
Merge tag 'mac80211-for-net-2021-01-26' of git://git./linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
A couple of fixes:
* fix 160 MHz channel switch in mac80211
* fix a staging driver to not deadlock due to some
recent cfg80211 changes
* fix NULL-ptr deref if cfg80211 returns -EINPROGRESS
to wext (syzbot)
* pause TX in mac80211 in type change to prevent crashes
(syzbot)
* tag 'mac80211-for-net-2021-01-26' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211:
staging: rtl8723bs: fix wireless regulatory API misuse
mac80211: pause TX while changing interface type
wext: fix NULL-ptr-dereference with cfg80211's lack of commit()
mac80211: 160MHz with extended NSS BW in CSA
====================
Link: https://lore.kernel.org/r/20210126130529.75225-1-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Tue, 26 Jan 2021 23:16:39 +0000 (15:16 -0800)]
Merge tag 'wireless-drivers-2021-01-26' of git://git./linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for v5.11
Second set of fixes for v5.11. Like in last time we again have more
fixes than usual Actually a bit too much for my liking in this state
of the cycle, but due to unrelated challenges I was only able to
submit them now.
We have few important crash fixes, iwlwifi modifying read-only data
being the most reported issue, and also smaller fixes to iwlwifi.
mt76
* fix a clang warning about enum usage
* fix rx buffer refcounting crash
mt7601u
* fix rx buffer refcounting crash
* fix crash when unbplugging the device
iwlwifi
* fix a crash where we were modifying read-only firmware data
* lots of smaller fixes all over the driver
* tag 'wireless-drivers-2021-01-26' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers: (24 commits)
mt7601u: fix kernel crash unplugging the device
iwlwifi: queue: bail out on invalid freeing
iwlwifi: mvm: guard against device removal in reprobe
iwlwifi: Fix IWL_SUBDEVICE_NO_160 macro to use the correct bit.
iwlwifi: mvm: clear IN_D3 after wowlan status cmd
iwlwifi: pcie: add rules to match Qu with Hr2
iwlwifi: mvm: invalidate IDs of internal stations at mvm start
iwlwifi: mvm: fix the return type for DSM functions 1 and 2
iwlwifi: pcie: reschedule in long-running memory reads
iwlwifi: pcie: use jiffies for memory read spin time limit
iwlwifi: pcie: fix context info memory leak
iwlwifi: pcie: add a NULL check in iwl_pcie_txq_unmap
iwlwifi: pcie: set LTR on more devices
iwlwifi: queue: don't crash if txq->entries is NULL
iwlwifi: fix the NMI flow for old devices
iwlwifi: pnvm: don't try to load after failures
iwlwifi: pnvm: don't skip everything when not reloading
iwlwifi: pcie: avoid potential PNVM leaks
iwlwifi: mvm: take mutex for calling iwl_mvm_get_sync_time()
iwlwifi: mvm: skip power command when unbinding vif during CSA
...
====================
Link: https://lore.kernel.org/r/20210126092202.6A367C433CA@smtp.codeaurora.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet [Mon, 25 Jan 2021 15:09:49 +0000 (07:09 -0800)]
iwlwifi: provide gso_type to GSO packets
net/core/tso.c got recent support for USO, and this broke iwlfifi
because the driver implemented a limited form of GSO.
Providing ->gso_type allows for skb_is_gso_tcp() to provide
a correct result.
Fixes: 3d5b459ba0e3 ("net: tso: add UDP segmentation support")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Ben Greear <greearb@candelatech.com>
Tested-by: Ben Greear <greearb@candelatech.com>
Cc: Luca Coelho <luciano.coelho@intel.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=209913
Link: https://lore.kernel.org/r/20210125150949.619309-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Corinna Vinschen [Tue, 17 Nov 2020 19:50:40 +0000 (20:50 +0100)]
igc: fix link speed advertising
Link speed advertising in igc has two problems:
- When setting the advertisement via ethtool, the link speed is converted
to the legacy 32 bit representation for the intel PHY code.
This inadvertently drops ETHTOOL_LINK_MODE_2500baseT_Full_BIT (being
beyond bit 31). As a result, any call to `ethtool -s ...' drops the
2500Mbit/s link speed from the PHY settings. Only reloading the driver
alleviates that problem.
Fix this by converting the ETHTOOL_LINK_MODE_2500baseT_Full_BIT to the
Intel PHY ADVERTISE_2500_FULL bit explicitly.
- Rather than checking the actual PHY setting, the .get_link_ksettings
function always fills link_modes.advertising with all link speeds
the device is capable of.
Fix this by checking the PHY autoneg_advertised settings and report
only the actually advertised speeds up to ethtool.
Fixes: 8c5ad0dae93c ("igc: Add ethtool support")
Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Ævar Arnfjörð Bjarmason [Tue, 26 Jan 2021 00:04:38 +0000 (01:04 +0100)]
mailmap: remove the "repo-abbrev" comment
Remove the magical "repo-abbrev" comment added when this file was
introduced in
e0ab1ec9fcd3 ([PATCH] add .mailmap for proper
git-shortlog output, 2007-02-14).
It's been an undocumented feature of git-shortlog(1), originally added
to git for Linus's use. Since then he's no longer using it[1], and
I've removed the feature in git.git's
4e168333a87 (shortlog: remove
unused(?) "repo-abbrev" feature, 2021-01-12). It's on the "master"
branch, but not yet in a release version.
Let's also remove it from linux.git, both as a heads-up to any
potential users of it in linux.git whose use would be broken sooner
than later by git itself, and because it'll eventually be entirely
redundant.
1. https://lore.kernel.org/git/CAHk-=wixHyBKZVUcxq+NCWMbkrX0xnppb7UCopRWw1+oExYpYw@mail.gmail.com/
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Helge Deller [Tue, 26 Jan 2021 19:16:21 +0000 (20:16 +0100)]
parisc: Enable -mlong-calls gcc option by default when !CONFIG_MODULES
When building a kernel without module support, the CONFIG_MLONGCALL option
needs to be enabled in order to reach symbols which are outside of a 22-bit
branch.
This patch changes the autodetection in the Kconfig script to always enable
CONFIG_MLONGCALL when modules are disabled and uses a far call to
preempt_schedule_irq() in intr_do_preempt() to reach the symbol in all cases.
Signed-off-by: Helge Deller <deller@gmx.de>
Reported-by: kernel test robot <lkp@intel.com>
Cc: stable@vger.kernel.org # v5.6+
Linus Torvalds [Tue, 26 Jan 2021 19:10:14 +0000 (11:10 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
- x86 bugfixes
- Documentation fixes
- Avoid performance regression due to SEV-ES patches
- ARM:
- Don't allow tagged pointers to point to memslots
- Filter out ARMv8.1+ PMU events on v8.0 hardware
- Hide PMU registers from userspace when no PMU is configured
- More PMU cleanups
- Don't try to handle broken PSCI firmware
- More sys_reg() to reg_to_encoding() conversions
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: allow KVM_REQ_GET_NESTED_STATE_PAGES outside guest mode for VMX
KVM: x86: Revert "KVM: x86: Mark GPRs dirty when written"
KVM: SVM: Unconditionally sync GPRs to GHCB on VMRUN of SEV-ES guest
KVM: nVMX: Sync unsync'd vmcs02 state to vmcs12 on migration
kvm: tracing: Fix unmatched kvm_entry and kvm_exit events
KVM: Documentation: Update description of KVM_{GET,CLEAR}_DIRTY_LOG
KVM: x86: get smi pending status correctly
KVM: x86/pmu: Fix HW_REF_CPU_CYCLES event pseudo-encoding in intel_arch_events[]
KVM: x86/pmu: Fix UBSAN shift-out-of-bounds warning in intel_pmu_refresh()
KVM: x86: Add more protection against undefined behavior in rsvd_bits()
KVM: Documentation: Fix spec for KVM_CAP_ENABLE_CAP_VM
KVM: Forbid the use of tagged userspace addresses for memslots
KVM: arm64: Filter out v8.1+ events on v8.0 HW
KVM: arm64: Compute TPIDR_EL2 ignoring MTE tag
KVM: arm64: Use the reg_to_encoding() macro instead of sys_reg()
KVM: arm64: Allow PSCI SYSTEM_OFF/RESET to return
KVM: arm64: Simplify handling of absent PMU system registers
KVM: arm64: Hide PMU registers from userspace when not available
Linus Torvalds [Tue, 26 Jan 2021 19:03:30 +0000 (11:03 -0800)]
Merge tag 'spi-fix-v5.11-rc5' of git://git./linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"One new device ID here, plus an error handling fix - nothing
remarkable in either"
* tag 'spi-fix-v5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spidev: Add cisco device compatible
spi: altera: Fix memory leak on error path
Linus Torvalds [Tue, 26 Jan 2021 18:59:01 +0000 (10:59 -0800)]
Merge tag 'regulator-fix-v5.11-rc5' of git://git./linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"The main thing here is a change to make sure that we don't try to
double resolve the supply of a regulator if we have two probes going
on simultaneously, plus an incremental fix on top of that to resolve a
lockdep issue it introduced.
There's also a patch from Dmitry Osipenko adding stubs for some
functions to avoid build issues in consumers in some configurations"
* tag 'regulator-fix-v5.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: Fix lockdep warning resolving supplies
regulator: consumer: Add missing stubs to regulator/consumer.h
regulator: core: avoid regulator_resolve_supply() race condition
Davidlohr Bueso [Fri, 15 Jan 2021 00:14:48 +0000 (16:14 -0800)]
parisc: Remove leftover reference to the power_tasklet
This was removed long ago, back in:
6e16d9409e1 ([PARISC] Convert soft power switch driver to kthread)
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Signed-off-by: Helge Deller <deller@gmx.de>
Stefan Assmann [Mon, 30 Nov 2020 13:12:57 +0000 (14:12 +0100)]
i40e: acquire VSI pointer only after VF is initialized
This change simplifies the VF initialization check and also minimizes
the delay between acquiring the VSI pointer and using it. As known by
the commit being fixed, there is a risk of the VSI pointer getting
changed. Therefore minimize the delay between getting and using the
pointer.
Fixes: 9889707b06ac ("i40e: Fix crash caused by stress setting of VF MAC addresses")
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Brett Creeley [Thu, 21 Jan 2021 18:38:06 +0000 (10:38 -0800)]
ice: Fix MSI-X vector fallback logic
The current MSI-X enablement logic tries to enable best-case MSI-X
vectors and if that fails we only support a bare-minimum set. This
includes a single MSI-X for 1 Tx and 1 Rx queue and a single MSI-X
for the OICR interrupt. Unfortunately, the driver fails to load when we
don't get as many MSI-X as requested for a couple reasons.
First, the code to allocate MSI-X in the driver tries to allocate
num_online_cpus() MSI-X for LAN traffic without caring about the number
of MSI-X actually enabled/requested from the kernel for LAN traffic.
So, when calling ice_get_res() for the PF VSI, it returns failure
because the number of available vectors is less than requested. Fix
this by not allowing the PF VSI to allocation more than
pf->num_lan_msix MSI-X vectors and pf->num_lan_msix Rx/Tx queues.
Limiting the number of queues is done because we don't want more than
1 Tx/Rx queue per interrupt due to performance conerns.
Second, the driver assigns pf->num_lan_msix = 2, to account for LAN
traffic and the OICR. However, pf->num_lan_msix is only meant for LAN
MSI-X. This is causing a failure when the PF VSI tries to
allocate/reserve the minimum pf->num_lan_msix because the OICR MSI-X has
already been reserved, so there may not be enough MSI-X vectors left.
Fix this by setting pf->num_lan_msix = 1 for the failure case. Then the
ICE_MIN_MSIX accounts for the LAN MSI-X and the OICR MSI-X needed for
the failure case.
Update the related defines used in ice_ena_msix_range() to align with
the above behavior and remove the unused RDMA defines because RDMA is
currently not supported. Also, remove the now incorrect comment.
Fixes: 152b978a1f90 ("ice: Rework ice_ena_msix_range")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Brett Creeley [Thu, 21 Jan 2021 18:38:05 +0000 (10:38 -0800)]
ice: Don't allow more channels than LAN MSI-X available
Currently users could create more channels than LAN MSI-X available.
This is happening because there is no check against pf->num_lan_msix
when checking the max allowed channels and will cause performance issues
if multiple Tx and Rx queues are tied to a single MSI-X. Fix this by not
allowing more channels than LAN MSI-X available in pf->num_lan_msix.
Fixes: 87324e747fde ("ice: Implement ethtool ops for channels")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Nick Nunley [Sat, 21 Nov 2020 00:38:33 +0000 (16:38 -0800)]
ice: update dev_addr in ice_set_mac_address even if HW filter exists
Fix the driver to copy the MAC address configured in ndo_set_mac_address
into dev_addr, even if the MAC filter already exists in HW. In some
situations (e.g. bonding) the netdev's dev_addr could have been modified
outside of the driver, with no change to the HW filter, so the driver
cannot assume that they match.
Fixes: 757976ab16be ("ice: Fix check for removing/adding mac filters")
Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Nick Nunley [Sat, 21 Nov 2020 00:38:31 +0000 (16:38 -0800)]
ice: Implement flow for IPv6 next header (extension header)
This patch is based on a similar change to i40e by Slawomir Laba:
"i40e: Implement flow for IPv6 next header (extension header)".
When a packet contains an IPv6 header with next header which is
an extension header and not a protocol one, the kernel function
skb_transport_header called with such sk_buff will return a
pointer to the extension header and not to the TCP one.
The above explained call caused a problem with packet processing
for skb with encapsulation for tunnel with ICE_TX_CTX_EIPT_IPV6.
The extension header was not skipped at all.
The ipv6_skip_exthdr function does check if next header of the IPV6
header is an extension header and doesn't modify the l4_proto pointer
if it points to a protocol header value so its safe to omit the
comparison of exthdr and l4.hdr pointers. The ipv6_skip_exthdr can
return value -1. This means that the skipping process failed
and there is something wrong with the packet so it will be dropped.
Fixes: a4e82a81f573 ("ice: Add support for tunnel offloads")
Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Henry Tieman [Sat, 21 Nov 2020 00:38:30 +0000 (16:38 -0800)]
ice: fix FDir IPv6 flexbyte
The packet classifier would occasionally misrecognize an IPv6 training
packet when the next protocol field was 0. The correct value for
unspecified protocol is IPPROTO_NONE.
Fixes: 165d80d6adab ("ice: Support IPv6 Flow Director filters")
Signed-off-by: Henry Tieman <henry.w.tieman@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>