Peter Zijlstra [Mon, 21 Mar 2022 09:57:51 +0000 (10:57 +0100)]
Merge branch 'kvm/kvm-sls-fix'
Sync with the last minute SLS fix to extend it for IBT.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Nathan Chancellor [Fri, 18 Mar 2022 23:07:47 +0000 (16:07 -0700)]
x86/Kconfig: Only allow CONFIG_X86_KERNEL_IBT with ld.lld >= 14.0.0
With CONFIG_X86_KERNEL_IBT=y and a version of ld.lld prior to 14.0.0,
there are numerous objtool warnings along the lines of:
warning: objtool: .plt+0x6: indirect jump found in RETPOLINE build
This is a known issue that has been resolved in ld.lld 14.0.0. Prevent
CONFIG_X86_KERNEL_IBT from being selectable when using one of these
problematic ld.lld versions.
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220318230747.3900772-3-nathan@kernel.org
Nathan Chancellor [Fri, 18 Mar 2022 23:07:46 +0000 (16:07 -0700)]
x86/Kconfig: Only enable CONFIG_CC_HAS_IBT for clang >= 14.0.0
Commit
156ff4a544ae ("x86/ibt: Base IBT bits") added a check for a crash
with 'clang -fcf-protection=branch -mfentry -pg', which intended to
exclude Clang versions older than 14.0.0 from selecting
CONFIG_X86_KERNEL_IBT.
clang-11 does not have the issue that the check is testing for, so
CONFIG_X86_KERNEL_IBT is selectable. Unfortunately, there is a different
crash in clang-11 that was fixed in clang-12. To make matters worse,
that crash does not appear to be entirely deterministic, as the same
input to the compiler will sometimes crash and other times not, which
makes dynamically checking for the crash like the '-pg' one unreliable.
To make everything work properly for all common versions of clang, use a
hard version check of 14.0.0, as that will be the first release upstream
that has both bugs properly fixed.
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220318230747.3900772-2-nathan@kernel.org
Peter Zijlstra [Fri, 18 Mar 2022 11:19:27 +0000 (12:19 +0100)]
kbuild: Fixup the IBT kbuild changes
Masahiro-san deemed my kbuild changes to support whole module objtool
runs too terrible to live and gracefully provided an alternative.
Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/CAK7LNAQ2mYMnOKMQheVi+6byUFE3KEkjm1zcndNUfe0tORGvug@mail.gmail.com
Borislav Petkov [Wed, 16 Mar 2022 21:05:52 +0000 (22:05 +0100)]
kvm/emulate: Fix SETcc emulation function offsets with SLS
The commit in Fixes started adding INT3 after RETs as a mitigation
against straight-line speculation.
The fastop SETcc implementation in kvm's insn emulator uses macro magic
to generate all possible SETcc functions and to jump to them when
emulating the respective instruction.
However, it hardcodes the size and alignment of those functions to 4: a
three-byte SETcc insn and a single-byte RET. BUT, with SLS, there's an
INT3 that gets slapped after the RET, which brings the whole scheme out
of alignment:
15: 0f 90 c0 seto %al
18: c3 ret
19: cc int3
1a: 0f 1f 00 nopl (%rax)
1d: 0f 91 c0 setno %al
20: c3 ret
21: cc int3
22: 0f 1f 00 nopl (%rax)
25: 0f 92 c0 setb %al
28: c3 ret
29: cc int3
and this explodes like this:
int3: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 2435 Comm: qemu-system-x86 Not tainted 5.17.0-rc8-sls #1
Hardware name: Dell Inc. Precision WorkStation T3400 /0TP412, BIOS A14 04/30/2012
RIP: 0010:setc+0x5/0x8 [kvm]
Code: 00 00 0f 1f 00 0f b6 05 43 24 06 00 c3 cc 0f 1f 80 00 00 00 00 0f 90 c0 c3 cc 0f \
1f 00 0f 91 c0 c3 cc 0f 1f 00 0f 92 c0 c3 cc <0f> 1f 00 0f 93 c0 c3 cc 0f 1f 00 \
0f 94 c0 c3 cc 0f 1f 00 0f 95 c0
Call Trace:
<TASK>
? x86_emulate_insn [kvm]
? x86_emulate_instruction [kvm]
? vmx_handle_exit [kvm_intel]
? kvm_arch_vcpu_ioctl_run [kvm]
? kvm_vcpu_ioctl [kvm]
? __x64_sys_ioctl
? do_syscall_64
? entry_SYSCALL_64_after_hwframe
</TASK>
Raise the alignment value when SLS is enabled and use a macro for that
instead of hard-coding naked numbers.
Fixes:
e463a09af2f0 ("x86: Add straight-line-speculation mitigation")
Reported-by: Jamie Heilman <jamie@audible.transient.net>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Jamie Heilman <jamie@audible.transient.net>
Link: https://lore.kernel.org/r/YjGzJwjrvxg5YZ0Z@audible.transient.net
[Add a comment and a bit of safety checking, since this is going to be changed
again for IBT support. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linus Torvalds [Fri, 18 Mar 2022 19:32:59 +0000 (12:32 -0700)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
"Fix two compiler warnings introduced by recent commits: pointer
arithmetic and double initialisation of struct field"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: errata: avoid duplicate field initializer
arm64: fix clang warning about TRAMP_VALIAS
Linus Torvalds [Fri, 18 Mar 2022 19:22:15 +0000 (12:22 -0700)]
Merge tag '5.17-rc8-smb3-fix' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fix from Steve French:
"Small fix for regression in multiuser mounts.
The additional improvements suggested by Ronnie to make the server and
session status handling code easier to read can wait for the 5.18
merge window."
* tag '5.17-rc8-smb3-fix' of git://git.samba.org/sfrench/cifs-2.6:
smb3: fix incorrect session setup check for multiuser mounts
Linus Torvalds [Fri, 18 Mar 2022 19:15:56 +0000 (12:15 -0700)]
Merge tag 'block-5.17-2022-03-18' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
- Revert of a nvme target feature (Hannes)
- Fix a memory leak with rq-qos (Ming)
* tag 'block-5.17-2022-03-18' of git://git.kernel.dk/linux-block:
nvmet: revert "nvmet: make discovery NQN configurable"
block: release rq qos structures for queue without disk
Linus Torvalds [Fri, 18 Mar 2022 19:01:19 +0000 (12:01 -0700)]
Merge tag 'drm-fixes-2022-03-18' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"A few minor changes to finish things off, one mgag200 regression, imx
fix and couple of panel changes.
imx:
- Don't test bus flags in atomic check
mgag200:
- Fix PLL setup on some models
panel:
- Fix bpp settings on Innolux G070Y2-L01
- Fix DRM_PANEL_EDP Kconfig dependencies"
* tag 'drm-fixes-2022-03-18' of git://anongit.freedesktop.org/drm/drm:
drm: Don't make DRM_PANEL_BRIDGE dependent on DRM_KMS_HELPERS
drm/panel: simple: Fix Innolux G070Y2-L01 BPP settings
drm/imx: parallel-display: Remove bus flags check in imx_pd_bridge_atomic_check()
drm/mgag200: Fix PLL setup for g200wb and g200ew
Arnd Bergmann [Wed, 16 Mar 2022 18:37:45 +0000 (19:37 +0100)]
arm64: errata: avoid duplicate field initializer
The '.type' field is initialized both in place and in the macro
as reported by this W=1 warning:
arch/arm64/include/asm/cpufeature.h:281:9: error: initialized field overwritten [-Werror=override-init]
281 | (ARM64_CPUCAP_SCOPE_LOCAL_CPU | ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU)
| ^
arch/arm64/kernel/cpu_errata.c:136:17: note: in expansion of macro 'ARM64_CPUCAP_LOCAL_CPU_ERRATUM'
136 | .type = ARM64_CPUCAP_LOCAL_CPU_ERRATUM, \
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/arm64/kernel/cpu_errata.c:145:9: note: in expansion of macro 'ERRATA_MIDR_RANGE'
145 | ERRATA_MIDR_RANGE(m, var, r_min, var, r_max)
| ^~~~~~~~~~~~~~~~~
arch/arm64/kernel/cpu_errata.c:613:17: note: in expansion of macro 'ERRATA_MIDR_REV_RANGE'
613 | ERRATA_MIDR_REV_RANGE(MIDR_CORTEX_A510, 0, 0, 2),
| ^~~~~~~~~~~~~~~~~~~~~
arch/arm64/include/asm/cpufeature.h:281:9: note: (near initialization for 'arm64_errata[18].type')
281 | (ARM64_CPUCAP_SCOPE_LOCAL_CPU | ARM64_CPUCAP_OPTIONAL_FOR_LATE_CPU)
| ^
Remove the extranous initializer.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes:
1dd498e5e26a ("KVM: arm64: Workaround Cortex-A510's single-step and PAC trap errata")
Link: https://lore.kernel.org/r/20220316183800.1546731-1-arnd@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Arnd Bergmann [Wed, 16 Mar 2022 18:38:18 +0000 (19:38 +0100)]
arm64: fix clang warning about TRAMP_VALIAS
The newly introduced TRAMP_VALIAS definition causes a build warning
with clang-14:
arch/arm64/include/asm/vectors.h:66:31: error: arithmetic on a null pointer treated as a cast from integer to pointer is a GNU extension [-Werror,-Wnull-pointer-arithmetic]
return (char *)TRAMP_VALIAS + SZ_2K * slot;
Change the addition to something clang does not complain about.
Fixes:
bd09128d16fa ("arm64: Add percpu vectors for EL1")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/r/20220316183833.1563139-1-arnd@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Dave Airlie [Fri, 18 Mar 2022 03:30:30 +0000 (13:30 +1000)]
Merge tag 'drm-misc-fixes-2022-03-17' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
* drm/imx: Don't test bus flags in atomic check
* drm/mgag200: Fix PLL setup on some models
* drm/panel: Fix bpp settings on Innolux G070Y2-L01; Fix DRM_PANEL_EDP
Kconfig dependencies
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/YjMNcqOuDFDoe+EN@linux-uq9g
Linus Torvalds [Thu, 17 Mar 2022 19:55:26 +0000 (12:55 -0700)]
Merge tag 'net-5.17-final' of git://git./linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter, ipsec, and wireless.
A few last minute revert / disable and fix patches came down from our
sub-trees. We're not waiting for any fixes at this point.
Current release - regressions:
- Revert "netfilter: nat: force port remap to prevent shadowing
well-known ports", restore working conntrack on asymmetric paths
- Revert "ath10k: drop beacon and probe response which leak from
other channel", restore working AP and mesh mode on QCA9984
- eth: intel: fix hang during reboot/shutdown
Current release - new code bugs:
- netfilter: nf_tables: disable register tracking, it needs more work
to cover all corner cases
Previous releases - regressions:
- ipv6: fix skb_over_panic in __ip6_append_data when (admin-only)
extension headers get specified
- esp6: fix ESP over TCP/UDP, interpret ipv6_skip_exthdr's return
value more selectively
- bnx2x: fix driver load failure when FW not present in initrd
Previous releases - always broken:
- vsock: stop destroying unrelated sockets in nested virtualization
- packet: fix slab-out-of-bounds access in packet_recvmsg()
Misc:
- add Paolo Abeni to networking maintainers!"
* tag 'net-5.17-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (26 commits)
iavf: Fix hang during reboot/shutdown
net: mscc: ocelot: fix backwards compatibility with single-chain tc-flower offload
net: bcmgenet: skip invalid partial checksums
bnx2x: fix built-in kernel driver load failure
net: phy: mscc: Add MODULE_FIRMWARE macros
net: dsa: Add missing of_node_put() in dsa_port_parse_of
net: handle ARPHRD_PIMREG in dev_is_mac_header_xmit()
Revert "ath10k: drop beacon and probe response which leak from other channel"
hv_netvsc: Add check for kvmalloc_array
iavf: Fix double free in iavf_reset_task
ice: destroy flow director filter mutex after releasing VSIs
ice: fix NULL pointer dereference in ice_update_vsi_tx_ring_stats()
Add Paolo Abeni to networking maintainers
atm: eni: Add check for dma_map_single
net/packet: fix slab-out-of-bounds access in packet_recvmsg()
net: mdio: mscc-miim: fix duplicate debugfs entry
net: phy: marvell: Fix invalid comparison in the resume and suspend functions
esp6: fix check on ipv6_skip_exthdr's return value
net: dsa: microchip: add spi_device_id tables
netfilter: nf_tables: disable register tracking
...
Linus Torvalds [Thu, 17 Mar 2022 19:40:59 +0000 (12:40 -0700)]
Merge tag 'acpi-5.17-rc9' of git://git./linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Revert recent commit that caused multiple systems to misbehave due to
firmware issues"
* tag 'acpi-5.17-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "ACPI: scan: Do not add device IDs from _CID if _HID is not valid"
Linus Torvalds [Thu, 17 Mar 2022 19:36:47 +0000 (12:36 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
"Four patches.
Subsystems affected by this patch series: mm/swap, kconfig, ocfs2, and
selftests"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
selftests: vm: fix clang build error multiple output files
ocfs2: fix crash when initialize filecheck kobj fails
configs/debug: restore DEBUG_INFO=y for overriding
mm: swap: get rid of livelock in swapin readahead
Yosry Ahmed [Wed, 16 Mar 2022 23:15:12 +0000 (16:15 -0700)]
selftests: vm: fix clang build error multiple output files
When building the vm selftests using clang, some errors are seen due to
having headers in the compilation command:
clang -Wall -I ../../../../usr/include -no-pie gup_test.c ../../../../mm/gup_test.h -lrt -lpthread -o .../tools/testing/selftests/vm/gup_test
clang: error: cannot specify -o when generating multiple output files
make[1]: *** [../lib.mk:146: .../tools/testing/selftests/vm/gup_test] Error 1
Rework to add the header files to LOCAL_HDRS before including ../lib.mk,
since the dependency is evaluated in '$(OUTPUT)/%:%.c $(LOCAL_HDRS)' in
file lib.mk.
Link: https://lkml.kernel.org/r/20220304000645.1888133-1-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joseph Qi [Wed, 16 Mar 2022 23:15:09 +0000 (16:15 -0700)]
ocfs2: fix crash when initialize filecheck kobj fails
Once s_root is set, genric_shutdown_super() will be called if
fill_super() fails. That means, we will call ocfs2_dismount_volume()
twice in such case, which can lead to kernel crash.
Fix this issue by initializing filecheck kobj before setting s_root.
Link: https://lkml.kernel.org/r/20220310081930.86305-1-joseph.qi@linux.alibaba.com
Fixes:
5f483c4abb50 ("ocfs2: add kobject for online file check")
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Qian Cai [Wed, 16 Mar 2022 23:15:06 +0000 (16:15 -0700)]
configs/debug: restore DEBUG_INFO=y for overriding
Previously, I failed to realize that Kees' patch [1] has not been merged
into the mainline yet, and dropped DEBUG_INFO=y too eagerly from the
mainline. As the results, "make debug.config" won't be able to flip
DEBUG_INFO=n from the existing .config. This should close the gaps of a
few weeks before Kees' patch is there, and work regardless of their
merging status anyway.
Link: https://lore.kernel.org/all/20220125075126.891825-1-keescook@chromium.org/
Link: https://lkml.kernel.org/r/20220308153524.8618-1-quic_qiancai@quicinc.com
Signed-off-by: Qian Cai <quic_qiancai@quicinc.com>
Reported-by: Daniel Thompson <daniel.thompson@linaro.org>
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Guo Ziliang [Wed, 16 Mar 2022 23:15:03 +0000 (16:15 -0700)]
mm: swap: get rid of livelock in swapin readahead
In our testing, a livelock task was found. Through sysrq printing, same
stack was found every time, as follows:
__swap_duplicate+0x58/0x1a0
swapcache_prepare+0x24/0x30
__read_swap_cache_async+0xac/0x220
read_swap_cache_async+0x58/0xa0
swapin_readahead+0x24c/0x628
do_swap_page+0x374/0x8a0
__handle_mm_fault+0x598/0xd60
handle_mm_fault+0x114/0x200
do_page_fault+0x148/0x4d0
do_translation_fault+0xb0/0xd4
do_mem_abort+0x50/0xb0
The reason for the livelock is that swapcache_prepare() always returns
EEXIST, indicating that SWAP_HAS_CACHE has not been cleared, so that it
cannot jump out of the loop. We suspect that the task that clears the
SWAP_HAS_CACHE flag never gets a chance to run. We try to lower the
priority of the task stuck in a livelock so that the task that clears
the SWAP_HAS_CACHE flag will run. The results show that the system
returns to normal after the priority is lowered.
In our testing, multiple real-time tasks are bound to the same core, and
the task in the livelock is the highest priority task of the core, so
the livelocked task cannot be preempted.
Although cond_resched() is used by __read_swap_cache_async, it is an
empty function in the preemptive system and cannot achieve the purpose
of releasing the CPU. A high-priority task cannot release the CPU
unless preempted by a higher-priority task. But when this task is
already the highest priority task on this core, other tasks will not be
able to be scheduled. So we think we should replace cond_resched() with
schedule_timeout_uninterruptible(1), schedule_timeout_interruptible will
call set_current_state first to set the task state, so the task will be
removed from the running queue, so as to achieve the purpose of giving
up the CPU and prevent it from running in kernel mode for too long.
(akpm: ugly hack becomes uglier. But it fixes the issue in a
backportable-to-stable fashion while we hopefully work on something
better)
Link: https://lkml.kernel.org/r/20220221111749.1928222-1-cgel.zte@gmail.com
Signed-off-by: Guo Ziliang <guo.ziliang@zte.com.cn>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Reviewed-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
Reviewed-by: Jiang Xuexin <jiang.xuexin@zte.com.cn>
Reviewed-by: Yang Yang <yang.yang29@zte.com.cn>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roger Quadros <rogerq@kernel.org>
Cc: Ziliang Guo <guo.ziliang@zte.com.cn>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ivan Vecera [Thu, 17 Mar 2022 10:45:24 +0000 (11:45 +0100)]
iavf: Fix hang during reboot/shutdown
Recent commit
974578017fc1 ("iavf: Add waiting so the port is
initialized in remove") adds a wait-loop at the beginning of
iavf_remove() to ensure that port initialization is finished
prior unregistering net device. This causes a regression
in reboot/shutdown scenario because in this case callback
iavf_shutdown() is called and this callback detaches the device,
makes it down if it is running and sets its state to __IAVF_REMOVE.
Later shutdown callback of associated PF driver (e.g. ice_shutdown)
is called. That callback calls among other things sriov_disable()
that calls indirectly iavf_remove() (see stack trace below).
As the adapter state is already __IAVF_REMOVE then the mentioned
loop is end-less and shutdown process hangs.
The patch fixes this by checking adapter's state at the beginning
of iavf_remove() and skips the rest of the function if the adapter
is already in remove state (shutdown is in progress).
Reproducer:
1. Create VF on PF driven by ice or i40e driver
2. Ensure that the VF is bound to iavf driver
3. Reboot
[52625.981294] sysrq: SysRq : Show Blocked State
[52625.988377] task:reboot state:D stack: 0 pid:17359 ppid: 1 f2
[52625.996732] Call Trace:
[52625.999187] __schedule+0x2d1/0x830
[52626.007400] schedule+0x35/0xa0
[52626.010545] schedule_hrtimeout_range_clock+0x83/0x100
[52626.020046] usleep_range+0x5b/0x80
[52626.023540] iavf_remove+0x63/0x5b0 [iavf]
[52626.027645] pci_device_remove+0x3b/0xc0
[52626.031572] device_release_driver_internal+0x103/0x1f0
[52626.036805] pci_stop_bus_device+0x72/0xa0
[52626.040904] pci_stop_and_remove_bus_device+0xe/0x20
[52626.045870] pci_iov_remove_virtfn+0xba/0x120
[52626.050232] sriov_disable+0x2f/0xe0
[52626.053813] ice_free_vfs+0x7c/0x340 [ice]
[52626.057946] ice_remove+0x220/0x240 [ice]
[52626.061967] ice_shutdown+0x16/0x50 [ice]
[52626.065987] pci_device_shutdown+0x34/0x60
[52626.070086] device_shutdown+0x165/0x1c5
[52626.074011] kernel_restart+0xe/0x30
[52626.077593] __do_sys_reboot+0x1d2/0x210
[52626.093815] do_syscall_64+0x5b/0x1a0
[52626.097483] entry_SYSCALL_64_after_hwframe+0x65/0xca
Fixes:
974578017fc1 ("iavf: Add waiting so the port is initialized in remove")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Link: https://lore.kernel.org/r/20220317104524.2802848-1-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vladimir Oltean [Wed, 16 Mar 2022 19:21:17 +0000 (21:21 +0200)]
net: mscc: ocelot: fix backwards compatibility with single-chain tc-flower offload
ACL rules can be offloaded to VCAP IS2 either through chain 0, or, since
the blamed commit, through a chain index whose number encodes a specific
PAG (Policy Action Group) and lookup number.
The chain number is translated through ocelot_chain_to_pag() into a PAG,
and through ocelot_chain_to_lookup() into a lookup number.
The problem with the blamed commit is that the above 2 functions don't
have special treatment for chain 0. So ocelot_chain_to_pag(0) returns
filter->pag = 224, which is in fact -32, but the "pag" field is an u8.
So we end up programming the hardware with VCAP IS2 entries having a PAG
of 224. But the way in which the PAG works is that it defines a subset
of VCAP IS2 filters which should match on a packet. The default PAG is
0, and previous VCAP IS1 rules (which we offload using 'goto') can
modify it. So basically, we are installing filters with a PAG on which
no packet will ever match. This is the hardware equivalent of adding
filters to a chain which has no 'goto' to it.
Restore the previous functionality by making ACL filters offloaded to
chain 0 go to PAG 0 and lookup number 0. The choice of PAG is clearly
correct, but the choice of lookup number isn't "as before" (which was to
leave the lookup a "don't care"). However, lookup 0 should be fine,
since even though there are ACL actions (policers) which have a
requirement to be used in a specific lookup, that lookup is 0.
Fixes:
226e9cd82a96 ("net: mscc: ocelot: only install TCAM entries into a specific lookup and PAG")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220316192117.2568261-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Doug Berger [Thu, 17 Mar 2022 01:28:12 +0000 (18:28 -0700)]
net: bcmgenet: skip invalid partial checksums
The RXCHK block will return a partial checksum of 0 if it encounters
a problem while receiving a packet. Since a 1's complement sum can
only produce this result if no bits are set in the received data
stream it is fair to treat it as an invalid partial checksum and
not pass it up the stack.
Fixes:
810155397890 ("net: bcmgenet: use CHECKSUM_COMPLETE for NETIF_F_RXCSUM")
Signed-off-by: Doug Berger <opendmb@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220317012812.1313196-1-opendmb@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Manish Chopra [Wed, 16 Mar 2022 21:46:13 +0000 (14:46 -0700)]
bnx2x: fix built-in kernel driver load failure
Commit
b7a49f73059f ("bnx2x: Utilize firmware 7.13.21.0")
added request_firmware() logic in probe() which caused
load failure when firmware file is not present in initrd (below),
as access to firmware file is not feasible during probe.
Direct firmware load for bnx2x/bnx2x-e2-7.13.15.0.fw failed with error -2
Direct firmware load for bnx2x/bnx2x-e2-7.13.21.0.fw failed with error -2
This patch fixes this issue by -
1. Removing request_firmware() logic from the probe()
such that .ndo_open() handle it as it used to handle
it earlier
2. Given request_firmware() is removed from probe(), so
driver has to relax FW version comparisons a bit against
the already loaded FW version (by some other PFs of same
adapter) to allow different compatible/close enough FWs with which
multiple PFs may run with (in different environments), as the
given PF who is in probe flow has no idea now with which firmware
file version it is going to initialize the device in ndo_open()
Link: https://lore.kernel.org/all/46f2d9d9-ae7f-b332-ddeb-b59802be2bab@molgen.mpg.de/
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Paul Menzel <pmenzel@molgen.mpg.de>
Fixes:
b7a49f73059f ("bnx2x: Utilize firmware 7.13.21.0")
Signed-off-by: Manish Chopra <manishc@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Link: https://lore.kernel.org/r/20220316214613.6884-1-manishc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Juerg Haefliger [Wed, 16 Mar 2022 15:18:35 +0000 (16:18 +0100)]
net: phy: mscc: Add MODULE_FIRMWARE macros
The driver requires firmware so define MODULE_FIRMWARE so that modinfo
provides the details.
Fixes:
fa164e40c53b ("net: phy: mscc: split the driver into separate files")
Signed-off-by: Juerg Haefliger <juergh@canonical.com>
Link: https://lore.kernel.org/r/20220316151835.88765-1-juergh@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Miaoqian Lin [Wed, 16 Mar 2022 08:26:02 +0000 (08:26 +0000)]
net: dsa: Add missing of_node_put() in dsa_port_parse_of
The device_node pointer is returned by of_parse_phandle() with refcount
incremented. We should use of_node_put() on it when done.
Fixes:
6d4e5c570c2d ("net: dsa: get port type at parse time")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220316082602.10785-1-linmq006@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Thomas Zimmermann [Tue, 15 Mar 2022 08:45:59 +0000 (09:45 +0100)]
drm: Don't make DRM_PANEL_BRIDGE dependent on DRM_KMS_HELPERS
Fix a number of undefined references to drm_kms_helper.ko in
drm_dp_helper.ko:
arm-suse-linux-gnueabi-ld: drivers/gpu/drm/dp/drm_dp_mst_topology.o: in function `drm_dp_mst_duplicate_state':
drm_dp_mst_topology.c:(.text+0x2df0): undefined reference to `__drm_atomic_helper_private_obj_duplicate_state'
arm-suse-linux-gnueabi-ld: drivers/gpu/drm/dp/drm_dp_mst_topology.o: in function `drm_dp_delayed_destroy_work':
drm_dp_mst_topology.c:(.text+0x370c): undefined reference to `drm_kms_helper_hotplug_event'
arm-suse-linux-gnueabi-ld: drivers/gpu/drm/dp/drm_dp_mst_topology.o: in function `drm_dp_mst_up_req_work':
drm_dp_mst_topology.c:(.text+0x7938): undefined reference to `drm_kms_helper_hotplug_event'
arm-suse-linux-gnueabi-ld: drivers/gpu/drm/dp/drm_dp_mst_topology.o: in function `drm_dp_mst_link_probe_work':
drm_dp_mst_topology.c:(.text+0x82e0): undefined reference to `drm_kms_helper_hotplug_event'
This happens if panel-edp.ko has been configured with
DRM_PANEL_EDP=y
DRM_DP_HELPER=y
DRM_KMS_HELPER=m
which builds DP helpers into the kernel and KMS helpers sa a module.
Making DRM_PANEL_EDP select DRM_KMS_HELPER resolves this problem.
To avoid a resulting cyclic dependency with DRM_PANEL_BRIDGE, don't
make the latter depend on DRM_KMS_HELPER and fix the one DRM bridge
drivers that doesn't already select DRM_KMS_HELPER. As KMS helpers
cannot be selected directly by the user, config symbols should avoid
depending on it anyway.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes:
3755d35ee1d2 ("drm/panel: Select DRM_DP_HELPER for DRM_PANEL_EDP")
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Tested-by: Brian Masney <bmasney@redhat.com>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: Linux Kernel Functional Testing <lkft@linaro.org>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: dri-devel@lists.freedesktop.org
Cc: Dave Airlie <airlied@redhat.com>
Cc: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/478296/
Thomas Zimmermann [Thu, 17 Mar 2022 10:03:28 +0000 (11:03 +0100)]
Merge drm/drm-fixes into drm-misc-fixes
Backmerging drm/drm-fixes for commit
3755d35ee1d2 ("drm/panel: Select
DRM_DP_HELPER for DRM_PANEL_EDP").
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Steve French [Thu, 17 Mar 2022 03:08:43 +0000 (22:08 -0500)]
smb3: fix incorrect session setup check for multiuser mounts
A recent change to how the SMB3 server (socket) and session status
is managed regressed multiuser mounts by changing the check
for whether session setup is needed to the socket (TCP_Server_info)
structure instead of the session struct (cifs_ses). Add additional
check in cifs_setup_sesion to fix this.
Fixes:
73f9bfbe3d81 ("cifs: maintain a state machine for tcp/smb/tcon sessions")
Reported-by: Ronnie Sahlberg <lsahlber@redhat.com>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Nicolas Dichtel [Tue, 15 Mar 2022 09:20:08 +0000 (10:20 +0100)]
net: handle ARPHRD_PIMREG in dev_is_mac_header_xmit()
This kind of interface doesn't have a mac header. This patch fixes
bpf_redirect() to a PIM interface.
Fixes:
27b29f63058d ("bpf: add bpf_redirect() helper")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Link: https://lore.kernel.org/r/20220315092008.31423-1-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Wed, 16 Mar 2022 18:57:46 +0000 (11:57 -0700)]
Merge tag 'efi-urgent-for-v5.17-3' of git://git./linux/kernel/git/efi/efi
Pull EFI fix from Ard Biesheuvel:
"Avoid spurious warnings about unknown boot parameters"
* tag 'efi-urgent-for-v5.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efi: fix return value of __setup handlers
Linus Torvalds [Wed, 16 Mar 2022 18:50:35 +0000 (11:50 -0700)]
Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
"This fixes a bug where qcom-rng can return a buffer that is not
completely filled with random data"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: qcom-rng - ensure buffer for generate is completely filled
Jakub Kicinski [Wed, 16 Mar 2022 18:39:36 +0000 (11:39 -0700)]
Merge branch 'master' of git://git./linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2022-03-16
1) Fix a kernel-info-leak in pfkey.
From Haimin Zhang.
2) Fix an incorrect check of the return value of ipv6_skip_exthdr.
From Sabrina Dubroca.
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
esp6: fix check on ipv6_skip_exthdr's return value
af_key: add __GFP_ZERO flag for compose_sadb_supported in function pfkey_register
====================
Link: https://lore.kernel.org/r/20220316121142.3142336-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 16 Mar 2022 18:08:09 +0000 (11:08 -0700)]
Merge tag 'wireless-2022-03-16' of git://git./linux/kernel/git/wireless/wireless
Kalle Valo says:
====================
wireless fixes for v5.17
Third set of fixes for v5.17. We have only one revert to fix an ath10k
regression.
* tag 'wireless-2022-03-16' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
Revert "ath10k: drop beacon and probe response which leak from other channel"
====================
Link: https://lore.kernel.org/r/20220316130249.B5225C340EC@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Marek Vasut [Sun, 20 Feb 2022 04:07:18 +0000 (05:07 +0100)]
drm/panel: simple: Fix Innolux G070Y2-L01 BPP settings
The Innolux G070Y2-L01 supports two modes of operation:
1) FRC=Low/NC ... MEDIA_BUS_FMT_RGB666_1X7X3_SPWG ... BPP=6
2) FRC=High ..... MEDIA_BUS_FMT_RGB888_1X7X4_SPWG ... BPP=8
Currently the panel description mixes both, BPP from 1) and bus
format from 2), which triggers a warning at panel-simple.c:615.
Pick the later, set bpp=8, fix the warning.
Fixes:
a5d2ade627dca ("drm/panel: simple: Add support for Innolux G070Y2-L01")
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Christoph Fritz <chf.fritz@googlemail.com>
Cc: Laurent Pinchart <Laurent.pinchart@ideasonboard.com>
Cc: Maxime Ripard <maxime@cerno.tech>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220220040718.532866-1-marex@denx.de
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Christoph Niedermaier [Tue, 1 Feb 2022 11:36:43 +0000 (12:36 +0100)]
drm/imx: parallel-display: Remove bus flags check in imx_pd_bridge_atomic_check()
If display timings were read from the devicetree using
of_get_display_timing() and pixelclk-active is defined
there, the flag DISPLAY_FLAGS_SYNC_POSEDGE/NEGEDGE is
automatically generated. Through the function
drm_bus_flags_from_videomode() e.g. called in the
panel-simple driver this flag got into the bus flags,
but then in imx_pd_bridge_atomic_check() the bus flag
check failed and will not initialize the display. The
original commit
fe141cedc433 does not explain why this
check was introduced. So remove the bus flags check,
because it stops the initialization of the display with
valid bus flags.
Fixes:
fe141cedc433 ("drm/imx: pd: Use bus format/flags provided by the bridge when available")
Signed-off-by: Christoph Niedermaier <cniedermaier@dh-electronics.com>
Cc: Marek Vasut <marex@denx.de>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Shawn Guo <shawnguo@kernel.org>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Pengutronix Kernel Team <kernel@pengutronix.de>
Cc: Fabio Estevam <festevam@gmail.com>
Cc: NXP Linux Team <linux-imx@nxp.com>
Cc: linux-arm-kernel@lists.infradead.org
To: dri-devel@lists.freedesktop.org
Tested-by: Max Krummenacher <max.krummenacher@toradex.com>
Acked-by: Boris Brezillon <boris.brezillon@collabora.com>
Signed-off-by: Marek Vasut <marex@denx.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20220201113643.4638-1-cniedermaier@dh-electronics.com
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Jens Axboe [Wed, 16 Mar 2022 11:43:25 +0000 (05:43 -0600)]
Merge tag 'nvme-5.17-2022-03-16' of git://git.infradead.org/nvme into block-5.17
Pull NVMe fix from Christoph:
"nvme fix for Linux 5.17
- last minute revert of a nvmet feature added in Linux 5.16
(Hannes Reinecke)"
* tag 'nvme-5.17-2022-03-16' of git://git.infradead.org/nvme:
nvmet: revert "nvmet: make discovery NQN configurable"
Kalle Valo [Tue, 15 Mar 2022 15:54:55 +0000 (17:54 +0200)]
Revert "ath10k: drop beacon and probe response which leak from other channel"
This reverts commit
3bf2537ec2e33310b431b53fd84be8833736c256.
I was reported privately that this commit breaks AP and mesh mode on QCA9984
(firmware 10.4-3.9.0.2-00156). So revert the commit to fix the regression.
There was a conflict due to cfg80211 API changes but that was easy to fix.
Fixes:
3bf2537ec2e3 ("ath10k: drop beacon and probe response which leak from other channel")
Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
Link: https://lore.kernel.org/r/20220315155455.20446-1-kvalo@kernel.org
Rafael J. Wysocki [Wed, 16 Mar 2022 10:23:05 +0000 (11:23 +0100)]
Revert "ACPI: scan: Do not add device IDs from _CID if _HID is not valid"
Revert commit
e38f9ff63e6d ("ACPI: scan: Do not add device IDs from _CID
if _HID is not valid"), because it has introduced regressions on
multiple systems, even though it only has effect on clearly invalid
firmware.
Reported-by: Pierre-Louis Bossart <notifications@github.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
David S. Miller [Wed, 16 Mar 2022 10:07:43 +0000 (10:07 +0000)]
Merge branch '100GbE' of git://git./linux/kernel/git/tnguy/net-queue
====================
Intel Wired LAN Driver Updates 2022-03-15
This series contains updates to ice and iavf drivers.
Maciej adjusts null check logic on Tx ring to prevent possible NULL
pointer dereference for ice.
Sudheer moves destruction of Flow Director lock as it was being accessed
after destruction for ice.
Przemyslaw removes an excess mutex unlock as it was being double
unlocked for iavf.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jiasheng Jiang [Mon, 14 Mar 2022 02:01:25 +0000 (10:01 +0800)]
hv_netvsc: Add check for kvmalloc_array
As the potential failure of the kvmalloc_array(),
it should be better to check and restore the 'data'
if fails in order to avoid the dereference of the
NULL pointer.
Fixes:
6ae746711263 ("hv_netvsc: Add per-cpu ethtool stats for netvsc")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Link: https://lore.kernel.org/r/20220314020125.2365084-1-jiasheng@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Przemyslaw Patynowski [Wed, 9 Mar 2022 15:37:39 +0000 (16:37 +0100)]
iavf: Fix double free in iavf_reset_task
Fix double free possibility in iavf_disable_vf, as crit_lock is
freed in caller, iavf_reset_task. Add kernel-doc for iavf_disable_vf.
Remove mutex_unlock in iavf_disable_vf.
Without this patch there is double free scenario, when calling
iavf_reset_task.
Fixes:
e85ff9c631e1 ("iavf: Fix deadlock in iavf_reset_task")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Suggested-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Sudheer Mogilappagari [Thu, 10 Mar 2022 18:46:52 +0000 (10:46 -0800)]
ice: destroy flow director filter mutex after releasing VSIs
Currently fdir_fltr_lock is accessed in ice_vsi_release_all() function
after it is destroyed. Instead destroy mutex after ice_vsi_release_all.
Fixes:
40319796b732 ("ice: Add flow director support for channel mode")
Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com>
Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Maciej Fijalkowski [Mon, 7 Mar 2022 17:47:39 +0000 (18:47 +0100)]
ice: fix NULL pointer dereference in ice_update_vsi_tx_ring_stats()
It is possible to do NULL pointer dereference in routine that updates
Tx ring stats. Currently only stats and bytes are updated when ring
pointer is valid, but later on ring is accessed to propagate gathered Tx
stats onto VSI stats.
Change the existing logic to move to next ring when ring is NULL.
Fixes:
e72bba21355d ("ice: split ice_ring onto Tx/Rx separate structs")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Jakub Kicinski [Mon, 14 Mar 2022 22:28:19 +0000 (15:28 -0700)]
Add Paolo Abeni to networking maintainers
Growing the network maintainers team from 2 to 3.
Signed-off-by: David S. Miller <davem@davemloft.net>
Link: https://lore.kernel.org/r/20220314222819.958428-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Bartosz Golaszewski [Tue, 15 Mar 2022 16:52:05 +0000 (17:52 +0100)]
Revert "gpio: Revert regression in sysfs-gpio (gpiolib.c)"
This reverts commit
fc328a7d1fcce263db0b046917a66f3aa6e68719.
This commit - while attempting to fix a regression - has caused a number
of other problems. As the fallout from it is more significant than the
initial problem itself, revert it for now before we find a correct
solution.
Link: https://lore.kernel.org/all/20220314192522.GA3031157@roeck-us.net/
Link: https://lore.kernel.org/stable/20220314155509.552218-1-michael@walle.cc/
Link: https://lore.kernel.org/all/20211217153555.9413-1-marcelo.jimenez@gmail.com/
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Reported-and-bisected-by: Guenter Roeck <linux@roeck-us.net>
Reported-by: Michael Walle <michael@walle.cc>
Cc: Thorsten Leemhuis <linux@leemhuis.info>
Cc: Marcelo Roberto Jimenez <marcelo.jimenez@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ingo Molnar [Tue, 15 Mar 2022 11:52:51 +0000 (12:52 +0100)]
Merge branch 'x86/cpu' into x86/core, to resolve conflicts
Conflicts:
arch/x86/include/asm/cpufeatures.h
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Ingo Molnar [Tue, 15 Mar 2022 11:50:49 +0000 (12:50 +0100)]
Merge branch 'x86/pasid' into x86/core, to resolve conflicts
Conflicts:
tools/objtool/arch/x86/decode.c
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Jiasheng Jiang [Mon, 14 Mar 2022 01:34:48 +0000 (09:34 +0800)]
atm: eni: Add check for dma_map_single
As the potential failure of the dma_map_single(),
it should be better to check it and return error
if fails.
Fixes:
1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hannes Reinecke [Tue, 15 Mar 2022 09:14:36 +0000 (10:14 +0100)]
nvmet: revert "nvmet: make discovery NQN configurable"
Revert commit
626851e9225d ("nvmet: make discovery NQN configurable");
the interface was deemed incorrect and will be replaced with a different
one.
Fixes:
626851e9225d ("nvmet: make discovery NQN configurable")
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Nathan Chancellor [Mon, 14 Mar 2022 19:48:42 +0000 (12:48 -0700)]
x86/Kconfig: Do not allow CONFIG_X86_X32_ABI=y with llvm-objcopy
There are two outstanding issues with CONFIG_X86_X32_ABI and
llvm-objcopy, with similar root causes:
1. llvm-objcopy does not properly convert .note.gnu.property when going
from x86_64 to x86_x32, resulting in a corrupted section when
linking:
https://github.com/ClangBuiltLinux/linux/issues/1141
2. llvm-objcopy produces corrupted compressed debug sections when going
from x86_64 to x86_x32, also resulting in an error when linking:
https://github.com/ClangBuiltLinux/linux/issues/514
After commit
41c5ef31ad71 ("x86/ibt: Base IBT bits"), the
.note.gnu.property section is always generated when
CONFIG_X86_KERNEL_IBT is enabled, which causes the first issue to become
visible with an allmodconfig build:
ld.lld: error: arch/x86/entry/vdso/vclock_gettime-x32.o:(.note.gnu.property+0x1c): program property is too short
To avoid this error, do not allow CONFIG_X86_X32_ABI to be selected when
using llvm-objcopy. If the two issues ever get fixed in llvm-objcopy,
this can be turned into a feature check.
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220314194842.3452-3-nathan@kernel.org
Masahiro Yamada [Mon, 14 Mar 2022 19:48:41 +0000 (12:48 -0700)]
x86: Remove toolchain check for X32 ABI capability
Commit
0bf6276392e9 ("x32: Warn and disable rather than error if
binutils too old") added a small test in arch/x86/Makefile because
binutils 2.22 or newer is needed to properly support elf32-x86-64. This
check is no longer necessary, as the minimum supported version of
binutils is 2.23, which is enforced at configuration time with
scripts/min-tool-version.sh.
Remove this check and replace all uses of CONFIG_X86_X32 with
CONFIG_X86_X32_ABI, as two symbols are no longer necessary.
[nathan: Rebase, fix up a few places where CONFIG_X86_X32 was still
used, and simplify commit message to satisfy -tip requirements]
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220314194842.3452-2-nathan@kernel.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:56 +0000 (16:30 +0100)]
x86/alternative: Use .ibt_endbr_seal to seal indirect calls
Objtool's --ibt option generates .ibt_endbr_seal which lists
superfluous ENDBR instructions. That is those instructions for which
the function is never indirectly called.
Overwrite these ENDBR instructions with a NOP4 such that these
function can never be indirect called, reducing the number of viable
ENDBR targets in the kernel.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.822545231@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:55 +0000 (16:30 +0100)]
objtool: Find unused ENDBR instructions
Find all ENDBR instructions which are never referenced and stick them
in a section such that the kernel can poison them, sealing the
functions from ever being an indirect call target.
This removes about 1-in-4 ENDBR instructions.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.763643193@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:54 +0000 (16:30 +0100)]
objtool: Validate IBT assumptions
Intel IBT requires that every indirect JMP/CALL targets an ENDBR
instructions, failing this #CP happens and we die. Similarly, all
exception entries should be ENDBR.
Find all code relocations and ensure they're either an ENDBR
instruction or ANNOTATE_NOENDBR. For the exceptions look for
UNWIND_HINT_IRET_REGS at sym+0 not being ENDBR.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.705110141@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:53 +0000 (16:30 +0100)]
objtool: Add IBT/ENDBR decoding
Intel IBT requires the target of any indirect CALL or JMP instruction
to be the ENDBR instruction; optionally it allows those two
instructions to have a NOTRACK prefix in order to avoid this
requirement.
The kernel will not enable the use of NOTRACK, as such any occurence
of it in compiler generated code should be flagged.
Teach objtool to Decode ENDBR instructions and WARN about NOTRACK
prefixes.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.645963517@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:52 +0000 (16:30 +0100)]
objtool: Read the NOENDBR annotation
Read the new NOENDBR annotation. While there, attempt to not bloat
struct instruction.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.586815435@infradead.org
Peter Zijlstra [Mon, 14 Mar 2022 17:07:30 +0000 (18:07 +0100)]
x86: Annotate idtentry_df()
Without CONFIG_X86_ESPFIX64 exc_double_fault() is noreturn and objtool
is clever enough to figure that out.
vmlinux.o: warning: objtool: asm_exc_double_fault()+0x22: unreachable instruction
0000000000001260 <asm_exc_double_fault>:
1260: f3 0f 1e fa endbr64
1264: 90 nop
1265: 90 nop
1266: 90 nop
1267: e8 84 03 00 00 call 15f0 <paranoid_entry>
126c: 48 89 e7 mov %rsp,%rdi
126f: 48 8b 74 24 78 mov 0x78(%rsp),%rsi
1274: 48 c7 44 24 78 ff ff ff ff movq $0xffffffffffffffff,0x78(%rsp)
127d: e8 00 00 00 00 call 1282 <asm_exc_double_fault+0x22> 127e: R_X86_64_PLT32 exc_double_fault-0x4
1282: e9 09 04 00 00 jmp 1690 <paranoid_exit>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/Yi9gOW9f1GGwwUD6@hirez.programming.kicks-ass.net
Peter Zijlstra [Mon, 14 Mar 2022 17:05:52 +0000 (18:05 +0100)]
x86,objtool: Move the ASM_REACHABLE annotation to objtool.h
Because we need a variant for .S files too.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/Yi9gOW9f1GGwwUD6@hirez.programming.kicks-ass.net
Peter Zijlstra [Tue, 8 Mar 2022 15:30:50 +0000 (16:30 +0100)]
x86: Annotate call_on_stack()
vmlinux.o: warning: objtool: page_fault_oops()+0x13c: unreachable instruction
0000
000000000005b460 <page_fault_oops>:
...
0128 5b588: 49 89 23 mov %rsp,(%r11)
012b 5b58b: 4c 89 dc mov %r11,%rsp
012e 5b58e: 4c 89 f2 mov %r14,%rdx
0131 5b591: 48 89 ee mov %rbp,%rsi
0134 5b594: 4c 89 e7 mov %r12,%rdi
0137 5b597: e8 00 00 00 00 call 5b59c <page_fault_oops+0x13c> 5b598: R_X86_64_PLT32 handle_stack_overflow-0x4
013c 5b59c: 5c pop %rsp
vmlinux.o: warning: objtool: sysvec_reboot()+0x6d: unreachable instruction
0000
00000000000033f0 <sysvec_reboot>:
...
005d 344d: 4c 89 dc mov %r11,%rsp
0060 3450: e8 00 00 00 00 call 3455 <sysvec_reboot+0x65> 3451: R_X86_64_PLT32 irq_enter_rcu-0x4
0065 3455: 48 89 ef mov %rbp,%rdi
0068 3458: e8 00 00 00 00 call 345d <sysvec_reboot+0x6d> 3459: R_X86_64_PC32 .text+0x47d0c
006d 345d: e8 00 00 00 00 call 3462 <sysvec_reboot+0x72> 345e: R_X86_64_PLT32 irq_exit_rcu-0x4
0072 3462: 5c pop %rsp
Both cases are due to a call_on_stack() calling a __noreturn function.
Since that's an inline asm, GCC can't do anything about the
instructions after the CALL. Therefore put in an explicit
ASM_REACHABLE annotation to make sure objtool and gcc are consistently
confused about control flow.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.468805622@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:49 +0000 (16:30 +0100)]
objtool: Rework ASM_REACHABLE
Currently ASM_REACHABLE only works for UD2 instructions; reorder
things to also allow over-riding dead_end_function().
To that end:
- Mark INSN_BUG instructions in decode_instructions(), this saves
having to iterate all instructions yet again.
- Have add_call_destinations() set insn->dead_end for
dead_end_function() calls.
- Move add_dead_ends() *after* add_call_destinations() such that
ASM_REACHABLE can clear the ->dead_end mark.
- have validate_branch() only check ->dead_end.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.410010807@infradead.org
Peter Zijlstra [Mon, 14 Mar 2022 16:58:35 +0000 (17:58 +0100)]
x86: Mark __invalid_creds() __noreturn
vmlinux.o: warning: objtool: ksys_unshare()+0x36c: unreachable instruction
0000
0000000000067040 <ksys_unshare>:
...
0364 673a4: 4c 89 ef mov %r13,%rdi
0367 673a7: e8 00 00 00 00 call 673ac <ksys_unshare+0x36c> 673a8: R_X86_64_PLT32 __invalid_creds-0x4
036c 673ac: e9 28 ff ff ff jmp 672d9 <ksys_unshare+0x299>
0371 673b1: 41 bc f4 ff ff ff mov $0xfffffff4,%r12d
0377 673b7: e9 80 fd ff ff jmp 6713c <ksys_unshare+0xfc>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/Yi9gOW9f1GGwwUD6@hirez.programming.kicks-ass.net
Peter Zijlstra [Tue, 8 Mar 2022 15:30:48 +0000 (16:30 +0100)]
exit: Mark do_group_exit() __noreturn
vmlinux.o: warning: objtool: get_signal()+0x108: unreachable instruction
0000
000000000007f930 <get_signal>:
...
0103 7fa33: e8 00 00 00 00 call 7fa38 <get_signal+0x108> 7fa34: R_X86_64_PLT32 do_group_exit-0x4
0108 7fa38: 41 8b 45 74 mov 0x74(%r13),%eax
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.351270711@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:47 +0000 (16:30 +0100)]
x86: Mark stop_this_cpu() __noreturn
vmlinux.o: warning: objtool: smp_stop_nmi_callback()+0x2b: unreachable instruction
0000
0000000000047cf0 <smp_stop_nmi_callback>:
...
0026 47d16: e8 00 00 00 00 call 47d1b <smp_stop_nmi_callback+0x2b> 47d17: R_X86_64_PLT32 stop_this_cpu-0x4
002b 47d1b: b8 01 00 00 00 mov $0x1,%eax
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.290905453@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:46 +0000 (16:30 +0100)]
objtool: Ignore extra-symbol code
There's a fun implementation detail on linking STB_WEAK symbols. When
the linker combines two translation units, where one contains a weak
function and the other an override for it. It simply strips the
STB_WEAK symbol from the symbol table, but doesn't actually remove the
code.
The result is that when objtool is ran in a whole-archive kind of way,
it will encounter *heaps* of unused (and unreferenced) code. All
rudiments of weak functions.
Additionally, when a weak implementation is split into a .cold
subfunction that .cold symbol is left in place, even though completely
unused.
Teach objtool to ignore such rudiments by searching for symbol holes;
that is, code ranges that fall outside the given symbol bounds.
Specifically, ignore a sequence of unreachable instruction iff they
occupy a single hole, additionally ignore any .cold subfunctions
referenced.
Both ld.bfd and ld.lld behave like this. LTO builds otoh can (and do)
properly DCE weak functions.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.232019347@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:45 +0000 (16:30 +0100)]
objtool: Rename --duplicate to --lto
In order to prepare for LTO like objtool runs for modules, rename the
duplicate argument to lto.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.172584233@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:44 +0000 (16:30 +0100)]
x86/ibt: Ensure module init/exit points have references
Since the references to the module init/exit points only have external
references, a module LTO run will consider them 'unused' and seal
them, leading to an immediate fail on module load.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.113767246@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:43 +0000 (16:30 +0100)]
x86/ibt: Dont generate ENDBR in .discard.text
Having ENDBR in discarded sections can easily lead to relocations into
discarded sections which the linkers aren't really fond of. Objtool
also shouldn't generate them, but why tempt fate.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.054842742@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:42 +0000 (16:30 +0100)]
x86/ibt,sev: Annotations
No IBT on AMD so far.. probably correct, who knows.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.995109889@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:41 +0000 (16:30 +0100)]
x86/ibt,ftrace: Annotate ftrace code patching
These are code patching sites, not indirect targets.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.936599479@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:40 +0000 (16:30 +0100)]
x86/ibt: Annotate text references
Annotate away some of the generic code references. This is things
where we take the address of a symbol for exception handling or return
addresses (eg. context switch).
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.877758523@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:38 +0000 (16:30 +0100)]
x86/ibt: Disable IBT around firmware
Assume firmware isn't IBT clean and disable it across calls.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.759989383@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:37 +0000 (16:30 +0100)]
x86/alternative: Simplify int3_selftest_ip
Similar to ibt_selftest_ip, apply the same pattern.
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.700456643@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:36 +0000 (16:30 +0100)]
x86/ibt,kexec: Disable CET on kexec
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.641454603@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:35 +0000 (16:30 +0100)]
x86/ibt: Add IBT feature, MSR and #CP handling
The bits required to make the hardware go.. Of note is that, provided
the syscall entry points are covered with ENDBR, #CP doesn't need to
be an IST because we'll never hit the syscall gap.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.582331711@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:34 +0000 (16:30 +0100)]
x86/ibt,ftrace: Add ENDBR to samples/ftrace
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.523421433@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:33 +0000 (16:30 +0100)]
x86/ibt,bpf: Add ENDBR instructions to prologue and trampoline
With IBT enabled builds we need ENDBR instructions at indirect jump
target sites, since we start execution of the JIT'ed code through an
indirect jump, the very first instruction needs to be ENDBR.
Similarly, since eBPF tail-calls use indirect branches, their landing
site needs to be an ENDBR too.
The trampolines need similar adjustment.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Fixed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.464998838@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:32 +0000 (16:30 +0100)]
x86/ibt,kprobes: Cure sym+0 equals fentry woes
In order to allow kprobes to skip the ENDBR instructions at sym+0 for
X86_KERNEL_IBT builds, change _kprobe_addr() to take an architecture
callback to inspect the function at hand and modify the offset if
needed.
This streamlines the existing interface to cover more cases and
require less hooks. Once PowerPC gets fully converted there will only
be the one arch hook.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.405947704@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:31 +0000 (16:30 +0100)]
x86/ibt,ftrace: Make function-graph play nice
Return trampoline must not use indirect branch to return; while this
preserves the RSB, it is fundamentally incompatible with IBT. Instead
use a retpoline like ROP gadget that defeats IBT while not unbalancing
the RSB.
And since ftrace_stub is no longer a plain RET, don't use it to copy
from. Since RET is a trivial instruction, poke it directly.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.347296408@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:30 +0000 (16:30 +0100)]
x86/livepatch: Validate __fentry__ location
Currently livepatch assumes __fentry__ lives at func+0, which is most
likely untrue with IBT on. Instead make it use ftrace_location() by
default which both validates and finds the actual ip if there is any
in the same symbol.
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.285971256@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:29 +0000 (16:30 +0100)]
x86/ibt,ftrace: Search for __fentry__ location
Currently a lot of ftrace code assumes __fentry__ is at sym+0. However
with Intel IBT enabled the first instruction of a function will most
likely be ENDBR.
Change ftrace_location() to not only return the __fentry__ location
when called for the __fentry__ location, but also when called for the
sym+0 location.
Then audit/update all callsites of this function to consistently use
these new semantics.
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.227581603@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:28 +0000 (16:30 +0100)]
x86/ibt,kvm: Add ENDBR to fastops
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.168850084@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:27 +0000 (16:30 +0100)]
x86/ibt,crypto: Add ENDBR for the jump-table entries
The code does:
## branch into array
mov jump_table(,%rax,8), %bufp
JMP_NOSPEC bufp
resulting in needing to mark the jump-table entries with ENDBR.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.110500806@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:26 +0000 (16:30 +0100)]
x86/ibt,paravirt: Sprinkle ENDBR
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154318.051635891@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:25 +0000 (16:30 +0100)]
x86/linkage: Add ENDBR to SYM_FUNC_START*()
Ensure the ASM functions have ENDBR on for IBT builds, this follows
the ARM64 example. Unlike ARM64, we'll likely end up overwriting them
with poison.
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.992708941@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:24 +0000 (16:30 +0100)]
x86/ibt,entry: Sprinkle ENDBR dust
Kernel entry points should be having ENDBR on for IBT configs.
The SYSCALL entry points are found through taking their respective
address in order to program them in the MSRs, while the exception
entry points are found through UNWIND_HINT_IRET_REGS.
The rule is that any UNWIND_HINT_IRET_REGS at sym+0 should have an
ENDBR, see the later objtool ibt validation patch.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.933157479@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:23 +0000 (16:30 +0100)]
x86/ibt,xen: Sprinkle the ENDBR
Even though Xen currently doesn't advertise IBT, prepare for when it
will eventually do so and sprinkle the ENDBR dust accordingly.
Even though most of the entry points are IRET like, the CPL0
Hypervisor can set WAIT-FOR-ENDBR and demand ENDBR at these sites.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.873919996@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:22 +0000 (16:30 +0100)]
x86/entry,xen: Early rewrite of restore_regs_and_return_to_kernel()
By doing an early rewrite of 'jmp native_iret` in
restore_regs_and_return_to_kernel() we can get rid of the last
INTERRUPT_RETURN user and paravirt_iret.
Suggested-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.815039833@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:21 +0000 (16:30 +0100)]
x86/entry: Cleanup PARAVIRT
Since commit
5c8f6a2e316e ("x86/xen: Add
xenpv_restore_regs_and_return_to_usermode()") Xen will no longer reach
this code and we can do away with the paravirt
SWAPGS/INTERRUPT_RETURN.
Suggested-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.756014488@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:20 +0000 (16:30 +0100)]
x86/ibt,paravirt: Use text_gen_insn() for paravirt_patch()
Less duplication is more better.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.697253958@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:19 +0000 (16:30 +0100)]
x86/text-patching: Make text_gen_insn() play nice with ANNOTATE_NOENDBR
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.638561109@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:18 +0000 (16:30 +0100)]
x86/ibt: Add ANNOTATE_NOENDBR
In order to have objtool warn about code references to !ENDBR
instruction, we need an annotation to allow this for non-control-flow
instances -- consider text range checks, text patching, or return
trampolines etc.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.578968224@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:17 +0000 (16:30 +0100)]
x86/ibt: Base IBT bits
Add Kconfig, Makefile and basic instruction support for x86 IBT.
(Ab)use __DISABLE_EXPORTS to disable IBT since it's already employed
to mark compressed and purgatory. Additionally mark realmode with it
as well to avoid inserting ENDBR instructions there. While ENDBR is
technically a NOP, inserting them was causing some grief due to code
growth. There's also a problem with using __noendbr in code compiled
without -fcf-protection=branch.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.519875203@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:16 +0000 (16:30 +0100)]
objtool: Have WARN_FUNC fall back to sym+off
Currently WARN_FUNC() either prints func+off and failing that prints
sec+off, add an intermediate sym+off. This is useful when playing
around with entry code.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.461283840@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:15 +0000 (16:30 +0100)]
objtool,efi: Update __efi64_thunk annotation
The current annotation relies on not running objtool on the file; this
won't work when running objtool on vmlinux.o. Instead explicitly mark
__efi64_thunk() to be ignored.
This preserves the status quo, which is somewhat unfortunate. Luckily
this code is hardly ever used.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.402118218@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:14 +0000 (16:30 +0100)]
objtool: Default ignore INT3 for unreachable
Ignore all INT3 instructions for unreachable code warnings, similar to NOP.
This allows using INT3 for various paddings instead of NOPs.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.343312938@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:13 +0000 (16:30 +0100)]
objtool: Add --dry-run
Add a --dry-run argument to skip writing the modifications. This is
convenient for debugging.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.282720146@infradead.org
Peter Zijlstra [Tue, 8 Mar 2022 15:30:12 +0000 (16:30 +0100)]
static_call: Avoid building empty .static_call_sites
Without CONFIG_HAVE_STATIC_CALL_INLINE there's no point in creating
the .static_call_sites section and it's related symbols.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.223798256@infradead.org
Peter Zijlstra [Tue, 15 Mar 2022 09:32:31 +0000 (10:32 +0100)]
Merge branch 'arm64/for-next/linkage'
Enjoy the cleanups and avoid conflicts vs linkage
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Fenghua Yu [Mon, 7 Feb 2022 23:02:53 +0000 (15:02 -0800)]
tools/objtool: Check for use of the ENQCMD instruction in the kernel
The ENQCMD instruction implicitly accesses the PASID_MSR to fill in the
pasid field of the descriptor being submitted to an accelerator. But
there is no precise (and stable across kernel changes) point at which
the PASID_MSR is updated from the value for one task to the next.
Kernel code that uses accelerators must always use the ENQCMDS instruction
which does not access the PASID_MSR.
Check for use of the ENQCMD instruction in the kernel and warn on its
usage.
Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220207230254.3342514-11-fenghua.yu@intel.com
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Eric Dumazet [Sat, 12 Mar 2022 23:29:58 +0000 (15:29 -0800)]
net/packet: fix slab-out-of-bounds access in packet_recvmsg()
syzbot found that when an AF_PACKET socket is using PACKET_COPY_THRESH
and mmap operations, tpacket_rcv() is queueing skbs with
garbage in skb->cb[], triggering a too big copy [1]
Presumably, users of af_packet using mmap() already gets correct
metadata from the mapped buffer, we can simply make sure
to clear 12 bytes that might be copied to user space later.
BUG: KASAN: stack-out-of-bounds in memcpy include/linux/fortify-string.h:225 [inline]
BUG: KASAN: stack-out-of-bounds in packet_recvmsg+0x56c/0x1150 net/packet/af_packet.c:3489
Write of size 165 at addr
ffffc9000385fb78 by task syz-executor233/3631
CPU: 0 PID: 3631 Comm: syz-executor233 Not tainted 5.17.0-rc7-syzkaller-02396-g0b3660695e80 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_address_description.constprop.0.cold+0xf/0x336 mm/kasan/report.c:255
__kasan_report mm/kasan/report.c:442 [inline]
kasan_report.cold+0x83/0xdf mm/kasan/report.c:459
check_region_inline mm/kasan/generic.c:183 [inline]
kasan_check_range+0x13d/0x180 mm/kasan/generic.c:189
memcpy+0x39/0x60 mm/kasan/shadow.c:66
memcpy include/linux/fortify-string.h:225 [inline]
packet_recvmsg+0x56c/0x1150 net/packet/af_packet.c:3489
sock_recvmsg_nosec net/socket.c:948 [inline]
sock_recvmsg net/socket.c:966 [inline]
sock_recvmsg net/socket.c:962 [inline]
____sys_recvmsg+0x2c4/0x600 net/socket.c:2632
___sys_recvmsg+0x127/0x200 net/socket.c:2674
__sys_recvmsg+0xe2/0x1a0 net/socket.c:2704
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fdfd5954c29
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 41 15 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:
00007ffcf8e71e48 EFLAGS:
00000246 ORIG_RAX:
000000000000002f
RAX:
ffffffffffffffda RBX:
0000000000000003 RCX:
00007fdfd5954c29
RDX:
0000000000000000 RSI:
0000000020000500 RDI:
0000000000000005
RBP:
0000000000000000 R08:
000000000000000d R09:
000000000000000d
R10:
0000000000000000 R11:
0000000000000246 R12:
00007ffcf8e71e60
R13:
00000000000f4240 R14:
000000000000c1ff R15:
00007ffcf8e71e54
</TASK>
addr
ffffc9000385fb78 is located in stack of task syz-executor233/3631 at offset 32 in frame:
____sys_recvmsg+0x0/0x600 include/linux/uio.h:246
this frame has 1 object:
[32, 160) 'addr'
Memory state around the buggy address:
ffffc9000385fa80: 00 04 f3 f3 f3 f3 f3 00 00 00 00 00 00 00 00 00
ffffc9000385fb00: 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00
>
ffffc9000385fb80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f3
^
ffffc9000385fc00: f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 f1
ffffc9000385fc80: f1 f1 f1 00 f2 f2 f2 00 f2 f2 f2 00 00 00 00 00
==================================================================
Fixes:
0fb375fb9b93 ("[AF_PACKET]: Allow for > 8 byte hardware addresses.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Link: https://lore.kernel.org/r/20220312232958.3535620-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>