Mayuresh Janorkar <mayur@ti.com>
Michael Buesch <m@bues.ch>
Michel Dänzer <michel@tungstengraphics.com>
+Michel Lespinasse <michel@lespinasse.org>
+Michel Lespinasse <michel@lespinasse.org> <walken@google.com>
+Michel Lespinasse <michel@lespinasse.org> <walken@zoy.org>
Miguel Ojeda <ojeda@kernel.org> <miguel.ojeda.sandonis@gmail.com>
Mike Rapoport <rppt@kernel.org> <mike@compulab.co.il>
Mike Rapoport <rppt@kernel.org> <mike.rapoport@gmail.com>
=========================
Writing 1 to this entry will disable unprivileged calls to ``bpf()``;
-once disabled, calling ``bpf()`` without ``CAP_SYS_ADMIN`` will return
-``-EPERM``.
+once disabled, calling ``bpf()`` without ``CAP_SYS_ADMIN`` or ``CAP_BPF``
+will return ``-EPERM``. Once set to 1, this can't be cleared from the
+running kernel anymore.
-Once set, this can't be cleared.
+Writing 2 to this entry will also disable unprivileged calls to ``bpf()``,
+however, an admin can still change this setting later on, if needed, by
+writing 0 or 1 to this entry.
+If ``BPF_UNPRIV_DEFAULT_OFF`` is enabled in the kernel config, then this
+entry will default to 2 instead of 0.
+
+= =============================================================
+0 Unprivileged calls to ``bpf()`` are enabled
+1 Unprivileged calls to ``bpf()`` are disabled without recovery
+2 Unprivileged calls to ``bpf()`` are disabled
+= =============================================================
watchdog
========
description: |
I2C bus timeout in microseconds
+ fsl,i2c-erratum-a004447:
+ $ref: /schemas/types.yaml#/definitions/flag
+ description: |
+ Indicates the presence of QorIQ erratum A-004447, which
+ says that the standard i2c recovery scheme mechanism does
+ not work and an alternate implementation is needed.
+
required:
- compatible
- reg
- $ref: ethernet-controller.yaml#
maintainers:
- - Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
+ - Sergei Shtylyov <sergei.shtylyov@gmail.com>
properties:
compatible:
mux-controls = <&mux>;
- spi-flash@0 {
+ flash@0 {
compatible = "jedec,spi-nor";
reg = <0>;
spi-max-frequency = <40000000>;
In addition, some functions useful for creating debugging output are
defined in ``drivers/usb/common/debug.c``.
+.. _usb_header:
+
Host-Side Data Types and Macros
===============================
seccomp notification fd to receive a ``struct seccomp_notif``, which contains
five members: the input length of the structure, a unique-per-filter ``id``,
the ``pid`` of the task which triggered this request (which may be 0 if the
-task is in a pid ns not visible from the listener's pid namespace), a ``flags``
-member which for now only has ``SECCOMP_NOTIF_FLAG_SIGNALED``, representing
-whether or not the notification is a result of a non-fatal signal, and the
-``data`` passed to seccomp. Userspace can then make a decision based on this
-information about what to do, and ``ioctl(SECCOMP_IOCTL_NOTIF_SEND)`` a
-response, indicating what should be returned to userspace. The ``id`` member of
-``struct seccomp_notif_resp`` should be the same ``id`` as in ``struct
-seccomp_notif``.
+task is in a pid ns not visible from the listener's pid namespace). The
+notification also contains the ``data`` passed to seccomp, and a filters flag.
+The structure should be zeroed out prior to calling the ioctl.
+
+Userspace can then make a decision based on this information about what to do,
+and ``ioctl(SECCOMP_IOCTL_NOTIF_SEND)`` a response, indicating what should be
+returned to userspace. The ``id`` member of ``struct seccomp_notif_resp`` should
+be the same ``id`` as in ``struct seccomp_notif``.
It is worth noting that ``struct seccomp_data`` contains the values of register
arguments to the syscall, but does not contain pointers to memory. The task's
necessary to inform each VCPU to completely refresh the tables. This
request is used for that.
-KVM_REQ_PENDING_TIMER
+KVM_REQ_UNBLOCK
- This request may be made from a timer handler run on the host on behalf
- of a VCPU. It informs the VCPU thread to inject a timer interrupt.
+ This request informs the vCPU to exit kvm_vcpu_block. It is used for
+ example from timer handlers that run on the host on behalf of a vCPU,
+ or in order to update the interrupt routing and ensure that assigned
+ devices will wake up the vCPU.
KVM_REQ_UNHALT
S: Maintained
W: http://btrfs.wiki.kernel.org/
Q: http://patchwork.kernel.org/project/linux-btrfs/list/
+C: irc://irc.libera.chat/btrfs
T: git git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git
F: Documentation/filesystems/btrfs.rst
F: fs/btrfs/
F: Documentation/devicetree/bindings/arm/cavium-thunder2.txt
F: arch/arm64/boot/dts/cavium/thunder2-99xx*
+CBS/ETF/TAPRIO QDISCS
+M: Vinicius Costa Gomes <vinicius.gomes@intel.com>
+S: Maintained
+L: netdev@vger.kernel.org
+F: net/sched/sch_cbs.c
+F: net/sched/sch_etf.c
+F: net/sched/sch_taprio.c
+
CC2520 IEEE-802.15.4 RADIO DRIVER
M: Varka Bhadram <varkabhadram@gmail.com>
L: linux-wpan@vger.kernel.org
DPAA2 ETHERNET DRIVER
M: Ioana Ciornei <ioana.ciornei@nxp.com>
-M: Ioana Radulescu <ruxandra.radulescu@nxp.com>
L: netdev@vger.kernel.org
S: Maintained
F: Documentation/networking/device_drivers/ethernet/freescale/dpaa2/ethernet-driver.rst
FANOTIFY
M: Jan Kara <jack@suse.cz>
R: Amir Goldstein <amir73il@gmail.com>
+R: Matthew Bobrowski <repnop@google.com>
L: linux-fsdevel@vger.kernel.org
S: Maintained
F: fs/notify/fanotify/
F: include/linux/mfd/ntxec.h
NETRONOME ETHERNET DRIVERS
-M: Simon Horman <simon.horman@netronome.com>
+M: Simon Horman <simon.horman@corigine.com>
R: Jakub Kicinski <kuba@kernel.org>
-L: oss-drivers@netronome.com
+L: oss-drivers@corigine.com
S: Maintained
F: drivers/net/ethernet/netronome/
M: Jakub Kicinski <kuba@kernel.org>
L: netdev@vger.kernel.org
S: Maintained
-W: http://www.linuxfoundation.org/en/Net
Q: https://patchwork.kernel.org/project/netdevbpf/list/
T: git git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git
T: git git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git
M: Jakub Kicinski <kuba@kernel.org>
L: netdev@vger.kernel.org
S: Maintained
-W: http://www.linuxfoundation.org/en/Net
Q: https://patchwork.kernel.org/project/netdevbpf/list/
B: mailto:netdev@vger.kernel.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git
F: net/ipv4/nexthop.c
NFC SUBSYSTEM
+M: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
+L: linux-nfc@lists.01.org (subscribers-only)
L: netdev@vger.kernel.org
-S: Orphan
+S: Maintained
F: Documentation/devicetree/bindings/net/nfc/
F: drivers/nfc/
F: include/linux/platform_data/nfcmrvl.h
NFC VIRTUAL NCI DEVICE DRIVER
M: Bongsu Jeon <bongsu.jeon@samsung.com>
L: netdev@vger.kernel.org
-L: linux-nfc@lists.01.org (moderated for non-subscribers)
+L: linux-nfc@lists.01.org (subscribers-only)
S: Supported
F: drivers/nfc/virtual_ncidev.c
F: tools/testing/selftests/nci/
F: sound/soc/codecs/tfa9879*
NXP-NCI NFC DRIVER
-M: Clément Perrochaud <clement.perrochaud@effinnov.com>
R: Charles Gorand <charles.gorand@effinnov.com>
-L: linux-nfc@lists.01.org (moderated for non-subscribers)
+L: linux-nfc@lists.01.org (subscribers-only)
S: Supported
F: drivers/nfc/nxp-nci
PCI ENDPOINT SUBSYSTEM
M: Kishon Vijay Abraham I <kishon@ti.com>
M: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
+R: Krzysztof Wilczyński <kw@linux.com>
L: linux-pci@vger.kernel.org
S: Supported
F: Documentation/PCI/endpoint/*
PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS
M: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
R: Rob Herring <robh@kernel.org>
+R: Krzysztof Wilczyński <kw@linux.com>
L: linux-pci@vger.kernel.org
S: Supported
Q: http://patchwork.ozlabs.org/project/linux-pci/list/
M: Dennis Zhou <dennis@kernel.org>
M: Tejun Heo <tj@kernel.org>
M: Christoph Lameter <cl@linux.com>
+L: linux-mm@kvack.org
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu.git
F: arch/*/include/asm/percpu.h
F: include/linux/percpu*.h
+F: lib/percpu*.c
F: mm/percpu*.c
PER-TASK DELAY ACCOUNTING
M: Julian Wiedmann <jwi@linux.ibm.com>
M: Karsten Graul <kgraul@linux.ibm.com>
L: linux-s390@vger.kernel.org
+L: netdev@vger.kernel.org
S: Supported
W: http://www.ibm.com/developerworks/linux/linux390/
F: drivers/s390/net/*iucv*
M: Julian Wiedmann <jwi@linux.ibm.com>
M: Karsten Graul <kgraul@linux.ibm.com>
L: linux-s390@vger.kernel.org
+L: netdev@vger.kernel.org
S: Supported
W: http://www.ibm.com/developerworks/linux/linux390/
F: drivers/s390/net/
SAMSUNG S3FWRN5 NFC DRIVER
M: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
M: Krzysztof Opasiak <k.opasiak@samsung.com>
-L: linux-nfc@lists.01.org (moderated for non-subscribers)
+L: linux-nfc@lists.01.org (subscribers-only)
S: Maintained
F: Documentation/devicetree/bindings/net/nfc/samsung,s3fwrn5.yaml
F: drivers/nfc/s3fwrn5
S: Maintained
F: drivers/i2c/busses/i2c-stm32*
+ST STM32 SPI DRIVER
+M: Alain Volmat <alain.volmat@foss.st.com>
+L: linux-spi@vger.kernel.org
+S: Maintained
+F: drivers/spi/spi-stm32.c
+
ST STPDDC60 DRIVER
M: Daniel Nilsson <daniel.nilsson@flex.com>
L: linux-hwmon@vger.kernel.org
L: linux-i2c@vger.kernel.org
S: Maintained
F: drivers/i2c/busses/i2c-designware-*
-F: include/linux/platform_data/i2c-designware.h
SYNOPSYS DESIGNWARE MMC/SD/SDIO DRIVER
M: Jaehoon Chung <jh80.chung@samsung.com>
TI TRF7970A NFC DRIVER
M: Mark Greer <mgreer@animalcreek.com>
L: linux-wireless@vger.kernel.org
-L: linux-nfc@lists.01.org (moderated for non-subscribers)
+L: linux-nfc@lists.01.org (subscribers-only)
S: Supported
F: Documentation/devicetree/bindings/net/nfc/trf7970a.txt
F: drivers/nfc/trf7970a.c
F: drivers/xen/*swiotlb*
XFS FILESYSTEM
+C: irc://irc.oftc.net/xfs
M: Darrick J. Wong <djwong@kernel.org>
M: linux-xfs@vger.kernel.org
L: linux-xfs@vger.kernel.org
VERSION = 5
PATCHLEVEL = 13
SUBLEVEL = 0
-EXTRAVERSION = -rc3
+EXTRAVERSION = -rc4
NAME = Frozen Wasteland
# *DOCUMENTATION*
# Limit inlining across translation units to reduce binary size
KBUILD_LDFLAGS += -mllvm -import-instr-limit=5
+
+# Check for frame size exceeding threshold during prolog/epilog insertion.
+ifneq ($(CONFIG_FRAME_WARN),0)
+KBUILD_LDFLAGS += -plugin-opt=-warn-stack-size=$(CONFIG_FRAME_WARN)
+endif
endif
ifdef CONFIG_LTO
# SPDX-License-Identifier: GPL-2.0-only
-obj-y += kernel/ mm/
-obj-$(CONFIG_NET) += net/
+obj-y += kernel/ mm/ net/
obj-$(CONFIG_KVM) += kvm/
obj-$(CONFIG_XEN) += xen/
obj-$(CONFIG_CRYPTO) += crypto/
* This insanity brought to you by speculative system register reads,
* out-of-order memory accesses, sequence locks and Thomas Gleixner.
*
- * http://lists.infradead.org/pipermail/linux-arm-kernel/2019-February/631195.html
+ * https://lore.kernel.org/r/alpine.DEB.2.21.1902081950260.1662@nanos.tec.linutronix.de/
*/
#define arch_counter_enforce_ordering(val) do { \
u64 tmp, _val = (val); \
#define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector 18
#define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize 19
#define __KVM_HOST_SMCCC_FUNC___pkvm_mark_hyp 20
+#define __KVM_HOST_SMCCC_FUNC___kvm_adjust_pc 21
#ifndef __ASSEMBLY__
extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
+extern void __kvm_adjust_pc(struct kvm_vcpu *vcpu);
+
extern u64 __vgic_v3_get_gic_config(void);
extern u64 __vgic_v3_read_vmcr(void);
extern void __vgic_v3_write_vmcr(u32 vmcr);
vcpu->arch.flags |= KVM_ARM64_INCREMENT_PC;
}
+static inline bool vcpu_has_feature(struct kvm_vcpu *vcpu, int feature)
+{
+ return test_bit(feature, vcpu->arch.features);
+}
+
#endif /* __ARM64_KVM_EMULATE_H__ */
return ret;
}
- if (run->immediate_exit)
- return -EINTR;
-
vcpu_load(vcpu);
+ if (run->immediate_exit) {
+ ret = -EINTR;
+ goto out;
+ }
+
kvm_sigset_activate(vcpu);
ret = 1;
kvm_sigset_deactivate(vcpu);
+out:
+ /*
+ * In the unlikely event that we are returning to userspace
+ * with pending exceptions or PC adjustment, commit these
+ * adjustments in order to give userspace a consistent view of
+ * the vcpu state. Note that this relies on __kvm_adjust_pc()
+ * being preempt-safe on VHE.
+ */
+ if (unlikely(vcpu->arch.flags & (KVM_ARM64_PENDING_EXCEPTION |
+ KVM_ARM64_INCREMENT_PC)))
+ kvm_call_hyp(__kvm_adjust_pc, vcpu);
+
vcpu_put(vcpu);
return ret;
}
*vcpu_pc(vcpu) = vect_offset;
}
-void kvm_inject_exception(struct kvm_vcpu *vcpu)
+static void kvm_inject_exception(struct kvm_vcpu *vcpu)
{
if (vcpu_el1_is_32bit(vcpu)) {
switch (vcpu->arch.flags & KVM_ARM64_EXCEPT_MASK) {
}
}
}
+
+/*
+ * Adjust the guest PC (and potentially exception state) depending on
+ * flags provided by the emulation code.
+ */
+void __kvm_adjust_pc(struct kvm_vcpu *vcpu)
+{
+ if (vcpu->arch.flags & KVM_ARM64_PENDING_EXCEPTION) {
+ kvm_inject_exception(vcpu);
+ vcpu->arch.flags &= ~(KVM_ARM64_PENDING_EXCEPTION |
+ KVM_ARM64_EXCEPT_MASK);
+ } else if (vcpu->arch.flags & KVM_ARM64_INCREMENT_PC) {
+ kvm_skip_instr(vcpu);
+ vcpu->arch.flags &= ~KVM_ARM64_INCREMENT_PC;
+ }
+}
#include <asm/kvm_emulate.h>
#include <asm/kvm_host.h>
-void kvm_inject_exception(struct kvm_vcpu *vcpu);
-
static inline void kvm_skip_instr(struct kvm_vcpu *vcpu)
{
if (vcpu_mode_is_32bit(vcpu)) {
}
/*
- * Adjust the guest PC on entry, depending on flags provided by EL1
- * for the purpose of emulation (MMIO, sysreg) or exception injection.
- */
-static inline void __adjust_pc(struct kvm_vcpu *vcpu)
-{
- if (vcpu->arch.flags & KVM_ARM64_PENDING_EXCEPTION) {
- kvm_inject_exception(vcpu);
- vcpu->arch.flags &= ~(KVM_ARM64_PENDING_EXCEPTION |
- KVM_ARM64_EXCEPT_MASK);
- } else if (vcpu->arch.flags & KVM_ARM64_INCREMENT_PC) {
- kvm_skip_instr(vcpu);
- vcpu->arch.flags &= ~KVM_ARM64_INCREMENT_PC;
- }
-}
-
-/*
* Skip an instruction while host sysregs are live.
* Assumes host is always 64-bit.
*/
cpu_reg(host_ctxt, 1) = __kvm_vcpu_run(kern_hyp_va(vcpu));
}
+static void handle___kvm_adjust_pc(struct kvm_cpu_context *host_ctxt)
+{
+ DECLARE_REG(struct kvm_vcpu *, vcpu, host_ctxt, 1);
+
+ __kvm_adjust_pc(kern_hyp_va(vcpu));
+}
+
static void handle___kvm_flush_vm_context(struct kvm_cpu_context *host_ctxt)
{
__kvm_flush_vm_context();
static const hcall_t host_hcall[] = {
HANDLE_FUNC(__kvm_vcpu_run),
+ HANDLE_FUNC(__kvm_adjust_pc),
HANDLE_FUNC(__kvm_flush_vm_context),
HANDLE_FUNC(__kvm_tlb_flush_vmid_ipa),
HANDLE_FUNC(__kvm_tlb_flush_vmid),
extern unsigned long hyp_nr_cpus;
struct host_kvm host_kvm;
-struct hyp_pool host_s2_mem;
-struct hyp_pool host_s2_dev;
+static struct hyp_pool host_s2_mem;
+static struct hyp_pool host_s2_dev;
/*
* Copies of the host's CPU features registers holding sanitized values.
#include <nvhe/trap_handler.h>
struct hyp_pool hpool;
-struct kvm_pgtable_mm_ops pkvm_pgtable_mm_ops;
unsigned long hyp_nr_cpus;
#define hyp_percpu_size ((unsigned long)__per_cpu_end - \
static void *hyp_pgt_base;
static void *host_s2_mem_pgt_base;
static void *host_s2_dev_pgt_base;
+static struct kvm_pgtable_mm_ops pkvm_pgtable_mm_ops;
static int divide_memory_pool(void *virt, unsigned long size)
{
* Author: Marc Zyngier <marc.zyngier@arm.com>
*/
-#include <hyp/adjust_pc.h>
#include <hyp/switch.h>
#include <hyp/sysreg-sr.h>
*/
__debug_save_host_buffers_nvhe(vcpu);
- __adjust_pc(vcpu);
+ __kvm_adjust_pc(vcpu);
/*
* We must restore the 32-bit state before the sysregs, thanks
* Author: Marc Zyngier <marc.zyngier@arm.com>
*/
-#include <hyp/adjust_pc.h>
#include <hyp/switch.h>
#include <linux/arm-smccc.h>
__load_guest_stage2(vcpu->arch.hw_mmu);
__activate_traps(vcpu);
- __adjust_pc(vcpu);
+ __kvm_adjust_pc(vcpu);
sysreg_restore_guest_state_vhe(guest_ctxt);
__debug_switch_to_guest(vcpu);
bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
{
if (!kvm->arch.mmu.pgt)
- return 0;
+ return false;
__unmap_stage2_range(&kvm->arch.mmu, range->start << PAGE_SHIFT,
(range->end - range->start) << PAGE_SHIFT,
range->may_block);
- return 0;
+ return false;
}
bool kvm_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
kvm_pfn_t pfn = pte_pfn(range->pte);
if (!kvm->arch.mmu.pgt)
- return 0;
+ return false;
WARN_ON(range->end - range->start != 1);
PAGE_SIZE, __pfn_to_phys(pfn),
KVM_PGTABLE_PROT_R, NULL);
- return 0;
+ return false;
}
bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
pte_t pte;
if (!kvm->arch.mmu.pgt)
- return 0;
+ return false;
WARN_ON(size != PAGE_SIZE && size != PMD_SIZE && size != PUD_SIZE);
bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
{
if (!kvm->arch.mmu.pgt)
- return 0;
+ return false;
return kvm_pgtable_stage2_is_young(kvm->arch.mmu.pgt,
range->start << PAGE_SHIFT);
return 0;
}
+static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu)
+{
+ struct kvm_vcpu *tmp;
+ bool is32bit;
+ int i;
+
+ is32bit = vcpu_has_feature(vcpu, KVM_ARM_VCPU_EL1_32BIT);
+ if (!cpus_have_const_cap(ARM64_HAS_32BIT_EL1) && is32bit)
+ return false;
+
+ /* Check that the vcpus are either all 32bit or all 64bit */
+ kvm_for_each_vcpu(i, tmp, vcpu->kvm) {
+ if (vcpu_has_feature(tmp, KVM_ARM_VCPU_EL1_32BIT) != is32bit)
+ return false;
+ }
+
+ return true;
+}
+
/**
* kvm_reset_vcpu - sets core registers and sys_regs to reset value
* @vcpu: The VCPU pointer
}
}
+ if (!vcpu_allowed_register_width(vcpu)) {
+ ret = -EINVAL;
+ goto out;
+ }
+
switch (vcpu->arch.target) {
default:
if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) {
- if (!cpus_have_const_cap(ARM64_HAS_32BIT_EL1)) {
- ret = -EINVAL;
- goto out;
- }
pstate = VCPU_RESET_PSTATE_SVC;
} else {
pstate = VCPU_RESET_PSTATE_EL1;
struct sys_reg_params *p,
const struct sys_reg_desc *rd)
{
- u64 *dbg_reg = &vcpu->arch.vcpu_debug_state.dbg_bvr[rd->reg];
+ u64 *dbg_reg = &vcpu->arch.vcpu_debug_state.dbg_bvr[rd->CRm];
if (p->is_write)
reg_to_dbg(vcpu, p, rd, dbg_reg);
else
dbg_to_reg(vcpu, p, rd, dbg_reg);
- trace_trap_reg(__func__, rd->reg, p->is_write, *dbg_reg);
+ trace_trap_reg(__func__, rd->CRm, p->is_write, *dbg_reg);
return true;
}
static int set_bvr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
const struct kvm_one_reg *reg, void __user *uaddr)
{
- __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_bvr[rd->reg];
+ __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_bvr[rd->CRm];
if (copy_from_user(r, uaddr, KVM_REG_SIZE(reg->id)) != 0)
return -EFAULT;
static int get_bvr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
const struct kvm_one_reg *reg, void __user *uaddr)
{
- __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_bvr[rd->reg];
+ __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_bvr[rd->CRm];
if (copy_to_user(uaddr, r, KVM_REG_SIZE(reg->id)) != 0)
return -EFAULT;
static void reset_bvr(struct kvm_vcpu *vcpu,
const struct sys_reg_desc *rd)
{
- vcpu->arch.vcpu_debug_state.dbg_bvr[rd->reg] = rd->val;
+ vcpu->arch.vcpu_debug_state.dbg_bvr[rd->CRm] = rd->val;
}
static bool trap_bcr(struct kvm_vcpu *vcpu,
struct sys_reg_params *p,
const struct sys_reg_desc *rd)
{
- u64 *dbg_reg = &vcpu->arch.vcpu_debug_state.dbg_bcr[rd->reg];
+ u64 *dbg_reg = &vcpu->arch.vcpu_debug_state.dbg_bcr[rd->CRm];
if (p->is_write)
reg_to_dbg(vcpu, p, rd, dbg_reg);
else
dbg_to_reg(vcpu, p, rd, dbg_reg);
- trace_trap_reg(__func__, rd->reg, p->is_write, *dbg_reg);
+ trace_trap_reg(__func__, rd->CRm, p->is_write, *dbg_reg);
return true;
}
static int set_bcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
const struct kvm_one_reg *reg, void __user *uaddr)
{
- __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_bcr[rd->reg];
+ __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_bcr[rd->CRm];
if (copy_from_user(r, uaddr, KVM_REG_SIZE(reg->id)) != 0)
return -EFAULT;
static int get_bcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
const struct kvm_one_reg *reg, void __user *uaddr)
{
- __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_bcr[rd->reg];
+ __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_bcr[rd->CRm];
if (copy_to_user(uaddr, r, KVM_REG_SIZE(reg->id)) != 0)
return -EFAULT;
static void reset_bcr(struct kvm_vcpu *vcpu,
const struct sys_reg_desc *rd)
{
- vcpu->arch.vcpu_debug_state.dbg_bcr[rd->reg] = rd->val;
+ vcpu->arch.vcpu_debug_state.dbg_bcr[rd->CRm] = rd->val;
}
static bool trap_wvr(struct kvm_vcpu *vcpu,
struct sys_reg_params *p,
const struct sys_reg_desc *rd)
{
- u64 *dbg_reg = &vcpu->arch.vcpu_debug_state.dbg_wvr[rd->reg];
+ u64 *dbg_reg = &vcpu->arch.vcpu_debug_state.dbg_wvr[rd->CRm];
if (p->is_write)
reg_to_dbg(vcpu, p, rd, dbg_reg);
else
dbg_to_reg(vcpu, p, rd, dbg_reg);
- trace_trap_reg(__func__, rd->reg, p->is_write,
- vcpu->arch.vcpu_debug_state.dbg_wvr[rd->reg]);
+ trace_trap_reg(__func__, rd->CRm, p->is_write,
+ vcpu->arch.vcpu_debug_state.dbg_wvr[rd->CRm]);
return true;
}
static int set_wvr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
const struct kvm_one_reg *reg, void __user *uaddr)
{
- __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_wvr[rd->reg];
+ __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_wvr[rd->CRm];
if (copy_from_user(r, uaddr, KVM_REG_SIZE(reg->id)) != 0)
return -EFAULT;
static int get_wvr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
const struct kvm_one_reg *reg, void __user *uaddr)
{
- __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_wvr[rd->reg];
+ __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_wvr[rd->CRm];
if (copy_to_user(uaddr, r, KVM_REG_SIZE(reg->id)) != 0)
return -EFAULT;
static void reset_wvr(struct kvm_vcpu *vcpu,
const struct sys_reg_desc *rd)
{
- vcpu->arch.vcpu_debug_state.dbg_wvr[rd->reg] = rd->val;
+ vcpu->arch.vcpu_debug_state.dbg_wvr[rd->CRm] = rd->val;
}
static bool trap_wcr(struct kvm_vcpu *vcpu,
struct sys_reg_params *p,
const struct sys_reg_desc *rd)
{
- u64 *dbg_reg = &vcpu->arch.vcpu_debug_state.dbg_wcr[rd->reg];
+ u64 *dbg_reg = &vcpu->arch.vcpu_debug_state.dbg_wcr[rd->CRm];
if (p->is_write)
reg_to_dbg(vcpu, p, rd, dbg_reg);
else
dbg_to_reg(vcpu, p, rd, dbg_reg);
- trace_trap_reg(__func__, rd->reg, p->is_write, *dbg_reg);
+ trace_trap_reg(__func__, rd->CRm, p->is_write, *dbg_reg);
return true;
}
static int set_wcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
const struct kvm_one_reg *reg, void __user *uaddr)
{
- __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_wcr[rd->reg];
+ __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_wcr[rd->CRm];
if (copy_from_user(r, uaddr, KVM_REG_SIZE(reg->id)) != 0)
return -EFAULT;
static int get_wcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *rd,
const struct kvm_one_reg *reg, void __user *uaddr)
{
- __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_wcr[rd->reg];
+ __u64 *r = &vcpu->arch.vcpu_debug_state.dbg_wcr[rd->CRm];
if (copy_to_user(uaddr, r, KVM_REG_SIZE(reg->id)) != 0)
return -EFAULT;
static void reset_wcr(struct kvm_vcpu *vcpu,
const struct sys_reg_desc *rd)
{
- vcpu->arch.vcpu_debug_state.dbg_wcr[rd->reg] = rd->val;
+ vcpu->arch.vcpu_debug_state.dbg_wcr[rd->CRm] = rd->val;
}
static void reset_amair_el1(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
*/
BUILD_BUG_ON(pgd_index(direct_map_end - 1) == pgd_index(direct_map_end));
- if (rodata_full || crash_mem_map || debug_pagealloc_enabled())
+ if (rodata_full || crash_mem_map || debug_pagealloc_enabled() ||
+ IS_ENABLED(CONFIG_KFENCE))
flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;
/*
#include <asm/reboot.h>
#include <asm/setup.h>
#include <asm/mach-au1x00/au1000.h>
+#include <asm/mach-au1x00/gpio-au1000.h>
#include <prom.h>
const char *get_system_type(void)
*
*/
+#ifndef _ASM_MIPS_BOARDS_LAUNCH_H
+#define _ASM_MIPS_BOARDS_LAUNCH_H
+
#ifndef _ASSEMBLER_
struct cpulaunch {
/* Polling period in count cycles for secondary CPU's */
#define LAUNCHPERIOD 10000
+
+#endif /* _ASM_MIPS_BOARDS_LAUNCH_H */
*/
notrace void arch_local_irq_disable(void)
{
- preempt_disable();
+ preempt_disable_notrace();
__asm__ __volatile__(
" .set push \n"
: /* no inputs */
: "memory");
- preempt_enable();
+ preempt_enable_notrace();
}
EXPORT_SYMBOL(arch_local_irq_disable);
{
unsigned long flags;
- preempt_disable();
+ preempt_disable_notrace();
__asm__ __volatile__(
" .set push \n"
: /* no inputs */
: "memory");
- preempt_enable();
+ preempt_enable_notrace();
return flags;
}
{
unsigned long __tmp1;
- preempt_disable();
+ preempt_disable_notrace();
__asm__ __volatile__(
" .set push \n"
: "0" (flags)
: "memory");
- preempt_enable();
+ preempt_enable_notrace();
}
EXPORT_SYMBOL(arch_local_irq_restore);
EXPORT_SYMBOL(_page_cachable_default);
#define PM(p) __pgprot(_page_cachable_default | (p))
-#define PVA(p) PM(_PAGE_VALID | _PAGE_ACCESSED | (p))
static inline void setup_protection_map(void)
{
protection_map[0] = PM(_PAGE_PRESENT | _PAGE_NO_EXEC | _PAGE_NO_READ);
- protection_map[1] = PVA(_PAGE_PRESENT | _PAGE_NO_EXEC);
- protection_map[2] = PVA(_PAGE_PRESENT | _PAGE_NO_EXEC | _PAGE_NO_READ);
- protection_map[3] = PVA(_PAGE_PRESENT | _PAGE_NO_EXEC);
- protection_map[4] = PVA(_PAGE_PRESENT);
- protection_map[5] = PVA(_PAGE_PRESENT);
- protection_map[6] = PVA(_PAGE_PRESENT);
- protection_map[7] = PVA(_PAGE_PRESENT);
+ protection_map[1] = PM(_PAGE_PRESENT | _PAGE_NO_EXEC);
+ protection_map[2] = PM(_PAGE_PRESENT | _PAGE_NO_EXEC | _PAGE_NO_READ);
+ protection_map[3] = PM(_PAGE_PRESENT | _PAGE_NO_EXEC);
+ protection_map[4] = PM(_PAGE_PRESENT);
+ protection_map[5] = PM(_PAGE_PRESENT);
+ protection_map[6] = PM(_PAGE_PRESENT);
+ protection_map[7] = PM(_PAGE_PRESENT);
protection_map[8] = PM(_PAGE_PRESENT | _PAGE_NO_EXEC | _PAGE_NO_READ);
- protection_map[9] = PVA(_PAGE_PRESENT | _PAGE_NO_EXEC);
- protection_map[10] = PVA(_PAGE_PRESENT | _PAGE_NO_EXEC | _PAGE_WRITE |
+ protection_map[9] = PM(_PAGE_PRESENT | _PAGE_NO_EXEC);
+ protection_map[10] = PM(_PAGE_PRESENT | _PAGE_NO_EXEC | _PAGE_WRITE |
_PAGE_NO_READ);
- protection_map[11] = PVA(_PAGE_PRESENT | _PAGE_NO_EXEC | _PAGE_WRITE);
- protection_map[12] = PVA(_PAGE_PRESENT);
- protection_map[13] = PVA(_PAGE_PRESENT);
- protection_map[14] = PVA(_PAGE_PRESENT);
- protection_map[15] = PVA(_PAGE_PRESENT);
+ protection_map[11] = PM(_PAGE_PRESENT | _PAGE_NO_EXEC | _PAGE_WRITE);
+ protection_map[12] = PM(_PAGE_PRESENT);
+ protection_map[13] = PM(_PAGE_PRESENT);
+ protection_map[14] = PM(_PAGE_PRESENT | _PAGE_WRITE);
+ protection_map[15] = PM(_PAGE_PRESENT | _PAGE_WRITE);
}
-#undef _PVA
#undef PM
void cpu_cache_init(void)
#include <linux/io.h>
#include <linux/clk.h>
+#include <linux/export.h>
#include <linux/init.h>
#include <linux/sizes.h>
#include <linux/of_fdt.h>
__iomem void *rt_sysc_membase;
__iomem void *rt_memc_membase;
+EXPORT_SYMBOL_GPL(rt_sysc_membase);
__iomem void *plat_of_remap_node(const char *node)
{
};
/include/ "pq3-i2c-0.dtsi"
+ i2c@3000 {
+ fsl,i2c-erratum-a004447;
+ };
+
/include/ "pq3-i2c-1.dtsi"
+ i2c@3100 {
+ fsl,i2c-erratum-a004447;
+ };
+
/include/ "pq3-duart-0.dtsi"
/include/ "pq3-espi-0.dtsi"
spi0: spi@7000 {
};
/include/ "qoriq-i2c-0.dtsi"
+ i2c@118000 {
+ fsl,i2c-erratum-a004447;
+ };
+
+ i2c@118100 {
+ fsl,i2c-erratum-a004447;
+ };
+
/include/ "qoriq-i2c-1.dtsi"
+ i2c@119000 {
+ fsl,i2c-erratum-a004447;
+ };
+
+ i2c@119100 {
+ fsl,i2c-erratum-a004447;
+ };
+
/include/ "qoriq-duart-0.dtsi"
/include/ "qoriq-duart-1.dtsi"
/include/ "qoriq-gpio-0.dtsi"
/* PPC-specific vcpu->requests bit members */
#define KVM_REQ_WATCHDOG KVM_ARCH_REQ(0)
#define KVM_REQ_EPR_EXIT KVM_ARCH_REQ(1)
+#define KVM_REQ_PENDING_TIMER KVM_ARCH_REQ(2)
#include <linux/mmu_notifier.h>
pgd_t *pgdir = init_mm.pgd;
return __find_linux_pte(pgdir, ea, NULL, hshift);
}
+
+/*
+ * Convert a kernel vmap virtual address (vmalloc or ioremap space) to a
+ * physical address, without taking locks. This can be used in real-mode.
+ */
+static inline phys_addr_t ppc_find_vmap_phys(unsigned long addr)
+{
+ pte_t *ptep;
+ phys_addr_t pa;
+ int hugepage_shift;
+
+ /*
+ * init_mm does not free page tables, and does not do THP. It may
+ * have huge pages from huge vmalloc / ioremap etc.
+ */
+ ptep = find_init_mm_pte(addr, &hugepage_shift);
+ if (WARN_ON(!ptep))
+ return 0;
+
+ pa = PFN_PHYS(pte_pfn(*ptep));
+
+ if (!hugepage_shift)
+ hugepage_shift = PAGE_SHIFT;
+
+ pa |= addr & ((1ul << hugepage_shift) - 1);
+
+ return pa;
+}
+
/*
* This is what we should always use. Any other lockless page table lookup needs
* careful audit against THP split.
*/
static inline unsigned long eeh_token_to_phys(unsigned long token)
{
- pte_t *ptep;
- unsigned long pa;
- int hugepage_shift;
-
- /*
- * We won't find hugepages here(this is iomem). Hence we are not
- * worried about _PAGE_SPLITTING/collapse. Also we will not hit
- * page table free, because of init_mm.
- */
- ptep = find_init_mm_pte(token, &hugepage_shift);
- if (!ptep)
- return token;
-
- pa = pte_pfn(*ptep);
-
- /* On radix we can do hugepage mappings for io, so handle that */
- if (!hugepage_shift)
- hugepage_shift = PAGE_SHIFT;
-
- pa <<= PAGE_SHIFT;
- pa |= token & ((1ul << hugepage_shift) - 1);
- return pa;
+ return ppc_find_vmap_phys(token);
}
/*
#ifdef CONFIG_PPC_INDIRECT_MMIO
struct iowa_bus *iowa_mem_find_bus(const PCI_IO_ADDR addr)
{
- unsigned hugepage_shift;
struct iowa_bus *bus;
int token;
bus = &iowa_busses[token - 1];
else {
unsigned long vaddr, paddr;
- pte_t *ptep;
vaddr = (unsigned long)PCI_FIX_ADDR(addr);
if (vaddr < PHB_IO_BASE || vaddr >= PHB_IO_END)
return NULL;
- /*
- * We won't find huge pages here (iomem). Also can't hit
- * a page table free due to init_mm
- */
- ptep = find_init_mm_pte(vaddr, &hugepage_shift);
- if (ptep == NULL)
- paddr = 0;
- else {
- WARN_ON(hugepage_shift);
- paddr = pte_pfn(*ptep) << PAGE_SHIFT;
- }
+
+ paddr = ppc_find_vmap_phys(vaddr);
+
bus = iowa_pci_find(vaddr, paddr);
if (bus == NULL)
unsigned int order;
unsigned int nio_pages, io_order;
struct page *page;
- size_t size_io = size;
size = PAGE_ALIGN(size);
order = get_order(size);
memset(ret, 0, size);
/* Set up tces to cover the allocated range */
- size_io = IOMMU_PAGE_ALIGN(size_io, tbl);
- nio_pages = size_io >> tbl->it_page_shift;
- io_order = get_iommu_order(size_io, tbl);
+ nio_pages = size >> tbl->it_page_shift;
+ io_order = get_iommu_order(size, tbl);
mapping = iommu_alloc(dev, tbl, ret, nio_pages, DMA_BIDIRECTIONAL,
mask >> tbl->it_page_shift, io_order, 0);
if (mapping == DMA_MAPPING_ERROR) {
void *vaddr, dma_addr_t dma_handle)
{
if (tbl) {
- size_t size_io = IOMMU_PAGE_ALIGN(size, tbl);
- unsigned int nio_pages = size_io >> tbl->it_page_shift;
+ unsigned int nio_pages;
+ size = PAGE_ALIGN(size);
+ nio_pages = size >> tbl->it_page_shift;
iommu_free(tbl, dma_handle, nio_pages);
size = PAGE_ALIGN(size);
free_pages((unsigned long)vaddr, get_order(size));
int ret = 0;
struct kprobe *prev;
struct ppc_inst insn = ppc_inst_read((struct ppc_inst *)p->addr);
- struct ppc_inst prefix = ppc_inst_read((struct ppc_inst *)(p->addr - 1));
if ((unsigned long)p->addr & 0x03) {
printk("Attempt to register kprobe at an unaligned address\n");
} else if (IS_MTMSRD(insn) || IS_RFID(insn) || IS_RFI(insn)) {
printk("Cannot register a kprobe on rfi/rfid or mtmsr[d]\n");
ret = -EINVAL;
- } else if (ppc_inst_prefixed(prefix)) {
+ } else if ((unsigned long)p->addr & ~PAGE_MASK &&
+ ppc_inst_prefixed(ppc_inst_read((struct ppc_inst *)(p->addr - 1)))) {
printk("Cannot register a kprobe on the second word of prefixed instruction\n");
ret = -EINVAL;
}
break;
}
cur = ktime_get();
- } while (single_task_running() && ktime_before(cur, stop));
+ } while (kvm_vcpu_can_poll(cur, stop));
spin_lock(&vc->lock);
vc->vcore_state = VCORE_INACTIVE;
mtspr(SPRN_EBBRR, ebb_regs[1]);
mtspr(SPRN_BESCR, ebb_regs[2]);
mtspr(SPRN_TAR, user_tar);
- mtspr(SPRN_FSCR, current->thread.fscr);
}
mtspr(SPRN_VRSAVE, user_vrsave);
#include <asm/pte-walk.h>
/* Translate address of a vmalloc'd thing to a linear map address */
-static void *real_vmalloc_addr(void *x)
+static void *real_vmalloc_addr(void *addr)
{
- unsigned long addr = (unsigned long) x;
- pte_t *p;
- /*
- * assume we don't have huge pages in vmalloc space...
- * So don't worry about THP collapse/split. Called
- * Only in realmode with MSR_EE = 0, hence won't need irq_save/restore.
- */
- p = find_init_mm_pte(addr, NULL);
- if (!p || !pte_present(*p))
- return NULL;
- addr = (pte_pfn(*p) << PAGE_SHIFT) | (addr & ~PAGE_MASK);
- return __va(addr);
+ return __va(ppc_find_vmap_phys((unsigned long)addr));
}
/* Return 1 if we need to do a global tlbie, 0 if we can use tlbiel */
#define STACK_SLOT_UAMOR (SFS-88)
#define STACK_SLOT_DAWR1 (SFS-96)
#define STACK_SLOT_DAWRX1 (SFS-104)
+#define STACK_SLOT_FSCR (SFS-112)
/* the following is used by the P9 short path */
#define STACK_SLOT_NVGPRS (SFS-152) /* 18 gprs */
std r6, STACK_SLOT_DAWR0(r1)
std r7, STACK_SLOT_DAWRX0(r1)
std r8, STACK_SLOT_IAMR(r1)
+ mfspr r5, SPRN_FSCR
+ std r5, STACK_SLOT_FSCR(r1)
END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
BEGIN_FTR_SECTION
mfspr r6, SPRN_DAWR1
ld r7, STACK_SLOT_HFSCR(r1)
mtspr SPRN_HFSCR, r7
ALT_FTR_SECTION_END_IFCLR(CPU_FTR_ARCH_300)
+BEGIN_FTR_SECTION
+ ld r5, STACK_SLOT_FSCR(r1)
+ mtspr SPRN_FSCR, r5
+END_FTR_SECTION_IFSET(CPU_FTR_ARCH_207S)
/*
* Restore various registers to 0, where non-zero values
* set by the guest could disrupt the host.
select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX
select ARCH_OPTIONAL_KERNEL_RWX_DEFAULT
select ARCH_SUPPORTS_HUGETLBFS if MMU
+ select ARCH_USE_MEMTEST
select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
select ARCH_WANT_FRAME_POINTERS
select ARCH_WANT_HUGE_PMD_SHARE if 64BIT
KBUILD_LDFLAGS += -melf32lriscv
endif
+ifeq ($(CONFIG_LD_IS_LLD),y)
+ KBUILD_CFLAGS += -mno-relax
+ KBUILD_AFLAGS += -mno-relax
+ifneq ($(LLVM_IAS),1)
+ KBUILD_CFLAGS += -Wa,-mno-relax
+ KBUILD_AFLAGS += -Wa,-mno-relax
+endif
+endif
+
# ISA string setting
riscv-march-$(CONFIG_ARCH_RV32I) := rv32ima
riscv-march-$(CONFIG_ARCH_RV64I) := rv64ima
-obj-y += errata_cip_453.o
+obj-$(CONFIG_ERRATA_SIFIVE_CIP_453) += errata_cip_453.o
obj-y += errata.o
unsigned long fdt_addr;
};
-const extern unsigned char riscv_kexec_relocate[];
-const extern unsigned int riscv_kexec_relocate_size;
+extern const unsigned char riscv_kexec_relocate[];
+extern const unsigned int riscv_kexec_relocate_size;
typedef void (*riscv_kexec_method)(unsigned long first_ind_entry,
unsigned long jump_addr,
#include <asm/set_memory.h> /* For set_memory_x() */
#include <linux/compiler.h> /* For unreachable() */
#include <linux/cpu.h> /* For cpu_down() */
+#include <linux/reboot.h>
-/**
+/*
* kexec_image_info - Print received image details
*/
static void
}
}
-/**
+/*
* machine_kexec_prepare - Initialize kexec
*
* This function is called from do_kexec_load, when the user has
}
-/**
+/*
* machine_kexec_cleanup - Cleanup any leftovers from
* machine_kexec_prepare
*
#endif
}
-/**
+/*
* machine_crash_shutdown - Prepare to kexec after a kernel crash
*
* This function is called by crash_kexec just before machine_kexec
pr_info("Starting crashdump kernel...\n");
}
-/**
+/*
* machine_kexec - Jump to the loaded kimage
*
* This function is called by kernel_kexec which is called by the
return 0;
}
+#ifdef CONFIG_MMU
void *alloc_insn_page(void)
{
return __vmalloc_node_range(PAGE_SIZE, 1, VMALLOC_START, VMALLOC_END,
VM_FLUSH_RESET_PERMS, NUMA_NO_NODE,
__builtin_return_address(0));
}
+#endif
/* install breakpoint in text */
void __kprobes arch_arm_kprobe(struct kprobe *p)
/* Clean-up any unused pre-allocated resources */
mem_res_sz = (num_resources - res_idx + 1) * sizeof(*mem_res);
- memblock_free((phys_addr_t) mem_res, mem_res_sz);
+ memblock_free(__pa(mem_res), mem_res_sz);
return;
error:
/* Better an empty resource tree than an inconsistent one */
release_child_resources(&iomem_resource);
- memblock_free((phys_addr_t) mem_res, mem_res_sz);
+ memblock_free(__pa(mem_res), mem_res_sz);
}
fp = frame_pointer(regs);
sp = user_stack_pointer(regs);
pc = instruction_pointer(regs);
- } else if (task == NULL || task == current) {
- fp = (unsigned long)__builtin_frame_address(0);
- sp = sp_in_global;
- pc = (unsigned long)walk_stackframe;
+ } else if (task == current) {
+ fp = (unsigned long)__builtin_frame_address(1);
+ sp = (unsigned long)__builtin_frame_address(0);
+ pc = (unsigned long)__builtin_return_address(0);
} else {
/* task blocked in __switch_to */
fp = task->thread.s[0];
return true;
}
-void dump_backtrace(struct pt_regs *regs, struct task_struct *task,
+noinline void dump_backtrace(struct pt_regs *regs, struct task_struct *task,
const char *loglvl)
{
- pr_cont("%sCall Trace:\n", loglvl);
walk_stackframe(task, regs, print_trace_address, (void *)loglvl);
}
void show_stack(struct task_struct *task, unsigned long *sp, const char *loglvl)
{
+ pr_cont("%sCall Trace:\n", loglvl);
dump_backtrace(NULL, task, loglvl);
}
#ifdef CONFIG_STACKTRACE
-void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
+noinline void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
struct task_struct *task, struct pt_regs *regs)
{
walk_stackframe(task, regs, consume_entry, cookie);
unsigned long init_data_start = (unsigned long)__init_data_begin;
unsigned long rodata_start = (unsigned long)__start_rodata;
unsigned long data_start = (unsigned long)_data;
- unsigned long max_low = (unsigned long)(__va(PFN_PHYS(max_low_pfn)));
+#if defined(CONFIG_64BIT) && defined(CONFIG_MMU)
+ unsigned long end_va = kernel_virt_addr + load_sz;
+#else
+ unsigned long end_va = (unsigned long)(__va(PFN_PHYS(max_low_pfn)));
+#endif
set_memory_ro(text_start, (init_text_start - text_start) >> PAGE_SHIFT);
set_memory_ro(init_text_start, (init_data_start - init_text_start) >> PAGE_SHIFT);
set_memory_nx(init_data_start, (rodata_start - init_data_start) >> PAGE_SHIFT);
/* rodata section is marked readonly in mark_rodata_ro */
set_memory_nx(rodata_start, (data_start - rodata_start) >> PAGE_SHIFT);
- set_memory_nx(data_start, (max_low - data_start) >> PAGE_SHIFT);
+ set_memory_nx(data_start, (end_va - data_start) >> PAGE_SHIFT);
}
void mark_rodata_ro(void)
extern int setup_APIC_eilvt(u8 lvt_off, u8 vector, u8 msg_type, u8 mask);
extern void lapic_assign_system_vectors(void);
extern void lapic_assign_legacy_vector(unsigned int isairq, bool replace);
+extern void lapic_update_legacy_vectors(void);
extern void lapic_online(void);
extern void lapic_offline(void);
extern bool apic_needs_pit(void);
# define DISABLE_PTI (1 << (X86_FEATURE_PTI & 31))
#endif
-#ifdef CONFIG_IOMMU_SUPPORT
-# define DISABLE_ENQCMD 0
-#else
-# define DISABLE_ENQCMD (1 << (X86_FEATURE_ENQCMD & 31))
-#endif
+/* Force disable because it's broken beyond repair */
+#define DISABLE_ENQCMD (1 << (X86_FEATURE_ENQCMD & 31))
#ifdef CONFIG_X86_SGX
# define DISABLE_SGX 0
*/
#define PASID_DISABLED 0
-#ifdef CONFIG_IOMMU_SUPPORT
-/* Update current's PASID MSR/state by mm's PASID. */
-void update_pasid(void);
-#else
static inline void update_pasid(void) { }
-#endif
+
#endif /* _ASM_X86_FPU_API_H */
pkru_val = pk->pkru;
}
__write_pkru(pkru_val);
-
- /*
- * Expensive PASID MSR write will be avoided in update_pasid() because
- * TIF_NEED_FPU_LOAD was set. And the PASID state won't be updated
- * unless it's different from mm->pasid to reduce overhead.
- */
- update_pasid();
}
#endif /* _ASM_X86_FPU_INTERNAL_H */
KVM_X86_OP_NULL(vcpu_blocking)
KVM_X86_OP_NULL(vcpu_unblocking)
KVM_X86_OP_NULL(update_pi_irte)
+KVM_X86_OP_NULL(start_assignment)
KVM_X86_OP_NULL(apicv_post_state_restore)
KVM_X86_OP_NULL(dy_apicv_has_pending_interrupt)
KVM_X86_OP_NULL(set_hv_timer)
int (*update_pi_irte)(struct kvm *kvm, unsigned int host_irq,
uint32_t guest_irq, bool set);
+ void (*start_assignment)(struct kvm *kvm);
void (*apicv_post_state_restore)(struct kvm_vcpu *vcpu);
bool (*dy_apicv_has_pending_interrupt)(struct kvm_vcpu *vcpu);
#define _ASM_X86_THERMAL_H
#ifdef CONFIG_X86_THERMAL_VECTOR
+void therm_lvt_init(void);
void intel_init_thermal(struct cpuinfo_x86 *c);
bool x86_thermal_enabled(void);
void intel_thermal_interrupt(void);
#else
-static inline void intel_init_thermal(struct cpuinfo_x86 *c) { }
+static inline void therm_lvt_init(void) { }
+static inline void intel_init_thermal(struct cpuinfo_x86 *c) { }
#endif
#endif /* _ASM_X86_THERMAL_H */
}
/*
+ * optimize_nops_range() - Optimize a sequence of single byte NOPs (0x90)
+ *
+ * @instr: instruction byte stream
+ * @instrlen: length of the above
+ * @off: offset within @instr where the first NOP has been detected
+ *
+ * Return: number of NOPs found (and replaced).
+ */
+static __always_inline int optimize_nops_range(u8 *instr, u8 instrlen, int off)
+{
+ unsigned long flags;
+ int i = off, nnops;
+
+ while (i < instrlen) {
+ if (instr[i] != 0x90)
+ break;
+
+ i++;
+ }
+
+ nnops = i - off;
+
+ if (nnops <= 1)
+ return nnops;
+
+ local_irq_save(flags);
+ add_nops(instr + off, nnops);
+ local_irq_restore(flags);
+
+ DUMP_BYTES(instr, instrlen, "%px: [%d:%d) optimized NOPs: ", instr, off, i);
+
+ return nnops;
+}
+
+/*
* "noinline" to cause control flow change and thus invalidate I$ and
* cause refetch after modification.
*/
static void __init_or_module noinline optimize_nops(struct alt_instr *a, u8 *instr)
{
- unsigned long flags;
struct insn insn;
- int nop, i = 0;
+ int i = 0;
/*
- * Jump over the non-NOP insns, the remaining bytes must be single-byte
- * NOPs, optimize them.
+ * Jump over the non-NOP insns and optimize single-byte NOPs into bigger
+ * ones.
*/
for (;;) {
if (insn_decode_kernel(&insn, &instr[i]))
return;
+ /*
+ * See if this and any potentially following NOPs can be
+ * optimized.
+ */
if (insn.length == 1 && insn.opcode.bytes[0] == 0x90)
- break;
-
- if ((i += insn.length) >= a->instrlen)
- return;
- }
+ i += optimize_nops_range(instr, a->instrlen, i);
+ else
+ i += insn.length;
- for (nop = i; i < a->instrlen; i++) {
- if (WARN_ONCE(instr[i] != 0x90, "Not a NOP at 0x%px\n", &instr[i]))
+ if (i >= a->instrlen)
return;
}
-
- local_irq_save(flags);
- add_nops(instr + nop, i - nop);
- local_irq_restore(flags);
-
- DUMP_BYTES(instr, a->instrlen, "%px: [%d:%d) optimized NOPs: ",
- instr, nop, a->instrlen);
}
/*
end_local_APIC_setup();
irq_remap_enable_fault_handling();
setup_IO_APIC();
+ lapic_update_legacy_vectors();
}
#ifdef CONFIG_UP_LATE_INIT
irq_matrix_assign_system(vector_matrix, ISA_IRQ_VECTOR(irq), replace);
}
+void __init lapic_update_legacy_vectors(void)
+{
+ unsigned int i;
+
+ if (IS_ENABLED(CONFIG_X86_IO_APIC) && nr_ioapics > 0)
+ return;
+
+ /*
+ * If the IO/APIC is disabled via config, kernel command line or
+ * lack of enumeration then all legacy interrupts are routed
+ * through the PIC. Make sure that they are marked as legacy
+ * vectors. PIC_CASCADE_IRQ has already been marked in
+ * lapic_assign_system_vectors().
+ */
+ for (i = 0; i < nr_legacy_irqs(); i++) {
+ if (i != PIC_CASCADE_IR)
+ lapic_assign_legacy_vector(i, true);
+ }
+}
+
void __init lapic_assign_system_vectors(void)
{
unsigned int i, vector = 0;
return 0;
}
#endif /* CONFIG_PROC_PID_ARCH_STATUS */
-
-#ifdef CONFIG_IOMMU_SUPPORT
-void update_pasid(void)
-{
- u64 pasid_state;
- u32 pasid;
-
- if (!cpu_feature_enabled(X86_FEATURE_ENQCMD))
- return;
-
- if (!current->mm)
- return;
-
- pasid = READ_ONCE(current->mm->pasid);
- /* Set the valid bit in the PASID MSR/state only for valid pasid. */
- pasid_state = pasid == PASID_DISABLED ?
- pasid : pasid | MSR_IA32_PASID_VALID;
-
- /*
- * No need to hold fregs_lock() since the task's fpstate won't
- * be changed by others (e.g. ptrace) while the task is being
- * switched to or is in IPI.
- */
- if (!test_thread_flag(TIF_NEED_FPU_LOAD)) {
- /* The MSR is active and can be directly updated. */
- wrmsrl(MSR_IA32_PASID, pasid_state);
- } else {
- struct fpu *fpu = ¤t->thread.fpu;
- struct ia32_pasid_state *ppasid_state;
- struct xregs_state *xsave;
-
- /*
- * The CPU's xstate registers are not currently active. Just
- * update the PASID state in the memory buffer here. The
- * PASID MSR will be loaded when returning to user mode.
- */
- xsave = &fpu->state.xsave;
- xsave->header.xfeatures |= XFEATURE_MASK_PASID;
- ppasid_state = get_xsave_addr(xsave, XFEATURE_PASID);
- /*
- * Since XFEATURE_MASK_PASID is set in xfeatures, ppasid_state
- * won't be NULL and no need to check its value.
- *
- * Only update the task's PASID state when it's different
- * from the mm's pasid.
- */
- if (ppasid_state->pasid != pasid_state) {
- /*
- * Invalid fpregs so that state restoring will pick up
- * the PASID state.
- */
- __fpu_invalidate_fpregs_state(fpu);
- ppasid_state->pasid = pasid_state;
- }
- }
-}
-#endif /* CONFIG_IOMMU_SUPPORT */
#include <asm/pci-direct.h>
#include <asm/prom.h>
#include <asm/proto.h>
+#include <asm/thermal.h>
#include <asm/unwind.h>
#include <asm/vsyscall.h>
#include <linux/vmalloc.h>
* them from accessing certain memory ranges, namely anything below
* 1M and in the pages listed in bad_pages[] above.
*
- * To avoid these pages being ever accessed by SNB gfx devices
- * reserve all memory below the 1 MB mark and bad_pages that have
- * not already been reserved at boot time.
+ * To avoid these pages being ever accessed by SNB gfx devices reserve
+ * bad_pages that have not already been reserved at boot time.
+ * All memory below the 1 MB mark is anyway reserved later during
+ * setup_arch(), so there is no need to reserve it here.
*/
- memblock_reserve(0, 1<<20);
for (i = 0; i < ARRAY_SIZE(bad_pages); i++) {
if (memblock_reserve(bad_pages[i], PAGE_SIZE))
* The first 4Kb of memory is a BIOS owned area, but generally it is
* not listed as such in the E820 table.
*
- * Reserve the first memory page and typically some additional
- * memory (64KiB by default) since some BIOSes are known to corrupt
- * low memory. See the Kconfig help text for X86_RESERVE_LOW.
+ * Reserve the first 64K of memory since some BIOSes are known to
+ * corrupt low memory. After the real mode trampoline is allocated the
+ * rest of the memory below 640k is reserved.
*
* In addition, make sure page 0 is always reserved because on
* systems with L1TF its contents can be leaked to user processes.
*/
- memblock_reserve(0, ALIGN(reserve_low, PAGE_SIZE));
+ memblock_reserve(0, SZ_64K);
early_reserve_initrd();
reserve_ibft_region();
reserve_bios_regions();
+ trim_snb_memory();
}
/*
(max_pfn_mapped<<PAGE_SHIFT) - 1);
#endif
- reserve_real_mode();
-
/*
- * Reserving memory causing GPU hangs on Sandy Bridge integrated
- * graphics devices should be done after we allocated memory under
- * 1M for the real mode trampoline.
+ * Find free memory for the real mode trampoline and place it
+ * there.
+ * If there is not enough free memory under 1M, on EFI-enabled
+ * systems there will be additional attempt to reclaim the memory
+ * for the real mode trampoline at efi_free_boot_services().
+ *
+ * Unconditionally reserve the entire first 1M of RAM because
+ * BIOSes are know to corrupt low memory and several
+ * hundred kilobytes are not worth complex detection what memory gets
+ * clobbered. Moreover, on machines with SandyBridge graphics or in
+ * setups that use crashkernel the entire 1M is reserved anyway.
*/
- trim_snb_memory();
+ reserve_real_mode();
init_mem_mapping();
x86_init.timers.wallclock_init();
+ /*
+ * This needs to run before setup_local_APIC() which soft-disables the
+ * local APIC temporarily and that masks the thermal LVT interrupt,
+ * leading to softlockups on machines which have configured SMI
+ * interrupt delivery.
+ */
+ therm_lvt_init();
+
mcheck_init();
register_refined_jiffies(CLOCK_TICK_RATE);
return rc;
}
-int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len)
+int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len, int emulation_type)
{
int rc = X86EMUL_CONTINUE;
int mode = ctxt->mode;
ctxt->execute = opcode.u.execute;
- if (unlikely(ctxt->ud) && likely(!(ctxt->d & EmulateOnUD)))
+ if (unlikely(emulation_type & EMULTYPE_TRAP_UD) &&
+ likely(!(ctxt->d & EmulateOnUD)))
return EMULATION_FAILED;
if (unlikely(ctxt->d &
{
struct kvm_hv *hv = to_kvm_hv(kvm);
u64 gfn;
+ int idx;
if (hv->hv_tsc_page_status == HV_TSC_PAGE_BROKEN ||
hv->hv_tsc_page_status == HV_TSC_PAGE_UNSET ||
gfn = hv->hv_tsc_page >> HV_X64_MSR_TSC_REFERENCE_ADDRESS_SHIFT;
hv->tsc_ref.tsc_sequence = 0;
+
+ /*
+ * Take the srcu lock as memslots will be accessed to check the gfn
+ * cache generation against the memslots generation.
+ */
+ idx = srcu_read_lock(&kvm->srcu);
if (kvm_write_guest(kvm, gfn_to_gpa(gfn),
&hv->tsc_ref, sizeof(hv->tsc_ref.tsc_sequence)))
hv->hv_tsc_page_status = HV_TSC_PAGE_BROKEN;
+ srcu_read_unlock(&kvm->srcu, idx);
out_unlock:
mutex_unlock(&hv->hv_lock);
int interruptibility;
bool perm_ok; /* do not check permissions if true */
- bool ud; /* inject an #UD if host doesn't support insn */
bool tf; /* TF value before instruction (after for syscall/sysret) */
bool have_exception;
#define X86EMUL_MODE_HOST X86EMUL_MODE_PROT64
#endif
-int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len);
+int x86_decode_insn(struct x86_emulate_ctxt *ctxt, void *insn, int insn_len, int emulation_type);
bool x86_page_table_writing_insn(struct x86_emulate_ctxt *ctxt);
#define EMULATION_FAILED -1
#define EMULATION_OK 0
guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc());
apic->lapic_timer.advance_expire_delta = guest_tsc - tsc_deadline;
+ if (lapic_timer_advance_dynamic) {
+ adjust_lapic_timer_advance(vcpu, apic->lapic_timer.advance_expire_delta);
+ /*
+ * If the timer fired early, reread the TSC to account for the
+ * overhead of the above adjustment to avoid waiting longer
+ * than is necessary.
+ */
+ if (guest_tsc < tsc_deadline)
+ guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc());
+ }
+
if (guest_tsc < tsc_deadline)
__wait_lapic_expire(vcpu, tsc_deadline - guest_tsc);
-
- if (lapic_timer_advance_dynamic)
- adjust_lapic_timer_advance(vcpu, apic->lapic_timer.advance_expire_delta);
}
void kvm_wait_lapic_expire(struct kvm_vcpu *vcpu)
}
atomic_inc(&apic->lapic_timer.pending);
- kvm_make_request(KVM_REQ_PENDING_TIMER, vcpu);
+ kvm_make_request(KVM_REQ_UNBLOCK, vcpu);
if (from_timer_fn)
kvm_vcpu_kick(vcpu);
}
}
/*
- * Remove write access from all the SPTEs mapping GFNs [start, end). If
- * skip_4k is set, SPTEs that map 4k pages, will not be write-protected.
- * Returns true if an SPTE has been changed and the TLBs need to be flushed.
+ * Remove write access from all SPTEs at or above min_level that map GFNs
+ * [start, end). Returns true if an SPTE has been changed and the TLBs need to
+ * be flushed.
*/
static bool wrprot_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
gfn_t start, gfn_t end, int min_level)
#include "svm.h"
/* enable / disable AVIC */
-int avic;
-#ifdef CONFIG_X86_LOCAL_APIC
-module_param(avic, int, S_IRUGO);
-#endif
+bool avic;
+module_param(avic, bool, S_IRUGO);
#define SVM_AVIC_DOORBELL 0xc001011b
}
if (avic) {
- if (!npt_enabled ||
- !boot_cpu_has(X86_FEATURE_AVIC) ||
- !IS_ENABLED(CONFIG_X86_LOCAL_APIC)) {
+ if (!npt_enabled || !boot_cpu_has(X86_FEATURE_AVIC)) {
avic = false;
} else {
pr_info("AVIC enabled\n");
#define VMCB_AVIC_APIC_BAR_MASK 0xFFFFFFFFFF000ULL
-extern int avic;
+extern bool avic;
static inline void avic_update_vapic_bar(struct vcpu_svm *svm, u64 data)
{
static inline bool cpu_has_vmx_posted_intr(void)
{
- return IS_ENABLED(CONFIG_X86_LOCAL_APIC) &&
- vmcs_config.pin_based_exec_ctrl & PIN_BASED_POSTED_INTR;
+ return vmcs_config.pin_based_exec_ctrl & PIN_BASED_POSTED_INTR;
}
static inline bool cpu_has_load_ia32_efer(void)
/*
+ * Bail out of the block loop if the VM has an assigned
+ * device, but the blocking vCPU didn't reconfigure the
+ * PI.NV to the wakeup vector, i.e. the assigned device
+ * came along after the initial check in pi_pre_block().
+ */
+void vmx_pi_start_assignment(struct kvm *kvm)
+{
+ if (!irq_remapping_cap(IRQ_POSTING_CAP))
+ return;
+
+ kvm_make_all_cpus_request(kvm, KVM_REQ_UNBLOCK);
+}
+
+/*
* pi_update_irte - set IRTE for Posted-Interrupts
*
* @kvm: kvm
bool pi_has_pending_interrupt(struct kvm_vcpu *vcpu);
int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,
bool set);
+void vmx_pi_start_assignment(struct kvm *kvm);
#endif /* __KVM_X86_VMX_POSTED_INTR_H */
struct vcpu_vmx *vmx = to_vmx(vcpu);
struct kvm_run *kvm_run = vcpu->run;
u32 intr_info, ex_no, error_code;
- unsigned long cr2, rip, dr6;
+ unsigned long cr2, dr6;
u32 vect_info;
vect_info = vmx->idt_vectoring_info;
vmx->vcpu.arch.event_exit_inst_len =
vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
kvm_run->exit_reason = KVM_EXIT_DEBUG;
- rip = kvm_rip_read(vcpu);
- kvm_run->debug.arch.pc = vmcs_readl(GUEST_CS_BASE) + rip;
+ kvm_run->debug.arch.pc = kvm_get_linear_rip(vcpu);
kvm_run->debug.arch.exception = ex_no;
break;
case AC_VECTOR:
.nested_ops = &vmx_nested_ops,
.update_pi_irte = pi_update_irte,
+ .start_assignment = vmx_pi_start_assignment,
#ifdef CONFIG_X86_64
.set_hv_timer = vmx_set_hv_timer,
st->preempted & KVM_VCPU_FLUSH_TLB);
if (xchg(&st->preempted, 0) & KVM_VCPU_FLUSH_TLB)
kvm_vcpu_flush_tlb_guest(vcpu);
+ } else {
+ st->preempted = 0;
}
vcpu->arch.st.preempted = 0;
BUILD_BUG_ON(HF_SMM_MASK != X86EMUL_SMM_MASK);
BUILD_BUG_ON(HF_SMM_INSIDE_NMI_MASK != X86EMUL_SMM_INSIDE_NMI_MASK);
+ ctxt->interruptibility = 0;
+ ctxt->have_exception = false;
+ ctxt->exception.vector = -1;
+ ctxt->perm_ok = false;
+
init_decode_cache(ctxt);
vcpu->arch.emulate_regs_need_sync_from_vcpu = false;
}
kvm_vcpu_check_breakpoint(vcpu, &r))
return r;
- ctxt->interruptibility = 0;
- ctxt->have_exception = false;
- ctxt->exception.vector = -1;
- ctxt->perm_ok = false;
-
- ctxt->ud = emulation_type & EMULTYPE_TRAP_UD;
-
- r = x86_decode_insn(ctxt, insn, insn_len);
+ r = x86_decode_insn(ctxt, insn, insn_len, emulation_type);
trace_kvm_emulate_insn_start(vcpu);
++vcpu->stat.insn_emulation;
vcpu->stat.directed_yield_attempted++;
+ if (single_task_running())
+ goto no_yield;
+
rcu_read_lock();
map = rcu_dereference(vcpu->kvm->arch.apic_map);
if (r <= 0)
break;
- kvm_clear_request(KVM_REQ_PENDING_TIMER, vcpu);
+ kvm_clear_request(KVM_REQ_UNBLOCK, vcpu);
if (kvm_cpu_has_pending_timer(vcpu))
kvm_inject_pending_timer_irqs(vcpu);
kvm_update_dr7(vcpu);
if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)
- vcpu->arch.singlestep_rip = kvm_rip_read(vcpu) +
- get_segment_base(vcpu, VCPU_SREG_CS);
+ vcpu->arch.singlestep_rip = kvm_get_linear_rip(vcpu);
/*
* Trigger an rflags update that will inject or remove the trace
void kvm_arch_start_assignment(struct kvm *kvm)
{
- atomic_inc(&kvm->arch.assigned_device_count);
+ if (atomic_inc_return(&kvm->arch.assigned_device_count) == 1)
+ static_call_cond(kvm_x86_start_assignment)(kvm);
}
EXPORT_SYMBOL_GPL(kvm_arch_start_assignment);
if (si_code == SEGV_PKUERR)
force_sig_pkuerr((void __user *)address, pkey);
-
- force_sig_fault(SIGSEGV, si_code, (void __user *)address);
+ else
+ force_sig_fault(SIGSEGV, si_code, (void __user *)address);
local_irq_disable();
}
#define AMD_SME_BIT BIT(0)
#define AMD_SEV_BIT BIT(1)
- /* Check the SEV MSR whether SEV or SME is enabled */
- sev_status = __rdmsr(MSR_AMD64_SEV);
- feature_mask = (sev_status & MSR_AMD64_SEV_ENABLED) ? AMD_SEV_BIT : AMD_SME_BIT;
-
/*
* Check for the SME/SEV feature:
* CPUID Fn8000_001F[EAX]
eax = 0x8000001f;
ecx = 0;
native_cpuid(&eax, &ebx, &ecx, &edx);
- if (!(eax & feature_mask))
+ /* Check whether SEV or SME is supported */
+ if (!(eax & (AMD_SEV_BIT | AMD_SME_BIT)))
return;
me_mask = 1UL << (ebx & 0x3f);
+ /* Check the SEV MSR whether SEV or SME is enabled */
+ sev_status = __rdmsr(MSR_AMD64_SEV);
+ feature_mask = (sev_status & MSR_AMD64_SEV_ENABLED) ? AMD_SEV_BIT : AMD_SME_BIT;
+
/* Check if memory encryption is enabled */
if (feature_mask == AMD_SME_BIT) {
/*
size -= rm_size;
}
+ /*
+ * Don't free memory under 1M for two reasons:
+ * - BIOS might clobber it
+ * - Crash kernel needs it to be reserved
+ */
+ if (start + size < SZ_1M)
+ continue;
+ if (start < SZ_1M) {
+ size -= (SZ_1M - start);
+ start = SZ_1M;
+ }
+
memblock_free_late(start, size);
}
/* Has to be under 1M so we can execute real-mode AP code. */
mem = memblock_find_in_range(0, 1<<20, size, PAGE_SIZE);
- if (!mem) {
+ if (!mem)
pr_info("No sub-1M memory is available for the trampoline\n");
- return;
- }
+ else
+ set_real_mode_mem(mem);
- memblock_reserve(mem, size);
- set_real_mode_mem(mem);
- crash_reserve_low_1M();
+ /*
+ * Unconditionally reserve the entire fisrt 1M, see comment in
+ * setup_arch().
+ */
+ memblock_reserve(0, SZ_1M);
}
static void sme_sev_setup_real_mode(struct trampoline_header *th)
{ "AMDI0010", APD_ADDR(wt_i2c_desc) },
{ "AMD0020", APD_ADDR(cz_uart_desc) },
{ "AMDI0020", APD_ADDR(cz_uart_desc) },
+ { "AMDI0022", APD_ADDR(cz_uart_desc) },
{ "AMD0030", },
{ "AMD0040", APD_ADDR(fch_misc_desc)},
{ "HYGO0010", APD_ADDR(wt_i2c_desc) },
}
break;
+ case ACPI_TYPE_LOCAL_ADDRESS_HANDLER:
+
+ ACPI_DEBUG_PRINT((ACPI_DB_ALLOCATIONS,
+ "***** Address handler %p\n", object));
+
+ acpi_os_delete_mutex(object->address_space.context_mutex);
+ break;
+
default:
break;
void acpi_power_resources_list_free(struct list_head *list);
int acpi_extract_power_resources(union acpi_object *package, unsigned int start,
struct list_head *list);
-int acpi_add_power_resource(acpi_handle handle);
+struct acpi_device *acpi_add_power_resource(acpi_handle handle);
void acpi_power_add_remove_device(struct acpi_device *adev, bool add);
int acpi_power_wakeup_list_init(struct list_head *list, int *system_level);
int acpi_device_sleep_wake(struct acpi_device *dev,
int acpi_power_get_inferred_state(struct acpi_device *device, int *state);
int acpi_power_on_resources(struct acpi_device *device, int state);
int acpi_power_transition(struct acpi_device *device, int state);
-void acpi_turn_off_unused_power_resources(void);
+void acpi_turn_off_unused_power_resources(bool init);
/* --------------------------------------------------------------------------
Device Power Management
u32 system_level;
u32 order;
unsigned int ref_count;
+ unsigned int users;
bool wakeup_enabled;
struct mutex resource_lock;
struct list_head dependents;
for (i = start; i < package->package.count; i++) {
union acpi_object *element = &package->package.elements[i];
+ struct acpi_device *rdev;
acpi_handle rhandle;
if (element->type != ACPI_TYPE_LOCAL_REFERENCE) {
if (acpi_power_resource_is_dup(package, start, i))
continue;
- err = acpi_add_power_resource(rhandle);
- if (err)
+ rdev = acpi_add_power_resource(rhandle);
+ if (!rdev) {
+ err = -ENODEV;
break;
-
+ }
err = acpi_power_resources_list_add(rhandle, list);
if (err)
break;
+
+ to_power_resource(rdev)->users++;
}
if (err)
acpi_power_resources_list_free(list);
mutex_unlock(&power_resource_list_lock);
}
-int acpi_add_power_resource(acpi_handle handle)
+struct acpi_device *acpi_add_power_resource(acpi_handle handle)
{
struct acpi_power_resource *resource;
struct acpi_device *device = NULL;
acpi_bus_get_device(handle, &device);
if (device)
- return 0;
+ return device;
resource = kzalloc(sizeof(*resource), GFP_KERNEL);
if (!resource)
- return -ENOMEM;
+ return NULL;
device = &resource->device;
acpi_init_device_object(device, handle, ACPI_BUS_TYPE_POWER);
acpi_power_add_resource_to_list(resource);
acpi_device_add_finalize(device);
- return 0;
+ return device;
err:
acpi_release_power_resource(&device->dev);
- return result;
+ return NULL;
}
#ifdef CONFIG_ACPI_SLEEP
}
#endif
-void acpi_turn_off_unused_power_resources(void)
+static void acpi_power_turn_off_if_unused(struct acpi_power_resource *resource,
+ bool init)
+{
+ if (resource->ref_count > 0)
+ return;
+
+ if (init) {
+ if (resource->users > 0)
+ return;
+ } else {
+ int result, state;
+
+ result = acpi_power_get_state(resource->device.handle, &state);
+ if (result || state == ACPI_POWER_RESOURCE_STATE_OFF)
+ return;
+ }
+
+ dev_info(&resource->device.dev, "Turning OFF\n");
+ __acpi_power_off(resource);
+}
+
+/**
+ * acpi_turn_off_unused_power_resources - Turn off power resources not in use.
+ * @init: Control switch.
+ *
+ * If @ainit is set, unconditionally turn off all of the ACPI power resources
+ * without any users.
+ *
+ * Otherwise, turn off all ACPI power resources without active references (that
+ * is, the ones that should be "off" at the moment) that are "on".
+ */
+void acpi_turn_off_unused_power_resources(bool init)
{
struct acpi_power_resource *resource;
list_for_each_entry_reverse(resource, &acpi_power_resource_list, list_node) {
mutex_lock(&resource->resource_lock);
- if (!resource->ref_count) {
- dev_info(&resource->device.dev, "Turning OFF\n");
- __acpi_power_off(resource);
- }
+ acpi_power_turn_off_if_unused(resource, init);
mutex_unlock(&resource->resource_lock);
}
}
}
- acpi_turn_off_unused_power_resources();
+ acpi_turn_off_unused_power_resources(true);
acpi_scan_initialized = true;
*/
static void acpi_pm_end(void)
{
- acpi_turn_off_unused_power_resources();
+ acpi_turn_off_unused_power_resources(false);
acpi_scan_lock_release();
/*
* This is necessary in case acpi_pm_finish() is not called during a
{
return srcu_read_lock_held(&device_links_srcu);
}
+
+static void device_link_synchronize_removal(void)
+{
+ synchronize_srcu(&device_links_srcu);
+}
+
+static void device_link_remove_from_lists(struct device_link *link)
+{
+ list_del_rcu(&link->s_node);
+ list_del_rcu(&link->c_node);
+}
#else /* !CONFIG_SRCU */
static DECLARE_RWSEM(device_links_lock);
return lockdep_is_held(&device_links_lock);
}
#endif
+
+static inline void device_link_synchronize_removal(void)
+{
+}
+
+static void device_link_remove_from_lists(struct device_link *link)
+{
+ list_del(&link->s_node);
+ list_del(&link->c_node);
+}
#endif /* !CONFIG_SRCU */
static bool device_is_ancestor(struct device *dev, struct device *target)
};
ATTRIBUTE_GROUPS(devlink);
-static void device_link_free(struct device_link *link)
+static void device_link_release_fn(struct work_struct *work)
{
+ struct device_link *link = container_of(work, struct device_link, rm_work);
+
+ /* Ensure that all references to the link object have been dropped. */
+ device_link_synchronize_removal();
+
while (refcount_dec_not_one(&link->rpm_active))
pm_runtime_put(link->supplier);
kfree(link);
}
-#ifdef CONFIG_SRCU
-static void __device_link_free_srcu(struct rcu_head *rhead)
-{
- device_link_free(container_of(rhead, struct device_link, rcu_head));
-}
-
static void devlink_dev_release(struct device *dev)
{
struct device_link *link = to_devlink(dev);
- call_srcu(&device_links_srcu, &link->rcu_head, __device_link_free_srcu);
-}
-#else
-static void devlink_dev_release(struct device *dev)
-{
- device_link_free(to_devlink(dev));
+ INIT_WORK(&link->rm_work, device_link_release_fn);
+ /*
+ * It may take a while to complete this work because of the SRCU
+ * synchronization in device_link_release_fn() and if the consumer or
+ * supplier devices get deleted when it runs, so put it into the "long"
+ * workqueue.
+ */
+ queue_work(system_long_wq, &link->rm_work);
}
-#endif
static struct class devlink_class = {
.name = "devlink",
}
EXPORT_SYMBOL_GPL(device_link_add);
-#ifdef CONFIG_SRCU
static void __device_link_del(struct kref *kref)
{
struct device_link *link = container_of(kref, struct device_link, kref);
pm_runtime_drop_link(link);
- list_del_rcu(&link->s_node);
- list_del_rcu(&link->c_node);
- device_unregister(&link->link_dev);
-}
-#else /* !CONFIG_SRCU */
-static void __device_link_del(struct kref *kref)
-{
- struct device_link *link = container_of(kref, struct device_link, kref);
-
- dev_info(link->consumer, "Dropping the link to %s\n",
- dev_name(link->supplier));
-
- pm_runtime_drop_link(link);
-
- list_del(&link->s_node);
- list_del(&link->c_node);
+ device_link_remove_from_lists(link);
device_unregister(&link->link_dev);
}
-#endif /* !CONFIG_SRCU */
static void device_link_put_kref(struct device_link *link)
{
struct zone *zone;
int ret;
- zone = page_zone(pfn_to_page(start_pfn));
-
/*
* Unaccount before offlining, such that unpopulated zone and kthreads
* can properly be torn down in offline_pages().
*/
- if (nr_vmemmap_pages)
+ if (nr_vmemmap_pages) {
+ zone = page_zone(pfn_to_page(start_pfn));
adjust_present_page_count(zone, -nr_vmemmap_pages);
+ }
ret = offline_pages(start_pfn + nr_vmemmap_pages,
nr_pages - nr_vmemmap_pages);
/* Realtek 8822CE Bluetooth devices */
{ USB_DEVICE(0x0bda, 0xb00c), .driver_info = BTUSB_REALTEK |
BTUSB_WIDEBAND_SPEECH },
+ { USB_DEVICE(0x0bda, 0xc822), .driver_info = BTUSB_REALTEK |
+ BTUSB_WIDEBAND_SPEECH },
/* Realtek 8852AE Bluetooth devices */
{ USB_DEVICE(0x0bda, 0xc852), .driver_info = BTUSB_REALTEK |
}
btusb_setup_intel_newgen_get_fw_name(ver, fwname, sizeof(fwname), "sfi");
- err = request_firmware(&fw, fwname, &hdev->dev);
+ err = firmware_request_nowarn(&fw, fwname, &hdev->dev);
if (err < 0) {
+ if (!test_bit(BTUSB_BOOTLOADER, &data->flags)) {
+ /* Firmware has already been loaded */
+ set_bit(BTUSB_FIRMWARE_LOADED, &data->flags);
+ return 0;
+ }
+
bt_dev_err(hdev, "Failed to load Intel firmware file %s (%d)",
fwname, err);
+
return err;
}
err = btusb_setup_intel_new_get_fw_name(ver, params, fwname,
sizeof(fwname), "sfi");
if (err < 0) {
+ if (!test_bit(BTUSB_BOOTLOADER, &data->flags)) {
+ /* Firmware has already been loaded */
+ set_bit(BTUSB_FIRMWARE_LOADED, &data->flags);
+ return 0;
+ }
+
bt_dev_err(hdev, "Unsupported Intel firmware naming");
return -EINVAL;
}
- err = request_firmware(&fw, fwname, &hdev->dev);
+ err = firmware_request_nowarn(&fw, fwname, &hdev->dev);
if (err < 0) {
+ if (!test_bit(BTUSB_BOOTLOADER, &data->flags)) {
+ /* Firmware has already been loaded */
+ set_bit(BTUSB_FIRMWARE_LOADED, &data->flags);
+ return 0;
+ }
+
bt_dev_err(hdev, "Failed to load Intel firmware file %s (%d)",
fwname, err);
return err;
* If the CPU does not support MOVDIR64B or ENQCMDS, there's no point in
* enumerating the device. We can not utilize it.
*/
- if (!boot_cpu_has(X86_FEATURE_MOVDIR64B)) {
+ if (!cpu_feature_enabled(X86_FEATURE_MOVDIR64B)) {
pr_warn("idxd driver failed to load without MOVDIR64B.\n");
return -ENODEV;
}
- if (!boot_cpu_has(X86_FEATURE_ENQCMD))
+ if (!cpu_feature_enabled(X86_FEATURE_ENQCMD))
pr_warn("Platform does not have ENQCMD(S) support.\n");
else
support_enqcmd = true;
if (!msg || !(mem->validation_bits & CPER_MEM_VALID_MODULE_HANDLE))
return 0;
- n = 0;
- len = CPER_REC_LEN - 1;
+ len = CPER_REC_LEN;
dmi_memdev_name(mem->mem_dev_handle, &bank, &device);
if (bank && device)
n = snprintf(msg, len, "DIMM location: %s %s ", bank, device);
"DIMM location: not present. DMI handle: 0x%.4x ",
mem->mem_dev_handle);
- msg[n] = '\0';
return n;
}
BUILD_BUG_ON(ARRAY_SIZE(target) != ARRAY_SIZE(name));
BUILD_BUG_ON(ARRAY_SIZE(target) != ARRAY_SIZE(dt_params[0].params));
+ if (!fdt)
+ return 0;
+
for (i = 0; i < ARRAY_SIZE(dt_params); i++) {
node = fdt_path_offset(fdt, dt_params[i].path);
if (node < 0)
return 0;
/* Skip any leading slashes */
- while (cmdline[i] == L'/' || cmdline[i] == L'\\')
+ while (i < cmdline_len && (cmdline[i] == L'/' || cmdline[i] == L'\\'))
i++;
while (--result_len > 0 && i < cmdline_len) {
return false;
}
- if (!(in->attribute & (EFI_MEMORY_RO | EFI_MEMORY_XP))) {
- pr_warn("Entry attributes invalid: RO and XP bits both cleared\n");
- return false;
- }
-
if (PAGE_SIZE > EFI_PAGE_SIZE &&
(!PAGE_ALIGNED(in->phys_addr) ||
!PAGE_ALIGNED(in->num_pages << EFI_PAGE_SHIFT))) {
mmSDMA0_RLC0_RB_CNTL) - mmSDMA0_RLC0_RB_CNTL;
break;
case 1:
- sdma_engine_reg_base = SOC15_REG_OFFSET(SDMA1, 0,
+ sdma_engine_reg_base = SOC15_REG_OFFSET(SDMA0, 0,
mmSDMA1_RLC0_RB_CNTL) - mmSDMA0_RLC0_RB_CNTL;
break;
case 2:
- sdma_engine_reg_base = SOC15_REG_OFFSET(SDMA2, 0,
- mmSDMA2_RLC0_RB_CNTL) - mmSDMA2_RLC0_RB_CNTL;
+ sdma_engine_reg_base = SOC15_REG_OFFSET(SDMA0, 0,
+ mmSDMA2_RLC0_RB_CNTL) - mmSDMA0_RLC0_RB_CNTL;
break;
case 3:
- sdma_engine_reg_base = SOC15_REG_OFFSET(SDMA3, 0,
- mmSDMA3_RLC0_RB_CNTL) - mmSDMA2_RLC0_RB_CNTL;
+ sdma_engine_reg_base = SOC15_REG_OFFSET(SDMA0, 0,
+ mmSDMA3_RLC0_RB_CNTL) - mmSDMA0_RLC0_RB_CNTL;
break;
}
engine_id, queue_id);
uint32_t i = 0, reg;
#undef HQD_N_REGS
-#define HQD_N_REGS (19+6+7+10)
+#define HQD_N_REGS (19+6+7+12)
*dump = kmalloc(HQD_N_REGS*2*sizeof(uint32_t), GFP_KERNEL);
if (*dump == NULL)
{
struct amdgpu_ctx *ctx;
struct amdgpu_ctx_mgr *mgr;
- unsigned long ras_counter;
if (!fpriv)
return -EINVAL;
if (atomic_read(&ctx->guilty))
out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_GUILTY;
- /*query ue count*/
- ras_counter = amdgpu_ras_query_error_count(adev, false);
- /*ras counter is monotonic increasing*/
- if (ras_counter != ctx->ras_counter_ue) {
- out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_RAS_UE;
- ctx->ras_counter_ue = ras_counter;
- }
-
- /*query ce count*/
- ras_counter = amdgpu_ras_query_error_count(adev, true);
- if (ras_counter != ctx->ras_counter_ce) {
- out->state.flags |= AMDGPU_CTX_QUERY2_FLAGS_RAS_CE;
- ctx->ras_counter_ce = ras_counter;
- }
-
mutex_unlock(&mgr->lock);
return 0;
}
*/
bool amdgpu_device_has_dc_support(struct amdgpu_device *adev)
{
- if (amdgpu_sriov_vf(adev) || adev->enable_virtual_display)
+ if (amdgpu_sriov_vf(adev) ||
+ adev->enable_virtual_display ||
+ (adev->harvest_ip_mask & AMD_HARVEST_IP_DMU_MASK))
return false;
return amdgpu_device_asic_has_dc_support(adev->asic_type);
int amdgpu_fru_get_product_info(struct amdgpu_device *adev)
{
unsigned char buff[34];
- int addrptr = 0, size = 0;
+ int addrptr, size;
+ int len;
if (!is_fru_eeprom_supported(adev))
return 0;
/* If algo exists, it means that the i2c_adapter's initialized */
if (!adev->pm.smu_i2c.algo) {
DRM_WARN("Cannot access FRU, EEPROM accessor not initialized");
- return 0;
+ return -ENODEV;
}
/* There's a lot of repetition here. This is due to the FRU having
size = amdgpu_fru_read_eeprom(adev, addrptr, buff);
if (size < 1) {
DRM_ERROR("Failed to read FRU Manufacturer, ret:%d", size);
- return size;
+ return -EINVAL;
}
/* Increment the addrptr by the size of the field, and 1 due to the
size = amdgpu_fru_read_eeprom(adev, addrptr, buff);
if (size < 1) {
DRM_ERROR("Failed to read FRU product name, ret:%d", size);
- return size;
+ return -EINVAL;
}
+ len = size;
/* Product name should only be 32 characters. Any more,
* and something could be wrong. Cap it at 32 to be safe
*/
- if (size > 32) {
+ if (len >= sizeof(adev->product_name)) {
DRM_WARN("FRU Product Number is larger than 32 characters. This is likely a mistake");
- size = 32;
+ len = sizeof(adev->product_name) - 1;
}
/* Start at 2 due to buff using fields 0 and 1 for the address */
- memcpy(adev->product_name, &buff[2], size);
- adev->product_name[size] = '\0';
+ memcpy(adev->product_name, &buff[2], len);
+ adev->product_name[len] = '\0';
addrptr += size + 1;
size = amdgpu_fru_read_eeprom(adev, addrptr, buff);
if (size < 1) {
DRM_ERROR("Failed to read FRU product number, ret:%d", size);
- return size;
+ return -EINVAL;
}
+ len = size;
/* Product number should only be 16 characters. Any more,
* and something could be wrong. Cap it at 16 to be safe
*/
- if (size > 16) {
+ if (len >= sizeof(adev->product_number)) {
DRM_WARN("FRU Product Number is larger than 16 characters. This is likely a mistake");
- size = 16;
+ len = sizeof(adev->product_number) - 1;
}
- memcpy(adev->product_number, &buff[2], size);
- adev->product_number[size] = '\0';
+ memcpy(adev->product_number, &buff[2], len);
+ adev->product_number[len] = '\0';
addrptr += size + 1;
size = amdgpu_fru_read_eeprom(adev, addrptr, buff);
if (size < 1) {
DRM_ERROR("Failed to read FRU product version, ret:%d", size);
- return size;
+ return -EINVAL;
}
addrptr += size + 1;
if (size < 1) {
DRM_ERROR("Failed to read FRU serial number, ret:%d", size);
- return size;
+ return -EINVAL;
}
+ len = size;
/* Serial number should only be 16 characters. Any more,
* and something could be wrong. Cap it at 16 to be safe
*/
- if (size > 16) {
+ if (len >= sizeof(adev->serial)) {
DRM_WARN("FRU Serial Number is larger than 16 characters. This is likely a mistake");
- size = 16;
+ len = sizeof(adev->serial) - 1;
}
- memcpy(adev->serial, &buff[2], size);
- adev->serial[size] = '\0';
+ memcpy(adev->serial, &buff[2], len);
+ adev->serial[len] = '\0';
return 0;
}
uint64_t ring_mem_mc_addr;
void *ring_mem_handle;
uint32_t ring_size;
+ uint32_t ring_wptr;
};
/* More registers may will be supported */
{
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+ cancel_delayed_work_sync(&adev->vcn.idle_work);
+
if (adev->jpeg.cur_state != AMD_PG_STATE_GATE &&
RREG32_SOC15(JPEG, 0, mmUVD_JRBC_STATUS))
jpeg_v2_0_set_powergating_state(adev, AMD_PG_STATE_GATE);
static int jpeg_v2_5_hw_fini(void *handle)
{
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
- struct amdgpu_ring *ring;
int i;
+ cancel_delayed_work_sync(&adev->vcn.idle_work);
+
for (i = 0; i < adev->jpeg.num_jpeg_inst; ++i) {
if (adev->jpeg.harvest_config & (1 << i))
continue;
- ring = &adev->jpeg.inst[i].ring_dec;
if (adev->jpeg.cur_state != AMD_PG_STATE_GATE &&
RREG32_SOC15(JPEG, i, mmUVD_JRBC_STATUS))
jpeg_v2_5_set_powergating_state(adev, AMD_PG_STATE_GATE);
static int jpeg_v3_0_hw_fini(void *handle)
{
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
- struct amdgpu_ring *ring;
- ring = &adev->jpeg.inst->ring_dec;
+ cancel_delayed_work_sync(&adev->vcn.idle_work);
+
if (adev->jpeg.cur_state != AMD_PG_STATE_GATE &&
RREG32_SOC15(JPEG, 0, mmUVD_JRBC_STATUS))
jpeg_v3_0_set_powergating_state(adev, AMD_PG_STATE_GATE);
struct amdgpu_device *adev = psp->adev;
if (amdgpu_sriov_vf(adev))
- data = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_102);
+ data = psp->km_ring.ring_wptr;
else
data = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_67);
if (amdgpu_sriov_vf(adev)) {
WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_102, value);
WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_101, GFX_CTRL_CMD_ID_CONSUME_CMD);
+ psp->km_ring.ring_wptr = value;
} else
WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_67, value);
}
struct amdgpu_device *adev = psp->adev;
if (amdgpu_sriov_vf(adev))
- data = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_102);
+ data = psp->km_ring.ring_wptr;
else
data = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_67);
return data;
/* send interrupt to PSP for SRIOV ring write pointer update */
WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_101,
GFX_CTRL_CMD_ID_CONSUME_CMD);
+ psp->km_ring.ring_wptr = value;
} else
WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_67, value);
}
error:
dma_fence_put(fence);
+ amdgpu_bo_unpin(bo);
amdgpu_bo_unreserve(bo);
amdgpu_bo_unref(&bo);
return r;
{
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+ cancel_delayed_work_sync(&adev->vcn.idle_work);
+
if ((adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) ||
- RREG32_SOC15(VCN, 0, mmUVD_STATUS))
+ (adev->vcn.cur_state != AMD_PG_STATE_GATE &&
+ RREG32_SOC15(VCN, 0, mmUVD_STATUS))) {
vcn_v1_0_set_powergating_state(adev, AMD_PG_STATE_GATE);
+ }
return 0;
}
{
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+ cancel_delayed_work_sync(&adev->vcn.idle_work);
+
if ((adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) ||
(adev->vcn.cur_state != AMD_PG_STATE_GATE &&
RREG32_SOC15(VCN, 0, mmUVD_STATUS)))
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
int i;
+ cancel_delayed_work_sync(&adev->vcn.idle_work);
+
for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
if (adev->vcn.harvest_config & (1 << i))
continue;
static int vcn_v3_0_hw_fini(void *handle)
{
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
- struct amdgpu_ring *ring;
int i;
+ cancel_delayed_work_sync(&adev->vcn.idle_work);
+
for (i = 0; i < adev->vcn.num_vcn_inst; ++i) {
if (adev->vcn.harvest_config & (1 << i))
continue;
- ring = &adev->vcn.inst[i].ring_dec;
-
if (!amdgpu_sriov_vf(adev)) {
if ((adev->pg_flags & AMD_PG_SUPPORT_VCN_DPG) ||
(adev->vcn.cur_state != AMD_PG_STATE_GATE &&
abm->dmcu_is_running = dmcu->funcs->is_dmcu_initialized(dmcu);
}
- adev->dm.dc->ctx->dmub_srv = dc_dmub_srv_create(adev->dm.dc, dmub_srv);
+ if (!adev->dm.dc->ctx->dmub_srv)
+ adev->dm.dc->ctx->dmub_srv = dc_dmub_srv_create(adev->dm.dc, dmub_srv);
if (!adev->dm.dc->ctx->dmub_srv) {
DRM_ERROR("Couldn't allocate DC DMUB server!\n");
return -ENOMEM;
amdgpu_dm_irq_suspend(adev);
-
dc_set_power_state(dm->dc, DC_ACPI_CM_POWER_STATE_D3);
return 0;
struct drm_display_mode saved_mode;
struct drm_display_mode *freesync_mode = NULL;
bool native_mode_found = false;
- bool recalculate_timing = dm_state ? (dm_state->scaling != RMX_OFF) : false;
+ bool recalculate_timing = false;
+ bool scale = dm_state ? (dm_state->scaling != RMX_OFF) : false;
int mode_refresh;
int preferred_refresh = 0;
#if defined(CONFIG_DRM_AMD_DC_DCN)
*/
DRM_DEBUG_DRIVER("No preferred mode found\n");
} else {
- recalculate_timing |= amdgpu_freesync_vid_mode &&
+ recalculate_timing = amdgpu_freesync_vid_mode &&
is_freesync_video_mode(&mode, aconnector);
if (recalculate_timing) {
freesync_mode = get_highest_refresh_rate_mode(aconnector, false);
mode = *freesync_mode;
} else {
decide_crtc_timing_for_drm_display_mode(
- &mode, preferred_mode,
- dm_state ? (dm_state->scaling != RMX_OFF) : false);
- }
+ &mode, preferred_mode, scale);
- preferred_refresh = drm_mode_vrefresh(preferred_mode);
+ preferred_refresh = drm_mode_vrefresh(preferred_mode);
+ }
}
if (recalculate_timing)
* If scaling is enabled and refresh rate didn't change
* we copy the vic and polarities of the old timings
*/
- if (!recalculate_timing || mode_refresh != preferred_refresh)
+ if (!scale || mode_refresh != preferred_refresh)
fill_stream_properties_from_drm_display_mode(
stream, &mode, &aconnector->base, con_state, NULL,
requested_bpc);
if (cursor_scale_w != primary_scale_w ||
cursor_scale_h != primary_scale_h) {
- DRM_DEBUG_ATOMIC("Cursor plane scaling doesn't match primary plane\n");
+ drm_dbg_atomic(crtc->dev, "Cursor plane scaling doesn't match primary plane\n");
return -EINVAL;
}
int i;
struct drm_plane *plane;
struct drm_plane_state *old_plane_state, *new_plane_state;
- struct drm_plane_state *primary_state, *overlay_state = NULL;
+ struct drm_plane_state *primary_state, *cursor_state, *overlay_state = NULL;
/* Check if primary plane is contained inside overlay */
for_each_oldnew_plane_in_state_reverse(state, plane, old_plane_state, new_plane_state, i) {
if (!primary_state->crtc)
return 0;
+ /* check if cursor plane is enabled */
+ cursor_state = drm_atomic_get_plane_state(state, overlay_state->crtc->cursor);
+ if (IS_ERR(cursor_state))
+ return PTR_ERR(cursor_state);
+
+ if (drm_atomic_plane_disabling(plane->state, cursor_state))
+ return 0;
+
/* Perform the bounds check to ensure the overlay plane covers the primary */
if (primary_state->crtc_x < overlay_state->crtc_x ||
primary_state->crtc_y < overlay_state->crtc_y ||
voltage_supported = dcn20_validate_bandwidth_internal(dc, context, false);
dummy_pstate_supported = context->bw_ctx.bw.dcn.clk.p_state_change_support;
- if (voltage_supported && dummy_pstate_supported) {
+ if (voltage_supported && (dummy_pstate_supported || !(context->stream_count))) {
context->bw_ctx.bw.dcn.clk.p_state_change_support = false;
goto restore_dml_state;
}
static int navi10_enable_mgpu_fan_boost(struct smu_context *smu)
{
+ struct smu_table_context *table_context = &smu->smu_table;
+ PPTable_t *smc_pptable = table_context->driver_pptable;
struct amdgpu_device *adev = smu->adev;
uint32_t param = 0;
if (adev->asic_type == CHIP_NAVI12)
return 0;
+ /*
+ * Skip the MGpuFanBoost setting for those ASICs
+ * which do not support it
+ */
+ if (!smc_pptable->MGpuFanBoostLimitRpm)
+ return 0;
+
/* Workaround for WS SKU */
if (adev->pdev->device == 0x7312 &&
adev->pdev->revision == 0)
static int sienna_cichlid_enable_mgpu_fan_boost(struct smu_context *smu)
{
+ struct smu_table_context *table_context = &smu->smu_table;
+ PPTable_t *smc_pptable = table_context->driver_pptable;
+
+ /*
+ * Skip the MGpuFanBoost setting for those ASICs
+ * which do not support it
+ */
+ if (!smc_pptable->MGpuFanBoostLimitRpm)
+ return 0;
+
return smu_cmn_send_smc_msg_with_param(smu,
SMU_MSG_SetMGpuFanBoostLimitRpm,
0,
select INPUT if ACPI
select ACPI_VIDEO if ACPI
select ACPI_BUTTON if ACPI
- select IO_MAPPING
select SYNC_FILE
select IOSF_MBI
select CRC32
return drm_dp_dpcd_write(&intel_dp->aux, DP_PHY_REPEATER_MODE, &val, 1) == 1;
}
-/**
- * intel_dp_init_lttpr_and_dprx_caps - detect LTTPR and DPRX caps, init the LTTPR link training mode
- * @intel_dp: Intel DP struct
- *
- * Read the LTTPR common and DPRX capabilities and switch to non-transparent
- * link training mode if any is detected and read the PHY capabilities for all
- * detected LTTPRs. In case of an LTTPR detection error or if the number of
- * LTTPRs is more than is supported (8), fall back to the no-LTTPR,
- * transparent mode link training mode.
- *
- * Returns:
- * >0 if LTTPRs were detected and the non-transparent LT mode was set. The
- * DPRX capabilities are read out.
- * 0 if no LTTPRs or more than 8 LTTPRs were detected or in case of a
- * detection failure and the transparent LT mode was set. The DPRX
- * capabilities are read out.
- * <0 Reading out the DPRX capabilities failed.
- */
-int intel_dp_init_lttpr_and_dprx_caps(struct intel_dp *intel_dp)
+static int intel_dp_init_lttpr(struct intel_dp *intel_dp)
{
int lttpr_count;
- bool ret;
int i;
- ret = intel_dp_read_lttpr_common_caps(intel_dp);
-
- /* The DPTX shall read the DPRX caps after LTTPR detection. */
- if (drm_dp_read_dpcd_caps(&intel_dp->aux, intel_dp->dpcd)) {
- intel_dp_reset_lttpr_common_caps(intel_dp);
- return -EIO;
- }
-
- if (!ret)
- return 0;
-
- /*
- * The 0xF0000-0xF02FF range is only valid if the DPCD revision is
- * at least 1.4.
- */
- if (intel_dp->dpcd[DP_DPCD_REV] < 0x14) {
- intel_dp_reset_lttpr_common_caps(intel_dp);
+ if (!intel_dp_read_lttpr_common_caps(intel_dp))
return 0;
- }
lttpr_count = drm_dp_lttpr_count(intel_dp->lttpr_common_caps);
/*
return lttpr_count;
}
+
+/**
+ * intel_dp_init_lttpr_and_dprx_caps - detect LTTPR and DPRX caps, init the LTTPR link training mode
+ * @intel_dp: Intel DP struct
+ *
+ * Read the LTTPR common and DPRX capabilities and switch to non-transparent
+ * link training mode if any is detected and read the PHY capabilities for all
+ * detected LTTPRs. In case of an LTTPR detection error or if the number of
+ * LTTPRs is more than is supported (8), fall back to the no-LTTPR,
+ * transparent mode link training mode.
+ *
+ * Returns:
+ * >0 if LTTPRs were detected and the non-transparent LT mode was set. The
+ * DPRX capabilities are read out.
+ * 0 if no LTTPRs or more than 8 LTTPRs were detected or in case of a
+ * detection failure and the transparent LT mode was set. The DPRX
+ * capabilities are read out.
+ * <0 Reading out the DPRX capabilities failed.
+ */
+int intel_dp_init_lttpr_and_dprx_caps(struct intel_dp *intel_dp)
+{
+ int lttpr_count = intel_dp_init_lttpr(intel_dp);
+
+ /* The DPTX shall read the DPRX caps after LTTPR detection. */
+ if (drm_dp_read_dpcd_caps(&intel_dp->aux, intel_dp->dpcd)) {
+ intel_dp_reset_lttpr_common_caps(intel_dp);
+ return -EIO;
+ }
+
+ return lttpr_count;
+}
EXPORT_SYMBOL(intel_dp_init_lttpr_and_dprx_caps);
static u8 dp_voltage_max(u8 preemph)
goto err_unpin;
/* Finally, remap it using the new GTT offset */
- ret = io_mapping_map_user(&ggtt->iomap, area, area->vm_start +
- (vma->ggtt_view.partial.offset << PAGE_SHIFT),
- (ggtt->gmadr.start + vma->node.start) >> PAGE_SHIFT,
- min_t(u64, vma->size, area->vm_end - area->vm_start));
+ ret = remap_io_mapping(area,
+ area->vm_start + (vma->ggtt_view.partial.offset << PAGE_SHIFT),
+ (ggtt->gmadr.start + vma->node.start) >> PAGE_SHIFT,
+ min_t(u64, vma->size, area->vm_end - area->vm_start),
+ &ggtt->iomap);
if (ret)
goto err_fence;
struct drm_file *file);
/* i915_mm.c */
+int remap_io_mapping(struct vm_area_struct *vma,
+ unsigned long addr, unsigned long pfn, unsigned long size,
+ struct io_mapping *iomap);
int remap_io_sg(struct vm_area_struct *vma,
unsigned long addr, unsigned long size,
struct scatterlist *sgl, resource_size_t iobase);
resource_size_t iobase;
};
+static int remap_pfn(pte_t *pte, unsigned long addr, void *data)
+{
+ struct remap_pfn *r = data;
+
+ /* Special PTE are not associated with any struct page */
+ set_pte_at(r->mm, addr, pte, pte_mkspecial(pfn_pte(r->pfn, r->prot)));
+ r->pfn++;
+
+ return 0;
+}
+
#define use_dma(io) ((io) != -1)
static inline unsigned long sgt_pfn(const struct remap_pfn *r)
return 0;
}
+/**
+ * remap_io_mapping - remap an IO mapping to userspace
+ * @vma: user vma to map to
+ * @addr: target user address to start at
+ * @pfn: physical address of kernel memory
+ * @size: size of map area
+ * @iomap: the source io_mapping
+ *
+ * Note: this is only safe if the mm semaphore is held when called.
+ */
+int remap_io_mapping(struct vm_area_struct *vma,
+ unsigned long addr, unsigned long pfn, unsigned long size,
+ struct io_mapping *iomap)
+{
+ struct remap_pfn r;
+ int err;
+
#define EXPECTED_FLAGS (VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP)
+ GEM_BUG_ON((vma->vm_flags & EXPECTED_FLAGS) != EXPECTED_FLAGS);
+
+ /* We rely on prevalidation of the io-mapping to skip track_pfn(). */
+ r.mm = vma->vm_mm;
+ r.pfn = pfn;
+ r.prot = __pgprot((pgprot_val(iomap->prot) & _PAGE_CACHE_MASK) |
+ (pgprot_val(vma->vm_page_prot) & ~_PAGE_CACHE_MASK));
+
+ err = apply_to_page_range(r.mm, addr, size, remap_pfn, &r);
+ if (unlikely(err)) {
+ zap_vma_ptes(vma, addr, (r.pfn - pfn) << PAGE_SHIFT);
+ return err;
+ }
+
+ return 0;
+}
/**
* remap_io_sg - remap an IO mapping to userspace
for (n = 0; n < smoke[0].ncontexts; n++) {
smoke[0].contexts[n] = live_context(i915, file);
- if (!smoke[0].contexts[n]) {
- ret = -ENOMEM;
+ if (IS_ERR(smoke[0].contexts[n])) {
+ ret = PTR_ERR(smoke[0].contexts[n]);
goto out_contexts;
}
}
static void meson_drv_shutdown(struct platform_device *pdev)
{
struct meson_drm *priv = dev_get_drvdata(&pdev->dev);
- struct drm_device *drm = priv->drm;
- DRM_DEBUG_DRIVER("\n");
- drm_kms_helper_poll_fini(drm);
- drm_atomic_helper_shutdown(drm);
+ if (!priv)
+ return;
+
+ drm_kms_helper_poll_fini(priv->drm);
+ drm_atomic_helper_shutdown(priv->drm);
}
static int meson_drv_probe(struct platform_device *pdev)
#include "trace.h"
/* XXX move to include/uapi/drm/drm_fourcc.h? */
-#define DRM_FORMAT_MOD_NVIDIA_SECTOR_LAYOUT BIT(22)
+#define DRM_FORMAT_MOD_NVIDIA_SECTOR_LAYOUT BIT_ULL(22)
struct reset_control;
* dGPU sector layout.
*/
if (tegra_plane_state->tiling.sector_layout == TEGRA_BO_SECTOR_LAYOUT_GPU)
- base |= BIT(39);
+ base |= BIT_ULL(39);
#endif
tegra_plane_writel(p, tegra_plane_state->format, DC_WIN_COLOR_DEPTH);
if (err < 0) {
dev_err(sor->dev, "failed to acquire SOR reset: %d\n",
err);
- return err;
+ goto rpm_put;
}
err = reset_control_assert(sor->rst);
if (err < 0) {
dev_err(sor->dev, "failed to assert SOR reset: %d\n",
err);
- return err;
+ goto rpm_put;
}
}
err = clk_prepare_enable(sor->clk);
if (err < 0) {
dev_err(sor->dev, "failed to enable clock: %d\n", err);
- return err;
+ goto rpm_put;
}
usleep_range(1000, 3000);
dev_err(sor->dev, "failed to deassert SOR reset: %d\n",
err);
clk_disable_unprepare(sor->clk);
- return err;
+ goto rpm_put;
}
reset_control_release(sor->rst);
}
return 0;
+
+rpm_put:
+ if (sor->rst)
+ pm_runtime_put(sor->dev);
+
+ return err;
}
static int tegra_sor_exit(struct host1x_client *client)
if (!sor->aux)
return -EPROBE_DEFER;
- if (get_device(&sor->aux->ddc.dev)) {
- if (try_module_get(sor->aux->ddc.owner))
- sor->output.ddc = &sor->aux->ddc;
- else
- put_device(&sor->aux->ddc.dev);
- }
+ if (get_device(sor->aux->dev))
+ sor->output.ddc = &sor->aux->ddc;
}
if (!sor->aux) {
err = tegra_sor_parse_dt(sor);
if (err < 0)
- return err;
+ goto put_aux;
err = tegra_output_probe(&sor->output);
- if (err < 0)
- return dev_err_probe(&pdev->dev, err,
- "failed to probe output\n");
+ if (err < 0) {
+ dev_err_probe(&pdev->dev, err, "failed to probe output\n");
+ goto put_aux;
+ }
if (sor->ops && sor->ops->probe) {
err = sor->ops->probe(sor);
platform_set_drvdata(pdev, sor);
pm_runtime_enable(&pdev->dev);
- INIT_LIST_HEAD(&sor->client.list);
+ host1x_client_init(&sor->client);
sor->client.ops = &sor_client_ops;
sor->client.dev = &pdev->dev;
- err = host1x_client_register(&sor->client);
- if (err < 0) {
- dev_err(&pdev->dev, "failed to register host1x client: %d\n",
- err);
- goto rpm_disable;
- }
-
/*
* On Tegra210 and earlier, provide our own implementation for the
* pad output clock.
sor->index);
if (!name) {
err = -ENOMEM;
- goto unregister;
+ goto uninit;
}
err = host1x_client_resume(&sor->client);
if (err < 0) {
dev_err(sor->dev, "failed to resume: %d\n", err);
- goto unregister;
+ goto uninit;
}
sor->clk_pad = tegra_clk_sor_pad_register(sor, name);
err = PTR_ERR(sor->clk_pad);
dev_err(sor->dev, "failed to register SOR pad clock: %d\n",
err);
- goto unregister;
+ goto uninit;
+ }
+
+ err = __host1x_client_register(&sor->client);
+ if (err < 0) {
+ dev_err(&pdev->dev, "failed to register host1x client: %d\n",
+ err);
+ goto uninit;
}
return 0;
-unregister:
- host1x_client_unregister(&sor->client);
-rpm_disable:
+uninit:
+ host1x_client_exit(&sor->client);
pm_runtime_disable(&pdev->dev);
remove:
+ if (sor->aux)
+ sor->output.ddc = NULL;
+
tegra_output_remove(&sor->output);
+put_aux:
+ if (sor->aux)
+ put_device(sor->aux->dev);
+
return err;
}
pm_runtime_disable(&pdev->dev);
+ if (sor->aux) {
+ put_device(sor->aux->dev);
+ sor->output.ddc = NULL;
+ }
+
tegra_output_remove(&sor->output);
return 0;
list_for_each_entry(bo, &man->lru[j], lru) {
uint32_t num_pages;
- if (!bo->ttm ||
+ if (!bo->ttm || !ttm_tt_is_populated(bo->ttm) ||
bo->ttm->page_flags & TTM_PAGE_FLAG_SG ||
bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)
continue;
EXPORT_SYMBOL(host1x_driver_unregister);
/**
+ * __host1x_client_init() - initialize a host1x client
+ * @client: host1x client
+ * @key: lock class key for the client-specific mutex
+ */
+void __host1x_client_init(struct host1x_client *client, struct lock_class_key *key)
+{
+ INIT_LIST_HEAD(&client->list);
+ __mutex_init(&client->lock, "host1x client lock", key);
+ client->usecount = 0;
+}
+EXPORT_SYMBOL(__host1x_client_init);
+
+/**
+ * host1x_client_exit() - uninitialize a host1x client
+ * @client: host1x client
+ */
+void host1x_client_exit(struct host1x_client *client)
+{
+ mutex_destroy(&client->lock);
+}
+EXPORT_SYMBOL(host1x_client_exit);
+
+/**
* __host1x_client_register() - register a host1x client
* @client: host1x client
* @key: lock class key for the client-specific mutex
* device and call host1x_device_init(), which will in turn call each client's
* &host1x_client_ops.init implementation.
*/
-int __host1x_client_register(struct host1x_client *client,
- struct lock_class_key *key)
+int __host1x_client_register(struct host1x_client *client)
{
struct host1x *host1x;
int err;
- INIT_LIST_HEAD(&client->list);
- __mutex_init(&client->lock, "host1x client lock", key);
- client->usecount = 0;
-
mutex_lock(&devices_lock);
list_for_each_entry(host1x, &devices, list) {
depends on HID
config HID_A4TECH
- tristate "A4 tech mice"
+ tristate "A4TECH mice"
depends on HID
default !EXPERT
help
- Support for A4 tech X5 and WOP-35 / Trust 450L mice.
+ Support for some A4TECH mice with two scroll wheels.
config HID_ACCUTOUCH
tristate "Accutouch touch device"
help
Support for Samsung InfraRed remote control or keyboards.
+config HID_SEMITEK
+ tristate "Semitek USB keyboards"
+ depends on HID
+ help
+ Support for Semitek USB keyboards that are not fully compliant
+ with the HID standard.
+
+ There are many variants, including:
+ - GK61, GK64, GK68, GK84, GK96, etc.
+ - SK61, SK64, SK68, SK84, SK96, etc.
+ - Dierya DK61/DK66
+ - Tronsmart TK09R
+ - Woo-dy
+ - X-Bows Nature/Knight
+
config HID_SONY
tristate "Sony PS2/3/4 accessories"
depends on USB_HID
obj-$(CONFIG_HID_RMI) += hid-rmi.o
obj-$(CONFIG_HID_SAITEK) += hid-saitek.o
obj-$(CONFIG_HID_SAMSUNG) += hid-samsung.o
+obj-$(CONFIG_HID_SEMITEK) += hid-semitek.o
obj-$(CONFIG_HID_SMARTJOYPLUS) += hid-sjoy.o
obj-$(CONFIG_HID_SONY) += hid-sony.o
obj-$(CONFIG_HID_SPEEDLINK) += hid-speedlink.o
sensor_index = req_node->sensor_idx;
report_id = req_node->report_id;
node_type = req_node->report_type;
+ kfree(req_node);
if (node_type == HID_FEATURE_REPORT) {
report_size = get_feature_report(sensor_index, report_id,
int rc, i;
dev = &privdata->pdev->dev;
- cl_data = kzalloc(sizeof(*cl_data), GFP_KERNEL);
+ cl_data = devm_kzalloc(dev, sizeof(*cl_data), GFP_KERNEL);
if (!cl_data)
return -ENOMEM;
rc = -EINVAL;
goto cleanup;
}
- cl_data->feature_report[i] = kzalloc(feature_report_size, GFP_KERNEL);
+ cl_data->feature_report[i] = devm_kzalloc(dev, feature_report_size, GFP_KERNEL);
if (!cl_data->feature_report[i]) {
rc = -ENOMEM;
goto cleanup;
}
- cl_data->input_report[i] = kzalloc(input_report_size, GFP_KERNEL);
+ cl_data->input_report[i] = devm_kzalloc(dev, input_report_size, GFP_KERNEL);
if (!cl_data->input_report[i]) {
rc = -ENOMEM;
goto cleanup;
info.sensor_idx = cl_idx;
info.dma_address = cl_data->sensor_dma_addr[i];
- cl_data->report_descr[i] = kzalloc(cl_data->report_descr_sz[i], GFP_KERNEL);
+ cl_data->report_descr[i] =
+ devm_kzalloc(dev, cl_data->report_descr_sz[i], GFP_KERNEL);
if (!cl_data->report_descr[i]) {
rc = -ENOMEM;
goto cleanup;
cl_data->sensor_virt_addr[i],
cl_data->sensor_dma_addr[i]);
}
- kfree(cl_data->feature_report[i]);
- kfree(cl_data->input_report[i]);
- kfree(cl_data->report_descr[i]);
+ devm_kfree(dev, cl_data->feature_report[i]);
+ devm_kfree(dev, cl_data->input_report[i]);
+ devm_kfree(dev, cl_data->report_descr[i]);
}
- kfree(cl_data);
+ devm_kfree(dev, cl_data);
return rc;
}
cl_data->sensor_dma_addr[i]);
}
}
- kfree(cl_data);
return 0;
}
int i;
for (i = 0; i < cli_data->num_hid_devices; ++i) {
- kfree(cli_data->feature_report[i]);
- kfree(cli_data->input_report[i]);
- kfree(cli_data->report_descr[i]);
if (cli_data->hid_sensor_hubs[i]) {
kfree(cli_data->hid_sensor_hubs[i]->driver_data);
hid_destroy_device(cli_data->hid_sensor_hubs[i]);
.driver_data = A4_2WHEEL_MOUSE_HACK_B8 },
{ HID_USB_DEVICE(USB_VENDOR_ID_A4TECH, USB_DEVICE_ID_A4TECH_RP_649),
.driver_data = A4_2WHEEL_MOUSE_HACK_B8 },
+ { HID_USB_DEVICE(USB_VENDOR_ID_A4TECH, USB_DEVICE_ID_A4TECH_NB_95),
+ .driver_data = A4_2WHEEL_MOUSE_HACK_B8 },
{ }
};
MODULE_DEVICE_TABLE(hid, a4_devices);
#define QUIRK_T100_KEYBOARD BIT(6)
#define QUIRK_T100CHI BIT(7)
#define QUIRK_G752_KEYBOARD BIT(8)
-#define QUIRK_T101HA_DOCK BIT(9)
-#define QUIRK_T90CHI BIT(10)
-#define QUIRK_MEDION_E1239T BIT(11)
-#define QUIRK_ROG_NKEY_KEYBOARD BIT(12)
+#define QUIRK_T90CHI BIT(9)
+#define QUIRK_MEDION_E1239T BIT(10)
+#define QUIRK_ROG_NKEY_KEYBOARD BIT(11)
#define I2C_KEYBOARD_QUIRKS (QUIRK_FIX_NOTEBOOK_REPORT | \
QUIRK_NO_INIT_REPORTS | \
if (drvdata->quirks & QUIRK_MEDION_E1239T)
return asus_e1239t_event(drvdata, data, size);
- if (drvdata->quirks & QUIRK_ROG_NKEY_KEYBOARD) {
+ if (drvdata->quirks & QUIRK_USE_KBD_BACKLIGHT) {
/*
* Skip these report ID, the device emits a continuous stream associated
* with the AURA mode it is in which looks like an 'echo'.
return -1;
}
}
+ if (drvdata->quirks & QUIRK_ROG_NKEY_KEYBOARD) {
+ /*
+ * G713 and G733 send these codes on some keypresses, depending on
+ * the key pressed it can trigger a shutdown event if not caught.
+ */
+ if(data[0] == 0x02 && data[1] == 0x30) {
+ return -1;
+ }
+ }
+
}
return 0;
return ret;
}
- /* use hid-multitouch for T101HA touchpad */
- if (id->driver_data & QUIRK_T101HA_DOCK &&
- hdev->collection->usage == HID_GD_MOUSE)
- return -ENODEV;
-
ret = hid_hw_start(hdev, HID_CONNECT_DEFAULT);
if (ret) {
hid_err(hdev, "Asus hw start failed: %d\n", ret);
{ HID_USB_DEVICE(USB_VENDOR_ID_ASUSTEK,
USB_DEVICE_ID_ASUSTEK_T100TAF_KEYBOARD),
QUIRK_T100_KEYBOARD | QUIRK_NO_CONSUMER_USAGES },
- { HID_USB_DEVICE(USB_VENDOR_ID_ASUSTEK,
- USB_DEVICE_ID_ASUSTEK_T101HA_KEYBOARD), QUIRK_T101HA_DOCK },
{ HID_USB_DEVICE(USB_VENDOR_ID_CHICONY, USB_DEVICE_ID_ASUS_AK1D) },
{ HID_USB_DEVICE(USB_VENDOR_ID_TURBOX, USB_DEVICE_ID_ASUS_MD_5110) },
{ HID_USB_DEVICE(USB_VENDOR_ID_JESS, USB_DEVICE_ID_ASUS_MD_5112) },
USB_DEVICE_ID_ASUSTEK_T100CHI_KEYBOARD), QUIRK_T100CHI },
{ HID_USB_DEVICE(USB_VENDOR_ID_ITE, USB_DEVICE_ID_ITE_MEDION_E1239T),
QUIRK_MEDION_E1239T },
+ /*
+ * Note bind to the HID_GROUP_GENERIC group, so that we only bind to the keyboard
+ * part, while letting hid-multitouch.c handle the touchpad.
+ */
+ { HID_DEVICE(BUS_USB, HID_GROUP_GENERIC,
+ USB_VENDOR_ID_ASUSTEK, USB_DEVICE_ID_ASUSTEK_T101HA_KEYBOARD) },
{ }
};
MODULE_DEVICE_TABLE(hid, asus_devices);
case BUS_I2C:
bus = "I2C";
break;
+ case BUS_VIRTUAL:
+ bus = "VIRTUAL";
+ break;
default:
bus = "<UNKNOWN>";
}
return 0;
}
-
EXPORT_SYMBOL_GPL(hid_check_keys_pressed);
static int __init hid_init(void)
[KEY_APPSELECT] = "AppSelect",
[KEY_SCREENSAVER] = "ScreenSaver",
[KEY_VOICECOMMAND] = "VoiceCommand",
+ [KEY_ASSISTANT] = "Assistant",
+ [KEY_KBD_LAYOUT_NEXT] = "KbdLayoutNext",
+ [KEY_EMOJI_PICKER] = "EmojiPicker",
[KEY_BRIGHTNESS_MIN] = "BrightnessMin",
[KEY_BRIGHTNESS_MAX] = "BrightnessMax",
[KEY_BRIGHTNESS_AUTO] = "BrightnessAuto",
u8 address; /* 7-bit I2C address */
u8 flag; /* I2C transaction condition */
u8 length; /* data payload length */
- u8 data[60]; /* data payload */
+ u8 data[FT260_WR_DATA_MAX]; /* data payload */
} __packed;
struct ft260_i2c_read_request_report {
ret = hid_hw_raw_request(hdev, report_id, buf, len, HID_FEATURE_REPORT,
HID_REQ_GET_REPORT);
- memcpy(data, buf, len);
+ if (likely(ret == len))
+ memcpy(data, buf, len);
+ else if (ret >= 0)
+ ret = -EIO;
kfree(buf);
return ret;
}
ret = ft260_hid_feature_report_get(hdev, FT260_I2C_STATUS,
(u8 *)&report, sizeof(report));
- if (ret < 0) {
+ if (unlikely(ret < 0)) {
hid_err(hdev, "failed to retrieve status: %d\n", ret);
return ret;
}
struct ft260_i2c_write_request_report *rep =
(struct ft260_i2c_write_request_report *)dev->write_buf;
+ if (data_len >= sizeof(rep->data))
+ return -EINVAL;
+
rep->address = addr;
rep->data[0] = cmd;
rep->length = data_len + 1;
ret = ft260_hid_feature_report_get(hdev, FT260_SYSTEM_SETTINGS,
(u8 *)cfg, len);
- if (ret != len) {
+ if (ret < 0) {
hid_err(hdev, "failed to retrieve system status\n");
- if (ret >= 0)
- return -EIO;
+ return ret;
}
return 0;
}
int ret;
ret = ft260_hid_feature_report_get(hdev, id, cfg, len);
- if (ret != len && ret >= 0)
- return -EIO;
+ if (ret < 0)
+ return ret;
return scnprintf(buf, PAGE_SIZE, "%hi\n", *field);
}
int ret;
ret = ft260_hid_feature_report_get(hdev, id, cfg, len);
- if (ret != len && ret >= 0)
- return -EIO;
+ if (ret < 0)
+ return ret;
return scnprintf(buf, PAGE_SIZE, "%hi\n", le16_to_cpu(*field));
}
ret = ft260_hid_feature_report_get(hdev, FT260_CHIP_VERSION,
(u8 *)&version, sizeof(version));
- if (ret != sizeof(version)) {
+ if (ret < 0) {
hid_err(hdev, "failed to retrieve chip version\n");
- if (ret >= 0)
- ret = -EIO;
goto err_hid_close;
}
{ HID_USB_DEVICE(USB_VENDOR_ID_MSI, USB_DEVICE_ID_MSI_GT683R_LED_PANEL) },
{ }
};
+MODULE_DEVICE_TABLE(hid, gt683r_led_id);
static void gt683r_brightness_set(struct led_classdev *led_cdev,
enum led_brightness brightness)
#define USB_DEVICE_ID_A4TECH_WCP32PU 0x0006
#define USB_DEVICE_ID_A4TECH_X5_005D 0x000a
#define USB_DEVICE_ID_A4TECH_RP_649 0x001a
+#define USB_DEVICE_ID_A4TECH_NB_95 0x022b
#define USB_VENDOR_ID_AASHIMA 0x06d6
#define USB_DEVICE_ID_AASHIMA_GAMEPAD 0x0025
#define USB_VENDOR_ID_CORSAIR 0x1b1c
#define USB_DEVICE_ID_CORSAIR_K90 0x1b02
-
-#define USB_VENDOR_ID_CORSAIR 0x1b1c
#define USB_DEVICE_ID_CORSAIR_K70R 0x1b09
#define USB_DEVICE_ID_CORSAIR_K95RGB 0x1b11
#define USB_DEVICE_ID_CORSAIR_M65RGB 0x1b12
#define USB_DEVICE_ID_LENOVO_X1_COVER 0x6085
#define USB_DEVICE_ID_LENOVO_X1_TAB 0x60a3
#define USB_DEVICE_ID_LENOVO_X1_TAB3 0x60b5
+#define USB_DEVICE_ID_LENOVO_OPTICAL_USB_MOUSE_600E 0x600e
#define USB_DEVICE_ID_LENOVO_PIXART_USB_MOUSE_608D 0x608d
#define USB_DEVICE_ID_LENOVO_PIXART_USB_MOUSE_6019 0x6019
#define USB_DEVICE_ID_LENOVO_PIXART_USB_MOUSE_602E 0x602e
#define USB_DEVICE_ID_SAITEK_X52 0x075c
#define USB_DEVICE_ID_SAITEK_X52_2 0x0255
#define USB_DEVICE_ID_SAITEK_X52_PRO 0x0762
+#define USB_DEVICE_ID_SAITEK_X65 0x0b6a
#define USB_VENDOR_ID_SAMSUNG 0x0419
#define USB_DEVICE_ID_SAMSUNG_IR_REMOTE 0x0001
#define USB_DEVICE_ID_SEMICO_USB_KEYKOARD 0x0023
#define USB_DEVICE_ID_SEMICO_USB_KEYKOARD2 0x0027
+#define USB_VENDOR_ID_SEMITEK 0x1ea7
+#define USB_DEVICE_ID_SEMITEK_KEYBOARD 0x0907
+
#define USB_VENDOR_ID_SENNHEISER 0x1395
#define USB_DEVICE_ID_SENNHEISER_BTD500USB 0x002c
#define USB_DEVICE_ID_SYNAPTICS_DELL_K12A 0x2819
#define USB_DEVICE_ID_SYNAPTICS_ACER_SWITCH5_012 0x2968
#define USB_DEVICE_ID_SYNAPTICS_TP_V103 0x5710
+#define USB_DEVICE_ID_SYNAPTICS_DELL_K15A 0x6e21
#define USB_DEVICE_ID_SYNAPTICS_ACER_ONE_S1002 0x73f4
#define USB_DEVICE_ID_SYNAPTICS_ACER_ONE_S1003 0x73f5
#define USB_DEVICE_ID_SYNAPTICS_ACER_SWITCH5 0x81a7
case 0x0cd: map_key_clear(KEY_PLAYPAUSE); break;
case 0x0cf: map_key_clear(KEY_VOICECOMMAND); break;
+
+ case 0x0d9: map_key_clear(KEY_EMOJI_PICKER); break;
+
case 0x0e0: map_abs_clear(ABS_VOLUME); break;
case 0x0e2: map_key_clear(KEY_MUTE); break;
case 0x0e5: map_key_clear(KEY_BASSBOOST); break;
int status;
long flags = (long) data[2];
+ *level = POWER_SUPPLY_CAPACITY_LEVEL_UNKNOWN;
if (flags & 0x80)
switch (flags & 0x07) {
if (id->vendor == USB_VENDOR_ID_APPLE &&
id->product == USB_DEVICE_ID_APPLE_MAGICTRACKPAD2 &&
hdev->type != HID_TYPE_USBMOUSE)
- return 0;
+ return -ENODEV;
msc = devm_kzalloc(&hdev->dev, sizeof(*msc), GFP_KERNEL);
if (msc == NULL) {
static void magicmouse_remove(struct hid_device *hdev)
{
struct magicmouse_sc *msc = hid_get_drvdata(hdev);
- cancel_delayed_work_sync(&msc->work);
+
+ if (msc)
+ cancel_delayed_work_sync(&msc->work);
+
hid_hw_stop(hdev);
}
#define MT_QUIRK_WIN8_PTP_BUTTONS BIT(18)
#define MT_QUIRK_SEPARATE_APP_REPORT BIT(19)
#define MT_QUIRK_FORCE_MULTI_INPUT BIT(20)
+#define MT_QUIRK_DISABLE_WAKEUP BIT(21)
#define MT_INPUTMODE_TOUCHSCREEN 0x02
#define MT_INPUTMODE_TOUCHPAD 0x03
#define MT_CLS_EXPORT_ALL_INPUTS 0x0013
/* reserved 0x0014 */
#define MT_CLS_WIN_8_FORCE_MULTI_INPUT 0x0015
+#define MT_CLS_WIN_8_DISABLE_WAKEUP 0x0016
/* vendor specific classes */
#define MT_CLS_3M 0x0101
MT_QUIRK_WIN8_PTP_BUTTONS |
MT_QUIRK_FORCE_MULTI_INPUT,
.export_all_inputs = true },
+ { .name = MT_CLS_WIN_8_DISABLE_WAKEUP,
+ .quirks = MT_QUIRK_ALWAYS_VALID |
+ MT_QUIRK_IGNORE_DUPLICATES |
+ MT_QUIRK_HOVERING |
+ MT_QUIRK_CONTACT_CNT_ACCURATE |
+ MT_QUIRK_STICKY_FINGERS |
+ MT_QUIRK_WIN8_PTP_BUTTONS |
+ MT_QUIRK_DISABLE_WAKEUP,
+ .export_all_inputs = true },
/*
* vendor specific classes
if (!(HID_MAIN_ITEM_VARIABLE & field->flags))
continue;
- for (n = 0; n < field->report_count; n++) {
- if (field->usage[n].hid == HID_DG_CONTACTID)
- rdata->is_mt_collection = true;
+ if (field->logical == HID_DG_FINGER || td->hdev->group != HID_GROUP_MULTITOUCH_WIN_8) {
+ for (n = 0; n < field->report_count; n++) {
+ if (field->usage[n].hid == HID_DG_CONTACTID) {
+ rdata->is_mt_collection = true;
+ break;
+ }
+ }
}
}
return 1;
case HID_DG_CONFIDENCE:
if ((cls->name == MT_CLS_WIN_8 ||
- cls->name == MT_CLS_WIN_8_FORCE_MULTI_INPUT) &&
+ cls->name == MT_CLS_WIN_8_FORCE_MULTI_INPUT ||
+ cls->name == MT_CLS_WIN_8_DISABLE_WAKEUP) &&
(field->application == HID_DG_TOUCHPAD ||
field->application == HID_DG_TOUCHSCREEN))
app->quirks |= MT_QUIRK_CONFIDENCE;
/* we do not set suffix = "Touchscreen" */
hi->input->name = hdev->name;
break;
- case HID_DG_STYLUS:
- /* force BTN_STYLUS to allow tablet matching in udev */
- __set_bit(BTN_STYLUS, hi->input->keybit);
- break;
case HID_VD_ASUS_CUSTOM_MEDIA_KEYS:
suffix = "Custom Media Keys";
break;
+ case HID_DG_STYLUS:
+ /* force BTN_STYLUS to allow tablet matching in udev */
+ __set_bit(BTN_STYLUS, hi->input->keybit);
+ fallthrough;
case HID_DG_PEN:
suffix = "Stylus";
break;
#ifdef CONFIG_PM
static int mt_suspend(struct hid_device *hdev, pm_message_t state)
{
+ struct mt_device *td = hid_get_drvdata(hdev);
+
/* High latency is desirable for power savings during S3/S0ix */
- mt_set_modes(hdev, HID_LATENCY_HIGH, true, true);
+ if (td->mtclass.quirks & MT_QUIRK_DISABLE_WAKEUP)
+ mt_set_modes(hdev, HID_LATENCY_HIGH, false, false);
+ else
+ mt_set_modes(hdev, HID_LATENCY_HIGH, true, true);
+
return 0;
}
MT_USB_DEVICE(USB_VENDOR_ID_ANTON,
USB_DEVICE_ID_ANTON_TOUCH_PAD) },
+ /* Asus T101HA */
+ { .driver_data = MT_CLS_WIN_8_DISABLE_WAKEUP,
+ HID_DEVICE(BUS_USB, HID_GROUP_MULTITOUCH_WIN_8,
+ USB_VENDOR_ID_ASUSTEK,
+ USB_DEVICE_ID_ASUSTEK_T101HA_KEYBOARD) },
+
/* Asus T304UA */
{ .driver_data = MT_CLS_ASUS,
HID_DEVICE(BUS_USB, HID_GROUP_MULTITOUCH_WIN_8,
{ HID_USB_DEVICE(USB_VENDOR_ID_KYE, USB_DEVICE_ID_KYE_PENSKETCH_M912), HID_QUIRK_MULTI_INPUT },
{ HID_USB_DEVICE(USB_VENDOR_ID_KYE, USB_DEVICE_ID_KYE_EASYPEN_M406XE), HID_QUIRK_MULTI_INPUT },
{ HID_USB_DEVICE(USB_VENDOR_ID_KYE, USB_DEVICE_ID_PIXART_USB_OPTICAL_MOUSE_ID2), HID_QUIRK_ALWAYS_POLL },
+ { HID_USB_DEVICE(USB_VENDOR_ID_LENOVO, USB_DEVICE_ID_LENOVO_OPTICAL_USB_MOUSE_600E), HID_QUIRK_ALWAYS_POLL },
{ HID_USB_DEVICE(USB_VENDOR_ID_LENOVO, USB_DEVICE_ID_LENOVO_PIXART_USB_MOUSE_608D), HID_QUIRK_ALWAYS_POLL },
{ HID_USB_DEVICE(USB_VENDOR_ID_LENOVO, USB_DEVICE_ID_LENOVO_PIXART_USB_MOUSE_6019), HID_QUIRK_ALWAYS_POLL },
{ HID_USB_DEVICE(USB_VENDOR_ID_LENOVO, USB_DEVICE_ID_LENOVO_PIXART_USB_MOUSE_602E), HID_QUIRK_ALWAYS_POLL },
{ HID_USB_DEVICE(USB_VENDOR_ID_SAITEK, USB_DEVICE_ID_SAITEK_X52), HID_QUIRK_INCREMENT_USAGE_ON_DUPLICATE },
{ HID_USB_DEVICE(USB_VENDOR_ID_SAITEK, USB_DEVICE_ID_SAITEK_X52_2), HID_QUIRK_INCREMENT_USAGE_ON_DUPLICATE },
{ HID_USB_DEVICE(USB_VENDOR_ID_SAITEK, USB_DEVICE_ID_SAITEK_X52_PRO), HID_QUIRK_INCREMENT_USAGE_ON_DUPLICATE },
+ { HID_USB_DEVICE(USB_VENDOR_ID_SAITEK, USB_DEVICE_ID_SAITEK_X65), HID_QUIRK_INCREMENT_USAGE_ON_DUPLICATE },
{ HID_USB_DEVICE(USB_VENDOR_ID_SEMICO, USB_DEVICE_ID_SEMICO_USB_KEYKOARD2), HID_QUIRK_NO_INIT_REPORTS },
{ HID_USB_DEVICE(USB_VENDOR_ID_SEMICO, USB_DEVICE_ID_SEMICO_USB_KEYKOARD), HID_QUIRK_NO_INIT_REPORTS },
{ HID_USB_DEVICE(USB_VENDOR_ID_SENNHEISER, USB_DEVICE_ID_SENNHEISER_BTD500USB), HID_QUIRK_NOGET },
{ HID_USB_DEVICE(USB_VENDOR_ID_SYNAPTICS, USB_DEVICE_ID_SYNAPTICS_QUAD_HD), HID_QUIRK_NO_INIT_REPORTS },
{ HID_USB_DEVICE(USB_VENDOR_ID_SYNAPTICS, USB_DEVICE_ID_SYNAPTICS_TP_V103), HID_QUIRK_NO_INIT_REPORTS },
{ HID_USB_DEVICE(USB_VENDOR_ID_SYNAPTICS, USB_DEVICE_ID_SYNAPTICS_DELL_K12A), HID_QUIRK_NO_INIT_REPORTS },
+ { HID_USB_DEVICE(USB_VENDOR_ID_SYNAPTICS, USB_DEVICE_ID_SYNAPTICS_DELL_K15A), HID_QUIRK_NO_INIT_REPORTS },
{ HID_USB_DEVICE(USB_VENDOR_ID_TOPMAX, USB_DEVICE_ID_TOPMAX_COBRAPAD), HID_QUIRK_BADPAD },
{ HID_USB_DEVICE(USB_VENDOR_ID_TOUCHPACK, USB_DEVICE_ID_TOUCHPACK_RTS), HID_QUIRK_MULTI_INPUT },
{ HID_USB_DEVICE(USB_VENDOR_ID_TPV, USB_DEVICE_ID_TPV_OPTICAL_TOUCHSCREEN_8882), HID_QUIRK_NOGET },
{ HID_USB_DEVICE(USB_VENDOR_ID_A4TECH, USB_DEVICE_ID_A4TECH_WCP32PU) },
{ HID_USB_DEVICE(USB_VENDOR_ID_A4TECH, USB_DEVICE_ID_A4TECH_X5_005D) },
{ HID_USB_DEVICE(USB_VENDOR_ID_A4TECH, USB_DEVICE_ID_A4TECH_RP_649) },
+ { HID_USB_DEVICE(USB_VENDOR_ID_A4TECH, USB_DEVICE_ID_A4TECH_NB_95) },
#endif
#if IS_ENABLED(CONFIG_HID_ACCUTOUCH)
{ HID_USB_DEVICE(USB_VENDOR_ID_ELO, USB_DEVICE_ID_ELO_ACCUTOUCH_2216) },
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * HID driver for Semitek keyboards
+ *
+ * Copyright (c) 2021 Benjamin Moody
+ */
+
+#include <linux/device.h>
+#include <linux/hid.h>
+#include <linux/module.h>
+
+#include "hid-ids.h"
+
+static __u8 *semitek_report_fixup(struct hid_device *hdev, __u8 *rdesc,
+ unsigned int *rsize)
+{
+ /* In the report descriptor for interface 2, fix the incorrect
+ description of report ID 0x04 (the report contains a
+ bitmask, not an array of keycodes.) */
+ if (*rsize == 0xcb && rdesc[0x83] == 0x81 && rdesc[0x84] == 0x00) {
+ hid_info(hdev, "fixing up Semitek report descriptor\n");
+ rdesc[0x84] = 0x02;
+ }
+ return rdesc;
+}
+
+static const struct hid_device_id semitek_devices[] = {
+ { HID_USB_DEVICE(USB_VENDOR_ID_SEMITEK, USB_DEVICE_ID_SEMITEK_KEYBOARD) },
+ { }
+};
+MODULE_DEVICE_TABLE(hid, semitek_devices);
+
+static struct hid_driver semitek_driver = {
+ .name = "semitek",
+ .id_table = semitek_devices,
+ .report_fixup = semitek_report_fixup,
+};
+module_hid_driver(semitek_driver);
+
+MODULE_LICENSE("GPL");
struct hid_sensor_custom *sensor_inst = dev_get_drvdata(dev);
int index, field_index, usage;
char name[HID_CUSTOM_NAME_LENGTH];
- int value;
+ int value, ret;
if (sscanf(attr->attr.name, "feature-%x-%x-%s", &index, &usage,
name) == 3) {
report_id = sensor_inst->fields[field_index].attribute.
report_id;
- sensor_hub_set_feature(sensor_inst->hsdev, report_id,
- index, sizeof(value), &value);
+ ret = sensor_hub_set_feature(sensor_inst->hsdev, report_id,
+ index, sizeof(value), &value);
+ if (ret)
+ return ret;
} else
return -EINVAL;
buffer_size = buffer_size / sizeof(__s32);
if (buffer_size) {
for (i = 0; i < buffer_size; ++i) {
- hid_set_field(report->field[field_index], i,
- (__force __s32)cpu_to_le32(*buf32));
+ ret = hid_set_field(report->field[field_index], i,
+ (__force __s32)cpu_to_le32(*buf32));
+ if (ret)
+ goto done_proc;
+
++buf32;
}
}
if (remaining_bytes) {
value = 0;
memcpy(&value, (u8 *)buf32, remaining_bytes);
- hid_set_field(report->field[field_index], i,
- (__force __s32)cpu_to_le32(value));
+ ret = hid_set_field(report->field[field_index], i,
+ (__force __s32)cpu_to_le32(value));
+ if (ret)
+ goto done_proc;
}
hid_hw_request(hsdev->hdev, report, HID_REQ_SET_REPORT);
hid_hw_wait(hsdev->hdev);
}
tm_wheel->change_request = kzalloc(sizeof(struct usb_ctrlrequest), GFP_KERNEL);
- if (!tm_wheel->model_request) {
+ if (!tm_wheel->change_request) {
ret = -ENOMEM;
goto error5;
}
#define I2C_HID_QUIRK_BOGUS_IRQ BIT(4)
#define I2C_HID_QUIRK_RESET_ON_RESUME BIT(5)
#define I2C_HID_QUIRK_BAD_INPUT_SIZE BIT(6)
+#define I2C_HID_QUIRK_NO_WAKEUP_AFTER_RESET BIT(7)
/* flags */
I2C_HID_QUIRK_RESET_ON_RESUME },
{ USB_VENDOR_ID_ITE, I2C_DEVICE_ID_ITE_LENOVO_LEGION_Y720,
I2C_HID_QUIRK_BAD_INPUT_SIZE },
+ /*
+ * Sending the wakeup after reset actually break ELAN touchscreen controller
+ */
+ { USB_VENDOR_ID_ELAN, HID_ANY_ID,
+ I2C_HID_QUIRK_NO_WAKEUP_AFTER_RESET },
{ 0, 0 }
};
}
/* At least some SIS devices need this after reset */
- ret = i2c_hid_set_power(client, I2C_HID_PWR_ON);
+ if (!(ihid->quirks & I2C_HID_QUIRK_NO_WAKEUP_AFTER_RESET))
+ ret = i2c_hid_set_power(client, I2C_HID_PWR_ON);
out_unlock:
mutex_unlock(&ihid->reset_lock);
hid->vendor = le16_to_cpu(ihid->hdesc.wVendorID);
hid->product = le16_to_cpu(ihid->hdesc.wProductID);
- snprintf(hid->name, sizeof(hid->name), "%s %04hX:%04hX",
- client->name, hid->vendor, hid->product);
+ snprintf(hid->name, sizeof(hid->name), "%s %04X:%04X",
+ client->name, (u16)hid->vendor, (u16)hid->product);
strlcpy(hid->phys, dev_name(&client->dev), sizeof(hid->phys));
ihid->quirks = i2c_hid_lookup_quirk(hid->vendor, hid->product);
#define EHL_Ax_DEVICE_ID 0x4BB3
#define TGL_LP_DEVICE_ID 0xA0FC
#define TGL_H_DEVICE_ID 0x43FC
+#define ADL_S_DEVICE_ID 0x7AF8
+#define ADL_P_DEVICE_ID 0x51FC
#define REVISION_ID_CHT_A0 0x6
#define REVISION_ID_CHT_Ax_SI 0x0
{PCI_DEVICE(PCI_VENDOR_ID_INTEL, EHL_Ax_DEVICE_ID)},
{PCI_DEVICE(PCI_VENDOR_ID_INTEL, TGL_LP_DEVICE_ID)},
{PCI_DEVICE(PCI_VENDOR_ID_INTEL, TGL_H_DEVICE_ID)},
+ {PCI_DEVICE(PCI_VENDOR_ID_INTEL, ADL_S_DEVICE_ID)},
+ {PCI_DEVICE(PCI_VENDOR_ID_INTEL, ADL_P_DEVICE_ID)},
{0, }
};
MODULE_DEVICE_TABLE(pci, ish_pci_tbl);
shid->hid->dev.parent = shid->dev;
shid->hid->bus = BUS_HOST;
- shid->hid->vendor = cpu_to_le16(shid->attrs.vendor);
- shid->hid->product = cpu_to_le16(shid->attrs.product);
- shid->hid->version = cpu_to_le16(shid->hid_desc.hid_version);
+ shid->hid->vendor = get_unaligned_le16(&shid->attrs.vendor);
+ shid->hid->product = get_unaligned_le16(&shid->attrs.product);
+ shid->hid->version = get_unaligned_le16(&shid->hid_desc.hid_version);
shid->hid->country = shid->hid_desc.country_code;
snprintf(shid->hid->name, sizeof(shid->hid->name), "Microsoft Surface %04X:%04X",
raw_report = usbhid->ctrl[usbhid->ctrltail].raw_report;
dir = usbhid->ctrl[usbhid->ctrltail].dir;
- len = ((report->size - 1) >> 3) + 1 + (report->id > 0);
+ len = hid_report_len(report);
if (dir == USB_DIR_OUT) {
usbhid->urbctrl->pipe = usb_sndctrlpipe(hid_to_usb_dev(hid), 0);
usbhid->urbctrl->transfer_buffer_length = len;
if (pidff->pool[PID_DEVICE_MANAGED_POOL].value &&
pidff->pool[PID_DEVICE_MANAGED_POOL].value[0] == 0) {
+ error = -EPERM;
hid_notice(hid,
"device does not support device managed pool\n");
goto fail;
static umode_t i8k_is_visible(struct kobject *kobj, struct attribute *attr,
int index)
{
- if (disallow_fan_support && index >= 8)
+ if (disallow_fan_support && index >= 20)
return 0;
if (disallow_fan_type_call &&
- (index == 9 || index == 12 || index == 15))
+ (index == 21 || index == 25 || index == 28))
return 0;
if (index >= 0 && index <= 1 &&
!(i8k_hwmon_flags & I8K_HWMON_HAVE_TEMP1))
struct pmbus_driver_info info;
int chip;
int page;
+
+ bool vout_linear_11;
};
#define to_fsp3y_data(x) container_of(x, struct fsp3y_data, info)
int rv;
/*
- * YH5151-E outputs vout in linear11. The conversion is done when
- * reading. Here, we have to inject pmbus_core with the correct
- * exponent (it is -6).
+ * Inject an exponent for non-compliant YH5151-E.
*/
- if (data->chip == yh5151e && reg == PMBUS_VOUT_MODE)
+ if (data->vout_linear_11 && reg == PMBUS_VOUT_MODE)
return 0x1A;
rv = set_page(client, page);
return rv;
/*
- * YH-5151E is non-compliant and outputs output voltages in linear11
- * instead of linear16.
+ * Handle YH-5151E non-compliant linear11 vout voltage.
*/
- if (data->chip == yh5151e && reg == PMBUS_READ_VOUT)
+ if (data->vout_linear_11 && reg == PMBUS_READ_VOUT)
rv = sign_extend32(rv, 10) & 0xffff;
return rv;
data->info = fsp3y_info[data->chip];
+ /*
+ * YH-5151E sometimes reports vout in linear11 and sometimes in
+ * linear16. This depends on the exact individual piece of hardware. One
+ * YH-5151E can use linear16 and another might use linear11 instead.
+ *
+ * The format can be recognized by reading VOUT_MODE - if it doesn't
+ * report a valid exponent, then vout uses linear11. Otherwise, the
+ * device is compliant and uses linear16.
+ */
+ data->vout_linear_11 = false;
+ if (data->chip == yh5151e) {
+ rv = i2c_smbus_read_byte_data(client, PMBUS_VOUT_MODE);
+ if (rv < 0)
+ return rv;
+
+ if (rv == 0xFF)
+ data->vout_linear_11 = true;
+ }
+
return pmbus_do_probe(client, &data->info);
}
info->read_word_data = raa_dmpvr2_read_word_data;
break;
case raa_dmpvr2_2rail_nontc:
- info->func[0] &= ~PMBUS_HAVE_TEMP;
- info->func[1] &= ~PMBUS_HAVE_TEMP;
+ info->func[0] &= ~PMBUS_HAVE_TEMP3;
+ info->func[1] &= ~PMBUS_HAVE_TEMP3;
fallthrough;
case raa_dmpvr2_2rail:
info->pages = 2;
dev_err(&client->dev, "Failed to read Manufacturer ID\n");
return ret;
}
- if (ret != 5 || strncmp(buf, "DELTA", 5)) {
+ if (ret != 6 || strncmp(buf, "DELTA", 5)) {
buf[ret] = '\0';
dev_err(dev, "Unsupported Manufacturer ID '%s'\n", buf);
return -ENODEV;
config I2C_HISI
tristate "HiSilicon I2C controller"
- depends on ARM64 || COMPILE_TEST
+ depends on (ARM64 && ACPI) || COMPILE_TEST
help
Say Y here if you want to have Hisilicon I2C controller support
available on the Kunpeng Server.
// SPDX-License-Identifier: GPL-2.0-only
-/**
+/*
* i2c-ali1563.c - i2c driver for the ALi 1563 Southbridge
*
* Copyright (C) 2004 Patrick Mochel
#define ALTR_I2C_XFER_TIMEOUT (msecs_to_jiffies(250))
/**
- * altr_i2c_dev - I2C device context
+ * struct altr_i2c_dev - I2C device context
* @base: pointer to register struct
* @msg: pointer to current message
* @msg_len: number of bytes transferred in msg
altr_i2c_int_enable(idev, ALTR_I2C_ALL_IRQ, false);
}
-/**
+/*
* altr_i2c_transfer - On the last byte to be transmitted, send
* a Stop bit on the last byte.
*/
writel(data, idev->base + ALTR_I2C_TFR_CMD);
}
-/**
+/*
* altr_i2c_empty_rx_fifo - Fetch data from RX FIFO until end of
* transfer. Send a Stop bit on the last byte.
*/
}
}
-/**
+/*
* altr_i2c_fill_tx_fifo - Fill TX FIFO from current message buffer.
- * @return: Number of bytes left to transfer.
*/
static int altr_i2c_fill_tx_fifo(struct altr_i2c_dev *idev)
{
};
/**
- * enum cdns_i2c_slave_mode - Slave state when I2C is operating in slave mode
+ * enum cdns_i2c_slave_state - Slave state when I2C is operating in slave mode
*
* @CDNS_I2C_SLAVE_STATE_IDLE: I2C slave idle
* @CDNS_I2C_SLAVE_STATE_SEND: I2C slave sending data to master
}
/**
- * i2c_dw_init() - Initialize the designware I2C master hardware
+ * i2c_dw_init_master() - Initialize the designware I2C master hardware
* @dev: device private data
*
* This functions configures and enables the I2C master.
/**
* struct adapter_info - This structure holds the adapter information for the
- PCH i2c controller
+ * PCH i2c controller
* @pch_data: stores a list of i2c_algo_pch_data
* @pch_i2c_suspended: specifies whether the system is suspended or not
* perhaps with more lines and words.
/**
* pch_i2c_writebytes() - write data to I2C bus in normal mode
* @i2c_adap: Pointer to the struct i2c_adapter.
+ * @msgs: Pointer to the i2c message structure.
* @last: specifies whether last message or not.
* In the case of compound mode it will be 1 for last message,
* otherwise 0.
dev_err(&priv->pci_dev->dev, "Transaction timeout\n");
/* try to stop the current command */
dev_dbg(&priv->pci_dev->dev, "Terminating the current operation\n");
- outb_p(inb_p(SMBHSTCNT(priv)) | SMBHSTCNT_KILL,
- SMBHSTCNT(priv));
+ outb_p(SMBHSTCNT_KILL, SMBHSTCNT(priv));
usleep_range(1000, 2000);
- outb_p(inb_p(SMBHSTCNT(priv)) & (~SMBHSTCNT_KILL),
- SMBHSTCNT(priv));
+ outb_p(0, SMBHSTCNT(priv));
/* Check if it worked */
status = inb_p(SMBHSTSTS(priv));
{
struct icy_i2c *i2c;
struct i2c_algo_pcf_data *algo_data;
- struct fwnode_handle *new_fwnode;
struct i2c_board_info ltc2990_info = {
.type = "ltc2990",
.swnode = &icy_ltc2990_node,
#include <linux/clk.h>
#include <linux/io.h>
+#include <linux/iopoll.h>
#include <linux/fsl_devices.h>
#include <linux/i2c.h>
#include <linux/interrupt.h>
#define CCR_MTX 0x10
#define CCR_TXAK 0x08
#define CCR_RSTA 0x04
+#define CCR_RSVD 0x02
#define CSR_MCF 0x80
#define CSR_MAAS 0x40
u32 block;
int rc;
int expect_rxack;
-
+ bool has_errata_A004447;
};
struct mpc_i2c_divider {
}
}
+static int i2c_mpc_wait_sr(struct mpc_i2c *i2c, int mask)
+{
+ void __iomem *addr = i2c->base + MPC_I2C_SR;
+ u8 val;
+
+ return readb_poll_timeout(addr, val, val & mask, 0, 100);
+}
+
+/*
+ * Workaround for Erratum A004447. From the P2040CE Rev Q
+ *
+ * 1. Set up the frequency divider and sampling rate.
+ * 2. I2CCR - a0h
+ * 3. Poll for I2CSR[MBB] to get set.
+ * 4. If I2CSR[MAL] is set (an indication that SDA is stuck low), then go to
+ * step 5. If MAL is not set, then go to step 13.
+ * 5. I2CCR - 00h
+ * 6. I2CCR - 22h
+ * 7. I2CCR - a2h
+ * 8. Poll for I2CSR[MBB] to get set.
+ * 9. Issue read to I2CDR.
+ * 10. Poll for I2CSR[MIF] to be set.
+ * 11. I2CCR - 82h
+ * 12. Workaround complete. Skip the next steps.
+ * 13. Issue read to I2CDR.
+ * 14. Poll for I2CSR[MIF] to be set.
+ * 15. I2CCR - 80h
+ */
+static void mpc_i2c_fixup_A004447(struct mpc_i2c *i2c)
+{
+ int ret;
+ u32 val;
+
+ writeccr(i2c, CCR_MEN | CCR_MSTA);
+ ret = i2c_mpc_wait_sr(i2c, CSR_MBB);
+ if (ret) {
+ dev_err(i2c->dev, "timeout waiting for CSR_MBB\n");
+ return;
+ }
+
+ val = readb(i2c->base + MPC_I2C_SR);
+
+ if (val & CSR_MAL) {
+ writeccr(i2c, 0x00);
+ writeccr(i2c, CCR_MSTA | CCR_RSVD);
+ writeccr(i2c, CCR_MEN | CCR_MSTA | CCR_RSVD);
+ ret = i2c_mpc_wait_sr(i2c, CSR_MBB);
+ if (ret) {
+ dev_err(i2c->dev, "timeout waiting for CSR_MBB\n");
+ return;
+ }
+ val = readb(i2c->base + MPC_I2C_DR);
+ ret = i2c_mpc_wait_sr(i2c, CSR_MIF);
+ if (ret) {
+ dev_err(i2c->dev, "timeout waiting for CSR_MIF\n");
+ return;
+ }
+ writeccr(i2c, CCR_MEN | CCR_RSVD);
+ } else {
+ val = readb(i2c->base + MPC_I2C_DR);
+ ret = i2c_mpc_wait_sr(i2c, CSR_MIF);
+ if (ret) {
+ dev_err(i2c->dev, "timeout waiting for CSR_MIF\n");
+ return;
+ }
+ writeccr(i2c, CCR_MEN);
+ }
+}
+
#if defined(CONFIG_PPC_MPC52xx) || defined(CONFIG_PPC_MPC512x)
static const struct mpc_i2c_divider mpc_i2c_dividers_52xx[] = {
{20, 0x20}, {22, 0x21}, {24, 0x22}, {26, 0x23},
{
struct mpc_i2c *i2c = i2c_get_adapdata(adap);
- mpc_i2c_fixup(i2c);
+ if (i2c->has_errata_A004447)
+ mpc_i2c_fixup_A004447(i2c);
+ else
+ mpc_i2c_fixup(i2c);
return 0;
}
}
dev_info(i2c->dev, "timeout %u us\n", mpc_ops.timeout * 1000000 / HZ);
+ if (of_property_read_bool(op->dev.of_node, "fsl,i2c-erratum-a004447"))
+ i2c->has_errata_A004447 = true;
+
i2c->adap = mpc_ops;
scnprintf(i2c->adap.name, sizeof(i2c->adap.name),
"MPC adapter (%s)", of_node_full_name(op->dev.of_node));
static void mtk_i2c_init_hw(struct mtk_i2c *i2c)
{
u16 control_reg;
+ u16 intr_stat_reg;
+
+ mtk_i2c_writew(i2c, I2C_CHN_CLR_FLAG, OFFSET_START);
+ intr_stat_reg = mtk_i2c_readw(i2c, OFFSET_INTR_STAT);
+ mtk_i2c_writew(i2c, intr_stat_reg, OFFSET_INTR_STAT);
if (i2c->dev_comp->apdma_sync) {
writel(I2C_DMA_WARM_RST, i2c->pdmabase + OFFSET_RST);
* @clk_freq: clock frequency for the operation mode
* @tft: Tx FIFO Threshold in bytes
* @rft: Rx FIFO Threshold in bytes
- * @timeout Slave response timeout (ms)
+ * @timeout: Slave response timeout (ms)
* @sm: speed mode
* @stop: stop condition.
* @xfer_complete: acknowledge completion for a I2C message.
}
/**
- * Process timeout event
+ * ocores_process_timeout() - Process timeout event
* @i2c: ocores I2C device instance
*/
static void ocores_process_timeout(struct ocores_i2c *i2c)
}
/**
- * Wait until something change in a given register
+ * ocores_wait() - Wait until something change in a given register
* @i2c: ocores I2C device instance
* @reg: register to query
* @mask: bitmask to apply on register value
}
/**
- * Wait until is possible to process some data
+ * ocores_poll_wait() - Wait until is possible to process some data
* @i2c: ocores I2C device instance
*
* Used when the device is in polling mode (interrupts disabled).
}
/**
- * It handles an IRQ-less transfer
+ * ocores_process_polling() - It handles an IRQ-less transfer
* @i2c: ocores I2C device instance
*
* Even if IRQ are disabled, the I2C OpenCore IP behavior is exactly the same
/**
* i2c_pnx_start - start a device
* @slave_addr: slave address
- * @adap: pointer to adapter structure
+ * @alg_data: pointer to local driver data structure
*
* Generate a START signal in the desired mode.
*/
/**
* i2c_pnx_stop - stop a device
- * @adap: pointer to I2C adapter structure
+ * @alg_data: pointer to local driver data structure
*
* Generate a STOP signal to terminate the master transaction.
*/
/**
* i2c_pnx_master_xmit - transmit data to slave
- * @adap: pointer to I2C adapter structure
+ * @alg_data: pointer to local driver data structure
*
* Sends one byte of data to the slave
*/
/**
* i2c_pnx_master_rcv - receive data from slave
- * @adap: pointer to I2C adapter structure
+ * @alg_data: pointer to local driver data structure
*
* Reads one byte data from the slave
*/
[GP_IRQ0] = {-EIO, "Unknown I2C err GP_IRQ0"},
[NACK] = {-ENXIO, "NACK: slv unresponsive, check its power/reset-ln"},
[GP_IRQ2] = {-EIO, "Unknown I2C err GP IRQ2"},
- [BUS_PROTO] = {-EPROTO, "Bus proto err, noisy/unepxected start/stop"},
+ [BUS_PROTO] = {-EPROTO, "Bus proto err, noisy/unexpected start/stop"},
[ARB_LOST] = {-EAGAIN, "Bus arbitration lost, clock line undriveable"},
[GP_IRQ5] = {-EIO, "Unknown I2C err GP IRQ5"},
[GENI_OVERRUN] = {-EIO, "Cmd overrun, check GENI cmd-state machine"},
return 0;
}
+static void geni_i2c_shutdown(struct platform_device *pdev)
+{
+ struct geni_i2c_dev *gi2c = platform_get_drvdata(pdev);
+
+ /* Make client i2c transfers start failing */
+ i2c_mark_adapter_suspended(&gi2c->adap);
+}
+
static int __maybe_unused geni_i2c_runtime_suspend(struct device *dev)
{
int ret;
{
struct geni_i2c_dev *gi2c = dev_get_drvdata(dev);
+ i2c_mark_adapter_suspended(&gi2c->adap);
+
if (!gi2c->suspended) {
geni_i2c_runtime_suspend(dev);
pm_runtime_disable(dev);
return 0;
}
+static int __maybe_unused geni_i2c_resume_noirq(struct device *dev)
+{
+ struct geni_i2c_dev *gi2c = dev_get_drvdata(dev);
+
+ i2c_mark_adapter_resumed(&gi2c->adap);
+ return 0;
+}
+
static const struct dev_pm_ops geni_i2c_pm_ops = {
- SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(geni_i2c_suspend_noirq, NULL)
+ SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(geni_i2c_suspend_noirq, geni_i2c_resume_noirq)
SET_RUNTIME_PM_OPS(geni_i2c_runtime_suspend, geni_i2c_runtime_resume,
NULL)
};
static struct platform_driver geni_i2c_driver = {
.probe = geni_i2c_probe,
.remove = geni_i2c_remove,
+ .shutdown = geni_i2c_shutdown,
.driver = {
.name = "geni_i2c",
.pm = &geni_i2c_pm_ops,
* forces us to send a new START
* when we change direction
*/
+ dev_dbg(i2c->dev,
+ "missing START before write->read\n");
s3c24xx_i2c_stop(i2c, -EINVAL);
+ break;
}
goto retry_write;
static const struct of_device_id sh_mobile_i2c_dt_ids[] = {
{ .compatible = "renesas,iic-r8a73a4", .data = &fast_clock_dt_config },
{ .compatible = "renesas,iic-r8a7740", .data = &r8a7740_dt_config },
- { .compatible = "renesas,iic-r8a774c0", .data = &fast_clock_dt_config },
+ { .compatible = "renesas,iic-r8a774c0", .data = &v2_freq_calc_dt_config },
{ .compatible = "renesas,iic-r8a7790", .data = &v2_freq_calc_dt_config },
{ .compatible = "renesas,iic-r8a7791", .data = &v2_freq_calc_dt_config },
{ .compatible = "renesas,iic-r8a7792", .data = &v2_freq_calc_dt_config },
}
/**
- * st_i2c_handle_write() - Handle FIFO enmpty interrupt in case of read
+ * st_i2c_handle_read() - Handle FIFO empty interrupt in case of read
* @i2c_dev: Controller's private data
*/
static void st_i2c_handle_read(struct st_i2c_dev *i2c_dev)
}
/**
- * st_i2c_isr() - Interrupt routine
+ * st_i2c_isr_thread() - Interrupt routine
* @irq: interrupt number
* @data: Controller's private data
*/
}
/**
- * stm32f4_i2c_write_ byte() - Write a byte in the data register
+ * stm32f4_i2c_write_byte() - Write a byte in the data register
* @i2c_dev: Controller's private data
* @byte: Data to write in the register
*/
*out |= SERIALI2C_RECV_LEN;
}
-/**
+/*
* The serialized I2C format is simply the following:
* [addr little-endian][flags little-endian][len little-endian][data if write]
* [addr little-endian][flags little-endian][len little-endian][data if write]
request->xfer.data_size = pos;
}
-/**
+/*
* The data in the BPMP -> CPU direction is composed of sequential blocks for
* those messages that have I2C_M_RD. So, for example, if you have:
*
};
-/**
+/*
* i2c_arbitrator_select - claim the I2C bus
*
* Use the GPIO-based signalling protocol; return -EBUSY if we fail.
return -EBUSY;
}
-/**
+/*
* i2c_arbitrator_deselect - release the I2C bus
*
* Release the I2C bus using the GPIO-based signalling protocol.
if (ret)
goto err;
+ if (channel >= indio_dev->num_channels) {
+ dev_err(indio_dev->dev.parent,
+ "Channel index >= number of channels\n");
+ ret = -EINVAL;
+ goto err;
+ }
+
ret = of_property_read_u32_array(child, "diff-channels",
ain, 2);
if (ret)
return ret;
}
+static void ad7124_reg_disable(void *r)
+{
+ regulator_disable(r);
+}
+
static int ad7124_probe(struct spi_device *spi)
{
const struct ad7124_chip_info *info;
ret = regulator_enable(st->vref[i]);
if (ret)
return ret;
+
+ ret = devm_add_action_or_reset(&spi->dev, ad7124_reg_disable,
+ st->vref[i]);
+ if (ret)
+ return ret;
}
st->mclk = devm_clk_get(&spi->dev, "mclk");
- if (IS_ERR(st->mclk)) {
- ret = PTR_ERR(st->mclk);
- goto error_regulator_disable;
- }
+ if (IS_ERR(st->mclk))
+ return PTR_ERR(st->mclk);
ret = clk_prepare_enable(st->mclk);
if (ret < 0)
- goto error_regulator_disable;
+ return ret;
ret = ad7124_soft_reset(st);
if (ret < 0)
ad_sd_cleanup_buffer_and_trigger(indio_dev);
error_clk_disable_unprepare:
clk_disable_unprepare(st->mclk);
-error_regulator_disable:
- for (i = ARRAY_SIZE(st->vref) - 1; i >= 0; i--) {
- if (!IS_ERR_OR_NULL(st->vref[i]))
- regulator_disable(st->vref[i]);
- }
return ret;
}
{
struct iio_dev *indio_dev = spi_get_drvdata(spi);
struct ad7124_state *st = iio_priv(indio_dev);
- int i;
iio_device_unregister(indio_dev);
ad_sd_cleanup_buffer_and_trigger(indio_dev);
clk_disable_unprepare(st->mclk);
- for (i = ARRAY_SIZE(st->vref) - 1; i >= 0; i--) {
- if (!IS_ERR_OR_NULL(st->vref[i]))
- regulator_disable(st->vref[i]);
- }
-
return 0;
}
{
struct ad7192_state *st;
struct iio_dev *indio_dev;
- int ret, voltage_uv = 0;
+ int ret;
if (!spi->irq) {
dev_err(&spi->dev, "no IRQ?\n");
goto error_disable_avdd;
}
- voltage_uv = regulator_get_voltage(st->avdd);
-
- if (voltage_uv > 0) {
- st->int_vref_mv = voltage_uv / 1000;
- } else {
- ret = voltage_uv;
+ ret = regulator_get_voltage(st->avdd);
+ if (ret < 0) {
dev_err(&spi->dev, "Device tree error, reference voltage undefined\n");
goto error_disable_avdd;
}
+ st->int_vref_mv = ret / 1000;
spi_set_drvdata(spi, indio_dev);
st->chip_info = of_device_get_match_data(&spi->dev);
return 0;
error_disable_clk:
- clk_disable_unprepare(st->mclk);
+ if (st->clock_sel == AD7192_CLK_EXT_MCLK1_2 ||
+ st->clock_sel == AD7192_CLK_EXT_MCLK2)
+ clk_disable_unprepare(st->mclk);
error_remove_trigger:
ad_sd_cleanup_buffer_and_trigger(indio_dev);
error_disable_dvdd:
struct ad7192_state *st = iio_priv(indio_dev);
iio_device_unregister(indio_dev);
- clk_disable_unprepare(st->mclk);
+ if (st->clock_sel == AD7192_CLK_EXT_MCLK1_2 ||
+ st->clock_sel == AD7192_CLK_EXT_MCLK2)
+ clk_disable_unprepare(st->mclk);
ad_sd_cleanup_buffer_and_trigger(indio_dev);
regulator_disable(st->dvdd);
* transfer buffers to live in their own cache lines.
*/
union {
+ struct {
+ __be32 chan;
+ s64 timestamp;
+ } scan;
__be32 d32;
u8 d8[2];
} data ____cacheline_aligned;
mutex_lock(&st->lock);
- ret = spi_read(st->spi, &st->data.d32, 3);
+ ret = spi_read(st->spi, &st->data.scan.chan, 3);
if (ret < 0)
goto err_unlock;
- iio_push_to_buffers_with_timestamp(indio_dev, &st->data.d32,
+ iio_push_to_buffers_with_timestamp(indio_dev, &st->data.scan,
iio_get_time_ns(indio_dev));
iio_trigger_notify_done(indio_dev->trig);
id &= AD7793_ID_MASK;
if (id != st->chip_info->id) {
+ ret = -ENODEV;
dev_err(&st->sd.spi->dev, "device ID query failed\n");
goto out;
}
/*
* DMA (thus cache coherency maintenance) requires the
* transfer buffers to live in their own cache lines.
+ * Ensure rx_buf can be directly used in iio_push_to_buffers_with_timetamp
+ * Length = 8 channels + 4 extra for 8 byte timestamp
*/
- __be16 rx_buf[4] ____cacheline_aligned;
+ __be16 rx_buf[12] ____cacheline_aligned;
__be16 tx_buf[4];
};
device_for_each_child_node(&st->spi->dev, child) {
ret = fwnode_property_read_u32(child, "num", &num);
if (ret)
- return ret;
- if (num >= AD5770R_MAX_CHANNELS)
- return -EINVAL;
+ goto err_child_out;
+ if (num >= AD5770R_MAX_CHANNELS) {
+ ret = -EINVAL;
+ goto err_child_out;
+ }
ret = fwnode_property_read_u32_array(child,
"adi,range-microamp",
tmp, 2);
if (ret)
- return ret;
+ goto err_child_out;
min = tmp[0] / 1000;
max = tmp[1] / 1000;
ret = ad5770r_store_output_range(st, min, max, num);
if (ret)
- return ret;
+ goto err_child_out;
}
+ return 0;
+
+err_child_out:
+ fwnode_handle_put(child);
return ret;
}
ret = regmap_field_read(data->regmap_fields[F_TEMP], &temp);
if (ret < 0) {
dev_err(dev, "failed to read temp: %d\n", ret);
+ fxas21002c_pm_put(data);
goto data_unlock;
}
&axis_be, sizeof(axis_be));
if (ret < 0) {
dev_err(dev, "failed to read axis: %d: %d\n", index, ret);
+ fxas21002c_pm_put(data);
goto data_unlock;
}
ent->xlt = (1 << ent->order) * sizeof(struct mlx5_mtt) /
MLX5_IB_UMR_OCTOWORD;
ent->access_mode = MLX5_MKC_ACCESS_MODE_MTT;
- if ((dev->mdev->profile->mask & MLX5_PROF_MASK_MR_CACHE) &&
+ if ((dev->mdev->profile.mask & MLX5_PROF_MASK_MR_CACHE) &&
!dev->is_rep && mlx5_core_is_pf(dev->mdev) &&
mlx5_ib_can_load_pas_with_umr(dev, 0))
- ent->limit = dev->mdev->profile->mr_cache[i].limit;
+ ent->limit = dev->mdev->profile.mr_cache[i].limit;
else
ent->limit = 0;
spin_lock_irq(&ent->lock);
// SPDX-License-Identifier: GPL-2.0
/*
- * Copyright (c) 2020, The Linux Foundation. All rights reserved.
+ * Copyright (c) 2020-2021, The Linux Foundation. All rights reserved.
*/
#include <asm/div64.h>
}
mutex_unlock(&bcm_voter_lock);
+ of_node_put(node);
return voter;
}
EXPORT_SYMBOL_GPL(of_bcm_voter_get);
{ .compatible = "qcom,bcm-voter" },
{ }
};
+MODULE_DEVICE_TABLE(of, bcm_voter_of_match);
static struct platform_driver qcom_icc_bcm_voter_driver = {
.probe = qcom_icc_bcm_voter_probe,
* The msb-bit must be clear on the address. Just set all the
* lower bits.
*/
- address |= 1ull << (msb_diff - 1);
+ address |= (1ull << msb_diff) - 1;
}
/* Clear bits 11:0 */
domain = iommu_get_domain_for_dev(dev);
if (domain->type == IOMMU_DOMAIN_DMA)
iommu_setup_dma_ops(dev, IOVA_START_PFN << PAGE_SHIFT, 0);
+ else
+ set_dma_ops(dev, NULL);
}
static void amd_iommu_release_device(struct device *dev)
err = iommu_device_register(&iommu->iommu, &intel_iommu_ops, NULL);
if (err)
- goto err_unmap;
+ goto err_sysfs;
}
drhd->iommu = iommu;
return 0;
+err_sysfs:
+ iommu_device_sysfs_remove(&iommu->iommu);
err_unmap:
unmap_iommu(iommu);
error_free_seq_id:
struct device *dev,
u32 pasid)
{
- int flags = PASID_FLAG_SUPERVISOR_MODE;
struct dma_pte *pgd = domain->pgd;
int agaw, level;
+ int flags = 0;
/*
* Skip top levels of page tables for iommu which has
if (level != 4 && level != 5)
return -EINVAL;
- flags |= (level == 5) ? PASID_FLAG_FL5LP : 0;
+ if (pasid != PASID_RID2PASID)
+ flags |= PASID_FLAG_SUPERVISOR_MODE;
+ if (level == 5)
+ flags |= PASID_FLAG_FL5LP;
if (domain->domain.type == IOMMU_DOMAIN_UNMANAGED)
flags |= PASID_FLAG_PAGE_SNOOP;
if (!sinfo) {
sinfo = kzalloc(sizeof(*sinfo), GFP_ATOMIC);
+ if (!sinfo)
+ return -ENOMEM;
sinfo->domain = domain;
sinfo->pdev = dev;
list_add(&sinfo->link_phys, &info->subdevices);
* Since it is a second level only translation setup, we should
* set SRE bit as well (addresses are expected to be GPAs).
*/
- pasid_set_sre(pte);
+ if (pasid != PASID_RID2PASID)
+ pasid_set_sre(pte);
pasid_set_present(pte);
pasid_flush_caches(iommu, pte, pasid, did);
{ VIRTIO_ID_IOMMU, VIRTIO_DEV_ANY_ID },
{ 0 },
};
+MODULE_DEVICE_TABLE(virtio, id_table);
static struct virtio_driver virtio_iommu_drv = {
.driver.name = KBUILD_MODNAME,
card->typ = NETJET_S_TJ300;
card->base = pci_resource_start(pdev, 0);
- card->irq = pdev->irq;
pci_set_drvdata(pdev, card);
err = setup_instance(card);
if (err)
if (o)
list_for_each_entry(snap, &o->snapshots, list)
- chunk_size = min(chunk_size, snap->store->chunk_size);
+ chunk_size = min_not_zero(chunk_size,
+ snap->store->chunk_size);
return (uint32_t) chunk_size;
}
#define DM_VERITY_VERIFY_ERR(s) DM_VERITY_ROOT_HASH_VERIFICATION " " s
static bool require_signatures;
-module_param(require_signatures, bool, false);
+module_param(require_signatures, bool, 0444);
MODULE_PARM_DESC(require_signatures,
"Verify the roothash of dm-verity hash tree");
unsigned int chunk_sectors;
unsigned int bio_sectors = bio_sectors(bio);
- WARN_ON_ONCE(bio->bi_bdev->bd_partno);
-
chunk_sectors = min(conf->chunk_sectors, conf->prev_chunk_sectors);
return chunk_sectors >=
((sector & (chunk_sectors - 1)) + bio_sectors);
printk(KERN_INFO a); \
} while (0)
#define v2printk(a...) do { \
- if (verbose > 1) \
+ if (verbose > 1) { \
printk(KERN_INFO a); \
+ } \
touch_nmi_watchdog(); \
} while (0)
#define eprintk(a...) do { \
return ret;
}
+ pm_runtime_mark_last_busy(dev->dev);
+ pm_request_autosuspend(dev->dev);
+
list_move_tail(&cb->list, &cl->rd_pending);
return 0;
#include <linux/module.h>
#include <linux/delay.h>
#include <linux/mtd/mtd.h>
+#include <linux/mtd/nand-ecc-sw-hamming.h>
#include <linux/mtd/rawnand.h>
#include <linux/mtd/partitions.h>
#include <linux/iopoll.h>
return 0;
}
+static int cs553x_ecc_correct(struct nand_chip *chip,
+ unsigned char *buf,
+ unsigned char *read_ecc,
+ unsigned char *calc_ecc)
+{
+ return ecc_sw_hamming_correct(buf, read_ecc, calc_ecc,
+ chip->ecc.size, false);
+}
+
static struct cs553x_nand_controller *controllers[4];
static int cs553x_attach_chip(struct nand_chip *chip)
chip->ecc.bytes = 3;
chip->ecc.hwctl = cs_enable_hwecc;
chip->ecc.calculate = cs_calculate_ecc;
- chip->ecc.correct = rawnand_sw_hamming_correct;
+ chip->ecc.correct = cs553x_ecc_correct;
chip->ecc.strength = 1;
return 0;
#include <linux/sched.h>
#include <linux/types.h>
#include <linux/mtd/mtd.h>
+#include <linux/mtd/nand-ecc-sw-hamming.h>
#include <linux/mtd/rawnand.h>
#include <linux/platform_device.h>
#include <linux/of.h>
return 0;
}
+static int fsmc_correct_ecc1(struct nand_chip *chip,
+ unsigned char *buf,
+ unsigned char *read_ecc,
+ unsigned char *calc_ecc)
+{
+ return ecc_sw_hamming_correct(buf, read_ecc, calc_ecc,
+ chip->ecc.size, false);
+}
+
/* Count the number of 0's in buff upto a max of max_bits */
static int count_written_bits(u8 *buff, int size, int max_bits)
{
case NAND_ECC_ENGINE_TYPE_ON_HOST:
dev_info(host->dev, "Using 1-bit HW ECC scheme\n");
nand->ecc.calculate = fsmc_read_hwecc_ecc1;
- nand->ecc.correct = rawnand_sw_hamming_correct;
+ nand->ecc.correct = fsmc_correct_ecc1;
nand->ecc.hwctl = fsmc_enable_hwecc;
nand->ecc.bytes = 3;
nand->ecc.strength = 1;
#include <linux/of.h>
#include <linux/of_gpio.h>
#include <linux/mtd/lpc32xx_slc.h>
+#include <linux/mtd/nand-ecc-sw-hamming.h>
#define LPC32XX_MODNAME "lpc32xx-nand"
}
/*
+ * Corrects the data
+ */
+static int lpc32xx_nand_ecc_correct(struct nand_chip *chip,
+ unsigned char *buf,
+ unsigned char *read_ecc,
+ unsigned char *calc_ecc)
+{
+ return ecc_sw_hamming_correct(buf, read_ecc, calc_ecc,
+ chip->ecc.size, false);
+}
+
+/*
* Read a single byte from NAND device
*/
static uint8_t lpc32xx_nand_read_byte(struct nand_chip *chip)
chip->ecc.write_oob = lpc32xx_nand_write_oob_syndrome;
chip->ecc.read_oob = lpc32xx_nand_read_oob_syndrome;
chip->ecc.calculate = lpc32xx_nand_ecc_calculate;
- chip->ecc.correct = rawnand_sw_hamming_correct;
+ chip->ecc.correct = lpc32xx_nand_ecc_correct;
chip->ecc.hwctl = lpc32xx_nand_ecc_enable;
/*
#include <linux/mtd/ndfc.h>
#include <linux/slab.h>
#include <linux/mtd/mtd.h>
+#include <linux/mtd/nand-ecc-sw-hamming.h>
#include <linux/of_address.h>
#include <linux/of_platform.h>
#include <asm/io.h>
return 0;
}
+static int ndfc_correct_ecc(struct nand_chip *chip,
+ unsigned char *buf,
+ unsigned char *read_ecc,
+ unsigned char *calc_ecc)
+{
+ return ecc_sw_hamming_correct(buf, read_ecc, calc_ecc,
+ chip->ecc.size, false);
+}
+
/*
* Speedups for buffer read/write/verify
*
chip->controller = &ndfc->ndfc_control;
chip->legacy.read_buf = ndfc_read_buf;
chip->legacy.write_buf = ndfc_write_buf;
- chip->ecc.correct = rawnand_sw_hamming_correct;
+ chip->ecc.correct = ndfc_correct_ecc;
chip->ecc.hwctl = ndfc_enable_hwecc;
chip->ecc.calculate = ndfc_calculate_ecc;
chip->ecc.engine_type = NAND_ECC_ENGINE_TYPE_ON_HOST;
#include <linux/module.h>
#include <linux/delay.h>
#include <linux/mtd/mtd.h>
+#include <linux/mtd/nand-ecc-sw-hamming.h>
#include <linux/mtd/rawnand.h>
#include <linux/mtd/partitions.h>
#include <linux/mtd/sharpsl.h>
return readb(sharpsl->io + ECCCNTR) != 0;
}
+static int sharpsl_nand_correct_ecc(struct nand_chip *chip,
+ unsigned char *buf,
+ unsigned char *read_ecc,
+ unsigned char *calc_ecc)
+{
+ return ecc_sw_hamming_correct(buf, read_ecc, calc_ecc,
+ chip->ecc.size, false);
+}
+
static int sharpsl_attach_chip(struct nand_chip *chip)
{
if (chip->ecc.engine_type != NAND_ECC_ENGINE_TYPE_ON_HOST)
chip->ecc.strength = 1;
chip->ecc.hwctl = sharpsl_nand_enable_hwecc;
chip->ecc.calculate = sharpsl_nand_calculate_ecc;
- chip->ecc.correct = rawnand_sw_hamming_correct;
+ chip->ecc.correct = sharpsl_nand_correct_ecc;
return 0;
}
#include <linux/interrupt.h>
#include <linux/ioport.h>
#include <linux/mtd/mtd.h>
+#include <linux/mtd/nand-ecc-sw-hamming.h>
#include <linux/mtd/rawnand.h>
#include <linux/mtd/partitions.h>
#include <linux/slab.h>
int r0, r1;
/* assume ecc.size = 512 and ecc.bytes = 6 */
- r0 = rawnand_sw_hamming_correct(chip, buf, read_ecc, calc_ecc);
+ r0 = ecc_sw_hamming_correct(buf, read_ecc, calc_ecc,
+ chip->ecc.size, false);
if (r0 < 0)
return r0;
- r1 = rawnand_sw_hamming_correct(chip, buf + 256, read_ecc + 3,
- calc_ecc + 3);
+ r1 = ecc_sw_hamming_correct(buf + 256, read_ecc + 3, calc_ecc + 3,
+ chip->ecc.size, false);
if (r1 < 0)
return r1;
return r0 + r1;
#include <linux/platform_device.h>
#include <linux/delay.h>
#include <linux/mtd/mtd.h>
+#include <linux/mtd/nand-ecc-sw-hamming.h>
#include <linux/mtd/rawnand.h>
#include <linux/mtd/partitions.h>
#include <linux/io.h>
int stat;
for (eccsize = chip->ecc.size; eccsize > 0; eccsize -= 256) {
- stat = rawnand_sw_hamming_correct(chip, buf, read_ecc,
- calc_ecc);
+ stat = ecc_sw_hamming_correct(buf, read_ecc, calc_ecc,
+ chip->ecc.size, false);
if (stat < 0)
return stat;
corrected += stat;
if (!mtd_node)
return 0;
- ofpart_node = of_get_child_by_name(mtd_node, "partitions");
- if (!ofpart_node && !master->parent) {
- /*
- * We might get here even when ofpart isn't used at all (e.g.,
- * when using another parser), so don't be louder than
- * KERN_DEBUG
- */
- pr_debug("%s: 'partitions' subnode not found on %pOF. Trying to parse direct subnodes as partitions.\n",
- master->name, mtd_node);
+ if (!master->parent) { /* Master */
+ ofpart_node = of_get_child_by_name(mtd_node, "partitions");
+ if (!ofpart_node) {
+ /*
+ * We might get here even when ofpart isn't used at all (e.g.,
+ * when using another parser), so don't be louder than
+ * KERN_DEBUG
+ */
+ pr_debug("%s: 'partitions' subnode not found on %pOF. Trying to parse direct subnodes as partitions.\n",
+ master->name, mtd_node);
+ ofpart_node = mtd_node;
+ dedicated = false;
+ }
+ } else { /* Partition */
ofpart_node = mtd_node;
- dedicated = false;
}
- if (!ofpart_node)
- return 0;
of_id = of_match_node(parse_ofpart_match_table, ofpart_node);
if (dedicated && !of_id) {
break;
}
+ dev->base_addr = ioaddr;
+
/* Reserve any actual interrupt. */
if (dev->irq) {
retval = request_irq(dev->irq, cops_interrupt, 0, dev->name, dev);
goto err_out;
}
- dev->base_addr = ioaddr;
-
lp = netdev_priv(dev);
spin_lock_init(&lp->lock);
slave->bond = bond;
slave->dev = slave_dev;
+ INIT_DELAYED_WORK(&slave->notify_work, bond_netdev_notify_work);
if (bond_kobj_init(slave))
return NULL;
return NULL;
}
}
- INIT_DELAYED_WORK(&slave->notify_work, bond_netdev_notify_work);
return slave;
}
bcm_sf2_sw_mac_link_set(ds, port, interface, true);
if (port != core_readl(priv, CORE_IMP0_PRT_ID)) {
- u32 reg_rgmii_ctrl;
+ u32 reg_rgmii_ctrl = 0;
u32 reg, offset;
- reg_rgmii_ctrl = bcm_sf2_reg_rgmii_cntrl(priv, port);
-
if (priv->type == BCM4908_DEVICE_ID ||
priv->type == BCM7445_DEVICE_ID)
offset = CORE_STS_OVERRIDE_GMIIP_PORT(port);
interface == PHY_INTERFACE_MODE_RGMII_TXID ||
interface == PHY_INTERFACE_MODE_MII ||
interface == PHY_INTERFACE_MODE_REVMII) {
+ reg_rgmii_ctrl = bcm_sf2_reg_rgmii_cntrl(priv, port);
reg = reg_readl(priv, reg_rgmii_ctrl);
reg &= ~(RX_PAUSE_EN | TX_PAUSE_EN);
.num_statics = 16,
.cpu_ports = 0x7F, /* can be configured as cpu port */
.port_cnt = 7, /* total physical port count */
+ .phy_errata_9477 = true,
},
};
{
struct mt7530_priv *priv = ds->priv;
- /* The real fabric path would be decided on the membership in the
- * entry of VLAN table. PCR_MATRIX set up here with ALL_MEMBERS
- * means potential VLAN can be consisting of certain subset of all
- * ports.
- */
- mt7530_rmw(priv, MT7530_PCR_P(port),
- PCR_MATRIX_MASK, PCR_MATRIX(MT7530_ALL_MEMBERS));
-
/* Trapped into security mode allows packet forwarding through VLAN
* table lookup. CPU port is set to fallback mode to let untagged
* frames pass through.
if (taprio->num_entries > VSC9959_TAS_GCL_ENTRY_MAX)
return -ERANGE;
- /* Set port num and disable ALWAYS_GUARD_BAND_SCH_Q, which means set
- * guard band to be implemented for nonschedule queues to schedule
- * queues transition.
+ /* Enable guard band. The switch will schedule frames without taking
+ * their length into account. Thus we'll always need to enable the
+ * guard band which reserves the time of a maximum sized frame at the
+ * end of the time window.
+ *
+ * Although the ALWAYS_GUARD_BAND_SCH_Q bit is global for all ports, we
+ * need to set PORT_NUM, because subsequent writes to PARAM_CFG_REG_n
+ * operate on the port number.
*/
- ocelot_rmw(ocelot,
- QSYS_TAS_PARAM_CFG_CTRL_PORT_NUM(port),
+ ocelot_rmw(ocelot, QSYS_TAS_PARAM_CFG_CTRL_PORT_NUM(port) |
+ QSYS_TAS_PARAM_CFG_CTRL_ALWAYS_GUARD_BAND_SCH_Q,
QSYS_TAS_PARAM_CFG_CTRL_PORT_NUM_M |
QSYS_TAS_PARAM_CFG_CTRL_ALWAYS_GUARD_BAND_SCH_Q,
QSYS_TAS_PARAM_CFG_CTRL);
SJA1105_HOSTCMD_INVALIDATE = 4,
};
+/* Command and entry overlap */
static void
-sja1105_vl_lookup_cmd_packing(void *buf, struct sja1105_dyn_cmd *cmd,
- enum packing_op op)
+sja1105et_vl_lookup_cmd_packing(void *buf, struct sja1105_dyn_cmd *cmd,
+ enum packing_op op)
{
const int size = SJA1105_SIZE_DYN_CMD;
sja1105_packing(buf, &cmd->index, 9, 0, size, op);
}
+/* Command and entry are separate */
+static void
+sja1105pqrs_vl_lookup_cmd_packing(void *buf, struct sja1105_dyn_cmd *cmd,
+ enum packing_op op)
+{
+ u8 *p = buf + SJA1105_SIZE_VL_LOOKUP_ENTRY;
+ const int size = SJA1105_SIZE_DYN_CMD;
+
+ sja1105_packing(p, &cmd->valid, 31, 31, size, op);
+ sja1105_packing(p, &cmd->errors, 30, 30, size, op);
+ sja1105_packing(p, &cmd->rdwrset, 29, 29, size, op);
+ sja1105_packing(p, &cmd->index, 9, 0, size, op);
+}
+
static size_t sja1105et_vl_lookup_entry_packing(void *buf, void *entry_ptr,
enum packing_op op)
{
const struct sja1105_dynamic_table_ops sja1105et_dyn_ops[BLK_IDX_MAX_DYN] = {
[BLK_IDX_VL_LOOKUP] = {
.entry_packing = sja1105et_vl_lookup_entry_packing,
- .cmd_packing = sja1105_vl_lookup_cmd_packing,
+ .cmd_packing = sja1105et_vl_lookup_cmd_packing,
.access = OP_WRITE,
.max_entry_count = SJA1105_MAX_VL_LOOKUP_COUNT,
.packed_size = SJA1105ET_SIZE_VL_LOOKUP_DYN_CMD,
const struct sja1105_dynamic_table_ops sja1105pqrs_dyn_ops[BLK_IDX_MAX_DYN] = {
[BLK_IDX_VL_LOOKUP] = {
.entry_packing = sja1105_vl_lookup_entry_packing,
- .cmd_packing = sja1105_vl_lookup_cmd_packing,
+ .cmd_packing = sja1105pqrs_vl_lookup_cmd_packing,
.access = (OP_READ | OP_WRITE),
.max_entry_count = SJA1105_MAX_VL_LOOKUP_COUNT,
.packed_size = SJA1105PQRS_SIZE_VL_LOOKUP_DYN_CMD,
#include "sja1105_tas.h"
#define SJA1105_UNKNOWN_MULTICAST 0x010000000000ull
+#define SJA1105_DEFAULT_VLAN (VLAN_N_VID - 1)
static const struct dsa_switch_ops sja1105_switch_ops;
default:
dev_err(dev, "Unsupported PHY mode %s!\n",
phy_modes(ports[i].phy_mode));
+ return -EINVAL;
}
/* Even though the SerDes port is able to drive SGMII autoneg
return 0;
}
+/* Set up a default VLAN for untagged traffic injected from the CPU
+ * using management routes (e.g. STP, PTP) as opposed to tag_8021q.
+ * All DT-defined ports are members of this VLAN, and there are no
+ * restrictions on forwarding (since the CPU selects the destination).
+ * Frames from this VLAN will always be transmitted as untagged, and
+ * neither the bridge nor the 8021q module cannot create this VLAN ID.
+ */
static int sja1105_init_static_vlan(struct sja1105_private *priv)
{
struct sja1105_table *table;
.vmemb_port = 0,
.vlan_bc = 0,
.tag_port = 0,
- .vlanid = 1,
+ .vlanid = SJA1105_DEFAULT_VLAN,
};
struct dsa_switch *ds = priv->ds;
int port;
table = &priv->static_config.tables[BLK_IDX_VLAN_LOOKUP];
- /* The static VLAN table will only contain the initial pvid of 1.
- * All other VLANs are to be configured through dynamic entries,
- * and kept in the static configuration table as backing memory.
- */
if (table->entry_count) {
kfree(table->entries);
table->entry_count = 0;
table->entry_count = 1;
- /* VLAN 1: all DT-defined ports are members; no restrictions on
- * forwarding; always transmit as untagged.
- */
for (port = 0; port < ds->num_ports; port++) {
struct sja1105_bridge_vlan *v;
pvid.vlan_bc |= BIT(port);
pvid.tag_port &= ~BIT(port);
- /* Let traffic that don't need dsa_8021q (e.g. STP, PTP) be
- * transmitted as untagged.
- */
v = kzalloc(sizeof(*v), GFP_KERNEL);
if (!v)
return -ENOMEM;
v->port = port;
- v->vid = 1;
+ v->vid = SJA1105_DEFAULT_VLAN;
v->untagged = true;
if (dsa_is_cpu_port(ds, port))
v->pvid = true;
bool pvid = flags & BRIDGE_VLAN_INFO_PVID;
struct sja1105_bridge_vlan *v;
- list_for_each_entry(v, vlan_list, list)
- if (v->port == port && v->vid == vid &&
- v->untagged == untagged && v->pvid == pvid)
+ list_for_each_entry(v, vlan_list, list) {
+ if (v->port == port && v->vid == vid) {
/* Already added */
- return 0;
+ if (v->untagged == untagged && v->pvid == pvid)
+ /* Nothing changed */
+ return 0;
+
+ /* It's the same VLAN, but some of the flags changed
+ * and the user did not bother to delete it first.
+ * Update it and trigger sja1105_build_vlan_table.
+ */
+ v->untagged = untagged;
+ v->pvid = pvid;
+ return 1;
+ }
+ }
v = kzalloc(sizeof(*v), GFP_KERNEL);
if (!v) {
rc = sja1105_static_config_load(priv, ports);
if (rc < 0) {
dev_err(ds->dev, "Failed to load static config: %d\n", rc);
- return rc;
+ goto out_ptp_clock_unregister;
}
/* Configure the CGU (PHY link modes and speeds) */
rc = sja1105_clocking_setup(priv);
if (rc < 0) {
dev_err(ds->dev, "Failed to configure MII clocking: %d\n", rc);
- return rc;
+ goto out_static_config_free;
}
/* On SJA1105, VLAN filtering per se is always enabled in hardware.
* The only thing we can do to disable it is lie about what the 802.1Q
rc = sja1105_devlink_setup(ds);
if (rc < 0)
- return rc;
+ goto out_static_config_free;
/* The DSA/switchdev model brings up switch ports in standalone mode by
* default, and that means vlan_filtering is 0 since they're not under
rtnl_lock();
rc = sja1105_setup_8021q_tagging(ds, true);
rtnl_unlock();
+ if (rc)
+ goto out_devlink_teardown;
+
+ return 0;
+
+out_devlink_teardown:
+ sja1105_devlink_teardown(ds);
+out_ptp_clock_unregister:
+ sja1105_ptp_clock_unregister(ds);
+out_static_config_free:
+ sja1105_static_config_free(&priv->static_config);
return rc;
}
priv->cbs = devm_kcalloc(dev, priv->info->num_cbs_shapers,
sizeof(struct sja1105_cbs_entry),
GFP_KERNEL);
- if (!priv->cbs)
- return -ENOMEM;
+ if (!priv->cbs) {
+ rc = -ENOMEM;
+ goto out_unregister_switch;
+ }
}
/* Connections between dsa_port and sja1105_port */
dev_err(ds->dev,
"failed to create deferred xmit thread: %d\n",
rc);
- goto out;
+ goto out_destroy_workers;
}
skb_queue_head_init(&sp->xmit_queue);
sp->xmit_tpid = ETH_P_SJA1105;
}
return 0;
-out:
+
+out_destroy_workers:
while (port-- > 0) {
struct sja1105_port *sp = &priv->ports[port];
kthread_destroy_worker(sp->xmit_worker);
}
+
+out_unregister_switch:
+ dsa_unregister_switch(ds);
+
return rc;
}
BNX2_WR(bp, PCI_COMMAND, reg);
} else if ((BNX2_CHIP_ID(bp) == BNX2_CHIP_ID_5706_A1) &&
!(bp->flags & BNX2_FLAG_PCIX)) {
-
dev_err(&pdev->dev,
"5706 A1 can only be used in a PCIX bus, aborting\n");
+ rc = -EPERM;
goto err_out_unmap;
}
goto failed;
/* SR-IOV capability was enabled but there are no VFs*/
- if (iov->total == 0)
+ if (iov->total == 0) {
+ err = -EINVAL;
goto failed;
+ }
iov->nr_virtfn = min_t(u16, iov->total, num_vfs_param);
{
return (idx == NETXTREME_C_VF || idx == NETXTREME_E_VF ||
idx == NETXTREME_S_VF || idx == NETXTREME_C_VF_HV ||
- idx == NETXTREME_E_VF_HV || idx == NETXTREME_E_P5_VF);
+ idx == NETXTREME_E_VF_HV || idx == NETXTREME_E_P5_VF ||
+ idx == NETXTREME_E_P5_VF_HV);
}
#define DB_CP_REARM_FLAGS (DB_KEY_CP | DB_IDX_VALID)
static void bnxt_hwrm_set_pg_attr(struct bnxt_ring_mem_info *rmem, u8 *pg_attr,
__le64 *pg_dir)
{
- u8 pg_size = 0;
-
if (!rmem->nr_pages)
return;
- if (BNXT_PAGE_SHIFT == 13)
- pg_size = 1 << 4;
- else if (BNXT_PAGE_SIZE == 16)
- pg_size = 2 << 4;
-
- *pg_attr = pg_size;
+ BNXT_SET_CTX_PAGE_ATTR(*pg_attr);
if (rmem->depth >= 1) {
if (rmem->depth == 2)
*pg_attr |= 2;
return rc;
}
+static bool bnxt_exthdr_check(struct bnxt *bp, struct sk_buff *skb, int nw_off,
+ u8 **nextp)
+{
+ struct ipv6hdr *ip6h = (struct ipv6hdr *)(skb->data + nw_off);
+ int hdr_count = 0;
+ u8 *nexthdr;
+ int start;
+
+ /* Check that there are at most 2 IPv6 extension headers, no
+ * fragment header, and each is <= 64 bytes.
+ */
+ start = nw_off + sizeof(*ip6h);
+ nexthdr = &ip6h->nexthdr;
+ while (ipv6_ext_hdr(*nexthdr)) {
+ struct ipv6_opt_hdr *hp;
+ int hdrlen;
+
+ if (hdr_count >= 3 || *nexthdr == NEXTHDR_NONE ||
+ *nexthdr == NEXTHDR_FRAGMENT)
+ return false;
+ hp = __skb_header_pointer(NULL, start, sizeof(*hp), skb->data,
+ skb_headlen(skb), NULL);
+ if (!hp)
+ return false;
+ if (*nexthdr == NEXTHDR_AUTH)
+ hdrlen = ipv6_authlen(hp);
+ else
+ hdrlen = ipv6_optlen(hp);
+
+ if (hdrlen > 64)
+ return false;
+ nexthdr = &hp->nexthdr;
+ start += hdrlen;
+ hdr_count++;
+ }
+ if (nextp) {
+ /* Caller will check inner protocol */
+ if (skb->encapsulation) {
+ *nextp = nexthdr;
+ return true;
+ }
+ *nextp = NULL;
+ }
+ /* Only support TCP/UDP for non-tunneled ipv6 and inner ipv6 */
+ return *nexthdr == IPPROTO_TCP || *nexthdr == IPPROTO_UDP;
+}
+
+/* For UDP, we can only handle 1 Vxlan port and 1 Geneve port. */
+static bool bnxt_udp_tunl_check(struct bnxt *bp, struct sk_buff *skb)
+{
+ struct udphdr *uh = udp_hdr(skb);
+ __be16 udp_port = uh->dest;
+
+ if (udp_port != bp->vxlan_port && udp_port != bp->nge_port)
+ return false;
+ if (skb->inner_protocol_type == ENCAP_TYPE_ETHER) {
+ struct ethhdr *eh = inner_eth_hdr(skb);
+
+ switch (eh->h_proto) {
+ case htons(ETH_P_IP):
+ return true;
+ case htons(ETH_P_IPV6):
+ return bnxt_exthdr_check(bp, skb,
+ skb_inner_network_offset(skb),
+ NULL);
+ }
+ }
+ return false;
+}
+
+static bool bnxt_tunl_check(struct bnxt *bp, struct sk_buff *skb, u8 l4_proto)
+{
+ switch (l4_proto) {
+ case IPPROTO_UDP:
+ return bnxt_udp_tunl_check(bp, skb);
+ case IPPROTO_IPIP:
+ return true;
+ case IPPROTO_GRE: {
+ switch (skb->inner_protocol) {
+ default:
+ return false;
+ case htons(ETH_P_IP):
+ return true;
+ case htons(ETH_P_IPV6):
+ fallthrough;
+ }
+ }
+ case IPPROTO_IPV6:
+ /* Check ext headers of inner ipv6 */
+ return bnxt_exthdr_check(bp, skb, skb_inner_network_offset(skb),
+ NULL);
+ }
+ return false;
+}
+
static netdev_features_t bnxt_features_check(struct sk_buff *skb,
struct net_device *dev,
netdev_features_t features)
{
- struct bnxt *bp;
- __be16 udp_port;
- u8 l4_proto = 0;
+ struct bnxt *bp = netdev_priv(dev);
+ u8 *l4_proto;
features = vlan_features_check(skb, features);
- if (!skb->encapsulation)
- return features;
-
switch (vlan_get_protocol(skb)) {
case htons(ETH_P_IP):
- l4_proto = ip_hdr(skb)->protocol;
+ if (!skb->encapsulation)
+ return features;
+ l4_proto = &ip_hdr(skb)->protocol;
+ if (bnxt_tunl_check(bp, skb, *l4_proto))
+ return features;
break;
case htons(ETH_P_IPV6):
- l4_proto = ipv6_hdr(skb)->nexthdr;
+ if (!bnxt_exthdr_check(bp, skb, skb_network_offset(skb),
+ &l4_proto))
+ break;
+ if (!l4_proto || bnxt_tunl_check(bp, skb, *l4_proto))
+ return features;
break;
- default:
- return features;
}
-
- if (l4_proto != IPPROTO_UDP)
- return features;
-
- bp = netdev_priv(dev);
- /* For UDP, we can only handle 1 Vxlan port and 1 Geneve port. */
- udp_port = udp_hdr(skb)->dest;
- if (udp_port == bp->vxlan_port || udp_port == bp->nge_port)
- return features;
return features & ~(NETIF_F_CSUM_MASK | NETIF_F_GSO_MASK);
}
#define BNXT_BACKING_STORE_CFG_LEGACY_LEN 256
+#define BNXT_SET_CTX_PAGE_ATTR(attr) \
+do { \
+ if (BNXT_PAGE_SIZE == 0x2000) \
+ attr = FUNC_BACKING_STORE_CFG_REQ_SRQ_PG_SIZE_PG_8K; \
+ else if (BNXT_PAGE_SIZE == 0x10000) \
+ attr = FUNC_BACKING_STORE_CFG_REQ_QPC_PG_SIZE_PG_64K; \
+ else \
+ attr = FUNC_BACKING_STORE_CFG_REQ_QPC_PG_SIZE_PG_4K; \
+} while (0)
+
struct bnxt_ctx_mem_info {
u32 qp_max_entries;
u16 qp_min_qp1_entries;
struct gem_stats *hwstat = &bp->hw_stats.gem;
struct net_device_stats *nstat = &bp->dev->stats;
+ if (!netif_running(bp->dev))
+ return nstat;
+
gem_update_stats(bp);
nstat->rx_errors = (hwstat->rx_frame_check_sequence_errors +
bool persistent, u8 *smt_idx);
int cxgb4_get_msix_idx_from_bmap(struct adapter *adap);
void cxgb4_free_msix_idx_in_bmap(struct adapter *adap, u32 msix_idx);
-int cxgb_open(struct net_device *dev);
-int cxgb_close(struct net_device *dev);
void cxgb4_enable_rx(struct adapter *adap, struct sge_rspq *q);
void cxgb4_quiesce_rx(struct sge_rspq *q);
int cxgb4_port_mirror_alloc(struct net_device *dev);
cxgb4_del_filter(dev, f->tid, &f->fs);
}
- sb = t4_read_reg(adapter, LE_DB_SRVR_START_INDEX_A);
+ sb = adapter->tids.stid_base;
for (i = 0; i < sb; i++) {
f = (struct filter_entry *)adapter->tids.tid_tab[i];
/*
* net_device operations
*/
-int cxgb_open(struct net_device *dev)
+static int cxgb_open(struct net_device *dev)
{
struct port_info *pi = netdev_priv(dev);
struct adapter *adapter = pi->adapter;
return err;
}
-int cxgb_close(struct net_device *dev)
+static int cxgb_close(struct net_device *dev)
{
struct port_info *pi = netdev_priv(dev);
struct adapter *adapter = pi->adapter;
adap->uld[CXGB4_ULD_KTLS].tlsdev_ops->tls_dev_del(netdev, tls_ctx,
direction);
- cxgb4_set_ktls_feature(adap, FW_PARAMS_PARAM_DEV_KTLS_HW_DISABLE);
out_unlock:
+ cxgb4_set_ktls_feature(adap, FW_PARAMS_PARAM_DEV_KTLS_HW_DISABLE);
mutex_unlock(&uld_mutex);
}
if (!ch_flower)
return -ENOENT;
+ rhashtable_remove_fast(&adap->flower_tbl, &ch_flower->node,
+ adap->flower_ht_params);
+
ret = cxgb4_flow_rule_destroy(dev, ch_flower->fs.tc_prio,
&ch_flower->fs, ch_flower->filter_id);
if (ret)
- goto err;
+ netdev_err(dev, "Flow rule destroy failed for tid: %u, ret: %d",
+ ch_flower->filter_id, ret);
- ret = rhashtable_remove_fast(&adap->flower_tbl, &ch_flower->node,
- adap->flower_ht_params);
- if (ret) {
- netdev_err(dev, "Flow remove from rhashtable failed");
- goto err;
- }
kfree_rcu(ch_flower, rcu);
-
-err:
return ret;
}
* down before configuring tc params.
*/
if (netif_running(dev)) {
- cxgb_close(dev);
+ netif_tx_stop_all_queues(dev);
+ netif_carrier_off(dev);
needs_bring_up = true;
}
}
out:
- if (needs_bring_up)
- cxgb_open(dev);
+ if (needs_bring_up) {
+ netif_tx_start_all_queues(dev);
+ netif_carrier_on(dev);
+ }
mutex_unlock(&adap->tc_mqprio->mqprio_mutex);
return ret;
if (!eosw_txq)
return -ENOMEM;
+ if (!(adap->flags & CXGB4_FW_OK)) {
+ /* Don't stall caller when access to FW is lost */
+ complete(&eosw_txq->completion);
+ return -EIO;
+ }
+
skb = alloc_skb(len, GFP_KERNEL);
if (!skb)
return -ENOMEM;
}
static int chcr_init_tcb_fields(struct chcr_ktls_info *tx_info);
+static void clear_conn_resources(struct chcr_ktls_info *tx_info);
/*
* chcr_ktls_save_keys: calculate and save crypto keys.
* @tx_info - driver specific tls info.
chcr_get_ktls_tx_context(tls_ctx);
struct chcr_ktls_info *tx_info = tx_ctx->chcr_info;
struct ch_ktls_port_stats_debug *port_stats;
+ struct chcr_ktls_uld_ctx *u_ctx;
if (!tx_info)
return;
+ u_ctx = tx_info->adap->uld[CXGB4_ULD_KTLS].handle;
+ if (u_ctx && u_ctx->detach)
+ return;
/* clear l2t entry */
if (tx_info->l2te)
cxgb4_l2t_release(tx_info->l2te);
if (tx_info->tid != -1) {
cxgb4_remove_tid(&tx_info->adap->tids, tx_info->tx_chan,
tx_info->tid, tx_info->ip_family);
+
+ xa_erase(&u_ctx->tid_list, tx_info->tid);
}
port_stats = &tx_info->adap->ch_ktls_stats.ktls_port[tx_info->port_id];
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct ch_ktls_port_stats_debug *port_stats;
struct chcr_ktls_ofld_ctx_tx *tx_ctx;
+ struct chcr_ktls_uld_ctx *u_ctx;
struct chcr_ktls_info *tx_info;
struct dst_entry *dst;
struct adapter *adap;
adap = pi->adapter;
port_stats = &adap->ch_ktls_stats.ktls_port[pi->port_id];
atomic64_inc(&port_stats->ktls_tx_connection_open);
+ u_ctx = adap->uld[CXGB4_ULD_KTLS].handle;
if (direction == TLS_OFFLOAD_CTX_DIR_RX) {
pr_err("not expecting for RX direction\n");
if (tx_ctx->chcr_info)
goto out;
+ if (u_ctx && u_ctx->detach)
+ goto out;
+
tx_info = kvzalloc(sizeof(*tx_info), GFP_KERNEL);
if (!tx_info)
goto out;
cxgb4_remove_tid(&tx_info->adap->tids, tx_info->tx_chan,
tx_info->tid, tx_info->ip_family);
+ xa_erase(&u_ctx->tid_list, tx_info->tid);
+
put_module:
/* release module refcount */
module_put(THIS_MODULE);
{
const struct cpl_act_open_rpl *p = (void *)input;
struct chcr_ktls_info *tx_info = NULL;
+ struct chcr_ktls_ofld_ctx_tx *tx_ctx;
+ struct chcr_ktls_uld_ctx *u_ctx;
unsigned int atid, tid, status;
+ struct tls_context *tls_ctx;
struct tid_info *t;
+ int ret = 0;
tid = GET_TID(p);
status = AOPEN_STATUS_G(ntohl(p->atid_status));
if (!status) {
tx_info->tid = tid;
cxgb4_insert_tid(t, tx_info, tx_info->tid, tx_info->ip_family);
+ /* Adding tid */
+ tls_ctx = tls_get_ctx(tx_info->sk);
+ tx_ctx = chcr_get_ktls_tx_context(tls_ctx);
+ u_ctx = adap->uld[CXGB4_ULD_KTLS].handle;
+ if (u_ctx) {
+ ret = xa_insert_bh(&u_ctx->tid_list, tid, tx_ctx,
+ GFP_NOWAIT);
+ if (ret < 0) {
+ pr_err("%s: Failed to allocate tid XA entry = %d\n",
+ __func__, tx_info->tid);
+ tx_info->open_state = CH_KTLS_OPEN_FAILURE;
+ goto out;
+ }
+ }
tx_info->open_state = CH_KTLS_OPEN_SUCCESS;
} else {
tx_info->open_state = CH_KTLS_OPEN_FAILURE;
}
+out:
spin_unlock(&tx_info->lock);
complete(&tx_info->completion);
- return 0;
+ return ret;
}
/*
goto out;
}
u_ctx->lldi = *lldi;
+ u_ctx->detach = false;
+ xa_init_flags(&u_ctx->tid_list, XA_FLAGS_LOCK_BH);
out:
return u_ctx;
}
return 0;
}
+static void clear_conn_resources(struct chcr_ktls_info *tx_info)
+{
+ /* clear l2t entry */
+ if (tx_info->l2te)
+ cxgb4_l2t_release(tx_info->l2te);
+
+#if IS_ENABLED(CONFIG_IPV6)
+ /* clear clip entry */
+ if (tx_info->ip_family == AF_INET6)
+ cxgb4_clip_release(tx_info->netdev, (const u32 *)
+ &tx_info->sk->sk_v6_rcv_saddr,
+ 1);
+#endif
+
+ /* clear tid */
+ if (tx_info->tid != -1)
+ cxgb4_remove_tid(&tx_info->adap->tids, tx_info->tx_chan,
+ tx_info->tid, tx_info->ip_family);
+}
+
+static void ch_ktls_reset_all_conn(struct chcr_ktls_uld_ctx *u_ctx)
+{
+ struct ch_ktls_port_stats_debug *port_stats;
+ struct chcr_ktls_ofld_ctx_tx *tx_ctx;
+ struct chcr_ktls_info *tx_info;
+ unsigned long index;
+
+ xa_for_each(&u_ctx->tid_list, index, tx_ctx) {
+ tx_info = tx_ctx->chcr_info;
+ clear_conn_resources(tx_info);
+ port_stats = &tx_info->adap->ch_ktls_stats.ktls_port[tx_info->port_id];
+ atomic64_inc(&port_stats->ktls_tx_connection_close);
+ kvfree(tx_info);
+ tx_ctx->chcr_info = NULL;
+ /* release module refcount */
+ module_put(THIS_MODULE);
+ }
+}
+
static int chcr_ktls_uld_state_change(void *handle, enum cxgb4_state new_state)
{
struct chcr_ktls_uld_ctx *u_ctx = handle;
case CXGB4_STATE_DETACH:
pr_info("%s: Down\n", pci_name(u_ctx->lldi.pdev));
mutex_lock(&dev_mutex);
+ u_ctx->detach = true;
list_del(&u_ctx->entry);
+ ch_ktls_reset_all_conn(u_ctx);
+ xa_destroy(&u_ctx->tid_list);
mutex_unlock(&dev_mutex);
break;
default:
adap = pci_get_drvdata(u_ctx->lldi.pdev);
memset(&adap->ch_ktls_stats, 0, sizeof(adap->ch_ktls_stats));
list_del(&u_ctx->entry);
+ xa_destroy(&u_ctx->tid_list);
kfree(u_ctx);
}
mutex_unlock(&dev_mutex);
struct chcr_ktls_uld_ctx {
struct list_head entry;
struct cxgb4_lld_info lldi;
+ struct xarray tid_list;
+ bool detach;
};
static inline struct chcr_ktls_ofld_ctx_tx *
cerr = put_cmsg(msg, SOL_TLS, TLS_GET_RECORD_TYPE,
sizeof(thdr->type), &thdr->type);
- if (cerr && thdr->type != TLS_RECORD_TYPE_DATA)
- return -EIO;
+ if (cerr && thdr->type != TLS_RECORD_TYPE_DATA) {
+ copied = -EIO;
+ break;
+ }
/* don't send tls header, skip copy */
goto skip_copy;
}
}
/* ------------------------------------------------------------------------- */
-static void fec_get_mac(struct net_device *ndev)
+static int fec_get_mac(struct net_device *ndev)
{
struct fec_enet_private *fep = netdev_priv(ndev);
unsigned char *iap, tmpaddr[ETH_ALEN];
ret = of_get_mac_address(np, tmpaddr);
if (!ret)
iap = tmpaddr;
+ else if (ret == -EPROBE_DEFER)
+ return ret;
}
}
eth_hw_addr_random(ndev);
dev_info(&fep->pdev->dev, "Using random MAC address: %pM\n",
ndev->dev_addr);
- return;
+ return 0;
}
memcpy(ndev->dev_addr, iap, ETH_ALEN);
/* Adjust MAC if using macaddr */
if (iap == macaddr)
ndev->dev_addr[ETH_ALEN-1] = macaddr[ETH_ALEN-1] + fep->dev_id;
+
+ return 0;
}
/* ------------------------------------------------------------------------- */
return ret;
}
- fec_enet_alloc_queue(ndev);
+ ret = fec_enet_alloc_queue(ndev);
+ if (ret)
+ return ret;
bd_size = (fep->total_tx_ring_size + fep->total_rx_ring_size) * dsize;
cbd_base = dmam_alloc_coherent(&fep->pdev->dev, bd_size, &bd_dma,
GFP_KERNEL);
if (!cbd_base) {
- return -ENOMEM;
+ ret = -ENOMEM;
+ goto free_queue_mem;
}
/* Get the Ethernet address */
- fec_get_mac(ndev);
+ ret = fec_get_mac(ndev);
+ if (ret)
+ goto free_queue_mem;
+
/* make sure MAC we just acquired is programmed into the hw */
fec_set_mac_address(ndev, NULL);
fec_enet_update_ethtool_stats(ndev);
return 0;
+
+free_queue_mem:
+ fec_enet_free_queue(ndev);
+ return ret;
}
#ifdef CONFIG_OF
/* Double check we have no extra work.
* Ensure unmask synchronizes with checking for work.
*/
- dma_rmb();
+ mb();
if (block->tx)
reschedule |= gve_tx_poll(block, -1);
if (block->rx)
int vecs_left = new_num_ntfy_blks % 2;
priv->num_ntfy_blks = new_num_ntfy_blks;
+ priv->mgmt_msix_idx = priv->num_ntfy_blks;
priv->tx_cfg.max_queues = min_t(int, priv->tx_cfg.max_queues,
vecs_per_type);
priv->rx_cfg.max_queues = min_t(int, priv->rx_cfg.max_queues,
{
int i;
- /* Free the irqs */
- for (i = 0; i < priv->num_ntfy_blks; i++) {
- struct gve_notify_block *block = &priv->ntfy_blocks[i];
- int msix_idx = i;
+ if (priv->msix_vectors) {
+ /* Free the irqs */
+ for (i = 0; i < priv->num_ntfy_blks; i++) {
+ struct gve_notify_block *block = &priv->ntfy_blocks[i];
+ int msix_idx = i;
- irq_set_affinity_hint(priv->msix_vectors[msix_idx].vector,
- NULL);
- free_irq(priv->msix_vectors[msix_idx].vector, block);
+ irq_set_affinity_hint(priv->msix_vectors[msix_idx].vector,
+ NULL);
+ free_irq(priv->msix_vectors[msix_idx].vector, block);
+ }
+ free_irq(priv->msix_vectors[priv->mgmt_msix_idx].vector, priv);
}
dma_free_coherent(&priv->pdev->dev,
priv->num_ntfy_blks * sizeof(*priv->ntfy_blocks),
priv->ntfy_blocks, priv->ntfy_block_bus);
priv->ntfy_blocks = NULL;
- free_irq(priv->msix_vectors[priv->mgmt_msix_idx].vector, priv);
pci_disable_msix(priv->pdev);
kvfree(priv->msix_vectors);
priv->msix_vectors = NULL;
tx->dev = &priv->pdev->dev;
if (!tx->raw_addressing) {
tx->tx_fifo.qpl = gve_assign_tx_qpl(priv);
-
+ if (!tx->tx_fifo.qpl)
+ goto abort_with_desc;
/* map Tx FIFO */
if (gve_tx_fifo_init(priv, &tx->tx_fifo))
- goto abort_with_desc;
+ goto abort_with_qpl;
}
tx->q_resources =
abort_with_fifo:
if (!tx->raw_addressing)
gve_tx_fifo_release(priv, &tx->tx_fifo);
+abort_with_qpl:
+ if (!tx->raw_addressing)
+ gve_unassign_qpl(priv, tx->tx_fifo.qpl->id);
abort_with_desc:
dma_free_coherent(hdev, bytes, tx->desc, tx->bus);
tx->desc = NULL;
struct gve_tx_ring *tx;
int nsegs;
- WARN(skb_get_queue_mapping(skb) > priv->tx_cfg.num_queues,
+ WARN(skb_get_queue_mapping(skb) >= priv->tx_cfg.num_queues,
"skb queue index out of range");
tx = &priv->tx[skb_get_queue_mapping(skb)];
if (unlikely(gve_maybe_stop_tx(tx, skb))) {
}
/**
- *hns_nic_set_link_settings - implement ethtool set link ksettings
+ *hns_nic_set_link_ksettings - implement ethtool set link ksettings
*@net_dev: net_device
*@cmd: ethtool_link_ksettings
*retuen 0 - success , negative --fail
}
/**
- * get_ethtool_stats - get detail statistics.
+ * hns_get_ethtool_stats - get detail statistics.
* @netdev: net device
* @stats: statistics info.
* @data: statistics data.
}
/**
- * get_strings: Return a set of strings that describe the requested objects
+ * hns_get_strings: Return a set of strings that describe the requested objects
* @netdev: net device
* @stringset: string set ID.
* @data: objects data.
struct hnae3_ae_dev *ae_dev = pci_get_drvdata(priv->ae_handle->pdev);
struct hns3_enet_coalesce *tx_coal = &tqp_vector->tx_group.coal;
struct hns3_enet_coalesce *rx_coal = &tqp_vector->rx_group.coal;
+ struct hns3_enet_coalesce *ptx_coal = &priv->tx_coal;
+ struct hns3_enet_coalesce *prx_coal = &priv->rx_coal;
- /* initialize the configuration for interrupt coalescing.
- * 1. GL (Interrupt Gap Limiter)
- * 2. RL (Interrupt Rate Limiter)
- * 3. QL (Interrupt Quantity Limiter)
- *
- * Default: enable interrupt coalescing self-adaptive and GL
- */
- tx_coal->adapt_enable = 1;
- rx_coal->adapt_enable = 1;
+ tx_coal->adapt_enable = ptx_coal->adapt_enable;
+ rx_coal->adapt_enable = prx_coal->adapt_enable;
- tx_coal->int_gl = HNS3_INT_GL_50K;
- rx_coal->int_gl = HNS3_INT_GL_50K;
+ tx_coal->int_gl = ptx_coal->int_gl;
+ rx_coal->int_gl = prx_coal->int_gl;
- rx_coal->flow_level = HNS3_FLOW_LOW;
- tx_coal->flow_level = HNS3_FLOW_LOW;
+ rx_coal->flow_level = prx_coal->flow_level;
+ tx_coal->flow_level = ptx_coal->flow_level;
/* device version above V3(include V3), GL can configure 1us
* unit, so uses 1us unit.
rx_coal->ql_enable = 1;
tx_coal->int_ql_max = ae_dev->dev_specs.int_ql_max;
rx_coal->int_ql_max = ae_dev->dev_specs.int_ql_max;
- tx_coal->int_ql = HNS3_INT_QL_DEFAULT_CFG;
- rx_coal->int_ql = HNS3_INT_QL_DEFAULT_CFG;
+ tx_coal->int_ql = ptx_coal->int_ql;
+ rx_coal->int_ql = prx_coal->int_ql;
}
}
l4.udp->dest == htons(4790))))
return false;
- skb_checksum_help(skb);
-
return true;
}
/* the stack computes the IP header already,
* driver calculate l4 checksum when not TSO.
*/
- skb_checksum_help(skb);
- return 0;
+ return skb_checksum_help(skb);
}
hns3_set_outer_l2l3l4(skb, ol4_proto, ol_type_vlan_len_msec);
break;
case IPPROTO_UDP:
if (hns3_tunnel_csum_bug(skb))
- break;
+ return skb_checksum_help(skb);
hns3_set_field(*type_cs_vlan_tso, HNS3_TXD_L4CS_B, 1);
hns3_set_field(*type_cs_vlan_tso, HNS3_TXD_L4T_S,
/* the stack computes the IP header already,
* driver calculate l4 checksum when not TSO.
*/
- skb_checksum_help(skb);
- return 0;
+ return skb_checksum_help(skb);
}
return 0;
return ret;
}
+static void hns3_nic_init_coal_cfg(struct hns3_nic_priv *priv)
+{
+ struct hnae3_ae_dev *ae_dev = pci_get_drvdata(priv->ae_handle->pdev);
+ struct hns3_enet_coalesce *tx_coal = &priv->tx_coal;
+ struct hns3_enet_coalesce *rx_coal = &priv->rx_coal;
+
+ /* initialize the configuration for interrupt coalescing.
+ * 1. GL (Interrupt Gap Limiter)
+ * 2. RL (Interrupt Rate Limiter)
+ * 3. QL (Interrupt Quantity Limiter)
+ *
+ * Default: enable interrupt coalescing self-adaptive and GL
+ */
+ tx_coal->adapt_enable = 1;
+ rx_coal->adapt_enable = 1;
+
+ tx_coal->int_gl = HNS3_INT_GL_50K;
+ rx_coal->int_gl = HNS3_INT_GL_50K;
+
+ rx_coal->flow_level = HNS3_FLOW_LOW;
+ tx_coal->flow_level = HNS3_FLOW_LOW;
+
+ if (ae_dev->dev_specs.int_ql_max) {
+ tx_coal->int_ql = HNS3_INT_QL_DEFAULT_CFG;
+ rx_coal->int_ql = HNS3_INT_QL_DEFAULT_CFG;
+ }
+}
+
static int hns3_nic_alloc_vector_data(struct hns3_nic_priv *priv)
{
struct hnae3_handle *h = priv->ae_handle;
goto out_get_ring_cfg;
}
+ hns3_nic_init_coal_cfg(priv);
+
ret = hns3_nic_alloc_vector_data(priv);
if (ret) {
ret = -ENOMEM;
if (ret)
goto out_init_phy;
- ret = register_netdev(netdev);
- if (ret) {
- dev_err(priv->dev, "probe register netdev fail!\n");
- goto out_reg_netdev_fail;
- }
-
/* the device can work without cpu rmap, only aRFS needs it */
ret = hns3_set_rx_cpu_rmap(netdev);
if (ret)
if (ae_dev->dev_version >= HNAE3_DEVICE_VERSION_V3)
set_bit(HNAE3_PFLAG_LIMIT_PROMISC, &handle->supported_pflags);
+ ret = register_netdev(netdev);
+ if (ret) {
+ dev_err(priv->dev, "probe register netdev fail!\n");
+ goto out_reg_netdev_fail;
+ }
+
if (netif_msg_drv(handle))
hns3_info_show(priv);
return ret;
+out_reg_netdev_fail:
+ hns3_dbg_uninit(handle);
out_client_start:
hns3_free_rx_cpu_rmap(netdev);
hns3_nic_uninit_irq(priv);
out_init_irq_fail:
- unregister_netdev(netdev);
-out_reg_netdev_fail:
hns3_uninit_phy(netdev);
out_init_phy:
hns3_uninit_all_ring(priv);
return 0;
}
-static void hns3_store_coal(struct hns3_nic_priv *priv)
-{
- /* ethtool only support setting and querying one coal
- * configuration for now, so save the vector 0' coal
- * configuration here in order to restore it.
- */
- memcpy(&priv->tx_coal, &priv->tqp_vector[0].tx_group.coal,
- sizeof(struct hns3_enet_coalesce));
- memcpy(&priv->rx_coal, &priv->tqp_vector[0].rx_group.coal,
- sizeof(struct hns3_enet_coalesce));
-}
-
-static void hns3_restore_coal(struct hns3_nic_priv *priv)
-{
- u16 vector_num = priv->vector_num;
- int i;
-
- for (i = 0; i < vector_num; i++) {
- memcpy(&priv->tqp_vector[i].tx_group.coal, &priv->tx_coal,
- sizeof(struct hns3_enet_coalesce));
- memcpy(&priv->tqp_vector[i].rx_group.coal, &priv->rx_coal,
- sizeof(struct hns3_enet_coalesce));
- }
-}
-
static int hns3_reset_notify_down_enet(struct hnae3_handle *handle)
{
struct hnae3_knic_private_info *kinfo = &handle->kinfo;
if (ret)
goto err_put_ring;
- hns3_restore_coal(priv);
-
ret = hns3_nic_init_vector_data(priv);
if (ret)
goto err_dealloc_vector;
hns3_nic_uninit_vector_data(priv);
- hns3_store_coal(priv);
-
hns3_nic_dealloc_vector_data(priv);
hns3_uninit_all_ring(priv);
h->ae_algo->ops->get_channels(h, ch);
}
-static int hns3_get_coalesce_per_queue(struct net_device *netdev, u32 queue,
- struct ethtool_coalesce *cmd)
+static int hns3_get_coalesce(struct net_device *netdev,
+ struct ethtool_coalesce *cmd)
{
- struct hns3_enet_tqp_vector *tx_vector, *rx_vector;
struct hns3_nic_priv *priv = netdev_priv(netdev);
+ struct hns3_enet_coalesce *tx_coal = &priv->tx_coal;
+ struct hns3_enet_coalesce *rx_coal = &priv->rx_coal;
struct hnae3_handle *h = priv->ae_handle;
- u16 queue_num = h->kinfo.num_tqps;
if (hns3_nic_resetting(netdev))
return -EBUSY;
- if (queue >= queue_num) {
- netdev_err(netdev,
- "Invalid queue value %u! Queue max id=%u\n",
- queue, queue_num - 1);
- return -EINVAL;
- }
-
- tx_vector = priv->ring[queue].tqp_vector;
- rx_vector = priv->ring[queue_num + queue].tqp_vector;
+ cmd->use_adaptive_tx_coalesce = tx_coal->adapt_enable;
+ cmd->use_adaptive_rx_coalesce = rx_coal->adapt_enable;
- cmd->use_adaptive_tx_coalesce =
- tx_vector->tx_group.coal.adapt_enable;
- cmd->use_adaptive_rx_coalesce =
- rx_vector->rx_group.coal.adapt_enable;
-
- cmd->tx_coalesce_usecs = tx_vector->tx_group.coal.int_gl;
- cmd->rx_coalesce_usecs = rx_vector->rx_group.coal.int_gl;
+ cmd->tx_coalesce_usecs = tx_coal->int_gl;
+ cmd->rx_coalesce_usecs = rx_coal->int_gl;
cmd->tx_coalesce_usecs_high = h->kinfo.int_rl_setting;
cmd->rx_coalesce_usecs_high = h->kinfo.int_rl_setting;
- cmd->tx_max_coalesced_frames = tx_vector->tx_group.coal.int_ql;
- cmd->rx_max_coalesced_frames = rx_vector->rx_group.coal.int_ql;
+ cmd->tx_max_coalesced_frames = tx_coal->int_ql;
+ cmd->rx_max_coalesced_frames = rx_coal->int_ql;
return 0;
}
-static int hns3_get_coalesce(struct net_device *netdev,
- struct ethtool_coalesce *cmd)
-{
- return hns3_get_coalesce_per_queue(netdev, 0, cmd);
-}
-
static int hns3_check_gl_coalesce_para(struct net_device *netdev,
struct ethtool_coalesce *cmd)
{
return ret;
}
- ret = hns3_check_ql_coalesce_param(netdev, cmd);
- if (ret)
- return ret;
-
- if (cmd->use_adaptive_tx_coalesce == 1 ||
- cmd->use_adaptive_rx_coalesce == 1) {
- netdev_info(netdev,
- "adaptive-tx=%u and adaptive-rx=%u, tx_usecs or rx_usecs will changed dynamically.\n",
- cmd->use_adaptive_tx_coalesce,
- cmd->use_adaptive_rx_coalesce);
- }
-
- return 0;
+ return hns3_check_ql_coalesce_param(netdev, cmd);
}
static void hns3_set_coalesce_per_queue(struct net_device *netdev,
struct ethtool_coalesce *cmd)
{
struct hnae3_handle *h = hns3_get_handle(netdev);
+ struct hns3_nic_priv *priv = netdev_priv(netdev);
+ struct hns3_enet_coalesce *tx_coal = &priv->tx_coal;
+ struct hns3_enet_coalesce *rx_coal = &priv->rx_coal;
u16 queue_num = h->kinfo.num_tqps;
int ret;
int i;
h->kinfo.int_rl_setting =
hns3_rl_round_down(cmd->rx_coalesce_usecs_high);
+ tx_coal->adapt_enable = cmd->use_adaptive_tx_coalesce;
+ rx_coal->adapt_enable = cmd->use_adaptive_rx_coalesce;
+
+ tx_coal->int_gl = cmd->tx_coalesce_usecs;
+ rx_coal->int_gl = cmd->rx_coalesce_usecs;
+
+ tx_coal->int_ql = cmd->tx_max_coalesced_frames;
+ rx_coal->int_ql = cmd->rx_max_coalesced_frames;
+
for (i = 0; i < queue_num; i++)
hns3_set_coalesce_per_queue(netdev, cmd, i);
unsigned int flag;
int ret = 0;
- memset(&resp_msg, 0, sizeof(resp_msg));
/* handle all the mailbox requests in the queue */
while (!hclge_cmd_crq_empty(&hdev->hw)) {
if (test_bit(HCLGE_STATE_CMD_DISABLE, &hdev->state)) {
trace_hclge_pf_mbx_get(hdev, req);
+ /* clear the resp_msg before processing every mailbox message */
+ memset(&resp_msg, 0, sizeof(resp_msg));
+
switch (req->msg.code) {
case HCLGE_MBX_MAP_RING_TO_VECTOR:
ret = hclge_map_unmap_ring_to_vf_vector(vport, true,
case XDP_TX:
xdp_ring = rx_ring->vsi->xdp_rings[rx_ring->queue_index];
result = i40e_xmit_xdp_tx_ring(xdp, xdp_ring);
+ if (result == I40E_XDP_CONSUMED)
+ goto out_failure;
break;
case XDP_REDIRECT:
err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog);
- result = !err ? I40E_XDP_REDIR : I40E_XDP_CONSUMED;
+ if (err)
+ goto out_failure;
+ result = I40E_XDP_REDIR;
break;
default:
bpf_warn_invalid_xdp_action(act);
fallthrough;
case XDP_ABORTED:
+out_failure:
trace_xdp_exception(rx_ring->netdev, xdp_prog, act);
fallthrough; /* handle aborts by dropping packet */
case XDP_DROP:
if (likely(act == XDP_REDIRECT)) {
err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog);
- result = !err ? I40E_XDP_REDIR : I40E_XDP_CONSUMED;
+ if (err)
+ goto out_failure;
rcu_read_unlock();
- return result;
+ return I40E_XDP_REDIR;
}
switch (act) {
case XDP_TX:
xdp_ring = rx_ring->vsi->xdp_rings[rx_ring->queue_index];
result = i40e_xmit_xdp_tx_ring(xdp, xdp_ring);
+ if (result == I40E_XDP_CONSUMED)
+ goto out_failure;
break;
default:
bpf_warn_invalid_xdp_action(act);
fallthrough;
case XDP_ABORTED:
+out_failure:
trace_xdp_exception(rx_ring->netdev, xdp_prog, act);
fallthrough; /* handle aborts by dropping packet */
case XDP_DROP:
struct ice_tc_cfg tc_cfg;
struct bpf_prog *xdp_prog;
struct ice_ring **xdp_rings; /* XDP ring array */
+ unsigned long *af_xdp_zc_qps; /* tracks AF_XDP ZC enabled qps */
u16 num_xdp_txq; /* Used XDP queues */
u8 xdp_mapping_mode; /* ICE_MAP_MODE_[CONTIG|SCATTER] */
*/
static inline struct xsk_buff_pool *ice_xsk_pool(struct ice_ring *ring)
{
+ struct ice_vsi *vsi = ring->vsi;
u16 qid = ring->q_index;
if (ice_ring_is_xdp(ring))
- qid -= ring->vsi->num_xdp_txq;
+ qid -= vsi->num_xdp_txq;
- if (!ice_is_xdp_ena_vsi(ring->vsi))
+ if (!ice_is_xdp_ena_vsi(vsi) || !test_bit(qid, vsi->af_xdp_zc_qps))
return NULL;
- return xsk_get_pool_from_qid(ring->vsi->netdev, qid);
+ return xsk_get_pool_from_qid(vsi->netdev, qid);
}
/**
ice_ethtool_advertise_link_mode(ICE_AQ_LINK_SPEED_100GB,
100000baseKR4_Full);
}
-
- /* Autoneg PHY types */
- if (phy_types_low & ICE_PHY_TYPE_LOW_100BASE_TX ||
- phy_types_low & ICE_PHY_TYPE_LOW_1000BASE_T ||
- phy_types_low & ICE_PHY_TYPE_LOW_1000BASE_KX ||
- phy_types_low & ICE_PHY_TYPE_LOW_2500BASE_T ||
- phy_types_low & ICE_PHY_TYPE_LOW_2500BASE_KX ||
- phy_types_low & ICE_PHY_TYPE_LOW_5GBASE_T ||
- phy_types_low & ICE_PHY_TYPE_LOW_5GBASE_KR ||
- phy_types_low & ICE_PHY_TYPE_LOW_10GBASE_T ||
- phy_types_low & ICE_PHY_TYPE_LOW_10GBASE_KR_CR1 ||
- phy_types_low & ICE_PHY_TYPE_LOW_25GBASE_T ||
- phy_types_low & ICE_PHY_TYPE_LOW_25GBASE_CR ||
- phy_types_low & ICE_PHY_TYPE_LOW_25GBASE_CR_S ||
- phy_types_low & ICE_PHY_TYPE_LOW_25GBASE_CR1 ||
- phy_types_low & ICE_PHY_TYPE_LOW_25GBASE_KR ||
- phy_types_low & ICE_PHY_TYPE_LOW_25GBASE_KR_S ||
- phy_types_low & ICE_PHY_TYPE_LOW_25GBASE_KR1 ||
- phy_types_low & ICE_PHY_TYPE_LOW_40GBASE_CR4 ||
- phy_types_low & ICE_PHY_TYPE_LOW_40GBASE_KR4) {
- ethtool_link_ksettings_add_link_mode(ks, supported,
- Autoneg);
- ethtool_link_ksettings_add_link_mode(ks, advertising,
- Autoneg);
- }
- if (phy_types_low & ICE_PHY_TYPE_LOW_50GBASE_CR2 ||
- phy_types_low & ICE_PHY_TYPE_LOW_50GBASE_KR2 ||
- phy_types_low & ICE_PHY_TYPE_LOW_50GBASE_CP ||
- phy_types_low & ICE_PHY_TYPE_LOW_50GBASE_KR_PAM4) {
- ethtool_link_ksettings_add_link_mode(ks, supported,
- Autoneg);
- ethtool_link_ksettings_add_link_mode(ks, advertising,
- Autoneg);
- }
- if (phy_types_low & ICE_PHY_TYPE_LOW_100GBASE_CR4 ||
- phy_types_low & ICE_PHY_TYPE_LOW_100GBASE_KR4 ||
- phy_types_low & ICE_PHY_TYPE_LOW_100GBASE_KR_PAM4 ||
- phy_types_low & ICE_PHY_TYPE_LOW_100GBASE_CP2) {
- ethtool_link_ksettings_add_link_mode(ks, supported,
- Autoneg);
- ethtool_link_ksettings_add_link_mode(ks, advertising,
- Autoneg);
- }
}
#define TEST_SET_BITS_TIMEOUT 50
ks->base.port = PORT_TP;
break;
case ICE_MEDIA_BACKPLANE:
- ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg);
ethtool_link_ksettings_add_link_mode(ks, supported, Backplane);
- ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg);
ethtool_link_ksettings_add_link_mode(ks, advertising,
Backplane);
ks->base.port = PORT_NONE;
if (caps->link_fec_options & ICE_AQC_PHY_FEC_25G_RS_CLAUSE91_EN)
ethtool_link_ksettings_add_link_mode(ks, supported, FEC_RS);
+ /* Set supported and advertised autoneg */
+ if (ice_is_phy_caps_an_enabled(caps)) {
+ ethtool_link_ksettings_add_link_mode(ks, supported, Autoneg);
+ ethtool_link_ksettings_add_link_mode(ks, advertising, Autoneg);
+ }
+
done:
kfree(caps);
return err;
#define PF_FW_ATQLEN_ATQOVFL_M BIT(29)
#define PF_FW_ATQLEN_ATQCRIT_M BIT(30)
#define VF_MBX_ARQLEN(_VF) (0x0022BC00 + ((_VF) * 4))
+#define VF_MBX_ATQLEN(_VF) (0x0022A800 + ((_VF) * 4))
#define PF_FW_ATQLEN_ATQENABLE_M BIT(31)
#define PF_FW_ATQT 0x00080400
#define PF_MBX_ARQBAH 0x0022E400
if (!vsi->q_vectors)
goto err_vectors;
+ vsi->af_xdp_zc_qps = bitmap_zalloc(max_t(int, vsi->alloc_txq, vsi->alloc_rxq), GFP_KERNEL);
+ if (!vsi->af_xdp_zc_qps)
+ goto err_zc_qps;
+
return 0;
+err_zc_qps:
+ devm_kfree(dev, vsi->q_vectors);
err_vectors:
devm_kfree(dev, vsi->rxq_map);
err_rxq_map:
break;
case ICE_VSI_VF:
vf = &pf->vf[vsi->vf_id];
+ if (vf->num_req_qs)
+ vf->num_vf_qs = vf->num_req_qs;
vsi->alloc_txq = vf->num_vf_qs;
vsi->alloc_rxq = vf->num_vf_qs;
/* pf->num_msix_per_vf includes (VF miscellaneous vector +
dev = ice_pf_to_dev(pf);
+ if (vsi->af_xdp_zc_qps) {
+ bitmap_free(vsi->af_xdp_zc_qps);
+ vsi->af_xdp_zc_qps = NULL;
+ }
/* free the ring and vector containers */
if (vsi->q_vectors) {
devm_kfree(dev, vsi->q_vectors);
struct bpf_prog *xdp_prog)
{
struct ice_ring *xdp_ring;
- int err;
+ int err, result;
u32 act;
act = bpf_prog_run_xdp(xdp_prog, xdp);
return ICE_XDP_PASS;
case XDP_TX:
xdp_ring = rx_ring->vsi->xdp_rings[smp_processor_id()];
- return ice_xmit_xdp_buff(xdp, xdp_ring);
+ result = ice_xmit_xdp_buff(xdp, xdp_ring);
+ if (result == ICE_XDP_CONSUMED)
+ goto out_failure;
+ return result;
case XDP_REDIRECT:
err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog);
- return !err ? ICE_XDP_REDIR : ICE_XDP_CONSUMED;
+ if (err)
+ goto out_failure;
+ return ICE_XDP_REDIR;
default:
bpf_warn_invalid_xdp_action(act);
fallthrough;
case XDP_ABORTED:
+out_failure:
trace_xdp_exception(rx_ring->netdev, xdp_prog, act);
fallthrough;
case XDP_DROP:
struct ice_tx_offload_params offload = { 0 };
struct ice_vsi *vsi = tx_ring->vsi;
struct ice_tx_buf *first;
+ struct ethhdr *eth;
unsigned int count;
int tso, csum;
goto out_drop;
/* allow CONTROL frames egress from main VSI if FW LLDP disabled */
- if (unlikely(skb->priority == TC_PRIO_CONTROL &&
+ eth = (struct ethhdr *)skb_mac_header(skb);
+ if (unlikely((skb->priority == TC_PRIO_CONTROL ||
+ eth->h_proto == htons(ETH_P_LLDP)) &&
vsi->type == ICE_VSI_PF &&
vsi->port_info->qos_cfg.is_sw_lldp))
offload.cd_qw1 |= (u64)(ICE_TX_DESC_DTYPE_CTX |
*/
clear_bit(ICE_VF_STATE_INIT, vf->vf_states);
- /* VF_MBX_ARQLEN is cleared by PFR, so the driver needs to clear it
- * in the case of VFR. If this is done for PFR, it can mess up VF
- * resets because the VF driver may already have started cleanup
- * by the time we get here.
+ /* VF_MBX_ARQLEN and VF_MBX_ATQLEN are cleared by PFR, so the driver
+ * needs to clear them in the case of VFR/VFLR. If this is done for
+ * PFR, it can mess up VF resets because the VF driver may already
+ * have started cleanup by the time we get here.
*/
- if (!is_pfr)
+ if (!is_pfr) {
wr32(hw, VF_MBX_ARQLEN(vf->vf_id), 0);
+ wr32(hw, VF_MBX_ATQLEN(vf->vf_id), 0);
+ }
/* In the case of a VFLR, the HW has already reset the VF and we
* just need to clean up, so don't hit the VFRTRIG register.
ice_vf_ctrl_vsi_release(vf);
ice_vf_pre_vsi_rebuild(vf);
- ice_vf_rebuild_vsi_with_release(vf);
+
+ if (ice_vf_rebuild_vsi_with_release(vf)) {
+ dev_err(dev, "Failed to release and setup the VF%u's VSI\n", vf->vf_id);
+ return false;
+ }
+
ice_vf_post_vsi_rebuild(vf);
/* if the VF has been reset allow it to come up again */
if (!pool)
return -EINVAL;
+ clear_bit(qid, vsi->af_xdp_zc_qps);
xsk_pool_dma_unmap(pool, ICE_RX_DMA_ATTR);
return 0;
if (err)
return err;
+ set_bit(qid, vsi->af_xdp_zc_qps);
+
return 0;
}
if (likely(act == XDP_REDIRECT)) {
err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog);
- result = !err ? ICE_XDP_REDIR : ICE_XDP_CONSUMED;
+ if (err)
+ goto out_failure;
rcu_read_unlock();
- return result;
+ return ICE_XDP_REDIR;
}
switch (act) {
case XDP_TX:
xdp_ring = rx_ring->vsi->xdp_rings[rx_ring->q_index];
result = ice_xmit_xdp_buff(xdp, xdp_ring);
+ if (result == ICE_XDP_CONSUMED)
+ goto out_failure;
break;
default:
bpf_warn_invalid_xdp_action(act);
fallthrough;
case XDP_ABORTED:
+out_failure:
trace_xdp_exception(rx_ring->netdev, xdp_prog, act);
fallthrough;
case XDP_DROP:
void igb_ptp_tx_hang(struct igb_adapter *adapter);
void igb_ptp_rx_rgtstamp(struct igb_q_vector *q_vector, struct sk_buff *skb);
int igb_ptp_rx_pktstamp(struct igb_q_vector *q_vector, void *va,
- struct sk_buff *skb);
+ ktime_t *timestamp);
int igb_ptp_set_ts_config(struct net_device *netdev, struct ifreq *ifr);
int igb_ptp_get_ts_config(struct net_device *netdev, struct ifreq *ifr);
void igb_set_flag_queue_pairs(struct igb_adapter *, const u32);
static struct sk_buff *igb_construct_skb(struct igb_ring *rx_ring,
struct igb_rx_buffer *rx_buffer,
struct xdp_buff *xdp,
- union e1000_adv_rx_desc *rx_desc)
+ ktime_t timestamp)
{
#if (PAGE_SIZE < 8192)
unsigned int truesize = igb_rx_pg_size(rx_ring) / 2;
if (unlikely(!skb))
return NULL;
- if (unlikely(igb_test_staterr(rx_desc, E1000_RXDADV_STAT_TSIP))) {
- if (!igb_ptp_rx_pktstamp(rx_ring->q_vector, xdp->data, skb)) {
- xdp->data += IGB_TS_HDR_LEN;
- size -= IGB_TS_HDR_LEN;
- }
- }
+ if (timestamp)
+ skb_hwtstamps(skb)->hwtstamp = timestamp;
/* Determine available headroom for copy */
headlen = size;
static struct sk_buff *igb_build_skb(struct igb_ring *rx_ring,
struct igb_rx_buffer *rx_buffer,
struct xdp_buff *xdp,
- union e1000_adv_rx_desc *rx_desc)
+ ktime_t timestamp)
{
#if (PAGE_SIZE < 8192)
unsigned int truesize = igb_rx_pg_size(rx_ring) / 2;
if (metasize)
skb_metadata_set(skb, metasize);
- /* pull timestamp out of packet data */
- if (igb_test_staterr(rx_desc, E1000_RXDADV_STAT_TSIP)) {
- if (!igb_ptp_rx_pktstamp(rx_ring->q_vector, skb->data, skb))
- __skb_pull(skb, IGB_TS_HDR_LEN);
- }
+ if (timestamp)
+ skb_hwtstamps(skb)->hwtstamp = timestamp;
/* update buffer offset */
#if (PAGE_SIZE < 8192)
break;
case XDP_TX:
result = igb_xdp_xmit_back(adapter, xdp);
+ if (result == IGB_XDP_CONSUMED)
+ goto out_failure;
break;
case XDP_REDIRECT:
err = xdp_do_redirect(adapter->netdev, xdp, xdp_prog);
- if (!err)
- result = IGB_XDP_REDIR;
- else
- result = IGB_XDP_CONSUMED;
+ if (err)
+ goto out_failure;
+ result = IGB_XDP_REDIR;
break;
default:
bpf_warn_invalid_xdp_action(act);
fallthrough;
case XDP_ABORTED:
+out_failure:
trace_xdp_exception(rx_ring->netdev, xdp_prog, act);
fallthrough;
case XDP_DROP:
while (likely(total_packets < budget)) {
union e1000_adv_rx_desc *rx_desc;
struct igb_rx_buffer *rx_buffer;
+ ktime_t timestamp = 0;
+ int pkt_offset = 0;
unsigned int size;
+ void *pktbuf;
/* return some buffers to hardware, one at a time is too slow */
if (cleaned_count >= IGB_RX_BUFFER_WRITE) {
dma_rmb();
rx_buffer = igb_get_rx_buffer(rx_ring, size, &rx_buf_pgcnt);
+ pktbuf = page_address(rx_buffer->page) + rx_buffer->page_offset;
+
+ /* pull rx packet timestamp if available and valid */
+ if (igb_test_staterr(rx_desc, E1000_RXDADV_STAT_TSIP)) {
+ int ts_hdr_len;
+
+ ts_hdr_len = igb_ptp_rx_pktstamp(rx_ring->q_vector,
+ pktbuf, ×tamp);
+
+ pkt_offset += ts_hdr_len;
+ size -= ts_hdr_len;
+ }
/* retrieve a buffer from the ring */
if (!skb) {
- unsigned int offset = igb_rx_offset(rx_ring);
- unsigned char *hard_start;
+ unsigned char *hard_start = pktbuf - igb_rx_offset(rx_ring);
+ unsigned int offset = pkt_offset + igb_rx_offset(rx_ring);
- hard_start = page_address(rx_buffer->page) +
- rx_buffer->page_offset - offset;
xdp_prepare_buff(&xdp, hard_start, offset, size, true);
#if (PAGE_SIZE > 4096)
/* At larger PAGE_SIZE, frame_sz depend on len size */
} else if (skb)
igb_add_rx_frag(rx_ring, rx_buffer, skb, size);
else if (ring_uses_build_skb(rx_ring))
- skb = igb_build_skb(rx_ring, rx_buffer, &xdp, rx_desc);
+ skb = igb_build_skb(rx_ring, rx_buffer, &xdp,
+ timestamp);
else
skb = igb_construct_skb(rx_ring, rx_buffer,
- &xdp, rx_desc);
+ &xdp, timestamp);
/* exit if we failed to retrieve a buffer */
if (!skb) {
dev_kfree_skb_any(skb);
}
-#define IGB_RET_PTP_DISABLED 1
-#define IGB_RET_PTP_INVALID 2
-
/**
* igb_ptp_rx_pktstamp - retrieve Rx per packet timestamp
* @q_vector: Pointer to interrupt specific structure
* @va: Pointer to address containing Rx buffer
- * @skb: Buffer containing timestamp and packet
+ * @timestamp: Pointer where timestamp will be stored
*
* This function is meant to retrieve a timestamp from the first buffer of an
* incoming frame. The value is stored in little endian format starting on
* byte 8
*
- * Returns: 0 if success, nonzero if failure
+ * Returns: The timestamp header length or 0 if not available
**/
int igb_ptp_rx_pktstamp(struct igb_q_vector *q_vector, void *va,
- struct sk_buff *skb)
+ ktime_t *timestamp)
{
struct igb_adapter *adapter = q_vector->adapter;
+ struct skb_shared_hwtstamps ts;
__le64 *regval = (__le64 *)va;
int adjust = 0;
if (!(adapter->ptp_flags & IGB_PTP_ENABLED))
- return IGB_RET_PTP_DISABLED;
+ return 0;
/* The timestamp is recorded in little endian format.
* DWORD: 0 1 2 3
/* check reserved dwords are zero, be/le doesn't matter for zero */
if (regval[0])
- return IGB_RET_PTP_INVALID;
+ return 0;
- igb_ptp_systim_to_hwtstamp(adapter, skb_hwtstamps(skb),
- le64_to_cpu(regval[1]));
+ igb_ptp_systim_to_hwtstamp(adapter, &ts, le64_to_cpu(regval[1]));
/* adjust timestamp for the RX latency based on link speed */
if (adapter->hw.mac.type == e1000_i210) {
break;
}
}
- skb_hwtstamps(skb)->hwtstamp =
- ktime_sub_ns(skb_hwtstamps(skb)->hwtstamp, adjust);
- return 0;
+ *timestamp = ktime_sub_ns(ts.hwtstamp, adjust);
+
+ return IGB_TS_HDR_LEN;
}
/**
break;
case XDP_TX:
if (igc_xdp_xmit_back(adapter, xdp) < 0)
- res = IGC_XDP_CONSUMED;
- else
- res = IGC_XDP_TX;
+ goto out_failure;
+ res = IGC_XDP_TX;
break;
case XDP_REDIRECT:
if (xdp_do_redirect(adapter->netdev, xdp, prog) < 0)
- res = IGC_XDP_CONSUMED;
- else
- res = IGC_XDP_REDIRECT;
+ goto out_failure;
+ res = IGC_XDP_REDIRECT;
break;
default:
bpf_warn_invalid_xdp_action(act);
fallthrough;
case XDP_ABORTED:
+out_failure:
trace_xdp_exception(adapter->netdev, prog, act);
fallthrough;
case XDP_DROP:
break;
case XDP_TX:
xdpf = xdp_convert_buff_to_frame(xdp);
- if (unlikely(!xdpf)) {
- result = IXGBE_XDP_CONSUMED;
- break;
- }
+ if (unlikely(!xdpf))
+ goto out_failure;
result = ixgbe_xmit_xdp_ring(adapter, xdpf);
+ if (result == IXGBE_XDP_CONSUMED)
+ goto out_failure;
break;
case XDP_REDIRECT:
err = xdp_do_redirect(adapter->netdev, xdp, xdp_prog);
- if (!err)
- result = IXGBE_XDP_REDIR;
- else
- result = IXGBE_XDP_CONSUMED;
+ if (err)
+ goto out_failure;
+ result = IXGBE_XDP_REDIR;
break;
default:
bpf_warn_invalid_xdp_action(act);
fallthrough;
case XDP_ABORTED:
+out_failure:
trace_xdp_exception(rx_ring->netdev, xdp_prog, act);
fallthrough; /* handle aborts by dropping packet */
case XDP_DROP:
return err;
}
-static s32 ixgbe_set_vf_lpe(struct ixgbe_adapter *adapter, u32 *msgbuf, u32 vf)
+static int ixgbe_set_vf_lpe(struct ixgbe_adapter *adapter, u32 max_frame, u32 vf)
{
struct ixgbe_hw *hw = &adapter->hw;
- int max_frame = msgbuf[1];
u32 max_frs;
+ if (max_frame < ETH_MIN_MTU || max_frame > IXGBE_MAX_JUMBO_FRAME_SIZE) {
+ e_err(drv, "VF max_frame %d out of range\n", max_frame);
+ return -EINVAL;
+ }
+
/*
* For 82599EB we have to keep all PFs and VFs operating with
* the same max_frame value in order to avoid sending an oversize
}
}
- /* MTU < 68 is an error and causes problems on some kernels */
- if (max_frame > IXGBE_MAX_JUMBO_FRAME_SIZE) {
- e_err(drv, "VF max_frame %d out of range\n", max_frame);
- return -EINVAL;
- }
-
/* pull current max frame size from hardware */
max_frs = IXGBE_READ_REG(hw, IXGBE_MAXFRS);
max_frs &= IXGBE_MHADD_MFS_MASK;
retval = ixgbe_set_vf_vlan_msg(adapter, msgbuf, vf);
break;
case IXGBE_VF_SET_LPE:
- retval = ixgbe_set_vf_lpe(adapter, msgbuf, vf);
+ retval = ixgbe_set_vf_lpe(adapter, msgbuf[1], vf);
break;
case IXGBE_VF_SET_MACVLAN:
retval = ixgbe_set_vf_macvlan_msg(adapter, msgbuf, vf);
if (likely(act == XDP_REDIRECT)) {
err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog);
- result = !err ? IXGBE_XDP_REDIR : IXGBE_XDP_CONSUMED;
+ if (err)
+ goto out_failure;
rcu_read_unlock();
- return result;
+ return IXGBE_XDP_REDIR;
}
switch (act) {
break;
case XDP_TX:
xdpf = xdp_convert_buff_to_frame(xdp);
- if (unlikely(!xdpf)) {
- result = IXGBE_XDP_CONSUMED;
- break;
- }
+ if (unlikely(!xdpf))
+ goto out_failure;
result = ixgbe_xmit_xdp_ring(adapter, xdpf);
+ if (result == IXGBE_XDP_CONSUMED)
+ goto out_failure;
break;
default:
bpf_warn_invalid_xdp_action(act);
fallthrough;
case XDP_ABORTED:
+out_failure:
trace_xdp_exception(rx_ring->netdev, xdp_prog, act);
fallthrough; /* handle aborts by dropping packet */
case XDP_DROP:
case XDP_TX:
xdp_ring = adapter->xdp_ring[rx_ring->queue_index];
result = ixgbevf_xmit_xdp_ring(xdp_ring, xdp);
+ if (result == IXGBEVF_XDP_CONSUMED)
+ goto out_failure;
break;
default:
bpf_warn_invalid_xdp_action(act);
fallthrough;
case XDP_ABORTED:
+out_failure:
trace_xdp_exception(rx_ring->netdev, xdp_prog, act);
fallthrough; /* handle aborts by dropping packet */
case XDP_DROP:
lp->tx_irq = platform_get_irq_byname(pdev, "tx");
p = devm_platform_ioremap_resource_byname(pdev, "emac");
- if (!p) {
+ if (IS_ERR(p)) {
printk(KERN_ERR DRV_NAME ": cannot remap registers\n");
- return -ENOMEM;
+ return PTR_ERR(p);
}
lp->eth_regs = p;
p = devm_platform_ioremap_resource_byname(pdev, "dma_rx");
- if (!p) {
+ if (IS_ERR(p)) {
printk(KERN_ERR DRV_NAME ": cannot remap Rx DMA registers\n");
- return -ENOMEM;
+ return PTR_ERR(p);
}
lp->rx_dma_regs = p;
p = devm_platform_ioremap_resource_byname(pdev, "dma_tx");
- if (!p) {
+ if (IS_ERR(p)) {
printk(KERN_ERR DRV_NAME ": cannot remap Tx DMA registers\n");
- return -ENOMEM;
+ return PTR_ERR(p);
}
lp->tx_dma_regs = p;
static int xrx200_alloc_skb(struct xrx200_chan *ch)
{
+ dma_addr_t mapping;
int ret = 0;
ch->skb[ch->dma.desc] = netdev_alloc_skb_ip_align(ch->priv->net_dev,
goto skip;
}
- ch->dma.desc_base[ch->dma.desc].addr = dma_map_single(ch->priv->dev,
- ch->skb[ch->dma.desc]->data, XRX200_DMA_DATA_LEN,
- DMA_FROM_DEVICE);
- if (unlikely(dma_mapping_error(ch->priv->dev,
- ch->dma.desc_base[ch->dma.desc].addr))) {
+ mapping = dma_map_single(ch->priv->dev, ch->skb[ch->dma.desc]->data,
+ XRX200_DMA_DATA_LEN, DMA_FROM_DEVICE);
+ if (unlikely(dma_mapping_error(ch->priv->dev, mapping))) {
dev_kfree_skb_any(ch->skb[ch->dma.desc]);
ret = -ENOMEM;
goto skip;
}
+ ch->dma.desc_base[ch->dma.desc].addr = mapping;
+ /* Make sure the address is written before we give it to HW */
+ wmb();
skip:
ch->dma.desc_base[ch->dma.desc].ctl =
LTQ_DMA_OWN | LTQ_DMA_RX_OFFSET(NET_IP_ALIGN) |
ch->dma.desc %= LTQ_DESC_NUM;
if (ret) {
+ ch->skb[ch->dma.desc] = skb;
+ net_dev->stats.rx_dropped++;
netdev_err(net_dev, "failed to allocate new rx buffer\n");
return ret;
}
#define MVPP2_DESC_DMA_MASK DMA_BIT_MASK(40)
+/* Buffer header info bits */
+#define MVPP2_B_HDR_INFO_MC_ID_MASK 0xfff
+#define MVPP2_B_HDR_INFO_MC_ID(info) ((info) & MVPP2_B_HDR_INFO_MC_ID_MASK)
+#define MVPP2_B_HDR_INFO_LAST_OFFS 12
+#define MVPP2_B_HDR_INFO_LAST_MASK BIT(12)
+#define MVPP2_B_HDR_INFO_IS_LAST(info) \
+ (((info) & MVPP2_B_HDR_INFO_LAST_MASK) >> MVPP2_B_HDR_INFO_LAST_OFFS)
+
struct mvpp2_tai;
/* Definitions */
u32 indir[MVPP22_RSS_TABLE_ENTRIES];
};
+struct mvpp2_buff_hdr {
+ __le32 next_phys_addr;
+ __le32 next_dma_addr;
+ __le16 byte_count;
+ __le16 info;
+ __le16 reserved1; /* bm_qset (for future use, BM) */
+ u8 next_phys_addr_high;
+ u8 next_dma_addr_high;
+ __le16 reserved2;
+ __le16 reserved3;
+ __le16 reserved4;
+ __le16 reserved5;
+};
+
/* Shared Packet Processor resources */
struct mvpp2 {
/* Shared registers' base addresses */
return ret;
}
+static void mvpp2_buff_hdr_pool_put(struct mvpp2_port *port, struct mvpp2_rx_desc *rx_desc,
+ int pool, u32 rx_status)
+{
+ phys_addr_t phys_addr, phys_addr_next;
+ dma_addr_t dma_addr, dma_addr_next;
+ struct mvpp2_buff_hdr *buff_hdr;
+
+ phys_addr = mvpp2_rxdesc_dma_addr_get(port, rx_desc);
+ dma_addr = mvpp2_rxdesc_cookie_get(port, rx_desc);
+
+ do {
+ buff_hdr = (struct mvpp2_buff_hdr *)phys_to_virt(phys_addr);
+
+ phys_addr_next = le32_to_cpu(buff_hdr->next_phys_addr);
+ dma_addr_next = le32_to_cpu(buff_hdr->next_dma_addr);
+
+ if (port->priv->hw_version >= MVPP22) {
+ phys_addr_next |= ((u64)buff_hdr->next_phys_addr_high << 32);
+ dma_addr_next |= ((u64)buff_hdr->next_dma_addr_high << 32);
+ }
+
+ mvpp2_bm_pool_put(port, pool, dma_addr, phys_addr);
+
+ phys_addr = phys_addr_next;
+ dma_addr = dma_addr_next;
+
+ } while (!MVPP2_B_HDR_INFO_IS_LAST(le16_to_cpu(buff_hdr->info)));
+}
+
/* Main rx processing */
static int mvpp2_rx(struct mvpp2_port *port, struct napi_struct *napi,
int rx_todo, struct mvpp2_rx_queue *rxq)
MVPP2_RXD_BM_POOL_ID_OFFS;
bm_pool = &port->priv->bm_pools[pool];
- /* In case of an error, release the requested buffer pointer
- * to the Buffer Manager. This request process is controlled
- * by the hardware, and the information about the buffer is
- * comprised by the RX descriptor.
- */
- if (rx_status & MVPP2_RXD_ERR_SUMMARY)
- goto err_drop_frame;
-
if (port->priv->percpu_pools) {
pp = port->priv->page_pool[pool];
dma_dir = page_pool_get_dma_dir(pp);
rx_bytes + MVPP2_MH_SIZE,
dma_dir);
+ /* Buffer header not supported */
+ if (rx_status & MVPP2_RXD_BUF_HDR)
+ goto err_drop_frame;
+
+ /* In case of an error, release the requested buffer pointer
+ * to the Buffer Manager. This request process is controlled
+ * by the hardware, and the information about the buffer is
+ * comprised by the RX descriptor.
+ */
+ if (rx_status & MVPP2_RXD_ERR_SUMMARY)
+ goto err_drop_frame;
+
/* Prefetch header */
prefetch(data);
dev->stats.rx_errors++;
mvpp2_rx_error(port, rx_desc);
/* Return the buffer to the pool */
- mvpp2_bm_pool_put(port, pool, dma_addr, phys_addr);
+ if (rx_status & MVPP2_RXD_BUF_HDR)
+ mvpp2_buff_hdr_pool_put(port, rx_desc, pool, rx_status);
+ else
+ mvpp2_bm_pool_put(port, pool, dma_addr, phys_addr);
}
rcu_read_unlock();
if (hfunc != ETH_RSS_HASH_NO_CHANGE && hfunc != ETH_RSS_HASH_TOP)
return -EOPNOTSUPP;
+ if (*rss_context != ETH_RXFH_CONTEXT_ALLOC &&
+ *rss_context >= MAX_RSS_GROUPS)
+ return -EINVAL;
+
rss = &pfvf->hw.rss_info;
if (!rss->enable) {
void mtk_stats_update_mac(struct mtk_mac *mac)
{
struct mtk_hw_stats *hw_stats = mac->hw_stats;
- unsigned int base = MTK_GDM1_TX_GBCNT;
- u64 stats;
-
- base += hw_stats->reg_offset;
+ struct mtk_eth *eth = mac->hw;
u64_stats_update_begin(&hw_stats->syncp);
- hw_stats->rx_bytes += mtk_r32(mac->hw, base);
- stats = mtk_r32(mac->hw, base + 0x04);
- if (stats)
- hw_stats->rx_bytes += (stats << 32);
- hw_stats->rx_packets += mtk_r32(mac->hw, base + 0x08);
- hw_stats->rx_overflow += mtk_r32(mac->hw, base + 0x10);
- hw_stats->rx_fcs_errors += mtk_r32(mac->hw, base + 0x14);
- hw_stats->rx_short_errors += mtk_r32(mac->hw, base + 0x18);
- hw_stats->rx_long_errors += mtk_r32(mac->hw, base + 0x1c);
- hw_stats->rx_checksum_errors += mtk_r32(mac->hw, base + 0x20);
- hw_stats->rx_flow_control_packets +=
- mtk_r32(mac->hw, base + 0x24);
- hw_stats->tx_skip += mtk_r32(mac->hw, base + 0x28);
- hw_stats->tx_collisions += mtk_r32(mac->hw, base + 0x2c);
- hw_stats->tx_bytes += mtk_r32(mac->hw, base + 0x30);
- stats = mtk_r32(mac->hw, base + 0x34);
- if (stats)
- hw_stats->tx_bytes += (stats << 32);
- hw_stats->tx_packets += mtk_r32(mac->hw, base + 0x38);
+ if (MTK_HAS_CAPS(eth->soc->caps, MTK_SOC_MT7628)) {
+ hw_stats->tx_packets += mtk_r32(mac->hw, MT7628_SDM_TPCNT);
+ hw_stats->tx_bytes += mtk_r32(mac->hw, MT7628_SDM_TBCNT);
+ hw_stats->rx_packets += mtk_r32(mac->hw, MT7628_SDM_RPCNT);
+ hw_stats->rx_bytes += mtk_r32(mac->hw, MT7628_SDM_RBCNT);
+ hw_stats->rx_checksum_errors +=
+ mtk_r32(mac->hw, MT7628_SDM_CS_ERR);
+ } else {
+ unsigned int offs = hw_stats->reg_offset;
+ u64 stats;
+
+ hw_stats->rx_bytes += mtk_r32(mac->hw,
+ MTK_GDM1_RX_GBCNT_L + offs);
+ stats = mtk_r32(mac->hw, MTK_GDM1_RX_GBCNT_H + offs);
+ if (stats)
+ hw_stats->rx_bytes += (stats << 32);
+ hw_stats->rx_packets +=
+ mtk_r32(mac->hw, MTK_GDM1_RX_GPCNT + offs);
+ hw_stats->rx_overflow +=
+ mtk_r32(mac->hw, MTK_GDM1_RX_OERCNT + offs);
+ hw_stats->rx_fcs_errors +=
+ mtk_r32(mac->hw, MTK_GDM1_RX_FERCNT + offs);
+ hw_stats->rx_short_errors +=
+ mtk_r32(mac->hw, MTK_GDM1_RX_SERCNT + offs);
+ hw_stats->rx_long_errors +=
+ mtk_r32(mac->hw, MTK_GDM1_RX_LENCNT + offs);
+ hw_stats->rx_checksum_errors +=
+ mtk_r32(mac->hw, MTK_GDM1_RX_CERCNT + offs);
+ hw_stats->rx_flow_control_packets +=
+ mtk_r32(mac->hw, MTK_GDM1_RX_FCCNT + offs);
+ hw_stats->tx_skip +=
+ mtk_r32(mac->hw, MTK_GDM1_TX_SKIPCNT + offs);
+ hw_stats->tx_collisions +=
+ mtk_r32(mac->hw, MTK_GDM1_TX_COLCNT + offs);
+ hw_stats->tx_bytes +=
+ mtk_r32(mac->hw, MTK_GDM1_TX_GBCNT_L + offs);
+ stats = mtk_r32(mac->hw, MTK_GDM1_TX_GBCNT_H + offs);
+ if (stats)
+ hw_stats->tx_bytes += (stats << 32);
+ hw_stats->tx_packets +=
+ mtk_r32(mac->hw, MTK_GDM1_TX_GPCNT + offs);
+ }
+
u64_stats_update_end(&hw_stats->syncp);
}
val |= cur << MTK_PDMA_DELAY_RX_PINT_SHIFT;
mtk_w32(eth, val, MTK_PDMA_DELAY_INT);
- mtk_w32(eth, val, MTK_QDMA_DELAY_INT);
+ if (MTK_HAS_CAPS(eth->soc->caps, MTK_QDMA))
+ mtk_w32(eth, val, MTK_QDMA_DELAY_INT);
spin_unlock_bh(ð->dim_lock);
val |= cur << MTK_PDMA_DELAY_TX_PINT_SHIFT;
mtk_w32(eth, val, MTK_PDMA_DELAY_INT);
- mtk_w32(eth, val, MTK_QDMA_DELAY_INT);
+ if (MTK_HAS_CAPS(eth->soc->caps, MTK_QDMA))
+ mtk_w32(eth, val, MTK_QDMA_DELAY_INT);
spin_unlock_bh(ð->dim_lock);
goto err_disable_pm;
}
+ /* set interrupt delays based on current Net DIM sample */
+ mtk_dim_rx(ð->rx_dim.work);
+ mtk_dim_tx(ð->tx_dim.work);
+
/* disable delay and normal interrupt */
mtk_tx_irq_disable(eth, ~0);
mtk_rx_irq_disable(eth, ~0);
/* QDMA FQ Free Page Buffer Length Register */
#define MTK_QDMA_FQ_BLEN 0x1B2C
-/* GMA1 Received Good Byte Count Register */
-#define MTK_GDM1_TX_GBCNT 0x2400
+/* GMA1 counter / statics register */
+#define MTK_GDM1_RX_GBCNT_L 0x2400
+#define MTK_GDM1_RX_GBCNT_H 0x2404
+#define MTK_GDM1_RX_GPCNT 0x2408
+#define MTK_GDM1_RX_OERCNT 0x2410
+#define MTK_GDM1_RX_FERCNT 0x2414
+#define MTK_GDM1_RX_SERCNT 0x2418
+#define MTK_GDM1_RX_LENCNT 0x241c
+#define MTK_GDM1_RX_CERCNT 0x2420
+#define MTK_GDM1_RX_FCCNT 0x2424
+#define MTK_GDM1_TX_SKIPCNT 0x2428
+#define MTK_GDM1_TX_COLCNT 0x242c
+#define MTK_GDM1_TX_GBCNT_L 0x2430
+#define MTK_GDM1_TX_GBCNT_H 0x2434
+#define MTK_GDM1_TX_GPCNT 0x2438
#define MTK_STAT_OFFSET 0x40
/* QDMA descriptor txd4 */
#define MT7628_SDM_MAC_ADRL (MT7628_SDM_OFFSET + 0x0c)
#define MT7628_SDM_MAC_ADRH (MT7628_SDM_OFFSET + 0x10)
+/* Counter / stat register */
+#define MT7628_SDM_TPCNT (MT7628_SDM_OFFSET + 0x100)
+#define MT7628_SDM_TBCNT (MT7628_SDM_OFFSET + 0x104)
+#define MT7628_SDM_RPCNT (MT7628_SDM_OFFSET + 0x108)
+#define MT7628_SDM_RBCNT (MT7628_SDM_OFFSET + 0x10c)
+#define MT7628_SDM_CS_ERR (MT7628_SDM_OFFSET + 0x110)
+
struct mtk_rx_dma {
unsigned int rxd1;
unsigned int rxd2;
return ret;
}
-#define MLX4_EEPROM_PAGE_LEN 256
-
static int mlx4_en_get_module_info(struct net_device *dev,
struct ethtool_modinfo *modinfo)
{
break;
case MLX4_MODULE_ID_SFP:
modinfo->type = ETH_MODULE_SFF_8472;
- modinfo->eeprom_len = MLX4_EEPROM_PAGE_LEN;
+ modinfo->eeprom_len = ETH_MODULE_SFF_8472_LEN;
break;
default:
return -EINVAL;
#define I2C_ADDR_LOW 0x50
#define I2C_ADDR_HIGH 0x51
#define I2C_PAGE_SIZE 256
+#define I2C_HIGH_PAGE_SIZE 128
/* Module Info Data */
struct mlx4_cable_info {
return "Unknown Error";
}
+static int mlx4_get_module_id(struct mlx4_dev *dev, u8 port, u8 *module_id)
+{
+ struct mlx4_cmd_mailbox *inbox, *outbox;
+ struct mlx4_mad_ifc *inmad, *outmad;
+ struct mlx4_cable_info *cable_info;
+ int ret;
+
+ inbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(inbox))
+ return PTR_ERR(inbox);
+
+ outbox = mlx4_alloc_cmd_mailbox(dev);
+ if (IS_ERR(outbox)) {
+ mlx4_free_cmd_mailbox(dev, inbox);
+ return PTR_ERR(outbox);
+ }
+
+ inmad = (struct mlx4_mad_ifc *)(inbox->buf);
+ outmad = (struct mlx4_mad_ifc *)(outbox->buf);
+
+ inmad->method = 0x1; /* Get */
+ inmad->class_version = 0x1;
+ inmad->mgmt_class = 0x1;
+ inmad->base_version = 0x1;
+ inmad->attr_id = cpu_to_be16(0xFF60); /* Module Info */
+
+ cable_info = (struct mlx4_cable_info *)inmad->data;
+ cable_info->dev_mem_address = 0;
+ cable_info->page_num = 0;
+ cable_info->i2c_addr = I2C_ADDR_LOW;
+ cable_info->size = cpu_to_be16(1);
+
+ ret = mlx4_cmd_box(dev, inbox->dma, outbox->dma, port, 3,
+ MLX4_CMD_MAD_IFC, MLX4_CMD_TIME_CLASS_C,
+ MLX4_CMD_NATIVE);
+ if (ret)
+ goto out;
+
+ if (be16_to_cpu(outmad->status)) {
+ /* Mad returned with bad status */
+ ret = be16_to_cpu(outmad->status);
+ mlx4_warn(dev,
+ "MLX4_CMD_MAD_IFC Get Module ID attr(%x) port(%d) i2c_addr(%x) offset(%d) size(%d): Response Mad Status(%x) - %s\n",
+ 0xFF60, port, I2C_ADDR_LOW, 0, 1, ret,
+ cable_info_mad_err_str(ret));
+ ret = -ret;
+ goto out;
+ }
+ cable_info = (struct mlx4_cable_info *)outmad->data;
+ *module_id = cable_info->data[0];
+out:
+ mlx4_free_cmd_mailbox(dev, inbox);
+ mlx4_free_cmd_mailbox(dev, outbox);
+ return ret;
+}
+
+static void mlx4_sfp_eeprom_params_set(u8 *i2c_addr, u8 *page_num, u16 *offset)
+{
+ *i2c_addr = I2C_ADDR_LOW;
+ *page_num = 0;
+
+ if (*offset < I2C_PAGE_SIZE)
+ return;
+
+ *i2c_addr = I2C_ADDR_HIGH;
+ *offset -= I2C_PAGE_SIZE;
+}
+
+static void mlx4_qsfp_eeprom_params_set(u8 *i2c_addr, u8 *page_num, u16 *offset)
+{
+ /* Offsets 0-255 belong to page 0.
+ * Offsets 256-639 belong to pages 01, 02, 03.
+ * For example, offset 400 is page 02: 1 + (400 - 256) / 128 = 2
+ */
+ if (*offset < I2C_PAGE_SIZE)
+ *page_num = 0;
+ else
+ *page_num = 1 + (*offset - I2C_PAGE_SIZE) / I2C_HIGH_PAGE_SIZE;
+ *i2c_addr = I2C_ADDR_LOW;
+ *offset -= *page_num * I2C_HIGH_PAGE_SIZE;
+}
+
/**
* mlx4_get_module_info - Read cable module eeprom data
* @dev: mlx4_dev.
struct mlx4_cmd_mailbox *inbox, *outbox;
struct mlx4_mad_ifc *inmad, *outmad;
struct mlx4_cable_info *cable_info;
- u16 i2c_addr;
+ u8 module_id, i2c_addr, page_num;
int ret;
if (size > MODULE_INFO_MAX_READ)
size = MODULE_INFO_MAX_READ;
+ ret = mlx4_get_module_id(dev, port, &module_id);
+ if (ret)
+ return ret;
+
+ switch (module_id) {
+ case MLX4_MODULE_ID_SFP:
+ mlx4_sfp_eeprom_params_set(&i2c_addr, &page_num, &offset);
+ break;
+ case MLX4_MODULE_ID_QSFP:
+ case MLX4_MODULE_ID_QSFP_PLUS:
+ case MLX4_MODULE_ID_QSFP28:
+ mlx4_qsfp_eeprom_params_set(&i2c_addr, &page_num, &offset);
+ break;
+ default:
+ mlx4_err(dev, "Module ID not recognized: %#x\n", module_id);
+ return -EINVAL;
+ }
+
inbox = mlx4_alloc_cmd_mailbox(dev);
if (IS_ERR(inbox))
return PTR_ERR(inbox);
*/
size -= offset + size - I2C_PAGE_SIZE;
- i2c_addr = I2C_ADDR_LOW;
-
cable_info = (struct mlx4_cable_info *)inmad->data;
cable_info->dev_mem_address = cpu_to_be16(offset);
- cable_info->page_num = 0;
+ cable_info->page_num = page_num;
cable_info->i2c_addr = i2c_addr;
cable_info->size = cpu_to_be16(size);
rpriv = priv->ppriv;
fwd_vport_num = rpriv->rep->vport;
lag_dev = netdev_master_upper_dev_get(netdev);
+ if (!lag_dev)
+ return;
netdev_dbg(netdev, "lag_dev(%s)'s slave vport(%d) is txable(%d)\n",
lag_dev->name, fwd_vport_num, net_lag_port_dev_txable(netdev));
struct mlx5_eswitch *esw;
u32 zone_restore_id;
- tc_skb_ext = skb_ext_add(skb, TC_SKB_EXT);
+ tc_skb_ext = tc_skb_ext_alloc(skb);
if (!tc_skb_ext) {
WARN_ON(1);
return false;
fen_info = container_of(info, struct fib_entry_notifier_info, info);
fib_dev = fib_info_nh(fen_info->fi, 0)->fib_nh_dev;
- if (fib_dev->netdev_ops != &mlx5e_netdev_ops ||
+ if (!fib_dev || fib_dev->netdev_ops != &mlx5e_netdev_ops ||
fen_info->dst_len != 32)
return NULL;
{
struct mlx5e_priv *priv = netdev_priv(netdev);
struct mlx5_core_dev *mdev = priv->mdev;
+ unsigned long fec_bitmap;
u16 fec_policy = 0;
int mode;
int err;
- if (bitmap_weight((unsigned long *)&fecparam->fec,
- ETHTOOL_FEC_LLRS_BIT + 1) > 1)
+ bitmap_from_arr32(&fec_bitmap, &fecparam->fec, sizeof(fecparam->fec) * BITS_PER_BYTE);
+ if (bitmap_weight(&fec_bitmap, ETHTOOL_FEC_LLRS_BIT + 1) > 1)
return -EOPNOTSUPP;
for (mode = 0; mode < ARRAY_SIZE(pplm_fec_2_ethtool); mode++) {
if (curr_val == new_val)
return 0;
+ if (new_val && !priv->profile->rx_ptp_support &&
+ priv->tstamp.rx_filter != HWTSTAMP_FILTER_NONE) {
+ netdev_err(priv->netdev,
+ "Profile doesn't support enabling of CQE compression while hardware time-stamping is enabled.\n");
+ return -EINVAL;
+ }
+
new_params = priv->channels.params;
MLX5E_SET_PFLAG(&new_params, MLX5E_PFLAG_RX_CQE_COMPRESS, new_val);
if (priv->tstamp.rx_filter != HWTSTAMP_FILTER_NONE)
#include <linux/ipv6.h>
#include <linux/tcp.h>
#include <linux/mlx5/fs.h>
+#include <linux/mlx5/mpfs.h>
#include "en.h"
#include "en_rep.h"
#include "lib/mpfs.h"
void mlx5e_activate_rq(struct mlx5e_rq *rq)
{
set_bit(MLX5E_RQ_STATE_ENABLED, &rq->state);
- if (rq->icosq)
+ if (rq->icosq) {
mlx5e_trigger_irq(rq->icosq);
- else
+ } else {
+ local_bh_disable();
napi_schedule(rq->cq.napi);
+ local_bh_enable();
+ }
}
void mlx5e_deactivate_rq(struct mlx5e_rq *rq)
int err;
old_num_txqs = netdev->real_num_tx_queues;
- old_ntc = netdev->num_tc;
+ old_ntc = netdev->num_tc ? : 1;
nch = priv->channels.params.num_channels;
ntc = priv->channels.params.num_tc;
netdev_warn(netdev, "Disabling rxhash, not supported when CQE compress is active\n");
}
+ if (mlx5e_is_uplink_rep(priv)) {
+ features &= ~NETIF_F_HW_TLS_RX;
+ if (netdev->features & NETIF_F_HW_TLS_RX)
+ netdev_warn(netdev, "Disabling hw_tls_rx, not supported in switchdev mode\n");
+
+ features &= ~NETIF_F_HW_TLS_TX;
+ if (netdev->features & NETIF_F_HW_TLS_TX)
+ netdev_warn(netdev, "Disabling hw_tls_tx, not supported in switchdev mode\n");
+ }
+
mutex_unlock(&priv->state_lock);
return features;
return mlx5e_ptp_rx_manage_fs(priv, set);
}
-int mlx5e_hwstamp_set(struct mlx5e_priv *priv, struct ifreq *ifr)
+static int mlx5e_hwstamp_config_no_ptp_rx(struct mlx5e_priv *priv, bool rx_filter)
+{
+ bool rx_cqe_compress_def = priv->channels.params.rx_cqe_compress_def;
+ int err;
+
+ if (!rx_filter)
+ /* Reset CQE compression to Admin default */
+ return mlx5e_modify_rx_cqe_compression_locked(priv, rx_cqe_compress_def);
+
+ if (!MLX5E_GET_PFLAG(&priv->channels.params, MLX5E_PFLAG_RX_CQE_COMPRESS))
+ return 0;
+
+ /* Disable CQE compression */
+ netdev_warn(priv->netdev, "Disabling RX cqe compression\n");
+ err = mlx5e_modify_rx_cqe_compression_locked(priv, false);
+ if (err)
+ netdev_err(priv->netdev, "Failed disabling cqe compression err=%d\n", err);
+
+ return err;
+}
+
+static int mlx5e_hwstamp_config_ptp_rx(struct mlx5e_priv *priv, bool ptp_rx)
{
struct mlx5e_params new_params;
+
+ if (ptp_rx == priv->channels.params.ptp_rx)
+ return 0;
+
+ new_params = priv->channels.params;
+ new_params.ptp_rx = ptp_rx;
+ return mlx5e_safe_switch_params(priv, &new_params, mlx5e_ptp_rx_manage_fs_ctx,
+ &new_params.ptp_rx, true);
+}
+
+int mlx5e_hwstamp_set(struct mlx5e_priv *priv, struct ifreq *ifr)
+{
struct hwtstamp_config config;
bool rx_cqe_compress_def;
+ bool ptp_rx;
int err;
if (!MLX5_CAP_GEN(priv->mdev, device_frequency_khz) ||
}
mutex_lock(&priv->state_lock);
- new_params = priv->channels.params;
rx_cqe_compress_def = priv->channels.params.rx_cqe_compress_def;
/* RX HW timestamp */
switch (config.rx_filter) {
case HWTSTAMP_FILTER_NONE:
- new_params.ptp_rx = false;
+ ptp_rx = false;
break;
case HWTSTAMP_FILTER_ALL:
case HWTSTAMP_FILTER_SOME:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
case HWTSTAMP_FILTER_NTP_ALL:
- new_params.ptp_rx = rx_cqe_compress_def;
config.rx_filter = HWTSTAMP_FILTER_ALL;
+ /* ptp_rx is set if both HW TS is set and CQE
+ * compression is set
+ */
+ ptp_rx = rx_cqe_compress_def;
break;
default:
- mutex_unlock(&priv->state_lock);
- return -ERANGE;
+ err = -ERANGE;
+ goto err_unlock;
}
- if (new_params.ptp_rx == priv->channels.params.ptp_rx)
- goto out;
+ if (!priv->profile->rx_ptp_support)
+ err = mlx5e_hwstamp_config_no_ptp_rx(priv,
+ config.rx_filter != HWTSTAMP_FILTER_NONE);
+ else
+ err = mlx5e_hwstamp_config_ptp_rx(priv, ptp_rx);
+ if (err)
+ goto err_unlock;
- err = mlx5e_safe_switch_params(priv, &new_params, mlx5e_ptp_rx_manage_fs_ctx,
- &new_params.ptp_rx, true);
- if (err) {
- mutex_unlock(&priv->state_lock);
- return err;
- }
-out:
memcpy(&priv->tstamp, &config, sizeof(config));
mutex_unlock(&priv->state_lock);
return copy_to_user(ifr->ifr_data, &config,
sizeof(config)) ? -EFAULT : 0;
+err_unlock:
+ mutex_unlock(&priv->state_lock);
+ return err;
}
int mlx5e_hwstamp_get(struct mlx5e_priv *priv, struct ifreq *ifr)
rtnl_unlock();
}
+static void mlx5e_reset_channels(struct net_device *netdev)
+{
+ netdev_reset_tc(netdev);
+}
+
int mlx5e_attach_netdev(struct mlx5e_priv *priv)
{
const bool take_rtnl = priv->netdev->reg_state == NETREG_REGISTERED;
profile->cleanup_tx(priv);
out:
+ mlx5e_reset_channels(priv->netdev);
set_bit(MLX5E_STATE_DESTROYING, &priv->state);
cancel_work_sync(&priv->update_stats_work);
return err;
profile->cleanup_rx(priv);
profile->cleanup_tx(priv);
+ mlx5e_reset_channels(priv->netdev);
cancel_work_sync(&priv->update_stats_work);
}
struct netlink_ext_ack *extack)
{
struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
- struct net_device *out_dev, *encap_dev = NULL;
struct mlx5e_tc_flow_parse_attr *parse_attr;
struct mlx5_flow_attr *attr = flow->attr;
bool vf_tun = false, encap_valid = true;
+ struct net_device *encap_dev = NULL;
struct mlx5_esw_flow_attr *esw_attr;
struct mlx5_fc *counter = NULL;
struct mlx5e_rep_priv *rpriv;
esw_attr = attr->esw_attr;
for (out_index = 0; out_index < MLX5_MAX_FLOW_FWD_VPORTS; out_index++) {
+ struct net_device *out_dev;
int mirred_ifindex;
if (!(esw_attr->dests[out_index].flags & MLX5_ESW_DEST_ENCAP))
continue;
mirred_ifindex = parse_attr->mirred_ifindex[out_index];
- out_dev = __dev_get_by_index(dev_net(priv->netdev),
- mirred_ifindex);
+ out_dev = dev_get_by_index(dev_net(priv->netdev), mirred_ifindex);
+ if (!out_dev) {
+ NL_SET_ERR_MSG_MOD(extack, "Requested mirred device not found");
+ err = -ENODEV;
+ goto err_out;
+ }
err = mlx5e_attach_encap(priv, flow, out_dev, out_index,
extack, &encap_dev, &encap_valid);
+ dev_put(out_dev);
if (err)
goto err_out;
esw_attr->dests[out_index].mdev = out_priv->mdev;
}
+ if (vf_tun && esw_attr->out_count > 1) {
+ NL_SET_ERR_MSG_MOD(extack, "VF tunnel encap with mirroring is not supported");
+ err = -EOPNOTSUPP;
+ goto err_out;
+ }
+
err = mlx5_eswitch_add_vlan_action(esw, attr);
if (err)
goto err_out;
misc_parameters_3);
struct flow_rule *rule = flow_cls_offload_flow_rule(f);
struct flow_dissector *dissector = rule->match.dissector;
+ enum fs_flow_table_type fs_type;
u16 addr_type = 0;
u8 ip_proto = 0;
u8 *match_level;
int err;
+ fs_type = mlx5e_is_eswitch_flow(flow) ? FS_FT_FDB : FS_FT_NIC_RX;
match_level = outer_match_level;
if (dissector->used_keys &
if (match.mask->vlan_id ||
match.mask->vlan_priority ||
match.mask->vlan_tpid) {
+ if (!MLX5_CAP_FLOWTABLE_TYPE(priv->mdev, ft_field_support.outer_second_vid,
+ fs_type)) {
+ NL_SET_ERR_MSG_MOD(extack,
+ "Matching on CVLAN is not supported");
+ return -EOPNOTSUPP;
+ }
+
if (match.key->vlan_tpid == htons(ETH_P_8021AD)) {
MLX5_SET(fte_match_set_misc, misc_c,
outer_second_svlan_tag, 1);
if (err)
return err;
- *out_dev = dev_get_by_index_rcu(dev_net(vlan_dev),
- dev_get_iflink(vlan_dev));
+ rcu_read_lock();
+ *out_dev = dev_get_by_index_rcu(dev_net(vlan_dev), dev_get_iflink(vlan_dev));
+ rcu_read_unlock();
+ if (!*out_dev)
+ return -ENODEV;
+
if (is_vlan_dev(*out_dev))
err = add_vlan_push_action(priv, attr, out_dev, action);
if (mapped_obj.type == MLX5_MAPPED_OBJ_CHAIN) {
chain = mapped_obj.chain;
- tc_skb_ext = skb_ext_add(skb, TC_SKB_EXT);
+ tc_skb_ext = tc_skb_ext_alloc(skb);
if (WARN_ON(!tc_skb_ext))
return false;
#include <linux/mlx5/mlx5_ifc.h>
#include <linux/mlx5/vport.h>
#include <linux/mlx5/fs.h>
+#include <linux/mlx5/mpfs.h>
#include "esw/acl/lgcy.h"
#include "esw/legacy.h"
#include "mlx5_core.h"
struct mlx5_fs_chains *chains,
int i)
{
- flow_act->flags |= FLOW_ACT_IGNORE_FLOW_LEVEL;
+ if (mlx5_chains_ignore_flow_level_supported(chains))
+ flow_act->flags |= FLOW_ACT_IGNORE_FLOW_LEVEL;
dest[i].type = MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE;
dest[i].ft = mlx5_chains_get_tc_end_ft(chains);
}
{
struct mlx5_flow_table_attr ft_attr = {};
struct mlx5_flow_namespace *root_ns;
- int err;
+ int err, err2;
root_ns = mlx5_get_flow_namespace(dev, MLX5_FLOW_NAMESPACE_FDB);
if (!root_ns) {
/* As this is the terminating action then the termination table is the
* same prio as the slow path
*/
- ft_attr.flags = MLX5_FLOW_TABLE_TERMINATION |
+ ft_attr.flags = MLX5_FLOW_TABLE_TERMINATION | MLX5_FLOW_TABLE_UNMANAGED |
MLX5_FLOW_TABLE_TUNNEL_EN_REFORMAT;
- ft_attr.prio = FDB_SLOW_PATH;
+ ft_attr.prio = FDB_TC_OFFLOAD;
ft_attr.max_fte = 1;
+ ft_attr.level = 1;
ft_attr.autogroup.max_num_groups = 1;
tt->termtbl = mlx5_create_auto_grouped_flow_table(root_ns, &ft_attr);
if (IS_ERR(tt->termtbl)) {
- esw_warn(dev, "Failed to create termination table (error %d)\n",
- IS_ERR(tt->termtbl));
- return -EOPNOTSUPP;
+ err = PTR_ERR(tt->termtbl);
+ esw_warn(dev, "Failed to create termination table, err %pe\n", tt->termtbl);
+ return err;
}
tt->rule = mlx5_add_flow_rules(tt->termtbl, NULL, flow_act,
&tt->dest, 1);
if (IS_ERR(tt->rule)) {
- esw_warn(dev, "Failed to create termination table rule (error %d)\n",
- IS_ERR(tt->rule));
+ err = PTR_ERR(tt->rule);
+ esw_warn(dev, "Failed to create termination table rule, err %pe\n", tt->rule);
goto add_flow_err;
}
return 0;
add_flow_err:
- err = mlx5_destroy_flow_table(tt->termtbl);
- if (err)
- esw_warn(dev, "Failed to destroy termination table\n");
+ err2 = mlx5_destroy_flow_table(tt->termtbl);
+ if (err2)
+ esw_warn(dev, "Failed to destroy termination table, err %d\n", err2);
- return -EOPNOTSUPP;
+ return err;
}
static struct mlx5_termtbl_handle *
}
}
-static bool mlx5_eswitch_termtbl_is_encap_reformat(struct mlx5_pkt_reformat *rt)
-{
- switch (rt->reformat_type) {
- case MLX5_REFORMAT_TYPE_L2_TO_VXLAN:
- case MLX5_REFORMAT_TYPE_L2_TO_NVGRE:
- case MLX5_REFORMAT_TYPE_L2_TO_L2_TUNNEL:
- case MLX5_REFORMAT_TYPE_L2_TO_L3_TUNNEL:
- return true;
- default:
- return false;
- }
-}
-
static void
mlx5_eswitch_termtbl_actions_move(struct mlx5_flow_act *src,
struct mlx5_flow_act *dst)
memset(&src->vlan[1], 0, sizeof(src->vlan[1]));
}
}
-
- if (src->action & MLX5_FLOW_CONTEXT_ACTION_PACKET_REFORMAT &&
- mlx5_eswitch_termtbl_is_encap_reformat(src->pkt_reformat)) {
- src->action &= ~MLX5_FLOW_CONTEXT_ACTION_PACKET_REFORMAT;
- dst->action |= MLX5_FLOW_CONTEXT_ACTION_PACKET_REFORMAT;
- dst->pkt_reformat = src->pkt_reformat;
- src->pkt_reformat = NULL;
- }
}
static bool mlx5_eswitch_offload_is_uplink_port(const struct mlx5_eswitch *esw,
int i;
if (!MLX5_CAP_ESW_FLOWTABLE_FDB(esw->dev, termination_table) ||
+ !MLX5_CAP_ESW_FLOWTABLE_FDB(esw->dev, ignore_flow_level) ||
attr->flags & MLX5_ESW_ATTR_FLAG_SLOW_PATH ||
!mlx5_eswitch_offload_is_uplink_port(esw, spec))
return false;
if (dest[i].type != MLX5_FLOW_DESTINATION_TYPE_VPORT)
continue;
+ if (attr->dests[num_vport_dests].flags & MLX5_ESW_DEST_ENCAP) {
+ term_tbl_act.action |= MLX5_FLOW_CONTEXT_ACTION_PACKET_REFORMAT;
+ term_tbl_act.pkt_reformat = attr->dests[num_vport_dests].pkt_reformat;
+ } else {
+ term_tbl_act.action &= ~MLX5_FLOW_CONTEXT_ACTION_PACKET_REFORMAT;
+ term_tbl_act.pkt_reformat = NULL;
+ }
+
/* get the terminating table for the action list */
tt = mlx5_eswitch_termtbl_get_create(esw, &term_tbl_act,
&dest[i], attr);
if (IS_ERR(tt)) {
- esw_warn(esw->dev, "Failed to get termination table (error %d)\n",
- IS_ERR(tt));
+ esw_warn(esw->dev, "Failed to get termination table, err %pe\n", tt);
goto revert_changes;
}
attr->dests[num_vport_dests].termtbl = tt;
goto revert_changes;
/* create the FTE */
+ flow_act->action &= ~MLX5_FLOW_CONTEXT_ACTION_PACKET_REFORMAT;
+ flow_act->pkt_reformat = NULL;
+ flow_act->flags |= FLOW_ACT_IGNORE_FLOW_LEVEL;
rule = mlx5_add_flow_rules(fdb, spec, flow_act, dest, num_dest);
if (IS_ERR(rule))
goto revert_changes;
reset_abort_work);
struct mlx5_core_dev *dev = fw_reset->dev;
+ if (!test_bit(MLX5_FW_RESET_FLAGS_RESET_REQUESTED, &fw_reset->reset_flags))
+ return;
+
mlx5_sync_reset_clear_reset_requested(dev, true);
mlx5_core_warn(dev, "PCI Sync FW Update Reset Aborted.\n");
}
struct lag_mp *mp = &ldev->lag_mp;
int err;
+ /* always clear mfi, as it might become stale when a route delete event
+ * has been missed
+ */
+ mp->mfi = NULL;
+
if (mp->fib_nb.notifier_call)
return 0;
unregister_fib_notifier(&init_net, &mp->fib_nb);
destroy_workqueue(mp->wq);
mp->fib_nb.notifier_call = NULL;
+ mp->mfi = NULL;
}
return chains->flags & MLX5_CHAINS_AND_PRIOS_SUPPORTED;
}
-static bool mlx5_chains_ignore_flow_level_supported(struct mlx5_fs_chains *chains)
+bool mlx5_chains_ignore_flow_level_supported(struct mlx5_fs_chains *chains)
{
return chains->flags & MLX5_CHAINS_IGNORE_FLOW_LEVEL_SUPPORTED;
}
bool
mlx5_chains_prios_supported(struct mlx5_fs_chains *chains);
+bool mlx5_chains_ignore_flow_level_supported(struct mlx5_fs_chains *chains);
bool
mlx5_chains_backwards_supported(struct mlx5_fs_chains *chains);
u32
#else /* CONFIG_MLX5_CLS_ACT */
+static inline bool
+mlx5_chains_ignore_flow_level_supported(struct mlx5_fs_chains *chains)
+{ return false; }
+
static inline struct mlx5_flow_table *
mlx5_chains_get_table(struct mlx5_fs_chains *chains, u32 chain, u32 prio,
u32 level) { return ERR_PTR(-EOPNOTSUPP); }
#include <linux/etherdevice.h>
#include <linux/mlx5/driver.h>
#include <linux/mlx5/mlx5_ifc.h>
+#include <linux/mlx5/mpfs.h>
#include <linux/mlx5/eswitch.h>
#include "mlx5_core.h"
#include "lib/mpfs.h"
mutex_unlock(&mpfs->lock);
return err;
}
+EXPORT_SYMBOL(mlx5_mpfs_add_mac);
int mlx5_mpfs_del_mac(struct mlx5_core_dev *dev, u8 *mac)
{
mutex_unlock(&mpfs->lock);
return err;
}
+EXPORT_SYMBOL(mlx5_mpfs_del_mac);
#ifdef CONFIG_MLX5_MPFS
int mlx5_mpfs_init(struct mlx5_core_dev *dev);
void mlx5_mpfs_cleanup(struct mlx5_core_dev *dev);
-int mlx5_mpfs_add_mac(struct mlx5_core_dev *dev, u8 *mac);
-int mlx5_mpfs_del_mac(struct mlx5_core_dev *dev, u8 *mac);
#else /* #ifndef CONFIG_MLX5_MPFS */
static inline int mlx5_mpfs_init(struct mlx5_core_dev *dev) { return 0; }
static inline void mlx5_mpfs_cleanup(struct mlx5_core_dev *dev) {}
-static inline int mlx5_mpfs_add_mac(struct mlx5_core_dev *dev, u8 *mac) { return 0; }
-static inline int mlx5_mpfs_del_mac(struct mlx5_core_dev *dev, u8 *mac) { return 0; }
#endif
+
#endif
static int handle_hca_cap(struct mlx5_core_dev *dev, void *set_ctx)
{
- struct mlx5_profile *prof = dev->profile;
+ struct mlx5_profile *prof = &dev->profile;
void *set_hca_cap;
int err;
to_fw_pkey_sz(dev, 128));
/* Check log_max_qp from HCA caps to set in current profile */
- if (MLX5_CAP_GEN_MAX(dev, log_max_qp) < profile[prof_sel].log_max_qp) {
+ if (MLX5_CAP_GEN_MAX(dev, log_max_qp) < prof->log_max_qp) {
mlx5_core_warn(dev, "log_max_qp value in current profile is %d, changing it to HCA capability limit (%d)\n",
- profile[prof_sel].log_max_qp,
+ prof->log_max_qp,
MLX5_CAP_GEN_MAX(dev, log_max_qp));
- profile[prof_sel].log_max_qp = MLX5_CAP_GEN_MAX(dev, log_max_qp);
+ prof->log_max_qp = MLX5_CAP_GEN_MAX(dev, log_max_qp);
}
if (prof->mask & MLX5_PROF_MASK_QP_SIZE)
MLX5_SET(cmd_hca_cap, set_hca_cap, log_max_qp,
struct mlx5_priv *priv = &dev->priv;
int err;
- dev->profile = &profile[profile_idx];
-
+ memcpy(&dev->profile, &profile[profile_idx], sizeof(dev->profile));
INIT_LIST_HEAD(&priv->ctx_list);
spin_lock_init(&priv->ctx_lock);
mutex_init(&dev->intf_state_mutex);
int mlx5_set_msix_vec_count(struct mlx5_core_dev *dev, int function_id,
int msix_vec_count)
{
- int sz = MLX5_ST_SZ_BYTES(set_hca_cap_in);
+ int query_sz = MLX5_ST_SZ_BYTES(query_hca_cap_out);
+ int set_sz = MLX5_ST_SZ_BYTES(set_hca_cap_in);
+ void *hca_cap = NULL, *query_cap = NULL, *cap;
int num_vf_msix, min_msix, max_msix;
- void *hca_cap, *cap;
int ret;
num_vf_msix = MLX5_CAP_GEN_MAX(dev, num_total_dynamic_vf_msix);
if (msix_vec_count > max_msix)
return -EOVERFLOW;
- hca_cap = kzalloc(sz, GFP_KERNEL);
- if (!hca_cap)
- return -ENOMEM;
+ query_cap = kzalloc(query_sz, GFP_KERNEL);
+ hca_cap = kzalloc(set_sz, GFP_KERNEL);
+ if (!hca_cap || !query_cap) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ ret = mlx5_vport_get_other_func_cap(dev, function_id, query_cap);
+ if (ret)
+ goto out;
cap = MLX5_ADDR_OF(set_hca_cap_in, hca_cap, capability);
+ memcpy(cap, MLX5_ADDR_OF(query_hca_cap_out, query_cap, capability),
+ MLX5_UN_SZ_BYTES(hca_cap_union));
MLX5_SET(cmd_hca_cap, cap, dynamic_msix_table_size, msix_vec_count);
MLX5_SET(set_hca_cap_in, hca_cap, opcode, MLX5_CMD_OP_SET_HCA_CAP);
MLX5_SET(set_hca_cap_in, hca_cap, op_mod,
MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE << 1);
ret = mlx5_cmd_exec_in(dev, set_hca_cap, hca_cap);
+out:
kfree(hca_cap);
+ kfree(query_cap);
return ret;
}
switch (hw_state) {
case MLX5_VHCA_STATE_ACTIVE:
case MLX5_VHCA_STATE_IN_USE:
- case MLX5_VHCA_STATE_TEARDOWN_REQUEST:
return DEVLINK_PORT_FN_STATE_ACTIVE;
case MLX5_VHCA_STATE_INVALID:
case MLX5_VHCA_STATE_ALLOCATED:
+ case MLX5_VHCA_STATE_TEARDOWN_REQUEST:
default:
return DEVLINK_PORT_FN_STATE_INACTIVE;
}
return err;
}
-static int mlx5_sf_activate(struct mlx5_core_dev *dev, struct mlx5_sf *sf)
+static int mlx5_sf_activate(struct mlx5_core_dev *dev, struct mlx5_sf *sf,
+ struct netlink_ext_ack *extack)
{
int err;
if (mlx5_sf_is_active(sf))
return 0;
- if (sf->hw_state != MLX5_VHCA_STATE_ALLOCATED)
- return -EINVAL;
+ if (sf->hw_state != MLX5_VHCA_STATE_ALLOCATED) {
+ NL_SET_ERR_MSG_MOD(extack, "SF is inactivated but it is still attached");
+ return -EBUSY;
+ }
err = mlx5_cmd_sf_enable_hca(dev, sf->hw_fn_id);
if (err)
static int mlx5_sf_state_set(struct mlx5_core_dev *dev, struct mlx5_sf_table *table,
struct mlx5_sf *sf,
- enum devlink_port_fn_state state)
+ enum devlink_port_fn_state state,
+ struct netlink_ext_ack *extack)
{
int err = 0;
if (state == mlx5_sf_to_devlink_state(sf->hw_state))
goto out;
if (state == DEVLINK_PORT_FN_STATE_ACTIVE)
- err = mlx5_sf_activate(dev, sf);
+ err = mlx5_sf_activate(dev, sf, extack);
else if (state == DEVLINK_PORT_FN_STATE_INACTIVE)
err = mlx5_sf_deactivate(dev, sf);
else
goto out;
}
- err = mlx5_sf_state_set(dev, table, sf, state);
+ err = mlx5_sf_state_set(dev, table, sf, state, extack);
out:
mlx5_sf_table_put(table);
return err;
int ret;
ft_attr.table_type = MLX5_FLOW_TABLE_TYPE_FDB;
- ft_attr.level = dmn->info.caps.max_ft_level - 2;
+ ft_attr.level = min_t(int, dmn->info.caps.max_ft_level - 2,
+ MLX5_FT_MAX_MULTIPATH_LEVEL);
ft_attr.reformat_en = reformat_req;
ft_attr.decap_en = reformat_req;
// SPDX-License-Identifier: GPL-2.0-or-later
-/**
+/*
* Microchip ENCX24J600 ethernet driver
*
* Copyright (C) 2015 Gridpoint
/* SPDX-License-Identifier: GPL-2.0 */
-/**
+/*
* encx24j600_hw.h: Register definitions
*
*/
dev_err(&pdev->dev,
"invalid sram_size %dB or board span %ldB\n",
mgp->sram_size, mgp->board_span);
+ status = -EINVAL;
goto abort_with_ioremap;
}
memcpy_fromio(mgp->eeprom_strings,
config IONIC
tristate "Pensando Ethernet IONIC Support"
depends on 64BIT && PCI
+ depends on PTP_1588_CLOCK || !PTP_1588_CLOCK
select NET_DEVLINK
select DIMLIB
help
value = readl(&port_regs->CommonRegs.semaphoreReg);
if ((value & (sem_mask >> 16)) == sem_bits)
return 0;
- ssleep(1);
+ mdelay(1000);
} while (--seconds);
return -1;
}
efx->pci_dev->irq);
goto fail1;
}
+ efx->irqs_hooked = true;
return 0;
}
*/
static int stmmac_init_phy(struct net_device *dev)
{
- struct ethtool_wolinfo wol = { .cmd = ETHTOOL_GWOL };
struct stmmac_priv *priv = netdev_priv(dev);
struct device_node *node;
int ret;
ret = phylink_connect_phy(priv->phylink, phydev);
}
- phylink_ethtool_get_wol(priv->phylink, &wol);
- device_set_wakeup_capable(priv->device, !!wol.supported);
+ if (!priv->plat->pmt) {
+ struct ethtool_wolinfo wol = { .cmd = ETHTOOL_GWOL };
+
+ phylink_ethtool_get_wol(priv->phylink, &wol);
+ device_set_wakeup_capable(priv->device, !!wol.supported);
+ }
return ret;
}
priv->phylink_config.dev = &priv->dev->dev;
priv->phylink_config.type = PHYLINK_NETDEV;
priv->phylink_config.pcs_poll = true;
- priv->phylink_config.ovr_an_inband =
- priv->plat->mdio_bus_data->xpcs_an_inband;
+ if (priv->plat->mdio_bus_data)
+ priv->phylink_config.ovr_an_inband =
+ priv->plat->mdio_bus_data->xpcs_an_inband;
if (!fwnode)
fwnode = dev_fwnode(priv->device);
struct stmmac_priv *priv = netdev_priv(ndev);
int ret = 0;
+ ret = pm_runtime_get_sync(priv->device);
+ if (ret < 0) {
+ pm_runtime_put_noidle(priv->device);
+ return ret;
+ }
+
ret = eth_mac_addr(ndev, addr);
if (ret)
- return ret;
+ goto set_mac_error;
stmmac_set_umac_addr(priv, priv->hw, ndev->dev_addr, 0);
+set_mac_error:
+ pm_runtime_put(priv->device);
+
return ret;
}
bool is_double = false;
int ret;
- ret = pm_runtime_get_sync(priv->device);
- if (ret < 0) {
- pm_runtime_put_noidle(priv->device);
- return ret;
- }
-
if (be16_to_cpu(proto) == ETH_P_8021AD)
is_double = true;
bool is_double = false;
int ret;
+ ret = pm_runtime_get_sync(priv->device);
+ if (ret < 0) {
+ pm_runtime_put_noidle(priv->device);
+ return ret;
+ }
+
if (be16_to_cpu(proto) == ETH_P_8021AD)
is_double = true;
stmmac_napi_del(ndev);
error_hw_init:
destroy_workqueue(priv->wq);
- stmmac_bus_clks_config(priv, false);
bitmap_free(priv->af_xdp_zc_qps);
return ret;
tx_pipe->dma_queue = knav_queue_open(name, tx_pipe->dma_queue_id,
KNAV_QUEUE_SHARED);
if (IS_ERR(tx_pipe->dma_queue)) {
- dev_err(dev, "Could not open DMA queue for channel \"%s\": %d\n",
- name, ret);
+ dev_err(dev, "Could not open DMA queue for channel \"%s\": %pe\n",
+ name, tx_pipe->dma_queue);
ret = PTR_ERR(tx_pipe->dma_queue);
goto err;
}
#include <linux/spi/spi.h>
#include <linux/interrupt.h>
+#include <linux/mod_devicetable.h>
#include <linux/module.h>
-#include <linux/of.h>
#include <linux/regmap.h>
#include <linux/ieee802154.h>
#include <linux/irq.h>
static struct spi_driver mrf24j40_driver = {
.driver = {
- .of_match_table = of_match_ptr(mrf24j40_of_match),
+ .of_match_table = mrf24j40_of_match,
.name = "mrf24j40",
},
.id_table = mrf24j40_ids,
* @mem_virt: Virtual address of IPA-local memory space
* @mem_offset: Offset from @mem_virt used for access to IPA memory
* @mem_size: Total size (bytes) of memory at @mem_virt
+ * @mem_count: Number of entries in the mem array
* @mem: Array of IPA-local memory region descriptors
* @imem_iova: I/O virtual address of IPA region in IMEM
* @imem_size: Size of IMEM region
void *mem_virt;
u32 mem_offset;
u32 mem_size;
+ u32 mem_count;
const struct ipa_mem *mem;
unsigned long imem_iova;
* for the region, write "canary" values in the space prior to
* the region's base address.
*/
- for (mem_id = 0; mem_id < IPA_MEM_COUNT; mem_id++) {
+ for (mem_id = 0; mem_id < ipa->mem_count; mem_id++) {
const struct ipa_mem *mem = &ipa->mem[mem_id];
u16 canary_count;
__le32 *canary;
ipa->mem_size = resource_size(res);
/* The ipa->mem[] array is indexed by enum ipa_mem_id values */
+ ipa->mem_count = mem_data->local_count;
ipa->mem = mem_data->local;
ret = ipa_imem_init(ipa, mem_data->imem_addr, mem_data->imem_size);
return 0;
fail_register:
- mdiobus_free(bus->mii_bus);
smi_en.u64 = 0;
oct_mdio_writeq(smi_en.u64, bus->register_base + SMI_EN);
return err;
bus = platform_get_drvdata(pdev);
mdiobus_unregister(bus->mii_bus);
- mdiobus_free(bus->mii_bus);
smi_en.u64 = 0;
oct_mdio_writeq(smi_en.u64, bus->register_base + SMI_EN);
return 0;
continue;
mdiobus_unregister(bus->mii_bus);
- mdiobus_free(bus->mii_bus);
oct_mdio_writeq(0, bus->register_base + SMI_EN);
}
pci_release_regions(pdev);
struct mdio_device *mdiodev;
int i;
- BUG_ON(bus->state != MDIOBUS_REGISTERED);
+ if (WARN_ON_ONCE(bus->state != MDIOBUS_REGISTERED))
+ return;
bus->state = MDIOBUS_UNREGISTERED;
for (i = 0; i < PHY_MAX_ADDR; i++) {
* for transport over USB using a simpler USB device model than the
* previous CDC "Ethernet Control Model" (ECM, or "CDC Ethernet").
*
- * For details, see www.usb.org/developers/devclass_docs/CDC_EEM10.pdf
+ * For details, see https://usb.org/sites/default/files/CDC_EEM10.pdf
*
* This version has been tested with GIGAntIC WuaoW SIM Smart Card on 2.6.24,
* 2.6.27 and 2.6.30rc2 kernel.
spin_unlock_irqrestore(&serial->serial_lock, flags);
return usb_control_msg(serial->parent->usb,
- usb_rcvctrlpipe(serial->parent->usb, 0), 0x22,
+ usb_sndctrlpipe(serial->parent->usb, 0), 0x22,
0x21, val, if_num, NULL, 0,
USB_CTRL_SET_TIMEOUT);
}
if (hso_dev->usb_gone)
rv = 0;
else
- rv = usb_control_msg(hso_dev->usb, usb_rcvctrlpipe(hso_dev->usb, 0),
+ rv = usb_control_msg(hso_dev->usb, usb_sndctrlpipe(hso_dev->usb, 0),
enabled ? 0x82 : 0x81, 0x40, 0, 0, NULL, 0,
USB_CTRL_SET_TIMEOUT);
mutex_unlock(&hso_dev->mutex);
num_urbs = 2;
serial->tiocmget = kzalloc(sizeof(struct hso_tiocmget),
GFP_KERNEL);
+ if (!serial->tiocmget)
+ goto exit;
serial->tiocmget->serial_state_notification
= kzalloc(sizeof(struct hso_serial_state_notification),
GFP_KERNEL);
- /* it isn't going to break our heart if serial->tiocmget
- * allocation fails don't bother checking this.
- */
- if (serial->tiocmget && serial->tiocmget->serial_state_notification) {
- tiocmget = serial->tiocmget;
- tiocmget->endp = hso_get_ep(interface,
- USB_ENDPOINT_XFER_INT,
- USB_DIR_IN);
- if (!tiocmget->endp) {
- dev_err(&interface->dev, "Failed to find INT IN ep\n");
- goto exit;
- }
-
- tiocmget->urb = usb_alloc_urb(0, GFP_KERNEL);
- if (tiocmget->urb) {
- mutex_init(&tiocmget->mutex);
- init_waitqueue_head(&tiocmget->waitq);
- } else
- hso_free_tiomget(serial);
+ if (!serial->tiocmget->serial_state_notification)
+ goto exit;
+ tiocmget = serial->tiocmget;
+ tiocmget->endp = hso_get_ep(interface,
+ USB_ENDPOINT_XFER_INT,
+ USB_DIR_IN);
+ if (!tiocmget->endp) {
+ dev_err(&interface->dev, "Failed to find INT IN ep\n");
+ goto exit;
}
- }
- else
+
+ tiocmget->urb = usb_alloc_urb(0, GFP_KERNEL);
+ if (!tiocmget->urb)
+ goto exit;
+
+ mutex_init(&tiocmget->mutex);
+ init_waitqueue_head(&tiocmget->waitq);
+ } else {
num_urbs = 1;
+ }
if (hso_serial_common_create(serial, num_urbs, BULK_URB_RX_SIZE,
BULK_URB_TX_SIZE))
.get_strings = lan78xx_get_strings,
.get_wol = lan78xx_get_wol,
.set_wol = lan78xx_set_wol,
+ .get_ts_info = ethtool_op_get_ts_info,
.get_eee = lan78xx_get_eee,
.set_eee = lan78xx_set_eee,
.get_pauseparam = lan78xx_get_pause,
tp->coalesce = 15000; /* 15 us */
}
+static bool rtl_check_vendor_ok(struct usb_interface *intf)
+{
+ struct usb_host_interface *alt = intf->cur_altsetting;
+ struct usb_endpoint_descriptor *in, *out, *intr;
+
+ if (usb_find_common_endpoints(alt, &in, &out, &intr, NULL) < 0) {
+ dev_err(&intf->dev, "Expected endpoints are not found\n");
+ return false;
+ }
+
+ /* Check Rx endpoint address */
+ if (usb_endpoint_num(in) != 1) {
+ dev_err(&intf->dev, "Invalid Rx endpoint address\n");
+ return false;
+ }
+
+ /* Check Tx endpoint address */
+ if (usb_endpoint_num(out) != 2) {
+ dev_err(&intf->dev, "Invalid Tx endpoint address\n");
+ return false;
+ }
+
+ /* Check interrupt endpoint address */
+ if (usb_endpoint_num(intr) != 3) {
+ dev_err(&intf->dev, "Invalid interrupt endpoint address\n");
+ return false;
+ }
+
+ return true;
+}
+
static bool rtl_vendor_mode(struct usb_interface *intf)
{
struct usb_host_interface *alt = intf->cur_altsetting;
int i, num_configs;
if (alt->desc.bInterfaceClass == USB_CLASS_VENDOR_SPEC)
- return true;
+ return rtl_check_vendor_ok(intf);
/* The vendor mode is not always config #1, so to find it out. */
udev = interface_to_usbdev(intf);
c = udev->config;
num_configs = udev->descriptor.bNumConfigurations;
+ if (num_configs < 2)
+ return false;
+
for (i = 0; i < num_configs; (i++, c++)) {
struct usb_interface_descriptor *desc = NULL;
}
}
- WARN_ON_ONCE(i == num_configs);
+ if (i == num_configs)
+ dev_err(&intf->dev, "Unexpected Device\n");
return false;
}
if (!rtl_vendor_mode(intf))
return -ENODEV;
- if (intf->cur_altsetting->desc.bNumEndpoints < 3)
- return -ENODEV;
-
usb_reset_device(udev);
netdev = alloc_etherdev(sizeof(struct r8152));
if (!netdev) {
ret = smsc75xx_wait_ready(dev, 0);
if (ret < 0) {
netdev_warn(dev->net, "device not ready in smsc75xx_bind\n");
- return ret;
+ goto err;
}
smsc75xx_init_mac_address(dev);
ret = smsc75xx_reset(dev);
if (ret < 0) {
netdev_warn(dev->net, "smsc75xx_reset error %d\n", ret);
- return ret;
+ goto err;
}
dev->net->netdev_ops = &smsc75xx_netdev_ops;
dev->hard_mtu = dev->net->mtu + dev->net->hard_header_len;
dev->net->max_mtu = MAX_SINGLE_PACKET_SIZE;
return 0;
+
+err:
+ kfree(pdata);
+ return ret;
}
static void smsc75xx_unbind(struct usbnet *dev, struct usb_interface *intf)
/* If headroom is not 0, there is an offset between the beginning of the
* data and the allocated space, otherwise the data and the allocated
* space are aligned.
+ *
+ * Buffers with headroom use PAGE_SIZE as alloc size, see
+ * add_recvbuf_mergeable() + get_mergeable_buf_len()
*/
- if (headroom) {
- /* Buffers with headroom use PAGE_SIZE as alloc size,
- * see add_recvbuf_mergeable() + get_mergeable_buf_len()
- */
- truesize = PAGE_SIZE;
- tailroom = truesize - len - offset;
- buf = page_address(page);
- } else {
- tailroom = truesize - len;
- buf = p;
- }
+ truesize = headroom ? PAGE_SIZE : truesize;
+ tailroom = truesize - len - headroom - (hdr_padded_len - hdr_len);
+ buf = p - headroom;
len -= hdr_len;
offset += hdr_padded_len;
put_page(page);
head_skb = page_to_skb(vi, rq, xdp_page, offset,
len, PAGE_SIZE, false,
- metasize, headroom);
+ metasize,
+ VIRTIO_XDP_HEADROOM);
return head_skb;
}
break;
-ccflags-y := -O3
-ccflags-y += -D'pr_fmt(fmt)=KBUILD_MODNAME ": " fmt'
+ccflags-y := -D'pr_fmt(fmt)=KBUILD_MODNAME ": " fmt'
ccflags-$(CONFIG_WIREGUARD_DEBUG) += -DDEBUG
wireguard-y := main.o
wireguard-y += noise.o
#include "allowedips.h"
#include "peer.h"
+static struct kmem_cache *node_cache;
+
static void swap_endian(u8 *dst, const u8 *src, u8 bits)
{
if (bits == 32) {
node->bitlen = bits;
memcpy(node->bits, src, bits / 8U);
}
-#define CHOOSE_NODE(parent, key) \
- parent->bit[(key[parent->bit_at_a] >> parent->bit_at_b) & 1]
+
+static inline u8 choose(struct allowedips_node *node, const u8 *key)
+{
+ return (key[node->bit_at_a] >> node->bit_at_b) & 1;
+}
static void push_rcu(struct allowedips_node **stack,
struct allowedips_node __rcu *p, unsigned int *len)
}
}
+static void node_free_rcu(struct rcu_head *rcu)
+{
+ kmem_cache_free(node_cache, container_of(rcu, struct allowedips_node, rcu));
+}
+
static void root_free_rcu(struct rcu_head *rcu)
{
struct allowedips_node *node, *stack[128] = {
while (len > 0 && (node = stack[--len])) {
push_rcu(stack, node->bit[0], &len);
push_rcu(stack, node->bit[1], &len);
- kfree(node);
+ kmem_cache_free(node_cache, node);
}
}
}
}
-static void walk_remove_by_peer(struct allowedips_node __rcu **top,
- struct wg_peer *peer, struct mutex *lock)
-{
-#define REF(p) rcu_access_pointer(p)
-#define DEREF(p) rcu_dereference_protected(*(p), lockdep_is_held(lock))
-#define PUSH(p) ({ \
- WARN_ON(IS_ENABLED(DEBUG) && len >= 128); \
- stack[len++] = p; \
- })
-
- struct allowedips_node __rcu **stack[128], **nptr;
- struct allowedips_node *node, *prev;
- unsigned int len;
-
- if (unlikely(!peer || !REF(*top)))
- return;
-
- for (prev = NULL, len = 0, PUSH(top); len > 0; prev = node) {
- nptr = stack[len - 1];
- node = DEREF(nptr);
- if (!node) {
- --len;
- continue;
- }
- if (!prev || REF(prev->bit[0]) == node ||
- REF(prev->bit[1]) == node) {
- if (REF(node->bit[0]))
- PUSH(&node->bit[0]);
- else if (REF(node->bit[1]))
- PUSH(&node->bit[1]);
- } else if (REF(node->bit[0]) == prev) {
- if (REF(node->bit[1]))
- PUSH(&node->bit[1]);
- } else {
- if (rcu_dereference_protected(node->peer,
- lockdep_is_held(lock)) == peer) {
- RCU_INIT_POINTER(node->peer, NULL);
- list_del_init(&node->peer_list);
- if (!node->bit[0] || !node->bit[1]) {
- rcu_assign_pointer(*nptr, DEREF(
- &node->bit[!REF(node->bit[0])]));
- kfree_rcu(node, rcu);
- node = DEREF(nptr);
- }
- }
- --len;
- }
- }
-
-#undef REF
-#undef DEREF
-#undef PUSH
-}
-
static unsigned int fls128(u64 a, u64 b)
{
return a ? fls64(a) + 64U : fls64(b);
found = node;
if (node->cidr == bits)
break;
- node = rcu_dereference_bh(CHOOSE_NODE(node, key));
+ node = rcu_dereference_bh(node->bit[choose(node, key)]);
}
return found;
}
u8 cidr, u8 bits, struct allowedips_node **rnode,
struct mutex *lock)
{
- struct allowedips_node *node = rcu_dereference_protected(trie,
- lockdep_is_held(lock));
+ struct allowedips_node *node = rcu_dereference_protected(trie, lockdep_is_held(lock));
struct allowedips_node *parent = NULL;
bool exact = false;
exact = true;
break;
}
- node = rcu_dereference_protected(CHOOSE_NODE(parent, key),
- lockdep_is_held(lock));
+ node = rcu_dereference_protected(parent->bit[choose(parent, key)], lockdep_is_held(lock));
}
*rnode = parent;
return exact;
}
+static inline void connect_node(struct allowedips_node **parent, u8 bit, struct allowedips_node *node)
+{
+ node->parent_bit_packed = (unsigned long)parent | bit;
+ rcu_assign_pointer(*parent, node);
+}
+
+static inline void choose_and_connect_node(struct allowedips_node *parent, struct allowedips_node *node)
+{
+ u8 bit = choose(parent, node->bits);
+ connect_node(&parent->bit[bit], bit, node);
+}
+
static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key,
u8 cidr, struct wg_peer *peer, struct mutex *lock)
{
return -EINVAL;
if (!rcu_access_pointer(*trie)) {
- node = kzalloc(sizeof(*node), GFP_KERNEL);
+ node = kmem_cache_zalloc(node_cache, GFP_KERNEL);
if (unlikely(!node))
return -ENOMEM;
RCU_INIT_POINTER(node->peer, peer);
list_add_tail(&node->peer_list, &peer->allowedips_list);
copy_and_assign_cidr(node, key, cidr, bits);
- rcu_assign_pointer(*trie, node);
+ connect_node(trie, 2, node);
return 0;
}
if (node_placement(*trie, key, cidr, bits, &node, lock)) {
return 0;
}
- newnode = kzalloc(sizeof(*newnode), GFP_KERNEL);
+ newnode = kmem_cache_zalloc(node_cache, GFP_KERNEL);
if (unlikely(!newnode))
return -ENOMEM;
RCU_INIT_POINTER(newnode->peer, peer);
if (!node) {
down = rcu_dereference_protected(*trie, lockdep_is_held(lock));
} else {
- down = rcu_dereference_protected(CHOOSE_NODE(node, key),
- lockdep_is_held(lock));
+ const u8 bit = choose(node, key);
+ down = rcu_dereference_protected(node->bit[bit], lockdep_is_held(lock));
if (!down) {
- rcu_assign_pointer(CHOOSE_NODE(node, key), newnode);
+ connect_node(&node->bit[bit], bit, newnode);
return 0;
}
}
parent = node;
if (newnode->cidr == cidr) {
- rcu_assign_pointer(CHOOSE_NODE(newnode, down->bits), down);
+ choose_and_connect_node(newnode, down);
if (!parent)
- rcu_assign_pointer(*trie, newnode);
+ connect_node(trie, 2, newnode);
else
- rcu_assign_pointer(CHOOSE_NODE(parent, newnode->bits),
- newnode);
- } else {
- node = kzalloc(sizeof(*node), GFP_KERNEL);
- if (unlikely(!node)) {
- list_del(&newnode->peer_list);
- kfree(newnode);
- return -ENOMEM;
- }
- INIT_LIST_HEAD(&node->peer_list);
- copy_and_assign_cidr(node, newnode->bits, cidr, bits);
+ choose_and_connect_node(parent, newnode);
+ return 0;
+ }
- rcu_assign_pointer(CHOOSE_NODE(node, down->bits), down);
- rcu_assign_pointer(CHOOSE_NODE(node, newnode->bits), newnode);
- if (!parent)
- rcu_assign_pointer(*trie, node);
- else
- rcu_assign_pointer(CHOOSE_NODE(parent, node->bits),
- node);
+ node = kmem_cache_zalloc(node_cache, GFP_KERNEL);
+ if (unlikely(!node)) {
+ list_del(&newnode->peer_list);
+ kmem_cache_free(node_cache, newnode);
+ return -ENOMEM;
}
+ INIT_LIST_HEAD(&node->peer_list);
+ copy_and_assign_cidr(node, newnode->bits, cidr, bits);
+
+ choose_and_connect_node(node, down);
+ choose_and_connect_node(node, newnode);
+ if (!parent)
+ connect_node(trie, 2, node);
+ else
+ choose_and_connect_node(parent, node);
return 0;
}
void wg_allowedips_remove_by_peer(struct allowedips *table,
struct wg_peer *peer, struct mutex *lock)
{
+ struct allowedips_node *node, *child, **parent_bit, *parent, *tmp;
+ bool free_parent;
+
+ if (list_empty(&peer->allowedips_list))
+ return;
++table->seq;
- walk_remove_by_peer(&table->root4, peer, lock);
- walk_remove_by_peer(&table->root6, peer, lock);
+ list_for_each_entry_safe(node, tmp, &peer->allowedips_list, peer_list) {
+ list_del_init(&node->peer_list);
+ RCU_INIT_POINTER(node->peer, NULL);
+ if (node->bit[0] && node->bit[1])
+ continue;
+ child = rcu_dereference_protected(node->bit[!rcu_access_pointer(node->bit[0])],
+ lockdep_is_held(lock));
+ if (child)
+ child->parent_bit_packed = node->parent_bit_packed;
+ parent_bit = (struct allowedips_node **)(node->parent_bit_packed & ~3UL);
+ *parent_bit = child;
+ parent = (void *)parent_bit -
+ offsetof(struct allowedips_node, bit[node->parent_bit_packed & 1]);
+ free_parent = !rcu_access_pointer(node->bit[0]) &&
+ !rcu_access_pointer(node->bit[1]) &&
+ (node->parent_bit_packed & 3) <= 1 &&
+ !rcu_access_pointer(parent->peer);
+ if (free_parent)
+ child = rcu_dereference_protected(
+ parent->bit[!(node->parent_bit_packed & 1)],
+ lockdep_is_held(lock));
+ call_rcu(&node->rcu, node_free_rcu);
+ if (!free_parent)
+ continue;
+ if (child)
+ child->parent_bit_packed = parent->parent_bit_packed;
+ *(struct allowedips_node **)(parent->parent_bit_packed & ~3UL) = child;
+ call_rcu(&parent->rcu, node_free_rcu);
+ }
}
int wg_allowedips_read_node(struct allowedips_node *node, u8 ip[16], u8 *cidr)
return NULL;
}
+int __init wg_allowedips_slab_init(void)
+{
+ node_cache = KMEM_CACHE(allowedips_node, 0);
+ return node_cache ? 0 : -ENOMEM;
+}
+
+void wg_allowedips_slab_uninit(void)
+{
+ rcu_barrier();
+ kmem_cache_destroy(node_cache);
+}
+
#include "selftest/allowedips.c"
struct allowedips_node {
struct wg_peer __rcu *peer;
struct allowedips_node __rcu *bit[2];
- /* While it may seem scandalous that we waste space for v4,
- * we're alloc'ing to the nearest power of 2 anyway, so this
- * doesn't actually make a difference.
- */
- u8 bits[16] __aligned(__alignof(u64));
u8 cidr, bit_at_a, bit_at_b, bitlen;
+ u8 bits[16] __aligned(__alignof(u64));
- /* Keep rarely used list at bottom to be beyond cache line. */
+ /* Keep rarely used members at bottom to be beyond cache line. */
+ unsigned long parent_bit_packed;
union {
struct list_head peer_list;
struct rcu_head rcu;
struct allowedips_node __rcu *root4;
struct allowedips_node __rcu *root6;
u64 seq;
-};
+} __aligned(4); /* We pack the lower 2 bits of &root, but m68k only gives 16-bit alignment. */
void wg_allowedips_init(struct allowedips *table);
void wg_allowedips_free(struct allowedips *table, struct mutex *mutex);
bool wg_allowedips_selftest(void);
#endif
+int wg_allowedips_slab_init(void);
+void wg_allowedips_slab_uninit(void);
+
#endif /* _WG_ALLOWEDIPS_H */
{
int ret;
+ ret = wg_allowedips_slab_init();
+ if (ret < 0)
+ goto err_allowedips;
+
#ifdef DEBUG
+ ret = -ENOTRECOVERABLE;
if (!wg_allowedips_selftest() || !wg_packet_counter_selftest() ||
!wg_ratelimiter_selftest())
- return -ENOTRECOVERABLE;
+ goto err_peer;
#endif
wg_noise_init();
+ ret = wg_peer_init();
+ if (ret < 0)
+ goto err_peer;
+
ret = wg_device_init();
if (ret < 0)
goto err_device;
err_netlink:
wg_device_uninit();
err_device:
+ wg_peer_uninit();
+err_peer:
+ wg_allowedips_slab_uninit();
+err_allowedips:
return ret;
}
{
wg_genetlink_uninit();
wg_device_uninit();
+ wg_peer_uninit();
+ wg_allowedips_slab_uninit();
}
module_init(mod_init);
#include <linux/rcupdate.h>
#include <linux/list.h>
+static struct kmem_cache *peer_cache;
static atomic64_t peer_counter = ATOMIC64_INIT(0);
struct wg_peer *wg_peer_create(struct wg_device *wg,
if (wg->num_peers >= MAX_PEERS_PER_DEVICE)
return ERR_PTR(ret);
- peer = kzalloc(sizeof(*peer), GFP_KERNEL);
+ peer = kmem_cache_zalloc(peer_cache, GFP_KERNEL);
if (unlikely(!peer))
return ERR_PTR(ret);
- if (dst_cache_init(&peer->endpoint_cache, GFP_KERNEL))
+ if (unlikely(dst_cache_init(&peer->endpoint_cache, GFP_KERNEL)))
goto err;
peer->device = wg;
return peer;
err:
- kfree(peer);
+ kmem_cache_free(peer_cache, peer);
return ERR_PTR(ret);
}
/* Mark as dead, so that we don't allow jumping contexts after. */
WRITE_ONCE(peer->is_dead, true);
- /* The caller must now synchronize_rcu() for this to take effect. */
+ /* The caller must now synchronize_net() for this to take effect. */
}
static void peer_remove_after_dead(struct wg_peer *peer)
lockdep_assert_held(&peer->device->device_update_lock);
peer_make_dead(peer);
- synchronize_rcu();
+ synchronize_net();
peer_remove_after_dead(peer);
}
peer_make_dead(peer);
list_add_tail(&peer->peer_list, &dead_peers);
}
- synchronize_rcu();
+ synchronize_net();
list_for_each_entry_safe(peer, temp, &dead_peers, peer_list)
peer_remove_after_dead(peer);
}
/* The final zeroing takes care of clearing any remaining handshake key
* material and other potentially sensitive information.
*/
- kfree_sensitive(peer);
+ memzero_explicit(peer, sizeof(*peer));
+ kmem_cache_free(peer_cache, peer);
}
static void kref_release(struct kref *refcount)
return;
kref_put(&peer->refcount, kref_release);
}
+
+int __init wg_peer_init(void)
+{
+ peer_cache = KMEM_CACHE(wg_peer, 0);
+ return peer_cache ? 0 : -ENOMEM;
+}
+
+void wg_peer_uninit(void)
+{
+ kmem_cache_destroy(peer_cache);
+}
void wg_peer_remove(struct wg_peer *peer);
void wg_peer_remove_all(struct wg_device *wg);
+int wg_peer_init(void);
+void wg_peer_uninit(void);
+
#endif /* _WG_PEER_H */
#include <linux/siphash.h>
-static __init void swap_endian_and_apply_cidr(u8 *dst, const u8 *src, u8 bits,
- u8 cidr)
-{
- swap_endian(dst, src, bits);
- memset(dst + (cidr + 7) / 8, 0, bits / 8 - (cidr + 7) / 8);
- if (cidr)
- dst[(cidr + 7) / 8 - 1] &= ~0U << ((8 - (cidr % 8)) % 8);
-}
-
static __init void print_node(struct allowedips_node *node, u8 bits)
{
char *fmt_connection = KERN_DEBUG "\t\"%p/%d\" -> \"%p/%d\";\n";
- char *fmt_declaration = KERN_DEBUG
- "\t\"%p/%d\"[style=%s, color=\"#%06x\"];\n";
+ char *fmt_declaration = KERN_DEBUG "\t\"%p/%d\"[style=%s, color=\"#%06x\"];\n";
+ u8 ip1[16], ip2[16], cidr1, cidr2;
char *style = "dotted";
- u8 ip1[16], ip2[16];
u32 color = 0;
+ if (node == NULL)
+ return;
if (bits == 32) {
fmt_connection = KERN_DEBUG "\t\"%pI4/%d\" -> \"%pI4/%d\";\n";
- fmt_declaration = KERN_DEBUG
- "\t\"%pI4/%d\"[style=%s, color=\"#%06x\"];\n";
+ fmt_declaration = KERN_DEBUG "\t\"%pI4/%d\"[style=%s, color=\"#%06x\"];\n";
} else if (bits == 128) {
fmt_connection = KERN_DEBUG "\t\"%pI6/%d\" -> \"%pI6/%d\";\n";
- fmt_declaration = KERN_DEBUG
- "\t\"%pI6/%d\"[style=%s, color=\"#%06x\"];\n";
+ fmt_declaration = KERN_DEBUG "\t\"%pI6/%d\"[style=%s, color=\"#%06x\"];\n";
}
if (node->peer) {
hsiphash_key_t key = { { 0 } };
hsiphash_1u32(0xabad1dea, &key) % 200;
style = "bold";
}
- swap_endian_and_apply_cidr(ip1, node->bits, bits, node->cidr);
- printk(fmt_declaration, ip1, node->cidr, style, color);
+ wg_allowedips_read_node(node, ip1, &cidr1);
+ printk(fmt_declaration, ip1, cidr1, style, color);
if (node->bit[0]) {
- swap_endian_and_apply_cidr(ip2,
- rcu_dereference_raw(node->bit[0])->bits, bits,
- node->cidr);
- printk(fmt_connection, ip1, node->cidr, ip2,
- rcu_dereference_raw(node->bit[0])->cidr);
- print_node(rcu_dereference_raw(node->bit[0]), bits);
+ wg_allowedips_read_node(rcu_dereference_raw(node->bit[0]), ip2, &cidr2);
+ printk(fmt_connection, ip1, cidr1, ip2, cidr2);
}
if (node->bit[1]) {
- swap_endian_and_apply_cidr(ip2,
- rcu_dereference_raw(node->bit[1])->bits,
- bits, node->cidr);
- printk(fmt_connection, ip1, node->cidr, ip2,
- rcu_dereference_raw(node->bit[1])->cidr);
- print_node(rcu_dereference_raw(node->bit[1]), bits);
+ wg_allowedips_read_node(rcu_dereference_raw(node->bit[1]), ip2, &cidr2);
+ printk(fmt_connection, ip1, cidr1, ip2, cidr2);
}
+ if (node->bit[0])
+ print_node(rcu_dereference_raw(node->bit[0]), bits);
+ if (node->bit[1])
+ print_node(rcu_dereference_raw(node->bit[1]), bits);
}
static __init void print_tree(struct allowedips_node __rcu *top, u8 bits)
{
union nf_inet_addr mask;
- memset(&mask, 0x00, 128 / 8);
- memset(&mask, 0xff, cidr / 8);
+ memset(&mask, 0, sizeof(mask));
+ memset(&mask.all, 0xff, cidr / 8);
if (cidr % 32)
mask.all[cidr / 32] = (__force u32)htonl(
(0xFFFFFFFFUL << (32 - (cidr % 32))) & 0xFFFFFFFFUL);
}
static __init inline bool
-horrible_match_v4(const struct horrible_allowedips_node *node,
- struct in_addr *ip)
+horrible_match_v4(const struct horrible_allowedips_node *node, struct in_addr *ip)
{
return (ip->s_addr & node->mask.ip) == node->ip.ip;
}
static __init inline bool
-horrible_match_v6(const struct horrible_allowedips_node *node,
- struct in6_addr *ip)
+horrible_match_v6(const struct horrible_allowedips_node *node, struct in6_addr *ip)
{
- return (ip->in6_u.u6_addr32[0] & node->mask.ip6[0]) ==
- node->ip.ip6[0] &&
- (ip->in6_u.u6_addr32[1] & node->mask.ip6[1]) ==
- node->ip.ip6[1] &&
- (ip->in6_u.u6_addr32[2] & node->mask.ip6[2]) ==
- node->ip.ip6[2] &&
+ return (ip->in6_u.u6_addr32[0] & node->mask.ip6[0]) == node->ip.ip6[0] &&
+ (ip->in6_u.u6_addr32[1] & node->mask.ip6[1]) == node->ip.ip6[1] &&
+ (ip->in6_u.u6_addr32[2] & node->mask.ip6[2]) == node->ip.ip6[2] &&
(ip->in6_u.u6_addr32[3] & node->mask.ip6[3]) == node->ip.ip6[3];
}
static __init void
-horrible_insert_ordered(struct horrible_allowedips *table,
- struct horrible_allowedips_node *node)
+horrible_insert_ordered(struct horrible_allowedips *table, struct horrible_allowedips_node *node)
{
struct horrible_allowedips_node *other = NULL, *where = NULL;
u8 my_cidr = horrible_mask_to_cidr(node->mask);
hlist_for_each_entry(other, &table->head, table) {
- if (!memcmp(&other->mask, &node->mask,
- sizeof(union nf_inet_addr)) &&
- !memcmp(&other->ip, &node->ip,
- sizeof(union nf_inet_addr)) &&
- other->ip_version == node->ip_version) {
+ if (other->ip_version == node->ip_version &&
+ !memcmp(&other->mask, &node->mask, sizeof(union nf_inet_addr)) &&
+ !memcmp(&other->ip, &node->ip, sizeof(union nf_inet_addr))) {
other->value = node->value;
kfree(node);
return;
}
+ }
+ hlist_for_each_entry(other, &table->head, table) {
where = other;
if (horrible_mask_to_cidr(other->mask) <= my_cidr)
break;
horrible_allowedips_insert_v4(struct horrible_allowedips *table,
struct in_addr *ip, u8 cidr, void *value)
{
- struct horrible_allowedips_node *node = kzalloc(sizeof(*node),
- GFP_KERNEL);
+ struct horrible_allowedips_node *node = kzalloc(sizeof(*node), GFP_KERNEL);
if (unlikely(!node))
return -ENOMEM;
horrible_allowedips_insert_v6(struct horrible_allowedips *table,
struct in6_addr *ip, u8 cidr, void *value)
{
- struct horrible_allowedips_node *node = kzalloc(sizeof(*node),
- GFP_KERNEL);
+ struct horrible_allowedips_node *node = kzalloc(sizeof(*node), GFP_KERNEL);
if (unlikely(!node))
return -ENOMEM;
}
static __init void *
-horrible_allowedips_lookup_v4(struct horrible_allowedips *table,
- struct in_addr *ip)
+horrible_allowedips_lookup_v4(struct horrible_allowedips *table, struct in_addr *ip)
{
struct horrible_allowedips_node *node;
- void *ret = NULL;
hlist_for_each_entry(node, &table->head, table) {
- if (node->ip_version != 4)
- continue;
- if (horrible_match_v4(node, ip)) {
- ret = node->value;
- break;
- }
+ if (node->ip_version == 4 && horrible_match_v4(node, ip))
+ return node->value;
}
- return ret;
+ return NULL;
}
static __init void *
-horrible_allowedips_lookup_v6(struct horrible_allowedips *table,
- struct in6_addr *ip)
+horrible_allowedips_lookup_v6(struct horrible_allowedips *table, struct in6_addr *ip)
{
struct horrible_allowedips_node *node;
- void *ret = NULL;
hlist_for_each_entry(node, &table->head, table) {
- if (node->ip_version != 6)
+ if (node->ip_version == 6 && horrible_match_v6(node, ip))
+ return node->value;
+ }
+ return NULL;
+}
+
+
+static __init void
+horrible_allowedips_remove_by_value(struct horrible_allowedips *table, void *value)
+{
+ struct horrible_allowedips_node *node;
+ struct hlist_node *h;
+
+ hlist_for_each_entry_safe(node, h, &table->head, table) {
+ if (node->value != value)
continue;
- if (horrible_match_v6(node, ip)) {
- ret = node->value;
- break;
- }
+ hlist_del(&node->table);
+ kfree(node);
}
- return ret;
+
}
static __init bool randomized_test(void)
goto free;
}
kref_init(&peers[i]->refcount);
+ INIT_LIST_HEAD(&peers[i]->allowedips_list);
}
mutex_lock(&mutex);
if (wg_allowedips_insert_v4(&t,
(struct in_addr *)mutated,
cidr, peer, &mutex) < 0) {
- pr_err("allowedips random malloc: FAIL\n");
+ pr_err("allowedips random self-test malloc: FAIL\n");
goto free_locked;
}
if (horrible_allowedips_insert_v4(&h,
print_tree(t.root6, 128);
}
- for (i = 0; i < NUM_QUERIES; ++i) {
- prandom_bytes(ip, 4);
- if (lookup(t.root4, 32, ip) !=
- horrible_allowedips_lookup_v4(&h, (struct in_addr *)ip)) {
- pr_err("allowedips random self-test: FAIL\n");
- goto free;
+ for (j = 0;; ++j) {
+ for (i = 0; i < NUM_QUERIES; ++i) {
+ prandom_bytes(ip, 4);
+ if (lookup(t.root4, 32, ip) != horrible_allowedips_lookup_v4(&h, (struct in_addr *)ip)) {
+ horrible_allowedips_lookup_v4(&h, (struct in_addr *)ip);
+ pr_err("allowedips random v4 self-test: FAIL\n");
+ goto free;
+ }
+ prandom_bytes(ip, 16);
+ if (lookup(t.root6, 128, ip) != horrible_allowedips_lookup_v6(&h, (struct in6_addr *)ip)) {
+ pr_err("allowedips random v6 self-test: FAIL\n");
+ goto free;
+ }
}
+ if (j >= NUM_PEERS)
+ break;
+ mutex_lock(&mutex);
+ wg_allowedips_remove_by_peer(&t, peers[j], &mutex);
+ mutex_unlock(&mutex);
+ horrible_allowedips_remove_by_value(&h, peers[j]);
}
- for (i = 0; i < NUM_QUERIES; ++i) {
- prandom_bytes(ip, 16);
- if (lookup(t.root6, 128, ip) !=
- horrible_allowedips_lookup_v6(&h, (struct in6_addr *)ip)) {
- pr_err("allowedips random self-test: FAIL\n");
- goto free;
- }
+ if (t.root4 || t.root6) {
+ pr_err("allowedips random self-test removal: FAIL\n");
+ goto free;
}
+
ret = true;
free:
if (new4)
wg->incoming_port = ntohs(inet_sk(new4)->inet_sport);
mutex_unlock(&wg->socket_update_lock);
- synchronize_rcu();
+ synchronize_net();
sock_free(old4);
sock_free(old6);
}
#define ATH10K_HTT_TXRX_PEER_SECURITY_MAX 2
#define ATH10K_TXRX_NUM_EXT_TIDS 19
+#define ATH10K_TXRX_NON_QOS_TID 16
enum htt_security_flags {
#define HTT_SECURITY_TYPE_MASK 0x7F
msdu->ip_summed = ath10k_htt_rx_get_csum_state(msdu);
}
+static u64 ath10k_htt_rx_h_get_pn(struct ath10k *ar, struct sk_buff *skb,
+ u16 offset,
+ enum htt_rx_mpdu_encrypt_type enctype)
+{
+ struct ieee80211_hdr *hdr;
+ u64 pn = 0;
+ u8 *ehdr;
+
+ hdr = (struct ieee80211_hdr *)(skb->data + offset);
+ ehdr = skb->data + offset + ieee80211_hdrlen(hdr->frame_control);
+
+ if (enctype == HTT_RX_MPDU_ENCRYPT_AES_CCM_WPA2) {
+ pn = ehdr[0];
+ pn |= (u64)ehdr[1] << 8;
+ pn |= (u64)ehdr[4] << 16;
+ pn |= (u64)ehdr[5] << 24;
+ pn |= (u64)ehdr[6] << 32;
+ pn |= (u64)ehdr[7] << 40;
+ }
+ return pn;
+}
+
+static bool ath10k_htt_rx_h_frag_multicast_check(struct ath10k *ar,
+ struct sk_buff *skb,
+ u16 offset)
+{
+ struct ieee80211_hdr *hdr;
+
+ hdr = (struct ieee80211_hdr *)(skb->data + offset);
+ return !is_multicast_ether_addr(hdr->addr1);
+}
+
+static bool ath10k_htt_rx_h_frag_pn_check(struct ath10k *ar,
+ struct sk_buff *skb,
+ u16 peer_id,
+ u16 offset,
+ enum htt_rx_mpdu_encrypt_type enctype)
+{
+ struct ath10k_peer *peer;
+ union htt_rx_pn_t *last_pn, new_pn = {0};
+ struct ieee80211_hdr *hdr;
+ bool more_frags;
+ u8 tid, frag_number;
+ u32 seq;
+
+ peer = ath10k_peer_find_by_id(ar, peer_id);
+ if (!peer) {
+ ath10k_dbg(ar, ATH10K_DBG_HTT, "invalid peer for frag pn check\n");
+ return false;
+ }
+
+ hdr = (struct ieee80211_hdr *)(skb->data + offset);
+ if (ieee80211_is_data_qos(hdr->frame_control))
+ tid = ieee80211_get_tid(hdr);
+ else
+ tid = ATH10K_TXRX_NON_QOS_TID;
+
+ last_pn = &peer->frag_tids_last_pn[tid];
+ new_pn.pn48 = ath10k_htt_rx_h_get_pn(ar, skb, offset, enctype);
+ more_frags = ieee80211_has_morefrags(hdr->frame_control);
+ frag_number = le16_to_cpu(hdr->seq_ctrl) & IEEE80211_SCTL_FRAG;
+ seq = (__le16_to_cpu(hdr->seq_ctrl) & IEEE80211_SCTL_SEQ) >> 4;
+
+ if (frag_number == 0) {
+ last_pn->pn48 = new_pn.pn48;
+ peer->frag_tids_seq[tid] = seq;
+ } else {
+ if (seq != peer->frag_tids_seq[tid])
+ return false;
+
+ if (new_pn.pn48 != last_pn->pn48 + 1)
+ return false;
+
+ last_pn->pn48 = new_pn.pn48;
+ }
+
+ return true;
+}
+
static void ath10k_htt_rx_h_mpdu(struct ath10k *ar,
struct sk_buff_head *amsdu,
struct ieee80211_rx_status *status,
bool fill_crypt_header,
u8 *rx_hdr,
- enum ath10k_pkt_rx_err *err)
+ enum ath10k_pkt_rx_err *err,
+ u16 peer_id,
+ bool frag)
{
struct sk_buff *first;
struct sk_buff *last;
- struct sk_buff *msdu;
+ struct sk_buff *msdu, *temp;
struct htt_rx_desc *rxd;
struct ieee80211_hdr *hdr;
enum htt_rx_mpdu_encrypt_type enctype;
bool is_decrypted;
bool is_mgmt;
u32 attention;
+ bool frag_pn_check = true, multicast_check = true;
if (skb_queue_empty(amsdu))
return;
}
skb_queue_walk(amsdu, msdu) {
+ if (frag && !fill_crypt_header && is_decrypted &&
+ enctype == HTT_RX_MPDU_ENCRYPT_AES_CCM_WPA2)
+ frag_pn_check = ath10k_htt_rx_h_frag_pn_check(ar,
+ msdu,
+ peer_id,
+ 0,
+ enctype);
+
+ if (frag)
+ multicast_check = ath10k_htt_rx_h_frag_multicast_check(ar,
+ msdu,
+ 0);
+
+ if (!frag_pn_check || !multicast_check) {
+ /* Discard the fragment with invalid PN or multicast DA
+ */
+ temp = msdu->prev;
+ __skb_unlink(msdu, amsdu);
+ dev_kfree_skb_any(msdu);
+ msdu = temp;
+ frag_pn_check = true;
+ multicast_check = true;
+ continue;
+ }
+
ath10k_htt_rx_h_csum_offload(msdu);
+
+ if (frag && !fill_crypt_header &&
+ enctype == HTT_RX_MPDU_ENCRYPT_TKIP_WPA)
+ status->flag &= ~RX_FLAG_MMIC_STRIPPED;
+
ath10k_htt_rx_h_undecap(ar, msdu, status, first_hdr, enctype,
is_decrypted);
hdr = (void *)msdu->data;
hdr->frame_control &= ~__cpu_to_le16(IEEE80211_FCTL_PROTECTED);
+
+ if (frag && !fill_crypt_header &&
+ enctype == HTT_RX_MPDU_ENCRYPT_TKIP_WPA)
+ status->flag &= ~RX_FLAG_IV_STRIPPED &
+ ~RX_FLAG_MMIC_STRIPPED;
}
}
ath10k_unchain_msdu(amsdu, unchain_cnt);
}
+static bool ath10k_htt_rx_validate_amsdu(struct ath10k *ar,
+ struct sk_buff_head *amsdu)
+{
+ u8 *subframe_hdr;
+ struct sk_buff *first;
+ bool is_first, is_last;
+ struct htt_rx_desc *rxd;
+ struct ieee80211_hdr *hdr;
+ size_t hdr_len, crypto_len;
+ enum htt_rx_mpdu_encrypt_type enctype;
+ int bytes_aligned = ar->hw_params.decap_align_bytes;
+
+ first = skb_peek(amsdu);
+
+ rxd = (void *)first->data - sizeof(*rxd);
+ hdr = (void *)rxd->rx_hdr_status;
+
+ is_first = !!(rxd->msdu_end.common.info0 &
+ __cpu_to_le32(RX_MSDU_END_INFO0_FIRST_MSDU));
+ is_last = !!(rxd->msdu_end.common.info0 &
+ __cpu_to_le32(RX_MSDU_END_INFO0_LAST_MSDU));
+
+ /* Return in case of non-aggregated msdu */
+ if (is_first && is_last)
+ return true;
+
+ /* First msdu flag is not set for the first msdu of the list */
+ if (!is_first)
+ return false;
+
+ enctype = MS(__le32_to_cpu(rxd->mpdu_start.info0),
+ RX_MPDU_START_INFO0_ENCRYPT_TYPE);
+
+ hdr_len = ieee80211_hdrlen(hdr->frame_control);
+ crypto_len = ath10k_htt_rx_crypto_param_len(ar, enctype);
+
+ subframe_hdr = (u8 *)hdr + round_up(hdr_len, bytes_aligned) +
+ crypto_len;
+
+ /* Validate if the amsdu has a proper first subframe.
+ * There are chances a single msdu can be received as amsdu when
+ * the unauthenticated amsdu flag of a QoS header
+ * gets flipped in non-SPP AMSDU's, in such cases the first
+ * subframe has llc/snap header in place of a valid da.
+ * return false if the da matches rfc1042 pattern
+ */
+ if (ether_addr_equal(subframe_hdr, rfc1042_header))
+ return false;
+
+ return true;
+}
+
static bool ath10k_htt_rx_amsdu_allowed(struct ath10k *ar,
struct sk_buff_head *amsdu,
struct ieee80211_rx_status *rx_status)
{
- /* FIXME: It might be a good idea to do some fuzzy-testing to drop
- * invalid/dangerous frames.
- */
-
if (!rx_status->freq) {
ath10k_dbg(ar, ATH10K_DBG_HTT, "no channel configured; ignoring frame(s)!\n");
return false;
return false;
}
+ if (!ath10k_htt_rx_validate_amsdu(ar, amsdu)) {
+ ath10k_dbg(ar, ATH10K_DBG_HTT, "invalid amsdu received\n");
+ return false;
+ }
+
return true;
}
ath10k_htt_rx_h_unchain(ar, &amsdu, &drop_cnt, &unchain_cnt);
ath10k_htt_rx_h_filter(ar, &amsdu, rx_status, &drop_cnt_filter);
- ath10k_htt_rx_h_mpdu(ar, &amsdu, rx_status, true, first_hdr, &err);
+ ath10k_htt_rx_h_mpdu(ar, &amsdu, rx_status, true, first_hdr, &err, 0,
+ false);
msdus_to_queue = skb_queue_len(&amsdu);
ath10k_htt_rx_h_enqueue(ar, &amsdu, rx_status);
fw_desc = &rx->fw_desc;
rx_desc_len = fw_desc->len;
+ if (fw_desc->u.bits.discard) {
+ ath10k_dbg(ar, ATH10K_DBG_HTT, "htt discard mpdu\n");
+ goto err;
+ }
+
/* I have not yet seen any case where num_mpdu_ranges > 1.
* qcacld does not seem handle that case either, so we introduce the
* same limitiation here as well.
rx_desc = (struct htt_hl_rx_desc *)(skb->data + tot_hdr_len);
rx_desc_info = __le32_to_cpu(rx_desc->info);
+ hdr = (struct ieee80211_hdr *)((u8 *)rx_desc + rx_hl->fw_desc.len);
+
+ if (is_multicast_ether_addr(hdr->addr1)) {
+ /* Discard the fragment with multicast DA */
+ goto err;
+ }
+
if (!MS(rx_desc_info, HTT_RX_DESC_HL_INFO_ENCRYPTED)) {
spin_unlock_bh(&ar->data_lock);
return ath10k_htt_rx_proc_rx_ind_hl(htt, &resp->rx_ind_hl, skb,
HTT_RX_NON_TKIP_MIC);
}
- hdr = (struct ieee80211_hdr *)((u8 *)rx_desc + rx_hl->fw_desc.len);
-
if (ieee80211_has_retry(hdr->frame_control))
goto err;
ath10k_htt_rx_h_ppdu(ar, &amsdu, status, vdev_id);
ath10k_htt_rx_h_filter(ar, &amsdu, status, NULL);
ath10k_htt_rx_h_mpdu(ar, &amsdu, status, false, NULL,
- NULL);
+ NULL, peer_id, frag);
ath10k_htt_rx_h_enqueue(ar, &amsdu, status);
break;
case -EAGAIN:
#define FW_RX_DESC_UDP (1 << 6)
struct fw_rx_desc_hl {
- u8 info0;
+ union {
+ struct {
+ u8 discard:1,
+ forward:1,
+ any_err:1,
+ dup_err:1,
+ reserved:1,
+ inspect:1,
+ extension:2;
+ } bits;
+ u8 info0;
+ } u;
+
u8 version;
u8 len;
u8 flags;
ab->hw_params.hw_ops->rx_desc_set_msdu_len(desc, len);
}
+static bool ath11k_dp_rx_h_attn_is_mcbc(struct ath11k_base *ab,
+ struct hal_rx_desc *desc)
+{
+ struct rx_attention *attn = ath11k_dp_rx_get_attention(ab, desc);
+
+ return ath11k_dp_rx_h_msdu_end_first_msdu(ab, desc) &&
+ (!!FIELD_GET(RX_ATTENTION_INFO1_MCAST_BCAST,
+ __le32_to_cpu(attn->info1)));
+}
+
static void ath11k_dp_service_mon_ring(struct timer_list *t)
{
struct ath11k_base *ab = from_timer(ab, t, mon_reap_timer);
__skb_queue_purge(&rx_tid->rx_frags);
}
+void ath11k_peer_frags_flush(struct ath11k *ar, struct ath11k_peer *peer)
+{
+ struct dp_rx_tid *rx_tid;
+ int i;
+
+ lockdep_assert_held(&ar->ab->base_lock);
+
+ for (i = 0; i <= IEEE80211_NUM_TIDS; i++) {
+ rx_tid = &peer->rx_tid[i];
+
+ spin_unlock_bh(&ar->ab->base_lock);
+ del_timer_sync(&rx_tid->frag_timer);
+ spin_lock_bh(&ar->ab->base_lock);
+
+ ath11k_dp_rx_frags_cleanup(rx_tid, true);
+ }
+}
+
void ath11k_peer_rx_tid_cleanup(struct ath11k *ar, struct ath11k_peer *peer)
{
struct dp_rx_tid *rx_tid;
u8 tid;
int ret = 0;
bool more_frags;
+ bool is_mcbc;
rx_desc = (struct hal_rx_desc *)msdu->data;
peer_id = ath11k_dp_rx_h_mpdu_start_peer_id(ar->ab, rx_desc);
seqno = ath11k_dp_rx_h_mpdu_start_seq_no(ar->ab, rx_desc);
frag_no = ath11k_dp_rx_h_mpdu_start_frag_no(ar->ab, msdu);
more_frags = ath11k_dp_rx_h_mpdu_start_more_frags(ar->ab, msdu);
+ is_mcbc = ath11k_dp_rx_h_attn_is_mcbc(ar->ab, rx_desc);
+
+ /* Multicast/Broadcast fragments are not expected */
+ if (is_mcbc)
+ return -EINVAL;
if (!ath11k_dp_rx_h_mpdu_start_seq_ctrl_valid(ar->ab, rx_desc) ||
!ath11k_dp_rx_h_mpdu_start_fc_valid(ar->ab, rx_desc) ||
const u8 *peer_addr,
enum set_key_cmd key_cmd,
struct ieee80211_key_conf *key);
+void ath11k_peer_frags_flush(struct ath11k *ar, struct ath11k_peer *peer);
void ath11k_peer_rx_tid_cleanup(struct ath11k *ar, struct ath11k_peer *peer);
void ath11k_peer_rx_tid_delete(struct ath11k *ar,
struct ath11k_peer *peer, u8 tid);
*/
spin_lock_bh(&ab->base_lock);
peer = ath11k_peer_find(ab, arvif->vdev_id, peer_addr);
+
+ /* flush the fragments cache during key (re)install to
+ * ensure all frags in the new frag list belong to the same key.
+ */
+ if (peer && cmd == SET_KEY)
+ ath11k_peer_frags_flush(ar, peer);
spin_unlock_bh(&ab->base_lock);
if (!peer) {
static void mt76_rx_release_amsdu(struct mt76_phy *phy, enum mt76_rxq_id q)
{
struct sk_buff *skb = phy->rx_amsdu[q].head;
+ struct mt76_rx_status *status = (struct mt76_rx_status *)skb->cb;
struct mt76_dev *dev = phy->dev;
phy->rx_amsdu[q].head = NULL;
phy->rx_amsdu[q].tail = NULL;
+
+ /*
+ * Validate if the amsdu has a proper first subframe.
+ * A single MSDU can be parsed as A-MSDU when the unauthenticated A-MSDU
+ * flag of the QoS header gets flipped. In such cases, the first
+ * subframe has a LLC/SNAP header in the location of the destination
+ * address.
+ */
+ if (skb_shinfo(skb)->frag_list) {
+ int offset = 0;
+
+ if (!(status->flag & RX_FLAG_8023)) {
+ offset = ieee80211_get_hdrlen_from_skb(skb);
+
+ if ((status->flag &
+ (RX_FLAG_DECRYPTED | RX_FLAG_IV_STRIPPED)) ==
+ RX_FLAG_DECRYPTED)
+ offset += 8;
+ }
+
+ if (ether_addr_equal(skb->data + offset, rfc1042_header)) {
+ dev_kfree_skb(skb);
+ return;
+ }
+ }
__skb_queue_tail(&dev->rx_skb[q], skb);
}
mutex_init(&dev->pm.mutex);
init_waitqueue_head(&dev->pm.wait);
spin_lock_init(&dev->pm.txq_lock);
- set_bit(MT76_STATE_PM, &dev->mphy.state);
INIT_DELAYED_WORK(&dev->mphy.mac_work, mt7615_mac_work);
INIT_DELAYED_WORK(&dev->phy.scan_work, mt7615_scan_work);
INIT_DELAYED_WORK(&dev->coredump.work, mt7615_coredump_work);
napi_schedule(&dev->mt76.napi[i]);
mt76_connac_pm_dequeue_skbs(mphy, &dev->pm);
mt76_queue_tx_cleanup(dev, dev->mt76.q_mcu[MT_MCUQ_WM], false);
- ieee80211_queue_delayed_work(mphy->hw, &mphy->mac_work,
- MT7615_WATCHDOG_TIME);
+ if (test_bit(MT76_STATE_RUNNING, &mphy->state))
+ ieee80211_queue_delayed_work(mphy->hw, &mphy->mac_work,
+ MT7615_WATCHDOG_TIME);
}
ieee80211_wake_queues(mphy->hw);
return ret;
}
-static int mt7663s_mcu_drv_pmctrl(struct mt7615_dev *dev)
+static int __mt7663s_mcu_drv_pmctrl(struct mt7615_dev *dev)
{
struct sdio_func *func = dev->mt76.sdio.func;
struct mt76_phy *mphy = &dev->mt76.phy;
u32 status;
int ret;
- if (!test_and_clear_bit(MT76_STATE_PM, &mphy->state))
- goto out;
-
sdio_claim_host(func);
sdio_writel(func, WHLPCR_FW_OWN_REQ_CLR, MCR_WHLPCR, NULL);
}
sdio_release_host(func);
-
-out:
dev->pm.last_activity = jiffies;
return 0;
}
+static int mt7663s_mcu_drv_pmctrl(struct mt7615_dev *dev)
+{
+ struct mt76_phy *mphy = &dev->mt76.phy;
+
+ if (test_and_clear_bit(MT76_STATE_PM, &mphy->state))
+ return __mt7663s_mcu_drv_pmctrl(dev);
+
+ return 0;
+}
+
static int mt7663s_mcu_fw_pmctrl(struct mt7615_dev *dev)
{
struct sdio_func *func = dev->mt76.sdio.func;
struct mt7615_mcu_ops *mcu_ops;
int ret;
- ret = mt7663s_mcu_drv_pmctrl(dev);
+ ret = __mt7663s_mcu_drv_pmctrl(dev);
if (ret)
return ret;
dev->mt76.mcu_ops = &mt7663u_mcu_ops,
- /* usb does not support runtime-pm */
- clear_bit(MT76_STATE_PM, &dev->mphy.state);
mt76_set(dev, MT_UDMA_TX_QSEL, MT_FW_DL_EN);
-
if (test_and_clear_bit(MT76_STATE_POWER_OFF, &dev->mphy.state)) {
mt7615_mcu_restart(&dev->mt76);
if (!mt76_poll_msec(dev, MT_CONN_ON_MISC,
phy->phy_type = mt76_connac_get_phy_mode_v2(mphy, vif, band, sta);
phy->basic_rate = cpu_to_le16((u16)vif->bss_conf.basic_rates);
phy->rcpi = rcpi;
+ phy->ampdu = FIELD_PREP(IEEE80211_HT_AMPDU_PARM_FACTOR,
+ sta->ht_cap.ampdu_factor) |
+ FIELD_PREP(IEEE80211_HT_AMPDU_PARM_DENSITY,
+ sta->ht_cap.ampdu_density);
tlv = mt76_connac_mcu_add_tlv(skb, STA_REC_RA, sizeof(*ra_info));
ra_info = (struct sta_rec_ra_info *)tlv;
.reconfig_complete = mt76x02_reconfig_complete,
};
-static int mt76x0e_register_device(struct mt76x02_dev *dev)
+static int mt76x0e_init_hardware(struct mt76x02_dev *dev, bool resume)
{
int err;
if (err < 0)
return err;
- err = mt76x02_dma_init(dev);
- if (err < 0)
- return err;
+ if (!resume) {
+ err = mt76x02_dma_init(dev);
+ if (err < 0)
+ return err;
+ }
err = mt76x0_init_hardware(dev);
if (err < 0)
mt76_clear(dev, 0x110, BIT(9));
mt76_set(dev, MT_MAX_LEN_CFG, BIT(13));
+ return 0;
+}
+
+static int mt76x0e_register_device(struct mt76x02_dev *dev)
+{
+ int err;
+
+ err = mt76x0e_init_hardware(dev, false);
+ if (err < 0)
+ return err;
+
err = mt76x0_register_device(dev);
if (err < 0)
return err;
if (ret)
return ret;
+ mt76_pci_disable_aspm(pdev);
+
mdev = mt76_alloc_device(&pdev->dev, sizeof(*dev), &mt76x0e_ops,
&drv_ops);
if (!mdev)
mt76_free_device(mdev);
}
+#ifdef CONFIG_PM
+static int mt76x0e_suspend(struct pci_dev *pdev, pm_message_t state)
+{
+ struct mt76_dev *mdev = pci_get_drvdata(pdev);
+ struct mt76x02_dev *dev = container_of(mdev, struct mt76x02_dev, mt76);
+ int i;
+
+ mt76_worker_disable(&mdev->tx_worker);
+ for (i = 0; i < ARRAY_SIZE(mdev->phy.q_tx); i++)
+ mt76_queue_tx_cleanup(dev, mdev->phy.q_tx[i], true);
+ for (i = 0; i < ARRAY_SIZE(mdev->q_mcu); i++)
+ mt76_queue_tx_cleanup(dev, mdev->q_mcu[i], true);
+ napi_disable(&mdev->tx_napi);
+
+ mt76_for_each_q_rx(mdev, i)
+ napi_disable(&mdev->napi[i]);
+
+ mt76x02_dma_disable(dev);
+ mt76x02_mcu_cleanup(dev);
+ mt76x0_chip_onoff(dev, false, false);
+
+ pci_enable_wake(pdev, pci_choose_state(pdev, state), true);
+ pci_save_state(pdev);
+
+ return pci_set_power_state(pdev, pci_choose_state(pdev, state));
+}
+
+static int mt76x0e_resume(struct pci_dev *pdev)
+{
+ struct mt76_dev *mdev = pci_get_drvdata(pdev);
+ struct mt76x02_dev *dev = container_of(mdev, struct mt76x02_dev, mt76);
+ int err, i;
+
+ err = pci_set_power_state(pdev, PCI_D0);
+ if (err)
+ return err;
+
+ pci_restore_state(pdev);
+
+ mt76_worker_enable(&mdev->tx_worker);
+
+ mt76_for_each_q_rx(mdev, i) {
+ mt76_queue_rx_reset(dev, i);
+ napi_enable(&mdev->napi[i]);
+ napi_schedule(&mdev->napi[i]);
+ }
+
+ napi_enable(&mdev->tx_napi);
+ napi_schedule(&mdev->tx_napi);
+
+ return mt76x0e_init_hardware(dev, true);
+}
+#endif /* CONFIG_PM */
+
static const struct pci_device_id mt76x0e_device_table[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_MEDIATEK, 0x7610) },
{ PCI_DEVICE(PCI_VENDOR_ID_MEDIATEK, 0x7630) },
.id_table = mt76x0e_device_table,
.probe = mt76x0e_probe,
.remove = mt76x0e_remove,
+#ifdef CONFIG_PM
+ .suspend = mt76x0e_suspend,
+ .resume = mt76x0e_resume,
+#endif /* CONFIG_PM */
};
module_pci_driver(mt76x0e_driver);
struct wiphy *wiphy = hw->wiphy;
hw->queues = 4;
- hw->max_rx_aggregation_subframes = IEEE80211_MAX_AMPDU_BUF;
- hw->max_tx_aggregation_subframes = IEEE80211_MAX_AMPDU_BUF;
+ hw->max_rx_aggregation_subframes = 64;
+ hw->max_tx_aggregation_subframes = 128;
hw->radiotap_timestamp.units_pos =
IEEE80211_RADIOTAP_TIMESTAMP_UNIT_US;
napi_schedule(&dev->mt76.napi[i]);
mt76_connac_pm_dequeue_skbs(mphy, &dev->pm);
mt7921_tx_cleanup(dev);
- ieee80211_queue_delayed_work(mphy->hw, &mphy->mac_work,
- MT7921_WATCHDOG_TIME);
+ if (test_bit(MT76_STATE_RUNNING, &mphy->state))
+ ieee80211_queue_delayed_work(mphy->hw, &mphy->mac_work,
+ MT7921_WATCHDOG_TIME);
}
ieee80211_wake_queues(mphy->hw);
IEEE80211_HE_PHY_CAP0_CHANNEL_WIDTH_SET_40MHZ_IN_2G;
else if (band == NL80211_BAND_5GHZ)
he_cap_elem->phy_cap_info[0] =
- IEEE80211_HE_PHY_CAP0_CHANNEL_WIDTH_SET_40MHZ_80MHZ_IN_5G |
- IEEE80211_HE_PHY_CAP0_CHANNEL_WIDTH_SET_80PLUS80_MHZ_IN_5G;
+ IEEE80211_HE_PHY_CAP0_CHANNEL_WIDTH_SET_40MHZ_80MHZ_IN_5G;
he_cap_elem->phy_cap_info[1] =
IEEE80211_HE_PHY_CAP1_LDPC_CODING_IN_PAYLOAD;
mt7921_mcu_tx_rate_report(struct mt7921_dev *dev, struct sk_buff *skb,
u16 wlan_idx)
{
- struct mt7921_mcu_wlan_info_event *wtbl_info =
- (struct mt7921_mcu_wlan_info_event *)(skb->data);
- struct rate_info rate = {};
- u8 curr_idx = wtbl_info->rate_info.rate_idx;
- u16 curr = le16_to_cpu(wtbl_info->rate_info.rate[curr_idx]);
- struct mt7921_mcu_peer_cap peer = wtbl_info->peer_cap;
+ struct mt7921_mcu_wlan_info_event *wtbl_info;
struct mt76_phy *mphy = &dev->mphy;
struct mt7921_sta_stats *stats;
+ struct rate_info rate = {};
struct mt7921_sta *msta;
struct mt76_wcid *wcid;
+ u8 idx;
if (wlan_idx >= MT76_N_WCIDS)
return;
+ wtbl_info = (struct mt7921_mcu_wlan_info_event *)skb->data;
+ idx = wtbl_info->rate_info.rate_idx;
+ if (idx >= ARRAY_SIZE(wtbl_info->rate_info.rate))
+ return;
+
rcu_read_lock();
wcid = rcu_dereference(dev->mt76.wcid[wlan_idx]);
stats = &msta->stats;
/* current rate */
- mt7921_mcu_tx_rate_parse(mphy, &peer, &rate, curr);
+ mt7921_mcu_tx_rate_parse(mphy, &wtbl_info->peer_cap, &rate,
+ le16_to_cpu(wtbl_info->rate_info.rate[idx]));
stats->tx_rate = rate;
out:
rcu_read_unlock();
-/**
+/*
* Marvell NFC driver: Firmware downloader
*
* Copyright (C) 2015, Marvell International Ltd.
-/**
+/*
* Marvell NFC-over-I2C driver: I2C interface related functions
*
* Copyright (C) 2015, Marvell International Ltd.
-/**
+/*
* Marvell NFC driver
*
* Copyright (C) 2014-2015, Marvell International Ltd.
-/**
+/*
* Marvell NFC-over-SPI driver: SPI interface related functions
*
* Copyright (C) 2015, Marvell International Ltd.
-/**
+/*
* Marvell NFC-over-UART driver
*
* Copyright (C) 2015, Marvell International Ltd.
-/**
+/*
* Marvell NFC-over-USB driver: USB interface related functions
*
* Copyright (C) 2014, Marvell International Ltd.
config NVME_TCP
tristate "NVM Express over Fabrics TCP host driver"
depends on INET
- depends on BLK_DEV_NVME
+ depends on BLOCK
+ select NVME_CORE
select NVME_FABRICS
select CRYPTO
select CRYPTO_CRC32C
cdev_init(cdev, fops);
cdev->owner = owner;
ret = cdev_device_add(cdev, cdev_device);
- if (ret)
+ if (ret) {
+ put_device(cdev_device);
ida_simple_remove(&nvme_ns_chr_minor_ida, minor);
+ }
return ret;
}
cmd->connect.recfmt);
break;
+ case NVME_SC_HOST_PATH_ERROR:
+ dev_err(ctrl->device,
+ "Connect command failed: host path error\n");
+ break;
+
default:
dev_err(ctrl->device,
"Connect command failed, error wo/DNR bit: %d\n",
if (ctrl->ctrl.icdoff) {
dev_err(ctrl->ctrl.device, "icdoff %d is not supported!\n",
ctrl->ctrl.icdoff);
+ ret = NVME_SC_INVALID_FIELD | NVME_SC_DNR;
goto out_disconnect_admin_queue;
}
if (!(ctrl->ctrl.sgls & ((1 << 0) | (1 << 1)))) {
dev_err(ctrl->ctrl.device,
"Mandatory sgls are not supported!\n");
+ ret = NVME_SC_INVALID_FIELD | NVME_SC_DNR;
goto out_disconnect_admin_queue;
}
if (ctrl->ctrl.state != NVME_CTRL_CONNECTING)
return;
- if (portptr->port_state == FC_OBJSTATE_ONLINE)
+ if (portptr->port_state == FC_OBJSTATE_ONLINE) {
dev_info(ctrl->ctrl.device,
"NVME-FC{%d}: reset: Reconnect attempt failed (%d)\n",
ctrl->cnum, status);
- else if (time_after_eq(jiffies, rport->dev_loss_end))
+ if (status > 0 && (status & NVME_SC_DNR))
+ recon = false;
+ } else if (time_after_eq(jiffies, rport->dev_loss_end))
recon = false;
if (recon && nvmf_should_reconnect(&ctrl->ctrl)) {
queue_delayed_work(nvme_wq, &ctrl->connect_work, recon_delay);
} else {
- if (portptr->port_state == FC_OBJSTATE_ONLINE)
- dev_warn(ctrl->ctrl.device,
- "NVME-FC{%d}: Max reconnect attempts (%d) "
- "reached.\n",
- ctrl->cnum, ctrl->ctrl.nr_reconnects);
- else
+ if (portptr->port_state == FC_OBJSTATE_ONLINE) {
+ if (status > 0 && (status & NVME_SC_DNR))
+ dev_warn(ctrl->ctrl.device,
+ "NVME-FC{%d}: reconnect failure\n",
+ ctrl->cnum);
+ else
+ dev_warn(ctrl->ctrl.device,
+ "NVME-FC{%d}: Max reconnect attempts "
+ "(%d) reached.\n",
+ ctrl->cnum, ctrl->ctrl.nr_reconnects);
+ } else
dev_warn(ctrl->ctrl.device,
"NVME-FC{%d}: dev_loss_tmo (%d) expired "
"while waiting for remoteport connectivity.\n",
int count)
{
struct nvme_sgl_desc *sg = &c->common.dptr.sgl;
- struct scatterlist *sgl = req->data_sgl.sg_table.sgl;
struct ib_sge *sge = &req->sge[1];
+ struct scatterlist *sgl;
u32 len = 0;
int i;
- for (i = 0; i < count; i++, sgl++, sge++) {
+ for_each_sg(req->data_sgl.sg_table.sgl, sgl, count, i) {
sge->addr = sg_dma_address(sgl);
sge->length = sg_dma_len(sgl);
sge->lkey = queue->device->pd->local_dma_lkey;
len += sge->length;
+ sge++;
}
sg->addr = cpu_to_le64(queue->ctrl->ctrl.icdoff);
{
struct nvmet_ctrl *ctrl = container_of(to_delayed_work(work),
struct nvmet_ctrl, ka_work);
- bool cmd_seen = ctrl->cmd_seen;
+ bool reset_tbkas = ctrl->reset_tbkas;
- ctrl->cmd_seen = false;
- if (cmd_seen) {
+ ctrl->reset_tbkas = false;
+ if (reset_tbkas) {
pr_debug("ctrl %d reschedule traffic based keep-alive timer\n",
ctrl->cntlid);
schedule_delayed_work(&ctrl->ka_work, ctrl->kato * HZ);
percpu_ref_exit(&sq->ref);
if (ctrl) {
+ /*
+ * The teardown flow may take some time, and the host may not
+ * send us keep-alive during this period, hence reset the
+ * traffic based keep-alive timer so we don't trigger a
+ * controller teardown as a result of a keep-alive expiration.
+ */
+ ctrl->reset_tbkas = true;
nvmet_ctrl_put(ctrl);
sq->ctrl = NULL; /* allows reusing the queue later */
}
}
if (sq->ctrl)
- sq->ctrl->cmd_seen = true;
+ sq->ctrl->reset_tbkas = true;
return true;
return req->transfer_len - req->metadata_len;
}
-static int nvmet_req_alloc_p2pmem_sgls(struct nvmet_req *req)
+static int nvmet_req_alloc_p2pmem_sgls(struct pci_dev *p2p_dev,
+ struct nvmet_req *req)
{
- req->sg = pci_p2pmem_alloc_sgl(req->p2p_dev, &req->sg_cnt,
+ req->sg = pci_p2pmem_alloc_sgl(p2p_dev, &req->sg_cnt,
nvmet_data_transfer_len(req));
if (!req->sg)
goto out_err;
if (req->metadata_len) {
- req->metadata_sg = pci_p2pmem_alloc_sgl(req->p2p_dev,
+ req->metadata_sg = pci_p2pmem_alloc_sgl(p2p_dev,
&req->metadata_sg_cnt, req->metadata_len);
if (!req->metadata_sg)
goto out_free_sg;
}
+
+ req->p2p_dev = p2p_dev;
+
return 0;
out_free_sg:
pci_p2pmem_free_sgl(req->p2p_dev, req->sg);
return -ENOMEM;
}
-static bool nvmet_req_find_p2p_dev(struct nvmet_req *req)
+static struct pci_dev *nvmet_req_find_p2p_dev(struct nvmet_req *req)
{
- if (!IS_ENABLED(CONFIG_PCI_P2PDMA))
- return false;
-
- if (req->sq->ctrl && req->sq->qid && req->ns) {
- req->p2p_dev = radix_tree_lookup(&req->sq->ctrl->p2p_ns_map,
- req->ns->nsid);
- if (req->p2p_dev)
- return true;
- }
-
- req->p2p_dev = NULL;
- return false;
+ if (!IS_ENABLED(CONFIG_PCI_P2PDMA) ||
+ !req->sq->ctrl || !req->sq->qid || !req->ns)
+ return NULL;
+ return radix_tree_lookup(&req->sq->ctrl->p2p_ns_map, req->ns->nsid);
}
int nvmet_req_alloc_sgls(struct nvmet_req *req)
{
- if (nvmet_req_find_p2p_dev(req) && !nvmet_req_alloc_p2pmem_sgls(req))
+ struct pci_dev *p2p_dev = nvmet_req_find_p2p_dev(req);
+
+ if (p2p_dev && !nvmet_req_alloc_p2pmem_sgls(p2p_dev, req))
return 0;
req->sg = sgl_alloc(nvmet_data_transfer_len(req), GFP_KERNEL,
pci_p2pmem_free_sgl(req->p2p_dev, req->sg);
if (req->metadata_sg)
pci_p2pmem_free_sgl(req->p2p_dev, req->metadata_sg);
+ req->p2p_dev = NULL;
} else {
sgl_free(req->sg);
if (req->metadata_sg)
static void nvme_loop_destroy_admin_queue(struct nvme_loop_ctrl *ctrl)
{
- clear_bit(NVME_LOOP_Q_LIVE, &ctrl->queues[0].flags);
+ if (!test_and_clear_bit(NVME_LOOP_Q_LIVE, &ctrl->queues[0].flags))
+ return;
nvmet_sq_destroy(&ctrl->queues[0].nvme_sq);
blk_cleanup_queue(ctrl->ctrl.admin_q);
blk_cleanup_queue(ctrl->ctrl.fabrics_q);
clear_bit(NVME_LOOP_Q_LIVE, &ctrl->queues[i].flags);
nvmet_sq_destroy(&ctrl->queues[i].nvme_sq);
}
+ ctrl->ctrl.queue_count = 1;
}
static int nvme_loop_init_io_queues(struct nvme_loop_ctrl *ctrl)
return 0;
out_cleanup_queue:
+ clear_bit(NVME_LOOP_Q_LIVE, &ctrl->queues[0].flags);
blk_cleanup_queue(ctrl->ctrl.admin_q);
out_cleanup_fabrics_q:
blk_cleanup_queue(ctrl->ctrl.fabrics_q);
nvme_loop_shutdown_ctrl(ctrl);
if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) {
- /* state change failure should never happen */
- WARN_ON_ONCE(1);
+ if (ctrl->ctrl.state != NVME_CTRL_DELETING &&
+ ctrl->ctrl.state != NVME_CTRL_DELETING_NOIO)
+ /* state change failure for non-deleted ctrl? */
+ WARN_ON_ONCE(1);
return;
}
struct nvmet_subsys *subsys;
struct nvmet_sq **sqs;
- bool cmd_seen;
+ bool reset_tbkas;
struct mutex lock;
u64 cap;
* nvmet_req_init is completed.
*/
if (queue->rcv_state == NVMET_TCP_RECV_PDU &&
- len && len < cmd->req.port->inline_data_size &&
+ len && len <= cmd->req.port->inline_data_size &&
nvme_is_write(cmd->req.cmd))
return;
}
#endif
}
+bool pci_host_of_has_msi_map(struct device *dev)
+{
+ if (dev && dev->of_node)
+ return of_get_property(dev->of_node, "msi-map", NULL);
+ return false;
+}
+
static inline int __of_pci_pci_compare(struct device_node *node,
unsigned int data)
{
device_enable_async_suspend(bus->bridge);
pci_set_bus_of_node(bus);
pci_set_bus_msi_domain(bus);
- if (bridge->msi_domain && !dev_get_msi_domain(&bus->dev))
+ if (bridge->msi_domain && !dev_get_msi_domain(&bus->dev) &&
+ !pci_host_of_has_msi_map(parent))
bus->bus_flags |= PCI_BUS_FLAGS_NO_MSI;
if (!parent)
if (!bp->base) {
dev_err(&pdev->dev, "io_remap bar0\n");
err = -ENOMEM;
- goto out;
+ goto out_release_regions;
}
bp->reg = bp->base + OCP_REGISTER_OFFSET;
bp->tod = bp->base + TOD_REGISTER_OFFSET;
return 0;
out:
+ pci_iounmap(pdev, bp->base);
+out_release_regions:
pci_release_regions(pdev);
out_disable:
pci_disable_device(pdev);
blk_queue_segment_boundary(q, PAGE_SIZE - 1);
}
+static int dasd_diag_pe_handler(struct dasd_device *device,
+ __u8 tbvpm, __u8 fcsecpm)
+{
+ return dasd_generic_verify_path(device, tbvpm);
+}
+
static struct dasd_discipline dasd_diag_discipline = {
.owner = THIS_MODULE,
.name = "DIAG",
.ebcname = "DIAG",
.check_device = dasd_diag_check_device,
- .verify_path = dasd_generic_verify_path,
+ .pe_handler = dasd_diag_pe_handler,
.fill_geometry = dasd_diag_fill_geometry,
.setup_blk_queue = dasd_diag_setup_blk_queue,
.start_IO = dasd_start_diag,
blk_queue_flag_set(QUEUE_FLAG_DISCARD, q);
}
+static int dasd_fba_pe_handler(struct dasd_device *device,
+ __u8 tbvpm, __u8 fcsecpm)
+{
+ return dasd_generic_verify_path(device, tbvpm);
+}
+
static struct dasd_discipline dasd_fba_discipline = {
.owner = THIS_MODULE,
.name = "FBA ",
.ebcname = "FBA ",
.check_device = dasd_fba_check_characteristics,
.do_analysis = dasd_fba_do_analysis,
- .verify_path = dasd_generic_verify_path,
+ .pe_handler = dasd_fba_pe_handler,
.setup_blk_queue = dasd_fba_setup_blk_queue,
.fill_geometry = dasd_fba_fill_geometry,
.start_IO = dasd_start_IO,
* e.g. verify that new path is compatible with the current
* configuration.
*/
- int (*verify_path)(struct dasd_device *, __u8);
int (*pe_handler)(struct dasd_device *, __u8, __u8);
/*
static DEFINE_RATELIMIT_STATE(ratelimit_state, 5 * HZ, 1);
int ret;
+ /* this is an error in the caller */
+ if (cp->initialized)
+ return -EBUSY;
+
/*
* We only support prefetching the channel program. We assume all channel
* programs executed by supported guests likewise support prefetching.
struct vfio_ccw_private *private;
struct irb *irb;
bool is_final;
+ bool cp_is_finished = false;
private = container_of(work, struct vfio_ccw_private, io_work);
irb = &private->irb;
(SCSW_ACTL_DEVACT | SCSW_ACTL_SCHACT));
if (scsw_is_solicited(&irb->scsw)) {
cp_update_scsw(&private->cp, &irb->scsw);
- if (is_final && private->state == VFIO_CCW_STATE_CP_PENDING)
+ if (is_final && private->state == VFIO_CCW_STATE_CP_PENDING) {
cp_free(&private->cp);
+ cp_is_finished = true;
+ }
}
mutex_lock(&private->io_mutex);
memcpy(private->io_region->irb_area, irb, sizeof(*irb));
mutex_unlock(&private->io_mutex);
- if (private->mdev && is_final)
+ /*
+ * Reset to IDLE only if processing of a channel program
+ * has finished. Do not overwrite a possible processing
+ * state if the final interrupt was for HSCH or CSCH.
+ */
+ if (private->mdev && cp_is_finished)
private->state = VFIO_CCW_STATE_IDLE;
if (private->io_trigger)
}
err_out:
+ private->state = VFIO_CCW_STATE_IDLE;
trace_vfio_ccw_fsm_io_request(scsw->cmd.fctl, schid,
io_region->ret_code, errstr);
}
}
vfio_ccw_fsm_event(private, VFIO_CCW_EVENT_IO_REQ);
- if (region->ret_code != 0)
- private->state = VFIO_CCW_STATE_IDLE;
ret = (region->ret_code != 0) ? region->ret_code : count;
out_unlock:
#include "aicasm_symbol.h"
#include "aicasm_insformat.h"
-int yylineno;
char *yyfilename;
char stock_prefix[] = "aic_";
char *prefix = stock_prefix;
regex_t arg_regex;
char *replacement_text;
};
-STAILQ_HEAD(macro_arg_list, macro_arg) args;
+STAILQ_HEAD(macro_arg_list, macro_arg);
struct macro_info {
struct macro_arg_list args;
* $FreeBSD: src/sys/cam/scsi/scsi_message.h,v 1.2 2000/05/01 20:21:29 peter Exp $
*/
+/* Messages (1 byte) */ /* I/T (M)andatory or (O)ptional */
+#define MSG_SAVEDATAPOINTER 0x02 /* O/O */
+#define MSG_RESTOREPOINTERS 0x03 /* O/O */
+#define MSG_DISCONNECT 0x04 /* O/O */
+#define MSG_MESSAGE_REJECT 0x07 /* M/M */
+#define MSG_NOOP 0x08 /* M/M */
+
+/* Messages (2 byte) */
+#define MSG_SIMPLE_Q_TAG 0x20 /* O/O */
+#define MSG_IGN_WIDE_RESIDUE 0x23 /* O/O */
+
/* Identify message */ /* M/M */
#define MSG_IDENTIFYFLAG 0x80
#define MSG_IDENTIFY_DISCFLAG 0x40
was a result from the ABTS request rather than the CLEANUP
request */
set_bit(BNX2FC_FLAG_IO_CLEANUP, &io_req->req_flags);
+ rc = FAILED;
goto done;
}
{
int i;
- free_irq(pci_irq_vector(pdev, 1), hisi_hba);
- free_irq(pci_irq_vector(pdev, 2), hisi_hba);
- free_irq(pci_irq_vector(pdev, 11), hisi_hba);
+ devm_free_irq(&pdev->dev, pci_irq_vector(pdev, 1), hisi_hba);
+ devm_free_irq(&pdev->dev, pci_irq_vector(pdev, 2), hisi_hba);
+ devm_free_irq(&pdev->dev, pci_irq_vector(pdev, 11), hisi_hba);
for (i = 0; i < hisi_hba->cq_nvecs; i++) {
struct hisi_sas_cq *cq = &hisi_hba->cq[i];
int nr = hisi_sas_intr_conv ? 16 : 16 + i;
- free_irq(pci_irq_vector(pdev, nr), cq);
+ devm_free_irq(&pdev->dev, pci_irq_vector(pdev, nr), cq);
}
pci_free_irq_vectors(pdev);
}
static void sas_resume_port(struct asd_sas_phy *phy)
{
- struct domain_device *dev;
+ struct domain_device *dev, *n;
struct asd_sas_port *port = phy->port;
struct sas_ha_struct *sas_ha = phy->ha;
struct sas_internal *si = to_sas_internal(sas_ha->core.shost->transportt);
* 1/ presume every device came back
* 2/ force the next revalidation to check all expander phys
*/
- list_for_each_entry(dev, &port->dev_list, dev_list_node) {
+ list_for_each_entry_safe(dev, n, &port->dev_list, dev_list_node) {
int i, rc;
rc = sas_notify_lldd_dev_found(dev);
return;
}
+ mutex_lock(&tgt->ha->optrom_mutex);
mutex_lock(&vha->vha_tgt.tgt_mutex);
tgt->tgt_stop = 0;
tgt->tgt_stopped = 1;
mutex_unlock(&vha->vha_tgt.tgt_mutex);
+ mutex_unlock(&tgt->ha->optrom_mutex);
ql_dbg(ql_dbg_tgt_mgt, vha, 0xf00c, "Stop of tgt %p finished\n",
tgt);
case BTSTAT_SUCCESS:
case BTSTAT_LINKED_COMMAND_COMPLETED:
case BTSTAT_LINKED_COMMAND_COMPLETED_WITH_FLAG:
- /* If everything went fine, let's move on.. */
+ /*
+ * Commands like INQUIRY may transfer less data than
+ * requested by the initiator via bufflen. Set residual
+ * count to make upper layer aware of the actual amount
+ * of data returned.
+ */
+ scsi_set_resid(cmd, scsi_bufflen(cmd) - e->dataLen);
cmd->result = (DID_OK << 16);
break;
ret = of_property_read_u8_array(np, "qcom,ports-block-pack-mode",
bp_mode, nports);
- if (ret)
- return ret;
+ if (ret) {
+ u32 version;
+
+ ctrl->reg_read(ctrl, SWRM_COMP_HW_VERSION, &version);
+
+ if (version <= 0x01030000)
+ memset(bp_mode, SWR_INVALID_PARAM, QCOM_SDW_MAX_PORTS);
+ else
+ return ret;
+ }
memset(hstart, SWR_INVALID_PARAM, QCOM_SDW_MAX_PORTS);
of_property_read_u8_array(np, "qcom,ports-hstart", hstart, nports);
This is the driver for the Altera SPI Controller.
config SPI_ALTERA_CORE
- tristate "Altera SPI Controller core code"
+ tristate "Altera SPI Controller core code" if COMPILE_TEST
select REGMAP
help
"The core code for the Altera SPI Controller"
ret = spi_register_controller(ctlr);
if (ret != 0) {
dev_err(&pdev->dev, "Problem registering DSPI ctlr\n");
- goto out_free_irq;
+ goto out_release_dma;
}
return ret;
+out_release_dma:
+ dspi_release_dma(dspi);
out_free_irq:
if (dspi->irq)
free_irq(dspi->irq, dspi);
static int sc18is602_check_transfer(struct spi_device *spi,
struct spi_transfer *t, int tlen)
{
- if (t && t->len + tlen > SC18IS602_BUFSIZ)
+ if (t && t->len + tlen > SC18IS602_BUFSIZ + 1)
return -EINVAL;
return 0;
return status;
}
+static size_t sc18is602_max_transfer_size(struct spi_device *spi)
+{
+ return SC18IS602_BUFSIZ;
+}
+
static int sc18is602_setup(struct spi_device *spi)
{
struct sc18is602 *hw = spi_master_get_devdata(spi->master);
master->bits_per_word_mask = SPI_BPW_MASK(8);
master->setup = sc18is602_setup;
master->transfer_one_message = sc18is602_transfer_one;
+ master->max_transfer_size = sc18is602_max_transfer_size;
+ master->max_message_size = sc18is602_max_transfer_size;
master->dev.of_node = np;
master->min_speed_hz = hw->freq / 128;
master->max_speed_hz = hw->freq / 4;
{ .compatible = "sprd,sc9860-spi", },
{ /* sentinel */ }
};
+MODULE_DEVICE_TABLE(of, sprd_spi_of_match);
static struct platform_driver sprd_spi_driver = {
.driver = {
}
/**
- * zynq_qspi_setup - Configure the QSPI controller
+ * zynq_qspi_setup_op - Configure the QSPI controller
* @spi: Pointer to the spi_device structure
*
* Sets the operational mode of QSPI controller for the next QSPI transfer, baud
struct zynq_qspi *xqspi = spi_controller_get_devdata(mem->spi->master);
int err = 0, i;
u8 *tmpbuf;
- u8 opcode = op->cmd.opcode;
dev_dbg(xqspi->dev, "cmd:%#x mode:%d.%d.%d.%d\n",
- opcode, op->cmd.buswidth, op->addr.buswidth,
+ op->cmd.opcode, op->cmd.buswidth, op->addr.buswidth,
op->dummy.buswidth, op->data.buswidth);
zynq_qspi_chipselect(mem->spi, true);
zynq_qspi_config_op(xqspi, mem->spi);
- if (op->cmd.nbytes) {
+ if (op->cmd.opcode) {
reinit_completion(&xqspi->data_completion);
- xqspi->txbuf = &opcode;
+ xqspi->txbuf = (u8 *)&op->cmd.opcode;
xqspi->rxbuf = NULL;
xqspi->tx_bytes = op->cmd.nbytes;
xqspi->rx_bytes = op->cmd.nbytes;
{
struct spi_device *spi = to_spi_device(dev);
- /* spi controllers may cleanup for released devices */
- if (spi->controller->cleanup)
- spi->controller->cleanup(spi);
-
spi_controller_put(spi->controller);
kfree(spi->driver_override);
kfree(spi);
return 0;
}
+static void spi_cleanup(struct spi_device *spi)
+{
+ if (spi->controller->cleanup)
+ spi->controller->cleanup(spi);
+}
+
/**
* spi_add_device - Add spi_device allocated with spi_alloc_device
* @spi: spi_device to register
/* Device may be bound to an active driver when this returns */
status = device_add(&spi->dev);
- if (status < 0)
+ if (status < 0) {
dev_err(dev, "can't add %s, status %d\n",
dev_name(&spi->dev), status);
- else
+ spi_cleanup(spi);
+ } else {
dev_dbg(dev, "registered child %s\n", dev_name(&spi->dev));
+ }
done:
mutex_unlock(&spi_add_lock);
if (ACPI_COMPANION(&spi->dev))
acpi_device_clear_enumerated(ACPI_COMPANION(&spi->dev));
device_remove_software_node(&spi->dev);
- device_unregister(&spi->dev);
+ device_del(&spi->dev);
+ spi_cleanup(spi);
+ put_device(&spi->dev);
}
EXPORT_SYMBOL_GPL(spi_unregister_device);
if (spi->cs_gpiod || gpio_is_valid(spi->cs_gpio)) {
if (!(spi->mode & SPI_NO_CS)) {
- if (spi->cs_gpiod)
- /* polarity handled by gpiolib */
- gpiod_set_value_cansleep(spi->cs_gpiod, activate);
- else
+ if (spi->cs_gpiod) {
+ /*
+ * Historically ACPI has no means of the GPIO polarity and
+ * thus the SPISerialBus() resource defines it on the per-chip
+ * basis. In order to avoid a chain of negations, the GPIO
+ * polarity is considered being Active High. Even for the cases
+ * when _DSD() is involved (in the updated versions of ACPI)
+ * the GPIO CS polarity must be defined Active High to avoid
+ * ambiguity. That's why we use enable, that takes SPI_CS_HIGH
+ * into account.
+ */
+ if (has_acpi_companion(&spi->dev))
+ gpiod_set_value_cansleep(spi->cs_gpiod, !enable);
+ else
+ /* Polarity handled by GPIO library */
+ gpiod_set_value_cansleep(spi->cs_gpiod, activate);
+ } else {
/*
* invert the enable line, as active low is
* default for SPI.
*/
gpio_set_value_cansleep(spi->cs_gpio, !enable);
+ }
}
/* Some SPI masters need both GPIO CS & slave_select */
if ((spi->controller->flags & SPI_MASTER_GPIO_SS) &&
if (spi->controller->set_cs_timing &&
!(spi->cs_gpiod || gpio_is_valid(spi->cs_gpio))) {
+ mutex_lock(&spi->controller->io_mutex);
+
if (spi->controller->auto_runtime_pm) {
status = pm_runtime_get_sync(parent);
if (status < 0) {
+ mutex_unlock(&spi->controller->io_mutex);
pm_runtime_put_noidle(parent);
dev_err(&spi->controller->dev, "Failed to power device: %d\n",
status);
hold, inactive);
pm_runtime_mark_last_busy(parent);
pm_runtime_put_autosuspend(parent);
- return status;
} else {
- return spi->controller->set_cs_timing(spi, setup, hold,
+ status = spi->controller->set_cs_timing(spi, setup, hold,
inactive);
}
+
+ mutex_unlock(&spi->controller->io_mutex);
+ return status;
}
if ((setup && setup->unit == SPI_DELAY_UNIT_SCK) ||
struct nbu2ss_ep *ep,
int status)
{
- struct nbu2ss_req *req;
+ struct nbu2ss_req *req, *n;
/* Endpoint Disable */
_nbu2ss_epn_exit(udc, ep);
return 0;
/* called with irqs blocked */
- list_for_each_entry(req, &ep->queue, queue) {
+ list_for_each_entry_safe(req, n, &ep->queue, queue) {
_nbu2ss_ep_done(ep, req, status);
}
indio_dev->num_channels = ARRAY_SIZE(ad7746_channels);
else
indio_dev->num_channels = ARRAY_SIZE(ad7746_channels) - 2;
- indio_dev->num_channels = ARRAY_SIZE(ad7746_channels);
indio_dev->modes = INDIO_DIRECT_MODE;
if (pdata) {
struct iblock_dev_plug *ib_dev_plug;
/*
- * Each se_device has a per cpu work this can be run from. Wwe
+ * Each se_device has a per cpu work this can be run from. We
* shouldn't have multiple threads on the same cpu calling this
* at the same time.
*/
- ib_dev_plug = &ib_dev->ibd_plug[smp_processor_id()];
+ ib_dev_plug = &ib_dev->ibd_plug[raw_smp_processor_id()];
if (test_and_set_bit(IBD_PLUGF_PLUGGED, &ib_dev_plug->flags))
return NULL;
cmd->orig_fe_lun = unpacked_lun;
if (!(cmd->se_cmd_flags & SCF_USE_CPUID))
- cmd->cpuid = smp_processor_id();
+ cmd->cpuid = raw_smp_processor_id();
cmd->state_active = false;
}
dpi = dbi * udev->data_pages_per_blk;
/* Count the number of already allocated pages */
xas_set(&xas, dpi);
+ rcu_read_lock();
for (cnt = 0; xas_next(&xas) && cnt < page_cnt;)
cnt++;
+ rcu_read_unlock();
for (i = cnt; i < page_cnt; i++) {
/* try to get new page from the mm */
struct scatterlist *sg, unsigned int sg_nents,
struct iovec **iov, size_t data_len)
{
- XA_STATE(xas, &udev->data_pages, 0);
/* start value of dbi + 1 must not be a valid dbi */
int dbi = -2;
size_t page_remaining, cp_len;
- int page_cnt, page_inx;
+ int page_cnt, page_inx, dpi;
struct sg_mapping_iter sg_iter;
unsigned int sg_flags;
struct page *page;
if (page_cnt > udev->data_pages_per_blk)
page_cnt = udev->data_pages_per_blk;
- xas_set(&xas, dbi * udev->data_pages_per_blk);
- for (page_inx = 0; page_inx < page_cnt && data_len; page_inx++) {
- page = xas_next(&xas);
+ dpi = dbi * udev->data_pages_per_blk;
+ for (page_inx = 0; page_inx < page_cnt && data_len;
+ page_inx++, dpi++) {
+ page = xa_load(&udev->data_pages, dpi);
if (direction == TCMU_DATA_AREA_TO_SG)
flush_dcache_page(page);
if (ACPI_FAILURE(status))
trip_cnt = 0;
else {
+ int i;
+
int34x_thermal_zone->aux_trips =
kcalloc(trip_cnt,
sizeof(*int34x_thermal_zone->aux_trips),
}
trip_mask = BIT(trip_cnt) - 1;
int34x_thermal_zone->aux_trip_nr = trip_cnt;
+ for (i = 0; i < trip_cnt; ++i)
+ int34x_thermal_zone->aux_trips[i] = THERMAL_TEMP_INVALID;
}
trip_cnt = int340x_thermal_read_trips(int34x_thermal_zone);
return atomic_read(&therm_throt_en);
}
+void __init therm_lvt_init(void)
+{
+ /*
+ * This function is only called on boot CPU. Save the init thermal
+ * LVT value on BSP and use that value to restore APs' thermal LVT
+ * entry BIOS programmed later
+ */
+ if (intel_thermal_supported(&boot_cpu_data))
+ lvtthmr_init = apic_read(APIC_LVTTHMR);
+}
+
void intel_init_thermal(struct cpuinfo_x86 *c)
{
unsigned int cpu = smp_processor_id();
if (!intel_thermal_supported(c))
return;
- /* On the BSP? */
- if (c == &boot_cpu_data)
- lvtthmr_init = apic_read(APIC_LVTTHMR);
-
/*
* First check if its enabled already, in which case there might
* be some SMM goo which handles it, so we can't even put a handler
if (thres_reg_value)
*temp = zonedev->tj_max - thres_reg_value * 1000;
else
- *temp = 0;
+ *temp = THERMAL_TEMP_INVALID;
pr_debug("sys_get_trip_temp %d\n", *temp);
return 0;
if (args.args_count != 1 || args.args[0] >= ADC5_MAX_CHANNEL) {
dev_err(dev, "%s: invalid ADC channel number %d\n", name, chan);
- return ret;
+ return -EINVAL;
}
channel->adc_channel = args.args[0];
}
/**
- * ti_bandgap_alert_init() - setup and initialize talert handling
+ * ti_bandgap_talert_init() - setup and initialize talert handling
* @bgp: pointer to struct ti_bandgap
* @pdev: pointer to device struct platform_device
*
void *buf, size_t size)
{
unsigned int retries = DMA_PORT_RETRIES;
- unsigned int offset;
-
- offset = address & 3;
- address = address & ~3;
do {
- u32 nbytes = min_t(u32, size, MAIL_DATA_DWORDS * 4);
+ unsigned int offset;
+ size_t nbytes;
int ret;
+ offset = address & 3;
+ nbytes = min_t(size_t, size + offset, MAIL_DATA_DWORDS * 4);
+
ret = dma_port_flash_read_block(dma, address, dma->buf,
ALIGN(nbytes, 4));
if (ret) {
return ret;
}
+ nbytes -= offset;
memcpy(buf, dma->buf + offset, nbytes);
size -= nbytes;
unsigned int retries = USB4_DATA_RETRIES;
unsigned int offset;
- offset = address & 3;
- address = address & ~3;
-
do {
- size_t nbytes = min_t(size_t, size, USB4_DATA_DWORDS * 4);
unsigned int dwaddress, dwords;
u8 data[USB4_DATA_DWORDS * 4];
+ size_t nbytes;
int ret;
+ offset = address & 3;
+ nbytes = min_t(size_t, size + offset, USB4_DATA_DWORDS * 4);
+
dwaddress = address / 4;
dwords = ALIGN(nbytes, 4) / 4;
return ret;
}
+ nbytes -= offset;
memcpy(buf, data + offset, nbytes);
size -= nbytes;
* Copyright (C) 2001 Russell King.
*/
+#include <linux/bits.h>
#include <linux/serial_8250.h>
#include <linux/serial_reg.h>
#include <linux/dmaengine.h>
unsigned int flags;
};
-#define UART_CAP_FIFO (1 << 8) /* UART has FIFO */
-#define UART_CAP_EFR (1 << 9) /* UART has EFR */
-#define UART_CAP_SLEEP (1 << 10) /* UART has IER sleep */
-#define UART_CAP_AFE (1 << 11) /* MCR-based hw flow control */
-#define UART_CAP_UUE (1 << 12) /* UART needs IER bit 6 set (Xscale) */
-#define UART_CAP_RTOIE (1 << 13) /* UART needs IER bit 4 set (Xscale, Tegra) */
-#define UART_CAP_HFIFO (1 << 14) /* UART has a "hidden" FIFO */
-#define UART_CAP_RPM (1 << 15) /* Runtime PM is active while idle */
-#define UART_CAP_IRDA (1 << 16) /* UART supports IrDA line discipline */
-#define UART_CAP_MINI (1 << 17) /* Mini UART on BCM283X family lacks:
+#define UART_CAP_FIFO BIT(8) /* UART has FIFO */
+#define UART_CAP_EFR BIT(9) /* UART has EFR */
+#define UART_CAP_SLEEP BIT(10) /* UART has IER sleep */
+#define UART_CAP_AFE BIT(11) /* MCR-based hw flow control */
+#define UART_CAP_UUE BIT(12) /* UART needs IER bit 6 set (Xscale) */
+#define UART_CAP_RTOIE BIT(13) /* UART needs IER bit 4 set (Xscale, Tegra) */
+#define UART_CAP_HFIFO BIT(14) /* UART has a "hidden" FIFO */
+#define UART_CAP_RPM BIT(15) /* Runtime PM is active while idle */
+#define UART_CAP_IRDA BIT(16) /* UART supports IrDA line discipline */
+#define UART_CAP_MINI BIT(17) /* Mini UART on BCM283X family lacks:
* STOP PARITY EPAR SPAR WLEN5 WLEN6
*/
-#define UART_BUG_QUOT (1 << 0) /* UART has buggy quot LSB */
-#define UART_BUG_TXEN (1 << 1) /* UART has buggy TX IIR status */
-#define UART_BUG_NOMSR (1 << 2) /* UART has buggy MSR status bits (Au1x00) */
-#define UART_BUG_THRE (1 << 3) /* UART has buggy THRE reassertion */
-#define UART_BUG_PARITY (1 << 4) /* UART mishandles parity if FIFO enabled */
+#define UART_BUG_QUOT BIT(0) /* UART has buggy quot LSB */
+#define UART_BUG_TXEN BIT(1) /* UART has buggy TX IIR status */
+#define UART_BUG_NOMSR BIT(2) /* UART has buggy MSR status bits (Au1x00) */
+#define UART_BUG_THRE BIT(3) /* UART has buggy THRE reassertion */
+#define UART_BUG_PARITY BIT(4) /* UART mishandles parity if FIFO enabled */
+#define UART_BUG_TXRACE BIT(5) /* UART Tx fails to set remote DR */
#ifdef CONFIG_SERIAL_8250_SHARE_IRQ
port.port.status = UPSTAT_SYNC_FIFO;
port.port.dev = &pdev->dev;
port.port.has_sysrq = IS_ENABLED(CONFIG_SERIAL_8250_CONSOLE);
+ port.bugs |= UART_BUG_TXRACE;
rc = sysfs_create_group(&vuart->dev->kobj, &aspeed_vuart_attr_group);
if (rc < 0)
{ "APMC0D08", 0},
{ "AMD0020", 0 },
{ "AMDI0020", 0 },
+ { "AMDI0022", 0 },
{ "BRCM2032", 0 },
{ "HISI0031", 0 },
{ },
int line[];
};
+#define PCI_DEVICE_ID_HPE_PCI_SERIAL 0x37e
+
static const struct pci_device_id pci_use_msi[] = {
{ PCI_DEVICE_SUB(PCI_VENDOR_ID_NETMOS, PCI_DEVICE_ID_NETMOS_9900,
0xA000, 0x1000) },
0xA000, 0x1000) },
{ PCI_DEVICE_SUB(PCI_VENDOR_ID_NETMOS, PCI_DEVICE_ID_NETMOS_9922,
0xA000, 0x1000) },
+ { PCI_DEVICE_SUB(PCI_VENDOR_ID_HP_3PAR, PCI_DEVICE_ID_HPE_PCI_SERIAL,
+ PCI_ANY_ID, PCI_ANY_ID) },
{ }
};
.setup = pci_hp_diva_setup,
},
/*
+ * HPE PCI serial device
+ */
+ {
+ .vendor = PCI_VENDOR_ID_HP_3PAR,
+ .device = PCI_DEVICE_ID_HPE_PCI_SERIAL,
+ .subvendor = PCI_ANY_ID,
+ .subdevice = PCI_ANY_ID,
+ .setup = pci_hp_diva_setup,
+ },
+ /*
* Intel
*/
{
uart.port.flags = UPF_SKIP_TEST | UPF_BOOT_AUTOCONF | UPF_SHARE_IRQ;
uart.port.uartclk = board->base_baud * 16;
- if (pci_match_id(pci_use_msi, dev)) {
- dev_dbg(&dev->dev, "Using MSI(-X) interrupts\n");
- pci_set_master(dev);
- rc = pci_alloc_irq_vectors(dev, 1, 1, PCI_IRQ_ALL_TYPES);
+ if (board->flags & FL_NOIRQ) {
+ uart.port.irq = 0;
} else {
- dev_dbg(&dev->dev, "Using legacy interrupts\n");
- rc = pci_alloc_irq_vectors(dev, 1, 1, PCI_IRQ_LEGACY);
- }
- if (rc < 0) {
- kfree(priv);
- priv = ERR_PTR(rc);
- goto err_deinit;
+ if (pci_match_id(pci_use_msi, dev)) {
+ dev_dbg(&dev->dev, "Using MSI(-X) interrupts\n");
+ pci_set_master(dev);
+ rc = pci_alloc_irq_vectors(dev, 1, 1, PCI_IRQ_ALL_TYPES);
+ } else {
+ dev_dbg(&dev->dev, "Using legacy interrupts\n");
+ rc = pci_alloc_irq_vectors(dev, 1, 1, PCI_IRQ_LEGACY);
+ }
+ if (rc < 0) {
+ kfree(priv);
+ priv = ERR_PTR(rc);
+ goto err_deinit;
+ }
+
+ uart.port.irq = pci_irq_vector(dev, 0);
}
- uart.port.irq = pci_irq_vector(dev, 0);
uart.port.dev = &dev->dev;
for (i = 0; i < nr_ports; i++) {
{ PCI_VENDOR_ID_HP, PCI_DEVICE_ID_HP_DIVA_AUX,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
pbn_b2_1_115200 },
+ /* HPE PCI serial device */
+ { PCI_VENDOR_ID_HP_3PAR, PCI_DEVICE_ID_HPE_PCI_SERIAL,
+ PCI_ANY_ID, PCI_ANY_ID, 0, 0,
+ pbn_b1_1_115200 },
{ PCI_VENDOR_ID_DCI, PCI_DEVICE_ID_DCI_PCCOM2,
PCI_ANY_ID, PCI_ANY_ID, 0, 0,
count = up->tx_loadsz;
do {
serial_out(up, UART_TX, xmit->buf[xmit->tail]);
+ if (up->bugs & UART_BUG_TXRACE) {
+ /*
+ * The Aspeed BMC virtual UARTs have a bug where data
+ * may get stuck in the BMC's Tx FIFO from bursts of
+ * writes on the APB interface.
+ *
+ * Delay back-to-back writes by a read cycle to avoid
+ * stalling the VUART. Read a register that won't have
+ * side-effects and discard the result.
+ */
+ serial_in(up, UART_SCR);
+ }
xmit->tail = (xmit->tail + 1) & (UART_XMIT_SIZE - 1);
port->icount.tx++;
if (uart_circ_empty(xmit))
void __iomem *bar0;
void __iomem *bar1;
spinlock_t card_lock;
- struct completion fw_loaded;
};
#define RP_ID(prod) PCI_VDEVICE(RP, (prod))
card->initialized_ports = 0;
}
-static void rp2_fw_cb(const struct firmware *fw, void *context)
+static int rp2_load_firmware(struct rp2_card *card, const struct firmware *fw)
{
- struct rp2_card *card = context;
resource_size_t phys_base;
- int i, rc = -ENOENT;
-
- if (!fw) {
- dev_err(&card->pdev->dev, "cannot find '%s' firmware image\n",
- RP2_FW_NAME);
- goto no_fw;
- }
+ int i, rc = 0;
phys_base = pci_resource_start(card->pdev, 1);
card->initialized_ports++;
}
- release_firmware(fw);
-no_fw:
- /*
- * rp2_fw_cb() is called from a workqueue long after rp2_probe()
- * has already returned success. So if something failed here,
- * we'll just leave the now-dormant device in place until somebody
- * unbinds it.
- */
- if (rc)
- dev_warn(&card->pdev->dev, "driver initialization failed\n");
-
- complete(&card->fw_loaded);
+ return rc;
}
static int rp2_probe(struct pci_dev *pdev,
const struct pci_device_id *id)
{
+ const struct firmware *fw;
struct rp2_card *card;
struct rp2_uart_port *ports;
void __iomem * const *bars;
return -ENOMEM;
pci_set_drvdata(pdev, card);
spin_lock_init(&card->card_lock);
- init_completion(&card->fw_loaded);
rc = pcim_enable_device(pdev);
if (rc)
return -ENOMEM;
card->ports = ports;
- rc = devm_request_irq(&pdev->dev, pdev->irq, rp2_uart_interrupt,
- IRQF_SHARED, DRV_NAME, card);
- if (rc)
+ rc = request_firmware(&fw, RP2_FW_NAME, &pdev->dev);
+ if (rc < 0) {
+ dev_err(&pdev->dev, "cannot find '%s' firmware image\n",
+ RP2_FW_NAME);
return rc;
+ }
- /*
- * Only catastrophic errors (e.g. ENOMEM) are reported here.
- * If the FW image is missing, we'll find out in rp2_fw_cb()
- * and print an error message.
- */
- rc = request_firmware_nowait(THIS_MODULE, 1, RP2_FW_NAME, &pdev->dev,
- GFP_KERNEL, card, rp2_fw_cb);
+ rc = rp2_load_firmware(card, fw);
+
+ release_firmware(fw);
+ if (rc < 0)
+ return rc;
+
+ rc = devm_request_irq(&pdev->dev, pdev->irq, rp2_uart_interrupt,
+ IRQF_SHARED, DRV_NAME, card);
if (rc)
return rc;
- dev_dbg(&pdev->dev, "waiting for firmware blob...\n");
return 0;
}
{
struct rp2_card *card = pci_get_drvdata(pdev);
- wait_for_completion(&card->fw_loaded);
rp2_remove_ports(card);
}
do {
lsr = tegra_uart_read(tup, UART_LSR);
- if ((lsr | UART_LSR_TEMT) && !(lsr & UART_LSR_DR))
+ if ((lsr & UART_LSR_TEMT) && !(lsr & UART_LSR_DR))
break;
udelay(1);
} while (--tmout);
goto check_and_exit;
}
- retval = security_locked_down(LOCKDOWN_TIOCSSERIAL);
- if (retval && (change_irq || change_port))
- goto exit;
+ if (change_irq || change_port) {
+ retval = security_locked_down(LOCKDOWN_TIOCSSERIAL);
+ if (retval)
+ goto exit;
+ }
/*
* Ask the low level driver to verify the settings.
{
unsigned int bits;
+ if (rx_trig >= port->fifosize)
+ rx_trig = port->fifosize - 1;
if (rx_trig < 1)
rx_trig = 1;
- if (rx_trig >= port->fifosize)
- rx_trig = port->fifosize;
/* HSCIF can be set to an arbitrary level. */
if (sci_getreg(port, HSRTRGR)->size) {
pm_runtime_get_sync(cdns->dev);
ret = cdns3_gadget_start(cdns);
- if (ret)
+ if (ret) {
+ pm_runtime_put_sync(cdns->dev);
return ret;
+ }
/*
* Because interrupt line can be shared with other components in
int cdnsp_ep_dequeue(struct cdnsp_ep *pep, struct cdnsp_request *preq)
{
struct cdnsp_device *pdev = pep->pdev;
- int ret;
+ int ret_stop = 0;
+ int ret_rem;
trace_cdnsp_request_dequeue(preq);
- if (GET_EP_CTX_STATE(pep->out_ctx) == EP_STATE_RUNNING) {
- ret = cdnsp_cmd_stop_ep(pdev, pep);
- if (ret)
- return ret;
- }
+ if (GET_EP_CTX_STATE(pep->out_ctx) == EP_STATE_RUNNING)
+ ret_stop = cdnsp_cmd_stop_ep(pdev, pep);
+
+ ret_rem = cdnsp_remove_request(pdev, preq, pep);
- return cdnsp_remove_request(pdev, preq, pep);
+ return ret_rem ? ret_rem : ret_stop;
}
static void cdnsp_zero_in_ctx(struct cdnsp_device *pdev)
ci->gadget.name = ci->platdata->name;
ci->gadget.otg_caps = otg_caps;
ci->gadget.sg_supported = 1;
+ ci->gadget.irq = ci->irq;
if (ci->platdata->flags & CI_HDRC_REQUIRES_ALIGNED_DMA)
ci->gadget.quirk_avoids_skb_reserve = 1;
ret = usbfs_increase_memory_usage(len1 + sizeof(struct urb));
if (ret)
return ret;
- tbuf = kmalloc(len1, GFP_KERNEL);
+
+ /*
+ * len1 can be almost arbitrarily large. Don't WARN if it's
+ * too big, just fail the request.
+ */
+ tbuf = kmalloc(len1, GFP_KERNEL | __GFP_NOWARN);
if (!tbuf) {
ret = -ENOMEM;
goto done;
if (num_sgs) {
as->urb->sg = kmalloc_array(num_sgs,
sizeof(struct scatterlist),
- GFP_KERNEL);
+ GFP_KERNEL | __GFP_NOWARN);
if (!as->urb->sg) {
ret = -ENOMEM;
goto error;
(uurb_start - as->usbm->vm_start);
} else {
as->urb->transfer_buffer = kmalloc(uurb->buffer_length,
- GFP_KERNEL);
+ GFP_KERNEL | __GFP_NOWARN);
if (!as->urb->transfer_buffer) {
ret = -ENOMEM;
goto error;
req->start_sg = sg_next(s);
req->num_queued_sgs++;
+ req->num_pending_sgs--;
/*
* The number of pending SG entries may not correspond to the
* don't include unused SG entries.
*/
if (length == 0) {
- req->num_pending_sgs -= req->request.num_mapped_sgs - req->num_queued_sgs;
+ req->num_pending_sgs = 0;
break;
}
struct dwc3_trb *trb = &dep->trb_pool[dep->trb_dequeue];
struct scatterlist *sg = req->sg;
struct scatterlist *s;
- unsigned int pending = req->num_pending_sgs;
+ unsigned int num_queued = req->num_queued_sgs;
unsigned int i;
int ret = 0;
- for_each_sg(sg, s, pending, i) {
+ for_each_sg(sg, s, num_queued, i) {
trb = &dep->trb_pool[dep->trb_dequeue];
req->sg = sg_next(s);
- req->num_pending_sgs--;
+ req->num_queued_sgs--;
ret = dwc3_gadget_ep_reclaim_completed_trb(dep, req,
trb, event, status, true);
static bool dwc3_gadget_ep_request_completed(struct dwc3_request *req)
{
- return req->num_pending_sgs == 0;
+ return req->num_pending_sgs == 0 && req->num_queued_sgs == 0;
}
static int dwc3_gadget_ep_cleanup_completed_request(struct dwc3_ep *dep,
{
int ret;
- if (req->num_pending_sgs)
+ if (req->request.num_mapped_sgs)
ret = dwc3_gadget_ep_reclaim_trb_sg(dep, req, event,
status);
else
struct renesas_usb3_request *usb3_req)
{
struct renesas_usb3 *usb3 = usb3_ep_to_usb3(usb3_ep);
- struct renesas_usb3_request *usb3_req_first = usb3_get_request(usb3_ep);
+ struct renesas_usb3_request *usb3_req_first;
unsigned long flags;
int ret = -EAGAIN;
u32 enable_bits = 0;
spin_lock_irqsave(&usb3->lock, flags);
if (usb3_ep->halt || usb3_ep->started)
goto out;
- if (usb3_req != usb3_req_first)
+ usb3_req_first = __usb3_get_request(usb3_ep);
+ if (!usb3_req_first || usb3_req != usb3_req_first)
goto out;
if (usb3_pn_change(usb3, usb3_ep->num) < 0)
list_for_each_entry_safe(td, tmp_td, &ep->cancelled_td_list,
cancelled_td_list) {
- /*
- * Doesn't matter what we pass for status, since the core will
- * just overwrite it (because the URB has been unlinked).
- */
ring = xhci_urb_to_transfer_ring(ep->xhci, td->urb);
if (td->cancel_status == TD_CLEARED)
- xhci_td_cleanup(ep->xhci, td, ring, 0);
+ xhci_td_cleanup(ep->xhci, td, ring, td->status);
if (ep->xhci->xhc_state & XHCI_STATE_DYING)
return;
continue;
}
/*
- * If ring stopped on the TD we need to cancel, then we have to
+ * If a ring stopped on the TD we need to cancel then we have to
* move the xHC endpoint ring dequeue pointer past this TD.
+ * Rings halted due to STALL may show hw_deq is past the stalled
+ * TD, but still require a set TR Deq command to flush xHC cache.
*/
hw_deq = xhci_get_hw_deq(xhci, ep->vdev, ep->ep_index,
td->urb->stream_id);
hw_deq &= ~0xf;
- if (trb_in_td(xhci, td->start_seg, td->first_trb,
+ if (td->cancel_status == TD_HALTED) {
+ cached_td = td;
+ } else if (trb_in_td(xhci, td->start_seg, td->first_trb,
td->last_trb, hw_deq, false)) {
switch (td->cancel_status) {
case TD_CLEARED: /* TD is already no-op */
/* Set speed */
retval = usb_control_msg(tv->udev, usb_sndctrlpipe(tv->udev, 0),
0x01, /* vendor request: set speed */
- USB_DIR_IN | USB_TYPE_VENDOR | USB_RECIP_OTHER,
+ USB_DIR_OUT | USB_TYPE_VENDOR | USB_RECIP_OTHER,
tv->speed, /* speed value */
- 0, NULL, 0, USB_CTRL_GET_TIMEOUT);
+ 0, NULL, 0, USB_CTRL_SET_TIMEOUT);
if (retval) {
tv->speed = old;
dev_dbg(&tv->udev->dev, "retval = %d\n", retval);
parport_announce_port(pp);
usb_set_intfdata(intf, pp);
+ usb_put_dev(usbdev);
return 0;
probe_abort:
/* Sienna devices */
{ USB_DEVICE(FTDI_VID, FTDI_SIENNA_PID) },
{ USB_DEVICE(ECHELON_VID, ECHELON_U20_PID) },
+ /* IDS GmbH devices */
+ { USB_DEVICE(IDS_VID, IDS_SI31A_PID) },
+ { USB_DEVICE(IDS_VID, IDS_CM31A_PID) },
/* U-Blox devices */
{ USB_DEVICE(UBLOX_VID, UBLOX_C099F9P_ZED_PID) },
{ USB_DEVICE(UBLOX_VID, UBLOX_C099F9P_ODIN_PID) },
#define UNJO_ISODEBUG_V1_PID 0x150D
/*
+ * IDS GmbH
+ */
+#define IDS_VID 0x2CAF
+#define IDS_SI31A_PID 0x13A2
+#define IDS_CM31A_PID 0x13A3
+
+/*
* U-Blox products (http://www.u-blox.com).
*/
#define UBLOX_VID 0x1546
.driver_info = NCTRL(0) | RSVD(1) },
{ USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x1901, 0xff), /* Telit LN940 (MBIM) */
.driver_info = NCTRL(0) },
+ { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x7010, 0xff), /* Telit LE910-S1 (RNDIS) */
+ .driver_info = NCTRL(2) },
+ { USB_DEVICE_INTERFACE_CLASS(TELIT_VENDOR_ID, 0x7011, 0xff), /* Telit LE910-S1 (ECM) */
+ .driver_info = NCTRL(2) },
{ USB_DEVICE(TELIT_VENDOR_ID, 0x9010), /* Telit SBL FN980 flashing device */
.driver_info = NCTRL(0) | ZLP },
{ USB_DEVICE_AND_INTERFACE_INFO(ZTE_VENDOR_ID, ZTE_PRODUCT_MF622, 0xff, 0xff, 0xff) }, /* ZTE WCDMA products */
{ USB_DEVICE(SONY_VENDOR_ID, SONY_QN3USB_PRODUCT_ID) },
{ USB_DEVICE(SANWA_VENDOR_ID, SANWA_PRODUCT_ID) },
{ USB_DEVICE(ADLINK_VENDOR_ID, ADLINK_ND6530_PRODUCT_ID) },
+ { USB_DEVICE(ADLINK_VENDOR_ID, ADLINK_ND6530GC_PRODUCT_ID) },
{ USB_DEVICE(SMART_VENDOR_ID, SMART_PRODUCT_ID) },
{ USB_DEVICE(AT_VENDOR_ID, AT_VTKIT3_PRODUCT_ID) },
{ } /* Terminating entry */
/* ADLINK ND-6530 RS232,RS485 and RS422 adapter */
#define ADLINK_VENDOR_ID 0x0b63
#define ADLINK_ND6530_PRODUCT_ID 0x6530
+#define ADLINK_ND6530GC_PRODUCT_ID 0x653a
/* SMART USB Serial Adapter */
#define SMART_VENDOR_ID 0x0b8c
/* Vendor and product ids */
#define TI_VENDOR_ID 0x0451
#define IBM_VENDOR_ID 0x04b3
+#define STARTECH_VENDOR_ID 0x14b0
#define TI_3410_PRODUCT_ID 0x3410
#define IBM_4543_PRODUCT_ID 0x4543
#define IBM_454B_PRODUCT_ID 0x454b
{ USB_DEVICE(MXU1_VENDOR_ID, MXU1_1131_PRODUCT_ID) },
{ USB_DEVICE(MXU1_VENDOR_ID, MXU1_1150_PRODUCT_ID) },
{ USB_DEVICE(MXU1_VENDOR_ID, MXU1_1151_PRODUCT_ID) },
+ { USB_DEVICE(STARTECH_VENDOR_ID, TI_3410_PRODUCT_ID) },
{ } /* terminator */
};
{ USB_DEVICE(MXU1_VENDOR_ID, MXU1_1131_PRODUCT_ID) },
{ USB_DEVICE(MXU1_VENDOR_ID, MXU1_1150_PRODUCT_ID) },
{ USB_DEVICE(MXU1_VENDOR_ID, MXU1_1151_PRODUCT_ID) },
+ { USB_DEVICE(STARTECH_VENDOR_ID, TI_3410_PRODUCT_ID) },
{ } /* terminator */
};
bool match;
int nval;
u16 *val;
+ int ret;
int i;
/*
if (!val)
return ERR_PTR(-ENOMEM);
- nval = fwnode_property_read_u16_array(fwnode, "svid", val, nval);
- if (nval < 0) {
+ ret = fwnode_property_read_u16_array(fwnode, "svid", val, nval);
+ if (ret < 0) {
kfree(val);
- return ERR_PTR(nval);
+ return ERR_PTR(ret);
}
for (i = 0; i < nval; i++) {
if (PD_VDO_SVDM_VER(p[0]) < svdm_version)
typec_partner_set_svdm_version(port->partner,
PD_VDO_SVDM_VER(p[0]));
+
+ tcpm_ams_start(port, DISCOVER_IDENTITY);
/* 6.4.4.3.1: Only respond as UFP (device) */
if (port->data_role == TYPEC_DEVICE &&
port->nr_snk_vdo) {
}
break;
case CMD_DISCOVER_SVID:
+ tcpm_ams_start(port, DISCOVER_SVIDS);
break;
case CMD_DISCOVER_MODES:
+ tcpm_ams_start(port, DISCOVER_MODES);
break;
case CMD_ENTER_MODE:
+ tcpm_ams_start(port, DFP_TO_UFP_ENTER_MODE);
break;
case CMD_EXIT_MODE:
+ tcpm_ams_start(port, DFP_TO_UFP_EXIT_MODE);
break;
case CMD_ATTENTION:
+ tcpm_ams_start(port, ATTENTION);
/* Attention command does not have response */
*adev_action = ADEV_ATTENTION;
return 0;
bool frs_enable;
int ret;
+ if (tcpm_vdm_ams(port) && type != PD_DATA_VENDOR_DEF) {
+ port->vdm_state = VDM_STATE_ERR_BUSY;
+ tcpm_ams_finish(port);
+ mod_vdm_delayed_work(port, 0);
+ }
+
switch (type) {
case PD_DATA_SOURCE_CAP:
for (i = 0; i < cnt; i++)
NONE_AMS);
break;
case PD_DATA_VENDOR_DEF:
- tcpm_handle_vdm_request(port, msg->payload, cnt);
+ if (tcpm_vdm_ams(port) || port->nr_snk_vdo)
+ tcpm_handle_vdm_request(port, msg->payload, cnt);
+ else if (port->negotiated_rev > PD_REV20)
+ tcpm_pd_handle_msg(port, PD_MSG_CTRL_NOT_SUPP, NONE_AMS);
break;
case PD_DATA_BIST:
port->bist_request = le32_to_cpu(msg->payload[0]);
enum pd_ctrl_msg_type type = pd_header_type_le(msg->header);
enum tcpm_state next_state;
+ /*
+ * Stop VDM state machine if interrupted by other Messages while NOT_SUPP is allowed in
+ * VDM AMS if waiting for VDM responses and will be handled later.
+ */
+ if (tcpm_vdm_ams(port) && type != PD_CTRL_NOT_SUPP && type != PD_CTRL_GOOD_CRC) {
+ port->vdm_state = VDM_STATE_ERR_BUSY;
+ tcpm_ams_finish(port);
+ mod_vdm_delayed_work(port, 0);
+ }
+
switch (type) {
case PD_CTRL_GOOD_CRC:
case PD_CTRL_PING:
enum pd_ext_msg_type type = pd_header_type_le(msg->header);
unsigned int data_size = pd_ext_header_data_size_le(msg->ext_msg.header);
- if (!(msg->ext_msg.header & PD_EXT_HDR_CHUNKED)) {
+ /* stopping VDM state machine if interrupted by other Messages */
+ if (tcpm_vdm_ams(port)) {
+ port->vdm_state = VDM_STATE_ERR_BUSY;
+ tcpm_ams_finish(port);
+ mod_vdm_delayed_work(port, 0);
+ }
+
+ if (!(le16_to_cpu(msg->ext_msg.header) & PD_EXT_HDR_CHUNKED)) {
tcpm_pd_handle_msg(port, PD_MSG_CTRL_NOT_SUPP, NONE_AMS);
tcpm_log(port, "Unchunked extended messages unsupported");
return;
"Data role mismatch, initiating error recovery");
tcpm_set_state(port, ERROR_RECOVERY, 0);
} else {
- if (msg->header & PD_HEADER_EXT_HDR)
+ if (le16_to_cpu(msg->header) & PD_HEADER_EXT_HDR)
tcpm_pd_ext_msg_request(port, msg);
else if (cnt)
tcpm_pd_data_request(port, msg);
ucsi_send_command(con->ucsi, command, NULL, 0);
/* 3. ACK connector change */
- clear_bit(EVENT_PENDING, &ucsi->flags);
ret = ucsi_acknowledge_connector_change(ucsi);
+ clear_bit(EVENT_PENDING, &ucsi->flags);
if (ret) {
dev_err(ucsi->dev, "%s: ACK failed (%d)", __func__, ret);
goto out_unlock;
#include <linux/mlx5/vport.h>
#include <linux/mlx5/fs.h>
#include <linux/mlx5/mlx5_ifc_vdpa.h>
+#include <linux/mlx5/mpfs.h>
#include "mlx5_vdpa.h"
MODULE_AUTHOR("Eli Cohen <eli@mellanox.com>");
static void mlx5_vdpa_free(struct vdpa_device *vdev)
{
struct mlx5_vdpa_dev *mvdev = to_mvdev(vdev);
+ struct mlx5_core_dev *pfmdev;
struct mlx5_vdpa_net *ndev;
ndev = to_mlx5_vdpa_ndev(mvdev);
free_resources(ndev);
+ if (!is_zero_ether_addr(ndev->config.mac)) {
+ pfmdev = pci_get_drvdata(pci_physfn(mvdev->mdev->pdev));
+ mlx5_mpfs_del_mac(pfmdev, ndev->config.mac);
+ }
mlx5_vdpa_free_resources(&ndev->mvdev);
mutex_destroy(&ndev->reslock);
}
{
struct mlx5_vdpa_mgmtdev *mgtdev = container_of(v_mdev, struct mlx5_vdpa_mgmtdev, mgtdev);
struct virtio_net_config *config;
+ struct mlx5_core_dev *pfmdev;
struct mlx5_vdpa_dev *mvdev;
struct mlx5_vdpa_net *ndev;
struct mlx5_core_dev *mdev;
if (err)
goto err_mtu;
+ if (!is_zero_ether_addr(config->mac)) {
+ pfmdev = pci_get_drvdata(pci_physfn(mdev->pdev));
+ err = mlx5_mpfs_add_mac(pfmdev, config->mac);
+ if (err)
+ goto err_mtu;
+ }
+
mvdev->vdev.dma_dev = mdev->device;
err = mlx5_vdpa_alloc_resources(&ndev->mvdev);
if (err)
- goto err_mtu;
+ goto err_mpfs;
err = alloc_resources(ndev);
if (err)
free_resources(ndev);
err_res:
mlx5_vdpa_free_resources(&ndev->mvdev);
+err_mpfs:
+ if (!is_zero_ether_addr(config->mac))
+ mlx5_mpfs_del_mac(pfmdev, config->mac);
err_mtu:
mutex_destroy(&ndev->reslock);
put_device(&mvdev->vdev.dev);
config VFIO_PCI
tristate "VFIO support for PCI devices"
depends on VFIO && PCI && EVENTFD
+ depends on MMU
select VFIO_VIRQFD
select IRQ_BYPASS_MANAGER
help
if (len == 0xFF) {
len = vfio_ext_cap_len(vdev, ecap, epos);
if (len < 0)
- return ret;
+ return len;
}
}
vfio_platform_regions_cleanup(vdev);
err_reg:
mutex_unlock(&driver_lock);
- module_put(THIS_MODULE);
+ module_put(vdev->parent_module);
return ret;
}
return 0;
}
- size = sizeof(*cap_iovas) + (iovas * sizeof(*cap_iovas->iova_ranges));
+ size = struct_size(cap_iovas, iova_ranges, iovas);
cap_iovas = kzalloc(size, GFP_KERNEL);
if (!cap_iovas)
return VM_FAULT_SIGBUS;
get_page(page);
+
+ if (vmf->vma->vm_file)
+ page->mapping = vmf->vma->vm_file->f_mapping;
+ else
+ printk(KERN_ERR "no mapping available\n");
+
+ BUG_ON(!page->mapping);
page->index = vmf->pgoff;
vmf->page = page;
.page_mkwrite = fb_deferred_io_mkwrite,
};
+static int fb_deferred_io_set_page_dirty(struct page *page)
+{
+ if (!PageDirty(page))
+ SetPageDirty(page);
+ return 0;
+}
+
+static const struct address_space_operations fb_deferred_io_aops = {
+ .set_page_dirty = fb_deferred_io_set_page_dirty,
+};
+
int fb_deferred_io_mmap(struct fb_info *info, struct vm_area_struct *vma)
{
vma->vm_ops = &fb_deferred_io_vm_ops;
}
EXPORT_SYMBOL_GPL(fb_deferred_io_init);
+void fb_deferred_io_open(struct fb_info *info,
+ struct inode *inode,
+ struct file *file)
+{
+ file->f_mapping->a_ops = &fb_deferred_io_aops;
+}
+EXPORT_SYMBOL_GPL(fb_deferred_io_open);
+
void fb_deferred_io_cleanup(struct fb_info *info)
{
struct fb_deferred_io *fbdefio = info->fbdefio;
+ struct page *page;
+ int i;
BUG_ON(!fbdefio);
cancel_delayed_work_sync(&info->deferred_work);
+
+ /* clear out the mapping that we setup */
+ for (i = 0 ; i < info->fix.smem_len; i += PAGE_SIZE) {
+ page = fb_deferred_io_page(info, i);
+ page->mapping = NULL;
+ }
+
mutex_destroy(&fbdefio->lock);
}
EXPORT_SYMBOL_GPL(fb_deferred_io_cleanup);
if (res)
module_put(info->fbops->owner);
}
+#ifdef CONFIG_FB_DEFERRED_IO
+ if (info->fbdefio)
+ fb_deferred_io_open(info, inode, file);
+#endif
out:
unlock_fb_info(info);
if (res)
int ret;
ret = hga_card_detect();
- if (!ret)
+ if (ret)
return ret;
printk(KERN_INFO "hgafb: %s with %ldK of memory detected.\n",
return ret;
call->unmarshall++;
+ fallthrough;
+
case 5:
break;
}
r->node[loop] = ntohl(b[loop + 5]);
call->unmarshall++;
+ fallthrough;
case 2:
break;
r->node[loop] = ntohl(b[loop + 5]);
call->unmarshall++;
+ fallthrough;
case 2:
break;
afs_extract_to_tmp(call);
call->unmarshall++;
+ fallthrough;
case 3:
break;
new_inode = d_inode(new_dentry);
if (new_inode) {
spin_lock(&new_inode->i_lock);
- if (new_inode->i_nlink > 0)
+ if (S_ISDIR(new_inode->i_mode))
+ clear_nlink(new_inode);
+ else if (new_inode->i_nlink > 0)
drop_nlink(new_inode);
spin_unlock(&new_inode->i_lock);
}
req->file_size = vp->scb.status.size;
call->unmarshall++;
+ fallthrough;
case 5:
break;
_debug("motd '%s'", p);
call->unmarshall++;
+ fallthrough;
case 8:
break;
xdr_decode_AFSVolSync(&bp, &op->volsync);
call->unmarshall++;
+ fallthrough;
case 6:
break;
xdr_decode_AFSVolSync(&bp, &op->volsync);
call->unmarshall++;
+ fallthrough;
case 4:
break;
if (ret < 0)
return ret;
call->unmarshall = 6;
+ fallthrough;
case 6:
break;
bytes_left = compressed_len;
for (pg_index = 0; pg_index < cb->nr_pages; pg_index++) {
int submit = 0;
- int len;
+ int len = 0;
page = compressed_pages[pg_index];
page->mapping = inode->vfs_inode.i_mapping;
submit = btrfs_bio_fits_in_stripe(page, PAGE_SIZE, bio,
0);
- if (pg_index == 0 && use_append)
- len = bio_add_zone_append_page(bio, page, PAGE_SIZE, 0);
- else
- len = bio_add_page(bio, page, PAGE_SIZE, 0);
+ /*
+ * Page can only be added to bio if the current bio fits in
+ * stripe.
+ */
+ if (!submit) {
+ if (pg_index == 0 && use_append)
+ len = bio_add_zone_append_page(bio, page,
+ PAGE_SIZE, 0);
+ else
+ len = bio_add_page(bio, page, PAGE_SIZE, 0);
+ }
page->mapping = NULL;
if (submit || len < PAGE_SIZE) {
trace_run_delayed_ref_head(fs_info, head, 0);
btrfs_delayed_ref_unlock(head);
btrfs_put_delayed_ref_head(head);
- return 0;
+ return ret;
}
static struct btrfs_delayed_ref_head *btrfs_obtain_ref_head(
u64 end_byte = bytenr + len;
u64 csum_end;
struct extent_buffer *leaf;
- int ret;
+ int ret = 0;
const u32 csum_size = fs_info->csum_size;
u32 blocksize_bits = fs_info->sectorsize_bits;
ret = btrfs_search_slot(trans, root, &key, path, -1, 1);
if (ret > 0) {
+ ret = 0;
if (path->slots[0] == 0)
break;
path->slots[0]--;
ret = btrfs_del_items(trans, root, path,
path->slots[0], del_nr);
if (ret)
- goto out;
+ break;
if (key.offset == bytenr)
break;
} else if (key.offset < bytenr && csum_end > end_byte) {
ret = btrfs_split_item(trans, root, path, &key, offset);
if (ret && ret != -EAGAIN) {
btrfs_abort_transaction(trans, ret);
- goto out;
+ break;
}
+ ret = 0;
key.offset = end_byte - 1;
} else {
}
btrfs_release_path(path);
}
- ret = 0;
-out:
btrfs_free_path(path);
return ret;
}
+static int find_next_csum_offset(struct btrfs_root *root,
+ struct btrfs_path *path,
+ u64 *next_offset)
+{
+ const u32 nritems = btrfs_header_nritems(path->nodes[0]);
+ struct btrfs_key found_key;
+ int slot = path->slots[0] + 1;
+ int ret;
+
+ if (nritems == 0 || slot >= nritems) {
+ ret = btrfs_next_leaf(root, path);
+ if (ret < 0) {
+ return ret;
+ } else if (ret > 0) {
+ *next_offset = (u64)-1;
+ return 0;
+ }
+ slot = path->slots[0];
+ }
+
+ btrfs_item_key_to_cpu(path->nodes[0], &found_key, slot);
+
+ if (found_key.objectid != BTRFS_EXTENT_CSUM_OBJECTID ||
+ found_key.type != BTRFS_EXTENT_CSUM_KEY)
+ *next_offset = (u64)-1;
+ else
+ *next_offset = found_key.offset;
+
+ return 0;
+}
+
int btrfs_csum_file_blocks(struct btrfs_trans_handle *trans,
struct btrfs_root *root,
struct btrfs_ordered_sum *sums)
u64 total_bytes = 0;
u64 csum_offset;
u64 bytenr;
- u32 nritems;
u32 ins_size;
int index = 0;
int found_next;
goto insert;
}
} else {
- int slot = path->slots[0] + 1;
- /* we didn't find a csum item, insert one */
- nritems = btrfs_header_nritems(path->nodes[0]);
- if (!nritems || (path->slots[0] >= nritems - 1)) {
- ret = btrfs_next_leaf(root, path);
- if (ret < 0) {
- goto out;
- } else if (ret > 0) {
- found_next = 1;
- goto insert;
- }
- slot = path->slots[0];
- }
- btrfs_item_key_to_cpu(path->nodes[0], &found_key, slot);
- if (found_key.objectid != BTRFS_EXTENT_CSUM_OBJECTID ||
- found_key.type != BTRFS_EXTENT_CSUM_KEY) {
- found_next = 1;
- goto insert;
- }
- next_offset = found_key.offset;
+ /* We didn't find a csum item, insert one. */
+ ret = find_next_csum_offset(root, path, &next_offset);
+ if (ret < 0)
+ goto out;
found_next = 1;
goto insert;
}
tmp = sums->len - total_bytes;
tmp >>= fs_info->sectorsize_bits;
WARN_ON(tmp < 1);
+ extend_nr = max_t(int, 1, tmp);
+
+ /*
+ * A log tree can already have checksum items with a subset of
+ * the checksums we are trying to log. This can happen after
+ * doing a sequence of partial writes into prealloc extents and
+ * fsyncs in between, with a full fsync logging a larger subrange
+ * of an extent for which a previous fast fsync logged a smaller
+ * subrange. And this happens in particular due to merging file
+ * extent items when we complete an ordered extent for a range
+ * covered by a prealloc extent - this is done at
+ * btrfs_mark_extent_written().
+ *
+ * So if we try to extend the previous checksum item, which has
+ * a range that ends at the start of the range we want to insert,
+ * make sure we don't extend beyond the start offset of the next
+ * checksum item. If we are at the last item in the leaf, then
+ * forget the optimization of extending and add a new checksum
+ * item - it is not worth the complexity of releasing the path,
+ * getting the first key for the next leaf, repeat the btree
+ * search, etc, because log trees are temporary anyway and it
+ * would only save a few bytes of leaf space.
+ */
+ if (root->root_key.objectid == BTRFS_TREE_LOG_OBJECTID) {
+ if (path->slots[0] + 1 >=
+ btrfs_header_nritems(path->nodes[0])) {
+ ret = find_next_csum_offset(root, path, &next_offset);
+ if (ret < 0)
+ goto out;
+ found_next = 1;
+ goto insert;
+ }
+
+ ret = find_next_csum_offset(root, path, &next_offset);
+ if (ret < 0)
+ goto out;
+
+ tmp = (next_offset - bytenr) >> fs_info->sectorsize_bits;
+ if (tmp <= INT_MAX)
+ extend_nr = min_t(int, extend_nr, tmp);
+ }
- extend_nr = max_t(int, 1, (int)tmp);
diff = (csum_offset + extend_nr) * csum_size;
diff = min(diff,
MAX_CSUM_ITEMS(fs_info, csum_size) * csum_size);
if (ret || truncated) {
u64 unwritten_start = start;
+ /*
+ * If we failed to finish this ordered extent for any reason we
+ * need to make sure BTRFS_ORDERED_IOERR is set on the ordered
+ * extent, and mark the inode with the error if it wasn't
+ * already set. Any error during writeback would have already
+ * set the mapping error, so we need to set it if we're the ones
+ * marking this ordered extent as failed.
+ */
+ if (ret && !test_and_set_bit(BTRFS_ORDERED_IOERR,
+ &ordered_extent->flags))
+ mapping_set_error(ordered_extent->inode->i_mapping, -EIO);
+
if (truncated)
unwritten_start += logical_len;
clear_extent_uptodate(io_tree, unwritten_start, end, NULL);
int ret2;
bool root_log_pinned = false;
bool dest_log_pinned = false;
+ bool need_abort = false;
/* we only allow rename subvolume link between subvolumes */
if (old_ino != BTRFS_FIRST_FREE_OBJECTID && root != dest)
old_idx);
if (ret)
goto out_fail;
+ need_abort = true;
}
/* And now for the dest. */
new_ino,
btrfs_ino(BTRFS_I(old_dir)),
new_idx);
- if (ret)
+ if (ret) {
+ if (need_abort)
+ btrfs_abort_transaction(trans, ret);
goto out_fail;
+ }
}
/* Update inode version and ctime/mtime. */
* inline extent's data to the page.
*/
ASSERT(key.offset > 0);
- ret = copy_inline_to_page(BTRFS_I(dst), new_key->offset,
- inline_data, size, datal,
- comp_type);
- goto out;
+ goto copy_to_page;
}
} else if (i_size_read(dst) <= datal) {
struct btrfs_file_extent_item *ei;
BTRFS_FILE_EXTENT_INLINE)
goto copy_inline_extent;
- ret = copy_inline_to_page(BTRFS_I(dst), new_key->offset,
- inline_data, size, datal, comp_type);
- goto out;
+ goto copy_to_page;
}
copy_inline_extent:
- ret = 0;
/*
* We have no extent items, or we have an extent at offset 0 which may
* or may not be inlined. All these cases are dealt the same way.
* clone. Deal with all these cases by copying the inline extent
* data into the respective page at the destination inode.
*/
- ret = copy_inline_to_page(BTRFS_I(dst), new_key->offset,
- inline_data, size, datal, comp_type);
- goto out;
+ goto copy_to_page;
}
+ /*
+ * Release path before starting a new transaction so we don't hold locks
+ * that would confuse lockdep.
+ */
btrfs_release_path(path);
/*
* If we end up here it means were copy the inline extent into a leaf
out:
if (!ret && !trans) {
/*
- * Release path before starting a new transaction so we don't
- * hold locks that would confuse lockdep.
- */
- btrfs_release_path(path);
- /*
* No transaction here means we copied the inline extent into a
* page of the destination inode.
*
*trans_out = trans;
return ret;
+
+copy_to_page:
+ /*
+ * Release our path because we don't need it anymore and also because
+ * copy_inline_to_page() needs to reserve data and metadata, which may
+ * need to flush delalloc when we are low on available space and
+ * therefore cause a deadlock if writeback of an inline extent needs to
+ * write to the same leaf or an ordered extent completion needs to write
+ * to the same leaf.
+ */
+ btrfs_release_path(path);
+
+ ret = copy_inline_to_page(BTRFS_I(dst), new_key->offset,
+ inline_data, size, datal, comp_type);
+ goto out;
}
/**
if (ret)
goto out;
- btrfs_update_inode(trans, root, BTRFS_I(inode));
+ ret = btrfs_update_inode(trans, root, BTRFS_I(inode));
+ if (ret)
+ goto out;
}
ref_ptr = (unsigned long)(ref_ptr + ref_struct_size) + namelen;
if (nlink != inode->i_nlink) {
set_nlink(inode, nlink);
- btrfs_update_inode(trans, root, BTRFS_I(inode));
+ ret = btrfs_update_inode(trans, root, BTRFS_I(inode));
+ if (ret)
+ goto out;
}
BTRFS_I(inode)->index_cnt = (u64)-1;
break;
if (ret == 1) {
+ ret = 0;
if (path->slots[0] == 0)
break;
path->slots[0]--;
ret = btrfs_del_item(trans, root, path);
if (ret)
- goto out;
+ break;
btrfs_release_path(path);
inode = read_one_inode(root, key.offset);
- if (!inode)
- return -EIO;
+ if (!inode) {
+ ret = -EIO;
+ break;
+ }
ret = fixup_inode_link_count(trans, root, inode);
iput(inode);
if (ret)
- goto out;
+ break;
/*
* fixup on a directory may create new entries,
*/
key.offset = (u64)-1;
}
- ret = 0;
-out:
btrfs_release_path(path);
return ret;
}
} __packed;
/*
- * Dump full key (32 byte encrypt/decrypt keys instead of 16 bytes)
- * is needed if GCM256 (stronger encryption) negotiated
+ * Dump variable-sized keys
*/
struct smb3_full_key_debug_info {
- __u64 Suid;
+ /* INPUT: size of userspace buffer */
+ __u32 in_size;
+
+ /*
+ * INPUT: 0 for current user, otherwise session to dump
+ * OUTPUT: session id that was dumped
+ */
+ __u64 session_id;
__u16 cipher_type;
- __u8 auth_key[16]; /* SMB2_NTLMV2_SESSKEY_SIZE */
- __u8 smb3encryptionkey[32]; /* SMB3_ENC_DEC_KEY_SIZE */
- __u8 smb3decryptionkey[32]; /* SMB3_ENC_DEC_KEY_SIZE */
+ __u8 session_key_length;
+ __u8 server_in_key_length;
+ __u8 server_out_key_length;
+ __u8 data[];
+ /*
+ * return this struct with the keys appended at the end:
+ * __u8 session_key[session_key_length];
+ * __u8 server_in_key[server_in_key_length];
+ * __u8 server_out_key[server_out_key_length];
+ */
} __packed;
struct smb3_notify {
#define SMB3_SIGN_KEY_SIZE (16)
/*
- * Size of the smb3 encryption/decryption keys
+ * Size of the smb3 encryption/decryption key storage.
+ * This size is big enough to store any cipher key types.
*/
#define SMB3_ENC_DEC_KEY_SIZE (32)
#include "cifsfs.h"
#include "cifs_ioctl.h"
#include "smb2proto.h"
+#include "smb2glob.h"
#include <linux/btrfs.h>
static long cifs_ioctl_query_info(unsigned int xid, struct file *filep,
return 0;
}
-static int cifs_dump_full_key(struct cifs_tcon *tcon, unsigned long arg)
+static int cifs_dump_full_key(struct cifs_tcon *tcon, struct smb3_full_key_debug_info __user *in)
{
- struct smb3_full_key_debug_info pfull_key_inf;
- __u64 suid;
- struct list_head *tmp;
+ struct smb3_full_key_debug_info out;
struct cifs_ses *ses;
+ int rc = 0;
bool found = false;
+ u8 __user *end;
- if (!smb3_encryption_required(tcon))
- return -EOPNOTSUPP;
+ if (!smb3_encryption_required(tcon)) {
+ rc = -EOPNOTSUPP;
+ goto out;
+ }
+
+ /* copy user input into our output buffer */
+ if (copy_from_user(&out, in, sizeof(out))) {
+ rc = -EINVAL;
+ goto out;
+ }
+
+ if (!out.session_id) {
+ /* if ses id is 0, use current user session */
+ ses = tcon->ses;
+ } else {
+ /* otherwise if a session id is given, look for it in all our sessions */
+ struct cifs_ses *ses_it = NULL;
+ struct TCP_Server_Info *server_it = NULL;
- ses = tcon->ses; /* default to user id for current user */
- if (get_user(suid, (__u64 __user *)arg))
- suid = 0;
- if (suid) {
- /* search to see if there is a session with a matching SMB UID */
spin_lock(&cifs_tcp_ses_lock);
- list_for_each(tmp, &tcon->ses->server->smb_ses_list) {
- ses = list_entry(tmp, struct cifs_ses, smb_ses_list);
- if (ses->Suid == suid) {
- found = true;
- break;
+ list_for_each_entry(server_it, &cifs_tcp_ses_list, tcp_ses_list) {
+ list_for_each_entry(ses_it, &server_it->smb_ses_list, smb_ses_list) {
+ if (ses_it->Suid == out.session_id) {
+ ses = ses_it;
+ /*
+ * since we are using the session outside the crit
+ * section, we need to make sure it won't be released
+ * so increment its refcount
+ */
+ ses->ses_count++;
+ found = true;
+ goto search_end;
+ }
}
}
+search_end:
spin_unlock(&cifs_tcp_ses_lock);
- if (found == false)
- return -EINVAL;
- } /* else uses default user's SMB UID (ie current user) */
-
- pfull_key_inf.cipher_type = le16_to_cpu(ses->server->cipher_type);
- pfull_key_inf.Suid = ses->Suid;
- memcpy(pfull_key_inf.auth_key, ses->auth_key.response,
- 16 /* SMB2_NTLMV2_SESSKEY_SIZE */);
- memcpy(pfull_key_inf.smb3decryptionkey, ses->smb3decryptionkey,
- 32 /* SMB3_ENC_DEC_KEY_SIZE */);
- memcpy(pfull_key_inf.smb3encryptionkey,
- ses->smb3encryptionkey, 32 /* SMB3_ENC_DEC_KEY_SIZE */);
- if (copy_to_user((void __user *)arg, &pfull_key_inf,
- sizeof(struct smb3_full_key_debug_info)))
- return -EFAULT;
+ if (!found) {
+ rc = -ENOENT;
+ goto out;
+ }
+ }
- return 0;
+ switch (ses->server->cipher_type) {
+ case SMB2_ENCRYPTION_AES128_CCM:
+ case SMB2_ENCRYPTION_AES128_GCM:
+ out.session_key_length = CIFS_SESS_KEY_SIZE;
+ out.server_in_key_length = out.server_out_key_length = SMB3_GCM128_CRYPTKEY_SIZE;
+ break;
+ case SMB2_ENCRYPTION_AES256_CCM:
+ case SMB2_ENCRYPTION_AES256_GCM:
+ out.session_key_length = CIFS_SESS_KEY_SIZE;
+ out.server_in_key_length = out.server_out_key_length = SMB3_GCM256_CRYPTKEY_SIZE;
+ break;
+ default:
+ rc = -EOPNOTSUPP;
+ goto out;
+ }
+
+ /* check if user buffer is big enough to store all the keys */
+ if (out.in_size < sizeof(out) + out.session_key_length + out.server_in_key_length
+ + out.server_out_key_length) {
+ rc = -ENOBUFS;
+ goto out;
+ }
+
+ out.session_id = ses->Suid;
+ out.cipher_type = le16_to_cpu(ses->server->cipher_type);
+
+ /* overwrite user input with our output */
+ if (copy_to_user(in, &out, sizeof(out))) {
+ rc = -EINVAL;
+ goto out;
+ }
+
+ /* append all the keys at the end of the user buffer */
+ end = in->data;
+ if (copy_to_user(end, ses->auth_key.response, out.session_key_length)) {
+ rc = -EINVAL;
+ goto out;
+ }
+ end += out.session_key_length;
+
+ if (copy_to_user(end, ses->smb3encryptionkey, out.server_in_key_length)) {
+ rc = -EINVAL;
+ goto out;
+ }
+ end += out.server_in_key_length;
+
+ if (copy_to_user(end, ses->smb3decryptionkey, out.server_out_key_length)) {
+ rc = -EINVAL;
+ goto out;
+ }
+
+out:
+ if (found)
+ cifs_put_smb_ses(ses);
+ return rc;
}
long cifs_ioctl(struct file *filep, unsigned int command, unsigned long arg)
rc = -EOPNOTSUPP;
break;
case CIFS_DUMP_KEY:
+ /*
+ * Dump encryption keys. This is an old ioctl that only
+ * handles AES-128-{CCM,GCM}.
+ */
if (pSMBFile == NULL)
break;
if (!capable(CAP_SYS_ADMIN)) {
else
rc = 0;
break;
- /*
- * Dump full key (32 bytes instead of 16 bytes) is
- * needed if GCM256 (stronger encryption) negotiated
- */
case CIFS_DUMP_FULL_KEY:
+ /*
+ * Dump encryption keys (handles any key sizes)
+ */
if (pSMBFile == NULL)
break;
if (!capable(CAP_SYS_ADMIN)) {
break;
}
tcon = tlink_tcon(pSMBFile->tlink);
- rc = cifs_dump_full_key(tcon, arg);
-
+ rc = cifs_dump_full_key(tcon, (void __user *)arg);
break;
case CIFS_IOC_NOTIFY:
if (!S_ISDIR(inode->i_mode)) {
/* Internal types */
server->capabilities |= SMB2_NT_FIND | SMB2_LARGE_FILES;
+ /*
+ * SMB3.0 supports only 1 cipher and doesn't have a encryption neg context
+ * Set the cipher type manually.
+ */
+ if (server->dialect == SMB30_PROT_ID && (server->capabilities & SMB2_GLOBAL_CAP_ENCRYPTION))
+ server->cipher_type = SMB2_ENCRYPTION_AES128_CCM;
+
security_blob = smb2_get_data_area_len(&blob_offset, &blob_length,
(struct smb2_sync_hdr *)rsp);
/*
#include <linux/tracepoint.h>
+/*
+ * Please use this 3-part article as a reference for writing new tracepoints:
+ * https://lwn.net/Articles/379903/
+ */
+
/* For logging errors in read or write */
DECLARE_EVENT_CLASS(smb3_rw_err_class,
TP_PROTO(unsigned int xid,
TP_ARGS(xid, func_name, rc),
TP_STRUCT__entry(
__field(unsigned int, xid)
- __field(const char *, func_name)
+ __string(func_name, func_name)
__field(int, rc)
),
TP_fast_assign(
__entry->xid = xid;
- __entry->func_name = func_name;
+ __assign_str(func_name, func_name);
__entry->rc = rc;
),
TP_printk("\t%s: xid=%u rc=%d",
- __entry->func_name, __entry->xid, __entry->rc)
+ __get_str(func_name), __entry->xid, __entry->rc)
)
#define DEFINE_SMB3_EXIT_ERR_EVENT(name) \
TP_ARGS(xid, func_name),
TP_STRUCT__entry(
__field(unsigned int, xid)
- __field(const char *, func_name)
+ __string(func_name, func_name)
),
TP_fast_assign(
__entry->xid = xid;
- __entry->func_name = func_name;
+ __assign_str(func_name, func_name);
),
TP_printk("\t%s: xid=%u",
- __entry->func_name, __entry->xid)
+ __get_str(func_name), __entry->xid)
)
#define DEFINE_SMB3_ENTER_EXIT_EVENT(name) \
TP_STRUCT__entry(
__field(__u64, currmid)
__field(__u64, conn_id)
- __field(char *, hostname)
+ __string(hostname, hostname)
),
TP_fast_assign(
__entry->currmid = currmid;
__entry->conn_id = conn_id;
- __entry->hostname = hostname;
+ __assign_str(hostname, hostname);
),
TP_printk("conn_id=0x%llx server=%s current_mid=%llu",
__entry->conn_id,
- __entry->hostname,
+ __get_str(hostname),
__entry->currmid)
)
TP_STRUCT__entry(
__field(__u64, currmid)
__field(__u64, conn_id)
- __field(char *, hostname)
+ __string(hostname, hostname)
__field(int, credits)
__field(int, credits_to_add)
__field(int, in_flight)
TP_fast_assign(
__entry->currmid = currmid;
__entry->conn_id = conn_id;
- __entry->hostname = hostname;
+ __assign_str(hostname, hostname);
__entry->credits = credits;
__entry->credits_to_add = credits_to_add;
__entry->in_flight = in_flight;
TP_printk("conn_id=0x%llx server=%s current_mid=%llu "
"credits=%d credit_change=%d in_flight=%d",
__entry->conn_id,
- __entry->hostname,
+ __get_str(hostname),
__entry->currmid,
__entry->credits,
__entry->credits_to_add,
static int debugfs_setattr(struct user_namespace *mnt_userns,
struct dentry *dentry, struct iattr *ia)
{
- int ret = security_locked_down(LOCKDOWN_DEBUGFS);
+ int ret;
- if (ret && (ia->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID)))
- return ret;
+ if (ia->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID)) {
+ ret = security_locked_down(LOCKDOWN_DEBUGFS);
+ if (ret)
+ return ret;
+ }
return simple_setattr(&init_user_ns, dentry, ia);
}
current->backing_dev_info = inode_to_bdi(inode);
buffered = iomap_file_buffered_write(iocb, from, &gfs2_iomap_ops);
current->backing_dev_info = NULL;
- if (unlikely(buffered <= 0))
+ if (unlikely(buffered <= 0)) {
+ if (!ret)
+ ret = buffered;
goto out_unlock;
+ }
/*
* We need to ensure that the page cache pages are written to
spin_unlock(&gl->gl_lockref.lock);
}
+static bool is_system_glock(struct gfs2_glock *gl)
+{
+ struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
+ struct gfs2_inode *m_ip = GFS2_I(sdp->sd_statfs_inode);
+
+ if (gl == m_ip->i_gl)
+ return true;
+ return false;
+}
+
/**
* do_xmote - Calls the DLM to change the state of a lock
* @gl: The lock state
* to see sd_log_error and withdraw, and in the meantime, requeue the
* work for later.
*
+ * We make a special exception for some system glocks, such as the
+ * system statfs inode glock, which needs to be granted before the
+ * gfs2_quotad daemon can exit, and that exit needs to finish before
+ * we can unmount the withdrawn file system.
+ *
* However, if we're just unlocking the lock (say, for unmount, when
* gfs2_gl_hash_clear calls clear_glock) and recovery is complete
* then it's okay to tell dlm to unlock it.
*/
if (unlikely(sdp->sd_log_error && !gfs2_withdrawn(sdp)))
gfs2_withdraw_delayed(sdp);
- if (glock_blocked_by_withdraw(gl)) {
- if (target != LM_ST_UNLOCKED ||
- test_bit(SDF_WITHDRAW_RECOVERY, &sdp->sd_flags)) {
+ if (glock_blocked_by_withdraw(gl) &&
+ (target != LM_ST_UNLOCKED ||
+ test_bit(SDF_WITHDRAW_RECOVERY, &sdp->sd_flags))) {
+ if (!is_system_glock(gl)) {
gfs2_glock_queue_work(gl, GL_GLOCK_DFT_HOLD);
goto out;
+ } else {
+ clear_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags);
}
}
glock_blocked_by_withdraw(gl) &&
gh->gh_gl != sdp->sd_jinode_gl) {
sdp->sd_glock_dqs_held++;
+ spin_unlock(&gl->gl_lockref.lock);
might_sleep();
wait_on_bit(&sdp->sd_flags, SDF_WITHDRAW_RECOVERY,
TASK_UNINTERRUPTIBLE);
+ spin_lock(&gl->gl_lockref.lock);
}
if (gh->gh_flags & GL_NOCACHE)
handle_callback(gl, LM_ST_UNLOCKED, 0, false);
while(!list_empty(list)) {
gl = list_first_entry(list, struct gfs2_glock, gl_lru);
list_del_init(&gl->gl_lru);
+ clear_bit(GLF_LRU, &gl->gl_flags);
if (!spin_trylock(&gl->gl_lockref.lock)) {
add_back_to_lru:
list_add(&gl->gl_lru, &lru_list);
if (!test_bit(GLF_LOCK, &gl->gl_flags)) {
list_move(&gl->gl_lru, &dispose);
atomic_dec(&lru_count);
- clear_bit(GLF_LRU, &gl->gl_flags);
freed++;
continue;
}
struct timespec64 atime;
u16 height, depth;
umode_t mode = be32_to_cpu(str->di_mode);
- bool is_new = ip->i_inode.i_flags & I_NEW;
+ bool is_new = ip->i_inode.i_state & I_NEW;
if (unlikely(ip->i_no_addr != be64_to_cpu(str->di_num.no_addr)))
goto corrupt;
}
/**
- * ail_drain - drain the ail lists after a withdraw
+ * gfs2_ail_drain - drain the ail lists after a withdraw
* @sdp: Pointer to GFS2 superblock
*/
-static void ail_drain(struct gfs2_sbd *sdp)
+void gfs2_ail_drain(struct gfs2_sbd *sdp)
{
struct gfs2_trans *tr;
list_del(&tr->tr_list);
gfs2_trans_free(sdp, tr);
}
+ gfs2_drain_revokes(sdp);
spin_unlock(&sdp->sd_ail_lock);
}
if (tr && list_empty(&tr->tr_list))
list_add(&tr->tr_list, &sdp->sd_ail1_list);
spin_unlock(&sdp->sd_ail_lock);
- ail_drain(sdp); /* frees all transactions */
tr = NULL;
goto out_end;
}
extern void gfs2_add_revoke(struct gfs2_sbd *sdp, struct gfs2_bufdata *bd);
extern void gfs2_glock_remove_revoke(struct gfs2_glock *gl);
extern void gfs2_flush_revokes(struct gfs2_sbd *sdp);
+extern void gfs2_ail_drain(struct gfs2_sbd *sdp);
#endif /* __LOG_DOT_H__ */
gfs2_log_write_page(sdp, page);
}
-static void revoke_lo_after_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
+void gfs2_drain_revokes(struct gfs2_sbd *sdp)
{
struct list_head *head = &sdp->sd_log_revokes;
struct gfs2_bufdata *bd;
}
}
+static void revoke_lo_after_commit(struct gfs2_sbd *sdp, struct gfs2_trans *tr)
+{
+ gfs2_drain_revokes(sdp);
+}
+
static void revoke_lo_before_scan(struct gfs2_jdesc *jd,
struct gfs2_log_header_host *head, int pass)
{
extern void gfs2_pin(struct gfs2_sbd *sdp, struct buffer_head *bh);
extern int gfs2_find_jhead(struct gfs2_jdesc *jd,
struct gfs2_log_header_host *head, bool keep_cache);
+extern void gfs2_drain_revokes(struct gfs2_sbd *sdp);
static inline unsigned int buf_limit(struct gfs2_sbd *sdp)
{
return sdp->sd_ldptrs;
if (test_bit(SDF_NORECOVERY, &sdp->sd_flags) || !sdp->sd_jdesc)
return;
+ gfs2_ail_drain(sdp); /* frees all transactions */
inode = sdp->sd_jdesc->jd_inode;
ip = GFS2_I(inode);
i_gl = ip->i_gl;
return cwd->wqe->wq == data;
}
+void io_wq_exit_start(struct io_wq *wq)
+{
+ set_bit(IO_WQ_BIT_EXIT, &wq->state);
+}
+
static void io_wq_exit_workers(struct io_wq *wq)
{
struct callback_head *cb;
int node;
- set_bit(IO_WQ_BIT_EXIT, &wq->state);
-
if (!wq->task)
return;
struct io_wqe *wqe = wq->wqes[node];
io_wq_for_each_worker(wqe, io_wq_worker_wake, NULL);
- spin_lock_irq(&wq->hash->wait.lock);
- list_del_init(&wq->wqes[node]->wait.entry);
- spin_unlock_irq(&wq->hash->wait.lock);
}
rcu_read_unlock();
io_worker_ref_put(wq);
wait_for_completion(&wq->worker_done);
+
+ for_each_node(node) {
+ spin_lock_irq(&wq->hash->wait.lock);
+ list_del_init(&wq->wqes[node]->wait.entry);
+ spin_unlock_irq(&wq->hash->wait.lock);
+ }
put_task_struct(wq->task);
wq->task = NULL;
}
cpuhp_state_remove_instance_nocalls(io_wq_online, &wq->cpuhp_node);
- io_wq_exit_workers(wq);
-
for_each_node(node) {
struct io_wqe *wqe = wq->wqes[node];
struct io_cb_cancel_data match = {
kfree(wq);
}
-void io_wq_put(struct io_wq *wq)
-{
- if (refcount_dec_and_test(&wq->refs))
- io_wq_destroy(wq);
-}
-
void io_wq_put_and_exit(struct io_wq *wq)
{
+ WARN_ON_ONCE(!test_bit(IO_WQ_BIT_EXIT, &wq->state));
+
io_wq_exit_workers(wq);
- io_wq_put(wq);
+ if (refcount_dec_and_test(&wq->refs))
+ io_wq_destroy(wq);
}
static bool io_wq_worker_affinity(struct io_worker *worker, void *data)
};
struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data);
-void io_wq_put(struct io_wq *wq);
+void io_wq_exit_start(struct io_wq *wq);
void io_wq_put_and_exit(struct io_wq *wq);
void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work);
{
int i, ret;
+ imu->acct_pages = 0;
for (i = 0; i < nr_pages; i++) {
if (!PageCompound(pages[i])) {
imu->acct_pages++;
struct io_tctx_node *node;
unsigned long index;
- tctx->io_wq = NULL;
xa_for_each(&tctx->xa, index, node)
io_uring_del_task_file(index);
- if (wq)
+ if (wq) {
+ /*
+ * Must be after io_uring_del_task_file() (removes nodes under
+ * uring_lock) to avoid race with io_uring_try_cancel_iowq().
+ */
+ tctx->io_wq = NULL;
io_wq_put_and_exit(wq);
+ }
}
static s64 tctx_inflight(struct io_uring_task *tctx, bool tracked)
if (!current->io_uring)
return;
+ if (tctx->io_wq)
+ io_wq_exit_start(tctx->io_wq);
+
WARN_ON_ONCE(!sqd || sqd->thread != current);
atomic_inc(&tctx->in_idle);
DEFINE_WAIT(wait);
s64 inflight;
+ if (tctx->io_wq)
+ io_wq_exit_start(tctx->io_wq);
+
/* make sure overflow events are dropped */
atomic_inc(&tctx->in_idle);
do {
# SPDX-License-Identifier: GPL-2.0-only
config NETFS_SUPPORT
- tristate "Support for network filesystem high-level I/O"
+ tristate
help
This option enables support for network filesystems, including
helpers for high-level buffered I/O, abstracting out read
DEFINE_READAHEAD(ractl, file, NULL, mapping, index);
retry:
- page = grab_cache_page_write_begin(mapping, index, 0);
+ page = grab_cache_page_write_begin(mapping, index, flags);
if (!page)
return -ENOMEM;
if (unlikely(!p))
goto out_err;
fl->fh_array[i]->size = be32_to_cpup(p++);
- if (sizeof(struct nfs_fh) < fl->fh_array[i]->size) {
+ if (fl->fh_array[i]->size > NFS_MAXFHSIZE) {
printk(KERN_ERR "NFS: Too big fh %d received %d\n",
i, fl->fh_array[i]->size);
goto out_err;
.set = param_set_nfs_timeout,
.get = param_get_nfs_timeout,
};
-#define param_check_nfs_timeout(name, p) __param_check(name, p, int);
+#define param_check_nfs_timeout(name, p) __param_check(name, p, int)
module_param(nfs_mountpoint_expiry_timeout, nfs_timeout, 0644);
MODULE_PARM_DESC(nfs_mountpoint_expiry_timeout,
case SEEK_HOLE:
case SEEK_DATA:
ret = nfs42_proc_llseek(filep, offset, whence);
- if (ret != -ENOTSUPP)
+ if (ret != -EOPNOTSUPP)
return ret;
fallthrough;
default:
rcu_read_unlock();
trace_nfs4_open_stateid_update_wait(state->inode, stateid, 0);
- if (!signal_pending(current)) {
+ if (!fatal_signal_pending(current)) {
if (schedule_timeout(5*HZ) == 0)
status = -EAGAIN;
else
write_sequnlock(&state->seqlock);
trace_nfs4_close_stateid_update_wait(state->inode, dst, 0);
- if (signal_pending(current))
+ if (fatal_signal_pending(current))
status = -EINTR;
else
if (schedule_timeout(5*HZ) != 0)
struct nfs_page *prev = NULL;
unsigned int size;
- if (mirror->pg_count != 0) {
- prev = nfs_list_entry(mirror->pg_list.prev);
- } else {
+ if (list_empty(&mirror->pg_list)) {
if (desc->pg_ops->pg_init)
desc->pg_ops->pg_init(desc, req);
if (desc->pg_error < 0)
return 0;
mirror->pg_base = req->wb_pgbase;
- }
+ mirror->pg_count = 0;
+ mirror->pg_recoalesce = 0;
+ } else
+ prev = nfs_list_entry(mirror->pg_list.prev);
if (desc->pg_maxretrans && req->wb_nio > desc->pg_maxretrans) {
if (NFS_SERVER(desc->pg_inode)->flags & NFS_MOUNT_SOFTERR)
{
struct nfs_pgio_mirror *mirror = nfs_pgio_current_mirror(desc);
-
if (!list_empty(&mirror->pg_list)) {
int error = desc->pg_ops->pg_doio(desc);
if (error < 0)
desc->pg_error = error;
- else
+ if (list_empty(&mirror->pg_list))
mirror->pg_bytes_written += mirror->pg_count;
}
- if (list_empty(&mirror->pg_list)) {
- mirror->pg_count = 0;
- mirror->pg_base = 0;
- }
}
static void
do {
list_splice_init(&mirror->pg_list, &head);
- mirror->pg_bytes_written -= mirror->pg_count;
- mirror->pg_count = 0;
- mirror->pg_base = 0;
- mirror->pg_recoalesce = 0;
while (!list_empty(&head)) {
struct nfs_page *req;
{
struct pnfs_layout_hdr *lo = NULL;
struct nfs_inode *nfsi = NFS_I(ino);
+ struct pnfs_layout_range range = {
+ .iomode = IOMODE_ANY,
+ .offset = 0,
+ .length = NFS4_MAX_UINT64,
+ };
LIST_HEAD(tmp_list);
const struct cred *cred;
nfs4_stateid stateid;
}
valid_layout = pnfs_layout_is_valid(lo);
pnfs_clear_layoutcommit(ino, &tmp_list);
- pnfs_mark_matching_lsegs_return(lo, &tmp_list, NULL, 0);
+ pnfs_mark_matching_lsegs_return(lo, &tmp_list, &range, 0);
- if (NFS_SERVER(ino)->pnfs_curr_ld->return_range) {
- struct pnfs_layout_range range = {
- .iomode = IOMODE_ANY,
- .offset = 0,
- .length = NFS4_MAX_UINT64,
- };
+ if (NFS_SERVER(ino)->pnfs_curr_ld->return_range)
NFS_SERVER(ino)->pnfs_curr_ld->return_range(lo, &range);
- }
/* Don't send a LAYOUTRETURN if list was initially empty */
if (!test_bit(NFS_LAYOUT_RETURN_REQUESTED, &lo->plh_flags) ||
void
pnfs_generic_pg_init_read(struct nfs_pageio_descriptor *pgio, struct nfs_page *req)
{
- u64 rd_size = req->wb_bytes;
+ u64 rd_size;
pnfs_generic_pg_check_layout(pgio);
pnfs_generic_pg_check_range(pgio, req);
.set = param_set_portnr,
.get = param_get_uint,
};
-#define param_check_portnr(name, p) __param_check(name, p, unsigned int);
+#define param_check_portnr(name, p) __param_check(name, p, unsigned int)
module_param_named(callback_tcpport, nfs_callback_set_tcpport, portnr, 0644);
module_param_named(callback_nr_threads, nfs_callback_nr_threads, ushort, 0644);
* events generated by the listener process itself, without disclosing
* the pids of other processes.
*/
- if (!capable(CAP_SYS_ADMIN) &&
+ if (FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) &&
task_tgid(current) != event->pid)
metadata.pid = 0;
- if (path && path->mnt && path->dentry) {
+ /*
+ * For now, fid mode is required for an unprivileged listener and
+ * fid mode does not report fd in events. Keep this check anyway
+ * for safety in case fid mode requirement is relaxed in the future
+ * to allow unprivileged listener to get events with no fd and no fid.
+ */
+ if (!FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) &&
+ path && path->mnt && path->dentry) {
fd = create_fd(group, path, &f);
if (fd < 0)
return fd;
int f_flags, fd;
unsigned int fid_mode = flags & FANOTIFY_FID_BITS;
unsigned int class = flags & FANOTIFY_CLASS_BITS;
+ unsigned int internal_flags = 0;
pr_debug("%s: flags=%x event_f_flags=%x\n",
__func__, flags, event_f_flags);
*/
if ((flags & FANOTIFY_ADMIN_INIT_FLAGS) || !fid_mode)
return -EPERM;
+
+ /*
+ * Setting the internal flag FANOTIFY_UNPRIV on the group
+ * prevents setting mount/filesystem marks on this group and
+ * prevents reporting pid and open fd in events.
+ */
+ internal_flags |= FANOTIFY_UNPRIV;
}
#ifdef CONFIG_AUDITSYSCALL
goto out_destroy_group;
}
- group->fanotify_data.flags = flags;
+ group->fanotify_data.flags = flags | internal_flags;
group->memcg = get_mem_cgroup_from_mm(current->mm);
group->fanotify_data.merge_hash = fanotify_alloc_merge_hash();
group = f.file->private_data;
/*
- * An unprivileged user is not allowed to watch a mount point nor
- * a filesystem.
+ * An unprivileged user is not allowed to setup mount nor filesystem
+ * marks. This also includes setting up such marks by a group that
+ * was initialized by an unprivileged user.
*/
ret = -EPERM;
- if (!capable(CAP_SYS_ADMIN) &&
+ if ((!capable(CAP_SYS_ADMIN) ||
+ FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV)) &&
mark_type != FAN_MARK_INODE)
goto fput_and_out;
max_marks = clamp(max_marks, FANOTIFY_OLD_DEFAULT_MAX_MARKS,
FANOTIFY_DEFAULT_MAX_USER_MARKS);
+ BUILD_BUG_ON(FANOTIFY_INIT_FLAGS & FANOTIFY_INTERNAL_GROUP_FLAGS);
BUILD_BUG_ON(HWEIGHT32(FANOTIFY_INIT_FLAGS) != 10);
BUILD_BUG_ON(HWEIGHT32(FANOTIFY_MARK_FLAGS) != 9);
struct fsnotify_group *group = f->private_data;
seq_printf(m, "fanotify flags:%x event-flags:%x\n",
- group->fanotify_data.flags,
+ group->fanotify_data.flags & FANOTIFY_INIT_FLAGS,
group->fanotify_data.f_flags);
show_fdinfo(m, f, fanotify_fdinfo);
}
/*
+ * zero out partial blocks of one cluster.
+ *
+ * start: file offset where zero starts, will be made upper block aligned.
+ * len: it will be trimmed to the end of current cluster if "start + len"
+ * is bigger than it.
+ */
+static int ocfs2_zeroout_partial_cluster(struct inode *inode,
+ u64 start, u64 len)
+{
+ int ret;
+ u64 start_block, end_block, nr_blocks;
+ u64 p_block, offset;
+ u32 cluster, p_cluster, nr_clusters;
+ struct super_block *sb = inode->i_sb;
+ u64 end = ocfs2_align_bytes_to_clusters(sb, start);
+
+ if (start + len < end)
+ end = start + len;
+
+ start_block = ocfs2_blocks_for_bytes(sb, start);
+ end_block = ocfs2_blocks_for_bytes(sb, end);
+ nr_blocks = end_block - start_block;
+ if (!nr_blocks)
+ return 0;
+
+ cluster = ocfs2_bytes_to_clusters(sb, start);
+ ret = ocfs2_get_clusters(inode, cluster, &p_cluster,
+ &nr_clusters, NULL);
+ if (ret)
+ return ret;
+ if (!p_cluster)
+ return 0;
+
+ offset = start_block - ocfs2_clusters_to_blocks(sb, cluster);
+ p_block = ocfs2_clusters_to_blocks(sb, p_cluster) + offset;
+ return sb_issue_zeroout(sb, p_block, nr_blocks, GFP_NOFS);
+}
+
+/*
* Parts of this function taken from xfs_change_file_space()
*/
static int __ocfs2_change_file_space(struct file *file, struct inode *inode,
{
int ret;
s64 llen;
- loff_t size;
+ loff_t size, orig_isize;
struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
struct buffer_head *di_bh = NULL;
handle_t *handle;
goto out_inode_unlock;
}
+ orig_isize = i_size_read(inode);
switch (sr->l_whence) {
case 0: /*SEEK_SET*/
break;
sr->l_start += f_pos;
break;
case 2: /*SEEK_END*/
- sr->l_start += i_size_read(inode);
+ sr->l_start += orig_isize;
break;
default:
ret = -EINVAL;
default:
ret = -EINVAL;
}
+
+ /* zeroout eof blocks in the cluster. */
+ if (!ret && change_size && orig_isize < size) {
+ ret = ocfs2_zeroout_partial_cluster(inode, orig_isize,
+ size - orig_isize);
+ if (!ret)
+ i_size_write(inode, size);
+ }
up_write(&OCFS2_I(inode)->ip_alloc_sem);
if (ret) {
mlog_errno(ret);
goto out_inode_unlock;
}
- if (change_size && i_size_read(inode) < size)
- i_size_write(inode, size);
-
inode->i_ctime = inode->i_mtime = current_time(inode);
ret = ocfs2_mark_inode_dirty(handle, inode, di_bh);
if (ret < 0)
void *page;
int rv;
+ /* A task may only write when it was the opener. */
+ if (file->f_cred != current_real_cred())
+ return -EPERM;
+
rcu_read_lock();
task = pid_task(proc_pid(inode), PIDTYPE_PID);
if (!task) {
error2 = xfs_alloc_pagf_init(mp, tp, pag->pag_agno, 0);
if (error2)
return error2;
- ASSERT(xfs_perag_resv(pag, XFS_AG_RESV_METADATA)->ar_reserved +
- xfs_perag_resv(pag, XFS_AG_RESV_RMAPBT)->ar_reserved <=
- pag->pagf_freeblks + pag->pagf_flcount);
+
+ /*
+ * If there isn't enough space in the AG to satisfy the
+ * reservation, let the caller know that there wasn't enough
+ * space. Callers are responsible for deciding what to do
+ * next, since (in theory) we can stumble along with
+ * insufficient reservation if data blocks are being freed to
+ * replenish the AG's free space.
+ */
+ if (!error &&
+ xfs_perag_resv(pag, XFS_AG_RESV_METADATA)->ar_reserved +
+ xfs_perag_resv(pag, XFS_AG_RESV_RMAPBT)->ar_reserved >
+ pag->pagf_freeblks + pag->pagf_flcount)
+ error = -ENOSPC;
}
+
return error;
}
ASSERT(cur);
ASSERT(whichfork != XFS_COW_FORK);
- ASSERT(!xfs_need_iread_extents(ifp));
ASSERT(ifp->if_format == XFS_DINODE_FMT_BTREE);
ASSERT(be16_to_cpu(rblock->bb_level) == 1);
ASSERT(be16_to_cpu(rblock->bb_numrecs) == 1);
xfs_fsblock_t sum;
xfs_filblks_t len = *rlen; /* length to unmap in file */
xfs_fileoff_t max_len;
- xfs_agnumber_t prev_agno = NULLAGNUMBER, agno;
xfs_fileoff_t end;
struct xfs_iext_cursor icur;
bool done = false;
del = got;
wasdel = isnullstartblock(del.br_startblock);
- /*
- * Make sure we don't touch multiple AGF headers out of order
- * in a single transaction, as that could cause AB-BA deadlocks.
- */
- if (!wasdel && !isrt) {
- agno = XFS_FSB_TO_AGNO(mp, del.br_startblock);
- if (prev_agno != NULLAGNUMBER && prev_agno > agno)
- break;
- prev_agno = agno;
- }
if (got.br_startoff < start) {
del.br_startoff = start;
del.br_blockcount -= start - got.br_startoff;
/*
* Validate di_extsize hint.
*
- * The rules are documented at xfs_ioctl_setattr_check_extsize().
- * These functions must be kept in sync with each other.
+ * 1. Extent size hint is only valid for directories and regular files.
+ * 2. FS_XFLAG_EXTSIZE is only valid for regular files.
+ * 3. FS_XFLAG_EXTSZINHERIT is only valid for directories.
+ * 4. Hint cannot be larger than MAXTEXTLEN.
+ * 5. Can be changed on directories at any time.
+ * 6. Hint value of 0 turns off hints, clears inode flags.
+ * 7. Extent size must be a multiple of the appropriate block size.
+ * For realtime files, this is the rt extent size.
+ * 8. For non-realtime files, the extent size hint must be limited
+ * to half the AG size to avoid alignment extending the extent beyond the
+ * limits of the AG.
*/
xfs_failaddr_t
xfs_inode_validate_extsize(
inherit_flag = (flags & XFS_DIFLAG_EXTSZINHERIT);
extsize_bytes = XFS_FSB_TO_B(mp, extsize);
+ /*
+ * This comment describes a historic gap in this verifier function.
+ *
+ * On older kernels, the extent size hint verifier doesn't check that
+ * the extent size hint is an integer multiple of the realtime extent
+ * size on a directory with both RTINHERIT and EXTSZINHERIT flags set.
+ * The verifier has always enforced the alignment rule for regular
+ * files with the REALTIME flag set.
+ *
+ * If a directory with a misaligned extent size hint is allowed to
+ * propagate that hint into a new regular realtime file, the result
+ * is that the inode cluster buffer verifier will trigger a corruption
+ * shutdown the next time it is run.
+ *
+ * Unfortunately, there could be filesystems with these misconfigured
+ * directories in the wild, so we cannot add a check to this verifier
+ * at this time because that will result a new source of directory
+ * corruption errors when reading an existing filesystem. Instead, we
+ * permit the misconfiguration to pass through the verifiers so that
+ * callers of this function can correct and mitigate externally.
+ */
+
if (rt_flag)
blocksize_bytes = mp->m_sb.sb_rextsize << mp->m_sb.sb_blocklog;
else
/*
* Validate di_cowextsize hint.
*
- * The rules are documented at xfs_ioctl_setattr_check_cowextsize().
- * These functions must be kept in sync with each other.
+ * 1. CoW extent size hint can only be set if reflink is enabled on the fs.
+ * The inode does not have to have any shared blocks, but it must be a v3.
+ * 2. FS_XFLAG_COWEXTSIZE is only valid for directories and regular files;
+ * for a directory, the hint is propagated to new files.
+ * 3. Can be changed on files & directories at any time.
+ * 4. Hint value of 0 turns off hints, clears inode flags.
+ * 5. Extent size must be a multiple of the appropriate block size.
+ * 6. The extent size hint must be limited to half the AG size to avoid
+ * alignment extending the extent beyond the limits of the AG.
*/
xfs_failaddr_t
xfs_inode_validate_cowextsize(
}
/*
+ * Inode verifiers on older kernels don't check that the extent size
+ * hint is an integer multiple of the rt extent size on a directory
+ * with both rtinherit and extszinherit flags set. If we're logging a
+ * directory that is misconfigured in this way, clear the hint.
+ */
+ if ((ip->i_diflags & XFS_DIFLAG_RTINHERIT) &&
+ (ip->i_diflags & XFS_DIFLAG_EXTSZINHERIT) &&
+ (ip->i_extsize % ip->i_mount->m_sb.sb_rextsize) > 0) {
+ xfs_info_once(ip->i_mount,
+ "Correcting misaligned extent size hint in inode 0x%llx.", ip->i_ino);
+ ip->i_diflags &= ~(XFS_DIFLAG_EXTSIZE |
+ XFS_DIFLAG_EXTSZINHERIT);
+ ip->i_extsize = 0;
+ flags |= XFS_ILOG_CORE;
+ }
+
+ /*
* Record the specific change for fdatasync optimisation. This allows
* fdatasync to skip log forces for inodes that are only timestamp
* dirty.
const struct xfs_inode *pip)
{
unsigned int di_flags = 0;
+ xfs_failaddr_t failaddr;
umode_t mode = VFS_I(ip)->i_mode;
if (S_ISDIR(mode)) {
di_flags |= XFS_DIFLAG_FILESTREAM;
ip->i_diflags |= di_flags;
+
+ /*
+ * Inode verifiers on older kernels only check that the extent size
+ * hint is an integer multiple of the rt extent size on realtime files.
+ * They did not check the hint alignment on a directory with both
+ * rtinherit and extszinherit flags set. If the misaligned hint is
+ * propagated from a directory into a new realtime file, new file
+ * allocations will fail due to math errors in the rt allocator and/or
+ * trip the verifiers. Validate the hint settings in the new file so
+ * that we don't let broken hints propagate.
+ */
+ failaddr = xfs_inode_validate_extsize(ip->i_mount, ip->i_extsize,
+ VFS_I(ip)->i_mode, ip->i_diflags);
+ if (failaddr) {
+ ip->i_diflags &= ~(XFS_DIFLAG_EXTSIZE |
+ XFS_DIFLAG_EXTSZINHERIT);
+ ip->i_extsize = 0;
+ }
}
/* Propagate di_flags2 from a parent inode to a child inode. */
struct xfs_inode *ip,
const struct xfs_inode *pip)
{
+ xfs_failaddr_t failaddr;
+
if (pip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE) {
ip->i_diflags2 |= XFS_DIFLAG2_COWEXTSIZE;
ip->i_cowextsize = pip->i_cowextsize;
}
if (pip->i_diflags2 & XFS_DIFLAG2_DAX)
ip->i_diflags2 |= XFS_DIFLAG2_DAX;
+
+ /* Don't let invalid cowextsize hints propagate. */
+ failaddr = xfs_inode_validate_cowextsize(ip->i_mount, ip->i_cowextsize,
+ VFS_I(ip)->i_mode, ip->i_diflags, ip->i_diflags2);
+ if (failaddr) {
+ ip->i_diflags2 &= ~XFS_DIFLAG2_COWEXTSIZE;
+ ip->i_cowextsize = 0;
+ }
}
/*
}
/*
- * extent size hint validation is somewhat cumbersome. Rules are:
- *
- * 1. extent size hint is only valid for directories and regular files
- * 2. FS_XFLAG_EXTSIZE is only valid for regular files
- * 3. FS_XFLAG_EXTSZINHERIT is only valid for directories.
- * 4. can only be changed on regular files if no extents are allocated
- * 5. can be changed on directories at any time
- * 6. extsize hint of 0 turns off hints, clears inode flags.
- * 7. Extent size must be a multiple of the appropriate block size.
- * 8. for non-realtime files, the extent size hint must be limited
- * to half the AG size to avoid alignment extending the extent beyond the
- * limits of the AG.
- *
- * Please keep this function in sync with xfs_scrub_inode_extsize.
+ * Validate a proposed extent size hint. For regular files, the hint can only
+ * be changed if no extents are allocated.
*/
static int
xfs_ioctl_setattr_check_extsize(
struct fileattr *fa)
{
struct xfs_mount *mp = ip->i_mount;
- xfs_extlen_t size;
- xfs_fsblock_t extsize_fsb;
+ xfs_failaddr_t failaddr;
+ uint16_t new_diflags;
if (!fa->fsx_valid)
return 0;
if (S_ISREG(VFS_I(ip)->i_mode) && ip->i_df.if_nextents &&
- ((ip->i_extsize << mp->m_sb.sb_blocklog) != fa->fsx_extsize))
+ XFS_FSB_TO_B(mp, ip->i_extsize) != fa->fsx_extsize)
return -EINVAL;
- if (fa->fsx_extsize == 0)
- return 0;
-
- extsize_fsb = XFS_B_TO_FSB(mp, fa->fsx_extsize);
- if (extsize_fsb > MAXEXTLEN)
+ if (fa->fsx_extsize & mp->m_blockmask)
return -EINVAL;
- if (XFS_IS_REALTIME_INODE(ip) ||
- (fa->fsx_xflags & FS_XFLAG_REALTIME)) {
- size = mp->m_sb.sb_rextsize << mp->m_sb.sb_blocklog;
- } else {
- size = mp->m_sb.sb_blocksize;
- if (extsize_fsb > mp->m_sb.sb_agblocks / 2)
+ new_diflags = xfs_flags2diflags(ip, fa->fsx_xflags);
+
+ /*
+ * Inode verifiers on older kernels don't check that the extent size
+ * hint is an integer multiple of the rt extent size on a directory
+ * with both rtinherit and extszinherit flags set. Don't let sysadmins
+ * misconfigure directories.
+ */
+ if ((new_diflags & XFS_DIFLAG_RTINHERIT) &&
+ (new_diflags & XFS_DIFLAG_EXTSZINHERIT)) {
+ unsigned int rtextsize_bytes;
+
+ rtextsize_bytes = XFS_FSB_TO_B(mp, mp->m_sb.sb_rextsize);
+ if (fa->fsx_extsize % rtextsize_bytes)
return -EINVAL;
}
- if (fa->fsx_extsize % size)
- return -EINVAL;
-
- return 0;
+ failaddr = xfs_inode_validate_extsize(ip->i_mount,
+ XFS_B_TO_FSB(mp, fa->fsx_extsize),
+ VFS_I(ip)->i_mode, new_diflags);
+ return failaddr != NULL ? -EINVAL : 0;
}
-/*
- * CoW extent size hint validation rules are:
- *
- * 1. CoW extent size hint can only be set if reflink is enabled on the fs.
- * The inode does not have to have any shared blocks, but it must be a v3.
- * 2. FS_XFLAG_COWEXTSIZE is only valid for directories and regular files;
- * for a directory, the hint is propagated to new files.
- * 3. Can be changed on files & directories at any time.
- * 4. CoW extsize hint of 0 turns off hints, clears inode flags.
- * 5. Extent size must be a multiple of the appropriate block size.
- * 6. The extent size hint must be limited to half the AG size to avoid
- * alignment extending the extent beyond the limits of the AG.
- *
- * Please keep this function in sync with xfs_scrub_inode_cowextsize.
- */
static int
xfs_ioctl_setattr_check_cowextsize(
struct xfs_inode *ip,
struct fileattr *fa)
{
struct xfs_mount *mp = ip->i_mount;
- xfs_extlen_t size;
- xfs_fsblock_t cowextsize_fsb;
+ xfs_failaddr_t failaddr;
+ uint64_t new_diflags2;
+ uint16_t new_diflags;
if (!fa->fsx_valid)
return 0;
- if (!(fa->fsx_xflags & FS_XFLAG_COWEXTSIZE))
- return 0;
-
- if (!xfs_sb_version_hasreflink(&ip->i_mount->m_sb))
+ if (fa->fsx_cowextsize & mp->m_blockmask)
return -EINVAL;
- if (fa->fsx_cowextsize == 0)
- return 0;
+ new_diflags = xfs_flags2diflags(ip, fa->fsx_xflags);
+ new_diflags2 = xfs_flags2diflags2(ip, fa->fsx_xflags);
- cowextsize_fsb = XFS_B_TO_FSB(mp, fa->fsx_cowextsize);
- if (cowextsize_fsb > MAXEXTLEN)
- return -EINVAL;
-
- size = mp->m_sb.sb_blocksize;
- if (cowextsize_fsb > mp->m_sb.sb_agblocks / 2)
- return -EINVAL;
-
- if (fa->fsx_cowextsize % size)
- return -EINVAL;
-
- return 0;
+ failaddr = xfs_inode_validate_cowextsize(ip->i_mount,
+ XFS_B_TO_FSB(mp, fa->fsx_cowextsize),
+ VFS_I(ip)->i_mode, new_diflags, new_diflags2);
+ return failaddr != NULL ? -EINVAL : 0;
}
static int
xfs_printk_once(xfs_warn, dev, fmt, ##__VA_ARGS__)
#define xfs_notice_once(dev, fmt, ...) \
xfs_printk_once(xfs_notice, dev, fmt, ##__VA_ARGS__)
+#define xfs_info_once(dev, fmt, ...) \
+ xfs_printk_once(xfs_info, dev, fmt, ##__VA_ARGS__)
void assfail(struct xfs_mount *mp, char *expr, char *f, int l);
void asswarn(struct xfs_mount *mp, char *expr, char *f, int l);
struct virtchnl_proto_hdrs {
u8 tunnel_level;
+ u8 pad[3];
/**
* specify where protocol header start from.
* 0 - from the outer layer
struct list_head task_iters;
/*
- * On the default hierarhcy, ->subsys[ssid] may point to a css
+ * On the default hierarchy, ->subsys[ssid] may point to a css
* attached to an ancestor instead of the cgroup this css_set is
* associated with. The following node is anchored at
* ->subsys[ssid]->cgroup->e_csets[ssid] and provides a way to
*/
bool threaded:1;
- /* the following two fields are initialized automtically during boot */
+ /* the following two fields are initialized automatically during boot */
int id;
const char *name;
* sock_cgroup_data overloads (prioidx, classid) and the cgroup pointer.
* On boot, sock_cgroup_data records the cgroup that the sock was created
* in so that cgroup2 matches can be made; however, once either net_prio or
- * net_cls starts being used, the area is overriden to carry prioidx and/or
+ * net_cls starts being used, the area is overridden to carry prioidx and/or
* classid. The two modes are distinguished by whether the lowest bit is
* set. Clear bit indicates cgroup pointer while set bit prioidx and
* classid.
#ifdef CONFIG_CGROUPS
/*
- * All weight knobs on the default hierarhcy should use the following min,
+ * All weight knobs on the default hierarchy should use the following min,
* default and max values. The default value is the logarithmic center of
* MIN and MAX and allows 100x to be expressed in both directions.
*/
* @flags: Link flags.
* @rpm_active: Whether or not the consumer device is runtime-PM-active.
* @kref: Count repeated addition of the same link.
- * @rcu_head: An RCU head to use for deferred execution of SRCU callbacks.
+ * @rm_work: Work structure used for removing the link.
* @supplier_preactivated: Supplier has been made active before consumer probe.
*/
struct device_link {
u32 flags;
refcount_t rpm_active;
struct kref kref;
-#ifdef CONFIG_SRCU
- struct rcu_head rcu_head;
-#endif
+ struct work_struct rm_work;
bool supplier_preactivated; /* Owned by consumer probe. */
};
#define FANOTIFY_INIT_FLAGS (FANOTIFY_ADMIN_INIT_FLAGS | \
FANOTIFY_USER_INIT_FLAGS)
+/* Internal group flags */
+#define FANOTIFY_UNPRIV 0x80000000
+#define FANOTIFY_INTERNAL_GROUP_FLAGS (FANOTIFY_UNPRIV)
+
#define FANOTIFY_MARK_TYPE_BITS (FAN_MARK_INODE | FAN_MARK_MOUNT | \
FAN_MARK_FILESYSTEM)
/* drivers/video/fb_defio.c */
int fb_deferred_io_mmap(struct fb_info *info, struct vm_area_struct *vma);
extern void fb_deferred_io_init(struct fb_info *info);
+extern void fb_deferred_io_open(struct fb_info *info,
+ struct inode *inode,
+ struct file *file);
extern void fb_deferred_io_cleanup(struct fb_info *info);
extern int fb_deferred_io_fsync(struct file *file, loff_t start,
loff_t end, int datasync);
*/
static inline u32 hid_report_len(struct hid_report *report)
{
- /* equivalent to DIV_ROUND_UP(report->size, 8) + !!(report->id > 0) */
- return ((report->size - 1) >> 3) + 1 + (report->id > 0);
+ return DIV_ROUND_UP(report->size, 8) + (report->id > 0);
}
int hid_report_raw_event(struct hid_device *hid, int type, u8 *data, u32 size,
int host1x_device_init(struct host1x_device *device);
int host1x_device_exit(struct host1x_device *device);
-int __host1x_client_register(struct host1x_client *client,
- struct lock_class_key *key);
-#define host1x_client_register(class) \
- ({ \
- static struct lock_class_key __key; \
- __host1x_client_register(class, &__key); \
+void __host1x_client_init(struct host1x_client *client, struct lock_class_key *key);
+void host1x_client_exit(struct host1x_client *client);
+
+#define host1x_client_init(client) \
+ ({ \
+ static struct lock_class_key __key; \
+ __host1x_client_init(client, &__key); \
+ })
+
+int __host1x_client_register(struct host1x_client *client);
+
+/*
+ * Note that this wrapper calls __host1x_client_init() for compatibility
+ * with existing callers. Callers that want to separately initialize and
+ * register a host1x client must first initialize using either of the
+ * __host1x_client_init() or host1x_client_init() functions and then use
+ * the low-level __host1x_client_register() function to avoid the client
+ * getting reinitialized.
+ */
+#define host1x_client_register(client) \
+ ({ \
+ static struct lock_class_key __key; \
+ __host1x_client_init(client, &__key); \
+ __host1x_client_register(client); \
})
int host1x_client_unregister(struct host1x_client *client);
asm(".section \"" __sec "\", \"a\" \n" \
__stringify(__name) ": \n" \
".long " __stringify(__stub) " - . \n" \
- ".previous \n");
+ ".previous \n"); \
+ static_assert(__same_type(initcall_t, &fn));
#else
#define ____define_initcall(fn, __unused, __name, __sec) \
static initcall_t __name __used \
#include <linux/spinlock.h>
#include <linux/signal.h>
#include <linux/sched.h>
+#include <linux/sched/stat.h>
#include <linux/bug.h>
#include <linux/minmax.h>
#include <linux/mm.h>
*/
#define KVM_REQ_TLB_FLUSH (0 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
#define KVM_REQ_MMU_RELOAD (1 | KVM_REQUEST_WAIT | KVM_REQUEST_NO_WAKEUP)
-#define KVM_REQ_PENDING_TIMER 2
+#define KVM_REQ_UNBLOCK 2
#define KVM_REQ_UNHALT 3
#define KVM_REQUEST_ARCH_BASE 8
return !!map->hva;
}
+static inline bool kvm_vcpu_can_poll(ktime_t cur, ktime_t stop)
+{
+ return single_task_running() && !need_resched() && ktime_before(cur, stop);
+}
+
/*
* Sometimes a large or cross-page mmio needs to be broken up into separate
* exits for userspace servicing.
#define MLX5_LOG_SW_ICM_BLOCK_SIZE(dev) (MLX5_CAP_DEV_MEM(dev, log_sw_icm_alloc_granularity))
#define MLX5_SW_ICM_BLOCK_SIZE(dev) (1 << MLX5_LOG_SW_ICM_BLOCK_SIZE(dev))
+enum {
+ MLX5_PROF_MASK_QP_SIZE = (u64)1 << 0,
+ MLX5_PROF_MASK_MR_CACHE = (u64)1 << 1,
+};
+
+enum {
+ MR_CACHE_LAST_STD_ENTRY = 20,
+ MLX5_IMR_MTT_CACHE_ENTRY,
+ MLX5_IMR_KSM_CACHE_ENTRY,
+ MAX_MR_CACHE_ENTRIES
+};
+
+struct mlx5_profile {
+ u64 mask;
+ u8 log_max_qp;
+ struct {
+ int size;
+ int limit;
+ } mr_cache[MAX_MR_CACHE_ENTRIES];
+};
+
struct mlx5_core_dev {
struct device *device;
enum mlx5_coredev_type coredev_type;
struct mutex intf_state_mutex;
unsigned long intf_state;
struct mlx5_priv priv;
- struct mlx5_profile *profile;
+ struct mlx5_profile profile;
u32 issi;
struct mlx5e_resources mlx5e_res;
struct mlx5_dm *dm;
return mkey & 0xff;
}
-enum {
- MLX5_PROF_MASK_QP_SIZE = (u64)1 << 0,
- MLX5_PROF_MASK_MR_CACHE = (u64)1 << 1,
-};
-
-enum {
- MR_CACHE_LAST_STD_ENTRY = 20,
- MLX5_IMR_MTT_CACHE_ENTRY,
- MLX5_IMR_KSM_CACHE_ENTRY,
- MAX_MR_CACHE_ENTRIES
-};
-
/* Async-atomic event notifier used by mlx5 core to forward FW
* evetns recived from event queue to mlx5 consumers.
* Optimise event queue dipatching.
struct ib_device *device,
struct rdma_netdev_alloc_params *params);
-struct mlx5_profile {
- u64 mask;
- u8 log_max_qp;
- struct {
- int size;
- int limit;
- } mr_cache[MAX_MR_CACHE_ENTRIES];
-};
-
enum {
MLX5_PCI_DEV_IS_VF = 1 << 0,
};
#define MLX5_FC_BULK_NUM_FCS(fc_enum) (MLX5_FC_BULK_SIZE_FACTOR * (fc_enum))
+#define MLX5_FT_MAX_MULTIPATH_LEVEL 63
+
enum {
MLX5_STEERING_FORMAT_CONNECTX_5 = 0,
MLX5_STEERING_FORMAT_CONNECTX_6DX = 1,
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
+ * Copyright (c) 2021 Mellanox Technologies Ltd.
+ */
+
+#ifndef _MLX5_MPFS_
+#define _MLX5_MPFS_
+
+struct mlx5_core_dev;
+
+#ifdef CONFIG_MLX5_MPFS
+int mlx5_mpfs_add_mac(struct mlx5_core_dev *dev, u8 *mac);
+int mlx5_mpfs_del_mac(struct mlx5_core_dev *dev, u8 *mac);
+#else /* #ifndef CONFIG_MLX5_MPFS */
+static inline int mlx5_mpfs_add_mac(struct mlx5_core_dev *dev, u8 *mac) { return 0; }
+static inline int mlx5_mpfs_del_mac(struct mlx5_core_dev *dev, u8 *mac) { return 0; }
+#endif
+
+#endif
struct device_node;
struct irq_domain;
struct irq_domain *pci_host_bridge_of_msi_domain(struct pci_bus *bus);
+bool pci_host_of_has_msi_map(struct device *dev);
/* Arch may override this (weak) */
struct device_node *pcibios_get_phb_of_node(struct pci_bus *bus);
#else /* CONFIG_OF */
static inline struct irq_domain *
pci_host_bridge_of_msi_domain(struct pci_bus *bus) { return NULL; }
+static inline bool pci_host_of_has_msi_map(struct device *dev) { return false; }
#endif /* CONFIG_OF */
static inline struct device_node *
* To be differentiate with macro pte_mkyoung, this macro is used on platforms
* where software maintains page access bit.
*/
+#ifndef pte_sw_mkyoung
+static inline pte_t pte_sw_mkyoung(pte_t pte)
+{
+ return pte;
+}
+#define pte_sw_mkyoung pte_sw_mkyoung
+#endif
+
#ifndef pte_savedwrite
#define pte_savedwrite pte_write
#endif
* @mac_managed_pm: Set true if MAC driver takes of suspending/resuming PHY
* @state: State of the PHY for management purposes
* @dev_flags: Device-specific flags used by the PHY driver.
+ * Bits [15:0] are free to use by the PHY driver to communicate
+ * driver specific behavior.
+ * Bits [23:16] are currently reserved for future use.
+ * Bits [31:24] are reserved for defining generic
+ * PHY driver behavior.
* @irq: IRQ number of the PHY's interrupt (-1 if none)
* @phy_timer: The timer for handling the state machine
* @phylink: Pointer to phylink instance for this PHY
int *cs_gpios;
struct gpio_desc **cs_gpiods;
bool use_gpio_descriptors;
- u8 unused_native_cs;
- u8 max_native_cs;
+ s8 unused_native_cs;
+ s8 max_native_cs;
/* statistics */
struct spi_statistics statistics;
unsigned int num_prealloc,
unsigned int max_req);
void xprt_free(struct rpc_xprt *);
+void xprt_add_backlog(struct rpc_xprt *xprt, struct rpc_task *task);
+bool xprt_wake_up_backlog(struct rpc_xprt *xprt, struct rpc_rqst *req);
static inline int
xprt_enable_swap(struct rpc_xprt *xprt)
* The link_support layer is used to add any Link Layer specific
* framing.
*/
-void caif_enroll_dev(struct net_device *dev, struct caif_dev_common *caifdev,
+int caif_enroll_dev(struct net_device *dev, struct caif_dev_common *caifdev,
struct cflayer *link_support, int head_room,
struct cflayer **layer, int (**rcv_func)(
struct sk_buff *, struct net_device *,
* @fcs: Specify if checksum is used in CAIF Framing Layer.
* @head_room: Head space needed by link specific protocol.
*/
-void
+int
cfcnfg_add_phy_layer(struct cfcnfg *cnfg,
struct net_device *dev, struct cflayer *phy_layer,
enum cfcnfg_phy_preference pref,
#include <net/caif/caif_layer.h>
struct cflayer *cfserl_create(int instance, bool use_stx);
+void cfserl_release(struct cflayer *layer);
#endif
*/
int ieee80211_data_to_8023_exthdr(struct sk_buff *skb, struct ethhdr *ehdr,
const u8 *addr, enum nl80211_iftype iftype,
- u8 data_offset);
+ u8 data_offset, bool is_amsdu);
/**
* ieee80211_data_to_8023 - convert an 802.11 data frame to 802.3
static inline int ieee80211_data_to_8023(struct sk_buff *skb, const u8 *addr,
enum nl80211_iftype iftype)
{
- return ieee80211_data_to_8023_exthdr(skb, NULL, addr, iftype, 0);
+ return ieee80211_data_to_8023_exthdr(skb, NULL, addr, iftype, 0, false);
}
/**
NF_FLOW_HW,
NF_FLOW_HW_DYING,
NF_FLOW_HW_DEAD,
- NF_FLOW_HW_REFRESH,
NF_FLOW_HW_PENDING,
};
struct nft_trans_table {
bool update;
- u8 state;
- u32 flags;
};
#define nft_trans_table_update(trans) \
(((struct nft_trans_table *)trans->data)->update)
-#define nft_trans_table_state(trans) \
- (((struct nft_trans_table *)trans->data)->state)
-#define nft_trans_table_flags(trans) \
- (((struct nft_trans_table *)trans->data)->flags)
struct nft_trans_elem {
struct nft_set *set;
struct sk_buff **resp);
struct nci_hci_dev *nci_hci_allocate(struct nci_dev *ndev);
+void nci_hci_deallocate(struct nci_dev *ndev);
int nci_hci_send_event(struct nci_dev *ndev, u8 gate, u8 event,
const u8 *param, size_t param_len);
int nci_hci_send_cmd(struct nci_dev *ndev, u8 gate,
cls_common->extack = extack;
}
+#if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
+static inline struct tc_skb_ext *tc_skb_ext_alloc(struct sk_buff *skb)
+{
+ struct tc_skb_ext *tc_skb_ext = skb_ext_add(skb, TC_SKB_EXT);
+
+ if (tc_skb_ext)
+ memset(tc_skb_ext, 0, sizeof(*tc_skb_ext));
+ return tc_skb_ext;
+}
+#endif
+
enum tc_matchall_command {
TC_CLSMATCHALL_REPLACE,
TC_CLSMATCHALL_DESTROY,
static inline void qdisc_run(struct Qdisc *q)
{
if (qdisc_run_begin(q)) {
- /* NOLOCK qdisc must check 'state' under the qdisc seqlock
- * to avoid racing with dev_qdisc_reset()
- */
- if (!(q->flags & TCQ_F_NOLOCK) ||
- likely(!test_bit(__QDISC_STATE_DEACTIVATED, &q->state)))
- __qdisc_run(q);
+ __qdisc_run(q);
qdisc_run_end(q);
}
}
enum qdisc_state_t {
__QDISC_STATE_SCHED,
__QDISC_STATE_DEACTIVATED,
+ __QDISC_STATE_MISSED,
};
struct qdisc_size_table {
static inline bool qdisc_run_begin(struct Qdisc *qdisc)
{
if (qdisc->flags & TCQ_F_NOLOCK) {
+ if (spin_trylock(&qdisc->seqlock))
+ goto nolock_empty;
+
+ /* If the MISSED flag is set, it means other thread has
+ * set the MISSED flag before second spin_trylock(), so
+ * we can return false here to avoid multi cpus doing
+ * the set_bit() and second spin_trylock() concurrently.
+ */
+ if (test_bit(__QDISC_STATE_MISSED, &qdisc->state))
+ return false;
+
+ /* Set the MISSED flag before the second spin_trylock(),
+ * if the second spin_trylock() return false, it means
+ * other cpu holding the lock will do dequeuing for us
+ * or it will see the MISSED flag set after releasing
+ * lock and reschedule the net_tx_action() to do the
+ * dequeuing.
+ */
+ set_bit(__QDISC_STATE_MISSED, &qdisc->state);
+
+ /* Retry again in case other CPU may not see the new flag
+ * after it releases the lock at the end of qdisc_run_end().
+ */
if (!spin_trylock(&qdisc->seqlock))
return false;
+
+nolock_empty:
WRITE_ONCE(qdisc->empty, false);
} else if (qdisc_is_running(qdisc)) {
return false;
static inline void qdisc_run_end(struct Qdisc *qdisc)
{
write_seqcount_end(&qdisc->running);
- if (qdisc->flags & TCQ_F_NOLOCK)
+ if (qdisc->flags & TCQ_F_NOLOCK) {
spin_unlock(&qdisc->seqlock);
+
+ if (unlikely(test_bit(__QDISC_STATE_MISSED,
+ &qdisc->state))) {
+ clear_bit(__QDISC_STATE_MISSED, &qdisc->state);
+ __netif_schedule(qdisc);
+ }
+ }
}
static inline bool qdisc_may_bulk(const struct Qdisc *qdisc)
sk_mem_charge(sk, skb->truesize);
}
-static inline void skb_set_owner_sk_safe(struct sk_buff *skb, struct sock *sk)
+static inline __must_check bool skb_set_owner_sk_safe(struct sk_buff *skb, struct sock *sk)
{
if (sk && refcount_inc_not_zero(&sk->sk_refcnt)) {
skb_orphan(skb);
skb->destructor = sock_efree;
skb->sk = sk;
+ return true;
}
+ return false;
}
void sk_reset_timer(struct sock *sk, struct timer_list *timer,
(sizeof(struct tls_offload_context_tx) + TLS_DRIVER_STATE_SIZE_TX)
enum tls_context_flags {
- TLS_RX_SYNC_RUNNING = 0,
+ /* tls_device_down was called after the netdev went down, device state
+ * was released, and kTLS works in software, even though rx_conf is
+ * still TLS_HW (needed for transition).
+ */
+ TLS_RX_DEV_DEGRADED = 0,
/* Unlike RX where resync is driven entirely by the core in TX only
* the driver knows when things went out of sync, so we need the flag
* to be atomic.
/* cache cold stuff */
struct proto *sk_proto;
+ struct sock *sk;
void (*sk_destruct)(struct sock *sk);
struct sk_buff *
tls_validate_xmit_skb(struct sock *sk, struct net_device *dev,
struct sk_buff *skb);
+struct sk_buff *
+tls_validate_xmit_skb_sw(struct sock *sk, struct net_device *dev,
+ struct sk_buff *skb);
static inline bool tls_is_sk_tx_device_offloaded(struct sock *sk)
{
#define SND_SOC_DAIFMT_CBP_CFP (1 << 12) /* codec clk provider & frame provider */
#define SND_SOC_DAIFMT_CBC_CFP (2 << 12) /* codec clk consumer & frame provider */
#define SND_SOC_DAIFMT_CBP_CFC (3 << 12) /* codec clk provider & frame consumer */
-#define SND_SOC_DAIFMT_CBC_CFC (4 << 12) /* codec clk consumer & frame follower */
+#define SND_SOC_DAIFMT_CBC_CFC (4 << 12) /* codec clk consumer & frame consumer */
/* previous definitions kept for backwards-compatibility, do not use in new contributions */
#define SND_SOC_DAIFMT_CBM_CFM SND_SOC_DAIFMT_CBP_CFP
#define KEY_VOICECOMMAND 0x246 /* Listening Voice Command */
#define KEY_ASSISTANT 0x247 /* AL Context-aware desktop assistant */
#define KEY_KBD_LAYOUT_NEXT 0x248 /* AC Next Keyboard Layout Select */
+#define KEY_EMOJI_PICKER 0x249 /* Show/hide emoji picker (HUTRR101) */
#define KEY_BRIGHTNESS_MIN 0x250 /* Set Brightness to Minimum */
#define KEY_BRIGHTNESS_MAX 0x251 /* Set Brightness to Maximum */
* Note: you must update KVM_API_VERSION if you change this interface.
*/
+#include <linux/const.h>
#include <linux/types.h>
#include <linux/compiler.h>
#include <linux/ioctl.h>
* conversion after harvesting an entry. Also, it must not skip any
* dirty bits, so that dirty bits are always harvested in sequence.
*/
-#define KVM_DIRTY_GFN_F_DIRTY BIT(0)
-#define KVM_DIRTY_GFN_F_RESET BIT(1)
+#define KVM_DIRTY_GFN_F_DIRTY _BITUL(0)
+#define KVM_DIRTY_GFN_F_RESET _BITUL(1)
#define KVM_DIRTY_GFN_F_MASK 0x3
/*
#define VIRTIO_ID_SOUND 25 /* virtio sound */
#define VIRTIO_ID_FS 26 /* virtio filesystem */
#define VIRTIO_ID_PMEM 27 /* virtio pmem */
-#define VIRTIO_ID_BT 28 /* virtio bluetooth */
#define VIRTIO_ID_MAC80211_HWSIM 29 /* virtio mac80211-hwsim */
+#define VIRTIO_ID_BT 40 /* virtio bluetooth */
#endif /* _LINUX_VIRTIO_IDS_H */
source "kernel/irq/Kconfig"
source "kernel/time/Kconfig"
+source "kernel/bpf/Kconfig"
source "kernel/Kconfig.preempt"
menu "CPU/Task time and stats accounting"
# syscall, maps, verifier
-config BPF_LSM
- bool "LSM Instrumentation with BPF"
- depends on BPF_EVENTS
- depends on BPF_SYSCALL
- depends on SECURITY
- depends on BPF_JIT
- help
- Enables instrumentation of the security hooks with eBPF programs for
- implementing dynamic MAC and Audit Policies.
-
- If you are unsure how to answer this question, answer N.
-
-config BPF_SYSCALL
- bool "Enable bpf() system call"
- select BPF
- select IRQ_WORK
- select TASKS_TRACE_RCU
- select BINARY_PRINTF
- select NET_SOCK_MSG if INET
- default n
- help
- Enable the bpf() system call that allows to manipulate eBPF
- programs and maps via file descriptors.
-
-config ARCH_WANT_DEFAULT_BPF_JIT
- bool
-
-config BPF_JIT_ALWAYS_ON
- bool "Permanently enable BPF JIT and remove BPF interpreter"
- depends on BPF_SYSCALL && HAVE_EBPF_JIT && BPF_JIT
- help
- Enables BPF JIT and removes BPF interpreter to avoid
- speculative execution of BPF instructions by the interpreter
-
-config BPF_JIT_DEFAULT_ON
- def_bool ARCH_WANT_DEFAULT_BPF_JIT || BPF_JIT_ALWAYS_ON
- depends on HAVE_EBPF_JIT && BPF_JIT
-
-source "kernel/bpf/preload/Kconfig"
-
config USERFAULTFD
bool "Enable userfaultfd() system call"
depends on MMU
*/
set_mems_allowed(node_states[N_MEMORY]);
- cad_pid = task_pid(current);
+ cad_pid = get_pid(task_pid(current));
smp_prepare_cpus(setup_max_cpus);
--- /dev/null
+# SPDX-License-Identifier: GPL-2.0-only
+
+# BPF interpreter that, for example, classic socket filters depend on.
+config BPF
+ bool
+
+# Used by archs to tell that they support BPF JIT compiler plus which
+# flavour. Only one of the two can be selected for a specific arch since
+# eBPF JIT supersedes the cBPF JIT.
+
+# Classic BPF JIT (cBPF)
+config HAVE_CBPF_JIT
+ bool
+
+# Extended BPF JIT (eBPF)
+config HAVE_EBPF_JIT
+ bool
+
+# Used by archs to tell that they want the BPF JIT compiler enabled by
+# default for kernels that were compiled with BPF JIT support.
+config ARCH_WANT_DEFAULT_BPF_JIT
+ bool
+
+menu "BPF subsystem"
+
+config BPF_SYSCALL
+ bool "Enable bpf() system call"
+ select BPF
+ select IRQ_WORK
+ select TASKS_TRACE_RCU
+ select BINARY_PRINTF
+ select NET_SOCK_MSG if INET
+ default n
+ help
+ Enable the bpf() system call that allows to manipulate BPF programs
+ and maps via file descriptors.
+
+config BPF_JIT
+ bool "Enable BPF Just In Time compiler"
+ depends on BPF
+ depends on HAVE_CBPF_JIT || HAVE_EBPF_JIT
+ depends on MODULES
+ help
+ BPF programs are normally handled by a BPF interpreter. This option
+ allows the kernel to generate native code when a program is loaded
+ into the kernel. This will significantly speed-up processing of BPF
+ programs.
+
+ Note, an admin should enable this feature changing:
+ /proc/sys/net/core/bpf_jit_enable
+ /proc/sys/net/core/bpf_jit_harden (optional)
+ /proc/sys/net/core/bpf_jit_kallsyms (optional)
+
+config BPF_JIT_ALWAYS_ON
+ bool "Permanently enable BPF JIT and remove BPF interpreter"
+ depends on BPF_SYSCALL && HAVE_EBPF_JIT && BPF_JIT
+ help
+ Enables BPF JIT and removes BPF interpreter to avoid speculative
+ execution of BPF instructions by the interpreter.
+
+config BPF_JIT_DEFAULT_ON
+ def_bool ARCH_WANT_DEFAULT_BPF_JIT || BPF_JIT_ALWAYS_ON
+ depends on HAVE_EBPF_JIT && BPF_JIT
+
+config BPF_UNPRIV_DEFAULT_OFF
+ bool "Disable unprivileged BPF by default"
+ depends on BPF_SYSCALL
+ help
+ Disables unprivileged BPF by default by setting the corresponding
+ /proc/sys/kernel/unprivileged_bpf_disabled knob to 2. An admin can
+ still reenable it by setting it to 0 later on, or permanently
+ disable it by setting it to 1 (from which no other transition to
+ 0 is possible anymore).
+
+source "kernel/bpf/preload/Kconfig"
+
+config BPF_LSM
+ bool "Enable BPF LSM Instrumentation"
+ depends on BPF_EVENTS
+ depends on BPF_SYSCALL
+ depends on SECURITY
+ depends on BPF_JIT
+ help
+ Enables instrumentation of the security hooks with BPF programs for
+ implementing dynamic MAC and Audit Policies.
+
+ If you are unsure how to answer this question, answer N.
+
+endmenu # "BPF subsystem"
return &bpf_inode_storage_get_proto;
case BPF_FUNC_inode_storage_delete:
return &bpf_inode_storage_delete_proto;
+#ifdef CONFIG_NET
case BPF_FUNC_sk_storage_get:
return &bpf_sk_storage_get_proto;
case BPF_FUNC_sk_storage_delete:
return &bpf_sk_storage_delete_proto;
+#endif /* CONFIG_NET */
case BPF_FUNC_spin_lock:
return &bpf_spin_lock_proto;
case BPF_FUNC_spin_unlock:
m->ret_size = ret;
for (i = 0; i < nargs; i++) {
+ if (i == nargs - 1 && args[i].type == 0) {
+ bpf_log(log,
+ "The function %s with variable args is unsupported.\n",
+ tname);
+ return -EINVAL;
+ }
ret = __get_type_size(btf, args[i].type, &t);
if (ret < 0) {
bpf_log(log,
tname, i, btf_kind_str[BTF_INFO_KIND(t->info)]);
return -EINVAL;
}
+ if (ret == 0) {
+ bpf_log(log,
+ "The function %s has malformed void argument.\n",
+ tname);
+ return -EINVAL;
+ }
m->arg_size[i] = ret;
}
m->nr_args = nargs;
#include <linux/jiffies.h>
#include <linux/pid_namespace.h>
#include <linux/proc_ns.h>
+#include <linux/security.h>
#include "../../lib/kstrtox.h"
return -EINVAL;
}
-/* Per-cpu temp buffers which can be used by printf-like helpers for %s or %p
+/* Per-cpu temp buffers used by printf-like helpers to store the bprintf binary
+ * arguments representation.
*/
-#define MAX_PRINTF_BUF_LEN 512
+#define MAX_BPRINTF_BUF_LEN 512
-struct bpf_printf_buf {
- char tmp_buf[MAX_PRINTF_BUF_LEN];
+/* Support executing three nested bprintf helper calls on a given CPU */
+#define MAX_BPRINTF_NEST_LEVEL 3
+struct bpf_bprintf_buffers {
+ char tmp_bufs[MAX_BPRINTF_NEST_LEVEL][MAX_BPRINTF_BUF_LEN];
};
-static DEFINE_PER_CPU(struct bpf_printf_buf, bpf_printf_buf);
-static DEFINE_PER_CPU(int, bpf_printf_buf_used);
+static DEFINE_PER_CPU(struct bpf_bprintf_buffers, bpf_bprintf_bufs);
+static DEFINE_PER_CPU(int, bpf_bprintf_nest_level);
static int try_get_fmt_tmp_buf(char **tmp_buf)
{
- struct bpf_printf_buf *bufs;
- int used;
+ struct bpf_bprintf_buffers *bufs;
+ int nest_level;
preempt_disable();
- used = this_cpu_inc_return(bpf_printf_buf_used);
- if (WARN_ON_ONCE(used > 1)) {
- this_cpu_dec(bpf_printf_buf_used);
+ nest_level = this_cpu_inc_return(bpf_bprintf_nest_level);
+ if (WARN_ON_ONCE(nest_level > MAX_BPRINTF_NEST_LEVEL)) {
+ this_cpu_dec(bpf_bprintf_nest_level);
preempt_enable();
return -EBUSY;
}
- bufs = this_cpu_ptr(&bpf_printf_buf);
- *tmp_buf = bufs->tmp_buf;
+ bufs = this_cpu_ptr(&bpf_bprintf_bufs);
+ *tmp_buf = bufs->tmp_bufs[nest_level - 1];
return 0;
}
void bpf_bprintf_cleanup(void)
{
- if (this_cpu_read(bpf_printf_buf_used)) {
- this_cpu_dec(bpf_printf_buf_used);
+ if (this_cpu_read(bpf_bprintf_nest_level)) {
+ this_cpu_dec(bpf_bprintf_nest_level);
preempt_enable();
}
}
if (num_args && try_get_fmt_tmp_buf(&tmp_buf))
return -EBUSY;
- tmp_buf_end = tmp_buf + MAX_PRINTF_BUF_LEN;
+ tmp_buf_end = tmp_buf + MAX_BPRINTF_BUF_LEN;
*bin_args = (u32 *)tmp_buf;
}
case BPF_FUNC_probe_read_user:
return &bpf_probe_read_user_proto;
case BPF_FUNC_probe_read_kernel:
- return &bpf_probe_read_kernel_proto;
+ return security_locked_down(LOCKDOWN_BPF_READ) < 0 ?
+ NULL : &bpf_probe_read_kernel_proto;
case BPF_FUNC_probe_read_user_str:
return &bpf_probe_read_user_str_proto;
case BPF_FUNC_probe_read_kernel_str:
- return &bpf_probe_read_kernel_str_proto;
+ return security_locked_down(LOCKDOWN_BPF_READ) < 0 ?
+ NULL : &bpf_probe_read_kernel_str_proto;
case BPF_FUNC_snprintf_btf:
return &bpf_snprintf_btf_proto;
case BPF_FUNC_snprintf:
return -ENOTSUPP;
}
-static size_t bpf_ringbuf_mmap_page_cnt(const struct bpf_ringbuf *rb)
-{
- size_t data_pages = (rb->mask + 1) >> PAGE_SHIFT;
-
- /* consumer page + producer page + 2 x data pages */
- return RINGBUF_POS_PAGES + 2 * data_pages;
-}
-
static int ringbuf_map_mmap(struct bpf_map *map, struct vm_area_struct *vma)
{
struct bpf_ringbuf_map *rb_map;
- size_t mmap_sz;
rb_map = container_of(map, struct bpf_ringbuf_map, map);
- mmap_sz = bpf_ringbuf_mmap_page_cnt(rb_map->rb) << PAGE_SHIFT;
-
- if (vma->vm_pgoff * PAGE_SIZE + (vma->vm_end - vma->vm_start) > mmap_sz)
- return -EINVAL;
+ if (vma->vm_flags & VM_WRITE) {
+ /* allow writable mapping for the consumer_pos only */
+ if (vma->vm_pgoff != 0 || vma->vm_end - vma->vm_start != PAGE_SIZE)
+ return -EPERM;
+ } else {
+ vma->vm_flags &= ~VM_MAYWRITE;
+ }
+ /* remap_vmalloc_range() checks size and offset constraints */
return remap_vmalloc_range(vma, rb_map->rb,
vma->vm_pgoff + RINGBUF_PGOFF);
}
return NULL;
len = round_up(size + BPF_RINGBUF_HDR_SZ, 8);
+ if (len > rb->mask + 1)
+ return NULL;
+
cons_pos = smp_load_acquire(&rb->consumer_pos);
if (in_nmi()) {
static DEFINE_IDR(link_idr);
static DEFINE_SPINLOCK(link_idr_lock);
-int sysctl_unprivileged_bpf_disabled __read_mostly;
+int sysctl_unprivileged_bpf_disabled __read_mostly =
+ IS_BUILTIN(CONFIG_BPF_UNPRIV_DEFAULT_OFF) ? 2 : 0;
static const struct bpf_map_ops * const bpf_map_types[] = {
#define BPF_PROG_TYPE(_id, _name, prog_ctx_type, kern_ctx_type)
};
static int retrieve_ptr_limit(const struct bpf_reg_state *ptr_reg,
- const struct bpf_reg_state *off_reg,
- u32 *alu_limit, u8 opcode)
+ u32 *alu_limit, bool mask_to_left)
{
- bool off_is_neg = off_reg->smin_value < 0;
- bool mask_to_left = (opcode == BPF_ADD && off_is_neg) ||
- (opcode == BPF_SUB && !off_is_neg);
u32 max = 0, ptr_limit = 0;
- if (!tnum_is_const(off_reg->var_off) &&
- (off_reg->smin_value < 0) != (off_reg->smax_value < 0))
- return REASON_BOUNDS;
-
switch (ptr_reg->type) {
case PTR_TO_STACK:
/* Offset 0 is out-of-bounds, but acceptable start for the
return opcode == BPF_ADD || opcode == BPF_SUB;
}
+struct bpf_sanitize_info {
+ struct bpf_insn_aux_data aux;
+ bool mask_to_left;
+};
+
static int sanitize_ptr_alu(struct bpf_verifier_env *env,
struct bpf_insn *insn,
const struct bpf_reg_state *ptr_reg,
const struct bpf_reg_state *off_reg,
struct bpf_reg_state *dst_reg,
- struct bpf_insn_aux_data *tmp_aux,
+ struct bpf_sanitize_info *info,
const bool commit_window)
{
- struct bpf_insn_aux_data *aux = commit_window ? cur_aux(env) : tmp_aux;
+ struct bpf_insn_aux_data *aux = commit_window ? cur_aux(env) : &info->aux;
struct bpf_verifier_state *vstate = env->cur_state;
bool off_is_imm = tnum_is_const(off_reg->var_off);
bool off_is_neg = off_reg->smin_value < 0;
if (vstate->speculative)
goto do_sim;
- err = retrieve_ptr_limit(ptr_reg, off_reg, &alu_limit, opcode);
+ if (!commit_window) {
+ if (!tnum_is_const(off_reg->var_off) &&
+ (off_reg->smin_value < 0) != (off_reg->smax_value < 0))
+ return REASON_BOUNDS;
+
+ info->mask_to_left = (opcode == BPF_ADD && off_is_neg) ||
+ (opcode == BPF_SUB && !off_is_neg);
+ }
+
+ err = retrieve_ptr_limit(ptr_reg, &alu_limit, info->mask_to_left);
if (err < 0)
return err;
/* In commit phase we narrow the masking window based on
* the observed pointer move after the simulated operation.
*/
- alu_state = tmp_aux->alu_state;
- alu_limit = abs(tmp_aux->alu_limit - alu_limit);
+ alu_state = info->aux.alu_state;
+ alu_limit = abs(info->aux.alu_limit - alu_limit);
} else {
alu_state = off_is_neg ? BPF_ALU_NEG_VALUE : 0;
alu_state |= off_is_imm ? BPF_ALU_IMMEDIATE : 0;
/* If we're in commit phase, we're done here given we already
* pushed the truncated dst_reg into the speculative verification
* stack.
+ *
+ * Also, when register is a known constant, we rewrite register-based
+ * operation to immediate-based, and thus do not need masking (and as
+ * a consequence, do not need to simulate the zero-truncation either).
*/
- if (commit_window)
+ if (commit_window || off_is_imm)
return 0;
/* Simulate and find potential out-of-bounds access under
smin_ptr = ptr_reg->smin_value, smax_ptr = ptr_reg->smax_value;
u64 umin_val = off_reg->umin_value, umax_val = off_reg->umax_value,
umin_ptr = ptr_reg->umin_value, umax_ptr = ptr_reg->umax_value;
- struct bpf_insn_aux_data tmp_aux = {};
+ struct bpf_sanitize_info info = {};
u8 opcode = BPF_OP(insn->code);
u32 dst = insn->dst_reg;
int ret;
if (sanitize_needed(opcode)) {
ret = sanitize_ptr_alu(env, insn, ptr_reg, off_reg, dst_reg,
- &tmp_aux, false);
+ &info, false);
if (ret < 0)
return sanitize_err(env, insn, ret, off_reg, dst_reg);
}
return -EACCES;
if (sanitize_needed(opcode)) {
ret = sanitize_ptr_alu(env, insn, dst_reg, off_reg, dst_reg,
- &tmp_aux, true);
+ &info, true);
if (ret < 0)
return sanitize_err(env, insn, ret, off_reg, dst_reg);
}
s32 smin_val = src_reg->s32_min_value;
u32 umax_val = src_reg->u32_max_value;
- /* Assuming scalar64_min_max_and will be called so its safe
- * to skip updating register for known 32-bit case.
- */
- if (src_known && dst_known)
+ if (src_known && dst_known) {
+ __mark_reg32_known(dst_reg, var32_off.value);
return;
+ }
/* We get our minimum from the var_off, since that's inherently
* bitwise. Our maximum is the minimum of the operands' maxima.
dst_reg->s32_min_value = dst_reg->u32_min_value;
dst_reg->s32_max_value = dst_reg->u32_max_value;
}
-
}
static void scalar_min_max_and(struct bpf_reg_state *dst_reg,
s32 smin_val = src_reg->s32_min_value;
u32 umin_val = src_reg->u32_min_value;
- /* Assuming scalar64_min_max_or will be called so it is safe
- * to skip updating register for known case.
- */
- if (src_known && dst_known)
+ if (src_known && dst_known) {
+ __mark_reg32_known(dst_reg, var32_off.value);
return;
+ }
/* We get our maximum from the var_off, and our minimum is the
* maximum of the operands' minima
struct tnum var32_off = tnum_subreg(dst_reg->var_off);
s32 smin_val = src_reg->s32_min_value;
- /* Assuming scalar64_min_max_xor will be called so it is safe
- * to skip updating register for known case.
- */
- if (src_known && dst_known)
+ if (src_known && dst_known) {
+ __mark_reg32_known(dst_reg, var32_off.value);
return;
+ }
/* We get both minimum and maximum from the var32_off. */
dst_reg->u32_min_value = var32_off.value;
return 0;
}
+BTF_SET_START(btf_id_deny)
+BTF_ID_UNUSED
+#ifdef CONFIG_SMP
+BTF_ID(func, migrate_disable)
+BTF_ID(func, migrate_enable)
+#endif
+#if !defined CONFIG_PREEMPT_RCU && !defined CONFIG_TINY_RCU
+BTF_ID(func, rcu_read_unlock_strict)
+#endif
+BTF_SET_END(btf_id_deny)
+
static int check_attach_btf_id(struct bpf_verifier_env *env)
{
struct bpf_prog *prog = env->prog;
ret = bpf_lsm_verify_prog(&env->log, prog);
if (ret < 0)
return ret;
+ } else if (prog->type == BPF_PROG_TYPE_TRACING &&
+ btf_id_set_contains(&btf_id_deny, btf_id)) {
+ return -EINVAL;
}
key = bpf_trampoline_compute_key(tgt_prog, prog->aux->attach_btf, btf_id);
if (is_priv)
env->test_state_freq = attr->prog_flags & BPF_F_TEST_STATE_FREQ;
- if (bpf_prog_is_dev_bound(env->prog->aux)) {
- ret = bpf_prog_offload_verifier_prep(env->prog);
- if (ret)
- goto skip_full_check;
- }
-
env->explored_states = kvcalloc(state_htab_size(env),
sizeof(struct bpf_verifier_state_list *),
GFP_USER);
if (ret < 0)
goto skip_full_check;
+ if (bpf_prog_is_dev_bound(env->prog->aux)) {
+ ret = bpf_prog_offload_verifier_prep(env->prog);
+ if (ret)
+ goto skip_full_check;
+ }
+
ret = check_cfg(env);
if (ret < 0)
goto skip_full_check;
ctx->subsys_mask &= enabled;
/*
- * In absense of 'none', 'name=' or subsystem name options,
+ * In absence of 'none', 'name=' and subsystem name options,
* let's default to 'all'.
*/
if (!ctx->subsys_mask && !ctx->none && !ctx->name)
* @cgrp: the cgroup of interest
* @ss: the subsystem of interest
*
- * Find and get @cgrp's css assocaited with @ss. If the css doesn't exist
+ * Find and get @cgrp's css associated with @ss. If the css doesn't exist
* or is offline, %NULL is returned.
*/
static struct cgroup_subsys_state *cgroup_tryget_css(struct cgroup *cgrp,
/**
* css_clear_dir - remove subsys files in a cgroup directory
- * @css: taget css
+ * @css: target css
*/
static void css_clear_dir(struct cgroup_subsys_state *css)
{
/*
* This is called when the refcnt of a css is confirmed to be killed.
* css_tryget_online() is now guaranteed to fail. Tell the subsystem to
- * initate destruction and put the css ref from kill_css().
+ * initiate destruction and put the css ref from kill_css().
*/
static void css_killed_work_fn(struct work_struct *work)
{
return 0;
}
-static u16 cgroup_disable_mask __initdata;
-
/**
* cgroup_init - cgroup initialization
*
* disabled flag and cftype registration needs kmalloc,
* both of which aren't available during early_init.
*/
- if (cgroup_disable_mask & (1 << ssid)) {
- static_branch_disable(cgroup_subsys_enabled_key[ssid]);
- printk(KERN_INFO "Disabling %s control group subsystem\n",
- ss->name);
+ if (!cgroup_ssid_enabled(ssid))
continue;
- }
if (cgroup1_ssid_disabled(ssid))
printk(KERN_INFO "Disabling %s control group subsystem in v1 mounts\n",
* @kargs: the arguments passed to create the child process
*
* This calls the cancel_fork() callbacks if a fork failed *after*
- * cgroup_can_fork() succeded and cleans up references we took to
+ * cgroup_can_fork() succeeded and cleans up references we took to
* prepare a new css_set for the child process in cgroup_can_fork().
*/
void cgroup_cancel_fork(struct task_struct *child,
if (strcmp(token, ss->name) &&
strcmp(token, ss->legacy_name))
continue;
- cgroup_disable_mask |= 1 << i;
+
+ static_branch_disable(cgroup_subsys_enabled_key[i]);
+ pr_info("Disabling %s control group subsystem\n",
+ ss->name);
}
}
return 1;
}
/**
- * cpuset_nodemask_valid_mems_allowed - check nodemask vs. curremt mems_allowed
+ * cpuset_nodemask_valid_mems_allowed - check nodemask vs. current mems_allowed
* @nodemask: the nodemask to be checked
*
* Are any of the nodes in the nodemask allowed in current->mems_allowed?
* This function follows charging resource in hierarchical way.
* It will fail if the charge would cause the new value to exceed the
* hierarchical limit.
- * Returns 0 if the charge succeded, otherwise -EAGAIN, -ENOMEM or -EINVAL.
+ * Returns 0 if the charge succeeded, otherwise -EAGAIN, -ENOMEM or -EINVAL.
* Returns pointer to rdmacg for this resource when charging is successful.
*
* Charger needs to account resources on two criteria.
* @root: root of the tree to traversal
* @cpu: target cpu
*
- * Walks the udpated rstat_cpu tree on @cpu from @root. %NULL @pos starts
+ * Walks the updated rstat_cpu tree on @cpu from @root. %NULL @pos starts
* the traversal and %NULL return indicates the end. During traversal,
* each returned cgroup is unlinked from the tree. Must be called with the
* matching cgroup_rstat_cpu_lock held.
up(&match->notif->request);
wake_up_poll(&match->wqh, EPOLLIN | EPOLLRDNORM);
- mutex_unlock(&match->notify_lock);
/*
* This is where we wait for a reply from userspace.
*/
-wait:
- err = wait_for_completion_interruptible(&n.ready);
- mutex_lock(&match->notify_lock);
- if (err == 0) {
- /* Check if we were woken up by a addfd message */
+ do {
+ mutex_unlock(&match->notify_lock);
+ err = wait_for_completion_interruptible(&n.ready);
+ mutex_lock(&match->notify_lock);
+ if (err != 0)
+ goto interrupted;
+
addfd = list_first_entry_or_null(&n.addfd,
struct seccomp_kaddfd, list);
- if (addfd && n.state != SECCOMP_NOTIFY_REPLIED) {
+ /* Check if we were woken up by a addfd message */
+ if (addfd)
seccomp_handle_addfd(addfd);
- mutex_unlock(&match->notify_lock);
- goto wait;
- }
- ret = n.val;
- err = n.error;
- flags = n.flags;
- }
+ } while (n.state != SECCOMP_NOTIFY_REPLIED);
+
+ ret = n.val;
+ err = n.error;
+ flags = n.flags;
+
+interrupted:
/* If there were any pending addfd calls, clear them out */
list_for_each_entry_safe(addfd, tmp, &n.addfd, list) {
/* The process went away before we got a chance to handle it */
mutex_unlock(&bpf_stats_enabled_mutex);
return ret;
}
-#endif
+
+static int bpf_unpriv_handler(struct ctl_table *table, int write,
+ void *buffer, size_t *lenp, loff_t *ppos)
+{
+ int ret, unpriv_enable = *(int *)table->data;
+ bool locked_state = unpriv_enable == 1;
+ struct ctl_table tmp = *table;
+
+ if (write && !capable(CAP_SYS_ADMIN))
+ return -EPERM;
+
+ tmp.data = &unpriv_enable;
+ ret = proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos);
+ if (write && !ret) {
+ if (locked_state && unpriv_enable != 1)
+ return -EPERM;
+ *(int *)table->data = unpriv_enable;
+ }
+ return ret;
+}
+#endif /* CONFIG_BPF_SYSCALL && CONFIG_SYSCTL */
/*
* /proc/sys support
.data = &sysctl_unprivileged_bpf_disabled,
.maxlen = sizeof(sysctl_unprivileged_bpf_disabled),
.mode = 0644,
- /* only handle a transition from default "0" to "1" */
- .proc_handler = proc_dointvec_minmax,
- .extra1 = SYSCTL_ONE,
- .extra2 = SYSCTL_ONE,
+ .proc_handler = bpf_unpriv_handler,
+ .extra1 = SYSCTL_ZERO,
+ .extra2 = &two,
},
{
.procname = "bpf_stats_enabled",
static __always_inline int
bpf_probe_read_kernel_common(void *dst, u32 size, const void *unsafe_ptr)
{
- int ret = security_locked_down(LOCKDOWN_BPF_READ);
+ int ret;
- if (unlikely(ret < 0))
- goto fail;
ret = copy_from_kernel_nofault(dst, unsafe_ptr, size);
if (unlikely(ret < 0))
- goto fail;
- return ret;
-fail:
- memset(dst, 0, size);
+ memset(dst, 0, size);
return ret;
}
static __always_inline int
bpf_probe_read_kernel_str_common(void *dst, u32 size, const void *unsafe_ptr)
{
- int ret = security_locked_down(LOCKDOWN_BPF_READ);
-
- if (unlikely(ret < 0))
- goto fail;
+ int ret;
/*
* The strncpy_from_kernel_nofault() call will likely not fill the
*/
ret = strncpy_from_kernel_nofault(dst, unsafe_ptr, size);
if (unlikely(ret < 0))
- goto fail;
-
- return ret;
-fail:
- memset(dst, 0, size);
+ memset(dst, 0, size);
return ret;
}
case BPF_FUNC_probe_read_user:
return &bpf_probe_read_user_proto;
case BPF_FUNC_probe_read_kernel:
- return &bpf_probe_read_kernel_proto;
+ return security_locked_down(LOCKDOWN_BPF_READ) < 0 ?
+ NULL : &bpf_probe_read_kernel_proto;
case BPF_FUNC_probe_read_user_str:
return &bpf_probe_read_user_str_proto;
case BPF_FUNC_probe_read_kernel_str:
- return &bpf_probe_read_kernel_str_proto;
+ return security_locked_down(LOCKDOWN_BPF_READ) < 0 ?
+ NULL : &bpf_probe_read_kernel_str_proto;
#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
case BPF_FUNC_probe_read:
- return &bpf_probe_read_compat_proto;
+ return security_locked_down(LOCKDOWN_BPF_READ) < 0 ?
+ NULL : &bpf_probe_read_compat_proto;
case BPF_FUNC_probe_read_str:
- return &bpf_probe_read_compat_str_proto;
+ return security_locked_down(LOCKDOWN_BPF_READ) < 0 ?
+ NULL : &bpf_probe_read_compat_str_proto;
#endif
#ifdef CONFIG_CGROUPS
case BPF_FUNC_get_current_cgroup_id:
#include <linux/uaccess.h>
#include <linux/sched/isolation.h>
#include <linux/nmi.h>
+#include <linux/kvm_para.h>
#include "workqueue_internal.h"
{
unsigned long thresh = READ_ONCE(wq_watchdog_thresh) * HZ;
bool lockup_detected = false;
+ unsigned long now = jiffies;
struct worker_pool *pool;
int pi;
if (list_empty(&pool->worklist))
continue;
+ /*
+ * If a virtual machine is stopped by the host it can look to
+ * the watchdog like a stall.
+ */
+ kvm_check_and_clear_guest_paused();
+
/* get the latest of pool and touched timestamps */
if (pool->cpu >= 0)
touched = READ_ONCE(per_cpu(wq_watchdog_touched_cpu, pool->cpu));
ts = touched;
/* did we stall? */
- if (time_after(jiffies, ts + thresh)) {
+ if (time_after(now, ts + thresh)) {
lockup_detected = true;
pr_emerg("BUG: workqueue lockup - pool");
pr_cont_pool_info(pool);
pr_cont(" stuck for %us!\n",
- jiffies_to_msecs(jiffies - pool_ts) / 1000);
+ jiffies_to_msecs(now - pool_ts) / 1000);
}
}
/**
* crc64_be - Calculate bitwise big-endian ECMA-182 CRC64
* @crc: seed value for computation. 0 or (u64)~0 for a new CRC calculation,
- or the previous crc64 value if computing incrementally.
+ * or the previous crc64 value if computing incrementally.
* @p: pointer to buffer over which CRC64 is run
* @len: length of buffer @p
*/
wait_event_lock_irq(percpu_ref_switch_waitq, !data->confirm_switch,
percpu_ref_switch_lock);
- if (data->force_atomic || (ref->percpu_count_ptr & __PERCPU_REF_DEAD))
+ if (data->force_atomic || percpu_ref_is_dying(ref))
__percpu_ref_switch_to_atomic(ref, confirm_switch);
else
__percpu_ref_switch_to_percpu(ref);
spin_lock_irqsave(&percpu_ref_switch_lock, flags);
- WARN_ONCE(ref->percpu_count_ptr & __PERCPU_REF_DEAD,
+ WARN_ONCE(percpu_ref_is_dying(ref),
"%s called more than once on %ps!", __func__,
ref->data->release);
spin_lock_irqsave(&percpu_ref_switch_lock, flags);
- WARN_ON_ONCE(!(ref->percpu_count_ptr & __PERCPU_REF_DEAD));
+ WARN_ON_ONCE(!percpu_ref_is_dying(ref));
WARN_ON_ONCE(__ref_is_percpu(ref, &percpu_count));
ref->percpu_count_ptr &= ~__PERCPU_REF_DEAD;
pr_debug("Validating PMD advanced\n");
/* Align the address wrt HPAGE_PMD_SIZE */
- vaddr = (vaddr & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE;
+ vaddr &= HPAGE_PMD_MASK;
pgtable_trans_huge_deposit(mm, pmdp, pgtable);
pr_debug("Validating PUD advanced\n");
/* Align the address wrt HPAGE_PUD_SIZE */
- vaddr = (vaddr & HPAGE_PUD_MASK) + HPAGE_PUD_SIZE;
+ vaddr &= HPAGE_PUD_MASK;
set_pud_at(mm, vaddr, pudp, pud);
pudp_set_wrprotect(mm, vaddr, pudp);
SetPageHWPoison(page);
ClearPageHWPoison(head);
}
- remove_hugetlb_page(h, page, false);
+ remove_hugetlb_page(h, head, false);
h->max_huge_pages--;
spin_unlock_irq(&hugetlb_lock);
update_and_free_page(h, head);
if (!page)
goto out;
} else if (!*pagep) {
- ret = -ENOMEM;
+ /* If a page already exists, then it's UFFDIO_COPY for
+ * a non-missing case. Return -EEXIST.
+ */
+ if (vm_shared &&
+ hugetlbfs_pagecache_present(h, dst_vma, dst_addr)) {
+ ret = -EEXIST;
+ goto out;
+ }
+
page = alloc_huge_page(dst_vma, dst_addr, 0);
- if (IS_ERR(page))
+ if (IS_ERR(page)) {
+ ret = -ENOMEM;
goto out;
+ }
ret = copy_huge_page_from_user(page,
(const void __user *) src_addr,
/**
* kasan_populate_early_shadow - populate shadow memory region with
* kasan_early_shadow_page
- * @shadow_start - start of the memory range to populate
- * @shadow_end - end of the memory range to populate
+ * @shadow_start: start of the memory range to populate
+ * @shadow_end: end of the memory range to populate
*/
int __ref kasan_populate_early_shadow(const void *shadow_start,
const void *shadow_end)
* During low activity with no allocations we might wait a
* while; let's avoid the hung task warning.
*/
- wait_event_timeout(allocation_wait, atomic_read(&kfence_allocation_gate),
- sysctl_hung_task_timeout_secs * HZ / 2);
+ wait_event_idle_timeout(allocation_wait, atomic_read(&kfence_allocation_gate),
+ sysctl_hung_task_timeout_secs * HZ / 2);
} else {
- wait_event(allocation_wait, atomic_read(&kfence_allocation_gate));
+ wait_event_idle(allocation_wait, atomic_read(&kfence_allocation_gate));
}
/* Disable static key and reset timer. */
}
flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
entry = mk_pte(new_page, vma->vm_page_prot);
+ entry = pte_sw_mkyoung(entry);
entry = maybe_mkwrite(pte_mkdirty(entry), vma);
/*
__SetPageUptodate(page);
entry = mk_pte(page, vma->vm_page_prot);
+ entry = pte_sw_mkyoung(entry);
if (vma->vm_flags & VM_WRITE)
entry = pte_mkwrite(pte_mkdirty(entry));
if (prefault && arch_wants_old_prefaulted_pte())
entry = pte_mkold(entry);
+ else
+ entry = pte_sw_mkyoung(entry);
if (write)
entry = maybe_mkwrite(pte_mkdirty(entry), vma);
del_page_from_free_list(page_head, zone, page_order);
break_down_buddy_pages(zone, page_head, page, 0,
page_order, migratetype);
+ if (!is_migrate_isolate(migratetype))
+ __mod_zone_freepage_state(zone, -1, migratetype);
ret = true;
break;
}
select DQL
default y
-config BPF_JIT
- bool "enable BPF Just In Time compiler"
- depends on HAVE_CBPF_JIT || HAVE_EBPF_JIT
- depends on MODULES
- help
- Berkeley Packet Filter filtering capabilities are normally handled
- by an interpreter. This option allows kernel to generate a native
- code when filter is loaded in memory. This should speedup
- packet sniffing (libpcap/tcpdump).
-
- Note, admin should enable this feature changing:
- /proc/sys/net/core/bpf_jit_enable
- /proc/sys/net/core/bpf_jit_harden (optional)
- /proc/sys/net/core/bpf_jit_kallsyms (optional)
-
config BPF_STREAM_PARSER
bool "enable BPF STREAM_PARSER"
depends on INET
e.g. notification messages.
endif # if NET
-
-# Used by archs to tell that they support BPF JIT compiler plus which flavour.
-# Only one of the two can be selected for a specific arch since eBPF JIT supersedes
-# the cBPF JIT.
-
-# Classic BPF JIT (cBPF)
-config HAVE_CBPF_JIT
- bool
-
-# Extended BPF JIT (eBPF)
-config HAVE_EBPF_JIT
- bool
} else {
/* Init failed, cleanup */
flush_work(&hdev->tx_work);
- flush_work(&hdev->cmd_work);
+
+ /* Since hci_rx_work() is possible to awake new cmd_work
+ * it should be flushed first to avoid unexpected call of
+ * hci_cmd_work()
+ */
flush_work(&hdev->rx_work);
+ flush_work(&hdev->cmd_work);
skb_queue_purge(&hdev->cmd_q);
skb_queue_purge(&hdev->rx_q);
/* Detach sockets from device */
read_lock(&hci_sk_list.lock);
sk_for_each(sk, &hci_sk_list.head) {
- bh_lock_sock_nested(sk);
+ lock_sock(sk);
if (hci_pi(sk)->hdev == hdev) {
hci_pi(sk)->hdev = NULL;
sk->sk_err = EPIPE;
hci_dev_put(hdev);
}
- bh_unlock_sock(sk);
+ release_sock(sk);
}
read_unlock(&hci_sk_list.lock);
}
caifd_put(caifd);
}
-void caif_enroll_dev(struct net_device *dev, struct caif_dev_common *caifdev,
+int caif_enroll_dev(struct net_device *dev, struct caif_dev_common *caifdev,
struct cflayer *link_support, int head_room,
struct cflayer **layer,
int (**rcv_func)(struct sk_buff *, struct net_device *,
enum cfcnfg_phy_preference pref;
struct cfcnfg *cfg = get_cfcnfg(dev_net(dev));
struct caif_device_entry_list *caifdevs;
+ int res;
caifdevs = caif_device_list(dev_net(dev));
caifd = caif_device_alloc(dev);
if (!caifd)
- return;
+ return -ENOMEM;
*layer = &caifd->layer;
spin_lock_init(&caifd->flow_lock);
strlcpy(caifd->layer.name, dev->name,
sizeof(caifd->layer.name));
caifd->layer.transmit = transmit;
- cfcnfg_add_phy_layer(cfg,
+ res = cfcnfg_add_phy_layer(cfg,
dev,
&caifd->layer,
pref,
mutex_unlock(&caifdevs->lock);
if (rcv_func)
*rcv_func = receive;
+ return res;
}
EXPORT_SYMBOL(caif_enroll_dev);
struct cflayer *layer, *link_support;
int head_room = 0;
struct caif_device_entry_list *caifdevs;
+ int res;
cfg = get_cfcnfg(dev_net(dev));
caifdevs = caif_device_list(dev_net(dev));
break;
}
}
- caif_enroll_dev(dev, caifdev, link_support, head_room,
+ res = caif_enroll_dev(dev, caifdev, link_support, head_room,
&layer, NULL);
+ if (res)
+ cfserl_release(link_support);
caifdev->flowctrl = dev_flowctrl;
break;
return (struct cflayer *) this;
}
+static void cfusbl_release(struct cflayer *layer)
+{
+ kfree(layer);
+}
+
static struct packet_type caif_usb_type __read_mostly = {
.type = cpu_to_be16(ETH_P_802_EX1),
};
struct cflayer *layer, *link_support;
struct usbnet *usbnet;
struct usb_device *usbdev;
+ int res;
/* Check whether we have a NCM device, and find its VID/PID. */
if (!(dev->dev.parent && dev->dev.parent->driver &&
if (dev->num_tx_queues > 1)
pr_warn("USB device uses more than one tx queue\n");
- caif_enroll_dev(dev, &common, link_support, CFUSB_MAX_HEADLEN,
+ res = caif_enroll_dev(dev, &common, link_support, CFUSB_MAX_HEADLEN,
&layer, &caif_usb_type.func);
+ if (res)
+ goto err;
+
if (!pack_added)
dev_add_pack(&caif_usb_type);
pack_added = true;
strlcpy(layer->name, dev->name, sizeof(layer->name));
return 0;
+err:
+ cfusbl_release(link_support);
+ return res;
}
static struct notifier_block caif_device_notifier = {
rcu_read_unlock();
}
-void
+int
cfcnfg_add_phy_layer(struct cfcnfg *cnfg,
struct net_device *dev, struct cflayer *phy_layer,
enum cfcnfg_phy_preference pref,
{
struct cflayer *frml;
struct cfcnfg_phyinfo *phyinfo = NULL;
- int i;
+ int i, res = 0;
u8 phyid;
mutex_lock(&cnfg->lock);
goto got_phyid;
}
pr_warn("Too many CAIF Link Layers (max 6)\n");
+ res = -EEXIST;
goto out;
got_phyid:
phyinfo = kzalloc(sizeof(struct cfcnfg_phyinfo), GFP_ATOMIC);
- if (!phyinfo)
+ if (!phyinfo) {
+ res = -ENOMEM;
goto out_err;
+ }
phy_layer->id = phyid;
phyinfo->pref = pref;
frml = cffrml_create(phyid, fcs);
- if (!frml)
+ if (!frml) {
+ res = -ENOMEM;
goto out_err;
+ }
phyinfo->frm_layer = frml;
layer_set_up(frml, cnfg->mux);
list_add_rcu(&phyinfo->node, &cnfg->phys);
out:
mutex_unlock(&cnfg->lock);
- return;
+ return res;
out_err:
kfree(phyinfo);
mutex_unlock(&cnfg->lock);
+ return res;
}
EXPORT_SYMBOL(cfcnfg_add_phy_layer);
static void cfserl_ctrlcmd(struct cflayer *layr, enum caif_ctrlcmd ctrl,
int phyid);
+void cfserl_release(struct cflayer *layer)
+{
+ kfree(layer);
+}
+
struct cflayer *cfserl_create(int instance, bool use_stx)
{
struct cfserl *this = kzalloc(sizeof(struct cfserl), GFP_ATOMIC);
if (len < ISOTP_MIN_NAMELEN)
return -EINVAL;
+ if (addr->can_addr.tp.tx_id & (CAN_ERR_FLAG | CAN_RTR_FLAG))
+ return -EADDRNOTAVAIL;
+
+ if (!addr->can_ifindex)
+ return -ENODEV;
+
+ lock_sock(sk);
+
/* do not register frame reception for functional addressing */
if (so->opt.flags & CAN_ISOTP_SF_BROADCAST)
do_rx_reg = 0;
/* do not validate rx address for functional addressing */
if (do_rx_reg) {
- if (addr->can_addr.tp.rx_id == addr->can_addr.tp.tx_id)
- return -EADDRNOTAVAIL;
+ if (addr->can_addr.tp.rx_id == addr->can_addr.tp.tx_id) {
+ err = -EADDRNOTAVAIL;
+ goto out;
+ }
- if (addr->can_addr.tp.rx_id & (CAN_ERR_FLAG | CAN_RTR_FLAG))
- return -EADDRNOTAVAIL;
+ if (addr->can_addr.tp.rx_id & (CAN_ERR_FLAG | CAN_RTR_FLAG)) {
+ err = -EADDRNOTAVAIL;
+ goto out;
+ }
}
- if (addr->can_addr.tp.tx_id & (CAN_ERR_FLAG | CAN_RTR_FLAG))
- return -EADDRNOTAVAIL;
-
- if (!addr->can_ifindex)
- return -ENODEV;
-
- lock_sock(sk);
-
if (so->bound && addr->can_ifindex == so->ifindex &&
addr->can_addr.tp.rx_id == so->rxid &&
addr->can_addr.tp.tx_id == so->txid)
return ISOTP_MIN_NAMELEN;
}
-static int isotp_setsockopt(struct socket *sock, int level, int optname,
+static int isotp_setsockopt_locked(struct socket *sock, int level, int optname,
sockptr_t optval, unsigned int optlen)
{
struct sock *sk = sock->sk;
struct isotp_sock *so = isotp_sk(sk);
int ret = 0;
- if (level != SOL_CAN_ISOTP)
- return -EINVAL;
-
if (so->bound)
return -EISCONN;
return ret;
}
+static int isotp_setsockopt(struct socket *sock, int level, int optname,
+ sockptr_t optval, unsigned int optlen)
+
+{
+ struct sock *sk = sock->sk;
+ int ret;
+
+ if (level != SOL_CAN_ISOTP)
+ return -EINVAL;
+
+ lock_sock(sk);
+ ret = isotp_setsockopt_locked(sock, level, optname, optval, optlen);
+ release_sock(sk);
+ return ret;
+}
+
static int isotp_getsockopt(struct socket *sock, int level, int optname,
char __user *optval, int __user *optlen)
{
if (kcmlen > stackbuf_size)
kcmsg_base = kcmsg = sock_kmalloc(sk, kcmlen, GFP_KERNEL);
if (kcmsg == NULL)
- return -ENOBUFS;
+ return -ENOMEM;
/* Now copy them over neatly. */
memset(kcmsg, 0, kcmlen);
if (q->flags & TCQ_F_NOLOCK) {
rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK;
- qdisc_run(q);
+ if (likely(!netif_xmit_frozen_or_stopped(txq)))
+ qdisc_run(q);
if (unlikely(to_free))
kfree_skb_list(to_free);
sd->output_queue_tailp = &sd->output_queue;
local_irq_enable();
+ rcu_read_lock();
+
while (head) {
struct Qdisc *q = head;
spinlock_t *root_lock = NULL;
head = head->next_sched;
- if (!(q->flags & TCQ_F_NOLOCK)) {
- root_lock = qdisc_lock(q);
- spin_lock(root_lock);
- }
/* We need to make sure head->next_sched is read
* before clearing __QDISC_STATE_SCHED
*/
smp_mb__before_atomic();
+
+ if (!(q->flags & TCQ_F_NOLOCK)) {
+ root_lock = qdisc_lock(q);
+ spin_lock(root_lock);
+ } else if (unlikely(test_bit(__QDISC_STATE_DEACTIVATED,
+ &q->state))) {
+ /* There is a synchronize_net() between
+ * STATE_DEACTIVATED flag being set and
+ * qdisc_reset()/some_qdisc_is_busy() in
+ * dev_deactivate(), so we can safely bail out
+ * early here to avoid data race between
+ * qdisc_deactivate() and some_qdisc_is_busy()
+ * for lockless qdisc.
+ */
+ clear_bit(__QDISC_STATE_SCHED, &q->state);
+ continue;
+ }
+
clear_bit(__QDISC_STATE_SCHED, &q->state);
qdisc_run(q);
if (root_lock)
spin_unlock(root_lock);
}
+
+ rcu_read_unlock();
}
xfrm_dev_backlog(sd);
case DEVLINK_PORT_FLAVOUR_PHYSICAL:
case DEVLINK_PORT_FLAVOUR_CPU:
case DEVLINK_PORT_FLAVOUR_DSA:
- case DEVLINK_PORT_FLAVOUR_VIRTUAL:
if (nla_put_u32(msg, DEVLINK_ATTR_PORT_NUMBER,
attrs->phys.port_number))
return -EMSGSIZE;
switch (attrs->flavour) {
case DEVLINK_PORT_FLAVOUR_PHYSICAL:
- case DEVLINK_PORT_FLAVOUR_VIRTUAL:
if (!attrs->split)
n = snprintf(name, len, "p%u", attrs->phys.port_number);
else
n = snprintf(name, len, "pf%usf%u", attrs->pci_sf.pf,
attrs->pci_sf.sf);
break;
+ case DEVLINK_PORT_FLAVOUR_VIRTUAL:
+ return -EOPNOTSUPP;
}
if (n >= len)
{
struct net *net;
struct sk_buff *skb;
- int err = -ENOBUFS;
+ int err = -ENOMEM;
net = ops->fro_net;
skb = nlmsg_new(fib_rule_nlmsg_size(ops, rule), GFP_KERNEL);
__skb_push(skb, head_room);
memset(skb->data, 0, head_room);
skb_reset_mac_header(skb);
+ skb_reset_mac_len(skb);
}
return ret;
if (err < 0)
goto errout;
- if (!skb->len)
+ if (!skb->len) {
+ err = -EINVAL;
goto errout;
+ }
rtnl_notify(skb, net, 0, RTNLGRP_LINK, NULL, GFP_ATOMIC);
return 0;
}
EXPORT_SYMBOL(sock_set_rcvbuf);
+static void __sock_set_mark(struct sock *sk, u32 val)
+{
+ if (val != sk->sk_mark) {
+ sk->sk_mark = val;
+ sk_dst_reset(sk);
+ }
+}
+
void sock_set_mark(struct sock *sk, u32 val)
{
lock_sock(sk);
- sk->sk_mark = val;
+ __sock_set_mark(sk, val);
release_sock(sk);
}
EXPORT_SYMBOL(sock_set_mark);
case SO_MARK:
if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) {
ret = -EPERM;
- } else if (val != sk->sk_mark) {
- sk->sk_mark = val;
- sk_dst_reset(sk);
+ break;
}
+
+ __sock_set_mark(sk, val);
break;
case SO_RXQ_OVFL:
if (skb_is_tcp_pure_ack(skb))
return;
- if (can_skb_orphan_partial(skb))
- skb_set_owner_sk_safe(skb, skb->sk);
- else
- skb_orphan(skb);
+ if (can_skb_orphan_partial(skb) && skb_set_owner_sk_safe(skb, skb->sk))
+ return;
+
+ skb_orphan(skb);
}
EXPORT_SYMBOL(skb_orphan_partial);
struct dsa_switch *ds = cpu_dp->ds;
int port = cpu_dp->index;
int len = ETH_GSTRING_LEN;
- int mcount = 0, count;
- unsigned int i;
+ int mcount = 0, count, i;
uint8_t pfx[4];
uint8_t *ndata;
*/
ds->ops->get_strings(ds, port, stringset, ndata);
count = ds->ops->get_sset_count(ds, port, stringset);
+ if (count < 0)
+ return;
for (i = 0; i < count; i++) {
memmove(ndata + (i * len + sizeof(pfx)),
ndata + i * len, len - sizeof(pfx));
struct dsa_switch *ds = dp->ds;
if (sset == ETH_SS_STATS) {
- int count;
+ int count = 0;
- count = 4;
- if (ds->ops->get_sset_count)
- count += ds->ops->get_sset_count(ds, dp->index, sset);
+ if (ds->ops->get_sset_count) {
+ count = ds->ops->get_sset_count(ds, dp->index, sset);
+ if (count < 0)
+ return count;
+ }
- return count;
+ return count + 4;
} else if (sset == ETH_SS_TEST) {
return net_selftest_get_count();
}
#define DSA_8021Q_SUBVLAN_HI_SHIFT 9
#define DSA_8021Q_SUBVLAN_HI_MASK GENMASK(9, 9)
#define DSA_8021Q_SUBVLAN_LO_SHIFT 4
-#define DSA_8021Q_SUBVLAN_LO_MASK GENMASK(4, 3)
+#define DSA_8021Q_SUBVLAN_LO_MASK GENMASK(5, 4)
#define DSA_8021Q_SUBVLAN_HI(x) (((x) & GENMASK(2, 2)) >> 2)
#define DSA_8021Q_SUBVLAN_LO(x) ((x) & GENMASK(1, 0))
#define DSA_8021Q_SUBVLAN(x) \
*/
memset(&data->phy_stats, 0xff, sizeof(data->phy_stats));
memset(&data->mac_stats, 0xff, sizeof(data->mac_stats));
- memset(&data->ctrl_stats, 0xff, sizeof(data->mac_stats));
+ memset(&data->ctrl_stats, 0xff, sizeof(data->ctrl_stats));
memset(&data->rmon_stats, 0xff, sizeof(data->rmon_stats));
if (test_bit(ETHTOOL_STATS_ETH_PHY, req_info->stat_mask) &&
if (master) {
skb->dev = master->dev;
skb_reset_mac_header(skb);
+ skb_reset_mac_len(skb);
hsr_forward_skb(skb, master);
} else {
atomic_long_inc(&dev->tx_dropped);
goto out;
skb_reset_mac_header(skb);
+ skb_reset_mac_len(skb);
skb_reset_network_header(skb);
skb_reset_transport_header(skb);
}
}
-void hsr_fill_frame_info(__be16 proto, struct sk_buff *skb,
- struct hsr_frame_info *frame)
+int hsr_fill_frame_info(__be16 proto, struct sk_buff *skb,
+ struct hsr_frame_info *frame)
{
struct hsr_port *port = frame->port_rcv;
struct hsr_priv *hsr = port->hsr;
/* HSRv0 supervisory frames double as a tag so treat them as tagged. */
if ((!hsr->prot_version && proto == htons(ETH_P_PRP)) ||
proto == htons(ETH_P_HSR)) {
+ /* Check if skb contains hsr_ethhdr */
+ if (skb->mac_len < sizeof(struct hsr_ethhdr))
+ return -EINVAL;
+
/* HSR tagged frame :- Data or Supervision */
frame->skb_std = NULL;
frame->skb_prp = NULL;
frame->skb_hsr = skb;
frame->sequence_nr = hsr_get_skb_sequence_nr(skb);
- return;
+ return 0;
}
/* Standard frame or PRP from master port */
handle_std_frame(skb, frame);
+
+ return 0;
}
-void prp_fill_frame_info(__be16 proto, struct sk_buff *skb,
- struct hsr_frame_info *frame)
+int prp_fill_frame_info(__be16 proto, struct sk_buff *skb,
+ struct hsr_frame_info *frame)
{
/* Supervision frame */
struct prp_rct *rct = skb_get_PRP_rct(skb);
frame->skb_std = NULL;
frame->skb_prp = skb;
frame->sequence_nr = prp_get_skb_sequence_nr(rct);
- return;
+ return 0;
}
handle_std_frame(skb, frame);
+
+ return 0;
}
static int fill_frame_info(struct hsr_frame_info *frame,
struct hsr_vlan_ethhdr *vlan_hdr;
struct ethhdr *ethhdr;
__be16 proto;
+ int ret;
- /* Check if skb contains hsr_ethhdr */
- if (skb->mac_len < sizeof(struct hsr_ethhdr))
+ /* Check if skb contains ethhdr */
+ if (skb->mac_len < sizeof(struct ethhdr))
return -EINVAL;
memset(frame, 0, sizeof(*frame));
frame->is_from_san = false;
frame->port_rcv = port;
- hsr->proto_ops->fill_frame_info(proto, skb, frame);
+ ret = hsr->proto_ops->fill_frame_info(proto, skb, frame);
+ if (ret)
+ return ret;
+
check_local_dest(port->hsr, skb, frame);
return 0;
struct hsr_port *port);
bool prp_drop_frame(struct hsr_frame_info *frame, struct hsr_port *port);
bool hsr_drop_frame(struct hsr_frame_info *frame, struct hsr_port *port);
-void prp_fill_frame_info(__be16 proto, struct sk_buff *skb,
- struct hsr_frame_info *frame);
-void hsr_fill_frame_info(__be16 proto, struct sk_buff *skb,
- struct hsr_frame_info *frame);
+int prp_fill_frame_info(__be16 proto, struct sk_buff *skb,
+ struct hsr_frame_info *frame);
+int hsr_fill_frame_info(__be16 proto, struct sk_buff *skb,
+ struct hsr_frame_info *frame);
#endif /* __HSR_FORWARD_H */
struct hsr_port *port);
struct sk_buff * (*create_tagged_frame)(struct hsr_frame_info *frame,
struct hsr_port *port);
- void (*fill_frame_info)(__be16 proto, struct sk_buff *skb,
- struct hsr_frame_info *frame);
+ int (*fill_frame_info)(__be16 proto, struct sk_buff *skb,
+ struct hsr_frame_info *frame);
bool (*invalid_dan_ingress_frame)(__be16 protocol);
void (*update_san_info)(struct hsr_node *node, bool is_sup);
};
goto finish_pass;
skb_push(skb, ETH_HLEN);
-
- if (skb_mac_header(skb) != skb->data) {
- WARN_ONCE(1, "%s:%d: Malformed frame at source port %s)\n",
- __func__, __LINE__, port->dev->name);
- goto finish_consume;
- }
+ skb_reset_mac_header(skb);
+ if ((!hsr->prot_version && protocol == htons(ETH_P_PRP)) ||
+ protocol == htons(ETH_P_HSR))
+ skb_set_network_header(skb, ETH_HLEN + HSR_HLEN);
+ skb_reset_mac_len(skb);
hsr_forward_skb(skb, port);
nla_put_u8(msg, IEEE802154_ATTR_LLSEC_SECLEVEL, params.out_level) ||
nla_put_u32(msg, IEEE802154_ATTR_LLSEC_FRAME_COUNTER,
be32_to_cpu(params.frame_counter)) ||
- ieee802154_llsec_fill_key_id(msg, ¶ms.out_key))
+ ieee802154_llsec_fill_key_id(msg, ¶ms.out_key)) {
+ rc = -ENOBUFS;
goto out_free;
+ }
dev_put(dev);
{
struct ieee802154_llsec_device *dpos;
struct ieee802154_llsec_device_key *kpos;
- int rc = 0, idx = 0, idx2;
+ int idx = 0, idx2;
list_for_each_entry(dpos, &data->table->devices, list) {
if (idx++ < data->s_idx)
data->nlmsg_seq,
dpos->hwaddr, kpos,
data->dev)) {
- return rc = -EMSGSIZE;
+ return -EMSGSIZE;
}
data->s_idx2++;
data->s_idx++;
}
- return rc;
+ return 0;
}
int ieee802154_llsec_dump_devkeys(struct sk_buff *skb,
}
if (nla_put_string(msg, IEEE802154_ATTR_PHY_NAME, wpan_phy_name(phy)) ||
- nla_put_string(msg, IEEE802154_ATTR_DEV_NAME, dev->name))
+ nla_put_string(msg, IEEE802154_ATTR_DEV_NAME, dev->name)) {
+ rc = -EMSGSIZE;
goto nla_put_failure;
+ }
dev_put(dev);
wpan_phy_put(phy);
if (!nla || nla_parse_nested_deprecated(attrs, NL802154_DEV_ADDR_ATTR_MAX, nla, nl802154_dev_addr_policy, NULL))
return -EINVAL;
- if (!attrs[NL802154_DEV_ADDR_ATTR_PAN_ID] ||
- !attrs[NL802154_DEV_ADDR_ATTR_MODE] ||
- !(attrs[NL802154_DEV_ADDR_ATTR_SHORT] ||
- attrs[NL802154_DEV_ADDR_ATTR_EXTENDED]))
+ if (!attrs[NL802154_DEV_ADDR_ATTR_PAN_ID] || !attrs[NL802154_DEV_ADDR_ATTR_MODE])
return -EINVAL;
addr->pan_id = nla_get_le16(attrs[NL802154_DEV_ADDR_ATTR_PAN_ID]);
addr->mode = nla_get_u32(attrs[NL802154_DEV_ADDR_ATTR_MODE]);
switch (addr->mode) {
case NL802154_DEV_ADDR_SHORT:
+ if (!attrs[NL802154_DEV_ADDR_ATTR_SHORT])
+ return -EINVAL;
addr->short_addr = nla_get_le16(attrs[NL802154_DEV_ADDR_ATTR_SHORT]);
break;
case NL802154_DEV_ADDR_EXTENDED:
+ if (!attrs[NL802154_DEV_ADDR_ATTR_EXTENDED])
+ return -EINVAL;
addr->extended_addr = nla_get_le64(attrs[NL802154_DEV_ADDR_ATTR_EXTENDED]);
break;
default:
BTF_ID(func, tcp_reno_undo_cwnd)
BTF_ID(func, tcp_slow_start)
BTF_ID(func, tcp_cong_avoid_ai)
+#ifdef CONFIG_X86
#ifdef CONFIG_DYNAMIC_FTRACE
#if IS_BUILTIN(CONFIG_TCP_CONG_CUBIC)
BTF_ID(func, cubictcp_init)
BTF_ID(func, bbr_set_state)
#endif
#endif /* CONFIG_DYNAMIC_FTRACE */
+#endif /* CONFIG_X86 */
BTF_SET_END(bpf_tcp_ca_kfunc_ids)
static bool bpf_tcp_ca_check_kfunc_call(u32 kfunc_btf_id)
/*
- * Copy BOOTP-supplied string if not already set.
+ * Copy BOOTP-supplied string
*/
static int __init ic_bootp_string(char *dest, char *src, int len, int max)
{
}
break;
case 12: /* Host name */
- ic_bootp_string(utsname()->nodename, ext+1, *ext,
- __NEW_UTS_LEN);
- ic_host_name_set = 1;
+ if (!ic_host_name_set) {
+ ic_bootp_string(utsname()->nodename, ext+1, *ext,
+ __NEW_UTS_LEN);
+ ic_host_name_set = 1;
+ }
break;
case 15: /* Domain name (DNS) */
- ic_bootp_string(ic_domain, ext+1, *ext, sizeof(ic_domain));
+ if (!ic_domain[0])
+ ic_bootp_string(ic_domain, ext+1, *ext, sizeof(ic_domain));
break;
case 17: /* Root path */
if (!root_server_path[0])
IPV6_TLV_PADN, 0 };
/* we assume size > sizeof(ra) here */
- /* limit our allocations to order-0 page */
- size = min_t(int, size, SKB_MAX_ORDER(0, 0));
skb = sock_alloc_send_skb(sk, size, 1, &err);
-
if (!skb)
return NULL;
hdr = ipv6_hdr(skb);
fhdr = (struct frag_hdr *)skb_transport_header(skb);
- if (!(fhdr->frag_off & htons(0xFFF9))) {
+ if (!(fhdr->frag_off & htons(IP6_OFFSET | IP6_MF))) {
/* It is not a fragmented frame */
skb->transport_header += sizeof(struct frag_hdr);
__IP6_INC_STATS(net,
IP6CB(skb)->nhoff = (u8 *)fhdr - skb_network_header(skb);
IP6CB(skb)->flags |= IP6SKB_FRAGMENTED;
+ IP6CB(skb)->frag_max_size = ntohs(hdr->payload_len) +
+ sizeof(struct ipv6hdr);
return 1;
}
if (nh) {
if (rt->fib6_src.plen) {
NL_SET_ERR_MSG(extack, "Nexthops can not be used with source routing");
- goto out;
+ goto out_free;
}
if (!nexthop_get(nh)) {
NL_SET_ERR_MSG(extack, "Nexthop has been deleted");
- goto out;
+ goto out_free;
}
rt->nh = nh;
fib6_nh = nexthop_fib6_nh(rt->nh);
out:
fib6_info_release(rt);
return ERR_PTR(err);
+out_free:
+ ip_fib_metrics_put(rt->fib6_metrics);
+ kfree(rt);
+ return ERR_PTR(err);
}
int ip6_route_add(struct fib6_config *cfg, gfp_t gfp_flags,
if (ipip6_tunnel_create(dev) < 0)
goto failed_free;
+ if (!parms->name[0])
+ strcpy(parms->name, dev->name);
+
return nt;
failed_free:
goto partial_message;
}
+ if (skb_has_frag_list(head)) {
+ kfree_skb_list(skb_shinfo(head)->frag_list);
+ skb_shinfo(head)->frag_list = NULL;
+ }
+
if (head != kcm->seq_skb)
kfree_skb(head);
#define IEEE80211_ENCRYPT_HEADROOM 8
#define IEEE80211_ENCRYPT_TAILROOM 18
-/* IEEE 802.11 (Ch. 9.5 Defragmentation) requires support for concurrent
- * reception of at least three fragmented frames. This limit can be increased
- * by changing this define, at the cost of slower frame reassembly and
- * increased memory use (about 2 kB of RAM per entry). */
-#define IEEE80211_FRAGMENT_MAX 4
-
/* power level hasn't been configured (or set to automatic) */
#define IEEE80211_UNSET_POWER_LEVEL INT_MIN
#define IEEE80211_MAX_NAN_INSTANCE_ID 255
-struct ieee80211_fragment_entry {
- struct sk_buff_head skb_list;
- unsigned long first_frag_time;
- u16 seq;
- u16 extra_len;
- u16 last_frag;
- u8 rx_queue;
- bool check_sequential_pn; /* needed for CCMP/GCMP */
- u8 last_pn[6]; /* PN of the last fragment if CCMP was used */
-};
-
-
struct ieee80211_bss {
u32 device_ts_beacon, device_ts_presp;
*/
int security_idx;
- u32 tkip_iv32;
- u16 tkip_iv16;
+ union {
+ struct {
+ u32 iv32;
+ u16 iv16;
+ } tkip;
+ struct {
+ u8 pn[IEEE80211_CCMP_PN_LEN];
+ } ccm_gcm;
+ };
};
struct ieee80211_csa_settings {
char name[IFNAMSIZ];
- /* Fragment table for host-based reassembly */
- struct ieee80211_fragment_entry fragments[IEEE80211_FRAGMENT_MAX];
- unsigned int fragment_next;
+ struct ieee80211_fragment_cache frags;
/* TID bitmap for NoAck policy */
u16 noack_map;
#define debug_noinline
#endif
+void ieee80211_init_frag_cache(struct ieee80211_fragment_cache *cache);
+void ieee80211_destroy_frag_cache(struct ieee80211_fragment_cache *cache);
+
#endif /* IEEE80211_I_H */
* Copyright 2008, Johannes Berg <johannes@sipsolutions.net>
* Copyright 2013-2014 Intel Mobile Communications GmbH
* Copyright (c) 2016 Intel Deutschland GmbH
- * Copyright (C) 2018-2020 Intel Corporation
+ * Copyright (C) 2018-2021 Intel Corporation
*/
#include <linux/slab.h>
#include <linux/kernel.h>
*/
static void ieee80211_teardown_sdata(struct ieee80211_sub_if_data *sdata)
{
- int i;
-
/* free extra data */
ieee80211_free_keys(sdata, false);
ieee80211_debugfs_remove_netdev(sdata);
- for (i = 0; i < IEEE80211_FRAGMENT_MAX; i++)
- __skb_queue_purge(&sdata->fragments[i].skb_list);
- sdata->fragment_next = 0;
+ ieee80211_destroy_frag_cache(&sdata->frags);
if (ieee80211_vif_is_mesh(&sdata->vif))
ieee80211_mesh_teardown_sdata(sdata);
sdata->wdev.wiphy = local->hw.wiphy;
sdata->local = local;
- for (i = 0; i < IEEE80211_FRAGMENT_MAX; i++)
- skb_queue_head_init(&sdata->fragments[i].skb_list);
+ ieee80211_init_frag_cache(&sdata->frags);
INIT_LIST_HEAD(&sdata->key_list);
struct ieee80211_sub_if_data *sdata,
struct sta_info *sta)
{
+ static atomic_t key_color = ATOMIC_INIT(0);
struct ieee80211_key *old_key;
int idx = key->conf.keyidx;
bool pairwise = key->conf.flags & IEEE80211_KEY_FLAG_PAIRWISE;
key->sdata = sdata;
key->sta = sta;
+ /*
+ * Assign a unique ID to every key so we can easily prevent mixed
+ * key and fragment cache attacks.
+ */
+ key->color = atomic_inc_return(&key_color);
+
increment_tailroom_need_count(sdata);
ret = ieee80211_key_replace(sdata, sta, pairwise, old_key, key);
} debugfs;
#endif
+ unsigned int color;
+
/*
* key config, must be last because it contains key
* material as variable length member
* Copyright 2007-2010 Johannes Berg <johannes@sipsolutions.net>
* Copyright 2013-2014 Intel Mobile Communications GmbH
* Copyright(c) 2015 - 2017 Intel Deutschland GmbH
- * Copyright (C) 2018-2020 Intel Corporation
+ * Copyright (C) 2018-2021 Intel Corporation
*/
#include <linux/jiffies.h>
return result;
}
+void ieee80211_init_frag_cache(struct ieee80211_fragment_cache *cache)
+{
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(cache->entries); i++)
+ skb_queue_head_init(&cache->entries[i].skb_list);
+}
+
+void ieee80211_destroy_frag_cache(struct ieee80211_fragment_cache *cache)
+{
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(cache->entries); i++)
+ __skb_queue_purge(&cache->entries[i].skb_list);
+}
+
static inline struct ieee80211_fragment_entry *
-ieee80211_reassemble_add(struct ieee80211_sub_if_data *sdata,
+ieee80211_reassemble_add(struct ieee80211_fragment_cache *cache,
unsigned int frag, unsigned int seq, int rx_queue,
struct sk_buff **skb)
{
struct ieee80211_fragment_entry *entry;
- entry = &sdata->fragments[sdata->fragment_next++];
- if (sdata->fragment_next >= IEEE80211_FRAGMENT_MAX)
- sdata->fragment_next = 0;
+ entry = &cache->entries[cache->next++];
+ if (cache->next >= IEEE80211_FRAGMENT_MAX)
+ cache->next = 0;
- if (!skb_queue_empty(&entry->skb_list))
- __skb_queue_purge(&entry->skb_list);
+ __skb_queue_purge(&entry->skb_list);
__skb_queue_tail(&entry->skb_list, *skb); /* no need for locking */
*skb = NULL;
}
static inline struct ieee80211_fragment_entry *
-ieee80211_reassemble_find(struct ieee80211_sub_if_data *sdata,
+ieee80211_reassemble_find(struct ieee80211_fragment_cache *cache,
unsigned int frag, unsigned int seq,
int rx_queue, struct ieee80211_hdr *hdr)
{
struct ieee80211_fragment_entry *entry;
int i, idx;
- idx = sdata->fragment_next;
+ idx = cache->next;
for (i = 0; i < IEEE80211_FRAGMENT_MAX; i++) {
struct ieee80211_hdr *f_hdr;
struct sk_buff *f_skb;
if (idx < 0)
idx = IEEE80211_FRAGMENT_MAX - 1;
- entry = &sdata->fragments[idx];
+ entry = &cache->entries[idx];
if (skb_queue_empty(&entry->skb_list) || entry->seq != seq ||
entry->rx_queue != rx_queue ||
entry->last_frag + 1 != frag)
return NULL;
}
+static bool requires_sequential_pn(struct ieee80211_rx_data *rx, __le16 fc)
+{
+ return rx->key &&
+ (rx->key->conf.cipher == WLAN_CIPHER_SUITE_CCMP ||
+ rx->key->conf.cipher == WLAN_CIPHER_SUITE_CCMP_256 ||
+ rx->key->conf.cipher == WLAN_CIPHER_SUITE_GCMP ||
+ rx->key->conf.cipher == WLAN_CIPHER_SUITE_GCMP_256) &&
+ ieee80211_has_protected(fc);
+}
+
static ieee80211_rx_result debug_noinline
ieee80211_rx_h_defragment(struct ieee80211_rx_data *rx)
{
+ struct ieee80211_fragment_cache *cache = &rx->sdata->frags;
struct ieee80211_hdr *hdr;
u16 sc;
__le16 fc;
unsigned int frag, seq;
struct ieee80211_fragment_entry *entry;
struct sk_buff *skb;
+ struct ieee80211_rx_status *status = IEEE80211_SKB_RXCB(rx->skb);
hdr = (struct ieee80211_hdr *)rx->skb->data;
fc = hdr->frame_control;
goto out_no_led;
}
+ if (rx->sta)
+ cache = &rx->sta->frags;
+
if (likely(!ieee80211_has_morefrags(fc) && frag == 0))
goto out;
if (frag == 0) {
/* This is the first fragment of a new frame. */
- entry = ieee80211_reassemble_add(rx->sdata, frag, seq,
+ entry = ieee80211_reassemble_add(cache, frag, seq,
rx->seqno_idx, &(rx->skb));
- if (rx->key &&
- (rx->key->conf.cipher == WLAN_CIPHER_SUITE_CCMP ||
- rx->key->conf.cipher == WLAN_CIPHER_SUITE_CCMP_256 ||
- rx->key->conf.cipher == WLAN_CIPHER_SUITE_GCMP ||
- rx->key->conf.cipher == WLAN_CIPHER_SUITE_GCMP_256) &&
- ieee80211_has_protected(fc)) {
+ if (requires_sequential_pn(rx, fc)) {
int queue = rx->security_idx;
/* Store CCMP/GCMP PN so that we can verify that the
* next fragment has a sequential PN value.
*/
entry->check_sequential_pn = true;
+ entry->is_protected = true;
+ entry->key_color = rx->key->color;
memcpy(entry->last_pn,
rx->key->u.ccmp.rx_pn[queue],
IEEE80211_CCMP_PN_LEN);
sizeof(rx->key->u.gcmp.rx_pn[queue]));
BUILD_BUG_ON(IEEE80211_CCMP_PN_LEN !=
IEEE80211_GCMP_PN_LEN);
+ } else if (rx->key &&
+ (ieee80211_has_protected(fc) ||
+ (status->flag & RX_FLAG_DECRYPTED))) {
+ entry->is_protected = true;
+ entry->key_color = rx->key->color;
}
return RX_QUEUED;
}
/* This is a fragment for a frame that should already be pending in
* fragment cache. Add this fragment to the end of the pending entry.
*/
- entry = ieee80211_reassemble_find(rx->sdata, frag, seq,
+ entry = ieee80211_reassemble_find(cache, frag, seq,
rx->seqno_idx, hdr);
if (!entry) {
I802_DEBUG_INC(rx->local->rx_handlers_drop_defrag);
if (entry->check_sequential_pn) {
int i;
u8 pn[IEEE80211_CCMP_PN_LEN], *rpn;
- int queue;
- if (!rx->key ||
- (rx->key->conf.cipher != WLAN_CIPHER_SUITE_CCMP &&
- rx->key->conf.cipher != WLAN_CIPHER_SUITE_CCMP_256 &&
- rx->key->conf.cipher != WLAN_CIPHER_SUITE_GCMP &&
- rx->key->conf.cipher != WLAN_CIPHER_SUITE_GCMP_256))
+ if (!requires_sequential_pn(rx, fc))
+ return RX_DROP_UNUSABLE;
+
+ /* Prevent mixed key and fragment cache attacks */
+ if (entry->key_color != rx->key->color)
return RX_DROP_UNUSABLE;
+
memcpy(pn, entry->last_pn, IEEE80211_CCMP_PN_LEN);
for (i = IEEE80211_CCMP_PN_LEN - 1; i >= 0; i--) {
pn[i]++;
if (pn[i])
break;
}
- queue = rx->security_idx;
- rpn = rx->key->u.ccmp.rx_pn[queue];
+
+ rpn = rx->ccm_gcm.pn;
if (memcmp(pn, rpn, IEEE80211_CCMP_PN_LEN))
return RX_DROP_UNUSABLE;
memcpy(entry->last_pn, pn, IEEE80211_CCMP_PN_LEN);
+ } else if (entry->is_protected &&
+ (!rx->key ||
+ (!ieee80211_has_protected(fc) &&
+ !(status->flag & RX_FLAG_DECRYPTED)) ||
+ rx->key->color != entry->key_color)) {
+ /* Drop this as a mixed key or fragment cache attack, even
+ * if for TKIP Michael MIC should protect us, and WEP is a
+ * lost cause anyway.
+ */
+ return RX_DROP_UNUSABLE;
+ } else if (entry->is_protected && rx->key &&
+ entry->key_color != rx->key->color &&
+ (status->flag & RX_FLAG_DECRYPTED)) {
+ return RX_DROP_UNUSABLE;
}
skb_pull(rx->skb, ieee80211_hdrlen(fc));
struct ethhdr *ehdr = (struct ethhdr *) rx->skb->data;
/*
- * Allow EAPOL frames to us/the PAE group address regardless
- * of whether the frame was encrypted or not.
+ * Allow EAPOL frames to us/the PAE group address regardless of
+ * whether the frame was encrypted or not, and always disallow
+ * all other destination addresses for them.
*/
- if (ehdr->h_proto == rx->sdata->control_port_protocol &&
- (ether_addr_equal(ehdr->h_dest, rx->sdata->vif.addr) ||
- ether_addr_equal(ehdr->h_dest, pae_group_addr)))
- return true;
+ if (unlikely(ehdr->h_proto == rx->sdata->control_port_protocol))
+ return ether_addr_equal(ehdr->h_dest, rx->sdata->vif.addr) ||
+ ether_addr_equal(ehdr->h_dest, pae_group_addr);
if (ieee80211_802_1x_port_control(rx) ||
ieee80211_drop_unencrypted(rx, fc))
cfg80211_rx_control_port(dev, skb, noencrypt);
dev_kfree_skb(skb);
} else {
+ struct ethhdr *ehdr = (void *)skb_mac_header(skb);
+
memset(skb->cb, 0, sizeof(skb->cb));
+ /*
+ * 802.1X over 802.11 requires that the authenticator address
+ * be used for EAPOL frames. However, 802.1X allows the use of
+ * the PAE group address instead. If the interface is part of
+ * a bridge and we pass the frame with the PAE group address,
+ * then the bridge will forward it to the network (even if the
+ * client was not associated yet), which isn't supposed to
+ * happen.
+ * To avoid that, rewrite the destination address to our own
+ * address, so that the authenticator (e.g. hostapd) will see
+ * the frame, but bridge won't forward it anywhere else. Note
+ * that due to earlier filtering, the only other address can
+ * be the PAE group address.
+ */
+ if (unlikely(skb->protocol == sdata->control_port_protocol &&
+ !ether_addr_equal(ehdr->h_dest, sdata->vif.addr)))
+ ether_addr_copy(ehdr->h_dest, sdata->vif.addr);
+
/* deliver to local stack */
if (rx->list)
list_add_tail(&skb->list, rx->list);
if ((sdata->vif.type == NL80211_IFTYPE_AP ||
sdata->vif.type == NL80211_IFTYPE_AP_VLAN) &&
!(sdata->flags & IEEE80211_SDATA_DONT_BRIDGE_PACKETS) &&
+ ehdr->h_proto != rx->sdata->control_port_protocol &&
(sdata->vif.type != NL80211_IFTYPE_AP_VLAN || !sdata->u.vlan.sta)) {
if (is_multicast_ether_addr(ehdr->h_dest) &&
ieee80211_vif_get_num_mcast_if(sdata) != 0) {
if (ieee80211_data_to_8023_exthdr(skb, ðhdr,
rx->sdata->vif.addr,
rx->sdata->vif.type,
- data_offset))
+ data_offset, true))
return RX_DROP_UNUSABLE;
ieee80211_amsdu_to_8023s(skb, &frame_list, dev->dev_addr,
if (is_multicast_ether_addr(hdr->addr1))
return RX_DROP_UNUSABLE;
+ if (rx->key) {
+ /*
+ * We should not receive A-MSDUs on pre-HT connections,
+ * and HT connections cannot use old ciphers. Thus drop
+ * them, as in those cases we couldn't even have SPP
+ * A-MSDUs or such.
+ */
+ switch (rx->key->conf.cipher) {
+ case WLAN_CIPHER_SUITE_WEP40:
+ case WLAN_CIPHER_SUITE_WEP104:
+ case WLAN_CIPHER_SUITE_TKIP:
+ return RX_DROP_UNUSABLE;
+ default:
+ break;
+ }
+ }
+
return __ieee80211_rx_h_amsdu(rx, 0);
}
* Copyright 2006-2007 Jiri Benc <jbenc@suse.cz>
* Copyright 2013-2014 Intel Mobile Communications GmbH
* Copyright (C) 2015 - 2017 Intel Deutschland GmbH
- * Copyright (C) 2018-2020 Intel Corporation
+ * Copyright (C) 2018-2021 Intel Corporation
*/
#include <linux/module.h>
u64_stats_init(&sta->rx_stats.syncp);
+ ieee80211_init_frag_cache(&sta->frags);
+
sta->sta_state = IEEE80211_STA_NONE;
/* Mark TID as unreserved */
ieee80211_sta_debugfs_remove(sta);
+ ieee80211_destroy_frag_cache(&sta->frags);
+
cleanup_single_sta(sta);
}
* Copyright 2002-2005, Devicescape Software, Inc.
* Copyright 2013-2014 Intel Mobile Communications GmbH
* Copyright(c) 2015-2017 Intel Deutschland GmbH
- * Copyright(c) 2020 Intel Corporation
+ * Copyright(c) 2020-2021 Intel Corporation
*/
#ifndef STA_INFO_H
};
/*
+ * IEEE 802.11-2016 (10.6 "Defragmentation") recommends support for "concurrent
+ * reception of at least one MSDU per access category per associated STA"
+ * on APs, or "at least one MSDU per access category" on other interface types.
+ *
+ * This limit can be increased by changing this define, at the cost of slower
+ * frame reassembly and increased memory use while fragments are pending.
+ */
+#define IEEE80211_FRAGMENT_MAX 4
+
+struct ieee80211_fragment_entry {
+ struct sk_buff_head skb_list;
+ unsigned long first_frag_time;
+ u16 seq;
+ u16 extra_len;
+ u16 last_frag;
+ u8 rx_queue;
+ u8 check_sequential_pn:1, /* needed for CCMP/GCMP */
+ is_protected:1;
+ u8 last_pn[6]; /* PN of the last fragment if CCMP was used */
+ unsigned int key_color;
+};
+
+struct ieee80211_fragment_cache {
+ struct ieee80211_fragment_entry entries[IEEE80211_FRAGMENT_MAX];
+ unsigned int next;
+};
+
+/*
* The bandwidth threshold below which the per-station CoDel parameters will be
* scaled to be more lenient (to prevent starvation of slow stations). This
* value will be scaled by the number of active stations when it is being
* @status_stats.last_ack_signal: last ACK signal
* @status_stats.ack_signal_filled: last ACK signal validity
* @status_stats.avg_ack_signal: average ACK signal
+ * @frags: fragment cache
*/
struct sta_info {
/* General information, mostly static */
struct cfg80211_chan_def tdls_chandef;
+ struct ieee80211_fragment_cache frags;
+
/* keep last! */
struct ieee80211_sta sta;
};
* Copyright 2002-2004, Instant802 Networks, Inc.
* Copyright 2008, Jouni Malinen <j@w1.fi>
* Copyright (C) 2016-2017 Intel Deutschland GmbH
+ * Copyright (C) 2020-2021 Intel Corporation
*/
#include <linux/netdevice.h>
update_iv:
/* update IV in key information to be able to detect replays */
- rx->key->u.tkip.rx[rx->security_idx].iv32 = rx->tkip_iv32;
- rx->key->u.tkip.rx[rx->security_idx].iv16 = rx->tkip_iv16;
+ rx->key->u.tkip.rx[rx->security_idx].iv32 = rx->tkip.iv32;
+ rx->key->u.tkip.rx[rx->security_idx].iv16 = rx->tkip.iv16;
return RX_CONTINUE;
key, skb->data + hdrlen,
skb->len - hdrlen, rx->sta->sta.addr,
hdr->addr1, hwaccel, rx->security_idx,
- &rx->tkip_iv32,
- &rx->tkip_iv16);
+ &rx->tkip.iv32,
+ &rx->tkip.iv16);
if (res != TKIP_DECRYPT_OK)
return RX_DROP_UNUSABLE;
}
memcpy(key->u.ccmp.rx_pn[queue], pn, IEEE80211_CCMP_PN_LEN);
+ if (unlikely(ieee80211_is_frag(hdr)))
+ memcpy(rx->ccm_gcm.pn, pn, IEEE80211_CCMP_PN_LEN);
}
/* Remove CCMP header and MIC */
}
memcpy(key->u.gcmp.rx_pn[queue], pn, IEEE80211_GCMP_PN_LEN);
+ if (unlikely(ieee80211_is_frag(hdr)))
+ memcpy(rx->ccm_gcm.pn, pn, IEEE80211_CCMP_PN_LEN);
}
/* Remove GCMP header and MIC */
memcpy(mp_opt->hmac, ptr, MPTCPOPT_HMAC_LEN);
pr_debug("MP_JOIN hmac");
} else {
- pr_warn("MP_JOIN bad option size");
mp_opt->mp_join = 0;
}
break;
MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_ADDADDR);
} else {
mptcp_pm_add_addr_echoed(msk, &mp_opt.addr);
- mptcp_pm_del_add_timer(msk, &mp_opt.addr);
+ mptcp_pm_del_add_timer(msk, &mp_opt.addr, true);
MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_ECHOADD);
}
struct mptcp_pm_add_entry *
mptcp_pm_del_add_timer(struct mptcp_sock *msk,
- struct mptcp_addr_info *addr)
+ struct mptcp_addr_info *addr, bool check_id)
{
struct mptcp_pm_add_entry *entry;
struct sock *sk = (struct sock *)msk;
spin_lock_bh(&msk->pm.lock);
entry = mptcp_lookup_anno_list_by_saddr(msk, addr);
- if (entry)
+ if (entry && (!check_id || entry->addr.id == addr->id))
entry->retrans_times = ADD_ADDR_RETRANS_MAX;
spin_unlock_bh(&msk->pm.lock);
- if (entry)
+ if (entry && (!check_id || entry->addr.id == addr->id))
sk_stop_timer_sync(sk, &entry->add_timer);
return entry;
{
struct mptcp_pm_add_entry *entry;
- entry = mptcp_pm_del_add_timer(msk, addr);
+ entry = mptcp_pm_del_add_timer(msk, addr, false);
if (entry) {
list_del(&entry->list);
kfree(entry);
!mpext->frozen;
}
+/* we can append data to the given data frag if:
+ * - there is space available in the backing page_frag
+ * - the data frag tail matches the current page_frag free offset
+ * - the data frag end sequence number matches the current write seq
+ */
static bool mptcp_frag_can_collapse_to(const struct mptcp_sock *msk,
const struct page_frag *pfrag,
const struct mptcp_data_frag *df)
{
return df && pfrag->page == df->page &&
pfrag->size - pfrag->offset > 0 &&
+ pfrag->offset == (df->offset + df->data_len) &&
df->data_seq + df->data_len == msk->write_seq;
}
{
struct mptcp_sock *msk = mptcp_sk(sk);
+#ifdef CONFIG_LOCKDEP
+ WARN_ON_ONCE(!lockdep_is_held(&sk->sk_lock.slock));
+#endif
+
if (!msk->wmem_reserved)
return;
static void __mptcp_clean_una_wakeup(struct sock *sk)
{
+#ifdef CONFIG_LOCKDEP
+ WARN_ON_ONCE(!lockdep_is_held(&sk->sk_lock.slock));
+#endif
__mptcp_clean_una(sk);
mptcp_write_space(sk);
}
+static void mptcp_clean_una_wakeup(struct sock *sk)
+{
+ mptcp_data_lock(sk);
+ __mptcp_clean_una_wakeup(sk);
+ mptcp_data_unlock(sk);
+}
+
static void mptcp_enter_memory_pressure(struct sock *sk)
{
struct mptcp_subflow_context *subflow;
struct sock *ssk;
int ret;
- __mptcp_clean_una_wakeup(sk);
+ mptcp_clean_una_wakeup(sk);
dfrag = mptcp_rtx_head(sk);
if (!dfrag) {
if (mptcp_data_fin_enabled(msk)) {
timer_setup(&msk->sk.icsk_retransmit_timer, mptcp_retransmit_timer, 0);
timer_setup(&sk->sk_timer, mptcp_timeout_timer, 0);
- tcp_assign_congestion_control(sk);
-
return 0;
}
static int mptcp_init_sock(struct sock *sk)
{
+ struct inet_connection_sock *icsk = inet_csk(sk);
struct net *net = sock_net(sk);
int ret;
if (ret)
return ret;
+ /* fetch the ca name; do it outside __mptcp_init_sock(), so that clone will
+ * propagate the correct value
+ */
+ tcp_assign_congestion_control(sk);
+ strcpy(mptcp_sk(sk)->ca_name, icsk->icsk_ca_ops->name);
+
+ /* no need to keep a reference to the ops, the name will suffice */
+ tcp_cleanup_congestion_control(sk);
+ icsk->icsk_ca_ops = NULL;
+
sk_sockets_allocated_inc(sk);
sk->sk_rcvbuf = sock_net(sk)->ipv4.sysctl_tcp_rmem[1];
sk->sk_sndbuf = sock_net(sk)->ipv4.sysctl_tcp_wmem[1];
sk_stream_kill_queues(sk);
xfrm_sk_free_policy(sk);
- tcp_cleanup_congestion_control(sk);
sk_refcnt_debug_release(sk);
mptcp_dispose_initial_subflow(msk);
sock_put(sk);
} rcvq_space;
u32 setsockopt_seq;
+ char ca_name[TCP_CA_NAME_MAX];
};
#define mptcp_lock_sock(___sk, cb) do { \
bool mptcp_pm_sport_in_anno_list(struct mptcp_sock *msk, const struct sock *sk);
struct mptcp_pm_add_entry *
mptcp_pm_del_add_timer(struct mptcp_sock *msk,
- struct mptcp_addr_info *addr);
+ struct mptcp_addr_info *addr, bool check_id);
struct mptcp_pm_add_entry *
mptcp_lookup_anno_list_by_saddr(struct mptcp_sock *msk,
struct mptcp_addr_info *addr);
}
if (ret == 0)
- tcp_set_congestion_control(sk, name, false, cap_net_admin);
+ strcpy(msk->ca_name, name);
release_sock(sk);
return ret;
sock_valbool_flag(ssk, SOCK_DBG, sock_flag(sk, SOCK_DBG));
if (inet_csk(sk)->icsk_ca_ops != inet_csk(ssk)->icsk_ca_ops)
- tcp_set_congestion_control(ssk, inet_csk(sk)->icsk_ca_ops->name, false, true);
+ tcp_set_congestion_control(ssk, msk->ca_name, false, true);
}
static void __mptcp_sockopt_sync(struct mptcp_sock *msk, struct sock *ssk)
/* if the sk is MP_CAPABLE, we try to fetch the client key */
if (subflow_req->mp_capable) {
- if (TCP_SKB_CB(skb)->seq != subflow_req->ssn_offset + 1) {
- /* here we can receive and accept an in-window,
- * out-of-order pkt, which will not carry the MP_CAPABLE
- * opt even on mptcp enabled paths
- */
- goto create_msk;
- }
-
+ /* we can receive and accept an in-window, out-of-order pkt,
+ * which may not carry the MP_CAPABLE opt even on mptcp enabled
+ * paths: always try to extract the peer key, and fallback
+ * for packets missing it.
+ * Even OoO DSS packets coming legitly after dropped or
+ * reordered MPC will cause fallback, but we don't have other
+ * options.
+ */
mptcp_get_options(skb, &mp_opt);
if (!mp_opt.mp_capable) {
fallback = true;
goto create_child;
}
-create_msk:
new_msk = mptcp_sk_clone(listener->conn, &mp_opt, req);
if (!new_msk)
fallback = true;
data_len = mpext->data_len;
if (data_len == 0) {
- pr_err("Infinite mapping not handled");
MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_INFINITEMAPRX);
return MAPPING_INVALID;
}
status = get_mapping_status(ssk, msk);
trace_subflow_check_data_avail(status, skb_peek(&ssk->sk_receive_queue));
- if (status == MAPPING_INVALID) {
- ssk->sk_err = EBADMSG;
- goto fatal;
- }
- if (status == MAPPING_DUMMY) {
- __mptcp_do_fallback(msk);
- skb = skb_peek(&ssk->sk_receive_queue);
- subflow->map_valid = 1;
- subflow->map_seq = READ_ONCE(msk->ack_seq);
- subflow->map_data_len = skb->len;
- subflow->map_subflow_seq = tcp_sk(ssk)->copied_seq -
- subflow->ssn_offset;
- subflow->data_avail = MPTCP_SUBFLOW_DATA_AVAIL;
- return true;
- }
+ if (unlikely(status == MAPPING_INVALID))
+ goto fallback;
+
+ if (unlikely(status == MAPPING_DUMMY))
+ goto fallback;
if (status != MAPPING_OK)
goto no_data;
* MP_CAPABLE-based mapping
*/
if (unlikely(!READ_ONCE(msk->can_ack))) {
- if (!subflow->mpc_map) {
- ssk->sk_err = EBADMSG;
- goto fatal;
- }
+ if (!subflow->mpc_map)
+ goto fallback;
WRITE_ONCE(msk->remote_key, subflow->remote_key);
WRITE_ONCE(msk->ack_seq, subflow->map_seq);
WRITE_ONCE(msk->can_ack, true);
no_data:
subflow_sched_work_if_closed(msk, ssk);
return false;
-fatal:
- /* fatal protocol error, close the socket */
- /* This barrier is coupled with smp_rmb() in tcp_poll() */
- smp_wmb();
- ssk->sk_error_report(ssk);
- tcp_set_state(ssk, TCP_CLOSE);
- subflow->reset_transient = 0;
- subflow->reset_reason = MPTCP_RST_EMPTCP;
- tcp_send_active_reset(ssk, GFP_ATOMIC);
- subflow->data_avail = 0;
- return false;
+
+fallback:
+ /* RFC 8684 section 3.7. */
+ if (subflow->mp_join || subflow->fully_established) {
+ /* fatal protocol error, close the socket.
+ * subflow_error_report() will introduce the appropriate barriers
+ */
+ ssk->sk_err = EBADMSG;
+ ssk->sk_error_report(ssk);
+ tcp_set_state(ssk, TCP_CLOSE);
+ subflow->reset_transient = 0;
+ subflow->reset_reason = MPTCP_RST_EMPTCP;
+ tcp_send_active_reset(ssk, GFP_ATOMIC);
+ subflow->data_avail = 0;
+ return false;
+ }
+
+ __mptcp_do_fallback(msk);
+ skb = skb_peek(&ssk->sk_receive_queue);
+ subflow->map_valid = 1;
+ subflow->map_seq = READ_ONCE(msk->ack_seq);
+ subflow->map_data_len = skb->len;
+ subflow->map_subflow_seq = tcp_sk(ssk)->copied_seq - subflow->ssn_offset;
+ subflow->data_avail = MPTCP_SUBFLOW_DATA_AVAIL;
+ return true;
}
bool mptcp_subflow_data_available(struct sock *sk)
ip_vs_addr_copy(svc->af, &svc->addr, &u->addr);
svc->port = u->port;
svc->fwmark = u->fwmark;
- svc->flags = u->flags;
+ svc->flags = u->flags & ~IP_VS_SVC_F_HASHED;
svc->timeout = u->timeout * HZ;
svc->netmask = u->netmask;
svc->ipvs = ipvs;
#if IS_ENABLED(CONFIG_IPV6)
cleanup_sockopt:
- nf_unregister_sockopt(&so_getorigdst6);
+ nf_unregister_sockopt(&so_getorigdst);
#endif
return ret;
}
{
flow->timeout = nf_flowtable_time_stamp + NF_FLOW_TIMEOUT;
- if (likely(!nf_flowtable_hw_offload(flow_table) ||
- !test_and_clear_bit(NF_FLOW_HW_REFRESH, &flow->flags)))
+ if (likely(!nf_flowtable_hw_offload(flow_table)))
return;
nf_flow_offload_add(flow_table, flow);
err = flow_offload_rule_add(offload, flow_rule);
if (err < 0)
- set_bit(NF_FLOW_HW_REFRESH, &offload->flow->flags);
- else
- set_bit(IPS_HW_OFFLOAD_BIT, &offload->flow->ct->status);
+ goto out;
+
+ set_bit(IPS_HW_OFFLOAD_BIT, &offload->flow->ct->status);
+out:
nf_flow_offload_destroy(flow_rule);
}
goto nla_put_failure;
if (nla_put_string(skb, NFTA_TABLE_NAME, table->name) ||
- nla_put_be32(skb, NFTA_TABLE_FLAGS, htonl(table->flags)) ||
+ nla_put_be32(skb, NFTA_TABLE_FLAGS,
+ htonl(table->flags & NFT_TABLE_F_MASK)) ||
nla_put_be32(skb, NFTA_TABLE_USE, htonl(table->use)) ||
nla_put_be64(skb, NFTA_TABLE_HANDLE, cpu_to_be64(table->handle),
NFTA_TABLE_PAD))
static void nf_tables_table_disable(struct net *net, struct nft_table *table)
{
+ table->flags &= ~NFT_TABLE_F_DORMANT;
nft_table_disable(net, table, 0);
+ table->flags |= NFT_TABLE_F_DORMANT;
}
-enum {
- NFT_TABLE_STATE_UNCHANGED = 0,
- NFT_TABLE_STATE_DORMANT,
- NFT_TABLE_STATE_WAKEUP
-};
+#define __NFT_TABLE_F_INTERNAL (NFT_TABLE_F_MASK + 1)
+#define __NFT_TABLE_F_WAS_DORMANT (__NFT_TABLE_F_INTERNAL << 0)
+#define __NFT_TABLE_F_WAS_AWAKEN (__NFT_TABLE_F_INTERNAL << 1)
+#define __NFT_TABLE_F_UPDATE (__NFT_TABLE_F_WAS_DORMANT | \
+ __NFT_TABLE_F_WAS_AWAKEN)
static int nf_tables_updtable(struct nft_ctx *ctx)
{
struct nft_trans *trans;
u32 flags;
- int ret = 0;
+ int ret;
if (!ctx->nla[NFTA_TABLE_FLAGS])
return 0;
if ((flags & NFT_TABLE_F_DORMANT) &&
!(ctx->table->flags & NFT_TABLE_F_DORMANT)) {
- nft_trans_table_state(trans) = NFT_TABLE_STATE_DORMANT;
+ ctx->table->flags |= NFT_TABLE_F_DORMANT;
+ if (!(ctx->table->flags & __NFT_TABLE_F_UPDATE))
+ ctx->table->flags |= __NFT_TABLE_F_WAS_AWAKEN;
} else if (!(flags & NFT_TABLE_F_DORMANT) &&
ctx->table->flags & NFT_TABLE_F_DORMANT) {
- ret = nf_tables_table_enable(ctx->net, ctx->table);
- if (ret >= 0)
- nft_trans_table_state(trans) = NFT_TABLE_STATE_WAKEUP;
+ ctx->table->flags &= ~NFT_TABLE_F_DORMANT;
+ if (!(ctx->table->flags & __NFT_TABLE_F_UPDATE)) {
+ ret = nf_tables_table_enable(ctx->net, ctx->table);
+ if (ret < 0)
+ goto err_register_hooks;
+
+ ctx->table->flags |= __NFT_TABLE_F_WAS_DORMANT;
+ }
}
- if (ret < 0)
- goto err;
- nft_trans_table_flags(trans) = flags;
nft_trans_table_update(trans) = true;
nft_trans_commit_list_add_tail(ctx->net, trans);
+
return 0;
-err:
+
+err_register_hooks:
nft_trans_destroy(trans);
return ret;
}
static int nft_chain_parse_hook(struct net *net,
const struct nlattr * const nla[],
struct nft_chain_hook *hook, u8 family,
- bool autoload)
+ struct netlink_ext_ack *extack, bool autoload)
{
struct nftables_pernet *nft_net = nft_pernet(net);
struct nlattr *ha[NFTA_HOOK_MAX + 1];
if (nla[NFTA_CHAIN_TYPE]) {
type = nf_tables_chain_type_lookup(net, nla[NFTA_CHAIN_TYPE],
family, autoload);
- if (IS_ERR(type))
+ if (IS_ERR(type)) {
+ NL_SET_BAD_ATTR(extack, nla[NFTA_CHAIN_TYPE]);
return PTR_ERR(type);
+ }
}
if (hook->num >= NFT_MAX_HOOKS || !(type->hook_mask & (1 << hook->num)))
return -EOPNOTSUPP;
hook->priority <= NF_IP_PRI_CONNTRACK)
return -EOPNOTSUPP;
- if (!try_module_get(type->owner))
+ if (!try_module_get(type->owner)) {
+ if (nla[NFTA_CHAIN_TYPE])
+ NL_SET_BAD_ATTR(extack, nla[NFTA_CHAIN_TYPE]);
return -ENOENT;
+ }
hook->type = type;
static u64 chain_id;
static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask,
- u8 policy, u32 flags)
+ u8 policy, u32 flags,
+ struct netlink_ext_ack *extack)
{
const struct nlattr * const *nla = ctx->nla;
struct nft_table *table = ctx->table;
if (flags & NFT_CHAIN_BINDING)
return -EOPNOTSUPP;
- err = nft_chain_parse_hook(net, nla, &hook, family, true);
+ err = nft_chain_parse_hook(net, nla, &hook, family, extack,
+ true);
if (err < 0)
return err;
return -EEXIST;
}
err = nft_chain_parse_hook(ctx->net, nla, &hook, ctx->family,
- false);
+ extack, false);
if (err < 0)
return err;
extack);
}
- return nf_tables_addchain(&ctx, family, genmask, policy, flags);
+ return nf_tables_addchain(&ctx, family, genmask, policy, flags, extack);
}
static int nf_tables_delchain(struct sk_buff *skb, const struct nfnl_info *info,
if (n == NFT_RULE_MAXEXPRS)
goto err1;
err = nf_tables_expr_parse(&ctx, tmp, &expr_info[n]);
- if (err < 0)
+ if (err < 0) {
+ NL_SET_BAD_ATTR(extack, tmp);
goto err1;
+ }
size += expr_info[n].ops->size;
n++;
}
switch (trans->msg_type) {
case NFT_MSG_NEWTABLE:
if (nft_trans_table_update(trans)) {
- if (nft_trans_table_state(trans) == NFT_TABLE_STATE_DORMANT)
+ if (!(trans->ctx.table->flags & __NFT_TABLE_F_UPDATE)) {
+ nft_trans_destroy(trans);
+ break;
+ }
+ if (trans->ctx.table->flags & NFT_TABLE_F_DORMANT)
nf_tables_table_disable(net, trans->ctx.table);
- trans->ctx.table->flags = nft_trans_table_flags(trans);
+ trans->ctx.table->flags &= ~__NFT_TABLE_F_UPDATE;
} else {
nft_clear(net, trans->ctx.table);
}
switch (trans->msg_type) {
case NFT_MSG_NEWTABLE:
if (nft_trans_table_update(trans)) {
- if (nft_trans_table_state(trans) == NFT_TABLE_STATE_WAKEUP)
+ if (!(trans->ctx.table->flags & __NFT_TABLE_F_UPDATE)) {
+ nft_trans_destroy(trans);
+ break;
+ }
+ if (trans->ctx.table->flags & __NFT_TABLE_F_WAS_DORMANT) {
nf_tables_table_disable(net, trans->ctx.table);
-
+ trans->ctx.table->flags |= NFT_TABLE_F_DORMANT;
+ } else if (trans->ctx.table->flags & __NFT_TABLE_F_WAS_AWAKEN) {
+ trans->ctx.table->flags &= ~NFT_TABLE_F_DORMANT;
+ }
+ trans->ctx.table->flags &= ~__NFT_TABLE_F_UPDATE;
nft_trans_destroy(trans);
} else {
list_del_rcu(&trans->ctx.table->list);
nfnl_cthelper_update(const struct nlattr * const tb[],
struct nf_conntrack_helper *helper)
{
+ u32 size;
int ret;
- if (tb[NFCTH_PRIV_DATA_LEN])
- return -EBUSY;
+ if (tb[NFCTH_PRIV_DATA_LEN]) {
+ size = ntohl(nla_get_be32(tb[NFCTH_PRIV_DATA_LEN]));
+ if (size != helper->data_len)
+ return -EBUSY;
+ }
if (tb[NFCTH_POLICY]) {
ret = nfnl_cthelper_update_policy(helper, tb[NFCTH_POLICY]);
struct nf_conn *ct;
ct = nf_ct_get(pkt->skb, &ctinfo);
- if (!ct || ctinfo == IP_CT_UNTRACKED) {
+ if (!ct || nf_ct_is_confirmed(ct) || nf_ct_is_template(ct)) {
regs->verdict.code = NFT_BREAK;
return;
}
*
* Return: true on match, false otherwise.
*/
-static bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
- const u32 *key, const struct nft_set_ext **ext)
+bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
+ const u32 *key, const struct nft_set_ext **ext)
{
struct nft_pipapo *priv = nft_set_priv(set);
unsigned long *res_map, *fill_map;
int pipapo_refill(unsigned long *map, int len, int rules, unsigned long *dst,
union nft_pipapo_map_bucket *mt, bool match_only);
+bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
+ const u32 *key, const struct nft_set_ext **ext);
/**
* pipapo_and_field_buckets_4bit() - Intersect 4-bit buckets
bool map_index;
int i, ret = 0;
+ if (unlikely(!irq_fpu_usable()))
+ return nft_pipapo_lookup(net, set, key, ext);
+
m = rcu_dereference(priv->match);
/* This also protects access to all data related to scratch maps */
static inline void
netlink_lock_table(void)
{
+ unsigned long flags;
+
/* read_lock() synchronizes us to netlink_table_grab */
- read_lock(&nl_table_lock);
+ read_lock_irqsave(&nl_table_lock, flags);
atomic_inc(&nl_table_users);
- read_unlock(&nl_table_lock);
+ read_unlock_irqrestore(&nl_table_lock, flags);
}
static inline void
if (!llcp_sock->service_name) {
nfc_llcp_local_put(llcp_sock->local);
llcp_sock->local = NULL;
+ llcp_sock->dev = NULL;
ret = -ENOMEM;
goto put_dev;
}
llcp_sock->local = NULL;
kfree(llcp_sock->service_name);
llcp_sock->service_name = NULL;
+ llcp_sock->dev = NULL;
ret = -EADDRINUSE;
goto put_dev;
}
void nci_free_device(struct nci_dev *ndev)
{
nfc_free_device(ndev->nfc_dev);
+ nci_hci_deallocate(ndev);
kfree(ndev);
}
EXPORT_SYMBOL(nci_free_device);
return hdev;
}
+
+void nci_hci_deallocate(struct nci_dev *ndev)
+{
+ kfree(ndev->hci_dev);
+}
return -ESOCKTNOSUPPORT;
if (sock->type == SOCK_RAW) {
- if (!capable(CAP_NET_RAW))
+ if (!ns_capable(net->user_ns, CAP_NET_RAW))
return -EPERM;
sock->ops = &rawsock_raw_ops;
} else {
spin_lock(&meter->lock);
long_delta_ms = (now_ms - meter->used); /* ms */
+ if (long_delta_ms < 0) {
+ /* This condition means that we have several threads fighting
+ * for a meter lock, and the one who received the packets a
+ * bit later wins. Assuming that all racing threads received
+ * packets at the same time to avoid overflow.
+ */
+ long_delta_ms = 0;
+ }
/* Make sure delta_ms will not be too large, so that bucket will not
* wrap around below.
ktime_to_timespec64_cond(shhwtstamps->hwtstamp, ts))
return TP_STATUS_TS_RAW_HARDWARE;
- if (ktime_to_timespec64_cond(skb->tstamp, ts))
+ if ((flags & SOF_TIMESTAMPING_SOFTWARE) &&
+ ktime_to_timespec64_cond(skb->tstamp, ts))
return TP_STATUS_TS_SOFTWARE;
return 0;
skb_copy_bits(skb, 0, h.raw + macoff, snaplen);
- if (!(ts_status = tpacket_get_timestamp(skb, &ts, po->tp_tstamp)))
+ /* Always timestamp; prefer an existing software timestamp taken
+ * closer to the time of capture.
+ */
+ ts_status = tpacket_get_timestamp(skb, &ts,
+ po->tp_tstamp | SOF_TIMESTAMPING_SOFTWARE);
+ if (!ts_status)
ktime_get_real_ts64(&ts);
status |= ts_status;
if (loop_trans) {
rds_trans_put(loop_trans);
conn->c_loopback = 1;
- if (is_outgoing && trans->t_prefer_loopback) {
- /* "outgoing" connection - and the transport
- * says it wants the connection handled by the
- * loopback transport. This is what TCP does.
- */
- trans = &rds_loop_transport;
+ if (trans->t_prefer_loopback) {
+ if (likely(is_outgoing)) {
+ /* "outgoing" connection to local address.
+ * Protocol says it wants the connection
+ * handled by the loopback transport.
+ * This is what TCP does.
+ */
+ trans = &rds_loop_transport;
+ } else {
+ /* No transport currently in use
+ * should end up here, but if it
+ * does, reset/destroy the connection.
+ */
+ kmem_cache_free(rds_conn_slab, conn);
+ conn = ERR_PTR(-EOPNOTSUPP);
+ goto out;
+ }
}
}
}
#endif
-static int rds_tcp_laddr_check(struct net *net, const struct in6_addr *addr,
- __u32 scope_id)
+int rds_tcp_laddr_check(struct net *net, const struct in6_addr *addr,
+ __u32 scope_id)
{
struct net_device *dev = NULL;
#if IS_ENABLED(CONFIG_IPV6)
u64 rds_tcp_map_seq(struct rds_tcp_connection *tc, u32 seq);
extern struct rds_transport rds_tcp_transport;
void rds_tcp_accept_work(struct sock *sk);
-
+int rds_tcp_laddr_check(struct net *net, const struct in6_addr *addr,
+ __u32 scope_id);
/* tcp_connect.c */
int rds_tcp_conn_path_connect(struct rds_conn_path *cp);
void rds_tcp_conn_path_shutdown(struct rds_conn_path *conn);
}
#endif
+ if (!rds_tcp_laddr_check(sock_net(sock->sk), peer_addr, dev_if)) {
+ /* local address connection is only allowed via loopback */
+ ret = -EOPNOTSUPP;
+ goto out;
+ }
+
conn = rds_conn_create(sock_net(sock->sk),
my_addr, peer_addr,
&rds_tcp_transport, 0, GFP_KERNEL, dev_if);
*/
cached = tcf_ct_skb_nfct_cached(net, skb, p->zone, force);
if (!cached) {
- if (!commit && tcf_ct_flow_table_lookup(p, skb, family)) {
+ if (tcf_ct_flow_table_lookup(p, skb, family)) {
skip_add = true;
goto do_nat;
}
* even if the connection is already confirmed.
*/
nf_conntrack_confirm(skb);
- } else if (!skip_add) {
- tcf_ct_flow_table_process_conn(p->ct_ft, ct, ctinfo);
}
+ if (!skip_add)
+ tcf_ct_flow_table_process_conn(p->ct_ft, ct, ctinfo);
+
out_push:
skb_push_rcsum(skb, nh_ofs);
sizeof(p->zone));
}
- if (p->zone == NF_CT_DEFAULT_ZONE_ID)
- return 0;
-
nf_ct_zone_init(&zone, p->zone, NF_CT_DEFAULT_ZONE_DIR, 0);
tmpl = nf_ct_tmpl_alloc(net, &zone, GFP_KERNEL);
if (!tmpl) {
/* If we missed on some chain */
if (ret == TC_ACT_UNSPEC && last_executed_chain) {
- ext = skb_ext_add(skb, TC_SKB_EXT);
+ ext = tc_skb_ext_alloc(skb);
if (WARN_ON_ONCE(!ext))
return TC_ACT_SHOT;
ext->chain = last_executed_chain;
struct dsmark_qdisc_data *p = qdisc_priv(sch);
pr_debug("%s(sch %p,[qdisc %p])\n", __func__, sch, p);
- qdisc_reset(p->q);
+ if (p->q)
+ qdisc_reset(p->q);
sch->qstats.backlog = 0;
sch->q.qlen = 0;
}
/* Classifies packet into corresponding flow */
idx = fq_pie_classify(skb, sch, &ret);
- sel_flow = &q->flows[idx];
+ if (idx == 0) {
+ if (ret & __NET_XMIT_BYPASS)
+ qdisc_qstats_drop(sch);
+ __qdisc_drop(skb, to_free);
+ return ret;
+ }
+ idx--;
+ sel_flow = &q->flows[idx];
/* Checks whether adding a new packet would exceed memory limit */
get_pie_cb(skb)->mem_usage = skb->truesize;
memory_limited = q->memory_usage > q->memory_limit + skb->truesize;
goto flow_error;
}
q->flows_cnt = nla_get_u32(tb[TCA_FQ_PIE_FLOWS]);
- if (!q->flows_cnt || q->flows_cnt >= 65536) {
+ if (!q->flows_cnt || q->flows_cnt > 65536) {
NL_SET_ERR_MSG_MOD(extack,
- "Number of flows must range in [1..65535]");
+ "Number of flows must range in [1..65536]");
goto flow_error;
}
}
struct fq_pie_sched_data *q = from_timer(q, t, adapt_timer);
struct Qdisc *sch = q->sch;
spinlock_t *root_lock; /* to lock qdisc for probability calculations */
- u16 idx;
+ u32 idx;
root_lock = qdisc_lock(qdisc_root_sleeping(sch));
spin_lock(root_lock);
{
struct fq_pie_sched_data *q = qdisc_priv(sch);
int err;
- u16 idx;
+ u32 idx;
pie_params_init(&q->p_params);
sch->limit = 10 * 1024;
static void fq_pie_reset(struct Qdisc *sch)
{
struct fq_pie_sched_data *q = qdisc_priv(sch);
- u16 idx;
+ u32 idx;
INIT_LIST_HEAD(&q->new_flows);
INIT_LIST_HEAD(&q->old_flows);
const struct Qdisc_ops *default_qdisc_ops = &pfifo_fast_ops;
EXPORT_SYMBOL(default_qdisc_ops);
+static void qdisc_maybe_clear_missed(struct Qdisc *q,
+ const struct netdev_queue *txq)
+{
+ clear_bit(__QDISC_STATE_MISSED, &q->state);
+
+ /* Make sure the below netif_xmit_frozen_or_stopped()
+ * checking happens after clearing STATE_MISSED.
+ */
+ smp_mb__after_atomic();
+
+ /* Checking netif_xmit_frozen_or_stopped() again to
+ * make sure STATE_MISSED is set if the STATE_MISSED
+ * set by netif_tx_wake_queue()'s rescheduling of
+ * net_tx_action() is cleared by the above clear_bit().
+ */
+ if (!netif_xmit_frozen_or_stopped(txq))
+ set_bit(__QDISC_STATE_MISSED, &q->state);
+}
+
/* Main transmission queue. */
/* Modifications to data participating in scheduling must be protected with
}
} else {
skb = SKB_XOFF_MAGIC;
+ qdisc_maybe_clear_missed(q, txq);
}
}
}
} else {
skb = NULL;
+ qdisc_maybe_clear_missed(q, txq);
}
if (lock)
spin_unlock(lock);
*validate = true;
if ((q->flags & TCQ_F_ONETXQUEUE) &&
- netif_xmit_frozen_or_stopped(txq))
+ netif_xmit_frozen_or_stopped(txq)) {
+ qdisc_maybe_clear_missed(q, txq);
return skb;
+ }
skb = qdisc_dequeue_skb_bad_txq(q);
if (unlikely(skb)) {
HARD_TX_LOCK(dev, txq, smp_processor_id());
if (!netif_xmit_frozen_or_stopped(txq))
skb = dev_hard_start_xmit(skb, dev, txq, &ret);
+ else
+ qdisc_maybe_clear_missed(q, txq);
HARD_TX_UNLOCK(dev, txq);
} else {
{
struct pfifo_fast_priv *priv = qdisc_priv(qdisc);
struct sk_buff *skb = NULL;
+ bool need_retry = true;
int band;
+retry:
for (band = 0; band < PFIFO_FAST_BANDS && !skb; band++) {
struct skb_array *q = band2list(priv, band);
}
if (likely(skb)) {
qdisc_update_stats_at_dequeue(qdisc, skb);
+ } else if (need_retry &&
+ test_bit(__QDISC_STATE_MISSED, &qdisc->state)) {
+ /* Delay clearing the STATE_MISSED here to reduce
+ * the overhead of the second spin_trylock() in
+ * qdisc_run_begin() and __netif_schedule() calling
+ * in qdisc_run_end().
+ */
+ clear_bit(__QDISC_STATE_MISSED, &qdisc->state);
+
+ /* Make sure dequeuing happens after clearing
+ * STATE_MISSED.
+ */
+ smp_mb__after_atomic();
+
+ need_retry = false;
+
+ goto retry;
} else {
WRITE_ONCE(qdisc->empty, true);
}
qdisc_reset(qdisc);
spin_unlock_bh(qdisc_lock(qdisc));
- if (nolock)
+ if (nolock) {
+ clear_bit(__QDISC_STATE_MISSED, &qdisc->state);
spin_unlock_bh(&qdisc->seqlock);
+ }
}
static bool some_qdisc_is_busy(struct net_device *dev)
struct Qdisc *old_q;
/* One ref for cl->leaf.q, the other for dev_queue->qdisc. */
- qdisc_refcount_inc(new_q);
+ if (new_q)
+ qdisc_refcount_inc(new_q);
old_q = htb_graft_helper(dev_queue, new_q);
WARN_ON(!(old_q->flags & TCQ_F_BUILTIN));
}
cl->parent->common.classid,
NULL);
if (q->offload) {
- if (new_q) {
+ if (new_q)
htb_set_lockdep_class_child(new_q);
- htb_parent_to_leaf_offload(sch, dev_queue, new_q);
- }
+ htb_parent_to_leaf_offload(sch, dev_queue, new_q);
}
}
transports)
t->encap_port = encap_port;
+ asoc->encap_port = encap_port;
return 0;
}
.data = &init_net.sctp.encap_port,
.maxlen = sizeof(int),
.mode = 0644,
- .proc_handler = proc_dointvec,
+ .proc_handler = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
.extra2 = &udp_port_max,
},
int smcd_register_dev(struct smcd_dev *smcd)
{
+ int rc;
+
mutex_lock(&smcd_dev_list.mutex);
if (list_empty(&smcd_dev_list.list)) {
u8 *system_eid = NULL;
dev_name(&smcd->dev), smcd->pnetid,
smcd->pnetid_by_user ? " (user defined)" : "");
- return device_add(&smcd->dev);
+ rc = device_add(&smcd->dev);
+ if (rc) {
+ mutex_lock(&smcd_dev_list.mutex);
+ list_del(&smcd->list);
+ mutex_unlock(&smcd_dev_list.mutex);
+ }
+
+ return rc;
}
EXPORT_SYMBOL_GPL(smcd_register_dev);
return;
}
- /*
- * Even though there was an error, we may have acquired
- * a request slot somehow. Make sure not to leak it.
- */
- if (task->tk_rqstp)
- xprt_release(task);
-
switch (status) {
case -ENOMEM:
rpc_delay(task, HZ >> 2);
static void xprt_init(struct rpc_xprt *xprt, struct net *net);
static __be32 xprt_alloc_xid(struct rpc_xprt *xprt);
static void xprt_destroy(struct rpc_xprt *xprt);
+static void xprt_request_init(struct rpc_task *task);
static DEFINE_SPINLOCK(xprt_list_lock);
static LIST_HEAD(xprt_list);
spin_unlock(&xprt->queue_lock);
}
-static void xprt_add_backlog(struct rpc_xprt *xprt, struct rpc_task *task)
+static void xprt_complete_request_init(struct rpc_task *task)
+{
+ if (task->tk_rqstp)
+ xprt_request_init(task);
+}
+
+void xprt_add_backlog(struct rpc_xprt *xprt, struct rpc_task *task)
{
set_bit(XPRT_CONGESTED, &xprt->state);
- rpc_sleep_on(&xprt->backlog, task, NULL);
+ rpc_sleep_on(&xprt->backlog, task, xprt_complete_request_init);
+}
+EXPORT_SYMBOL_GPL(xprt_add_backlog);
+
+static bool __xprt_set_rq(struct rpc_task *task, void *data)
+{
+ struct rpc_rqst *req = data;
+
+ if (task->tk_rqstp == NULL) {
+ memset(req, 0, sizeof(*req)); /* mark unused */
+ task->tk_rqstp = req;
+ return true;
+ }
+ return false;
}
-static void xprt_wake_up_backlog(struct rpc_xprt *xprt)
+bool xprt_wake_up_backlog(struct rpc_xprt *xprt, struct rpc_rqst *req)
{
- if (rpc_wake_up_next(&xprt->backlog) == NULL)
+ if (rpc_wake_up_first(&xprt->backlog, __xprt_set_rq, req) == NULL) {
clear_bit(XPRT_CONGESTED, &xprt->state);
+ return false;
+ }
+ return true;
}
+EXPORT_SYMBOL_GPL(xprt_wake_up_backlog);
static bool xprt_throttle_congested(struct rpc_xprt *xprt, struct rpc_task *task)
{
goto out;
spin_lock(&xprt->reserve_lock);
if (test_bit(XPRT_CONGESTED, &xprt->state)) {
- rpc_sleep_on(&xprt->backlog, task, NULL);
+ xprt_add_backlog(xprt, task);
ret = true;
}
spin_unlock(&xprt->reserve_lock);
void xprt_free_slot(struct rpc_xprt *xprt, struct rpc_rqst *req)
{
spin_lock(&xprt->reserve_lock);
- if (!xprt_dynamic_free_slot(xprt, req)) {
+ if (!xprt_wake_up_backlog(xprt, req) &&
+ !xprt_dynamic_free_slot(xprt, req)) {
memset(req, 0, sizeof(*req)); /* mark unused */
list_add(&req->rq_list, &xprt->free);
}
- xprt_wake_up_backlog(xprt);
spin_unlock(&xprt->reserve_lock);
}
EXPORT_SYMBOL_GPL(xprt_free_slot);
xdr_free_bvec(&req->rq_snd_buf);
if (req->rq_cred != NULL)
put_rpccred(req->rq_cred);
- task->tk_rqstp = NULL;
if (req->rq_release_snd_buf)
req->rq_release_snd_buf(req);
+ task->tk_rqstp = NULL;
if (likely(!bc_prealloc(req)))
xprt->ops->free_slot(xprt, req);
else
return false;
}
-/* The tail iovec might not reside in the same page as the
- * head iovec.
+/* The tail iovec may include an XDR pad for the page list,
+ * as well as additional content, and may not reside in the
+ * same page as the head iovec.
*/
static bool rpcrdma_prepare_tail_iov(struct rpcrdma_req *req,
struct xdr_buf *xdr,
struct rpcrdma_req *req,
struct xdr_buf *xdr)
{
- struct kvec *tail = &xdr->tail[0];
-
if (!rpcrdma_prepare_head_iov(r_xprt, req, xdr->head[0].iov_len))
return false;
- /* If there is a Read chunk, the page list is handled
+ /* If there is a Read chunk, the page list is being handled
* via explicit RDMA, and thus is skipped here.
*/
- if (tail->iov_len) {
- if (!rpcrdma_prepare_tail_iov(req, xdr,
- offset_in_page(tail->iov_base),
- tail->iov_len))
+ /* Do not include the tail if it is only an XDR pad */
+ if (xdr->tail[0].iov_len > 3) {
+ unsigned int page_base, len;
+
+ /* If the content in the page list is an odd length,
+ * xdr_write_pages() adds a pad at the beginning of
+ * the tail iovec. Force the tail's non-pad content to
+ * land at the next XDR position in the Send message.
+ */
+ page_base = offset_in_page(xdr->tail[0].iov_base);
+ len = xdr->tail[0].iov_len;
+ page_base += len & 3;
+ len -= len & 3;
+ if (!rpcrdma_prepare_tail_iov(req, xdr, page_base, len))
return false;
kref_get(&req->rl_kref);
}
return;
out_sleep:
- set_bit(XPRT_CONGESTED, &xprt->state);
- rpc_sleep_on(&xprt->backlog, task, NULL);
task->tk_status = -EAGAIN;
+ xprt_add_backlog(xprt, task);
}
/**
struct rpcrdma_xprt *r_xprt =
container_of(xprt, struct rpcrdma_xprt, rx_xprt);
- memset(rqst, 0, sizeof(*rqst));
- rpcrdma_buffer_put(&r_xprt->rx_buf, rpcr_to_rdmar(rqst));
- if (unlikely(!rpc_wake_up_next(&xprt->backlog)))
- clear_bit(XPRT_CONGESTED, &xprt->state);
+ rpcrdma_reply_put(&r_xprt->rx_buf, rpcr_to_rdmar(rqst));
+ if (!xprt_wake_up_backlog(xprt, rqst)) {
+ memset(rqst, 0, sizeof(*rqst));
+ rpcrdma_buffer_put(&r_xprt->rx_buf, rpcr_to_rdmar(rqst));
+ }
}
static bool rpcrdma_check_regbuf(struct rpcrdma_xprt *r_xprt,
}
/**
+ * rpcrdma_reply_put - Put reply buffers back into pool
+ * @buffers: buffer pool
+ * @req: object to return
+ *
+ */
+void rpcrdma_reply_put(struct rpcrdma_buffer *buffers, struct rpcrdma_req *req)
+{
+ if (req->rl_reply) {
+ rpcrdma_rep_put(buffers, req->rl_reply);
+ req->rl_reply = NULL;
+ }
+}
+
+/**
* rpcrdma_buffer_get - Get a request buffer
* @buffers: Buffer pool from which to obtain a buffer
*
*/
void rpcrdma_buffer_put(struct rpcrdma_buffer *buffers, struct rpcrdma_req *req)
{
- if (req->rl_reply)
- rpcrdma_rep_put(buffers, req->rl_reply);
- req->rl_reply = NULL;
+ rpcrdma_reply_put(buffers, req);
spin_lock(&buffers->rb_lock);
list_add(&req->rl_list, &buffers->rb_send_bufs);
void rpcrdma_buffer_put(struct rpcrdma_buffer *buffers,
struct rpcrdma_req *req);
void rpcrdma_rep_put(struct rpcrdma_buffer *buf, struct rpcrdma_rep *rep);
+void rpcrdma_reply_put(struct rpcrdma_buffer *buffers, struct rpcrdma_req *req);
bool rpcrdma_regbuf_realloc(struct rpcrdma_regbuf *rb, size_t size,
gfp_t flags);
kernel_sock_shutdown(transport->sock, SHUT_RDWR);
return -ENOTCONN;
}
+ if (!transport->inet)
+ return -ENOTCONN;
xs_pktdump("packet data:",
req->rq_svec->iov_base,
tn->trial_addr = 0;
tn->addr_trial_end = 0;
tn->capabilities = TIPC_NODE_CAPABILITIES;
- INIT_WORK(&tn->final_work.work, tipc_net_finalize_work);
+ INIT_WORK(&tn->work, tipc_net_finalize_work);
memset(tn->node_id, 0, sizeof(tn->node_id));
memset(tn->node_id_string, 0, sizeof(tn->node_id_string));
tn->mon_threshold = TIPC_DEF_MON_THRESHOLD;
tipc_detach_loopback(net);
/* Make sure the tipc_net_finalize_work() finished */
- cancel_work_sync(&tn->final_work.work);
+ cancel_work_sync(&tn->work);
tipc_net_stop(net);
tipc_bcast_stop(net);
#ifdef CONFIG_TIPC_CRYPTO
tipc_crypto_stop(&tipc_net(net)->crypto_tx);
#endif
+ while (atomic_read(&tn->wq_count))
+ cond_resched();
}
static void __net_exit tipc_pernet_pre_exit(struct net *net)
extern int sysctl_tipc_rmem[3] __read_mostly;
extern int sysctl_tipc_named_timeout __read_mostly;
-struct tipc_net_work {
- struct work_struct work;
- struct net *net;
- u32 addr;
-};
-
struct tipc_net {
u8 node_id[NODE_ID_LEN];
u32 node_addr;
struct tipc_crypto *crypto_tx;
#endif
/* Work item for net finalize */
- struct tipc_net_work final_work;
+ struct work_struct work;
+ /* The numbers of work queues in schedule */
+ atomic_t wq_count;
};
static inline struct tipc_net *tipc_net(struct net *net)
/* Apply trial address if we just left trial period */
if (!trial && !self) {
- tipc_sched_net_finalize(net, tn->trial_addr);
+ schedule_work(&tn->work);
msg_set_prevnode(buf_msg(d->skb), tn->trial_addr);
msg_set_type(buf_msg(d->skb), DSC_REQ_MSG);
}
if (!time_before(jiffies, tn->addr_trial_end) && !tipc_own_addr(net)) {
mod_timer(&d->timer, jiffies + TIPC_DISC_INIT);
spin_unlock_bh(&d->lock);
- tipc_sched_net_finalize(net, tn->trial_addr);
+ schedule_work(&tn->work);
return;
}
return l->net_plane;
}
+struct net *tipc_link_net(struct tipc_link *l)
+{
+ return l->net;
+}
+
void tipc_link_update_caps(struct tipc_link *l, u16 capabilities)
{
l->peer_caps = capabilities;
int tipc_link_bc_nack_rcv(struct tipc_link *l, struct sk_buff *skb,
struct sk_buff_head *xmitq);
bool tipc_link_too_silent(struct tipc_link *l);
+struct net *tipc_link_net(struct tipc_link *l);
#endif
if (unlikely(head))
goto err;
*buf = NULL;
+ if (skb_has_frag_list(frag) && __skb_linearize(frag))
+ goto err;
frag = skb_unshare(frag, GFP_ATOMIC);
if (unlikely(!frag))
goto err;
head = *headbuf = frag;
TIPC_SKB_CB(head)->tail = NULL;
- if (skb_is_nonlinear(head)) {
- skb_walk_frags(head, tail) {
- TIPC_SKB_CB(head)->tail = tail;
- }
- } else {
- skb_frag_list_init(head);
- }
return 0;
}
#include "socket.h"
#include "node.h"
#include "bcast.h"
+#include "link.h"
#include "netlink.h"
#include "monitor.h"
void tipc_net_finalize_work(struct work_struct *work)
{
- struct tipc_net_work *fwork;
+ struct tipc_net *tn = container_of(work, struct tipc_net, work);
- fwork = container_of(work, struct tipc_net_work, work);
- tipc_net_finalize(fwork->net, fwork->addr);
-}
-
-void tipc_sched_net_finalize(struct net *net, u32 addr)
-{
- struct tipc_net *tn = tipc_net(net);
-
- tn->final_work.net = net;
- tn->final_work.addr = addr;
- schedule_work(&tn->final_work.work);
+ tipc_net_finalize(tipc_link_net(tn->bcl), tn->trial_addr);
}
void tipc_net_stop(struct net *net)
write_unlock_bh(&n->lock);
if (flags & TIPC_NOTIFY_NODE_DOWN)
- tipc_publ_notify(net, publ_list, n->addr, n->capabilities);
+ tipc_publ_notify(net, publ_list, sk.node, n->capabilities);
if (flags & TIPC_NOTIFY_NODE_UP)
- tipc_named_node_up(net, n->addr, n->capabilities);
+ tipc_named_node_up(net, sk.node, n->capabilities);
if (flags & TIPC_NOTIFY_LINK_UP) {
- tipc_mon_peer_up(net, n->addr, bearer_id);
- tipc_nametbl_publish(net, &ua, &sk, n->link_id);
+ tipc_mon_peer_up(net, sk.node, bearer_id);
+ tipc_nametbl_publish(net, &ua, &sk, sk.ref);
}
if (flags & TIPC_NOTIFY_LINK_DOWN) {
- tipc_mon_peer_down(net, n->addr, bearer_id);
- tipc_nametbl_withdraw(net, &ua, &sk, n->link_id);
+ tipc_mon_peer_down(net, sk.node, bearer_id);
+ tipc_nametbl_withdraw(net, &ua, &sk, sk.ref);
}
}
spin_lock_bh(&inputq->lock);
if (skb_peek(arrvq) == skb) {
skb_queue_splice_tail_init(&tmpq, inputq);
- __skb_dequeue(arrvq);
+ /* Decrease the skb's refcnt as increasing in the
+ * function tipc_skb_peek
+ */
+ kfree_skb(__skb_dequeue(arrvq));
}
spin_unlock_bh(&inputq->lock);
__skb_queue_purge(&tmpq);
kfree_rcu(rcast, rcu);
}
+ atomic_dec(&tipc_net(sock_net(ub->ubsock->sk))->wq_count);
dst_cache_destroy(&ub->rcast.dst_cache);
udp_tunnel_sock_release(ub->ubsock);
synchronize_net();
RCU_INIT_POINTER(ub->bearer, NULL);
/* sock_release need to be done outside of rtnl lock */
+ atomic_inc(&tipc_net(sock_net(ub->ubsock->sk))->wq_count);
INIT_WORK(&ub->work, cleanup_bearer);
schedule_work(&ub->work);
}
static DECLARE_WORK(tls_device_gc_work, tls_device_gc_task);
static LIST_HEAD(tls_device_gc_list);
static LIST_HEAD(tls_device_list);
+static LIST_HEAD(tls_device_down_list);
static DEFINE_SPINLOCK(tls_device_lock);
static void tls_device_free_ctx(struct tls_context *ctx)
struct tls_offload_context_rx *rx_ctx = tls_offload_ctx_rx(tls_ctx);
struct net_device *netdev;
- if (WARN_ON(test_and_set_bit(TLS_RX_SYNC_RUNNING, &tls_ctx->flags)))
- return;
-
trace_tls_device_rx_resync_send(sk, seq, rcd_sn, rx_ctx->resync_type);
+ rcu_read_lock();
netdev = READ_ONCE(tls_ctx->netdev);
if (netdev)
netdev->tlsdev_ops->tls_dev_resync(netdev, sk, seq, rcd_sn,
TLS_OFFLOAD_CTX_DIR_RX);
- clear_bit_unlock(TLS_RX_SYNC_RUNNING, &tls_ctx->flags);
+ rcu_read_unlock();
TLS_INC_STATS(sock_net(sk), LINUX_MIB_TLSRXDEVICERESYNC);
}
if (tls_ctx->rx_conf != TLS_HW)
return;
+ if (unlikely(test_bit(TLS_RX_DEV_DEGRADED, &tls_ctx->flags)))
+ return;
prot = &tls_ctx->prot_info;
rx_ctx = tls_offload_ctx_rx(tls_ctx);
ctx->sw.decrypted |= is_decrypted;
+ if (unlikely(test_bit(TLS_RX_DEV_DEGRADED, &tls_ctx->flags))) {
+ if (likely(is_encrypted || is_decrypted))
+ return 0;
+
+ /* After tls_device_down disables the offload, the next SKB will
+ * likely have initial fragments decrypted, and final ones not
+ * decrypted. We need to reencrypt that single SKB.
+ */
+ return tls_device_reencrypt(sk, skb);
+ }
+
/* Return immediately if the record is either entirely plaintext or
* entirely ciphertext. Otherwise handle reencrypt partially decrypted
* record.
spin_unlock_irqrestore(&tls_device_lock, flags);
list_for_each_entry_safe(ctx, tmp, &list, list) {
+ /* Stop offloaded TX and switch to the fallback.
+ * tls_is_sk_tx_device_offloaded will return false.
+ */
+ WRITE_ONCE(ctx->sk->sk_validate_xmit_skb, tls_validate_xmit_skb_sw);
+
+ /* Stop the RX and TX resync.
+ * tls_dev_resync must not be called after tls_dev_del.
+ */
+ WRITE_ONCE(ctx->netdev, NULL);
+
+ /* Start skipping the RX resync logic completely. */
+ set_bit(TLS_RX_DEV_DEGRADED, &ctx->flags);
+
+ /* Sync with inflight packets. After this point:
+ * TX: no non-encrypted packets will be passed to the driver.
+ * RX: resync requests from the driver will be ignored.
+ */
+ synchronize_net();
+
+ /* Release the offload context on the driver side. */
if (ctx->tx_conf == TLS_HW)
netdev->tlsdev_ops->tls_dev_del(netdev, ctx,
TLS_OFFLOAD_CTX_DIR_TX);
!test_bit(TLS_RX_DEV_CLOSED, &ctx->flags))
netdev->tlsdev_ops->tls_dev_del(netdev, ctx,
TLS_OFFLOAD_CTX_DIR_RX);
- WRITE_ONCE(ctx->netdev, NULL);
- smp_mb__before_atomic(); /* pairs with test_and_set_bit() */
- while (test_bit(TLS_RX_SYNC_RUNNING, &ctx->flags))
- usleep_range(10, 200);
+
dev_put(netdev);
- list_del_init(&ctx->list);
- if (refcount_dec_and_test(&ctx->refcount))
- tls_device_free_ctx(ctx);
+ /* Move the context to a separate list for two reasons:
+ * 1. When the context is deallocated, list_del is called.
+ * 2. It's no longer an offloaded context, so we don't want to
+ * run offload-specific code on this context.
+ */
+ spin_lock_irqsave(&tls_device_lock, flags);
+ list_move_tail(&ctx->list, &tls_device_down_list);
+ spin_unlock_irqrestore(&tls_device_lock, flags);
+
+ /* Device contexts for RX and TX will be freed in on sk_destruct
+ * by tls_device_free_ctx. rx_conf and tx_conf stay in TLS_HW.
+ */
}
up_write(&device_offload_lock);
}
EXPORT_SYMBOL_GPL(tls_validate_xmit_skb);
+struct sk_buff *tls_validate_xmit_skb_sw(struct sock *sk,
+ struct net_device *dev,
+ struct sk_buff *skb)
+{
+ return tls_sw_fallback(sk, skb);
+}
+
struct sk_buff *tls_encrypt_skb(struct sk_buff *skb)
{
return tls_sw_fallback(skb->sk, skb);
mutex_init(&ctx->tx_lock);
rcu_assign_pointer(icsk->icsk_ulp_data, ctx);
ctx->sk_proto = READ_ONCE(sk->sk_prot);
+ ctx->sk = sk;
return ctx;
}
#include <linux/sched/signal.h>
#include <linux/module.h>
+#include <linux/splice.h>
#include <crypto/aead.h>
#include <net/strparser.h>
}
static struct sk_buff *tls_wait_data(struct sock *sk, struct sk_psock *psock,
- int flags, long timeo, int *err)
+ bool nonblock, long timeo, int *err)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx);
if (sock_flag(sk, SOCK_DONE))
return NULL;
- if ((flags & MSG_DONTWAIT) || !timeo) {
+ if (nonblock || !timeo) {
*err = -EAGAIN;
return NULL;
}
bool async_capable;
bool async = false;
- skb = tls_wait_data(sk, psock, flags, timeo, &err);
+ skb = tls_wait_data(sk, psock, flags & MSG_DONTWAIT, timeo, &err);
if (!skb) {
if (psock) {
int ret = sk_msg_recvmsg(sk, psock, msg, len,
lock_sock(sk);
- timeo = sock_rcvtimeo(sk, flags & MSG_DONTWAIT);
+ timeo = sock_rcvtimeo(sk, flags & SPLICE_F_NONBLOCK);
- skb = tls_wait_data(sk, NULL, flags, timeo, &err);
+ skb = tls_wait_data(sk, NULL, flags & SPLICE_F_NONBLOCK, timeo, &err);
if (!skb)
goto splice_read_end;
int ieee80211_data_to_8023_exthdr(struct sk_buff *skb, struct ethhdr *ehdr,
const u8 *addr, enum nl80211_iftype iftype,
- u8 data_offset)
+ u8 data_offset, bool is_amsdu)
{
struct ieee80211_hdr *hdr = (struct ieee80211_hdr *) skb->data;
struct {
skb_copy_bits(skb, hdrlen, &payload, sizeof(payload));
tmp.h_proto = payload.proto;
- if (likely((ether_addr_equal(payload.hdr, rfc1042_header) &&
+ if (likely((!is_amsdu && ether_addr_equal(payload.hdr, rfc1042_header) &&
tmp.h_proto != htons(ETH_P_AARP) &&
tmp.h_proto != htons(ETH_P_IPX)) ||
ether_addr_equal(payload.hdr, bridge_tunnel_header)))
remaining = skb->len - offset;
if (subframe_len > remaining)
goto purge;
+ /* mitigate A-MSDU aggregation injection attacks */
+ if (ether_addr_equal(eth.h_dest, rfc1042_header))
+ goto purge;
offset += sizeof(struct ethhdr);
last = remaining <= subframe_len + padding;
if (protocol)
goto out;
- rc = -ENOBUFS;
+ rc = -ENOMEM;
if ((sk = x25_alloc_socket(net, kern)) == NULL)
goto out;
for (i = 0; i < batch_size; i++) {
struct xdp_desc *tx_desc = xsk_ring_prod__tx_desc(&xsk->tx,
idx + i);
- tx_desc->addr = (*frame_nb + i) << XSK_UMEM__DEFAULT_FRAME_SHIFT;
+ tx_desc->addr = (*frame_nb + i) * opt_xsk_frame_size;
tx_desc->len = PKT_SIZE;
}
if (format != DRM_FORMAT_XRGB8888) {
pci_err(pdev, "format mismatch (0x%x != 0x%x)\n",
format, DRM_FORMAT_XRGB8888);
- return -EINVAL;
+ ret = -EINVAL;
+ goto err_release_regions;
}
if (width < 100 || width > 10000) {
pci_err(pdev, "width (%d) out of range\n", width);
- return -EINVAL;
+ ret = -EINVAL;
+ goto err_release_regions;
}
if (height < 100 || height > 10000) {
pci_err(pdev, "height (%d) out of range\n", height);
- return -EINVAL;
+ ret = -EINVAL;
+ goto err_release_regions;
}
pci_info(pdev, "mdpy found: %dx%d framebuffer\n",
width, height);
info = framebuffer_alloc(sizeof(struct mdpy_fb_par), &pdev->dev);
- if (!info)
+ if (!info) {
+ ret = -ENOMEM;
goto err_release_regions;
+ }
pci_set_drvdata(pdev, info);
par = info->par;
quiet_cmd_btf_ko = BTF [M] $@
cmd_btf_ko = \
if [ -f vmlinux ]; then \
- LLVM_OBJCOPY=$(OBJCOPY) $(PAHOLE) -J --btf_base vmlinux $@; \
+ LLVM_OBJCOPY="$(OBJCOPY)" $(PAHOLE) -J --btf_base vmlinux $@; \
else \
printf "Skipping BTF generation for %s due to unavailability of vmlinux\n" $@ 1>&2; \
fi;
fi
info "BTF" ${2}
- LLVM_OBJCOPY=${OBJCOPY} ${PAHOLE} -J ${extra_paholeopt} ${1}
+ LLVM_OBJCOPY="${OBJCOPY}" ${PAHOLE} -J ${extra_paholeopt} ${1}
# Create ${2} which contains just .BTF section but no symbols. Add
# SHF_ALLOC because .BTF will be part of the vmlinux image. --strip-all
#define MAX_LED (((SNDRV_CTL_ELEM_ACCESS_MIC_LED - SNDRV_CTL_ELEM_ACCESS_SPK_LED) \
>> SNDRV_CTL_ELEM_ACCESS_LED_SHIFT) + 1)
+#define to_led_card_dev(_dev) \
+ container_of(_dev, struct snd_ctl_led_card, dev)
+
enum snd_ctl_led_mode {
MODE_FOLLOW_MUTE = 0,
MODE_FOLLOW_ROUTE,
snd_ctl_led_refresh();
}
+static void snd_ctl_led_card_release(struct device *dev)
+{
+ struct snd_ctl_led_card *led_card = to_led_card_dev(dev);
+
+ kfree(led_card);
+}
+
+static void snd_ctl_led_release(struct device *dev)
+{
+}
+
+static void snd_ctl_led_dev_release(struct device *dev)
+{
+}
+
/*
* sysfs
*/
led_card->number = card->number;
led_card->led = led;
device_initialize(&led_card->dev);
+ led_card->dev.release = snd_ctl_led_card_release;
if (dev_set_name(&led_card->dev, "card%d", card->number) < 0)
goto cerr;
led_card->dev.parent = &led->dev;
put_device(&led_card->dev);
cerr2:
printk(KERN_ERR "snd_ctl_led: unable to add card%d", card->number);
- kfree(led_card);
}
}
snprintf(link_name, sizeof(link_name), "led-%s", led->name);
sysfs_remove_link(&card->ctl_dev.kobj, link_name);
sysfs_remove_link(&led_card->dev.kobj, "card");
- device_del(&led_card->dev);
- kfree(led_card);
+ device_unregister(&led_card->dev);
led->cards[card->number] = NULL;
}
}
device_initialize(&snd_ctl_led_dev);
snd_ctl_led_dev.class = sound_class;
+ snd_ctl_led_dev.release = snd_ctl_led_dev_release;
dev_set_name(&snd_ctl_led_dev, "ctl-led");
if (device_add(&snd_ctl_led_dev)) {
put_device(&snd_ctl_led_dev);
INIT_LIST_HEAD(&led->controls);
device_initialize(&led->dev);
led->dev.parent = &snd_ctl_led_dev;
+ led->dev.release = snd_ctl_led_release;
led->dev.groups = snd_ctl_led_dev_attr_groups;
dev_set_name(&led->dev, led->name);
if (device_add(&led->dev)) {
put_device(&led->dev);
for (; group > 0; group--) {
led = &snd_ctl_leds[group - 1];
- device_del(&led->dev);
+ device_unregister(&led->dev);
}
- device_del(&snd_ctl_led_dev);
+ device_unregister(&snd_ctl_led_dev);
return -ENOMEM;
}
}
}
for (group = 0; group < MAX_LED; group++) {
led = &snd_ctl_leds[group];
- device_del(&led->dev);
+ device_unregister(&led->dev);
}
- device_del(&snd_ctl_led_dev);
+ device_unregister(&snd_ctl_led_dev);
snd_ctl_led_clean(NULL);
}
return;
if (timer->hw.flags & SNDRV_TIMER_HW_SLAVE)
return;
+ event += 10; /* convert to SNDRV_TIMER_EVENT_MXXX */
list_for_each_entry(ts, &ti->slave_active_head, active_list)
if (ts->ccallback)
- ts->ccallback(ts, event + 100, &tstamp, resolution);
+ ts->ccallback(ts, event, &tstamp, resolution);
}
/* start/continue a master timer */
.flags = FLAG_SOF | FLAG_SOF_ONLY_IF_DMIC_OR_SOUNDWIRE,
.device = 0x51c8,
},
+ {
+ .flags = FLAG_SOF | FLAG_SOF_ONLY_IF_DMIC_OR_SOUNDWIRE,
+ .device = 0x51cc,
+ },
#endif
};
#ifdef CONFIG_PM_SLEEP
static int hda_codec_pm_prepare(struct device *dev)
{
+ dev->power.power_state = PMSG_SUSPEND;
return pm_runtime_suspended(dev);
}
{
struct hda_codec *codec = dev_to_hda_codec(dev);
+ /* If no other pm-functions are called between prepare() and complete() */
+ if (dev->power.power_state.event == PM_EVENT_SUSPEND)
+ dev->power.power_state = PMSG_RESUME;
+
if (pm_runtime_suspended(dev) && (codec->jackpoll_interval ||
hda_codec_need_resume(codec) || codec->forced_resume))
pm_request_resume(dev);
static const struct snd_kcontrol_new cap_sw_temp = {
.iface = SNDRV_CTL_ELEM_IFACE_MIXER,
.name = "Capture Switch",
+ .access = SNDRV_CTL_ELEM_ACCESS_READWRITE,
.info = cap_sw_info,
.get = cap_sw_get,
.put = cap_sw_put,
/* Alderlake-P */
{ PCI_DEVICE(0x8086, 0x51c8),
.driver_data = AZX_DRIVER_SKL | AZX_DCAPS_INTEL_SKYLAKE},
+ /* Alderlake-M */
+ { PCI_DEVICE(0x8086, 0x51cc),
+ .driver_data = AZX_DRIVER_SKL | AZX_DCAPS_INTEL_SKYLAKE},
/* Elkhart Lake */
{ PCI_DEVICE(0x8086, 0x4b55),
.driver_data = AZX_DRIVER_SKL | AZX_DCAPS_INTEL_SKYLAKE},
break;
case HDA_FIXUP_ACT_PROBE:
- /* Set initial volume on Bullseye to -26 dB */
- if (codec->fixup_id == CS8409_BULLSEYE)
- snd_hda_codec_amp_init_stereo(codec, CS8409_CS42L42_DMIC_ADC_PIN_NID,
- HDA_INPUT, 0, 0xff, 0x19);
+ /* Set initial DMIC volume to -26 dB */
+ snd_hda_codec_amp_init_stereo(codec, CS8409_CS42L42_DMIC_ADC_PIN_NID,
+ HDA_INPUT, 0, 0xff, 0x19);
snd_hda_gen_add_kctl(&spec->gen,
NULL, &cs8409_cs42l42_hp_volume_mixer);
snd_hda_gen_add_kctl(&spec->gen,
{}
};
+static const struct snd_hda_pin_quirk alc882_pin_fixup_tbl[] = {
+ SND_HDA_PIN_QUIRK(0x10ec1220, 0x1043, "ASUS", ALC1220_FIXUP_CLEVO_P950,
+ {0x14, 0x01014010},
+ {0x15, 0x01011012},
+ {0x16, 0x01016011},
+ {0x18, 0x01a19040},
+ {0x19, 0x02a19050},
+ {0x1a, 0x0181304f},
+ {0x1b, 0x0221401f},
+ {0x1e, 0x01456130}),
+ SND_HDA_PIN_QUIRK(0x10ec1220, 0x1462, "MS-7C35", ALC1220_FIXUP_CLEVO_P950,
+ {0x14, 0x01015010},
+ {0x15, 0x01011012},
+ {0x16, 0x01011011},
+ {0x18, 0x01a11040},
+ {0x19, 0x02a19050},
+ {0x1a, 0x0181104f},
+ {0x1b, 0x0221401f},
+ {0x1e, 0x01451130}),
+ {}
+};
+
/*
* BIOS auto configuration
*/
snd_hda_pick_fixup(codec, alc882_fixup_models, alc882_fixup_tbl,
alc882_fixups);
+ snd_hda_pick_pin_fixup(codec, alc882_pin_fixup_tbl, alc882_fixups, true);
snd_hda_apply_fixup(codec, HDA_FIXUP_ACT_PRE_PROBE);
alc_auto_parse_customize_define(codec);
ALC295_FIXUP_ASUS_DACS,
ALC295_FIXUP_HP_OMEN,
ALC285_FIXUP_HP_SPECTRE_X360,
+ ALC287_FIXUP_IDEAPAD_BASS_SPK_AMP,
+ ALC623_FIXUP_LENOVO_THINKSTATION_P340,
};
static const struct hda_fixup alc269_fixups[] = {
.chained = true,
.chain_id = ALC285_FIXUP_SPEAKER2_TO_DAC1,
},
+ [ALC287_FIXUP_IDEAPAD_BASS_SPK_AMP] = {
+ .type = HDA_FIXUP_FUNC,
+ .v.func = alc285_fixup_ideapad_s740_coef,
+ .chained = true,
+ .chain_id = ALC285_FIXUP_THINKPAD_HEADSET_JACK,
+ },
+ [ALC623_FIXUP_LENOVO_THINKSTATION_P340] = {
+ .type = HDA_FIXUP_FUNC,
+ .v.func = alc_fixup_no_shutup,
+ .chained = true,
+ .chain_id = ALC283_FIXUP_HEADSET_MIC,
+ },
};
static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x103c, 0x82bf, "HP G3 mini", ALC221_FIXUP_HP_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x103c, 0x82c0, "HP G3 mini premium", ALC221_FIXUP_HP_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x103c, 0x83b9, "HP Spectre x360", ALC269_FIXUP_HP_MUTE_LED_MIC3),
+ SND_PCI_QUIRK(0x103c, 0x841c, "HP Pavilion 15-CK0xx", ALC269_FIXUP_HP_MUTE_LED_MIC3),
SND_PCI_QUIRK(0x103c, 0x8497, "HP Envy x360", ALC269_FIXUP_HP_MUTE_LED_MIC3),
SND_PCI_QUIRK(0x103c, 0x84da, "HP OMEN dc0019-ur", ALC295_FIXUP_HP_OMEN),
SND_PCI_QUIRK(0x103c, 0x84e7, "HP Pavilion 15", ALC269_FIXUP_HP_MUTE_LED_MIC3),
SND_PCI_QUIRK(0x103c, 0x87f7, "HP Spectre x360 14", ALC245_FIXUP_HP_X360_AMP),
SND_PCI_QUIRK(0x103c, 0x8846, "HP EliteBook 850 G8 Notebook PC", ALC285_FIXUP_HP_GPIO_LED),
SND_PCI_QUIRK(0x103c, 0x884c, "HP EliteBook 840 G8 Notebook PC", ALC285_FIXUP_HP_GPIO_LED),
+ SND_PCI_QUIRK(0x103c, 0x886d, "HP ZBook Fury 17.3 Inch G8 Mobile Workstation PC", ALC285_FIXUP_HP_GPIO_AMP_INIT),
+ SND_PCI_QUIRK(0x103c, 0x8870, "HP ZBook Fury 15.6 Inch G8 Mobile Workstation PC", ALC285_FIXUP_HP_GPIO_AMP_INIT),
+ SND_PCI_QUIRK(0x103c, 0x8873, "HP ZBook Studio 15.6 Inch G8 Mobile Workstation PC", ALC285_FIXUP_HP_GPIO_AMP_INIT),
+ SND_PCI_QUIRK(0x103c, 0x8896, "HP EliteBook 855 G8 Notebook PC", ALC285_FIXUP_HP_MUTE_LED),
SND_PCI_QUIRK(0x103c, 0x8898, "HP EliteBook 845 G8 Notebook PC", ALC285_FIXUP_HP_LIMIT_INT_MIC_BOOST),
SND_PCI_QUIRK(0x1043, 0x103e, "ASUS X540SA", ALC256_FIXUP_ASUS_MIC),
SND_PCI_QUIRK(0x1043, 0x103f, "ASUS TX300", ALC282_FIXUP_ASUS_TX300),
SND_PCI_QUIRK(0x1558, 0xc019, "Clevo NH77D[BE]Q", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1558, 0xc022, "Clevo NH77[DC][QW]", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x17aa, 0x1036, "Lenovo P520", ALC233_FIXUP_LENOVO_MULTI_CODECS),
- SND_PCI_QUIRK(0x17aa, 0x1048, "ThinkCentre Station", ALC283_FIXUP_HEADSET_MIC),
+ SND_PCI_QUIRK(0x17aa, 0x1048, "ThinkCentre Station", ALC623_FIXUP_LENOVO_THINKSTATION_P340),
SND_PCI_QUIRK(0x17aa, 0x20f2, "Thinkpad SL410/510", ALC269_FIXUP_SKU_IGNORE),
SND_PCI_QUIRK(0x17aa, 0x215e, "Thinkpad L512", ALC269_FIXUP_SKU_IGNORE),
SND_PCI_QUIRK(0x17aa, 0x21b8, "Thinkpad Edge 14", ALC269_FIXUP_SKU_IGNORE),
SND_PCI_QUIRK(0x17aa, 0x3178, "ThinkCentre Station", ALC283_FIXUP_HEADSET_MIC),
SND_PCI_QUIRK(0x17aa, 0x3818, "Lenovo C940", ALC298_FIXUP_LENOVO_SPK_VOLUME),
SND_PCI_QUIRK(0x17aa, 0x3827, "Ideapad S740", ALC285_FIXUP_IDEAPAD_S740_COEF),
+ SND_PCI_QUIRK(0x17aa, 0x3843, "Yoga 9i", ALC287_FIXUP_IDEAPAD_BASS_SPK_AMP),
SND_PCI_QUIRK(0x17aa, 0x3902, "Lenovo E50-80", ALC269_FIXUP_DMIC_THINKPAD_ACPI),
SND_PCI_QUIRK(0x17aa, 0x3977, "IdeaPad S210", ALC283_FIXUP_INT_MIC),
SND_PCI_QUIRK(0x17aa, 0x3978, "Lenovo B50-70", ALC269_FIXUP_DMIC_THINKPAD_ACPI),
{.id = ALC245_FIXUP_HP_X360_AMP, .name = "alc245-hp-x360-amp"},
{.id = ALC295_FIXUP_HP_OMEN, .name = "alc295-hp-omen"},
{.id = ALC285_FIXUP_HP_SPECTRE_X360, .name = "alc285-hp-spectre-x360"},
+ {.id = ALC287_FIXUP_IDEAPAD_BASS_SPK_AMP, .name = "alc287-ideapad-bass-spk-amp"},
+ {.id = ALC623_FIXUP_LENOVO_THINKSTATION_P340, .name = "alc623-lenovo-thinkstation-p340"},
{}
};
#define ALC225_STANDARD_PINS \
return ret;
}
- if (!adata->play_stream && !adata->capture_stream &&
- !adata->i2ssp_play_stream && !adata->i2ssp_capture_stream)
- rv_writel(1, adata->acp3x_base + mmACP_EXTERNAL_INTR_ENB);
-
i2s_data->acp3x_base = adata->acp3x_base;
runtime->private_data = i2s_data;
return ret;
}
}
- /* Disable ACP irq, when the current stream is being closed and
- * another stream is also not active.
- */
- if (!adata->play_stream && !adata->capture_stream &&
- !adata->i2ssp_play_stream && !adata->i2ssp_capture_stream)
- rv_writel(0, adata->acp3x_base + mmACP_EXTERNAL_INTR_ENB);
return 0;
}
#define ACP_POWER_OFF_IN_PROGRESS 0x03
#define ACP3x_ITER_IRER_SAMP_LEN_MASK 0x38
+#define ACP_EXT_INTR_STAT_CLEAR_MASK 0xFFFFFFFF
struct acp3x_platform_info {
u16 play_i2s_instance;
return -ETIMEDOUT;
}
+static void acp3x_enable_interrupts(void __iomem *acp_base)
+{
+ rv_writel(0x01, acp_base + mmACP_EXTERNAL_INTR_ENB);
+}
+
+static void acp3x_disable_interrupts(void __iomem *acp_base)
+{
+ rv_writel(ACP_EXT_INTR_STAT_CLEAR_MASK, acp_base +
+ mmACP_EXTERNAL_INTR_STAT);
+ rv_writel(0x00, acp_base + mmACP_EXTERNAL_INTR_CNTL);
+ rv_writel(0x00, acp_base + mmACP_EXTERNAL_INTR_ENB);
+}
+
static int acp3x_init(struct acp3x_dev_data *adata)
{
void __iomem *acp3x_base = adata->acp3x_base;
pr_err("ACP3x reset failed\n");
return ret;
}
+ acp3x_enable_interrupts(acp3x_base);
return 0;
}
{
int ret;
+ acp3x_disable_interrupts(acp3x_base);
/* Reset */
ret = acp3x_reset(acp3x_base);
if (ret) {
};
static struct snd_soc_dai_driver ak5552_dai = {
- .name = "ak5558-aif",
+ .name = "ak5552-aif",
.capture = {
.stream_name = "Capture",
.channels_min = 1,
.readable_reg = cs35l32_readable_register,
.precious_reg = cs35l32_precious_register,
.cache_type = REGCACHE_RBTREE,
+
+ .use_single_read = true,
+ .use_single_write = true,
};
static int cs35l32_handle_of_data(struct i2c_client *i2c_client,
dev_err(&i2c_client->dev,
"CS35L33 Device ID (%X). Expected ID %X\n",
devid, CS35L33_CHIP_ID);
+ ret = -EINVAL;
goto err_enable;
}
.readable_reg = cs35l34_readable_register,
.precious_reg = cs35l34_precious_register,
.cache_type = REGCACHE_RBTREE,
+
+ .use_single_read = true,
+ .use_single_write = true,
};
static int cs35l34_handle_of_data(struct i2c_client *i2c_client,
.reg_defaults = cs42l42_reg_defaults,
.num_reg_defaults = ARRAY_SIZE(cs42l42_reg_defaults),
.cache_type = REGCACHE_RBTREE,
+
+ .use_single_read = true,
+ .use_single_write = true,
};
static DECLARE_TLV_DB_SCALE(adc_tlv, -9600, 100, false);
struct cs42l56_platform_data *pdata =
dev_get_platdata(&i2c_client->dev);
int ret, i;
- unsigned int devid = 0;
+ unsigned int devid;
unsigned int alpha_rev, metal_rev;
unsigned int reg;
}
ret = regmap_read(cs42l56->regmap, CS42L56_CHIP_ID_1, ®);
+ if (ret) {
+ dev_err(&i2c_client->dev, "Failed to read chip ID: %d\n", ret);
+ return ret;
+ }
+
devid = reg & CS42L56_CHIP_ID_MASK;
if (devid != CS42L56_DEVID) {
dev_err(&i2c_client->dev,
.volatile_reg = cs42l73_volatile_register,
.readable_reg = cs42l73_readable_register,
.cache_type = REGCACHE_RBTREE,
+
+ .use_single_read = true,
+ .use_single_write = true,
};
static int cs42l73_i2c_probe(struct i2c_client *i2c_client,
.writeable_reg = cs53l30_writeable_register,
.readable_reg = cs53l30_readable_register,
.cache_type = REGCACHE_RBTREE,
+
+ .use_single_read = true,
+ .use_single_write = true,
};
static int cs53l30_i2c_probe(struct i2c_client *client,
ret);
goto err;
}
-
- da7219->dai_clks[i] = devm_clk_hw_get_clk(dev, dai_clk_hw, NULL);
- if (IS_ERR(da7219->dai_clks[i]))
- return PTR_ERR(da7219->dai_clks[i]);
+ da7219->dai_clks[i] = dai_clk_hw->clk;
/* For DT setup onecell data, otherwise create lookup */
if (np) {
{ .compatible = "qcom,sm8250-lpass-rx-macro" },
{ }
};
+MODULE_DEVICE_TABLE(of, rx_macro_dt_match);
static struct platform_driver rx_macro_driver = {
.driver = {
{ .compatible = "qcom,sm8250-lpass-tx-macro" },
{ }
};
+MODULE_DEVICE_TABLE(of, tx_macro_dt_match);
static struct platform_driver tx_macro_driver = {
.driver = {
.name = "tx_macro",
enum max98088_type devtype;
struct max98088_pdata *pdata;
struct clk *mclk;
+ unsigned char mclk_prescaler;
unsigned int sysclk;
struct max98088_cdata dai[2];
int eq_textcnt;
/* Configure NI when operating as master */
if (snd_soc_component_read(component, M98088_REG_14_DAI1_FORMAT)
& M98088_DAI_MAS) {
+ unsigned long pclk;
+
if (max98088->sysclk == 0) {
dev_err(component->dev, "Invalid system clock frequency\n");
return -EINVAL;
}
ni = 65536ULL * (rate < 50000 ? 96ULL : 48ULL)
* (unsigned long long int)rate;
- do_div(ni, (unsigned long long int)max98088->sysclk);
+ pclk = DIV_ROUND_CLOSEST(max98088->sysclk, max98088->mclk_prescaler);
+ ni = DIV_ROUND_CLOSEST_ULL(ni, pclk);
snd_soc_component_write(component, M98088_REG_12_DAI1_CLKCFG_HI,
(ni >> 8) & 0x7F);
snd_soc_component_write(component, M98088_REG_13_DAI1_CLKCFG_LO,
/* Configure NI when operating as master */
if (snd_soc_component_read(component, M98088_REG_1C_DAI2_FORMAT)
& M98088_DAI_MAS) {
+ unsigned long pclk;
+
if (max98088->sysclk == 0) {
dev_err(component->dev, "Invalid system clock frequency\n");
return -EINVAL;
}
ni = 65536ULL * (rate < 50000 ? 96ULL : 48ULL)
* (unsigned long long int)rate;
- do_div(ni, (unsigned long long int)max98088->sysclk);
+ pclk = DIV_ROUND_CLOSEST(max98088->sysclk, max98088->mclk_prescaler);
+ ni = DIV_ROUND_CLOSEST_ULL(ni, pclk);
snd_soc_component_write(component, M98088_REG_1A_DAI2_CLKCFG_HI,
(ni >> 8) & 0x7F);
snd_soc_component_write(component, M98088_REG_1B_DAI2_CLKCFG_LO,
*/
if ((freq >= 10000000) && (freq < 20000000)) {
snd_soc_component_write(component, M98088_REG_10_SYS_CLK, 0x10);
+ max98088->mclk_prescaler = 1;
} else if ((freq >= 20000000) && (freq < 30000000)) {
snd_soc_component_write(component, M98088_REG_10_SYS_CLK, 0x20);
+ max98088->mclk_prescaler = 2;
} else {
dev_err(component->dev, "Invalid master clock frequency\n");
return -EINVAL;
ch_r = (rt711->fu1e_dapm_mute || rt711->fu1e_mixer_r_mute) ? 0x01 : 0x00;
err = regmap_write(rt711->regmap,
- SDW_SDCA_CTL(FUNC_NUM_JACK_CODEC, RT711_SDCA_ENT_USER_FU1E,
+ SDW_SDCA_CTL(FUNC_NUM_MIC_ARRAY, RT711_SDCA_ENT_USER_FU1E,
RT711_SDCA_CTL_FU_MUTE, CH_L), ch_l);
if (err < 0)
return err;
err = regmap_write(rt711->regmap,
- SDW_SDCA_CTL(FUNC_NUM_JACK_CODEC, RT711_SDCA_ENT_USER_FU1E,
+ SDW_SDCA_CTL(FUNC_NUM_MIC_ARRAY, RT711_SDCA_ENT_USER_FU1E,
RT711_SDCA_CTL_FU_MUTE, CH_R), ch_r);
if (err < 0)
return err;
},
{},
};
+MODULE_DEVICE_TABLE(of, sti_sas_dev_match);
static int sti_sas_driver_probe(struct platform_device *pdev)
{
tristate "NXP Audio Base On RPMSG support"
depends on COMMON_CLK
depends on RPMSG
+ depends on SND_IMX_SOC || SND_IMX_SOC = n
select SND_SOC_IMX_RPMSG if SND_IMX_SOC != n
help
Say Y if you want to add rpmsg audio support for the Freescale CPUs.
static int graph_parse_node(struct asoc_simple_priv *priv,
struct device_node *ep,
struct link_info *li,
- int is_cpu)
+ int *cpu)
{
struct device *dev = simple_priv_to_dev(priv);
struct device_node *top = dev->of_node;
struct simple_dai_props *dai_props = simple_priv_to_props(priv, li->link);
struct snd_soc_dai_link_component *dlc;
struct asoc_simple_dai *dai;
- int ret, single = 0;
+ int ret;
- if (is_cpu) {
+ if (cpu) {
dlc = asoc_link_to_cpu(dai_link, 0);
dai = simple_props_to_dai_cpu(dai_props, 0);
} else {
graph_parse_mclk_fs(top, ep, dai_props);
- ret = asoc_simple_parse_dai(ep, dlc, &single);
+ ret = asoc_simple_parse_dai(ep, dlc, cpu);
if (ret < 0)
return ret;
if (ret < 0)
return ret;
- if (is_cpu)
- asoc_simple_canonicalize_cpu(dlc, single);
-
return 0;
}
struct link_info *li)
{
struct device *dev = simple_priv_to_dev(priv);
- struct snd_soc_card *card = simple_priv_to_card(priv);
struct snd_soc_dai_link *dai_link = simple_priv_to_link(priv, li->link);
struct simple_dai_props *dai_props = simple_priv_to_props(priv, li->link);
struct device_node *top = dev->of_node;
struct device_node *ep = li->cpu ? cpu_ep : codec_ep;
- struct device_node *port;
- struct device_node *ports;
- struct snd_soc_dai_link_component *cpus = asoc_link_to_cpu(dai_link, 0);
- struct snd_soc_dai_link_component *codecs = asoc_link_to_codec(dai_link, 0);
char dai_name[64];
int ret;
- port = of_get_parent(ep);
- ports = of_get_parent(port);
-
dev_dbg(dev, "link_of DPCM (%pOF)\n", ep);
if (li->cpu) {
+ struct snd_soc_card *card = simple_priv_to_card(priv);
+ struct snd_soc_dai_link_component *cpus = asoc_link_to_cpu(dai_link, 0);
+ int is_single_links = 0;
+
/* Codec is dummy */
/* FE settings */
dai_link->dynamic = 1;
dai_link->dpcm_merged_format = 1;
- ret = graph_parse_node(priv, cpu_ep, li, 1);
+ ret = graph_parse_node(priv, cpu_ep, li, &is_single_links);
if (ret)
- goto out_put_node;
+ return ret;
snprintf(dai_name, sizeof(dai_name),
"fe.%pOFP.%s", cpus->of_node, cpus->dai_name);
*/
if (card->component_chaining && !soc_component_is_pcm(cpus))
dai_link->no_pcm = 1;
+
+ asoc_simple_canonicalize_cpu(cpus, is_single_links);
} else {
- struct snd_soc_codec_conf *cconf;
+ struct snd_soc_codec_conf *cconf = simple_props_to_codec_conf(dai_props, 0);
+ struct snd_soc_dai_link_component *codecs = asoc_link_to_codec(dai_link, 0);
+ struct device_node *port;
+ struct device_node *ports;
/* CPU is dummy */
dai_link->no_pcm = 1;
dai_link->be_hw_params_fixup = asoc_simple_be_hw_params_fixup;
- cconf = simple_props_to_codec_conf(dai_props, 0);
-
- ret = graph_parse_node(priv, codec_ep, li, 0);
+ ret = graph_parse_node(priv, codec_ep, li, NULL);
if (ret < 0)
- goto out_put_node;
+ return ret;
snprintf(dai_name, sizeof(dai_name),
"be.%pOFP.%s", codecs->of_node, codecs->dai_name);
/* check "prefix" from top node */
+ port = of_get_parent(ep);
+ ports = of_get_parent(port);
snd_soc_of_parse_node_prefix(top, cconf, codecs->of_node,
"prefix");
if (of_node_name_eq(ports, "ports"))
snd_soc_of_parse_node_prefix(ports, cconf, codecs->of_node, "prefix");
snd_soc_of_parse_node_prefix(port, cconf, codecs->of_node,
"prefix");
+
+ of_node_put(ports);
+ of_node_put(port);
}
graph_parse_convert(dev, ep, &dai_props->adata);
ret = graph_link_init(priv, cpu_ep, codec_ep, li, dai_name);
-out_put_node:
li->link++;
- of_node_put(ports);
- of_node_put(port);
return ret;
}
struct snd_soc_dai_link_component *cpus = asoc_link_to_cpu(dai_link, 0);
struct snd_soc_dai_link_component *codecs = asoc_link_to_codec(dai_link, 0);
char dai_name[64];
- int ret;
+ int ret, is_single_links = 0;
dev_dbg(dev, "link_of (%pOF)\n", cpu_ep);
- ret = graph_parse_node(priv, cpu_ep, li, 1);
+ ret = graph_parse_node(priv, cpu_ep, li, &is_single_links);
if (ret < 0)
return ret;
- ret = graph_parse_node(priv, codec_ep, li, 0);
+ ret = graph_parse_node(priv, codec_ep, li, NULL);
if (ret < 0)
return ret;
snprintf(dai_name, sizeof(dai_name),
"%s-%s", cpus->dai_name, codecs->dai_name);
+
+ asoc_simple_canonicalize_cpu(cpus, is_single_links);
+
ret = graph_link_init(priv, cpu_ep, codec_ep, li, dai_name);
if (ret < 0)
return ret;
}
static void simple_parse_mclk_fs(struct device_node *top,
- struct device_node *cpu,
- struct device_node *codec,
+ struct device_node *np,
struct simple_dai_props *props,
char *prefix)
{
- struct device_node *node = of_get_parent(cpu);
+ struct device_node *node = of_get_parent(np);
char prop[128];
snprintf(prop, sizeof(prop), "%smclk-fs", PREFIX);
snprintf(prop, sizeof(prop), "%smclk-fs", prefix);
of_property_read_u32(node, prop, &props->mclk_fs);
- of_property_read_u32(cpu, prop, &props->mclk_fs);
- of_property_read_u32(codec, prop, &props->mclk_fs);
+ of_property_read_u32(np, prop, &props->mclk_fs);
of_node_put(node);
}
+static int simple_parse_node(struct asoc_simple_priv *priv,
+ struct device_node *np,
+ struct link_info *li,
+ char *prefix,
+ int *cpu)
+{
+ struct device *dev = simple_priv_to_dev(priv);
+ struct device_node *top = dev->of_node;
+ struct snd_soc_dai_link *dai_link = simple_priv_to_link(priv, li->link);
+ struct simple_dai_props *dai_props = simple_priv_to_props(priv, li->link);
+ struct snd_soc_dai_link_component *dlc;
+ struct asoc_simple_dai *dai;
+ int ret;
+
+ if (cpu) {
+ dlc = asoc_link_to_cpu(dai_link, 0);
+ dai = simple_props_to_dai_cpu(dai_props, 0);
+ } else {
+ dlc = asoc_link_to_codec(dai_link, 0);
+ dai = simple_props_to_dai_codec(dai_props, 0);
+ }
+
+ simple_parse_mclk_fs(top, np, dai_props, prefix);
+
+ ret = asoc_simple_parse_dai(np, dlc, cpu);
+ if (ret)
+ return ret;
+
+ ret = asoc_simple_parse_clk(dev, np, dai, dlc);
+ if (ret)
+ return ret;
+
+ ret = asoc_simple_parse_tdm(np, dai);
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+static int simple_link_init(struct asoc_simple_priv *priv,
+ struct device_node *node,
+ struct device_node *codec,
+ struct link_info *li,
+ char *prefix, char *name)
+{
+ struct device *dev = simple_priv_to_dev(priv);
+ struct snd_soc_dai_link *dai_link = simple_priv_to_link(priv, li->link);
+ int ret;
+
+ ret = asoc_simple_parse_daifmt(dev, node, codec,
+ prefix, &dai_link->dai_fmt);
+ if (ret < 0)
+ return 0;
+
+ dai_link->init = asoc_simple_dai_init;
+ dai_link->ops = &simple_ops;
+
+ return asoc_simple_set_dailink_name(dev, dai_link, name);
+}
+
static int simple_dai_link_of_dpcm(struct asoc_simple_priv *priv,
struct device_node *np,
struct device_node *codec,
struct device *dev = simple_priv_to_dev(priv);
struct snd_soc_dai_link *dai_link = simple_priv_to_link(priv, li->link);
struct simple_dai_props *dai_props = simple_priv_to_props(priv, li->link);
- struct asoc_simple_dai *dai;
- struct snd_soc_dai_link_component *cpus = asoc_link_to_cpu(dai_link, 0);
- struct snd_soc_dai_link_component *codecs = asoc_link_to_codec(dai_link, 0);
- struct snd_soc_dai_link_component *platforms = asoc_link_to_platform(dai_link, 0);
struct device_node *top = dev->of_node;
struct device_node *node = of_get_parent(np);
char *prefix = "";
+ char dai_name[64];
int ret;
dev_dbg(dev, "link_of DPCM (%pOF)\n", np);
- li->link++;
-
/* For single DAI link & old style of DT node */
if (is_top)
prefix = PREFIX;
if (li->cpu) {
+ struct snd_soc_dai_link_component *cpus = asoc_link_to_cpu(dai_link, 0);
+ struct snd_soc_dai_link_component *platforms = asoc_link_to_platform(dai_link, 0);
int is_single_links = 0;
/* Codec is dummy */
dai_link->dynamic = 1;
dai_link->dpcm_merged_format = 1;
- dai = simple_props_to_dai_cpu(dai_props, 0);
-
- ret = asoc_simple_parse_dai(np, cpus, &is_single_links);
- if (ret)
- goto out_put_node;
-
- ret = asoc_simple_parse_clk(dev, np, dai, cpus);
+ ret = simple_parse_node(priv, np, li, prefix, &is_single_links);
if (ret < 0)
goto out_put_node;
- ret = asoc_simple_set_dailink_name(dev, dai_link,
- "fe.%s",
- cpus->dai_name);
- if (ret < 0)
- goto out_put_node;
+ snprintf(dai_name, sizeof(dai_name), "fe.%s", cpus->dai_name);
asoc_simple_canonicalize_cpu(cpus, is_single_links);
asoc_simple_canonicalize_platform(platforms, cpus);
} else {
+ struct snd_soc_dai_link_component *codecs = asoc_link_to_codec(dai_link, 0);
struct snd_soc_codec_conf *cconf;
/* CPU is dummy */
dai_link->no_pcm = 1;
dai_link->be_hw_params_fixup = asoc_simple_be_hw_params_fixup;
- dai = simple_props_to_dai_codec(dai_props, 0);
cconf = simple_props_to_codec_conf(dai_props, 0);
- ret = asoc_simple_parse_dai(np, codecs, NULL);
+ ret = simple_parse_node(priv, np, li, prefix, NULL);
if (ret < 0)
goto out_put_node;
- ret = asoc_simple_parse_clk(dev, np, dai, codecs);
- if (ret < 0)
- goto out_put_node;
-
- ret = asoc_simple_set_dailink_name(dev, dai_link,
- "be.%s",
- codecs->dai_name);
- if (ret < 0)
- goto out_put_node;
+ snprintf(dai_name, sizeof(dai_name), "be.%s", codecs->dai_name);
/* check "prefix" from top node */
snd_soc_of_parse_node_prefix(top, cconf, codecs->of_node,
}
simple_parse_convert(dev, np, &dai_props->adata);
- simple_parse_mclk_fs(top, np, codec, dai_props, prefix);
-
- ret = asoc_simple_parse_tdm(np, dai);
- if (ret)
- goto out_put_node;
-
- ret = asoc_simple_parse_daifmt(dev, node, codec,
- prefix, &dai_link->dai_fmt);
- if (ret < 0)
- goto out_put_node;
snd_soc_dai_link_set_capabilities(dai_link);
- dai_link->ops = &simple_ops;
- dai_link->init = asoc_simple_dai_init;
+ ret = simple_link_init(priv, node, codec, li, prefix, dai_name);
out_put_node:
+ li->link++;
+
of_node_put(node);
return ret;
}
{
struct device *dev = simple_priv_to_dev(priv);
struct snd_soc_dai_link *dai_link = simple_priv_to_link(priv, li->link);
- struct simple_dai_props *dai_props = simple_priv_to_props(priv, li->link);
- struct asoc_simple_dai *cpu_dai = simple_props_to_dai_cpu(dai_props, 0);
- struct asoc_simple_dai *codec_dai = simple_props_to_dai_codec(dai_props, 0);
struct snd_soc_dai_link_component *cpus = asoc_link_to_cpu(dai_link, 0);
struct snd_soc_dai_link_component *codecs = asoc_link_to_codec(dai_link, 0);
struct snd_soc_dai_link_component *platforms = asoc_link_to_platform(dai_link, 0);
- struct device_node *top = dev->of_node;
struct device_node *cpu = NULL;
struct device_node *node = NULL;
struct device_node *plat = NULL;
+ char dai_name[64];
char prop[128];
char *prefix = "";
int ret, single_cpu = 0;
cpu = np;
node = of_get_parent(np);
- li->link++;
dev_dbg(dev, "link_of (%pOF)\n", node);
snprintf(prop, sizeof(prop), "%splat", prefix);
plat = of_get_child_by_name(node, prop);
- ret = asoc_simple_parse_daifmt(dev, node, codec,
- prefix, &dai_link->dai_fmt);
- if (ret < 0)
- goto dai_link_of_err;
-
- simple_parse_mclk_fs(top, cpu, codec, dai_props, prefix);
-
- ret = asoc_simple_parse_dai(cpu, cpus, &single_cpu);
+ ret = simple_parse_node(priv, cpu, li, prefix, &single_cpu);
if (ret < 0)
goto dai_link_of_err;
- ret = asoc_simple_parse_dai(codec, codecs, NULL);
+ ret = simple_parse_node(priv, codec, li, prefix, NULL);
if (ret < 0)
goto dai_link_of_err;
if (ret < 0)
goto dai_link_of_err;
- ret = asoc_simple_parse_tdm(cpu, cpu_dai);
- if (ret < 0)
- goto dai_link_of_err;
-
- ret = asoc_simple_parse_tdm(codec, codec_dai);
- if (ret < 0)
- goto dai_link_of_err;
-
- ret = asoc_simple_parse_clk(dev, cpu, cpu_dai, cpus);
- if (ret < 0)
- goto dai_link_of_err;
-
- ret = asoc_simple_parse_clk(dev, codec, codec_dai, codecs);
- if (ret < 0)
- goto dai_link_of_err;
-
- ret = asoc_simple_set_dailink_name(dev, dai_link,
- "%s-%s",
- cpus->dai_name,
- codecs->dai_name);
- if (ret < 0)
- goto dai_link_of_err;
-
- dai_link->ops = &simple_ops;
- dai_link->init = asoc_simple_dai_init;
+ snprintf(dai_name, sizeof(dai_name),
+ "%s-%s", cpus->dai_name, codecs->dai_name);
asoc_simple_canonicalize_cpu(cpus, single_cpu);
asoc_simple_canonicalize_platform(platforms, cpus);
+ ret = simple_link_init(priv, node, codec, li, prefix, dai_name);
+
dai_link_of_err:
of_node_put(plat);
of_node_put(node);
+ li->link++;
+
return ret;
}
BYT_RT5640_SSP0_AIF1 |
BYT_RT5640_MCLK_EN),
},
+ { /* Glavey TM800A550L */
+ .matches = {
+ DMI_MATCH(DMI_BOARD_VENDOR, "AMI Corporation"),
+ DMI_MATCH(DMI_BOARD_NAME, "Aptio CRB"),
+ /* Above strings are too generic, also match on BIOS version */
+ DMI_MATCH(DMI_BIOS_VERSION, "ZY-8-BI-PX4S70VTR400-X423B-005-D"),
+ },
+ .driver_data = (void *)(BYTCR_INPUT_DEFAULTS |
+ BYT_RT5640_SSP0_AIF1 |
+ BYT_RT5640_MCLK_EN),
+ },
{
.matches = {
DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Hewlett-Packard"),
BYT_RT5640_MONO_SPEAKER |
BYT_RT5640_MCLK_EN),
},
+ { /* Lenovo Miix 3-830 */
+ .matches = {
+ DMI_EXACT_MATCH(DMI_SYS_VENDOR, "LENOVO"),
+ DMI_EXACT_MATCH(DMI_PRODUCT_VERSION, "Lenovo MIIX 3-830"),
+ },
+ .driver_data = (void *)(BYT_RT5640_IN1_MAP |
+ BYT_RT5640_JD_SRC_JD2_IN4N |
+ BYT_RT5640_OVCD_TH_2000UA |
+ BYT_RT5640_OVCD_SF_0P75 |
+ BYT_RT5640_MONO_SPEAKER |
+ BYT_RT5640_DIFF_MIC |
+ BYT_RT5640_SSP0_AIF1 |
+ BYT_RT5640_MCLK_EN),
+ },
{ /* Linx Linx7 tablet */
.matches = {
DMI_EXACT_MATCH(DMI_SYS_VENDOR, "LINX"),
if (dai_id == LPASS_DP_RX)
continue;
- drvdata->mi2s_osr_clk[dai_id] = devm_clk_get(dev,
+ drvdata->mi2s_osr_clk[dai_id] = devm_clk_get_optional(dev,
variant->dai_osr_clk_names[i]);
- if (IS_ERR(drvdata->mi2s_osr_clk[dai_id])) {
- dev_warn(dev,
- "%s() error getting optional %s: %ld\n",
- __func__,
- variant->dai_osr_clk_names[i],
- PTR_ERR(drvdata->mi2s_osr_clk[dai_id]));
-
- drvdata->mi2s_osr_clk[dai_id] = NULL;
- }
-
drvdata->mi2s_bit_clk[dai_id] = devm_clk_get(dev,
variant->dai_bit_clk_names[i]);
if (IS_ERR(drvdata->mi2s_bit_clk[dai_id])) {
struct snd_soc_pcm_runtime *rtd = asoc_substream_to_rtd(substream);
struct snd_soc_component *component = snd_soc_rtdcom_lookup(rtd, SOF_AUDIO_PCM_DRV_NAME);
struct snd_sof_dev *sdev = snd_soc_component_get_drvdata(component);
+ struct sof_ipc_fw_version *v = &sdev->fw_ready.version;
struct sof_ipc_dai_config *config;
struct snd_sof_dai *sof_dai;
struct sof_ipc_reply reply;
int ret;
+ /* DAI_CONFIG IPC during hw_params is not supported in older firmware */
+ if (v->abi_version < SOF_ABI_VER(3, 18, 0))
+ return 0;
+
list_for_each_entry(sof_dai, &sdev->dai_list, list) {
if (!sof_dai->cpu_dai_name || !sof_dai->dai_config)
continue;
dev_err(dev, "mclk register returned %d\n", ret);
return ret;
}
-
- sai->sai_mclk = devm_clk_hw_get_clk(dev, hw, NULL);
- if (IS_ERR(sai->sai_mclk))
- return PTR_ERR(sai->sai_mclk);
+ sai->sai_mclk = hw->clk;
/* register mclk provider */
return devm_of_clk_add_hw_provider(dev, of_clk_hw_simple_get, hw);
if (snd_BUG_ON(altsetting >= 64 - 8))
return false;
- err = snd_usb_ctl_msg(dev, usb_sndctrlpipe(dev, 0), UAC2_CS_CUR,
+ err = snd_usb_ctl_msg(dev, usb_rcvctrlpipe(dev, 0), UAC2_CS_CUR,
USB_TYPE_CLASS | USB_RECIP_INTERFACE | USB_DIR_IN,
UAC2_AS_VAL_ALT_SETTINGS << 8,
iface, &raw_data, sizeof(raw_data));
case USB_ID(0x1235, 0x8203): /* Focusrite Scarlett 6i6 2nd Gen */
case USB_ID(0x1235, 0x8204): /* Focusrite Scarlett 18i8 2nd Gen */
case USB_ID(0x1235, 0x8201): /* Focusrite Scarlett 18i20 2nd Gen */
- err = snd_scarlett_gen2_controls_create(mixer);
+ err = snd_scarlett_gen2_init(mixer);
break;
case USB_ID(0x041e, 0x323b): /* Creative Sound Blaster E1 */
/* send a second message to get the response */
err = snd_usb_ctl_msg(mixer->chip->dev,
- usb_sndctrlpipe(mixer->chip->dev, 0),
+ usb_rcvctrlpipe(mixer->chip->dev, 0),
SCARLETT2_USB_VENDOR_SPECIFIC_CMD_RESP,
USB_RECIP_INTERFACE | USB_TYPE_CLASS | USB_DIR_IN,
0,
return usb_submit_urb(mixer->urb, GFP_KERNEL);
}
-/* Entry point */
-int snd_scarlett_gen2_controls_create(struct usb_mixer_interface *mixer)
+static int snd_scarlett_gen2_controls_create(struct usb_mixer_interface *mixer,
+ const struct scarlett2_device_info *info)
{
- const struct scarlett2_device_info *info;
int err;
- /* only use UAC_VERSION_2 */
- if (!mixer->protocol)
- return 0;
-
- switch (mixer->chip->usb_id) {
- case USB_ID(0x1235, 0x8203):
- info = &s6i6_gen2_info;
- break;
- case USB_ID(0x1235, 0x8204):
- info = &s18i8_gen2_info;
- break;
- case USB_ID(0x1235, 0x8201):
- info = &s18i20_gen2_info;
- break;
- default: /* device not (yet) supported */
- return -EINVAL;
- }
-
- if (!(mixer->chip->setup & SCARLETT2_ENABLE)) {
- usb_audio_err(mixer->chip,
- "Focusrite Scarlett Gen 2 Mixer Driver disabled; "
- "use options snd_usb_audio device_setup=1 "
- "to enable and report any issues to g@b4.vu");
- return 0;
- }
-
/* Initialise private data, routing, sequence number */
err = scarlett2_init_private(mixer, info);
if (err < 0)
return 0;
}
+
+int snd_scarlett_gen2_init(struct usb_mixer_interface *mixer)
+{
+ struct snd_usb_audio *chip = mixer->chip;
+ const struct scarlett2_device_info *info;
+ int err;
+
+ /* only use UAC_VERSION_2 */
+ if (!mixer->protocol)
+ return 0;
+
+ switch (chip->usb_id) {
+ case USB_ID(0x1235, 0x8203):
+ info = &s6i6_gen2_info;
+ break;
+ case USB_ID(0x1235, 0x8204):
+ info = &s18i8_gen2_info;
+ break;
+ case USB_ID(0x1235, 0x8201):
+ info = &s18i20_gen2_info;
+ break;
+ default: /* device not (yet) supported */
+ return -EINVAL;
+ }
+
+ if (!(chip->setup & SCARLETT2_ENABLE)) {
+ usb_audio_info(chip,
+ "Focusrite Scarlett Gen 2 Mixer Driver disabled; "
+ "use options snd_usb_audio vid=0x%04x pid=0x%04x "
+ "device_setup=1 to enable and report any issues "
+ "to g@b4.vu",
+ USB_ID_VENDOR(chip->usb_id),
+ USB_ID_PRODUCT(chip->usb_id));
+ return 0;
+ }
+
+ usb_audio_info(chip,
+ "Focusrite Scarlett Gen 2 Mixer Driver enabled pid=0x%04x",
+ USB_ID_PRODUCT(chip->usb_id));
+
+ err = snd_scarlett_gen2_controls_create(mixer, info);
+ if (err < 0)
+ usb_audio_err(mixer->chip,
+ "Error initialising Scarlett Mixer Driver: %d",
+ err);
+
+ return err;
+}
#ifndef __USB_MIXER_SCARLETT_GEN2_H
#define __USB_MIXER_SCARLETT_GEN2_H
-int snd_scarlett_gen2_controls_create(struct usb_mixer_interface *mixer);
+int snd_scarlett_gen2_init(struct usb_mixer_interface *mixer);
#endif /* __USB_MIXER_SCARLETT_GEN2_H */
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_MIPS_PERF_REGS_H
+#define _ASM_MIPS_PERF_REGS_H
+
+enum perf_event_mips_regs {
+ PERF_REG_MIPS_PC,
+ PERF_REG_MIPS_R1,
+ PERF_REG_MIPS_R2,
+ PERF_REG_MIPS_R3,
+ PERF_REG_MIPS_R4,
+ PERF_REG_MIPS_R5,
+ PERF_REG_MIPS_R6,
+ PERF_REG_MIPS_R7,
+ PERF_REG_MIPS_R8,
+ PERF_REG_MIPS_R9,
+ PERF_REG_MIPS_R10,
+ PERF_REG_MIPS_R11,
+ PERF_REG_MIPS_R12,
+ PERF_REG_MIPS_R13,
+ PERF_REG_MIPS_R14,
+ PERF_REG_MIPS_R15,
+ PERF_REG_MIPS_R16,
+ PERF_REG_MIPS_R17,
+ PERF_REG_MIPS_R18,
+ PERF_REG_MIPS_R19,
+ PERF_REG_MIPS_R20,
+ PERF_REG_MIPS_R21,
+ PERF_REG_MIPS_R22,
+ PERF_REG_MIPS_R23,
+ PERF_REG_MIPS_R24,
+ PERF_REG_MIPS_R25,
+ PERF_REG_MIPS_R26,
+ PERF_REG_MIPS_R27,
+ PERF_REG_MIPS_R28,
+ PERF_REG_MIPS_R29,
+ PERF_REG_MIPS_R30,
+ PERF_REG_MIPS_R31,
+ PERF_REG_MIPS_MAX = PERF_REG_MIPS_R31 + 1,
+};
+#endif /* _ASM_MIPS_PERF_REGS_H */
__u16 flags;
} smm;
+ __u16 pad;
+
__u32 flags;
__u64 preemption_timer_deadline;
};
| *ATTACH_TYPE* := { **ingress** | **egress** | **sock_create** | **sock_ops** | **device** |
| **bind4** | **bind6** | **post_bind4** | **post_bind6** | **connect4** | **connect6** |
| **getpeername4** | **getpeername6** | **getsockname4** | **getsockname6** | **sendmsg4** |
-| **sendmsg6** | **recvmsg4** | **recvmsg6** | **sysctl** | **getsockopt** | **setsockopt** }
+| **sendmsg6** | **recvmsg4** | **recvmsg6** | **sysctl** | **getsockopt** | **setsockopt** |
+| **sock_release** }
| *ATTACH_FLAGS* := { **multi** | **override** }
DESCRIPTION
**getpeername6** call to getpeername(2) for an inet6 socket (since 5.8);
**getsockname4** call to getsockname(2) for an inet4 socket (since 5.8);
**getsockname6** call to getsockname(2) for an inet6 socket (since 5.8).
+ **sock_release** closing an userspace inet socket (since 5.9).
**bpftool cgroup detach** *CGROUP* *ATTACH_TYPE* *PROG*
Detach *PROG* from the cgroup *CGROUP* and attach type
| **cgroup/connect4** | **cgroup/connect6** | **cgroup/getpeername4** | **cgroup/getpeername6** |
| **cgroup/getsockname4** | **cgroup/getsockname6** | **cgroup/sendmsg4** | **cgroup/sendmsg6** |
| **cgroup/recvmsg4** | **cgroup/recvmsg6** | **cgroup/sysctl** |
-| **cgroup/getsockopt** | **cgroup/setsockopt** |
+| **cgroup/getsockopt** | **cgroup/setsockopt** | **cgroup/sock_release** |
| **struct_ops** | **fentry** | **fexit** | **freplace** | **sk_lookup**
| }
| *ATTACH_TYPE* := {
cgroup/recvmsg4 cgroup/recvmsg6 \
cgroup/post_bind4 cgroup/post_bind6 \
cgroup/sysctl cgroup/getsockopt \
- cgroup/setsockopt struct_ops \
+ cgroup/setsockopt cgroup/sock_release struct_ops \
fentry fexit freplace sk_lookup" -- \
"$cur" ) )
return 0
device bind4 bind6 post_bind4 post_bind6 connect4 connect6 \
getpeername4 getpeername6 getsockname4 getsockname6 \
sendmsg4 sendmsg6 recvmsg4 recvmsg6 sysctl getsockopt \
- setsockopt'
+ setsockopt sock_release'
local ATTACH_FLAGS='multi override'
local PROG_TYPE='id pinned tag name'
case $prev in
ingress|egress|sock_create|sock_ops|device|bind4|bind6|\
post_bind4|post_bind6|connect4|connect6|getpeername4|\
getpeername6|getsockname4|getsockname6|sendmsg4|sendmsg6|\
- recvmsg4|recvmsg6|sysctl|getsockopt|setsockopt)
+ recvmsg4|recvmsg6|sysctl|getsockopt|setsockopt|sock_release)
COMPREPLY=( $( compgen -W "$PROG_TYPE" -- \
"$cur" ) )
return 0
" connect6 | getpeername4 | getpeername6 |\n" \
" getsockname4 | getsockname6 | sendmsg4 |\n" \
" sendmsg6 | recvmsg4 | recvmsg6 |\n" \
- " sysctl | getsockopt | setsockopt }"
+ " sysctl | getsockopt | setsockopt |\n" \
+ " sock_release }"
static unsigned int query_flags;
" cgroup/getpeername4 | cgroup/getpeername6 |\n"
" cgroup/getsockname4 | cgroup/getsockname6 | cgroup/sendmsg4 |\n"
" cgroup/sendmsg6 | cgroup/recvmsg4 | cgroup/recvmsg6 |\n"
- " cgroup/getsockopt | cgroup/setsockopt |\n"
+ " cgroup/getsockopt | cgroup/setsockopt | cgroup/sock_release |\n"
" struct_ops | fentry | fexit | freplace | sk_lookup }\n"
" ATTACH_TYPE := { msg_verdict | stream_verdict | stream_parser |\n"
" flow_dissector }\n"
#define BLKROTATIONAL _IO(0x12,126)
#define BLKZEROOUT _IO(0x12,127)
/*
- * A jump here: 130-131 are reserved for zoned block devices
+ * A jump here: 130-136 are reserved for zoned block devices
* (see uapi/linux/blkzoned.h)
*/
* Note: you must update KVM_API_VERSION if you change this interface.
*/
+#include <linux/const.h>
#include <linux/types.h>
#include <linux/compiler.h>
#include <linux/ioctl.h>
* conversion after harvesting an entry. Also, it must not skip any
* dirty bits, so that dirty bits are always harvested in sequence.
*/
-#define KVM_DIRTY_GFN_F_DIRTY BIT(0)
-#define KVM_DIRTY_GFN_F_RESET BIT(1)
+#define KVM_DIRTY_GFN_F_DIRTY _BITUL(0)
+#define KVM_DIRTY_GFN_F_RESET _BITUL(1)
#define KVM_DIRTY_GFN_F_MASK 0x3
/*
/*
* User provided data if sigtrap=1, passed back to user via
- * siginfo_t::si_perf, e.g. to permit user to identify the event.
+ * siginfo_t::si_perf_data, e.g. to permit user to identify the event.
*/
__u64 sig_data;
};
const struct btf_var_secinfo *vs;
const struct btf_type *sec;
+ if (!btf)
+ return 0;
+
sec_btf_id = btf__find_by_name_kind(btf, KSYMS_SEC,
BTF_KIND_DATASEC);
if (sec_btf_id < 0)
#define ELF_C_READ_MMAP ELF_C_READ
#endif
+/* Older libelf all end up in this expression, for both 32 and 64 bit */
+#ifndef GELF_ST_VISIBILITY
+#define GELF_ST_VISIBILITY(o) ((o) & 0x03)
+#endif
+
#define BTF_INFO_ENC(kind, kind_flag, vlen) \
((!!(kind_flag) << 31) | ((kind) << 24) | ((vlen) & BTF_MAX_VLEN))
#define BTF_TYPE_ENC(name, info, size_or_type) (name), (info), (size_or_type)
perf script --itrace=ibxwpe -F+flags
-The flags are "bcrosyiABEx" which stand for branch, call, return, conditional,
-system, asynchronous, interrupt, transaction abort, trace begin, trace end, and
-in transaction, respectively.
+The flags are "bcrosyiABExgh" which stand for branch, call, return, conditional,
+system, asynchronous, interrupt, transaction abort, trace begin, trace end,
+in transaction, VM-entry, and VM-exit respectively.
perf script also supports higher level ways to dump instruction traces:
At this point usage is displayed, and perf-script exits.
The flags field is synthesized and may have a value when Instruction
- Trace decoding. The flags are "bcrosyiABEx" which stand for branch,
+ Trace decoding. The flags are "bcrosyiABExgh" which stand for branch,
call, return, conditional, system, asynchronous, interrupt,
- transaction abort, trace begin, trace end, and in transaction,
+ transaction abort, trace begin, trace end, in transaction, VM-Entry, and VM-Exit
respectively. Known combinations of flags are printed more nicely e.g.
"call" for "bc", "return" for "br", "jcc" for "bo", "jmp" for "b",
"int" for "bci", "iret" for "bri", "syscall" for "bcs", "sysret" for "brs",
"async" for "by", "hw int" for "bcyi", "tx abrt" for "bA", "tr strt" for "bB",
- "tr end" for "bE". However the "x" flag will be display separately in those
+ "tr end" for "bE", "vmentry" for "bcg", "vmexit" for "bch".
+ However the "x" flag will be displayed separately in those
cases e.g. "jcc (x)" for a condition branch within a transaction.
The callindent field is synthesized and may have a value when
ifeq ($(ARCH),mips)
NO_PERF_REGS := 0
CFLAGS += -I$(OUTPUT)arch/mips/include/generated
- CFLAGS += -I../../arch/mips/include/uapi -I../../arch/mips/include/generated/uapi
LIBUNWIND_LIBS = -lunwind -lunwind-mips
endif
440 n64 process_madvise sys_process_madvise
441 n64 epoll_pwait2 sys_epoll_pwait2
442 n64 mount_setattr sys_mount_setattr
-443 n64 quotactl_path sys_quotactl_path
+# 443 reserved for quotactl_path
444 n64 landlock_create_ruleset sys_landlock_create_ruleset
445 n64 landlock_add_rule sys_landlock_add_rule
446 n64 landlock_restrict_self sys_landlock_restrict_self
440 common process_madvise sys_process_madvise
441 common epoll_pwait2 sys_epoll_pwait2 compat_sys_epoll_pwait2
442 common mount_setattr sys_mount_setattr
-443 common quotactl_path sys_quotactl_path
+# 443 reserved for quotactl_path
444 common landlock_create_ruleset sys_landlock_create_ruleset
445 common landlock_add_rule sys_landlock_add_rule
446 common landlock_restrict_self sys_landlock_restrict_self
440 common process_madvise sys_process_madvise sys_process_madvise
441 common epoll_pwait2 sys_epoll_pwait2 compat_sys_epoll_pwait2
442 common mount_setattr sys_mount_setattr sys_mount_setattr
-443 common quotactl_path sys_quotactl_path sys_quotactl_path
+# 443 reserved for quotactl_path
444 common landlock_create_ruleset sys_landlock_create_ruleset sys_landlock_create_ruleset
445 common landlock_add_rule sys_landlock_add_rule sys_landlock_add_rule
446 common landlock_restrict_self sys_landlock_restrict_self sys_landlock_restrict_self
440 common process_madvise sys_process_madvise
441 common epoll_pwait2 sys_epoll_pwait2
442 common mount_setattr sys_mount_setattr
-443 common quotactl_path sys_quotactl_path
+# 443 reserved for quotactl_path
444 common landlock_create_ruleset sys_landlock_create_ruleset
445 common landlock_add_rule sys_landlock_add_rule
446 common landlock_restrict_self sys_landlock_restrict_self
if (!perf_header__has_feat(&session->header, HEADER_BUILD_ID))
with_hits = true;
+ if (zstd_init(&(session->zstd_data), 0) < 0)
+ pr_warning("Decompression initialization failed. Reported data may be incomplete.\n");
+
/*
* in pipe-mode, the only way to get the buildids is to parse
* the record stream. Buildids are stored as RECORD_HEADER_BUILD_ID
rec->no_buildid = true;
}
+ if (rec->opts.record_cgroup && !perf_can_record_cgroup()) {
+ pr_err("Kernel has no cgroup sampling support.\n");
+ err = -EINVAL;
+ goto out_opts;
+ }
+
if (rec->opts.kcore)
rec->data.is_dir = true;
* - we have initial delay configured
*/
if (!target__none(&target) || stat_config.initial_delay) {
- evlist__enable(evsel_list);
+ if (!all_counters_use_bpf)
+ evlist__enable(evsel_list);
if (stat_config.initial_delay > 0)
pr_info(EVLIST_ENABLED_MSG);
}
static void disable_counters(void)
{
+ struct evsel *counter;
+
/*
* If we don't have tracee (attaching to task or cpu), counters may
* still be running. To get accurate group ratios, we must stop groups
* from counting before reading their constituent counters.
*/
- if (!target__none(&target))
- evlist__disable(evsel_list);
+ if (!target__none(&target)) {
+ evlist__for_each_entry(evsel_list, counter)
+ bpf_counter__disable(counter);
+ if (!all_counters_use_bpf)
+ evlist__disable(evsel_list);
+ }
}
static volatile int workload_exec_errno;
arch/x86/tools/gen-insn-attr-x86.awk
arch/arm/include/uapi/asm/perf_regs.h
arch/arm64/include/uapi/asm/perf_regs.h
+arch/mips/include/uapi/asm/perf_regs.h
arch/powerpc/include/uapi/asm/perf_regs.h
arch/s390/include/uapi/asm/perf_regs.h
arch/x86/include/uapi/asm/perf_regs.h
const char *cmd;
char sbuf[STRERR_BUFSIZE];
+ perf_debug_setup();
+
/* libsubcmd init */
exec_cmd_init("perf", PREFIX, PERF_EXEC_PATH, EXEC_PATH_ENVIRONMENT);
pager_init(PERF_PAGER_ENVIRONMENT);
*/
pthread__block_sigwinch();
- perf_debug_setup();
-
while (1) {
static int done_help;
[
{
- "EventCode": "1003C",
+ "EventCode": "0x1003C",
"EventName": "PM_EXEC_STALL_DMISS_L2L3",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from either the local L2 or local L3."
},
{
- "EventCode": "34056",
+ "EventCode": "0x1E054",
+ "EventName": "PM_EXEC_STALL_DMISS_L21_L31",
+ "BriefDescription": "Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from another core's L2 or L3 on the same chip."
+ },
+ {
+ "EventCode": "0x34054",
+ "EventName": "PM_EXEC_STALL_DMISS_L2L3_NOCONFLICT",
+ "BriefDescription": "Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from the local L2 or local L3, without a dispatch conflict."
+ },
+ {
+ "EventCode": "0x34056",
"EventName": "PM_EXEC_STALL_LOAD_FINISH",
- "BriefDescription": "Cycles in which the oldest instruction in the pipeline was finishing a load after its data was reloaded from a data source beyond the local L1; cycles in which the LSU was processing an L1-hit; cycles in which the NTF instruction merged with another load in the LMQ."
+ "BriefDescription": "Cycles in which the oldest instruction in the pipeline was finishing a load after its data was reloaded from a data source beyond the local L1; cycles in which the LSU was processing an L1-hit; cycles in which the NTF instruction merged with another load in the LMQ; cycles in which the NTF instruction is waiting for a data reload for a load miss, but the data comes back with a non-NTF instruction."
},
{
- "EventCode": "3006C",
+ "EventCode": "0x3006C",
"EventName": "PM_RUN_CYC_SMT2_MODE",
"BriefDescription": "Cycles when this thread's run latch is set and the core is in SMT2 mode."
},
{
- "EventCode": "300F4",
+ "EventCode": "0x300F4",
"EventName": "PM_RUN_INST_CMPL_CONC",
"BriefDescription": "PowerPC instructions completed by this thread when all threads in the core had the run-latch set."
},
{
- "EventCode": "4C016",
+ "EventCode": "0x4C016",
"EventName": "PM_EXEC_STALL_DMISS_L2L3_CONFLICT",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from the local L2 or local L3, with a dispatch conflict."
},
{
- "EventCode": "4D014",
+ "EventCode": "0x4D014",
"EventName": "PM_EXEC_STALL_LOAD",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was a load instruction executing in the Load Store Unit."
},
{
- "EventCode": "4D016",
+ "EventCode": "0x4D016",
"EventName": "PM_EXEC_STALL_PTESYNC",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was a PTESYNC instruction executing in the Load Store Unit."
},
{
- "EventCode": "401EA",
+ "EventCode": "0x401EA",
"EventName": "PM_THRESH_EXC_128",
"BriefDescription": "Threshold counter exceeded a value of 128."
},
{
- "EventCode": "400F6",
+ "EventCode": "0x400F6",
"EventName": "PM_BR_MPRED_CMPL",
"BriefDescription": "A mispredicted branch completed. Includes direction and target."
}
[
{
- "EventCode": "4016E",
+ "EventCode": "0x4016E",
"EventName": "PM_THRESH_NOT_MET",
"BriefDescription": "Threshold counter did not meet threshold."
}
[
{
- "EventCode": "10004",
+ "EventCode": "0x10004",
"EventName": "PM_EXEC_STALL_TRANSLATION",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline suffered a TLB miss or ERAT miss and waited for it to resolve."
},
{
- "EventCode": "10010",
+ "EventCode": "0x10006",
+ "EventName": "PM_DISP_STALL_HELD_OTHER_CYC",
+ "BriefDescription": "Cycles in which the NTC instruction is held at dispatch for any other reason."
+ },
+ {
+ "EventCode": "0x10010",
"EventName": "PM_PMC4_OVERFLOW",
"BriefDescription": "The event selected for PMC4 caused the event counter to overflow."
},
{
- "EventCode": "10020",
+ "EventCode": "0x10020",
"EventName": "PM_PMC4_REWIND",
"BriefDescription": "The speculative event selected for PMC4 rewinds and the counter for PMC4 is not charged."
},
{
- "EventCode": "10038",
+ "EventCode": "0x10038",
"EventName": "PM_DISP_STALL_TRANSLATION",
"BriefDescription": "Cycles when dispatch was stalled for this thread because the MMU was handling a translation miss."
},
{
- "EventCode": "1003A",
+ "EventCode": "0x1003A",
"EventName": "PM_DISP_STALL_BR_MPRED_IC_L2",
"BriefDescription": "Cycles when dispatch was stalled while the instruction was fetched from the local L2 after suffering a branch mispredict."
},
{
- "EventCode": "1E050",
+ "EventCode": "0x1D05E",
+ "EventName": "PM_DISP_STALL_HELD_HALT_CYC",
+ "BriefDescription": "Cycles in which the NTC instruction is held at dispatch because of power management."
+ },
+ {
+ "EventCode": "0x1E050",
"EventName": "PM_DISP_STALL_HELD_STF_MAPPER_CYC",
"BriefDescription": "Cycles in which the NTC instruction is held at dispatch because the STF mapper/SRB was full. Includes GPR (count, link, tar), VSR, VMR, FPR."
},
{
- "EventCode": "1F054",
+ "EventCode": "0x1F054",
"EventName": "PM_DTLB_HIT",
"BriefDescription": "The PTE required by the instruction was resident in the TLB (data TLB access). When MMCR1[16]=0 this event counts only demand hits. When MMCR1[16]=1 this event includes demand and prefetch. Applies to both HPT and RPT."
},
{
- "EventCode": "101E8",
+ "EventCode": "0x10064",
+ "EventName": "PM_DISP_STALL_IC_L2",
+ "BriefDescription": "Cycles when dispatch was stalled while the instruction was fetched from the local L2."
+ },
+ {
+ "EventCode": "0x101E8",
"EventName": "PM_THRESH_EXC_256",
"BriefDescription": "Threshold counter exceeded a count of 256."
},
{
- "EventCode": "101EC",
+ "EventCode": "0x101EC",
"EventName": "PM_THRESH_MET",
"BriefDescription": "Threshold exceeded."
},
{
- "EventCode": "100F2",
+ "EventCode": "0x100F2",
"EventName": "PM_1PLUS_PPC_CMPL",
"BriefDescription": "Cycles in which at least one instruction is completed by this thread."
},
{
- "EventCode": "100F6",
+ "EventCode": "0x100F6",
"EventName": "PM_IERAT_MISS",
"BriefDescription": "IERAT Reloaded to satisfy an IERAT miss. All page sizes are counted by this event."
},
{
- "EventCode": "100F8",
+ "EventCode": "0x100F8",
"EventName": "PM_DISP_STALL_CYC",
"BriefDescription": "Cycles the ICT has no itags assigned to this thread (no instructions were dispatched during these cycles)."
},
{
- "EventCode": "20114",
+ "EventCode": "0x20006",
+ "EventName": "PM_DISP_STALL_HELD_ISSQ_FULL_CYC",
+ "BriefDescription": "Cycles in which the NTC instruction is held at dispatch due to Issue queue full. Includes issue queue and branch queue."
+ },
+ {
+ "EventCode": "0x20114",
"EventName": "PM_MRK_L2_RC_DISP",
"BriefDescription": "Marked instruction RC dispatched in L2."
},
{
- "EventCode": "2C010",
+ "EventCode": "0x2C010",
"EventName": "PM_EXEC_STALL_LSU",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was executing in the Load Store Unit. This does not include simple fixed point instructions."
},
{
- "EventCode": "2C016",
+ "EventCode": "0x2C016",
"EventName": "PM_DISP_STALL_IERAT_ONLY_MISS",
"BriefDescription": "Cycles when dispatch was stalled while waiting to resolve an instruction ERAT miss."
},
{
- "EventCode": "2C01E",
+ "EventCode": "0x2C01E",
"EventName": "PM_DISP_STALL_BR_MPRED_IC_L3",
"BriefDescription": "Cycles when dispatch was stalled while the instruction was fetched from the local L3 after suffering a branch mispredict."
},
{
- "EventCode": "2D01A",
+ "EventCode": "0x2D01A",
"EventName": "PM_DISP_STALL_IC_MISS",
"BriefDescription": "Cycles when dispatch was stalled for this thread due to an Icache Miss."
},
{
- "EventCode": "2D01C",
- "EventName": "PM_CMPL_STALL_STCX",
- "BriefDescription": "Cycles in which the oldest instruction in the pipeline was a stcx waiting for resolution from the nest before completing."
- },
- {
- "EventCode": "2E018",
+ "EventCode": "0x2E018",
"EventName": "PM_DISP_STALL_FETCH",
"BriefDescription": "Cycles when dispatch was stalled for this thread because Fetch was being held."
},
{
- "EventCode": "2E01A",
+ "EventCode": "0x2E01A",
"EventName": "PM_DISP_STALL_HELD_XVFC_MAPPER_CYC",
"BriefDescription": "Cycles in which the NTC instruction is held at dispatch because the XVFC mapper/SRB was full."
},
{
- "EventCode": "2C142",
+ "EventCode": "0x2C142",
"EventName": "PM_MRK_XFER_FROM_SRC_PMC2",
"BriefDescription": "For a marked data transfer instruction, the processor's L1 data cache was reloaded from the source specified in MMCR3[15:27]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads."
},
{
- "EventCode": "24050",
+ "EventCode": "0x24050",
"EventName": "PM_IOPS_DISP",
"BriefDescription": "Internal Operations dispatched. PM_IOPS_DISP / PM_INST_DISP will show the average number of internal operations per PowerPC instruction."
},
{
- "EventCode": "2405E",
+ "EventCode": "0x2405E",
"EventName": "PM_ISSUE_CANCEL",
"BriefDescription": "An instruction issued and the issue was later cancelled. Only one cancel per PowerPC instruction."
},
{
- "EventCode": "200FA",
+ "EventCode": "0x200FA",
"EventName": "PM_BR_TAKEN_CMPL",
"BriefDescription": "Branch Taken instruction completed."
},
{
- "EventCode": "30012",
+ "EventCode": "0x30004",
+ "EventName": "PM_DISP_STALL_FLUSH",
+ "BriefDescription": "Cycles when dispatch was stalled because of a flush that happened to an instruction(s) that was not yet NTC. PM_EXEC_STALL_NTC_FLUSH only includes instructions that were flushed after becoming NTC."
+ },
+ {
+ "EventCode": "0x3000A",
+ "EventName": "PM_DISP_STALL_ITLB_MISS",
+ "BriefDescription": "Cycles when dispatch was stalled while waiting to resolve an instruction TLB miss."
+ },
+ {
+ "EventCode": "0x30012",
"EventName": "PM_FLUSH_COMPLETION",
"BriefDescription": "The instruction that was next to complete (oldest in the pipeline) did not complete because it suffered a flush."
},
{
- "EventCode": "30014",
+ "EventCode": "0x30014",
"EventName": "PM_EXEC_STALL_STORE",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was a store instruction executing in the Load Store Unit."
},
{
- "EventCode": "30018",
+ "EventCode": "0x30018",
"EventName": "PM_DISP_STALL_HELD_SCOREBOARD_CYC",
"BriefDescription": "Cycles in which the NTC instruction is held at dispatch while waiting on the Scoreboard. This event combines VSCR and FPSCR together."
},
{
- "EventCode": "30026",
+ "EventCode": "0x30026",
"EventName": "PM_EXEC_STALL_STORE_MISS",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was a store whose cache line was not resident in the L1 and was waiting for allocation of the missing line into the L1."
},
{
- "EventCode": "3012A",
+ "EventCode": "0x3012A",
"EventName": "PM_MRK_L2_RC_DONE",
"BriefDescription": "L2 RC machine completed the transaction for the marked instruction."
},
{
- "EventCode": "3F046",
+ "EventCode": "0x3F046",
"EventName": "PM_ITLB_HIT_1G",
"BriefDescription": "Instruction TLB hit (IERAT reload) page size 1G, which implies Radix Page Table translation is in use. When MMCR1[17]=0 this event counts only for demand misses. When MMCR1[17]=1 this event includes demand misses and prefetches."
},
{
- "EventCode": "34058",
+ "EventCode": "0x34058",
"EventName": "PM_DISP_STALL_BR_MPRED_ICMISS",
"BriefDescription": "Cycles when dispatch was stalled after a mispredicted branch resulted in an instruction cache miss."
},
{
- "EventCode": "3D05C",
+ "EventCode": "0x3D05C",
"EventName": "PM_DISP_STALL_HELD_RENAME_CYC",
"BriefDescription": "Cycles in which the NTC instruction is held at dispatch because the mapper/SRB was full. Includes GPR (count, link, tar), VSR, VMR, FPR and XVFC."
},
{
- "EventCode": "3E052",
+ "EventCode": "0x3E052",
"EventName": "PM_DISP_STALL_IC_L3",
"BriefDescription": "Cycles when dispatch was stalled while the instruction was fetched from the local L3."
},
{
- "EventCode": "3E054",
+ "EventCode": "0x3E054",
"EventName": "PM_LD_MISS_L1",
"BriefDescription": "Load Missed L1, counted at execution time (can be greater than loads finished). LMQ merges are not included in this count. i.e. if a load instruction misses on an address that is already allocated on the LMQ, this event will not increment for that load). Note that this count is per slice, so if a load spans multiple slices this event will increment multiple times for a single load."
},
{
- "EventCode": "301EA",
+ "EventCode": "0x301EA",
"EventName": "PM_THRESH_EXC_1024",
"BriefDescription": "Threshold counter exceeded a value of 1024."
},
{
- "EventCode": "300FA",
+ "EventCode": "0x300FA",
"EventName": "PM_INST_FROM_L3MISS",
"BriefDescription": "The processor's instruction cache was reloaded from a source other than the local core's L1, L2, or L3 due to a demand miss."
},
{
- "EventCode": "40006",
+ "EventCode": "0x40006",
"EventName": "PM_ISSUE_KILL",
"BriefDescription": "Cycles in which an instruction or group of instructions were cancelled after being issued. This event increments once per occurrence, regardless of how many instructions are included in the issue group."
},
{
- "EventCode": "40116",
+ "EventCode": "0x40116",
"EventName": "PM_MRK_LARX_FIN",
"BriefDescription": "Marked load and reserve instruction (LARX) finished. LARX and STCX are instructions used to acquire a lock."
},
{
- "EventCode": "4C010",
+ "EventCode": "0x4C010",
"EventName": "PM_DISP_STALL_BR_MPRED_IC_L3MISS",
"BriefDescription": "Cycles when dispatch was stalled while the instruction was fetched from sources beyond the local L3 after suffering a mispredicted branch."
},
{
- "EventCode": "4D01E",
+ "EventCode": "0x4D01E",
"EventName": "PM_DISP_STALL_BR_MPRED",
"BriefDescription": "Cycles when dispatch was stalled for this thread due to a mispredicted branch."
},
{
- "EventCode": "4E010",
+ "EventCode": "0x4E010",
"EventName": "PM_DISP_STALL_IC_L3MISS",
"BriefDescription": "Cycles when dispatch was stalled while the instruction was fetched from any source beyond the local L3."
},
{
- "EventCode": "4E01A",
+ "EventCode": "0x4E01A",
"EventName": "PM_DISP_STALL_HELD_CYC",
"BriefDescription": "Cycles in which the NTC instruction is held at dispatch for any reason."
},
{
- "EventCode": "44056",
+ "EventCode": "0x4003C",
+ "EventName": "PM_DISP_STALL_HELD_SYNC_CYC",
+ "BriefDescription": "Cycles in which the NTC instruction is held at dispatch because of a synchronizing instruction that requires the ICT to be empty before dispatch."
+ },
+ {
+ "EventCode": "0x44056",
"EventName": "PM_VECTOR_ST_CMPL",
"BriefDescription": "Vector store instructions completed."
}
[
{
- "EventCode": "1E058",
+ "EventCode": "0x1E058",
"EventName": "PM_STCX_FAIL_FIN",
"BriefDescription": "Conditional store instruction (STCX) failed. LARX and STCX are instructions used to acquire a lock."
},
{
- "EventCode": "4E050",
+ "EventCode": "0x4E050",
"EventName": "PM_STCX_PASS_FIN",
"BriefDescription": "Conditional store instruction (STCX) passed. LARX and STCX are instructions used to acquire a lock."
}
[
{
- "EventCode": "1002C",
+ "EventCode": "0x1002C",
"EventName": "PM_LD_PREFETCH_CACHE_LINE_MISS",
"BriefDescription": "The L1 cache was reloaded with a line that fulfills a prefetch request."
},
{
- "EventCode": "10132",
+ "EventCode": "0x10132",
"EventName": "PM_MRK_INST_ISSUED",
"BriefDescription": "Marked instruction issued. Note that stores always get issued twice, the address gets issued to the LSU and the data gets issued to the VSU. Also, issues can sometimes get killed/cancelled and cause multiple sequential issues for the same instruction."
},
{
- "EventCode": "101E0",
+ "EventCode": "0x101E0",
"EventName": "PM_MRK_INST_DISP",
"BriefDescription": "The thread has dispatched a randomly sampled marked instruction."
},
{
- "EventCode": "101E2",
+ "EventCode": "0x101E2",
"EventName": "PM_MRK_BR_TAKEN_CMPL",
"BriefDescription": "Marked Branch Taken instruction completed."
},
{
- "EventCode": "20112",
+ "EventCode": "0x20112",
"EventName": "PM_MRK_NTF_FIN",
"BriefDescription": "The marked instruction became the oldest in the pipeline before it finished. It excludes instructions that finish at dispatch."
},
{
- "EventCode": "2C01C",
+ "EventCode": "0x2C01C",
"EventName": "PM_EXEC_STALL_DMISS_OFF_CHIP",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from a remote chip."
},
{
- "EventCode": "20138",
+ "EventCode": "0x20138",
"EventName": "PM_MRK_ST_NEST",
"BriefDescription": "A store has been sampled/marked and is at the point of execution where it has completed in the core and can no longer be flushed. At this point the store is sent to the L2."
},
{
- "EventCode": "2013A",
+ "EventCode": "0x2013A",
"EventName": "PM_MRK_BRU_FIN",
"BriefDescription": "Marked Branch instruction finished."
},
{
- "EventCode": "2C144",
+ "EventCode": "0x2C144",
"EventName": "PM_MRK_XFER_FROM_SRC_CYC_PMC2",
"BriefDescription": "Cycles taken for a marked demand miss to reload a line from the source specified in MMCR3[15:27]."
},
{
- "EventCode": "24156",
+ "EventCode": "0x24156",
"EventName": "PM_MRK_STCX_FIN",
"BriefDescription": "Marked conditional store instruction (STCX) finished. LARX and STCX are instructions used to acquire a lock."
},
{
- "EventCode": "24158",
+ "EventCode": "0x24158",
"EventName": "PM_MRK_INST",
"BriefDescription": "An instruction was marked. Includes both Random Instruction Sampling (RIS) at decode time and Random Event Sampling (RES) at the time the configured event happens."
},
{
- "EventCode": "2415C",
+ "EventCode": "0x2415C",
"EventName": "PM_MRK_BR_CMPL",
"BriefDescription": "A marked branch completed. All branches are included."
},
{
- "EventCode": "200FD",
+ "EventCode": "0x200FD",
"EventName": "PM_L1_ICACHE_MISS",
"BriefDescription": "Demand iCache Miss."
},
{
- "EventCode": "30130",
+ "EventCode": "0x30130",
"EventName": "PM_MRK_INST_FIN",
"BriefDescription": "marked instruction finished. Excludes instructions that finish at dispatch. Note that stores always finish twice since the address gets issued to the LSU and the data gets issued to the VSU."
},
{
- "EventCode": "34146",
+ "EventCode": "0x34146",
"EventName": "PM_MRK_LD_CMPL",
"BriefDescription": "Marked loads completed."
},
{
- "EventCode": "3E158",
+ "EventCode": "0x3E158",
"EventName": "PM_MRK_STCX_FAIL",
"BriefDescription": "Marked conditional store instruction (STCX) failed. LARX and STCX are instructions used to acquire a lock."
},
{
- "EventCode": "3E15A",
+ "EventCode": "0x3E15A",
"EventName": "PM_MRK_ST_FIN",
"BriefDescription": "The marked instruction was a store of any kind."
},
{
- "EventCode": "30068",
+ "EventCode": "0x30068",
"EventName": "PM_L1_ICACHE_RELOADED_PREF",
"BriefDescription": "Counts all Icache prefetch reloads ( includes demand turned into prefetch)."
},
{
- "EventCode": "301E4",
+ "EventCode": "0x301E4",
"EventName": "PM_MRK_BR_MPRED_CMPL",
"BriefDescription": "Marked Branch Mispredicted. Includes direction and target."
},
{
- "EventCode": "300F6",
+ "EventCode": "0x300F6",
"EventName": "PM_LD_DEMAND_MISS_L1",
"BriefDescription": "The L1 cache was reloaded with a line that fulfills a demand miss request. Counted at reload time, before finish."
},
{
- "EventCode": "300FE",
+ "EventCode": "0x300FE",
"EventName": "PM_DATA_FROM_L3MISS",
"BriefDescription": "The processor's data cache was reloaded from a source other than the local core's L1, L2, or L3 due to a demand miss."
},
{
- "EventCode": "40012",
+ "EventCode": "0x40012",
"EventName": "PM_L1_ICACHE_RELOADED_ALL",
"BriefDescription": "Counts all Icache reloads includes demand, prefetch, prefetch turned into demand and demand turned into prefetch."
},
{
- "EventCode": "40134",
+ "EventCode": "0x40134",
"EventName": "PM_MRK_INST_TIMEO",
"BriefDescription": "Marked instruction finish timeout (instruction was lost)."
},
{
- "EventCode": "4003C",
- "EventName": "PM_DISP_STALL_HELD_SYNC_CYC",
- "BriefDescription": "Cycles in which the NTC instruction is held at dispatch because of a synchronizing instruction that requires the ICT to be empty before dispatch."
- },
- {
- "EventCode": "4505A",
+ "EventCode": "0x4505A",
"EventName": "PM_SP_FLOP_CMPL",
"BriefDescription": "Single Precision floating point instructions completed."
},
{
- "EventCode": "4D058",
+ "EventCode": "0x4D058",
"EventName": "PM_VECTOR_FLOP_CMPL",
"BriefDescription": "Vector floating point instructions completed."
},
{
- "EventCode": "4D05A",
+ "EventCode": "0x4D05A",
"EventName": "PM_NON_MATH_FLOP_CMPL",
"BriefDescription": "Non Math instructions completed."
},
{
- "EventCode": "401E0",
+ "EventCode": "0x401E0",
"EventName": "PM_MRK_INST_CMPL",
"BriefDescription": "marked instruction completed."
},
{
- "EventCode": "400FE",
+ "EventCode": "0x400FE",
"EventName": "PM_DATA_FROM_MEMORY",
"BriefDescription": "The processor's data cache was reloaded from local, remote, or distant memory due to a demand miss."
}
[
{
- "EventCode": "1000A",
+ "EventCode": "0x1000A",
"EventName": "PM_PMC3_REWIND",
"BriefDescription": "The speculative event selected for PMC3 rewinds and the counter for PMC3 is not charged."
},
{
- "EventCode": "1C040",
+ "EventCode": "0x1C040",
"EventName": "PM_XFER_FROM_SRC_PMC1",
"BriefDescription": "The processor's L1 data cache was reloaded from the source specified in MMCR3[0:12]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads."
},
{
- "EventCode": "1C142",
+ "EventCode": "0x1C142",
"EventName": "PM_MRK_XFER_FROM_SRC_PMC1",
"BriefDescription": "For a marked data transfer instruction, the processor's L1 data cache was reloaded from the source specified in MMCR3[0:12]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads."
},
{
- "EventCode": "1C144",
+ "EventCode": "0x1C144",
"EventName": "PM_MRK_XFER_FROM_SRC_CYC_PMC1",
"BriefDescription": "Cycles taken for a marked demand miss to reload a line from the source specified in MMCR3[0:12]."
},
{
- "EventCode": "1C056",
+ "EventCode": "0x1C056",
"EventName": "PM_DERAT_MISS_4K",
"BriefDescription": "Data ERAT Miss (Data TLB Access) page size 4K. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
},
{
- "EventCode": "1C058",
+ "EventCode": "0x1C058",
"EventName": "PM_DTLB_MISS_16G",
"BriefDescription": "Data TLB reload (after a miss) page size 16G. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
},
{
- "EventCode": "1C05C",
+ "EventCode": "0x1C05C",
"EventName": "PM_DTLB_MISS_2M",
"BriefDescription": "Data TLB reload (after a miss) page size 2M. Implies radix translation was used. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
},
{
- "EventCode": "1E056",
+ "EventCode": "0x1E056",
"EventName": "PM_EXEC_STALL_STORE_PIPE",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was executing in the store unit. This does not include cycles spent handling store misses, PTESYNC instructions or TLBIE instructions."
},
{
- "EventCode": "1F150",
+ "EventCode": "0x1F150",
"EventName": "PM_MRK_ST_L2_CYC",
"BriefDescription": "Cycles from L2 RC dispatch to L2 RC completion."
},
{
- "EventCode": "10062",
+ "EventCode": "0x10062",
"EventName": "PM_LD_L3MISS_PEND_CYC",
"BriefDescription": "Cycles L3 miss was pending for this thread."
},
{
- "EventCode": "20010",
+ "EventCode": "0x20010",
"EventName": "PM_PMC1_OVERFLOW",
"BriefDescription": "The event selected for PMC1 caused the event counter to overflow."
},
{
- "EventCode": "2001A",
+ "EventCode": "0x2001A",
"EventName": "PM_ITLB_HIT",
"BriefDescription": "The PTE required to translate the instruction address was resident in the TLB (instruction TLB access/IERAT reload). Applies to both HPT and RPT. When MMCR1[17]=0 this event counts only for demand misses. When MMCR1[17]=1 this event includes demand misses and prefetches."
},
{
- "EventCode": "2003E",
+ "EventCode": "0x2003E",
"EventName": "PM_PTESYNC_FIN",
"BriefDescription": "Ptesync instruction finished in the store unit. Only one ptesync can finish at a time."
},
{
- "EventCode": "2C040",
+ "EventCode": "0x2C040",
"EventName": "PM_XFER_FROM_SRC_PMC2",
"BriefDescription": "The processor's L1 data cache was reloaded from the source specified in MMCR3[15:27]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads."
},
{
- "EventCode": "2C054",
+ "EventCode": "0x2C054",
"EventName": "PM_DERAT_MISS_64K",
"BriefDescription": "Data ERAT Miss (Data TLB Access) page size 64K. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
},
{
- "EventCode": "2C056",
+ "EventCode": "0x2C056",
"EventName": "PM_DTLB_MISS_4K",
"BriefDescription": "Data TLB reload (after a miss) page size 4K. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
},
{
- "EventCode": "2D154",
+ "EventCode": "0x2D154",
"EventName": "PM_MRK_DERAT_MISS_64K",
"BriefDescription": "Data ERAT Miss (Data TLB Access) page size 64K for a marked instruction. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
},
{
- "EventCode": "200F6",
+ "EventCode": "0x200F6",
"EventName": "PM_DERAT_MISS",
"BriefDescription": "DERAT Reloaded to satisfy a DERAT miss. All page sizes are counted by this event. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
},
{
- "EventCode": "3000A",
- "EventName": "PM_DISP_STALL_ITLB_MISS",
- "BriefDescription": "Cycles when dispatch was stalled while waiting to resolve an instruction TLB miss."
- },
- {
- "EventCode": "30016",
+ "EventCode": "0x30016",
"EventName": "PM_EXEC_STALL_DERAT_DTLB_MISS",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline suffered a TLB miss and waited for it resolve."
},
{
- "EventCode": "3C040",
+ "EventCode": "0x3C040",
"EventName": "PM_XFER_FROM_SRC_PMC3",
"BriefDescription": "The processor's L1 data cache was reloaded from the source specified in MMCR3[30:42]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads."
},
{
- "EventCode": "3C142",
+ "EventCode": "0x3C142",
"EventName": "PM_MRK_XFER_FROM_SRC_PMC3",
"BriefDescription": "For a marked data transfer instruction, the processor's L1 data cache was reloaded from the source specified in MMCR3[30:42]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads."
},
{
- "EventCode": "3C144",
+ "EventCode": "0x3C144",
"EventName": "PM_MRK_XFER_FROM_SRC_CYC_PMC3",
"BriefDescription": "Cycles taken for a marked demand miss to reload a line from the source specified in MMCR3[30:42]."
},
{
- "EventCode": "3C054",
+ "EventCode": "0x3C054",
"EventName": "PM_DERAT_MISS_16M",
"BriefDescription": "Data ERAT Miss (Data TLB Access) page size 16M. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
},
{
- "EventCode": "3C056",
+ "EventCode": "0x3C056",
"EventName": "PM_DTLB_MISS_64K",
"BriefDescription": "Data TLB reload (after a miss) page size 64K. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
},
{
- "EventCode": "3C058",
+ "EventCode": "0x3C058",
"EventName": "PM_LARX_FIN",
"BriefDescription": "Load and reserve instruction (LARX) finished. LARX and STCX are instructions used to acquire a lock."
},
{
- "EventCode": "301E2",
+ "EventCode": "0x301E2",
"EventName": "PM_MRK_ST_CMPL",
"BriefDescription": "Marked store completed and sent to nest. Note that this count excludes cache-inhibited stores."
},
{
- "EventCode": "300FC",
+ "EventCode": "0x300FC",
"EventName": "PM_DTLB_MISS",
"BriefDescription": "The DPTEG required for the load/store instruction in execution was missing from the TLB. It includes pages of all sizes for demand and prefetch activity."
},
{
- "EventCode": "4D02C",
+ "EventCode": "0x4D02C",
"EventName": "PM_PMC1_REWIND",
"BriefDescription": "The speculative event selected for PMC1 rewinds and the counter for PMC1 is not charged."
},
{
- "EventCode": "4003E",
+ "EventCode": "0x4003E",
"EventName": "PM_LD_CMPL",
"BriefDescription": "Loads completed."
},
{
- "EventCode": "4C040",
+ "EventCode": "0x4C040",
"EventName": "PM_XFER_FROM_SRC_PMC4",
"BriefDescription": "The processor's L1 data cache was reloaded from the source specified in MMCR3[45:57]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads."
},
{
- "EventCode": "4C142",
+ "EventCode": "0x4C142",
"EventName": "PM_MRK_XFER_FROM_SRC_PMC4",
"BriefDescription": "For a marked data transfer instruction, the processor's L1 data cache was reloaded from the source specified in MMCR3[45:57]. If MMCR1[16|17] is 0 (default), this count includes only lines that were reloaded to satisfy a demand miss. If MMCR1[16|17] is 1, this count includes both demand misses and prefetch reloads."
},
{
- "EventCode": "4C144",
+ "EventCode": "0x4C144",
"EventName": "PM_MRK_XFER_FROM_SRC_CYC_PMC4",
"BriefDescription": "Cycles taken for a marked demand miss to reload a line from the source specified in MMCR3[45:57]."
},
{
- "EventCode": "4C056",
+ "EventCode": "0x4C056",
"EventName": "PM_DTLB_MISS_16M",
"BriefDescription": "Data TLB reload (after a miss) page size 16M. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
},
{
- "EventCode": "4C05A",
+ "EventCode": "0x4C05A",
"EventName": "PM_DTLB_MISS_1G",
"BriefDescription": "Data TLB reload (after a miss) page size 1G. Implies radix translation was used. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
},
{
- "EventCode": "4C15E",
+ "EventCode": "0x4C15E",
"EventName": "PM_MRK_DTLB_MISS_64K",
"BriefDescription": "Marked Data TLB reload (after a miss) page size 64K. When MMCR1[16]=0 this event counts only for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
},
{
- "EventCode": "4D056",
+ "EventCode": "0x4D056",
"EventName": "PM_NON_FMA_FLOP_CMPL",
"BriefDescription": "Non FMA instruction completed."
},
{
- "EventCode": "40164",
+ "EventCode": "0x40164",
"EventName": "PM_MRK_DERAT_MISS_2M",
"BriefDescription": "Data ERAT Miss (Data TLB Access) page size 2M for a marked instruction. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
}
[
{
- "EventCode": "10016",
+ "EventCode": "0x10016",
"EventName": "PM_VSU0_ISSUE",
"BriefDescription": "VSU instructions issued to VSU pipe 0."
},
{
- "EventCode": "1001C",
+ "EventCode": "0x1001C",
"EventName": "PM_ULTRAVISOR_INST_CMPL",
"BriefDescription": "PowerPC instructions that completed while the thread was in ultravisor state."
},
{
- "EventCode": "100F0",
+ "EventCode": "0x100F0",
"EventName": "PM_CYC",
"BriefDescription": "Processor cycles."
},
{
- "EventCode": "10134",
+ "EventCode": "0x10134",
"EventName": "PM_MRK_ST_DONE_L2",
"BriefDescription": "Marked stores completed in L2 (RC machine done)."
},
{
- "EventCode": "1505E",
+ "EventCode": "0x1505E",
"EventName": "PM_LD_HIT_L1",
"BriefDescription": "Loads that finished without experiencing an L1 miss."
},
{
- "EventCode": "1D05E",
- "EventName": "PM_DISP_STALL_HELD_HALT_CYC",
- "BriefDescription": "Cycles in which the NTC instruction is held at dispatch because of power management."
- },
- {
- "EventCode": "1E054",
- "EventName": "PM_EXEC_STALL_DMISS_L21_L31",
- "BriefDescription": "Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from another core's L2 or L3 on the same chip."
- },
- {
- "EventCode": "1E05A",
- "EventName": "PM_CMPL_STALL_LWSYNC",
- "BriefDescription": "Cycles in which the oldest instruction in the pipeline was a lwsync waiting to complete."
- },
- {
- "EventCode": "1F056",
+ "EventCode": "0x1F056",
"EventName": "PM_DISP_SS0_2_INSTR_CYC",
"BriefDescription": "Cycles in which Superslice 0 dispatches either 1 or 2 instructions."
},
{
- "EventCode": "1F15C",
+ "EventCode": "0x1F15C",
"EventName": "PM_MRK_STCX_L2_CYC",
"BriefDescription": "Cycles spent in the nest portion of a marked Stcx instruction. It starts counting when the operation starts to drain to the L2 and it stops counting when the instruction retires from the Instruction Completion Table (ICT) in the Instruction Sequencing Unit (ISU)."
},
{
- "EventCode": "10066",
+ "EventCode": "0x10066",
"EventName": "PM_ADJUNCT_CYC",
"BriefDescription": "Cycles in which the thread is in Adjunct state. MSR[S HV PR] bits = 011."
},
{
- "EventCode": "101E4",
+ "EventCode": "0x101E4",
"EventName": "PM_MRK_L1_ICACHE_MISS",
"BriefDescription": "Marked Instruction suffered an icache Miss."
},
{
- "EventCode": "101EA",
+ "EventCode": "0x101EA",
"EventName": "PM_MRK_L1_RELOAD_VALID",
"BriefDescription": "Marked demand reload."
},
{
- "EventCode": "100F4",
+ "EventCode": "0x100F4",
"EventName": "PM_FLOP_CMPL",
"BriefDescription": "Floating Point Operations Completed. Includes any type. It counts once for each 1, 2, 4 or 8 flop instruction. Use PM_1|2|4|8_FLOP_CMPL events to count flops."
},
{
- "EventCode": "100FA",
+ "EventCode": "0x100FA",
"EventName": "PM_RUN_LATCH_ANY_THREAD_CYC",
"BriefDescription": "Cycles when at least one thread has the run latch set."
},
{
- "EventCode": "100FC",
+ "EventCode": "0x100FC",
"EventName": "PM_LD_REF_L1",
"BriefDescription": "All L1 D cache load references counted at finish, gated by reject. In P9 and earlier this event counted only cacheable loads but in P10 both cacheable and non-cacheable loads are included."
},
{
- "EventCode": "20006",
- "EventName": "PM_DISP_STALL_HELD_ISSQ_FULL_CYC",
- "BriefDescription": "Cycles in which the NTC instruction is held at dispatch due to Issue queue full. Includes issue queue and branch queue."
- },
- {
- "EventCode": "2000C",
+ "EventCode": "0x2000C",
"EventName": "PM_RUN_LATCH_ALL_THREADS_CYC",
"BriefDescription": "Cycles when the run latch is set for all threads."
},
{
- "EventCode": "2E010",
+ "EventCode": "0x2E010",
"EventName": "PM_ADJUNCT_INST_CMPL",
"BriefDescription": "PowerPC instructions that completed while the thread is in Adjunct state."
},
{
- "EventCode": "2E014",
+ "EventCode": "0x2E014",
"EventName": "PM_STCX_FIN",
"BriefDescription": "Conditional store instruction (STCX) finished. LARX and STCX are instructions used to acquire a lock."
},
{
- "EventCode": "20130",
+ "EventCode": "0x20130",
"EventName": "PM_MRK_INST_DECODED",
"BriefDescription": "An instruction was marked at decode time. Random Instruction Sampling (RIS) only."
},
{
- "EventCode": "20132",
+ "EventCode": "0x20132",
"EventName": "PM_MRK_DFU_ISSUE",
"BriefDescription": "The marked instruction was a decimal floating point operation issued to the VSU. Measured at issue time."
},
{
- "EventCode": "20134",
+ "EventCode": "0x20134",
"EventName": "PM_MRK_FXU_ISSUE",
"BriefDescription": "The marked instruction was a fixed point operation issued to the VSU. Measured at issue time."
},
{
- "EventCode": "2505C",
+ "EventCode": "0x2505C",
"EventName": "PM_VSU_ISSUE",
"BriefDescription": "At least one VSU instruction was issued to one of the VSU pipes. Up to 4 per cycle. Includes fixed point operations."
},
{
- "EventCode": "2F054",
+ "EventCode": "0x2F054",
"EventName": "PM_DISP_SS1_2_INSTR_CYC",
"BriefDescription": "Cycles in which Superslice 1 dispatches either 1 or 2 instructions."
},
{
- "EventCode": "2F056",
+ "EventCode": "0x2F056",
"EventName": "PM_DISP_SS1_4_INSTR_CYC",
"BriefDescription": "Cycles in which Superslice 1 dispatches either 3 or 4 instructions."
},
{
- "EventCode": "2006C",
+ "EventCode": "0x2006C",
"EventName": "PM_RUN_CYC_SMT4_MODE",
"BriefDescription": "Cycles when this thread's run latch is set and the core is in SMT4 mode."
},
{
- "EventCode": "201E0",
+ "EventCode": "0x201E0",
"EventName": "PM_MRK_DATA_FROM_MEMORY",
"BriefDescription": "The processor's data cache was reloaded from local, remote, or distant memory due to a demand miss for a marked load."
},
{
- "EventCode": "201E4",
+ "EventCode": "0x201E4",
"EventName": "PM_MRK_DATA_FROM_L3MISS",
"BriefDescription": "The processor's data cache was reloaded from a source other than the local core's L1, L2, or L3 due to a demand miss for a marked load."
},
{
- "EventCode": "201E8",
+ "EventCode": "0x201E8",
"EventName": "PM_THRESH_EXC_512",
"BriefDescription": "Threshold counter exceeded a value of 512."
},
{
- "EventCode": "200F2",
+ "EventCode": "0x200F2",
"EventName": "PM_INST_DISP",
"BriefDescription": "PowerPC instructions dispatched."
},
{
- "EventCode": "30132",
+ "EventCode": "0x30132",
"EventName": "PM_MRK_VSU_FIN",
"BriefDescription": "VSU marked instructions finished. Excludes simple FX instructions issued to the Store Unit."
},
{
- "EventCode": "30038",
+ "EventCode": "0x30038",
"EventName": "PM_EXEC_STALL_DMISS_LMEM",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from the local memory, local OpenCapp cache, or local OpenCapp memory."
},
{
- "EventCode": "3F04A",
+ "EventCode": "0x3F04A",
"EventName": "PM_LSU_ST5_FIN",
"BriefDescription": "LSU Finished an internal operation in ST2 port."
},
{
- "EventCode": "34054",
- "EventName": "PM_EXEC_STALL_DMISS_L2L3_NOCONFLICT",
- "BriefDescription": "Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from the local L2 or local L3, without a dispatch conflict."
- },
- {
- "EventCode": "3405A",
+ "EventCode": "0x3405A",
"EventName": "PM_PRIVILEGED_INST_CMPL",
"BriefDescription": "PowerPC Instructions that completed while the thread is in Privileged state."
},
{
- "EventCode": "3F150",
+ "EventCode": "0x3F150",
"EventName": "PM_MRK_ST_DRAIN_CYC",
"BriefDescription": "cycles to drain st from core to L2."
},
{
- "EventCode": "3F054",
+ "EventCode": "0x3F054",
"EventName": "PM_DISP_SS0_4_INSTR_CYC",
"BriefDescription": "Cycles in which Superslice 0 dispatches either 3 or 4 instructions."
},
{
- "EventCode": "3F056",
+ "EventCode": "0x3F056",
"EventName": "PM_DISP_SS0_8_INSTR_CYC",
"BriefDescription": "Cycles in which Superslice 0 dispatches either 5, 6, 7 or 8 instructions."
},
{
- "EventCode": "30162",
+ "EventCode": "0x30162",
"EventName": "PM_MRK_ISSUE_DEPENDENT_LOAD",
"BriefDescription": "The marked instruction was dependent on a load. It is eligible for issue kill."
},
{
- "EventCode": "40114",
+ "EventCode": "0x40114",
"EventName": "PM_MRK_START_PROBE_NOP_DISP",
"BriefDescription": "Marked Start probe nop dispatched. Instruction AND R0,R0,R0."
},
{
- "EventCode": "4001C",
+ "EventCode": "0x4001C",
"EventName": "PM_VSU_FIN",
"BriefDescription": "VSU instructions finished."
},
{
- "EventCode": "4C01A",
+ "EventCode": "0x4C01A",
"EventName": "PM_EXEC_STALL_DMISS_OFF_NODE",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from a distant chip."
},
{
- "EventCode": "4D012",
+ "EventCode": "0x4D012",
"EventName": "PM_PMC3_SAVED",
"BriefDescription": "The conditions for the speculative event selected for PMC3 are met and PMC3 is charged."
},
{
- "EventCode": "4D022",
+ "EventCode": "0x4D022",
"EventName": "PM_HYPERVISOR_INST_CMPL",
"BriefDescription": "PowerPC instructions that completed while the thread is in hypervisor state."
},
{
- "EventCode": "4D026",
+ "EventCode": "0x4D026",
"EventName": "PM_ULTRAVISOR_CYC",
"BriefDescription": "Cycles when the thread is in Ultravisor state. MSR[S HV PR]=110."
},
{
- "EventCode": "4D028",
+ "EventCode": "0x4D028",
"EventName": "PM_PRIVILEGED_CYC",
"BriefDescription": "Cycles when the thread is in Privileged state. MSR[S HV PR]=x00."
},
{
- "EventCode": "40030",
+ "EventCode": "0x40030",
"EventName": "PM_INST_FIN",
"BriefDescription": "Instructions finished."
},
{
- "EventCode": "44146",
+ "EventCode": "0x44146",
"EventName": "PM_MRK_STCX_CORE_CYC",
"BriefDescription": "Cycles spent in the core portion of a marked Stcx instruction. It starts counting when the instruction is decoded and stops counting when it drains into the L2."
},
{
- "EventCode": "44054",
+ "EventCode": "0x44054",
"EventName": "PM_VECTOR_LD_CMPL",
"BriefDescription": "Vector load instructions completed."
},
{
- "EventCode": "45054",
+ "EventCode": "0x45054",
"EventName": "PM_FMA_CMPL",
"BriefDescription": "Two floating point instructions completed (FMA class of instructions: fmadd, fnmadd, fmsub, fnmsub). Scalar instructions only."
},
{
- "EventCode": "45056",
+ "EventCode": "0x45056",
"EventName": "PM_SCALAR_FLOP_CMPL",
"BriefDescription": "Scalar floating point instructions completed."
},
{
- "EventCode": "4505C",
+ "EventCode": "0x4505C",
"EventName": "PM_MATH_FLOP_CMPL",
"BriefDescription": "Math floating point instructions completed."
},
{
- "EventCode": "4D05E",
+ "EventCode": "0x4D05E",
"EventName": "PM_BR_CMPL",
"BriefDescription": "A branch completed. All branches are included."
},
{
- "EventCode": "4E15E",
+ "EventCode": "0x4E15E",
"EventName": "PM_MRK_INST_FLUSHED",
"BriefDescription": "The marked instruction was flushed."
},
{
- "EventCode": "401E6",
+ "EventCode": "0x401E6",
"EventName": "PM_MRK_INST_FROM_L3MISS",
"BriefDescription": "The processor's instruction cache was reloaded from a source other than the local core's L1, L2, or L3 due to a demand miss for a marked instruction."
},
{
- "EventCode": "401E8",
+ "EventCode": "0x401E8",
"EventName": "PM_MRK_DATA_FROM_L2MISS",
"BriefDescription": "The processor's data cache was reloaded from a source other than the local core's L1 or L2 due to a demand miss for a marked load."
},
{
- "EventCode": "400F0",
+ "EventCode": "0x400F0",
"EventName": "PM_LD_DEMAND_MISS_L1_FIN",
"BriefDescription": "Load Missed L1, counted at finish time."
},
{
- "EventCode": "400FA",
+ "EventCode": "0x400FA",
"EventName": "PM_RUN_INST_CMPL",
"BriefDescription": "Completed PowerPC instructions gated by the run latch."
}
[
{
- "EventCode": "100FE",
+ "EventCode": "0x100FE",
"EventName": "PM_INST_CMPL",
"BriefDescription": "PowerPC instructions completed."
},
{
- "EventCode": "10006",
- "EventName": "PM_DISP_STALL_HELD_OTHER_CYC",
- "BriefDescription": "Cycles in which the NTC instruction is held at dispatch for any other reason."
- },
- {
- "EventCode": "1000C",
+ "EventCode": "0x1000C",
"EventName": "PM_LSU_LD0_FIN",
"BriefDescription": "LSU Finished an internal operation in LD0 port."
},
{
- "EventCode": "1000E",
+ "EventCode": "0x1000E",
"EventName": "PM_MMA_ISSUED",
"BriefDescription": "MMA instructions issued."
},
{
- "EventCode": "10012",
+ "EventCode": "0x10012",
"EventName": "PM_LSU_ST0_FIN",
"BriefDescription": "LSU Finished an internal operation in ST0 port."
},
{
- "EventCode": "10014",
+ "EventCode": "0x10014",
"EventName": "PM_LSU_ST4_FIN",
"BriefDescription": "LSU Finished an internal operation in ST4 port."
},
{
- "EventCode": "10018",
+ "EventCode": "0x10018",
"EventName": "PM_IC_DEMAND_CYC",
"BriefDescription": "Cycles in which an instruction reload is pending to satisfy a demand miss."
},
{
- "EventCode": "10022",
+ "EventCode": "0x10022",
"EventName": "PM_PMC2_SAVED",
"BriefDescription": "The conditions for the speculative event selected for PMC2 are met and PMC2 is charged."
},
{
- "EventCode": "10024",
+ "EventCode": "0x10024",
"EventName": "PM_PMC5_OVERFLOW",
"BriefDescription": "The event selected for PMC5 caused the event counter to overflow."
},
{
- "EventCode": "10058",
+ "EventCode": "0x10058",
"EventName": "PM_EXEC_STALL_FIN_AT_DISP",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline finished at dispatch and did not require execution in the LSU, BRU or VSU."
},
{
- "EventCode": "1005A",
+ "EventCode": "0x1005A",
"EventName": "PM_FLUSH_MPRED",
"BriefDescription": "A flush occurred due to a mispredicted branch. Includes target and direction."
},
{
- "EventCode": "1C05A",
+ "EventCode": "0x1C05A",
"EventName": "PM_DERAT_MISS_2M",
"BriefDescription": "Data ERAT Miss (Data TLB Access) page size 2M. Implies radix translation. When MMCR1[16]=0 this event counts only DERAT reloads for demand misses. When MMCR1[16]=1 this event includes demand misses and prefetches."
},
{
- "EventCode": "10064",
- "EventName": "PM_DISP_STALL_IC_L2",
- "BriefDescription": "Cycles when dispatch was stalled while the instruction was fetched from the local L2."
+ "EventCode": "0x1E05A",
+ "EventName": "PM_CMPL_STALL_LWSYNC",
+ "BriefDescription": "Cycles in which the oldest instruction in the pipeline was a lwsync waiting to complete."
},
{
- "EventCode": "10068",
+ "EventCode": "0x10068",
"EventName": "PM_BR_FIN",
"BriefDescription": "A branch instruction finished. Includes predicted/mispredicted/unconditional."
},
{
- "EventCode": "1006A",
+ "EventCode": "0x1006A",
"EventName": "PM_FX_LSU_FIN",
"BriefDescription": "Simple fixed point instruction issued to the store unit. Measured at finish time."
},
{
- "EventCode": "1006C",
+ "EventCode": "0x1006C",
"EventName": "PM_RUN_CYC_ST_MODE",
"BriefDescription": "Cycles when the run latch is set and the core is in ST mode."
},
{
- "EventCode": "20004",
+ "EventCode": "0x20004",
"EventName": "PM_ISSUE_STALL",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was dispatched but not issued yet."
},
{
- "EventCode": "2000A",
+ "EventCode": "0x2000A",
"EventName": "PM_HYPERVISOR_CYC",
"BriefDescription": "Cycles when the thread is in Hypervisor state. MSR[S HV PR]=010."
},
{
- "EventCode": "2000E",
+ "EventCode": "0x2000E",
"EventName": "PM_LSU_LD1_FIN",
"BriefDescription": "LSU Finished an internal operation in LD1 port."
},
{
- "EventCode": "2C014",
+ "EventCode": "0x2C014",
"EventName": "PM_CMPL_STALL_SPECIAL",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline required special handling before completing."
},
{
- "EventCode": "2C018",
+ "EventCode": "0x2C018",
"EventName": "PM_EXEC_STALL_DMISS_L3MISS",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was waiting for a load miss to resolve from a source beyond the local L2 or local L3."
},
{
- "EventCode": "2D010",
+ "EventCode": "0x2D010",
"EventName": "PM_LSU_ST1_FIN",
"BriefDescription": "LSU Finished an internal operation in ST1 port."
},
{
- "EventCode": "2D012",
+ "EventCode": "0x2D012",
"EventName": "PM_VSU1_ISSUE",
"BriefDescription": "VSU instructions issued to VSU pipe 1."
},
{
- "EventCode": "2D018",
+ "EventCode": "0x2D018",
"EventName": "PM_EXEC_STALL_VSU",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was executing in the VSU (includes FXU, VSU, CRU)."
},
{
- "EventCode": "2E01E",
+ "EventCode": "0x2D01C",
+ "EventName": "PM_CMPL_STALL_STCX",
+ "BriefDescription": "Cycles in which the oldest instruction in the pipeline was a stcx waiting for resolution from the nest before completing."
+ },
+ {
+ "EventCode": "0x2E01E",
"EventName": "PM_EXEC_STALL_NTC_FLUSH",
- "BriefDescription": "Cycles in which the oldest instruction in the pipeline was executing in any unit before it was flushed. Note that if the flush of the oldest instruction happens after finish, the cycles from dispatch to issue will be included in PM_DISP_STALL and the cycles from issue to finish will be included in PM_EXEC_STALL and its corresponding children."
+ "BriefDescription": "Cycles in which the oldest instruction in the pipeline was executing in any unit before it was flushed. Note that if the flush of the oldest instruction happens after finish, the cycles from dispatch to issue will be included in PM_DISP_STALL and the cycles from issue to finish will be included in PM_EXEC_STALL and its corresponding children. This event will also count cycles when the previous NTF instruction is still completing and the new NTF instruction is stalled at dispatch."
},
{
- "EventCode": "2013C",
+ "EventCode": "0x2013C",
"EventName": "PM_MRK_FX_LSU_FIN",
"BriefDescription": "The marked instruction was simple fixed point that was issued to the store unit. Measured at finish time."
},
{
- "EventCode": "2405A",
+ "EventCode": "0x2405A",
"EventName": "PM_NTC_FIN",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline (NTC) finishes. Note that instructions can finish out of order, therefore not all the instructions that finish have a Next-to-complete status."
},
{
- "EventCode": "201E2",
+ "EventCode": "0x201E2",
"EventName": "PM_MRK_LD_MISS_L1",
"BriefDescription": "Marked DL1 Demand Miss counted at finish time."
},
{
- "EventCode": "200F4",
+ "EventCode": "0x200F4",
"EventName": "PM_RUN_CYC",
"BriefDescription": "Processor cycles gated by the run latch."
},
{
- "EventCode": "30004",
- "EventName": "PM_DISP_STALL_FLUSH",
- "BriefDescription": "Cycles when dispatch was stalled because of a flush that happened to an instruction(s) that was not yet NTC. PM_EXEC_STALL_NTC_FLUSH only includes instructions that were flushed after becoming NTC."
- },
- {
- "EventCode": "30008",
+ "EventCode": "0x30008",
"EventName": "PM_EXEC_STALL",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was waiting to finish in one of the execution units (BRU, LSU, VSU). Only cycles between issue and finish are counted in this category."
},
{
- "EventCode": "3001A",
+ "EventCode": "0x3001A",
"EventName": "PM_LSU_ST2_FIN",
"BriefDescription": "LSU Finished an internal operation in ST2 port."
},
{
- "EventCode": "30020",
+ "EventCode": "0x30020",
"EventName": "PM_PMC2_REWIND",
"BriefDescription": "The speculative event selected for PMC2 rewinds and the counter for PMC2 is not charged."
},
{
- "EventCode": "30022",
+ "EventCode": "0x30022",
"EventName": "PM_PMC4_SAVED",
"BriefDescription": "The conditions for the speculative event selected for PMC4 are met and PMC4 is charged."
},
{
- "EventCode": "30024",
+ "EventCode": "0x30024",
"EventName": "PM_PMC6_OVERFLOW",
"BriefDescription": "The event selected for PMC6 caused the event counter to overflow."
},
{
- "EventCode": "30028",
+ "EventCode": "0x30028",
"EventName": "PM_CMPL_STALL_MEM_ECC",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was waiting for the non-speculative finish of either a stcx waiting for its result or a load waiting for non-critical sectors of data and ECC."
},
{
- "EventCode": "30036",
+ "EventCode": "0x30036",
"EventName": "PM_EXEC_STALL_SIMPLE_FX",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was a simple fixed point instruction executing in the Load Store Unit."
},
{
- "EventCode": "3003A",
+ "EventCode": "0x3003A",
"EventName": "PM_CMPL_STALL_EXCEPTION",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was not allowed to complete because it was interrupted by ANY exception, which has to be serviced before the instruction can complete."
},
{
- "EventCode": "3F044",
+ "EventCode": "0x3F044",
"EventName": "PM_VSU2_ISSUE",
"BriefDescription": "VSU instructions issued to VSU pipe 2."
},
{
- "EventCode": "30058",
+ "EventCode": "0x30058",
"EventName": "PM_TLBIE_FIN",
"BriefDescription": "TLBIE instructions finished in the LSU. Two TLBIEs can finish each cycle. All will be counted."
},
{
- "EventCode": "3D058",
+ "EventCode": "0x3D058",
"EventName": "PM_SCALAR_FSQRT_FDIV_ISSUE",
"BriefDescription": "Scalar versions of four floating point operations: fdiv,fsqrt (xvdivdp, xvdivsp, xvsqrtdp, xvsqrtsp)."
},
{
- "EventCode": "30066",
+ "EventCode": "0x30066",
"EventName": "PM_LSU_FIN",
"BriefDescription": "LSU Finished an internal operation (up to 4 per cycle)."
},
{
- "EventCode": "40004",
+ "EventCode": "0x40004",
"EventName": "PM_FXU_ISSUE",
"BriefDescription": "A fixed point instruction was issued to the VSU."
},
{
- "EventCode": "40008",
+ "EventCode": "0x40008",
"EventName": "PM_NTC_ALL_FIN",
"BriefDescription": "Cycles in which both instructions in the ICT entry pair show as finished. These are the cycles between finish and completion for the oldest pair of instructions in the pipeline."
},
{
- "EventCode": "40010",
+ "EventCode": "0x40010",
"EventName": "PM_PMC3_OVERFLOW",
"BriefDescription": "The event selected for PMC3 caused the event counter to overflow."
},
{
- "EventCode": "4C012",
+ "EventCode": "0x4C012",
"EventName": "PM_EXEC_STALL_DERAT_ONLY_MISS",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline suffered an ERAT miss and waited for it resolve."
},
{
- "EventCode": "4C018",
+ "EventCode": "0x4C018",
"EventName": "PM_CMPL_STALL",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline cannot complete because the thread was blocked for any reason."
},
{
- "EventCode": "4C01E",
+ "EventCode": "0x4C01E",
"EventName": "PM_LSU_ST3_FIN",
"BriefDescription": "LSU Finished an internal operation in ST3 port."
},
{
- "EventCode": "4D018",
+ "EventCode": "0x4D018",
"EventName": "PM_EXEC_STALL_BRU",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was executing in the Branch unit."
},
{
- "EventCode": "4D01A",
+ "EventCode": "0x4D01A",
"EventName": "PM_CMPL_STALL_HWSYNC",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was a hwsync waiting for response from L2 before completing."
},
{
- "EventCode": "4D01C",
+ "EventCode": "0x4D01C",
"EventName": "PM_EXEC_STALL_TLBIEL",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was a TLBIEL instruction executing in the Load Store Unit. TLBIEL instructions have lower overhead than TLBIE instructions because they don't get set to the nest."
},
{
- "EventCode": "4E012",
+ "EventCode": "0x4E012",
"EventName": "PM_EXEC_STALL_UNKNOWN",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline completed without an ntf_type pulse. The ntf_pulse was missed by the ISU because the NTF finishes and completions came too close together."
},
{
- "EventCode": "4D020",
+ "EventCode": "0x4D020",
"EventName": "PM_VSU3_ISSUE",
"BriefDescription": "VSU instruction was issued to VSU pipe 3."
},
{
- "EventCode": "40132",
+ "EventCode": "0x40132",
"EventName": "PM_MRK_LSU_FIN",
"BriefDescription": "LSU marked instruction finish."
},
{
- "EventCode": "45058",
+ "EventCode": "0x45058",
"EventName": "PM_IC_MISS_CMPL",
"BriefDescription": "Non-speculative icache miss, counted at completion."
},
{
- "EventCode": "4D050",
+ "EventCode": "0x4D050",
"EventName": "PM_VSU_NON_FLOP_CMPL",
"BriefDescription": "Non-floating point VSU instructions completed."
},
{
- "EventCode": "4D052",
+ "EventCode": "0x4D052",
"EventName": "PM_2FLOP_CMPL",
"BriefDescription": "Double Precision vector version of fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg completed."
},
{
- "EventCode": "400F2",
+ "EventCode": "0x400F2",
"EventName": "PM_1PLUS_PPC_DISP",
"BriefDescription": "Cycles at least one Instr Dispatched."
},
{
- "EventCode": "400F8",
+ "EventCode": "0x400F8",
"EventName": "PM_FLUSH",
"BriefDescription": "Flush (any type)."
}
[
{
- "EventCode": "301E8",
+ "EventCode": "0x301E8",
"EventName": "PM_THRESH_EXC_64",
"BriefDescription": "Threshold counter exceeded a value of 64."
},
{
- "EventCode": "45050",
+ "EventCode": "0x45050",
"EventName": "PM_1FLOP_CMPL",
"BriefDescription": "One floating point instruction completed (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg)."
},
{
- "EventCode": "45052",
+ "EventCode": "0x45052",
"EventName": "PM_4FLOP_CMPL",
"BriefDescription": "Four floating point instructions completed (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg)."
},
{
- "EventCode": "4D054",
+ "EventCode": "0x4D054",
"EventName": "PM_8FLOP_CMPL",
"BriefDescription": "Four Double Precision vector instructions completed."
}
[
{
- "EventCode": "1F15E",
+ "EventCode": "0x1F15E",
"EventName": "PM_MRK_START_PROBE_NOP_CMPL",
"BriefDescription": "Marked Start probe nop (AND R0,R0,R0) completed."
},
{
- "EventCode": "20016",
+ "EventCode": "0x20016",
"EventName": "PM_ST_FIN",
"BriefDescription": "Store finish count. Includes speculative activity."
},
{
- "EventCode": "20018",
+ "EventCode": "0x20018",
"EventName": "PM_ST_FWD",
"BriefDescription": "Store forwards that finished."
},
{
- "EventCode": "2011C",
+ "EventCode": "0x2011C",
"EventName": "PM_MRK_NTF_CYC",
"BriefDescription": "Cycles during which the marked instruction is the oldest in the pipeline (NTF or NTC)."
},
{
- "EventCode": "2E01C",
+ "EventCode": "0x2E01C",
"EventName": "PM_EXEC_STALL_TLBIE",
"BriefDescription": "Cycles in which the oldest instruction in the pipeline was a TLBIE instruction executing in the Load Store Unit."
},
{
- "EventCode": "201E6",
+ "EventCode": "0x201E6",
"EventName": "PM_THRESH_EXC_32",
"BriefDescription": "Threshold counter exceeded a value of 32."
},
{
- "EventCode": "200F0",
+ "EventCode": "0x200F0",
"EventName": "PM_ST_CMPL",
"BriefDescription": "Stores completed from S2Q (2nd-level store queue). This event includes regular stores, stcx and cache inhibited stores. The following operations are excluded (pteupdate, snoop tlbie complete, store atomics, miso, load atomic payloads, tlbie, tlbsync, slbieg, isync, msgsnd, slbiag, cpabort, copy, tcheck, tend, stsync, dcbst, icbi, dcbf, hwsync, lwsync, ptesync, eieio, msgsync)."
},
{
- "EventCode": "200FE",
+ "EventCode": "0x200FE",
"EventName": "PM_DATA_FROM_L2MISS",
"BriefDescription": "The processor's data cache was reloaded from a source other than the local core's L1 or L2 due to a demand miss."
},
{
- "EventCode": "30010",
+ "EventCode": "0x30010",
"EventName": "PM_PMC2_OVERFLOW",
"BriefDescription": "The event selected for PMC2 caused the event counter to overflow."
},
{
- "EventCode": "4D010",
+ "EventCode": "0x4D010",
"EventName": "PM_PMC1_SAVED",
"BriefDescription": "The conditions for the speculative event selected for PMC1 are met and PMC1 is charged."
},
{
- "EventCode": "4D05C",
+ "EventCode": "0x4D05C",
"EventName": "PM_DPP_FLOP_CMPL",
"BriefDescription": "Double-Precision or Quad-Precision instructions completed."
}
struct rlimit rlim;
if (getrlimit(RLIMIT_NOFILE, &rlim) == 0)
- return min((int)rlim.rlim_max / 2, 512);
+ return min(rlim.rlim_max / 2, (rlim_t)512);
return 512;
}
from __future__ import print_function
import sys
+# Only change warnings if the python -W option was not used
+if not sys.warnoptions:
+ import warnings
+ # PySide2 causes deprecation warnings, ignore them.
+ warnings.filterwarnings("ignore", category=DeprecationWarning)
import argparse
import weakref
import threading
from PySide.QtGui import *
from PySide.QtSql import *
-from decimal import *
-from ctypes import *
+from decimal import Decimal, ROUND_HALF_UP
+from ctypes import CDLL, Structure, create_string_buffer, addressof, sizeof, \
+ c_void_p, c_bool, c_byte, c_char, c_int, c_uint, c_longlong, c_ulonglong
from multiprocessing import Process, Array, Value, Event
# xrange is range in Python3
if with_hdr:
model = indexes[0].model()
for col in range(min_col, max_col + 1):
- val = model.headerData(col, Qt.Horizontal)
+ val = model.headerData(col, Qt.Horizontal, Qt.DisplayRole)
if as_csv:
text += sep + ToCSValue(val)
sep = ","
exclusive=0
exclude_user=0
exclude_kernel=0|1
-exclude_hv=0
+exclude_hv=0|1
exclude_idle=0
mmap=1
comm=1
},
{
.events = "{},{instructions}",
- .nr_events = 0,
- .nr_groups = 0,
+ .nr_events = 1,
+ .nr_groups = 1,
},
{
.events = "{instructions},{instructions}",
goto out;
}
- err = -1;
link = bpf_program__attach(skel->progs.on_switch);
- if (!link) {
+ if (IS_ERR(link)) {
pr_err("Failed to attach leader program\n");
+ err = PTR_ERR(link);
goto out;
}
evsel->bperf_leader_link_fd = bpf_link_get_fd_by_id(entry.link_id);
if (evsel->bperf_leader_link_fd < 0 &&
- bperf_reload_leader_program(evsel, attr_map_fd, &entry))
+ bperf_reload_leader_program(evsel, attr_map_fd, &entry)) {
+ err = -1;
goto out;
-
+ }
/*
* The bpf_link holds reference to the leader program, and the
* leader program holds reference to the maps. Therefore, if
/* Step 2: load the follower skeleton */
evsel->follower_skel = bperf_follower_bpf__open();
if (!evsel->follower_skel) {
+ err = -1;
pr_err("Failed to open follower skeleton\n");
goto out;
}
if ((tag == DW_TAG_formal_parameter ||
tag == DW_TAG_variable) &&
die_compare_name(die_mem, fvp->name) &&
- /* Does the DIE have location information or external instance? */
+ /*
+ * Does the DIE have location information or const value
+ * or external instance?
+ */
(dwarf_attr(die_mem, DW_AT_external, &attr) ||
- dwarf_attr(die_mem, DW_AT_location, &attr)))
+ dwarf_attr(die_mem, DW_AT_location, &attr) ||
+ dwarf_attr(die_mem, DW_AT_const_value, &attr)))
return DIE_FIND_CB_END;
if (dwarf_haspc(die_mem, fvp->addr))
return DIE_FIND_CB_CONTINUE;
node = rb_entry(next, struct bpf_prog_info_node, rb_node);
next = rb_next(&node->rb_node);
rb_erase(&node->rb_node, root);
+ free(node->info_linear);
free(node);
}
PERF_IP_FLAG_VMEXIT = 1ULL << 12,
};
-#define PERF_IP_FLAG_CHARS "bcrosyiABEx"
+#define PERF_IP_FLAG_CHARS "bcrosyiABExgh"
#define PERF_BRANCH_MASK (\
PERF_IP_FLAG_BRANCH |\
if (affinity__setup(&affinity) < 0)
return;
- evlist__for_each_entry(evlist, pos)
- bpf_counter__disable(pos);
-
/* Disable 'immediate' events last */
for (imm = 0; imm <= 1; imm++) {
evlist__for_each_cpu(evlist, i, cpu) {
evsel->auto_merge_stats = orig->auto_merge_stats;
evsel->collect_stat = orig->collect_stat;
evsel->weak_group = orig->weak_group;
+ evsel->use_config_name = orig->use_config_name;
if (evsel__copy_config_terms(evsel, orig) < 0)
goto out_err;
bool collect_stat;
bool weak_group;
bool bpf_counter;
+ bool use_config_name;
int bpf_fd;
struct bpf_object *bpf_obj;
+ struct list_head config_terms;
};
/*
bool merged_stat;
bool reset_group;
bool errored;
- bool use_config_name;
struct hashmap *per_pkg_mask;
struct evsel *leader;
- struct list_head config_terms;
int err;
int cpu_iter;
struct {
decoder->set_fup_tx_flags = false;
decoder->tx_flags = decoder->fup_tx_flags;
decoder->state.type = INTEL_PT_TRANSACTION;
+ if (decoder->fup_tx_flags & INTEL_PT_ABORT_TX)
+ decoder->state.type |= INTEL_PT_BRANCH;
decoder->state.from_ip = decoder->ip;
decoder->state.to_ip = 0;
decoder->state.flags = decoder->fup_tx_flags;
return 0;
if (err == -EAGAIN ||
intel_pt_fup_with_nlip(decoder, &intel_pt_insn, ip, err)) {
+ bool no_tip = decoder->pkt_state != INTEL_PT_STATE_FUP;
+
decoder->pkt_state = INTEL_PT_STATE_IN_SYNC;
- if (intel_pt_fup_event(decoder))
+ if (intel_pt_fup_event(decoder) && no_tip)
return 0;
return -EAGAIN;
}
*ip += intel_pt_insn->length;
- if (to_ip && *ip == to_ip)
+ if (to_ip && *ip == to_ip) {
+ intel_pt_insn->length = 0;
goto out_no_cache;
+ }
if (*ip >= al.map->end)
break;
static void intel_pt_sample_flags(struct intel_pt_queue *ptq)
{
+ ptq->insn_len = 0;
if (ptq->state->flags & INTEL_PT_ABORT_TX) {
ptq->flags = PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_TX_ABORT;
} else if (ptq->state->flags & INTEL_PT_ASYNC) {
ptq->flags = PERF_IP_FLAG_BRANCH | PERF_IP_FLAG_CALL |
PERF_IP_FLAG_ASYNC |
PERF_IP_FLAG_INTERRUPT;
- ptq->insn_len = 0;
} else {
if (ptq->state->from_ip)
ptq->flags = intel_pt_insn_type(ptq->state->insn_op);
.symbol = "bpf-output",
.alias = "",
},
+ [PERF_COUNT_SW_CGROUP_SWITCHES] = {
+ .symbol = "cgroup-switches",
+ .alias = "",
+ },
};
#define __PERF_EVENT_FIELD(config, name) \
}
for (i = 0; i < max; i++, syms++) {
+ /*
+ * New attr.config still not supported here, the latest
+ * example was PERF_COUNT_SW_CGROUP_SWITCHES
+ */
+ if (syms->symbol == NULL)
+ continue;
- if (event_glob != NULL && syms->symbol != NULL &&
- !(strglobmatch(syms->symbol, event_glob) ||
+ if (event_glob != NULL && !(strglobmatch(syms->symbol, event_glob) ||
(syms->alias && strglobmatch(syms->alias, event_glob))))
continue;
dummy { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_DUMMY); }
duration_time { return tool(yyscanner, PERF_TOOL_DURATION_TIME); }
bpf-output { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_BPF_OUTPUT); }
+cgroup-switches { return sym(yyscanner, PERF_TYPE_SOFTWARE, PERF_COUNT_SW_CGROUP_SWITCHES); }
/*
* We have to handle the kernel PMU event cycles-ct/cycles-t/mem-loads/mem-stores separately.
evsel->core.attr.build_id = 1;
}
+static void perf_probe_cgroup(struct evsel *evsel)
+{
+ evsel->core.attr.cgroup = 1;
+}
+
bool perf_can_sample_identifier(void)
{
return perf_probe_api(perf_probe_sample_identifier);
{
return perf_probe_api(perf_probe_build_id);
}
+
+bool perf_can_record_cgroup(void)
+{
+ return perf_probe_api(perf_probe_cgroup);
+}
bool perf_can_record_text_poke_events(void);
bool perf_can_sample_identifier(void);
bool perf_can_record_build_id(void);
+bool perf_can_record_cgroup(void);
#endif // __PERF_API_PROBE_H
}
/* no event */
- if (*q == '\0')
+ if (*q == '\0') {
+ if (*sep == '}') {
+ if (grp_evt < 0) {
+ ui__error("cannot close a non-existing event group\n");
+ goto error;
+ }
+ grp_evt--;
+ }
continue;
+ }
memset(&attr, 0, sizeof(attr));
event_attr_init(&attr);
grp_evt = -1;
}
}
+ free(p_orig);
return 0;
error:
free(p_orig);
immediate_value_is_supported()) {
Dwarf_Sword snum;
+ if (!tvar)
+ return 0;
+
dwarf_formsdata(&attr, &snum);
ret = asprintf(&tvar->value, "\\%ld", (long)snum);
char *config;
int ret = 0;
- if (counter->uniquified_name ||
+ if (counter->uniquified_name || counter->use_config_name ||
!counter->pmu_name || !strncmp(counter->name, counter->pmu_name,
strlen(counter->pmu_name)))
return;
}
} else {
if (perf_pmu__has_hybrid()) {
- if (!counter->use_config_name) {
- ret = asprintf(&new_name, "%s/%s/",
- counter->pmu_name, counter->name);
- }
+ ret = asprintf(&new_name, "%s/%s/",
+ counter->pmu_name, counter->name);
} else {
ret = asprintf(&new_name, "%s [%s]",
counter->name, counter->pmu_name);
list_for_each_entry_safe(pos, tmp, sdt_notes, note_list) {
list_del_init(&pos->note_list);
+ zfree(&pos->args);
zfree(&pos->name);
zfree(&pos->provider);
free(pos);
.tcp.doff = 5,
};
-static int settimeo(int fd, int timeout_ms)
+int settimeo(int fd, int timeout_ms)
{
struct timeval timeout = { .tv_sec = 3 };
} __packed;
extern struct ipv6_packet pkt_v6;
+int settimeo(int fd, int timeout_ms);
int start_server(int family, int type, const char *addr, __u16 port,
int timeout_ms);
int connect_to_fd(int server_fd, int timeout_ms);
const size_t rec_sz = BPF_RINGBUF_HDR_SZ + sizeof(struct sample);
pthread_t thread;
long bg_ret = -1;
- int err, cnt;
+ int err, cnt, rb_fd;
int page_size = getpagesize();
+ void *mmap_ptr, *tmp_ptr;
skel = test_ringbuf__open();
if (CHECK(!skel, "skel_open", "skeleton open failed\n"))
if (CHECK(err != 0, "skel_load", "skeleton load failed\n"))
goto cleanup;
+ rb_fd = bpf_map__fd(skel->maps.ringbuf);
+ /* good read/write cons_pos */
+ mmap_ptr = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_SHARED, rb_fd, 0);
+ ASSERT_OK_PTR(mmap_ptr, "rw_cons_pos");
+ tmp_ptr = mremap(mmap_ptr, page_size, 2 * page_size, MREMAP_MAYMOVE);
+ if (!ASSERT_ERR_PTR(tmp_ptr, "rw_extend"))
+ goto cleanup;
+ ASSERT_ERR(mprotect(mmap_ptr, page_size, PROT_EXEC), "exec_cons_pos_protect");
+ ASSERT_OK(munmap(mmap_ptr, page_size), "unmap_rw");
+
+ /* bad writeable prod_pos */
+ mmap_ptr = mmap(NULL, page_size, PROT_WRITE, MAP_SHARED, rb_fd, page_size);
+ err = -errno;
+ ASSERT_ERR_PTR(mmap_ptr, "wr_prod_pos");
+ ASSERT_EQ(err, -EPERM, "wr_prod_pos_err");
+
+ /* bad writeable data pages */
+ mmap_ptr = mmap(NULL, page_size, PROT_WRITE, MAP_SHARED, rb_fd, 2 * page_size);
+ err = -errno;
+ ASSERT_ERR_PTR(mmap_ptr, "wr_data_page_one");
+ ASSERT_EQ(err, -EPERM, "wr_data_page_one_err");
+ mmap_ptr = mmap(NULL, page_size, PROT_WRITE, MAP_SHARED, rb_fd, 3 * page_size);
+ ASSERT_ERR_PTR(mmap_ptr, "wr_data_page_two");
+ mmap_ptr = mmap(NULL, 2 * page_size, PROT_WRITE, MAP_SHARED, rb_fd, 2 * page_size);
+ ASSERT_ERR_PTR(mmap_ptr, "wr_data_page_all");
+
+ /* good read-only pages */
+ mmap_ptr = mmap(NULL, 4 * page_size, PROT_READ, MAP_SHARED, rb_fd, 0);
+ if (!ASSERT_OK_PTR(mmap_ptr, "ro_prod_pos"))
+ goto cleanup;
+
+ ASSERT_ERR(mprotect(mmap_ptr, 4 * page_size, PROT_WRITE), "write_protect");
+ ASSERT_ERR(mprotect(mmap_ptr, 4 * page_size, PROT_EXEC), "exec_protect");
+ ASSERT_ERR_PTR(mremap(mmap_ptr, 0, 4 * page_size, MREMAP_MAYMOVE), "ro_remap");
+ ASSERT_OK(munmap(mmap_ptr, 4 * page_size), "unmap_ro");
+
+ /* good read-only pages with initial offset */
+ mmap_ptr = mmap(NULL, page_size, PROT_READ, MAP_SHARED, rb_fd, page_size);
+ if (!ASSERT_OK_PTR(mmap_ptr, "ro_prod_pos"))
+ goto cleanup;
+
+ ASSERT_ERR(mprotect(mmap_ptr, page_size, PROT_WRITE), "write_protect");
+ ASSERT_ERR(mprotect(mmap_ptr, page_size, PROT_EXEC), "exec_protect");
+ ASSERT_ERR_PTR(mremap(mmap_ptr, 0, 3 * page_size, MREMAP_MAYMOVE), "ro_remap");
+ ASSERT_OK(munmap(mmap_ptr, page_size), "unmap_ro");
+
/* only trigger BPF program for current process */
skel->bss->pid = getpid();
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause
+
+/*
+ * This test sets up 3 netns (src <-> fwd <-> dst). There is no direct veth link
+ * between src and dst. The netns fwd has veth links to each src and dst. The
+ * client is in src and server in dst. The test installs a TC BPF program to each
+ * host facing veth in fwd which calls into i) bpf_redirect_neigh() to perform the
+ * neigh addr population and redirect or ii) bpf_redirect_peer() for namespace
+ * switch from ingress side; it also installs a checker prog on the egress side
+ * to drop unexpected traffic.
+ */
+
+#define _GNU_SOURCE
+
+#include <arpa/inet.h>
+#include <linux/limits.h>
+#include <linux/sysctl.h>
+#include <linux/if_tun.h>
+#include <linux/if.h>
+#include <sched.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <sys/stat.h>
+#include <sys/mount.h>
+
+#include "test_progs.h"
+#include "network_helpers.h"
+#include "test_tc_neigh_fib.skel.h"
+#include "test_tc_neigh.skel.h"
+#include "test_tc_peer.skel.h"
+
+#define NS_SRC "ns_src"
+#define NS_FWD "ns_fwd"
+#define NS_DST "ns_dst"
+
+#define IP4_SRC "172.16.1.100"
+#define IP4_DST "172.16.2.100"
+#define IP4_TUN_SRC "172.17.1.100"
+#define IP4_TUN_FWD "172.17.1.200"
+#define IP4_PORT 9004
+
+#define IP6_SRC "0::1:dead:beef:cafe"
+#define IP6_DST "0::2:dead:beef:cafe"
+#define IP6_TUN_SRC "1::1:dead:beef:cafe"
+#define IP6_TUN_FWD "1::2:dead:beef:cafe"
+#define IP6_PORT 9006
+
+#define IP4_SLL "169.254.0.1"
+#define IP4_DLL "169.254.0.2"
+#define IP4_NET "169.254.0.0"
+
+#define MAC_DST_FWD "00:11:22:33:44:55"
+#define MAC_DST "00:22:33:44:55:66"
+
+#define IFADDR_STR_LEN 18
+#define PING_ARGS "-i 0.2 -c 3 -w 10 -q"
+
+#define SRC_PROG_PIN_FILE "/sys/fs/bpf/test_tc_src"
+#define DST_PROG_PIN_FILE "/sys/fs/bpf/test_tc_dst"
+#define CHK_PROG_PIN_FILE "/sys/fs/bpf/test_tc_chk"
+
+#define TIMEOUT_MILLIS 10000
+
+#define log_err(MSG, ...) \
+ fprintf(stderr, "(%s:%d: errno: %s) " MSG "\n", \
+ __FILE__, __LINE__, strerror(errno), ##__VA_ARGS__)
+
+static const char * const namespaces[] = {NS_SRC, NS_FWD, NS_DST, NULL};
+
+static int write_file(const char *path, const char *newval)
+{
+ FILE *f;
+
+ f = fopen(path, "r+");
+ if (!f)
+ return -1;
+ if (fwrite(newval, strlen(newval), 1, f) != 1) {
+ log_err("writing to %s failed", path);
+ fclose(f);
+ return -1;
+ }
+ fclose(f);
+ return 0;
+}
+
+struct nstoken {
+ int orig_netns_fd;
+};
+
+static int setns_by_fd(int nsfd)
+{
+ int err;
+
+ err = setns(nsfd, CLONE_NEWNET);
+ close(nsfd);
+
+ if (!ASSERT_OK(err, "setns"))
+ return err;
+
+ /* Switch /sys to the new namespace so that e.g. /sys/class/net
+ * reflects the devices in the new namespace.
+ */
+ err = unshare(CLONE_NEWNS);
+ if (!ASSERT_OK(err, "unshare"))
+ return err;
+
+ err = umount2("/sys", MNT_DETACH);
+ if (!ASSERT_OK(err, "umount2 /sys"))
+ return err;
+
+ err = mount("sysfs", "/sys", "sysfs", 0, NULL);
+ if (!ASSERT_OK(err, "mount /sys"))
+ return err;
+
+ err = mount("bpffs", "/sys/fs/bpf", "bpf", 0, NULL);
+ if (!ASSERT_OK(err, "mount /sys/fs/bpf"))
+ return err;
+
+ return 0;
+}
+
+/**
+ * open_netns() - Switch to specified network namespace by name.
+ *
+ * Returns token with which to restore the original namespace
+ * using close_netns().
+ */
+static struct nstoken *open_netns(const char *name)
+{
+ int nsfd;
+ char nspath[PATH_MAX];
+ int err;
+ struct nstoken *token;
+
+ token = malloc(sizeof(struct nstoken));
+ if (!ASSERT_OK_PTR(token, "malloc token"))
+ return NULL;
+
+ token->orig_netns_fd = open("/proc/self/ns/net", O_RDONLY);
+ if (!ASSERT_GE(token->orig_netns_fd, 0, "open /proc/self/ns/net"))
+ goto fail;
+
+ snprintf(nspath, sizeof(nspath), "%s/%s", "/var/run/netns", name);
+ nsfd = open(nspath, O_RDONLY | O_CLOEXEC);
+ if (!ASSERT_GE(nsfd, 0, "open netns fd"))
+ goto fail;
+
+ err = setns_by_fd(nsfd);
+ if (!ASSERT_OK(err, "setns_by_fd"))
+ goto fail;
+
+ return token;
+fail:
+ free(token);
+ return NULL;
+}
+
+static void close_netns(struct nstoken *token)
+{
+ ASSERT_OK(setns_by_fd(token->orig_netns_fd), "setns_by_fd");
+ free(token);
+}
+
+static int netns_setup_namespaces(const char *verb)
+{
+ const char * const *ns = namespaces;
+ char cmd[128];
+
+ while (*ns) {
+ snprintf(cmd, sizeof(cmd), "ip netns %s %s", verb, *ns);
+ if (!ASSERT_OK(system(cmd), cmd))
+ return -1;
+ ns++;
+ }
+ return 0;
+}
+
+struct netns_setup_result {
+ int ifindex_veth_src_fwd;
+ int ifindex_veth_dst_fwd;
+};
+
+static int get_ifaddr(const char *name, char *ifaddr)
+{
+ char path[PATH_MAX];
+ FILE *f;
+ int ret;
+
+ snprintf(path, PATH_MAX, "/sys/class/net/%s/address", name);
+ f = fopen(path, "r");
+ if (!ASSERT_OK_PTR(f, path))
+ return -1;
+
+ ret = fread(ifaddr, 1, IFADDR_STR_LEN, f);
+ if (!ASSERT_EQ(ret, IFADDR_STR_LEN, "fread ifaddr")) {
+ fclose(f);
+ return -1;
+ }
+ fclose(f);
+ return 0;
+}
+
+static int get_ifindex(const char *name)
+{
+ char path[PATH_MAX];
+ char buf[32];
+ FILE *f;
+ int ret;
+
+ snprintf(path, PATH_MAX, "/sys/class/net/%s/ifindex", name);
+ f = fopen(path, "r");
+ if (!ASSERT_OK_PTR(f, path))
+ return -1;
+
+ ret = fread(buf, 1, sizeof(buf), f);
+ if (!ASSERT_GT(ret, 0, "fread ifindex")) {
+ fclose(f);
+ return -1;
+ }
+ fclose(f);
+ return atoi(buf);
+}
+
+#define SYS(fmt, ...) \
+ ({ \
+ char cmd[1024]; \
+ snprintf(cmd, sizeof(cmd), fmt, ##__VA_ARGS__); \
+ if (!ASSERT_OK(system(cmd), cmd)) \
+ goto fail; \
+ })
+
+static int netns_setup_links_and_routes(struct netns_setup_result *result)
+{
+ struct nstoken *nstoken = NULL;
+ char veth_src_fwd_addr[IFADDR_STR_LEN+1] = {};
+
+ SYS("ip link add veth_src type veth peer name veth_src_fwd");
+ SYS("ip link add veth_dst type veth peer name veth_dst_fwd");
+
+ SYS("ip link set veth_dst_fwd address " MAC_DST_FWD);
+ SYS("ip link set veth_dst address " MAC_DST);
+
+ if (get_ifaddr("veth_src_fwd", veth_src_fwd_addr))
+ goto fail;
+
+ result->ifindex_veth_src_fwd = get_ifindex("veth_src_fwd");
+ if (result->ifindex_veth_src_fwd < 0)
+ goto fail;
+ result->ifindex_veth_dst_fwd = get_ifindex("veth_dst_fwd");
+ if (result->ifindex_veth_dst_fwd < 0)
+ goto fail;
+
+ SYS("ip link set veth_src netns " NS_SRC);
+ SYS("ip link set veth_src_fwd netns " NS_FWD);
+ SYS("ip link set veth_dst_fwd netns " NS_FWD);
+ SYS("ip link set veth_dst netns " NS_DST);
+
+ /** setup in 'src' namespace */
+ nstoken = open_netns(NS_SRC);
+ if (!ASSERT_OK_PTR(nstoken, "setns src"))
+ goto fail;
+
+ SYS("ip addr add " IP4_SRC "/32 dev veth_src");
+ SYS("ip addr add " IP6_SRC "/128 dev veth_src nodad");
+ SYS("ip link set dev veth_src up");
+
+ SYS("ip route add " IP4_DST "/32 dev veth_src scope global");
+ SYS("ip route add " IP4_NET "/16 dev veth_src scope global");
+ SYS("ip route add " IP6_DST "/128 dev veth_src scope global");
+
+ SYS("ip neigh add " IP4_DST " dev veth_src lladdr %s",
+ veth_src_fwd_addr);
+ SYS("ip neigh add " IP6_DST " dev veth_src lladdr %s",
+ veth_src_fwd_addr);
+
+ close_netns(nstoken);
+
+ /** setup in 'fwd' namespace */
+ nstoken = open_netns(NS_FWD);
+ if (!ASSERT_OK_PTR(nstoken, "setns fwd"))
+ goto fail;
+
+ /* The fwd netns automatically gets a v6 LL address / routes, but also
+ * needs v4 one in order to start ARP probing. IP4_NET route is added
+ * to the endpoints so that the ARP processing will reply.
+ */
+ SYS("ip addr add " IP4_SLL "/32 dev veth_src_fwd");
+ SYS("ip addr add " IP4_DLL "/32 dev veth_dst_fwd");
+ SYS("ip link set dev veth_src_fwd up");
+ SYS("ip link set dev veth_dst_fwd up");
+
+ SYS("ip route add " IP4_SRC "/32 dev veth_src_fwd scope global");
+ SYS("ip route add " IP6_SRC "/128 dev veth_src_fwd scope global");
+ SYS("ip route add " IP4_DST "/32 dev veth_dst_fwd scope global");
+ SYS("ip route add " IP6_DST "/128 dev veth_dst_fwd scope global");
+
+ close_netns(nstoken);
+
+ /** setup in 'dst' namespace */
+ nstoken = open_netns(NS_DST);
+ if (!ASSERT_OK_PTR(nstoken, "setns dst"))
+ goto fail;
+
+ SYS("ip addr add " IP4_DST "/32 dev veth_dst");
+ SYS("ip addr add " IP6_DST "/128 dev veth_dst nodad");
+ SYS("ip link set dev veth_dst up");
+
+ SYS("ip route add " IP4_SRC "/32 dev veth_dst scope global");
+ SYS("ip route add " IP4_NET "/16 dev veth_dst scope global");
+ SYS("ip route add " IP6_SRC "/128 dev veth_dst scope global");
+
+ SYS("ip neigh add " IP4_SRC " dev veth_dst lladdr " MAC_DST_FWD);
+ SYS("ip neigh add " IP6_SRC " dev veth_dst lladdr " MAC_DST_FWD);
+
+ close_netns(nstoken);
+
+ return 0;
+fail:
+ if (nstoken)
+ close_netns(nstoken);
+ return -1;
+}
+
+static int netns_load_bpf(void)
+{
+ SYS("tc qdisc add dev veth_src_fwd clsact");
+ SYS("tc filter add dev veth_src_fwd ingress bpf da object-pinned "
+ SRC_PROG_PIN_FILE);
+ SYS("tc filter add dev veth_src_fwd egress bpf da object-pinned "
+ CHK_PROG_PIN_FILE);
+
+ SYS("tc qdisc add dev veth_dst_fwd clsact");
+ SYS("tc filter add dev veth_dst_fwd ingress bpf da object-pinned "
+ DST_PROG_PIN_FILE);
+ SYS("tc filter add dev veth_dst_fwd egress bpf da object-pinned "
+ CHK_PROG_PIN_FILE);
+
+ return 0;
+fail:
+ return -1;
+}
+
+static void test_tcp(int family, const char *addr, __u16 port)
+{
+ int listen_fd = -1, accept_fd = -1, client_fd = -1;
+ char buf[] = "testing testing";
+ int n;
+ struct nstoken *nstoken;
+
+ nstoken = open_netns(NS_DST);
+ if (!ASSERT_OK_PTR(nstoken, "setns dst"))
+ return;
+
+ listen_fd = start_server(family, SOCK_STREAM, addr, port, 0);
+ if (!ASSERT_GE(listen_fd, 0, "listen"))
+ goto done;
+
+ close_netns(nstoken);
+ nstoken = open_netns(NS_SRC);
+ if (!ASSERT_OK_PTR(nstoken, "setns src"))
+ goto done;
+
+ client_fd = connect_to_fd(listen_fd, TIMEOUT_MILLIS);
+ if (!ASSERT_GE(client_fd, 0, "connect_to_fd"))
+ goto done;
+
+ accept_fd = accept(listen_fd, NULL, NULL);
+ if (!ASSERT_GE(accept_fd, 0, "accept"))
+ goto done;
+
+ if (!ASSERT_OK(settimeo(accept_fd, TIMEOUT_MILLIS), "settimeo"))
+ goto done;
+
+ n = write(client_fd, buf, sizeof(buf));
+ if (!ASSERT_EQ(n, sizeof(buf), "send to server"))
+ goto done;
+
+ n = read(accept_fd, buf, sizeof(buf));
+ ASSERT_EQ(n, sizeof(buf), "recv from server");
+
+done:
+ if (nstoken)
+ close_netns(nstoken);
+ if (listen_fd >= 0)
+ close(listen_fd);
+ if (accept_fd >= 0)
+ close(accept_fd);
+ if (client_fd >= 0)
+ close(client_fd);
+}
+
+static int test_ping(int family, const char *addr)
+{
+ const char *ping = family == AF_INET6 ? "ping6" : "ping";
+
+ SYS("ip netns exec " NS_SRC " %s " PING_ARGS " %s > /dev/null", ping, addr);
+ return 0;
+fail:
+ return -1;
+}
+
+static void test_connectivity(void)
+{
+ test_tcp(AF_INET, IP4_DST, IP4_PORT);
+ test_ping(AF_INET, IP4_DST);
+ test_tcp(AF_INET6, IP6_DST, IP6_PORT);
+ test_ping(AF_INET6, IP6_DST);
+}
+
+static int set_forwarding(bool enable)
+{
+ int err;
+
+ err = write_file("/proc/sys/net/ipv4/ip_forward", enable ? "1" : "0");
+ if (!ASSERT_OK(err, "set ipv4.ip_forward=0"))
+ return err;
+
+ err = write_file("/proc/sys/net/ipv6/conf/all/forwarding", enable ? "1" : "0");
+ if (!ASSERT_OK(err, "set ipv6.forwarding=0"))
+ return err;
+
+ return 0;
+}
+
+static void test_tc_redirect_neigh_fib(struct netns_setup_result *setup_result)
+{
+ struct nstoken *nstoken = NULL;
+ struct test_tc_neigh_fib *skel = NULL;
+ int err;
+
+ nstoken = open_netns(NS_FWD);
+ if (!ASSERT_OK_PTR(nstoken, "setns fwd"))
+ return;
+
+ skel = test_tc_neigh_fib__open();
+ if (!ASSERT_OK_PTR(skel, "test_tc_neigh_fib__open"))
+ goto done;
+
+ if (!ASSERT_OK(test_tc_neigh_fib__load(skel), "test_tc_neigh_fib__load"))
+ goto done;
+
+ err = bpf_program__pin(skel->progs.tc_src, SRC_PROG_PIN_FILE);
+ if (!ASSERT_OK(err, "pin " SRC_PROG_PIN_FILE))
+ goto done;
+
+ err = bpf_program__pin(skel->progs.tc_chk, CHK_PROG_PIN_FILE);
+ if (!ASSERT_OK(err, "pin " CHK_PROG_PIN_FILE))
+ goto done;
+
+ err = bpf_program__pin(skel->progs.tc_dst, DST_PROG_PIN_FILE);
+ if (!ASSERT_OK(err, "pin " DST_PROG_PIN_FILE))
+ goto done;
+
+ if (netns_load_bpf())
+ goto done;
+
+ /* bpf_fib_lookup() checks if forwarding is enabled */
+ if (!ASSERT_OK(set_forwarding(true), "enable forwarding"))
+ goto done;
+
+ test_connectivity();
+
+done:
+ if (skel)
+ test_tc_neigh_fib__destroy(skel);
+ close_netns(nstoken);
+}
+
+static void test_tc_redirect_neigh(struct netns_setup_result *setup_result)
+{
+ struct nstoken *nstoken = NULL;
+ struct test_tc_neigh *skel = NULL;
+ int err;
+
+ nstoken = open_netns(NS_FWD);
+ if (!ASSERT_OK_PTR(nstoken, "setns fwd"))
+ return;
+
+ skel = test_tc_neigh__open();
+ if (!ASSERT_OK_PTR(skel, "test_tc_neigh__open"))
+ goto done;
+
+ skel->rodata->IFINDEX_SRC = setup_result->ifindex_veth_src_fwd;
+ skel->rodata->IFINDEX_DST = setup_result->ifindex_veth_dst_fwd;
+
+ err = test_tc_neigh__load(skel);
+ if (!ASSERT_OK(err, "test_tc_neigh__load"))
+ goto done;
+
+ err = bpf_program__pin(skel->progs.tc_src, SRC_PROG_PIN_FILE);
+ if (!ASSERT_OK(err, "pin " SRC_PROG_PIN_FILE))
+ goto done;
+
+ err = bpf_program__pin(skel->progs.tc_chk, CHK_PROG_PIN_FILE);
+ if (!ASSERT_OK(err, "pin " CHK_PROG_PIN_FILE))
+ goto done;
+
+ err = bpf_program__pin(skel->progs.tc_dst, DST_PROG_PIN_FILE);
+ if (!ASSERT_OK(err, "pin " DST_PROG_PIN_FILE))
+ goto done;
+
+ if (netns_load_bpf())
+ goto done;
+
+ if (!ASSERT_OK(set_forwarding(false), "disable forwarding"))
+ goto done;
+
+ test_connectivity();
+
+done:
+ if (skel)
+ test_tc_neigh__destroy(skel);
+ close_netns(nstoken);
+}
+
+static void test_tc_redirect_peer(struct netns_setup_result *setup_result)
+{
+ struct nstoken *nstoken;
+ struct test_tc_peer *skel;
+ int err;
+
+ nstoken = open_netns(NS_FWD);
+ if (!ASSERT_OK_PTR(nstoken, "setns fwd"))
+ return;
+
+ skel = test_tc_peer__open();
+ if (!ASSERT_OK_PTR(skel, "test_tc_peer__open"))
+ goto done;
+
+ skel->rodata->IFINDEX_SRC = setup_result->ifindex_veth_src_fwd;
+ skel->rodata->IFINDEX_DST = setup_result->ifindex_veth_dst_fwd;
+
+ err = test_tc_peer__load(skel);
+ if (!ASSERT_OK(err, "test_tc_peer__load"))
+ goto done;
+
+ err = bpf_program__pin(skel->progs.tc_src, SRC_PROG_PIN_FILE);
+ if (!ASSERT_OK(err, "pin " SRC_PROG_PIN_FILE))
+ goto done;
+
+ err = bpf_program__pin(skel->progs.tc_chk, CHK_PROG_PIN_FILE);
+ if (!ASSERT_OK(err, "pin " CHK_PROG_PIN_FILE))
+ goto done;
+
+ err = bpf_program__pin(skel->progs.tc_dst, DST_PROG_PIN_FILE);
+ if (!ASSERT_OK(err, "pin " DST_PROG_PIN_FILE))
+ goto done;
+
+ if (netns_load_bpf())
+ goto done;
+
+ if (!ASSERT_OK(set_forwarding(false), "disable forwarding"))
+ goto done;
+
+ test_connectivity();
+
+done:
+ if (skel)
+ test_tc_peer__destroy(skel);
+ close_netns(nstoken);
+}
+
+static int tun_open(char *name)
+{
+ struct ifreq ifr;
+ int fd, err;
+
+ fd = open("/dev/net/tun", O_RDWR);
+ if (!ASSERT_GE(fd, 0, "open /dev/net/tun"))
+ return -1;
+
+ memset(&ifr, 0, sizeof(ifr));
+
+ ifr.ifr_flags = IFF_TUN | IFF_NO_PI;
+ if (*name)
+ strncpy(ifr.ifr_name, name, IFNAMSIZ);
+
+ err = ioctl(fd, TUNSETIFF, &ifr);
+ if (!ASSERT_OK(err, "ioctl TUNSETIFF"))
+ goto fail;
+
+ SYS("ip link set dev %s up", name);
+
+ return fd;
+fail:
+ close(fd);
+ return -1;
+}
+
+#define MAX(a, b) ((a) > (b) ? (a) : (b))
+enum {
+ SRC_TO_TARGET = 0,
+ TARGET_TO_SRC = 1,
+};
+
+static int tun_relay_loop(int src_fd, int target_fd)
+{
+ fd_set rfds, wfds;
+
+ FD_ZERO(&rfds);
+ FD_ZERO(&wfds);
+
+ for (;;) {
+ char buf[1500];
+ int direction, nread, nwrite;
+
+ FD_SET(src_fd, &rfds);
+ FD_SET(target_fd, &rfds);
+
+ if (select(1 + MAX(src_fd, target_fd), &rfds, NULL, NULL, NULL) < 0) {
+ log_err("select failed");
+ return 1;
+ }
+
+ direction = FD_ISSET(src_fd, &rfds) ? SRC_TO_TARGET : TARGET_TO_SRC;
+
+ nread = read(direction == SRC_TO_TARGET ? src_fd : target_fd, buf, sizeof(buf));
+ if (nread < 0) {
+ log_err("read failed");
+ return 1;
+ }
+
+ nwrite = write(direction == SRC_TO_TARGET ? target_fd : src_fd, buf, nread);
+ if (nwrite != nread) {
+ log_err("write failed");
+ return 1;
+ }
+ }
+}
+
+static void test_tc_redirect_peer_l3(struct netns_setup_result *setup_result)
+{
+ struct test_tc_peer *skel = NULL;
+ struct nstoken *nstoken = NULL;
+ int err;
+ int tunnel_pid = -1;
+ int src_fd, target_fd;
+ int ifindex;
+
+ /* Start a L3 TUN/TAP tunnel between the src and dst namespaces.
+ * This test is using TUN/TAP instead of e.g. IPIP or GRE tunnel as those
+ * expose the L2 headers encapsulating the IP packet to BPF and hence
+ * don't have skb in suitable state for this test. Alternative to TUN/TAP
+ * would be e.g. Wireguard which would appear as a pure L3 device to BPF,
+ * but that requires much more complicated setup.
+ */
+ nstoken = open_netns(NS_SRC);
+ if (!ASSERT_OK_PTR(nstoken, "setns " NS_SRC))
+ return;
+
+ src_fd = tun_open("tun_src");
+ if (!ASSERT_GE(src_fd, 0, "tun_open tun_src"))
+ goto fail;
+
+ close_netns(nstoken);
+
+ nstoken = open_netns(NS_FWD);
+ if (!ASSERT_OK_PTR(nstoken, "setns " NS_FWD))
+ goto fail;
+
+ target_fd = tun_open("tun_fwd");
+ if (!ASSERT_GE(target_fd, 0, "tun_open tun_fwd"))
+ goto fail;
+
+ tunnel_pid = fork();
+ if (!ASSERT_GE(tunnel_pid, 0, "fork tun_relay_loop"))
+ goto fail;
+
+ if (tunnel_pid == 0)
+ exit(tun_relay_loop(src_fd, target_fd));
+
+ skel = test_tc_peer__open();
+ if (!ASSERT_OK_PTR(skel, "test_tc_peer__open"))
+ goto fail;
+
+ ifindex = get_ifindex("tun_fwd");
+ if (!ASSERT_GE(ifindex, 0, "get_ifindex tun_fwd"))
+ goto fail;
+
+ skel->rodata->IFINDEX_SRC = ifindex;
+ skel->rodata->IFINDEX_DST = setup_result->ifindex_veth_dst_fwd;
+
+ err = test_tc_peer__load(skel);
+ if (!ASSERT_OK(err, "test_tc_peer__load"))
+ goto fail;
+
+ err = bpf_program__pin(skel->progs.tc_src_l3, SRC_PROG_PIN_FILE);
+ if (!ASSERT_OK(err, "pin " SRC_PROG_PIN_FILE))
+ goto fail;
+
+ err = bpf_program__pin(skel->progs.tc_dst_l3, DST_PROG_PIN_FILE);
+ if (!ASSERT_OK(err, "pin " DST_PROG_PIN_FILE))
+ goto fail;
+
+ err = bpf_program__pin(skel->progs.tc_chk, CHK_PROG_PIN_FILE);
+ if (!ASSERT_OK(err, "pin " CHK_PROG_PIN_FILE))
+ goto fail;
+
+ /* Load "tc_src_l3" to the tun_fwd interface to redirect packets
+ * towards dst, and "tc_dst" to redirect packets
+ * and "tc_chk" on veth_dst_fwd to drop non-redirected packets.
+ */
+ SYS("tc qdisc add dev tun_fwd clsact");
+ SYS("tc filter add dev tun_fwd ingress bpf da object-pinned "
+ SRC_PROG_PIN_FILE);
+
+ SYS("tc qdisc add dev veth_dst_fwd clsact");
+ SYS("tc filter add dev veth_dst_fwd ingress bpf da object-pinned "
+ DST_PROG_PIN_FILE);
+ SYS("tc filter add dev veth_dst_fwd egress bpf da object-pinned "
+ CHK_PROG_PIN_FILE);
+
+ /* Setup route and neigh tables */
+ SYS("ip -netns " NS_SRC " addr add dev tun_src " IP4_TUN_SRC "/24");
+ SYS("ip -netns " NS_FWD " addr add dev tun_fwd " IP4_TUN_FWD "/24");
+
+ SYS("ip -netns " NS_SRC " addr add dev tun_src " IP6_TUN_SRC "/64 nodad");
+ SYS("ip -netns " NS_FWD " addr add dev tun_fwd " IP6_TUN_FWD "/64 nodad");
+
+ SYS("ip -netns " NS_SRC " route del " IP4_DST "/32 dev veth_src scope global");
+ SYS("ip -netns " NS_SRC " route add " IP4_DST "/32 via " IP4_TUN_FWD
+ " dev tun_src scope global");
+ SYS("ip -netns " NS_DST " route add " IP4_TUN_SRC "/32 dev veth_dst scope global");
+ SYS("ip -netns " NS_SRC " route del " IP6_DST "/128 dev veth_src scope global");
+ SYS("ip -netns " NS_SRC " route add " IP6_DST "/128 via " IP6_TUN_FWD
+ " dev tun_src scope global");
+ SYS("ip -netns " NS_DST " route add " IP6_TUN_SRC "/128 dev veth_dst scope global");
+
+ SYS("ip -netns " NS_DST " neigh add " IP4_TUN_SRC " dev veth_dst lladdr " MAC_DST_FWD);
+ SYS("ip -netns " NS_DST " neigh add " IP6_TUN_SRC " dev veth_dst lladdr " MAC_DST_FWD);
+
+ if (!ASSERT_OK(set_forwarding(false), "disable forwarding"))
+ goto fail;
+
+ test_connectivity();
+
+fail:
+ if (tunnel_pid > 0) {
+ kill(tunnel_pid, SIGTERM);
+ waitpid(tunnel_pid, NULL, 0);
+ }
+ if (src_fd >= 0)
+ close(src_fd);
+ if (target_fd >= 0)
+ close(target_fd);
+ if (skel)
+ test_tc_peer__destroy(skel);
+ if (nstoken)
+ close_netns(nstoken);
+}
+
+#define RUN_TEST(name) \
+ ({ \
+ struct netns_setup_result setup_result; \
+ if (test__start_subtest(#name)) \
+ if (ASSERT_OK(netns_setup_namespaces("add"), "setup namespaces")) { \
+ if (ASSERT_OK(netns_setup_links_and_routes(&setup_result), \
+ "setup links and routes")) \
+ test_ ## name(&setup_result); \
+ netns_setup_namespaces("delete"); \
+ } \
+ })
+
+static void *test_tc_redirect_run_tests(void *arg)
+{
+ RUN_TEST(tc_redirect_peer);
+ RUN_TEST(tc_redirect_peer_l3);
+ RUN_TEST(tc_redirect_neigh);
+ RUN_TEST(tc_redirect_neigh_fib);
+ return NULL;
+}
+
+void test_tc_redirect(void)
+{
+ pthread_t test_thread;
+ int err;
+
+ /* Run the tests in their own thread to isolate the namespace changes
+ * so they do not affect the environment of other tests.
+ * (specifically needed because of unshare(CLONE_NEWNS) in open_netns())
+ */
+ err = pthread_create(&test_thread, NULL, &test_tc_redirect_run_tests, NULL);
+ if (ASSERT_OK(err, "pthread_create"))
+ ASSERT_OK(pthread_join(test_thread, NULL), "pthread_join");
+}
a.s6_addr32[3] == b.s6_addr32[3])
#endif
-enum {
- dev_src,
- dev_dst,
-};
-
-struct bpf_map_def SEC("maps") ifindex_map = {
- .type = BPF_MAP_TYPE_ARRAY,
- .key_size = sizeof(int),
- .value_size = sizeof(int),
- .max_entries = 2,
-};
+volatile const __u32 IFINDEX_SRC;
+volatile const __u32 IFINDEX_DST;
static __always_inline bool is_remote_ep_v4(struct __sk_buff *skb,
__be32 addr)
return v6_equal(ip6h->daddr, addr);
}
-static __always_inline int get_dev_ifindex(int which)
-{
- int *ifindex = bpf_map_lookup_elem(&ifindex_map, &which);
-
- return ifindex ? *ifindex : 0;
-}
-
-SEC("chk_egress") int tc_chk(struct __sk_buff *skb)
+SEC("classifier/chk_egress")
+int tc_chk(struct __sk_buff *skb)
{
void *data_end = ctx_ptr(skb->data_end);
void *data = ctx_ptr(skb->data);
return !raw[0] && !raw[1] && !raw[2] ? TC_ACT_SHOT : TC_ACT_OK;
}
-SEC("dst_ingress") int tc_dst(struct __sk_buff *skb)
+SEC("classifier/dst_ingress")
+int tc_dst(struct __sk_buff *skb)
{
__u8 zero[ETH_ALEN * 2];
bool redirect = false;
if (bpf_skb_store_bytes(skb, 0, &zero, sizeof(zero), 0) < 0)
return TC_ACT_SHOT;
- return bpf_redirect_neigh(get_dev_ifindex(dev_src), NULL, 0, 0);
+ return bpf_redirect_neigh(IFINDEX_SRC, NULL, 0, 0);
}
-SEC("src_ingress") int tc_src(struct __sk_buff *skb)
+SEC("classifier/src_ingress")
+int tc_src(struct __sk_buff *skb)
{
__u8 zero[ETH_ALEN * 2];
bool redirect = false;
if (bpf_skb_store_bytes(skb, 0, &zero, sizeof(zero), 0) < 0)
return TC_ACT_SHOT;
- return bpf_redirect_neigh(get_dev_ifindex(dev_dst), NULL, 0, 0);
+ return bpf_redirect_neigh(IFINDEX_DST, NULL, 0, 0);
}
char __license[] SEC("license") = "GPL";
return 0;
}
-SEC("chk_egress") int tc_chk(struct __sk_buff *skb)
+SEC("classifier/chk_egress")
+int tc_chk(struct __sk_buff *skb)
{
void *data_end = ctx_ptr(skb->data_end);
void *data = ctx_ptr(skb->data);
/* these are identical, but keep them separate for compatibility with the
* section names expected by test_tc_redirect.sh
*/
-SEC("dst_ingress") int tc_dst(struct __sk_buff *skb)
+SEC("classifier/dst_ingress")
+int tc_dst(struct __sk_buff *skb)
{
return tc_redir(skb);
}
-SEC("src_ingress") int tc_src(struct __sk_buff *skb)
+SEC("classifier/src_ingress")
+int tc_src(struct __sk_buff *skb)
{
return tc_redir(skb);
}
#include <linux/bpf.h>
#include <linux/stddef.h>
#include <linux/pkt_cls.h>
+#include <linux/if_ether.h>
+#include <linux/ip.h>
#include <bpf/bpf_helpers.h>
-enum {
- dev_src,
- dev_dst,
-};
+volatile const __u32 IFINDEX_SRC;
+volatile const __u32 IFINDEX_DST;
-struct bpf_map_def SEC("maps") ifindex_map = {
- .type = BPF_MAP_TYPE_ARRAY,
- .key_size = sizeof(int),
- .value_size = sizeof(int),
- .max_entries = 2,
-};
+static const __u8 src_mac[] = {0x00, 0x11, 0x22, 0x33, 0x44, 0x55};
+static const __u8 dst_mac[] = {0x00, 0x22, 0x33, 0x44, 0x55, 0x66};
-static __always_inline int get_dev_ifindex(int which)
+SEC("classifier/chk_egress")
+int tc_chk(struct __sk_buff *skb)
{
- int *ifindex = bpf_map_lookup_elem(&ifindex_map, &which);
+ return TC_ACT_SHOT;
+}
- return ifindex ? *ifindex : 0;
+SEC("classifier/dst_ingress")
+int tc_dst(struct __sk_buff *skb)
+{
+ return bpf_redirect_peer(IFINDEX_SRC, 0);
}
-SEC("chk_egress") int tc_chk(struct __sk_buff *skb)
+SEC("classifier/src_ingress")
+int tc_src(struct __sk_buff *skb)
{
- return TC_ACT_SHOT;
+ return bpf_redirect_peer(IFINDEX_DST, 0);
}
-SEC("dst_ingress") int tc_dst(struct __sk_buff *skb)
+SEC("classifier/dst_ingress_l3")
+int tc_dst_l3(struct __sk_buff *skb)
{
- return bpf_redirect_peer(get_dev_ifindex(dev_src), 0);
+ return bpf_redirect(IFINDEX_SRC, 0);
}
-SEC("src_ingress") int tc_src(struct __sk_buff *skb)
+SEC("classifier/src_ingress_l3")
+int tc_src_l3(struct __sk_buff *skb)
{
- return bpf_redirect_peer(get_dev_ifindex(dev_dst), 0);
+ __u16 proto = skb->protocol;
+
+ if (bpf_skb_change_head(skb, ETH_HLEN, 0) != 0)
+ return TC_ACT_SHOT;
+
+ if (bpf_skb_store_bytes(skb, 0, &src_mac, ETH_ALEN, 0) != 0)
+ return TC_ACT_SHOT;
+
+ if (bpf_skb_store_bytes(skb, ETH_ALEN, &dst_mac, ETH_ALEN, 0) != 0)
+ return TC_ACT_SHOT;
+
+ if (bpf_skb_store_bytes(skb, ETH_ALEN + ETH_ALEN, &proto, sizeof(__u16), 0) != 0)
+ return TC_ACT_SHOT;
+
+ return bpf_redirect_peer(IFINDEX_DST, 0);
}
char __license[] SEC("license") = "GPL";
+++ /dev/null
-#!/bin/bash
-# SPDX-License-Identifier: GPL-2.0
-#
-# This test sets up 3 netns (src <-> fwd <-> dst). There is no direct veth link
-# between src and dst. The netns fwd has veth links to each src and dst. The
-# client is in src and server in dst. The test installs a TC BPF program to each
-# host facing veth in fwd which calls into i) bpf_redirect_neigh() to perform the
-# neigh addr population and redirect or ii) bpf_redirect_peer() for namespace
-# switch from ingress side; it also installs a checker prog on the egress side
-# to drop unexpected traffic.
-
-if [[ $EUID -ne 0 ]]; then
- echo "This script must be run as root"
- echo "FAIL"
- exit 1
-fi
-
-# check that needed tools are present
-command -v nc >/dev/null 2>&1 || \
- { echo >&2 "nc is not available"; exit 1; }
-command -v dd >/dev/null 2>&1 || \
- { echo >&2 "dd is not available"; exit 1; }
-command -v timeout >/dev/null 2>&1 || \
- { echo >&2 "timeout is not available"; exit 1; }
-command -v ping >/dev/null 2>&1 || \
- { echo >&2 "ping is not available"; exit 1; }
-if command -v ping6 >/dev/null 2>&1; then PING6=ping6; else PING6=ping; fi
-command -v perl >/dev/null 2>&1 || \
- { echo >&2 "perl is not available"; exit 1; }
-command -v jq >/dev/null 2>&1 || \
- { echo >&2 "jq is not available"; exit 1; }
-command -v bpftool >/dev/null 2>&1 || \
- { echo >&2 "bpftool is not available"; exit 1; }
-
-readonly GREEN='\033[0;92m'
-readonly RED='\033[0;31m'
-readonly NC='\033[0m' # No Color
-
-readonly PING_ARG="-c 3 -w 10 -q"
-
-readonly TIMEOUT=10
-
-readonly NS_SRC="ns-src-$(mktemp -u XXXXXX)"
-readonly NS_FWD="ns-fwd-$(mktemp -u XXXXXX)"
-readonly NS_DST="ns-dst-$(mktemp -u XXXXXX)"
-
-readonly IP4_SRC="172.16.1.100"
-readonly IP4_DST="172.16.2.100"
-
-readonly IP6_SRC="::1:dead:beef:cafe"
-readonly IP6_DST="::2:dead:beef:cafe"
-
-readonly IP4_SLL="169.254.0.1"
-readonly IP4_DLL="169.254.0.2"
-readonly IP4_NET="169.254.0.0"
-
-netns_cleanup()
-{
- ip netns del ${NS_SRC}
- ip netns del ${NS_FWD}
- ip netns del ${NS_DST}
-}
-
-netns_setup()
-{
- ip netns add "${NS_SRC}"
- ip netns add "${NS_FWD}"
- ip netns add "${NS_DST}"
-
- ip link add veth_src type veth peer name veth_src_fwd
- ip link add veth_dst type veth peer name veth_dst_fwd
-
- ip link set veth_src netns ${NS_SRC}
- ip link set veth_src_fwd netns ${NS_FWD}
-
- ip link set veth_dst netns ${NS_DST}
- ip link set veth_dst_fwd netns ${NS_FWD}
-
- ip -netns ${NS_SRC} addr add ${IP4_SRC}/32 dev veth_src
- ip -netns ${NS_DST} addr add ${IP4_DST}/32 dev veth_dst
-
- # The fwd netns automatically get a v6 LL address / routes, but also
- # needs v4 one in order to start ARP probing. IP4_NET route is added
- # to the endpoints so that the ARP processing will reply.
-
- ip -netns ${NS_FWD} addr add ${IP4_SLL}/32 dev veth_src_fwd
- ip -netns ${NS_FWD} addr add ${IP4_DLL}/32 dev veth_dst_fwd
-
- ip -netns ${NS_SRC} addr add ${IP6_SRC}/128 dev veth_src nodad
- ip -netns ${NS_DST} addr add ${IP6_DST}/128 dev veth_dst nodad
-
- ip -netns ${NS_SRC} link set dev veth_src up
- ip -netns ${NS_FWD} link set dev veth_src_fwd up
-
- ip -netns ${NS_DST} link set dev veth_dst up
- ip -netns ${NS_FWD} link set dev veth_dst_fwd up
-
- ip -netns ${NS_SRC} route add ${IP4_DST}/32 dev veth_src scope global
- ip -netns ${NS_SRC} route add ${IP4_NET}/16 dev veth_src scope global
- ip -netns ${NS_FWD} route add ${IP4_SRC}/32 dev veth_src_fwd scope global
-
- ip -netns ${NS_SRC} route add ${IP6_DST}/128 dev veth_src scope global
- ip -netns ${NS_FWD} route add ${IP6_SRC}/128 dev veth_src_fwd scope global
-
- ip -netns ${NS_DST} route add ${IP4_SRC}/32 dev veth_dst scope global
- ip -netns ${NS_DST} route add ${IP4_NET}/16 dev veth_dst scope global
- ip -netns ${NS_FWD} route add ${IP4_DST}/32 dev veth_dst_fwd scope global
-
- ip -netns ${NS_DST} route add ${IP6_SRC}/128 dev veth_dst scope global
- ip -netns ${NS_FWD} route add ${IP6_DST}/128 dev veth_dst_fwd scope global
-
- fmac_src=$(ip netns exec ${NS_FWD} cat /sys/class/net/veth_src_fwd/address)
- fmac_dst=$(ip netns exec ${NS_FWD} cat /sys/class/net/veth_dst_fwd/address)
-
- ip -netns ${NS_SRC} neigh add ${IP4_DST} dev veth_src lladdr $fmac_src
- ip -netns ${NS_DST} neigh add ${IP4_SRC} dev veth_dst lladdr $fmac_dst
-
- ip -netns ${NS_SRC} neigh add ${IP6_DST} dev veth_src lladdr $fmac_src
- ip -netns ${NS_DST} neigh add ${IP6_SRC} dev veth_dst lladdr $fmac_dst
-}
-
-netns_test_connectivity()
-{
- set +e
-
- ip netns exec ${NS_DST} bash -c "nc -4 -l -p 9004 &"
- ip netns exec ${NS_DST} bash -c "nc -6 -l -p 9006 &"
-
- TEST="TCPv4 connectivity test"
- ip netns exec ${NS_SRC} bash -c "timeout ${TIMEOUT} dd if=/dev/zero bs=1000 count=100 > /dev/tcp/${IP4_DST}/9004"
- if [ $? -ne 0 ]; then
- echo -e "${TEST}: ${RED}FAIL${NC}"
- exit 1
- fi
- echo -e "${TEST}: ${GREEN}PASS${NC}"
-
- TEST="TCPv6 connectivity test"
- ip netns exec ${NS_SRC} bash -c "timeout ${TIMEOUT} dd if=/dev/zero bs=1000 count=100 > /dev/tcp/${IP6_DST}/9006"
- if [ $? -ne 0 ]; then
- echo -e "${TEST}: ${RED}FAIL${NC}"
- exit 1
- fi
- echo -e "${TEST}: ${GREEN}PASS${NC}"
-
- TEST="ICMPv4 connectivity test"
- ip netns exec ${NS_SRC} ping $PING_ARG ${IP4_DST}
- if [ $? -ne 0 ]; then
- echo -e "${TEST}: ${RED}FAIL${NC}"
- exit 1
- fi
- echo -e "${TEST}: ${GREEN}PASS${NC}"
-
- TEST="ICMPv6 connectivity test"
- ip netns exec ${NS_SRC} $PING6 $PING_ARG ${IP6_DST}
- if [ $? -ne 0 ]; then
- echo -e "${TEST}: ${RED}FAIL${NC}"
- exit 1
- fi
- echo -e "${TEST}: ${GREEN}PASS${NC}"
-
- set -e
-}
-
-hex_mem_str()
-{
- perl -e 'print join(" ", unpack("(H2)8", pack("L", @ARGV)))' $1
-}
-
-netns_setup_bpf()
-{
- local obj=$1
- local use_forwarding=${2:-0}
-
- ip netns exec ${NS_FWD} tc qdisc add dev veth_src_fwd clsact
- ip netns exec ${NS_FWD} tc filter add dev veth_src_fwd ingress bpf da obj $obj sec src_ingress
- ip netns exec ${NS_FWD} tc filter add dev veth_src_fwd egress bpf da obj $obj sec chk_egress
-
- ip netns exec ${NS_FWD} tc qdisc add dev veth_dst_fwd clsact
- ip netns exec ${NS_FWD} tc filter add dev veth_dst_fwd ingress bpf da obj $obj sec dst_ingress
- ip netns exec ${NS_FWD} tc filter add dev veth_dst_fwd egress bpf da obj $obj sec chk_egress
-
- if [ "$use_forwarding" -eq "1" ]; then
- # bpf_fib_lookup() checks if forwarding is enabled
- ip netns exec ${NS_FWD} sysctl -w net.ipv4.ip_forward=1
- ip netns exec ${NS_FWD} sysctl -w net.ipv6.conf.veth_dst_fwd.forwarding=1
- ip netns exec ${NS_FWD} sysctl -w net.ipv6.conf.veth_src_fwd.forwarding=1
- return 0
- fi
-
- veth_src=$(ip netns exec ${NS_FWD} cat /sys/class/net/veth_src_fwd/ifindex)
- veth_dst=$(ip netns exec ${NS_FWD} cat /sys/class/net/veth_dst_fwd/ifindex)
-
- progs=$(ip netns exec ${NS_FWD} bpftool net --json | jq -r '.[] | .tc | map(.id) | .[]')
- for prog in $progs; do
- map=$(bpftool prog show id $prog --json | jq -r '.map_ids | .? | .[]')
- if [ ! -z "$map" ]; then
- bpftool map update id $map key hex $(hex_mem_str 0) value hex $(hex_mem_str $veth_src)
- bpftool map update id $map key hex $(hex_mem_str 1) value hex $(hex_mem_str $veth_dst)
- fi
- done
-}
-
-trap netns_cleanup EXIT
-set -e
-
-netns_setup
-netns_setup_bpf test_tc_neigh.o
-netns_test_connectivity
-netns_cleanup
-netns_setup
-netns_setup_bpf test_tc_neigh_fib.o 1
-netns_test_connectivity
-netns_cleanup
-netns_setup
-netns_setup_bpf test_tc_peer.o
-netns_test_connectivity
BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, 0),
BPF_EXIT_INSN(),
},
- .result_unpriv = REJECT,
- .errstr_unpriv = "invalid write to stack R1 off=0 size=1",
.result = ACCEPT,
.retval = 42,
},
},
.fixup_map_array_48b = { 3 },
.result = ACCEPT,
- .result_unpriv = REJECT,
- .errstr_unpriv = "R0 pointer arithmetic of map value goes out of range",
.retval = 1,
},
{
},
.fixup_map_array_48b = { 3 },
.result = ACCEPT,
- .result_unpriv = REJECT,
- .errstr_unpriv = "R0 pointer arithmetic of map value goes out of range",
.retval = 1,
},
{
},
.fixup_map_array_48b = { 3 },
.result = ACCEPT,
- .result_unpriv = REJECT,
- .errstr_unpriv = "R0 pointer arithmetic of map value goes out of range",
.retval = 1,
},
{
},
.fixup_map_array_48b = { 3 },
.result = ACCEPT,
- .result_unpriv = REJECT,
- .errstr_unpriv = "R0 pointer arithmetic of map value goes out of range",
.retval = 1,
},
{
/kvm_create_max_vcpus
/kvm_page_table_test
/memslot_modification_stress_test
+/memslot_perf_test
/set_memory_region_test
/steal_time
UNAME_M := s390x
endif
-LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c
+LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c
LIBKVM_x86_64 = lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S
LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c
LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
TEST_GEN_PROGS_x86_64 += kvm_create_max_vcpus
TEST_GEN_PROGS_x86_64 += kvm_page_table_test
TEST_GEN_PROGS_x86_64 += memslot_modification_stress_test
+TEST_GEN_PROGS_x86_64 += memslot_perf_test
TEST_GEN_PROGS_x86_64 += set_memory_region_test
TEST_GEN_PROGS_x86_64 += steal_time
#define _GNU_SOURCE /* for pipe2 */
+#include <inttypes.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
static int nr_vcpus = 1;
static uint64_t guest_percpu_mem_size = DEFAULT_PER_VCPU_MEM_SIZE;
+static size_t demand_paging_size;
static char *guest_data_prototype;
static void *vcpu_worker(void *data)
return NULL;
}
-static int handle_uffd_page_request(int uffd, uint64_t addr)
+static int handle_uffd_page_request(int uffd_mode, int uffd, uint64_t addr)
{
- pid_t tid;
+ pid_t tid = syscall(__NR_gettid);
struct timespec start;
struct timespec ts_diff;
- struct uffdio_copy copy;
int r;
- tid = syscall(__NR_gettid);
+ clock_gettime(CLOCK_MONOTONIC, &start);
- copy.src = (uint64_t)guest_data_prototype;
- copy.dst = addr;
- copy.len = perf_test_args.host_page_size;
- copy.mode = 0;
+ if (uffd_mode == UFFDIO_REGISTER_MODE_MISSING) {
+ struct uffdio_copy copy;
- clock_gettime(CLOCK_MONOTONIC, &start);
+ copy.src = (uint64_t)guest_data_prototype;
+ copy.dst = addr;
+ copy.len = demand_paging_size;
+ copy.mode = 0;
- r = ioctl(uffd, UFFDIO_COPY, ©);
- if (r == -1) {
- pr_info("Failed Paged in 0x%lx from thread %d with errno: %d\n",
- addr, tid, errno);
- return r;
+ r = ioctl(uffd, UFFDIO_COPY, ©);
+ if (r == -1) {
+ pr_info("Failed UFFDIO_COPY in 0x%lx from thread %d with errno: %d\n",
+ addr, tid, errno);
+ return r;
+ }
+ } else if (uffd_mode == UFFDIO_REGISTER_MODE_MINOR) {
+ struct uffdio_continue cont = {0};
+
+ cont.range.start = addr;
+ cont.range.len = demand_paging_size;
+
+ r = ioctl(uffd, UFFDIO_CONTINUE, &cont);
+ if (r == -1) {
+ pr_info("Failed UFFDIO_CONTINUE in 0x%lx from thread %d with errno: %d\n",
+ addr, tid, errno);
+ return r;
+ }
+ } else {
+ TEST_FAIL("Invalid uffd mode %d", uffd_mode);
}
ts_diff = timespec_elapsed(start);
- PER_PAGE_DEBUG("UFFDIO_COPY %d \t%ld ns\n", tid,
+ PER_PAGE_DEBUG("UFFD page-in %d \t%ld ns\n", tid,
timespec_to_ns(ts_diff));
PER_PAGE_DEBUG("Paged in %ld bytes at 0x%lx from thread %d\n",
- perf_test_args.host_page_size, addr, tid);
+ demand_paging_size, addr, tid);
return 0;
}
bool quit_uffd_thread;
struct uffd_handler_args {
+ int uffd_mode;
int uffd;
int pipefd;
useconds_t delay;
if (r == -1) {
if (errno == EAGAIN)
continue;
- pr_info("Read of uffd gor errno %d", errno);
+ pr_info("Read of uffd got errno %d\n", errno);
return NULL;
}
if (delay)
usleep(delay);
addr = msg.arg.pagefault.address;
- r = handle_uffd_page_request(uffd, addr);
+ r = handle_uffd_page_request(uffd_args->uffd_mode, uffd, addr);
if (r < 0)
return NULL;
pages++;
return NULL;
}
-static int setup_demand_paging(struct kvm_vm *vm,
- pthread_t *uffd_handler_thread, int pipefd,
- useconds_t uffd_delay,
- struct uffd_handler_args *uffd_args,
- void *hva, uint64_t len)
+static void setup_demand_paging(struct kvm_vm *vm,
+ pthread_t *uffd_handler_thread, int pipefd,
+ int uffd_mode, useconds_t uffd_delay,
+ struct uffd_handler_args *uffd_args,
+ void *hva, void *alias, uint64_t len)
{
+ bool is_minor = (uffd_mode == UFFDIO_REGISTER_MODE_MINOR);
int uffd;
struct uffdio_api uffdio_api;
struct uffdio_register uffdio_register;
+ uint64_t expected_ioctls = ((uint64_t) 1) << _UFFDIO_COPY;
- uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
- if (uffd == -1) {
- pr_info("uffd creation failed\n");
- return -1;
+ PER_PAGE_DEBUG("Userfaultfd %s mode, faults resolved with %s\n",
+ is_minor ? "MINOR" : "MISSING",
+ is_minor ? "UFFDIO_CONINUE" : "UFFDIO_COPY");
+
+ /* In order to get minor faults, prefault via the alias. */
+ if (is_minor) {
+ size_t p;
+
+ expected_ioctls = ((uint64_t) 1) << _UFFDIO_CONTINUE;
+
+ TEST_ASSERT(alias != NULL, "Alias required for minor faults");
+ for (p = 0; p < (len / demand_paging_size); ++p) {
+ memcpy(alias + (p * demand_paging_size),
+ guest_data_prototype, demand_paging_size);
+ }
}
+ uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
+ TEST_ASSERT(uffd >= 0, "uffd creation failed, errno: %d", errno);
+
uffdio_api.api = UFFD_API;
uffdio_api.features = 0;
- if (ioctl(uffd, UFFDIO_API, &uffdio_api) == -1) {
- pr_info("ioctl uffdio_api failed\n");
- return -1;
- }
+ TEST_ASSERT(ioctl(uffd, UFFDIO_API, &uffdio_api) != -1,
+ "ioctl UFFDIO_API failed: %" PRIu64,
+ (uint64_t)uffdio_api.api);
uffdio_register.range.start = (uint64_t)hva;
uffdio_register.range.len = len;
- uffdio_register.mode = UFFDIO_REGISTER_MODE_MISSING;
- if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) == -1) {
- pr_info("ioctl uffdio_register failed\n");
- return -1;
- }
-
- if ((uffdio_register.ioctls & UFFD_API_RANGE_IOCTLS) !=
- UFFD_API_RANGE_IOCTLS) {
- pr_info("unexpected userfaultfd ioctl set\n");
- return -1;
- }
+ uffdio_register.mode = uffd_mode;
+ TEST_ASSERT(ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) != -1,
+ "ioctl UFFDIO_REGISTER failed");
+ TEST_ASSERT((uffdio_register.ioctls & expected_ioctls) ==
+ expected_ioctls, "missing userfaultfd ioctls");
+ uffd_args->uffd_mode = uffd_mode;
uffd_args->uffd = uffd;
uffd_args->pipefd = pipefd;
uffd_args->delay = uffd_delay;
PER_VCPU_DEBUG("Created uffd thread for HVA range [%p, %p)\n",
hva, hva + len);
-
- return 0;
}
struct test_params {
- bool use_uffd;
+ int uffd_mode;
useconds_t uffd_delay;
+ enum vm_mem_backing_src_type src_type;
bool partition_vcpu_memory_access;
};
int r;
vm = perf_test_create_vm(mode, nr_vcpus, guest_percpu_mem_size,
- VM_MEM_SRC_ANONYMOUS);
+ p->src_type);
perf_test_args.wr_fract = 1;
- guest_data_prototype = malloc(perf_test_args.host_page_size);
+ demand_paging_size = get_backing_src_pagesz(p->src_type);
+
+ guest_data_prototype = malloc(demand_paging_size);
TEST_ASSERT(guest_data_prototype,
"Failed to allocate buffer for guest data pattern");
- memset(guest_data_prototype, 0xAB, perf_test_args.host_page_size);
+ memset(guest_data_prototype, 0xAB, demand_paging_size);
vcpu_threads = malloc(nr_vcpus * sizeof(*vcpu_threads));
TEST_ASSERT(vcpu_threads, "Memory allocation failed");
perf_test_setup_vcpus(vm, nr_vcpus, guest_percpu_mem_size,
p->partition_vcpu_memory_access);
- if (p->use_uffd) {
+ if (p->uffd_mode) {
uffd_handler_threads =
malloc(nr_vcpus * sizeof(*uffd_handler_threads));
TEST_ASSERT(uffd_handler_threads, "Memory allocation failed");
for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) {
vm_paddr_t vcpu_gpa;
void *vcpu_hva;
+ void *vcpu_alias;
uint64_t vcpu_mem_size;
PER_VCPU_DEBUG("Added VCPU %d with test mem gpa [%lx, %lx)\n",
vcpu_id, vcpu_gpa, vcpu_gpa + vcpu_mem_size);
- /* Cache the HVA pointer of the region */
+ /* Cache the host addresses of the region */
vcpu_hva = addr_gpa2hva(vm, vcpu_gpa);
+ vcpu_alias = addr_gpa2alias(vm, vcpu_gpa);
/*
* Set up user fault fd to handle demand paging
O_CLOEXEC | O_NONBLOCK);
TEST_ASSERT(!r, "Failed to set up pipefd");
- r = setup_demand_paging(vm,
- &uffd_handler_threads[vcpu_id],
- pipefds[vcpu_id * 2],
- p->uffd_delay, &uffd_args[vcpu_id],
- vcpu_hva, vcpu_mem_size);
- if (r < 0)
- exit(-r);
+ setup_demand_paging(vm, &uffd_handler_threads[vcpu_id],
+ pipefds[vcpu_id * 2], p->uffd_mode,
+ p->uffd_delay, &uffd_args[vcpu_id],
+ vcpu_hva, vcpu_alias,
+ vcpu_mem_size);
}
}
pr_info("All vCPU threads joined\n");
- if (p->use_uffd) {
+ if (p->uffd_mode) {
char c;
/* Tell the user fault fd handler threads to quit */
free(guest_data_prototype);
free(vcpu_threads);
- if (p->use_uffd) {
+ if (p->uffd_mode) {
free(uffd_handler_threads);
free(uffd_args);
free(pipefds);
static void help(char *name)
{
puts("");
- printf("usage: %s [-h] [-m mode] [-u] [-d uffd_delay_usec]\n"
- " [-b memory] [-v vcpus] [-o]\n", name);
+ printf("usage: %s [-h] [-m vm_mode] [-u uffd_mode] [-d uffd_delay_usec]\n"
+ " [-b memory] [-t type] [-v vcpus] [-o]\n", name);
guest_modes_help();
- printf(" -u: use User Fault FD to handle vCPU page\n"
- " faults.\n");
+ printf(" -u: use userfaultfd to handle vCPU page faults. Mode is a\n"
+ " UFFD registration mode: 'MISSING' or 'MINOR'.\n");
printf(" -d: add a delay in usec to the User Fault\n"
" FD handler to simulate demand paging\n"
" overheads. Ignored without -u.\n");
printf(" -b: specify the size of the memory region which should be\n"
" demand paged by each vCPU. e.g. 10M or 3G.\n"
" Default: 1G\n");
+ printf(" -t: The type of backing memory to use. Default: anonymous\n");
+ backing_src_help();
printf(" -v: specify the number of vCPUs to run.\n");
printf(" -o: Overlap guest memory accesses instead of partitioning\n"
" them into a separate region of memory for each vCPU.\n");
{
int max_vcpus = kvm_check_cap(KVM_CAP_MAX_VCPUS);
struct test_params p = {
+ .src_type = VM_MEM_SRC_ANONYMOUS,
.partition_vcpu_memory_access = true,
};
int opt;
guest_modes_append_default();
- while ((opt = getopt(argc, argv, "hm:ud:b:v:o")) != -1) {
+ while ((opt = getopt(argc, argv, "hm:u:d:b:t:v:o")) != -1) {
switch (opt) {
case 'm':
guest_modes_cmdline(optarg);
break;
case 'u':
- p.use_uffd = true;
+ if (!strcmp("MISSING", optarg))
+ p.uffd_mode = UFFDIO_REGISTER_MODE_MISSING;
+ else if (!strcmp("MINOR", optarg))
+ p.uffd_mode = UFFDIO_REGISTER_MODE_MINOR;
+ TEST_ASSERT(p.uffd_mode, "UFFD mode must be 'MISSING' or 'MINOR'.");
break;
case 'd':
p.uffd_delay = strtoul(optarg, NULL, 0);
case 'b':
guest_percpu_mem_size = parse_size(optarg);
break;
+ case 't':
+ p.src_type = parse_backing_src_type(optarg);
+ break;
case 'v':
nr_vcpus = atoi(optarg);
TEST_ASSERT(nr_vcpus > 0 && nr_vcpus <= max_vcpus,
}
}
+ if (p.uffd_mode == UFFDIO_REGISTER_MODE_MINOR &&
+ !backing_src_is_shared(p.src_type)) {
+ TEST_FAIL("userfaultfd MINOR mode requires shared memory; pick a different -t");
+ }
+
for_each_guest_mode(run_test, &p);
return 0;
TEST_ASSERT(false, "%s: [%d] child escaped the ninja\n", __func__, run);
}
+void wait_for_child_setup(pid_t pid)
+{
+ /*
+ * Wait for the child to post to the semaphore, but wake up periodically
+ * to check if the child exited prematurely.
+ */
+ for (;;) {
+ const struct timespec wait_period = { .tv_sec = 1 };
+ int status;
+
+ if (!sem_timedwait(sem, &wait_period))
+ return;
+
+ /* Child is still running, keep waiting. */
+ if (pid != waitpid(pid, &status, WNOHANG))
+ continue;
+
+ /*
+ * Child is no longer running, which is not expected.
+ *
+ * If it exited with a non-zero status, we explicitly forward
+ * the child's status in case it exited with KSFT_SKIP.
+ */
+ if (WIFEXITED(status))
+ exit(WEXITSTATUS(status));
+ else
+ TEST_ASSERT(false, "Child exited unexpectedly");
+ }
+}
+
int main(int argc, char **argv)
{
uint32_t i;
run_test(i); /* This function always exits */
pr_debug("%s: [%d] waiting semaphore\n", __func__, i);
- sem_wait(sem);
+ wait_for_child_setup(pid);
r = (rand() % DELAY_US_MAX) + 1;
pr_debug("%s: [%d] waiting %dus\n", __func__, i, r);
usleep(r);
};
extern const struct vm_guest_mode_params vm_guest_mode_params[];
+int open_kvm_dev_path_or_exit(void);
int kvm_check_cap(long cap);
int vm_enable_cap(struct kvm_vm *vm, struct kvm_enable_cap *cap);
int vcpu_enable_cap(struct kvm_vm *vm, uint32_t vcpu_id,
void *addr_gpa2hva(struct kvm_vm *vm, vm_paddr_t gpa);
void *addr_gva2hva(struct kvm_vm *vm, vm_vaddr_t gva);
vm_paddr_t addr_hva2gpa(struct kvm_vm *vm, void *hva);
+void *addr_gpa2alias(struct kvm_vm *vm, vm_paddr_t gpa);
/*
* Address Guest Virtual to Guest Physical
unsigned int vm_get_page_size(struct kvm_vm *vm);
unsigned int vm_get_page_shift(struct kvm_vm *vm);
-unsigned int vm_get_max_gfn(struct kvm_vm *vm);
+uint64_t vm_get_max_gfn(struct kvm_vm *vm);
int vm_get_fd(struct kvm_vm *vm);
unsigned int vm_calc_num_guest_pages(enum vm_guest_mode mode, size_t size);
#include <errno.h>
#include <unistd.h>
#include <fcntl.h>
+#include <sys/mman.h>
#include "kselftest.h"
static inline int _no_printf(const char *format, ...) { return 0; }
VM_MEM_SRC_ANONYMOUS_HUGETLB_1GB,
VM_MEM_SRC_ANONYMOUS_HUGETLB_2GB,
VM_MEM_SRC_ANONYMOUS_HUGETLB_16GB,
+ VM_MEM_SRC_SHMEM,
+ VM_MEM_SRC_SHARED_HUGETLB,
NUM_SRC_TYPES,
};
void backing_src_help(void);
enum vm_mem_backing_src_type parse_backing_src_type(const char *type_name);
+/*
+ * Whether or not the given source type is shared memory (as opposed to
+ * anonymous).
+ */
+static inline bool backing_src_is_shared(enum vm_mem_backing_src_type t)
+{
+ return vm_mem_backing_src_alias(t)->flag & MAP_SHARED;
+}
+
#endif /* SELFTEST_KVM_TEST_UTIL_H */
}
/*
+ * Open KVM_DEV_PATH if available, otherwise exit the entire program.
+ *
+ * Input Args:
+ * flags - The flags to pass when opening KVM_DEV_PATH.
+ *
+ * Return:
+ * The opened file descriptor of /dev/kvm.
+ */
+static int _open_kvm_dev_path_or_exit(int flags)
+{
+ int fd;
+
+ fd = open(KVM_DEV_PATH, flags);
+ if (fd < 0) {
+ print_skip("%s not available, is KVM loaded? (errno: %d)",
+ KVM_DEV_PATH, errno);
+ exit(KSFT_SKIP);
+ }
+
+ return fd;
+}
+
+int open_kvm_dev_path_or_exit(void)
+{
+ return _open_kvm_dev_path_or_exit(O_RDONLY);
+}
+
+/*
* Capability
*
* Input Args:
int ret;
int kvm_fd;
- kvm_fd = open(KVM_DEV_PATH, O_RDONLY);
- if (kvm_fd < 0)
- exit(KSFT_SKIP);
-
+ kvm_fd = open_kvm_dev_path_or_exit();
ret = ioctl(kvm_fd, KVM_CHECK_EXTENSION, cap);
TEST_ASSERT(ret != -1, "KVM_CHECK_EXTENSION IOCTL failed,\n"
" rc: %i errno: %i", ret, errno);
static void vm_open(struct kvm_vm *vm, int perm)
{
- vm->kvm_fd = open(KVM_DEV_PATH, perm);
- if (vm->kvm_fd < 0)
- exit(KSFT_SKIP);
+ vm->kvm_fd = _open_kvm_dev_path_or_exit(perm);
if (!kvm_check_cap(KVM_CAP_IMMEDIATE_EXIT)) {
print_skip("immediate_exit not available");
TEST_ASSERT(vm != NULL, "Insufficient Memory");
INIT_LIST_HEAD(&vm->vcpus);
- INIT_LIST_HEAD(&vm->userspace_mem_regions);
+ vm->regions.gpa_tree = RB_ROOT;
+ vm->regions.hva_tree = RB_ROOT;
+ hash_init(vm->regions.slot_hash);
vm->mode = mode;
vm->type = 0;
*/
uint64_t vcpu_pages = (DEFAULT_STACK_PGS + num_percpu_pages) * nr_vcpus;
uint64_t extra_pg_pages = (extra_mem_pages + vcpu_pages) / PTES_PER_MIN_PAGE * 2;
- uint64_t pages = DEFAULT_GUEST_PHY_PAGES + vcpu_pages + extra_pg_pages;
+ uint64_t pages = DEFAULT_GUEST_PHY_PAGES + extra_mem_pages + vcpu_pages + extra_pg_pages;
struct kvm_vm *vm;
int i;
*/
void kvm_vm_restart(struct kvm_vm *vmp, int perm)
{
+ int ctr;
struct userspace_mem_region *region;
vm_open(vmp, perm);
if (vmp->has_irqchip)
vm_create_irqchip(vmp);
- list_for_each_entry(region, &vmp->userspace_mem_regions, list) {
+ hash_for_each(vmp->regions.slot_hash, ctr, region, slot_node) {
int ret = ioctl(vmp->fd, KVM_SET_USER_MEMORY_REGION, ®ion->region);
TEST_ASSERT(ret == 0, "KVM_SET_USER_MEMORY_REGION IOCTL failed,\n"
" rc: %i errno: %i\n"
static struct userspace_mem_region *
userspace_mem_region_find(struct kvm_vm *vm, uint64_t start, uint64_t end)
{
- struct userspace_mem_region *region;
+ struct rb_node *node;
- list_for_each_entry(region, &vm->userspace_mem_regions, list) {
+ for (node = vm->regions.gpa_tree.rb_node; node; ) {
+ struct userspace_mem_region *region =
+ container_of(node, struct userspace_mem_region, gpa_node);
uint64_t existing_start = region->region.guest_phys_addr;
uint64_t existing_end = region->region.guest_phys_addr
+ region->region.memory_size - 1;
if (start <= existing_end && end >= existing_start)
return region;
+
+ if (start < existing_start)
+ node = node->rb_left;
+ else
+ node = node->rb_right;
}
return NULL;
}
static void __vm_mem_region_delete(struct kvm_vm *vm,
- struct userspace_mem_region *region)
+ struct userspace_mem_region *region,
+ bool unlink)
{
int ret;
- list_del(®ion->list);
+ if (unlink) {
+ rb_erase(®ion->gpa_node, &vm->regions.gpa_tree);
+ rb_erase(®ion->hva_node, &vm->regions.hva_tree);
+ hash_del(®ion->slot_node);
+ }
region->region.memory_size = 0;
ret = ioctl(vm->fd, KVM_SET_USER_MEMORY_REGION, ®ion->region);
*/
void kvm_vm_free(struct kvm_vm *vmp)
{
- struct userspace_mem_region *region, *tmp;
+ int ctr;
+ struct hlist_node *node;
+ struct userspace_mem_region *region;
if (vmp == NULL)
return;
/* Free userspace_mem_regions. */
- list_for_each_entry_safe(region, tmp, &vmp->userspace_mem_regions, list)
- __vm_mem_region_delete(vmp, region);
+ hash_for_each_safe(vmp->regions.slot_hash, ctr, node, region, slot_node)
+ __vm_mem_region_delete(vmp, region, false);
/* Free sparsebit arrays. */
sparsebit_free(&vmp->vpages_valid);
return 0;
}
+static void vm_userspace_mem_region_gpa_insert(struct rb_root *gpa_tree,
+ struct userspace_mem_region *region)
+{
+ struct rb_node **cur, *parent;
+
+ for (cur = &gpa_tree->rb_node, parent = NULL; *cur; ) {
+ struct userspace_mem_region *cregion;
+
+ cregion = container_of(*cur, typeof(*cregion), gpa_node);
+ parent = *cur;
+ if (region->region.guest_phys_addr <
+ cregion->region.guest_phys_addr)
+ cur = &(*cur)->rb_left;
+ else {
+ TEST_ASSERT(region->region.guest_phys_addr !=
+ cregion->region.guest_phys_addr,
+ "Duplicate GPA in region tree");
+
+ cur = &(*cur)->rb_right;
+ }
+ }
+
+ rb_link_node(®ion->gpa_node, parent, cur);
+ rb_insert_color(®ion->gpa_node, gpa_tree);
+}
+
+static void vm_userspace_mem_region_hva_insert(struct rb_root *hva_tree,
+ struct userspace_mem_region *region)
+{
+ struct rb_node **cur, *parent;
+
+ for (cur = &hva_tree->rb_node, parent = NULL; *cur; ) {
+ struct userspace_mem_region *cregion;
+
+ cregion = container_of(*cur, typeof(*cregion), hva_node);
+ parent = *cur;
+ if (region->host_mem < cregion->host_mem)
+ cur = &(*cur)->rb_left;
+ else {
+ TEST_ASSERT(region->host_mem !=
+ cregion->host_mem,
+ "Duplicate HVA in region tree");
+
+ cur = &(*cur)->rb_right;
+ }
+ }
+
+ rb_link_node(®ion->hva_node, parent, cur);
+ rb_insert_color(®ion->hva_node, hva_tree);
+}
+
/*
* VM Userspace Memory Region Add
*
* Input Args:
* vm - Virtual Machine
- * backing_src - Storage source for this region.
- * NULL to use anonymous memory.
+ * src_type - Storage source for this region.
+ * NULL to use anonymous memory.
* guest_paddr - Starting guest physical address
* slot - KVM region slot
* npages - Number of physical pages
(uint64_t) region->region.memory_size);
/* Confirm no region with the requested slot already exists. */
- list_for_each_entry(region, &vm->userspace_mem_regions, list) {
+ hash_for_each_possible(vm->regions.slot_hash, region, slot_node,
+ slot) {
if (region->region.slot != slot)
continue;
if (alignment > 1)
region->mmap_size += alignment;
+ region->fd = -1;
+ if (backing_src_is_shared(src_type)) {
+ int memfd_flags = MFD_CLOEXEC;
+
+ if (src_type == VM_MEM_SRC_SHARED_HUGETLB)
+ memfd_flags |= MFD_HUGETLB;
+
+ region->fd = memfd_create("kvm_selftest", memfd_flags);
+ TEST_ASSERT(region->fd != -1,
+ "memfd_create failed, errno: %i", errno);
+
+ ret = ftruncate(region->fd, region->mmap_size);
+ TEST_ASSERT(ret == 0, "ftruncate failed, errno: %i", errno);
+
+ ret = fallocate(region->fd,
+ FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, 0,
+ region->mmap_size);
+ TEST_ASSERT(ret == 0, "fallocate failed, errno: %i", errno);
+ }
+
region->mmap_start = mmap(NULL, region->mmap_size,
PROT_READ | PROT_WRITE,
- MAP_PRIVATE | MAP_ANONYMOUS
- | vm_mem_backing_src_alias(src_type)->flag,
- -1, 0);
+ vm_mem_backing_src_alias(src_type)->flag,
+ region->fd, 0);
TEST_ASSERT(region->mmap_start != MAP_FAILED,
"test_malloc failed, mmap_start: %p errno: %i",
region->mmap_start, errno);
ret, errno, slot, flags,
guest_paddr, (uint64_t) region->region.memory_size);
- /* Add to linked-list of memory regions. */
- list_add(®ion->list, &vm->userspace_mem_regions);
+ /* Add to quick lookup data structures */
+ vm_userspace_mem_region_gpa_insert(&vm->regions.gpa_tree, region);
+ vm_userspace_mem_region_hva_insert(&vm->regions.hva_tree, region);
+ hash_add(vm->regions.slot_hash, ®ion->slot_node, slot);
+
+ /* If shared memory, create an alias. */
+ if (region->fd >= 0) {
+ region->mmap_alias = mmap(NULL, region->mmap_size,
+ PROT_READ | PROT_WRITE,
+ vm_mem_backing_src_alias(src_type)->flag,
+ region->fd, 0);
+ TEST_ASSERT(region->mmap_alias != MAP_FAILED,
+ "mmap of alias failed, errno: %i", errno);
+
+ /* Align host alias address */
+ region->host_alias = align(region->mmap_alias, alignment);
+ }
}
/*
{
struct userspace_mem_region *region;
- list_for_each_entry(region, &vm->userspace_mem_regions, list) {
+ hash_for_each_possible(vm->regions.slot_hash, region, slot_node,
+ memslot)
if (region->region.slot == memslot)
return region;
- }
fprintf(stderr, "No mem region with the requested slot found,\n"
" requested slot: %u\n", memslot);
*/
void vm_mem_region_delete(struct kvm_vm *vm, uint32_t slot)
{
- __vm_mem_region_delete(vm, memslot2region(vm, slot));
+ __vm_mem_region_delete(vm, memslot2region(vm, slot), true);
}
/*
{
int dev_fd, ret;
- dev_fd = open(KVM_DEV_PATH, O_RDONLY);
- if (dev_fd < 0)
- exit(KSFT_SKIP);
+ dev_fd = open_kvm_dev_path_or_exit();
ret = ioctl(dev_fd, KVM_GET_VCPU_MMAP_SIZE, NULL);
TEST_ASSERT(ret >= sizeof(struct kvm_run),
uint64_t pages = (sz >> vm->page_shift) + ((sz % vm->page_size) != 0);
virt_pgd_alloc(vm, pgd_memslot);
+ vm_paddr_t paddr = vm_phy_pages_alloc(vm, pages,
+ KVM_UTIL_MIN_PFN * vm->page_size,
+ data_memslot);
/*
* Find an unused range of virtual page addresses of at least
/* Map the virtual pages. */
for (vm_vaddr_t vaddr = vaddr_start; pages > 0;
- pages--, vaddr += vm->page_size) {
- vm_paddr_t paddr;
-
- paddr = vm_phy_page_alloc(vm,
- KVM_UTIL_MIN_PFN * vm->page_size, data_memslot);
+ pages--, vaddr += vm->page_size, paddr += vm->page_size) {
virt_pg_map(vm, vaddr, paddr, pgd_memslot);
{
struct userspace_mem_region *region;
- list_for_each_entry(region, &vm->userspace_mem_regions, list) {
- if ((gpa >= region->region.guest_phys_addr)
- && (gpa <= (region->region.guest_phys_addr
- + region->region.memory_size - 1)))
- return (void *) ((uintptr_t) region->host_mem
- + (gpa - region->region.guest_phys_addr));
+ region = userspace_mem_region_find(vm, gpa, gpa);
+ if (!region) {
+ TEST_FAIL("No vm physical memory at 0x%lx", gpa);
+ return NULL;
}
- TEST_FAIL("No vm physical memory at 0x%lx", gpa);
- return NULL;
+ return (void *)((uintptr_t)region->host_mem
+ + (gpa - region->region.guest_phys_addr));
}
/*
*/
vm_paddr_t addr_hva2gpa(struct kvm_vm *vm, void *hva)
{
- struct userspace_mem_region *region;
+ struct rb_node *node;
- list_for_each_entry(region, &vm->userspace_mem_regions, list) {
- if ((hva >= region->host_mem)
- && (hva <= (region->host_mem
- + region->region.memory_size - 1)))
- return (vm_paddr_t) ((uintptr_t)
- region->region.guest_phys_addr
- + (hva - (uintptr_t) region->host_mem));
+ for (node = vm->regions.hva_tree.rb_node; node; ) {
+ struct userspace_mem_region *region =
+ container_of(node, struct userspace_mem_region, hva_node);
+
+ if (hva >= region->host_mem) {
+ if (hva <= (region->host_mem
+ + region->region.memory_size - 1))
+ return (vm_paddr_t)((uintptr_t)
+ region->region.guest_phys_addr
+ + (hva - (uintptr_t)region->host_mem));
+
+ node = node->rb_right;
+ } else
+ node = node->rb_left;
}
TEST_FAIL("No mapping to a guest physical address, hva: %p", hva);
}
/*
+ * Address VM physical to Host Virtual *alias*.
+ *
+ * Input Args:
+ * vm - Virtual Machine
+ * gpa - VM physical address
+ *
+ * Output Args: None
+ *
+ * Return:
+ * Equivalent address within the host virtual *alias* area, or NULL
+ * (without failing the test) if the guest memory is not shared (so
+ * no alias exists).
+ *
+ * When vm_create() and related functions are called with a shared memory
+ * src_type, we also create a writable, shared alias mapping of the
+ * underlying guest memory. This allows the host to manipulate guest memory
+ * without mapping that memory in the guest's address space. And, for
+ * userfaultfd-based demand paging, we can do so without triggering userfaults.
+ */
+void *addr_gpa2alias(struct kvm_vm *vm, vm_paddr_t gpa)
+{
+ struct userspace_mem_region *region;
+ uintptr_t offset;
+
+ region = userspace_mem_region_find(vm, gpa, gpa);
+ if (!region)
+ return NULL;
+
+ if (!region->host_alias)
+ return NULL;
+
+ offset = gpa - region->region.guest_phys_addr;
+ return (void *) ((uintptr_t) region->host_alias + offset);
+}
+
+/*
* VM Create IRQ Chip
*
* Input Args:
*/
void vm_dump(FILE *stream, struct kvm_vm *vm, uint8_t indent)
{
+ int ctr;
struct userspace_mem_region *region;
struct vcpu *vcpu;
fprintf(stream, "%*sfd: %i\n", indent, "", vm->fd);
fprintf(stream, "%*spage_size: 0x%x\n", indent, "", vm->page_size);
fprintf(stream, "%*sMem Regions:\n", indent, "");
- list_for_each_entry(region, &vm->userspace_mem_regions, list) {
+ hash_for_each(vm->regions.slot_hash, ctr, region, slot_node) {
fprintf(stream, "%*sguest_phys: 0x%lx size: 0x%lx "
"host_virt: %p\n", indent + 2, "",
(uint64_t) region->region.guest_phys_addr,
if (vm == NULL) {
/* Ensure that the KVM vendor-specific module is loaded. */
- f = fopen(KVM_DEV_PATH, "r");
- TEST_ASSERT(f != NULL, "Error in opening KVM dev file: %d",
- errno);
- fclose(f);
+ close(open_kvm_dev_path_or_exit());
}
f = fopen("/sys/module/kvm_intel/parameters/unrestricted_guest", "r");
return vm->page_shift;
}
-unsigned int vm_get_max_gfn(struct kvm_vm *vm)
+uint64_t vm_get_max_gfn(struct kvm_vm *vm)
{
return vm->max_gfn;
}
#ifndef SELFTEST_KVM_UTIL_INTERNAL_H
#define SELFTEST_KVM_UTIL_INTERNAL_H
+#include "linux/hashtable.h"
+#include "linux/rbtree.h"
+
#include "sparsebit.h"
struct userspace_mem_region {
int fd;
off_t offset;
void *host_mem;
+ void *host_alias;
void *mmap_start;
+ void *mmap_alias;
size_t mmap_size;
- struct list_head list;
+ struct rb_node gpa_node;
+ struct rb_node hva_node;
+ struct hlist_node slot_node;
};
struct vcpu {
uint32_t dirty_gfns_count;
};
+struct userspace_mem_regions {
+ struct rb_root gpa_tree;
+ struct rb_root hva_tree;
+ DECLARE_HASHTABLE(slot_hash, 9);
+};
+
struct kvm_vm {
int mode;
unsigned long type;
unsigned int va_bits;
uint64_t max_gfn;
struct list_head vcpus;
- struct list_head userspace_mem_regions;
+ struct userspace_mem_regions regions;
struct sparsebit *vpages_valid;
struct sparsebit *vpages_mapped;
bool has_irqchip;
/*
* Copyright (C) 2020, Google LLC.
*/
+#include <inttypes.h>
#include "kvm_util.h"
#include "perf_test_util.h"
*/
TEST_ASSERT(guest_num_pages < vm_get_max_gfn(vm),
"Requested more guest memory than address space allows.\n"
- " guest pages: %lx max gfn: %x vcpus: %d wss: %lx]\n",
+ " guest pages: %" PRIx64 " max gfn: %" PRIx64
+ " vcpus: %d wss: %" PRIx64 "]\n",
guest_num_pages, vm_get_max_gfn(vm), vcpus,
vcpu_memory_bytes);
--- /dev/null
+#include "../../../../lib/rbtree.c"
const struct vm_mem_backing_src_alias *vm_mem_backing_src_alias(uint32_t i)
{
+ static const int anon_flags = MAP_PRIVATE | MAP_ANONYMOUS;
+ static const int anon_huge_flags = anon_flags | MAP_HUGETLB;
+
static const struct vm_mem_backing_src_alias aliases[] = {
[VM_MEM_SRC_ANONYMOUS] = {
.name = "anonymous",
- .flag = 0,
+ .flag = anon_flags,
},
[VM_MEM_SRC_ANONYMOUS_THP] = {
.name = "anonymous_thp",
- .flag = 0,
+ .flag = anon_flags,
},
[VM_MEM_SRC_ANONYMOUS_HUGETLB] = {
.name = "anonymous_hugetlb",
- .flag = MAP_HUGETLB,
+ .flag = anon_huge_flags,
},
[VM_MEM_SRC_ANONYMOUS_HUGETLB_16KB] = {
.name = "anonymous_hugetlb_16kb",
- .flag = MAP_HUGETLB | MAP_HUGE_16KB,
+ .flag = anon_huge_flags | MAP_HUGE_16KB,
},
[VM_MEM_SRC_ANONYMOUS_HUGETLB_64KB] = {
.name = "anonymous_hugetlb_64kb",
- .flag = MAP_HUGETLB | MAP_HUGE_64KB,
+ .flag = anon_huge_flags | MAP_HUGE_64KB,
},
[VM_MEM_SRC_ANONYMOUS_HUGETLB_512KB] = {
.name = "anonymous_hugetlb_512kb",
- .flag = MAP_HUGETLB | MAP_HUGE_512KB,
+ .flag = anon_huge_flags | MAP_HUGE_512KB,
},
[VM_MEM_SRC_ANONYMOUS_HUGETLB_1MB] = {
.name = "anonymous_hugetlb_1mb",
- .flag = MAP_HUGETLB | MAP_HUGE_1MB,
+ .flag = anon_huge_flags | MAP_HUGE_1MB,
},
[VM_MEM_SRC_ANONYMOUS_HUGETLB_2MB] = {
.name = "anonymous_hugetlb_2mb",
- .flag = MAP_HUGETLB | MAP_HUGE_2MB,
+ .flag = anon_huge_flags | MAP_HUGE_2MB,
},
[VM_MEM_SRC_ANONYMOUS_HUGETLB_8MB] = {
.name = "anonymous_hugetlb_8mb",
- .flag = MAP_HUGETLB | MAP_HUGE_8MB,
+ .flag = anon_huge_flags | MAP_HUGE_8MB,
},
[VM_MEM_SRC_ANONYMOUS_HUGETLB_16MB] = {
.name = "anonymous_hugetlb_16mb",
- .flag = MAP_HUGETLB | MAP_HUGE_16MB,
+ .flag = anon_huge_flags | MAP_HUGE_16MB,
},
[VM_MEM_SRC_ANONYMOUS_HUGETLB_32MB] = {
.name = "anonymous_hugetlb_32mb",
- .flag = MAP_HUGETLB | MAP_HUGE_32MB,
+ .flag = anon_huge_flags | MAP_HUGE_32MB,
},
[VM_MEM_SRC_ANONYMOUS_HUGETLB_256MB] = {
.name = "anonymous_hugetlb_256mb",
- .flag = MAP_HUGETLB | MAP_HUGE_256MB,
+ .flag = anon_huge_flags | MAP_HUGE_256MB,
},
[VM_MEM_SRC_ANONYMOUS_HUGETLB_512MB] = {
.name = "anonymous_hugetlb_512mb",
- .flag = MAP_HUGETLB | MAP_HUGE_512MB,
+ .flag = anon_huge_flags | MAP_HUGE_512MB,
},
[VM_MEM_SRC_ANONYMOUS_HUGETLB_1GB] = {
.name = "anonymous_hugetlb_1gb",
- .flag = MAP_HUGETLB | MAP_HUGE_1GB,
+ .flag = anon_huge_flags | MAP_HUGE_1GB,
},
[VM_MEM_SRC_ANONYMOUS_HUGETLB_2GB] = {
.name = "anonymous_hugetlb_2gb",
- .flag = MAP_HUGETLB | MAP_HUGE_2GB,
+ .flag = anon_huge_flags | MAP_HUGE_2GB,
},
[VM_MEM_SRC_ANONYMOUS_HUGETLB_16GB] = {
.name = "anonymous_hugetlb_16gb",
- .flag = MAP_HUGETLB | MAP_HUGE_16GB,
+ .flag = anon_huge_flags | MAP_HUGE_16GB,
+ },
+ [VM_MEM_SRC_SHMEM] = {
+ .name = "shmem",
+ .flag = MAP_SHARED,
+ },
+ [VM_MEM_SRC_SHARED_HUGETLB] = {
+ .name = "shared_hugetlb",
+ /*
+ * No MAP_HUGETLB, we use MFD_HUGETLB instead. Since
+ * we're using "file backed" memory, we need to specify
+ * this when the FD is created, not when the area is
+ * mapped.
+ */
+ .flag = MAP_SHARED,
},
};
_Static_assert(ARRAY_SIZE(aliases) == NUM_SRC_TYPES,
switch (i) {
case VM_MEM_SRC_ANONYMOUS:
+ case VM_MEM_SRC_SHMEM:
return getpagesize();
case VM_MEM_SRC_ANONYMOUS_THP:
return get_trans_hugepagesz();
case VM_MEM_SRC_ANONYMOUS_HUGETLB:
+ case VM_MEM_SRC_SHARED_HUGETLB:
return get_def_hugetlb_pagesz();
default:
return MAP_HUGE_PAGE_SIZE(flag);
return cpuid;
cpuid = allocate_kvm_cpuid2();
- kvm_fd = open(KVM_DEV_PATH, O_RDONLY);
- if (kvm_fd < 0)
- exit(KSFT_SKIP);
+ kvm_fd = open_kvm_dev_path_or_exit();
ret = ioctl(kvm_fd, KVM_GET_SUPPORTED_CPUID, cpuid);
TEST_ASSERT(ret == 0, "KVM_GET_SUPPORTED_CPUID failed %d %d\n",
buffer.header.nmsrs = 1;
buffer.entry.index = msr_index;
- kvm_fd = open(KVM_DEV_PATH, O_RDONLY);
- if (kvm_fd < 0)
- exit(KSFT_SKIP);
+ kvm_fd = open_kvm_dev_path_or_exit();
r = ioctl(kvm_fd, KVM_GET_MSRS, &buffer.header);
TEST_ASSERT(r == 1, "KVM_GET_MSRS IOCTL failed,\n"
struct kvm_msr_list *list;
int nmsrs, r, kvm_fd;
- kvm_fd = open(KVM_DEV_PATH, O_RDONLY);
- if (kvm_fd < 0)
- exit(KSFT_SKIP);
+ kvm_fd = open_kvm_dev_path_or_exit();
nmsrs = kvm_get_num_msrs_fd(kvm_fd);
list = malloc(sizeof(*list) + nmsrs * sizeof(list->indices[0]));
return cpuid;
cpuid = allocate_kvm_cpuid2();
- kvm_fd = open(KVM_DEV_PATH, O_RDONLY);
- if (kvm_fd < 0)
- exit(KSFT_SKIP);
+ kvm_fd = open_kvm_dev_path_or_exit();
ret = ioctl(kvm_fd, KVM_GET_SUPPORTED_HV_CPUID, cpuid);
TEST_ASSERT(ret == 0, "KVM_GET_SUPPORTED_HV_CPUID failed %d %d\n",
};
static void add_remove_memslot(struct kvm_vm *vm, useconds_t delay,
- uint64_t nr_modifications, uint64_t gpa)
+ uint64_t nr_modifications)
{
+ const uint64_t pages = 1;
+ uint64_t gpa;
int i;
+ /*
+ * Add the dummy memslot just below the perf_test_util memslot, which is
+ * at the top of the guest physical address space.
+ */
+ gpa = guest_test_phys_mem - pages * vm_get_page_size(vm);
+
for (i = 0; i < nr_modifications; i++) {
usleep(delay);
vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS, gpa,
- DUMMY_MEMSLOT_INDEX, 1, 0);
+ DUMMY_MEMSLOT_INDEX, pages, 0);
vm_mem_region_delete(vm, DUMMY_MEMSLOT_INDEX);
}
pr_info("Started all vCPUs\n");
add_remove_memslot(vm, p->memslot_modification_delay,
- p->nr_memslot_modifications,
- guest_test_phys_mem +
- (guest_percpu_mem_size * nr_vcpus) +
- perf_test_args.host_page_size +
- perf_test_args.guest_page_size);
+ p->nr_memslot_modifications);
run_vcpus = false;
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * A memslot-related performance benchmark.
+ *
+ * Copyright (C) 2021 Oracle and/or its affiliates.
+ *
+ * Basic guest setup / host vCPU thread code lifted from set_memory_region_test.
+ */
+#include <pthread.h>
+#include <sched.h>
+#include <semaphore.h>
+#include <stdatomic.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/mman.h>
+#include <time.h>
+#include <unistd.h>
+
+#include <linux/compiler.h>
+
+#include <test_util.h>
+#include <kvm_util.h>
+#include <processor.h>
+
+#define VCPU_ID 0
+
+#define MEM_SIZE ((512U << 20) + 4096)
+#define MEM_SIZE_PAGES (MEM_SIZE / 4096)
+#define MEM_GPA 0x10000000UL
+#define MEM_AUX_GPA MEM_GPA
+#define MEM_SYNC_GPA MEM_AUX_GPA
+#define MEM_TEST_GPA (MEM_AUX_GPA + 4096)
+#define MEM_TEST_SIZE (MEM_SIZE - 4096)
+static_assert(MEM_SIZE % 4096 == 0, "invalid mem size");
+static_assert(MEM_TEST_SIZE % 4096 == 0, "invalid mem test size");
+
+/*
+ * 32 MiB is max size that gets well over 100 iterations on 509 slots.
+ * Considering that each slot needs to have at least one page up to
+ * 8194 slots in use can then be tested (although with slightly
+ * limited resolution).
+ */
+#define MEM_SIZE_MAP ((32U << 20) + 4096)
+#define MEM_SIZE_MAP_PAGES (MEM_SIZE_MAP / 4096)
+#define MEM_TEST_MAP_SIZE (MEM_SIZE_MAP - 4096)
+#define MEM_TEST_MAP_SIZE_PAGES (MEM_TEST_MAP_SIZE / 4096)
+static_assert(MEM_SIZE_MAP % 4096 == 0, "invalid map test region size");
+static_assert(MEM_TEST_MAP_SIZE % 4096 == 0, "invalid map test region size");
+static_assert(MEM_TEST_MAP_SIZE_PAGES % 2 == 0, "invalid map test region size");
+static_assert(MEM_TEST_MAP_SIZE_PAGES > 2, "invalid map test region size");
+
+/*
+ * 128 MiB is min size that fills 32k slots with at least one page in each
+ * while at the same time gets 100+ iterations in such test
+ */
+#define MEM_TEST_UNMAP_SIZE (128U << 20)
+#define MEM_TEST_UNMAP_SIZE_PAGES (MEM_TEST_UNMAP_SIZE / 4096)
+/* 2 MiB chunk size like a typical huge page */
+#define MEM_TEST_UNMAP_CHUNK_PAGES (2U << (20 - 12))
+static_assert(MEM_TEST_UNMAP_SIZE <= MEM_TEST_SIZE,
+ "invalid unmap test region size");
+static_assert(MEM_TEST_UNMAP_SIZE % 4096 == 0,
+ "invalid unmap test region size");
+static_assert(MEM_TEST_UNMAP_SIZE_PAGES %
+ (2 * MEM_TEST_UNMAP_CHUNK_PAGES) == 0,
+ "invalid unmap test region size");
+
+/*
+ * For the move active test the middle of the test area is placed on
+ * a memslot boundary: half lies in the memslot being moved, half in
+ * other memslot(s).
+ *
+ * When running this test with 32k memslots (32764, really) each memslot
+ * contains 4 pages.
+ * The last one additionally contains the remaining 21 pages of memory,
+ * for the total size of 25 pages.
+ * Hence, the maximum size here is 50 pages.
+ */
+#define MEM_TEST_MOVE_SIZE_PAGES (50)
+#define MEM_TEST_MOVE_SIZE (MEM_TEST_MOVE_SIZE_PAGES * 4096)
+#define MEM_TEST_MOVE_GPA_DEST (MEM_GPA + MEM_SIZE)
+static_assert(MEM_TEST_MOVE_SIZE <= MEM_TEST_SIZE,
+ "invalid move test region size");
+
+#define MEM_TEST_VAL_1 0x1122334455667788
+#define MEM_TEST_VAL_2 0x99AABBCCDDEEFF00
+
+struct vm_data {
+ struct kvm_vm *vm;
+ pthread_t vcpu_thread;
+ uint32_t nslots;
+ uint64_t npages;
+ uint64_t pages_per_slot;
+ void **hva_slots;
+ bool mmio_ok;
+ uint64_t mmio_gpa_min;
+ uint64_t mmio_gpa_max;
+};
+
+struct sync_area {
+ atomic_bool start_flag;
+ atomic_bool exit_flag;
+ atomic_bool sync_flag;
+ void *move_area_ptr;
+};
+
+/*
+ * Technically, we need also for the atomic bool to be address-free, which
+ * is recommended, but not strictly required, by C11 for lockless
+ * implementations.
+ * However, in practice both GCC and Clang fulfill this requirement on
+ * all KVM-supported platforms.
+ */
+static_assert(ATOMIC_BOOL_LOCK_FREE == 2, "atomic bool is not lockless");
+
+static sem_t vcpu_ready;
+
+static bool map_unmap_verify;
+
+static bool verbose;
+#define pr_info_v(...) \
+ do { \
+ if (verbose) \
+ pr_info(__VA_ARGS__); \
+ } while (0)
+
+static void *vcpu_worker(void *data)
+{
+ struct vm_data *vm = data;
+ struct kvm_run *run;
+ struct ucall uc;
+ uint64_t cmd;
+
+ run = vcpu_state(vm->vm, VCPU_ID);
+ while (1) {
+ vcpu_run(vm->vm, VCPU_ID);
+
+ if (run->exit_reason == KVM_EXIT_IO) {
+ cmd = get_ucall(vm->vm, VCPU_ID, &uc);
+ if (cmd != UCALL_SYNC)
+ break;
+
+ sem_post(&vcpu_ready);
+ continue;
+ }
+
+ if (run->exit_reason != KVM_EXIT_MMIO)
+ break;
+
+ TEST_ASSERT(vm->mmio_ok, "Unexpected mmio exit");
+ TEST_ASSERT(run->mmio.is_write, "Unexpected mmio read");
+ TEST_ASSERT(run->mmio.len == 8,
+ "Unexpected exit mmio size = %u", run->mmio.len);
+ TEST_ASSERT(run->mmio.phys_addr >= vm->mmio_gpa_min &&
+ run->mmio.phys_addr <= vm->mmio_gpa_max,
+ "Unexpected exit mmio address = 0x%llx",
+ run->mmio.phys_addr);
+ }
+
+ if (run->exit_reason == KVM_EXIT_IO && cmd == UCALL_ABORT)
+ TEST_FAIL("%s at %s:%ld, val = %lu", (const char *)uc.args[0],
+ __FILE__, uc.args[1], uc.args[2]);
+
+ return NULL;
+}
+
+static void wait_for_vcpu(void)
+{
+ struct timespec ts;
+
+ TEST_ASSERT(!clock_gettime(CLOCK_REALTIME, &ts),
+ "clock_gettime() failed: %d\n", errno);
+
+ ts.tv_sec += 2;
+ TEST_ASSERT(!sem_timedwait(&vcpu_ready, &ts),
+ "sem_timedwait() failed: %d\n", errno);
+}
+
+static void *vm_gpa2hva(struct vm_data *data, uint64_t gpa, uint64_t *rempages)
+{
+ uint64_t gpage, pgoffs;
+ uint32_t slot, slotoffs;
+ void *base;
+
+ TEST_ASSERT(gpa >= MEM_GPA, "Too low gpa to translate");
+ TEST_ASSERT(gpa < MEM_GPA + data->npages * 4096,
+ "Too high gpa to translate");
+ gpa -= MEM_GPA;
+
+ gpage = gpa / 4096;
+ pgoffs = gpa % 4096;
+ slot = min(gpage / data->pages_per_slot, (uint64_t)data->nslots - 1);
+ slotoffs = gpage - (slot * data->pages_per_slot);
+
+ if (rempages) {
+ uint64_t slotpages;
+
+ if (slot == data->nslots - 1)
+ slotpages = data->npages - slot * data->pages_per_slot;
+ else
+ slotpages = data->pages_per_slot;
+
+ TEST_ASSERT(!pgoffs,
+ "Asking for remaining pages in slot but gpa not page aligned");
+ *rempages = slotpages - slotoffs;
+ }
+
+ base = data->hva_slots[slot];
+ return (uint8_t *)base + slotoffs * 4096 + pgoffs;
+}
+
+static uint64_t vm_slot2gpa(struct vm_data *data, uint32_t slot)
+{
+ TEST_ASSERT(slot < data->nslots, "Too high slot number");
+
+ return MEM_GPA + slot * data->pages_per_slot * 4096;
+}
+
+static struct vm_data *alloc_vm(void)
+{
+ struct vm_data *data;
+
+ data = malloc(sizeof(*data));
+ TEST_ASSERT(data, "malloc(vmdata) failed");
+
+ data->vm = NULL;
+ data->hva_slots = NULL;
+
+ return data;
+}
+
+static bool prepare_vm(struct vm_data *data, int nslots, uint64_t *maxslots,
+ void *guest_code, uint64_t mempages,
+ struct timespec *slot_runtime)
+{
+ uint32_t max_mem_slots;
+ uint64_t rempages;
+ uint64_t guest_addr;
+ uint32_t slot;
+ struct timespec tstart;
+ struct sync_area *sync;
+
+ max_mem_slots = kvm_check_cap(KVM_CAP_NR_MEMSLOTS);
+ TEST_ASSERT(max_mem_slots > 1,
+ "KVM_CAP_NR_MEMSLOTS should be greater than 1");
+ TEST_ASSERT(nslots > 1 || nslots == -1,
+ "Slot count cap should be greater than 1");
+ if (nslots != -1)
+ max_mem_slots = min(max_mem_slots, (uint32_t)nslots);
+ pr_info_v("Allowed number of memory slots: %"PRIu32"\n", max_mem_slots);
+
+ TEST_ASSERT(mempages > 1,
+ "Can't test without any memory");
+
+ data->npages = mempages;
+ data->nslots = max_mem_slots - 1;
+ data->pages_per_slot = mempages / data->nslots;
+ if (!data->pages_per_slot) {
+ *maxslots = mempages + 1;
+ return false;
+ }
+
+ rempages = mempages % data->nslots;
+ data->hva_slots = malloc(sizeof(*data->hva_slots) * data->nslots);
+ TEST_ASSERT(data->hva_slots, "malloc() fail");
+
+ data->vm = vm_create_default(VCPU_ID, 1024, guest_code);
+
+ pr_info_v("Adding slots 1..%i, each slot with %"PRIu64" pages + %"PRIu64" extra pages last\n",
+ max_mem_slots - 1, data->pages_per_slot, rempages);
+
+ clock_gettime(CLOCK_MONOTONIC, &tstart);
+ for (slot = 1, guest_addr = MEM_GPA; slot < max_mem_slots; slot++) {
+ uint64_t npages;
+
+ npages = data->pages_per_slot;
+ if (slot == max_mem_slots - 1)
+ npages += rempages;
+
+ vm_userspace_mem_region_add(data->vm, VM_MEM_SRC_ANONYMOUS,
+ guest_addr, slot, npages,
+ 0);
+ guest_addr += npages * 4096;
+ }
+ *slot_runtime = timespec_elapsed(tstart);
+
+ for (slot = 0, guest_addr = MEM_GPA; slot < max_mem_slots - 1; slot++) {
+ uint64_t npages;
+ uint64_t gpa;
+
+ npages = data->pages_per_slot;
+ if (slot == max_mem_slots - 2)
+ npages += rempages;
+
+ gpa = vm_phy_pages_alloc(data->vm, npages, guest_addr,
+ slot + 1);
+ TEST_ASSERT(gpa == guest_addr,
+ "vm_phy_pages_alloc() failed\n");
+
+ data->hva_slots[slot] = addr_gpa2hva(data->vm, guest_addr);
+ memset(data->hva_slots[slot], 0, npages * 4096);
+
+ guest_addr += npages * 4096;
+ }
+
+ virt_map(data->vm, MEM_GPA, MEM_GPA, mempages, 0);
+
+ sync = (typeof(sync))vm_gpa2hva(data, MEM_SYNC_GPA, NULL);
+ atomic_init(&sync->start_flag, false);
+ atomic_init(&sync->exit_flag, false);
+ atomic_init(&sync->sync_flag, false);
+
+ data->mmio_ok = false;
+
+ return true;
+}
+
+static void launch_vm(struct vm_data *data)
+{
+ pr_info_v("Launching the test VM\n");
+
+ pthread_create(&data->vcpu_thread, NULL, vcpu_worker, data);
+
+ /* Ensure the guest thread is spun up. */
+ wait_for_vcpu();
+}
+
+static void free_vm(struct vm_data *data)
+{
+ kvm_vm_free(data->vm);
+ free(data->hva_slots);
+ free(data);
+}
+
+static void wait_guest_exit(struct vm_data *data)
+{
+ pthread_join(data->vcpu_thread, NULL);
+}
+
+static void let_guest_run(struct sync_area *sync)
+{
+ atomic_store_explicit(&sync->start_flag, true, memory_order_release);
+}
+
+static void guest_spin_until_start(void)
+{
+ struct sync_area *sync = (typeof(sync))MEM_SYNC_GPA;
+
+ while (!atomic_load_explicit(&sync->start_flag, memory_order_acquire))
+ ;
+}
+
+static void make_guest_exit(struct sync_area *sync)
+{
+ atomic_store_explicit(&sync->exit_flag, true, memory_order_release);
+}
+
+static bool _guest_should_exit(void)
+{
+ struct sync_area *sync = (typeof(sync))MEM_SYNC_GPA;
+
+ return atomic_load_explicit(&sync->exit_flag, memory_order_acquire);
+}
+
+#define guest_should_exit() unlikely(_guest_should_exit())
+
+/*
+ * noinline so we can easily see how much time the host spends waiting
+ * for the guest.
+ * For the same reason use alarm() instead of polling clock_gettime()
+ * to implement a wait timeout.
+ */
+static noinline void host_perform_sync(struct sync_area *sync)
+{
+ alarm(2);
+
+ atomic_store_explicit(&sync->sync_flag, true, memory_order_release);
+ while (atomic_load_explicit(&sync->sync_flag, memory_order_acquire))
+ ;
+
+ alarm(0);
+}
+
+static bool guest_perform_sync(void)
+{
+ struct sync_area *sync = (typeof(sync))MEM_SYNC_GPA;
+ bool expected;
+
+ do {
+ if (guest_should_exit())
+ return false;
+
+ expected = true;
+ } while (!atomic_compare_exchange_weak_explicit(&sync->sync_flag,
+ &expected, false,
+ memory_order_acq_rel,
+ memory_order_relaxed));
+
+ return true;
+}
+
+static void guest_code_test_memslot_move(void)
+{
+ struct sync_area *sync = (typeof(sync))MEM_SYNC_GPA;
+ uintptr_t base = (typeof(base))READ_ONCE(sync->move_area_ptr);
+
+ GUEST_SYNC(0);
+
+ guest_spin_until_start();
+
+ while (!guest_should_exit()) {
+ uintptr_t ptr;
+
+ for (ptr = base; ptr < base + MEM_TEST_MOVE_SIZE;
+ ptr += 4096)
+ *(uint64_t *)ptr = MEM_TEST_VAL_1;
+
+ /*
+ * No host sync here since the MMIO exits are so expensive
+ * that the host would spend most of its time waiting for
+ * the guest and so instead of measuring memslot move
+ * performance we would measure the performance and
+ * likelihood of MMIO exits
+ */
+ }
+
+ GUEST_DONE();
+}
+
+static void guest_code_test_memslot_map(void)
+{
+ struct sync_area *sync = (typeof(sync))MEM_SYNC_GPA;
+
+ GUEST_SYNC(0);
+
+ guest_spin_until_start();
+
+ while (1) {
+ uintptr_t ptr;
+
+ for (ptr = MEM_TEST_GPA;
+ ptr < MEM_TEST_GPA + MEM_TEST_MAP_SIZE / 2; ptr += 4096)
+ *(uint64_t *)ptr = MEM_TEST_VAL_1;
+
+ if (!guest_perform_sync())
+ break;
+
+ for (ptr = MEM_TEST_GPA + MEM_TEST_MAP_SIZE / 2;
+ ptr < MEM_TEST_GPA + MEM_TEST_MAP_SIZE; ptr += 4096)
+ *(uint64_t *)ptr = MEM_TEST_VAL_2;
+
+ if (!guest_perform_sync())
+ break;
+ }
+
+ GUEST_DONE();
+}
+
+static void guest_code_test_memslot_unmap(void)
+{
+ struct sync_area *sync = (typeof(sync))MEM_SYNC_GPA;
+
+ GUEST_SYNC(0);
+
+ guest_spin_until_start();
+
+ while (1) {
+ uintptr_t ptr = MEM_TEST_GPA;
+
+ /*
+ * We can afford to access (map) just a small number of pages
+ * per host sync as otherwise the host will spend
+ * a significant amount of its time waiting for the guest
+ * (instead of doing unmap operations), so this will
+ * effectively turn this test into a map performance test.
+ *
+ * Just access a single page to be on the safe side.
+ */
+ *(uint64_t *)ptr = MEM_TEST_VAL_1;
+
+ if (!guest_perform_sync())
+ break;
+
+ ptr += MEM_TEST_UNMAP_SIZE / 2;
+ *(uint64_t *)ptr = MEM_TEST_VAL_2;
+
+ if (!guest_perform_sync())
+ break;
+ }
+
+ GUEST_DONE();
+}
+
+static void guest_code_test_memslot_rw(void)
+{
+ GUEST_SYNC(0);
+
+ guest_spin_until_start();
+
+ while (1) {
+ uintptr_t ptr;
+
+ for (ptr = MEM_TEST_GPA;
+ ptr < MEM_TEST_GPA + MEM_TEST_SIZE; ptr += 4096)
+ *(uint64_t *)ptr = MEM_TEST_VAL_1;
+
+ if (!guest_perform_sync())
+ break;
+
+ for (ptr = MEM_TEST_GPA + 4096 / 2;
+ ptr < MEM_TEST_GPA + MEM_TEST_SIZE; ptr += 4096) {
+ uint64_t val = *(uint64_t *)ptr;
+
+ GUEST_ASSERT_1(val == MEM_TEST_VAL_2, val);
+ *(uint64_t *)ptr = 0;
+ }
+
+ if (!guest_perform_sync())
+ break;
+ }
+
+ GUEST_DONE();
+}
+
+static bool test_memslot_move_prepare(struct vm_data *data,
+ struct sync_area *sync,
+ uint64_t *maxslots, bool isactive)
+{
+ uint64_t movesrcgpa, movetestgpa;
+
+ movesrcgpa = vm_slot2gpa(data, data->nslots - 1);
+
+ if (isactive) {
+ uint64_t lastpages;
+
+ vm_gpa2hva(data, movesrcgpa, &lastpages);
+ if (lastpages < MEM_TEST_MOVE_SIZE_PAGES / 2) {
+ *maxslots = 0;
+ return false;
+ }
+ }
+
+ movetestgpa = movesrcgpa - (MEM_TEST_MOVE_SIZE / (isactive ? 2 : 1));
+ sync->move_area_ptr = (void *)movetestgpa;
+
+ if (isactive) {
+ data->mmio_ok = true;
+ data->mmio_gpa_min = movesrcgpa;
+ data->mmio_gpa_max = movesrcgpa + MEM_TEST_MOVE_SIZE / 2 - 1;
+ }
+
+ return true;
+}
+
+static bool test_memslot_move_prepare_active(struct vm_data *data,
+ struct sync_area *sync,
+ uint64_t *maxslots)
+{
+ return test_memslot_move_prepare(data, sync, maxslots, true);
+}
+
+static bool test_memslot_move_prepare_inactive(struct vm_data *data,
+ struct sync_area *sync,
+ uint64_t *maxslots)
+{
+ return test_memslot_move_prepare(data, sync, maxslots, false);
+}
+
+static void test_memslot_move_loop(struct vm_data *data, struct sync_area *sync)
+{
+ uint64_t movesrcgpa;
+
+ movesrcgpa = vm_slot2gpa(data, data->nslots - 1);
+ vm_mem_region_move(data->vm, data->nslots - 1 + 1,
+ MEM_TEST_MOVE_GPA_DEST);
+ vm_mem_region_move(data->vm, data->nslots - 1 + 1, movesrcgpa);
+}
+
+static void test_memslot_do_unmap(struct vm_data *data,
+ uint64_t offsp, uint64_t count)
+{
+ uint64_t gpa, ctr;
+
+ for (gpa = MEM_TEST_GPA + offsp * 4096, ctr = 0; ctr < count; ) {
+ uint64_t npages;
+ void *hva;
+ int ret;
+
+ hva = vm_gpa2hva(data, gpa, &npages);
+ TEST_ASSERT(npages, "Empty memory slot at gptr 0x%"PRIx64, gpa);
+ npages = min(npages, count - ctr);
+ ret = madvise(hva, npages * 4096, MADV_DONTNEED);
+ TEST_ASSERT(!ret,
+ "madvise(%p, MADV_DONTNEED) on VM memory should not fail for gptr 0x%"PRIx64,
+ hva, gpa);
+ ctr += npages;
+ gpa += npages * 4096;
+ }
+ TEST_ASSERT(ctr == count,
+ "madvise(MADV_DONTNEED) should exactly cover all of the requested area");
+}
+
+static void test_memslot_map_unmap_check(struct vm_data *data,
+ uint64_t offsp, uint64_t valexp)
+{
+ uint64_t gpa;
+ uint64_t *val;
+
+ if (!map_unmap_verify)
+ return;
+
+ gpa = MEM_TEST_GPA + offsp * 4096;
+ val = (typeof(val))vm_gpa2hva(data, gpa, NULL);
+ TEST_ASSERT(*val == valexp,
+ "Guest written values should read back correctly before unmap (%"PRIu64" vs %"PRIu64" @ %"PRIx64")",
+ *val, valexp, gpa);
+ *val = 0;
+}
+
+static void test_memslot_map_loop(struct vm_data *data, struct sync_area *sync)
+{
+ /*
+ * Unmap the second half of the test area while guest writes to (maps)
+ * the first half.
+ */
+ test_memslot_do_unmap(data, MEM_TEST_MAP_SIZE_PAGES / 2,
+ MEM_TEST_MAP_SIZE_PAGES / 2);
+
+ /*
+ * Wait for the guest to finish writing the first half of the test
+ * area, verify the written value on the first and the last page of
+ * this area and then unmap it.
+ * Meanwhile, the guest is writing to (mapping) the second half of
+ * the test area.
+ */
+ host_perform_sync(sync);
+ test_memslot_map_unmap_check(data, 0, MEM_TEST_VAL_1);
+ test_memslot_map_unmap_check(data,
+ MEM_TEST_MAP_SIZE_PAGES / 2 - 1,
+ MEM_TEST_VAL_1);
+ test_memslot_do_unmap(data, 0, MEM_TEST_MAP_SIZE_PAGES / 2);
+
+
+ /*
+ * Wait for the guest to finish writing the second half of the test
+ * area and verify the written value on the first and the last page
+ * of this area.
+ * The area will be unmapped at the beginning of the next loop
+ * iteration.
+ * Meanwhile, the guest is writing to (mapping) the first half of
+ * the test area.
+ */
+ host_perform_sync(sync);
+ test_memslot_map_unmap_check(data, MEM_TEST_MAP_SIZE_PAGES / 2,
+ MEM_TEST_VAL_2);
+ test_memslot_map_unmap_check(data, MEM_TEST_MAP_SIZE_PAGES - 1,
+ MEM_TEST_VAL_2);
+}
+
+static void test_memslot_unmap_loop_common(struct vm_data *data,
+ struct sync_area *sync,
+ uint64_t chunk)
+{
+ uint64_t ctr;
+
+ /*
+ * Wait for the guest to finish mapping page(s) in the first half
+ * of the test area, verify the written value and then perform unmap
+ * of this area.
+ * Meanwhile, the guest is writing to (mapping) page(s) in the second
+ * half of the test area.
+ */
+ host_perform_sync(sync);
+ test_memslot_map_unmap_check(data, 0, MEM_TEST_VAL_1);
+ for (ctr = 0; ctr < MEM_TEST_UNMAP_SIZE_PAGES / 2; ctr += chunk)
+ test_memslot_do_unmap(data, ctr, chunk);
+
+ /* Likewise, but for the opposite host / guest areas */
+ host_perform_sync(sync);
+ test_memslot_map_unmap_check(data, MEM_TEST_UNMAP_SIZE_PAGES / 2,
+ MEM_TEST_VAL_2);
+ for (ctr = MEM_TEST_UNMAP_SIZE_PAGES / 2;
+ ctr < MEM_TEST_UNMAP_SIZE_PAGES; ctr += chunk)
+ test_memslot_do_unmap(data, ctr, chunk);
+}
+
+static void test_memslot_unmap_loop(struct vm_data *data,
+ struct sync_area *sync)
+{
+ test_memslot_unmap_loop_common(data, sync, 1);
+}
+
+static void test_memslot_unmap_loop_chunked(struct vm_data *data,
+ struct sync_area *sync)
+{
+ test_memslot_unmap_loop_common(data, sync, MEM_TEST_UNMAP_CHUNK_PAGES);
+}
+
+static void test_memslot_rw_loop(struct vm_data *data, struct sync_area *sync)
+{
+ uint64_t gptr;
+
+ for (gptr = MEM_TEST_GPA + 4096 / 2;
+ gptr < MEM_TEST_GPA + MEM_TEST_SIZE; gptr += 4096)
+ *(uint64_t *)vm_gpa2hva(data, gptr, NULL) = MEM_TEST_VAL_2;
+
+ host_perform_sync(sync);
+
+ for (gptr = MEM_TEST_GPA;
+ gptr < MEM_TEST_GPA + MEM_TEST_SIZE; gptr += 4096) {
+ uint64_t *vptr = (typeof(vptr))vm_gpa2hva(data, gptr, NULL);
+ uint64_t val = *vptr;
+
+ TEST_ASSERT(val == MEM_TEST_VAL_1,
+ "Guest written values should read back correctly (is %"PRIu64" @ %"PRIx64")",
+ val, gptr);
+ *vptr = 0;
+ }
+
+ host_perform_sync(sync);
+}
+
+struct test_data {
+ const char *name;
+ uint64_t mem_size;
+ void (*guest_code)(void);
+ bool (*prepare)(struct vm_data *data, struct sync_area *sync,
+ uint64_t *maxslots);
+ void (*loop)(struct vm_data *data, struct sync_area *sync);
+};
+
+static bool test_execute(int nslots, uint64_t *maxslots,
+ unsigned int maxtime,
+ const struct test_data *tdata,
+ uint64_t *nloops,
+ struct timespec *slot_runtime,
+ struct timespec *guest_runtime)
+{
+ uint64_t mem_size = tdata->mem_size ? : MEM_SIZE_PAGES;
+ struct vm_data *data;
+ struct sync_area *sync;
+ struct timespec tstart;
+ bool ret = true;
+
+ data = alloc_vm();
+ if (!prepare_vm(data, nslots, maxslots, tdata->guest_code,
+ mem_size, slot_runtime)) {
+ ret = false;
+ goto exit_free;
+ }
+
+ sync = (typeof(sync))vm_gpa2hva(data, MEM_SYNC_GPA, NULL);
+
+ if (tdata->prepare &&
+ !tdata->prepare(data, sync, maxslots)) {
+ ret = false;
+ goto exit_free;
+ }
+
+ launch_vm(data);
+
+ clock_gettime(CLOCK_MONOTONIC, &tstart);
+ let_guest_run(sync);
+
+ while (1) {
+ *guest_runtime = timespec_elapsed(tstart);
+ if (guest_runtime->tv_sec >= maxtime)
+ break;
+
+ tdata->loop(data, sync);
+
+ (*nloops)++;
+ }
+
+ make_guest_exit(sync);
+ wait_guest_exit(data);
+
+exit_free:
+ free_vm(data);
+
+ return ret;
+}
+
+static const struct test_data tests[] = {
+ {
+ .name = "map",
+ .mem_size = MEM_SIZE_MAP_PAGES,
+ .guest_code = guest_code_test_memslot_map,
+ .loop = test_memslot_map_loop,
+ },
+ {
+ .name = "unmap",
+ .mem_size = MEM_TEST_UNMAP_SIZE_PAGES + 1,
+ .guest_code = guest_code_test_memslot_unmap,
+ .loop = test_memslot_unmap_loop,
+ },
+ {
+ .name = "unmap chunked",
+ .mem_size = MEM_TEST_UNMAP_SIZE_PAGES + 1,
+ .guest_code = guest_code_test_memslot_unmap,
+ .loop = test_memslot_unmap_loop_chunked,
+ },
+ {
+ .name = "move active area",
+ .guest_code = guest_code_test_memslot_move,
+ .prepare = test_memslot_move_prepare_active,
+ .loop = test_memslot_move_loop,
+ },
+ {
+ .name = "move inactive area",
+ .guest_code = guest_code_test_memslot_move,
+ .prepare = test_memslot_move_prepare_inactive,
+ .loop = test_memslot_move_loop,
+ },
+ {
+ .name = "RW",
+ .guest_code = guest_code_test_memslot_rw,
+ .loop = test_memslot_rw_loop
+ },
+};
+
+#define NTESTS ARRAY_SIZE(tests)
+
+struct test_args {
+ int tfirst;
+ int tlast;
+ int nslots;
+ int seconds;
+ int runs;
+};
+
+static void help(char *name, struct test_args *targs)
+{
+ int ctr;
+
+ pr_info("usage: %s [-h] [-v] [-d] [-s slots] [-f first_test] [-e last_test] [-l test_length] [-r run_count]\n",
+ name);
+ pr_info(" -h: print this help screen.\n");
+ pr_info(" -v: enable verbose mode (not for benchmarking).\n");
+ pr_info(" -d: enable extra debug checks.\n");
+ pr_info(" -s: specify memslot count cap (-1 means no cap; currently: %i)\n",
+ targs->nslots);
+ pr_info(" -f: specify the first test to run (currently: %i; max %zu)\n",
+ targs->tfirst, NTESTS - 1);
+ pr_info(" -e: specify the last test to run (currently: %i; max %zu)\n",
+ targs->tlast, NTESTS - 1);
+ pr_info(" -l: specify the test length in seconds (currently: %i)\n",
+ targs->seconds);
+ pr_info(" -r: specify the number of runs per test (currently: %i)\n",
+ targs->runs);
+
+ pr_info("\nAvailable tests:\n");
+ for (ctr = 0; ctr < NTESTS; ctr++)
+ pr_info("%d: %s\n", ctr, tests[ctr].name);
+}
+
+static bool parse_args(int argc, char *argv[],
+ struct test_args *targs)
+{
+ int opt;
+
+ while ((opt = getopt(argc, argv, "hvds:f:e:l:r:")) != -1) {
+ switch (opt) {
+ case 'h':
+ default:
+ help(argv[0], targs);
+ return false;
+ case 'v':
+ verbose = true;
+ break;
+ case 'd':
+ map_unmap_verify = true;
+ break;
+ case 's':
+ targs->nslots = atoi(optarg);
+ if (targs->nslots <= 0 && targs->nslots != -1) {
+ pr_info("Slot count cap has to be positive or -1 for no cap\n");
+ return false;
+ }
+ break;
+ case 'f':
+ targs->tfirst = atoi(optarg);
+ if (targs->tfirst < 0) {
+ pr_info("First test to run has to be non-negative\n");
+ return false;
+ }
+ break;
+ case 'e':
+ targs->tlast = atoi(optarg);
+ if (targs->tlast < 0 || targs->tlast >= NTESTS) {
+ pr_info("Last test to run has to be non-negative and less than %zu\n",
+ NTESTS);
+ return false;
+ }
+ break;
+ case 'l':
+ targs->seconds = atoi(optarg);
+ if (targs->seconds < 0) {
+ pr_info("Test length in seconds has to be non-negative\n");
+ return false;
+ }
+ break;
+ case 'r':
+ targs->runs = atoi(optarg);
+ if (targs->runs <= 0) {
+ pr_info("Runs per test has to be positive\n");
+ return false;
+ }
+ break;
+ }
+ }
+
+ if (optind < argc) {
+ help(argv[0], targs);
+ return false;
+ }
+
+ if (targs->tfirst > targs->tlast) {
+ pr_info("First test to run cannot be greater than the last test to run\n");
+ return false;
+ }
+
+ return true;
+}
+
+struct test_result {
+ struct timespec slot_runtime, guest_runtime, iter_runtime;
+ int64_t slottimens, runtimens;
+ uint64_t nloops;
+};
+
+static bool test_loop(const struct test_data *data,
+ const struct test_args *targs,
+ struct test_result *rbestslottime,
+ struct test_result *rbestruntime)
+{
+ uint64_t maxslots;
+ struct test_result result;
+
+ result.nloops = 0;
+ if (!test_execute(targs->nslots, &maxslots, targs->seconds, data,
+ &result.nloops,
+ &result.slot_runtime, &result.guest_runtime)) {
+ if (maxslots)
+ pr_info("Memslot count too high for this test, decrease the cap (max is %"PRIu64")\n",
+ maxslots);
+ else
+ pr_info("Memslot count may be too high for this test, try adjusting the cap\n");
+
+ return false;
+ }
+
+ pr_info("Test took %ld.%.9lds for slot setup + %ld.%.9lds all iterations\n",
+ result.slot_runtime.tv_sec, result.slot_runtime.tv_nsec,
+ result.guest_runtime.tv_sec, result.guest_runtime.tv_nsec);
+ if (!result.nloops) {
+ pr_info("No full loops done - too short test time or system too loaded?\n");
+ return true;
+ }
+
+ result.iter_runtime = timespec_div(result.guest_runtime,
+ result.nloops);
+ pr_info("Done %"PRIu64" iterations, avg %ld.%.9lds each\n",
+ result.nloops,
+ result.iter_runtime.tv_sec,
+ result.iter_runtime.tv_nsec);
+ result.slottimens = timespec_to_ns(result.slot_runtime);
+ result.runtimens = timespec_to_ns(result.iter_runtime);
+
+ /*
+ * Only rank the slot setup time for tests using the whole test memory
+ * area so they are comparable
+ */
+ if (!data->mem_size &&
+ (!rbestslottime->slottimens ||
+ result.slottimens < rbestslottime->slottimens))
+ *rbestslottime = result;
+ if (!rbestruntime->runtimens ||
+ result.runtimens < rbestruntime->runtimens)
+ *rbestruntime = result;
+
+ return true;
+}
+
+int main(int argc, char *argv[])
+{
+ struct test_args targs = {
+ .tfirst = 0,
+ .tlast = NTESTS - 1,
+ .nslots = -1,
+ .seconds = 5,
+ .runs = 1,
+ };
+ struct test_result rbestslottime;
+ int tctr;
+
+ /* Tell stdout not to buffer its content */
+ setbuf(stdout, NULL);
+
+ if (!parse_args(argc, argv, &targs))
+ return -1;
+
+ rbestslottime.slottimens = 0;
+ for (tctr = targs.tfirst; tctr <= targs.tlast; tctr++) {
+ const struct test_data *data = &tests[tctr];
+ unsigned int runctr;
+ struct test_result rbestruntime;
+
+ if (tctr > targs.tfirst)
+ pr_info("\n");
+
+ pr_info("Testing %s performance with %i runs, %d seconds each\n",
+ data->name, targs.runs, targs.seconds);
+
+ rbestruntime.runtimens = 0;
+ for (runctr = 0; runctr < targs.runs; runctr++)
+ if (!test_loop(data, &targs,
+ &rbestslottime, &rbestruntime))
+ break;
+
+ if (rbestruntime.runtimens)
+ pr_info("Best runtime result was %ld.%.9lds per iteration (with %"PRIu64" iterations)\n",
+ rbestruntime.iter_runtime.tv_sec,
+ rbestruntime.iter_runtime.tv_nsec,
+ rbestruntime.nloops);
+ }
+
+ if (rbestslottime.slottimens)
+ pr_info("Best slot setup time for the whole test area was %ld.%.9lds\n",
+ rbestslottime.slot_runtime.tv_sec,
+ rbestslottime.slot_runtime.tv_nsec);
+
+ return 0;
+}
u32 function;
u32 index;
} mangled_cpuids[] = {
+ /*
+ * These entries depend on the vCPU's XCR0 register and IA32_XSS MSR,
+ * which are not controlled for by this test.
+ */
{.function = 0xd, .index = 0},
+ {.function = 0xd, .index = 1},
};
static void test_guest_cpuids(struct kvm_cpuid2 *guest_cpuid)
int old_res, res, kvm_fd, r;
struct kvm_msr_list *list;
- kvm_fd = open(KVM_DEV_PATH, O_RDONLY);
- if (kvm_fd < 0)
- exit(KSFT_SKIP);
+ kvm_fd = open_kvm_dev_path_or_exit();
old_res = kvm_num_index_msrs(kvm_fd, 0);
TEST_ASSERT(old_res != 0, "Expecting nmsrs to be > 0");
int res, old_res, i, kvm_fd;
struct kvm_msr_list *feature_list;
- kvm_fd = open(KVM_DEV_PATH, O_RDONLY);
- if (kvm_fd < 0)
- exit(KSFT_SKIP);
+ kvm_fd = open_kvm_dev_path_or_exit();
old_res = kvm_num_feature_msrs(kvm_fd, 0);
TEST_ASSERT(old_res != 0, "Expecting nmsrs to be > 0");
local stat_ackrx_now_l=$(get_mib_counter "${listener_ns}" "MPTcpExtMPCapableACKRX")
local stat_cookietx_now=$(get_mib_counter "${listener_ns}" "TcpExtSyncookiesSent")
local stat_cookierx_now=$(get_mib_counter "${listener_ns}" "TcpExtSyncookiesRecv")
+ local stat_ooo_now=$(get_mib_counter "${listener_ns}" "TcpExtTCPOFOQueue")
expect_synrx=$((stat_synrx_last_l))
expect_ackrx=$((stat_ackrx_last_l))
"${stat_synrx_now_l}" "${expect_synrx}" 1>&2
retc=1
fi
- if [ ${stat_ackrx_now_l} -lt ${expect_ackrx} ]; then
- printf "[ FAIL ] lower MPC ACK rx (%d) than expected (%d)\n" \
- "${stat_ackrx_now_l}" "${expect_ackrx}" 1>&2
- rets=1
+ if [ ${stat_ackrx_now_l} -lt ${expect_ackrx} -a ${stat_ooo_now} -eq 0 ]; then
+ if [ ${stat_ooo_now} -eq 0 ]; then
+ printf "[ FAIL ] lower MPC ACK rx (%d) than expected (%d)\n" \
+ "${stat_ackrx_now_l}" "${expect_ackrx}" 1>&2
+ rets=1
+ else
+ printf "[ Note ] fallback due to TCP OoO"
+ fi
fi
if [ $retc -eq 0 ] && [ $rets -eq 0 ]; then
/proc-self-map-files-002
/proc-self-syscall
/proc-self-wchan
+/proc-subset-pid
/proc-uptime-001
/proc-uptime-002
/read
"setup": [
"$IP link add dev $DUMMY type dummy || /bin/true"
],
- "cmdUnderTest": "$TC qdisc add dev $DUMMY root fq_pie flows 65536",
- "expExitCode": "2",
+ "cmdUnderTest": "$TC qdisc add dev $DUMMY handle 1: root fq_pie flows 65536",
+ "expExitCode": "0",
"verifyCmd": "$TC qdisc show dev $DUMMY",
- "matchPattern": "qdisc",
- "matchCount": "0",
+ "matchPattern": "qdisc fq_pie 1: root refcnt 2 limit 10240p flows 65536",
+ "matchCount": "1",
"teardown": [
"$IP link del dev $DUMMY"
]
ip1 -4 route add default dev wg0 table 51820
ip1 -4 rule add not fwmark 51820 table 51820
ip1 -4 rule add table main suppress_prefixlength 0
+n1 bash -c 'printf 0 > /proc/sys/net/ipv4/conf/vethc/rp_filter'
# Flood the pings instead of sending just one, to trigger routing table reference counting bugs.
n1 ping -W 1 -c 100 -f 192.168.99.7
n1 ping -W 1 -c 100 -f abab::1111
CONFIG_NETFILTER_XT_NAT=y
CONFIG_NETFILTER_XT_MATCH_LENGTH=y
CONFIG_NETFILTER_XT_MARK=y
-CONFIG_NF_CONNTRACK_IPV4=y
CONFIG_NF_NAT_IPV4=y
CONFIG_IP_NF_IPTABLES=y
CONFIG_IP_NF_FILTER=y
{
return kvm_make_all_cpus_request_except(kvm, req, NULL);
}
+EXPORT_SYMBOL_GPL(kvm_make_all_cpus_request);
#ifndef CONFIG_HAVE_KVM_ARCH_TLB_FLUSH_ALL
void kvm_flush_remote_tlbs(struct kvm *kvm)
goto out;
if (signal_pending(current))
goto out;
+ if (kvm_check_request(KVM_REQ_UNBLOCK, vcpu))
+ goto out;
ret = 0;
out:
goto out;
}
poll_end = cur = ktime_get();
- } while (single_task_running() && !need_resched() &&
- ktime_before(cur, stop));
+ } while (kvm_vcpu_can_poll(cur, stop));
}
prepare_to_rcuwait(&vcpu->wait);
if (prod->add_consumer)
ret = prod->add_consumer(prod, cons);
- if (ret)
- goto err_add_consumer;
-
- ret = cons->add_producer(cons, prod);
- if (ret)
- goto err_add_producer;
+ if (!ret) {
+ ret = cons->add_producer(cons, prod);
+ if (ret && prod->del_consumer)
+ prod->del_consumer(prod, cons);
+ }
if (cons->start)
cons->start(cons);
if (prod->start)
prod->start(prod);
-err_add_producer:
- if (prod->del_consumer)
- prod->del_consumer(prod, cons);
-err_add_consumer:
+
return ret;
}