Jason Gunthorpe <jgg@ziepe.ca> <jgunthorpe@obsidianresearch.com>
Javi Merino <javi.merino@kernel.org> <javi.merino@arm.com>
<javier@osg.samsung.com> <javier.martinez@collabora.co.uk>
+Jayachandran C <c.jayachandran@gmail.com> <jayachandranc@netlogicmicro.com>
+Jayachandran C <c.jayachandran@gmail.com> <jchandra@broadcom.com>
+Jayachandran C <c.jayachandran@gmail.com> <jchandra@digeo.com>
+Jayachandran C <c.jayachandran@gmail.com> <jnair@caviumnetworks.com>
Jean Tourrilhes <jt@hpl.hp.com>
<jean-philippe@linaro.org> <jean-philippe.brucker@arm.com>
Jeff Garzik <jgarzik@pretzel.yyz.us>
/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
/sys/devices/system/cpu/vulnerabilities/l1tf
/sys/devices/system/cpu/vulnerabilities/mds
+ /sys/devices/system/cpu/vulnerabilities/tsx_async_abort
+ /sys/devices/system/cpu/vulnerabilities/itlb_multihit
Date: January 2018
Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
Description: Information about CPU vulnerabilities
spectre
l1tf
mds
+ tsx_async_abort
+ multihit.rst
--- /dev/null
+iTLB multihit
+=============
+
+iTLB multihit is an erratum where some processors may incur a machine check
+error, possibly resulting in an unrecoverable CPU lockup, when an
+instruction fetch hits multiple entries in the instruction TLB. This can
+occur when the page size is changed along with either the physical address
+or cache type. A malicious guest running on a virtualized system can
+exploit this erratum to perform a denial of service attack.
+
+
+Affected processors
+-------------------
+
+Variations of this erratum are present on most Intel Core and Xeon processor
+models. The erratum is not present on:
+
+ - non-Intel processors
+
+ - Some Atoms (Airmont, Bonnell, Goldmont, GoldmontPlus, Saltwell, Silvermont)
+
+ - Intel processors that have the PSCHANGE_MC_NO bit set in the
+ IA32_ARCH_CAPABILITIES MSR.
+
+
+Related CVEs
+------------
+
+The following CVE entry is related to this issue:
+
+ ============== =================================================
+ CVE-2018-12207 Machine Check Error Avoidance on Page Size Change
+ ============== =================================================
+
+
+Problem
+-------
+
+Privileged software, including OS and virtual machine managers (VMM), are in
+charge of memory management. A key component in memory management is the control
+of the page tables. Modern processors use virtual memory, a technique that creates
+the illusion of a very large memory for processors. This virtual space is split
+into pages of a given size. Page tables translate virtual addresses to physical
+addresses.
+
+To reduce latency when performing a virtual to physical address translation,
+processors include a structure, called TLB, that caches recent translations.
+There are separate TLBs for instruction (iTLB) and data (dTLB).
+
+Under this errata, instructions are fetched from a linear address translated
+using a 4 KB translation cached in the iTLB. Privileged software modifies the
+paging structure so that the same linear address using large page size (2 MB, 4
+MB, 1 GB) with a different physical address or memory type. After the page
+structure modification but before the software invalidates any iTLB entries for
+the linear address, a code fetch that happens on the same linear address may
+cause a machine-check error which can result in a system hang or shutdown.
+
+
+Attack scenarios
+----------------
+
+Attacks against the iTLB multihit erratum can be mounted from malicious
+guests in a virtualized system.
+
+
+iTLB multihit system information
+--------------------------------
+
+The Linux kernel provides a sysfs interface to enumerate the current iTLB
+multihit status of the system:whether the system is vulnerable and which
+mitigations are active. The relevant sysfs file is:
+
+/sys/devices/system/cpu/vulnerabilities/itlb_multihit
+
+The possible values in this file are:
+
+.. list-table::
+
+ * - Not affected
+ - The processor is not vulnerable.
+ * - KVM: Mitigation: Split huge pages
+ - Software changes mitigate this issue.
+ * - KVM: Vulnerable
+ - The processor is vulnerable, but no mitigation enabled
+
+
+Enumeration of the erratum
+--------------------------------
+
+A new bit has been allocated in the IA32_ARCH_CAPABILITIES (PSCHANGE_MC_NO) msr
+and will be set on CPU's which are mitigated against this issue.
+
+ ======================================= =========== ===============================
+ IA32_ARCH_CAPABILITIES MSR Not present Possibly vulnerable,check model
+ IA32_ARCH_CAPABILITIES[PSCHANGE_MC_NO] '0' Likely vulnerable,check model
+ IA32_ARCH_CAPABILITIES[PSCHANGE_MC_NO] '1' Not vulnerable
+ ======================================= =========== ===============================
+
+
+Mitigation mechanism
+-------------------------
+
+This erratum can be mitigated by restricting the use of large page sizes to
+non-executable pages. This forces all iTLB entries to be 4K, and removes
+the possibility of multiple hits.
+
+In order to mitigate the vulnerability, KVM initially marks all huge pages
+as non-executable. If the guest attempts to execute in one of those pages,
+the page is broken down into 4K pages, which are then marked executable.
+
+If EPT is disabled or not available on the host, KVM is in control of TLB
+flushes and the problematic situation cannot happen. However, the shadow
+EPT paging mechanism used by nested virtualization is vulnerable, because
+the nested guest can trigger multiple iTLB hits by modifying its own
+(non-nested) page tables. For simplicity, KVM will make large pages
+non-executable in all shadow paging modes.
+
+Mitigation control on the kernel command line and KVM - module parameter
+------------------------------------------------------------------------
+
+The KVM hypervisor mitigation mechanism for marking huge pages as
+non-executable can be controlled with a module parameter "nx_huge_pages=".
+The kernel command line allows to control the iTLB multihit mitigations at
+boot time with the option "kvm.nx_huge_pages=".
+
+The valid arguments for these options are:
+
+ ========== ================================================================
+ force Mitigation is enabled. In this case, the mitigation implements
+ non-executable huge pages in Linux kernel KVM module. All huge
+ pages in the EPT are marked as non-executable.
+ If a guest attempts to execute in one of those pages, the page is
+ broken down into 4K pages, which are then marked executable.
+
+ off Mitigation is disabled.
+
+ auto Enable mitigation only if the platform is affected and the kernel
+ was not booted with the "mitigations=off" command line parameter.
+ This is the default option.
+ ========== ================================================================
+
+
+Mitigation selection guide
+--------------------------
+
+1. No virtualization in use
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ The system is protected by the kernel unconditionally and no further
+ action is required.
+
+2. Virtualization with trusted guests
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ If the guest comes from a trusted source, you may assume that the guest will
+ not attempt to maliciously exploit these errata and no further action is
+ required.
+
+3. Virtualization with untrusted guests
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+ If the guest comes from an untrusted source, the guest host kernel will need
+ to apply iTLB multihit mitigation via the kernel command line or kvm
+ module parameter.
--- /dev/null
+.. SPDX-License-Identifier: GPL-2.0
+
+TAA - TSX Asynchronous Abort
+======================================
+
+TAA is a hardware vulnerability that allows unprivileged speculative access to
+data which is available in various CPU internal buffers by using asynchronous
+aborts within an Intel TSX transactional region.
+
+Affected processors
+-------------------
+
+This vulnerability only affects Intel processors that support Intel
+Transactional Synchronization Extensions (TSX) when the TAA_NO bit (bit 8)
+is 0 in the IA32_ARCH_CAPABILITIES MSR. On processors where the MDS_NO bit
+(bit 5) is 0 in the IA32_ARCH_CAPABILITIES MSR, the existing MDS mitigations
+also mitigate against TAA.
+
+Whether a processor is affected or not can be read out from the TAA
+vulnerability file in sysfs. See :ref:`tsx_async_abort_sys_info`.
+
+Related CVEs
+------------
+
+The following CVE entry is related to this TAA issue:
+
+ ============== ===== ===================================================
+ CVE-2019-11135 TAA TSX Asynchronous Abort (TAA) condition on some
+ microprocessors utilizing speculative execution may
+ allow an authenticated user to potentially enable
+ information disclosure via a side channel with
+ local access.
+ ============== ===== ===================================================
+
+Problem
+-------
+
+When performing store, load or L1 refill operations, processors write
+data into temporary microarchitectural structures (buffers). The data in
+those buffers can be forwarded to load operations as an optimization.
+
+Intel TSX is an extension to the x86 instruction set architecture that adds
+hardware transactional memory support to improve performance of multi-threaded
+software. TSX lets the processor expose and exploit concurrency hidden in an
+application due to dynamically avoiding unnecessary synchronization.
+
+TSX supports atomic memory transactions that are either committed (success) or
+aborted. During an abort, operations that happened within the transactional region
+are rolled back. An asynchronous abort takes place, among other options, when a
+different thread accesses a cache line that is also used within the transactional
+region when that access might lead to a data race.
+
+Immediately after an uncompleted asynchronous abort, certain speculatively
+executed loads may read data from those internal buffers and pass it to dependent
+operations. This can be then used to infer the value via a cache side channel
+attack.
+
+Because the buffers are potentially shared between Hyper-Threads cross
+Hyper-Thread attacks are possible.
+
+The victim of a malicious actor does not need to make use of TSX. Only the
+attacker needs to begin a TSX transaction and raise an asynchronous abort
+which in turn potenitally leaks data stored in the buffers.
+
+More detailed technical information is available in the TAA specific x86
+architecture section: :ref:`Documentation/x86/tsx_async_abort.rst <tsx_async_abort>`.
+
+
+Attack scenarios
+----------------
+
+Attacks against the TAA vulnerability can be implemented from unprivileged
+applications running on hosts or guests.
+
+As for MDS, the attacker has no control over the memory addresses that can
+be leaked. Only the victim is responsible for bringing data to the CPU. As
+a result, the malicious actor has to sample as much data as possible and
+then postprocess it to try to infer any useful information from it.
+
+A potential attacker only has read access to the data. Also, there is no direct
+privilege escalation by using this technique.
+
+
+.. _tsx_async_abort_sys_info:
+
+TAA system information
+-----------------------
+
+The Linux kernel provides a sysfs interface to enumerate the current TAA status
+of mitigated systems. The relevant sysfs file is:
+
+/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
+
+The possible values in this file are:
+
+.. list-table::
+
+ * - 'Vulnerable'
+ - The CPU is affected by this vulnerability and the microcode and kernel mitigation are not applied.
+ * - 'Vulnerable: Clear CPU buffers attempted, no microcode'
+ - The system tries to clear the buffers but the microcode might not support the operation.
+ * - 'Mitigation: Clear CPU buffers'
+ - The microcode has been updated to clear the buffers. TSX is still enabled.
+ * - 'Mitigation: TSX disabled'
+ - TSX is disabled.
+ * - 'Not affected'
+ - The CPU is not affected by this issue.
+
+.. _ucode_needed:
+
+Best effort mitigation mode
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If the processor is vulnerable, but the availability of the microcode-based
+mitigation mechanism is not advertised via CPUID the kernel selects a best
+effort mitigation mode. This mode invokes the mitigation instructions
+without a guarantee that they clear the CPU buffers.
+
+This is done to address virtualization scenarios where the host has the
+microcode update applied, but the hypervisor is not yet updated to expose the
+CPUID to the guest. If the host has updated microcode the protection takes
+effect; otherwise a few CPU cycles are wasted pointlessly.
+
+The state in the tsx_async_abort sysfs file reflects this situation
+accordingly.
+
+
+Mitigation mechanism
+--------------------
+
+The kernel detects the affected CPUs and the presence of the microcode which is
+required. If a CPU is affected and the microcode is available, then the kernel
+enables the mitigation by default.
+
+
+The mitigation can be controlled at boot time via a kernel command line option.
+See :ref:`taa_mitigation_control_command_line`.
+
+.. _virt_mechanism:
+
+Virtualization mitigation
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Affected systems where the host has TAA microcode and TAA is mitigated by
+having disabled TSX previously, are not vulnerable regardless of the status
+of the VMs.
+
+In all other cases, if the host either does not have the TAA microcode or
+the kernel is not mitigated, the system might be vulnerable.
+
+
+.. _taa_mitigation_control_command_line:
+
+Mitigation control on the kernel command line
+---------------------------------------------
+
+The kernel command line allows to control the TAA mitigations at boot time with
+the option "tsx_async_abort=". The valid arguments for this option are:
+
+ ============ =============================================================
+ off This option disables the TAA mitigation on affected platforms.
+ If the system has TSX enabled (see next parameter) and the CPU
+ is affected, the system is vulnerable.
+
+ full TAA mitigation is enabled. If TSX is enabled, on an affected
+ system it will clear CPU buffers on ring transitions. On
+ systems which are MDS-affected and deploy MDS mitigation,
+ TAA is also mitigated. Specifying this option on those
+ systems will have no effect.
+
+ full,nosmt The same as tsx_async_abort=full, with SMT disabled on
+ vulnerable CPUs that have TSX enabled. This is the complete
+ mitigation. When TSX is disabled, SMT is not disabled because
+ CPU is not vulnerable to cross-thread TAA attacks.
+ ============ =============================================================
+
+Not specifying this option is equivalent to "tsx_async_abort=full".
+
+The kernel command line also allows to control the TSX feature using the
+parameter "tsx=" on CPUs which support TSX control. MSR_IA32_TSX_CTRL is used
+to control the TSX feature and the enumeration of the TSX feature bits (RTM
+and HLE) in CPUID.
+
+The valid options are:
+
+ ============ =============================================================
+ off Disables TSX on the system.
+
+ Note that this option takes effect only on newer CPUs which are
+ not vulnerable to MDS, i.e., have MSR_IA32_ARCH_CAPABILITIES.MDS_NO=1
+ and which get the new IA32_TSX_CTRL MSR through a microcode
+ update. This new MSR allows for the reliable deactivation of
+ the TSX functionality.
+
+ on Enables TSX.
+
+ Although there are mitigations for all known security
+ vulnerabilities, TSX has been known to be an accelerator for
+ several previous speculation-related CVEs, and so there may be
+ unknown security risks associated with leaving it enabled.
+
+ auto Disables TSX if X86_BUG_TAA is present, otherwise enables TSX
+ on the system.
+ ============ =============================================================
+
+Not specifying this option is equivalent to "tsx=off".
+
+The following combinations of the "tsx_async_abort" and "tsx" are possible. For
+affected platforms tsx=auto is equivalent to tsx=off and the result will be:
+
+ ========= ========================== =========================================
+ tsx=on tsx_async_abort=full The system will use VERW to clear CPU
+ buffers. Cross-thread attacks are still
+ possible on SMT machines.
+ tsx=on tsx_async_abort=full,nosmt As above, cross-thread attacks on SMT
+ mitigated.
+ tsx=on tsx_async_abort=off The system is vulnerable.
+ tsx=off tsx_async_abort=full TSX might be disabled if microcode
+ provides a TSX control MSR. If so,
+ system is not vulnerable.
+ tsx=off tsx_async_abort=full,nosmt Ditto
+ tsx=off tsx_async_abort=off ditto
+ ========= ========================== =========================================
+
+
+For unaffected platforms "tsx=on" and "tsx_async_abort=full" does not clear CPU
+buffers. For platforms without TSX control (MSR_IA32_ARCH_CAPABILITIES.MDS_NO=0)
+"tsx" command line argument has no effect.
+
+For the affected platforms below table indicates the mitigation status for the
+combinations of CPUID bit MD_CLEAR and IA32_ARCH_CAPABILITIES MSR bits MDS_NO
+and TSX_CTRL_MSR.
+
+ ======= ========= ============= ========================================
+ MDS_NO MD_CLEAR TSX_CTRL_MSR Status
+ ======= ========= ============= ========================================
+ 0 0 0 Vulnerable (needs microcode)
+ 0 1 0 MDS and TAA mitigated via VERW
+ 1 1 0 MDS fixed, TAA vulnerable if TSX enabled
+ because MD_CLEAR has no meaning and
+ VERW is not guaranteed to clear buffers
+ 1 X 1 MDS fixed, TAA can be mitigated by
+ VERW or TSX_CTRL_MSR
+ ======= ========= ============= ========================================
+
+Mitigation selection guide
+--------------------------
+
+1. Trusted userspace and guests
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If all user space applications are from a trusted source and do not execute
+untrusted code which is supplied externally, then the mitigation can be
+disabled. The same applies to virtualized environments with trusted guests.
+
+
+2. Untrusted userspace and guests
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If there are untrusted applications or guests on the system, enabling TSX
+might allow a malicious actor to leak data from the host or from other
+processes running on the same physical core.
+
+If the microcode is available and the TSX is disabled on the host, attacks
+are prevented in a virtualized environment as well, even if the VMs do not
+explicitly enable the mitigation.
+
+
+.. _taa_default_mitigations:
+
+Default mitigations
+-------------------
+
+The kernel's default action for vulnerable processors is:
+
+ - Deploy TSX disable mitigation (tsx_async_abort=full tsx=off).
KVM MMU at runtime.
Default is 0 (off)
+ kvm.nx_huge_pages=
+ [KVM] Controls the software workaround for the
+ X86_BUG_ITLB_MULTIHIT bug.
+ force : Always deploy workaround.
+ off : Never deploy workaround.
+ auto : Deploy workaround based on the presence of
+ X86_BUG_ITLB_MULTIHIT.
+
+ Default is 'auto'.
+
+ If the software workaround is enabled for the host,
+ guests do need not to enable it for nested guests.
+
+ kvm.nx_huge_pages_recovery_ratio=
+ [KVM] Controls how many 4KiB pages are periodically zapped
+ back to huge pages. 0 disables the recovery, otherwise if
+ the value is N KVM will zap 1/Nth of the 4KiB pages every
+ minute. The default is 60.
+
kvm-amd.nested= [KVM,AMD] Allow nested virtualization in KVM/SVM.
Default is 1 (enabled)
ssbd=force-off [ARM64]
l1tf=off [X86]
mds=off [X86]
+ tsx_async_abort=off [X86]
+ kvm.nx_huge_pages=off [X86]
+
+ Exceptions:
+ This does not have any effect on
+ kvm.nx_huge_pages when
+ kvm.nx_huge_pages=force.
auto (default)
Mitigate all CPU vulnerabilities, but leave SMT
be fully mitigated, even if it means losing SMT.
Equivalent to: l1tf=flush,nosmt [X86]
mds=full,nosmt [X86]
+ tsx_async_abort=full,nosmt [X86]
mminit_loglevel=
[KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
interruptions from clocksource watchdog are not
acceptable).
+ tsx= [X86] Control Transactional Synchronization
+ Extensions (TSX) feature in Intel processors that
+ support TSX control.
+
+ This parameter controls the TSX feature. The options are:
+
+ on - Enable TSX on the system. Although there are
+ mitigations for all known security vulnerabilities,
+ TSX has been known to be an accelerator for
+ several previous speculation-related CVEs, and
+ so there may be unknown security risks associated
+ with leaving it enabled.
+
+ off - Disable TSX on the system. (Note that this
+ option takes effect only on newer CPUs which are
+ not vulnerable to MDS, i.e., have
+ MSR_IA32_ARCH_CAPABILITIES.MDS_NO=1 and which get
+ the new IA32_TSX_CTRL MSR through a microcode
+ update. This new MSR allows for the reliable
+ deactivation of the TSX functionality.)
+
+ auto - Disable TSX if X86_BUG_TAA is present,
+ otherwise enable TSX on the system.
+
+ Not specifying this option is equivalent to tsx=off.
+
+ See Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
+ for more details.
+
+ tsx_async_abort= [X86,INTEL] Control mitigation for the TSX Async
+ Abort (TAA) vulnerability.
+
+ Similar to Micro-architectural Data Sampling (MDS)
+ certain CPUs that support Transactional
+ Synchronization Extensions (TSX) are vulnerable to an
+ exploit against CPU internal buffers which can forward
+ information to a disclosure gadget under certain
+ conditions.
+
+ In vulnerable processors, the speculatively forwarded
+ data can be used in a cache side channel attack, to
+ access data to which the attacker does not have direct
+ access.
+
+ This parameter controls the TAA mitigation. The
+ options are:
+
+ full - Enable TAA mitigation on vulnerable CPUs
+ if TSX is enabled.
+
+ full,nosmt - Enable TAA mitigation and disable SMT on
+ vulnerable CPUs. If TSX is disabled, SMT
+ is not disabled because CPU is not
+ vulnerable to cross-thread TAA attacks.
+ off - Unconditionally disable TAA mitigation
+
+ Not specifying this option is equivalent to
+ tsx_async_abort=full. On CPUs which are MDS affected
+ and deploy MDS mitigation, TAA mitigation is not
+ required and doesn't provide any additional
+ mitigation.
+
+ For details see:
+ Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
+
turbografx.map[2|3]= [HW,JOY]
TurboGraFX parallel port interface
Format:
encryption.
* ``tx_tls_ooo`` - number of TX packets which were part of a TLS stream
but did not arrive in the expected order.
+ * ``tx_tls_skip_no_sync_data`` - number of TX packets which were part of
+ a TLS stream and arrived out-of-order, but skipped the HW offload routine
+ and went to the regular transmit flow as they were retransmissions of the
+ connection handshake.
* ``tx_tls_drop_no_sync_data`` - number of TX packets which were part of
a TLS stream dropped, because they arrived out of order and associated
record could not be found.
mds
microcode
resctrl_ui
+ tsx_async_abort
usb-legacy-support
i386/index
x86_64/index
--- /dev/null
+.. SPDX-License-Identifier: GPL-2.0
+
+TSX Async Abort (TAA) mitigation
+================================
+
+.. _tsx_async_abort:
+
+Overview
+--------
+
+TSX Async Abort (TAA) is a side channel attack on internal buffers in some
+Intel processors similar to Microachitectural Data Sampling (MDS). In this
+case certain loads may speculatively pass invalid data to dependent operations
+when an asynchronous abort condition is pending in a Transactional
+Synchronization Extensions (TSX) transaction. This includes loads with no
+fault or assist condition. Such loads may speculatively expose stale data from
+the same uarch data structures as in MDS, with same scope of exposure i.e.
+same-thread and cross-thread. This issue affects all current processors that
+support TSX.
+
+Mitigation strategy
+-------------------
+
+a) TSX disable - one of the mitigations is to disable TSX. A new MSR
+IA32_TSX_CTRL will be available in future and current processors after
+microcode update which can be used to disable TSX. In addition, it
+controls the enumeration of the TSX feature bits (RTM and HLE) in CPUID.
+
+b) Clear CPU buffers - similar to MDS, clearing the CPU buffers mitigates this
+vulnerability. More details on this approach can be found in
+:ref:`Documentation/admin-guide/hw-vuln/mds.rst <mds>`.
+
+Kernel internal mitigation modes
+--------------------------------
+
+ ============= ============================================================
+ off Mitigation is disabled. Either the CPU is not affected or
+ tsx_async_abort=off is supplied on the kernel command line.
+
+ tsx disabled Mitigation is enabled. TSX feature is disabled by default at
+ bootup on processors that support TSX control.
+
+ verw Mitigation is enabled. CPU is affected and MD_CLEAR is
+ advertised in CPUID.
+
+ ucode needed Mitigation is enabled. CPU is affected and MD_CLEAR is not
+ advertised in CPUID. That is mainly for virtualization
+ scenarios where the host has the updated microcode but the
+ hypervisor does not expose MD_CLEAR in CPUID. It's a best
+ effort approach without guarantee.
+ ============= ============================================================
+
+If the CPU is affected and the "tsx_async_abort" kernel command line parameter is
+not provided then the kernel selects an appropriate mitigation depending on the
+status of RTM and MD_CLEAR CPUID bits.
+
+Below tables indicate the impact of tsx=on|off|auto cmdline options on state of
+TAA mitigation, VERW behavior and TSX feature for various combinations of
+MSR_IA32_ARCH_CAPABILITIES bits.
+
+1. "tsx=off"
+
+========= ========= ============ ============ ============== =================== ======================
+MSR_IA32_ARCH_CAPABILITIES bits Result with cmdline tsx=off
+---------------------------------- -------------------------------------------------------------------------
+TAA_NO MDS_NO TSX_CTRL_MSR TSX state VERW can clear TAA mitigation TAA mitigation
+ after bootup CPU buffers tsx_async_abort=off tsx_async_abort=full
+========= ========= ============ ============ ============== =================== ======================
+ 0 0 0 HW default Yes Same as MDS Same as MDS
+ 0 0 1 Invalid case Invalid case Invalid case Invalid case
+ 0 1 0 HW default No Need ucode update Need ucode update
+ 0 1 1 Disabled Yes TSX disabled TSX disabled
+ 1 X 1 Disabled X None needed None needed
+========= ========= ============ ============ ============== =================== ======================
+
+2. "tsx=on"
+
+========= ========= ============ ============ ============== =================== ======================
+MSR_IA32_ARCH_CAPABILITIES bits Result with cmdline tsx=on
+---------------------------------- -------------------------------------------------------------------------
+TAA_NO MDS_NO TSX_CTRL_MSR TSX state VERW can clear TAA mitigation TAA mitigation
+ after bootup CPU buffers tsx_async_abort=off tsx_async_abort=full
+========= ========= ============ ============ ============== =================== ======================
+ 0 0 0 HW default Yes Same as MDS Same as MDS
+ 0 0 1 Invalid case Invalid case Invalid case Invalid case
+ 0 1 0 HW default No Need ucode update Need ucode update
+ 0 1 1 Enabled Yes None Same as MDS
+ 1 X 1 Enabled X None needed None needed
+========= ========= ============ ============ ============== =================== ======================
+
+3. "tsx=auto"
+
+========= ========= ============ ============ ============== =================== ======================
+MSR_IA32_ARCH_CAPABILITIES bits Result with cmdline tsx=auto
+---------------------------------- -------------------------------------------------------------------------
+TAA_NO MDS_NO TSX_CTRL_MSR TSX state VERW can clear TAA mitigation TAA mitigation
+ after bootup CPU buffers tsx_async_abort=off tsx_async_abort=full
+========= ========= ============ ============ ============== =================== ======================
+ 0 0 0 HW default Yes Same as MDS Same as MDS
+ 0 0 1 Invalid case Invalid case Invalid case Invalid case
+ 0 1 0 HW default No Need ucode update Need ucode update
+ 0 1 1 Disabled Yes TSX disabled TSX disabled
+ 1 X 1 Enabled X None needed None needed
+========= ========= ============ ============ ============== =================== ======================
+
+In the tables, TSX_CTRL_MSR is a new bit in MSR_IA32_ARCH_CAPABILITIES that
+indicates whether MSR_IA32_TSX_CTRL is supported.
+
+There are two control bits in IA32_TSX_CTRL MSR:
+
+ Bit 0: When set it disables the Restricted Transactional Memory (RTM)
+ sub-feature of TSX (will force all transactions to abort on the
+ XBEGIN instruction).
+
+ Bit 1: When set it disables the enumeration of the RTM and HLE feature
+ (i.e. it will make CPUID(EAX=7).EBX{bit4} and
+ CPUID(EAX=7).EBX{bit11} read as 0).
R: Martin KaFai Lau <kafai@fb.com>
R: Song Liu <songliubraving@fb.com>
R: Yonghong Song <yhs@fb.com>
+R: Andrii Nakryiko <andriin@fb.com>
L: netdev@vger.kernel.org
L: bpf@vger.kernel.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git
CAVIUM THUNDERX2 ARM64 SOC
M: Robert Richter <rrichter@cavium.com>
-M: Jayachandran C <jnair@caviumnetworks.com>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
S: Maintained
F: arch/arm64/boot/dts/cavium/thunder2-99xx*
F: Documentation/core-api/boot-time-mm.rst
MEMORY MANAGEMENT
+M: Andrew Morton <akpm@linux-foundation.org>
L: linux-mm@kvack.org
W: http://www.linux-mm.org
+T: quilt https://ozlabs.org/~akpm/mmotm/
+T: quilt https://ozlabs.org/~akpm/mmots/
+T: git git://github.com/hnaz/linux-mm.git
S: Maintained
F: include/linux/mm.h
F: include/linux/gfp.h
ZSWAP COMPRESSED SWAP CACHING
M: Seth Jennings <sjenning@redhat.com>
M: Dan Streetman <ddstreet@ieee.org>
+M: Vitaly Wool <vitaly.wool@konsulko.com>
L: linux-mm@kvack.org
S: Maintained
F: mm/zswap.c
VERSION = 5
PATCHLEVEL = 4
SUBLEVEL = 0
-EXTRAVERSION = -rc6
+EXTRAVERSION = -rc7
NAME = Kleptomaniac Octopus
# *DOCUMENTATION*
pinctrl-0 = <&pinctrl_pwm3>;
};
+&snvs_pwrkey {
+ status = "okay";
+};
+
&ssi2 {
status = "okay";
};
accelerometer@1c {
compatible = "fsl,mma8451";
reg = <0x1c>;
+ pinctrl-names = "default";
+ pinctrl-0 = <&pinctrl_mma8451_int>;
interrupt-parent = <&gpio6>;
interrupts = <31 IRQ_TYPE_LEVEL_LOW>;
};
>;
};
+ pinctrl_mma8451_int: mma8451intgrp {
+ fsl,pins = <
+ MX6QDL_PAD_EIM_BCLK__GPIO6_IO31 0xb0b1
+ >;
+ };
+
pinctrl_pwm3: pwm1grp {
fsl,pins = <
MX6QDL_PAD_SD4_DAT1__PWM3_OUT 0x1b0b1
ov5640: camera@3c {
compatible = "ovti,ov5640";
- pinctrl-names = "default";
- pinctrl-0 = <&ov5640_pins>;
reg = <0x3c>;
clocks = <&clk_ext_camera>;
clock-names = "xclk";
DOVDD-supply = <&v2v8>;
- powerdown-gpios = <&stmfx_pinctrl 18 GPIO_ACTIVE_HIGH>;
- reset-gpios = <&stmfx_pinctrl 19 GPIO_ACTIVE_LOW>;
+ powerdown-gpios = <&stmfx_pinctrl 18 (GPIO_ACTIVE_HIGH | GPIO_PUSH_PULL)>;
+ reset-gpios = <&stmfx_pinctrl 19 (GPIO_ACTIVE_LOW | GPIO_PUSH_PULL)>;
rotation = <180>;
status = "okay";
joystick_pins: joystick {
pins = "gpio0", "gpio1", "gpio2", "gpio3", "gpio4";
- drive-push-pull;
bias-pull-down;
};
-
- ov5640_pins: camera {
- pins = "agpio2", "agpio3"; /* stmfx pins 18 & 19 */
- drive-push-pull;
- output-low;
- };
};
};
};
interrupt-names = "int0", "int1";
clocks = <&rcc CK_HSE>, <&rcc FDCAN_K>;
clock-names = "hclk", "cclk";
- bosch,mram-cfg = <0x1400 0 0 32 0 0 2 2>;
+ bosch,mram-cfg = <0x0 0 0 32 0 0 2 2>;
status = "disabled";
};
interrupt-names = "int0", "int1";
clocks = <&rcc CK_HSE>, <&rcc FDCAN_K>;
clock-names = "hclk", "cclk";
- bosch,mram-cfg = <0x0 0 0 32 0 0 2 2>;
+ bosch,mram-cfg = <0x1400 0 0 32 0 0 2 2>;
status = "disabled";
};
vqmmc-supply = <®_dldo1>;
non-removable;
wakeup-source;
+ keep-power-in-suspend;
status = "okay";
brcmf: wifi@1 {
static int sunxi_cpu_powerdown(unsigned int cpu, unsigned int cluster)
{
u32 reg;
+ int gating_bit = cpu;
pr_debug("%s: cluster %u cpu %u\n", __func__, cluster, cpu);
if (cpu >= SUNXI_CPUS_PER_CLUSTER || cluster >= SUNXI_NR_CLUSTERS)
return -EINVAL;
+ if (is_a83t && cpu == 0)
+ gating_bit = 4;
+
/* gate processor power */
reg = readl(prcm_base + PRCM_PWROFF_GATING_REG(cluster));
- reg |= PRCM_PWROFF_GATING_REG_CORE(cpu);
+ reg |= PRCM_PWROFF_GATING_REG_CORE(gating_bit);
writel(reg, prcm_base + PRCM_PWROFF_GATING_REG(cluster));
udelay(20);
status = "okay";
i2c-mux@77 {
- compatible = "nxp,pca9847";
+ compatible = "nxp,pca9547";
reg = <0x77>;
#address-cells = <1>;
#size-cells = <0>;
};
sdma2: dma-controller@302c0000 {
- compatible = "fsl,imx8mm-sdma", "fsl,imx7d-sdma";
+ compatible = "fsl,imx8mm-sdma", "fsl,imx8mq-sdma";
reg = <0x302c0000 0x10000>;
interrupts = <GIC_SPI 103 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clk IMX8MM_CLK_SDMA2_ROOT>,
};
sdma3: dma-controller@302b0000 {
- compatible = "fsl,imx8mm-sdma", "fsl,imx7d-sdma";
+ compatible = "fsl,imx8mm-sdma", "fsl,imx8mq-sdma";
reg = <0x302b0000 0x10000>;
interrupts = <GIC_SPI 34 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clk IMX8MM_CLK_SDMA3_ROOT>,
};
sdma1: dma-controller@30bd0000 {
- compatible = "fsl,imx8mm-sdma", "fsl,imx7d-sdma";
+ compatible = "fsl,imx8mm-sdma", "fsl,imx8mq-sdma";
reg = <0x30bd0000 0x10000>;
interrupts = <GIC_SPI 2 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clk IMX8MM_CLK_SDMA1_ROOT>,
};
sdma3: dma-controller@302b0000 {
- compatible = "fsl,imx8mn-sdma", "fsl,imx7d-sdma";
+ compatible = "fsl,imx8mn-sdma", "fsl,imx8mq-sdma";
reg = <0x302b0000 0x10000>;
interrupts = <GIC_SPI 34 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clk IMX8MN_CLK_SDMA3_ROOT>,
};
sdma2: dma-controller@302c0000 {
- compatible = "fsl,imx8mn-sdma", "fsl,imx7d-sdma";
+ compatible = "fsl,imx8mn-sdma", "fsl,imx8mq-sdma";
reg = <0x302c0000 0x10000>;
interrupts = <GIC_SPI 103 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clk IMX8MN_CLK_SDMA2_ROOT>,
};
sdma1: dma-controller@30bd0000 {
- compatible = "fsl,imx8mn-sdma", "fsl,imx7d-sdma";
+ compatible = "fsl,imx8mn-sdma", "fsl,imx8mq-sdma";
reg = <0x30bd0000 0x10000>;
interrupts = <GIC_SPI 2 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clk IMX8MN_CLK_SDMA1_ROOT>,
regulator-name = "0V9_ARM";
regulator-min-microvolt = <900000>;
regulator-max-microvolt = <1000000>;
- gpios = <&gpio3 19 GPIO_ACTIVE_HIGH>;
+ gpios = <&gpio3 16 GPIO_ACTIVE_HIGH>;
states = <1000000 0x1
900000 0x0>;
regulator-always-on;
set_pte(ptep, pte);
}
-#define __HAVE_ARCH_PTE_SAME
-static inline int pte_same(pte_t pte_a, pte_t pte_b)
-{
- pteval_t lhs, rhs;
-
- lhs = pte_val(pte_a);
- rhs = pte_val(pte_b);
-
- if (pte_present(pte_a))
- lhs &= ~PTE_RDONLY;
-
- if (pte_present(pte_b))
- rhs &= ~PTE_RDONLY;
-
- return (lhs == rhs);
-}
-
/*
* Huge pte definitions.
*/
#define __arch_get_clock_mode __arm64_get_clock_mode
static __always_inline
-int __arm64_use_vsyscall(struct vdso_data *vdata)
-{
- return !vdata[CS_HRES_COARSE].clock_mode;
-}
-#define __arch_use_vsyscall __arm64_use_vsyscall
-
-static __always_inline
void __arm64_update_vsyscall(struct vdso_data *vdata, struct timekeeper *tk)
{
vdata[CS_HRES_COARSE].mask = VDSO_PRECISION_MASK;
}
#define __arch_get_clock_mode __mips_get_clock_mode
-static __always_inline
-int __mips_use_vsyscall(struct vdso_data *vdata)
-{
- return (vdata[CS_HRES_COARSE].clock_mode != VDSO_CLOCK_NONE);
-}
-#define __arch_use_vsyscall __mips_use_vsyscall
-
/* The asm-generic header needs to be included after the definitions above */
#include <asm-generic/vdso/vsyscall.h>
}
/*
+ * If we have seen a tail call, we need a second pass.
+ * This is because bpf_jit_emit_common_epilogue() is called
+ * from bpf_jit_emit_tail_call() with a not yet stable ctx->seen.
+ */
+ if (cgctx.seen & SEEN_TAILCALL) {
+ cgctx.idx = 0;
+ if (bpf_jit_build_body(fp, 0, &cgctx, addrs, false)) {
+ fp = org_fp;
+ goto out_addrs;
+ }
+ }
+
+ /*
* Pretend to build prologue, given the features we've seen. This will
* update ctgtx.idx as it pretends to output instructions, then we can
* calculate total size from idx.
If unsure, say y.
+choice
+ prompt "TSX enable mode"
+ depends on CPU_SUP_INTEL
+ default X86_INTEL_TSX_MODE_OFF
+ help
+ Intel's TSX (Transactional Synchronization Extensions) feature
+ allows to optimize locking protocols through lock elision which
+ can lead to a noticeable performance boost.
+
+ On the other hand it has been shown that TSX can be exploited
+ to form side channel attacks (e.g. TAA) and chances are there
+ will be more of those attacks discovered in the future.
+
+ Therefore TSX is not enabled by default (aka tsx=off). An admin
+ might override this decision by tsx=on the command line parameter.
+ Even with TSX enabled, the kernel will attempt to enable the best
+ possible TAA mitigation setting depending on the microcode available
+ for the particular machine.
+
+ This option allows to set the default tsx mode between tsx=on, =off
+ and =auto. See Documentation/admin-guide/kernel-parameters.txt for more
+ details.
+
+ Say off if not sure, auto if TSX is in use but it should be used on safe
+ platforms or on if TSX is in use and the security aspect of tsx is not
+ relevant.
+
+config X86_INTEL_TSX_MODE_OFF
+ bool "off"
+ help
+ TSX is disabled if possible - equals to tsx=off command line parameter.
+
+config X86_INTEL_TSX_MODE_ON
+ bool "on"
+ help
+ TSX is always enabled on TSX capable HW - equals the tsx=on command
+ line parameter.
+
+config X86_INTEL_TSX_MODE_AUTO
+ bool "auto"
+ help
+ TSX is enabled on TSX capable HW that is believed to be safe against
+ side channel attacks- equals the tsx=auto command line parameter.
+endchoice
+
config EFI
bool "EFI runtime service support"
depends on ACPI
#define X86_BUG_MDS X86_BUG(19) /* CPU is affected by Microarchitectural data sampling */
#define X86_BUG_MSBDS_ONLY X86_BUG(20) /* CPU is only affected by the MSDBS variant of BUG_MDS */
#define X86_BUG_SWAPGS X86_BUG(21) /* CPU is affected by speculation through SWAPGS */
+#define X86_BUG_TAA X86_BUG(22) /* CPU is affected by TSX Async Abort(TAA) */
+#define X86_BUG_ITLB_MULTIHIT X86_BUG(23) /* CPU may incur MCE during certain page attribute changes */
#endif /* _ASM_X86_CPUFEATURES_H */
struct kvm_mmu_page {
struct list_head link;
struct hlist_node hash_link;
+ struct list_head lpage_disallowed_link;
+
bool unsync;
u8 mmu_valid_gen;
bool mmio_cached;
+ bool lpage_disallowed; /* Can't be replaced by an equiv large page */
/*
* The following two entries are used to key the shadow page in the
*/
struct list_head active_mmu_pages;
struct list_head zapped_obsolete_pages;
+ struct list_head lpage_disallowed_mmu_pages;
struct kvm_page_track_notifier_node mmu_sp_tracker;
struct kvm_page_track_notifier_head track_notifier_head;
bool exception_payload_enabled;
struct kvm_pmu_event_filter *pmu_event_filter;
+ struct task_struct *nx_lpage_recovery_thread;
};
struct kvm_vm_stat {
ulong mmu_unsync;
ulong remote_tlb_flush;
ulong lpages;
+ ulong nx_lpage_splits;
ulong max_mmu_page_hash_collisions;
};
* Microarchitectural Data
* Sampling (MDS) vulnerabilities.
*/
+#define ARCH_CAP_PSCHANGE_MC_NO BIT(6) /*
+ * The processor is not susceptible to a
+ * machine check error due to modifying the
+ * code page size along with either the
+ * physical address or cache type
+ * without TLB invalidation.
+ */
+#define ARCH_CAP_TSX_CTRL_MSR BIT(7) /* MSR for TSX control is available. */
+#define ARCH_CAP_TAA_NO BIT(8) /*
+ * Not susceptible to
+ * TSX Async Abort (TAA) vulnerabilities.
+ */
#define MSR_IA32_FLUSH_CMD 0x0000010b
#define L1D_FLUSH BIT(0) /*
#define MSR_IA32_BBL_CR_CTL 0x00000119
#define MSR_IA32_BBL_CR_CTL3 0x0000011e
+#define MSR_IA32_TSX_CTRL 0x00000122
+#define TSX_CTRL_RTM_DISABLE BIT(0) /* Disable RTM feature */
+#define TSX_CTRL_CPUID_CLEAR BIT(1) /* Disable TSX enumeration */
+
#define MSR_IA32_SYSENTER_CS 0x00000174
#define MSR_IA32_SYSENTER_ESP 0x00000175
#define MSR_IA32_SYSENTER_EIP 0x00000176
#include <asm/segment.h>
/**
- * mds_clear_cpu_buffers - Mitigation for MDS vulnerability
+ * mds_clear_cpu_buffers - Mitigation for MDS and TAA vulnerability
*
* This uses the otherwise unused and obsolete VERW instruction in
* combination with microcode which triggers a CPU buffer flush when the
}
/**
- * mds_user_clear_cpu_buffers - Mitigation for MDS vulnerability
+ * mds_user_clear_cpu_buffers - Mitigation for MDS and TAA vulnerability
*
* Clear CPU buffers if the corresponding static key is enabled
*/
MDS_MITIGATION_VMWERV,
};
+enum taa_mitigations {
+ TAA_MITIGATION_OFF,
+ TAA_MITIGATION_UCODE_NEEDED,
+ TAA_MITIGATION_VERW,
+ TAA_MITIGATION_TSX_DISABLED,
+};
+
#endif /* _ASM_X86_PROCESSOR_H */
{
int cpu = smp_processor_id();
unsigned int value;
-#ifdef CONFIG_X86_32
- int logical_apicid, ldr_apicid;
-#endif
if (disable_apic) {
disable_ioapic_support();
apic->init_apic_ldr();
#ifdef CONFIG_X86_32
- /*
- * APIC LDR is initialized. If logical_apicid mapping was
- * initialized during get_smp_config(), make sure it matches the
- * actual value.
- */
- logical_apicid = early_per_cpu(x86_cpu_to_logical_apicid, cpu);
- ldr_apicid = GET_APIC_LOGICAL_ID(apic_read(APIC_LDR));
- WARN_ON(logical_apicid != BAD_APICID && logical_apicid != ldr_apicid);
- /* always use the value from LDR */
- early_per_cpu(x86_cpu_to_logical_apicid, cpu) = ldr_apicid;
+ if (apic->dest_logical) {
+ int logical_apicid, ldr_apicid;
+
+ /*
+ * APIC LDR is initialized. If logical_apicid mapping was
+ * initialized during get_smp_config(), make sure it matches
+ * the actual value.
+ */
+ logical_apicid = early_per_cpu(x86_cpu_to_logical_apicid, cpu);
+ ldr_apicid = GET_APIC_LOGICAL_ID(apic_read(APIC_LDR));
+ if (logical_apicid != BAD_APICID)
+ WARN_ON(logical_apicid != ldr_apicid);
+ /* Always use the value from LDR. */
+ early_per_cpu(x86_cpu_to_logical_apicid, cpu) = ldr_apicid;
+ }
#endif
/*
obj-$(CONFIG_X86_FEATURE_NAMES) += capflags.o powerflags.o
ifdef CONFIG_CPU_SUP_INTEL
-obj-y += intel.o intel_pconfig.o
+obj-y += intel.o intel_pconfig.o tsx.o
obj-$(CONFIG_PM) += intel_epb.o
endif
obj-$(CONFIG_CPU_SUP_AMD) += amd.o
static void __init ssb_select_mitigation(void);
static void __init l1tf_select_mitigation(void);
static void __init mds_select_mitigation(void);
+static void __init taa_select_mitigation(void);
/* The base value of the SPEC_CTRL MSR that always has to be preserved. */
u64 x86_spec_ctrl_base;
ssb_select_mitigation();
l1tf_select_mitigation();
mds_select_mitigation();
+ taa_select_mitigation();
arch_smt_update();
early_param("mds", mds_cmdline);
#undef pr_fmt
+#define pr_fmt(fmt) "TAA: " fmt
+
+/* Default mitigation for TAA-affected CPUs */
+static enum taa_mitigations taa_mitigation __ro_after_init = TAA_MITIGATION_VERW;
+static bool taa_nosmt __ro_after_init;
+
+static const char * const taa_strings[] = {
+ [TAA_MITIGATION_OFF] = "Vulnerable",
+ [TAA_MITIGATION_UCODE_NEEDED] = "Vulnerable: Clear CPU buffers attempted, no microcode",
+ [TAA_MITIGATION_VERW] = "Mitigation: Clear CPU buffers",
+ [TAA_MITIGATION_TSX_DISABLED] = "Mitigation: TSX disabled",
+};
+
+static void __init taa_select_mitigation(void)
+{
+ u64 ia32_cap;
+
+ if (!boot_cpu_has_bug(X86_BUG_TAA)) {
+ taa_mitigation = TAA_MITIGATION_OFF;
+ return;
+ }
+
+ /* TSX previously disabled by tsx=off */
+ if (!boot_cpu_has(X86_FEATURE_RTM)) {
+ taa_mitigation = TAA_MITIGATION_TSX_DISABLED;
+ goto out;
+ }
+
+ if (cpu_mitigations_off()) {
+ taa_mitigation = TAA_MITIGATION_OFF;
+ return;
+ }
+
+ /* TAA mitigation is turned off on the cmdline (tsx_async_abort=off) */
+ if (taa_mitigation == TAA_MITIGATION_OFF)
+ goto out;
+
+ if (boot_cpu_has(X86_FEATURE_MD_CLEAR))
+ taa_mitigation = TAA_MITIGATION_VERW;
+ else
+ taa_mitigation = TAA_MITIGATION_UCODE_NEEDED;
+
+ /*
+ * VERW doesn't clear the CPU buffers when MD_CLEAR=1 and MDS_NO=1.
+ * A microcode update fixes this behavior to clear CPU buffers. It also
+ * adds support for MSR_IA32_TSX_CTRL which is enumerated by the
+ * ARCH_CAP_TSX_CTRL_MSR bit.
+ *
+ * On MDS_NO=1 CPUs if ARCH_CAP_TSX_CTRL_MSR is not set, microcode
+ * update is required.
+ */
+ ia32_cap = x86_read_arch_cap_msr();
+ if ( (ia32_cap & ARCH_CAP_MDS_NO) &&
+ !(ia32_cap & ARCH_CAP_TSX_CTRL_MSR))
+ taa_mitigation = TAA_MITIGATION_UCODE_NEEDED;
+
+ /*
+ * TSX is enabled, select alternate mitigation for TAA which is
+ * the same as MDS. Enable MDS static branch to clear CPU buffers.
+ *
+ * For guests that can't determine whether the correct microcode is
+ * present on host, enable the mitigation for UCODE_NEEDED as well.
+ */
+ static_branch_enable(&mds_user_clear);
+
+ if (taa_nosmt || cpu_mitigations_auto_nosmt())
+ cpu_smt_disable(false);
+
+out:
+ pr_info("%s\n", taa_strings[taa_mitigation]);
+}
+
+static int __init tsx_async_abort_parse_cmdline(char *str)
+{
+ if (!boot_cpu_has_bug(X86_BUG_TAA))
+ return 0;
+
+ if (!str)
+ return -EINVAL;
+
+ if (!strcmp(str, "off")) {
+ taa_mitigation = TAA_MITIGATION_OFF;
+ } else if (!strcmp(str, "full")) {
+ taa_mitigation = TAA_MITIGATION_VERW;
+ } else if (!strcmp(str, "full,nosmt")) {
+ taa_mitigation = TAA_MITIGATION_VERW;
+ taa_nosmt = true;
+ }
+
+ return 0;
+}
+early_param("tsx_async_abort", tsx_async_abort_parse_cmdline);
+
+#undef pr_fmt
#define pr_fmt(fmt) "Spectre V1 : " fmt
enum spectre_v1_mitigation {
}
#define MDS_MSG_SMT "MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/mds.html for more details.\n"
+#define TAA_MSG_SMT "TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.\n"
void cpu_bugs_smt_update(void)
{
- /* Enhanced IBRS implies STIBP. No update required. */
- if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
- return;
-
mutex_lock(&spec_ctrl_mutex);
switch (spectre_v2_user) {
break;
}
+ switch (taa_mitigation) {
+ case TAA_MITIGATION_VERW:
+ case TAA_MITIGATION_UCODE_NEEDED:
+ if (sched_smt_active())
+ pr_warn_once(TAA_MSG_SMT);
+ break;
+ case TAA_MITIGATION_TSX_DISABLED:
+ case TAA_MITIGATION_OFF:
+ break;
+ }
+
mutex_unlock(&spec_ctrl_mutex);
}
x86_amd_ssb_disable();
}
+bool itlb_multihit_kvm_mitigation;
+EXPORT_SYMBOL_GPL(itlb_multihit_kvm_mitigation);
+
#undef pr_fmt
#define pr_fmt(fmt) "L1TF: " fmt
l1tf_vmx_states[l1tf_vmx_mitigation],
sched_smt_active() ? "vulnerable" : "disabled");
}
+
+static ssize_t itlb_multihit_show_state(char *buf)
+{
+ if (itlb_multihit_kvm_mitigation)
+ return sprintf(buf, "KVM: Mitigation: Split huge pages\n");
+ else
+ return sprintf(buf, "KVM: Vulnerable\n");
+}
#else
static ssize_t l1tf_show_state(char *buf)
{
return sprintf(buf, "%s\n", L1TF_DEFAULT_MSG);
}
+
+static ssize_t itlb_multihit_show_state(char *buf)
+{
+ return sprintf(buf, "Processor vulnerable\n");
+}
#endif
static ssize_t mds_show_state(char *buf)
sched_smt_active() ? "vulnerable" : "disabled");
}
+static ssize_t tsx_async_abort_show_state(char *buf)
+{
+ if ((taa_mitigation == TAA_MITIGATION_TSX_DISABLED) ||
+ (taa_mitigation == TAA_MITIGATION_OFF))
+ return sprintf(buf, "%s\n", taa_strings[taa_mitigation]);
+
+ if (boot_cpu_has(X86_FEATURE_HYPERVISOR)) {
+ return sprintf(buf, "%s; SMT Host state unknown\n",
+ taa_strings[taa_mitigation]);
+ }
+
+ return sprintf(buf, "%s; SMT %s\n", taa_strings[taa_mitigation],
+ sched_smt_active() ? "vulnerable" : "disabled");
+}
+
static char *stibp_state(void)
{
if (spectre_v2_enabled == SPECTRE_V2_IBRS_ENHANCED)
case X86_BUG_MDS:
return mds_show_state(buf);
+ case X86_BUG_TAA:
+ return tsx_async_abort_show_state(buf);
+
+ case X86_BUG_ITLB_MULTIHIT:
+ return itlb_multihit_show_state(buf);
+
default:
break;
}
{
return cpu_show_common(dev, attr, buf, X86_BUG_MDS);
}
+
+ssize_t cpu_show_tsx_async_abort(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ return cpu_show_common(dev, attr, buf, X86_BUG_TAA);
+}
+
+ssize_t cpu_show_itlb_multihit(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ return cpu_show_common(dev, attr, buf, X86_BUG_ITLB_MULTIHIT);
+}
#endif
#endif
}
-#define NO_SPECULATION BIT(0)
-#define NO_MELTDOWN BIT(1)
-#define NO_SSB BIT(2)
-#define NO_L1TF BIT(3)
-#define NO_MDS BIT(4)
-#define MSBDS_ONLY BIT(5)
-#define NO_SWAPGS BIT(6)
+#define NO_SPECULATION BIT(0)
+#define NO_MELTDOWN BIT(1)
+#define NO_SSB BIT(2)
+#define NO_L1TF BIT(3)
+#define NO_MDS BIT(4)
+#define MSBDS_ONLY BIT(5)
+#define NO_SWAPGS BIT(6)
+#define NO_ITLB_MULTIHIT BIT(7)
#define VULNWL(_vendor, _family, _model, _whitelist) \
{ X86_VENDOR_##_vendor, _family, _model, X86_FEATURE_ANY, _whitelist }
VULNWL(NSC, 5, X86_MODEL_ANY, NO_SPECULATION),
/* Intel Family 6 */
- VULNWL_INTEL(ATOM_SALTWELL, NO_SPECULATION),
- VULNWL_INTEL(ATOM_SALTWELL_TABLET, NO_SPECULATION),
- VULNWL_INTEL(ATOM_SALTWELL_MID, NO_SPECULATION),
- VULNWL_INTEL(ATOM_BONNELL, NO_SPECULATION),
- VULNWL_INTEL(ATOM_BONNELL_MID, NO_SPECULATION),
-
- VULNWL_INTEL(ATOM_SILVERMONT, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS),
- VULNWL_INTEL(ATOM_SILVERMONT_D, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS),
- VULNWL_INTEL(ATOM_SILVERMONT_MID, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS),
- VULNWL_INTEL(ATOM_AIRMONT, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS),
- VULNWL_INTEL(XEON_PHI_KNL, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS),
- VULNWL_INTEL(XEON_PHI_KNM, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS),
+ VULNWL_INTEL(ATOM_SALTWELL, NO_SPECULATION | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(ATOM_SALTWELL_TABLET, NO_SPECULATION | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(ATOM_SALTWELL_MID, NO_SPECULATION | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(ATOM_BONNELL, NO_SPECULATION | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(ATOM_BONNELL_MID, NO_SPECULATION | NO_ITLB_MULTIHIT),
+
+ VULNWL_INTEL(ATOM_SILVERMONT, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(ATOM_SILVERMONT_D, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(ATOM_SILVERMONT_MID, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(ATOM_AIRMONT, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(XEON_PHI_KNL, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(XEON_PHI_KNM, NO_SSB | NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
VULNWL_INTEL(CORE_YONAH, NO_SSB),
- VULNWL_INTEL(ATOM_AIRMONT_MID, NO_L1TF | MSBDS_ONLY | NO_SWAPGS),
- VULNWL_INTEL(ATOM_AIRMONT_NP, NO_L1TF | NO_SWAPGS),
+ VULNWL_INTEL(ATOM_AIRMONT_MID, NO_L1TF | MSBDS_ONLY | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(ATOM_AIRMONT_NP, NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT),
- VULNWL_INTEL(ATOM_GOLDMONT, NO_MDS | NO_L1TF | NO_SWAPGS),
- VULNWL_INTEL(ATOM_GOLDMONT_D, NO_MDS | NO_L1TF | NO_SWAPGS),
- VULNWL_INTEL(ATOM_GOLDMONT_PLUS, NO_MDS | NO_L1TF | NO_SWAPGS),
+ VULNWL_INTEL(ATOM_GOLDMONT, NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(ATOM_GOLDMONT_D, NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_INTEL(ATOM_GOLDMONT_PLUS, NO_MDS | NO_L1TF | NO_SWAPGS | NO_ITLB_MULTIHIT),
/*
* Technically, swapgs isn't serializing on AMD (despite it previously
* good enough for our purposes.
*/
+ VULNWL_INTEL(ATOM_TREMONT_D, NO_ITLB_MULTIHIT),
+
/* AMD Family 0xf - 0x12 */
- VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS),
- VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS),
- VULNWL_AMD(0x11, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS),
- VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS),
+ VULNWL_AMD(0x0f, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_AMD(0x10, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_AMD(0x11, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_AMD(0x12, NO_MELTDOWN | NO_SSB | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
/* FAMILY_ANY must be last, otherwise 0x0f - 0x12 matches won't work */
- VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS),
- VULNWL_HYGON(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS),
+ VULNWL_AMD(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
+ VULNWL_HYGON(X86_FAMILY_ANY, NO_MELTDOWN | NO_L1TF | NO_MDS | NO_SWAPGS | NO_ITLB_MULTIHIT),
{}
};
return m && !!(m->driver_data & which);
}
-static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
+u64 x86_read_arch_cap_msr(void)
{
u64 ia32_cap = 0;
+ if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
+ rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
+
+ return ia32_cap;
+}
+
+static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
+{
+ u64 ia32_cap = x86_read_arch_cap_msr();
+
+ /* Set ITLB_MULTIHIT bug if cpu is not in the whitelist and not mitigated */
+ if (!cpu_matches(NO_ITLB_MULTIHIT) && !(ia32_cap & ARCH_CAP_PSCHANGE_MC_NO))
+ setup_force_cpu_bug(X86_BUG_ITLB_MULTIHIT);
+
if (cpu_matches(NO_SPECULATION))
return;
setup_force_cpu_bug(X86_BUG_SPECTRE_V1);
setup_force_cpu_bug(X86_BUG_SPECTRE_V2);
- if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES))
- rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
-
if (!cpu_matches(NO_SSB) && !(ia32_cap & ARCH_CAP_SSB_NO) &&
!cpu_has(c, X86_FEATURE_AMD_SSB_NO))
setup_force_cpu_bug(X86_BUG_SPEC_STORE_BYPASS);
if (!cpu_matches(NO_SWAPGS))
setup_force_cpu_bug(X86_BUG_SWAPGS);
+ /*
+ * When the CPU is not mitigated for TAA (TAA_NO=0) set TAA bug when:
+ * - TSX is supported or
+ * - TSX_CTRL is present
+ *
+ * TSX_CTRL check is needed for cases when TSX could be disabled before
+ * the kernel boot e.g. kexec.
+ * TSX_CTRL check alone is not sufficient for cases when the microcode
+ * update is not present or running as guest that don't get TSX_CTRL.
+ */
+ if (!(ia32_cap & ARCH_CAP_TAA_NO) &&
+ (cpu_has(c, X86_FEATURE_RTM) ||
+ (ia32_cap & ARCH_CAP_TSX_CTRL_MSR)))
+ setup_force_cpu_bug(X86_BUG_TAA);
+
if (cpu_matches(NO_MELTDOWN))
return;
#endif
cpu_detect_tlb(&boot_cpu_data);
setup_cr_pinning();
+
+ tsx_init();
}
void identify_secondary_cpu(struct cpuinfo_x86 *c)
extern const struct cpu_dev *const __x86_cpu_dev_start[],
*const __x86_cpu_dev_end[];
+#ifdef CONFIG_CPU_SUP_INTEL
+enum tsx_ctrl_states {
+ TSX_CTRL_ENABLE,
+ TSX_CTRL_DISABLE,
+ TSX_CTRL_NOT_SUPPORTED,
+};
+
+extern __ro_after_init enum tsx_ctrl_states tsx_ctrl_state;
+
+extern void __init tsx_init(void);
+extern void tsx_enable(void);
+extern void tsx_disable(void);
+#else
+static inline void tsx_init(void) { }
+#endif /* CONFIG_CPU_SUP_INTEL */
+
extern void get_cpu_cap(struct cpuinfo_x86 *c);
extern void get_cpu_address_sizes(struct cpuinfo_x86 *c);
extern void cpu_detect_cache_sizes(struct cpuinfo_x86 *c);
extern void x86_spec_ctrl_setup_ap(void);
+extern u64 x86_read_arch_cap_msr(void);
+
#endif /* ARCH_X86_CPU_H */
detect_tme(c);
init_intel_misc_features(c);
+
+ if (tsx_ctrl_state == TSX_CTRL_ENABLE)
+ tsx_enable();
+ if (tsx_ctrl_state == TSX_CTRL_DISABLE)
+ tsx_disable();
}
#ifdef CONFIG_X86_32
int ret = 0;
rdtgrp = rdtgroup_kn_lock_live(of->kn);
+ if (!rdtgrp) {
+ ret = -ENOENT;
+ goto out;
+ }
md.priv = of->kn->priv;
resid = md.u.rid;
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Intel Transactional Synchronization Extensions (TSX) control.
+ *
+ * Copyright (C) 2019 Intel Corporation
+ *
+ * Author:
+ * Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
+ */
+
+#include <linux/cpufeature.h>
+
+#include <asm/cmdline.h>
+
+#include "cpu.h"
+
+enum tsx_ctrl_states tsx_ctrl_state __ro_after_init = TSX_CTRL_NOT_SUPPORTED;
+
+void tsx_disable(void)
+{
+ u64 tsx;
+
+ rdmsrl(MSR_IA32_TSX_CTRL, tsx);
+
+ /* Force all transactions to immediately abort */
+ tsx |= TSX_CTRL_RTM_DISABLE;
+
+ /*
+ * Ensure TSX support is not enumerated in CPUID.
+ * This is visible to userspace and will ensure they
+ * do not waste resources trying TSX transactions that
+ * will always abort.
+ */
+ tsx |= TSX_CTRL_CPUID_CLEAR;
+
+ wrmsrl(MSR_IA32_TSX_CTRL, tsx);
+}
+
+void tsx_enable(void)
+{
+ u64 tsx;
+
+ rdmsrl(MSR_IA32_TSX_CTRL, tsx);
+
+ /* Enable the RTM feature in the cpu */
+ tsx &= ~TSX_CTRL_RTM_DISABLE;
+
+ /*
+ * Ensure TSX support is enumerated in CPUID.
+ * This is visible to userspace and will ensure they
+ * can enumerate and use the TSX feature.
+ */
+ tsx &= ~TSX_CTRL_CPUID_CLEAR;
+
+ wrmsrl(MSR_IA32_TSX_CTRL, tsx);
+}
+
+static bool __init tsx_ctrl_is_supported(void)
+{
+ u64 ia32_cap = x86_read_arch_cap_msr();
+
+ /*
+ * TSX is controlled via MSR_IA32_TSX_CTRL. However, support for this
+ * MSR is enumerated by ARCH_CAP_TSX_MSR bit in MSR_IA32_ARCH_CAPABILITIES.
+ *
+ * TSX control (aka MSR_IA32_TSX_CTRL) is only available after a
+ * microcode update on CPUs that have their MSR_IA32_ARCH_CAPABILITIES
+ * bit MDS_NO=1. CPUs with MDS_NO=0 are not planned to get
+ * MSR_IA32_TSX_CTRL support even after a microcode update. Thus,
+ * tsx= cmdline requests will do nothing on CPUs without
+ * MSR_IA32_TSX_CTRL support.
+ */
+ return !!(ia32_cap & ARCH_CAP_TSX_CTRL_MSR);
+}
+
+static enum tsx_ctrl_states x86_get_tsx_auto_mode(void)
+{
+ if (boot_cpu_has_bug(X86_BUG_TAA))
+ return TSX_CTRL_DISABLE;
+
+ return TSX_CTRL_ENABLE;
+}
+
+void __init tsx_init(void)
+{
+ char arg[5] = {};
+ int ret;
+
+ if (!tsx_ctrl_is_supported())
+ return;
+
+ ret = cmdline_find_option(boot_command_line, "tsx", arg, sizeof(arg));
+ if (ret >= 0) {
+ if (!strcmp(arg, "on")) {
+ tsx_ctrl_state = TSX_CTRL_ENABLE;
+ } else if (!strcmp(arg, "off")) {
+ tsx_ctrl_state = TSX_CTRL_DISABLE;
+ } else if (!strcmp(arg, "auto")) {
+ tsx_ctrl_state = x86_get_tsx_auto_mode();
+ } else {
+ tsx_ctrl_state = TSX_CTRL_DISABLE;
+ pr_err("tsx: invalid option, defaulting to off\n");
+ }
+ } else {
+ /* tsx= not provided */
+ if (IS_ENABLED(CONFIG_X86_INTEL_TSX_MODE_AUTO))
+ tsx_ctrl_state = x86_get_tsx_auto_mode();
+ else if (IS_ENABLED(CONFIG_X86_INTEL_TSX_MODE_OFF))
+ tsx_ctrl_state = TSX_CTRL_DISABLE;
+ else
+ tsx_ctrl_state = TSX_CTRL_ENABLE;
+ }
+
+ if (tsx_ctrl_state == TSX_CTRL_DISABLE) {
+ tsx_disable();
+
+ /*
+ * tsx_disable() will change the state of the
+ * RTM CPUID bit. Clear it here since it is now
+ * expected to be not set.
+ */
+ setup_clear_cpu_cap(X86_FEATURE_RTM);
+ } else if (tsx_ctrl_state == TSX_CTRL_ENABLE) {
+
+ /*
+ * HW defaults TSX to be enabled at bootup.
+ * We may still need the TSX enable support
+ * during init for special cases like
+ * kexec after TSX is disabled.
+ */
+ tsx_enable();
+
+ /*
+ * tsx_enable() will change the state of the
+ * RTM CPUID bit. Force it here since it is now
+ * expected to be set.
+ */
+ setup_force_cpu_cap(X86_FEATURE_RTM);
+ }
+}
BUILD_BUG_ON(N_EXCEPTION_STACKS != 6);
begin = (unsigned long)__this_cpu_read(cea_exception_stacks);
+ /*
+ * Handle the case where stack trace is collected _before_
+ * cea_exception_stacks had been initialized.
+ */
+ if (!begin)
+ return false;
+
end = begin + sizeof(struct cea_exception_stacks);
/* Bail if @stack is outside the exception stack area. */
if (stk < begin || stk >= end)
return;
}
+ if (tsc_clocksource_reliable || no_tsc_watchdog)
+ clocksource_tsc_early.flags &= ~CLOCK_SOURCE_MUST_VERIFY;
+
clocksource_register_khz(&clocksource_tsc_early, tsc_khz);
detect_art();
}
#include <linux/uaccess.h>
#include <linux/hash.h>
#include <linux/kern_levels.h>
+#include <linux/kthread.h>
#include <asm/page.h>
#include <asm/pat.h>
#include <asm/kvm_page_track.h>
#include "trace.h"
+extern bool itlb_multihit_kvm_mitigation;
+
+static int __read_mostly nx_huge_pages = -1;
+static uint __read_mostly nx_huge_pages_recovery_ratio = 60;
+
+static int set_nx_huge_pages(const char *val, const struct kernel_param *kp);
+static int set_nx_huge_pages_recovery_ratio(const char *val, const struct kernel_param *kp);
+
+static struct kernel_param_ops nx_huge_pages_ops = {
+ .set = set_nx_huge_pages,
+ .get = param_get_bool,
+};
+
+static struct kernel_param_ops nx_huge_pages_recovery_ratio_ops = {
+ .set = set_nx_huge_pages_recovery_ratio,
+ .get = param_get_uint,
+};
+
+module_param_cb(nx_huge_pages, &nx_huge_pages_ops, &nx_huge_pages, 0644);
+__MODULE_PARM_TYPE(nx_huge_pages, "bool");
+module_param_cb(nx_huge_pages_recovery_ratio, &nx_huge_pages_recovery_ratio_ops,
+ &nx_huge_pages_recovery_ratio, 0644);
+__MODULE_PARM_TYPE(nx_huge_pages_recovery_ratio, "uint");
+
/*
* When setting this variable to true it enables Two-Dimensional-Paging
* where the hardware walks 2 page tables:
return (spte & SPTE_SPECIAL_MASK) != SPTE_AD_ENABLED_MASK;
}
+static bool is_nx_huge_page_enabled(void)
+{
+ return READ_ONCE(nx_huge_pages);
+}
+
static inline u64 spte_shadow_accessed_mask(u64 spte)
{
MMU_WARN_ON(is_mmio_spte(spte));
kvm_mmu_gfn_disallow_lpage(slot, gfn);
}
+static void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp)
+{
+ if (sp->lpage_disallowed)
+ return;
+
+ ++kvm->stat.nx_lpage_splits;
+ list_add_tail(&sp->lpage_disallowed_link,
+ &kvm->arch.lpage_disallowed_mmu_pages);
+ sp->lpage_disallowed = true;
+}
+
static void unaccount_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp)
{
struct kvm_memslots *slots;
kvm_mmu_gfn_allow_lpage(slot, gfn);
}
+static void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp)
+{
+ --kvm->stat.nx_lpage_splits;
+ sp->lpage_disallowed = false;
+ list_del(&sp->lpage_disallowed_link);
+}
+
static bool __mmu_gfn_lpage_is_disallowed(gfn_t gfn, int level,
struct kvm_memory_slot *slot)
{
kvm_reload_remote_mmus(kvm);
}
+ if (sp->lpage_disallowed)
+ unaccount_huge_nx_page(kvm, sp);
+
sp->role.invalid = 1;
return list_unstable;
}
if (!speculative)
spte |= spte_shadow_accessed_mask(spte);
+ if (level > PT_PAGE_TABLE_LEVEL && (pte_access & ACC_EXEC_MASK) &&
+ is_nx_huge_page_enabled()) {
+ pte_access &= ~ACC_EXEC_MASK;
+ }
+
if (pte_access & ACC_EXEC_MASK)
spte |= shadow_x_mask;
else
__direct_pte_prefetch(vcpu, sp, sptep);
}
+static void disallowed_hugepage_adjust(struct kvm_shadow_walk_iterator it,
+ gfn_t gfn, kvm_pfn_t *pfnp, int *levelp)
+{
+ int level = *levelp;
+ u64 spte = *it.sptep;
+
+ if (it.level == level && level > PT_PAGE_TABLE_LEVEL &&
+ is_nx_huge_page_enabled() &&
+ is_shadow_present_pte(spte) &&
+ !is_large_pte(spte)) {
+ /*
+ * A small SPTE exists for this pfn, but FNAME(fetch)
+ * and __direct_map would like to create a large PTE
+ * instead: just force them to go down another level,
+ * patching back for them into pfn the next 9 bits of
+ * the address.
+ */
+ u64 page_mask = KVM_PAGES_PER_HPAGE(level) - KVM_PAGES_PER_HPAGE(level - 1);
+ *pfnp |= gfn & page_mask;
+ (*levelp)--;
+ }
+}
+
static int __direct_map(struct kvm_vcpu *vcpu, gpa_t gpa, int write,
int map_writable, int level, kvm_pfn_t pfn,
- bool prefault)
+ bool prefault, bool lpage_disallowed)
{
struct kvm_shadow_walk_iterator it;
struct kvm_mmu_page *sp;
trace_kvm_mmu_spte_requested(gpa, level, pfn);
for_each_shadow_entry(vcpu, gpa, it) {
+ /*
+ * We cannot overwrite existing page tables with an NX
+ * large page, as the leaf could be executable.
+ */
+ disallowed_hugepage_adjust(it, gfn, &pfn, &level);
+
base_gfn = gfn & ~(KVM_PAGES_PER_HPAGE(it.level) - 1);
if (it.level == level)
break;
it.level - 1, true, ACC_ALL);
link_shadow_page(vcpu, it.sptep, sp);
+ if (lpage_disallowed)
+ account_huge_nx_page(vcpu->kvm, sp);
}
}
* here.
*/
if (!is_error_noslot_pfn(pfn) && !kvm_is_reserved_pfn(pfn) &&
- level == PT_PAGE_TABLE_LEVEL &&
+ !kvm_is_zone_device_pfn(pfn) && level == PT_PAGE_TABLE_LEVEL &&
PageTransCompoundMap(pfn_to_page(pfn)) &&
!mmu_gfn_lpage_is_disallowed(vcpu, gfn, PT_DIRECTORY_LEVEL)) {
unsigned long mask;
{
int r;
int level;
- bool force_pt_level = false;
+ bool force_pt_level;
kvm_pfn_t pfn;
unsigned long mmu_seq;
bool map_writable, write = error_code & PFERR_WRITE_MASK;
+ bool lpage_disallowed = (error_code & PFERR_FETCH_MASK) &&
+ is_nx_huge_page_enabled();
+ force_pt_level = lpage_disallowed;
level = mapping_level(vcpu, gfn, &force_pt_level);
if (likely(!force_pt_level)) {
/*
goto out_unlock;
if (likely(!force_pt_level))
transparent_hugepage_adjust(vcpu, gfn, &pfn, &level);
- r = __direct_map(vcpu, v, write, map_writable, level, pfn, prefault);
+ r = __direct_map(vcpu, v, write, map_writable, level, pfn,
+ prefault, false);
out_unlock:
spin_unlock(&vcpu->kvm->mmu_lock);
kvm_release_pfn_clean(pfn);
unsigned long mmu_seq;
int write = error_code & PFERR_WRITE_MASK;
bool map_writable;
+ bool lpage_disallowed = (error_code & PFERR_FETCH_MASK) &&
+ is_nx_huge_page_enabled();
MMU_WARN_ON(!VALID_PAGE(vcpu->arch.mmu->root_hpa));
if (r)
return r;
- force_pt_level = !check_hugepage_cache_consistency(vcpu, gfn,
- PT_DIRECTORY_LEVEL);
+ force_pt_level =
+ lpage_disallowed ||
+ !check_hugepage_cache_consistency(vcpu, gfn, PT_DIRECTORY_LEVEL);
level = mapping_level(vcpu, gfn, &force_pt_level);
if (likely(!force_pt_level)) {
if (level > PT_DIRECTORY_LEVEL &&
goto out_unlock;
if (likely(!force_pt_level))
transparent_hugepage_adjust(vcpu, gfn, &pfn, &level);
- r = __direct_map(vcpu, gpa, write, map_writable, level, pfn, prefault);
+ r = __direct_map(vcpu, gpa, write, map_writable, level, pfn,
+ prefault, lpage_disallowed);
out_unlock:
spin_unlock(&vcpu->kvm->mmu_lock);
kvm_release_pfn_clean(pfn);
* the guest, and the guest page table is using 4K page size
* mapping if the indirect sp has level = 1.
*/
- if (sp->role.direct &&
- !kvm_is_reserved_pfn(pfn) &&
- PageTransCompoundMap(pfn_to_page(pfn))) {
+ if (sp->role.direct && !kvm_is_reserved_pfn(pfn) &&
+ !kvm_is_zone_device_pfn(pfn) &&
+ PageTransCompoundMap(pfn_to_page(pfn))) {
pte_list_remove(rmap_head, sptep);
if (kvm_available_flush_tlb_with_range())
kvm_mmu_set_mmio_spte_mask(mask, mask, ACC_WRITE_MASK | ACC_USER_MASK);
}
+static bool get_nx_auto_mode(void)
+{
+ /* Return true when CPU has the bug, and mitigations are ON */
+ return boot_cpu_has_bug(X86_BUG_ITLB_MULTIHIT) && !cpu_mitigations_off();
+}
+
+static void __set_nx_huge_pages(bool val)
+{
+ nx_huge_pages = itlb_multihit_kvm_mitigation = val;
+}
+
+static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)
+{
+ bool old_val = nx_huge_pages;
+ bool new_val;
+
+ /* In "auto" mode deploy workaround only if CPU has the bug. */
+ if (sysfs_streq(val, "off"))
+ new_val = 0;
+ else if (sysfs_streq(val, "force"))
+ new_val = 1;
+ else if (sysfs_streq(val, "auto"))
+ new_val = get_nx_auto_mode();
+ else if (strtobool(val, &new_val) < 0)
+ return -EINVAL;
+
+ __set_nx_huge_pages(new_val);
+
+ if (new_val != old_val) {
+ struct kvm *kvm;
+ int idx;
+
+ mutex_lock(&kvm_lock);
+
+ list_for_each_entry(kvm, &vm_list, vm_list) {
+ idx = srcu_read_lock(&kvm->srcu);
+ kvm_mmu_zap_all_fast(kvm);
+ srcu_read_unlock(&kvm->srcu, idx);
+
+ wake_up_process(kvm->arch.nx_lpage_recovery_thread);
+ }
+ mutex_unlock(&kvm_lock);
+ }
+
+ return 0;
+}
+
int kvm_mmu_module_init(void)
{
int ret = -ENOMEM;
+ if (nx_huge_pages == -1)
+ __set_nx_huge_pages(get_nx_auto_mode());
+
/*
* MMU roles use union aliasing which is, generally speaking, an
* undefined behavior. However, we supposedly know how compilers behave
unregister_shrinker(&mmu_shrinker);
mmu_audit_disable();
}
+
+static int set_nx_huge_pages_recovery_ratio(const char *val, const struct kernel_param *kp)
+{
+ unsigned int old_val;
+ int err;
+
+ old_val = nx_huge_pages_recovery_ratio;
+ err = param_set_uint(val, kp);
+ if (err)
+ return err;
+
+ if (READ_ONCE(nx_huge_pages) &&
+ !old_val && nx_huge_pages_recovery_ratio) {
+ struct kvm *kvm;
+
+ mutex_lock(&kvm_lock);
+
+ list_for_each_entry(kvm, &vm_list, vm_list)
+ wake_up_process(kvm->arch.nx_lpage_recovery_thread);
+
+ mutex_unlock(&kvm_lock);
+ }
+
+ return err;
+}
+
+static void kvm_recover_nx_lpages(struct kvm *kvm)
+{
+ int rcu_idx;
+ struct kvm_mmu_page *sp;
+ unsigned int ratio;
+ LIST_HEAD(invalid_list);
+ ulong to_zap;
+
+ rcu_idx = srcu_read_lock(&kvm->srcu);
+ spin_lock(&kvm->mmu_lock);
+
+ ratio = READ_ONCE(nx_huge_pages_recovery_ratio);
+ to_zap = ratio ? DIV_ROUND_UP(kvm->stat.nx_lpage_splits, ratio) : 0;
+ while (to_zap && !list_empty(&kvm->arch.lpage_disallowed_mmu_pages)) {
+ /*
+ * We use a separate list instead of just using active_mmu_pages
+ * because the number of lpage_disallowed pages is expected to
+ * be relatively small compared to the total.
+ */
+ sp = list_first_entry(&kvm->arch.lpage_disallowed_mmu_pages,
+ struct kvm_mmu_page,
+ lpage_disallowed_link);
+ WARN_ON_ONCE(!sp->lpage_disallowed);
+ kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
+ WARN_ON_ONCE(sp->lpage_disallowed);
+
+ if (!--to_zap || need_resched() || spin_needbreak(&kvm->mmu_lock)) {
+ kvm_mmu_commit_zap_page(kvm, &invalid_list);
+ if (to_zap)
+ cond_resched_lock(&kvm->mmu_lock);
+ }
+ }
+
+ spin_unlock(&kvm->mmu_lock);
+ srcu_read_unlock(&kvm->srcu, rcu_idx);
+}
+
+static long get_nx_lpage_recovery_timeout(u64 start_time)
+{
+ return READ_ONCE(nx_huge_pages) && READ_ONCE(nx_huge_pages_recovery_ratio)
+ ? start_time + 60 * HZ - get_jiffies_64()
+ : MAX_SCHEDULE_TIMEOUT;
+}
+
+static int kvm_nx_lpage_recovery_worker(struct kvm *kvm, uintptr_t data)
+{
+ u64 start_time;
+ long remaining_time;
+
+ while (true) {
+ start_time = get_jiffies_64();
+ remaining_time = get_nx_lpage_recovery_timeout(start_time);
+
+ set_current_state(TASK_INTERRUPTIBLE);
+ while (!kthread_should_stop() && remaining_time > 0) {
+ schedule_timeout(remaining_time);
+ remaining_time = get_nx_lpage_recovery_timeout(start_time);
+ set_current_state(TASK_INTERRUPTIBLE);
+ }
+
+ set_current_state(TASK_RUNNING);
+
+ if (kthread_should_stop())
+ return 0;
+
+ kvm_recover_nx_lpages(kvm);
+ }
+}
+
+int kvm_mmu_post_init_vm(struct kvm *kvm)
+{
+ int err;
+
+ err = kvm_vm_create_worker_thread(kvm, kvm_nx_lpage_recovery_worker, 0,
+ "kvm-nx-lpage-recovery",
+ &kvm->arch.nx_lpage_recovery_thread);
+ if (!err)
+ kthread_unpark(kvm->arch.nx_lpage_recovery_thread);
+
+ return err;
+}
+
+void kvm_mmu_pre_destroy_vm(struct kvm *kvm)
+{
+ if (kvm->arch.nx_lpage_recovery_thread)
+ kthread_stop(kvm->arch.nx_lpage_recovery_thread);
+}
bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
struct kvm_memory_slot *slot, u64 gfn);
int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu);
+
+int kvm_mmu_post_init_vm(struct kvm *kvm);
+void kvm_mmu_pre_destroy_vm(struct kvm *kvm);
+
#endif
static int FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
struct guest_walker *gw,
int write_fault, int hlevel,
- kvm_pfn_t pfn, bool map_writable, bool prefault)
+ kvm_pfn_t pfn, bool map_writable, bool prefault,
+ bool lpage_disallowed)
{
struct kvm_mmu_page *sp = NULL;
struct kvm_shadow_walk_iterator it;
unsigned direct_access, access = gw->pt_access;
int top_level, ret;
- gfn_t base_gfn;
+ gfn_t gfn, base_gfn;
direct_access = gw->pte_access;
link_shadow_page(vcpu, it.sptep, sp);
}
- base_gfn = gw->gfn;
+ /*
+ * FNAME(page_fault) might have clobbered the bottom bits of
+ * gw->gfn, restore them from the virtual address.
+ */
+ gfn = gw->gfn | ((addr & PT_LVL_OFFSET_MASK(gw->level)) >> PAGE_SHIFT);
+ base_gfn = gfn;
trace_kvm_mmu_spte_requested(addr, gw->level, pfn);
for (; shadow_walk_okay(&it); shadow_walk_next(&it)) {
clear_sp_write_flooding_count(it.sptep);
- base_gfn = gw->gfn & ~(KVM_PAGES_PER_HPAGE(it.level) - 1);
+
+ /*
+ * We cannot overwrite existing page tables with an NX
+ * large page, as the leaf could be executable.
+ */
+ disallowed_hugepage_adjust(it, gfn, &pfn, &hlevel);
+
+ base_gfn = gfn & ~(KVM_PAGES_PER_HPAGE(it.level) - 1);
if (it.level == hlevel)
break;
sp = kvm_mmu_get_page(vcpu, base_gfn, addr,
it.level - 1, true, direct_access);
link_shadow_page(vcpu, it.sptep, sp);
+ if (lpage_disallowed)
+ account_huge_nx_page(vcpu->kvm, sp);
}
}
int r;
kvm_pfn_t pfn;
int level = PT_PAGE_TABLE_LEVEL;
- bool force_pt_level = false;
unsigned long mmu_seq;
bool map_writable, is_self_change_mapping;
+ bool lpage_disallowed = (error_code & PFERR_FETCH_MASK) &&
+ is_nx_huge_page_enabled();
+ bool force_pt_level = lpage_disallowed;
pgprintk("%s: addr %lx err %x\n", __func__, addr, error_code);
if (!force_pt_level)
transparent_hugepage_adjust(vcpu, walker.gfn, &pfn, &level);
r = FNAME(fetch)(vcpu, addr, &walker, write_fault,
- level, pfn, map_writable, prefault);
+ level, pfn, map_writable, prefault, lpage_disallowed);
kvm_mmu_audit(vcpu, AUDIT_POST_PAGE_FAULT);
out_unlock:
if (!pi_test_sn(pi_desc) && vcpu->cpu == cpu)
return;
+ /*
+ * If the 'nv' field is POSTED_INTR_WAKEUP_VECTOR, do not change
+ * PI.NDST: pi_post_block is the one expected to change PID.NDST and the
+ * wakeup handler expects the vCPU to be on the blocked_vcpu_list that
+ * matches PI.NDST. Otherwise, a vcpu may not be able to be woken up
+ * correctly.
+ */
+ if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR || vcpu->cpu == cpu) {
+ pi_clear_sn(pi_desc);
+ goto after_clear_sn;
+ }
+
/* The full case. */
do {
old.control = new.control = pi_desc->control;
} while (cmpxchg64(&pi_desc->control, old.control,
new.control) != old.control);
+after_clear_sn:
+
/*
* Clear SN before reading the bitmap. The VT-d firmware
* writes the bitmap and reads SN atomically (5.2.3 in the
*/
smp_mb__after_atomic();
- if (!bitmap_empty((unsigned long *)pi_desc->pir, NR_VECTORS))
+ if (!pi_is_pir_empty(pi_desc))
pi_set_on(pi_desc);
}
if (pi_test_on(&vmx->pi_desc)) {
pi_clear_on(&vmx->pi_desc);
/*
- * IOMMU can write to PIR.ON, so the barrier matters even on UP.
+ * IOMMU can write to PID.ON, so the barrier matters even on UP.
* But on x86 this is just a compiler barrier anyway.
*/
smp_mb__after_atomic();
static bool vmx_dy_apicv_has_pending_interrupt(struct kvm_vcpu *vcpu)
{
- return pi_test_on(vcpu_to_pi_desc(vcpu));
+ struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu);
+
+ return pi_test_on(pi_desc) ||
+ (pi_test_sn(pi_desc) && !pi_is_pir_empty(pi_desc));
}
static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
return test_and_set_bit(vector, (unsigned long *)pi_desc->pir);
}
+static inline bool pi_is_pir_empty(struct pi_desc *pi_desc)
+{
+ return bitmap_empty((unsigned long *)pi_desc->pir, NR_VECTORS);
+}
+
static inline void pi_set_sn(struct pi_desc *pi_desc)
{
set_bit(POSTED_INTR_SN,
(unsigned long *)&pi_desc->control);
}
+static inline void pi_clear_sn(struct pi_desc *pi_desc)
+{
+ clear_bit(POSTED_INTR_SN,
+ (unsigned long *)&pi_desc->control);
+}
+
static inline int pi_test_on(struct pi_desc *pi_desc)
{
return test_bit(POSTED_INTR_ON,
{ "mmu_unsync", VM_STAT(mmu_unsync) },
{ "remote_tlb_flush", VM_STAT(remote_tlb_flush) },
{ "largepages", VM_STAT(lpages, .mode = 0444) },
+ { "nx_largepages_splitted", VM_STAT(nx_lpage_splits, .mode = 0444) },
{ "max_mmu_page_hash_collisions",
VM_STAT(max_mmu_page_hash_collisions) },
{ NULL }
* List of msr numbers which we expose to userspace through KVM_GET_MSRS
* and KVM_SET_MSRS, and KVM_GET_MSR_INDEX_LIST.
*
- * This list is modified at module load time to reflect the
+ * The three MSR lists(msrs_to_save, emulated_msrs, msr_based_features)
+ * extract the supported MSRs from the related const lists.
+ * msrs_to_save is selected from the msrs_to_save_all to reflect the
* capabilities of the host cpu. This capabilities test skips MSRs that are
- * kvm-specific. Those are put in emulated_msrs; filtering of emulated_msrs
+ * kvm-specific. Those are put in emulated_msrs_all; filtering of emulated_msrs
* may depend on host virtualization features rather than host cpu features.
*/
-static u32 msrs_to_save[] = {
+static const u32 msrs_to_save_all[] = {
MSR_IA32_SYSENTER_CS, MSR_IA32_SYSENTER_ESP, MSR_IA32_SYSENTER_EIP,
MSR_STAR,
#ifdef CONFIG_X86_64
MSR_ARCH_PERFMON_EVENTSEL0 + 16, MSR_ARCH_PERFMON_EVENTSEL0 + 17,
};
+static u32 msrs_to_save[ARRAY_SIZE(msrs_to_save_all)];
static unsigned num_msrs_to_save;
-static u32 emulated_msrs[] = {
+static const u32 emulated_msrs_all[] = {
MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK,
MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW,
HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL,
* by arch/x86/kvm/vmx/nested.c based on CPUID or other MSRs.
* We always support the "true" VMX control MSRs, even if the host
* processor does not, so I am putting these registers here rather
- * than in msrs_to_save.
+ * than in msrs_to_save_all.
*/
MSR_IA32_VMX_BASIC,
MSR_IA32_VMX_TRUE_PINBASED_CTLS,
MSR_KVM_POLL_CONTROL,
};
+static u32 emulated_msrs[ARRAY_SIZE(emulated_msrs_all)];
static unsigned num_emulated_msrs;
/*
* List of msr numbers which are used to expose MSR-based features that
* can be used by a hypervisor to validate requested CPU features.
*/
-static u32 msr_based_features[] = {
+static const u32 msr_based_features_all[] = {
MSR_IA32_VMX_BASIC,
MSR_IA32_VMX_TRUE_PINBASED_CTLS,
MSR_IA32_VMX_PINBASED_CTLS,
MSR_IA32_ARCH_CAPABILITIES,
};
+static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all)];
static unsigned int num_msr_based_features;
static u64 kvm_get_arch_capabilities(void)
rdmsrl(MSR_IA32_ARCH_CAPABILITIES, data);
/*
+ * If nx_huge_pages is enabled, KVM's shadow paging will ensure that
+ * the nested hypervisor runs with NX huge pages. If it is not,
+ * L1 is anyway vulnerable to ITLB_MULTIHIT explots from other
+ * L1 guests, so it need not worry about its own (L2) guests.
+ */
+ data |= ARCH_CAP_PSCHANGE_MC_NO;
+
+ /*
* If we're doing cache flushes (either "always" or "cond")
* we will do one whenever the guest does a vmlaunch/vmresume.
* If an outer hypervisor is doing the cache flush for us
if (!boot_cpu_has_bug(X86_BUG_MDS))
data |= ARCH_CAP_MDS_NO;
+ /*
+ * On TAA affected systems, export MDS_NO=0 when:
+ * - TSX is enabled on the host, i.e. X86_FEATURE_RTM=1.
+ * - Updated microcode is present. This is detected by
+ * the presence of ARCH_CAP_TSX_CTRL_MSR and ensures
+ * that VERW clears CPU buffers.
+ *
+ * When MDS_NO=0 is exported, guests deploy clear CPU buffer
+ * mitigation and don't complain:
+ *
+ * "Vulnerable: Clear CPU buffers attempted, no microcode"
+ *
+ * If TSX is disabled on the system, guests are also mitigated against
+ * TAA and clear CPU buffer mitigation is not required for guests.
+ */
+ if (boot_cpu_has_bug(X86_BUG_TAA) && boot_cpu_has(X86_FEATURE_RTM) &&
+ (data & ARCH_CAP_TSX_CTRL_MSR))
+ data &= ~ARCH_CAP_MDS_NO;
+
return data;
}
{
struct x86_pmu_capability x86_pmu;
u32 dummy[2];
- unsigned i, j;
+ unsigned i;
BUILD_BUG_ON_MSG(INTEL_PMC_MAX_FIXED != 4,
- "Please update the fixed PMCs in msrs_to_save[]");
+ "Please update the fixed PMCs in msrs_to_saved_all[]");
perf_get_x86_pmu_capability(&x86_pmu);
- for (i = j = 0; i < ARRAY_SIZE(msrs_to_save); i++) {
- if (rdmsr_safe(msrs_to_save[i], &dummy[0], &dummy[1]) < 0)
+ for (i = 0; i < ARRAY_SIZE(msrs_to_save_all); i++) {
+ if (rdmsr_safe(msrs_to_save_all[i], &dummy[0], &dummy[1]) < 0)
continue;
/*
* Even MSRs that are valid in the host may not be exposed
* to the guests in some cases.
*/
- switch (msrs_to_save[i]) {
+ switch (msrs_to_save_all[i]) {
case MSR_IA32_BNDCFGS:
if (!kvm_mpx_supported())
continue;
break;
case MSR_IA32_RTIT_ADDR0_A ... MSR_IA32_RTIT_ADDR3_B: {
if (!kvm_x86_ops->pt_supported() ||
- msrs_to_save[i] - MSR_IA32_RTIT_ADDR0_A >=
+ msrs_to_save_all[i] - MSR_IA32_RTIT_ADDR0_A >=
intel_pt_validate_hw_cap(PT_CAP_num_address_ranges) * 2)
continue;
break;
case MSR_ARCH_PERFMON_PERFCTR0 ... MSR_ARCH_PERFMON_PERFCTR0 + 17:
- if (msrs_to_save[i] - MSR_ARCH_PERFMON_PERFCTR0 >=
+ if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_PERFCTR0 >=
min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
continue;
break;
case MSR_ARCH_PERFMON_EVENTSEL0 ... MSR_ARCH_PERFMON_EVENTSEL0 + 17:
- if (msrs_to_save[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
+ if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
min(INTEL_PMC_MAX_GENERIC, x86_pmu.num_counters_gp))
continue;
}
break;
}
- if (j < i)
- msrs_to_save[j] = msrs_to_save[i];
- j++;
+ msrs_to_save[num_msrs_to_save++] = msrs_to_save_all[i];
}
- num_msrs_to_save = j;
- for (i = j = 0; i < ARRAY_SIZE(emulated_msrs); i++) {
- if (!kvm_x86_ops->has_emulated_msr(emulated_msrs[i]))
+ for (i = 0; i < ARRAY_SIZE(emulated_msrs_all); i++) {
+ if (!kvm_x86_ops->has_emulated_msr(emulated_msrs_all[i]))
continue;
- if (j < i)
- emulated_msrs[j] = emulated_msrs[i];
- j++;
+ emulated_msrs[num_emulated_msrs++] = emulated_msrs_all[i];
}
- num_emulated_msrs = j;
- for (i = j = 0; i < ARRAY_SIZE(msr_based_features); i++) {
+ for (i = 0; i < ARRAY_SIZE(msr_based_features_all); i++) {
struct kvm_msr_entry msr;
- msr.index = msr_based_features[i];
+ msr.index = msr_based_features_all[i];
if (kvm_get_msr_feature(&msr))
continue;
- if (j < i)
- msr_based_features[j] = msr_based_features[i];
- j++;
+ msr_based_features[num_msr_based_features++] = msr_based_features_all[i];
}
- num_msr_based_features = j;
}
static int vcpu_mmio_write(struct kvm_vcpu *vcpu, gpa_t addr, int len,
INIT_HLIST_HEAD(&kvm->arch.mask_notifier_list);
INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
INIT_LIST_HEAD(&kvm->arch.zapped_obsolete_pages);
+ INIT_LIST_HEAD(&kvm->arch.lpage_disallowed_mmu_pages);
INIT_LIST_HEAD(&kvm->arch.assigned_dev_head);
atomic_set(&kvm->arch.noncoherent_dma_count, 0);
return kvm_x86_ops->vm_init(kvm);
}
+int kvm_arch_post_init_vm(struct kvm *kvm)
+{
+ return kvm_mmu_post_init_vm(kvm);
+}
+
static void kvm_unload_vcpu_mmu(struct kvm_vcpu *vcpu)
{
vcpu_load(vcpu);
}
EXPORT_SYMBOL_GPL(x86_set_memory_region);
+void kvm_arch_pre_destroy_vm(struct kvm *kvm)
+{
+ kvm_mmu_pre_destroy_vm(kvm);
+}
+
void kvm_arch_destroy_vm(struct kvm *kvm)
{
if (current->mm == kvm->mm) {
int i;
bool has_stats = false;
+ spin_lock_irq(&blkg->q->queue_lock);
+
+ if (!blkg->online)
+ goto skip;
+
dname = blkg_dev_name(blkg);
if (!dname)
- continue;
+ goto skip;
/*
* Hooray string manipulation, count is the size written NOT
*/
off += scnprintf(buf+off, size-off, "%s ", dname);
- spin_lock_irq(&blkg->q->queue_lock);
-
blkg_rwstat_recursive_sum(blkg, NULL,
offsetof(struct blkcg_gq, stat_bytes), &rwstat);
rbytes = rwstat.cnt[BLKG_RWSTAT_READ];
wios = rwstat.cnt[BLKG_RWSTAT_WRITE];
dios = rwstat.cnt[BLKG_RWSTAT_DISCARD];
- spin_unlock_irq(&blkg->q->queue_lock);
-
if (rbytes || wbytes || rios || wios) {
has_stats = true;
off += scnprintf(buf+off, size-off,
seq_commit(sf, -1);
}
}
+ skip:
+ spin_unlock_irq(&blkg->q->queue_lock);
}
rcu_read_unlock();
return sprintf(buf, "Not affected\n");
}
+ssize_t __weak cpu_show_tsx_async_abort(struct device *dev,
+ struct device_attribute *attr,
+ char *buf)
+{
+ return sprintf(buf, "Not affected\n");
+}
+
+ssize_t __weak cpu_show_itlb_multihit(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ return sprintf(buf, "Not affected\n");
+}
+
static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL);
static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL);
static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL);
static DEVICE_ATTR(spec_store_bypass, 0444, cpu_show_spec_store_bypass, NULL);
static DEVICE_ATTR(l1tf, 0444, cpu_show_l1tf, NULL);
static DEVICE_ATTR(mds, 0444, cpu_show_mds, NULL);
+static DEVICE_ATTR(tsx_async_abort, 0444, cpu_show_tsx_async_abort, NULL);
+static DEVICE_ATTR(itlb_multihit, 0444, cpu_show_itlb_multihit, NULL);
static struct attribute *cpu_root_vulnerabilities_attrs[] = {
&dev_attr_meltdown.attr,
&dev_attr_spec_store_bypass.attr,
&dev_attr_l1tf.attr,
&dev_attr_mds.attr,
+ &dev_attr_tsx_async_abort.attr,
+ &dev_attr_itlb_multihit.attr,
NULL
};
if (nc->tentative && connection->agreed_pro_version < 92) {
rcu_read_unlock();
- mutex_unlock(&sock->mutex);
drbd_err(connection, "--dry-run is not supported by peer");
return -EOPNOTSUPP;
}
regmap_read(regmap, AT91_CKGR_MCFR, &mcfr);
if (mcfr & AT91_PMC_MAINRDY)
return 0;
- usleep_range(MAINF_LOOP_MIN_WAIT, MAINF_LOOP_MAX_WAIT);
+ if (system_state < SYSTEM_RUNNING)
+ udelay(MAINF_LOOP_MIN_WAIT);
+ else
+ usleep_range(MAINF_LOOP_MIN_WAIT, MAINF_LOOP_MAX_WAIT);
} while (time_before(prep_time, timeout));
return -ETIMEDOUT;
};
static const struct clk_programmable_layout sam9x60_programmable_layout = {
+ .pres_mask = 0xff,
.pres_shift = 8,
.css_mask = 0x1f,
.have_slck_mck = 0,
writel(tmp | osc->bits->cr_osc32en, sckcr);
- usleep_range(osc->startup_usec, osc->startup_usec + 1);
+ if (system_state < SYSTEM_RUNNING)
+ udelay(osc->startup_usec);
+ else
+ usleep_range(osc->startup_usec, osc->startup_usec + 1);
return 0;
}
writel(readl(sckcr) | osc->bits->cr_rcen, sckcr);
- usleep_range(osc->startup_usec, osc->startup_usec + 1);
+ if (system_state < SYSTEM_RUNNING)
+ udelay(osc->startup_usec);
+ else
+ usleep_range(osc->startup_usec, osc->startup_usec + 1);
return 0;
}
writel(tmp, sckcr);
- usleep_range(SLOWCK_SW_TIME_USEC, SLOWCK_SW_TIME_USEC + 1);
+ if (system_state < SYSTEM_RUNNING)
+ udelay(SLOWCK_SW_TIME_USEC);
+ else
+ usleep_range(SLOWCK_SW_TIME_USEC, SLOWCK_SW_TIME_USEC + 1);
return 0;
}
return 0;
}
- usleep_range(osc->startup_usec, osc->startup_usec + 1);
+ if (system_state < SYSTEM_RUNNING)
+ udelay(osc->startup_usec);
+ else
+ usleep_range(osc->startup_usec, osc->startup_usec + 1);
osc->prepared = true;
return 0;
/* Enable clock */
if (gate->flags & CLK_GATE_SET_TO_DISABLE) {
- regmap_write(gate->map, get_clock_reg(gate), clk);
- } else {
- /* Use set to clear register */
+ /* Clock is clear to enable, so use set to clear register */
regmap_write(gate->map, get_clock_reg(gate) + 0x04, clk);
+ } else {
+ /* Clock is set to enable, so use write to set register */
+ regmap_write(gate->map, get_clock_reg(gate), clk);
}
if (gate->reset_idx >= 0) {
clks[IMX8MM_CLK_A53_DIV],
clks[IMX8MM_CLK_A53_SRC],
clks[IMX8MM_ARM_PLL_OUT],
- clks[IMX8MM_CLK_24M]);
+ clks[IMX8MM_SYS_PLL1_800M]);
imx_check_clocks(clks, ARRAY_SIZE(clks));
clks[IMX8MN_CLK_A53_DIV],
clks[IMX8MN_CLK_A53_SRC],
clks[IMX8MN_ARM_PLL_OUT],
- clks[IMX8MN_CLK_24M]);
+ clks[IMX8MN_SYS_PLL1_800M]);
imx_check_clocks(clks, ARRAY_SIZE(clks));
.offset = HHI_SYS_CPU_CLK_CNTL0,
.mask = 0x3,
.shift = 0,
+ .flags = CLK_MUX_ROUND_CLOSEST,
},
.hw.init = &(struct clk_init_data){
.name = "cpu_clk_dyn0_sel",
{ .hw = &g12a_fclk_div3.hw },
},
.num_parents = 3,
- /* This sub-tree is used a parking clock */
- .flags = CLK_SET_RATE_NO_REPARENT,
+ .flags = CLK_SET_RATE_PARENT,
},
};
.offset = HHI_SYS_CPU_CLK_CNTL0,
.mask = 0x1,
.shift = 2,
+ .flags = CLK_MUX_ROUND_CLOSEST,
},
.hw.init = &(struct clk_init_data){
.name = "cpu_clk_dyn0",
.offset = HHI_SYS_CPU_CLK_CNTL0,
.mask = 0x1,
.shift = 10,
+ .flags = CLK_MUX_ROUND_CLOSEST,
},
.hw.init = &(struct clk_init_data){
.name = "cpu_clk_dyn",
.offset = HHI_SYS_CPU_CLK_CNTL0,
.mask = 0x1,
.shift = 11,
+ .flags = CLK_MUX_ROUND_CLOSEST,
},
.hw.init = &(struct clk_init_data){
.name = "cpu_clk",
.offset = HHI_SYS_CPU_CLK_CNTL0,
.mask = 0x1,
.shift = 11,
+ .flags = CLK_MUX_ROUND_CLOSEST,
},
.hw.init = &(struct clk_init_data){
.name = "cpu_clk",
.offset = HHI_SYS_CPUB_CLK_CNTL,
.mask = 0x3,
.shift = 0,
+ .flags = CLK_MUX_ROUND_CLOSEST,
},
.hw.init = &(struct clk_init_data){
.name = "cpub_clk_dyn0_sel",
{ .hw = &g12a_fclk_div3.hw },
},
.num_parents = 3,
+ .flags = CLK_SET_RATE_PARENT,
},
};
.offset = HHI_SYS_CPUB_CLK_CNTL,
.mask = 0x1,
.shift = 2,
+ .flags = CLK_MUX_ROUND_CLOSEST,
},
.hw.init = &(struct clk_init_data){
.name = "cpub_clk_dyn0",
.offset = HHI_SYS_CPUB_CLK_CNTL,
.mask = 0x1,
.shift = 10,
+ .flags = CLK_MUX_ROUND_CLOSEST,
},
.hw.init = &(struct clk_init_data){
.name = "cpub_clk_dyn",
.offset = HHI_SYS_CPUB_CLK_CNTL,
.mask = 0x1,
.shift = 11,
+ .flags = CLK_MUX_ROUND_CLOSEST,
},
.hw.init = &(struct clk_init_data){
.name = "cpub_clk",
&gxbb_sar_adc_clk_sel.hw
},
.num_parents = 1,
+ .flags = CLK_SET_RATE_PARENT,
},
};
GATE_BUS_CPU,
GATE_SCLK_CPU,
CLKOUT_CMU_CPU,
+ CPLL_CON0,
+ DPLL_CON0,
EPLL_CON0,
EPLL_CON1,
EPLL_CON2,
RPLL_CON0,
RPLL_CON1,
RPLL_CON2,
+ IPLL_CON0,
+ SPLL_CON0,
+ VPLL_CON0,
+ MPLL_CON0,
SRC_TOP0,
SRC_TOP1,
SRC_TOP2,
GATE(CLK_SCLK_ISP_SENSOR2, "sclk_isp_sensor2", "dout_isp_sensor2",
GATE_TOP_SCLK_ISP, 12, CLK_SET_RATE_PARENT, 0),
- GATE(CLK_G3D, "g3d", "mout_user_aclk_g3d", GATE_IP_G3D, 9, 0, 0),
-
/* CDREX */
GATE(CLK_CLKM_PHY0, "clkm_phy0", "dout_sclk_cdrex",
GATE_BUS_CDREX0, 0, 0, 0),
{ DIV2_RATIO0, 0, 0x30 }, /* DIV dout_gscl_blk_300 */
};
+static const struct samsung_gate_clock exynos5x_g3d_gate_clks[] __initconst = {
+ GATE(CLK_G3D, "g3d", "mout_user_aclk_g3d", GATE_IP_G3D, 9, 0, 0),
+};
+
+static struct exynos5_subcmu_reg_dump exynos5x_g3d_suspend_regs[] = {
+ { GATE_IP_G3D, 0x3ff, 0x3ff }, /* G3D gates */
+ { SRC_TOP5, 0, BIT(16) }, /* MUX mout_user_aclk_g3d */
+};
+
static const struct samsung_div_clock exynos5x_mfc_div_clks[] __initconst = {
DIV(0, "dout_mfc_blk", "mout_user_aclk333", DIV4_RATIO, 0, 2),
};
.pd_name = "GSC",
};
+static const struct exynos5_subcmu_info exynos5x_g3d_subcmu = {
+ .gate_clks = exynos5x_g3d_gate_clks,
+ .nr_gate_clks = ARRAY_SIZE(exynos5x_g3d_gate_clks),
+ .suspend_regs = exynos5x_g3d_suspend_regs,
+ .nr_suspend_regs = ARRAY_SIZE(exynos5x_g3d_suspend_regs),
+ .pd_name = "G3D",
+};
+
static const struct exynos5_subcmu_info exynos5x_mfc_subcmu = {
.div_clks = exynos5x_mfc_div_clks,
.nr_div_clks = ARRAY_SIZE(exynos5x_mfc_div_clks),
static const struct exynos5_subcmu_info *exynos5x_subcmus[] = {
&exynos5x_disp_subcmu,
&exynos5x_gsc_subcmu,
+ &exynos5x_g3d_subcmu,
&exynos5x_mfc_subcmu,
&exynos5x_mscl_subcmu,
};
static const struct exynos5_subcmu_info *exynos5800_subcmus[] = {
&exynos5x_disp_subcmu,
&exynos5x_gsc_subcmu,
+ &exynos5x_g3d_subcmu,
&exynos5x_mfc_subcmu,
&exynos5x_mscl_subcmu,
&exynos5800_mau_subcmu,
#include <linux/of_device.h>
#include <linux/platform_device.h>
#include <linux/pm_runtime.h>
+#include <linux/slab.h>
#include <dt-bindings/clock/exynos5433.h>
data->clk_save = samsung_clk_alloc_reg_dump(info->clk_regs,
info->nr_clk_regs);
+ if (!data->clk_save)
+ return -ENOMEM;
data->nr_clk_save = info->nr_clk_regs;
data->clk_suspend = info->suspend_regs;
data->nr_clk_suspend = info->nr_suspend_regs;
if (data->nr_pclks > 0) {
data->pclks = devm_kcalloc(dev, sizeof(struct clk *),
data->nr_pclks, GFP_KERNEL);
-
+ if (!data->pclks) {
+ kfree(data->clk_save);
+ return -ENOMEM;
+ }
for (i = 0; i < data->nr_pclks; i++) {
struct clk *clk = of_clk_get(dev->of_node, i);
- if (IS_ERR(clk))
+ if (IS_ERR(clk)) {
+ kfree(data->clk_save);
+ while (--i >= 0)
+ clk_put(data->pclks[i]);
return PTR_ERR(clk);
+ }
data->pclks[i] = clk;
}
}
/* Enforce d1 = 0, d2 = 0 for Audio PLL */
val = readl(reg + SUN9I_A80_PLL_AUDIO_REG);
- val &= (BIT(16) & BIT(18));
+ val &= ~(BIT(16) | BIT(18));
writel(val, reg + SUN9I_A80_PLL_AUDIO_REG);
/* Enforce P = 1 for both CPU cluster PLLs */
rate_hw, rate_ops,
gate_hw, &clk_gate_ops,
clkflags |
- data->div[i].critical ?
- CLK_IS_CRITICAL : 0);
+ (data->div[i].critical ?
+ CLK_IS_CRITICAL : 0));
WARN_ON(IS_ERR(clk_data->clks[i]));
}
struct clk_init_data init = { NULL };
const char **parent_names = NULL;
struct clk *clk;
- int ret;
clk_hw = kzalloc(sizeof(*clk_hw), GFP_KERNEL);
if (!clk_hw) {
clk = ti_clk_register(NULL, &clk_hw->hw, node->name);
if (!IS_ERR(clk)) {
- ret = ti_clk_add_alias(NULL, clk, node->name);
- if (ret) {
- clk_unregister(clk);
- goto cleanup;
- }
of_clk_add_provider(node, of_clk_src_simple_get, clk);
kfree(parent_names);
return;
* can be from a timer that requires pm_runtime access, which
* will eventually bring us here with timekeeping_suspended,
* during both suspend entry and resume paths. This happens
- * at least on am43xx platform.
+ * at least on am43xx platform. Account for flakeyness
+ * with udelay() by multiplying the timeout value by 2.
*/
if (unlikely(_early_timeout || timekeeping_suspended)) {
if (time->cycles++ < timeout) {
- udelay(1);
+ udelay(1 * 2);
return false;
}
} else {
return 0;
}
+static const unsigned int sh_mtu2_channel_offsets[] = {
+ 0x300, 0x380, 0x000,
+};
+
static int sh_mtu2_setup_channel(struct sh_mtu2_channel *ch, unsigned int index,
struct sh_mtu2_device *mtu)
{
- static const unsigned int channel_offsets[] = {
- 0x300, 0x380, 0x000,
- };
char name[6];
int irq;
int ret;
return ret;
}
- ch->base = mtu->mapbase + channel_offsets[index];
+ ch->base = mtu->mapbase + sh_mtu2_channel_offsets[index];
ch->index = index;
return sh_mtu2_register(ch, dev_name(&mtu->pdev->dev));
}
/* Allocate and setup the channels. */
- mtu->num_channels = 3;
+ ret = platform_irq_count(pdev);
+ if (ret < 0)
+ goto err_unmap;
+
+ mtu->num_channels = min_t(unsigned int, ret,
+ ARRAY_SIZE(sh_mtu2_channel_offsets));
mtu->channels = kcalloc(mtu->num_channels, sizeof(*mtu->channels),
GFP_KERNEL);
ret = timer_of_init(node, &to);
if (ret)
- goto err;
+ return ret;
clockevents_config_and_register(&to.clkevt, timer_of_rate(&to),
TIMER_SYNC_TICKS, 0xffffffff);
return 0;
-err:
- timer_of_cleanup(&to);
- return ret;
}
static int __init mtk_gpt_init(struct device_node *node)
ret = timer_of_init(node, &to);
if (ret)
- goto err;
+ return ret;
/* Configure clock source */
mtk_gpt_setup(&to, TIMER_CLK_SRC, GPT_CTRL_OP_FREERUN);
mtk_gpt_enable_irq(&to, TIMER_CLK_EVT);
return 0;
-err:
- timer_of_cleanup(&to);
- return ret;
}
TIMER_OF_DECLARE(mtk_mt6577, "mediatek,mt6577-timer", mtk_gpt_init);
TIMER_OF_DECLARE(mtk_mt6765, "mediatek,mt6765-timer", mtk_syst_init);
value |= HWP_MAX_PERF(min_perf);
value |= HWP_MIN_PERF(min_perf);
- /* Set EPP/EPB to min */
+ /* Set EPP to min */
if (boot_cpu_has(X86_FEATURE_HWP_EPP))
value |= HWP_ENERGY_PERF_PREFERENCE(HWP_EPP_POWERSAVE);
- else
- intel_pstate_set_epb(cpu, HWP_EPP_BALANCE_POWERSAVE);
wrmsrl_on_cpu(cpu, MSR_HWP_REQUEST, value);
}
chained_irq_exit(irqchip, desc);
}
-static int mrfld_irq_init_hw(struct gpio_chip *chip)
+static void mrfld_irq_init_hw(struct mrfld_gpio *priv)
{
- struct mrfld_gpio *priv = gpiochip_get_data(chip);
void __iomem *reg;
unsigned int base;
reg = gpio_reg(&priv->chip, base, GFER);
writel(0, reg);
}
-
- return 0;
}
static const char *mrfld_gpio_get_pinctrl_dev_name(struct mrfld_gpio *priv)
{
const struct mrfld_gpio_pinrange *range;
const char *pinctrl_dev_name;
- struct gpio_irq_chip *girq;
struct mrfld_gpio *priv;
u32 gpio_base, irq_base;
void __iomem *base;
raw_spin_lock_init(&priv->lock);
- girq = &priv->chip.irq;
- girq->chip = &mrfld_irqchip;
- girq->init_hw = mrfld_irq_init_hw;
- girq->parent_handler = mrfld_irq_handler;
- girq->num_parents = 1;
- girq->parents = devm_kcalloc(&pdev->dev, girq->num_parents,
- sizeof(*girq->parents),
- GFP_KERNEL);
- if (!girq->parents)
- return -ENOMEM;
- girq->parents[0] = pdev->irq;
- girq->first = irq_base;
- girq->default_type = IRQ_TYPE_NONE;
- girq->handler = handle_bad_irq;
-
pci_set_drvdata(pdev, priv);
retval = devm_gpiochip_add_data(&pdev->dev, &priv->chip, priv);
if (retval) {
}
}
+ retval = gpiochip_irqchip_add(&priv->chip, &mrfld_irqchip, irq_base,
+ handle_bad_irq, IRQ_TYPE_NONE);
+ if (retval) {
+ dev_err(&pdev->dev, "could not connect irqchip to gpiochip\n");
+ return retval;
+ }
+
+ mrfld_irq_init_hw(priv);
+
+ gpiochip_set_chained_irqchip(&priv->chip, &mrfld_irqchip, pdev->irq,
+ mrfld_irq_handler);
+
return 0;
}
continue;
}
- for (i = 0; i < num_entities; i++)
+ for (i = 0; i < num_entities; i++) {
+ mutex_lock(&ctx->adev->lock_reset);
drm_sched_entity_fini(&ctx->entities[0][i].entity);
+ mutex_unlock(&ctx->adev->lock_reset);
+ }
}
}
DRM_INFO("amdgpu: acceleration disabled, skipping benchmarks\n");
}
+ /*
+ * Register gpu instance before amdgpu_device_enable_mgpu_fan_boost.
+ * Otherwise the mgpu fan boost feature will be skipped due to the
+ * gpu instance is counted less.
+ */
+ amdgpu_register_gpu_instance(adev);
+
/* enable clockgating, etc. after ib tests, etc. since some blocks require
* explicit gating rather than handling it automatically.
*/
{0x1002, 0x7340, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI14|AMD_EXP_HW_SUPPORT},
{0x1002, 0x7341, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI14|AMD_EXP_HW_SUPPORT},
{0x1002, 0x7347, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI14|AMD_EXP_HW_SUPPORT},
+ {0x1002, 0x734F, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_NAVI14|AMD_EXP_HW_SUPPORT},
/* Renoir */
{0x1002, 0x1636, PCI_ANY_ID, PCI_ANY_ID, 0, 0, CHIP_RENOIR|AMD_IS_APU|AMD_EXP_HW_SUPPORT},
uint32_t mec2_feature_version;
bool mec_fw_write_wait;
bool me_fw_write_wait;
+ bool cp_fw_write_wait;
struct amdgpu_ring gfx_ring[AMDGPU_MAX_GFX_RINGS];
unsigned num_gfx_rings;
struct amdgpu_ring compute_ring[AMDGPU_MAX_COMPUTE_RINGS];
pm_runtime_put_autosuspend(dev->dev);
}
- amdgpu_register_gpu_instance(adev);
out:
if (r) {
/* balance pm_runtime_get_sync in amdgpu_driver_unload_kms */
kfree(adev->gfx.rlc.register_list_format);
}
+static void gfx_v10_0_check_fw_write_wait(struct amdgpu_device *adev)
+{
+ adev->gfx.cp_fw_write_wait = false;
+
+ switch (adev->asic_type) {
+ case CHIP_NAVI10:
+ case CHIP_NAVI12:
+ case CHIP_NAVI14:
+ if ((adev->gfx.me_fw_version >= 0x00000046) &&
+ (adev->gfx.me_feature_version >= 27) &&
+ (adev->gfx.pfp_fw_version >= 0x00000068) &&
+ (adev->gfx.pfp_feature_version >= 27) &&
+ (adev->gfx.mec_fw_version >= 0x0000005b) &&
+ (adev->gfx.mec_feature_version >= 27))
+ adev->gfx.cp_fw_write_wait = true;
+ break;
+ default:
+ break;
+ }
+
+ if (adev->gfx.cp_fw_write_wait == false)
+ DRM_WARN_ONCE("Warning: check cp_fw_version and update it to realize \
+ GRBM requires 1-cycle delay in cp firmware\n");
+}
+
+
static void gfx_v10_0_init_rlc_ext_microcode(struct amdgpu_device *adev)
{
const struct rlc_firmware_header_v2_1 *rlc_hdr;
}
}
+ gfx_v10_0_check_fw_write_wait(adev);
out:
if (err) {
dev_err(adev->dev,
gfx_v10_0_wait_reg_mem(ring, 0, 0, 0, reg, 0, val, mask, 0x20);
}
+static void gfx_v10_0_ring_emit_reg_write_reg_wait(struct amdgpu_ring *ring,
+ uint32_t reg0, uint32_t reg1,
+ uint32_t ref, uint32_t mask)
+{
+ int usepfp = (ring->funcs->type == AMDGPU_RING_TYPE_GFX);
+ struct amdgpu_device *adev = ring->adev;
+ bool fw_version_ok = false;
+
+ fw_version_ok = adev->gfx.cp_fw_write_wait;
+
+ if (fw_version_ok)
+ gfx_v10_0_wait_reg_mem(ring, usepfp, 0, 1, reg0, reg1,
+ ref, mask, 0x20);
+ else
+ amdgpu_ring_emit_reg_write_reg_wait_helper(ring, reg0, reg1,
+ ref, mask);
+}
+
static void
gfx_v10_0_set_gfx_eop_interrupt_state(struct amdgpu_device *adev,
uint32_t me, uint32_t pipe,
.emit_tmz = gfx_v10_0_ring_emit_tmz,
.emit_wreg = gfx_v10_0_ring_emit_wreg,
.emit_reg_wait = gfx_v10_0_ring_emit_reg_wait,
+ .emit_reg_write_reg_wait = gfx_v10_0_ring_emit_reg_write_reg_wait,
};
static const struct amdgpu_ring_funcs gfx_v10_0_ring_funcs_compute = {
.pad_ib = amdgpu_ring_generic_pad_ib,
.emit_wreg = gfx_v10_0_ring_emit_wreg,
.emit_reg_wait = gfx_v10_0_ring_emit_reg_wait,
+ .emit_reg_write_reg_wait = gfx_v10_0_ring_emit_reg_write_reg_wait,
};
static const struct amdgpu_ring_funcs gfx_v10_0_ring_funcs_kiq = {
.emit_rreg = gfx_v10_0_ring_emit_rreg,
.emit_wreg = gfx_v10_0_ring_emit_wreg,
.emit_reg_wait = gfx_v10_0_ring_emit_reg_wait,
+ .emit_reg_write_reg_wait = gfx_v10_0_ring_emit_reg_write_reg_wait,
};
static void gfx_v10_0_set_ring_funcs(struct amdgpu_device *adev)
adev->gfx.me_fw_write_wait = false;
adev->gfx.mec_fw_write_wait = false;
+ if ((adev->gfx.mec_fw_version < 0x000001a5) ||
+ (adev->gfx.mec_feature_version < 46) ||
+ (adev->gfx.pfp_fw_version < 0x000000b7) ||
+ (adev->gfx.pfp_feature_version < 46))
+ DRM_WARN_ONCE("Warning: check cp_fw_version and update it to realize \
+ GRBM requires 1-cycle delay in cp firmware\n");
+
switch (adev->asic_type) {
case CHIP_VEGA10:
if ((adev->gfx.me_fw_version >= 0x0000009c) &&
AMD_PG_SUPPORT_CP |
AMD_PG_SUPPORT_RLC_SMU_HS;
break;
+ case CHIP_RENOIR:
+ if (adev->pm.pp_feature & PP_GFXOFF_MASK)
+ adev->pg_flags |= AMD_PG_SUPPORT_GFX_PG |
+ AMD_PG_SUPPORT_CP |
+ AMD_PG_SUPPORT_RLC_SMU_HS;
+ break;
default:
break;
}
amdgpu_ring_emit_wreg(ring, hub->ctx0_ptb_addr_hi32 + (2 * vmid),
upper_32_bits(pd_addr));
- amdgpu_ring_emit_wreg(ring, hub->vm_inv_eng0_req + eng, req);
-
- /* wait for the invalidate to complete */
- amdgpu_ring_emit_reg_wait(ring, hub->vm_inv_eng0_ack + eng,
- 1 << vmid, 1 << vmid);
+ amdgpu_ring_emit_reg_write_reg_wait(ring, hub->vm_inv_eng0_req + eng,
+ hub->vm_inv_eng0_ack + eng,
+ req, 1 << vmid);
return pd_addr;
}
hubid * MMHUB_INSTANCE_REGISTER_OFFSET, tmp);
tmp = mmVML2PF0_VM_L2_CNTL3_DEFAULT;
+ if (adev->gmc.translate_further) {
+ tmp = REG_SET_FIELD(tmp, VML2PF0_VM_L2_CNTL3, BANK_SELECT, 12);
+ tmp = REG_SET_FIELD(tmp, VML2PF0_VM_L2_CNTL3,
+ L2_CACHE_BIGK_FRAGMENT_SIZE, 9);
+ } else {
+ tmp = REG_SET_FIELD(tmp, VML2PF0_VM_L2_CNTL3, BANK_SELECT, 9);
+ tmp = REG_SET_FIELD(tmp, VML2PF0_VM_L2_CNTL3,
+ L2_CACHE_BIGK_FRAGMENT_SIZE, 6);
+ }
WREG32_SOC15_OFFSET(MMHUB, 0, mmVML2PF0_VM_L2_CNTL3,
hubid * MMHUB_INSTANCE_REGISTER_OFFSET, tmp);
SDMA_PKT_POLL_REGMEM_DW5_INTERVAL(10));
}
+static void sdma_v5_0_ring_emit_reg_write_reg_wait(struct amdgpu_ring *ring,
+ uint32_t reg0, uint32_t reg1,
+ uint32_t ref, uint32_t mask)
+{
+ amdgpu_ring_emit_wreg(ring, reg0, ref);
+ /* wait for a cycle to reset vm_inv_eng*_ack */
+ amdgpu_ring_emit_reg_wait(ring, reg0, 0, 0);
+ amdgpu_ring_emit_reg_wait(ring, reg1, mask, mask);
+}
+
static int sdma_v5_0_early_init(void *handle)
{
struct amdgpu_device *adev = (struct amdgpu_device *)handle;
6 + /* sdma_v5_0_ring_emit_pipeline_sync */
/* sdma_v5_0_ring_emit_vm_flush */
SOC15_FLUSH_GPU_TLB_NUM_WREG * 3 +
- SOC15_FLUSH_GPU_TLB_NUM_REG_WAIT * 6 +
+ SOC15_FLUSH_GPU_TLB_NUM_REG_WAIT * 6 * 2 +
10 + 10 + 10, /* sdma_v5_0_ring_emit_fence x3 for user fence, vm fence */
.emit_ib_size = 7 + 6, /* sdma_v5_0_ring_emit_ib */
.emit_ib = sdma_v5_0_ring_emit_ib,
.pad_ib = sdma_v5_0_ring_pad_ib,
.emit_wreg = sdma_v5_0_ring_emit_wreg,
.emit_reg_wait = sdma_v5_0_ring_emit_reg_wait,
+ .emit_reg_write_reg_wait = sdma_v5_0_ring_emit_reg_write_reg_wait,
.init_cond_exec = sdma_v5_0_ring_init_cond_exec,
.patch_cond_exec = sdma_v5_0_ring_patch_cond_exec,
.preempt_ib = sdma_v5_0_ring_preempt_ib,
AMD_PG_SUPPORT_VCN |
AMD_PG_SUPPORT_VCN_DPG;
adev->external_rev_id = adev->rev_id + 0x91;
-
- if (adev->pm.pp_feature & PP_GFXOFF_MASK)
- adev->pg_flags |= AMD_PG_SUPPORT_GFX_PG |
- AMD_PG_SUPPORT_CP |
- AMD_PG_SUPPORT_RLC_SMU_HS;
break;
default:
/* FIXME: not supported yet */
CONTROLLER_DP_TEST_PATTERN_VIDEOMODE,
COLOR_DEPTH_UNDEFINED);
- /* This second call is needed to reconfigure the DIG
- * as a workaround for the incorrect value being applied
- * from transmitter control.
- */
- if (!dc_is_virtual_signal(pipe_ctx->stream->signal))
- stream->link->link_enc->funcs->setup(
- stream->link->link_enc,
- pipe_ctx->stream->signal);
-
#ifdef CONFIG_DRM_AMD_DC_DSC_SUPPORT
if (pipe_ctx->stream->timing.flags.DSC) {
if (dc_is_dp_signal(pipe_ctx->stream->signal) ||
if (!enc1)
return NULL;
+ if (ASICREV_IS_NAVI14_M(ctx->asic_id.hw_internal_rev)) {
+ if (eng_id >= ENGINE_ID_DIGD)
+ eng_id++;
+ }
+
dcn20_stream_encoder_construct(enc1, ctx, ctx->dc_bios, eng_id,
&stream_enc_regs[eng_id],
&se_shift, &se_mask);
WORKLOAD_MAP(PP_SMC_POWER_PROFILE_POWERSAVING, WORKLOAD_PPLIB_POWER_SAVING_BIT),
WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VIDEO, WORKLOAD_PPLIB_VIDEO_BIT),
WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VR, WORKLOAD_PPLIB_VR_BIT),
- WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE, WORKLOAD_PPLIB_CUSTOM_BIT),
+ WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE, WORKLOAD_PPLIB_COMPUTE_BIT),
WORKLOAD_MAP(PP_SMC_POWER_PROFILE_CUSTOM, WORKLOAD_PPLIB_CUSTOM_BIT),
};
WORKLOAD_MAP(PP_SMC_POWER_PROFILE_POWERSAVING, WORKLOAD_PPLIB_POWER_SAVING_BIT),
WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VIDEO, WORKLOAD_PPLIB_VIDEO_BIT),
WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VR, WORKLOAD_PPLIB_VR_BIT),
- WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE, WORKLOAD_PPLIB_CUSTOM_BIT),
+ WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE, WORKLOAD_PPLIB_COMPUTE_BIT),
WORKLOAD_MAP(PP_SMC_POWER_PROFILE_CUSTOM, WORKLOAD_PPLIB_CUSTOM_BIT),
};
{
struct drm_device *dev = old_state->dev;
const struct drm_mode_config_helper_funcs *funcs;
+ struct drm_crtc_state *new_crtc_state;
+ struct drm_crtc *crtc;
ktime_t start;
s64 commit_time_ms;
+ unsigned int i, new_self_refresh_mask = 0;
funcs = dev->mode_config.helper_private;
drm_atomic_helper_wait_for_dependencies(old_state);
+ /*
+ * We cannot safely access new_crtc_state after
+ * drm_atomic_helper_commit_hw_done() so figure out which crtc's have
+ * self-refresh active beforehand:
+ */
+ for_each_new_crtc_in_state(old_state, crtc, new_crtc_state, i)
+ if (new_crtc_state->self_refresh_active)
+ new_self_refresh_mask |= BIT(i);
+
if (funcs && funcs->atomic_commit_tail)
funcs->atomic_commit_tail(old_state);
else
commit_time_ms = ktime_ms_delta(ktime_get(), start);
if (commit_time_ms > 0)
drm_self_refresh_helper_update_avg_times(old_state,
- (unsigned long)commit_time_ms);
+ (unsigned long)commit_time_ms,
+ new_self_refresh_mask);
drm_atomic_helper_commit_cleanup_done(old_state);
* drm_self_refresh_helper_update_avg_times - Updates a crtc's SR time averages
* @state: the state which has just been applied to hardware
* @commit_time_ms: the amount of time in ms that this commit took to complete
+ * @new_self_refresh_mask: bitmask of crtc's that have self_refresh_active in
+ * new state
*
* Called after &drm_mode_config_funcs.atomic_commit_tail, this function will
* update the average entry/exit self refresh times on self refresh transitions.
* These averages will be used when calculating how long to delay before
* entering self refresh mode after activity.
*/
-void drm_self_refresh_helper_update_avg_times(struct drm_atomic_state *state,
- unsigned int commit_time_ms)
+void
+drm_self_refresh_helper_update_avg_times(struct drm_atomic_state *state,
+ unsigned int commit_time_ms,
+ unsigned int new_self_refresh_mask)
{
struct drm_crtc *crtc;
- struct drm_crtc_state *old_crtc_state, *new_crtc_state;
+ struct drm_crtc_state *old_crtc_state;
int i;
- for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state,
- new_crtc_state, i) {
+ for_each_old_crtc_in_state(state, crtc, old_crtc_state, i) {
+ bool new_self_refresh_active = new_self_refresh_mask & BIT(i);
struct drm_self_refresh_data *sr_data = crtc->self_refresh_data;
struct ewma_psr_time *time;
if (old_crtc_state->self_refresh_active ==
- new_crtc_state->self_refresh_active)
+ new_self_refresh_active)
continue;
- if (new_crtc_state->self_refresh_active)
+ if (new_self_refresh_active)
time = &sr_data->entry_avg_ms;
else
time = &sr_data->exit_avg_ms;
out:
intel_display_power_put(dev_priv, intel_encoder->power_domain, wakeref);
+
+ /*
+ * Make sure the refs for power wells enabled during detect are
+ * dropped to avoid a new detect cycle triggered by HPD polling.
+ */
+ intel_display_power_flush_work(dev_priv);
+
return status;
}
u32 unused)
{
struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp);
+ struct drm_i915_private *i915 =
+ to_i915(intel_dig_port->base.base.dev);
+ enum phy phy = intel_port_to_phy(i915, intel_dig_port->base.port);
u32 ret;
ret = DP_AUX_CH_CTL_SEND_BUSY |
DP_AUX_CH_CTL_FW_SYNC_PULSE_SKL(32) |
DP_AUX_CH_CTL_SYNC_PULSE_SKL(32);
- if (intel_dig_port->tc_mode == TC_PORT_TBT_ALT)
+ if (intel_phy_is_tc(i915, phy) &&
+ intel_dig_port->tc_mode == TC_PORT_TBT_ALT)
ret |= DP_AUX_CH_CTL_TBT_IO;
return ret;
if (status != connector_status_connected && !intel_dp->is_mst)
intel_dp_unset_edid(intel_dp);
+ /*
+ * Make sure the refs for power wells enabled during detect are
+ * dropped to avoid a new detect cycle triggered by HPD polling.
+ */
+ intel_display_power_flush_work(dev_priv);
+
return status;
}
if (status != connector_status_connected)
cec_notifier_phys_addr_invalidate(intel_hdmi->cec_notifier);
+ /*
+ * Make sure the refs for power wells enabled during detect are
+ * dropped to avoid a new detect cycle triggered by HPD polling.
+ */
+ intel_display_power_flush_work(dev_priv);
+
return status;
}
free_engines(rcu_access_pointer(ctx->engines));
mutex_destroy(&ctx->engines_mutex);
+ kfree(ctx->jump_whitelist);
+
if (ctx->timeline)
intel_timeline_put(ctx->timeline);
for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)
ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
+ ctx->jump_whitelist = NULL;
+ ctx->jump_whitelist_cmds = 0;
+
return ctx;
err_free:
* per vm, which may be one per context or shared with the global GTT)
*/
struct radix_tree_root handles_vma;
+
+ /** jump_whitelist: Bit array for tracking cmds during cmdparsing
+ * Guarded by struct_mutex
+ */
+ unsigned long *jump_whitelist;
+ /** jump_whitelist_cmds: No of cmd slots available */
+ u32 jump_whitelist_cmds;
};
#endif /* __I915_GEM_CONTEXT_TYPES_H__ */
static inline bool eb_use_cmdparser(const struct i915_execbuffer *eb)
{
- return intel_engine_needs_cmd_parser(eb->engine) && eb->batch_len;
+ return intel_engine_requires_cmd_parser(eb->engine) ||
+ (intel_engine_using_cmd_parser(eb->engine) &&
+ eb->args->batch_len);
}
static int eb_create(struct i915_execbuffer *eb)
return 0;
}
-static struct i915_vma *eb_parse(struct i915_execbuffer *eb, bool is_master)
+static struct i915_vma *
+shadow_batch_pin(struct i915_execbuffer *eb, struct drm_i915_gem_object *obj)
+{
+ struct drm_i915_private *dev_priv = eb->i915;
+ struct i915_vma * const vma = *eb->vma;
+ struct i915_address_space *vm;
+ u64 flags;
+
+ /*
+ * PPGTT backed shadow buffers must be mapped RO, to prevent
+ * post-scan tampering
+ */
+ if (CMDPARSER_USES_GGTT(dev_priv)) {
+ flags = PIN_GLOBAL;
+ vm = &dev_priv->ggtt.vm;
+ } else if (vma->vm->has_read_only) {
+ flags = PIN_USER;
+ vm = vma->vm;
+ i915_gem_object_set_readonly(obj);
+ } else {
+ DRM_DEBUG("Cannot prevent post-scan tampering without RO capable vm\n");
+ return ERR_PTR(-EINVAL);
+ }
+
+ return i915_gem_object_pin(obj, vm, NULL, 0, 0, flags);
+}
+
+static struct i915_vma *eb_parse(struct i915_execbuffer *eb)
{
struct intel_engine_pool_node *pool;
struct i915_vma *vma;
+ u64 batch_start;
+ u64 shadow_batch_start;
int err;
pool = intel_engine_pool_get(&eb->engine->pool, eb->batch_len);
if (IS_ERR(pool))
return ERR_CAST(pool);
- err = intel_engine_cmd_parser(eb->engine,
+ vma = shadow_batch_pin(eb, pool->obj);
+ if (IS_ERR(vma))
+ goto err;
+
+ batch_start = gen8_canonical_addr(eb->batch->node.start) +
+ eb->batch_start_offset;
+
+ shadow_batch_start = gen8_canonical_addr(vma->node.start);
+
+ err = intel_engine_cmd_parser(eb->gem_context,
+ eb->engine,
eb->batch->obj,
- pool->obj,
+ batch_start,
eb->batch_start_offset,
eb->batch_len,
- is_master);
+ pool->obj,
+ shadow_batch_start);
+
if (err) {
- if (err == -EACCES) /* unhandled chained batch */
+ i915_vma_unpin(vma);
+
+ /*
+ * Unsafe GGTT-backed buffers can still be submitted safely
+ * as non-secure.
+ * For PPGTT backing however, we have no choice but to forcibly
+ * reject unsafe buffers
+ */
+ if (CMDPARSER_USES_GGTT(eb->i915) && (err == -EACCES))
+ /* Execute original buffer non-secure */
vma = NULL;
else
vma = ERR_PTR(err);
goto err;
}
- vma = i915_gem_object_ggtt_pin(pool->obj, NULL, 0, 0, 0);
- if (IS_ERR(vma))
- goto err;
-
eb->vma[eb->buffer_count] = i915_vma_get(vma);
eb->flags[eb->buffer_count] =
__EXEC_OBJECT_HAS_PIN | __EXEC_OBJECT_HAS_REF;
vma->exec_flags = &eb->flags[eb->buffer_count];
eb->buffer_count++;
+ eb->batch_start_offset = 0;
+ eb->batch = vma;
+
+ if (CMDPARSER_USES_GGTT(eb->i915))
+ eb->batch_flags |= I915_DISPATCH_SECURE;
+
+ /* eb->batch_len unchanged */
+
vma->private = pool;
return vma;
struct drm_i915_gem_exec_object2 *exec,
struct drm_syncobj **fences)
{
+ struct drm_i915_private *i915 = to_i915(dev);
struct i915_execbuffer eb;
struct dma_fence *in_fence = NULL;
struct dma_fence *exec_fence = NULL;
BUILD_BUG_ON(__EXEC_OBJECT_INTERNAL_FLAGS &
~__EXEC_OBJECT_UNKNOWN_FLAGS);
- eb.i915 = to_i915(dev);
+ eb.i915 = i915;
eb.file = file;
eb.args = args;
if (DBG_FORCE_RELOC || !(args->flags & I915_EXEC_NO_RELOC))
eb.batch_flags = 0;
if (args->flags & I915_EXEC_SECURE) {
+ if (INTEL_GEN(i915) >= 11)
+ return -ENODEV;
+
+ /* Return -EPERM to trigger fallback code on old binaries. */
+ if (!HAS_SECURE_BATCHES(i915))
+ return -EPERM;
+
if (!drm_is_current_master(file) || !capable(CAP_SYS_ADMIN))
- return -EPERM;
+ return -EPERM;
eb.batch_flags |= I915_DISPATCH_SECURE;
}
goto err_vma;
}
+ if (eb.batch_len == 0)
+ eb.batch_len = eb.batch->size - eb.batch_start_offset;
+
if (eb_use_cmdparser(&eb)) {
struct i915_vma *vma;
- vma = eb_parse(&eb, drm_is_current_master(file));
+ vma = eb_parse(&eb);
if (IS_ERR(vma)) {
err = PTR_ERR(vma);
goto err_vma;
}
-
- if (vma) {
- /*
- * Batch parsed and accepted:
- *
- * Set the DISPATCH_SECURE bit to remove the NON_SECURE
- * bit from MI_BATCH_BUFFER_START commands issued in
- * the dispatch_execbuffer implementations. We
- * specifically don't want that set on batches the
- * command parser has accepted.
- */
- eb.batch_flags |= I915_DISPATCH_SECURE;
- eb.batch_start_offset = 0;
- eb.batch = vma;
- }
}
- if (eb.batch_len == 0)
- eb.batch_len = eb.batch->size - eb.batch_start_offset;
-
/*
* snb/ivb/vlv conflate the "batch in ppgtt" bit with the "non-secure
* batch" bit. Hence we need to pin secure batches into the global gtt.
struct intel_engine_hangcheck hangcheck;
-#define I915_ENGINE_NEEDS_CMD_PARSER BIT(0)
+#define I915_ENGINE_USING_CMD_PARSER BIT(0)
#define I915_ENGINE_SUPPORTS_STATS BIT(1)
#define I915_ENGINE_HAS_PREEMPTION BIT(2)
#define I915_ENGINE_HAS_SEMAPHORES BIT(3)
#define I915_ENGINE_NEEDS_BREADCRUMB_TASKLET BIT(4)
#define I915_ENGINE_IS_VIRTUAL BIT(5)
+#define I915_ENGINE_REQUIRES_CMD_PARSER BIT(7)
unsigned int flags;
/*
};
static inline bool
-intel_engine_needs_cmd_parser(const struct intel_engine_cs *engine)
+intel_engine_using_cmd_parser(const struct intel_engine_cs *engine)
{
- return engine->flags & I915_ENGINE_NEEDS_CMD_PARSER;
+ return engine->flags & I915_ENGINE_USING_CMD_PARSER;
+}
+
+static inline bool
+intel_engine_requires_cmd_parser(const struct intel_engine_cs *engine)
+{
+ return engine->flags & I915_ENGINE_REQUIRES_CMD_PARSER;
}
static inline bool
gt->awake = intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
GEM_BUG_ON(!gt->awake);
+ if (NEEDS_RC6_CTX_CORRUPTION_WA(i915))
+ intel_uncore_forcewake_get(&i915->uncore, FORCEWAKE_ALL);
+
intel_enable_gt_powersave(i915);
i915_update_gfx_val(i915);
if (INTEL_GEN(i915) >= 6)
gen6_rps_idle(i915);
+ if (NEEDS_RC6_CTX_CORRUPTION_WA(i915)) {
+ i915_rc6_ctx_wa_check(i915);
+ intel_uncore_forcewake_put(&i915->uncore, FORCEWAKE_ALL);
+ }
+
/* Everything switched off, flush any residual interrupt just in case */
intel_synchronize_irq(i915);
* granting userspace undue privileges. There are three categories of privilege.
*
* First, commands which are explicitly defined as privileged or which should
- * only be used by the kernel driver. The parser generally rejects such
- * commands, though it may allow some from the drm master process.
+ * only be used by the kernel driver. The parser rejects such commands
*
* Second, commands which access registers. To support correct/enhanced
* userspace functionality, particularly certain OpenGL extensions, the parser
- * provides a whitelist of registers which userspace may safely access (for both
- * normal and drm master processes).
+ * provides a whitelist of registers which userspace may safely access
*
* Third, commands which access privileged memory (i.e. GGTT, HWS page, etc).
* The parser always rejects such commands.
* in the per-engine command tables.
*
* Other command table entries map fairly directly to high level categories
- * mentioned above: rejected, master-only, register whitelist. The parser
- * implements a number of checks, including the privileged memory checks, via a
- * general bitmasking mechanism.
+ * mentioned above: rejected, register whitelist. The parser implements a number
+ * of checks, including the privileged memory checks, via a general bitmasking
+ * mechanism.
*/
/*
* CMD_DESC_REJECT: The command is never allowed
* CMD_DESC_REGISTER: The command should be checked against the
* register whitelist for the appropriate ring
- * CMD_DESC_MASTER: The command is allowed if the submitting process
- * is the DRM master
*/
u32 flags;
#define CMD_DESC_FIXED (1<<0)
#define CMD_DESC_REJECT (1<<2)
#define CMD_DESC_REGISTER (1<<3)
#define CMD_DESC_BITMASK (1<<4)
-#define CMD_DESC_MASTER (1<<5)
/*
* The command's unique identification bits and the bitmask to get them.
#define CMD(op, opm, f, lm, fl, ...) \
{ \
.flags = (fl) | ((f) ? CMD_DESC_FIXED : 0), \
- .cmd = { (op), ~0u << (opm) }, \
+ .cmd = { (op & ~0u << (opm)), ~0u << (opm) }, \
.length = { (lm) }, \
__VA_ARGS__ \
}
#define R CMD_DESC_REJECT
#define W CMD_DESC_REGISTER
#define B CMD_DESC_BITMASK
-#define M CMD_DESC_MASTER
/* Command Mask Fixed Len Action
---------------------------------------------------------- */
-static const struct drm_i915_cmd_descriptor common_cmds[] = {
+static const struct drm_i915_cmd_descriptor gen7_common_cmds[] = {
CMD( MI_NOOP, SMI, F, 1, S ),
CMD( MI_USER_INTERRUPT, SMI, F, 1, R ),
- CMD( MI_WAIT_FOR_EVENT, SMI, F, 1, M ),
+ CMD( MI_WAIT_FOR_EVENT, SMI, F, 1, R ),
CMD( MI_ARB_CHECK, SMI, F, 1, S ),
CMD( MI_REPORT_HEAD, SMI, F, 1, S ),
CMD( MI_SUSPEND_FLUSH, SMI, F, 1, S ),
CMD( MI_BATCH_BUFFER_START, SMI, !F, 0xFF, S ),
};
-static const struct drm_i915_cmd_descriptor render_cmds[] = {
+static const struct drm_i915_cmd_descriptor gen7_render_cmds[] = {
CMD( MI_FLUSH, SMI, F, 1, S ),
CMD( MI_ARB_ON_OFF, SMI, F, 1, R ),
CMD( MI_PREDICATE, SMI, F, 1, S ),
CMD( MI_URB_ATOMIC_ALLOC, SMI, F, 1, S ),
CMD( MI_SET_APPID, SMI, F, 1, S ),
CMD( MI_RS_CONTEXT, SMI, F, 1, S ),
- CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, M ),
+ CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, R ),
CMD( MI_LOAD_SCAN_LINES_EXCL, SMI, !F, 0x3F, R ),
CMD( MI_LOAD_REGISTER_REG, SMI, !F, 0xFF, W,
.reg = { .offset = 1, .mask = 0x007FFFFC, .step = 1 } ),
CMD( GFX_OP_3DSTATE_BINDING_TABLE_EDIT_PS, S3D, !F, 0x1FF, S ),
};
-static const struct drm_i915_cmd_descriptor video_cmds[] = {
+static const struct drm_i915_cmd_descriptor gen7_video_cmds[] = {
CMD( MI_ARB_ON_OFF, SMI, F, 1, R ),
CMD( MI_SET_APPID, SMI, F, 1, S ),
CMD( MI_STORE_DWORD_IMM, SMI, !F, 0xFF, B,
CMD( MFX_WAIT, SMFX, F, 1, S ),
};
-static const struct drm_i915_cmd_descriptor vecs_cmds[] = {
+static const struct drm_i915_cmd_descriptor gen7_vecs_cmds[] = {
CMD( MI_ARB_ON_OFF, SMI, F, 1, R ),
CMD( MI_SET_APPID, SMI, F, 1, S ),
CMD( MI_STORE_DWORD_IMM, SMI, !F, 0xFF, B,
}}, ),
};
-static const struct drm_i915_cmd_descriptor blt_cmds[] = {
+static const struct drm_i915_cmd_descriptor gen7_blt_cmds[] = {
CMD( MI_DISPLAY_FLIP, SMI, !F, 0xFF, R ),
CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3FF, B,
.bits = {{
};
static const struct drm_i915_cmd_descriptor hsw_blt_cmds[] = {
- CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, M ),
+ CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, R ),
CMD( MI_LOAD_SCAN_LINES_EXCL, SMI, !F, 0x3F, R ),
};
+/*
+ * For Gen9 we can still rely on the h/w to enforce cmd security, and only
+ * need to re-enforce the register access checks. We therefore only need to
+ * teach the cmdparser how to find the end of each command, and identify
+ * register accesses. The table doesn't need to reject any commands, and so
+ * the only commands listed here are:
+ * 1) Those that touch registers
+ * 2) Those that do not have the default 8-bit length
+ *
+ * Note that the default MI length mask chosen for this table is 0xFF, not
+ * the 0x3F used on older devices. This is because the vast majority of MI
+ * cmds on Gen9 use a standard 8-bit Length field.
+ * All the Gen9 blitter instructions are standard 0xFF length mask, and
+ * none allow access to non-general registers, so in fact no BLT cmds are
+ * included in the table at all.
+ *
+ */
+static const struct drm_i915_cmd_descriptor gen9_blt_cmds[] = {
+ CMD( MI_NOOP, SMI, F, 1, S ),
+ CMD( MI_USER_INTERRUPT, SMI, F, 1, S ),
+ CMD( MI_WAIT_FOR_EVENT, SMI, F, 1, S ),
+ CMD( MI_FLUSH, SMI, F, 1, S ),
+ CMD( MI_ARB_CHECK, SMI, F, 1, S ),
+ CMD( MI_REPORT_HEAD, SMI, F, 1, S ),
+ CMD( MI_ARB_ON_OFF, SMI, F, 1, S ),
+ CMD( MI_SUSPEND_FLUSH, SMI, F, 1, S ),
+ CMD( MI_LOAD_SCAN_LINES_INCL, SMI, !F, 0x3F, S ),
+ CMD( MI_LOAD_SCAN_LINES_EXCL, SMI, !F, 0x3F, S ),
+ CMD( MI_STORE_DWORD_IMM, SMI, !F, 0x3FF, S ),
+ CMD( MI_LOAD_REGISTER_IMM(1), SMI, !F, 0xFF, W,
+ .reg = { .offset = 1, .mask = 0x007FFFFC, .step = 2 } ),
+ CMD( MI_UPDATE_GTT, SMI, !F, 0x3FF, S ),
+ CMD( MI_STORE_REGISTER_MEM_GEN8, SMI, F, 4, W,
+ .reg = { .offset = 1, .mask = 0x007FFFFC } ),
+ CMD( MI_FLUSH_DW, SMI, !F, 0x3F, S ),
+ CMD( MI_LOAD_REGISTER_MEM_GEN8, SMI, F, 4, W,
+ .reg = { .offset = 1, .mask = 0x007FFFFC } ),
+ CMD( MI_LOAD_REGISTER_REG, SMI, !F, 0xFF, W,
+ .reg = { .offset = 1, .mask = 0x007FFFFC, .step = 1 } ),
+
+ /*
+ * We allow BB_START but apply further checks. We just sanitize the
+ * basic fields here.
+ */
+#define MI_BB_START_OPERAND_MASK GENMASK(SMI-1, 0)
+#define MI_BB_START_OPERAND_EXPECT (MI_BATCH_PPGTT_HSW | 1)
+ CMD( MI_BATCH_BUFFER_START_GEN8, SMI, !F, 0xFF, B,
+ .bits = {{
+ .offset = 0,
+ .mask = MI_BB_START_OPERAND_MASK,
+ .expected = MI_BB_START_OPERAND_EXPECT,
+ }}, ),
+};
+
static const struct drm_i915_cmd_descriptor noop_desc =
CMD(MI_NOOP, SMI, F, 1, S);
#undef R
#undef W
#undef B
-#undef M
-static const struct drm_i915_cmd_table gen7_render_cmds[] = {
- { common_cmds, ARRAY_SIZE(common_cmds) },
- { render_cmds, ARRAY_SIZE(render_cmds) },
+static const struct drm_i915_cmd_table gen7_render_cmd_table[] = {
+ { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) },
+ { gen7_render_cmds, ARRAY_SIZE(gen7_render_cmds) },
};
-static const struct drm_i915_cmd_table hsw_render_ring_cmds[] = {
- { common_cmds, ARRAY_SIZE(common_cmds) },
- { render_cmds, ARRAY_SIZE(render_cmds) },
+static const struct drm_i915_cmd_table hsw_render_ring_cmd_table[] = {
+ { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) },
+ { gen7_render_cmds, ARRAY_SIZE(gen7_render_cmds) },
{ hsw_render_cmds, ARRAY_SIZE(hsw_render_cmds) },
};
-static const struct drm_i915_cmd_table gen7_video_cmds[] = {
- { common_cmds, ARRAY_SIZE(common_cmds) },
- { video_cmds, ARRAY_SIZE(video_cmds) },
+static const struct drm_i915_cmd_table gen7_video_cmd_table[] = {
+ { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) },
+ { gen7_video_cmds, ARRAY_SIZE(gen7_video_cmds) },
};
-static const struct drm_i915_cmd_table hsw_vebox_cmds[] = {
- { common_cmds, ARRAY_SIZE(common_cmds) },
- { vecs_cmds, ARRAY_SIZE(vecs_cmds) },
+static const struct drm_i915_cmd_table hsw_vebox_cmd_table[] = {
+ { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) },
+ { gen7_vecs_cmds, ARRAY_SIZE(gen7_vecs_cmds) },
};
-static const struct drm_i915_cmd_table gen7_blt_cmds[] = {
- { common_cmds, ARRAY_SIZE(common_cmds) },
- { blt_cmds, ARRAY_SIZE(blt_cmds) },
+static const struct drm_i915_cmd_table gen7_blt_cmd_table[] = {
+ { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) },
+ { gen7_blt_cmds, ARRAY_SIZE(gen7_blt_cmds) },
};
-static const struct drm_i915_cmd_table hsw_blt_ring_cmds[] = {
- { common_cmds, ARRAY_SIZE(common_cmds) },
- { blt_cmds, ARRAY_SIZE(blt_cmds) },
+static const struct drm_i915_cmd_table hsw_blt_ring_cmd_table[] = {
+ { gen7_common_cmds, ARRAY_SIZE(gen7_common_cmds) },
+ { gen7_blt_cmds, ARRAY_SIZE(gen7_blt_cmds) },
{ hsw_blt_cmds, ARRAY_SIZE(hsw_blt_cmds) },
};
+static const struct drm_i915_cmd_table gen9_blt_cmd_table[] = {
+ { gen9_blt_cmds, ARRAY_SIZE(gen9_blt_cmds) },
+};
+
+
/*
* Register whitelists, sorted by increasing register offset.
*/
REG64_IDX(RING_TIMESTAMP, BLT_RING_BASE),
};
-static const struct drm_i915_reg_descriptor ivb_master_regs[] = {
- REG32(FORCEWAKE_MT),
- REG32(DERRMR),
- REG32(GEN7_PIPE_DE_LOAD_SL(PIPE_A)),
- REG32(GEN7_PIPE_DE_LOAD_SL(PIPE_B)),
- REG32(GEN7_PIPE_DE_LOAD_SL(PIPE_C)),
-};
-
-static const struct drm_i915_reg_descriptor hsw_master_regs[] = {
- REG32(FORCEWAKE_MT),
- REG32(DERRMR),
+static const struct drm_i915_reg_descriptor gen9_blt_regs[] = {
+ REG64_IDX(RING_TIMESTAMP, RENDER_RING_BASE),
+ REG64_IDX(RING_TIMESTAMP, BSD_RING_BASE),
+ REG32(BCS_SWCTRL),
+ REG64_IDX(RING_TIMESTAMP, BLT_RING_BASE),
+ REG64_IDX(BCS_GPR, 0),
+ REG64_IDX(BCS_GPR, 1),
+ REG64_IDX(BCS_GPR, 2),
+ REG64_IDX(BCS_GPR, 3),
+ REG64_IDX(BCS_GPR, 4),
+ REG64_IDX(BCS_GPR, 5),
+ REG64_IDX(BCS_GPR, 6),
+ REG64_IDX(BCS_GPR, 7),
+ REG64_IDX(BCS_GPR, 8),
+ REG64_IDX(BCS_GPR, 9),
+ REG64_IDX(BCS_GPR, 10),
+ REG64_IDX(BCS_GPR, 11),
+ REG64_IDX(BCS_GPR, 12),
+ REG64_IDX(BCS_GPR, 13),
+ REG64_IDX(BCS_GPR, 14),
+ REG64_IDX(BCS_GPR, 15),
};
#undef REG64
struct drm_i915_reg_table {
const struct drm_i915_reg_descriptor *regs;
int num_regs;
- bool master;
};
static const struct drm_i915_reg_table ivb_render_reg_tables[] = {
- { gen7_render_regs, ARRAY_SIZE(gen7_render_regs), false },
- { ivb_master_regs, ARRAY_SIZE(ivb_master_regs), true },
+ { gen7_render_regs, ARRAY_SIZE(gen7_render_regs) },
};
static const struct drm_i915_reg_table ivb_blt_reg_tables[] = {
- { gen7_blt_regs, ARRAY_SIZE(gen7_blt_regs), false },
- { ivb_master_regs, ARRAY_SIZE(ivb_master_regs), true },
+ { gen7_blt_regs, ARRAY_SIZE(gen7_blt_regs) },
};
static const struct drm_i915_reg_table hsw_render_reg_tables[] = {
- { gen7_render_regs, ARRAY_SIZE(gen7_render_regs), false },
- { hsw_render_regs, ARRAY_SIZE(hsw_render_regs), false },
- { hsw_master_regs, ARRAY_SIZE(hsw_master_regs), true },
+ { gen7_render_regs, ARRAY_SIZE(gen7_render_regs) },
+ { hsw_render_regs, ARRAY_SIZE(hsw_render_regs) },
};
static const struct drm_i915_reg_table hsw_blt_reg_tables[] = {
- { gen7_blt_regs, ARRAY_SIZE(gen7_blt_regs), false },
- { hsw_master_regs, ARRAY_SIZE(hsw_master_regs), true },
+ { gen7_blt_regs, ARRAY_SIZE(gen7_blt_regs) },
+};
+
+static const struct drm_i915_reg_table gen9_blt_reg_tables[] = {
+ { gen9_blt_regs, ARRAY_SIZE(gen9_blt_regs) },
};
static u32 gen7_render_get_cmd_length_mask(u32 cmd_header)
return 0;
}
+static u32 gen9_blt_get_cmd_length_mask(u32 cmd_header)
+{
+ u32 client = cmd_header >> INSTR_CLIENT_SHIFT;
+
+ if (client == INSTR_MI_CLIENT || client == INSTR_BC_CLIENT)
+ return 0xFF;
+
+ DRM_DEBUG_DRIVER("CMD: Abnormal blt cmd length! 0x%08X\n", cmd_header);
+ return 0;
+}
+
static bool validate_cmds_sorted(const struct intel_engine_cs *engine,
const struct drm_i915_cmd_table *cmd_tables,
int cmd_table_count)
int cmd_table_count;
int ret;
- if (!IS_GEN(engine->i915, 7))
+ if (!IS_GEN(engine->i915, 7) && !(IS_GEN(engine->i915, 9) &&
+ engine->class == COPY_ENGINE_CLASS))
return;
switch (engine->class) {
case RENDER_CLASS:
if (IS_HASWELL(engine->i915)) {
- cmd_tables = hsw_render_ring_cmds;
+ cmd_tables = hsw_render_ring_cmd_table;
cmd_table_count =
- ARRAY_SIZE(hsw_render_ring_cmds);
+ ARRAY_SIZE(hsw_render_ring_cmd_table);
} else {
- cmd_tables = gen7_render_cmds;
- cmd_table_count = ARRAY_SIZE(gen7_render_cmds);
+ cmd_tables = gen7_render_cmd_table;
+ cmd_table_count = ARRAY_SIZE(gen7_render_cmd_table);
}
if (IS_HASWELL(engine->i915)) {
engine->reg_tables = ivb_render_reg_tables;
engine->reg_table_count = ARRAY_SIZE(ivb_render_reg_tables);
}
-
engine->get_cmd_length_mask = gen7_render_get_cmd_length_mask;
break;
case VIDEO_DECODE_CLASS:
- cmd_tables = gen7_video_cmds;
- cmd_table_count = ARRAY_SIZE(gen7_video_cmds);
+ cmd_tables = gen7_video_cmd_table;
+ cmd_table_count = ARRAY_SIZE(gen7_video_cmd_table);
engine->get_cmd_length_mask = gen7_bsd_get_cmd_length_mask;
break;
case COPY_ENGINE_CLASS:
- if (IS_HASWELL(engine->i915)) {
- cmd_tables = hsw_blt_ring_cmds;
- cmd_table_count = ARRAY_SIZE(hsw_blt_ring_cmds);
+ engine->get_cmd_length_mask = gen7_blt_get_cmd_length_mask;
+ if (IS_GEN(engine->i915, 9)) {
+ cmd_tables = gen9_blt_cmd_table;
+ cmd_table_count = ARRAY_SIZE(gen9_blt_cmd_table);
+ engine->get_cmd_length_mask =
+ gen9_blt_get_cmd_length_mask;
+
+ /* BCS Engine unsafe without parser */
+ engine->flags |= I915_ENGINE_REQUIRES_CMD_PARSER;
+ } else if (IS_HASWELL(engine->i915)) {
+ cmd_tables = hsw_blt_ring_cmd_table;
+ cmd_table_count = ARRAY_SIZE(hsw_blt_ring_cmd_table);
} else {
- cmd_tables = gen7_blt_cmds;
- cmd_table_count = ARRAY_SIZE(gen7_blt_cmds);
+ cmd_tables = gen7_blt_cmd_table;
+ cmd_table_count = ARRAY_SIZE(gen7_blt_cmd_table);
}
- if (IS_HASWELL(engine->i915)) {
+ if (IS_GEN(engine->i915, 9)) {
+ engine->reg_tables = gen9_blt_reg_tables;
+ engine->reg_table_count =
+ ARRAY_SIZE(gen9_blt_reg_tables);
+ } else if (IS_HASWELL(engine->i915)) {
engine->reg_tables = hsw_blt_reg_tables;
engine->reg_table_count = ARRAY_SIZE(hsw_blt_reg_tables);
} else {
engine->reg_tables = ivb_blt_reg_tables;
engine->reg_table_count = ARRAY_SIZE(ivb_blt_reg_tables);
}
-
- engine->get_cmd_length_mask = gen7_blt_get_cmd_length_mask;
break;
case VIDEO_ENHANCEMENT_CLASS:
- cmd_tables = hsw_vebox_cmds;
- cmd_table_count = ARRAY_SIZE(hsw_vebox_cmds);
+ cmd_tables = hsw_vebox_cmd_table;
+ cmd_table_count = ARRAY_SIZE(hsw_vebox_cmd_table);
/* VECS can use the same length_mask function as VCS */
engine->get_cmd_length_mask = gen7_bsd_get_cmd_length_mask;
break;
return;
}
- engine->flags |= I915_ENGINE_NEEDS_CMD_PARSER;
+ engine->flags |= I915_ENGINE_USING_CMD_PARSER;
}
/**
*/
void intel_engine_cleanup_cmd_parser(struct intel_engine_cs *engine)
{
- if (!intel_engine_needs_cmd_parser(engine))
+ if (!intel_engine_using_cmd_parser(engine))
return;
fini_hash_table(engine);
}
static const struct drm_i915_reg_descriptor *
-find_reg(const struct intel_engine_cs *engine, bool is_master, u32 addr)
+find_reg(const struct intel_engine_cs *engine, u32 addr)
{
const struct drm_i915_reg_table *table = engine->reg_tables;
+ const struct drm_i915_reg_descriptor *reg = NULL;
int count = engine->reg_table_count;
- for (; count > 0; ++table, --count) {
- if (!table->master || is_master) {
- const struct drm_i915_reg_descriptor *reg;
+ for (; !reg && (count > 0); ++table, --count)
+ reg = __find_reg(table->regs, table->num_regs, addr);
- reg = __find_reg(table->regs, table->num_regs, addr);
- if (reg != NULL)
- return reg;
- }
- }
-
- return NULL;
+ return reg;
}
/* Returns a vmap'd pointer to dst_obj, which the caller must unmap */
static bool check_cmd(const struct intel_engine_cs *engine,
const struct drm_i915_cmd_descriptor *desc,
- const u32 *cmd, u32 length,
- const bool is_master)
+ const u32 *cmd, u32 length)
{
if (desc->flags & CMD_DESC_SKIP)
return true;
return false;
}
- if ((desc->flags & CMD_DESC_MASTER) && !is_master) {
- DRM_DEBUG_DRIVER("CMD: Rejected master-only command: 0x%08X\n",
- *cmd);
- return false;
- }
-
if (desc->flags & CMD_DESC_REGISTER) {
/*
* Get the distance between individual register offset
offset += step) {
const u32 reg_addr = cmd[offset] & desc->reg.mask;
const struct drm_i915_reg_descriptor *reg =
- find_reg(engine, is_master, reg_addr);
+ find_reg(engine, reg_addr);
if (!reg) {
DRM_DEBUG_DRIVER("CMD: Rejected register 0x%08X in command: 0x%08X (%s)\n",
return true;
}
+static int check_bbstart(const struct i915_gem_context *ctx,
+ u32 *cmd, u32 offset, u32 length,
+ u32 batch_len,
+ u64 batch_start,
+ u64 shadow_batch_start)
+{
+ u64 jump_offset, jump_target;
+ u32 target_cmd_offset, target_cmd_index;
+
+ /* For igt compatibility on older platforms */
+ if (CMDPARSER_USES_GGTT(ctx->i915)) {
+ DRM_DEBUG("CMD: Rejecting BB_START for ggtt based submission\n");
+ return -EACCES;
+ }
+
+ if (length != 3) {
+ DRM_DEBUG("CMD: Recursive BB_START with bad length(%u)\n",
+ length);
+ return -EINVAL;
+ }
+
+ jump_target = *(u64*)(cmd+1);
+ jump_offset = jump_target - batch_start;
+
+ /*
+ * Any underflow of jump_target is guaranteed to be outside the range
+ * of a u32, so >= test catches both too large and too small
+ */
+ if (jump_offset >= batch_len) {
+ DRM_DEBUG("CMD: BB_START to 0x%llx jumps out of BB\n",
+ jump_target);
+ return -EINVAL;
+ }
+
+ /*
+ * This cannot overflow a u32 because we already checked jump_offset
+ * is within the BB, and the batch_len is a u32
+ */
+ target_cmd_offset = lower_32_bits(jump_offset);
+ target_cmd_index = target_cmd_offset / sizeof(u32);
+
+ *(u64*)(cmd + 1) = shadow_batch_start + target_cmd_offset;
+
+ if (target_cmd_index == offset)
+ return 0;
+
+ if (ctx->jump_whitelist_cmds <= target_cmd_index) {
+ DRM_DEBUG("CMD: Rejecting BB_START - truncated whitelist array\n");
+ return -EINVAL;
+ } else if (!test_bit(target_cmd_index, ctx->jump_whitelist)) {
+ DRM_DEBUG("CMD: BB_START to 0x%llx not a previously executed cmd\n",
+ jump_target);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static void init_whitelist(struct i915_gem_context *ctx, u32 batch_len)
+{
+ const u32 batch_cmds = DIV_ROUND_UP(batch_len, sizeof(u32));
+ const u32 exact_size = BITS_TO_LONGS(batch_cmds);
+ u32 next_size = BITS_TO_LONGS(roundup_pow_of_two(batch_cmds));
+ unsigned long *next_whitelist;
+
+ if (CMDPARSER_USES_GGTT(ctx->i915))
+ return;
+
+ if (batch_cmds <= ctx->jump_whitelist_cmds) {
+ bitmap_zero(ctx->jump_whitelist, batch_cmds);
+ return;
+ }
+
+again:
+ next_whitelist = kcalloc(next_size, sizeof(long), GFP_KERNEL);
+ if (next_whitelist) {
+ kfree(ctx->jump_whitelist);
+ ctx->jump_whitelist = next_whitelist;
+ ctx->jump_whitelist_cmds =
+ next_size * BITS_PER_BYTE * sizeof(long);
+ return;
+ }
+
+ if (next_size > exact_size) {
+ next_size = exact_size;
+ goto again;
+ }
+
+ DRM_DEBUG("CMD: Failed to extend whitelist. BB_START may be disallowed\n");
+ bitmap_zero(ctx->jump_whitelist, ctx->jump_whitelist_cmds);
+
+ return;
+}
+
#define LENGTH_BIAS 2
/**
* i915_parse_cmds() - parse a submitted batch buffer for privilege violations
+ * @ctx: the context in which the batch is to execute
* @engine: the engine on which the batch is to execute
* @batch_obj: the batch buffer in question
- * @shadow_batch_obj: copy of the batch buffer in question
+ * @batch_start: Canonical base address of batch
* @batch_start_offset: byte offset in the batch at which execution starts
* @batch_len: length of the commands in batch_obj
- * @is_master: is the submitting process the drm master?
+ * @shadow_batch_obj: copy of the batch buffer in question
+ * @shadow_batch_start: Canonical base address of shadow_batch_obj
*
* Parses the specified batch buffer looking for privilege violations as
* described in the overview.
* Return: non-zero if the parser finds violations or otherwise fails; -EACCES
* if the batch appears legal but should use hardware parsing
*/
-int intel_engine_cmd_parser(struct intel_engine_cs *engine,
+
+int intel_engine_cmd_parser(struct i915_gem_context *ctx,
+ struct intel_engine_cs *engine,
struct drm_i915_gem_object *batch_obj,
- struct drm_i915_gem_object *shadow_batch_obj,
+ u64 batch_start,
u32 batch_start_offset,
u32 batch_len,
- bool is_master)
+ struct drm_i915_gem_object *shadow_batch_obj,
+ u64 shadow_batch_start)
{
- u32 *cmd, *batch_end;
+ u32 *cmd, *batch_end, offset = 0;
struct drm_i915_cmd_descriptor default_desc = noop_desc;
const struct drm_i915_cmd_descriptor *desc = &default_desc;
bool needs_clflush_after = false;
return PTR_ERR(cmd);
}
+ init_whitelist(ctx, batch_len);
+
/*
* We use the batch length as size because the shadow object is as
* large or larger and copy_batch() will write MI_NOPs to the extra
do {
u32 length;
- if (*cmd == MI_BATCH_BUFFER_END) {
- if (needs_clflush_after) {
- void *ptr = page_mask_bits(shadow_batch_obj->mm.mapping);
- drm_clflush_virt_range(ptr,
- (void *)(cmd + 1) - ptr);
- }
+ if (*cmd == MI_BATCH_BUFFER_END)
break;
- }
desc = find_cmd(engine, *cmd, desc, &default_desc);
if (!desc) {
DRM_DEBUG_DRIVER("CMD: Unrecognized command: 0x%08X\n",
*cmd);
ret = -EINVAL;
- break;
- }
-
- /*
- * If the batch buffer contains a chained batch, return an
- * error that tells the caller to abort and dispatch the
- * workload as a non-secure batch.
- */
- if (desc->cmd.value == MI_BATCH_BUFFER_START) {
- ret = -EACCES;
- break;
+ goto err;
}
if (desc->flags & CMD_DESC_FIXED)
length,
batch_end - cmd);
ret = -EINVAL;
- break;
+ goto err;
}
- if (!check_cmd(engine, desc, cmd, length, is_master)) {
+ if (!check_cmd(engine, desc, cmd, length)) {
ret = -EACCES;
+ goto err;
+ }
+
+ if (desc->cmd.value == MI_BATCH_BUFFER_START) {
+ ret = check_bbstart(ctx, cmd, offset, length,
+ batch_len, batch_start,
+ shadow_batch_start);
+
+ if (ret)
+ goto err;
break;
}
+ if (ctx->jump_whitelist_cmds > offset)
+ set_bit(offset, ctx->jump_whitelist);
+
cmd += length;
+ offset += length;
if (cmd >= batch_end) {
DRM_DEBUG_DRIVER("CMD: Got to the end of the buffer w/o a BBE cmd!\n");
ret = -EINVAL;
- break;
+ goto err;
}
} while (1);
+ if (needs_clflush_after) {
+ void *ptr = page_mask_bits(shadow_batch_obj->mm.mapping);
+
+ drm_clflush_virt_range(ptr, (void *)(cmd + 1) - ptr);
+ }
+
+err:
i915_gem_object_unpin_map(shadow_batch_obj);
return ret;
}
/* If the command parser is not enabled, report 0 - unsupported */
for_each_uabi_engine(engine, dev_priv) {
- if (intel_engine_needs_cmd_parser(engine)) {
+ if (intel_engine_using_cmd_parser(engine)) {
active = true;
break;
}
* the parser enabled.
* 9. Don't whitelist or handle oacontrol specially, as ownership
* for oacontrol state is moving to i915-perf.
+ * 10. Support for Gen9 BCS Parsing
*/
- return 9;
+ return 10;
}
i915_gem_suspend_late(dev_priv);
+ i915_rc6_ctx_wa_suspend(dev_priv);
+
intel_uncore_suspend(&dev_priv->uncore);
intel_power_domains_suspend(dev_priv,
intel_power_domains_resume(dev_priv);
+ i915_rc6_ctx_wa_resume(dev_priv);
+
intel_gt_sanitize(&dev_priv->gt, true);
enable_rpm_wakeref_asserts(&dev_priv->runtime_pm);
struct intel_rc6 {
bool enabled;
+ bool ctx_corrupted;
+ intel_wakeref_t ctx_corrupted_wakeref;
u64 prev_hw_residency[4];
u64 cur_residency[4];
};
#define VEBOX_MASK(dev_priv) \
ENGINE_INSTANCES_MASK(dev_priv, VECS0, I915_MAX_VECS)
+/*
+ * The Gen7 cmdparser copies the scanned buffer to the ggtt for execution
+ * All later gens can run the final buffer from the ppgtt
+ */
+#define CMDPARSER_USES_GGTT(dev_priv) IS_GEN(dev_priv, 7)
+
#define HAS_LLC(dev_priv) (INTEL_INFO(dev_priv)->has_llc)
#define HAS_SNOOP(dev_priv) (INTEL_INFO(dev_priv)->has_snoop)
#define HAS_EDRAM(dev_priv) ((dev_priv)->edram_size_mb)
+#define HAS_SECURE_BATCHES(dev_priv) (INTEL_GEN(dev_priv) < 6)
#define HAS_WT(dev_priv) ((IS_HASWELL(dev_priv) || \
IS_BROADWELL(dev_priv)) && HAS_EDRAM(dev_priv))
/* Early gen2 have a totally busted CS tlb and require pinned batches. */
#define HAS_BROKEN_CS_TLB(dev_priv) (IS_I830(dev_priv) || IS_I845G(dev_priv))
+#define NEEDS_RC6_CTX_CORRUPTION_WA(dev_priv) \
+ (IS_BROADWELL(dev_priv) || IS_GEN(dev_priv, 9))
+
/* WaRsDisableCoarsePowerGating:skl,cnl */
#define NEEDS_WaRsDisableCoarsePowerGating(dev_priv) \
- (IS_CANNONLAKE(dev_priv) || \
- IS_SKL_GT3(dev_priv) || IS_SKL_GT4(dev_priv))
+ (IS_CANNONLAKE(dev_priv) || IS_GEN(dev_priv, 9))
#define HAS_GMBUS_IRQ(dev_priv) (INTEL_GEN(dev_priv) >= 4)
#define HAS_GMBUS_BURST_READ(dev_priv) (INTEL_GEN(dev_priv) >= 10 || \
unsigned long flags);
#define I915_GEM_OBJECT_UNBIND_ACTIVE BIT(0)
+struct i915_vma * __must_check
+i915_gem_object_pin(struct drm_i915_gem_object *obj,
+ struct i915_address_space *vm,
+ const struct i915_ggtt_view *view,
+ u64 size,
+ u64 alignment,
+ u64 flags);
+
void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv);
static inline int __must_check
int i915_cmd_parser_get_version(struct drm_i915_private *dev_priv);
void intel_engine_init_cmd_parser(struct intel_engine_cs *engine);
void intel_engine_cleanup_cmd_parser(struct intel_engine_cs *engine);
-int intel_engine_cmd_parser(struct intel_engine_cs *engine,
+int intel_engine_cmd_parser(struct i915_gem_context *cxt,
+ struct intel_engine_cs *engine,
struct drm_i915_gem_object *batch_obj,
- struct drm_i915_gem_object *shadow_batch_obj,
+ u64 user_batch_start,
u32 batch_start_offset,
u32 batch_len,
- bool is_master);
+ struct drm_i915_gem_object *shadow_batch_obj,
+ u64 shadow_batch_start);
/* intel_device_info.c */
static inline struct intel_device_info *
{
struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
struct i915_address_space *vm = &dev_priv->ggtt.vm;
+
+ return i915_gem_object_pin(obj, vm, view, size, alignment,
+ flags | PIN_GLOBAL);
+}
+
+struct i915_vma *
+i915_gem_object_pin(struct drm_i915_gem_object *obj,
+ struct i915_address_space *vm,
+ const struct i915_ggtt_view *view,
+ u64 size,
+ u64 alignment,
+ u64 flags)
+{
+ struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
struct i915_vma *vma;
int ret;
return ERR_PTR(ret);
}
- ret = i915_vma_pin(vma, size, alignment, flags | PIN_GLOBAL);
+ ret = i915_vma_pin(vma, size, alignment, flags);
if (ret)
return ERR_PTR(ret);
value = !!(i915->caps.scheduler & I915_SCHEDULER_CAP_SEMAPHORES);
break;
case I915_PARAM_HAS_SECURE_BATCHES:
- value = capable(CAP_SYS_ADMIN);
+ value = HAS_SECURE_BATCHES(i915) && capable(CAP_SYS_ADMIN);
break;
case I915_PARAM_CMD_PARSER_VERSION:
value = i915_cmd_parser_get_version(i915);
#define ECOCHK_PPGTT_WT_HSW (0x2 << 3)
#define ECOCHK_PPGTT_WB_HSW (0x3 << 3)
+#define GEN8_RC6_CTX_INFO _MMIO(0x8504)
+
#define GAC_ECO_BITS _MMIO(0x14090)
#define ECOBITS_SNB_BIT (1 << 13)
#define ECOBITS_PPGTT_CACHE64B (3 << 8)
*/
#define BCS_SWCTRL _MMIO(0x22200)
+/* There are 16 GPR registers */
+#define BCS_GPR(n) _MMIO(0x22600 + (n) * 8)
+#define BCS_GPR_UDW(n) _MMIO(0x22600 + (n) * 8 + 4)
+
#define GPGPU_THREADS_DISPATCHED _MMIO(0x2290)
#define GPGPU_THREADS_DISPATCHED_UDW _MMIO(0x2290 + 4)
#define HS_INVOCATION_COUNT _MMIO(0x2300)
#define TGL_DMC_DEBUG_DC5_COUNT _MMIO(0x101084)
#define TGL_DMC_DEBUG_DC6_COUNT _MMIO(0x101088)
+/* Display Internal Timeout Register */
+#define RM_TIMEOUT _MMIO(0x42060)
+#define MMIO_TIMEOUT_US(us) ((us) << 0)
+
/* interrupts */
#define DE_MASTER_IRQ_CONTROL (1 << 31)
#define DE_SPRITEB_FLIP_DONE (1 << 29)
*/
I915_WRITE(GEN9_CLKGATE_DIS_0, I915_READ(GEN9_CLKGATE_DIS_0) |
PWM1_GATING_DIS | PWM2_GATING_DIS);
+
+ /*
+ * Lower the display internal timeout.
+ * This is needed to avoid any hard hangs when DSI port PLL
+ * is off and a MMIO access is attempted by any privilege
+ * application, using batch buffers or any other means.
+ */
+ I915_WRITE(RM_TIMEOUT, MMIO_TIMEOUT_US(950));
}
static void glk_init_clock_gating(struct drm_i915_private *dev_priv)
dev_priv->ips.corr = (lcfuse & LCFUSE_HIV_MASK);
}
+static bool i915_rc6_ctx_corrupted(struct drm_i915_private *dev_priv)
+{
+ return !I915_READ(GEN8_RC6_CTX_INFO);
+}
+
+static void i915_rc6_ctx_wa_init(struct drm_i915_private *i915)
+{
+ if (!NEEDS_RC6_CTX_CORRUPTION_WA(i915))
+ return;
+
+ if (i915_rc6_ctx_corrupted(i915)) {
+ DRM_INFO("RC6 context corrupted, disabling runtime power management\n");
+ i915->gt_pm.rc6.ctx_corrupted = true;
+ i915->gt_pm.rc6.ctx_corrupted_wakeref =
+ intel_runtime_pm_get(&i915->runtime_pm);
+ }
+}
+
+static void i915_rc6_ctx_wa_cleanup(struct drm_i915_private *i915)
+{
+ if (i915->gt_pm.rc6.ctx_corrupted) {
+ intel_runtime_pm_put(&i915->runtime_pm,
+ i915->gt_pm.rc6.ctx_corrupted_wakeref);
+ i915->gt_pm.rc6.ctx_corrupted = false;
+ }
+}
+
+/**
+ * i915_rc6_ctx_wa_suspend - system suspend sequence for the RC6 CTX WA
+ * @i915: i915 device
+ *
+ * Perform any steps needed to clean up the RC6 CTX WA before system suspend.
+ */
+void i915_rc6_ctx_wa_suspend(struct drm_i915_private *i915)
+{
+ if (i915->gt_pm.rc6.ctx_corrupted)
+ intel_runtime_pm_put(&i915->runtime_pm,
+ i915->gt_pm.rc6.ctx_corrupted_wakeref);
+}
+
+/**
+ * i915_rc6_ctx_wa_resume - system resume sequence for the RC6 CTX WA
+ * @i915: i915 device
+ *
+ * Perform any steps needed to re-init the RC6 CTX WA after system resume.
+ */
+void i915_rc6_ctx_wa_resume(struct drm_i915_private *i915)
+{
+ if (!i915->gt_pm.rc6.ctx_corrupted)
+ return;
+
+ if (i915_rc6_ctx_corrupted(i915)) {
+ i915->gt_pm.rc6.ctx_corrupted_wakeref =
+ intel_runtime_pm_get(&i915->runtime_pm);
+ return;
+ }
+
+ DRM_INFO("RC6 context restored, re-enabling runtime power management\n");
+ i915->gt_pm.rc6.ctx_corrupted = false;
+}
+
+static void intel_disable_rc6(struct drm_i915_private *dev_priv);
+
+/**
+ * i915_rc6_ctx_wa_check - check for a new RC6 CTX corruption
+ * @i915: i915 device
+ *
+ * Check if an RC6 CTX corruption has happened since the last check and if so
+ * disable RC6 and runtime power management.
+ *
+ * Return false if no context corruption has happened since the last call of
+ * this function, true otherwise.
+*/
+bool i915_rc6_ctx_wa_check(struct drm_i915_private *i915)
+{
+ if (!NEEDS_RC6_CTX_CORRUPTION_WA(i915))
+ return false;
+
+ if (i915->gt_pm.rc6.ctx_corrupted)
+ return false;
+
+ if (!i915_rc6_ctx_corrupted(i915))
+ return false;
+
+ DRM_NOTE("RC6 context corruption, disabling runtime power management\n");
+
+ intel_disable_rc6(i915);
+ i915->gt_pm.rc6.ctx_corrupted = true;
+ i915->gt_pm.rc6.ctx_corrupted_wakeref =
+ intel_runtime_pm_get_noresume(&i915->runtime_pm);
+
+ return true;
+}
+
void intel_init_gt_powersave(struct drm_i915_private *dev_priv)
{
struct intel_rps *rps = &dev_priv->gt_pm.rps;
pm_runtime_get(&dev_priv->drm.pdev->dev);
}
+ i915_rc6_ctx_wa_init(dev_priv);
+
/* Initialize RPS limits (for userspace) */
if (IS_CHERRYVIEW(dev_priv))
cherryview_init_gt_powersave(dev_priv);
if (IS_VALLEYVIEW(dev_priv))
valleyview_cleanup_gt_powersave(dev_priv);
+ i915_rc6_ctx_wa_cleanup(dev_priv);
+
if (!HAS_RC6(dev_priv))
pm_runtime_put(&dev_priv->drm.pdev->dev);
}
i915->gt_pm.llc_pstate.enabled = false;
}
-static void intel_disable_rc6(struct drm_i915_private *dev_priv)
+static void __intel_disable_rc6(struct drm_i915_private *dev_priv)
{
lockdep_assert_held(&dev_priv->gt_pm.rps.lock);
dev_priv->gt_pm.rc6.enabled = false;
}
+static void intel_disable_rc6(struct drm_i915_private *dev_priv)
+{
+ struct intel_rps *rps = &dev_priv->gt_pm.rps;
+
+ mutex_lock(&rps->lock);
+ __intel_disable_rc6(dev_priv);
+ mutex_unlock(&rps->lock);
+}
+
static void intel_disable_rps(struct drm_i915_private *dev_priv)
{
lockdep_assert_held(&dev_priv->gt_pm.rps.lock);
{
mutex_lock(&dev_priv->gt_pm.rps.lock);
- intel_disable_rc6(dev_priv);
+ __intel_disable_rc6(dev_priv);
intel_disable_rps(dev_priv);
if (HAS_LLC(dev_priv))
intel_disable_llc_pstate(dev_priv);
if (dev_priv->gt_pm.rc6.enabled)
return;
+ if (dev_priv->gt_pm.rc6.ctx_corrupted)
+ return;
+
if (IS_CHERRYVIEW(dev_priv))
cherryview_enable_rc6(dev_priv);
else if (IS_VALLEYVIEW(dev_priv))
void intel_sanitize_gt_powersave(struct drm_i915_private *dev_priv);
void intel_enable_gt_powersave(struct drm_i915_private *dev_priv);
void intel_disable_gt_powersave(struct drm_i915_private *dev_priv);
+bool i915_rc6_ctx_wa_check(struct drm_i915_private *i915);
+void i915_rc6_ctx_wa_suspend(struct drm_i915_private *i915);
+void i915_rc6_ctx_wa_resume(struct drm_i915_private *i915);
void gen6_rps_busy(struct drm_i915_private *dev_priv);
void gen6_rps_idle(struct drm_i915_private *dev_priv);
void gen6_rps_boost(struct i915_request *rq);
case 0x682C:
si_pi->cac_weights = cac_weights_cape_verde_pro;
si_pi->dte_data = dte_data_sun_xt;
+ update_dte_from_pl2 = true;
break;
case 0x6825:
case 0x6827:
if (ret) {
dev_err(&client->dev, "failed to reset device.\n");
i2c_hid_set_power(client, I2C_HID_PWR_SLEEP);
+ goto out_unlock;
}
+ /* At least some SIS devices need this after reset */
+ ret = i2c_hid_set_power(client, I2C_HID_PWR_ON);
+
out_unlock:
mutex_unlock(&ihid->reset_lock);
return ret;
}
}
+/*
+ * Convert a signed 32-bit integer to an unsigned n-bit integer. Undoes
+ * the normally-helpful work of 'hid_snto32' for fields that use signed
+ * ranges for questionable reasons.
+ */
+static inline __u32 wacom_s32tou(s32 value, __u8 n)
+{
+ switch (n) {
+ case 8: return ((__u8)value);
+ case 16: return ((__u16)value);
+ case 32: return ((__u32)value);
+ }
+ return value & (1 << (n - 1)) ? value & (~(~0U << n)) : value;
+}
+
extern const struct hid_device_id wacom_ids[];
void wacom_wac_irq(struct wacom_wac *wacom_wac, size_t len);
case HID_DG_TOOLSERIALNUMBER:
if (value) {
wacom_wac->serial[0] = (wacom_wac->serial[0] & ~0xFFFFFFFFULL);
- wacom_wac->serial[0] |= (__u32)value;
+ wacom_wac->serial[0] |= wacom_s32tou(value, field->report_size);
}
return;
case HID_DG_TWIST:
return;
case WACOM_HID_WD_SERIALHI:
if (value) {
+ __u32 raw_value = wacom_s32tou(value, field->report_size);
+
wacom_wac->serial[0] = (wacom_wac->serial[0] & 0xFFFFFFFF);
- wacom_wac->serial[0] |= ((__u64)value) << 32;
+ wacom_wac->serial[0] |= ((__u64)raw_value) << 32;
/*
* Non-USI EMR devices may contain additional tool type
* information here. See WACOM_HID_WD_TOOLTYPE case for
* more details.
*/
if (value >> 20 == 1) {
- wacom_wac->id[0] |= value & 0xFFFFF;
+ wacom_wac->id[0] |= raw_value & 0xFFFFF;
}
}
return;
* bitwise OR so the complete value can be built
* up over time :(
*/
- wacom_wac->id[0] |= value;
+ wacom_wac->id[0] |= wacom_s32tou(value, field->report_size);
return;
case WACOM_HID_WD_OFFSETLEFT:
if (features->offset_left && value != features->offset_left)
if (!count)
dev_dbg(&thdev->dev, "timeout waiting for CTS Trigger\n");
+ /* De-assert the trigger */
+ iowrite32(0, gth->base + REG_CTS_CTL);
+
intel_th_gth_stop(gth, output, false);
intel_th_gth_start(gth, output);
}
};
static LIST_HEAD(msu_buffer_list);
-static struct mutex msu_buffer_mutex;
+static DEFINE_MUTEX(msu_buffer_mutex);
/**
* struct msu_buffer_entry - internal MSU buffer bookkeeping
struct msc_block_desc *bdesc = sg_virt(sg);
if (msc_block_wrapped(bdesc))
- return win->nr_blocks << PAGE_SHIFT;
+ return (size_t)win->nr_blocks << PAGE_SHIFT;
size += msc_total_sz(bdesc);
if (msc_block_last_written(bdesc))
len = cp - buf;
mode = kstrndup(buf, len, GFP_KERNEL);
+ if (!mode)
+ return -ENOMEM;
+
i = match_string(msc_mode, ARRAY_SIZE(msc_mode), mode);
- if (i >= 0)
+ if (i >= 0) {
+ kfree(mode);
goto found;
+ }
/* Buffer sinks only work with a usable IRQ */
if (!msc->do_irq) {
.driver_data = (kernel_ulong_t)&intel_th_2x,
},
{
+ /* Comet Lake PCH */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x06a6),
+ .driver_data = (kernel_ulong_t)&intel_th_2x,
+ },
+ {
/* Ice Lake NNPI */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x45c5),
.driver_data = (kernel_ulong_t)&intel_th_2x,
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0xa0a6),
.driver_data = (kernel_ulong_t)&intel_th_2x,
},
+ {
+ /* Jasper Lake PCH */
+ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x4da6),
+ .driver_data = (kernel_ulong_t)&intel_th_2x,
+ },
{ 0 },
};
cookie = dmaengine_submit(desc);
ret = dma_submit_error(cookie);
if (ret) {
- dmaengine_terminate_all(adc->dma_chan);
+ dmaengine_terminate_sync(adc->dma_chan);
return ret;
}
stm32_adc_conv_irq_disable(adc);
if (adc->dma_chan)
- dmaengine_terminate_all(adc->dma_chan);
+ dmaengine_terminate_sync(adc->dma_chan);
if (stm32_adc_set_trig(indio_dev, NULL))
dev_err(&indio_dev->dev, "Can't clear trigger\n");
struct adis16480 *st = iio_priv(indio_dev);
unsigned int t, reg;
+ if (val < 0 || val2 < 0)
+ return -EINVAL;
+
t = val * 1000 + val2 / 1000;
- if (t <= 0)
+ if (t == 0)
return -EINVAL;
/*
.name = "MPU6050",
.reg = ®_set_6050,
.config = &chip_config_6050,
+ .fifo_size = 1024,
},
{
.whoami = INV_MPU6500_WHOAMI_VALUE,
.name = "MPU6500",
.reg = ®_set_6500,
.config = &chip_config_6050,
+ .fifo_size = 512,
},
{
.whoami = INV_MPU6515_WHOAMI_VALUE,
.name = "MPU6515",
.reg = ®_set_6500,
.config = &chip_config_6050,
+ .fifo_size = 512,
},
{
.whoami = INV_MPU6000_WHOAMI_VALUE,
.name = "MPU6000",
.reg = ®_set_6050,
.config = &chip_config_6050,
+ .fifo_size = 1024,
},
{
.whoami = INV_MPU9150_WHOAMI_VALUE,
.name = "MPU9150",
.reg = ®_set_6050,
.config = &chip_config_6050,
+ .fifo_size = 1024,
},
{
.whoami = INV_MPU9250_WHOAMI_VALUE,
.name = "MPU9250",
.reg = ®_set_6500,
.config = &chip_config_6050,
+ .fifo_size = 512,
},
{
.whoami = INV_MPU9255_WHOAMI_VALUE,
.name = "MPU9255",
.reg = ®_set_6500,
.config = &chip_config_6050,
+ .fifo_size = 512,
},
{
.whoami = INV_ICM20608_WHOAMI_VALUE,
.name = "ICM20608",
.reg = ®_set_6500,
.config = &chip_config_6050,
+ .fifo_size = 512,
},
{
.whoami = INV_ICM20602_WHOAMI_VALUE,
.name = "ICM20602",
.reg = ®_set_icm20602,
.config = &chip_config_6050,
+ .fifo_size = 1008,
},
};
* @name: name of the chip.
* @reg: register map of the chip.
* @config: configuration of the chip.
+ * @fifo_size: size of the FIFO in bytes.
*/
struct inv_mpu6050_hw {
u8 whoami;
u8 *name;
const struct inv_mpu6050_reg_map *reg;
const struct inv_mpu6050_chip_config *config;
+ size_t fifo_size;
};
/*
"failed to ack interrupt\n");
goto flush_fifo;
}
- /* handle fifo overflow by reseting fifo */
- if (int_status & INV_MPU6050_BIT_FIFO_OVERFLOW_INT)
- goto flush_fifo;
if (!(int_status & INV_MPU6050_BIT_RAW_DATA_RDY_INT)) {
dev_warn(regmap_get_device(st->map),
"spurious interrupt with status 0x%x\n", int_status);
if (result)
goto end_session;
fifo_count = get_unaligned_be16(&data[0]);
+
+ /*
+ * Handle fifo overflow by resetting fifo.
+ * Reset if there is only 3 data set free remaining to mitigate
+ * possible delay between reading fifo count and fifo data.
+ */
+ nb = 3 * bytes_per_datum;
+ if (fifo_count >= st->hw->fifo_size - nb) {
+ dev_warn(regmap_get_device(st->map), "fifo overflow reset\n");
+ goto flush_fifo;
+ }
+
/* compute and process all complete datum */
nb = fifo_count / bytes_per_datum;
inv_mpu6050_update_period(st, pf->timestamp, nb);
udelay(data->cfg->trigger_pulse_us);
gpiod_set_value(data->gpiod_trig, 0);
- /* it cannot take more than 20 ms */
+ /* it should not take more than 20 ms until echo is rising */
ret = wait_for_completion_killable_timeout(&data->rising, HZ/50);
if (ret < 0) {
mutex_unlock(&data->lock);
return -ETIMEDOUT;
}
- ret = wait_for_completion_killable_timeout(&data->falling, HZ/50);
+ /* it cannot take more than 50 ms until echo is falling */
+ ret = wait_for_completion_killable_timeout(&data->falling, HZ/20);
if (ret < 0) {
mutex_unlock(&data->lock);
return ret;
dt_ns = ktime_to_ns(ktime_dt);
/*
- * measuring more than 3 meters is beyond the capabilities of
- * the sensor
+ * measuring more than 6,45 meters is beyond the capabilities of
+ * the supported sensors
* ==> filter out invalid results for not measuring echos of
* another us sensor
*
* formula:
- * distance 3 m
- * time = ---------- = --------- = 9404389 ns
- * speed 319 m/s
+ * distance 6,45 * 2 m
+ * time = ---------- = ------------ = 40438871 ns
+ * speed 319 m/s
*
* using a minimum speed at -20 °C of 319 m/s
*/
- if (dt_ns > 9404389)
+ if (dt_ns > 40438871)
return -EIO;
time_ns = dt_ns;
* with Temp in °C
* and speed in m/s
*
- * use 343 m/s as ultrasonic speed at 20 °C here in absence of the
+ * use 343,5 m/s as ultrasonic speed at 20 °C here in absence of the
* temperature
*
* therefore:
- * time 343
- * distance = ------ * -----
- * 10^6 2
+ * time 343,5 time * 106
+ * distance = ------ * ------- = ------------
+ * 10^6 2 617176
* with time in ns
* and distance in mm (one way)
*
- * because we limit to 3 meters the multiplication with 343 just
+ * because we limit to 6,45 meters the multiplication with 106 just
* fits into 32 bit
*/
- distance_mm = time_ns * 343 / 2000000;
+ distance_mm = time_ns * 106 / 617176;
return distance_mm;
}
struct rmi_2d_sensor_platform_data sensor_pdata;
unsigned long *abs_mask;
unsigned long *rel_mask;
- unsigned long *result_bits;
};
enum f11_finger_state {
/*
** init instance data, fill in values and create any sysfs files
*/
- f11 = devm_kzalloc(&fn->dev, sizeof(struct f11_data) + mask_size * 3,
+ f11 = devm_kzalloc(&fn->dev, sizeof(struct f11_data) + mask_size * 2,
GFP_KERNEL);
if (!f11)
return -ENOMEM;
+ sizeof(struct f11_data));
f11->rel_mask = (unsigned long *)((char *)f11
+ sizeof(struct f11_data) + mask_size);
- f11->result_bits = (unsigned long *)((char *)f11
- + sizeof(struct f11_data) + mask_size * 2);
set_bit(fn->irq_pos, f11->abs_mask);
set_bit(fn->irq_pos + 1, f11->rel_mask);
valid_bytes = f11->sensor.attn_size;
memcpy(f11->sensor.data_pkt, drvdata->attn_data.data,
valid_bytes);
- drvdata->attn_data.data += f11->sensor.attn_size;
- drvdata->attn_data.size -= f11->sensor.attn_size;
+ drvdata->attn_data.data += valid_bytes;
+ drvdata->attn_data.size -= valid_bytes;
} else {
error = rmi_read_block(rmi_dev,
data_base_addr, f11->sensor.data_pkt,
const struct rmi_register_desc_item *data15;
u16 data15_offset;
+
+ unsigned long *abs_mask;
+ unsigned long *rel_mask;
};
static int rmi_f12_read_sensor_tuning(struct f12_data *f12)
valid_bytes = sensor->attn_size;
memcpy(sensor->data_pkt, drvdata->attn_data.data,
valid_bytes);
- drvdata->attn_data.data += sensor->attn_size;
- drvdata->attn_data.size -= sensor->attn_size;
+ drvdata->attn_data.data += valid_bytes;
+ drvdata->attn_data.size -= valid_bytes;
} else {
retval = rmi_read_block(rmi_dev, f12->data_addr,
sensor->data_pkt, sensor->pkt_size);
static int rmi_f12_config(struct rmi_function *fn)
{
struct rmi_driver *drv = fn->rmi_dev->driver;
+ struct f12_data *f12 = dev_get_drvdata(&fn->dev);
+ struct rmi_2d_sensor *sensor;
int ret;
- drv->set_irq_bits(fn->rmi_dev, fn->irq_mask);
+ sensor = &f12->sensor;
+
+ if (!sensor->report_abs)
+ drv->clear_irq_bits(fn->rmi_dev, f12->abs_mask);
+ else
+ drv->set_irq_bits(fn->rmi_dev, f12->abs_mask);
+
+ drv->clear_irq_bits(fn->rmi_dev, f12->rel_mask);
ret = rmi_f12_write_control_regs(fn);
if (ret)
struct rmi_device_platform_data *pdata = rmi_get_platform_data(rmi_dev);
struct rmi_driver_data *drvdata = dev_get_drvdata(&rmi_dev->dev);
u16 data_offset = 0;
+ int mask_size;
rmi_dbg(RMI_DEBUG_FN, &fn->dev, "%s\n", __func__);
+ mask_size = BITS_TO_LONGS(drvdata->irq_count) * sizeof(unsigned long);
+
ret = rmi_read(fn->rmi_dev, query_addr, &buf);
if (ret < 0) {
dev_err(&fn->dev, "Failed to read general info register: %d\n",
return -ENODEV;
}
- f12 = devm_kzalloc(&fn->dev, sizeof(struct f12_data), GFP_KERNEL);
+ f12 = devm_kzalloc(&fn->dev, sizeof(struct f12_data) + mask_size * 2,
+ GFP_KERNEL);
if (!f12)
return -ENOMEM;
+ f12->abs_mask = (unsigned long *)((char *)f12
+ + sizeof(struct f12_data));
+ f12->rel_mask = (unsigned long *)((char *)f12
+ + sizeof(struct f12_data) + mask_size);
+
+ set_bit(fn->irq_pos, f12->abs_mask);
+ set_bit(fn->irq_pos + 1, f12->rel_mask);
+
f12->has_dribble = !!(buf & BIT(3));
if (fn->dev.of_node) {
static const struct vb2_queue rmi_f54_queue = {
.type = V4L2_BUF_TYPE_VIDEO_CAPTURE,
.io_modes = VB2_MMAP | VB2_USERPTR | VB2_DMABUF | VB2_READ,
- .buf_struct_size = sizeof(struct vb2_buffer),
+ .buf_struct_size = sizeof(struct vb2_v4l2_buffer),
.ops = &rmi_f54_queue_ops,
.mem_ops = &vb2_vmalloc_memops,
.timestamp_flags = V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC,
{
struct rmi_driver *drv = fn->rmi_dev->driver;
- drv->set_irq_bits(fn->rmi_dev, fn->irq_mask);
+ drv->clear_irq_bits(fn->rmi_dev, fn->irq_mask);
return 0;
}
/* get sysinfo */
md->si = &cd->sysinfo;
- if (!md->si) {
- dev_err(dev, "%s: Fail get sysinfo pointer from core p=%p\n",
- __func__, md->si);
- goto error_get_sysinfo;
- }
rc = cyttsp4_setup_input_device(cd);
if (rc)
error_init_input:
input_free_device(md->input);
-error_get_sysinfo:
- input_set_drvdata(md->input, NULL);
error_alloc_failed:
dev_err(dev, "%s failed.\n", __func__);
return rc;
if (!path)
return;
+ mutex_lock(&icc_lock);
+
for (i = 0; i < path->num_nodes; i++)
path->reqs[i].tag = tag;
+
+ mutex_unlock(&icc_lock);
}
EXPORT_SYMBOL_GPL(icc_set_tag);
if (!qp)
return -ENOMEM;
- data = devm_kcalloc(dev, num_nodes, sizeof(*node), GFP_KERNEL);
+ data = devm_kzalloc(dev, struct_size(data, nodes, num_nodes),
+ GFP_KERNEL);
if (!data)
return -ENOMEM;
if (!qp)
return -ENOMEM;
- data = devm_kcalloc(&pdev->dev, num_nodes, sizeof(*node), GFP_KERNEL);
+ data = devm_kzalloc(&pdev->dev, struct_size(data, nodes, num_nodes),
+ GFP_KERNEL);
if (!data)
return -ENOMEM;
ignore_updelay = !rcu_dereference(bond->curr_active_slave);
bond_for_each_slave_rcu(bond, slave, iter) {
- slave->new_link = BOND_LINK_NOCHANGE;
- slave->link_new_state = slave->link;
+ bond_propose_link_state(slave, BOND_LINK_NOCHANGE);
link_state = bond_check_dev_link(bond, slave->dev, 0);
}
if (slave->delay <= 0) {
- slave->new_link = BOND_LINK_DOWN;
+ bond_propose_link_state(slave, BOND_LINK_DOWN);
commit++;
continue;
}
slave->delay = 0;
if (slave->delay <= 0) {
- slave->new_link = BOND_LINK_UP;
+ bond_propose_link_state(slave, BOND_LINK_UP);
commit++;
ignore_updelay = false;
continue;
struct slave *slave, *primary;
bond_for_each_slave(bond, slave, iter) {
- switch (slave->new_link) {
+ switch (slave->link_new_state) {
case BOND_LINK_NOCHANGE:
/* For 802.3ad mode, check current slave speed and
* duplex again in case its port was disabled after
default:
slave_err(bond->dev, slave->dev, "invalid new link %d on slave\n",
- slave->new_link);
- slave->new_link = BOND_LINK_NOCHANGE;
+ slave->link_new_state);
+ bond_propose_link_state(slave, BOND_LINK_NOCHANGE);
continue;
}
bond_for_each_slave_rcu(bond, slave, iter) {
unsigned long trans_start = dev_trans_start(slave->dev);
- slave->new_link = BOND_LINK_NOCHANGE;
+ bond_propose_link_state(slave, BOND_LINK_NOCHANGE);
if (slave->link != BOND_LINK_UP) {
if (bond_time_in_interval(bond, trans_start, 1) &&
bond_time_in_interval(bond, slave->last_rx, 1)) {
- slave->new_link = BOND_LINK_UP;
+ bond_propose_link_state(slave, BOND_LINK_UP);
slave_state_changed = 1;
/* primary_slave has no meaning in round-robin
if (!bond_time_in_interval(bond, trans_start, 2) ||
!bond_time_in_interval(bond, slave->last_rx, 2)) {
- slave->new_link = BOND_LINK_DOWN;
+ bond_propose_link_state(slave, BOND_LINK_DOWN);
slave_state_changed = 1;
if (slave->link_failure_count < UINT_MAX)
goto re_arm;
bond_for_each_slave(bond, slave, iter) {
- if (slave->new_link != BOND_LINK_NOCHANGE)
- slave->link = slave->new_link;
+ if (slave->link_new_state != BOND_LINK_NOCHANGE)
+ slave->link = slave->link_new_state;
}
if (slave_state_changed) {
}
/* Called to inspect slaves for active-backup mode ARP monitor link state
- * changes. Sets new_link in slaves to specify what action should take
- * place for the slave. Returns 0 if no changes are found, >0 if changes
- * to link states must be committed.
+ * changes. Sets proposed link state in slaves to specify what action
+ * should take place for the slave. Returns 0 if no changes are found, >0
+ * if changes to link states must be committed.
*
* Called with rcu_read_lock held.
*/
int commit = 0;
bond_for_each_slave_rcu(bond, slave, iter) {
- slave->new_link = BOND_LINK_NOCHANGE;
+ bond_propose_link_state(slave, BOND_LINK_NOCHANGE);
last_rx = slave_last_rx(bond, slave);
if (slave->link != BOND_LINK_UP) {
if (bond_time_in_interval(bond, last_rx, 1)) {
- slave->new_link = BOND_LINK_UP;
+ bond_propose_link_state(slave, BOND_LINK_UP);
commit++;
}
continue;
if (!bond_is_active_slave(slave) &&
!rcu_access_pointer(bond->current_arp_slave) &&
!bond_time_in_interval(bond, last_rx, 3)) {
- slave->new_link = BOND_LINK_DOWN;
+ bond_propose_link_state(slave, BOND_LINK_DOWN);
commit++;
}
if (bond_is_active_slave(slave) &&
(!bond_time_in_interval(bond, trans_start, 2) ||
!bond_time_in_interval(bond, last_rx, 2))) {
- slave->new_link = BOND_LINK_DOWN;
+ bond_propose_link_state(slave, BOND_LINK_DOWN);
commit++;
}
}
struct slave *slave;
bond_for_each_slave(bond, slave, iter) {
- switch (slave->new_link) {
+ switch (slave->link_new_state) {
case BOND_LINK_NOCHANGE:
continue;
continue;
default:
- slave_err(bond->dev, slave->dev, "impossible: new_link %d on slave\n",
- slave->new_link);
+ slave_err(bond->dev, slave->dev,
+ "impossible: link_new_state %d on slave\n",
+ slave->link_new_state);
continue;
}
#define CONTROL_EX_PDR BIT(8)
/* control register */
+#define CONTROL_SWR BIT(15)
#define CONTROL_TEST BIT(7)
#define CONTROL_CCE BIT(6)
#define CONTROL_DISABLE_AR BIT(5)
#define BTR_TSEG2_SHIFT 12
#define BTR_TSEG2_MASK (0x7 << BTR_TSEG2_SHIFT)
+/* interrupt register */
+#define INT_STS_PENDING 0x8000
+
/* brp extension register */
#define BRP_EXT_BRPE_MASK 0x0f
#define BRP_EXT_BRPE_SHIFT 0
IF_MCONT_RCV_EOB);
}
+static int c_can_software_reset(struct net_device *dev)
+{
+ struct c_can_priv *priv = netdev_priv(dev);
+ int retry = 0;
+
+ if (priv->type != BOSCH_D_CAN)
+ return 0;
+
+ priv->write_reg(priv, C_CAN_CTRL_REG, CONTROL_SWR | CONTROL_INIT);
+ while (priv->read_reg(priv, C_CAN_CTRL_REG) & CONTROL_SWR) {
+ msleep(20);
+ if (retry++ > 100) {
+ netdev_err(dev, "CCTRL: software reset failed\n");
+ return -EIO;
+ }
+ }
+
+ return 0;
+}
+
/*
* Configure C_CAN chip:
* - enable/disable auto-retransmission
static int c_can_chip_config(struct net_device *dev)
{
struct c_can_priv *priv = netdev_priv(dev);
+ int err;
+
+ err = c_can_software_reset(dev);
+ if (err)
+ return err;
/* enable automatic retransmission */
priv->write_reg(priv, C_CAN_CTRL_REG, CONTROL_ENABLE_AR);
struct can_berr_counter bec;
switch (error_type) {
+ case C_CAN_NO_ERROR:
+ priv->can.state = CAN_STATE_ERROR_ACTIVE;
+ break;
case C_CAN_ERROR_WARNING:
/* error warning state */
priv->can.can_stats.error_warning++;
ERR_CNT_RP_SHIFT;
switch (error_type) {
+ case C_CAN_NO_ERROR:
+ /* error warning state */
+ cf->can_id |= CAN_ERR_CRTL;
+ cf->data[1] = CAN_ERR_CRTL_ACTIVE;
+ cf->data[6] = bec.txerr;
+ cf->data[7] = bec.rxerr;
+ break;
case C_CAN_ERROR_WARNING:
/* error warning state */
cf->can_id |= CAN_ERR_CRTL;
u16 curr, last = priv->last_status;
int work_done = 0;
- priv->last_status = curr = priv->read_reg(priv, C_CAN_STS_REG);
- /* Ack status on C_CAN. D_CAN is self clearing */
- if (priv->type != BOSCH_D_CAN)
- priv->write_reg(priv, C_CAN_STS_REG, LEC_UNUSED);
+ /* Only read the status register if a status interrupt was pending */
+ if (atomic_xchg(&priv->sie_pending, 0)) {
+ priv->last_status = curr = priv->read_reg(priv, C_CAN_STS_REG);
+ /* Ack status on C_CAN. D_CAN is self clearing */
+ if (priv->type != BOSCH_D_CAN)
+ priv->write_reg(priv, C_CAN_STS_REG, LEC_UNUSED);
+ } else {
+ /* no change detected ... */
+ curr = last;
+ }
/* handle state changes */
if ((curr & STATUS_EWARN) && (!(last & STATUS_EWARN))) {
/* handle bus recovery events */
if ((!(curr & STATUS_BOFF)) && (last & STATUS_BOFF)) {
netdev_dbg(dev, "left bus off state\n");
- priv->can.state = CAN_STATE_ERROR_ACTIVE;
+ work_done += c_can_handle_state_change(dev, C_CAN_ERROR_PASSIVE);
}
+
if ((!(curr & STATUS_EPASS)) && (last & STATUS_EPASS)) {
netdev_dbg(dev, "left error passive state\n");
- priv->can.state = CAN_STATE_ERROR_ACTIVE;
+ work_done += c_can_handle_state_change(dev, C_CAN_ERROR_WARNING);
+ }
+
+ if ((!(curr & STATUS_EWARN)) && (last & STATUS_EWARN)) {
+ netdev_dbg(dev, "left error warning state\n");
+ work_done += c_can_handle_state_change(dev, C_CAN_NO_ERROR);
}
/* handle lec errors on the bus */
{
struct net_device *dev = (struct net_device *)dev_id;
struct c_can_priv *priv = netdev_priv(dev);
+ int reg_int;
- if (!priv->read_reg(priv, C_CAN_INT_REG))
+ reg_int = priv->read_reg(priv, C_CAN_INT_REG);
+ if (!reg_int)
return IRQ_NONE;
+ /* save for later use */
+ if (reg_int & INT_STS_PENDING)
+ atomic_set(&priv->sie_pending, 1);
+
/* disable all interrupts and schedule the NAPI */
c_can_irq_control(priv, false);
napi_schedule(&priv->napi);
struct net_device *dev;
struct device *device;
atomic_t tx_active;
+ atomic_t sie_pending;
unsigned long tx_dir;
int last_status;
u16 (*read_reg) (const struct c_can_priv *priv, enum reg index);
return;
ret = of_property_read_u32(dn, "max-bitrate", &priv->bitrate_max);
+ of_node_put(dn);
if ((ret && ret != -EINVAL) || (!ret && !priv->bitrate_max))
netdev_warn(dev, "Invalid value for transceiver max bitrate. Ignoring bitrate limit.\n");
}
struct can_frame *cf;
bool rx_errors = false, tx_errors = false;
u32 timestamp;
+ int err;
timestamp = priv->read(®s->timer) << 16;
if (tx_errors)
dev->stats.tx_errors++;
- can_rx_offload_queue_sorted(&priv->offload, skb, timestamp);
+ err = can_rx_offload_queue_sorted(&priv->offload, skb, timestamp);
+ if (err)
+ dev->stats.rx_fifo_errors++;
}
static void flexcan_irq_state(struct net_device *dev, u32 reg_esr)
int flt;
struct can_berr_counter bec;
u32 timestamp;
+ int err;
timestamp = priv->read(®s->timer) << 16;
if (unlikely(new_state == CAN_STATE_BUS_OFF))
can_bus_off(dev);
- can_rx_offload_queue_sorted(&priv->offload, skb, timestamp);
+ err = can_rx_offload_queue_sorted(&priv->offload, skb, timestamp);
+ if (err)
+ dev->stats.rx_fifo_errors++;
}
static inline struct flexcan_priv *rx_offload_to_priv(struct can_rx_offload *offload)
reg_mecr = priv->read(®s->mecr);
reg_mecr &= ~FLEXCAN_MECR_ECRWRDIS;
priv->write(reg_mecr, ®s->mecr);
+ reg_mecr |= FLEXCAN_MECR_ECCDIS;
reg_mecr &= ~(FLEXCAN_MECR_NCEFAFRZ | FLEXCAN_MECR_HANCEI_MSK |
FLEXCAN_MECR_FANCEI_MSK);
priv->write(reg_mecr, ®s->mecr);
return cb_b->timestamp - cb_a->timestamp;
}
-static struct sk_buff *can_rx_offload_offload_one(struct can_rx_offload *offload, unsigned int n)
+/**
+ * can_rx_offload_offload_one() - Read one CAN frame from HW
+ * @offload: pointer to rx_offload context
+ * @n: number of mailbox to read
+ *
+ * The task of this function is to read a CAN frame from mailbox @n
+ * from the device and return the mailbox's content as a struct
+ * sk_buff.
+ *
+ * If the struct can_rx_offload::skb_queue exceeds the maximal queue
+ * length (struct can_rx_offload::skb_queue_len_max) or no skb can be
+ * allocated, the mailbox contents is discarded by reading it into an
+ * overflow buffer. This way the mailbox is marked as free by the
+ * driver.
+ *
+ * Return: A pointer to skb containing the CAN frame on success.
+ *
+ * NULL if the mailbox @n is empty.
+ *
+ * ERR_PTR() in case of an error
+ */
+static struct sk_buff *
+can_rx_offload_offload_one(struct can_rx_offload *offload, unsigned int n)
{
- struct sk_buff *skb = NULL;
+ struct sk_buff *skb = NULL, *skb_error = NULL;
struct can_rx_offload_cb *cb;
struct can_frame *cf;
int ret;
- /* If queue is full or skb not available, read to discard mailbox */
- if (likely(skb_queue_len(&offload->skb_queue) <=
- offload->skb_queue_len_max))
+ if (likely(skb_queue_len(&offload->skb_queue) <
+ offload->skb_queue_len_max)) {
skb = alloc_can_skb(offload->dev, &cf);
+ if (unlikely(!skb))
+ skb_error = ERR_PTR(-ENOMEM); /* skb alloc failed */
+ } else {
+ skb_error = ERR_PTR(-ENOBUFS); /* skb_queue is full */
+ }
- if (!skb) {
+ /* If queue is full or skb not available, drop by reading into
+ * overflow buffer.
+ */
+ if (unlikely(skb_error)) {
struct can_frame cf_overflow;
u32 timestamp;
ret = offload->mailbox_read(offload, &cf_overflow,
×tamp, n);
- if (ret)
- offload->dev->stats.rx_dropped++;
- return NULL;
+ /* Mailbox was empty. */
+ if (unlikely(!ret))
+ return NULL;
+
+ /* Mailbox has been read and we're dropping it or
+ * there was a problem reading the mailbox.
+ *
+ * Increment error counters in any case.
+ */
+ offload->dev->stats.rx_dropped++;
+ offload->dev->stats.rx_fifo_errors++;
+
+ /* There was a problem reading the mailbox, propagate
+ * error value.
+ */
+ if (unlikely(ret < 0))
+ return ERR_PTR(ret);
+
+ return skb_error;
}
cb = can_rx_offload_get_cb(skb);
ret = offload->mailbox_read(offload, cf, &cb->timestamp, n);
- if (!ret) {
+
+ /* Mailbox was empty. */
+ if (unlikely(!ret)) {
kfree_skb(skb);
return NULL;
}
+ /* There was a problem reading the mailbox, propagate error value. */
+ if (unlikely(ret < 0)) {
+ kfree_skb(skb);
+
+ offload->dev->stats.rx_dropped++;
+ offload->dev->stats.rx_fifo_errors++;
+
+ return ERR_PTR(ret);
+ }
+
+ /* Mailbox was read. */
return skb;
}
continue;
skb = can_rx_offload_offload_one(offload, i);
- if (!skb)
- break;
+ if (IS_ERR_OR_NULL(skb))
+ continue;
__skb_queue_add_sort(&skb_queue, skb, can_rx_offload_compare);
}
struct sk_buff *skb;
int received = 0;
- while ((skb = can_rx_offload_offload_one(offload, 0))) {
+ while (1) {
+ skb = can_rx_offload_offload_one(offload, 0);
+ if (IS_ERR(skb))
+ continue;
+ if (!skb)
+ break;
+
skb_queue_tail(&offload->skb_queue, skb);
received++;
}
unsigned long flags;
if (skb_queue_len(&offload->skb_queue) >
- offload->skb_queue_len_max)
- return -ENOMEM;
+ offload->skb_queue_len_max) {
+ kfree_skb(skb);
+ return -ENOBUFS;
+ }
cb = can_rx_offload_get_cb(skb);
cb->timestamp = timestamp;
struct sk_buff *skb)
{
if (skb_queue_len(&offload->skb_queue) >
- offload->skb_queue_len_max)
- return -ENOMEM;
+ offload->skb_queue_len_max) {
+ kfree_skb(skb);
+ return -ENOBUFS;
+ }
skb_queue_tail(&offload->skb_queue, skb);
can_rx_offload_schedule(offload);
if (priv->after_suspend) {
mcp251x_hw_reset(spi);
mcp251x_setup(net, spi);
+ priv->force_quit = 0;
if (priv->after_suspend & AFTER_SUSPEND_RESTART) {
mcp251x_set_normal_mode(spi);
} else if (priv->after_suspend & AFTER_SUSPEND_UP) {
mcp251x_hw_sleep(spi);
}
priv->after_suspend = 0;
- priv->force_quit = 0;
}
if (priv->restart_tx) {
*/
#define HECC_MAX_RX_MBOX (HECC_MAX_MAILBOXES - HECC_MAX_TX_MBOX)
#define HECC_RX_FIRST_MBOX (HECC_MAX_MAILBOXES - 1)
+#define HECC_RX_LAST_MBOX (HECC_MAX_TX_MBOX)
/* TI HECC module registers */
#define HECC_CANME 0x0 /* Mailbox enable */
#define HECC_CANTA 0x10 /* Transmission acknowledge */
#define HECC_CANAA 0x14 /* Abort acknowledge */
#define HECC_CANRMP 0x18 /* Receive message pending */
-#define HECC_CANRML 0x1C /* Remote message lost */
+#define HECC_CANRML 0x1C /* Receive message lost */
#define HECC_CANRFP 0x20 /* Remote frame pending */
#define HECC_CANGAM 0x24 /* SECC only:Global acceptance mask */
#define HECC_CANMC 0x28 /* Master control */
#define HECC_BUS_ERROR (HECC_CANES_FE | HECC_CANES_BE |\
HECC_CANES_CRCE | HECC_CANES_SE |\
HECC_CANES_ACKE)
+#define HECC_CANES_FLAGS (HECC_BUS_ERROR | HECC_CANES_BO |\
+ HECC_CANES_EP | HECC_CANES_EW)
#define HECC_CANMCF_RTR BIT(4) /* Remote transmit request */
hecc_set_bit(priv, HECC_CANMIM, mbx_mask);
}
- /* Prevent message over-write & Enable interrupts */
- hecc_write(priv, HECC_CANOPC, HECC_SET_REG);
+ /* Enable tx interrupts */
+ hecc_set_bit(priv, HECC_CANMIM, BIT(HECC_MAX_TX_MBOX) - 1);
+
+ /* Prevent message over-write to create a rx fifo, but not for
+ * the lowest priority mailbox, since that allows detecting
+ * overflows instead of the hardware silently dropping the
+ * messages.
+ */
+ mbx_mask = ~BIT(HECC_RX_LAST_MBOX);
+ hecc_write(priv, HECC_CANOPC, mbx_mask);
+
+ /* Enable interrupts */
if (priv->use_hecc1int) {
hecc_write(priv, HECC_CANMIL, HECC_SET_REG);
hecc_write(priv, HECC_CANGIM, HECC_CANGIM_DEF_MASK |
{
struct ti_hecc_priv *priv = netdev_priv(ndev);
+ /* Disable the CPK; stop sending, erroring and acking */
+ hecc_set_bit(priv, HECC_CANMC, HECC_CANMC_CCR);
+
/* Disable interrupts and disable mailboxes */
hecc_write(priv, HECC_CANGIM, 0);
hecc_write(priv, HECC_CANMIM, 0);
hecc_set_bit(priv, HECC_CANME, mbx_mask);
spin_unlock_irqrestore(&priv->mbx_lock, flags);
- hecc_clear_bit(priv, HECC_CANMD, mbx_mask);
- hecc_set_bit(priv, HECC_CANMIM, mbx_mask);
hecc_write(priv, HECC_CANTRS, mbx_mask);
return NETDEV_TX_OK;
u32 *timestamp, unsigned int mbxno)
{
struct ti_hecc_priv *priv = rx_offload_to_priv(offload);
- u32 data;
+ u32 data, mbx_mask;
+ int ret = 1;
+ mbx_mask = BIT(mbxno);
data = hecc_read_mbx(priv, mbxno, HECC_CANMID);
if (data & HECC_CANMID_IDE)
cf->can_id = (data & CAN_EFF_MASK) | CAN_EFF_FLAG;
*timestamp = hecc_read_stamp(priv, mbxno);
- return 1;
+ /* Check for FIFO overrun.
+ *
+ * All but the last RX mailbox have activated overwrite
+ * protection. So skip check for overrun, if we're not
+ * handling the last RX mailbox.
+ *
+ * As the overwrite protection for the last RX mailbox is
+ * disabled, the CAN core might update while we're reading
+ * it. This means the skb might be inconsistent.
+ *
+ * Return an error to let rx-offload discard this CAN frame.
+ */
+ if (unlikely(mbxno == HECC_RX_LAST_MBOX &&
+ hecc_read(priv, HECC_CANRML) & mbx_mask))
+ ret = -ENOBUFS;
+
+ hecc_write(priv, HECC_CANRMP, mbx_mask);
+
+ return ret;
}
static int ti_hecc_error(struct net_device *ndev, int int_status,
struct can_frame *cf;
struct sk_buff *skb;
u32 timestamp;
+ int err;
- /* propagate the error condition to the can stack */
- skb = alloc_can_err_skb(ndev, &cf);
- if (!skb) {
- if (printk_ratelimit())
- netdev_err(priv->ndev,
- "%s: alloc_can_err_skb() failed\n",
- __func__);
- return -ENOMEM;
- }
-
- if (int_status & HECC_CANGIF_WLIF) { /* warning level int */
- if ((int_status & HECC_CANGIF_BOIF) == 0) {
- priv->can.state = CAN_STATE_ERROR_WARNING;
- ++priv->can.can_stats.error_warning;
- cf->can_id |= CAN_ERR_CRTL;
- if (hecc_read(priv, HECC_CANTEC) > 96)
- cf->data[1] |= CAN_ERR_CRTL_TX_WARNING;
- if (hecc_read(priv, HECC_CANREC) > 96)
- cf->data[1] |= CAN_ERR_CRTL_RX_WARNING;
- }
- hecc_set_bit(priv, HECC_CANES, HECC_CANES_EW);
- netdev_dbg(priv->ndev, "Error Warning interrupt\n");
- hecc_clear_bit(priv, HECC_CANMC, HECC_CANMC_CCR);
- }
-
- if (int_status & HECC_CANGIF_EPIF) { /* error passive int */
- if ((int_status & HECC_CANGIF_BOIF) == 0) {
- priv->can.state = CAN_STATE_ERROR_PASSIVE;
- ++priv->can.can_stats.error_passive;
- cf->can_id |= CAN_ERR_CRTL;
- if (hecc_read(priv, HECC_CANTEC) > 127)
- cf->data[1] |= CAN_ERR_CRTL_TX_PASSIVE;
- if (hecc_read(priv, HECC_CANREC) > 127)
- cf->data[1] |= CAN_ERR_CRTL_RX_PASSIVE;
+ if (err_status & HECC_BUS_ERROR) {
+ /* propagate the error condition to the can stack */
+ skb = alloc_can_err_skb(ndev, &cf);
+ if (!skb) {
+ if (net_ratelimit())
+ netdev_err(priv->ndev,
+ "%s: alloc_can_err_skb() failed\n",
+ __func__);
+ return -ENOMEM;
}
- hecc_set_bit(priv, HECC_CANES, HECC_CANES_EP);
- netdev_dbg(priv->ndev, "Error passive interrupt\n");
- hecc_clear_bit(priv, HECC_CANMC, HECC_CANMC_CCR);
- }
-
- /* Need to check busoff condition in error status register too to
- * ensure warning interrupts don't hog the system
- */
- if ((int_status & HECC_CANGIF_BOIF) || (err_status & HECC_CANES_BO)) {
- priv->can.state = CAN_STATE_BUS_OFF;
- cf->can_id |= CAN_ERR_BUSOFF;
- hecc_set_bit(priv, HECC_CANES, HECC_CANES_BO);
- hecc_clear_bit(priv, HECC_CANMC, HECC_CANMC_CCR);
- /* Disable all interrupts in bus-off to avoid int hog */
- hecc_write(priv, HECC_CANGIM, 0);
- ++priv->can.can_stats.bus_off;
- can_bus_off(ndev);
- }
- if (err_status & HECC_BUS_ERROR) {
++priv->can.can_stats.bus_error;
cf->can_id |= CAN_ERR_BUSERROR | CAN_ERR_PROT;
- if (err_status & HECC_CANES_FE) {
- hecc_set_bit(priv, HECC_CANES, HECC_CANES_FE);
+ if (err_status & HECC_CANES_FE)
cf->data[2] |= CAN_ERR_PROT_FORM;
- }
- if (err_status & HECC_CANES_BE) {
- hecc_set_bit(priv, HECC_CANES, HECC_CANES_BE);
+ if (err_status & HECC_CANES_BE)
cf->data[2] |= CAN_ERR_PROT_BIT;
- }
- if (err_status & HECC_CANES_SE) {
- hecc_set_bit(priv, HECC_CANES, HECC_CANES_SE);
+ if (err_status & HECC_CANES_SE)
cf->data[2] |= CAN_ERR_PROT_STUFF;
- }
- if (err_status & HECC_CANES_CRCE) {
- hecc_set_bit(priv, HECC_CANES, HECC_CANES_CRCE);
+ if (err_status & HECC_CANES_CRCE)
cf->data[3] = CAN_ERR_PROT_LOC_CRC_SEQ;
- }
- if (err_status & HECC_CANES_ACKE) {
- hecc_set_bit(priv, HECC_CANES, HECC_CANES_ACKE);
+ if (err_status & HECC_CANES_ACKE)
cf->data[3] = CAN_ERR_PROT_LOC_ACK;
- }
+
+ timestamp = hecc_read(priv, HECC_CANLNT);
+ err = can_rx_offload_queue_sorted(&priv->offload, skb,
+ timestamp);
+ if (err)
+ ndev->stats.rx_fifo_errors++;
}
- timestamp = hecc_read(priv, HECC_CANLNT);
- can_rx_offload_queue_sorted(&priv->offload, skb, timestamp);
+ hecc_write(priv, HECC_CANES, HECC_CANES_FLAGS);
return 0;
}
+static void ti_hecc_change_state(struct net_device *ndev,
+ enum can_state rx_state,
+ enum can_state tx_state)
+{
+ struct ti_hecc_priv *priv = netdev_priv(ndev);
+ struct can_frame *cf;
+ struct sk_buff *skb;
+ u32 timestamp;
+ int err;
+
+ skb = alloc_can_err_skb(priv->ndev, &cf);
+ if (unlikely(!skb)) {
+ priv->can.state = max(tx_state, rx_state);
+ return;
+ }
+
+ can_change_state(priv->ndev, cf, tx_state, rx_state);
+
+ if (max(tx_state, rx_state) != CAN_STATE_BUS_OFF) {
+ cf->data[6] = hecc_read(priv, HECC_CANTEC);
+ cf->data[7] = hecc_read(priv, HECC_CANREC);
+ }
+
+ timestamp = hecc_read(priv, HECC_CANLNT);
+ err = can_rx_offload_queue_sorted(&priv->offload, skb, timestamp);
+ if (err)
+ ndev->stats.rx_fifo_errors++;
+}
+
static irqreturn_t ti_hecc_interrupt(int irq, void *dev_id)
{
struct net_device *ndev = (struct net_device *)dev_id;
struct net_device_stats *stats = &ndev->stats;
u32 mbxno, mbx_mask, int_status, err_status, stamp;
unsigned long flags, rx_pending;
+ u32 handled = 0;
int_status = hecc_read(priv,
priv->use_hecc1int ?
return IRQ_NONE;
err_status = hecc_read(priv, HECC_CANES);
- if (err_status & (HECC_BUS_ERROR | HECC_CANES_BO |
- HECC_CANES_EP | HECC_CANES_EW))
+ if (unlikely(err_status & HECC_CANES_FLAGS))
ti_hecc_error(ndev, int_status, err_status);
+ if (unlikely(int_status & HECC_CANGIM_DEF_MASK)) {
+ enum can_state rx_state, tx_state;
+ u32 rec = hecc_read(priv, HECC_CANREC);
+ u32 tec = hecc_read(priv, HECC_CANTEC);
+
+ if (int_status & HECC_CANGIF_WLIF) {
+ handled |= HECC_CANGIF_WLIF;
+ rx_state = rec >= tec ? CAN_STATE_ERROR_WARNING : 0;
+ tx_state = rec <= tec ? CAN_STATE_ERROR_WARNING : 0;
+ netdev_dbg(priv->ndev, "Error Warning interrupt\n");
+ ti_hecc_change_state(ndev, rx_state, tx_state);
+ }
+
+ if (int_status & HECC_CANGIF_EPIF) {
+ handled |= HECC_CANGIF_EPIF;
+ rx_state = rec >= tec ? CAN_STATE_ERROR_PASSIVE : 0;
+ tx_state = rec <= tec ? CAN_STATE_ERROR_PASSIVE : 0;
+ netdev_dbg(priv->ndev, "Error passive interrupt\n");
+ ti_hecc_change_state(ndev, rx_state, tx_state);
+ }
+
+ if (int_status & HECC_CANGIF_BOIF) {
+ handled |= HECC_CANGIF_BOIF;
+ rx_state = CAN_STATE_BUS_OFF;
+ tx_state = CAN_STATE_BUS_OFF;
+ netdev_dbg(priv->ndev, "Bus off interrupt\n");
+
+ /* Disable all interrupts */
+ hecc_write(priv, HECC_CANGIM, 0);
+ can_bus_off(ndev);
+ ti_hecc_change_state(ndev, rx_state, tx_state);
+ }
+ } else if (unlikely(priv->can.state != CAN_STATE_ERROR_ACTIVE)) {
+ enum can_state new_state, tx_state, rx_state;
+ u32 rec = hecc_read(priv, HECC_CANREC);
+ u32 tec = hecc_read(priv, HECC_CANTEC);
+
+ if (rec >= 128 || tec >= 128)
+ new_state = CAN_STATE_ERROR_PASSIVE;
+ else if (rec >= 96 || tec >= 96)
+ new_state = CAN_STATE_ERROR_WARNING;
+ else
+ new_state = CAN_STATE_ERROR_ACTIVE;
+
+ if (new_state < priv->can.state) {
+ rx_state = rec >= tec ? new_state : 0;
+ tx_state = rec <= tec ? new_state : 0;
+ ti_hecc_change_state(ndev, rx_state, tx_state);
+ }
+ }
+
if (int_status & HECC_CANGIF_GMIF) {
while (priv->tx_tail - priv->tx_head > 0) {
mbxno = get_tx_tail_mb(priv);
mbx_mask = BIT(mbxno);
if (!(mbx_mask & hecc_read(priv, HECC_CANTA)))
break;
- hecc_clear_bit(priv, HECC_CANMIM, mbx_mask);
hecc_write(priv, HECC_CANTA, mbx_mask);
spin_lock_irqsave(&priv->mbx_lock, flags);
hecc_clear_bit(priv, HECC_CANME, mbx_mask);
while ((rx_pending = hecc_read(priv, HECC_CANRMP))) {
can_rx_offload_irq_offload_timestamp(&priv->offload,
rx_pending);
- hecc_write(priv, HECC_CANRMP, rx_pending);
}
}
/* clear all interrupt conditions - read back to avoid spurious ints */
if (priv->use_hecc1int) {
- hecc_write(priv, HECC_CANGIF1, HECC_SET_REG);
+ hecc_write(priv, HECC_CANGIF1, handled);
int_status = hecc_read(priv, HECC_CANGIF1);
} else {
- hecc_write(priv, HECC_CANGIF0, HECC_SET_REG);
+ hecc_write(priv, HECC_CANGIF0, handled);
int_status = hecc_read(priv, HECC_CANGIF0);
}
priv->offload.mailbox_read = ti_hecc_mailbox_read;
priv->offload.mb_first = HECC_RX_FIRST_MBOX;
- priv->offload.mb_last = HECC_MAX_TX_MBOX;
+ priv->offload.mb_last = HECC_RX_LAST_MBOX;
err = can_rx_offload_add_timestamp(ndev, &priv->offload);
if (err) {
dev_err(&pdev->dev, "can_rx_offload_add_timestamp() failed\n");
rc);
usb_unanchor_urb(urb);
+ usb_free_urb(urb);
break;
}
netdev_info(priv->netdev, "device disconnected\n");
unregister_candev(priv->netdev);
- free_candev(priv->netdev);
-
mcba_urb_unlink(priv);
+ free_candev(priv->netdev);
}
static struct usb_driver mcba_usb_driver = {
u8 *end;
u8 rec_cnt;
u8 rec_idx;
- u8 rec_data_idx;
+ u8 rec_ts_idx;
struct net_device *netdev;
struct pcan_usb *pdev;
};
}
if ((n & PCAN_USB_ERROR_BUS_LIGHT) == 0) {
/* no error (back to active state) */
- mc->pdev->dev.can.state = CAN_STATE_ERROR_ACTIVE;
- return 0;
+ new_state = CAN_STATE_ERROR_ACTIVE;
+ break;
}
break;
}
if ((n & PCAN_USB_ERROR_BUS_HEAVY) == 0) {
- /* no error (back to active state) */
- mc->pdev->dev.can.state = CAN_STATE_ERROR_ACTIVE;
- return 0;
+ /* no error (back to warning state) */
+ new_state = CAN_STATE_ERROR_WARNING;
+ break;
}
break;
mc->pdev->dev.can.can_stats.error_warning++;
break;
+ case CAN_STATE_ERROR_ACTIVE:
+ cf->can_id |= CAN_ERR_CRTL;
+ cf->data[1] = CAN_ERR_CRTL_ACTIVE;
+ break;
+
default:
/* CAN_STATE_MAX (trick to handle other errors) */
cf->can_id |= CAN_ERR_CRTL;
mc->ptr += PCAN_USB_CMD_ARGS;
if (status_len & PCAN_USB_STATUSLEN_TIMESTAMP) {
- int err = pcan_usb_decode_ts(mc, !mc->rec_idx);
+ int err = pcan_usb_decode_ts(mc, !mc->rec_ts_idx);
if (err)
return err;
+
+ /* Next packet in the buffer will have a timestamp on a single
+ * byte
+ */
+ mc->rec_ts_idx++;
}
switch (f) {
cf->can_dlc = get_can_dlc(rec_len);
- /* first data packet timestamp is a word */
- if (pcan_usb_decode_ts(mc, !mc->rec_data_idx))
+ /* Only first packet timestamp is a word */
+ if (pcan_usb_decode_ts(mc, !mc->rec_ts_idx))
goto decode_failed;
+ /* Next packet in the buffer will have a timestamp on a single byte */
+ mc->rec_ts_idx++;
+
/* read data */
memset(cf->data, 0x0, sizeof(cf->data));
if (status_len & PCAN_USB_STATUSLEN_RTR) {
/* handle normal can frames here */
} else {
err = pcan_usb_decode_data(&mc, sl);
- mc.rec_data_idx++;
}
}
dev = netdev_priv(netdev);
/* allocate a buffer large enough to send commands */
- dev->cmd_buf = kmalloc(PCAN_USB_MAX_CMD_LEN, GFP_KERNEL);
+ dev->cmd_buf = kzalloc(PCAN_USB_MAX_CMD_LEN, GFP_KERNEL);
if (!dev->cmd_buf) {
err = -ENOMEM;
goto lbl_free_candev;
netdev_info(priv->netdev, "device disconnected\n");
unregister_netdev(priv->netdev);
- free_candev(priv->netdev);
-
unlink_all_urbs(priv);
+ free_candev(priv->netdev);
}
}
static const struct xcan_devtype_data xcan_axi_data = {
.cantype = XAXI_CAN,
- .flags = XCAN_FLAG_TXFEMP,
.bittiming_const = &xcan_bittiming_const,
.btr_ts2_shift = XCAN_BTR_TS2_SHIFT,
.btr_sjw_shift = XCAN_BTR_SJW_SHIFT,
struct bcm_sf2_priv *priv = platform_get_drvdata(pdev);
priv->wol_ports_mask = 0;
+ /* Disable interrupts */
+ bcm_sf2_intr_disable(priv);
dsa_unregister_switch(priv->dev->ds);
bcm_sf2_cfp_exit(priv->dev->ds);
- /* Disable all ports and interrupts */
- bcm_sf2_sw_suspend(priv->dev->ds);
bcm_sf2_mdio_unregister(priv);
return 0;
/* issue soft reset with (rg)mii loopback to ensure a stable rxclk */
bcmgenet_umac_writel(priv, CMD_SW_RESET | CMD_LCL_LOOP_EN, UMAC_CMD);
- udelay(2);
- bcmgenet_umac_writel(priv, 0, UMAC_CMD);
}
static void bcmgenet_intr_disable(struct bcmgenet_priv *priv)
spin_unlock_irq(&priv->lock);
if (status & UMAC_IRQ_PHY_DET_R &&
- priv->dev->phydev->autoneg != AUTONEG_ENABLE)
+ priv->dev->phydev->autoneg != AUTONEG_ENABLE) {
phy_init_hw(priv->dev->phydev);
+ genphy_config_aneg(priv->dev->phydev);
+ }
/* Link UP/DOWN event */
if (status & UMAC_IRQ_LINK_EVENT)
if (priv->internal_phy)
bcmgenet_power_up(priv, GENET_POWER_PASSIVE);
- ret = bcmgenet_mii_connect(dev);
- if (ret) {
- netdev_err(dev, "failed to connect to PHY\n");
- goto err_clk_disable;
- }
-
/* take MAC out of reset */
bcmgenet_umac_reset(priv);
reg = bcmgenet_umac_readl(priv, UMAC_CMD);
priv->crc_fwd_en = !!(reg & CMD_CRC_FWD);
- ret = bcmgenet_mii_config(dev, true);
- if (ret) {
- netdev_err(dev, "unsupported PHY\n");
- goto err_disconnect_phy;
- }
-
bcmgenet_set_hw_addr(priv, dev->dev_addr);
if (priv->internal_phy) {
ret = bcmgenet_init_dma(priv);
if (ret) {
netdev_err(dev, "failed to initialize DMA\n");
- goto err_disconnect_phy;
+ goto err_clk_disable;
}
/* Always enable ring 16 - descriptor ring */
goto err_irq0;
}
+ ret = bcmgenet_mii_probe(dev);
+ if (ret) {
+ netdev_err(dev, "failed to connect to PHY\n");
+ goto err_irq1;
+ }
+
bcmgenet_netif_start(dev);
netif_tx_start_all_queues(dev);
return 0;
+err_irq1:
+ free_irq(priv->irq1, priv);
err_irq0:
free_irq(priv->irq0, priv);
err_fini_dma:
bcmgenet_dma_teardown(priv);
bcmgenet_fini_dma(priv);
-err_disconnect_phy:
- phy_disconnect(dev->phydev);
err_clk_disable:
if (priv->internal_phy)
bcmgenet_power_down(priv, GENET_POWER_PASSIVE);
if (priv->internal_phy)
bcmgenet_power_up(priv, GENET_POWER_PASSIVE);
- phy_init_hw(dev->phydev);
-
bcmgenet_umac_reset(priv);
init_umac(priv);
if (priv->wolopts)
clk_disable_unprepare(priv->clk_wol);
+ phy_init_hw(dev->phydev);
+
/* Speed settings must be restored */
+ genphy_config_aneg(dev->phydev);
bcmgenet_mii_config(priv->dev, false);
bcmgenet_set_hw_addr(priv, dev->dev_addr);
/* MDIO routines */
int bcmgenet_mii_init(struct net_device *dev);
-int bcmgenet_mii_connect(struct net_device *dev);
int bcmgenet_mii_config(struct net_device *dev, bool init);
+int bcmgenet_mii_probe(struct net_device *dev);
void bcmgenet_mii_exit(struct net_device *dev);
void bcmgenet_phy_power_set(struct net_device *dev, bool enable);
void bcmgenet_mii_setup(struct net_device *dev);
bcmgenet_fixed_phy_link_update);
}
-int bcmgenet_mii_connect(struct net_device *dev)
-{
- struct bcmgenet_priv *priv = netdev_priv(dev);
- struct device_node *dn = priv->pdev->dev.of_node;
- struct phy_device *phydev;
- u32 phy_flags = 0;
- int ret;
-
- /* Communicate the integrated PHY revision */
- if (priv->internal_phy)
- phy_flags = priv->gphy_rev;
-
- /* Initialize link state variables that bcmgenet_mii_setup() uses */
- priv->old_link = -1;
- priv->old_speed = -1;
- priv->old_duplex = -1;
- priv->old_pause = -1;
-
- if (dn) {
- phydev = of_phy_connect(dev, priv->phy_dn, bcmgenet_mii_setup,
- phy_flags, priv->phy_interface);
- if (!phydev) {
- pr_err("could not attach to PHY\n");
- return -ENODEV;
- }
- } else {
- phydev = dev->phydev;
- phydev->dev_flags = phy_flags;
-
- ret = phy_connect_direct(dev, phydev, bcmgenet_mii_setup,
- priv->phy_interface);
- if (ret) {
- pr_err("could not attach to PHY\n");
- return -ENODEV;
- }
- }
-
- return 0;
-}
-
int bcmgenet_mii_config(struct net_device *dev, bool init)
{
struct bcmgenet_priv *priv = netdev_priv(dev);
const char *phy_name = NULL;
u32 id_mode_dis = 0;
u32 port_ctrl;
+ int bmcr = -1;
+ int ret;
u32 reg;
+ /* MAC clocking workaround during reset of umac state machines */
+ reg = bcmgenet_umac_readl(priv, UMAC_CMD);
+ if (reg & CMD_SW_RESET) {
+ /* An MII PHY must be isolated to prevent TXC contention */
+ if (priv->phy_interface == PHY_INTERFACE_MODE_MII) {
+ ret = phy_read(phydev, MII_BMCR);
+ if (ret >= 0) {
+ bmcr = ret;
+ ret = phy_write(phydev, MII_BMCR,
+ bmcr | BMCR_ISOLATE);
+ }
+ if (ret) {
+ netdev_err(dev, "failed to isolate PHY\n");
+ return ret;
+ }
+ }
+ /* Switch MAC clocking to RGMII generated clock */
+ bcmgenet_sys_writel(priv, PORT_MODE_EXT_GPHY, SYS_PORT_CTRL);
+ /* Ensure 5 clks with Rx disabled
+ * followed by 5 clks with Reset asserted
+ */
+ udelay(4);
+ reg &= ~(CMD_SW_RESET | CMD_LCL_LOOP_EN);
+ bcmgenet_umac_writel(priv, reg, UMAC_CMD);
+ /* Ensure 5 more clocks before Rx is enabled */
+ udelay(2);
+ }
+
priv->ext_phy = !priv->internal_phy &&
(priv->phy_interface != PHY_INTERFACE_MODE_MOCA);
phy_set_max_speed(phydev, SPEED_100);
bcmgenet_sys_writel(priv,
PORT_MODE_EXT_EPHY, SYS_PORT_CTRL);
+ /* Restore the MII PHY after isolation */
+ if (bmcr >= 0)
+ phy_write(phydev, MII_BMCR, bmcr);
break;
case PHY_INTERFACE_MODE_REVMII:
bcmgenet_ext_writel(priv, reg, EXT_RGMII_OOB_CTRL);
}
- if (init) {
- linkmode_copy(phydev->advertising, phydev->supported);
+ if (init)
+ dev_info(kdev, "configuring instance for %s\n", phy_name);
- /* The internal PHY has its link interrupts routed to the
- * Ethernet MAC ISRs. On GENETv5 there is a hardware issue
- * that prevents the signaling of link UP interrupts when
- * the link operates at 10Mbps, so fallback to polling for
- * those versions of GENET.
- */
- if (priv->internal_phy && !GENET_IS_V5(priv))
- phydev->irq = PHY_IGNORE_INTERRUPT;
+ return 0;
+}
- dev_info(kdev, "configuring instance for %s\n", phy_name);
+int bcmgenet_mii_probe(struct net_device *dev)
+{
+ struct bcmgenet_priv *priv = netdev_priv(dev);
+ struct device_node *dn = priv->pdev->dev.of_node;
+ struct phy_device *phydev;
+ u32 phy_flags = 0;
+ int ret;
+
+ /* Communicate the integrated PHY revision */
+ if (priv->internal_phy)
+ phy_flags = priv->gphy_rev;
+
+ /* Initialize link state variables that bcmgenet_mii_setup() uses */
+ priv->old_link = -1;
+ priv->old_speed = -1;
+ priv->old_duplex = -1;
+ priv->old_pause = -1;
+
+ if (dn) {
+ phydev = of_phy_connect(dev, priv->phy_dn, bcmgenet_mii_setup,
+ phy_flags, priv->phy_interface);
+ if (!phydev) {
+ pr_err("could not attach to PHY\n");
+ return -ENODEV;
+ }
+ } else {
+ phydev = dev->phydev;
+ phydev->dev_flags = phy_flags;
+
+ ret = phy_connect_direct(dev, phydev, bcmgenet_mii_setup,
+ priv->phy_interface);
+ if (ret) {
+ pr_err("could not attach to PHY\n");
+ return -ENODEV;
+ }
}
+ /* Configure port multiplexer based on what the probed PHY device since
+ * reading the 'max-speed' property determines the maximum supported
+ * PHY speed which is needed for bcmgenet_mii_config() to configure
+ * things appropriately.
+ */
+ ret = bcmgenet_mii_config(dev, true);
+ if (ret) {
+ phy_disconnect(dev->phydev);
+ return ret;
+ }
+
+ linkmode_copy(phydev->advertising, phydev->supported);
+
+ /* The internal PHY has its link interrupts routed to the
+ * Ethernet MAC ISRs. On GENETv5 there is a hardware issue
+ * that prevents the signaling of link UP interrupts when
+ * the link operates at 10Mbps, so fallback to polling for
+ * those versions of GENET.
+ */
+ if (priv->internal_phy && !GENET_IS_V5(priv))
+ dev->phydev->irq = PHY_IGNORE_INTERRUPT;
+
return 0;
}
netdev->ethtool_ops = &octeon_mgmt_ethtool_ops;
netdev->min_mtu = 64 - OCTEON_MGMT_RX_HEADROOM;
- netdev->max_mtu = 16383 - OCTEON_MGMT_RX_HEADROOM;
+ netdev->max_mtu = 16383 - OCTEON_MGMT_RX_HEADROOM - VLAN_HLEN;
mac = of_get_mac_address(pdev->dev.of_node);
regulator_disable(fep->reg_phy);
pm_runtime_put(&pdev->dev);
pm_runtime_disable(&pdev->dev);
+ clk_disable_unprepare(fep->clk_ahb);
+ clk_disable_unprepare(fep->clk_ipg);
if (of_phy_is_fixed_link(np))
of_phy_deregister_fixed_link(np);
of_node_put(fep->phy_node);
ring->q = q;
ring->flags = flags;
- spin_lock_init(&ring->lock);
ring->coal_param = q->handle->coal_param;
assert(!ring->desc && !ring->desc_cb && !ring->desc_dma_addr);
/* statistic */
struct ring_stats stats;
- /* ring lock for poll one */
- spinlock_t lock;
-
dma_addr_t desc_dma_addr;
u32 buf_size; /* size for hnae_desc->addr, preset by AE */
u16 desc_num; /* total number of desc */
return u > c ? (h > c && h <= u) : (h > c || h <= u);
}
-/* netif_tx_lock will turn down the performance, set only when necessary */
-#ifdef CONFIG_NET_POLL_CONTROLLER
-#define NETIF_TX_LOCK(ring) spin_lock(&(ring)->lock)
-#define NETIF_TX_UNLOCK(ring) spin_unlock(&(ring)->lock)
-#else
-#define NETIF_TX_LOCK(ring)
-#define NETIF_TX_UNLOCK(ring)
-#endif
-
/* reclaim all desc in one budget
* return error or number of desc left
*/
int head;
int bytes, pkts;
- NETIF_TX_LOCK(ring);
-
head = readl_relaxed(ring->io_base + RCB_REG_HEAD);
rmb(); /* make sure head is ready before touch any data */
- if (is_ring_empty(ring) || head == ring->next_to_clean) {
- NETIF_TX_UNLOCK(ring);
+ if (is_ring_empty(ring) || head == ring->next_to_clean)
return 0; /* no data to poll */
- }
if (!is_valid_clean_head(ring, head)) {
netdev_err(ndev, "wrong head (%d, %d-%d)\n", head,
ring->next_to_use, ring->next_to_clean);
ring->stats.io_err_cnt++;
- NETIF_TX_UNLOCK(ring);
return -EIO;
}
ring->stats.tx_pkts += pkts;
ring->stats.tx_bytes += bytes;
- NETIF_TX_UNLOCK(ring);
-
dev_queue = netdev_get_tx_queue(ndev, ring_data->queue_index);
netdev_tx_completed_queue(dev_queue, pkts, bytes);
int head;
int bytes, pkts;
- NETIF_TX_LOCK(ring);
-
head = ring->next_to_use; /* ntu :soft setted ring position*/
bytes = 0;
pkts = 0;
while (head != ring->next_to_clean)
hns_nic_reclaim_one_desc(ring, &bytes, &pkts);
- NETIF_TX_UNLOCK(ring);
-
dev_queue = netdev_get_tx_queue(ndev, ring_data->queue_index);
netdev_tx_reset_queue(dev_queue);
}
-// SPDX-License-Identifier: GPL-2.0+
+/* SPDX-License-Identifier: GPL-2.0+ */
// Copyright (c) 2016-2017 Hisilicon Limited.
#ifndef __HNAE3_H
-// SPDX-License-Identifier: GPL-2.0+
+/* SPDX-License-Identifier: GPL-2.0+ */
// Copyright (c) 2016-2017 Hisilicon Limited.
#ifndef __HNS3_ENET_H
-// SPDX-License-Identifier: GPL-2.0+
+/* SPDX-License-Identifier: GPL-2.0+ */
// Copyright (c) 2016-2017 Hisilicon Limited.
#ifndef __HCLGE_CMD_H
-// SPDX-License-Identifier: GPL-2.0+
+/* SPDX-License-Identifier: GPL-2.0+ */
// Copyright (c) 2016-2017 Hisilicon Limited.
#ifndef __HCLGE_DCB_H__
{
struct hclge_pf_rst_done_cmd *req;
struct hclge_desc desc;
+ int ret;
req = (struct hclge_pf_rst_done_cmd *)desc.data;
hclge_cmd_setup_basic_desc(&desc, HCLGE_OPC_PF_RST_DONE, false);
req->pf_rst_done |= HCLGE_PF_RESET_DONE_BIT;
- return hclge_cmd_send(&hdev->hw, &desc, 1);
+ ret = hclge_cmd_send(&hdev->hw, &desc, 1);
+ /* To be compatible with the old firmware, which does not support
+ * command HCLGE_OPC_PF_RST_DONE, just print a warning and
+ * return success
+ */
+ if (ret == -EOPNOTSUPP) {
+ dev_warn(&hdev->pdev->dev,
+ "current firmware does not support command(0x%x)!\n",
+ HCLGE_OPC_PF_RST_DONE);
+ return 0;
+ } else if (ret) {
+ dev_err(&hdev->pdev->dev, "assert PF reset done fail %d!\n",
+ ret);
+ }
+
+ return ret;
}
static int hclge_reset_prepare_up(struct hclge_dev *hdev)
-// SPDX-License-Identifier: GPL-2.0+
+/* SPDX-License-Identifier: GPL-2.0+ */
// Copyright (c) 2016-2017 Hisilicon Limited.
#ifndef __HCLGE_MAIN_H
-// SPDX-License-Identifier: GPL-2.0+
+/* SPDX-License-Identifier: GPL-2.0+ */
// Copyright (c) 2016-2017 Hisilicon Limited.
#ifndef __HCLGE_MDIO_H
-// SPDX-License-Identifier: GPL-2.0+
+/* SPDX-License-Identifier: GPL-2.0+ */
// Copyright (c) 2016-2017 Hisilicon Limited.
#ifndef __HCLGE_TM_H
/* API version 1.7 implements additional link and PHY-specific APIs */
#define I40E_MINOR_VER_GET_LINK_INFO_XL710 0x0007
+/* API version 1.9 for X722 implements additional link and PHY-specific APIs */
+#define I40E_MINOR_VER_GET_LINK_INFO_X722 0x0009
/* API version 1.6 for X722 devices adds ability to stop FW LLDP agent */
#define I40E_MINOR_VER_FW_LLDP_STOPPABLE_X722 0x0006
hw->aq.fw_min_ver < 40)) && hw_link_info->phy_type == 0xE)
hw_link_info->phy_type = I40E_PHY_TYPE_10GBASE_SFPP_CU;
- if (hw->flags & I40E_HW_FLAG_AQ_PHY_ACCESS_CAPABLE) {
+ if (hw->flags & I40E_HW_FLAG_AQ_PHY_ACCESS_CAPABLE &&
+ hw->mac.type != I40E_MAC_X722) {
__le32 tmp;
memcpy(&tmp, resp->link_type, sizeof(tmp));
i40e_xdp_ring_update_tail(xdp_ring);
xsk_umem_consume_tx_done(xdp_ring->xsk_umem);
- if (xsk_umem_uses_need_wakeup(xdp_ring->xsk_umem))
- xsk_clear_tx_need_wakeup(xdp_ring->xsk_umem);
}
return !!budget && work_done;
i40e_update_tx_stats(tx_ring, completed_frames, total_bytes);
out_xmit:
- if (xsk_umem_uses_need_wakeup(tx_ring->xsk_umem)) {
- if (tx_ring->next_to_clean == tx_ring->next_to_use)
- xsk_set_tx_need_wakeup(tx_ring->xsk_umem);
- else
- xsk_clear_tx_need_wakeup(tx_ring->xsk_umem);
- }
+ if (xsk_umem_uses_need_wakeup(tx_ring->xsk_umem))
+ xsk_set_tx_need_wakeup(tx_ring->xsk_umem);
xmit_done = i40e_xmit_zc(tx_ring, budget);
q_vector->rx.target_itr = ITR_TO_REG(rx_ring->itr_setting);
q_vector->ring_mask |= BIT(r_idx);
wr32(hw, IAVF_VFINT_ITRN1(IAVF_RX_ITR, q_vector->reg_idx),
- q_vector->rx.current_itr);
+ q_vector->rx.current_itr >> 1);
q_vector->rx.current_itr = q_vector->rx.target_itr;
}
q_vector->tx.target_itr = ITR_TO_REG(tx_ring->itr_setting);
q_vector->num_ringpairs++;
wr32(hw, IAVF_VFINT_ITRN1(IAVF_TX_ITR, q_vector->reg_idx),
- q_vector->tx.target_itr);
+ q_vector->tx.target_itr >> 1);
q_vector->tx.current_itr = q_vector->tx.target_itr;
}
struct ice_aqc_query_txsched_res_resp *buf;
enum ice_status status = 0;
__le16 max_sibl;
- u8 i;
+ u16 i;
if (hw->layer_info)
return status;
* should have been handled by the upper layers.
*/
if (tx_ring->launchtime_enable) {
- ts = ns_to_timespec64(first->skb->tstamp);
- first->skb->tstamp = 0;
+ ts = ktime_to_timespec64(first->skb->tstamp);
+ first->skb->tstamp = ktime_set(0, 0);
context_desc->seqnum_seed = cpu_to_le32(ts.tv_nsec / 32);
} else {
context_desc->seqnum_seed = 0;
* should have been handled by the upper layers.
*/
if (tx_ring->launchtime_enable) {
- ts = ns_to_timespec64(first->skb->tstamp);
- first->skb->tstamp = 0;
+ ts = ktime_to_timespec64(first->skb->tstamp);
+ first->skb->tstamp = ktime_set(0, 0);
context_desc->launch_time = cpu_to_le32(ts.tv_nsec / 32);
} else {
context_desc->launch_time = 0;
if (tx_desc) {
ixgbe_xdp_ring_update_tail(xdp_ring);
xsk_umem_consume_tx_done(xdp_ring->xsk_umem);
- if (xsk_umem_uses_need_wakeup(xdp_ring->xsk_umem))
- xsk_clear_tx_need_wakeup(xdp_ring->xsk_umem);
}
return !!budget && work_done;
if (xsk_frames)
xsk_umem_complete_tx(umem, xsk_frames);
- if (xsk_umem_uses_need_wakeup(tx_ring->xsk_umem)) {
- if (tx_ring->next_to_clean == tx_ring->next_to_use)
- xsk_set_tx_need_wakeup(tx_ring->xsk_umem);
- else
- xsk_clear_tx_need_wakeup(tx_ring->xsk_umem);
- }
+ if (xsk_umem_uses_need_wakeup(tx_ring->xsk_umem))
+ xsk_set_tx_need_wakeup(tx_ring->xsk_umem);
return ixgbe_xmit_zc(tx_ring, q_vector->tx.work_limit);
}
dev->caps.max_rq_desc_sz = dev_cap->max_rq_desc_sz;
/*
* Subtract 1 from the limit because we need to allocate a
- * spare CQE so the HCA HW can tell the difference between an
- * empty CQ and a full CQ.
+ * spare CQE to enable resizing the CQ.
*/
dev->caps.max_cqes = dev_cap->max_cq_sz - 1;
dev->caps.reserved_cqs = dev_cap->reserved_cqs;
MLX5_CAP_GEN(dev, max_flow_counter_15_0);
fdb_max = 1 << MLX5_CAP_ESW_FLOWTABLE_FDB(dev, log_max_ft_size);
- esw_debug(dev, "Create offloads FDB table, min (max esw size(2^%d), max counters(%d), groups(%d), max flow table size(2^%d))\n",
+ esw_debug(dev, "Create offloads FDB table, min (max esw size(2^%d), max counters(%d), groups(%d), max flow table size(%d))\n",
MLX5_CAP_ESW_FLOWTABLE_FDB(dev, log_max_ft_size),
max_flow_counter, ESW_OFFLOADS_NUM_GROUPS,
fdb_max);
u32 port_mask, port_value;
if (MLX5_CAP_ESW_FLOWTABLE(esw->dev, flow_source))
- return spec->flow_context.flow_source == MLX5_VPORT_UPLINK;
+ return spec->flow_context.flow_source ==
+ MLX5_FLOW_CONTEXT_FLOW_SOURCE_UPLINK;
port_mask = MLX5_GET(fte_match_param, spec->match_criteria,
misc_parameters.source_port);
break;
case DR_ACTION_TYP_MODIFY_HDR:
mlx5dr_icm_free_chunk(action->rewrite.chunk);
+ kfree(action->rewrite.data);
refcount_dec(&action->rewrite.dmn->refcount);
break;
default:
if (htbl)
mlx5dr_htbl_put(htbl);
+ kfree(hw_ste_arr);
+
return 0;
free_ste:
struct ocelot_port *ocelot_port = netdev_priv(dev);
int err = 0;
- if (!ocelot_netdevice_dev_check(dev))
- return 0;
-
switch (event) {
case NETDEV_CHANGEUPPER:
if (netif_is_bridge_master(info->upper_dev)) {
struct net_device *dev = netdev_notifier_info_to_dev(ptr);
int ret = 0;
+ if (!ocelot_netdevice_dev_check(dev))
+ return 0;
+
if (event == NETDEV_PRECHANGEUPPER &&
netif_is_lag_master(info->upper_dev)) {
struct netdev_lag_upper_info *lag_upper_info = info->upper_info;
struct netlink_ext_ack *extack;
- if (lag_upper_info->tx_type != NETDEV_LAG_TX_TYPE_HASH) {
+ if (lag_upper_info &&
+ lag_upper_info->tx_type != NETDEV_LAG_TX_TYPE_HASH) {
extack = netdev_notifier_info_to_extack(&info->info);
NL_SET_ERR_MSG_MOD(extack, "LAG device using unsupported Tx type");
#define ocelot_write_rix(ocelot, val, reg, ri) __ocelot_write_ix(ocelot, val, reg, reg##_RSZ * (ri))
#define ocelot_write(ocelot, val, reg) __ocelot_write_ix(ocelot, val, reg, 0)
-void __ocelot_rmw_ix(struct ocelot *ocelot, u32 val, u32 reg, u32 mask,
+void __ocelot_rmw_ix(struct ocelot *ocelot, u32 val, u32 mask, u32 reg,
u32 offset);
#define ocelot_rmw_ix(ocelot, val, m, reg, gi, ri) __ocelot_rmw_ix(ocelot, val, m, reg, reg##_GSZ * (gi) + reg##_RSZ * (ri))
#define ocelot_rmw_gix(ocelot, val, m, reg, gi) __ocelot_rmw_ix(ocelot, val, m, reg, reg##_GSZ * (gi))
static void __qede_remove(struct pci_dev *pdev, enum qede_remove_mode mode)
{
struct net_device *ndev = pci_get_drvdata(pdev);
- struct qede_dev *edev = netdev_priv(ndev);
- struct qed_dev *cdev = edev->cdev;
+ struct qede_dev *edev;
+ struct qed_dev *cdev;
+
+ if (!ndev) {
+ dev_info(&pdev->dev, "Device has already been removed\n");
+ return;
+ }
+
+ edev = netdev_priv(ndev);
+ cdev = edev->cdev;
DP_INFO(edev, "Starting qede_remove\n");
if (port->nr_rmnet_devs)
return -EINVAL;
- kfree(port);
-
netdev_rx_handler_unregister(real_dev);
+ kfree(port);
+
/* release reference on real_dev */
dev_put(real_dev);
static int r8168g_mdio_read(struct rtl8169_private *tp, int reg)
{
+ if (reg == 0x1f)
+ return tp->ocp_base == OCP_STD_PHY_BASE ? 0 : tp->ocp_base >> 4;
+
if (tp->ocp_base != OCP_STD_PHY_BASE)
reg -= 0x10;
* bits used depends on the hardware configuration
* selected at core configuration time.
*/
- int bit_nr = bitrev32(~crc32_le(~0, ha->addr,
+ u32 bit_nr = bitrev32(~crc32_le(~0, ha->addr,
ETH_ALEN)) >> (32 - mcbitslog2);
/* The most significant bit determines the register to
* use (H/L) while the other 5 bits determine the bit
writel(low_credit, ioaddr + XGMAC_MTL_TCx_LOCREDIT(queue));
value = readl(ioaddr + XGMAC_MTL_TCx_ETS_CONTROL(queue));
+ value &= ~XGMAC_TSA;
value |= XGMAC_CC | XGMAC_CBS;
writel(value, ioaddr + XGMAC_MTL_TCx_ETS_CONTROL(queue));
}
value |= XGMAC_FILTER_HMC;
netdev_for_each_mc_addr(ha, dev) {
- int nr = (bitrev32(~crc32_le(~0, ha->addr, 6)) >>
+ u32 nr = (bitrev32(~crc32_le(~0, ha->addr, 6)) >>
(32 - mcbitslog2));
mc_filter[nr >> 5] |= (1 << (nr & 0x1F));
}
static int dwxgmac2_get_rx_header_len(struct dma_desc *p, unsigned int *len)
{
- *len = le32_to_cpu(p->des2) & XGMAC_RDES2_HL;
+ if (le32_to_cpu(p->des3) & XGMAC_RDES3_L34T)
+ *len = le32_to_cpu(p->des2) & XGMAC_RDES2_HL;
return 0;
}
dma_cap->eee = (hw_cap & XGMAC_HWFEAT_EEESEL) >> 13;
dma_cap->atime_stamp = (hw_cap & XGMAC_HWFEAT_TSSEL) >> 12;
dma_cap->av = (hw_cap & XGMAC_HWFEAT_AVSEL) >> 11;
- dma_cap->av &= !(hw_cap & XGMAC_HWFEAT_RAVSEL) >> 10;
+ dma_cap->av &= !((hw_cap & XGMAC_HWFEAT_RAVSEL) >> 10);
dma_cap->arpoffsel = (hw_cap & XGMAC_HWFEAT_ARPOFFSEL) >> 9;
dma_cap->rmon = (hw_cap & XGMAC_HWFEAT_MMCSEL) >> 8;
dma_cap->pmt_magic_frame = (hw_cap & XGMAC_HWFEAT_MGKSEL) >> 7;
static void dwxgmac2_qmode(void __iomem *ioaddr, u32 channel, u8 qmode)
{
u32 value = readl(ioaddr + XGMAC_MTL_TXQ_OPMODE(channel));
+ u32 flow = readl(ioaddr + XGMAC_RX_FLOW_CTRL);
value &= ~XGMAC_TXQEN;
if (qmode != MTL_QUEUE_AVB) {
writel(0, ioaddr + XGMAC_MTL_TCx_ETS_CONTROL(channel));
} else {
value |= 0x1 << XGMAC_TXQEN_SHIFT;
+ writel(flow & (~XGMAC_RFE), ioaddr + XGMAC_RX_FLOW_CTRL);
}
writel(value, ioaddr + XGMAC_MTL_TXQ_OPMODE(channel));
#define MMC_XGMAC_RX_PKT_SMD_ERR 0x22c
#define MMC_XGMAC_RX_PKT_ASSEMBLY_OK 0x230
#define MMC_XGMAC_RX_FPE_FRAG 0x234
+#define MMC_XGMAC_RX_IPC_INTR_MASK 0x25c
static void dwmac_mmc_ctrl(void __iomem *mmcaddr, unsigned int mode)
{
static void dwxgmac_mmc_intr_all_mask(void __iomem *mmcaddr)
{
- writel(MMC_DEFAULT_MASK, mmcaddr + MMC_RX_INTR_MASK);
- writel(MMC_DEFAULT_MASK, mmcaddr + MMC_TX_INTR_MASK);
+ writel(0x0, mmcaddr + MMC_RX_INTR_MASK);
+ writel(0x0, mmcaddr + MMC_TX_INTR_MASK);
+ writel(MMC_DEFAULT_MASK, mmcaddr + MMC_XGMAC_RX_IPC_INTR_MASK);
}
static void dwxgmac_read_mmc_reg(void __iomem *addr, u32 reg, u32 *dest)
stmmac_set_desc_addr(priv, first, des);
tmp_pay_len = pay_len;
des += proto_hdr_len;
+ pay_len = 0;
}
stmmac_tso_allocator(priv, des, tmp_pay_len, (nfrags == 0), queue);
/* Only the last descriptor gets to point to the skb. */
tx_q->tx_skbuff[tx_q->cur_tx] = skb;
+ /* Manage tx mitigation */
+ tx_q->tx_count_frames += nfrags + 1;
+ if (likely(priv->tx_coal_frames > tx_q->tx_count_frames) &&
+ !((skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) &&
+ priv->hwts_tx_en)) {
+ stmmac_tx_timer_arm(priv, queue);
+ } else {
+ desc = &tx_q->dma_tx[tx_q->cur_tx];
+ tx_q->tx_count_frames = 0;
+ stmmac_set_tx_ic(priv, desc);
+ priv->xstats.tx_set_ic_bit++;
+ }
+
/* We've used all descriptors we need for this skb, however,
* advance cur_tx so that it references a fresh descriptor.
* ndo_start_xmit will fill this descriptor the next time it's
priv->xstats.tx_tso_frames++;
priv->xstats.tx_tso_nfrags += nfrags;
- /* Manage tx mitigation */
- tx_q->tx_count_frames += nfrags + 1;
- if (likely(priv->tx_coal_frames > tx_q->tx_count_frames) &&
- !(priv->synopsys_id >= DWMAC_CORE_4_00 &&
- (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) &&
- priv->hwts_tx_en)) {
- stmmac_tx_timer_arm(priv, queue);
- } else {
- tx_q->tx_count_frames = 0;
- stmmac_set_tx_ic(priv, desc);
- priv->xstats.tx_set_ic_bit++;
- }
-
if (priv->sarc_type)
stmmac_set_desc_sarc(priv, first, priv->sarc_type);
/* Only the last descriptor gets to point to the skb. */
tx_q->tx_skbuff[entry] = skb;
+ /* According to the coalesce parameter the IC bit for the latest
+ * segment is reset and the timer re-started to clean the tx status.
+ * This approach takes care about the fragments: desc is the first
+ * element in case of no SG.
+ */
+ tx_q->tx_count_frames += nfrags + 1;
+ if (likely(priv->tx_coal_frames > tx_q->tx_count_frames) &&
+ !((skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) &&
+ priv->hwts_tx_en)) {
+ stmmac_tx_timer_arm(priv, queue);
+ } else {
+ if (likely(priv->extend_desc))
+ desc = &tx_q->dma_etx[entry].basic;
+ else
+ desc = &tx_q->dma_tx[entry];
+
+ tx_q->tx_count_frames = 0;
+ stmmac_set_tx_ic(priv, desc);
+ priv->xstats.tx_set_ic_bit++;
+ }
+
/* We've used all descriptors we need for this skb, however,
* advance cur_tx so that it references a fresh descriptor.
* ndo_start_xmit will fill this descriptor the next time it's
dev->stats.tx_bytes += skb->len;
- /* According to the coalesce parameter the IC bit for the latest
- * segment is reset and the timer re-started to clean the tx status.
- * This approach takes care about the fragments: desc is the first
- * element in case of no SG.
- */
- tx_q->tx_count_frames += nfrags + 1;
- if (likely(priv->tx_coal_frames > tx_q->tx_count_frames) &&
- !(priv->synopsys_id >= DWMAC_CORE_4_00 &&
- (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) &&
- priv->hwts_tx_en)) {
- stmmac_tx_timer_arm(priv, queue);
- } else {
- tx_q->tx_count_frames = 0;
- stmmac_set_tx_ic(priv, desc);
- priv->xstats.tx_set_ic_bit++;
- }
-
if (priv->sarc_type)
stmmac_set_desc_sarc(priv, first, priv->sarc_type);
if (unlikely(status & dma_own))
break;
- count++;
-
rx_q->cur_rx = STMMAC_GET_ENTRY(rx_q->cur_rx, DMA_RX_SIZE);
next_entry = rx_q->cur_rx;
goto read_again;
if (unlikely(error)) {
dev_kfree_skb(skb);
+ count++;
continue;
}
skb = napi_alloc_skb(&ch->rx_napi, len);
if (!skb) {
priv->dev->stats.rx_dropped++;
+ count++;
continue;
}
priv->dev->stats.rx_packets++;
priv->dev->stats.rx_bytes += len;
+ count++;
}
if (status & rx_not_ls) {
* Author: Jose Abreu <joabreu@synopsys.com>
*/
+#include <linux/bitrev.h>
#include <linux/completion.h>
+#include <linux/crc32.h>
#include <linux/ethtool.h>
#include <linux/ip.h>
#include <linux/phy.h>
return -EOPNOTSUPP;
}
+static bool stmmac_hash_check(struct stmmac_priv *priv, unsigned char *addr)
+{
+ int mc_offset = 32 - priv->hw->mcast_bits_log2;
+ struct netdev_hw_addr *ha;
+ u32 hash, hash_nr;
+
+ /* First compute the hash for desired addr */
+ hash = bitrev32(~crc32_le(~0, addr, 6)) >> mc_offset;
+ hash_nr = hash >> 5;
+ hash = 1 << (hash & 0x1f);
+
+ /* Now, check if it collides with any existing one */
+ netdev_for_each_mc_addr(ha, priv->dev) {
+ u32 nr = bitrev32(~crc32_le(~0, ha->addr, ETH_ALEN)) >> mc_offset;
+ if (((nr >> 5) == hash_nr) && ((1 << (nr & 0x1f)) == hash))
+ return false;
+ }
+
+ /* No collisions, address is good to go */
+ return true;
+}
+
+static bool stmmac_perfect_check(struct stmmac_priv *priv, unsigned char *addr)
+{
+ struct netdev_hw_addr *ha;
+
+ /* Check if it collides with any existing one */
+ netdev_for_each_uc_addr(ha, priv->dev) {
+ if (!memcmp(ha->addr, addr, ETH_ALEN))
+ return false;
+ }
+
+ /* No collisions, address is good to go */
+ return true;
+}
+
static int stmmac_test_hfilt(struct stmmac_priv *priv)
{
- unsigned char gd_addr[ETH_ALEN] = {0x01, 0xee, 0xdd, 0xcc, 0xbb, 0xaa};
- unsigned char bd_addr[ETH_ALEN] = {0x01, 0x01, 0x02, 0x03, 0x04, 0x05};
+ unsigned char gd_addr[ETH_ALEN] = {0xf1, 0xee, 0xdd, 0xcc, 0xbb, 0xaa};
+ unsigned char bd_addr[ETH_ALEN] = {0xf1, 0xff, 0xff, 0xff, 0xff, 0xff};
struct stmmac_packet_attrs attr = { };
- int ret;
+ int ret, tries = 256;
ret = stmmac_filter_check(priv);
if (ret)
if (netdev_mc_count(priv->dev) >= priv->hw->multicast_filter_bins)
return -EOPNOTSUPP;
+ while (--tries) {
+ /* We only need to check the bd_addr for collisions */
+ bd_addr[ETH_ALEN - 1] = tries;
+ if (stmmac_hash_check(priv, bd_addr))
+ break;
+ }
+
+ if (!tries)
+ return -EOPNOTSUPP;
+
ret = dev_mc_add(priv->dev, gd_addr);
if (ret)
return ret;
static int stmmac_test_pfilt(struct stmmac_priv *priv)
{
- unsigned char gd_addr[ETH_ALEN] = {0x00, 0x01, 0x44, 0x55, 0x66, 0x77};
- unsigned char bd_addr[ETH_ALEN] = {0x08, 0x00, 0x22, 0x33, 0x44, 0x55};
+ unsigned char gd_addr[ETH_ALEN] = {0xf0, 0x01, 0x44, 0x55, 0x66, 0x77};
+ unsigned char bd_addr[ETH_ALEN] = {0xf0, 0xff, 0xff, 0xff, 0xff, 0xff};
struct stmmac_packet_attrs attr = { };
- int ret;
+ int ret, tries = 256;
if (stmmac_filter_check(priv))
return -EOPNOTSUPP;
+ if (netdev_uc_count(priv->dev) >= priv->hw->unicast_filter_entries)
+ return -EOPNOTSUPP;
+
+ while (--tries) {
+ /* We only need to check the bd_addr for collisions */
+ bd_addr[ETH_ALEN - 1] = tries;
+ if (stmmac_perfect_check(priv, bd_addr))
+ break;
+ }
+
+ if (!tries)
+ return -EOPNOTSUPP;
ret = dev_uc_add(priv->dev, gd_addr);
if (ret)
return ret;
}
-static int stmmac_dummy_sync(struct net_device *netdev, const u8 *addr)
-{
- return 0;
-}
-
-static void stmmac_test_set_rx_mode(struct net_device *netdev)
-{
- /* As we are in test mode of ethtool we already own the rtnl lock
- * so no address will change from user. We can just call the
- * ndo_set_rx_mode() callback directly */
- if (netdev->netdev_ops->ndo_set_rx_mode)
- netdev->netdev_ops->ndo_set_rx_mode(netdev);
-}
-
static int stmmac_test_mcfilt(struct stmmac_priv *priv)
{
- unsigned char uc_addr[ETH_ALEN] = {0x00, 0x01, 0x44, 0x55, 0x66, 0x77};
- unsigned char mc_addr[ETH_ALEN] = {0x01, 0x01, 0x44, 0x55, 0x66, 0x77};
+ unsigned char uc_addr[ETH_ALEN] = {0xf0, 0xff, 0xff, 0xff, 0xff, 0xff};
+ unsigned char mc_addr[ETH_ALEN] = {0xf1, 0xff, 0xff, 0xff, 0xff, 0xff};
struct stmmac_packet_attrs attr = { };
- int ret;
+ int ret, tries = 256;
if (stmmac_filter_check(priv))
return -EOPNOTSUPP;
- if (!priv->hw->multicast_filter_bins)
+ if (netdev_uc_count(priv->dev) >= priv->hw->unicast_filter_entries)
return -EOPNOTSUPP;
- /* Remove all MC addresses */
- __dev_mc_unsync(priv->dev, NULL);
- stmmac_test_set_rx_mode(priv->dev);
+ while (--tries) {
+ /* We only need to check the mc_addr for collisions */
+ mc_addr[ETH_ALEN - 1] = tries;
+ if (stmmac_hash_check(priv, mc_addr))
+ break;
+ }
+
+ if (!tries)
+ return -EOPNOTSUPP;
ret = dev_uc_add(priv->dev, uc_addr);
if (ret)
- goto cleanup;
+ return ret;
attr.dst = uc_addr;
cleanup:
dev_uc_del(priv->dev, uc_addr);
- __dev_mc_sync(priv->dev, stmmac_dummy_sync, NULL);
- stmmac_test_set_rx_mode(priv->dev);
return ret;
}
static int stmmac_test_ucfilt(struct stmmac_priv *priv)
{
- unsigned char uc_addr[ETH_ALEN] = {0x00, 0x01, 0x44, 0x55, 0x66, 0x77};
- unsigned char mc_addr[ETH_ALEN] = {0x01, 0x01, 0x44, 0x55, 0x66, 0x77};
+ unsigned char uc_addr[ETH_ALEN] = {0xf0, 0xff, 0xff, 0xff, 0xff, 0xff};
+ unsigned char mc_addr[ETH_ALEN] = {0xf1, 0xff, 0xff, 0xff, 0xff, 0xff};
struct stmmac_packet_attrs attr = { };
- int ret;
+ int ret, tries = 256;
if (stmmac_filter_check(priv))
return -EOPNOTSUPP;
- if (!priv->hw->multicast_filter_bins)
+ if (netdev_mc_count(priv->dev) >= priv->hw->multicast_filter_bins)
return -EOPNOTSUPP;
- /* Remove all UC addresses */
- __dev_uc_unsync(priv->dev, NULL);
- stmmac_test_set_rx_mode(priv->dev);
+ while (--tries) {
+ /* We only need to check the uc_addr for collisions */
+ uc_addr[ETH_ALEN - 1] = tries;
+ if (stmmac_perfect_check(priv, uc_addr))
+ break;
+ }
+
+ if (!tries)
+ return -EOPNOTSUPP;
ret = dev_mc_add(priv->dev, mc_addr);
if (ret)
- goto cleanup;
+ return ret;
attr.dst = mc_addr;
cleanup:
dev_mc_del(priv->dev, mc_addr);
- __dev_uc_sync(priv->dev, stmmac_dummy_sync, NULL);
- stmmac_test_set_rx_mode(priv->dev);
return ret;
}
/* read current mtu value from device */
err = usbnet_read_cmd(dev, USB_CDC_GET_MAX_DATAGRAM_SIZE,
USB_TYPE_CLASS | USB_DIR_IN | USB_RECIP_INTERFACE,
- 0, iface_no, &max_datagram_size, 2);
- if (err < 0) {
+ 0, iface_no, &max_datagram_size, sizeof(max_datagram_size));
+ if (err < sizeof(max_datagram_size)) {
dev_dbg(&dev->intf->dev, "GET_MAX_DATAGRAM_SIZE failed\n");
goto out;
}
max_datagram_size = cpu_to_le16(ctx->max_datagram_size);
err = usbnet_write_cmd(dev, USB_CDC_SET_MAX_DATAGRAM_SIZE,
USB_TYPE_CLASS | USB_DIR_OUT | USB_RECIP_INTERFACE,
- 0, iface_no, &max_datagram_size, 2);
+ 0, iface_no, &max_datagram_size, sizeof(max_datagram_size));
if (err < 0)
dev_dbg(&dev->intf->dev, "SET_MAX_DATAGRAM_SIZE failed\n");
{QMI_FIXED_INTF(0x413c, 0x81b6, 8)}, /* Dell Wireless 5811e */
{QMI_FIXED_INTF(0x413c, 0x81b6, 10)}, /* Dell Wireless 5811e */
{QMI_FIXED_INTF(0x413c, 0x81d7, 0)}, /* Dell Wireless 5821e */
+ {QMI_FIXED_INTF(0x413c, 0x81e0, 0)}, /* Dell Wireless 5821e with eSIM support*/
{QMI_FIXED_INTF(0x03f0, 0x4e1d, 8)}, /* HP lt4111 LTE/EV-DO/HSPA+ Gobi 4G Module */
{QMI_FIXED_INTF(0x03f0, 0x9d1d, 1)}, /* HP lt4120 Snapdragon X5 LTE */
{QMI_FIXED_INTF(0x22de, 0x9061, 3)}, /* WeTelecom WPD-600N */
*fw_vsc_cfg, len);
if (r) {
- devm_kfree(dev, fw_vsc_cfg);
+ devm_kfree(dev, *fw_vsc_cfg);
goto vsc_read_err;
}
} else {
NFC_PROTO_FELICA_MASK;
} else {
kfree_skb(nfcid_skb);
+ nfcid_skb = NULL;
/* P2P in type A */
r = nfc_hci_get_param(hdev, ST21NFCA_RF_READER_F_GATE,
ST21NFCA_RF_READER_F_NFCID1,
struct nvme_ns *ns;
mutex_lock(&ctrl->scan_lock);
+ down_read(&ctrl->namespaces_rwsem);
list_for_each_entry(ns, &ctrl->namespaces, list)
if (nvme_mpath_clear_current_path(ns))
kblockd_schedule_work(&ns->head->requeue_work);
+ up_read(&ctrl->namespaces_rwsem);
mutex_unlock(&ctrl->scan_lock);
}
static void __exit nvme_rdma_cleanup_module(void)
{
+ struct nvme_rdma_ctrl *ctrl;
+
nvmf_unregister_transport(&nvme_rdma_transport);
ib_unregister_client(&nvme_rdma_ib_client);
+
+ mutex_lock(&nvme_rdma_ctrl_mutex);
+ list_for_each_entry(ctrl, &nvme_rdma_ctrl_list, list)
+ nvme_delete_ctrl(&ctrl->ctrl);
+ mutex_unlock(&nvme_rdma_ctrl_mutex);
+ flush_workqueue(nvme_delete_wq);
}
module_init(nvme_rdma_init_module);
* @pctldesc: Pin controller description
* @pctldev: Pointer to the pin controller device
* @chip: GPIO chip in this pin controller
+ * @irqchip: IRQ chip in this pin controller
* @regs: MMIO registers
* @intr_lines: Stores mapping between 16 HW interrupt wires and GPIO
* offset (in GPIO number space)
struct pinctrl_desc pctldesc;
struct pinctrl_dev *pctldev;
struct gpio_chip chip;
+ struct irq_chip irqchip;
void __iomem *regs;
unsigned intr_lines[16];
const struct chv_community *community;
return 0;
}
-static struct irq_chip chv_gpio_irqchip = {
- .name = "chv-gpio",
- .irq_startup = chv_gpio_irq_startup,
- .irq_ack = chv_gpio_irq_ack,
- .irq_mask = chv_gpio_irq_mask,
- .irq_unmask = chv_gpio_irq_unmask,
- .irq_set_type = chv_gpio_irq_type,
- .flags = IRQCHIP_SKIP_SET_WAKE,
-};
-
static void chv_gpio_irq_handler(struct irq_desc *desc)
{
struct gpio_chip *gc = irq_desc_get_handler_data(desc);
intsel >>= CHV_PADCTRL0_INTSEL_SHIFT;
if (intsel >= community->nirqs)
- clear_bit(i, valid_mask);
+ clear_bit(desc->number, valid_mask);
}
}
}
}
- ret = gpiochip_irqchip_add(chip, &chv_gpio_irqchip, 0,
+ pctrl->irqchip.name = "chv-gpio";
+ pctrl->irqchip.irq_startup = chv_gpio_irq_startup;
+ pctrl->irqchip.irq_ack = chv_gpio_irq_ack;
+ pctrl->irqchip.irq_mask = chv_gpio_irq_mask;
+ pctrl->irqchip.irq_unmask = chv_gpio_irq_unmask;
+ pctrl->irqchip.irq_set_type = chv_gpio_irq_type;
+ pctrl->irqchip.flags = IRQCHIP_SKIP_SET_WAKE;
+
+ ret = gpiochip_irqchip_add(chip, &pctrl->irqchip, 0,
handle_bad_irq, IRQ_TYPE_NONE);
if (ret) {
dev_err(pctrl->dev, "failed to add IRQ chip\n");
}
}
- gpiochip_set_chained_irqchip(chip, &chv_gpio_irqchip, irq,
+ gpiochip_set_chained_irqchip(chip, &pctrl->irqchip, irq,
chv_gpio_irq_handler);
return 0;
}
#define PADCFG0_GPIROUTNMI BIT(17)
#define PADCFG0_PMODE_SHIFT 10
#define PADCFG0_PMODE_MASK GENMASK(13, 10)
+#define PADCFG0_PMODE_GPIO 0
#define PADCFG0_GPIORXDIS BIT(9)
#define PADCFG0_GPIOTXDIS BIT(8)
#define PADCFG0_GPIORXSTATE BIT(1)
cfg1 = readl(intel_get_padcfg(pctrl, pin, PADCFG1));
mode = (cfg0 & PADCFG0_PMODE_MASK) >> PADCFG0_PMODE_SHIFT;
- if (!mode)
+ if (mode == PADCFG0_PMODE_GPIO)
seq_puts(s, "GPIO ");
else
seq_printf(s, "mode %d ", mode);
writel(value, padcfg0);
}
+static int intel_gpio_get_gpio_mode(void __iomem *padcfg0)
+{
+ return (readl(padcfg0) & PADCFG0_PMODE_MASK) >> PADCFG0_PMODE_SHIFT;
+}
+
static void intel_gpio_set_gpio_mode(void __iomem *padcfg0)
{
u32 value;
}
padcfg0 = intel_get_padcfg(pctrl, pin, PADCFG0);
+
+ /*
+ * If pin is already configured in GPIO mode, we assume that
+ * firmware provides correct settings. In such case we avoid
+ * potential glitches on the pin. Otherwise, for the pin in
+ * alternative mode, consumer has to supply respective flags.
+ */
+ if (intel_gpio_get_gpio_mode(padcfg0) == PADCFG0_PMODE_GPIO) {
+ raw_spin_unlock_irqrestore(&pctrl->lock, flags);
+ return 0;
+ }
+
intel_gpio_set_gpio_mode(padcfg0);
+
/* Disable TX buffer and enable RX (this will be input) */
__intel_gpio_set_direction(padcfg0, true);
return stmfx_function_enable(pctl->stmfx, func);
}
-static int stmfx_pinctrl_gpio_init_valid_mask(struct gpio_chip *gc,
- unsigned long *valid_mask,
- unsigned int ngpios)
-{
- struct stmfx_pinctrl *pctl = gpiochip_get_data(gc);
- u32 n;
-
- for_each_clear_bit(n, &pctl->gpio_valid_mask, ngpios)
- clear_bit(n, valid_mask);
-
- return 0;
-}
-
static int stmfx_pinctrl_probe(struct platform_device *pdev)
{
struct stmfx *stmfx = dev_get_drvdata(pdev->dev.parent);
pctl->gpio_chip.ngpio = pctl->pctl_desc.npins;
pctl->gpio_chip.can_sleep = true;
pctl->gpio_chip.of_node = np;
- pctl->gpio_chip.init_valid_mask = stmfx_pinctrl_gpio_init_valid_mask;
ret = devm_gpiochip_add_data(pctl->dev, &pctl->gpio_chip, pctl);
if (ret) {
static const struct pwm_ops iproc_pwm_ops = {
.apply = iproc_pwmc_apply,
.get_state = iproc_pwmc_get_state,
+ .owner = THIS_MODULE,
};
static int iproc_pwmc_probe(struct platform_device *pdev)
* of_reset_simple_xlate - translate reset_spec to the reset line number
* @rcdev: a pointer to the reset controller device
* @reset_spec: reset line specifier as found in the device tree
- * @flags: a flags pointer to fill in (optional)
*
* This simple translation function should be used for reset controllers
* with 1:1 mapping, where reset lines can be indexed by number without gaps.
for (i = 0; i < resets->num_rstcs; i++)
__reset_control_put_internal(resets->rstc[i]);
mutex_unlock(&reset_list_mutex);
+ kfree(resets);
}
/**
}
EXPORT_SYMBOL_GPL(__device_reset);
-/**
+/*
* APIs to manage an array of reset controls.
*/
+
/**
* of_reset_control_get_count - Count number of resets available with a device
*
* ensures no active vp_list traversal while the vport is removed
* from the queue)
*/
- for (i = 0; i < 10 && atomic_read(&vha->vref_count); i++)
- wait_event_timeout(vha->vref_waitq,
- atomic_read(&vha->vref_count), HZ);
+ for (i = 0; i < 10; i++) {
+ if (wait_event_timeout(vha->vref_waitq,
+ !atomic_read(&vha->vref_count), HZ) > 0)
+ break;
+ }
spin_lock_irqsave(&ha->vport_slock, flags);
if (atomic_read(&vha->vref_count)) {
qla2x00_mark_all_devices_lost(vha, 0);
- for (i = 0; i < 10; i++)
- wait_event_timeout(vha->fcport_waitQ, test_fcport_count(vha),
- HZ);
+ for (i = 0; i < 10; i++) {
+ if (wait_event_timeout(vha->fcport_waitQ,
+ test_fcport_count(vha), HZ) > 0)
+ break;
+ }
flush_workqueue(vha->hw->wq);
}
{
unsigned int cmd_size, sgl_size;
- sgl_size = scsi_mq_inline_sgl_size(shost);
+ sgl_size = max_t(unsigned int, sizeof(struct scatterlist),
+ scsi_mq_inline_sgl_size(shost));
cmd_size = sizeof(struct scsi_cmnd) + shost->hostt->cmd_size + sgl_size;
if (scsi_host_get_prot(shost))
cmd_size += sizeof(struct scsi_data_buffer) +
int result = cmd->result;
struct request *rq = cmd->request;
- switch (req_op(rq)) {
- case REQ_OP_ZONE_RESET:
- case REQ_OP_ZONE_RESET_ALL:
-
- if (result &&
- sshdr->sense_key == ILLEGAL_REQUEST &&
- sshdr->asc == 0x24)
- /*
- * INVALID FIELD IN CDB error: reset of a conventional
- * zone was attempted. Nothing to worry about, so be
- * quiet about the error.
- */
- rq->rq_flags |= RQF_QUIET;
- break;
-
- case REQ_OP_WRITE:
- case REQ_OP_WRITE_ZEROES:
- case REQ_OP_WRITE_SAME:
- break;
+ if (req_op(rq) == REQ_OP_ZONE_RESET &&
+ result &&
+ sshdr->sense_key == ILLEGAL_REQUEST &&
+ sshdr->asc == 0x24) {
+ /*
+ * INVALID FIELD IN CDB error: reset of a conventional
+ * zone was attempted. Nothing to worry about, so be
+ * quiet about the error.
+ */
+ rq->rq_flags |= RQF_QUIET;
}
}
};
static struct imx_pm_domain imx_gpc_domains[] = {
- [GPC_PGC_DOMAIN_ARM] {
+ [GPC_PGC_DOMAIN_ARM] = {
.base = {
.name = "ARM",
.flags = GENPD_FLAG_ALWAYS_ON,
},
},
- [GPC_PGC_DOMAIN_PU] {
+ [GPC_PGC_DOMAIN_PU] = {
.base = {
.name = "PU",
.power_off = imx6_pm_domain_power_off,
.reg_offs = 0x260,
.cntr_pdn_bit = 0,
},
- [GPC_PGC_DOMAIN_DISPLAY] {
+ [GPC_PGC_DOMAIN_DISPLAY] = {
.base = {
.name = "DISPLAY",
.power_off = imx6_pm_domain_power_off,
.reg_offs = 0x240,
.cntr_pdn_bit = 4,
},
- [GPC_PGC_DOMAIN_PCI] {
+ [GPC_PGC_DOMAIN_PCI] = {
.base = {
.name = "PCI",
.power_off = imx6_pm_domain_power_off,
menuconfig SOUNDWIRE
tristate "SoundWire support"
+ depends on ACPI || OF
help
SoundWire is a 2-Pin interface with data and clock line ratified
by the MIPI Alliance. SoundWire is used for transporting data
/* Create PCM DAIs */
stream = &cdns->pcm;
- ret = intel_create_dai(cdns, dais, INTEL_PDI_IN, stream->num_in,
+ ret = intel_create_dai(cdns, dais, INTEL_PDI_IN, cdns->pcm.num_in,
off, stream->num_ch_in, true);
if (ret)
return ret;
if (ret)
return ret;
- off += cdns->pdm.num_bd;
+ off += cdns->pdm.num_out;
ret = intel_create_dai(cdns, dais, INTEL_PDI_BD, cdns->pdm.num_bd,
off, stream->num_ch_bd, false);
if (ret)
struct device_node *node;
for_each_child_of_node(bus->dev->of_node, node) {
- int link_id, sdw_version, ret, len;
+ int link_id, ret, len;
+ unsigned int sdw_version;
const char *compat = NULL;
struct sdw_slave_id id;
const __be32 *addr;
{
u32 data;
- pci_read_config_dword(nhi->pdev, VS_CAP_19, &data);
data = (cmd << VS_CAP_19_CMD_SHIFT) & VS_CAP_19_CMD_MASK;
pci_write_config_dword(nhi->pdev, VS_CAP_19, data | VS_CAP_19_VALID);
}
*/
bool tb_dp_port_is_enabled(struct tb_port *port)
{
- u32 data;
+ u32 data[2];
- if (tb_port_read(port, &data, TB_CFG_PORT, port->cap_adap, 1))
+ if (tb_port_read(port, data, TB_CFG_PORT, port->cap_adap,
+ ARRAY_SIZE(data)))
return false;
- return !!(data & (TB_DP_VIDEO_EN | TB_DP_AUX_EN));
+ return !!(data[0] & (TB_DP_VIDEO_EN | TB_DP_AUX_EN));
}
/**
*/
int tb_dp_port_enable(struct tb_port *port, bool enable)
{
- u32 data;
+ u32 data[2];
int ret;
- ret = tb_port_read(port, &data, TB_CFG_PORT, port->cap_adap, 1);
+ ret = tb_port_read(port, data, TB_CFG_PORT, port->cap_adap,
+ ARRAY_SIZE(data));
if (ret)
return ret;
if (enable)
- data |= TB_DP_VIDEO_EN | TB_DP_AUX_EN;
+ data[0] |= TB_DP_VIDEO_EN | TB_DP_AUX_EN;
else
- data &= ~(TB_DP_VIDEO_EN | TB_DP_AUX_EN);
+ data[0] &= ~(TB_DP_VIDEO_EN | TB_DP_AUX_EN);
- return tb_port_write(port, &data, TB_CFG_PORT, port->cap_adap, 1);
+ return tb_port_write(port, data, TB_CFG_PORT, port->cap_adap,
+ ARRAY_SIZE(data));
}
/* switch utility functions */
if (sw->authorized)
goto unlock;
- /*
- * Make sure there is no PCIe rescan ongoing when a new PCIe
- * tunnel is created. Otherwise the PCIe rescan code might find
- * the new tunnel too early.
- */
- pci_lock_rescan_remove();
-
switch (val) {
/* Approve switch */
case 1:
break;
}
- pci_unlock_rescan_remove();
-
if (!ret) {
sw->authorized = val;
/* Notify status change to the userspace */
extern void c2p_unsupported(void);
-static inline u32 get_mask(unsigned int n)
+static __always_inline u32 get_mask(unsigned int n)
{
switch (n) {
case 1:
* Transpose operations on 8 32-bit words
*/
-static inline void transp8(u32 d[], unsigned int n, unsigned int m)
+static __always_inline void transp8(u32 d[], unsigned int n, unsigned int m)
{
u32 mask = get_mask(n);
* Transpose operations on 4 32-bit words
*/
-static inline void transp4(u32 d[], unsigned int n, unsigned int m)
+static __always_inline void transp4(u32 d[], unsigned int n, unsigned int m)
{
u32 mask = get_mask(n);
* Transpose operations on 4 32-bit words (reverse order)
*/
-static inline void transp4x(u32 d[], unsigned int n, unsigned int m)
+static __always_inline void transp4x(u32 d[], unsigned int n, unsigned int m)
{
u32 mask = get_mask(n);
MODULE_AUTHOR("Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>");
MODULE_DESCRIPTION("BD70528 watchdog driver");
MODULE_LICENSE("GPL");
+MODULE_ALIAS("platform:bd70528-wdt");
#include <linux/interrupt.h>
#include <linux/ioport.h>
#include <linux/timer.h>
+#include <linux/compat.h>
#include <linux/slab.h>
#include <linux/mutex.h>
#include <linux/io.h>
return 0;
}
+static long cpwd_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+ return cpwd_ioctl(file, cmd, (unsigned long)compat_ptr(arg));
+}
+
static ssize_t cpwd_write(struct file *file, const char __user *buf,
size_t count, loff_t *ppos)
{
static const struct file_operations cpwd_fops = {
.owner = THIS_MODULE,
.unlocked_ioctl = cpwd_ioctl,
- .compat_ioctl = compat_ptr_ioctl,
+ .compat_ioctl = cpwd_compat_ioctl,
.open = cpwd_open,
.write = cpwd_write,
.read = cpwd_read,
{
struct arm_smccc_res res;
+ /*
+ * SCU firmware calculates pretimeout based on current time
+ * stamp instead of watchdog timeout stamp, need to convert
+ * the pretimeout to SCU firmware's timeout value.
+ */
arm_smccc_smc(IMX_SIP_TIMER, IMX_SIP_TIMER_SET_PRETIME_WDOG,
- pretimeout * 1000, 0, 0, 0, 0, 0, &res);
+ (wdog->timeout - pretimeout) * 1000, 0, 0, 0,
+ 0, 0, &res);
if (res.a0)
return -EACCES;
reg = readl(data->reg_base + GXBB_WDT_TCNT_REG);
- return ((reg >> GXBB_WDT_TCNT_CNT_SHIFT) -
- (reg & GXBB_WDT_TCNT_SETUP_MASK)) / 1000;
+ return ((reg & GXBB_WDT_TCNT_SETUP_MASK) -
+ (reg >> GXBB_WDT_TCNT_CNT_SHIFT)) / 1000;
}
static const struct watchdog_ops meson_gxbb_wdt_ops = {
irq = platform_get_irq(pdev, 0);
if (irq > 0) {
- if (devm_request_irq(dev, irq, pm8916_wdt_isr, 0, "pm8916_wdt",
- wdt))
- irq = 0;
+ err = devm_request_irq(dev, irq, pm8916_wdt_isr, 0,
+ "pm8916_wdt", wdt);
+ if (err)
+ return err;
+
+ wdt->wdev.info = &pm8916_wdt_pt_ident;
+ } else {
+ if (irq == -EPROBE_DEFER)
+ return -EPROBE_DEFER;
+
+ wdt->wdev.info = &pm8916_wdt_ident;
}
/* Configure watchdog to hard-reset mode */
return err;
}
- wdt->wdev.info = (irq > 0) ? &pm8916_wdt_pt_ident : &pm8916_wdt_ident,
wdt->wdev.ops = &pm8916_wdt_ops,
wdt->wdev.parent = dev;
wdt->wdev.min_timeout = PM8916_WDT_MIN_TIMEOUT;
u64 start = async_chunk->start;
u64 end = async_chunk->end;
u64 actual_end;
+ u64 i_size;
int ret = 0;
struct page **pages = NULL;
unsigned long nr_pages;
inode_should_defrag(BTRFS_I(inode), start, end, end - start + 1,
SZ_16K);
- actual_end = min_t(u64, i_size_read(inode), end + 1);
+ /*
+ * We need to save i_size before now because it could change in between
+ * us evaluating the size and assigning it. This is because we lock and
+ * unlock the page in truncate and fallocate, and then modify the i_size
+ * later on.
+ *
+ * The barriers are to emulate READ_ONCE, remove that once i_size_read
+ * does that for us.
+ */
+ barrier();
+ i_size = i_size_read(inode);
+ barrier();
+ actual_end = min_t(u64, i_size, end + 1);
again:
will_compress = 0;
nr_pages = (end >> PAGE_SHIFT) - (start >> PAGE_SHIFT) + 1;
commit_transaction = true;
}
if (commit_transaction) {
+ /*
+ * We may have set commit_transaction when logging the new name
+ * in the destination root, in which case we left the source
+ * root context in the list of log contextes. So make sure we
+ * remove it to avoid invalid memory accesses, since the context
+ * was allocated in our stack frame.
+ */
+ if (sync_log_root) {
+ mutex_lock(&root->log_mutex);
+ list_del_init(&ctx_root.list);
+ mutex_unlock(&root->log_mutex);
+ }
ret = btrfs_commit_transaction(trans);
} else {
int ret2;
if (old_ino == BTRFS_FIRST_FREE_OBJECTID)
up_read(&fs_info->subvol_sem);
+ ASSERT(list_empty(&ctx_root.list));
+ ASSERT(list_empty(&ctx_dest.list));
+
return ret;
}
u64 transid;
int ret;
- btrfs_warn(root->fs_info,
- "START_SYNC ioctl is deprecated and will be removed in kernel 5.7");
-
trans = btrfs_attach_transaction_barrier(root);
if (IS_ERR(trans)) {
if (PTR_ERR(trans) != -ENOENT)
{
u64 transid;
- btrfs_warn(fs_info,
- "WAIT_SYNC ioctl is deprecated and will be removed in kernel 5.7");
-
if (argp) {
if (copy_from_user(&transid, argp, sizeof(transid)))
return -EFAULT;
while (ticket->bytes > 0 && ticket->error == 0) {
ret = prepare_to_wait_event(&ticket->wait, &wait, TASK_KILLABLE);
if (ret) {
+ /*
+ * Delete us from the list. After we unlock the space
+ * info, we don't want the async reclaim job to reserve
+ * space for this ticket. If that would happen, then the
+ * ticket's task would not known that space was reserved
+ * despite getting an error, resulting in a space leak
+ * (bytes_may_use counter of our space_info).
+ */
+ list_del_init(&ticket->list);
ticket->error = -EINTR;
break;
}
spin_lock(&space_info->lock);
ret = ticket->error;
if (ticket->bytes || ticket->error) {
+ /*
+ * Need to delete here for priority tickets. For regular tickets
+ * either the async reclaim job deletes the ticket from the list
+ * or we delete it ourselves at wait_reserve_ticket().
+ */
list_del_init(&ticket->list);
if (!ret)
ret = -ENOSPC;
}
spin_unlock(&space_info->lock);
ASSERT(list_empty(&ticket->list));
+ /*
+ * Check that we can't have an error set if the reservation succeeded,
+ * as that would confuse tasks and lead them to error out without
+ * releasing reserved space (if an error happens the expectation is that
+ * space wasn't reserved at all).
+ */
+ ASSERT(!(ticket->bytes == 0 && ticket->error));
return ret;
}
static int check_dev_item(struct extent_buffer *leaf,
struct btrfs_key *key, int slot)
{
- struct btrfs_fs_info *fs_info = leaf->fs_info;
struct btrfs_dev_item *ditem;
- u64 max_devid = max(BTRFS_MAX_DEVS(fs_info), BTRFS_MAX_DEVS_SYS_CHUNK);
if (key->objectid != BTRFS_DEV_ITEMS_OBJECTID) {
dev_item_err(leaf, slot,
key->objectid, BTRFS_DEV_ITEMS_OBJECTID);
return -EUCLEAN;
}
- if (key->offset > max_devid) {
- dev_item_err(leaf, slot,
- "invalid devid: has=%llu expect=[0, %llu]",
- key->offset, max_devid);
- return -EUCLEAN;
- }
ditem = btrfs_item_ptr(leaf, slot, struct btrfs_dev_item);
if (btrfs_device_id(leaf, ditem) != key->offset) {
dev_item_err(leaf, slot,
} else if (type & BTRFS_BLOCK_GROUP_SYSTEM) {
max_stripe_size = SZ_32M;
max_chunk_size = 2 * max_stripe_size;
+ devs_max = min_t(int, devs_max, BTRFS_MAX_DEVS_SYS_CHUNK);
} else {
btrfs_err(info, "invalid chunk type 0x%llx requested",
type);
dout("__ceph_remove_cap %p from %p\n", cap, &ci->vfs_inode);
+ /* remove from inode's cap rbtree, and clear auth cap */
+ rb_erase(&cap->ci_node, &ci->i_caps);
+ if (ci->i_auth_cap == cap)
+ ci->i_auth_cap = NULL;
+
/* remove from session list */
spin_lock(&session->s_cap_lock);
if (session->s_cap_iterator == cap) {
spin_unlock(&session->s_cap_lock);
- /* remove from inode list */
- rb_erase(&cap->ci_node, &ci->i_caps);
- if (ci->i_auth_cap == cap)
- ci->i_auth_cap = NULL;
-
if (removed)
ceph_put_cap(mdsc, cap);
{
int valid = 0;
struct dentry *parent;
- struct inode *dir;
+ struct inode *dir, *inode;
if (flags & LOOKUP_RCU) {
parent = READ_ONCE(dentry->d_parent);
dir = d_inode_rcu(parent);
if (!dir)
return -ECHILD;
+ inode = d_inode_rcu(dentry);
} else {
parent = dget_parent(dentry);
dir = d_inode(parent);
+ inode = d_inode(dentry);
}
dout("d_revalidate %p '%pd' inode %p offset %lld\n", dentry,
- dentry, d_inode(dentry), ceph_dentry(dentry)->offset);
+ dentry, inode, ceph_dentry(dentry)->offset);
/* always trust cached snapped dentries, snapdir dentry */
if (ceph_snap(dir) != CEPH_NOSNAP) {
dout("d_revalidate %p '%pd' inode %p is SNAPPED\n", dentry,
- dentry, d_inode(dentry));
+ dentry, inode);
valid = 1;
- } else if (d_really_is_positive(dentry) &&
- ceph_snap(d_inode(dentry)) == CEPH_SNAPDIR) {
+ } else if (inode && ceph_snap(inode) == CEPH_SNAPDIR) {
valid = 1;
} else {
valid = dentry_lease_is_valid(dentry, flags);
if (valid == -ECHILD)
return valid;
if (valid || dir_lease_is_valid(dir, dentry)) {
- if (d_really_is_positive(dentry))
- valid = ceph_is_any_caps(d_inode(dentry));
+ if (inode)
+ valid = ceph_is_any_caps(inode);
else
valid = 1;
}
err = ceph_security_init_secctx(dentry, mode, &as_ctx);
if (err < 0)
goto out_ctx;
+ } else if (!d_in_lookup(dentry)) {
+ /* If it's not being looked up, it's negative */
+ return -ENOENT;
}
/* do the open */
if (ceph_test_mount_opt(src_fsc, NOCOPYFROM))
return -EOPNOTSUPP;
+ /*
+ * Striped file layouts require that we copy partial objects, but the
+ * OSD copy-from operation only supports full-object copies. Limit
+ * this to non-striped file layouts for now.
+ */
if ((src_ci->i_layout.stripe_unit != dst_ci->i_layout.stripe_unit) ||
- (src_ci->i_layout.stripe_count != dst_ci->i_layout.stripe_count) ||
- (src_ci->i_layout.object_size != dst_ci->i_layout.object_size))
+ (src_ci->i_layout.stripe_count != 1) ||
+ (dst_ci->i_layout.stripe_count != 1) ||
+ (src_ci->i_layout.object_size != dst_ci->i_layout.object_size)) {
+ dout("Invalid src/dst files layout\n");
return -EOPNOTSUPP;
+ }
if (len < src_ci->i_layout.object_size)
return -EOPNOTSUPP; /* no remote copy will be done */
dout(" final dn %p\n", dn);
} else if ((req->r_op == CEPH_MDS_OP_LOOKUPSNAP ||
req->r_op == CEPH_MDS_OP_MKSNAP) &&
+ test_bit(CEPH_MDS_R_PARENT_LOCKED, &req->r_req_flags) &&
!test_bit(CEPH_MDS_R_ABORTED, &req->r_req_flags)) {
struct inode *dir = req->r_parent;
}
break;
case Opt_fscache_uniq:
+#ifdef CONFIG_CEPH_FSCACHE
kfree(fsopt->fscache_uniq);
fsopt->fscache_uniq = kstrndup(argstr[0].from,
argstr[0].to-argstr[0].from,
return -ENOMEM;
fsopt->flags |= CEPH_MOUNT_OPT_FSCACHE;
break;
- /* misc */
+#else
+ pr_err("fscache support is disabled\n");
+ return -EINVAL;
+#endif
case Opt_wsize:
if (intval < (int)PAGE_SIZE || intval > CEPH_MAX_WRITE_SIZE)
return -EINVAL;
fsopt->flags &= ~CEPH_MOUNT_OPT_INO32;
break;
case Opt_fscache:
+#ifdef CONFIG_CEPH_FSCACHE
fsopt->flags |= CEPH_MOUNT_OPT_FSCACHE;
kfree(fsopt->fscache_uniq);
fsopt->fscache_uniq = NULL;
break;
+#else
+ pr_err("fscache support is disabled\n");
+ return -EINVAL;
+#endif
case Opt_nofscache:
fsopt->flags &= ~CEPH_MOUNT_OPT_FSCACHE;
kfree(fsopt->fscache_uniq);
struct create_context ccontext;
__u8 Name[8];
struct durable_reconnect_context_v2 dcontext;
+ __u8 Pad[4];
} __packed;
/* See MS-SMB2 2.2.13.2.5 */
}
target_sd->s_links++;
spin_unlock(&configfs_dirent_lock);
- ret = configfs_get_target_path(item, item, body);
+ ret = configfs_get_target_path(parent_item, item, body);
if (!ret)
ret = configfs_create_link(target_sd, parent_item->ci_dentry,
dentry, body);
spin_unlock(&inode->i_lock);
/*
- * A dying wb indicates that the memcg-blkcg mapping has changed
- * and a new wb is already serving the memcg. Switch immediately.
+ * A dying wb indicates that either the blkcg associated with the
+ * memcg changed or the associated memcg is dying. In the first
+ * case, a replacement wb should already be available and we should
+ * refresh the wb immediately. In the second case, trying to
+ * refresh will keep failing.
*/
- if (unlikely(wb_dying(wbc->wb)))
+ if (unlikely(wb_dying(wbc->wb) && !css_is_dying(wbc->wb->memcg_css)))
inode_switch_wbs(inode, wbc->wb_id);
}
EXPORT_SYMBOL_GPL(wbc_attach_and_unlock_inode);
return 0;
}
-static int ocfs2_prepare_inode_for_refcount(struct inode *inode,
- struct file *file,
- loff_t pos, size_t count,
- int *meta_level)
+static int ocfs2_inode_lock_for_extent_tree(struct inode *inode,
+ struct buffer_head **di_bh,
+ int meta_level,
+ int overwrite_io,
+ int write_sem,
+ int wait)
{
- int ret;
- struct buffer_head *di_bh = NULL;
- u32 cpos = pos >> OCFS2_SB(inode->i_sb)->s_clustersize_bits;
- u32 clusters =
- ocfs2_clusters_for_bytes(inode->i_sb, pos + count) - cpos;
+ int ret = 0;
- ret = ocfs2_inode_lock(inode, &di_bh, 1);
- if (ret) {
- mlog_errno(ret);
+ if (wait)
+ ret = ocfs2_inode_lock(inode, NULL, meta_level);
+ else
+ ret = ocfs2_try_inode_lock(inode,
+ overwrite_io ? NULL : di_bh, meta_level);
+ if (ret < 0)
goto out;
+
+ if (wait) {
+ if (write_sem)
+ down_write(&OCFS2_I(inode)->ip_alloc_sem);
+ else
+ down_read(&OCFS2_I(inode)->ip_alloc_sem);
+ } else {
+ if (write_sem)
+ ret = down_write_trylock(&OCFS2_I(inode)->ip_alloc_sem);
+ else
+ ret = down_read_trylock(&OCFS2_I(inode)->ip_alloc_sem);
+
+ if (!ret) {
+ ret = -EAGAIN;
+ goto out_unlock;
+ }
}
- *meta_level = 1;
+ return ret;
- ret = ocfs2_refcount_cow(inode, di_bh, cpos, clusters, UINT_MAX);
- if (ret)
- mlog_errno(ret);
+out_unlock:
+ brelse(*di_bh);
+ ocfs2_inode_unlock(inode, meta_level);
out:
- brelse(di_bh);
return ret;
}
+static void ocfs2_inode_unlock_for_extent_tree(struct inode *inode,
+ struct buffer_head **di_bh,
+ int meta_level,
+ int write_sem)
+{
+ if (write_sem)
+ up_write(&OCFS2_I(inode)->ip_alloc_sem);
+ else
+ up_read(&OCFS2_I(inode)->ip_alloc_sem);
+
+ brelse(*di_bh);
+ *di_bh = NULL;
+
+ if (meta_level >= 0)
+ ocfs2_inode_unlock(inode, meta_level);
+}
+
static int ocfs2_prepare_inode_for_write(struct file *file,
loff_t pos, size_t count, int wait)
{
int ret = 0, meta_level = 0, overwrite_io = 0;
+ int write_sem = 0;
struct dentry *dentry = file->f_path.dentry;
struct inode *inode = d_inode(dentry);
struct buffer_head *di_bh = NULL;
+ u32 cpos;
+ u32 clusters;
/*
* We start with a read level meta lock and only jump to an ex
* if we need to make modifications here.
*/
for(;;) {
- if (wait)
- ret = ocfs2_inode_lock(inode, NULL, meta_level);
- else
- ret = ocfs2_try_inode_lock(inode,
- overwrite_io ? NULL : &di_bh, meta_level);
+ ret = ocfs2_inode_lock_for_extent_tree(inode,
+ &di_bh,
+ meta_level,
+ overwrite_io,
+ write_sem,
+ wait);
if (ret < 0) {
- meta_level = -1;
if (ret != -EAGAIN)
mlog_errno(ret);
goto out;
*/
if (!wait && !overwrite_io) {
overwrite_io = 1;
- if (!down_read_trylock(&OCFS2_I(inode)->ip_alloc_sem)) {
- ret = -EAGAIN;
- goto out_unlock;
- }
ret = ocfs2_overwrite_io(inode, di_bh, pos, count);
- brelse(di_bh);
- di_bh = NULL;
- up_read(&OCFS2_I(inode)->ip_alloc_sem);
if (ret < 0) {
if (ret != -EAGAIN)
mlog_errno(ret);
* set inode->i_size at the end of a write. */
if (should_remove_suid(dentry)) {
if (meta_level == 0) {
- ocfs2_inode_unlock(inode, meta_level);
+ ocfs2_inode_unlock_for_extent_tree(inode,
+ &di_bh,
+ meta_level,
+ write_sem);
meta_level = 1;
continue;
}
ret = ocfs2_check_range_for_refcount(inode, pos, count);
if (ret == 1) {
- ocfs2_inode_unlock(inode, meta_level);
- meta_level = -1;
-
- ret = ocfs2_prepare_inode_for_refcount(inode,
- file,
- pos,
- count,
- &meta_level);
+ ocfs2_inode_unlock_for_extent_tree(inode,
+ &di_bh,
+ meta_level,
+ write_sem);
+ ret = ocfs2_inode_lock_for_extent_tree(inode,
+ &di_bh,
+ meta_level,
+ overwrite_io,
+ 1,
+ wait);
+ write_sem = 1;
+ if (ret < 0) {
+ if (ret != -EAGAIN)
+ mlog_errno(ret);
+ goto out;
+ }
+
+ cpos = pos >> OCFS2_SB(inode->i_sb)->s_clustersize_bits;
+ clusters =
+ ocfs2_clusters_for_bytes(inode->i_sb, pos + count) - cpos;
+ ret = ocfs2_refcount_cow(inode, di_bh, cpos, clusters, UINT_MAX);
}
if (ret < 0) {
- mlog_errno(ret);
+ if (ret != -EAGAIN)
+ mlog_errno(ret);
goto out_unlock;
}
trace_ocfs2_prepare_inode_for_write(OCFS2_I(inode)->ip_blkno,
pos, count, wait);
- brelse(di_bh);
-
- if (meta_level >= 0)
- ocfs2_inode_unlock(inode, meta_level);
+ ocfs2_inode_unlock_for_extent_tree(inode,
+ &di_bh,
+ meta_level,
+ write_sem);
out:
return ret;
}
#endif /* __arch_get_clock_mode */
-#ifndef __arch_use_vsyscall
-static __always_inline int __arch_use_vsyscall(struct vdso_data *vdata)
-{
- return 1;
-}
-#endif /* __arch_use_vsyscall */
-
#ifndef __arch_update_vsyscall
static __always_inline void __arch_update_vsyscall(struct vdso_data *vdata,
struct timekeeper *tk)
*/
unsigned int pages_use_count;
+ /**
+ * @madv: State for madvise
+ *
+ * 0 is active/inuse.
+ * A negative value is the object is purged.
+ * Positive values are driver specific and not used by the helpers.
+ */
int madv;
+
+ /**
+ * @madv_list: List entry for madvise tracking
+ *
+ * Typically used by drivers to track purgeable objects
+ */
struct list_head madv_list;
/**
void drm_self_refresh_helper_alter_state(struct drm_atomic_state *state);
void drm_self_refresh_helper_update_avg_times(struct drm_atomic_state *state,
- unsigned int commit_time_ms);
+ unsigned int commit_time_ms,
+ unsigned int new_self_refresh_mask);
int drm_self_refresh_helper_init(struct drm_crtc *crtc);
void drm_self_refresh_helper_cleanup(struct drm_crtc *crtc);
void bpf_map_put(struct bpf_map *map);
int bpf_map_charge_memlock(struct bpf_map *map, u32 pages);
void bpf_map_uncharge_memlock(struct bpf_map *map, u32 pages);
-int bpf_map_charge_init(struct bpf_map_memory *mem, size_t size);
+int bpf_map_charge_init(struct bpf_map_memory *mem, u64 size);
void bpf_map_charge_finish(struct bpf_map_memory *mem);
void bpf_map_charge_move(struct bpf_map_memory *dst,
struct bpf_map_memory *src);
-void *bpf_map_area_alloc(size_t size, int numa_node);
+void *bpf_map_area_alloc(u64 size, int numa_node);
void bpf_map_area_free(void *base);
void bpf_map_init_from_attr(struct bpf_map *map, union bpf_attr *attr);
struct device_attribute *attr, char *buf);
extern ssize_t cpu_show_mds(struct device *dev,
struct device_attribute *attr, char *buf);
+extern ssize_t cpu_show_tsx_async_abort(struct device *dev,
+ struct device_attribute *attr,
+ char *buf);
+extern ssize_t cpu_show_itlb_multihit(struct device *dev,
+ struct device_attribute *attr, char *buf);
extern __printf(4, 5)
struct device *cpu_device_create(struct device *parent, void *drvdata,
static inline int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval) { return 0; }
#endif
-/*
- * These are used for a global "mitigations=" cmdline option for toggling
- * optional CPU mitigations.
- */
-enum cpu_mitigations {
- CPU_MITIGATIONS_OFF,
- CPU_MITIGATIONS_AUTO,
- CPU_MITIGATIONS_AUTO_NOSMT,
-};
-
-extern enum cpu_mitigations cpu_mitigations;
-
-/* mitigations=off */
-static inline bool cpu_mitigations_off(void)
-{
- return cpu_mitigations == CPU_MITIGATIONS_OFF;
-}
-
-/* mitigations=auto,nosmt */
-static inline bool cpu_mitigations_auto_nosmt(void)
-{
- return cpu_mitigations == CPU_MITIGATIONS_AUTO_NOSMT;
-}
+extern bool cpu_mitigations_off(void);
+extern bool cpu_mitigations_auto_nosmt(void);
#endif /* _LINUX_CPU_H_ */
* is convenient for a "not found" value.
*/
#define idr_for_each_entry(idr, entry, id) \
- for (id = 0; ((entry) = idr_get_next(idr, &(id))) != NULL; ++id)
+ for (id = 0; ((entry) = idr_get_next(idr, &(id))) != NULL; id += 1U)
/**
* idr_for_each_entry_ul() - Iterate over an IDR's elements of a given type.
void kvm_vcpu_kick(struct kvm_vcpu *vcpu);
bool kvm_is_reserved_pfn(kvm_pfn_t pfn);
+bool kvm_is_zone_device_pfn(kvm_pfn_t pfn);
struct kvm_irq_ack_notifier {
struct hlist_node link;
}
#endif /* CONFIG_HAVE_KVM_VCPU_RUN_PID_CHANGE */
+typedef int (*kvm_vm_thread_fn_t)(struct kvm *kvm, uintptr_t data);
+
+int kvm_vm_create_worker_thread(struct kvm *kvm, kvm_vm_thread_fn_t thread_fn,
+ uintptr_t data, const char *name,
+ struct task_struct **thread_ptr);
+
#endif
extern void kvfree(const void *addr);
-static inline atomic_t *compound_mapcount_ptr(struct page *page)
-{
- return &page[1].compound_mapcount;
-}
-
static inline int compound_mapcount(struct page *page)
{
VM_BUG_ON_PAGE(!PageCompound(page), page);
#endif
} _struct_page_alignment;
+static inline atomic_t *compound_mapcount_ptr(struct page *page)
+{
+ return &page[1].compound_mapcount;
+}
+
/*
* Used for sizing the vmemmap region on some architectures
*/
*
* Unlike PageTransCompound, this is safe to be called only while
* split_huge_pmd() cannot run from under us, like if protected by the
- * MMU notifier, otherwise it may result in page->_mapcount < 0 false
+ * MMU notifier, otherwise it may result in page->_mapcount check false
* positives.
+ *
+ * We have to treat page cache THP differently since every subpage of it
+ * would get _mapcount inc'ed once it is PMD mapped. But, it may be PTE
+ * mapped in the current process so comparing subpage's _mapcount to
+ * compound_mapcount to filter out PTE mapped case.
*/
static inline int PageTransCompoundMap(struct page *page)
{
- return PageTransCompound(page) && atomic_read(&page->_mapcount) < 0;
+ struct page *head;
+
+ if (!PageTransCompound(page))
+ return 0;
+
+ if (PageAnon(page))
+ return atomic_read(&page->_mapcount) < 0;
+
+ head = compound_head(page);
+ /* File THP is PMD mapped and not PTE mapped */
+ return atomic_read(&page->_mapcount) ==
+ atomic_read(compound_mapcount_ptr(head));
}
/*
}
/**
- * radix_tree_iter_find - find a present entry
- * @root: radix tree root
- * @iter: iterator state
- * @index: start location
- *
- * This function returns the slot containing the entry with the lowest index
- * which is at least @index. If @index is larger than any present entry, this
- * function returns NULL. The @iter is updated to describe the entry found.
- */
-static inline void __rcu **
-radix_tree_iter_find(const struct radix_tree_root *root,
- struct radix_tree_iter *iter, unsigned long index)
-{
- radix_tree_iter_init(iter, index);
- return radix_tree_next_chunk(root, iter, 0);
-}
-
-/**
* radix_tree_iter_retry - retry this chunk of the iteration
* @iter: iterator state
*
struct reset_controller_dev;
/**
- * struct reset_control_ops
+ * struct reset_control_ops - reset controller driver callbacks
*
* @reset: for self-deasserting resets, does all necessary
* things to reset the device
* @provider: name of the reset controller device controlling this reset line
* @index: ID of the reset controller in the reset controller device
* @dev_id: name of the device associated with this reset line
- * @con_id name of the reset line (can be NULL)
+ * @con_id: name of the reset line (can be NULL)
*/
struct reset_control_lookup {
struct list_head list;
* If this function is called more than once for the same reset_control it will
* return -EBUSY.
*
- * See reset_control_get_shared for details on shared references to
+ * See reset_control_get_shared() for details on shared references to
* reset-controls.
*
* Use of id names is optional.
}
}
+static inline u32 sk_msg_iter_dist(u32 start, u32 end)
+{
+ return end >= start ? end - start : end + (MAX_MSG_FRAGS - start);
+}
+
#define sk_msg_iter_var_prev(var) \
do { \
if (var == 0) \
if (sk_msg_full(msg))
return MAX_MSG_FRAGS;
- return msg->sg.end >= msg->sg.start ?
- msg->sg.end - msg->sg.start :
- msg->sg.end + (MAX_MSG_FRAGS - msg->sg.start);
+ return sk_msg_iter_dist(msg->sg.start, msg->sg.end);
}
static inline struct scatterlist *sk_msg_elem(struct sk_msg *msg, int which)
unsigned long target_last_arp_rx[BOND_MAX_ARP_TARGETS];
s8 link; /* one of BOND_LINK_XXXX */
s8 link_new_state; /* one of BOND_LINK_XXXX */
- s8 new_link;
u8 backup:1, /* indicates backup slave. Value corresponds with
BOND_STATE_ACTIVE and BOND_STATE_BACKUP */
inactive:1, /* indicates inactive slave */
static inline void bond_commit_link_state(struct slave *slave, bool notify)
{
- if (slave->link == slave->link_new_state)
+ if (slave->link_new_state == BOND_LINK_NOCHANGE)
return;
slave->link = slave->link_new_state;
fq->limit = 8192;
fq->memory_limit = 16 << 20; /* 16 MBytes */
- fq->flows = kcalloc(fq->flows_cnt, sizeof(fq->flows[0]), GFP_KERNEL);
+ fq->flows = kvcalloc(fq->flows_cnt, sizeof(fq->flows[0]), GFP_KERNEL);
if (!fq->flows)
return -ENOMEM;
for (i = 0; i < fq->flows_cnt; i++)
fq_flow_reset(fq, &fq->flows[i], free_func);
- kfree(fq->flows);
+ kvfree(fq->flows);
fq->flows = NULL;
}
{
unsigned long now = jiffies;
- if (neigh->used != now)
- neigh->used = now;
+ if (READ_ONCE(neigh->used) != now)
+ WRITE_ONCE(neigh->used, now);
if (!(neigh->nud_state&(NUD_CONNECTED|NUD_DELAY|NUD_PROBE)))
return __neigh_event_send(neigh, skb);
return 0;
*/
struct nft_expr {
const struct nft_expr_ops *ops;
- unsigned char data[];
+ unsigned char data[]
+ __attribute__((aligned(__alignof__(u64))));
};
static inline void *nft_expr_priv(const struct nft_expr *expr)
#include <linux/mutex.h>
#include <linux/rwsem.h>
#include <linux/atomic.h>
+#include <linux/hashtable.h>
#include <net/gen_stats.h>
#include <net/rtnetlink.h>
#include <net/flow_offload.h>
bool deleting;
refcount_t refcnt;
struct rcu_head rcu;
+ struct hlist_node destroy_ht_node;
};
struct qdisc_skb_cb {
struct list_head filter_chain_list;
} chain0;
struct rcu_head rcu;
+ DECLARE_HASHTABLE(proto_destroy_ht, 7);
+ struct mutex proto_destroy_lock; /* Lock for proto_destroy hashtable. */
};
#ifdef CONFIG_PROVE_LOCKING
return kt;
#else
- return sk->sk_stamp;
+ return READ_ONCE(sk->sk_stamp);
#endif
}
sk->sk_stamp = kt;
write_sequnlock(&sk->sk_stamp_seq);
#else
- sk->sk_stamp = kt;
+ WRITE_ONCE(sk->sk_stamp, kt);
#endif
}
#include <linux/socket.h>
#include <linux/tcp.h>
#include <linux/skmsg.h>
+#include <linux/mutex.h>
#include <linux/netdevice.h>
#include <linux/rcupdate.h>
bool in_tcp_sendpages;
bool pending_open_record_frags;
+
+ struct mutex tx_lock; /* protects partially_sent_* fields and
+ * per-type TX fields
+ */
unsigned long flags;
/* cache cold stuff */
-/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */
+/* SPDX-License-Identifier: ((GPL-2.0-only WITH Linux-syscall-note) OR BSD-3-Clause) */
/*
* linux/can.h
*
-/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */
+/* SPDX-License-Identifier: ((GPL-2.0-only WITH Linux-syscall-note) OR BSD-3-Clause) */
/*
* linux/can/bcm.h
*
-/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */
+/* SPDX-License-Identifier: ((GPL-2.0-only WITH Linux-syscall-note) OR BSD-3-Clause) */
/*
* linux/can/error.h
*
-/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */
+/* SPDX-License-Identifier: ((GPL-2.0-only WITH Linux-syscall-note) OR BSD-3-Clause) */
/*
* linux/can/gw.h
*
-/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
/*
* j1939.h
*
-/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
/*
* linux/can/netlink.h
*
-/* SPDX-License-Identifier: ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) */
+/* SPDX-License-Identifier: ((GPL-2.0-only WITH Linux-syscall-note) OR BSD-3-Clause) */
/*
* linux/can/raw.h
*
-/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
#ifndef _UAPI_CAN_VXCAN_H
#define _UAPI_CAN_VXCAN_H
__u32 cdw14;
__u32 cdw15;
__u32 timeout_ms;
+ __u32 rsvd2;
__u64 result;
};
* sent when the child exits.
* @stack: Specify the location of the stack for the
* child process.
+ * Note, @stack is expected to point to the
+ * lowest address. The stack direction will be
+ * determined by the kernel and set up
+ * appropriately based on @stack_size.
* @stack_size: The size of the stack for the child process.
* @tls: If CLONE_SETTLS is set, the tls descriptor
* is set to tls.
return false;
switch (off) {
- case offsetof(struct bpf_sysctl, write):
+ case bpf_ctx_range(struct bpf_sysctl, write):
if (type != BPF_READ)
return false;
bpf_ctx_record_field_size(info, size_default);
return bpf_ctx_narrow_access_ok(off, size, size_default);
- case offsetof(struct bpf_sysctl, file_pos):
+ case bpf_ctx_range(struct bpf_sysctl, file_pos):
if (type == BPF_READ) {
bpf_ctx_record_field_size(info, size_default);
return bpf_ctx_narrow_access_ok(off, size, size_default);
return map;
}
-void *bpf_map_area_alloc(size_t size, int numa_node)
+void *bpf_map_area_alloc(u64 size, int numa_node)
{
/* We really just want to fail instead of triggering OOM killer
* under memory pressure, therefore we set __GFP_NORETRY to kmalloc,
const gfp_t flags = __GFP_NOWARN | __GFP_ZERO;
void *area;
+ if (size >= SIZE_MAX)
+ return NULL;
+
if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
area = kmalloc_node(size, GFP_USER | __GFP_NORETRY | flags,
numa_node);
atomic_long_sub(pages, &user->locked_vm);
}
-int bpf_map_charge_init(struct bpf_map_memory *mem, size_t size)
+int bpf_map_charge_init(struct bpf_map_memory *mem, u64 size)
{
u32 pages = round_up(size, PAGE_SIZE) >> PAGE_SHIFT;
struct user_struct *user;
this_cpu_write(cpuhp_state.state, CPUHP_ONLINE);
}
-enum cpu_mitigations cpu_mitigations __ro_after_init = CPU_MITIGATIONS_AUTO;
+/*
+ * These are used for a global "mitigations=" cmdline option for toggling
+ * optional CPU mitigations.
+ */
+enum cpu_mitigations {
+ CPU_MITIGATIONS_OFF,
+ CPU_MITIGATIONS_AUTO,
+ CPU_MITIGATIONS_AUTO_NOSMT,
+};
+
+static enum cpu_mitigations cpu_mitigations __ro_after_init =
+ CPU_MITIGATIONS_AUTO;
static int __init mitigations_parse_cmdline(char *arg)
{
return 0;
}
early_param("mitigations", mitigations_parse_cmdline);
+
+/* mitigations=off */
+bool cpu_mitigations_off(void)
+{
+ return cpu_mitigations == CPU_MITIGATIONS_OFF;
+}
+EXPORT_SYMBOL_GPL(cpu_mitigations_off);
+
+/* mitigations=auto,nosmt */
+bool cpu_mitigations_auto_nosmt(void)
+{
+ return cpu_mitigations == CPU_MITIGATIONS_AUTO_NOSMT;
+}
+EXPORT_SYMBOL_GPL(cpu_mitigations_auto_nosmt);
return 0;
}
-static bool clone3_args_valid(const struct kernel_clone_args *kargs)
+/**
+ * clone3_stack_valid - check and prepare stack
+ * @kargs: kernel clone args
+ *
+ * Verify that the stack arguments userspace gave us are sane.
+ * In addition, set the stack direction for userspace since it's easy for us to
+ * determine.
+ */
+static inline bool clone3_stack_valid(struct kernel_clone_args *kargs)
+{
+ if (kargs->stack == 0) {
+ if (kargs->stack_size > 0)
+ return false;
+ } else {
+ if (kargs->stack_size == 0)
+ return false;
+
+ if (!access_ok((void __user *)kargs->stack, kargs->stack_size))
+ return false;
+
+#if !defined(CONFIG_STACK_GROWSUP) && !defined(CONFIG_IA64)
+ kargs->stack += kargs->stack_size;
+#endif
+ }
+
+ return true;
+}
+
+static bool clone3_args_valid(struct kernel_clone_args *kargs)
{
/*
* All lower bits of the flag word are taken.
kargs->exit_signal)
return false;
+ if (!clone3_stack_valid(kargs))
+ return false;
+
return true;
}
* @type: Type of irqchip_fwnode. See linux/irqdomain.h
* @name: Optional user provided domain name
* @id: Optional user provided id if name != NULL
- * @data: Optional user-provided data
+ * @pa: Optional user-provided physical address
*
* Allocate a struct irqchip_fwid, and return a poiner to the embedded
* fwnode_handle (or NULL on failure).
task_rq_unlock(rq, p, &rf);
}
+#ifdef CONFIG_UCLAMP_TASK_GROUP
static inline void
uclamp_update_active_tasks(struct cgroup_subsys_state *css,
unsigned int clamps)
css_task_iter_end(&it);
}
-#ifdef CONFIG_UCLAMP_TASK_GROUP
static void cpu_util_update_eff(struct cgroup_subsys_state *css);
static void uclamp_update_root_tg(void)
{
}
restart:
+#ifdef CONFIG_SMP
/*
- * Ensure that we put DL/RT tasks before the pick loop, such that they
- * can PULL higher prio tasks when we lower the RQ 'priority'.
+ * We must do the balancing pass before put_next_task(), such
+ * that when we release the rq->lock the task is in the same
+ * state as before we took rq->lock.
+ *
+ * We can terminate the balance pass as soon as we know there is
+ * a runnable task of @class priority or higher.
*/
- prev->sched_class->put_prev_task(rq, prev, rf);
- if (!rq->nr_running)
- newidle_balance(rq, rf);
+ for_class_range(class, prev->sched_class, &idle_sched_class) {
+ if (class->balance(rq, prev, rf))
+ break;
+ }
+#endif
+
+ put_prev_task(rq, prev);
for_each_class(class) {
p = class->pick_next_task(rq, NULL, NULL);
for_each_class(class) {
next = class->pick_next_task(rq, NULL, NULL);
if (next) {
- next->sched_class->put_prev_task(rq, next, NULL);
+ next->sched_class->put_prev_task(rq, next);
return next;
}
}
resched_curr(rq);
}
+static int balance_dl(struct rq *rq, struct task_struct *p, struct rq_flags *rf)
+{
+ if (!on_dl_rq(&p->dl) && need_pull_dl_task(rq, p)) {
+ /*
+ * This is OK, because current is on_cpu, which avoids it being
+ * picked for load-balance and preemption/IRQs are still
+ * disabled avoiding further scheduler activity on it and we've
+ * not yet started the picking loop.
+ */
+ rq_unpin_lock(rq, rf);
+ pull_dl_task(rq);
+ rq_repin_lock(rq, rf);
+ }
+
+ return sched_stop_runnable(rq) || sched_dl_runnable(rq);
+}
#endif /* CONFIG_SMP */
/*
pick_next_task_dl(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
{
struct sched_dl_entity *dl_se;
+ struct dl_rq *dl_rq = &rq->dl;
struct task_struct *p;
- struct dl_rq *dl_rq;
WARN_ON_ONCE(prev || rf);
- dl_rq = &rq->dl;
-
- if (unlikely(!dl_rq->dl_nr_running))
+ if (!sched_dl_runnable(rq))
return NULL;
dl_se = pick_next_dl_entity(rq, dl_rq);
BUG_ON(!dl_se);
-
p = dl_task_of(dl_se);
-
set_next_task_dl(rq, p);
-
return p;
}
-static void put_prev_task_dl(struct rq *rq, struct task_struct *p, struct rq_flags *rf)
+static void put_prev_task_dl(struct rq *rq, struct task_struct *p)
{
update_curr_dl(rq);
update_dl_rq_load_avg(rq_clock_pelt(rq), rq, 1);
if (on_dl_rq(&p->dl) && p->nr_cpus_allowed > 1)
enqueue_pushable_dl_task(rq, p);
-
- if (rf && !on_dl_rq(&p->dl) && need_pull_dl_task(rq, p)) {
- /*
- * This is OK, because current is on_cpu, which avoids it being
- * picked for load-balance and preemption/IRQs are still
- * disabled avoiding further scheduler activity on it and we've
- * not yet started the picking loop.
- */
- rq_unpin_lock(rq, rf);
- pull_dl_task(rq);
- rq_repin_lock(rq, rf);
- }
}
/*
.set_next_task = set_next_task_dl,
#ifdef CONFIG_SMP
+ .balance = balance_dl,
.select_task_rq = select_task_rq_dl,
.migrate_task_rq = migrate_task_rq_dl,
.set_cpus_allowed = set_cpus_allowed_dl,
{
remove_entity_load_avg(&p->se);
}
+
+static int
+balance_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
+{
+ if (rq->nr_running)
+ return 1;
+
+ return newidle_balance(rq, rf) != 0;
+}
#endif /* CONFIG_SMP */
static unsigned long wakeup_gran(struct sched_entity *se)
int new_tasks;
again:
- if (!cfs_rq->nr_running)
+ if (!sched_fair_runnable(rq))
goto idle;
#ifdef CONFIG_FAIR_GROUP_SCHED
/*
* Account for a descheduled task:
*/
-static void put_prev_task_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
+static void put_prev_task_fair(struct rq *rq, struct task_struct *prev)
{
struct sched_entity *se = &prev->se;
struct cfs_rq *cfs_rq;
.check_preempt_curr = check_preempt_wakeup,
.pick_next_task = pick_next_task_fair,
-
.put_prev_task = put_prev_task_fair,
.set_next_task = set_next_task_fair,
#ifdef CONFIG_SMP
+ .balance = balance_fair,
.select_task_rq = select_task_rq_fair,
.migrate_task_rq = migrate_task_rq_fair,
{
return task_cpu(p); /* IDLE tasks as never migrated */
}
+
+static int
+balance_idle(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
+{
+ return WARN_ON_ONCE(1);
+}
#endif
/*
resched_curr(rq);
}
-static void put_prev_task_idle(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
+static void put_prev_task_idle(struct rq *rq, struct task_struct *prev)
{
}
.set_next_task = set_next_task_idle,
#ifdef CONFIG_SMP
+ .balance = balance_idle,
.select_task_rq = select_task_rq_idle,
.set_cpus_allowed = set_cpus_allowed_common,
#endif
resched_curr(rq);
}
+static int balance_rt(struct rq *rq, struct task_struct *p, struct rq_flags *rf)
+{
+ if (!on_rt_rq(&p->rt) && need_pull_rt_task(rq, p)) {
+ /*
+ * This is OK, because current is on_cpu, which avoids it being
+ * picked for load-balance and preemption/IRQs are still
+ * disabled avoiding further scheduler activity on it and we've
+ * not yet started the picking loop.
+ */
+ rq_unpin_lock(rq, rf);
+ pull_rt_task(rq);
+ rq_repin_lock(rq, rf);
+ }
+
+ return sched_stop_runnable(rq) || sched_dl_runnable(rq) || sched_rt_runnable(rq);
+}
#endif /* CONFIG_SMP */
/*
pick_next_task_rt(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
{
struct task_struct *p;
- struct rt_rq *rt_rq = &rq->rt;
WARN_ON_ONCE(prev || rf);
- if (!rt_rq->rt_queued)
+ if (!sched_rt_runnable(rq))
return NULL;
p = _pick_next_task_rt(rq);
-
set_next_task_rt(rq, p);
-
return p;
}
-static void put_prev_task_rt(struct rq *rq, struct task_struct *p, struct rq_flags *rf)
+static void put_prev_task_rt(struct rq *rq, struct task_struct *p)
{
update_curr_rt(rq);
*/
if (on_rt_rq(&p->rt) && p->nr_cpus_allowed > 1)
enqueue_pushable_task(rq, p);
-
- if (rf && !on_rt_rq(&p->rt) && need_pull_rt_task(rq, p)) {
- /*
- * This is OK, because current is on_cpu, which avoids it being
- * picked for load-balance and preemption/IRQs are still
- * disabled avoiding further scheduler activity on it and we've
- * not yet started the picking loop.
- */
- rq_unpin_lock(rq, rf);
- pull_rt_task(rq);
- rq_repin_lock(rq, rf);
- }
}
#ifdef CONFIG_SMP
.set_next_task = set_next_task_rt,
#ifdef CONFIG_SMP
+ .balance = balance_rt,
.select_task_rq = select_task_rq_rt,
-
.set_cpus_allowed = set_cpus_allowed_common,
.rq_online = rq_online_rt,
.rq_offline = rq_offline_rt,
struct task_struct * (*pick_next_task)(struct rq *rq,
struct task_struct *prev,
struct rq_flags *rf);
- void (*put_prev_task)(struct rq *rq, struct task_struct *p, struct rq_flags *rf);
+ void (*put_prev_task)(struct rq *rq, struct task_struct *p);
void (*set_next_task)(struct rq *rq, struct task_struct *p);
#ifdef CONFIG_SMP
+ int (*balance)(struct rq *rq, struct task_struct *prev, struct rq_flags *rf);
int (*select_task_rq)(struct task_struct *p, int task_cpu, int sd_flag, int flags);
void (*migrate_task_rq)(struct task_struct *p, int new_cpu);
static inline void put_prev_task(struct rq *rq, struct task_struct *prev)
{
WARN_ON_ONCE(rq->curr != prev);
- prev->sched_class->put_prev_task(rq, prev, NULL);
+ prev->sched_class->put_prev_task(rq, prev);
}
static inline void set_next_task(struct rq *rq, struct task_struct *next)
#else
#define sched_class_highest (&dl_sched_class)
#endif
+
+#define for_class_range(class, _from, _to) \
+ for (class = (_from); class != (_to); class = class->next)
+
#define for_each_class(class) \
- for (class = sched_class_highest; class; class = class->next)
+ for_class_range(class, sched_class_highest, NULL)
extern const struct sched_class stop_sched_class;
extern const struct sched_class dl_sched_class;
extern const struct sched_class fair_sched_class;
extern const struct sched_class idle_sched_class;
+static inline bool sched_stop_runnable(struct rq *rq)
+{
+ return rq->stop && task_on_rq_queued(rq->stop);
+}
+
+static inline bool sched_dl_runnable(struct rq *rq)
+{
+ return rq->dl.dl_nr_running > 0;
+}
+
+static inline bool sched_rt_runnable(struct rq *rq)
+{
+ return rq->rt.rt_queued > 0;
+}
+
+static inline bool sched_fair_runnable(struct rq *rq)
+{
+ return rq->cfs.nr_running > 0;
+}
#ifdef CONFIG_SMP
{
return task_cpu(p); /* stop tasks as never migrate */
}
+
+static int
+balance_stop(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
+{
+ return sched_stop_runnable(rq);
+}
#endif /* CONFIG_SMP */
static void
static struct task_struct *
pick_next_task_stop(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
{
- struct task_struct *stop = rq->stop;
-
WARN_ON_ONCE(prev || rf);
- if (!stop || !task_on_rq_queued(stop))
+ if (!sched_stop_runnable(rq))
return NULL;
- set_next_task_stop(rq, stop);
-
- return stop;
+ set_next_task_stop(rq, rq->stop);
+ return rq->stop;
}
static void
BUG(); /* the stop task should never yield, its pointless. */
}
-static void put_prev_task_stop(struct rq *rq, struct task_struct *prev, struct rq_flags *rf)
+static void put_prev_task_stop(struct rq *rq, struct task_struct *prev)
{
struct task_struct *curr = rq->curr;
u64 delta_exec;
.set_next_task = set_next_task_stop,
#ifdef CONFIG_SMP
+ .balance = balance_stop,
.select_task_rq = select_task_rq_stop,
.set_cpus_allowed = set_cpus_allowed_common,
#endif
*/
preempt_disable();
read_unlock(&tasklist_lock);
- preempt_enable_no_resched();
cgroup_enter_frozen();
+ preempt_enable_no_resched();
freezable_schedule();
cgroup_leave_frozen(true);
} else {
struct stacktrace_cookie c = {
.store = store,
.size = size,
- .skip = skipnr + 1,
+ /* skip this function if they are tracing us */
+ .skip = skipnr + !!(current == tsk),
};
if (!try_get_task_stack(tsk))
struct stack_trace trace = {
.entries = store,
.max_entries = size,
- .skip = skipnr + 1,
+ /* skip this function if they are tracing us */
+ .skip = skipnr + !!(current == task),
};
save_stack_trace_tsk(task, &trace);
nsec = nsec + tk->wall_to_monotonic.tv_nsec;
vdso_ts->sec += __iter_div_u64_rem(nsec, NSEC_PER_SEC, &vdso_ts->nsec);
- if (__arch_use_vsyscall(vdata))
- update_vdso_data(vdata, tk);
+ update_vdso_data(vdata, tk);
__arch_update_vsyscall(vdata, tk);
{
struct vdso_data *vdata = __arch_get_k_vdso_data();
- if (__arch_use_vsyscall(vdata)) {
- vdata[CS_HRES_COARSE].tz_minuteswest = sys_tz.tz_minuteswest;
- vdata[CS_HRES_COARSE].tz_dsttime = sys_tz.tz_dsttime;
- }
+ vdata[CS_HRES_COARSE].tz_minuteswest = sys_tz.tz_minuteswest;
+ vdata[CS_HRES_COARSE].tz_dsttime = sys_tz.tz_dsttime;
__arch_sync_vdso_data(vdata);
}
config HAS_IOMEM
bool
depends on !NO_IOMEM
- select GENERIC_IO
default y
config HAS_IOPORT_MAP
was_locked = 1;
} else {
local_irq_restore(flags);
- cpu_relax();
+ /*
+ * Wait for the lock to release before jumping to
+ * atomic_cmpxchg() in order to mitigate the thundering herd
+ * problem.
+ */
+ do { cpu_relax(); } while (atomic_read(&dump_lock) != -1);
goto retry;
}
EXPORT_SYMBOL(idr_for_each);
/**
- * idr_get_next() - Find next populated entry.
+ * idr_get_next_ul() - Find next populated entry.
* @idr: IDR handle.
* @nextid: Pointer to an ID.
*
* to the ID of the found value. To use in a loop, the value pointed to by
* nextid must be incremented by the user.
*/
-void *idr_get_next(struct idr *idr, int *nextid)
+void *idr_get_next_ul(struct idr *idr, unsigned long *nextid)
{
struct radix_tree_iter iter;
void __rcu **slot;
}
if (!slot)
return NULL;
- id = iter.index + base;
-
- if (WARN_ON_ONCE(id > INT_MAX))
- return NULL;
- *nextid = id;
+ *nextid = iter.index + base;
return entry;
}
-EXPORT_SYMBOL(idr_get_next);
+EXPORT_SYMBOL(idr_get_next_ul);
/**
- * idr_get_next_ul() - Find next populated entry.
+ * idr_get_next() - Find next populated entry.
* @idr: IDR handle.
* @nextid: Pointer to an ID.
*
* to the ID of the found value. To use in a loop, the value pointed to by
* nextid must be incremented by the user.
*/
-void *idr_get_next_ul(struct idr *idr, unsigned long *nextid)
+void *idr_get_next(struct idr *idr, int *nextid)
{
- struct radix_tree_iter iter;
- void __rcu **slot;
- unsigned long base = idr->idr_base;
unsigned long id = *nextid;
+ void *entry = idr_get_next_ul(idr, &id);
- id = (id < base) ? 0 : id - base;
- slot = radix_tree_iter_find(&idr->idr_rt, &iter, id);
- if (!slot)
+ if (WARN_ON_ONCE(id > INT_MAX))
return NULL;
-
- *nextid = iter.index + base;
- return rcu_dereference_raw(*slot);
+ *nextid = id;
+ return entry;
}
-EXPORT_SYMBOL(idr_get_next_ul);
+EXPORT_SYMBOL(idr_get_next);
/**
* idr_replace() - replace pointer for given ID.
offset = radix_tree_find_next_bit(node, IDR_FREE,
offset + 1);
start = next_index(start, node, offset);
- if (start > max)
+ if (start > max || start == 0)
return ERR_PTR(-ENOSPC);
while (offset == RADIX_TREE_MAP_SIZE) {
offset = node->offset + 1;
XA_BUG_ON(xa, !xa_empty(xa));
}
+static noinline void check_move_tiny(struct xarray *xa)
+{
+ XA_STATE(xas, xa, 0);
+
+ XA_BUG_ON(xa, !xa_empty(xa));
+ rcu_read_lock();
+ XA_BUG_ON(xa, xas_next(&xas) != NULL);
+ XA_BUG_ON(xa, xas_next(&xas) != NULL);
+ rcu_read_unlock();
+ xa_store_index(xa, 0, GFP_KERNEL);
+ rcu_read_lock();
+ xas_set(&xas, 0);
+ XA_BUG_ON(xa, xas_next(&xas) != xa_mk_index(0));
+ XA_BUG_ON(xa, xas_next(&xas) != NULL);
+ xas_set(&xas, 0);
+ XA_BUG_ON(xa, xas_prev(&xas) != xa_mk_index(0));
+ XA_BUG_ON(xa, xas_prev(&xas) != NULL);
+ rcu_read_unlock();
+ xa_erase_index(xa, 0);
+ XA_BUG_ON(xa, !xa_empty(xa));
+}
+
static noinline void check_move_small(struct xarray *xa, unsigned long idx)
{
XA_STATE(xas, xa, 0);
xa_destroy(xa);
+ check_move_tiny(xa);
+
for (i = 0; i < 16; i++)
check_move_small(xa, 1UL << i);
if (!xas_frozen(xas->xa_node))
xas->xa_index--;
+ if (!xas->xa_node)
+ return set_bounds(xas);
if (xas_not_node(xas->xa_node))
return xas_load(xas);
if (!xas_frozen(xas->xa_node))
xas->xa_index++;
+ if (!xas->xa_node)
+ return set_bounds(xas);
if (xas_not_node(xas->xa_node))
return xas_load(xas);
anon_vma_lock_write(vma->anon_vma);
- pte = pte_offset_map(pmd, address);
- pte_ptl = pte_lockptr(mm, pmd);
-
mmu_notifier_range_init(&range, MMU_NOTIFY_CLEAR, 0, NULL, mm,
address, address + HPAGE_PMD_SIZE);
mmu_notifier_invalidate_range_start(&range);
+
+ pte = pte_offset_map(pmd, address);
+ pte_ptl = pte_lockptr(mm, pmd);
+
pmd_ptl = pmd_lock(mm, pmd); /* probably unnecessary */
/*
* After this gup_fast can't run anymore. This also removes
unsigned long ino = 0;
rcu_read_lock();
- if (PageHead(page) && PageSlab(page))
+ if (PageSlab(page) && !PageTail(page))
memcg = memcg_from_slab_page(page);
else
memcg = READ_ONCE(page->mem_cgroup);
}
/*
+ * Memcg doesn't have a dedicated reserve for atomic
+ * allocations. But like the global atomic pool, we need to
+ * put the burden of reclaim on regular allocation requests
+ * and let these go through as privileged allocations.
+ */
+ if (gfp_mask & __GFP_ATOMIC)
+ goto force;
+
+ /*
* Unlike in global OOM situations, memcg is not in a physical
* memory shortage. Allow dying and OOM-killed tasks to
* bypass the last charges so that they can exit quickly and
{
int node;
- /*
- * Flush percpu vmstats and vmevents to guarantee the value correctness
- * on parent's and all ancestor levels.
- */
- memcg_flush_percpu_vmstats(memcg, false);
- memcg_flush_percpu_vmevents(memcg);
for_each_node(node)
free_mem_cgroup_per_node_info(memcg, node);
free_percpu(memcg->vmstats_percpu);
static void mem_cgroup_free(struct mem_cgroup *memcg)
{
memcg_wb_domain_exit(memcg);
+ /*
+ * Flush percpu vmstats and vmevents to guarantee the value correctness
+ * on parent's and all ancestor levels.
+ */
+ memcg_flush_percpu_vmstats(memcg, false);
+ memcg_flush_percpu_vmevents(memcg);
__mem_cgroup_free(memcg);
}
zone->spanned_pages;
/* No need to lock the zones, they can't change. */
+ if (!zone->spanned_pages)
+ continue;
+ if (!node_end_pfn) {
+ node_start_pfn = zone->zone_start_pfn;
+ node_end_pfn = zone_end_pfn;
+ continue;
+ }
+
if (zone_end_pfn > node_end_pfn)
node_end_pfn = zone_end_pfn;
if (zone->zone_start_pfn < node_start_pfn)
mn->ops->invalidate_range_start, _ret,
!mmu_notifier_range_blockable(range) ? "non-" : "");
WARN_ON(mmu_notifier_range_blockable(range) ||
- ret != -EAGAIN);
+ _ret != -EAGAIN);
ret = _ret;
}
}
wait_for_completion(&pgdat_init_all_done_comp);
/*
+ * The number of managed pages has changed due to the initialisation
+ * so the pcpu batch and high limits needs to be updated or the limits
+ * will be artificially small.
+ */
+ for_each_populated_zone(zone)
+ zone_pcp_update(zone);
+
+ /*
* We initialized the rest of the deferred pages. Permanently disable
* on-demand struct page initialization.
*/
static void warn_alloc_show_mem(gfp_t gfp_mask, nodemask_t *nodemask)
{
unsigned int filter = SHOW_MEM_FILTER_NODES;
- static DEFINE_RATELIMIT_STATE(show_mem_rs, HZ, 1);
-
- if (!__ratelimit(&show_mem_rs))
- return;
/*
* This documents exceptions given to allocations in certain
{
struct va_format vaf;
va_list args;
- static DEFINE_RATELIMIT_STATE(nopage_rs, DEFAULT_RATELIMIT_INTERVAL,
- DEFAULT_RATELIMIT_BURST);
+ static DEFINE_RATELIMIT_STATE(nopage_rs, 10*HZ, 1);
if ((gfp_mask & __GFP_NOWARN) || !__ratelimit(&nopage_rs))
return;
WARN(count != 0, "%d pages are still in use!\n", count);
}
-#ifdef CONFIG_MEMORY_HOTPLUG
/*
* The zone indicated has a new number of managed_pages; batch sizes and percpu
* page high values need to be recalulated.
per_cpu_ptr(zone->pageset, cpu));
mutex_unlock(&pcp_batch_high_lock);
}
-#endif
void zone_pcp_reset(struct zone *zone)
{
* Expects a pointer to a slab page. Please note, that PageSlab() check
* isn't sufficient, as it returns true also for tail compound slab pages,
* which do not have slab_cache pointer set.
- * So this function assumes that the page can pass PageHead() and PageSlab()
- * checks.
+ * So this function assumes that the page can pass PageSlab() && !PageTail()
+ * check.
*
* The kmem_cache can be reparented asynchronously. The caller must ensure
* the memcg lifetime, e.g. by taking rcu_read_lock() or cgroup_mutex.
unsigned long freecount = 0;
struct free_area *area;
struct list_head *curr;
+ bool overflow = false;
area = &(zone->free_area[order]);
- list_for_each(curr, &area->free_list[mtype])
- freecount++;
- seq_printf(m, "%6lu ", freecount);
+ list_for_each(curr, &area->free_list[mtype]) {
+ /*
+ * Cap the free_list iteration because it might
+ * be really large and we are under a spinlock
+ * so a long time spent here could trigger a
+ * hard lockup detector. Anyway this is a
+ * debugging tool so knowing there is a handful
+ * of pages of this order should be more than
+ * sufficient.
+ */
+ if (++freecount >= 100000) {
+ overflow = true;
+ break;
+ }
+ }
+ seq_printf(m, "%s%6lu ", overflow ? ">" : "", freecount);
+ spin_unlock_irq(&zone->lock);
+ cond_resched();
+ spin_lock_irq(&zone->lock);
}
seq_putc(m, '\n');
}
#endif
#ifdef CONFIG_PROC_FS
proc_create_seq("buddyinfo", 0444, NULL, &fragmentation_op);
- proc_create_seq("pagetypeinfo", 0444, NULL, &pagetypeinfo_op);
+ proc_create_seq("pagetypeinfo", 0400, NULL, &pagetypeinfo_op);
proc_create_seq("vmstat", 0444, NULL, &vmstat_op);
proc_create_seq("zoneinfo", 0444, NULL, &zoneinfo_op);
#endif
ebt_dnat_tg(struct sk_buff *skb, const struct xt_action_param *par)
{
const struct ebt_nat_info *info = par->targinfo;
- struct net_device *dev;
if (skb_ensure_writable(skb, ETH_ALEN))
return EBT_DROP;
else
skb->pkt_type = PACKET_MULTICAST;
} else {
- if (xt_hooknum(par) != NF_BR_BROUTING)
- dev = br_port_get_rcu(xt_in(par))->br->dev;
- else
+ const struct net_device *dev;
+
+ switch (xt_hooknum(par)) {
+ case NF_BR_BROUTING:
dev = xt_in(par);
+ break;
+ case NF_BR_PRE_ROUTING:
+ dev = br_port_get_rcu(xt_in(par))->br->dev;
+ break;
+ default:
+ dev = NULL;
+ break;
+ }
+
+ if (!dev) /* NF_BR_LOCAL_OUT */
+ return info->target;
if (ether_addr_equal(info->mac, dev->dev_addr))
skb->pkt_type = PACKET_HOST;
j1939_netdev_stop(priv);
}
+ kfree(jsk->filters);
sock_orphan(sk);
sock->sk = NULL;
memset(serr, 0, sizeof(*serr));
switch (type) {
case J1939_ERRQUEUE_ACK:
- if (!(sk->sk_tsflags & SOF_TIMESTAMPING_TX_ACK))
+ if (!(sk->sk_tsflags & SOF_TIMESTAMPING_TX_ACK)) {
+ kfree_skb(skb);
return;
+ }
serr->ee.ee_errno = ENOMSG;
serr->ee.ee_origin = SO_EE_ORIGIN_TIMESTAMPING;
state = "ACK";
break;
case J1939_ERRQUEUE_SCHED:
- if (!(sk->sk_tsflags & SOF_TIMESTAMPING_TX_SCHED))
+ if (!(sk->sk_tsflags & SOF_TIMESTAMPING_TX_SCHED)) {
+ kfree_skb(skb);
return;
+ }
serr->ee.ee_errno = ENOMSG;
serr->ee.ee_origin = SO_EE_ORIGIN_TIMESTAMPING;
static void
j1939_xtp_rx_eoma_one(struct j1939_session *session, struct sk_buff *skb)
{
+ struct j1939_sk_buff_cb *skcb = j1939_skb_to_cb(skb);
+ const u8 *dat;
+ int len;
+
if (j1939_xtp_rx_cmd_bad_pgn(session, skb))
return;
+ dat = skb->data;
+
+ if (skcb->addr.type == J1939_ETP)
+ len = j1939_etp_ctl_to_size(dat);
+ else
+ len = j1939_tp_ctl_to_size(dat);
+
+ if (session->total_message_size != len) {
+ netdev_warn_once(session->priv->ndev,
+ "%s: 0x%p: Incorrect size. Expected: %i; got: %i.\n",
+ __func__, session, session->total_message_size,
+ len);
+ }
+
netdev_dbg(session->priv->ndev, "%s: 0x%p\n", __func__, session);
session->pkt.tx_acked = session->pkt.total;
skcb = j1939_skb_to_cb(skb);
memcpy(skcb, rel_skcb, sizeof(*skcb));
- session = j1939_session_new(priv, skb, skb->len);
+ session = j1939_session_new(priv, skb, size);
if (!session) {
kfree_skb(skb);
return NULL;
msg->sg.data[i].length -= trim;
sk_mem_uncharge(sk, trim);
+ /* Adjust copybreak if it falls into the trimmed part of last buf */
+ if (msg->sg.curr == i && msg->sg.copybreak > msg->sg.data[i].length)
+ msg->sg.copybreak = msg->sg.data[i].length;
out:
- /* If we trim data before curr pointer update copybreak and current
- * so that any future copy operations start at new copy location.
+ sk_msg_iter_var_next(i);
+ msg->sg.end = i;
+
+ /* If we trim data a full sg elem before curr pointer update
+ * copybreak and current so that any future copy operations
+ * start at new copy location.
* However trimed data that has not yet been used in a copy op
* does not require an update.
*/
- if (msg->sg.curr >= i) {
+ if (!msg->sg.size) {
+ msg->sg.curr = msg->sg.start;
+ msg->sg.copybreak = 0;
+ } else if (sk_msg_iter_dist(msg->sg.start, msg->sg.curr) >=
+ sk_msg_iter_dist(msg->sg.start, msg->sg.end)) {
+ sk_msg_iter_var_prev(i);
msg->sg.curr = i;
msg->sg.copybreak = msg->sg.data[i].length;
}
- sk_msg_iter_var_next(i);
- msg->sg.end = i;
}
EXPORT_SYMBOL_GPL(sk_msg_trim);
RCU_INIT_POINTER(newinet->inet_opt, rcu_dereference(ireq->ireq_opt));
newinet->mc_index = inet_iif(skb);
newinet->mc_ttl = ip_hdr(skb)->ttl;
- newinet->inet_id = jiffies;
+ newinet->inet_id = prandom_u32();
if (dst == NULL && (dst = inet_csk_route_child_sock(sk, newsk, req)) == NULL)
goto put_and_exit;
int ret = 0;
unsigned int hash = fib_laddr_hashfn(local);
struct hlist_head *head = &fib_info_laddrhash[hash];
+ int tb_id = l3mdev_fib_table(dev) ? : RT_TABLE_MAIN;
struct net *net = dev_net(dev);
- int tb_id = l3mdev_fib_table(dev);
struct fib_info *fi;
if (!fib_info_laddrhash || local == 0)
{
struct __rt6_probe_work *work = NULL;
const struct in6_addr *nh_gw;
+ unsigned long last_probe;
struct neighbour *neigh;
struct net_device *dev;
struct inet6_dev *idev;
nh_gw = &fib6_nh->fib_nh_gw6;
dev = fib6_nh->fib_nh_dev;
rcu_read_lock_bh();
+ last_probe = READ_ONCE(fib6_nh->last_probe);
idev = __in6_dev_get(dev);
neigh = __ipv6_neigh_lookup_noref(dev, nh_gw);
if (neigh) {
__neigh_set_probe_once(neigh);
}
write_unlock(&neigh->lock);
- } else if (time_after(jiffies, fib6_nh->last_probe +
+ } else if (time_after(jiffies, last_probe +
idev->cnf.rtr_probe_interval)) {
work = kmalloc(sizeof(*work), GFP_ATOMIC);
}
- if (work) {
- fib6_nh->last_probe = jiffies;
+ if (!work || cmpxchg(&fib6_nh->last_probe,
+ last_probe, jiffies) != last_probe) {
+ kfree(work);
+ } else {
INIT_WORK(&work->work, rt6_probe_deferred);
work->target = *nh_gw;
dev_hold(dev);
int err;
fib6_nh->fib_nh_family = AF_INET6;
+#ifdef CONFIG_IPV6_ROUTER_PREF
+ fib6_nh->last_probe = jiffies;
+#endif
err = -ENODEV;
if (cfg->fc_ifindex) {
ieee80211_remove_interfaces(local);
fail_rate:
rtnl_unlock();
- ieee80211_led_exit(local);
fail_flows:
+ ieee80211_led_exit(local);
destroy_workqueue(local->workqueue);
fail_workqueue:
wiphy_unregister(local->hw.wiphy);
{
struct ieee80211_sta_rx_stats *stats = sta_get_last_rx_stats(sta);
- if (time_after(stats->last_rx, sta->status_stats.last_ack))
+ if (!sta->status_stats.last_ack ||
+ time_after(stats->last_rx, sta->status_stats.last_ack))
return stats->last_rx;
return sta->status_stats.last_ack;
}
if (unlikely(!flag_nested(nla)))
return -IPSET_ERR_PROTOCOL;
- if (nla_parse_nested_deprecated(tb, IPSET_ATTR_IPADDR_MAX, nla, ipaddr_policy, NULL))
+ if (nla_parse_nested(tb, IPSET_ATTR_IPADDR_MAX, nla,
+ ipaddr_policy, NULL))
return -IPSET_ERR_PROTOCOL;
if (unlikely(!ip_set_attr_netorder(tb, IPSET_ATTR_IPADDR_IPV4)))
return -IPSET_ERR_PROTOCOL;
if (unlikely(!flag_nested(nla)))
return -IPSET_ERR_PROTOCOL;
- if (nla_parse_nested_deprecated(tb, IPSET_ATTR_IPADDR_MAX, nla, ipaddr_policy, NULL))
+ if (nla_parse_nested(tb, IPSET_ATTR_IPADDR_MAX, nla,
+ ipaddr_policy, NULL))
return -IPSET_ERR_PROTOCOL;
if (unlikely(!ip_set_attr_netorder(tb, IPSET_ATTR_IPADDR_IPV6)))
return -IPSET_ERR_PROTOCOL;
/* Without holding any locks, create private part. */
if (attr[IPSET_ATTR_DATA] &&
- nla_parse_nested_deprecated(tb, IPSET_ATTR_CREATE_MAX, attr[IPSET_ATTR_DATA], set->type->create_policy, NULL)) {
+ nla_parse_nested(tb, IPSET_ATTR_CREATE_MAX, attr[IPSET_ATTR_DATA],
+ set->type->create_policy, NULL)) {
ret = -IPSET_ERR_PROTOCOL;
goto put_out;
}
}
}
+static const struct nla_policy
+ip_set_dump_policy[IPSET_ATTR_CMD_MAX + 1] = {
+ [IPSET_ATTR_PROTOCOL] = { .type = NLA_U8 },
+ [IPSET_ATTR_SETNAME] = { .type = NLA_NUL_STRING,
+ .len = IPSET_MAXNAMELEN - 1 },
+ [IPSET_ATTR_FLAGS] = { .type = NLA_U32 },
+};
+
static int
dump_init(struct netlink_callback *cb, struct ip_set_net *inst)
{
ip_set_id_t index;
int ret;
- ret = nla_parse_deprecated(cda, IPSET_ATTR_CMD_MAX, attr,
- nlh->nlmsg_len - min_len,
- ip_set_setname_policy, NULL);
+ ret = nla_parse(cda, IPSET_ATTR_CMD_MAX, attr,
+ nlh->nlmsg_len - min_len,
+ ip_set_dump_policy, NULL);
if (ret)
return ret;
memcpy(&errmsg->msg, nlh, nlh->nlmsg_len);
cmdattr = (void *)&errmsg->msg + min_len;
- ret = nla_parse_deprecated(cda, IPSET_ATTR_CMD_MAX, cmdattr,
- nlh->nlmsg_len - min_len,
- ip_set_adt_policy, NULL);
+ ret = nla_parse(cda, IPSET_ATTR_CMD_MAX, cmdattr,
+ nlh->nlmsg_len - min_len, ip_set_adt_policy,
+ NULL);
if (ret) {
nlmsg_free(skb2);
use_lineno = !!attr[IPSET_ATTR_LINENO];
if (attr[IPSET_ATTR_DATA]) {
- if (nla_parse_nested_deprecated(tb, IPSET_ATTR_ADT_MAX, attr[IPSET_ATTR_DATA], set->type->adt_policy, NULL))
+ if (nla_parse_nested(tb, IPSET_ATTR_ADT_MAX,
+ attr[IPSET_ATTR_DATA],
+ set->type->adt_policy, NULL))
return -IPSET_ERR_PROTOCOL;
ret = call_ad(ctnl, skb, set, tb, adt, flags,
use_lineno);
nla_for_each_nested(nla, attr[IPSET_ATTR_ADT], nla_rem) {
if (nla_type(nla) != IPSET_ATTR_DATA ||
!flag_nested(nla) ||
- nla_parse_nested_deprecated(tb, IPSET_ATTR_ADT_MAX, nla, set->type->adt_policy, NULL))
+ nla_parse_nested(tb, IPSET_ATTR_ADT_MAX, nla,
+ set->type->adt_policy, NULL))
return -IPSET_ERR_PROTOCOL;
ret = call_ad(ctnl, skb, set, tb, adt,
flags, use_lineno);
if (!set)
return -ENOENT;
- if (nla_parse_nested_deprecated(tb, IPSET_ATTR_ADT_MAX, attr[IPSET_ATTR_DATA], set->type->adt_policy, NULL))
+ if (nla_parse_nested(tb, IPSET_ATTR_ADT_MAX, attr[IPSET_ATTR_DATA],
+ set->type->adt_policy, NULL))
return -IPSET_ERR_PROTOCOL;
rcu_read_lock_bh();
[IPSET_CMD_LIST] = {
.call = ip_set_dump,
.attr_count = IPSET_ATTR_CMD_MAX,
- .policy = ip_set_setname_policy,
+ .policy = ip_set_dump_policy,
},
[IPSET_CMD_SAVE] = {
.call = ip_set_dump,
}
req_version->version = IPSET_PROTOCOL;
- ret = copy_to_user(user, req_version,
- sizeof(struct ip_set_req_version));
+ if (copy_to_user(user, req_version,
+ sizeof(struct ip_set_req_version)))
+ ret = -EFAULT;
goto done;
}
case IP_SET_OP_GET_BYNAME: {
} /* end of switch(op) */
copy:
- ret = copy_to_user(user, data, copylen);
+ if (copy_to_user(user, data, copylen))
+ ret = -EFAULT;
done:
vfree(data);
(skb_mac_header(skb) + ETH_HLEN) > skb->data)
return -EINVAL;
- if (opt->flags & IPSET_DIM_ONE_SRC)
+ if (opt->flags & IPSET_DIM_TWO_SRC)
ether_addr_copy(e.ether, eth_hdr(skb)->h_source);
else
ether_addr_copy(e.ether, eth_hdr(skb)->h_dest);
[IPSET_ATTR_IP_TO] = { .type = NLA_NESTED },
[IPSET_ATTR_CIDR] = { .type = NLA_U8 },
[IPSET_ATTR_TIMEOUT] = { .type = NLA_U32 },
+ [IPSET_ATTR_LINENO] = { .type = NLA_U32 },
[IPSET_ATTR_CADT_FLAGS] = { .type = NLA_U32 },
[IPSET_ATTR_BYTES] = { .type = NLA_U64 },
[IPSET_ATTR_PACKETS] = { .type = NLA_U64 },
[IPSET_ATTR_CIDR] = { .type = NLA_U8 },
[IPSET_ATTR_CIDR2] = { .type = NLA_U8 },
[IPSET_ATTR_TIMEOUT] = { .type = NLA_U32 },
+ [IPSET_ATTR_LINENO] = { .type = NLA_U32 },
[IPSET_ATTR_CADT_FLAGS] = { .type = NLA_U32 },
[IPSET_ATTR_BYTES] = { .type = NLA_U64 },
[IPSET_ATTR_PACKETS] = { .type = NLA_U64 },
if (nlh->nlmsg_flags & NLM_F_REPLACE)
return -EOPNOTSUPP;
+ flags |= chain->flags & NFT_BASE_CHAIN;
return nf_tables_updchain(&ctx, genmask, policy, flags);
}
struct nft_trans *trans;
int err;
- if (!obj->ops->update)
- return -EOPNOTSUPP;
-
trans = nft_trans_alloc(ctx, NFT_MSG_NEWOBJ,
sizeof(struct nft_trans_obj));
if (!trans)
obj = nft_trans_obj(trans);
newobj = nft_trans_obj_newobj(trans);
- obj->ops->update(obj, newobj);
+ if (obj->ops->update)
+ obj->ops->update(obj, newobj);
kfree(newobj);
}
switch (trans->msg_type) {
case NFT_MSG_NEWCHAIN:
- if (!(trans->ctx.chain->flags & NFT_CHAIN_HW_OFFLOAD))
+ if (!(trans->ctx.chain->flags & NFT_CHAIN_HW_OFFLOAD) ||
+ nft_trans_chain_update(trans))
continue;
policy = nft_trans_chain_policy(trans);
const struct nft_expr *expr)
{
const struct nft_bitwise *priv = nft_expr_priv(expr);
+ struct nft_offload_reg *reg = &ctx->regs[priv->dreg];
if (memcmp(&priv->xor, &zero, sizeof(priv->xor)) ||
- priv->sreg != priv->dreg)
+ priv->sreg != priv->dreg || priv->len != reg->len)
return -EOPNOTSUPP;
- memcpy(&ctx->regs[priv->dreg].mask, &priv->mask, sizeof(priv->mask));
+ memcpy(®->mask, &priv->mask, sizeof(priv->mask));
return 0;
}
u8 *mask = (u8 *)&flow->match.mask;
u8 *key = (u8 *)&flow->match.key;
- if (priv->op != NFT_CMP_EQ)
+ if (priv->op != NFT_CMP_EQ || reg->len != priv->len)
return -EOPNOTSUPP;
memcpy(key + reg->offset, &priv->data, priv->len);
local = nfc_llcp_find_local(dev);
if (!local) {
- nfc_put_device(dev);
rc = -ENODEV;
goto exit;
}
local = nfc_llcp_find_local(dev);
if (!local) {
- nfc_put_device(dev);
rc = -ENODEV;
goto exit;
}
#include <linux/slab.h>
#include <linux/idr.h>
#include <linux/rhashtable.h>
+#include <linux/jhash.h>
#include <net/net_namespace.h>
#include <net/sock.h>
#include <net/netlink.h>
/* Protects list of registered TC modules. It is pure SMP lock. */
static DEFINE_RWLOCK(cls_mod_lock);
+static u32 destroy_obj_hashfn(const struct tcf_proto *tp)
+{
+ return jhash_3words(tp->chain->index, tp->prio,
+ (__force __u32)tp->protocol, 0);
+}
+
+static void tcf_proto_signal_destroying(struct tcf_chain *chain,
+ struct tcf_proto *tp)
+{
+ struct tcf_block *block = chain->block;
+
+ mutex_lock(&block->proto_destroy_lock);
+ hash_add_rcu(block->proto_destroy_ht, &tp->destroy_ht_node,
+ destroy_obj_hashfn(tp));
+ mutex_unlock(&block->proto_destroy_lock);
+}
+
+static bool tcf_proto_cmp(const struct tcf_proto *tp1,
+ const struct tcf_proto *tp2)
+{
+ return tp1->chain->index == tp2->chain->index &&
+ tp1->prio == tp2->prio &&
+ tp1->protocol == tp2->protocol;
+}
+
+static bool tcf_proto_exists_destroying(struct tcf_chain *chain,
+ struct tcf_proto *tp)
+{
+ u32 hash = destroy_obj_hashfn(tp);
+ struct tcf_proto *iter;
+ bool found = false;
+
+ rcu_read_lock();
+ hash_for_each_possible_rcu(chain->block->proto_destroy_ht, iter,
+ destroy_ht_node, hash) {
+ if (tcf_proto_cmp(tp, iter)) {
+ found = true;
+ break;
+ }
+ }
+ rcu_read_unlock();
+
+ return found;
+}
+
+static void
+tcf_proto_signal_destroyed(struct tcf_chain *chain, struct tcf_proto *tp)
+{
+ struct tcf_block *block = chain->block;
+
+ mutex_lock(&block->proto_destroy_lock);
+ if (hash_hashed(&tp->destroy_ht_node))
+ hash_del_rcu(&tp->destroy_ht_node);
+ mutex_unlock(&block->proto_destroy_lock);
+}
+
/* Find classifier type by string name */
static const struct tcf_proto_ops *__tcf_proto_lookup_ops(const char *kind)
static void tcf_chain_put(struct tcf_chain *chain);
static void tcf_proto_destroy(struct tcf_proto *tp, bool rtnl_held,
- struct netlink_ext_ack *extack)
+ bool sig_destroy, struct netlink_ext_ack *extack)
{
tp->ops->destroy(tp, rtnl_held, extack);
+ if (sig_destroy)
+ tcf_proto_signal_destroyed(tp->chain, tp);
tcf_chain_put(tp->chain);
module_put(tp->ops->owner);
kfree_rcu(tp, rcu);
struct netlink_ext_ack *extack)
{
if (refcount_dec_and_test(&tp->refcnt))
- tcf_proto_destroy(tp, rtnl_held, extack);
+ tcf_proto_destroy(tp, rtnl_held, true, extack);
}
static int walker_check_empty(struct tcf_proto *tp, void *fh,
static void tcf_block_destroy(struct tcf_block *block)
{
mutex_destroy(&block->lock);
+ mutex_destroy(&block->proto_destroy_lock);
kfree_rcu(block, rcu);
}
mutex_lock(&chain->filter_chain_lock);
tp = tcf_chain_dereference(chain->filter_chain, chain);
+ while (tp) {
+ tp_next = rcu_dereference_protected(tp->next, 1);
+ tcf_proto_signal_destroying(chain, tp);
+ tp = tp_next;
+ }
+ tp = tcf_chain_dereference(chain->filter_chain, chain);
RCU_INIT_POINTER(chain->filter_chain, NULL);
tcf_chain0_head_change(chain, NULL);
chain->flushing = true;
return ERR_PTR(-ENOMEM);
}
mutex_init(&block->lock);
+ mutex_init(&block->proto_destroy_lock);
init_rwsem(&block->cb_lock);
flow_block_init(&block->flow_block);
INIT_LIST_HEAD(&block->chain_list);
mutex_lock(&chain->filter_chain_lock);
+ if (tcf_proto_exists_destroying(chain, tp_new)) {
+ mutex_unlock(&chain->filter_chain_lock);
+ tcf_proto_destroy(tp_new, rtnl_held, false, NULL);
+ return ERR_PTR(-EAGAIN);
+ }
+
tp = tcf_chain_tp_find(chain, &chain_info,
protocol, prio, false);
if (!tp)
mutex_unlock(&chain->filter_chain_lock);
if (tp) {
- tcf_proto_destroy(tp_new, rtnl_held, NULL);
+ tcf_proto_destroy(tp_new, rtnl_held, false, NULL);
tp_new = tp;
} else if (err) {
- tcf_proto_destroy(tp_new, rtnl_held, NULL);
+ tcf_proto_destroy(tp_new, rtnl_held, false, NULL);
tp_new = ERR_PTR(err);
}
return;
}
+ tcf_proto_signal_destroying(chain, tp);
next = tcf_chain_dereference(chain_info.next, chain);
if (tp == chain->filter_chain)
tcf_chain0_head_change(chain, next);
err = -EINVAL;
goto errout_locked;
} else if (t->tcm_handle == 0) {
+ tcf_proto_signal_destroying(chain, tp);
tcf_chain_tp_remove(chain, &chain_info, tp);
mutex_unlock(&chain->filter_chain_lock);
goto done;
}
- taprio_offload_config_changed(q);
-
done:
taprio_offload_free(offload);
call_rcu(&admin->rcu, taprio_free_sched_cb);
spin_unlock_irqrestore(&q->current_entry_lock, flags);
+
+ if (FULL_OFFLOAD_IS_ENABLED(taprio_flags))
+ taprio_offload_config_changed(q);
}
new_admin = NULL;
return 0;
error:
- if (pnetelem->ndev)
- dev_put(pnetelem->ndev);
return rc;
}
int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
{
unsigned char record_type = TLS_RECORD_TYPE_DATA;
+ struct tls_context *tls_ctx = tls_get_ctx(sk);
int rc;
+ mutex_lock(&tls_ctx->tx_lock);
lock_sock(sk);
if (unlikely(msg->msg_controllen)) {
out:
release_sock(sk);
+ mutex_unlock(&tls_ctx->tx_lock);
return rc;
}
int tls_device_sendpage(struct sock *sk, struct page *page,
int offset, size_t size, int flags)
{
+ struct tls_context *tls_ctx = tls_get_ctx(sk);
struct iov_iter msg_iter;
char *kaddr = kmap(page);
struct kvec iov;
if (flags & MSG_SENDPAGE_NOTLAST)
flags |= MSG_MORE;
+ mutex_lock(&tls_ctx->tx_lock);
lock_sock(sk);
if (flags & MSG_OOB) {
out:
release_sock(sk);
+ mutex_unlock(&tls_ctx->tx_lock);
return rc;
}
void tls_device_write_space(struct sock *sk, struct tls_context *ctx)
{
- if (!sk->sk_write_pending && tls_is_partially_sent_record(ctx)) {
+ if (tls_is_partially_sent_record(ctx)) {
gfp_t sk_allocation = sk->sk_allocation;
+ WARN_ON_ONCE(sk->sk_write_pending);
+
sk->sk_allocation = GFP_ATOMIC;
tls_push_partial_record(sk, ctx,
MSG_DONTWAIT | MSG_NOSIGNAL |
memzero_explicit(&ctx->crypto_send, sizeof(ctx->crypto_send));
memzero_explicit(&ctx->crypto_recv, sizeof(ctx->crypto_recv));
+ mutex_destroy(&ctx->tx_lock);
if (sk)
kfree_rcu(ctx, rcu);
if (!ctx)
return NULL;
+ mutex_init(&ctx->tx_lock);
rcu_assign_pointer(icsk->icsk_ulp_data, ctx);
ctx->sk_proto = sk->sk_prot;
return ctx;
if (msg->msg_flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL))
return -ENOTSUPP;
+ mutex_lock(&tls_ctx->tx_lock);
lock_sock(sk);
- /* Wait till there is any pending write on socket */
- if (unlikely(sk->sk_write_pending)) {
- ret = wait_on_pending_writer(sk, &timeo);
- if (unlikely(ret))
- goto send_end;
- }
-
if (unlikely(msg->msg_controllen)) {
ret = tls_proccess_cmsg(sk, msg, &record_type);
if (ret) {
ret = sk_stream_error(sk, msg->msg_flags, ret);
release_sock(sk);
+ mutex_unlock(&tls_ctx->tx_lock);
return copied ? copied : ret;
}
eor = !(flags & (MSG_MORE | MSG_SENDPAGE_NOTLAST));
sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
- /* Wait till there is any pending write on socket */
- if (unlikely(sk->sk_write_pending)) {
- ret = wait_on_pending_writer(sk, &timeo);
- if (unlikely(ret))
- goto sendpage_end;
- }
-
/* Call the sk_stream functions to manage the sndbuf mem. */
while (size > 0) {
size_t copy, required_size;
int tls_sw_sendpage(struct sock *sk, struct page *page,
int offset, size_t size, int flags)
{
+ struct tls_context *tls_ctx = tls_get_ctx(sk);
int ret;
if (flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL |
MSG_SENDPAGE_NOTLAST | MSG_SENDPAGE_NOPOLICY))
return -ENOTSUPP;
+ mutex_lock(&tls_ctx->tx_lock);
lock_sock(sk);
ret = tls_sw_do_sendpage(sk, page, offset, size, flags);
release_sock(sk);
+ mutex_unlock(&tls_ctx->tx_lock);
return ret;
}
if (!test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask))
return;
+ mutex_lock(&tls_ctx->tx_lock);
lock_sock(sk);
tls_tx_records(sk, -1);
release_sock(sk);
+ mutex_unlock(&tls_ctx->tx_lock);
}
void tls_sw_write_space(struct sock *sk, struct tls_context *ctx)
struct tls_sw_context_tx *tx_ctx = tls_sw_ctx_tx(ctx);
/* Schedule the transmission if tx list is ready */
- if (is_tx_ready(tx_ctx) && !sk->sk_write_pending) {
- /* Schedule the transmission */
- if (!test_and_set_bit(BIT_TX_SCHEDULED,
- &tx_ctx->tx_bitmask))
- schedule_delayed_work(&tx_ctx->tx_work.work, 0);
- }
+ if (is_tx_ready(tx_ctx) &&
+ !test_and_set_bit(BIT_TX_SCHEDULED, &tx_ctx->tx_bitmask))
+ schedule_delayed_work(&tx_ctx->tx_work.work, 0);
}
void tls_sw_strparser_arm(struct sock *sk, struct tls_context *tls_ctx)
if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_SHUTDOWN_SEND)
vsk->peer_shutdown |= SEND_SHUTDOWN;
if (vsk->peer_shutdown == SHUTDOWN_MASK &&
- vsock_stream_has_data(vsk) <= 0) {
- sock_set_flag(sk, SOCK_DONE);
- sk->sk_state = TCP_CLOSING;
+ vsock_stream_has_data(vsk) <= 0 &&
+ !sock_flag(sk, SOCK_DONE)) {
+ (void)virtio_transport_reset(vsk, NULL);
+
+ virtio_transport_do_close(vsk, true);
}
if (le32_to_cpu(pkt->hdr.flags))
sk->sk_state_change(sk);
KBUILD_HOSTCFLAGS += -I$(srctree)/tools/testing/selftests/bpf/
KBUILD_HOSTCFLAGS += -I$(srctree)/tools/lib/ -I$(srctree)/tools/include
KBUILD_HOSTCFLAGS += -I$(srctree)/tools/perf
+KBUILD_HOSTCFLAGS += -DHAVE_ATTR_TEST=0
HOSTCFLAGS_bpf_load.o += -I$(objtree)/usr/include -Wno-unused-variable
attrs[n]['name'].string(): attrs[n]['address']
for n in range(int(sect_attrs['nsections']))}
args = []
- for section_name in [".data", ".data..read_mostly", ".rodata", ".bss"]:
+ for section_name in [".data", ".data..read_mostly", ".rodata", ".bss",
+ ".text", ".text.hot", ".text.unlikely"]:
address = section_name_to_address.get(section_name)
if address:
args.append(" -s {name} {addr}".format(
local mod_file=`echo $@ | sed -e 's/\.ko/\.mod/'`
local ns_deps_file=`echo $@ | sed -e 's/\.ko/\.ns_deps/'`
if [ ! -f "$ns_deps_file" ]; then return; fi
- local mod_source_files=`cat $mod_file | sed -n 1p \
+ local mod_source_files="`cat $mod_file | sed -n 1p \
| sed -e 's/\.o/\.c/g' \
- | sed "s|[^ ]* *|${srctree}/&|g"`
+ | sed "s|[^ ]* *|${srctree}/&|g"`"
for ns in `cat $ns_deps_file`; do
echo "Adding namespace $ns to module $mod_name (if needed)."
- generate_deps_for_ns $ns $mod_source_files
+ generate_deps_for_ns $ns "$mod_source_files"
# sort the imports
for source_file in $mod_source_files; do
sed '/MODULE_IMPORT_NS/Q' $source_file > ${source_file}.tmp
{
/* first let's check the buffer parameter's */
if (params->buffer.fragment_size == 0 ||
- params->buffer.fragments > INT_MAX / params->buffer.fragment_size ||
+ params->buffer.fragments > U32_MAX / params->buffer.fragment_size ||
params->buffer.fragments == 0)
return -EINVAL;
goto unlock;
}
if (!list_empty(&timer->open_list_head)) {
- timeri = list_entry(timer->open_list_head.next,
+ struct snd_timer_instance *t =
+ list_entry(timer->open_list_head.next,
struct snd_timer_instance, open_list);
- if (timeri->flags & SNDRV_TIMER_IFLG_EXCLUSIVE) {
+ if (t->flags & SNDRV_TIMER_IFLG_EXCLUSIVE) {
err = -EBUSY;
- timeri = NULL;
goto unlock;
}
}
#define SAFFIRE_CLOCK_SOURCE_SPDIF 1
/* clock sources as returned from register of Saffire Pro 10 and 26 */
+#define SAFFIREPRO_CLOCK_SOURCE_SELECT_MASK 0x000000ff
+#define SAFFIREPRO_CLOCK_SOURCE_DETECT_MASK 0x0000ff00
#define SAFFIREPRO_CLOCK_SOURCE_INTERNAL 0
#define SAFFIREPRO_CLOCK_SOURCE_SKIP 1 /* never used on hardware */
#define SAFFIREPRO_CLOCK_SOURCE_SPDIF 2
map = saffirepro_clk_maps[1];
/* In a case that this driver cannot handle the value of register. */
+ value &= SAFFIREPRO_CLOCK_SOURCE_SELECT_MASK;
if (value >= SAFFIREPRO_CLOCK_SOURCE_COUNT || map[value] < 0) {
err = -EIO;
goto end;
/* Delay enabling the HP amp, to let the mic-detection
* state machine run.
*/
- cancel_delayed_work_sync(&spec->unsol_hp_work);
+ cancel_delayed_work(&spec->unsol_hp_work);
schedule_delayed_work(&spec->unsol_hp_work, msecs_to_jiffies(500));
tbl = snd_hda_jack_tbl_get(codec, cb->nid);
if (tbl)
return intel_hsw_common_init(codec, 0x02, map, ARRAY_SIZE(map));
}
+static int patch_i915_tgl_hdmi(struct hda_codec *codec)
+{
+ /*
+ * pin to port mapping table where the value indicate the pin number and
+ * the index indicate the port number with 1 base.
+ */
+ static const int map[] = {0x4, 0x6, 0x8, 0xa, 0xb, 0xc, 0xd, 0xe, 0xf};
+
+ return intel_hsw_common_init(codec, 0x02, map, ARRAY_SIZE(map));
+}
+
+
/* Intel Baytrail and Braswell; with eld notifier */
static int patch_i915_byt_hdmi(struct hda_codec *codec)
{
HDA_CODEC_ENTRY(0x8086280c, "Cannonlake HDMI", patch_i915_glk_hdmi),
HDA_CODEC_ENTRY(0x8086280d, "Geminilake HDMI", patch_i915_glk_hdmi),
HDA_CODEC_ENTRY(0x8086280f, "Icelake HDMI", patch_i915_icl_hdmi),
+HDA_CODEC_ENTRY(0x80862812, "Tigerlake HDMI", patch_i915_tgl_hdmi),
HDA_CODEC_ENTRY(0x80862880, "CedarTrail HDMI", patch_generic_hdmi),
HDA_CODEC_ENTRY(0x80862882, "Valleyview2 HDMI", patch_i915_byt_hdmi),
HDA_CODEC_ENTRY(0x80862883, "Braswell HDMI", patch_i915_byt_hdmi),
return;
}
- snd_hdac_ext_bus_link_put(hdev->bus, hlink);
pm_runtime_disable(&hdev->dev);
+ snd_hdac_ext_bus_link_put(hdev->bus, hlink);
}
static const struct snd_soc_dapm_route hdac_hda_dapm_routes[] = {
uint8_t eld[MAX_ELD_BYTES];
struct snd_pcm_chmap *chmap_info;
unsigned int chmap_idx;
- struct mutex lock;
+ unsigned long busy;
struct snd_soc_jack *jack;
unsigned int jack_status;
};
struct hdmi_codec_priv *hcp = snd_soc_dai_get_drvdata(dai);
int ret = 0;
- ret = mutex_trylock(&hcp->lock);
- if (!ret) {
+ ret = test_and_set_bit(0, &hcp->busy);
+ if (ret) {
dev_err(dai->dev, "Only one simultaneous stream supported!\n");
return -EINVAL;
}
err:
/* Release the exclusive lock on error */
- mutex_unlock(&hcp->lock);
+ clear_bit(0, &hcp->busy);
return ret;
}
hcp->chmap_idx = HDMI_CODEC_CHMAP_IDX_UNKNOWN;
hcp->hcd.ops->audio_shutdown(dai->dev->parent, hcp->hcd.data);
- mutex_unlock(&hcp->lock);
+ clear_bit(0, &hcp->busy);
}
static int hdmi_codec_hw_params(struct snd_pcm_substream *substream,
return -ENOMEM;
hcp->hcd = *hcd;
- mutex_init(&hcp->lock);
-
daidrv = devm_kcalloc(dev, dai_count, sizeof(*daidrv), GFP_KERNEL);
if (!daidrv)
return -ENOMEM;
/* Power on device */
if (gpio_is_valid(max98373->reset_gpio)) {
- ret = gpio_request(max98373->reset_gpio, "MAX98373_RESET");
+ ret = devm_gpio_request(&i2c->dev, max98373->reset_gpio,
+ "MAX98373_RESET");
if (ret) {
dev_err(&i2c->dev, "%s: Failed to request gpio %d\n",
__func__, max98373->reset_gpio);
- gpio_free(max98373->reset_gpio);
return -EINVAL;
}
gpio_direction_output(max98373->reset_gpio, 0);
};
static const char *const adc2_mux_text[] = { "ZERO", "INP2", "INP3" };
-static const char *const rdac2_mux_text[] = { "ZERO", "RX2", "RX1" };
+static const char *const rdac2_mux_text[] = { "RX1", "RX2" };
static const char *const hph_text[] = { "ZERO", "Switch", };
static const struct soc_enum hph_enum = SOC_ENUM_SINGLE_VIRT(
/* RDAC2 MUX */
static const struct soc_enum rdac2_mux_enum = SOC_ENUM_SINGLE(
- CDC_D_CDC_CONN_HPHR_DAC_CTL, 0, 3, rdac2_mux_text);
+ CDC_D_CDC_CONN_HPHR_DAC_CTL, 0, 2, rdac2_mux_text);
static const struct snd_kcontrol_new spkr_switch[] = {
SOC_DAPM_SINGLE("Switch", CDC_A_SPKR_DAC_CTL, 7, 1, 0)
return PTR_ERR(priv->clk);
}
- err = clk_prepare_enable(priv->clk);
- if (err < 0)
- return err;
-
priv->extclk = devm_clk_get(&pdev->dev, "extclk");
if (IS_ERR(priv->extclk)) {
if (PTR_ERR(priv->extclk) == -EPROBE_DEFER)
}
}
+ err = clk_prepare_enable(priv->clk);
+ if (err < 0)
+ return err;
+
/* Some sensible defaults - this reflects the powerup values */
priv->ctl_play = KIRKWOOD_PLAYCTL_SIZE_24;
priv->ctl_rec = KIRKWOOD_RECCTL_SIZE_24;
priv->ctl_rec |= KIRKWOOD_RECCTL_BURST_128;
}
- err = devm_snd_soc_register_component(&pdev->dev, &kirkwood_soc_component,
+ err = snd_soc_register_component(&pdev->dev, &kirkwood_soc_component,
soc_dai, 2);
if (err) {
dev_err(&pdev->dev, "snd_soc_register_component failed\n");
{
struct kirkwood_dma_data *priv = dev_get_drvdata(&pdev->dev);
+ snd_soc_unregister_component(&pdev->dev);
if (!IS_ERR(priv->extclk))
clk_disable_unprepare(priv->extclk);
clk_disable_unprepare(priv->clk);
struct snd_soc_jack *jack = (struct snd_soc_jack *)data;
struct snd_soc_dapm_context *dapm = &jack->card->dapm;
- if (event & SND_JACK_MICROPHONE)
+ if (event & SND_JACK_MICROPHONE) {
snd_soc_dapm_force_enable_pin(dapm, "MICBIAS");
- else
+ snd_soc_dapm_force_enable_pin(dapm, "SHDN");
+ } else {
snd_soc_dapm_disable_pin(dapm, "MICBIAS");
+ snd_soc_dapm_disable_pin(dapm, "SHDN");
+ }
snd_soc_dapm_sync(dapm);
#define RDMA_SSI_I_N(addr, i) (addr ##_reg - 0x00300000 + (0x40 * i) + 0x8)
#define RDMA_SSI_O_N(addr, i) (addr ##_reg - 0x00300000 + (0x40 * i) + 0xc)
-#define RDMA_SSIU_I_N(addr, i, j) (addr ##_reg - 0x00441000 + (0x1000 * (i)) + (((j) / 4) * 0xA000) + (((j) % 4) * 0x400))
+#define RDMA_SSIU_I_N(addr, i, j) (addr ##_reg - 0x00441000 + (0x1000 * (i)) + (((j) / 4) * 0xA000) + (((j) % 4) * 0x400) - (0x4000 * ((i) / 9) * ((j) / 4)))
#define RDMA_SSIU_O_N(addr, i, j) RDMA_SSIU_I_N(addr, i, j)
-#define RDMA_SSIU_I_P(addr, i, j) (addr ##_reg - 0x00141000 + (0x1000 * (i)) + (((j) / 4) * 0xA000) + (((j) % 4) * 0x400))
+#define RDMA_SSIU_I_P(addr, i, j) (addr ##_reg - 0x00141000 + (0x1000 * (i)) + (((j) / 4) * 0xA000) + (((j) % 4) * 0x400) - (0x4000 * ((i) / 9) * ((j) / 4)))
#define RDMA_SSIU_O_P(addr, i, j) RDMA_SSIU_I_P(addr, i, j)
#define RDMA_SRC_I_N(addr, i) (addr ##_reg - 0x00500000 + (0x400 * i))
*/
dentry = file->f_path.dentry;
if (strcmp(dentry->d_name.name, "ipc_flood_count") &&
- strcmp(dentry->d_name.name, "ipc_flood_duration_ms"))
- return -EINVAL;
+ strcmp(dentry->d_name.name, "ipc_flood_duration_ms")) {
+ ret = -EINVAL;
+ goto out;
+ }
if (!strcmp(dentry->d_name.name, "ipc_flood_duration_ms"))
flood_duration_test = true;
* Workaround to address a known issue with host DMA that results
* in xruns during pause/release in capture scenarios.
*/
- if (!IS_ENABLED(SND_SOC_SOF_HDA_ALWAYS_ENABLE_DMI_L1))
+ if (!IS_ENABLED(CONFIG_SND_SOC_SOF_HDA_ALWAYS_ENABLE_DMI_L1))
if (stream && direction == SNDRV_PCM_STREAM_CAPTURE)
snd_sof_dsp_update_bits(sdev, HDA_DSP_HDA_BAR,
HDA_VS_INTEL_EM2,
spin_unlock_irq(&bus->reg_lock);
/* Enable DMI L1 entry if there are no capture streams open */
- if (!IS_ENABLED(SND_SOC_SOF_HDA_ALWAYS_ENABLE_DMI_L1))
+ if (!IS_ENABLED(CONFIG_SND_SOC_SOF_HDA_ALWAYS_ENABLE_DMI_L1))
if (!active_capture_stream)
snd_sof_dsp_update_bits(sdev, HDA_DSP_HDA_BAR,
HDA_VS_INTEL_EM2,
else
err = sof_get_ctrl_copy_params(cdata->type, partdata, cdata,
sparams);
- if (err < 0)
+ if (err < 0) {
+ kfree(partdata);
return err;
+ }
msg_bytes = sparams->msg_bytes;
pl_size = sparams->pl_size;
struct soc_bytes_ext *sbe = (struct soc_bytes_ext *)kc->private_value;
int max_size = sbe->max;
- if (le32_to_cpu(control->priv.size) > max_size) {
+ /* init the get/put bytes data */
+ scontrol->size = sizeof(struct sof_ipc_ctrl_data) +
+ le32_to_cpu(control->priv.size);
+
+ if (scontrol->size > max_size) {
dev_err(sdev->dev, "err: bytes data size %d exceeds max %d.\n",
- control->priv.size, max_size);
+ scontrol->size, max_size);
return -EINVAL;
}
- /* init the get/put bytes data */
- scontrol->size = sizeof(struct sof_ipc_ctrl_data) +
- le32_to_cpu(control->priv.size);
scontrol->control_data = kzalloc(max_size, GFP_KERNEL);
cdata = scontrol->control_data;
if (!scontrol->control_data)
return 0;
}
+/* No support of mmap in S/PDIF mode */
+static const struct snd_pcm_hardware stm32_sai_pcm_hw_spdif = {
+ .info = SNDRV_PCM_INFO_INTERLEAVED,
+ .buffer_bytes_max = 8 * PAGE_SIZE,
+ .period_bytes_min = 1024,
+ .period_bytes_max = PAGE_SIZE,
+ .periods_min = 2,
+ .periods_max = 8,
+};
+
static const struct snd_pcm_hardware stm32_sai_pcm_hw = {
.info = SNDRV_PCM_INFO_INTERLEAVED | SNDRV_PCM_INFO_MMAP,
.buffer_bytes_max = 8 * PAGE_SIZE,
};
static const struct snd_dmaengine_pcm_config stm32_sai_pcm_config_spdif = {
- .pcm_hardware = &stm32_sai_pcm_hw,
+ .pcm_hardware = &stm32_sai_pcm_hw_spdif,
.prepare_slave_config = snd_dmaengine_pcm_prepare_slave_config,
.process = stm32_sai_pcm_process_spdif,
};
config->chan_names[0] = txdmachan;
config->chan_names[1] = rxdmachan;
- return devm_snd_dmaengine_pcm_register(dev, config, 0);
+ return devm_snd_dmaengine_pcm_register(dev, config, flags);
}
EXPORT_SYMBOL_GPL(sdma_pcm_platform_register);
bindir ?= /usr/bin
-ifeq ($(srctree),)
+# This will work when gpio is built in tools env. where srctree
+# isn't set and when invoked from selftests build, where srctree
+# is set to ".". building_out_of_srctree is undefined for in srctree
+# builds
+ifndef building_out_of_srctree
srctree := $(patsubst %/,%,$(dir $(CURDIR)))
srctree := $(patsubst %/,%,$(dir $(srctree)))
endif
void test_attr__open(struct perf_event_attr *attr, pid_t pid, int cpu,
int fd, int group_fd, unsigned long flags);
-#define HAVE_ATTR_TEST
+#ifndef HAVE_ATTR_TEST
+#define HAVE_ATTR_TEST 1
+#endif
static inline int
sys_perf_event_open(struct perf_event_attr *attr,
fd = syscall(__NR_perf_event_open, attr, pid, cpu,
group_fd, flags);
-#ifdef HAVE_ATTR_TEST
+#if HAVE_ATTR_TEST
if (unlikely(test_attr__enabled))
test_attr__open(attr, pid, cpu, fd, group_fd, flags);
#endif
return 0;
}
-static int hist_entry__sort(struct hist_entry *a, struct hist_entry *b)
+static int64_t hist_entry__sort(struct hist_entry *a, struct hist_entry *b)
{
struct hists *hists = a->hists;
struct perf_hpp_fmt *fmt;
static int perl_generate_script(struct tep_handle *pevent, const char *outfile)
{
+ int i, not_first, count, nr_events;
+ struct tep_event **all_events;
struct tep_event *event = NULL;
struct tep_format_field *f;
char fname[PATH_MAX];
- int not_first, count;
FILE *ofp;
sprintf(fname, "%s.pl", outfile);
}\n\n\
");
+ nr_events = tep_get_events_count(pevent);
+ all_events = tep_list_events(pevent, TEP_EVENT_SORT_ID);
- while ((event = trace_find_next_event(pevent, event))) {
+ for (i = 0; all_events && i < nr_events; i++) {
+ event = all_events[i];
fprintf(ofp, "sub %s::%s\n{\n", event->system, event->name);
fprintf(ofp, "\tmy (");
static int python_generate_script(struct tep_handle *pevent, const char *outfile)
{
+ int i, not_first, count, nr_events;
+ struct tep_event **all_events;
struct tep_event *event = NULL;
struct tep_format_field *f;
char fname[PATH_MAX];
- int not_first, count;
FILE *ofp;
sprintf(fname, "%s.py", outfile);
fprintf(ofp, "def trace_end():\n");
fprintf(ofp, "\tprint(\"in trace_end\")\n\n");
- while ((event = trace_find_next_event(pevent, event))) {
+ nr_events = tep_get_events_count(pevent);
+ all_events = tep_list_events(pevent, TEP_EVENT_SORT_ID);
+
+ for (i = 0; all_events && i < nr_events; i++) {
+ event = all_events[i];
fprintf(ofp, "def %s__%s(", event->system, event->name);
fprintf(ofp, "event_name, ");
fprintf(ofp, "context, ");
return tep_parse_event(pevent, buf, size, sys);
}
-struct tep_event *trace_find_next_event(struct tep_handle *pevent,
- struct tep_event *event)
-{
- static int idx;
- int events_count;
- struct tep_event *all_events;
-
- all_events = tep_get_first_event(pevent);
- events_count = tep_get_events_count(pevent);
- if (!pevent || !all_events || events_count < 1)
- return NULL;
-
- if (!event) {
- idx = 0;
- return all_events;
- }
-
- if (idx < events_count && event == (all_events + idx)) {
- idx++;
- if (idx == events_count)
- return NULL;
- return (all_events + idx);
- }
-
- for (idx = 1; idx < events_count; idx++) {
- if (event == (all_events + (idx - 1)))
- return (all_events + idx);
- }
- return NULL;
-}
-
struct flag {
const char *name;
unsigned long long value;
ssize_t trace_report(int fd, struct trace_event *tevent, bool repipe);
-struct tep_event *trace_find_next_event(struct tep_handle *pevent,
- struct tep_event *event);
unsigned long long read_size(struct tep_event *event, void *ptr, int size);
unsigned long long eval_flag(const char *flag);
.descr = "ctx:file_pos sysctl:read read ok narrow",
.insns = {
/* If (file_pos == X) */
+#if __BYTE_ORDER == __LITTLE_ENDIAN
BPF_LDX_MEM(BPF_B, BPF_REG_7, BPF_REG_1,
offsetof(struct bpf_sysctl, file_pos)),
- BPF_JMP_IMM(BPF_JNE, BPF_REG_7, 0, 2),
+#else
+ BPF_LDX_MEM(BPF_B, BPF_REG_7, BPF_REG_1,
+ offsetof(struct bpf_sysctl, file_pos) + 3),
+#endif
+ BPF_JMP_IMM(BPF_JNE, BPF_REG_7, 4, 2),
/* return ALLOW; */
BPF_MOV64_IMM(BPF_REG_0, 1),
.attach_type = BPF_CGROUP_SYSCTL,
.sysctl = "kernel/ostype",
.open_flags = O_RDONLY,
+ .seek = 4,
.result = SUCCESS,
},
{
}
}
+static void
+test_mutliproc(struct __test_metadata *_metadata, struct _test_data_tls *self,
+ bool sendpg, unsigned int n_readers, unsigned int n_writers)
+{
+ const unsigned int n_children = n_readers + n_writers;
+ const size_t data = 6 * 1000 * 1000;
+ const size_t file_sz = data / 100;
+ size_t read_bias, write_bias;
+ int i, fd, child_id;
+ char buf[file_sz];
+ pid_t pid;
+
+ /* Only allow multiples for simplicity */
+ ASSERT_EQ(!(n_readers % n_writers) || !(n_writers % n_readers), true);
+ read_bias = n_writers / n_readers ?: 1;
+ write_bias = n_readers / n_writers ?: 1;
+
+ /* prep a file to send */
+ fd = open("/tmp/", O_TMPFILE | O_RDWR, 0600);
+ ASSERT_GE(fd, 0);
+
+ memset(buf, 0xac, file_sz);
+ ASSERT_EQ(write(fd, buf, file_sz), file_sz);
+
+ /* spawn children */
+ for (child_id = 0; child_id < n_children; child_id++) {
+ pid = fork();
+ ASSERT_NE(pid, -1);
+ if (!pid)
+ break;
+ }
+
+ /* parent waits for all children */
+ if (pid) {
+ for (i = 0; i < n_children; i++) {
+ int status;
+
+ wait(&status);
+ EXPECT_EQ(status, 0);
+ }
+
+ return;
+ }
+
+ /* Split threads for reading and writing */
+ if (child_id < n_readers) {
+ size_t left = data * read_bias;
+ char rb[8001];
+
+ while (left) {
+ int res;
+
+ res = recv(self->cfd, rb,
+ left > sizeof(rb) ? sizeof(rb) : left, 0);
+
+ EXPECT_GE(res, 0);
+ left -= res;
+ }
+ } else {
+ size_t left = data * write_bias;
+
+ while (left) {
+ int res;
+
+ ASSERT_EQ(lseek(fd, 0, SEEK_SET), 0);
+ if (sendpg)
+ res = sendfile(self->fd, fd, NULL,
+ left > file_sz ? file_sz : left);
+ else
+ res = send(self->fd, buf,
+ left > file_sz ? file_sz : left, 0);
+
+ EXPECT_GE(res, 0);
+ left -= res;
+ }
+ }
+}
+
+TEST_F(tls, mutliproc_even)
+{
+ test_mutliproc(_metadata, self, false, 6, 6);
+}
+
+TEST_F(tls, mutliproc_readers)
+{
+ test_mutliproc(_metadata, self, false, 4, 12);
+}
+
+TEST_F(tls, mutliproc_writers)
+{
+ test_mutliproc(_metadata, self, false, 10, 2);
+}
+
+TEST_F(tls, mutliproc_sendpage_even)
+{
+ test_mutliproc(_metadata, self, true, 6, 6);
+}
+
+TEST_F(tls, mutliproc_sendpage_readers)
+{
+ test_mutliproc(_metadata, self, true, 4, 12);
+}
+
+TEST_F(tls, mutliproc_sendpage_writers)
+{
+ test_mutliproc(_metadata, self, true, 10, 2);
+}
+
TEST_F(tls, control_msg)
{
if (self->notls)
flags |= MAP_SHARED;
break;
case 'H':
- flags |= MAP_HUGETLB;
+ flags |= (MAP_HUGETLB | MAP_ANONYMOUS);
break;
default:
return -1;
#include <linux/bsearch.h>
#include <linux/io.h>
#include <linux/lockdep.h>
+#include <linux/kthread.h>
#include <asm/processor.h>
#include <asm/ioctl.h>
return 0;
}
+bool kvm_is_zone_device_pfn(kvm_pfn_t pfn)
+{
+ /*
+ * The metadata used by is_zone_device_page() to determine whether or
+ * not a page is ZONE_DEVICE is guaranteed to be valid if and only if
+ * the device has been pinned, e.g. by get_user_pages(). WARN if the
+ * page_count() is zero to help detect bad usage of this helper.
+ */
+ if (!pfn_valid(pfn) || WARN_ON_ONCE(!page_count(pfn_to_page(pfn))))
+ return false;
+
+ return is_zone_device_page(pfn_to_page(pfn));
+}
+
bool kvm_is_reserved_pfn(kvm_pfn_t pfn)
{
+ /*
+ * ZONE_DEVICE pages currently set PG_reserved, but from a refcounting
+ * perspective they are "normal" pages, albeit with slightly different
+ * usage rules.
+ */
if (pfn_valid(pfn))
- return PageReserved(pfn_to_page(pfn));
+ return PageReserved(pfn_to_page(pfn)) &&
+ !kvm_is_zone_device_pfn(pfn);
return true;
}
return 0;
}
+/*
+ * Called after the VM is otherwise initialized, but just before adding it to
+ * the vm_list.
+ */
+int __weak kvm_arch_post_init_vm(struct kvm *kvm)
+{
+ return 0;
+}
+
+/*
+ * Called just after removing the VM from the vm_list, but before doing any
+ * other destruction.
+ */
+void __weak kvm_arch_pre_destroy_vm(struct kvm *kvm)
+{
+}
+
static struct kvm *kvm_create_vm(unsigned long type)
{
struct kvm *kvm = kvm_arch_alloc_vm();
BUILD_BUG_ON(KVM_MEM_SLOTS_NUM > SHRT_MAX);
+ if (init_srcu_struct(&kvm->srcu))
+ goto out_err_no_srcu;
+ if (init_srcu_struct(&kvm->irq_srcu))
+ goto out_err_no_irq_srcu;
+
+ refcount_set(&kvm->users_count, 1);
for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) {
struct kvm_memslots *slots = kvm_alloc_memslots();
goto out_err_no_arch_destroy_vm;
}
- refcount_set(&kvm->users_count, 1);
r = kvm_arch_init_vm(kvm, type);
if (r)
goto out_err_no_arch_destroy_vm;
INIT_HLIST_HEAD(&kvm->irq_ack_notifier_list);
#endif
- if (init_srcu_struct(&kvm->srcu))
- goto out_err_no_srcu;
- if (init_srcu_struct(&kvm->irq_srcu))
- goto out_err_no_irq_srcu;
-
r = kvm_init_mmu_notifier(kvm);
if (r)
+ goto out_err_no_mmu_notifier;
+
+ r = kvm_arch_post_init_vm(kvm);
+ if (r)
goto out_err;
mutex_lock(&kvm_lock);
return kvm;
out_err:
- cleanup_srcu_struct(&kvm->irq_srcu);
-out_err_no_irq_srcu:
- cleanup_srcu_struct(&kvm->srcu);
-out_err_no_srcu:
+#if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER)
+ if (kvm->mmu_notifier.ops)
+ mmu_notifier_unregister(&kvm->mmu_notifier, current->mm);
+#endif
+out_err_no_mmu_notifier:
hardware_disable_all();
out_err_no_disable:
kvm_arch_destroy_vm(kvm);
- WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
out_err_no_arch_destroy_vm:
+ WARN_ON_ONCE(!refcount_dec_and_test(&kvm->users_count));
for (i = 0; i < KVM_NR_BUSES; i++)
kfree(kvm_get_bus(kvm, i));
for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++)
kvm_free_memslots(kvm, __kvm_memslots(kvm, i));
+ cleanup_srcu_struct(&kvm->irq_srcu);
+out_err_no_irq_srcu:
+ cleanup_srcu_struct(&kvm->srcu);
+out_err_no_srcu:
kvm_arch_free_vm(kvm);
mmdrop(current->mm);
return ERR_PTR(r);
mutex_lock(&kvm_lock);
list_del(&kvm->vm_list);
mutex_unlock(&kvm_lock);
+ kvm_arch_pre_destroy_vm(kvm);
+
kvm_free_irq_routing(kvm);
for (i = 0; i < KVM_NR_BUSES; i++) {
struct kvm_io_bus *bus = kvm_get_bus(kvm, i);
void kvm_set_pfn_dirty(kvm_pfn_t pfn)
{
- if (!kvm_is_reserved_pfn(pfn)) {
+ if (!kvm_is_reserved_pfn(pfn) && !kvm_is_zone_device_pfn(pfn)) {
struct page *page = pfn_to_page(pfn);
SetPageDirty(page);
void kvm_set_pfn_accessed(kvm_pfn_t pfn)
{
- if (!kvm_is_reserved_pfn(pfn))
+ if (!kvm_is_reserved_pfn(pfn) && !kvm_is_zone_device_pfn(pfn))
mark_page_accessed(pfn_to_page(pfn));
}
EXPORT_SYMBOL_GPL(kvm_set_pfn_accessed);
kvm_vfio_ops_exit();
}
EXPORT_SYMBOL_GPL(kvm_exit);
+
+struct kvm_vm_worker_thread_context {
+ struct kvm *kvm;
+ struct task_struct *parent;
+ struct completion init_done;
+ kvm_vm_thread_fn_t thread_fn;
+ uintptr_t data;
+ int err;
+};
+
+static int kvm_vm_worker_thread(void *context)
+{
+ /*
+ * The init_context is allocated on the stack of the parent thread, so
+ * we have to locally copy anything that is needed beyond initialization
+ */
+ struct kvm_vm_worker_thread_context *init_context = context;
+ struct kvm *kvm = init_context->kvm;
+ kvm_vm_thread_fn_t thread_fn = init_context->thread_fn;
+ uintptr_t data = init_context->data;
+ int err;
+
+ err = kthread_park(current);
+ /* kthread_park(current) is never supposed to return an error */
+ WARN_ON(err != 0);
+ if (err)
+ goto init_complete;
+
+ err = cgroup_attach_task_all(init_context->parent, current);
+ if (err) {
+ kvm_err("%s: cgroup_attach_task_all failed with err %d\n",
+ __func__, err);
+ goto init_complete;
+ }
+
+ set_user_nice(current, task_nice(init_context->parent));
+
+init_complete:
+ init_context->err = err;
+ complete(&init_context->init_done);
+ init_context = NULL;
+
+ if (err)
+ return err;
+
+ /* Wait to be woken up by the spawner before proceeding. */
+ kthread_parkme();
+
+ if (!kthread_should_stop())
+ err = thread_fn(kvm, data);
+
+ return err;
+}
+
+int kvm_vm_create_worker_thread(struct kvm *kvm, kvm_vm_thread_fn_t thread_fn,
+ uintptr_t data, const char *name,
+ struct task_struct **thread_ptr)
+{
+ struct kvm_vm_worker_thread_context init_context = {};
+ struct task_struct *thread;
+
+ *thread_ptr = NULL;
+ init_context.kvm = kvm;
+ init_context.parent = current;
+ init_context.thread_fn = thread_fn;
+ init_context.data = data;
+ init_completion(&init_context.init_done);
+
+ thread = kthread_run(kvm_vm_worker_thread, &init_context,
+ "%s-%d", name, task_pid_nr(current));
+ if (IS_ERR(thread))
+ return PTR_ERR(thread);
+
+ /* kthread_run is never supposed to return NULL */
+ WARN_ON(thread == NULL);
+
+ wait_for_completion(&init_context.init_done);
+
+ if (!init_context.err)
+ *thread_ptr = thread;
+
+ return init_context.err;
+}