Bjorn Helgaas [Thu, 28 Nov 2019 14:54:52 +0000 (08:54 -0600)]
Merge branch 'remotes/lorenzo/pci/vmd'
- Add VMD bus 224-255 restriction decode (Jon Derrick)
- Add VMD 8086:9A0B device ID (Jon Derrick)
- Remove Keith from VMD maintainer list (Keith Busch)
* remotes/lorenzo/pci/vmd:
MAINTAINERS: Remove Keith from VMD maintainer
PCI: vmd: Add device id for VMD device 8086:9A0B
PCI: vmd: Add bus 224-255 restriction decode
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:51 +0000 (08:54 -0600)]
Merge branch 'remotes/lorenzo/pci/uniphier'
- Set uniphier to host (RC) mode always (Kunihiko Hayashi)
* remotes/lorenzo/pci/uniphier:
PCI: uniphier: Set mode register to host mode
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:50 +0000 (08:54 -0600)]
Merge branch 'remotes/lorenzo/pci/tegra'
- Fix Tegra CLKREQ dependency programming (Vidya Sagar)
* remotes/lorenzo/pci/tegra:
PCI: tegra: Fix CLKREQ dependency programming
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:49 +0000 (08:54 -0600)]
Merge branch 'remotes/lorenzo/pci/rockchip'
- Make rockchip 0V9 and 1V8 power regulators non-optional (Robin Murphy)
* remotes/lorenzo/pci/rockchip:
PCI: rockchip: Make some regulators non-optional
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:48 +0000 (08:54 -0600)]
Merge branch 'remotes/lorenzo/pci/rcar'
- Clear bit 0 of MACCTLR before PCIETCTLR.CFINIT per manual (Yoshihiro
Shimoda)
- Remove unnecessary header include from rcar (Andrew Murray)
- Tighten register index checking for rcar inbound range programming
(Marek Vasut)
- Fix rcar inbound range alignment calculation to improve packing of
multiple entries (Marek Vasut)
- Update rcar MACCTLR setting to match documentation (Yoshihiro Shimoda)
* remotes/lorenzo/pci/rcar:
PCI: rcar: Fix missing MACCTLR register setting in initialization sequence
PCI: rcar: Recalculate inbound range alignment for each controller entry
PCI: rcar: Move the inbound index check
PCI: rcar: Remove unnecessary header include (../pci.h)
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:47 +0000 (08:54 -0600)]
Merge branch 'remotes/lorenzo/pci/mobiveil'
- Change mobiveil csr_read()/write() function names that conflict with
riscv arch functions (Kefeng Wang)
* remotes/lorenzo/pci/mobiveil:
PCI: mobiveil: Fix csr_read()/write() build issue
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:46 +0000 (08:54 -0600)]
Merge branch 'remotes/lorenzo/pci/meson'
- Fix meson PERST# GPIO polarity problem (Remi Pommarel)
- Add DT bindings for Amlogic Meson G12A (Neil Armstrong)
- Fix meson clock names to match DT bindings (Neil Armstrong)
- Add meson support for Amlogic G12A SoC with separate shared PHY (Neil
Armstrong)
- Add meson extended PCIe PHY functions for Amlogic G12A USB3+PCIe combo
PHY (Neil Armstrong)
- Add arm64 DT for Amlogic G12A PCIe controller node (Neil Armstrong)
- Add commented-out description of VIM3 USB3/PCIe mux in arm64 DT (Neil
Armstrong)
* remotes/lorenzo/pci/meson:
arm64: dts: khadas-vim3: add commented support for PCIe
arm64: dts: meson-g12a: Add PCIe node
phy: meson-g12a-usb3-pcie: Add support for PCIe mode
PCI: amlogic: meson: Add support for G12A
PCI: amlogic: Fix probed clock names
dt-bindings: pci: amlogic, meson-pcie: Add G12A bindings
PCI: amlogic: Fix reset assertion via gpio descriptor
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:45 +0000 (08:54 -0600)]
Merge branch 'remotes/lorenzo/pci/layerscape'
- Add layerscape LS1028a support (Xiaowei Bao)
* remotes/lorenzo/pci/layerscape:
PCI: layerscape: Add LS1028a support
dt-bindings: pci: layerscape-pci: add compatible strings "fsl, ls1028a-pcie"
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:44 +0000 (08:54 -0600)]
Merge branch 'remotes/lorenzo/pci/iproc'
- Invalidate iProc PAXB address mapping before programming it (Abhishek
Shah)
* remotes/lorenzo/pci/iproc:
PCI: iproc: Invalidate PAXB address mapping before programming it
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:43 +0000 (08:54 -0600)]
Merge branch 'remotes/lorenzo/pci/hv'
- Add hibernation support for Hyper-V virtual PCI devices (Dexuan Cui)
- Track Hyper-V pci_protocol_version per-hbus, not globally (Dexuan Cui)
- Avoid kmemleak false positive on hv hbus buffer (Dexuan Cui)
* remotes/lorenzo/pci/hv:
PCI: hv: Avoid a kmemleak false positive caused by the hbus buffer
PCI: hv: Change pci_protocol_version to per-hbus
PCI: hv: Add hibernation support
PCI: hv: Reorganize the code in preparation of hibernation
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:42 +0000 (08:54 -0600)]
Merge branch 'remotes/lorenzo/pci/endpoint'
- Fix endpoint driver sign extension problem when shifting page number to
phys_addr_t (Alan Mikhak)
* remotes/lorenzo/pci/endpoint:
PCI: endpoint: Cast the page number to phys_addr_t
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:41 +0000 (08:54 -0600)]
Merge branch 'remotes/lorenzo/pci/dwc'
- Fix dwc find_next_bit() usage (Niklas Cassel)
* remotes/lorenzo/pci/dwc:
PCI: dwc: Fix find_next_bit() usage
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:40 +0000 (08:54 -0600)]
Merge branch 'remotes/lorenzo/pci/cadence'
- Refactor Cadence PCIe host controller to use as a library for both host
and endpoint (Tom Joseph)
* remotes/lorenzo/pci/cadence:
PCI: cadence: Move all files to per-device cadence directory
PCI: cadence: Refactor driver to use as a core library
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:39 +0000 (08:54 -0600)]
Merge branch 'remotes/lorenzo/pci/aardvark'
- Use LTSSM state to build link training flag since Aardvark doesn't
implement the Link Training bit (Remi Pommarel)
- Delay before training Aardvark link in case PERST# was asserted before
the driver probe (Remi Pommarel)
- Fix Aardvark issues with Root Control reads and writes (Remi Pommarel)
- Don't rely on jiffies in Aardvark config access path since interrupts
may be disabled (Remi Pommarel)
- Fix Aardvark big-endian support (Grzegorz Jaszczyk)
- Fix bridge emulation big-endian support (Grzegorz Jaszczyk)
* remotes/lorenzo/pci/aardvark:
PCI: pci-bridge-emul: Fix big-endian support
PCI: aardvark: Fix big endian support
PCI: aardvark: Don't rely on jiffies while holding spinlock
PCI: aardvark: Fix PCI_EXP_RTCTL register configuration
PCI: aardvark: Wait for endpoint to be ready before training link
PCI: aardvark: Use LTSSM state to build link training flag
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:38 +0000 (08:54 -0600)]
Merge branch 'pci/virtualization'
- Fix erroneous intel-iommu dependency on CONFIG_AMD_IOMMU (Bjorn
Helgaas)
- Move pci_prg_resp_pasid_required() to CONFIG_PCI_PRI (Bjorn Helgaas)
- Allow VFs to use PRI (the PF PRI is shared by the VFs, but the code
previously didn't recognize that) (Kuppuswamy Sathyanarayanan)
- Allow VFs to use PASID (the PF PASID capability is shared by the VFs,
but the code previously didn't recognize that) (Kuppuswamy
Sathyanarayanan)
- Disconnect PF and VF ATS enablement, since ATS in PFs and associated
VFs can be enabled independently (Kuppuswamy Sathyanarayanan)
- Cache PRI and PASID capability offsets (Kuppuswamy Sathyanarayanan)
- Cache the PRI PRG Response PASID Required bit (Bjorn Helgaas)
- Consolidate ATS declarations in linux/pci-ats.h (Krzysztof Wilczynski)
- Remove unused PRI and PASID stubs (Bjorn Helgaas)
- Removed unnecessary EXPORT_SYMBOL_GPL() from ATS, PRI, and PASID
interfaces that are only used by built-in IOMMU drivers (Bjorn Helgaas)
- Hide PRI and PASID state restoration functions used only inside the PCI
core (Bjorn Helgaas)
- Fix the UPDCR register address in the Intel ACS quirk (Steffen
Liebergeld)
- Add a DMA alias quirk for the Intel VCA NTB (Slawomir Pawlowski)
- Serialize sysfs sriov_numvfs reads vs writes (Pierre Crégut)
- Update Cavium ACS quirk for ThunderX2 and ThunderX3 (George Cherian)
- Unify ACS quirk implementations (Bjorn Helgaas)
* pci/virtualization:
PCI: Unify ACS quirk desired vs provided checking
PCI: Make ACS quirk implementations more uniform
PCI: Apply Cavium ACS quirk to ThunderX2 and ThunderX3
PCI/IOV: Serialize sysfs sriov_numvfs reads vs writes
PCI: Add DMA alias quirk for Intel VCA NTB
PCI: Fix Intel ACS quirk UPDCR register address
PCI/ATS: Make pci_restore_pri_state(), pci_restore_pasid_state() private
PCI/ATS: Remove unnecessary EXPORT_SYMBOL_GPL()
PCI/ATS: Remove unused PRI and PASID stubs
PCI/ATS: Consolidate ATS declarations in linux/pci-ats.h
PCI/ATS: Cache PRI PRG Response PASID Required bit
PCI/ATS: Cache PASID Capability offset
PCI/ATS: Cache PRI Capability offset
PCI/ATS: Disable PF/VF ATS service independently
PCI/ATS: Handle sharing of PF PASID Capability with all VFs
PCI/ATS: Handle sharing of PF PRI Capability with all VFs
PCI/ATS: Move pci_prg_resp_pasid_required() to CONFIG_PCI_PRI
iommu/vt-d: Select PCI_PRI for INTEL_IOMMU_SVM
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:37 +0000 (08:54 -0600)]
Merge branch 'pci/switchtec'
- Read all 64 bits of Switchtec part_event_bitmap (Logan Gunthorpe)
* pci/switchtec:
PCI/switchtec: Read all 64 bits of part_event_bitmap
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:36 +0000 (08:54 -0600)]
Merge branch 'pci/resource'
- Protect pci_reassign_bridge_resources() against concurrent
addition/removal (Benjamin Herrenschmidt)
- Fix bridge dma_ranges resource list cleanup (Rob Herring)
- Add PCI_STD_NUM_BARS for the number of standard BARs (Denis Efremov)
- Add "pci=hpmmiosize" and "pci=hpmmioprefsize" parameters to control the
MMIO and prefetchable MMIO window sizes of hotplug bridges
independently (Nicholas Johnson)
- Fix MMIO/MMIO_PREF window assignment that assigned more space than
desired (Nicholas Johnson)
- Only enforce bus numbers from bridge EA if the bridge has EA devices
downstream (Subbaraya Sundeep)
* pci/resource:
PCI: Do not use bus number zero from EA capability
PCI: Avoid double hpmemsize MMIO window assignment
PCI: Add "pci=hpmmiosize" and "pci=hpmmioprefsize" parameters
PCI: Add PCI_STD_NUM_BARS for the number of standard BARs
PCI: Fix missing bridge dma_ranges resource list cleanup
PCI: Protect pci_reassign_bridge_resources() against concurrent addition/removal
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:35 +0000 (08:54 -0600)]
Merge branch 'pci/pm'
- Always return devices to D0 when thawing to fix hibernation with
drivers like mlx4 that used legacy power management (previously we only
did it for drivers with new power management ops) (Dexuan Cui)
- Clear PCIe PME Status even for legacy power management (Bjorn Helgaas)
- Fix PCI PM documentation errors (Bjorn Helgaas)
- Use dev_printk() for more power management messages (Bjorn Helgaas)
- Apply D2 delay as milliseconds, not microseconds (Bjorn Helgaas)
- Convert xen-platform from legacy to generic power management (Bjorn
Helgaas)
- Removed unused .resume_early() and .suspend_late() legacy power
management hooks (Bjorn Helgaas)
- Rearrange power management code for clarity (Rafael J. Wysocki)
- Decode power states more clearly ("4" or "D4" really refers to
"D3cold") (Bjorn Helgaas)
- Notice when reading PM Control register returns an error (~0) instead
of interpreting it as being in D3hot (Bjorn Helgaas)
- Add missing link delays required by the PCIe spec (Mika Westerberg)
* pci/pm:
PCI/PM: Move pci_dev_wait() definition earlier
PCI/PM: Add missing link delays required by the PCIe spec
PCI/PM: Add pcie_wait_for_link_delay()
PCI/PM: Return error when changing power state from D3cold
PCI/PM: Decode D3cold power state correctly
PCI/PM: Fold __pci_complete_power_transition() into its caller
PCI/PM: Avoid exporting __pci_complete_power_transition()
PCI/PM: Fold __pci_start_power_transition() into its caller
PCI/PM: Use pci_power_up() in pci_set_power_state()
PCI/PM: Move power state update away from pci_power_up()
PCI/PM: Remove unused pci_driver.suspend_late() hook
PCI/PM: Remove unused pci_driver.resume_early() hook
xen-platform: Convert to generic power management
PCI/PM: Simplify pci_set_power_state()
PCI/PM: Expand PM reset messages to mention D3hot (not just D3)
PCI/PM: Apply D2 delay as milliseconds, not microseconds
PCI/PM: Use pci_WARN() to include device information
PCI/PM: Use PCI dev_printk() wrappers for consistency
PCI/PM: Wrap long lines in documentation
PCI/PM: Note that PME can be generated from D0
PCI/PM: Make power management op coding style consistent
PCI/PM: Run resume fixups before disabling wakeup events
PCI/PM: Clear PCIe PME Status even for legacy power management
PCI/PM: Correct pci_pm_thaw_noirq() documentation
PCI/PM: Always return devices to D0 when thawing
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:34 +0000 (08:54 -0600)]
Merge branch 'pci/msi'
- Remove unused pci_irq_get_node() Greg Kroah-Hartman)
- Move power state check out of pci_msi_supported() (Bjorn Helgaas)
- Fix incorrect MSI-X masking on resume and revert related nvme quirk for
Kingston NVME SSD running FW E8FK11.T (Jian-Hong Pan)
- Make asm/msi.h mandatory and simplify PCI_MSI_IRQ_DOMAIN Kconfig
(Palmer Dabbelt, Michal Simek)
* pci/msi:
PCI: Remove PCI_MSI_IRQ_DOMAIN architecture whitelist
asm-generic: Make msi.h a mandatory include/asm header
Revert "nvme: Add quirk for Kingston NVME SSD running FW E8FK11.T"
PCI/MSI: Fix incorrect MSI-X masking on resume
PCI/MSI: Move power state check out of pci_msi_supported()
PCI/MSI: Remove unused pci_irq_get_node()
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:32 +0000 (08:54 -0600)]
Merge branch 'pci/misc'
- Add NumaChip SPDX header (Krzysztof Wilczynski)
- Replace EXTRA_CFLAGS with ccflags-y (Krzysztof Wilczynski)
- Remove unused includes (Krzysztof Wilczynski)
- Avoid AMD FCH XHCI USB PME# from D0 defect that prevents wakeup on USB
2.0 or 1.1 connect events (Kai-Heng Feng)
- Removed unused sysfs attribute groups (Ben Dooks)
- Remove PTM and ASPM dependencies on PCIEPORTBUS (Bjorn Helgaas)
- Add PCIe Link Control 2 register field definitions to replace magic
numbers in AMDGPU and Radeon CIK/SI (Bjorn Helgaas)
- Fix incorrect Link Control 2 Transmit Margin usage in AMDGPU and Radeon
CIK/SI PCIe Gen3 link training (Bjorn Helgaas)
- Use pcie_capability_read_word() instead of pci_read_config_word() in
AMDGPU and Radeon CIK/SI (Frederick Lawler)
* pci/misc:
drm/radeon: Prefer pcie_capability_read_word()
drm/radeon: Replace numbers with PCI_EXP_LNKCTL2 definitions
drm/radeon: Correct Transmit Margin masks
drm/amdgpu: Prefer pcie_capability_read_word()
drm/amdgpu: Replace numbers with PCI_EXP_LNKCTL2 definitions
drm/amdgpu: Correct Transmit Margin masks
PCI: Add #defines for Enter Compliance, Transmit Margin
PCI: Allow building PCIe things without PCIEPORTBUS
PCI: Remove PCIe Kconfig dependencies on PCI
PCI/ASPM: Remove dependency on PCIEPORTBUS
PCI/PTM: Remove dependency on PCIEPORTBUS
PCI/PTM: Remove spurious "d" from granularity message
PCI: sysfs: Remove unused attribute groups
x86/PCI: Avoid AMD FCH XHCI USB PME# from D0 defect
PCI: Remove unused includes and superfluous struct declaration
x86/PCI: Replace deprecated EXTRA_CFLAGS with ccflags-y
x86/PCI: Correct SPDX comment style
x86/PCI: Add NumaChip SPDX GPL-2.0 to replace COPYING boilerplate
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:31 +0000 (08:54 -0600)]
Merge branch 'pci/hotplug'
- Avoid returning prematurely from sysfs requests to enable or disable a
PCIe hotplug slot (Lukas Wunner)
- Don't disable interrupts twice when suspending hotplug ports (Mika
Westerberg)
- Fix deadlocks when PCIe ports are hot-removed while suspended (Mika
Westerberg)
- Fix boot-time Embedded Controller GPE storm caused by incorrect
resource assignment after ACPI Bus Check Notification (Mika Westerberg)
* pci/hotplug:
ACPI / hotplug / PCI: Allocate resources directly under the non-hotplug bridge
PCI: pciehp: Prevent deadlock on disconnect
PCI: pciehp: Do not disable interrupt twice on suspend
PCI: pciehp: Refactor infinite loop in pcie_poll_cmd()
PCI: pciehp: Avoid returning prematurely from sysfs requests
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:30 +0000 (08:54 -0600)]
Merge branch 'pci/enumeration'
- Warn if a host bridge has no NUMA info (Yunsheng Lin)
* pci/enumeration:
PCI: Warn if no host bridge NUMA node info
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:29 +0000 (08:54 -0600)]
Merge branch 'pci/aspm'
- Remove unnecessary ASPM locking (Bjorn Helgaas)
- Add support for disabling L1 PM Substates (Heiner Kallweit)
- Allow re-enabling Clock PM after it has been disabled (Heiner Kallweit)
- Add sysfs attributes for controlling ASPM link states (Heiner Kallweit)
- Remove CONFIG_PCIEASPM_DEBUG, including "link_state" and "clk_ctl"
sysfs files (Heiner Kallweit)
* pci/aspm:
PCI/ASPM: Remove PCIEASPM_DEBUG Kconfig option and related code
PCI/ASPM: Add sysfs attributes for controlling ASPM link states
PCI/ASPM: Add pcie_aspm_get_link()
PCI/ASPM: Allow re-enabling Clock PM
PCI/ASPM: Add L1 PM substate support to pci_disable_link_state()
PCI/ASPM: Remove pcie_aspm_enabled() unnecessary locking
Bjorn Helgaas [Thu, 28 Nov 2019 14:54:28 +0000 (08:54 -0600)]
Merge branch 'pci/aer'
- Restore AER capability after resume (Mayurkumar Patel)
- Add PoisonTLPBlocked AER counter (Rajat Jain)
- Use for_each_set_bit() to simplify AER code (Andy Shevchenko)
- Fix AER kernel-doc (Andy Shevchenko)
- Add "pcie_ports=dpc-native" parameter to allow native use of DPC even
if platform didn't grant control over AER (Olof Johansson)
* pci/aer:
PCI/DPC: Add "pcie_ports=dpc-native" to allow DPC without AER control
PCI/AER: Fix kernel-doc warnings
PCI/AER: Use for_each_set_bit() to simplify code
PCI/AER: Add PoisonTLPBlocked to Uncorrectable error counters
PCI/AER: Save AER Capability for suspend/resume
Palmer Dabbelt [Fri, 25 Oct 2019 06:10:38 +0000 (08:10 +0200)]
PCI: Remove PCI_MSI_IRQ_DOMAIN architecture whitelist
The only apparent reason for the PCI_MSI_IRQ_DOMAIN architecture
whitelist was that it requires msi.h. Now that msi.h is mandatory in
asm-generic/Kbuild, every arch should have at least the default version,
so remove the whitelist.
Built for all the architectures that play nice with make.cross, but not
boot tested anywhere.
Link: https://lore.kernel.org/r/514e7b040be8ccd69088193aba260da1b89e919c.1571983829.git.michal.simek@xilinx.com
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Waiman Long <longman@redhat.com>
Michal Simek [Fri, 25 Oct 2019 06:10:37 +0000 (08:10 +0200)]
asm-generic: Make msi.h a mandatory include/asm header
msi.h is generic for all architectures except x86, which has its own
version. Enabling MSI by adding msi.h to every architecture's Kbuild is
just an additional step which doesn't need to be done.
Make msi.h mandatory in the asm-generic/Kbuild so we don't have to do it
for each architecture.
Suggested-by: Christoph Hellwig <hch@infradead.org>
Link: https://lore.kernel.org/r/c991669e29a79b1a8e28c3b4b3a125801a693de8.1571983829.git.michal.simek@xilinx.com
Tested-by: Paul Walmsley <paul.walmsley@sifive.com> # build only, rv32/rv64
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Waiman Long <longman@redhat.com>
Acked-by: Paul Walmsley <paul.walmsley@sifive.com> # arch/riscv
Jian-Hong Pan [Thu, 31 Oct 2019 09:34:09 +0000 (17:34 +0800)]
Revert "nvme: Add quirk for Kingston NVME SSD running FW E8FK11.T"
Since
e045fa29e893 ("PCI/MSI: Fix incorrect MSI-X masking on resume") is
merged, we can revert the previous quirk now.
This reverts commit
19ea025e1d28c629b369c3532a85b3df478cc5c6.
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=204887
Fixes:
19ea025e1d28 ("nvme: Add quirk for Kingston NVME SSD running FW E8FK11.T")
Link: https://lore.kernel.org/r/20191031093408.9322-1-jian-hong@endlessm.com
Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Jian-Hong Pan [Tue, 8 Oct 2019 03:42:39 +0000 (11:42 +0800)]
PCI/MSI: Fix incorrect MSI-X masking on resume
When a driver enables MSI-X, msix_program_entries() reads the MSI-X Vector
Control register for each vector and saves it in desc->masked. Each
register is 32 bits and bit 0 is the actual Mask bit.
When we restored these registers during resume, we previously set the Mask
bit if *any* bit in desc->masked was set instead of when the Mask bit
itself was set:
pci_restore_state
pci_restore_msi_state
__pci_restore_msix_state
for_each_pci_msi_entry
msix_mask_irq(entry, entry->masked) <-- entire u32 word
__pci_msix_desc_mask_irq(desc, flag)
mask_bits = desc->masked & ~PCI_MSIX_ENTRY_CTRL_MASKBIT
if (flag) <-- testing entire u32, not just bit 0
mask_bits |= PCI_MSIX_ENTRY_CTRL_MASKBIT
writel(mask_bits, desc_addr + PCI_MSIX_ENTRY_VECTOR_CTRL)
This means that after resume, MSI-X vectors were masked when they shouldn't
be, which leads to timeouts like this:
nvme nvme0: I/O 978 QID 3 timeout, completion polled
On resume, set the Mask bit only when the saved Mask bit from suspend was
set.
This should remove the need for
19ea025e1d28 ("nvme: Add quirk for Kingston
NVME SSD running FW E8FK11.T").
[bhelgaas: commit log, move fix to __pci_msix_desc_mask_irq()]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=204887
Link: https://lore.kernel.org/r/20191008034238.2503-1-jian-hong@endlessm.com
Fixes:
f2440d9acbe8 ("PCI MSI: Refactor interrupt masking code")
Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Bjorn Helgaas [Mon, 14 Oct 2019 21:17:05 +0000 (16:17 -0500)]
PCI/MSI: Move power state check out of pci_msi_supported()
27e20603c54b ("PCI/MSI: Move D0 check into pci_msi_check_device()")
moved the power state check into pci_msi_check_device(), which was
subsequently renamed to pci_msi_supported(). This didn't change the
behavior, since both callers checked the power state.
However, it doesn't fit the current "pci_msi_supported()" name, which
should return what the device is capable of, independent of the power
state.
Move the power state check back into the callers for readability. No
functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Greg Kroah-Hartman [Mon, 14 Oct 2019 10:04:52 +0000 (12:04 +0200)]
PCI/MSI: Remove unused pci_irq_get_node()
The function pci_irq_get_node() is not used by anyone in the tree, so just
delete it.
Link: https://lore.kernel.org/r/20191014100452.GA6699@kroah.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Dexuan Cui [Mon, 25 Nov 2019 05:33:54 +0000 (21:33 -0800)]
PCI: hv: Avoid a kmemleak false positive caused by the hbus buffer
With the recent
59bb47985c1d ("mm, sl[aou]b: guarantee natural
alignment for kmalloc(power-of-two)"), kzalloc() is able to allocate
a 4KB buffer that is guaranteed to be 4KB-aligned. Here the size and
alignment of hbus is important because hbus's field
retarget_msi_interrupt_params must not cross a 4KB page boundary.
Here we prefer kzalloc to get_zeroed_page(), because a buffer
allocated by the latter is not tracked and scanned by kmemleak, and
hence kmemleak reports the pointer contained in the hbus buffer
(i.e. the hpdev struct, which is created in new_pcichild_device() and
is tracked by hbus->children) as memory leak (false positive).
If the kernel doesn't have
59bb47985c1d, get_zeroed_page() *must* be
used to allocate the hbus buffer and we can avoid the kmemleak false
positive by using kmemleak_alloc() and kmemleak_free() to ask
kmemleak to track and scan the hbus buffer.
Reported-by: Lili Deng <v-lide@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Dexuan Cui [Mon, 25 Nov 2019 05:33:53 +0000 (21:33 -0800)]
PCI: hv: Change pci_protocol_version to per-hbus
A VM can have multiple Hyper-V hbus. It's incorrect to set the global
variable 'pci_protocol_version' when *every* hbus is initialized in
hv_pci_protocol_negotiation(). This is not an issue in practice since
every hbus should have the same value of hbus->protocol_version, but
we should make the variable per-hbus, so in case we have busses
with different protocol versions, the driver can still work correctly.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Dexuan Cui [Mon, 25 Nov 2019 05:33:52 +0000 (21:33 -0800)]
PCI: hv: Add hibernation support
Add suspend() and resume() functions so that Hyper-V virtual PCI devices
are handled properly when the VM hibernates and resumes from
hibernation.
Note that the suspend() function must make sure there are no pending
work items before calling vmbus_close(), since it runs in a process
context as a callback in dpm_suspend(). When it starts to run, the
channel callback hv_pci_onchannelcallback(), which runs in a tasklet
context, can be still running concurrently and scheduling new work items
onto hbus->wq in hv_pci_devices_present() and hv_pci_eject_device(), and
the work item handlers can access the vmbus channel, which can be being
closed by hv_pci_suspend(), e.g. the work item handler
pci_devices_present_work() -> new_pcichild_device() writes to the vmbus
channel.
To eliminate the race, hv_pci_suspend() disables the channel callback
tasklet, sets hbus->state to hv_pcibus_removing, and re-enables the
tasklet. This way, when hv_pci_suspend() proceeds, it knows that no new
work item can be scheduled, and then it flushes hbus->wq and safely
closes the vmbus channel.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Dexuan Cui [Mon, 25 Nov 2019 05:33:51 +0000 (21:33 -0800)]
PCI: hv: Reorganize the code in preparation of hibernation
There is no functional change. This is just preparatory for a later
patch which adds the hibernation support for the pci-hyperv driver.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Keith Busch [Fri, 22 Nov 2019 16:25:01 +0000 (01:25 +0900)]
MAINTAINERS: Remove Keith from VMD maintainer
I no longer work in this capacity on the VMD driver.
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Jon Derrick <jonathan.derrick@intel.com>
Heiner Kallweit [Sat, 5 Oct 2019 12:08:52 +0000 (14:08 +0200)]
PCI/ASPM: Remove PCIEASPM_DEBUG Kconfig option and related code
Previously, CONFIG_PCIEASPM_DEBUG enabled "link_state" and "clk_ctl" sysfs
files that controlled ASPM. We believe these files were rarely if ever
used.
We recently added sysfs ASPM controls that are always present, so the debug
code is no longer needed. Removing this debug code has been discussed for
quite some time, see e.g. [0].
Remove PCIEASPM_DEBUG and the related code.
[0] https://lore.kernel.org/lkml/
20180727202619.GD173328@bhelgaas-glaptop.roam.corp.google.com/
Link: https://lore.kernel.org/r/ec935d8e-c084-3938-f1d1-748617596b25@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Heiner Kallweit [Sat, 5 Oct 2019 12:07:56 +0000 (14:07 +0200)]
PCI/ASPM: Add sysfs attributes for controlling ASPM link states
Add sysfs attributes to Endpoints and other Upstream Ports to control ASPM,
Clock PM, and L1 PM Substates. The new attributes are:
/sys/devices/pci*/.../link/clkpm
/sys/devices/pci*/.../link/l0s_aspm
/sys/devices/pci*/.../link/l1_aspm
/sys/devices/pci*/.../link/l1_1_aspm
/sys/devices/pci*/.../link/l1_2_aspm
/sys/devices/pci*/.../link/l1_1_pcipm
/sys/devices/pci*/.../link/l1_2_pcipm
An attribute is only visible if both ends of the Link leading to the device
support the state. Writing y/1/on to the file enables the state; n/0/off
disables it.
These attributes can be used to tune the power/performance tradeoff for
individual devices.
[bhelgaas: commit log, rename directory to "link"]
Link: https://lore.kernel.org/r/b1c83f8a-9bf6-eac5-82d0-cf5b90128fbf@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Frederick Lawler [Mon, 18 Nov 2019 00:35:13 +0000 (18:35 -0600)]
drm/radeon: Prefer pcie_capability_read_word()
Commit
8c0d3a02c130 ("PCI: Add accessors for PCI Express Capability")
added accessors for the PCI Express Capability so that drivers didn't
need to be aware of differences between v1 and v2 of the PCI
Express Capability.
Replace pci_read_config_word() and pci_write_config_word() calls with
pcie_capability_read_word() and pcie_capability_write_word().
Link: https://lore.kernel.org/r/20191118003513.10852-1-fred@fredlawl.com
Signed-off-by: Frederick Lawler <fred@fredlawl.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Bjorn Helgaas [Thu, 21 Nov 2019 13:24:24 +0000 (07:24 -0600)]
drm/radeon: Replace numbers with PCI_EXP_LNKCTL2 definitions
Replace hard-coded magic numbers with the descriptive PCI_EXP_LNKCTL2
definitions. No functional change intended.
Link: https://lore.kernel.org/r/20191112173503.176611-4-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Bjorn Helgaas [Wed, 20 Nov 2019 23:54:13 +0000 (17:54 -0600)]
drm/radeon: Correct Transmit Margin masks
Previously we masked PCIe Link Control 2 register values with "7 << 9",
which was apparently intended to be the Transmit Margin field, but instead
was the high order bit of Transmit Margin, the Enter Modified Compliance
bit, and the Compliance SOS bit.
Correct the mask to "7 << 7", which is the Transmit Margin field.
Link: https://lore.kernel.org/r/20191112173503.176611-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Frederick Lawler [Mon, 18 Nov 2019 00:35:13 +0000 (18:35 -0600)]
drm/amdgpu: Prefer pcie_capability_read_word()
Commit
8c0d3a02c130 ("PCI: Add accessors for PCI Express Capability")
added accessors for the PCI Express Capability so that drivers didn't
need to be aware of differences between v1 and v2 of the PCI
Express Capability.
Replace pci_read_config_word() and pci_write_config_word() calls with
pcie_capability_read_word() and pcie_capability_write_word().
[bhelgaas: fix a couple remaining instances in cik.c]
Link: https://lore.kernel.org/r/20191118003513.10852-1-fred@fredlawl.com
Signed-off-by: Frederick Lawler <fred@fredlawl.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Kunihiko Hayashi [Thu, 7 Nov 2019 04:58:14 +0000 (13:58 +0900)]
PCI: uniphier: Set mode register to host mode
Set the mode register to host(RC) mode so that the host controller
mode is set-up consistently across SoCs.
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
[lorenzo.pieralisi@arm.com: updated log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Bjorn Helgaas [Thu, 21 Nov 2019 13:23:41 +0000 (07:23 -0600)]
drm/amdgpu: Replace numbers with PCI_EXP_LNKCTL2 definitions
Replace hard-coded magic numbers with the descriptive PCI_EXP_LNKCTL2
definitions. No functional change intended.
Link: https://lore.kernel.org/r/20191112173503.176611-4-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Bjorn Helgaas [Wed, 20 Nov 2019 23:52:48 +0000 (17:52 -0600)]
drm/amdgpu: Correct Transmit Margin masks
Previously we masked PCIe Link Control 2 register values with "7 << 9",
which was apparently intended to be the Transmit Margin field, but instead
was the high order bit of Transmit Margin, the Enter Modified Compliance
bit, and the Compliance SOS bit.
Correct the mask to "7 << 7", which is the Transmit Margin field.
Link: https://lore.kernel.org/r/20191112173503.176611-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Bjorn Helgaas [Tue, 12 Nov 2019 17:07:36 +0000 (11:07 -0600)]
PCI: Add #defines for Enter Compliance, Transmit Margin
Add definitions for the Enter Compliance and Transmit Margin fields of the
PCIe Link Control 2 register.
Link: https://lore.kernel.org/r/20191112173503.176611-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Bjorn Helgaas [Wed, 6 Nov 2019 22:13:43 +0000 (16:13 -0600)]
PCI: Allow building PCIe things without PCIEPORTBUS
Some things in drivers/pci/pcie (aspm.c and ptm.c) do not depend on the
PCIe portdrv, so we should be able to build them even if PCIEPORTBUS is not
selected. Remove the PCIEPORTBUS guard from building pcie/.
Link: https://lore.kernel.org/r/20191106222420.10216-6-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Bjorn Helgaas [Wed, 6 Nov 2019 22:09:40 +0000 (16:09 -0600)]
PCI: Remove PCIe Kconfig dependencies on PCI
drivers/pci/pcie/Kconfig is only sourced by drivers/pci/Kconfig, and only
when PCI is defined, so there's no need to depend on PCI again. Remove the
unnecessary dependencies.
Link: https://lore.kernel.org/r/20191106222420.10216-5-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Bjorn Helgaas [Wed, 6 Nov 2019 22:07:36 +0000 (16:07 -0600)]
PCI/ASPM: Remove dependency on PCIEPORTBUS
The ASPM support does not depend on the portdrv, so remove the Kconfig
dependency.
Link: https://lore.kernel.org/r/20191106222420.10216-4-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Bjorn Helgaas [Wed, 6 Nov 2019 22:01:15 +0000 (16:01 -0600)]
PCI/PTM: Remove dependency on PCIEPORTBUS
The PTM support does not depend on the portdrv, so remove the Kconfig
dependency.
Link: https://lore.kernel.org/r/20191106222420.10216-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Cc: Jonathan Yong <jonathan.yong@intel.com>
Bjorn Helgaas [Wed, 6 Nov 2019 21:30:48 +0000 (15:30 -0600)]
PCI/PTM: Remove spurious "d" from granularity message
The granularity message has an extra "d":
pci 0000:02:00.0: PTM enabled, 4dns granularity
Remove the "d" so the message is simply "PTM enabled, 4ns granularity".
Fixes:
8b2ec318eece ("PCI: Add PTM clock granularity information")
Link: https://lore.kernel.org/r/20191106222420.10216-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Cc: Jonathan Yong <jonathan.yong@intel.com>
Ben Dooks [Wed, 16 Oct 2019 08:03:24 +0000 (09:03 +0100)]
PCI: sysfs: Remove unused attribute groups
56c1af4606f0 ("PCI: Add sysfs max_link_speed/width,
current_link_speed/width, etc") added the following objects, but they are
unused, so remove them:
pci_bridge_group
pci_bridge_groups
pcie_dev_group
pcie_dev_groups
This fixes the following warnings from sparse:
drivers/pci/pci-sysfs.c:1546:30: warning: symbol 'pci_bridge_groups' was not declared. Should it be static?
drivers/pci/pci-sysfs.c:1555:30: warning: symbol 'pcie_dev_groups' was not declared. Should it be static?
Link: https://lore.kernel.org/r/20191016080324.12864-1-ben.dooks@codethink.co.uk
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Kai-Heng Feng [Mon, 2 Sep 2019 14:52:52 +0000 (22:52 +0800)]
x86/PCI: Avoid AMD FCH XHCI USB PME# from D0 defect
The AMD FCH USB XHCI Controller advertises support for generating PME#
while in D0. When in D0, it does signal PME# for USB 3.0 connect events,
but not for USB 2.0 or USB 1.1 connect events, which means the controller
doesn't wake correctly for those events.
00:10.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller [1022:7914] (rev 20) (prog-if 30 [XHCI])
Subsystem: Dell FCH USB XHCI Controller [1028:087e]
Capabilities: [50] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
Clear PCI_PM_CAP_PME_D0 in dev->pme_support to indicate the device will not
assert PME# from D0 so we don't rely on it.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203673
Link: https://lore.kernel.org/r/20190902145252.32111-1-kai.heng.feng@canonical.com
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Krzysztof Wilczynski [Tue, 3 Sep 2019 11:30:59 +0000 (13:30 +0200)]
PCI: Remove unused includes and superfluous struct declaration
Remove <linux/pci.h> and <linux/msi.h> from being included directly as part
of the include/linux/of_pci.h, and remove superfluous declaration of struct
of_phandle_args.
Move users of include <linux/of_pci.h> to include <linux/pci.h> and
<linux/msi.h> directly rather than rely on both being included transitively
through <linux/of_pci.h>.
Link: https://lore.kernel.org/r/20190903113059.2901-1-kw@linux.com
Signed-off-by: Krzysztof Wilczynski <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Krzysztof Wilczynski [Mon, 19 Aug 2019 06:05:32 +0000 (08:05 +0200)]
x86/PCI: Replace deprecated EXTRA_CFLAGS with ccflags-y
Update arch/x86/pci/Makefile replacing the deprecated EXTRA_CFLAGS with the
ccflags-y matching recommendation per section 3.7 of
Documentation/kbuild/makefiles.txt.
Link: https://lore.kernel.org/r/20190819060532.17093-1-kw@linux.com
Signed-off-by: Krzysztof Wilczynski <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Krzysztof Wilczynski [Wed, 28 Aug 2019 13:53:22 +0000 (15:53 +0200)]
x86/PCI: Correct SPDX comment style
Change:
drivers/pci/controller/pcie-cadence.h
drivers/pci/controller/pcie-rockchip.h
to use the correct SPDX comment style per section 2 of
Documentation/process/license-rules.rst.
These resolve the following checkpatch.pl warning:
WARNING: Missing or malformed SPDX-License-Identifier tag in line 1
[bhelgaas: split to separate patch]
Link: https://lore.kernel.org/r/20190828135322.10370-1-kw@linux.com
Signed-off-by: Krzysztof Wilczynski <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Krzysztof Wilczynski [Mon, 30 Sep 2019 22:48:09 +0000 (17:48 -0500)]
x86/PCI: Add NumaChip SPDX GPL-2.0 to replace COPYING boilerplate
Add SPDX GPL-2.0 to numachip.c, which referred to the kernel default
"COPYING" file, which specifies GPL version 2.
Remove the boilerplate language referring to the GPL and "COPYING", relying
on the assertion in
b24413180f56 ("License cleanup: add SPDX GPL-2.0
license identifier to files with no license") that the SPDX identifier may
be used instead of the full boilerplate text.
[bhelgaas: split to separate patch]
Link: https://lore.kernel.org/r/20190828135322.10370-1-kw@linux.com
Signed-off-by: Krzysztof Wilczynski <kw@linux.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Vidya Sagar [Wed, 20 Nov 2019 05:17:42 +0000 (10:47 +0530)]
PCI/PM: Move pci_dev_wait() definition earlier
Move the definition of pci_dev_wait() above pci_power_up() so that it can
be called from the latter with no change in functionality. This is a pure
code move with no functional change.
Link: https://lore.kernel.org/r/20191120051743.23124-1-vidyas@nvidia.com
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Mika Westerberg [Tue, 12 Nov 2019 09:16:17 +0000 (12:16 +0300)]
PCI/PM: Add missing link delays required by the PCIe spec
Currently Linux does not follow PCIe spec regarding the required delays
after reset. A concrete example is a Thunderbolt add-in-card that consists
of a PCIe switch and two PCIe endpoints:
+-1b.0-[01-6b]----00.0-[02-6b]--+-00.0-[03]----00.0 TBT controller
+-01.0-[04-36]-- DS hotplug port
+-02.0-[37]----00.0 xHCI controller
\-04.0-[38-6b]-- DS hotplug port
The root port (1b.0) and the PCIe switch downstream ports are all PCIe Gen3
so they support 8GT/s link speeds.
We wait for the PCIe hierarchy to enter D3cold (runtime):
pcieport 0000:00:1b.0: power state changed by ACPI to D3cold
When it wakes up from D3cold, according to the PCIe 5.0 section 5.8 the
PCIe switch is put to reset and its power is re-applied. This means that we
must follow the rules in PCIe 5.0 section 6.6.1.
For the PCIe Gen3 ports we are dealing with here, the following applies:
With a Downstream Port that supports Link speeds greater than 5.0 GT/s,
software must wait a minimum of 100 ms after Link training completes
before sending a Configuration Request to the device immediately below
that Port. Software can determine when Link training completes by polling
the Data Link Layer Link Active bit or by setting up an associated
interrupt (see Section 6.7.3.3).
Translating this into the above topology we would need to do this (DLLLA
stands for Data Link Layer Link Active):
0000:00:1b.0: wait for 100 ms after DLLLA is set before access to 0000:01:00.0
0000:02:00.0: wait for 100 ms after DLLLA is set before access to 0000:03:00.0
0000:02:02.0: wait for 100 ms after DLLLA is set before access to 0000:37:00.0
I've instrumented the kernel with some additional logging so we can see the
actual delays performed:
pcieport 0000:00:1b.0: power state changed by ACPI to D0
pcieport 0000:00:1b.0: waiting for D3cold delay of 100 ms
pcieport 0000:00:1b.0: waiting for D3hot delay of 10 ms
pcieport 0000:02:01.0: waiting for D3hot delay of 10 ms
pcieport 0000:02:04.0: waiting for D3hot delay of 10 ms
For the switch upstream port (01:00.0 reachable through 00:1b.0 root port)
we wait for 100 ms but not taking into account the DLLLA requirement. We
then wait 10 ms for D3hot -> D0 transition of the root port and the two
downstream hotplug ports. This means that we deviate from what the spec
requires.
Performing the same check for system sleep (s2idle) transitions it turns
out to be even worse. None of the mandatory delays are performed. If this
would be S3 instead of s2idle then according to PCI FW spec 3.2 section
4.6.8. there is a specific _DSM that allows the OS to skip the delays but
this platform does not provide the _DSM and does not go to S3 anyway so no
firmware is involved that could already handle these delays.
On this particular platform these delays are not actually needed because
there is an additional delay as part of the ACPI power resource that is
used to turn on power to the hierarchy but since that additional delay is
not required by any of standards (PCIe, ACPI) it is not present in the
Intel Ice Lake, for example where missing the mandatory delays causes
pciehp to start tearing down the stack too early (links are not yet
trained). Below is an example how it looks like when this happens:
pcieport 0000:83:04.0: pciehp: Slot(4): Card not present
pcieport 0000:87:04.0: PME# disabled
pcieport 0000:83:04.0: pciehp: pciehp_unconfigure_device: domain:bus:dev = 0000:86:00
pcieport 0000:86:00.0: Refused to change power state, currently in D3
pcieport 0000:86:00.0: restoring config space at offset 0x3c (was 0xffffffff, writing 0x201ff)
pcieport 0000:86:00.0: restoring config space at offset 0x38 (was 0xffffffff, writing 0x0)
...
There is also one reported case (see the bugzilla link below) where the
missing delay causes xHCI on a Titan Ridge controller fail to runtime
resume when USB-C dock is plugged. This does not involve pciehp but instead
the PCI core fails to runtime resume the xHCI device:
pcieport 0000:04:02.0: restoring config space at offset 0xc (was 0x10000, writing 0x10020)
pcieport 0000:04:02.0: restoring config space at offset 0x4 (was 0x100000, writing 0x100406)
xhci_hcd 0000:39:00.0: Refused to change power state, currently in D3
xhci_hcd 0000:39:00.0: restoring config space at offset 0x3c (was 0xffffffff, writing 0x1ff)
xhci_hcd 0000:39:00.0: restoring config space at offset 0x38 (was 0xffffffff, writing 0x0)
...
Add a new function pci_bridge_wait_for_secondary_bus() that is called on
PCI core resume and runtime resume paths accordingly if the bridge entered
D3cold (and thus went through reset).
This is second attempt to add the missing delays. The previous solution in
c2bf1fc212f7 ("PCI: Add missing link delays required by the PCIe spec") was
reverted because of two issues it caused:
1. One system become unresponsive after S3 resume due to PME service
spinning in pcie_pme_work_fn(). The root port in question reports that
the xHCI sent PME but the xHCI device itself does not have PME status
set. The PME status bit is never cleared in the root port resulting
the indefinite loop in pcie_pme_work_fn().
2. Slows down resume if the root/downstream port does not support Data
Link Layer Active Reporting because pcie_wait_for_link_delay() waits
1100 ms in that case.
This version should avoid the above issues because we restrict the delay to
happen only if the port went into D3cold.
Link: https://lore.kernel.org/linux-pci/SL2P216MB01878BBCD75F21D882AEEA2880C60@SL2P216MB0187.KORP216.PROD.OUTLOOK.COM/
Link: https://bugzilla.kernel.org/show_bug.cgi?id=203885
Link: https://lore.kernel.org/r/20191112091617.70282-3-mika.westerberg@linux.intel.com
Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Mika Westerberg [Tue, 12 Nov 2019 09:16:16 +0000 (12:16 +0300)]
PCI/PM: Add pcie_wait_for_link_delay()
Add pcie_wait_for_link_delay(). Similar to pcie_wait_for_link() but allows
passing custom activation delay in milliseconds.
Link: https://lore.kernel.org/r/20191112091617.70282-2-mika.westerberg@linux.intel.com
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Bjorn Helgaas [Thu, 1 Aug 2019 16:50:56 +0000 (11:50 -0500)]
PCI/PM: Return error when changing power state from D3cold
pci_raw_set_power_state() uses the Power Management capability to change a
device's power state. The capability is in config space, which is
accessible in D0, D1, D2, and D3hot, but not in D3cold.
If we call pci_raw_set_power_state() on a device that's in D3cold, config
reads fail and return ~0 data, which we erroneously interpreted as "the
device is in D3hot", leading to messages like this:
pcieport 0000:03:00.0: Refused to change power state, currently in D3
The PCI_PM_CTRL has several RsvdP fields, so ~0 is never a valid register
value. If we get that value, print a more informative message and return
an error.
Changing the power state of a device from D3cold must be done by a platform
power management method or some other non-config space mechanism.
Link: https://lore.kernel.org/r/20190822200551.129039-4-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Bjorn Helgaas [Fri, 2 Aug 2019 23:47:22 +0000 (18:47 -0500)]
PCI/PM: Decode D3cold power state correctly
Use pci_power_name() to print pci_power_t correctly. This changes:
"state 0" or "D0" to "D0"
"state 1" or "D1" to "D1"
"state 2" or "D2" to "D2"
"state 3" or "D3" to "D3hot"
"state 4" or "D4" to "D3cold"
Changes dmesg logging only, no other functional change intended.
Link: https://lore.kernel.org/r/20190822200551.129039-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Rafael J. Wysocki [Tue, 5 Nov 2019 16:32:08 +0000 (17:32 +0100)]
PCI/PM: Fold __pci_complete_power_transition() into its caller
Because pci_set_power_state() has become the only caller of
__pci_complete_power_transition(), there is no need for the latter to
be a separate function any more, so fold it into the former, drop a
redundant check and reduce the number of lines of code somewhat.
Code rearrangement, no intentional functional impact.
Link: https://lore.kernel.org/r/15576968.k611qn3UU0@kreacher
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Rafael J. Wysocki [Tue, 5 Nov 2019 10:30:36 +0000 (11:30 +0100)]
PCI/PM: Avoid exporting __pci_complete_power_transition()
Notice that radeon_set_suspend(), which is the only caller of
__pci_complete_power_transition() outside of pci.c, really only
cares about the pci_platform_power_transition() invoked by it,
so export the latter instead of it, update the radeon driver to
call pci_platform_power_transition() directly and make
__pci_complete_power_transition() static.
Code rearrangement, no intentional functional impact.
Link: https://lore.kernel.org/r/1731661.ykamz2Tiuf@kreacher
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Rafael J. Wysocki [Tue, 5 Nov 2019 10:29:16 +0000 (11:29 +0100)]
PCI/PM: Fold __pci_start_power_transition() into its caller
Because pci_power_up() has become the only caller of
__pci_start_power_transition(), there is no need for the latter to
be a separate function any more, so fold it into the former, drop a
redundant check and reduce the number of lines of code somewhat.
Code rearrangement, no intentional functional impact.
Link: https://lore.kernel.org/r/3458080.lsoDbfkST9@kreacher
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Rafael J. Wysocki [Tue, 5 Nov 2019 10:27:49 +0000 (11:27 +0100)]
PCI/PM: Use pci_power_up() in pci_set_power_state()
Make it explicitly clear that the code to put devices into D0 in
pci_set_power_state() and in pci_pm_default_resume_early() is the
same by making the latter use pci_power_up() for transitions into D0.
Code rearrangement, no intentional functional impact.
Link: https://lore.kernel.org/r/2520019.OZ1nXS5aSj@kreacher
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Rafael J. Wysocki [Tue, 5 Nov 2019 10:13:43 +0000 (11:13 +0100)]
PCI/PM: Move power state update away from pci_power_up()
Move the invocation of pci_update_current_state() from pci_power_up() to
pci_pm_default_resume_early(), which is the only caller of that function.
Preparatory change, no functional impact.
Link: https://lore.kernel.org/r/37482337.udjOGdOKNb@kreacher
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Bjorn Helgaas [Thu, 31 Oct 2019 22:37:54 +0000 (17:37 -0500)]
PCI/PM: Remove unused pci_driver.suspend_late() hook
The struct pci_driver.suspend_late() hook is one of the legacy PCI power
management callbacks, and there are no remaining users of it. Remove it.
Link: https://lore.kernel.org/r/20191101204558.210235-7-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Bjorn Helgaas [Thu, 31 Oct 2019 22:53:04 +0000 (17:53 -0500)]
PCI/PM: Remove unused pci_driver.resume_early() hook
The struct pci_driver.resume_early() hook is one of the legacy PCI power
management callbacks, and there are no remaining users of it. Remove it.
Link: https://lore.kernel.org/r/20191101204558.210235-6-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Bjorn Helgaas [Fri, 1 Nov 2019 13:20:40 +0000 (08:20 -0500)]
xen-platform: Convert to generic power management
Convert xen-platform from the legacy PCI power management callbacks to the
generic operations. This is one step towards removing support for the
legacy PCI callbacks.
The generic .resume_noirq() operation is called by pci_pm_resume_noirq() at
the same point the legacy PCI .resume_early() callback was, so this patch
should not change the xen-platform behavior.
Link: https://lore.kernel.org/r/20191101204558.210235-5-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: KarimAllah Ahmed <karahmed@amazon.de>
Bjorn Helgaas [Wed, 16 Oct 2019 20:23:20 +0000 (15:23 -0500)]
PCI/PM: Simplify pci_set_power_state()
Check for the PCI_DEV_FLAGS_NO_D3 quirk early, before calling
__pci_start_power_transition(). This way all the cases where we don't need
to do anything at all are checked up front.
This doesn't fix anything because if the caller requested D3hot or D3cold,
__pci_start_power_transition() is a no-op. But calling it is pointless and
makes the code harder to analyze.
Link: https://lore.kernel.org/r/20191101204558.210235-4-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Bjorn Helgaas [Mon, 28 Oct 2019 13:27:00 +0000 (08:27 -0500)]
PCI/PM: Expand PM reset messages to mention D3hot (not just D3)
pci_pm_reset() resets a device by putting it in D3hot and bringing it back
to D0. Clarify related messages to mention "D3hot" explicitly instead of
just "D3".
Link: https://lore.kernel.org/r/20191101204558.210235-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Bjorn Helgaas [Wed, 23 Oct 2019 22:40:52 +0000 (17:40 -0500)]
PCI/PM: Apply D2 delay as milliseconds, not microseconds
PCI_PM_D2_DELAY is defined as 200, which is milliseconds, but previously we
used udelay(), which only waited for 200 microseconds. Use msleep()
instead so we wait the correct amount of time. See PCIe r5.0, sec 5.9.
Link: https://lore.kernel.org/r/20191101204558.210235-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Bjorn Helgaas [Mon, 7 Oct 2019 12:52:28 +0000 (07:52 -0500)]
PCI/PM: Use pci_WARN() to include device information
Add and use pci_WARN() wrappers so warnings include device information.
Link: https://lore.kernel.org/r/20191017212851.54237-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Bjorn Helgaas [Mon, 7 Oct 2019 12:55:18 +0000 (07:55 -0500)]
PCI/PM: Use PCI dev_printk() wrappers for consistency
Use the PCI dev_printk() wrappers for consistency with the rest of the PCI
core. No functional change intended.
Link: https://lore.kernel.org/r/20191017212851.54237-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Bjorn Helgaas [Tue, 8 Oct 2019 20:25:23 +0000 (15:25 -0500)]
PCI/PM: Wrap long lines in documentation
Documentation/power/pci.rst is wrapped to fit in 80 columns, but directory
structure changes made a few lines longer. Wrap them so they all fit in 80
columns again.
Link: https://lore.kernel.org/r/20191014230016.240912-7-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Bjorn Helgaas [Tue, 8 Oct 2019 20:28:00 +0000 (15:28 -0500)]
PCI/PM: Note that PME can be generated from D0
Per PCIe r5.0 sec 7.5.2.1, PME may be generated from D0, so update
Documentation/power/pci.rst to reflect that.
Link: https://lore.kernel.org/r/20191016194450.68959-1-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Bjorn Helgaas [Mon, 14 Oct 2019 18:46:50 +0000 (13:46 -0500)]
PCI/PM: Make power management op coding style consistent
Some of the power management ops use this style:
struct device_driver *drv = dev->driver;
if (drv && drv->pm && drv->pm->prepare(dev))
drv->pm->prepare(dev);
while others use this:
const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
if (pm && pm->runtime_resume)
pm->runtime_resume(dev);
Convert the first style to the second so they're all consistent. Remove
local "error" variables when unnecessary. No functional change intended.
Link: https://lore.kernel.org/r/20191014230016.240912-6-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Bjorn Helgaas [Sat, 12 Oct 2019 22:15:57 +0000 (17:15 -0500)]
PCI/PM: Run resume fixups before disabling wakeup events
pci_pm_resume() and pci_pm_restore() call pci_pm_default_resume(), which
runs resume fixups before disabling wakeup events:
static void pci_pm_default_resume(struct pci_dev *pci_dev)
{
pci_fixup_device(pci_fixup_resume, pci_dev);
pci_enable_wake(pci_dev, PCI_D0, false);
}
pci_pm_runtime_resume() does both of these, but in the opposite order:
pci_enable_wake(pci_dev, PCI_D0, false);
pci_fixup_device(pci_fixup_resume, pci_dev);
We should always use the same ordering unless there's a reason to do
otherwise. Change pci_pm_runtime_resume() to call pci_pm_default_resume()
instead of open-coding this, so the fixups are always done before disabling
wakeup events.
pci_pm_default_resume() is called from pci_pm_runtime_resume(), which is
under #ifdef CONFIG_PM. If SUSPEND and HIBERNATION are disabled, PM_SLEEP
is disabled also, so move pci_pm_default_resume() from #ifdef
CONFIG_PM_SLEEP to #ifdef CONFIG_PM.
Link: https://lore.kernel.org/r/20191014230016.240912-5-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Bjorn Helgaas [Thu, 10 Oct 2019 21:54:36 +0000 (16:54 -0500)]
PCI/PM: Clear PCIe PME Status even for legacy power management
Previously, pci_pm_resume_noirq() cleared the PME Status bit in the Root
Status register only if the device had no driver or the driver did not
implement legacy power management. It should clear PME Status regardless
of what sort of power management the driver supports, so do this before
checking for legacy power management.
This affects Root Ports and Root Complex Event Collectors, for which the
usual driver is the PCIe portdrv, which implements new power management, so
this change is just on principle, not to fix any actual defects.
Fixes:
a39bd851dccf ("PCI/PM: Clear PCIe PME Status bit in core, not PCIe port driver")
Link: https://lore.kernel.org/r/20191014230016.240912-4-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Bjorn Helgaas [Mon, 14 Oct 2019 19:14:06 +0000 (14:14 -0500)]
PCI/PM: Correct pci_pm_thaw_noirq() documentation
According to the documentation, pci_pm_thaw_noirq() did not put the device
into the full-power state and restore its standard configuration registers.
This is incorrect, so update the documentation to match the code.
Link: https://lore.kernel.org/r/20191014230016.240912-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Dexuan Cui [Wed, 14 Aug 2019 01:06:55 +0000 (01:06 +0000)]
PCI/PM: Always return devices to D0 when thawing
pci_pm_thaw_noirq() is supposed to return the device to D0 and restore its
configuration registers, but previously it only did that for devices whose
drivers implemented the new power management ops.
Hibernation, e.g., via "echo disk > /sys/power/state", involves freezing
devices, creating a hibernation image, thawing devices, writing the image,
and powering off. The fact that thawing did not return devices with legacy
power management to D0 caused errors, e.g., in this path:
pci_pm_thaw_noirq
if (pci_has_legacy_pm_support(pci_dev)) # true for Mellanox VF driver
return pci_legacy_resume_early(dev) # ... legacy PM skips the rest
pci_set_power_state(pci_dev, PCI_D0)
pci_restore_state(pci_dev)
pci_pm_thaw
if (pci_has_legacy_pm_support(pci_dev))
pci_legacy_resume
drv->resume
mlx4_resume
...
pci_enable_msix_range
...
if (dev->current_state != PCI_D0) # <---
return -EINVAL;
which caused these warnings:
mlx4_core a6d1:00:02.0: INTx is not supported in multi-function mode, aborting
PM: dpm_run_callback(): pci_pm_thaw+0x0/0xd7 returns -95
PM: Device a6d1:00:02.0 failed to thaw: error -95
Return devices to D0 and restore config registers for all devices, not just
those whose drivers support new power management.
[bhelgaas: also call pci_restore_state() before pci_legacy_resume_early(),
update comment, add stable tag, commit log]
Link: https://lore.kernel.org/r/KU1P153MB016637CAEAD346F0AA8E3801BFAD0@KU1P153MB0166.APCP153.PROD.OUTLOOK.COM
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org # v4.13+
Robin Murphy [Sat, 16 Nov 2019 12:54:19 +0000 (12:54 +0000)]
PCI: rockchip: Make some regulators non-optional
The 0V9 and 1V8 supplies power the PCIe block in the SoC itself, and
are thus fundamental to PCIe being usable at all. As such, it makes
sense to treat them as non-optional and rely on dummy regulators if
not explicitly described.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Jon Derrick [Tue, 12 Nov 2019 12:47:53 +0000 (05:47 -0700)]
PCI: vmd: Add device id for VMD device 8086:9A0B
This patch adds support for this VMD device which supports the bus
restriction mode.
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Jon Derrick [Tue, 12 Nov 2019 12:47:52 +0000 (05:47 -0700)]
PCI: vmd: Add bus 224-255 restriction decode
VMD bus restrictions are required when IO fabric is multiplexed such
that VMD cannot use the entire bus range. This patch adds another bus
restriction decode bit that can be set by firmware to restrict the VMD
bus range to between 224-255.
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Bjorn Helgaas [Fri, 6 Sep 2019 23:36:06 +0000 (18:36 -0500)]
PCI: Unify ACS quirk desired vs provided checking
Most of the ACS quirks have a similar pattern of:
acs_flags &= ~( <controls provided by this device> );
return acs_flags ? 0 : 1;
Pull this out into a helper function to simplify the quirks slightly. The
helper function is also a convenient place for comments about what the list
of ACS controls means. No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Bjorn Helgaas [Thu, 5 Sep 2019 22:54:42 +0000 (17:54 -0500)]
PCI: Make ACS quirk implementations more uniform
The ACS quirks differ in needless ways, which makes them look more
different than they really are.
Reorder the ACS flags in order of definitions in the spec:
PCI_ACS_SV Source Validation
PCI_ACS_TB Translation Blocking
PCI_ACS_RR P2P Request Redirect
PCI_ACS_CR P2P Completion Redirect
PCI_ACS_UF Upstream Forwarding
PCI_ACS_EC P2P Egress Control
PCI_ACS_DT Direct Translated P2P
(PCIe r5.0, sec 7.7.8.2) and use similar code structure in all. No
functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Subbaraya Sundeep [Mon, 4 Nov 2019 06:57:44 +0000 (12:27 +0530)]
PCI: Do not use bus number zero from EA capability
As per PCIe r5.0, sec 7.8.5.2, fixed bus numbers of a bridge must be zero
when no function that uses EA is located behind it. Hence, if EA supplies
bus numbers of zero, assign bus numbers normally. A secondary bus can
never have a bus number of zero, so setting a bridge's Secondary Bus Number
to zero makes downstream devices unreachable.
[bhelgaas: retain bool return value so "zero is invalid" logic is local]
Fixes:
2dbce5901179 ("PCI: Assign bus numbers present in EA capability for bridges")
Link: https://lore.kernel.org/r/1572850664-9861-1-git-send-email-sundeep.lkml@gmail.com
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v5.2+
Nicholas Johnson [Wed, 13 Nov 2019 15:25:28 +0000 (15:25 +0000)]
PCI: Avoid double hpmemsize MMIO window assignment
Previously, the kernel sometimes assigned more MMIO or MMIO_PREF space than
desired. For example, if the user requested 128M of space with
"pci=realloc,hpmemsize=128M", we sometimes assigned 256M:
pci 0000:06:01.0: BAR 14: assigned [mem 0x90100000-0xa00fffff] = 256M
pci 0000:06:04.0: BAR 14: assigned [mem 0xa0200000-0xb01fffff] = 256M
With this patch applied:
pci 0000:06:01.0: BAR 14: assigned [mem 0x90100000-0x980fffff] = 128M
pci 0000:06:04.0: BAR 14: assigned [mem 0x98200000-0xa01fffff] = 128M
This happened when in the first pass, the MMIO_PREF succeeded but the MMIO
failed. In the next pass, because MMIO_PREF was already assigned, the
attempt to assign MMIO_PREF returned an error code instead of success
(nothing more to do, already allocated). Hence, the size which was actually
allocated, but thought to have failed, was placed in the MMIO window.
The bug resulted in the MMIO_PREF being added to the MMIO window, which
meant doubling if MMIO_PREF size = MMIO size. With a large MMIO_PREF, the
MMIO window would likely fail to be assigned altogether due to lack of
32-bit address space.
Change find_free_bus_resource() to do the following:
- Return first unassigned resource of the correct type.
- If there is none, return first assigned resource of the correct type.
- If none of the above, return NULL.
Returning an assigned resource of the correct type allows the caller to
distinguish between already assigned and no resource of the correct type.
Add checks in pbus_size_io() and pbus_size_mem() to return success if
resource returned from find_free_bus_resource() is already allocated.
This avoids pbus_size_io() and pbus_size_mem() returning error code to
__pci_bus_size_bridges() when a resource has been successfully assigned in
a previous pass. This fixes the existing behaviour where space for a
resource could be reserved multiple times in different parent bridge
windows.
Link: https://lore.kernel.org/lkml/20190531171216.20532-2-logang@deltatee.com/T/#u
Link: https://bugzilla.kernel.org/show_bug.cgi?id=203243
Link: https://lore.kernel.org/r/PS2P216MB075563AA6AD242AA666EDC6A80760@PS2P216MB0755.KORP216.PROD.OUTLOOK.COM
Reported-by: Kit Chow <kchow@gigaio.com>
Reported-by: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au>
Signed-off-by: Nicholas Johnson <nicholas.johnson-opensource@outlook.com.au>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Mika Westerberg [Wed, 30 Oct 2019 15:05:45 +0000 (18:05 +0300)]
ACPI / hotplug / PCI: Allocate resources directly under the non-hotplug bridge
Valerio and others reported that commit
84c8b58ed3ad ("ACPI / hotplug /
PCI: Don't scan bridges managed by native hotplug") prevents some recent
LG and HP laptops from booting with endless loop of:
ACPI Error: No handler or method for GPE 08, disabling event (
20190215/evgpe-835)
ACPI Error: No handler or method for GPE 09, disabling event (
20190215/evgpe-835)
ACPI Error: No handler or method for GPE 0A, disabling event (
20190215/evgpe-835)
...
What seems to happen is that during boot, after the initial PCI enumeration
when EC is enabled the platform triggers ACPI Notify() to one of the root
ports. The root port itself looks like this:
pci 0000:00:1b.0: PCI bridge to [bus 02-3a]
pci 0000:00:1b.0: bridge window [mem 0xc4000000-0xda0fffff]
pci 0000:00:1b.0: bridge window [mem 0x80000000-0xa1ffffff 64bit pref]
The BIOS has configured the root port so that it does not have I/O bridge
window.
Now when the ACPI Notify() is triggered ACPI hotplug handler calls
acpiphp_native_scan_bridge() for each non-hotplug bridge (as this system is
using native PCIe hotplug) and pci_assign_unassigned_bridge_resources() to
allocate resources.
The device connected to the root port is a PCIe switch (Thunderbolt
controller) with two hotplug downstream ports. Because of the hotplug ports
__pci_bus_size_bridges() tries to add "additional I/O" of 256 bytes to each
(DEFAULT_HOTPLUG_IO_SIZE). This gets further aligned to 4k as that's the
minimum I/O window size so each hotplug port gets 4k I/O window and the
same happens for the root port (which is also hotplug port). This means
3 * 4k = 12k I/O window.
Because of this pci_assign_unassigned_bridge_resources() ends up opening a
I/O bridge window for the root port at first available I/O address which
seems to be in range 0x1000 - 0x3fff. Normally this range is used for ACPI
stuff such as GPE bits (below is part of /proc/ioports):
1800-1803 : ACPI PM1a_EVT_BLK
1804-1805 : ACPI PM1a_CNT_BLK
1808-180b : ACPI PM_TMR
1810-1815 : ACPI CPU throttle
1850-1850 : ACPI PM2_CNT_BLK
1854-1857 : pnp 00:05
1860-187f : ACPI GPE0_BLK
However, when the ACPI Notify() happened this range was not yet reserved
for ACPI/PNP (that happens later) so PCI gets it. It then starts writing to
this range and accidentally stomps over GPE bits among other things causing
the endless stream of messages about missing GPE handler.
This problem does not happen if "pci=hpiosize=0" is passed in the kernel
command line. The reason is that then the kernel does not try to allocate
the additional 256 bytes for each hotplug port.
Fix this by allocating resources directly below the non-hotplug bridges
where a new device may appear as a result of ACPI Notify(). This avoids the
hotplug bridges and prevents opening the additional I/O window.
Fixes:
84c8b58ed3ad ("ACPI / hotplug / PCI: Don't scan bridges managed by native hotplug")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=203617
Link: https://lore.kernel.org/r/20191030150545.19885-1-mika.westerberg@linux.intel.com
Reported-by: Valerio Passini <passini.valerio@gmail.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org
Mika Westerberg [Tue, 29 Oct 2019 17:00:22 +0000 (20:00 +0300)]
PCI: pciehp: Prevent deadlock on disconnect
This addresses deadlocks in these common cases in hierarchies containing
two switches:
- All involved ports are runtime suspended and they are unplugged. This
can happen easily if the drivers involved automatically enable runtime
PM (xHCI for example does that).
- System is suspended (e.g., closing the lid on a laptop) with a dock +
something else connected, and the dock is unplugged while suspended.
These cases lead to the following deadlock:
INFO: task irq/126-pciehp:198 blocked for more than 120 seconds.
irq/126-pciehp D 0 198 2 0x80000000
Call Trace:
schedule+0x2c/0x80
schedule_timeout+0x246/0x350
wait_for_completion+0xb7/0x140
kthread_stop+0x49/0x110
free_irq+0x32/0x70
pcie_shutdown_notification+0x2f/0x50
pciehp_remove+0x27/0x50
pcie_port_remove_service+0x36/0x50
device_release_driver+0x12/0x20
bus_remove_device+0xec/0x160
device_del+0x13b/0x350
device_unregister+0x1a/0x60
remove_iter+0x1e/0x30
device_for_each_child+0x56/0x90
pcie_port_device_remove+0x22/0x40
pcie_portdrv_remove+0x20/0x60
pci_device_remove+0x3e/0xc0
device_release_driver_internal+0x18c/0x250
device_release_driver+0x12/0x20
pci_stop_bus_device+0x6f/0x90
pci_stop_bus_device+0x31/0x90
pci_stop_and_remove_bus_device+0x12/0x20
pciehp_unconfigure_device+0x88/0x140
pciehp_disable_slot+0x6a/0x110
pciehp_handle_presence_or_link_change+0x263/0x400
pciehp_ist+0x1c9/0x1d0
irq_thread_fn+0x24/0x60
irq_thread+0xeb/0x190
kthread+0x120/0x140
INFO: task irq/190-pciehp:2288 blocked for more than 120 seconds.
irq/190-pciehp D 0 2288 2 0x80000000
Call Trace:
__schedule+0x2a2/0x880
schedule+0x2c/0x80
schedule_preempt_disabled+0xe/0x10
mutex_lock+0x2c/0x30
pci_lock_rescan_remove+0x15/0x20
pciehp_unconfigure_device+0x4d/0x140
pciehp_disable_slot+0x6a/0x110
pciehp_handle_presence_or_link_change+0x263/0x400
pciehp_ist+0x1c9/0x1d0
irq_thread_fn+0x24/0x60
irq_thread+0xeb/0x190
kthread+0x120/0x140
What happens here is that the whole hierarchy is runtime resumed and the
parent PCIe downstream port, which got the hot-remove event, starts
removing devices below it, taking pci_lock_rescan_remove() lock. When the
child PCIe port is runtime resumed it calls pciehp_check_presence() which
ends up calling pciehp_card_present() and pciehp_check_link_active(). Both
of these use pcie_capability_read_word(), which notices that the underlying
device is already gone and returns PCIBIOS_DEVICE_NOT_FOUND with the
capability value set to 0. When pciehp gets this value it thinks that its
child device is also hot-removed and schedules its IRQ thread to handle the
event.
The deadlock happens when the child's IRQ thread runs and tries to acquire
pci_lock_rescan_remove() which is already taken by the parent and the
parent waits for the child's IRQ thread to finish.
Prevent this from happening by checking the return value of
pcie_capability_read_word() and if it is PCIBIOS_DEVICE_NOT_FOUND stop
performing any hot-removal activities.
[bhelgaas: add common scenarios to commit log]
Link: https://lore.kernel.org/r/20191029170022.57528-2-mika.westerberg@linux.intel.com
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Mika Westerberg [Tue, 29 Oct 2019 17:00:21 +0000 (20:00 +0300)]
PCI: pciehp: Do not disable interrupt twice on suspend
We try to keep PCIe hotplug ports runtime suspended when entering system
suspend. Because the PCIe portdrv sets the DPM_FLAG_NEVER_SKIP flag, the PM
core always calls system suspend/resume hooks even if the device is left
runtime suspended. Since PCIe hotplug driver re-used the same function for
both runtime suspend and system suspend, it ended up disabling hotplug
interrupt twice and the second time following was printed:
pciehp 0000:03:01.0:pcie204: pcie_do_write_cmd: no response from device
Prevent this from happening by checking whether the device is already
runtime suspended when the system suspend hook is called.
Fixes:
9c62f0bfb832 ("PCI: pciehp: Implement runtime PM callbacks")
Link: https://lore.kernel.org/r/20191029170022.57528-1-mika.westerberg@linux.intel.com
Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Yoshihiro Shimoda [Tue, 5 Nov 2019 10:51:29 +0000 (19:51 +0900)]
PCI: rcar: Fix missing MACCTLR register setting in initialization sequence
The R-Car Gen2/3 manual - available at:
https://www.renesas.com/eu/en/products/microcontrollers-microprocessors/rz/rzg/rzg1m.html#documents
"RZ/G Series User's Manual: Hardware" section
strictly enforces the MACCTLR inizialization value - 39.3.1 - "Initial
Setting of PCI Express":
"Be sure to write the initial value (= H'80FF 0000) to MACCTLR before
enabling PCIETCTLR.CFINIT".
To avoid unexpected behavior and to match the SW initialization sequence
guidelines, this patch programs the MACCTLR with the correct value.
Note that the MACCTLR.SPCHG bit in the MACCTLR register description
reports that "Only writing 1 is valid and writing 0 is invalid" but this
"invalid" has to be interpreted as a write-ignore aka "ignored", not
"prohibited".
Reported-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Fixes:
c25da4778803 ("PCI: rcar: Add Renesas R-Car PCIe driver")
Fixes:
be20bbcb0a8c ("PCI: rcar: Add the initialization of PCIe link in resume_noirq()")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: <stable@vger.kernel.org> # v5.2+
Andy Shevchenko [Fri, 8 Nov 2019 11:18:55 +0000 (13:18 +0200)]
PCI: pciehp: Refactor infinite loop in pcie_poll_cmd()
Infinite timeout loops are hard to read. Refactor it to plausible 'do {}
while ()'.
Note, the supplied timeout can't be negative for current use, though if
it's not dividable to 10, we may go below 0, that's why type of the
parameter is int. And thus, we may move the check to the loop condition.
No functional change intended.
Link: https://lore.kernel.org/r/20191108111855.85866-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
George Cherian [Mon, 11 Nov 2019 02:43:03 +0000 (02:43 +0000)]
PCI: Apply Cavium ACS quirk to ThunderX2 and ThunderX3
Enhance the ACS quirk for Cavium Processors. Add the root port vendor IDs
for ThunderX2 and ThunderX3 series of processors.
[bhelgaas: add Fixes: and stable tag]
Fixes:
f2ddaf8dfd4a ("PCI: Apply Cavium ThunderX ACS quirk to more Root Ports")
Link: https://lore.kernel.org/r/20191111024243.GA11408@dc5-eodlnx05.marvell.com
Signed-off-by: George Cherian <george.cherian@marvell.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Robert Richter <rrichter@marvell.com>
Cc: stable@vger.kernel.org # v4.12+
Tom Joseph [Mon, 11 Nov 2019 12:30:44 +0000 (12:30 +0000)]
PCI: cadence: Move all files to per-device cadence directory
Cadence core library files may be used by various platform drivers.
Add a new directory "cadence" to group all the Cadence core library files
and the platforms using Cadence core library.
Signed-off-by: Tom Joseph <tjoseph@cadence.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Tom Joseph [Mon, 11 Nov 2019 12:30:43 +0000 (12:30 +0000)]
PCI: cadence: Refactor driver to use as a core library
Cadence PCIe host and endpoint IP may be embedded into a variety of
SoCs/platforms. Let's extract the platform related APIs/Structures in the
current driver to a separate file (pcie-cadence-plat.c), such that the
common functionality can be used by future platforms.
Signed-off-by: Tom Joseph <tjoseph@cadence.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Marek Vasut [Sat, 26 Oct 2019 18:26:59 +0000 (20:26 +0200)]
PCI: rcar: Recalculate inbound range alignment for each controller entry
Due to hardware constraints, the size of each inbound range entry
populated into the controller cannot be larger than the alignment
of the entry's start address. Currently, the alignment for each
"dma-ranges" inbound range is calculated only once for each range
and the increment for programming the controller is also derived
from it only once. Thus, a "dma-ranges" entry describing a memory
at 0x48000000 and size 0x38000000 would lead to multiple controller
entries, each 0x08000000 long.
This is inefficient, especially considering that by adding the size
to the start address, the alignment increases. This patch moves the
alignment calculation into the loop populating the controller entries,
thus updating the alignment for each controller entry.
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Marek Vasut <marek.vasut+renesas@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Wolfram Sang <wsa@the-dreams.de>
Cc: linux-renesas-soc@vger.kernel.org
Marek Vasut [Sat, 26 Oct 2019 18:26:58 +0000 (20:26 +0200)]
PCI: rcar: Move the inbound index check
Since the 'idx' variable value is stored across multiple calls to
rcar_pcie_inbound_ranges() function, and the 'idx' value is used to
index registers which are written, subsequent calls might cause
the 'idx' value to be high enough to trigger writes into nonexistent
registers.
Fix this by moving the 'idx' value check to the beginning of the loop.
Signed-off-by: Marek Vasut <marek.vasut+renesas@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Wolfram Sang <wsa@the-dreams.de>
Cc: linux-renesas-soc@vger.kernel.org
Andrew Murray [Thu, 5 Sep 2019 15:05:28 +0000 (16:05 +0100)]
PCI: rcar: Remove unnecessary header include (../pci.h)
Remove unnecessary header include (../pci.h) since it doesn't
provide any needed symbols.
Signed-off-by: Andrew Murray <andrew.murray@arm.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Xiaowei Bao [Mon, 2 Sep 2019 03:43:19 +0000 (11:43 +0800)]
PCI: layerscape: Add LS1028a support
Add support for the LS1028a PCIe controller.
Signed-off-by: Xiaowei Bao <xiaowei.bao@nxp.com>
Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Andrew Murray <Andrew.Murray@arm.com>
Acked-by: Minghuan Lian <minghuan.Lian@nxp.com>