platform/kernel/linux-rpi.git
3 years agomisc: lattice-ecp3-config: Fix task hung when firmware load failed
Wei Yongjun [Tue, 28 Dec 2021 12:55:22 +0000 (12:55 +0000)]
misc: lattice-ecp3-config: Fix task hung when firmware load failed

When firmware load failed, kernel report task hung as follows:

INFO: task xrun:5191 blocked for more than 147 seconds.
      Tainted: G        W         5.16.0-rc5-next-20211220+ #11
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:xrun            state:D stack:    0 pid: 5191 ppid:   270 flags:0x00000004
Call Trace:
 __schedule+0xc12/0x4b50 kernel/sched/core.c:4986
 schedule+0xd7/0x260 kernel/sched/core.c:6369 (discriminator 1)
 schedule_timeout+0x7aa/0xa80 kernel/time/timer.c:1857
 wait_for_completion+0x181/0x290 kernel/sched/completion.c:85
 lattice_ecp3_remove+0x32/0x40 drivers/misc/lattice-ecp3-config.c:221
 spi_remove+0x72/0xb0 drivers/spi/spi.c:409

lattice_ecp3_remove() wait for signals from firmware loading, but when
load failed, firmware_load() does not send this signal. This cause
device remove hung. Fix it by sending signal even if load failed.

Fixes: 781551df57c7 ("misc: Add Lattice ECP3 FPGA configuration via SPI")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Link: https://lore.kernel.org/r/20211228125522.3122284-1-weiyongjun1@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoMerge tag 'phy-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux...
Greg Kroah-Hartman [Thu, 30 Dec 2021 13:02:16 +0000 (14:02 +0100)]
Merge tag 'phy-for-5.17' of git://git./linux/kernel/git/phy/linux-phy into char-misc-next

Vinod writes:

phy-for-5.17

  - New support:
        - Qualcomm eDP PHY driver
- Qualcomm SM8450 UFS, USB2, USB3, PCIe0 and PCIe1 phy support
- Lan966x ethernet serdes PHY driver
- Support for uniphier NXI & Pro4 SoC
        - Qualcomm SM6350 USB2 support
- Amlogic Meson8 HDMI TX PHY driver
- Rockchip rk3568 usb2 support
- Intel Thunder Bay eMMC PHY driver
- Freescale IMX8 PCIe phy driver

  - Updates:
- Cadence Sierra driver updates for multilink configurations
        - Bcm usb2 updates for Phy reg space

* tag 'phy-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (72 commits)
  phy: cadence: Sierra: Add support for derived reference clock output
  dt-bindings: phy: cadence-sierra: Add clock ID for derived reference clock
  phy: cadence: Sierra: Add PCIe + QSGMII PHY multilink configuration
  phy: cadence: Sierra: Add support for PHY multilink configurations
  phy: cadence: Sierra: Fix to get correct parent for mux clocks
  phy: cadence: Sierra: Update single link PCIe register configuration
  phy: cadence: Sierra: Check PIPE mode PHY status to be ready for operation
  phy: cadence: Sierra: Check cmn_ready assertion during PHY power on
  phy: cadence: Sierra: Add PHY PCS common register configurations
  phy: cadence: Sierra: Rename some regmap variables to be in sync with Sierra documentation
  phy: cadence: Sierra: Add support to get SSC type from device tree
  dt-bindings: phy: cadence-sierra: Add binding to specify SSC mode
  dt-bindings: phy: cadence-torrent: Rename SSC macros to use generic names
  phy: cadence: Sierra: Prepare driver to add support for multilink configurations
  phy: cadence: Sierra: Use of_device_get_match_data() to get driver data
  phy: mediatek: Fix missing check in mtk_mipi_tx_probe
  phy: uniphier-usb3ss: fix unintended writing zeros to PHY register
  phy: phy-mtk-tphy: use new io helpers to access register
  phy: phy-mtk-xsphy: use new io helpers to access register
  phy: mediatek: add helpers to update bits of registers
  ...

3 years agoMerge tag 'soundwire-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul...
Greg Kroah-Hartman [Thu, 30 Dec 2021 13:00:44 +0000 (14:00 +0100)]
Merge tag 'soundwire-5.17-rc1' of git://git./linux/kernel/git/vkoul/soundwire into char-misc-next

Vinod writes:

soundwire updates for 5.17-rc1

 - Remove redundant version number read in qcom driver

* tag 'soundwire-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire:
  soundwire: qcom: remove redundant version number read

3 years agoMerge tag 'iio-for-5.17b' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23...
Greg Kroah-Hartman [Wed, 29 Dec 2021 18:29:20 +0000 (19:29 +0100)]
Merge tag 'iio-for-5.17b' of https://git./linux/kernel/git/jic23/iio into char-misc-next

Jonathan writes:

2nd set of new device support etc for IIO in the 5.17 cycle.

A small additional set of things that just missed the previous
pull request and have mostly been through plenty of review before the
holiday period began (or are trivial).  I've not taken some other series
on the list to allow for more eyes after the holiday period.

New device support
* adi,admv1013
  - New driver for this wideband microwave upconverter including dt-bindings
    and some device specific ABI due to the need to describe phase calibrations
    of a differential channel on both i and q phases. Previously we could
    do differential or i/q but not both on the same channel. The driver
    ABI uses a workaround for core support which will do until we know if
    this is a common requirement for which a more generic solution is
    needed.

MAINTAINERS:
* Add Haibo Chen as a maintainer for various NXP SoC ADCs.

Minor cleanup:
* sunrise_co2
  - Make sure an uninitialized value isn't used to set *val in read_raw().
    Not a real bug, but a compiler or reviewer can't tell that based
    on what they can see locally.

* tag 'iio-for-5.17b' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio:
  iio: chemical: sunrise_co2: set val parameter only on success
  dt-bindings:iio:adc: update the maintainer of vf610-adc
  MAINTAINERS: add imx7d/imx6sx/imx6ul/imx8qxp and vf610 adc maintainer
  Documentation:ABI:testing:admv1013: add ABI docs
  dt-bindings: iio: frequency: add admv1013 doc
  iio: frequency: admv1013: add support for ADMV1013

3 years agocxl: use default_groups in kobj_type
Greg Kroah-Hartman [Tue, 28 Dec 2021 13:13:50 +0000 (14:13 +0100)]
cxl: use default_groups in kobj_type

There are currently 2 ways to create a set of sysfs files for a
kobj_type, through the default_attrs field, and the default_groups
field.  Move the cxl code to use default_groups field which has been the
preferred way since aa30f47cf666 ("kobject: Add support for default
attribute groups to kobj_type") so that we can soon get rid of the
obsolete default_attrs field.

Cc: Frederic Barrat <fbarrat@linux.ibm.com>
Cc: Andrew Donnellan <ajd@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20211228131350.249532-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoUIO: use default_groups in kobj_type
Greg Kroah-Hartman [Tue, 28 Dec 2021 13:13:19 +0000 (14:13 +0100)]
UIO: use default_groups in kobj_type

There are currently 2 ways to create a set of sysfs files for a
kobj_type, through the default_attrs field, and the default_groups
field.  Move the UIO code to use default_groups field which has been the
preferred way since aa30f47cf666 ("kobject: Add support for default
attribute groups to kobj_type") so that we can soon get rid of the
obsolete default_attrs field.

Link: https://lore.kernel.org/r/20211228131319.249324-1-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoiio: chemical: sunrise_co2: set val parameter only on success
Tom Rix [Fri, 24 Dec 2021 15:08:33 +0000 (07:08 -0800)]
iio: chemical: sunrise_co2: set val parameter only on success

Clang static analysis reports this representative warning

sunrise_co2.c:410:9: warning: Assigned value is garbage or undefined
  *val = value;
       ^ ~~~~~

The ealier call to sunrise_read_word can fail without setting
value.  So defer setting val until we know the read was successful.

Fixes: c397894e24f1 ("iio: chemical: Add Senseair Sunrise 006-0-007 driver")
Signed-off-by: Tom Rix <trix@redhat.com>
Link: https://lore.kernel.org/r/20211224150833.3278236-1-trix@redhat.com
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
3 years agophy: cadence: Sierra: Add support for derived reference clock output
Swapnil Jakhade [Thu, 23 Dec 2021 06:01:37 +0000 (07:01 +0100)]
phy: cadence: Sierra: Add support for derived reference clock output

Sierra has derived differential reference clock output which is sourced
after the spread spectrum generation has been added. Add support to drive
derived reference clock out of serdes. Model this derived clock as a
"clock" so that platforms using this can enable it.

Sierra Main LC VCO PLL divider 1 clock is programmed to output 100MHz
clock output.

Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Reviewed-by: Aswath Govindraju <a-govindraju@ti.com>
Link: https://lore.kernel.org/r/20211223060137.9252-16-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
3 years agodt-bindings: phy: cadence-sierra: Add clock ID for derived reference clock
Swapnil Jakhade [Thu, 23 Dec 2021 06:01:36 +0000 (07:01 +0100)]
dt-bindings: phy: cadence-sierra: Add clock ID for derived reference clock

Add clock ID for Sierra derived reference clock.

Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Acked-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20211223060137.9252-15-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
3 years agophy: cadence: Sierra: Add PCIe + QSGMII PHY multilink configuration
Swapnil Jakhade [Thu, 23 Dec 2021 06:01:35 +0000 (07:01 +0100)]
phy: cadence: Sierra: Add PCIe + QSGMII PHY multilink configuration

Add register sequences for PCIe + QSGMII PHY multilink configuration.
PHY configuration for multi-link operation is done in two steps.
e.g. Consider a case for a 4 lane PHY with PCIe using 2 lanes and QSGMII
other 2 lanes. Sierra PHY has 2 PLLs, viz. PLLLC and PLLLC1. So in this
case, PLLLC is used for PCIe and PLLLC1 is used for QSGMII.

PHY is configured in two steps as described below.

[1] For first step, the register values are selected as
    [TYPE_PCIE][TYPE_QSGMII][ssc].
    This will configure PHY registers associated for PCIe involving PLLLC
    registers and registers for first 2 lanes of PHY.
[2] In second step, the register values are selected as
    [TYPE_QSGMII][TYPE_PCIE][ssc].
    This will configure PHY registers associated for QSGMII involving
    PLLLC1 registers and registers for other 2 lanes of PHY.

This completes the PHY configuration for multilink operation.

Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Reviewed-by: Aswath Govindraju <a-govindraju@ti.com>
Link: https://lore.kernel.org/r/20211223060137.9252-14-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
3 years agophy: cadence: Sierra: Add support for PHY multilink configurations
Swapnil Jakhade [Thu, 23 Dec 2021 06:01:34 +0000 (07:01 +0100)]
phy: cadence: Sierra: Add support for PHY multilink configurations

Add support for multilink configuration of Sierra PHY. Currently,
maximum two links are supported.

Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Reviewed-by: Aswath Govindraju <a-govindraju@ti.com>
Link: https://lore.kernel.org/r/20211223060137.9252-13-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
3 years agophy: cadence: Sierra: Fix to get correct parent for mux clocks
Swapnil Jakhade [Thu, 23 Dec 2021 06:01:33 +0000 (07:01 +0100)]
phy: cadence: Sierra: Fix to get correct parent for mux clocks

Fix get_parent() callback to return the correct index of the parent for
PLL_CMNLC1 clock. Add a separate table of register values corresponding
to the parent index for PLL_CMNLC1. Update set_parent() callback
accordingly.

Fixes: 28081b72859f ("phy: cadence: Sierra: Model PLL_CMNLC and PLL_CMNLC1 as clocks (mux clocks)")
Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Reviewed-by: Aswath Govindraju <a-govindraju@ti.com>
Link: https://lore.kernel.org/r/20211223060137.9252-12-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
3 years agophy: cadence: Sierra: Update single link PCIe register configuration
Swapnil Jakhade [Thu, 23 Dec 2021 06:01:32 +0000 (07:01 +0100)]
phy: cadence: Sierra: Update single link PCIe register configuration

Add single link PCIe register configurations for no SSC and internal
SSC. Also, add missing PMA lane registers for external SSC.

Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Reviewed-by: Aswath Govindraju <a-govindraju@ti.com>
Link: https://lore.kernel.org/r/20211223060137.9252-11-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
3 years agophy: cadence: Sierra: Check PIPE mode PHY status to be ready for operation
Swapnil Jakhade [Thu, 23 Dec 2021 06:01:31 +0000 (07:01 +0100)]
phy: cadence: Sierra: Check PIPE mode PHY status to be ready for operation

PIPE phy status is used to communicate the completion of several PHY
functions. Check if PHY is ready for operation while configured for
PIPE mode during startup.

Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Reviewed-by: Aswath Govindraju <a-govindraju@ti.com>
Link: https://lore.kernel.org/r/20211223060137.9252-10-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
3 years agophy: cadence: Sierra: Check cmn_ready assertion during PHY power on
Swapnil Jakhade [Thu, 23 Dec 2021 06:01:30 +0000 (07:01 +0100)]
phy: cadence: Sierra: Check cmn_ready assertion during PHY power on

Check if PMA cmn_ready is set indicating the startup process is complete.

Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Reviewed-by: Aswath Govindraju <a-govindraju@ti.com>
Link: https://lore.kernel.org/r/20211223060137.9252-9-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
3 years agophy: cadence: Sierra: Add PHY PCS common register configurations
Swapnil Jakhade [Thu, 23 Dec 2021 06:01:29 +0000 (07:01 +0100)]
phy: cadence: Sierra: Add PHY PCS common register configurations

Add PHY PCS common register configuration sequences for single link.
Update single link PCIe register sequence accordingly.

Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Reviewed-by: Aswath Govindraju <a-govindraju@ti.com>
Link: https://lore.kernel.org/r/20211223060137.9252-8-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
3 years agophy: cadence: Sierra: Rename some regmap variables to be in sync with Sierra document...
Swapnil Jakhade [Thu, 23 Dec 2021 06:01:28 +0000 (07:01 +0100)]
phy: cadence: Sierra: Rename some regmap variables to be in sync with Sierra documentation

No functional change. Rename some regmap variables as mentioned in Sierra
register description documentation.

Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Reviewed-by: Aswath Govindraju <a-govindraju@ti.com>
Link: https://lore.kernel.org/r/20211223060137.9252-7-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
3 years agophy: cadence: Sierra: Add support to get SSC type from device tree
Swapnil Jakhade [Thu, 23 Dec 2021 06:01:27 +0000 (07:01 +0100)]
phy: cadence: Sierra: Add support to get SSC type from device tree

Add support to get SSC type from DT.

Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Reviewed-by: Aswath Govindraju <a-govindraju@ti.com>
Link: https://lore.kernel.org/r/20211223060137.9252-6-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
3 years agodt-bindings: phy: cadence-sierra: Add binding to specify SSC mode
Swapnil Jakhade [Thu, 23 Dec 2021 06:01:26 +0000 (07:01 +0100)]
dt-bindings: phy: cadence-sierra: Add binding to specify SSC mode

Add binding to specify Spread Spectrum Clocking mode used.

Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20211223060137.9252-5-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
3 years agodt-bindings: phy: cadence-torrent: Rename SSC macros to use generic names
Swapnil Jakhade [Thu, 23 Dec 2021 06:01:25 +0000 (07:01 +0100)]
dt-bindings: phy: cadence-torrent: Rename SSC macros to use generic names

Rename SSC macros to use generic names instead of PHY specific names,
so that they can be used to specify SSC modes for both Torrent and
Sierra. Renaming the macros should not affect the things as these are
not being used in any DTS file yet.

Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Acked-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20211223060137.9252-4-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
3 years agophy: cadence: Sierra: Prepare driver to add support for multilink configurations
Swapnil Jakhade [Thu, 23 Dec 2021 06:01:24 +0000 (07:01 +0100)]
phy: cadence: Sierra: Prepare driver to add support for multilink configurations

Sierra driver currently supports single link configurations only. Prepare
driver to support multilink multiprotocol configurations along with
different SSC modes.

Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Reviewed-by: Aswath Govindraju <a-govindraju@ti.com>
Link: https://lore.kernel.org/r/20211223060137.9252-3-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
3 years agophy: cadence: Sierra: Use of_device_get_match_data() to get driver data
Swapnil Jakhade [Thu, 23 Dec 2021 06:01:23 +0000 (07:01 +0100)]
phy: cadence: Sierra: Use of_device_get_match_data() to get driver data

Use of_device_get_match_data() to get driver data instead of boilerplate
code.

Signed-off-by: Swapnil Jakhade <sjakhade@cadence.com>
Reviewed-by: Aswath Govindraju <a-govindraju@ti.com>
Link: https://lore.kernel.org/r/20211223060137.9252-2-sjakhade@cadence.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
3 years agophy: mediatek: Fix missing check in mtk_mipi_tx_probe
Miaoqian Lin [Fri, 24 Dec 2021 08:21:03 +0000 (08:21 +0000)]
phy: mediatek: Fix missing check in mtk_mipi_tx_probe

The of_device_get_match_data() function may return NULL.
Add check to prevent potential null dereference.

Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Link: https://lore.kernel.org/r/20211224082103.7658-1-linmq006@gmail.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
3 years agomei: cleanup status before client dma setup call
Alexander Usyskin [Thu, 23 Dec 2021 09:47:05 +0000 (11:47 +0200)]
mei: cleanup status before client dma setup call

The upper layer may retry call to mei_cl_dma_alloc_and_map(),
in that case the client status may be non-zero after the previous call
and the wait condition will be true immediately.
Set cl->status to zero to allow waiting for an actual result
from the firmware.

Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Link: https://lore.kernel.org/r/20211223094705.204624-2-tomas.winkler@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agomei: add POWERING_DOWN into device state print
Alexander Usyskin [Thu, 23 Dec 2021 09:47:04 +0000 (11:47 +0200)]
mei: add POWERING_DOWN into device state print

The POWERING_DOWN state string was missing from
the device states list, add it.

Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Link: https://lore.kernel.org/r/20211223094705.204624-1-tomas.winkler@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoMerge tag 'misc-habanalabs-next-2021-12-27' of https://git.kernel.org/pub/scm/linux...
Greg Kroah-Hartman [Mon, 27 Dec 2021 09:28:13 +0000 (10:28 +0100)]
Merge tag 'misc-habanalabs-next-2021-12-27' of https://git./linux/kernel/git/ogabbay/linux into char-misc-next

Oded writes:

This tag contains habanalabs driver changes for v5.17:

- Support reset-during-reset. In case the f/w notifies the driver
  that the f/w is going to reset the device, the driver should
  support that even if it is in the middle of doing another
  reset

- Support events from f/w that arrive during device resets.
  These events would be ignored which is bad as critical errors
  would not be reported and treated by the driver.

- Don't kill processes that hold the control device open during
  hard-reset of the device. The control device operations can't
  crash if done during hard-reset. And usually, only monitoring
  applications are using the control device, so killing them
  defies their purpose.

- Fix handling of hwmon nodes when working with legacy f/w

- Change the compute context pointer to be boolean. This pointer
  was abused by multiple code paths that wanted fast access to
  the compute context structure.

- Add uapi to fetch historical errors. This is necessary as errors
  sometimes result in hard-reset where the user application is
  being terminated.

- Optimize GAUDI's MMU cache invalidation.

- Add support for loading the latest f/w.

- Add uapi to fetch HBM replacement and pending rows information.

- Multiple bug fixes to the reset code.

- Multiple bug fixes for Multi-CS ioctl code.

- Multiple bug fixes for wait-for-interrupt ioctl code.

- Many small bug fixes and cleanups.

* tag 'misc-habanalabs-next-2021-12-27' of https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux: (70 commits)
  habanalabs: support hard-reset scheduling during soft-reset
  habanalabs: add a lock to protect multiple reset variables
  habanalabs: refactor reset information variables
  habanalabs: handle skip multi-CS if handling not done
  habanalabs: add CPU-CP packet for engine core ASID cfg
  habanalabs: replace some -ENOTTY with -EINVAL
  habanalabs: fix comments according to kernel-doc
  habanalabs: fix endianness when reading cpld version
  habanalabs: change wait_for_interrupt implementation
  habanalabs: prevent wait if CS in multi-CS list completed
  habanalabs: modify cpu boot status error print
  habanalabs: clean MMU headers definitions
  habanalabs: expose soft reset sysfs nodes for inference ASIC
  habanalabs: sysfs support for two infineon versions
  habanalabs: keep control device alive during hard reset
  habanalabs: fix hwmon handling for legacy f/w
  habanalabs: add current PI value to cpu packets
  habanalabs: remove in_debug check in device open
  habanalabs: return correct clock throttling period
  habanalabs: wait again for multi-CS if no CS completed
  ...

3 years agoMerge tag 'extcon-next-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git...
Greg Kroah-Hartman [Mon, 27 Dec 2021 09:27:01 +0000 (10:27 +0100)]
Merge tag 'extcon-next-for-5.17' of git://git./linux/kernel/git/chanwoo/extcon into char-misc-next

Chanwoo writes:

Update extcon next for v5.17

Detailed description for this pull request:
1. Remove duplicate code in extcon_set_state_sync() in extcon core
2. Fix non-kernel-doc comment for extcon-usb-gpio.c

* tag 'extcon-next-for-5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/extcon:
  extcon: Deduplicate code in extcon_set_state_sync()
  extcon: usb-gpio: fix a non-kernel-doc comment

3 years agoMerge tag 'icc-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/djakov...
Greg Kroah-Hartman [Mon, 27 Dec 2021 09:23:35 +0000 (10:23 +0100)]
Merge tag 'icc-5.17-rc1' of git://git./linux/kernel/git/djakov/icc into char-misc-next

Georgi writes:

interconnect changes for 5.17

Here are the interconnect changes for the 5.17-rc1 merge window
consisting of new drivers, minor changes and fixes.

New drivers:
 - New driver for MSM8996 platforms
 - New driver for SC7280 EPSS L3 hardware
 - New driver for QCM2290 platforms
 - New driver for SM8450 platforms

Driver changes:
 - dt-bindings: interconnect: Combine SDM660 bindings into RPM schema
 - icc-rpm: Add support for bus power domain
 - icc-rpm: Use NOC_QOS_MODE_INVALID for qos_mode check
 - icc-rpm: Define ICC device type
 - icc-rpm: Add QNOC type QoS support
 - icc-rpm: Support child NoC device probe
 - icc-rpm: Prevent integer overflow in rate
 - icc-rpmh: Add BCMs to commit list in pre_aggregate

Signed-off-by: Georgi Djakov <djakov@kernel.org>
* tag 'icc-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc:
  interconnect: qcom: Add QCM2290 driver support
  dt-bindings: interconnect: Add Qualcomm QCM2290 NoC support
  interconnect: icc-rpm: Support child NoC device probe
  interconnect: icc-rpm: Add QNOC type QoS support
  interconnect: icc-rpm: Define ICC device type
  interconnect: qcom: Add SM8450 interconnect provider driver
  dt-bindings: interconnect: Add Qualcomm SM8450 DT bindings
  interconnect: qcom: rpm: Prevent integer overflow in rate
  interconnect: icc-rpm: Use NOC_QOS_MODE_INVALID for qos_mode check
  interconnect: qcom: icc-rpmh: Add BCMs to commit list in pre_aggregate
  interconnect: qcom: Add MSM8996 interconnect provider driver
  dt-bindings: interconnect: Add Qualcomm MSM8996 DT bindings
  interconnect: icc-rpm: Add support for bus power domain
  dt-bindings: interconnect: Combine SDM660 bindings into RPM schema
  interconnect: qcom: Add EPSS L3 support on SC7280
  dt-bindings: interconnect: Add EPSS L3 DT binding on SC7280

3 years agohabanalabs: support hard-reset scheduling during soft-reset
Ofir Bitton [Tue, 23 Nov 2021 14:34:28 +0000 (16:34 +0200)]
habanalabs: support hard-reset scheduling during soft-reset

As hard-reset can be requested during soft-reset, driver must allow
it or else critical events received during soft-reset will be
ignored.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: add a lock to protect multiple reset variables
Ofir Bitton [Tue, 23 Nov 2021 13:15:22 +0000 (15:15 +0200)]
habanalabs: add a lock to protect multiple reset variables

Atomic operations during reset are replaced by a spinlock in order
to have the ability to protect more than a single variable.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: refactor reset information variables
Ofir Bitton [Tue, 23 Nov 2021 13:15:22 +0000 (15:15 +0200)]
habanalabs: refactor reset information variables

Unify variables related to device reset, which will help us to
add some new reset functionality in future patches.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: handle skip multi-CS if handling not done
Ohad Sharabi [Mon, 20 Dec 2021 11:30:35 +0000 (13:30 +0200)]
habanalabs: handle skip multi-CS if handling not done

This patch fixes issue in which we have timeout for multi-CS although
the CS in the list actually completed.

Example scenario (the two threads marked as WAIT for the thread that
handles the wait_for_multi_cs and CMPL as the thread that signal
completion for both CS and multi-CS):
1. Submit CS with sequence X
2. [WAIT]: call wait_for_multi_cs with single CS X
3. [CMPL]: CS X do invoke complete_all for both CS and multi-CS
           (multi_cs_completion_done still false)
4. [WAIT]: enter poll_fences, reinit the completion and find the CS
           as completed when asking on the fence but multi_cs_done is
   still false it returns that no CS actually completed
5. [CMPL]: set multi_cs_handling_done as true
6. [WAIT]: wait for completion but no CS to awake the wait context
           and hence wait till timeout

Solution: if CS detected as completed in poll_fences but multi_cs_done
          is still false invoke complete_all to the multi-CS completion
  and so it will not go to sleep in wait_for_completion but
  rather will have a "second chance" to wait for
  multi_cs_completion_done.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: add CPU-CP packet for engine core ASID cfg
Tomer Tayar [Thu, 16 Dec 2021 14:31:18 +0000 (16:31 +0200)]
habanalabs: add CPU-CP packet for engine core ASID cfg

In some cases the driver cannot configure ASID of some engines due to
the security level of the relevant registers.
For this a new CPU-CP packet is introduced, which will allow the driver
to ask the F/W to do this configuration instead.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: replace some -ENOTTY with -EINVAL
Oded Gabbay [Sun, 19 Dec 2021 14:06:59 +0000 (16:06 +0200)]
habanalabs: replace some -ENOTTY with -EINVAL

-ENOTTY is returned in case of error in the ioctl arguments themselves,
such as function that doesn't exists.

In all other cases, where the error is in the arguments of the custom
data structures that we define that are passed in the various ioctls,
we need to return -EINVAL.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: fix comments according to kernel-doc
Ofir Bitton [Sun, 19 Dec 2021 09:38:01 +0000 (11:38 +0200)]
habanalabs: fix comments according to kernel-doc

Fix missing fields, descriptions not according to kernel-doc style.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: fix endianness when reading cpld version
Ofir Bitton [Wed, 15 Dec 2021 12:48:27 +0000 (14:48 +0200)]
habanalabs: fix endianness when reading cpld version

Current sysfs implementation does not take endianness into
consideration when dumping the cpld version.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: change wait_for_interrupt implementation
farah kassabri [Tue, 2 Nov 2021 09:34:18 +0000 (11:34 +0200)]
habanalabs: change wait_for_interrupt implementation

Currently the cq counters are allocated in userspace memory,
and mapped by the driver to the device address space.

A new requirement that is part of new future API related to this one,
requires that cq counters will be allocated in kernel memory.

We leverage the existing cb_create API with KERNEL_MAPPED flag set to
allocate this memory.

That way we gain two things:
1. The memory cannot be freed while in use since it's protected
by refcount in driver.

2. No need to wake up the user thread upon each interrupt from CQ,
because the kernel has direct access to the counter. Therefore,
it can make comparison with the target value in the interrupt
handler and wake up the user thread only if the counter reaches the
target value. This is instead of waking the thread up to copy counter
value from user then go sleep again if target value wasn't reached.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: prevent wait if CS in multi-CS list completed
Ohad Sharabi [Tue, 7 Dec 2021 12:30:20 +0000 (14:30 +0200)]
habanalabs: prevent wait if CS in multi-CS list completed

By the original design we assumed that if we "miss" multi CS completion
it is of no severe consequence as we'll just call wait_for_multi_cs
again.

Sequence of events for such scenario:
1. user submit CS with sequence N
2. user calls wait for multi-CS with only CS #N in the list
3. the multi CS call starts with poll of the CSs but find that none
   completed (while CS #N did not completed yet)
4. now, multi CS #N complete but multi CS CTX was not yet created for
   the above multi-CS. so, attempt to complete multi-CS fails (as no
   multi CS CTX exist)
5. wait_for_multi_cs call now does init_wait_multi_cs_completion (and
   for this create the multi-CS CTX)
6. wait_for_multi_cs wits on completion but will not get one as CS #N
   already completed

To fix the issue we initialize the multi-CS CTX prior polling the
fences.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: modify cpu boot status error print
Ofir Bitton [Mon, 13 Dec 2021 13:43:06 +0000 (15:43 +0200)]
habanalabs: modify cpu boot status error print

As BTL can be replaced by ROM we should modify relevant error print.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: clean MMU headers definitions
Ohad Sharabi [Wed, 8 Dec 2021 07:06:03 +0000 (09:06 +0200)]
habanalabs: clean MMU headers definitions

During the MMU development the MMU header files were left with unclean
definitions:

- MMU "version specific" definitions that were left in the mmu_general
  file
- unused definitions

This patch attempts, where possible, to keep definitions that can serve
multiple MMU versions (but that are not tightly bound with specific MMU
arch) in the mmu_general header file (e.g. different definitions for
number of HOPs).

Otherwise, move MMU version specific definitions (e.g. HOPs masks and
shifts) to the specific MMU version file.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: expose soft reset sysfs nodes for inference ASIC
Ofir Bitton [Sun, 12 Dec 2021 15:46:21 +0000 (17:46 +0200)]
habanalabs: expose soft reset sysfs nodes for inference ASIC

As we allow soft-reset to be performed only on inference devices,
having the sysfs nodes may cause a confusion. Hence, we remove those
nodes on training ASICs.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: sysfs support for two infineon versions
Ofir Bitton [Wed, 8 Dec 2021 13:00:10 +0000 (15:00 +0200)]
habanalabs: sysfs support for two infineon versions

Currently sysfs support dumping a single infineon version, in
future asics we will have two infineon versions.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: keep control device alive during hard reset
Dani Liberman [Wed, 8 Dec 2021 07:52:03 +0000 (09:52 +0200)]
habanalabs: keep control device alive during hard reset

Need to allow user retrieve data during reset and afterwards without
the need to reopen the device.
Did it by seperating the user peocesses list into two lists:
1. fpriv_list which contains list of user processes that opened
   the device (currently only one).
2. fpriv_ctrl_list which contains list of user processes that opened
   the control device. This processes in this list shall not be
   killed during reset, only when the device is suddenly removed from
   PCI chain.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: fix hwmon handling for legacy f/w
Oded Gabbay [Sun, 12 Dec 2021 14:40:24 +0000 (16:40 +0200)]
habanalabs: fix hwmon handling for legacy f/w

In legacy f/w that use old hwmon.h file, the values of the hwmon
enums are different than the values that are in newer kernels (5.6
and above).

Therefore, to support working with those f/w, we need to do some
fixup before registering with the hwmon subsystem and also when
calling the functions that communicate with the f/w to retrieve
sensors information.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: add current PI value to cpu packets
Ofir Bitton [Wed, 8 Dec 2021 19:46:29 +0000 (21:46 +0200)]
habanalabs: add current PI value to cpu packets

In order to increase cpucp messaging reliability we will add
the current PI value to the descriptor sent to F/W.
F/W will wait for the PI value as an indication of a valid packet.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: remove in_debug check in device open
Oded Gabbay [Wed, 8 Dec 2021 14:25:07 +0000 (16:25 +0200)]
habanalabs: remove in_debug check in device open

The driver supports only a single user anyway, so there is no point
in checking whether we are in_debug state when a user tries to open
the device, because if we are in_debug, it means a user is already
using the device.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: return correct clock throttling period
Ofir Bitton [Tue, 7 Dec 2021 09:20:46 +0000 (11:20 +0200)]
habanalabs: return correct clock throttling period

Current clock throttling period returned from driver was wrong due
to wrong time comparison.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: wait again for multi-CS if no CS completed
Ohad Sharabi [Wed, 1 Dec 2021 08:52:27 +0000 (10:52 +0200)]
habanalabs: wait again for multi-CS if no CS completed

The original multi-CS design assumption that stream masters are used
exclusively (i.e. multi-CS with set of stream master QIDs will not get
completed by CS not from the multi-CS set) is inaccurate.

Thus multi-CS behavior is now modified not to treat such case as an
error.

Instead, if we have multi-CS completion but we detect that no CS from
the list is actually completed we will do another multi-CS wait (with
modified timeout).

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: remove compute context pointer
Oded Gabbay [Tue, 30 Nov 2021 21:08:21 +0000 (23:08 +0200)]
habanalabs: remove compute context pointer

It was an error to save the compute context's pointer in the device
structure, as it allowed its use without proper ref-cnt.

Change the variable to a flag that only indicates whether there is
an active compute context. Code that needs the pointer will now
be forced to use proper internal APIs to get the pointer.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: add helper to get compute context
Oded Gabbay [Tue, 30 Nov 2021 21:02:21 +0000 (23:02 +0200)]
habanalabs: add helper to get compute context

There are multiple places where the code needs to get the context's
pointer and increment its ref cnt. This is the proper way instead
of using the compute context pointer in the device structure.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: fix etr asid configuration
Oded Gabbay [Tue, 30 Nov 2021 20:32:13 +0000 (22:32 +0200)]
habanalabs: fix etr asid configuration

Pass the user's context pointer into the etr configuration function
to extract its ASID.

Using the compute_ctx pointer is an error as it is just an indication
of whether a user has opened the compute device.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: save ctx inside encaps signal
Oded Gabbay [Tue, 30 Nov 2021 13:28:23 +0000 (15:28 +0200)]
habanalabs: save ctx inside encaps signal

Compute context pointer in hdev shouldn't be used for fetching the
context's pointer.

If an object needs the context's pointer, it should get it while
incrementing its kref, and when the object is released, put it.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: remove redundant check on ctx_fini
Oded Gabbay [Tue, 30 Nov 2021 15:04:13 +0000 (17:04 +0200)]
habanalabs: remove redundant check on ctx_fini

The driver supports only a single context. Therefore, no need to check
if the user context that is closed is the compute context. The user
context, if exists, is always the compute context.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: free signal handle on failure
Oded Gabbay [Tue, 30 Nov 2021 12:54:53 +0000 (14:54 +0200)]
habanalabs: free signal handle on failure

Fix a bug where in case of failure to allocate idr, the handle's
memory wasn't freed as part of the error handling code.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: add missing kernel-doc comments for hl_device fields
Tomer Tayar [Mon, 29 Nov 2021 09:20:27 +0000 (11:20 +0200)]
habanalabs: add missing kernel-doc comments for hl_device fields

Add missing kernel-doc comments for the "last_error" and
"stream_master_qid_arr" fields of the "hl_device" structure".

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: pass reset flags to reset thread
Tomer Tayar [Mon, 22 Nov 2021 10:29:22 +0000 (12:29 +0200)]
habanalabs: pass reset flags to reset thread

The reset flags used by the reset thread are currently a mix of
hard-coded values and a specific flag which is passed from the context
that initiates the reset.
To make it easier to pass more flags in future from this context to the
reset thread, modify it to pass all the original reset flags to the
thread.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: enable access to info ioctl during hard reset
Dani Liberman [Mon, 22 Nov 2021 19:47:30 +0000 (21:47 +0200)]
habanalabs: enable access to info ioctl during hard reset

Because info ioctl is used to retrieve data, some of its opcodes may be
used during hard reset.
Other ioctls should be blocked while device is not operational.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: add SOB information to signal submission uAPI
Dani Liberman [Tue, 9 Nov 2021 09:33:28 +0000 (11:33 +0200)]
habanalabs: add SOB information to signal submission uAPI

For debug purpose, add SOB address and SOB initial counter value
before current submission to uAPI output.

Using SOB address and initial counter, user can calculate how much of
the submmision has been completed.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: skip read fw errors if dynamic descriptor invalid
Ohad Sharabi [Mon, 22 Nov 2021 10:23:51 +0000 (12:23 +0200)]
habanalabs: skip read fw errors if dynamic descriptor invalid

Reporting FW errors involves reading of the error registers.

In case we have a corrupted FW descriptor we cannot do that since the
dynamic scratchpad is potentially corrupted as well and may cause kernel
crush when attempting access to a corrupted register offset.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: handle events during soft-reset
Ofir Bitton [Sun, 21 Nov 2021 14:02:32 +0000 (16:02 +0200)]
habanalabs: handle events during soft-reset

Driver should handle events during soft-reset as F/W is not
going through reset and it keeps sending events towards host.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: change misleading IRQ warning during reset
Ofir Bitton [Thu, 18 Nov 2021 06:46:15 +0000 (08:46 +0200)]
habanalabs: change misleading IRQ warning during reset

Currently we dump the physical IRQ line index in host if an event
is received during reset. This ID is confusing as it means nothing
to the user.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: add power information type to POWER_GET packet
Tomer Tayar [Thu, 18 Nov 2021 08:44:05 +0000 (10:44 +0200)]
habanalabs: add power information type to POWER_GET packet

In new f/w versions, it is required to explicitly indicate the power
information type when querying the F/W for power info.
When getting the current power level it should be set to power_input.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: add more info ioctls support during reset
Ofir Bitton [Mon, 15 Nov 2021 17:36:25 +0000 (19:36 +0200)]
habanalabs: add more info ioctls support during reset

Some info ioctls can be served even if the device is disabled or
in reset. Hence, we enable more info ioctls during reset, as these
ioctls do not require any H/W nor F/W communication.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: fix race condition in multi CS completion
Dani Liberman [Wed, 17 Nov 2021 07:59:10 +0000 (09:59 +0200)]
habanalabs: fix race condition in multi CS completion

Race example scenario:
1. User have 2 threads that waits on multi CS:
   - thread_0 waits on QID 0 and uses multi CS context 0.
   - thread_1 waits on QID 1 and uses multi CS context 1.
2. thread_1 got completion and release multi CS context 1.
3. CS related to multi CS of thread_0 starts executing
   complete_multi_cs function, the first iteration of the loop
   completes the multi CS of thread_0, hence multi CS context 0
   is released.
4. thread_1 waits on QID 1 and uses multi CS context 0.
5. thread_0 waits on QID 0 and uses multi CS context 1.
6. The second iterattion of the loop (from step 3) starts, which
   means, start checking multi CS context 1:
   - multi CS contetxt is being used by thread_0 waiting on QID 0.
   - The fence of the CS (still CS from step 3) has QID map the same
     as the multi CS context 1.
   - multi CS context 1 (thread_0) gets completion on CS that triggered
     already thread_0 (with multi CS context 0) and is no longer
     being waited on.

Fixed by exiting the loop in complete_multi_cs after getting completion

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: move device boot warnings to the correct location
Ofir Bitton [Tue, 16 Nov 2021 13:48:42 +0000 (15:48 +0200)]
habanalabs: move device boot warnings to the correct location

As device boot warnings clears the indication from the error mask,
they must be located together before the unknown error validation.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs/gaudi: return EPERM on non hard-reset
Oded Gabbay [Tue, 16 Nov 2021 08:30:26 +0000 (10:30 +0200)]
habanalabs/gaudi: return EPERM on non hard-reset

GAUDI supports only hard-reset. Therefore, this function should
return an error of operation not permitted.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: rename late init after reset function
Oded Gabbay [Tue, 16 Nov 2021 07:59:32 +0000 (09:59 +0200)]
habanalabs: rename late init after reset function

The ASIC-specific soft_reset_late_init() is now called after either
soft-reset or reset-upon-device-release. Therefore, it needs a more
appropriate name.

No need to split it to two functions, as an ASIC either supports
soft-reset or reset-upon-device-release.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: fix soft reset accounting
Oded Gabbay [Tue, 16 Nov 2021 07:46:02 +0000 (09:46 +0200)]
habanalabs: fix soft reset accounting

Reset upon device release is not a soft-reset from user/system point
of view. As such, we shouldn't count that reset in the statistics we
gather and expose to the monitoring applications.

We also shouldn't print soft-reset when doing the reset upon device
release.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: Move frequency change thread to goya_late_init
Rajaravi Krishna Katta [Thu, 5 Aug 2021 07:24:16 +0000 (10:24 +0300)]
habanalabs: Move frequency change thread to goya_late_init

Changing the frequency automatically is only done in Goya. In future
ASICs this is done inside the firmware. Therefore, move the common code
into the Goya specific files.

Main changes as part of the commit are:
    1. The thread for setting frequency is moved from device_late_init
       to goya_late_init
    2. hl_device_set_frequency is removed from hl_device_open as it is
       not relevant for other ASICs and for Goya it is taken care by
       the thread
    3. hl_device_set_frequency is renamed as goya_set_frequency

Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: abort reset on invalid request
Oded Gabbay [Mon, 15 Nov 2021 15:13:37 +0000 (17:13 +0200)]
habanalabs: abort reset on invalid request

Hard-reset is mutually exclusive with reset-on-device-release.
Therefore, if such a request arrives to the reset function, abort
the reset and return an error to the callee.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: fix possible deadlock in cache invl failure
Ofir Bitton [Tue, 9 Nov 2021 11:12:38 +0000 (13:12 +0200)]
habanalabs: fix possible deadlock in cache invl failure

Currently there is a deadlock in driver in scenarios where MMU
cache invalidation fails. The issue is basically device reset
being performed without releasing the MMU mutex.
The solution is to skip device reset as it is not necessary.
In addition we introduce a slight code refactor that prints the
invalidation error from a single location.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: skip PLL freq fetch
Ohad Sharabi [Sun, 14 Nov 2021 07:37:33 +0000 (09:37 +0200)]
habanalabs: skip PLL freq fetch

Getting the used PLL index with which to send the CPUPU packet relies on
the CPUCP info packet.

In case CPU queues are not enabled getting the PLL index will issue an
error and in some ASICs will also fail the driver load.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: align debugfs documentation to alphabetical order
Tomer Tayar [Sun, 14 Nov 2021 07:29:48 +0000 (09:29 +0200)]
habanalabs: align debugfs documentation to alphabetical order

Move an entry in the debugfs documentation to align with the
alphabetical order which is kept this file.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: prevent false heartbeat message
Oded Gabbay [Sat, 13 Nov 2021 15:58:43 +0000 (17:58 +0200)]
habanalabs: prevent false heartbeat message

If a device reset has started, there is a chance that the heartbeat
function will fail because the device is disabled at the beginning
of the reset function.

In that case, we don't want the error message to appear in the log.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: add support for fetching historic errors
Dani Liberman [Wed, 3 Nov 2021 08:09:59 +0000 (10:09 +0200)]
habanalabs: add support for fetching historic errors

A new uAPI is added for debug purposes of the user-space to retrieve
errors related data from previous session (before device reset was
performed).

Inforamtion is filled when a razwi or CS timeout happens and can
contain one of the following:

1. Retrieve timestamp of last time the device was opened and razwi or
   CS timeout happened.
2. Retrieve information about last CS timeout.
3. Retrieve information about last razwi error.

This information doesn't contain user data, so no danger of data
leakage between users.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: handle device TPM boot error as warning
Ofir Bitton [Wed, 10 Nov 2021 09:41:43 +0000 (11:41 +0200)]
habanalabs: handle device TPM boot error as warning

AS TPM error indication is not fatal, driver should dump a warning
and continue booting.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: debugfs support for larger I2C transactions
Ofir Bitton [Tue, 12 Oct 2021 17:52:46 +0000 (20:52 +0300)]
habanalabs: debugfs support for larger I2C transactions

I2C debugfs support is limited to 1 byte. We extend functionality
to more than 1 byte by using one of the pad fields as a length.
No backward compatibility issues as new F/W versions will treat 0
length as a 1 byte length transaction.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: make hdev creation code more readable
Oded Gabbay [Thu, 4 Nov 2021 07:48:22 +0000 (09:48 +0200)]
habanalabs: make hdev creation code more readable

Divide the code into 3 different parts:
- Copy kernel parameters
- Setting device behaivor per asic
- Fixup of various device parameters according to the device behaivor.

In addition, remove non-relevant code for upstream (simulator support).

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: add new opcodes for INFO IOCTL
farah kassabri [Sun, 24 Oct 2021 16:02:32 +0000 (19:02 +0300)]
habanalabs: add new opcodes for INFO IOCTL

Add implementation for new opcodes in the INFO IOCTL:
1. Retrieve the replaced DRAM rows from f/w.
2. Retrieve the pending DRAM rows from f/w.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: refactor wait-for-user-interrupt function
Bharat Jauhari [Wed, 8 Sep 2021 14:32:54 +0000 (17:32 +0300)]
habanalabs: refactor wait-for-user-interrupt function

Refactor the wait-for-user-interrupt routine to make it more
generic for re-use for other user exposed h/w interfaces in future
ASICs.

Signed-off-by: Bharat Jauhari <bjauhari@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs/gaudi: Fix collective wait bug
farah kassabri [Wed, 3 Nov 2021 11:15:55 +0000 (13:15 +0200)]
habanalabs/gaudi: Fix collective wait bug

In Signaling-From-Graph case, the driver didn't set the hw_sob pointer
at the right place, which is needed for the cs completion
check prior to start sending all the master/slaves jobs to device.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: expand clock throttling information uAPI
Ofir Bitton [Mon, 25 Oct 2021 06:47:04 +0000 (09:47 +0300)]
habanalabs: expand clock throttling information uAPI

In addition to the clock throttling reason, user should be able
to obtain also the start time and the duration of the throttling
event.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: change wait for interrupt timeout to 64 bit
Dani Liberman [Thu, 14 Oct 2021 19:38:41 +0000 (22:38 +0300)]
habanalabs: change wait for interrupt timeout to 64 bit

In order to increase maximum wait-for-interrupt timeout, change it
to 64 bit variable. This wait is used only by newer ASICs, so no
problem in changing this interface at this time.

Signed-off-by: Dani Liberman <dliberman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: rename reset flags
Bharat Jauhari [Thu, 16 Sep 2021 11:00:38 +0000 (14:00 +0300)]
habanalabs: rename reset flags

Rename reset flags for better readability as compared to
HL_RESET_CAUSE* enum shared with the f/w.

Signed-off-by: Bharat Jauhari <bjauhari@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: add dedicated message towards f/w to set power
Rajaravi Krishna Katta [Tue, 26 Oct 2021 11:11:06 +0000 (14:11 +0300)]
habanalabs: add dedicated message towards f/w to set power

CPUCP_PACKET_POWER_GET packet type was used for both
hl_get_power() and hl_set_power().

To align with other sensor functions hl_set_power()
should use CPUCP_PACKET_POWER_SET.

This packet will only be used with newer ASICs, so need to add
a compatibility flag to the asic properties to indicate whether to use
this packet or the GET packet.

Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: handle abort scenario for user interrupt
Bharat Jauhari [Wed, 8 Sep 2021 14:16:51 +0000 (17:16 +0300)]
habanalabs: handle abort scenario for user interrupt

In case of device reset, the driver does a force trigger on all waiting
users to release them from waiting. However, the driver does not handle
error scenario while waiting.

hl_interrupt_wait_ioctl() now exits the wait in case of an error with
abort status.

Signed-off-by: Bharat Jauhari <bjauhari@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: don't clear previous f/w indications
Ohad Sharabi [Tue, 26 Oct 2021 07:42:24 +0000 (10:42 +0300)]
habanalabs: don't clear previous f/w indications

Once we read indication of whether f/w is doing the reset, we don't
want to clear it, until the next time we read this indication.

Otherwise, we might be in a state of wrong indication.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: use variable poll interval for fw loading
Ohad Sharabi [Tue, 26 Oct 2021 12:33:23 +0000 (15:33 +0300)]
habanalabs: use variable poll interval for fw loading

Using a variable poll interval for fw loading allows us to support
much slower environments (emulation) while changing only a single
line in the code, instead of choosing a different interval in each
function that polls.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: adding indication of boot fit loaded
Ohad Sharabi [Thu, 21 Oct 2021 08:24:41 +0000 (11:24 +0300)]
habanalabs: adding indication of boot fit loaded

Up until now the driver stored indication if Linux was loaded on the
device CPU. This was needed in order to coordinate some tasks that are
performed by the Linux.

In future ASICs, many of those tasks will be performed by the boot
fit, so now we need the same indication of boot fit load status.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: partly skip cache flush when in PMMU map flow
Yuri Nudelman [Mon, 25 Oct 2021 08:37:25 +0000 (11:37 +0300)]
habanalabs: partly skip cache flush when in PMMU map flow

The PCI MMU cache is two layered. The upper layer, memcache, uses cache
lines, the bottom layer doesn't.

Hence, after PMMU map operation we have to invalidate memcache, to avoid
the situation where the new entry is already in the cache due to its
cache line being fully in the cache.

However, we do not have to invalidate the lower cache, and here we can
optimize, since cache invalidation is time consuming.

Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: add enum mmu_op_flags
Yuri Nudelman [Thu, 30 Sep 2021 12:52:25 +0000 (15:52 +0300)]
habanalabs: add enum mmu_op_flags

The enum vm_type was abused, used once as a value (indication
memory type for map) and once as a flag (for cache invalidation).
This makes it hard to add new and still keep it meaningful, hence it
is better to split into one enum for values and one for flags.

Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: make last_mask an MMU property
Yuri Nudelman [Thu, 21 Oct 2021 12:08:51 +0000 (15:08 +0300)]
habanalabs: make last_mask an MMU property

Currently LAST_MASK is a global, but really it is an MMU implementation
specific. We need this change for future ASICs.

Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: wrong VA size calculation
Yuri Nudelman [Thu, 14 Oct 2021 07:33:27 +0000 (10:33 +0300)]
habanalabs: wrong VA size calculation

VA blocks are currently stored in an inconsistent way. Sometimes block
end is inclusive, sometimes exclusive. This leads to wrong size
calculations in certain cases, plus could lead to a segmentation fault
in case mapping process fails in the middle and we try to roll it back.
Need to make this consistent - start inclusive till end inclusive.

For example, the regions table may now look like this:
    0x0000 - 0x1fff : allocated
    0x2000 - 0x2fff : free
    0x3000 - 0x3fff : allocated

Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs/gaudi: fix debugfs dma channel selection
Guy Zadicario [Tue, 12 Oct 2021 07:30:28 +0000 (10:30 +0300)]
habanalabs/gaudi: fix debugfs dma channel selection

Do not use a dma channel for debugfs requested transfer if it's
QM is not idle.

Signed-off-by: Guy Zadicario <gzadicario@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: revise and document use of boot status flags
Ohad Sharabi [Sun, 17 Oct 2021 06:00:43 +0000 (09:00 +0300)]
habanalabs: revise and document use of boot status flags

The boot status flag "SRAM available" can be set by f/w Linux (in the
general case) or by f/w uboot (in some specific debug scenario) but
never by f/w preboot.

Hence, when polling the boot status flags in the preboot stage we do not
want to poll on "SRAM Avialable".

The special case in which uboot set this flag is when we are running
special debug scenario without Linux. In this case, at some point during
the boot, the uboot relocates its code to the DRAM and then set the
specified flag.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: print va_range in vm node debugfs
Yuri Nudelman [Thu, 14 Oct 2021 09:10:31 +0000 (12:10 +0300)]
habanalabs: print va_range in vm node debugfs

VA range info could assist in debugging VA allocation bugs.

Signed-off-by: Yuri Nudelman <ynudelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs/gaudi: recover from CPU WD event
Oded Gabbay [Thu, 21 Oct 2021 11:02:40 +0000 (14:02 +0300)]
habanalabs/gaudi: recover from CPU WD event

There are rare cases where the device CPU's watchdog has expired and as
a result, the watchdog reset has happened and the CPU will now move to
running its preboot f/w.

When that happens, the driver will only know that a heartbeat failure
occurred. As a result, the driver will send a message to the CPU's main
f/w asking it to reset the device, but because the CPU is now running
preboot, it won't respond and the re-initialization process will later
fail when trying to load the f/w.

The solution is to send the request to the preboot as well, only if the
reset was caused because of HB failure.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: modify wait for boot fit in dynamic FW load
Ohad Sharabi [Sun, 17 Oct 2021 05:40:28 +0000 (08:40 +0300)]
habanalabs: modify wait for boot fit in dynamic FW load

In the dynamic FW load protocol the boot status is updated to
"Ready to Boot" once uboot is active.

Polling on other boot status values is a residue of code duplication
from the static protocol and should be removed.

Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agoextcon: Deduplicate code in extcon_set_state_sync()
Alexander Stein [Tue, 23 Nov 2021 14:53:01 +0000 (15:53 +0100)]
extcon: Deduplicate code in extcon_set_state_sync()

Finding the cable index and checking for changed status is also done
in extcon_set_state(). So calling extcon_set_state_sync() will do these
checks twice. Remove them and use these checks from extcon_set_state().

Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
3 years agoextcon: usb-gpio: fix a non-kernel-doc comment
Randy Dunlap [Mon, 15 Nov 2021 03:05:36 +0000 (19:05 -0800)]
extcon: usb-gpio: fix a non-kernel-doc comment

Do not use "/**" to begin a non-kernel-doc comment.
Fixes this build warning:

drivers/extcon/extcon-usb-gpio.c:23:
warning: expecting prototype for drivers/extcon/extcon-usb-gpio.c().

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>