platform/kernel/linux-starfive.git
3 years agoMerge tag 'icc-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/djakov...
Greg Kroah-Hartman [Fri, 4 Dec 2020 13:11:20 +0000 (14:11 +0100)]
Merge tag 'icc-5.11-rc1' of git://git./linux/kernel/git/djakov/icc into char-misc-next

Georgi writes:

interconnect changes for 5.11

Here are the interconnect changes for the 5.10-rc1 merge window
consisting of new driver and a cleanup.

Driver changes:
- New driver for Samsung Exynos SoCs
- Misc cleanups

Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
* tag 'icc-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc:
  MAINTAINERS: Add entry for Samsung interconnect drivers
  interconnect: Add generic interconnect driver for Exynos SoCs
  interconnect: qcom: Simplify the vcd compare function

3 years agofpga: fpga-mgr: altera-pr-ip: Simplify registration
Moritz Fischer [Sun, 15 Nov 2020 19:51:27 +0000 (11:51 -0800)]
fpga: fpga-mgr: altera-pr-ip: Simplify registration

Simplify registration using new devm_fpga_mgr_register() API.

Reviewed-by: Tom Rix <trix@redhat.com>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Link: https://lore.kernel.org/r/20201115195127.284487-11-mdf@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agofpga: fpga-mgr: zynqmp: Simplify registration
Moritz Fischer [Sun, 15 Nov 2020 19:51:26 +0000 (11:51 -0800)]
fpga: fpga-mgr: zynqmp: Simplify registration

Simplify registration using new devm_fpga_mgr_register() API.

Reviewed-by: Tom Rix <trix@redhat.com>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Link: https://lore.kernel.org/r/20201115195127.284487-10-mdf@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agofpga: fpga-mgr: xilinx-spi: Simplify registration
Moritz Fischer [Sun, 15 Nov 2020 19:51:25 +0000 (11:51 -0800)]
fpga: fpga-mgr: xilinx-spi: Simplify registration

Simplify registration using new devm_fpga_mgr_register() API.

Reviewed-by: Tom Rix <trix@redhat.com>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Link: https://lore.kernel.org/r/20201115195127.284487-9-mdf@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agofpga: fpga-mgr: ts73xx: Simplify registration
Moritz Fischer [Sun, 15 Nov 2020 19:51:24 +0000 (11:51 -0800)]
fpga: fpga-mgr: ts73xx: Simplify registration

Simplify registration using new devm_fpga_mgr_register() API.

Reviewed-by: Tom Rix <trix@redhat.com>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Link: https://lore.kernel.org/r/20201115195127.284487-8-mdf@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agofpga: fpga-mgr: socfpga: Simplify registration
Moritz Fischer [Sun, 15 Nov 2020 19:51:23 +0000 (11:51 -0800)]
fpga: fpga-mgr: socfpga: Simplify registration

Simplify registration using new devm_fpga_mgr_register() API.

Reviewed-by: Tom Rix <trix@redhat.com>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Link: https://lore.kernel.org/r/20201115195127.284487-7-mdf@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agofpga: fpga-mgr: machxo2-spi: Simplify registration
Moritz Fischer [Sun, 15 Nov 2020 19:51:22 +0000 (11:51 -0800)]
fpga: fpga-mgr: machxo2-spi: Simplify registration

Simplify registration using new devm_fpga_mgr_register() API.

Reviewed-by: Tom Rix <trix@redhat.com>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Link: https://lore.kernel.org/r/20201115195127.284487-6-mdf@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agofpga: fpga-mgr: ice40-spi: Simplify registration
Moritz Fischer [Sun, 15 Nov 2020 19:51:21 +0000 (11:51 -0800)]
fpga: fpga-mgr: ice40-spi: Simplify registration

Simplify registration using new devm_fpga_mgr_register() API.

Reviewed-by: Tom Rix <trix@redhat.com>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Link: https://lore.kernel.org/r/20201115195127.284487-5-mdf@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agofpga: fpga-mgr: dfl-fme-mgr: Simplify registration
Moritz Fischer [Sun, 15 Nov 2020 19:51:20 +0000 (11:51 -0800)]
fpga: fpga-mgr: dfl-fme-mgr: Simplify registration

Simplify registration using new devm_fpga_mgr_register() API.

Reviewed-by: Tom Rix <trix@redhat.com>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Link: https://lore.kernel.org/r/20201115195127.284487-4-mdf@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agofpga: fpga-mgr: altera-ps-spi: Simplify registration
Moritz Fischer [Sun, 15 Nov 2020 19:51:19 +0000 (11:51 -0800)]
fpga: fpga-mgr: altera-ps-spi: Simplify registration

Simplify registration by using new devm_fpga_mgr_register() API.

Reviewed-by: Tom Rix <trix@redhat.com>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Link: https://lore.kernel.org/r/20201115195127.284487-3-mdf@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agofpga: fpga-mgr: Add devm_fpga_mgr_register() API
Moritz Fischer [Sun, 15 Nov 2020 19:51:18 +0000 (11:51 -0800)]
fpga: fpga-mgr: Add devm_fpga_mgr_register() API

Add a devm_fpga_mgr_register() API that can be used to register a FPGA
Manager that was created using devm_fpga_mgr_create().

Introduce a struct fpga_mgr_devres that makes the devres
allocation a little bit more readable and gets reused for
devm_fpga_mgr_create() devm_fpga_mgr_register().

Reviewed-by: Tom Rix <trix@redhat.com>
Signed-off-by: Moritz Fischer <mdf@kernel.org>
Link: https://lore.kernel.org/r/20201115195127.284487-2-mdf@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoMAINTAINERS: Add entry for Samsung interconnect drivers
Sylwester Nawrocki [Thu, 12 Nov 2020 14:09:29 +0000 (15:09 +0100)]
MAINTAINERS: Add entry for Samsung interconnect drivers

Add maintainers entry for the Samsung SoC interconnect drivers, this
currently includes the Exynos generic interconnect driver.

Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org>
Link: https://lore.kernel.org/r/20201112140931.31139-4-s.nawrocki@samsung.com
Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
3 years agointerconnect: Add generic interconnect driver for Exynos SoCs
Sylwester Nawrocki [Thu, 12 Nov 2020 14:09:28 +0000 (15:09 +0100)]
interconnect: Add generic interconnect driver for Exynos SoCs

This patch adds a generic interconnect driver for Exynos SoCs in order
to provide interconnect functionality for each "samsung,exynos-bus"
compatible device.

The SoC topology is a graph (or more specifically, a tree) and its
edges are described by specifying in the 'interconnects' property
the interconnect consumer path for each interconnect provider DT node.

Each bus is now an interconnect provider and an interconnect node as
well (cf. Documentation/interconnect/interconnect.rst), i.e. every bus
registers itself as a node. Node IDs are not hard coded but rather
assigned dynamically at runtime. This approach allows for using this
driver with various Exynos SoCs.

Frequencies requested via the interconnect API for a given node are
propagated to devfreq using dev_pm_qos_update_request(). Please note
that it is not an error when CONFIG_INTERCONNECT is 'n', in which
case all interconnect API functions are no-op.

The samsung,data-clk-ratio DT property is used to specify the ratio
of the interconect bandwidth to the minimum data clock frequency
for each bus.

Due to unspecified relative probing order, -EPROBE_DEFER may be
propagated to ensure that the parent is probed before its children.

Reviewed-by: Chanwoo Choi <cw00.choi@samsung.com>
Tested-by: Chanwoo Choi <cw00.choi@samsung.com>
Acked-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Artur Świgoń <a.swigon@samsung.com>
Signed-off-by: Sylwester Nawrocki <s.nawrocki@samsung.com>
Link: https://lore.kernel.org/r/20201112140931.31139-3-s.nawrocki@samsung.com
Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
3 years agointerconnect: qcom: Simplify the vcd compare function
Georgi Djakov [Tue, 13 Oct 2020 17:19:23 +0000 (20:19 +0300)]
interconnect: qcom: Simplify the vcd compare function

Let's simplify the cmp_vcd() function and replace the conditionals
with just a single statement, which also improves readability.

Reviewed-by: Mike Tipton <mdtipton@codeaurora.org>
Link: https://lore.kernel.org/r/20201013171923.7351-1-georgi.djakov@linaro.org
Signed-off-by: Georgi Djakov <georgi.djakov@linaro.org>
3 years agoMerge tag 'misc-habanalabs-next-2020-11-30' of ssh://gitolite.kernel.org/pub/scm...
Greg Kroah-Hartman [Mon, 30 Nov 2020 14:38:21 +0000 (15:38 +0100)]
Merge tag 'misc-habanalabs-next-2020-11-30' of ssh://gitolite./linux/kernel/git/ogabbay/linux into char-misc-next

This tag contains habanalabs driver changes for v5.11-rc1:

- Add support for ability to perform collective stream sync. This is basically
  a synchronization between compute and network streams.

- Add initialization of NIC QMANs and security configuration. This is a
  pre-requisite for upstreaming the NIC ETH and RDMA code.

- Add option to scrub all internal memory (SRAM and DRAM) when the user
  closes the file-descriptor

- Support new firmware that provide enhanced device security. This includes
  many changes that basically amounts to moving certain configurations to
  the firmware and stop reading registers directly and instead receiving the
  information from the firmware. For example:
  - Retrieve HBM ECC error information
  - Retrieve PLL configuration
  - Configure of internal credits, rate-limitation

- Support new firmware that performs the GAUDI device reset instead of the
  driver. The driver now asks the firmware to do it.

- Some changes were done as Pre-requisite for future ASICs support:
  - Add option to put the device's PCI MMU page tables on the host memory.
  - Support loading multiple types of firmware.
  - Adding option to user to inquire about usage counter of Command buffer.

- Support taking timestamp of Command Submission when it completes and
  providing it to the user.

- Change aggregate cs counters to atomic and fix the cs counters structure
  to support addition of new counters in the future

- Update email address nad git repo of the driver in MAINTAINERS

- Many small bug fixes and improvements, such as:
  - Refactoring in MMU code to move code from ASIC-dependant files to
    common code
  - Minimize driver prints when no errors occur
  - Using enums, defines instead of hard-coded values
  - Refactoring of Command Submission flow to make it more readable now that
    we have multiple types of Command Submissions.

* tag 'misc-habanalabs-next-2020-11-30' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux: (76 commits)
  habanalabs: Add CB IOCTL opcode to retrieve CB information
  habanalabs: Modify the cs_cnt of a CB to be atomic
  habanalabs: Add mask for CS type bits in CS flags
  habanalabs: change messages to debug level
  habanalabs: free host huge va_range if not used
  habanalabs/gaudi: handle reset when f/w is in preboot
  habanalabs: add missing counter update
  habanalabs: add ull to PLL masks
  habanalabs: add support for cs with timestamp
  habanalabs: indicate to user that a cs is gone
  habanalabs/gaudi: print ECC type field
  habanalabs: update firmware files
  habanalabs: gaudi_ctx_fini() can be static
  habanalabs: goya_reset_sob_group() can be static
  habanalabs: fetch pll frequency from firmware
  habanalabs: mmu map wrapper for sizes larger than a page
  habanalabs: print CS type when it is stuck
  habanalabs/gaudi: align to new FW reset scheme
  habanalabs: firmware returns 64bit argument
  habanalabs: fix MMU debugfs operations
  ...

3 years agohabanalabs: Add CB IOCTL opcode to retrieve CB information
Tomer Tayar [Wed, 2 Sep 2020 10:43:32 +0000 (13:43 +0300)]
habanalabs: Add CB IOCTL opcode to retrieve CB information

Add a new CB IOCTL opcode that enables a user to query about a CB and
get its usage count.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: Modify the cs_cnt of a CB to be atomic
Tomer Tayar [Sun, 2 Aug 2020 19:51:31 +0000 (22:51 +0300)]
habanalabs: Modify the cs_cnt of a CB to be atomic

Modify the CS counter of a CB to be atomic, so no locking is required
when it is being modified or read.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: Add mask for CS type bits in CS flags
Tomer Tayar [Sun, 22 Nov 2020 09:02:50 +0000 (11:02 +0200)]
habanalabs: Add mask for CS type bits in CS flags

hl_cs_sanity_checks() extracts the CS type bits of the CS flags, by
masking out the non-type bits.
To save the need for updating the function whenever new bits for
non-type flags are added, add an explicit mask for the CS type bits.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: change messages to debug level
Oded Gabbay [Fri, 27 Nov 2020 16:10:20 +0000 (18:10 +0200)]
habanalabs: change messages to debug level

Some messages should be changed to debug mode as we want to keep
minimal prints during normal operation of the device.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: free host huge va_range if not used
Ofir Bitton [Thu, 26 Nov 2020 11:01:11 +0000 (13:01 +0200)]
habanalabs: free host huge va_range if not used

If huge range is not valid, driver uses the host range also for
huge page allocations, but driver never frees its allocation.
This introduces a memory leak every time a user closes its context.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs/gaudi: handle reset when f/w is in preboot
Oded Gabbay [Thu, 26 Nov 2020 16:11:05 +0000 (18:11 +0200)]
habanalabs/gaudi: handle reset when f/w is in preboot

Currently, if the f/w is in preboot/u-boot they don't perform the new
reset mechanism. Therefore, the driver needs to reset the device.
To prevent reset of PCI_IF, the driver needs to first configure the
reset units.

If the security is enabled, the driver can't configure the reset units.
In that situation, don't reset the card.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: add missing counter update
Oded Gabbay [Wed, 25 Nov 2020 06:02:40 +0000 (08:02 +0200)]
habanalabs: add missing counter update

The global CS drop-on-reset counter wasn't updated together with
the context counter.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: add ull to PLL masks
Alon Mizrahi [Sun, 22 Nov 2020 20:09:52 +0000 (22:09 +0200)]
habanalabs: add ull to PLL masks

These defines are 64-bit defines so they need ull suffix.

Signed-off-by: Alon Mizrahi <amizrahi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: add support for cs with timestamp
Ofir Bitton [Tue, 10 Nov 2020 15:26:22 +0000 (17:26 +0200)]
habanalabs: add support for cs with timestamp

add support for user to request a timestamp upon
cs completion.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: indicate to user that a cs is gone
Ofir Bitton [Tue, 10 Nov 2020 14:30:53 +0000 (16:30 +0200)]
habanalabs: indicate to user that a cs is gone

We want to indicate to the user that a certain command submission
is finished long time ago and it is no longer in database.
This means no further information regarding this cs can be obtained.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs/gaudi: print ECC type field
Oded Gabbay [Sat, 21 Nov 2020 12:29:25 +0000 (14:29 +0200)]
habanalabs/gaudi: print ECC type field

We have the ECC type field from the firmware but the driver didn't
print it, so we need to add that field to the ECC print message.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: update firmware files
Oded Gabbay [Fri, 20 Nov 2020 19:39:09 +0000 (21:39 +0200)]
habanalabs: update firmware files

Update various firmware header files with new defines.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: gaudi_ctx_fini() can be static
kernel test robot [Thu, 19 Nov 2020 04:25:43 +0000 (12:25 +0800)]
habanalabs: gaudi_ctx_fini() can be static

Make a function in gaudi.c to be static

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: goya_reset_sob_group() can be static
kernel test robot [Thu, 19 Nov 2020 03:18:30 +0000 (11:18 +0800)]
habanalabs: goya_reset_sob_group() can be static

Make some functions static

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: fetch pll frequency from firmware
Alon Mizrahi [Tue, 17 Nov 2020 12:25:14 +0000 (14:25 +0200)]
habanalabs: fetch pll frequency from firmware

Once firmware security is enabled, driver must fetch pll frequencies
through the firmware message interface instead of reading the registers
directly.

Signed-off-by: Alon Mizrahi <amizrahi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: mmu map wrapper for sizes larger than a page
Ofir Bitton [Thu, 22 Oct 2020 12:13:10 +0000 (15:13 +0300)]
habanalabs: mmu map wrapper for sizes larger than a page

We introduce a new wrapper which allows us to mmu map any size
to any host va_range available. In addition we remove duplicated
code from various places in driver and using this new wrapper
instead.
This wrapper supports mapping only contiguous physical
memory blocks and will be used for mappings that are done to the
driver ASID.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: print CS type when it is stuck
Oded Gabbay [Mon, 16 Nov 2020 08:25:53 +0000 (10:25 +0200)]
habanalabs: print CS type when it is stuck

We have several types of command submissions and the user wants to know
which type of command submission has not finished in time when that
event occurs. This is very helpful for debug.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs/gaudi: align to new FW reset scheme
Ofir Bitton [Sun, 8 Nov 2020 10:59:04 +0000 (12:59 +0200)]
habanalabs/gaudi: align to new FW reset scheme

As part of the security effort in which FW will be handling
sensitive HW registers, hard reset flow will be done by FW
and will be triggered by driver.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: firmware returns 64bit argument
Alon Mizrahi [Tue, 10 Nov 2020 11:49:10 +0000 (13:49 +0200)]
habanalabs: firmware returns 64bit argument

F/W message returns 64bit value but up until now we casted it to
a 32bit variable, instead of receiving 64bit in the first place.

Signed-off-by: Alon Mizrahi <amizrahi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: fix MMU debugfs operations
Moti Haimovski [Tue, 27 Oct 2020 08:55:42 +0000 (10:55 +0200)]
habanalabs: fix MMU debugfs operations

After the MMU-code refactoring, the existing MMU debugfs operations
are no longer working so we need to fix them.
In addition, remove the duplicate code that was in the debugfs code
and use the already existing MMU-code.

Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: share a single ctx-mutex between all MMUs
Moti Haimovski [Tue, 27 Oct 2020 09:03:32 +0000 (11:03 +0200)]
habanalabs: share a single ctx-mutex between all MMUs

Multiple locks are usually a source of problems, which in the MMU
case can be avoided since it is relatively rare that both MMU
tables are updated at the same time.

Therefore, use a single shared lock instead of two separate ones.

Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: support reserving aligned va block
Ofir Bitton [Wed, 4 Nov 2020 13:18:55 +0000 (15:18 +0200)]
habanalabs: support reserving aligned va block

Add support for reserving va block with alignment different than
page size. This is a pre-requisite for allocations needed in future
ASICs

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: add boot errors prints
Guy Nisan [Thu, 12 Nov 2020 15:52:52 +0000 (17:52 +0200)]
habanalabs: add boot errors prints

Add log prints for security and eFuse boot error bits

Signed-off-by: Guy Nisan <gnisan@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: print message with correct device
Oded Gabbay [Tue, 10 Nov 2020 20:03:43 +0000 (22:03 +0200)]
habanalabs: print message with correct device

During hard-reset, the driver rejects further IOCTL calls and prints
an error message. That error message should be printed with the correct
device instead of using only the control device.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs/gaudi: fetch HBM ecc info from FW
Ofir Bitton [Mon, 5 Oct 2020 10:44:59 +0000 (13:44 +0300)]
habanalabs/gaudi: fetch HBM ecc info from FW

Once FW security is enabled there is no access to HBM ecc registers,
need to read values from FW using a dedicated interface.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: fetch hard reset capability from FW
Ofir Bitton [Sun, 8 Nov 2020 11:10:09 +0000 (13:10 +0200)]
habanalabs: fetch hard reset capability from FW

Driver must fetch FW hard reset capability during boot time,
in order to skip the hard reset flow if necessary.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: move asic property to correct structure
Oded Gabbay [Mon, 9 Nov 2020 07:48:31 +0000 (09:48 +0200)]
habanalabs: move asic property to correct structure

Whether an ASIC has MMU towards its DRAM is an ASIC property, so
move it to the asic fixed properties structure.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: use host va range for internal pools
Ofir Bitton [Thu, 22 Oct 2020 12:04:10 +0000 (15:04 +0300)]
habanalabs: use host va range for internal pools

Instead of using a dedicated va range for each internal pool,
we introduce a new way for reserving a va block from an existing
va range. This is a more generic way of reserving va blocks for
future use.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: improve hard reset procedure
Ofir Bitton [Thu, 8 Oct 2020 07:27:42 +0000 (10:27 +0300)]
habanalabs: improve hard reset procedure

We want to handle the scenario in which the driver was not able
to kill all user processes due to many memory mappings.
We need to retry again after some period while releasing the cores.
The devices will be unusable and "in-reset" status during that time.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: Rename hw_queues_mirror to cs_mirror
Tomer Tayar [Fri, 30 Oct 2020 09:16:23 +0000 (11:16 +0200)]
habanalabs: Rename hw_queues_mirror to cs_mirror

Future command submission types might be submitted to HW not via the
QMAN queues path. However, it would be still required to have the TDR
mechanism for these CS, and thus the patch renames the TDR fields and
replaces the hw_queues_ prefix with cs_.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: refactor mmu va_range db structure
Ofir Bitton [Thu, 22 Oct 2020 08:05:55 +0000 (11:05 +0300)]
habanalabs: refactor mmu va_range db structure

Use an array of va_ranges instead of keeping each va_range separately,
we do this for better readability and in order to support access to
a specific range in a much elegant manner.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: move HW dirty check to a proper location
Ofir Bitton [Mon, 19 Oct 2020 14:04:20 +0000 (17:04 +0300)]
habanalabs: move HW dirty check to a proper location

Driver must verify if HW is dirty before trying to fetch preboot
information. Hence, we move this validation to a prior stage of
the boot sequence.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: restore vm_pgoff after mmap
Oded Gabbay [Thu, 29 Oct 2020 16:38:31 +0000 (18:38 +0200)]
habanalabs: restore vm_pgoff after mmap

Due to using dma_mmap_coherent() to perform mmap of dma memory, we
had to clear the vm_pgoff field before calling that function.

However, that broke the userspace (profiler tool) as they relied
on searching the /proc/self/maps for these values to correctly
"disassemble" the topology recipe.

To re-enable that functionality, the driver can simply restore the
value of vm_pgoff before returning to userspace but after calling
dma_mmap_coherent().

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: add 'needs reset' state in driver
Ofir Bitton [Mon, 5 Oct 2020 11:40:10 +0000 (14:40 +0300)]
habanalabs: add 'needs reset' state in driver

The new state indicates that device should be reset in order
to re-gain funcionality.
This unique state can occur if reset_on_lockup is disabled
and an actual lockup has occurred.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: fix hard reset print and comment
Omer Shpigelman [Sat, 31 Oct 2020 20:03:55 +0000 (22:03 +0200)]
habanalabs: fix hard reset print and comment

One of the first steps of a hard reset flow is to close all open user
contexts. This user process teradown might take some time due to long
cleanup in our driver or some other reason even before our cleanup flow.
Hence fix the relevant print and comment to be more accurate.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs/gaudi: remove pcie_en strap toggle
Igor Grinberg [Thu, 29 Oct 2020 12:06:54 +0000 (14:06 +0200)]
habanalabs/gaudi: remove pcie_en strap toggle

Since the very large grace period is over and this functionality
prevents us to implement the new reset sequence and apply security
settings, we need to remove the code toggling the PCIE_EN bit in the
straps register.
Remove it for good.

Signed-off-by: Igor Grinberg <igrinberg@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: remove duplicate print
Oded Gabbay [Wed, 28 Oct 2020 19:05:20 +0000 (21:05 +0200)]
habanalabs: remove duplicate print

We print twice the firmware status regarding security, once in
common code and once in asic code. Remove the print in asic code
and leave the common code print.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: Separate CS job completion from its deallocation
Tomer Tayar [Mon, 10 Aug 2020 14:30:35 +0000 (17:30 +0300)]
habanalabs: Separate CS job completion from its deallocation

Current CS jobs are no longer needed after their completion.
However, jobs of future workload might be in use even after they are
completed. To allow that, the patch adds a refcount to the job object,
and decouples its completion handling from its deallocation.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs/gaudi: increase MAX CS to 16K
Oded Gabbay [Tue, 27 Oct 2020 07:34:44 +0000 (09:34 +0200)]
habanalabs/gaudi: increase MAX CS to 16K

We need to have the MAX CS be much larger than the size of the
different queues. In GAUDI we have around 8 groups of queues, and each
group has 1K queue size. To prevent head-of-the-line blocking, we need
to make sure there is sufficient number of available CS allocations
even if one or more of those queues are full.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: reset device upon fw read failure
farah kassabri [Wed, 14 Oct 2020 12:17:36 +0000 (15:17 +0300)]
habanalabs: reset device upon fw read failure

failure in reading pre-boot verion is not handled correctly,
upon failure we need to reset the device in order to be able
to reinstall the driver.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: Move repeatedly included headers to habanalabs.h
Tomer Tayar [Sun, 25 Oct 2020 15:47:22 +0000 (17:47 +0200)]
habanalabs: Move repeatedly included headers to habanalabs.h

Several header files are repeatedly included in many files.
Move these files to habanalabs.h which is included by all.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: release signal if collective wait was dropped
Ofir Bitton [Sun, 25 Oct 2020 07:36:08 +0000 (09:36 +0200)]
habanalabs: release signal if collective wait was dropped

As in standard wait cs, we must release a signal fence once
a collective wait cs was dropped and not submitted.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: Skip updating CI of internal queues if not in use
Tomer Tayar [Mon, 27 Jul 2020 21:28:51 +0000 (00:28 +0300)]
habanalabs: Skip updating CI of internal queues if not in use

There are no internal queues if H/W queues are being used.
In this case we can skip the redundant traversal over the queues array,
looking for internal queues.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: Small refactoring of cs_do_release()
Tomer Tayar [Mon, 27 Jul 2020 20:49:41 +0000 (23:49 +0300)]
habanalabs: Small refactoring of cs_do_release()

Slightly refactor the cs_do_release() function, to reduce nesting level
and to ease the handling of future CS types.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: Small refactoring of CS IOCTL handling
Tomer Tayar [Sun, 19 Jul 2020 18:07:15 +0000 (21:07 +0300)]
habanalabs: Small refactoring of CS IOCTL handling

Refactor the CS IOCTL handling by gathering common code into
sub-functions, in order to ease future additions of new CS types.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs/gaudi: fetch PLL info from FW
Ofir Bitton [Mon, 5 Oct 2020 08:36:00 +0000 (11:36 +0300)]
habanalabs/gaudi: fetch PLL info from FW

Once FW security is enabled there is no access to PLL registers,
need to read values from FW using a dedicated interface.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: refactor MMU to support dual residency MMU
Moti Haimovski [Mon, 5 Oct 2020 14:59:29 +0000 (17:59 +0300)]
habanalabs: refactor MMU to support dual residency MMU

This commit refactors the MMU code to support PCI MMU page tables
residing on host and DCORE MMU residing on the device DRAM at the
same time.

This is needed for future devices as on GAUDI and GOYA we have
a single MMU where its page tables always reside on DRAM.

Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: fix MMU print message
Moti Haimovski [Mon, 5 Oct 2020 16:33:10 +0000 (19:33 +0300)]
habanalabs: fix MMU print message

This commit fixes an incorrect error message

Signed-off-by: Moti Haimovski <mhaimovski@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs/gaudi: scrub all memory upon closing FD
farah kassabri [Wed, 6 May 2020 08:17:38 +0000 (11:17 +0300)]
habanalabs/gaudi: scrub all memory upon closing FD

In cases of multi-tenants, administrators may want to prevent data
leakage between users running on the same device one after another.

To do that the driver can scrub the internal memory (both SRAM and
DRAM) after a user finish to use the memory.

Because in GAUDI the driver allows only one application to use the
device at a time, it can scrub the memory when user app close FD.

In future devices where we have MMU on the DRAM, we can scrub the DRAM
memory with a finer granularity (page granularity) when the user
allocates the memory.

This feature is not supported in Goya.

To allow users that want to debug their applications, we add a kernel
module parameter to load the driver with this feature disabled.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs/gaudi: add support for FW security
Ofir Bitton [Sun, 4 Oct 2020 14:34:37 +0000 (17:34 +0300)]
habanalabs/gaudi: add support for FW security

Skip relevant HW configurations once FW security is enabled
because these configurations are being performed by FW.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: fetch security indication from FW
Ofir Bitton [Sun, 4 Oct 2020 06:09:19 +0000 (09:09 +0300)]
habanalabs: fetch security indication from FW

Add support for fetching security indication from FW.
This indication is needed in order to skip unnecessary
initializations done by FW.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: fix cs counters structure
farah kassabri [Mon, 12 Oct 2020 11:30:26 +0000 (14:30 +0300)]
habanalabs: fix cs counters structure

Fix cs counters structure in uapi to be one flat structure instead
of two instances of the same other structure.
use atomic read/increment for context counters so we could use
one structure for both aggregated and context counters.

Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: advanced FW loading
Ofir Bitton [Tue, 20 Oct 2020 07:45:37 +0000 (10:45 +0300)]
habanalabs: advanced FW loading

Today driver is able to load a whole FW binary into a specific
location on ASIC. We add support for loading sections from the
same FW binary into different loactions.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: initialize variable before use
Oded Gabbay [Tue, 20 Oct 2020 15:37:56 +0000 (18:37 +0300)]
habanalabs: initialize variable before use

GCC 7.3.1 20180303 (Red Hat 7.3.1-5) complains that collective_engine_id
might be used uninitialized.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs/gaudi: remove unreachable code
Ofir Bitton [Mon, 19 Oct 2020 13:52:00 +0000 (16:52 +0300)]
habanalabs/gaudi: remove unreachable code

Remove unreachable code in gaudi collective flow.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: make sure cs type is valid in cs_ioctl_signal_wait
Oded Gabbay [Mon, 19 Oct 2020 06:06:18 +0000 (09:06 +0300)]
habanalabs: make sure cs type is valid in cs_ioctl_signal_wait

Although we get a valid cs type from the callee, in case new values
will be added in the future, it is best to check the expected values
in that function.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs/gaudi: monitor device memory usage
Oded Gabbay [Sun, 18 Oct 2020 12:32:23 +0000 (15:32 +0300)]
habanalabs/gaudi: monitor device memory usage

In GAUDI we don't have an MMU towards the HBM device memory. Therefore,
the user access that memory directly through physical address (via the
different engines) without the need to go through the driver to
allocate/free memory on the HBM.

For system monitoring purposes, the driver will keep track of the HBM
usage. This can be done as long as the user accurately reports the
allocations and releases of HBM memory, through the existing MEMORY
IOCTL uapi.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: sync stream collective support
Ofir Bitton [Thu, 10 Sep 2020 07:56:26 +0000 (10:56 +0300)]
habanalabs: sync stream collective support

Implement sync stream collective for GAUDI. Need to allocate additional
resources for that and add ctx_fini() to clean up those resources.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs/gaudi: Set DMA5 QMAN internal
Ofir Bitton [Mon, 31 Aug 2020 05:52:56 +0000 (08:52 +0300)]
habanalabs/gaudi: Set DMA5 QMAN internal

DMA5 QMAN is designated to be used for reduction process, hence it will
be no longer configured as external queue.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: sync stream collective infrastructure
Ofir Bitton [Thu, 10 Sep 2020 07:10:55 +0000 (10:10 +0300)]
habanalabs: sync stream collective infrastructure

Define new API for collective wait support and modify sync stream
common flow. In addition add kernel CB allocation support for
internal queues.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: use enum for CB allocation options
Tal Cohen [Wed, 3 Jun 2020 06:25:27 +0000 (09:25 +0300)]
habanalabs: use enum for CB allocation options

In the future there will be situations where queues can accept either
kernel allocated CBs or user allocated CBs, depending on different
states.

Therefore, instead of using a boolean variable of kernel/user allocated
CB, we need to use a bitmask to indicate that, which will allow to
combine the two options.

Add a flag to the uapi so the user will be able to indicate whether
the CB was allocated by kernel or by user. Of course the driver
validates that.

Signed-off-by: Tal Cohen <talcohen@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs/gaudi: add support for NIC QMANs
Oded Gabbay [Mon, 2 Nov 2020 19:10:39 +0000 (21:10 +0200)]
habanalabs/gaudi: add support for NIC QMANs

Initialize the QMANs that are responsible to submit doorbells to the NIC
engines. Add support for stopping and disabling them, and reset them as
part of the hard-reset procedure of GAUDI. This will allow the user to
submit work to the NICs.

Add support for receiving events on QMAN errors from the firmware.

However, the nic_ports_mask is still initialized to 0. That means this code
won't initialize the QMANs just yet. That will be in a later patch.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs/gaudi: add NIC security configuration
Oded Gabbay [Mon, 2 Nov 2020 19:09:33 +0000 (21:09 +0200)]
habanalabs/gaudi: add NIC security configuration

Configure the security properties of the NIC IP. This is to prevent the
user process from doing something with the NIC that he shouldn't do. e.g.
crash the server, steal data, etc.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs/gaudi: add NIC firmware-related definitions
Oded Gabbay [Mon, 2 Nov 2020 19:07:51 +0000 (21:07 +0200)]
habanalabs/gaudi: add NIC firmware-related definitions

Add new structures and messages that the driver use to interact with the
firmware to receive information and events (errors) about GAUDI's NIC.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs/gaudi: add NIC QMAN H/W and registers definitions
Oded Gabbay [Mon, 2 Nov 2020 19:00:18 +0000 (21:00 +0200)]
habanalabs/gaudi: add NIC QMAN H/W and registers definitions

Add auto-generated header files that describe the NIC QMANs registers
used by the driver.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: remove duplicate check
Oded Gabbay [Mon, 12 Oct 2020 17:56:33 +0000 (20:56 +0300)]
habanalabs: remove duplicate check

We already check if queue index is smaller than max queues a few lines
above this check so no need to check this again.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: sync stream refactor functions
Ofir Bitton [Thu, 10 Sep 2020 06:43:43 +0000 (09:43 +0300)]
habanalabs: sync stream refactor functions

Refactor sync stream implementation by reducing function length
for better readability.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: add support for multiple SOBs per monitor
Ofir Bitton [Thu, 10 Sep 2020 06:40:35 +0000 (09:40 +0300)]
habanalabs: add support for multiple SOBs per monitor

Support advanced monitor functionality to monitor more than a
single SOB. In addition expand all CB generation functions
with buffer offset in order to put in them multiple packets that are
generated by different functions.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: sync stream structures refactor
Ofir Bitton [Thu, 10 Sep 2020 06:17:50 +0000 (09:17 +0300)]
habanalabs: sync stream structures refactor

Refactor sync stream implementation by adding more structures for
better readability. In addition reducing allocated resources.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: don't init vm module if no MMU
Oded Gabbay [Sun, 4 Oct 2020 20:00:39 +0000 (23:00 +0300)]
habanalabs: don't init vm module if no MMU

In case we are running without MMU enabled (debug mode), no need to
initialize the VM module in the driver.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: minimize prints when everything is fine
Oded Gabbay [Fri, 2 Oct 2020 21:14:27 +0000 (00:14 +0300)]
habanalabs: minimize prints when everything is fine

No need to print when the driver starts to initialize the H/W. Drivers
should be silent when everything is OK.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: support multiple types of firmwares
Oded Gabbay [Thu, 1 Oct 2020 10:46:37 +0000 (13:46 +0300)]
habanalabs: support multiple types of firmwares

The driver now loads the firmware in two stages. For debugging purposes
we need to support situations where only the first stage firmware is
loaded.

Therefore, use a bitmask to determine which F/W is loaded

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: we need CPU queues for hwmon
Oded Gabbay [Thu, 1 Oct 2020 10:44:22 +0000 (13:44 +0300)]
habanalabs: we need CPU queues for hwmon

F/W can be loaded but device CPU queues disabled. In that case, HWMON
should be disabled. This is only relevant when debugging

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs/gaudi: move mmu_prepare to context init
Ofir Bitton [Wed, 30 Sep 2020 12:51:10 +0000 (15:51 +0300)]
habanalabs/gaudi: move mmu_prepare to context init

Currently mmu_prepare is located at context switch.
Since we support a single context, no reason to reconfigure
the MMU registers every context switch.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agohabanalabs: change aggregate cs counters to atomic
Oded Gabbay [Wed, 30 Sep 2020 11:25:55 +0000 (14:25 +0300)]
habanalabs: change aggregate cs counters to atomic

In case we will have multiple contexts/processes, we can't just
increment aggregated counters. We need to make them atomic as they can
be incremented by multiple processes

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
3 years agoMAINTAINERS: update email, git repo of habanalabs driver
Oded Gabbay [Mon, 2 Nov 2020 19:15:47 +0000 (21:15 +0200)]
MAINTAINERS: update email, git repo of habanalabs driver

Update the email to my kernel.org email address and update the git
repository address to the git.kernel.org

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoMerge 5.10-rc6 into char-misc-next
Greg Kroah-Hartman [Mon, 30 Nov 2020 07:33:06 +0000 (08:33 +0100)]
Merge 5.10-rc6 into char-misc-next

We need the fixes in here as well.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
3 years agoLinux 5.10-rc6
Linus Torvalds [Sun, 29 Nov 2020 23:50:50 +0000 (15:50 -0800)]
Linux 5.10-rc6

3 years agoMerge tag 'locking-urgent-2020-11-29' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 29 Nov 2020 19:19:26 +0000 (11:19 -0800)]
Merge tag 'locking-urgent-2020-11-29' of git://git./linux/kernel/git/tip/tip

Pull locking fixes from Thomas Gleixner:
 "Two more places which invoke tracing from RCU disabled regions in the
  idle path.

  Similar to the entry path the low level idle functions have to be
  non-instrumentable"

* tag 'locking-urgent-2020-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  intel_idle: Fix intel_idle() vs tracing
  sched/idle: Fix arch_cpu_idle() vs tracing

3 years agoMerge tag 'irq-urgent-2020-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 29 Nov 2020 19:06:57 +0000 (11:06 -0800)]
Merge tag 'irq-urgent-2020-11-29' of git://git./linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
 "Two fixes for irqchip drivers:

   - Save and restore the GICV3 ITS state unconditionally on
     suspend/resume to handle firmware which fails to do so.

   - Use the correct index into the fwspec parameters to read the irq
     trigger type in the EXIU chip driver"

* tag 'irq-urgent-2020-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-v3-its: Unconditionally save/restore the ITS state on suspend
  irqchip/exiu: Fix the index of fwspec for IRQ type

3 years agoMerge tag 'efi-urgent-for-v5.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 29 Nov 2020 18:18:53 +0000 (10:18 -0800)]
Merge tag 'efi-urgent-for-v5.10-rc5' of git://git./linux/kernel/git/tip/tip

Pull EFI fixes from Borislav Petkov:
 "More EFI fixes forwarded from Ard Biesheuvel:

   - revert efivarfs kmemleak fix again - it was a false positive

   - make CONFIG_EFI_EARLYCON depend on CONFIG_EFI explicitly so it does
     not pull in other dependencies unnecessarily if CONFIG_EFI is not
     set

   - defer attempts to load SSDT overrides from EFI vars until after the
     efivar layer is up"

* tag 'efi-urgent-for-v5.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi: EFI_EARLYCON should depend on EFI
  efivarfs: revert "fix memory leak in efivarfs_create()"
  efi/efivars: Set generic ops before loading SSDT

3 years agoMerge tag 'x86_urgent_for_v5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 29 Nov 2020 18:08:17 +0000 (10:08 -0800)]
Merge tag 'x86_urgent_for_v5.10-rc6' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:
 "A couple of urgent fixes which accumulated this last week:

   - Two resctrl fixes to prevent refcount leaks when manipulating the
     resctrl fs (Xiaochen Shen)

   - Correct prctl(PR_GET_SPECULATION_CTRL) reporting (Anand K Mistry)

   - A fix to not lose already seen MCE severity which determines
     whether the machine can recover (Gabriele Paoloni)"

* tag 'x86_urgent_for_v5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Do not overwrite no_way_out if mce_end() fails
  x86/speculation: Fix prctl() when spectre_v2_user={seccomp,prctl},ibpb
  x86/resctrl: Add necessary kernfs_put() calls to prevent refcount leak
  x86/resctrl: Remove superfluous kernfs_get() calls to prevent refcount leak

3 years agoMerge tag 'riscv-for-linus-5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 28 Nov 2020 23:53:30 +0000 (15:53 -0800)]
Merge tag 'riscv-for-linus-5.10-rc6' of git://git./linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:
 "I've collected a handful of fixes over the past few weeks:

   - A fix to un-break the build-id argument to the vDSO build, which is
     necessary for the LLVM linker.

   - A fix to initialize the jump label subsystem, without which it (and
     all the stuff that uses it) doesn't actually function.

   - A fix to include <asm/barrier.h> from <vdso/processor.h>, without
     which some drivers won't compile"

* tag 'riscv-for-linus-5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  RISC-V: fix barrier() use in <vdso/processor.h>
  RISC-V: Add missing jump label initialization
  riscv: Explicitly specify the build id style in vDSO Makefile again

3 years agoMerge tag 'kbuild-fixes-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masah...
Linus Torvalds [Sat, 28 Nov 2020 18:42:30 +0000 (10:42 -0800)]
Merge tag 'kbuild-fixes-v5.10' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - Remove unused OBJSIZE variable.

 - Fix rootless deb-pkg build in a setgid directory.

* tag 'kbuild-fixes-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  builddeb: Fix rootless build in setuid/setgid directory
  kbuild: remove unused OBJSIZE

3 years agoMerge tag 'perf-tools-fixes-for-v5.10-2020-11-28' of git://git.kernel.org/pub/scm...
Linus Torvalds [Sat, 28 Nov 2020 18:35:05 +0000 (10:35 -0800)]
Merge tag 'perf-tools-fixes-for-v5.10-2020-11-28' of git://git./linux/kernel/git/acme/linux

Pull perf tool fixes from Arnaldo Carvalho de Melo:

 - Fix die_entrypc() when DW_AT_ranges DWARF attribute not available

 - Cope with broken DWARF (missing DW_AT_declaration) generated by some
   recent gcc versions

 - Do not generate CGROUP metadata events when not asked to in 'perf
   record'

 - Use proper CPU for shadow stats in 'perf stat'

 - Update copy of libbpf's hashmap.c, silencing tools/perf build warning

 - Fix return value in 'perf diff'

* tag 'perf-tools-fixes-for-v5.10-2020-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf probe: Change function definition check due to broken DWARF
  perf probe: Fix to die_entrypc() returns error correctly
  perf stat: Use proper cpu for shadow stats
  perf record: Synthesize cgroup events only if needed
  perf diff: Fix error return value in __cmd_diff()
  perf tools: Update copy of libbpf's hashmap.c