habanalabs/gaudi: fetch TPC/MME ECC errors from F/W
authorOfir Bitton <obitton@habana.ai>
Thu, 5 Aug 2021 14:36:24 +0000 (17:36 +0300)
committerOded Gabbay <ogabbay@kernel.org>
Wed, 1 Sep 2021 15:38:24 +0000 (18:38 +0300)
In case F/W security is enabled driver cannot access ECC registers,
hence driver must fetch the ECC info from F/W.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
drivers/misc/habanalabs/gaudi/gaudi.c

index a05688c..e49e6b8 100644 (file)
@@ -7457,6 +7457,11 @@ static void gaudi_handle_ecc_event(struct hl_device *hdev, u16 event_type,
        bool extract_info_from_fw;
        int rc;
 
+       if (hdev->asic_prop.fw_security_enabled) {
+               extract_info_from_fw = true;
+               goto extract_ecc_info;
+       }
+
        switch (event_type) {
        case GAUDI_EVENT_PCIE_CORE_SERR ... GAUDI_EVENT_PCIE_PHY_DERR:
        case GAUDI_EVENT_DMA0_SERR_ECC ... GAUDI_EVENT_MMU_DERR:
@@ -7529,6 +7534,7 @@ static void gaudi_handle_ecc_event(struct hl_device *hdev, u16 event_type,
                return;
        }
 
+extract_ecc_info:
        if (extract_info_from_fw) {
                ecc_address = le64_to_cpu(ecc_data->ecc_address);
                ecc_syndrom = le64_to_cpu(ecc_data->ecc_syndrom);