habanalabs: enable stop on error for all QMANs and engines
authorOfir Bitton <obitton@habana.ai>
Thu, 11 Feb 2021 09:09:12 +0000 (11:09 +0200)
committerOded Gabbay <ogabbay@kernel.org>
Fri, 18 Jun 2021 12:23:41 +0000 (15:23 +0300)
If there is an error in the QMAN/engine, there is no point of trying
to continue running the workload. It is better to stop to allow the
user to debug the program.

Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
drivers/misc/habanalabs/common/habanalabs_drv.c

index b55dd1c..3a42339 100644 (file)
@@ -326,6 +326,7 @@ int create_hdev(struct hl_device **dev, struct pci_dev *pdev,
        hdev->reset_on_lockup = reset_on_lockup;
        hdev->memory_scrub = memory_scrub;
        hdev->boot_error_status_mask = boot_error_status_mask;
+       hdev->stop_on_err = true;
 
        hdev->pldm = 0;