net/mlx5: Increase FW pre-init timeout for health recovery
authorGavin Li <gavinl@nvidia.com>
Sun, 27 Mar 2022 14:36:44 +0000 (17:36 +0300)
committerSaeed Mahameed <saeedm@nvidia.com>
Tue, 10 May 2022 05:54:00 +0000 (22:54 -0700)
commit37ca95e62ee23fa6d2c2c64e3dc40b4a0c0146dc
tree3c5957b3f12e084ad9b389f3fe772daf965a61ce
parent8324a02c342a36336114a497130826612ed5520d
net/mlx5: Increase FW pre-init timeout for health recovery

Currently, health recovery will reload driver to recover it from fatal
errors. During the driver's load process, it would wait for FW to set the
pre-init bit for up to 120 seconds, beyond this threshold it would abort
the load process. In some cases, such as a FW upgrade on the DPU, this
timeout period is insufficient, and the user has no way to recover the
host device.

To solve this issue, introduce a new FW pre-init timeout for health
recovery, which is set to 2 hours.

The timeout for devlink reload and probe will use the original one because
they are user triggered flows, and therefore should not have a
significantly long timeout, during which the user command would hang.

Signed-off-by: Gavin Li <gavinl@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
drivers/net/ethernet/mellanox/mlx5/core/devlink.c
drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
drivers/net/ethernet/mellanox/mlx5/core/lib/tout.c
drivers/net/ethernet/mellanox/mlx5/core/lib/tout.h
drivers/net/ethernet/mellanox/mlx5/core/main.c
drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h