i40e: Fix to stop tx_timeout recovery if GLOBR fails
authorAlan Brady <alan.brady@intel.com>
Tue, 2 Aug 2022 08:19:17 +0000 (10:19 +0200)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Thu, 25 Aug 2022 09:40:26 +0000 (11:40 +0200)
commit 57c942bc3bef0970f0b21f8e0998e76a900ea80d upstream.

When a tx_timeout fires, the PF attempts to recover by incrementally
resetting.  First we try a PFR, then CORER and finally a GLOBR.  If the
GLOBR fails, then we keep hitting the tx_timeout and incrementing the
recovery level and issuing dmesgs, which is both annoying to the user
and accomplishes nothing.

If the GLOBR fails, then we're pretty much totally hosed, and there's
not much else we can do to recover, so this makes it such that we just
kill the VSI and stop hitting the tx_timeout in such a case.

Fixes: 41c445ff0f48 ("i40e: main driver core")
Signed-off-by: Alan Brady <alan.brady@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
drivers/net/ethernet/intel/i40e/i40e_main.c

index b07d55c99317ee0b877e89adf36de4d262c14710..536f9198bd47a1690ffa7b0266b3870a20fce1be 100644 (file)
@@ -383,7 +383,9 @@ static void i40e_tx_timeout(struct net_device *netdev, unsigned int txqueue)
                set_bit(__I40E_GLOBAL_RESET_REQUESTED, pf->state);
                break;
        default:
-               netdev_err(netdev, "tx_timeout recovery unsuccessful\n");
+               netdev_err(netdev, "tx_timeout recovery unsuccessful, device is in non-recoverable state.\n");
+               set_bit(__I40E_DOWN_REQUESTED, pf->state);
+               set_bit(__I40E_VSI_DOWN_REQUESTED, vsi->state);
                break;
        }