IB/hfi1: Ignore LNI errors before DC8051 transitions to Polling state
authorKaike Wan <kaike.wan@intel.com>
Wed, 28 Nov 2018 18:19:04 +0000 (10:19 -0800)
committerJason Gunthorpe <jgg@mellanox.com>
Fri, 7 Dec 2018 02:50:08 +0000 (19:50 -0700)
commitc1a797c0818e0122c7ec8422edd971cfec9b15ea
tree3d11557a227b00b4f9c0efa75c5b34b34dd76e86
parent937488a85986faa743d12456970a0cbe83e3b04e
IB/hfi1: Ignore LNI errors before DC8051 transitions to Polling state

When it is requested to change its physical state back to Offline while in
the process to go up, DC8051 will set the ERROR field in the
DC8051_DBG_ERR_INFO_SET_BY_8051 register. This ERROR field will remain
until the next time when DC8051 transitions from Offline to Polling.
Subsequently, when the host requests DC8051 to change its physical state
to Polling again, it may receive a DC8051 interrupt with the stale ERROR
field still in DC8051_DBG_ERR_INFO_SET_BY_8051. If the host link state has
been changed to Polling, this stale ERROR will force the host to
transition to Offline state, resulting in a vicious cycle of Polling
->Offline->Polling->Offline. On the other hand, if the host link state is
still Offline when the stale ERROR is received, the stale ERROR will be
ignored, and the link will come up correctly.  This patch implements the
correct behavior by changing host link state to Polling only after DC8051
changes its physical state to Polling.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Krzysztof Goreczny <krzysztof.goreczny@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
drivers/infiniband/hw/hfi1/chip.c