We received two reports of BUG_ON in bnad_txcmpl_process() where
hw_consumer_index appeared to be ahead of producer_index. Out of order
write/read of these variables could explain these reports.
bnad_start_xmit(), as a producer of tx descriptors, has a few memory
barriers sprinkled around writes to producer_index and the device's
doorbell but they're not paired with anything in bnad_txcmpl_process(), a
consumer.
Since we are synchronizing with a device, we must use mandatory barriers,
not smp_*. Also, I didn't see the purpose of the last smp_mb() in
bnad_start_xmit().
Signed-off-by: Benjamin Poirier <bpoirier@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
return 0;
hw_cons = *(tcb->hw_consumer_index);
+ rmb();
cons = tcb->consumer_index;
q_depth = tcb->q_depth;
BNA_QE_INDX_INC(prod, q_depth);
tcb->producer_index = prod;
- smp_mb();
+ wmb();
if (unlikely(!test_bit(BNAD_TXQ_TX_STARTED, &tcb->flags)))
return NETDEV_TX_OK;
skb_tx_timestamp(skb);
bna_txq_prod_indx_doorbell(tcb);
- smp_mb();
return NETDEV_TX_OK;
}