When vnet_start_xmit() is concurrent with vnet_ack(), we may
have a race that looks like:
thread 1 thread 2
vnet_start_xmit vnet_event_napi -> vnet_rx
__vnet_tx_trigger for some desc X
at this point dr->prod == X
peer sends back a stopped ack for X
we process X, but X == dr->prod
so we bail out in vnet_ack with
!idx_is_pending
update dr->prod
As a result of the fact that we never processed the stopped ack for X,
the Tx path is led to incorrectly believe that the peer is still
"started" and reading, but the peer has stopped reading, which will
ultimately end in flow-control assertions.
The fix is to synchronize the above 2 paths on the netif_tx_lock.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
return 0;
end = pkt->end_idx;
- if (unlikely(!idx_is_pending(dr, end)))
- return 0;
-
vp = port->vp;
dev = vp->dev;
+ netif_tx_lock(dev);
+ if (unlikely(!idx_is_pending(dr, end))) {
+ netif_tx_unlock(dev);
+ return 0;
+ }
+
/* sync for race conditions with vnet_start_xmit() and tell xmit it
* is time to send a trigger.
*/
- netif_tx_lock(dev);
dr->cons = next_idx(end, dr);
desc = vio_dring_entry(dr, dr->cons);
if (desc->hdr.state == VIO_DESC_READY && port->start_cons) {