[vortex-bug] Polling mode race fixed

Bogdan Costescu Bogdan.Costescu@IWR.Uni-Heidelberg.De
Thu, 14 Dec 2000 13:58:27 +0100 (CET)


[ This message summarizes a private thread with Andrew and is posted to
vortex-bug for reference.]

Hi,

I have found a race condition in the driver that uses polling mode from
http://www.uow.edu.au/~andrewm/linux/#3c59x-bc. I have observed Tx
timeouts that appear while the network is under load, but only on
computers with fast CPUs. I could not reproduce it on a PIII-450, I
can hardly reproduce it on a Duron-600, but appears very easily (seconds
to minutes) on an Athlon-800 and especially a Thunderbird-1000, while the
Tx ring is reduced to 2 or 4, the poll rate is 8 or higher and the network
is loaded by a 'ping -f -s 50000'.

The race is between the polling engine and the CPU. Basically, if the CPU
is fast enough to be able to re-fill the whole Tx ring at once, between 2
polls of the card, the race appears.

Suppose the card is polling on tx_ring[n].addr which is zero because
this was the last packet added by start_xmit(). If another packet is added
to the ring, tx_ring[n].addr should point to tx_ring[n+1]; but if the CPU
is fast enough to re-fill the ring, it will soon be overwritten by
tx_ring[n+TX_RING_SIZE].addr which is also zero because the ring is now
full and no other packet can be added. (I neglected the (% TX_RING_SIZE)
in the above addresses for clarity).

However, I also observed that polling mode associated with checking in
boomerang_interrupt for DownListPtr (instead of the DnComplete bit) is not
subject to the race (Don's current drivers are using this combination).
The difference between using the DnComplete bit and checking the value in
DownListPtr is that, for polling, DownListPtr is never zero and it holds
the address of the last DPD (on which it polls), so the Tx ring is never
empty and dirty_tx == cur_tx is never reached, while checking for
DnComplete allows an empty ring.

The solution is to not allow the Tx ring to be filled completely between 2
polls of the card. This can be achieved through:

1. larger Tx ring. This is not good for bonding and, given that faster
CPUs appear every day, is only a temporary solution.

2. reduction of the polling interval. This puts more load on the PCI bus,
has a minimum value imposed by the card and is also a temporary solution
because of the fast CPU speed increase.

3. a mechanism that prevents the Tx ring to be re-filled completely at any
time. The mechanism that Don introduced in his drivers where the Tx ring
is never completely full is good for this purpose; however, TX_QUEUE_LEN
must always be smaller (not equal) to TX_RING_SIZE in order for this
mechanism to function properly.

4. maybe something else, that I cannot imagine right now... If you have
any ideea, please speak up.

A modified driver using the 3rd method is under load (in the same
conditions as before) for more than 24 hours without any problem (compared
to seconds-minutes). It will be available from the above mentioned address
soon.

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu@IWR.Uni-Heidelberg.De