0.99L and timeouts

Bogdan Costescu Bogdan.Costescu@IWR.Uni-Heidelberg.De
Sat Apr 8 15:46:19 2000


Hi,

I played today with 0.99L on 2.2.15pre16 (SMP) as I was trying to track
down the Tx timeouts that occur sometimes.
I noticed that if I set TX_RING_SIZE to 2, I am able to trigger
the Tx timeout quite often (like 5-6 in 10 minutes) under load; however I
still don't understand what is going on there.. maybe someone can help?

I put some more printks around and I noticed that the test of tbusy at the
beginning of boomerang_start_xmit is done at much bigger intervals than
TX_TIMEOUT (5-600 ticks or more), so the vortex_tx_timeout is called every
time. Now, if it's called, it always has cur_tx=dirty_tx and tx_full=1,
which is strange, because cur_tx=dirty_tx means that there are no more
packets in the tx_ring and tx_full should be 0.
I also found out that at the same time, inl(ioaddr + DownListPtr)=0 which
means (from what I understand from the driver, I don't have the docs) that
the card's Tx queue is also empty.

So, vortex_tx_timeout does:
- outw(TxReset...) and wait
- tx_errors++
- Cyclone/Tornado is full bus master, so the condition is TRUE
  - cur_tx - dirty_tx > 0 is FALSE (as they are equal)
  - tx_full is 1 and cur_tx - dirty_tx = 0, so tx_full is set to 0 and 
    tbusy is cleared
  - outb(PKT_BUF...)
  - outw(DownUnstall...)
- outw(TxEnable...)
- trans_start = jiffies

So, this function does nothing useful in these circumstances (IMO), except
clearing tx_full and tbusy. To prove this, I removed the out[b,w] calls,
so the only thing that the function was doing was to clear these 2 flags,
and it worked.

How is it possible to have tx_full and tbusy set, while cur_tx=dirty_tx?

I was also thinking about putting some test for the above conditions and
then only reset tx_full and tbusy flags (without doing any HW Tx reset)
and then continuing as normal with boomerang_start_xmit (not returning 1),
but I'm not sure that the pointers are correct in all possible situations.
This checking would allow the queueing of the packet in the current call
to boomerang_start_xmit (my understanding is that if it returns 1, the
packet will be retried by the upper level after some delay - which I don't
know what is: another packet needed to be sent, a timer or whichever
happens first?)

Thank you for any suggestion!

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu@IWR.Uni-Heidelberg.De

-------------------------------------------------------------------
To unsubscribe send a message body containing "unsubscribe"
to linux-vortex-bug-request@beowulf.org