FIX: 0.99L and timeouts

Bogdan Costescu Bogdan.Costescu@IWR.Uni-Heidelberg.De
Fri Apr 14 11:54:59 2000


  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.
  Send mail to mime@docserver.cac.washington.edu for more info.

---830399112-1207890389-955727642=:27258
Content-Type: TEXT/PLAIN; charset=US-ASCII



Sorry for the delay, I had some urgent things on my agenda...

On Tue, 11 Apr 2000, Andrew Morton wrote:

> Oh.  I now understand your point.  cur_tx == dirty_tx and tx_full == 1.
> 
> I've looked and looked.  I can't see a logic bug, SMP race or IRQ race
> which could cause this.  I suggest you put some explicit code in various
> places to detect this condition and print something out, so we can
> identify it as near as poss to its cause.  Odd.

I think I've got it and it's indeed a SMP race! First, I should mention
that I couldn't reproduce it running an UP kernel, but the conditions of
my test (CPU and network load) were perturbed by this, so it's not quite
100% safe.

OK, so the race is between the vortex_interrupt - DownComplete branch
(which is the only code updating vp->dirty_tx) and the end of
boomerang_start_xmit, after outw(DownStall) and retore_flags (which is the
only code setting vp->tx_full = 1).
There are in fact 2 cases (as shown by my log with TX_RING_SIZE=2, I only
thought of 1 of them !!!):

1. in vortex_interrupt, when tx_full == 1 and vp->cur_tx - dirty_tx >
TX_RING_SIZE - 1
(the first and third occurence in my log) vp->cur_tx = dirty_tx + 2 =>
vp->cur_tx - dirty_tx <= 1 is false, so vp->tx_full and dev->tbusy are not
cleared.
This happens when vp->cur_tx is incremented in boomerang_start_xmit about
the same time with vp->dirty_tx = dirty_tx in vortex_interrupt.

2. in boomerang_start_xmit, vp->cur_tx++ and the comparison is made and is
true (again vp->cur_tx = vp->dirty_tx + 2), but vp->tx_full is not yet
set, then in vortex_interrupt vp->dirty_tx is updated and vp->tx_full is 0
which means that vp->tx_full and dev->tbusy are not cleared, then back
into boomerang_start_xmit, vp->tx_full is set.
(the second occurence in my log)

The fix that I found is to move (in boomerang_start_xmit) the lines:

	outw(DownUnstall, ioaddr + EL3_CMD);
	restore_flags(flags);

after
	vp->cur_tx++;
	if (vp->cur_tx ....)

This way, the card will not be able to generate an interrupt before we
actually update vp->cur_tx and vp->tx_full.
I'm not using interrupt mitigation, so I don't know is this change has any
effect on cpu_to_le32(~TxIntrUploaded).

With this fix, I was not able to reproduce the timeouts any more.

By looking at the code, it seems like 0.99[M,N] have the same race, but I
was not able to get any of them to work upto now...

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu@IWR.Uni-Heidelberg.De

---830399112-1207890389-955727642=:27258
Content-Type: TEXT/PLAIN; charset=US-ASCII; name=log1
Content-Transfer-Encoding: BASE64
Content-ID: <Pine.LNX.4.10.10004141754020.27258@kenzo.iwr.uni-heidelberg.de>
Content-Description: log
Content-Disposition: attachment; filename=log1

QXByIDE0IDE1OjMzOjA0IG5vZGUwMDkga2VybmVsOiBldGgwOiBBZnRlciBE
b3duTGlzdFB0ciBmdWxsOiAwIGN1ciAxMTc5MyBkaXJ0eSAxMTc5Mi4gDQpB
cHIgMTQgMTU6MzM6MDQgbm9kZTAwOSBrZXJuZWw6IGV0aDA6IEFmdGVyIERv
d25VbnN0YWxsIGZ1bGw6IDAgY3VyIDExNzkzIGRpcnR5IDExNzkyLiANCkFw
ciAxNCAxNTozMzowNCBub2RlMDA5IGtlcm5lbDogZXRoMDogQWZ0ZXIgcmVz
dG9yZV9mbGFncyBmdWxsOiAwIGN1ciAxMTc5MyBkaXJ0eSAxMTc5Mi4gDQpB
cHIgMTQgMTU6MzM6MDQgbm9kZTAwOSBrZXJuZWw6IGV0aDA6IEFmdGVyIGN1
cl90eCsrIGZ1bGw6IDAgY3VyIDExNzk0IGRpcnR5IDExNzkzLiANCkFwciAx
NCAxNTozMzowNCBub2RlMDA5IGtlcm5lbDogZXRoMDogQWZ0ZXIgaWYgZnVs
bDogMCBjdXIgMTE3OTQgZGlydHkgMTE3OTMuIA0KQXByIDE0IDE1OjMzOjA0
IG5vZGUwMDkga2VybmVsOiBldGgwOiBBZnRlciBEb3duTGlzdFB0ciBmdWxs
OiAwIGN1ciAxMTc5NCBkaXJ0eSAxMTc5My4gDQpBcHIgMTQgMTU6MzM6MDQg
bm9kZTAwOSBrZXJuZWw6IGV0aDA6IEFmdGVyIERvd25VbnN0YWxsIGZ1bGw6
IDAgY3VyIDExNzk0IGRpcnR5IDExNzkzLiANCkFwciAxNCAxNTozMzowNCBu
b2RlMDA5IGtlcm5lbDogZXRoMDogQWZ0ZXIgcmVzdG9yZV9mbGFncyBmdWxs
OiAwIGN1ciAxMTc5NCBkaXJ0eSAxMTc5My4gDQpBcHIgMTQgMTU6MzM6MDQg
bm9kZTAwOSBrZXJuZWw6IGV0aDA6IEFmdGVyIGN1cl90eCsrIGZ1bGw6IDAg
Y3VyIDExNzk1IGRpcnR5IDExNzk0LiANCkFwciAxNCAxNTozMzowNCBub2Rl
MDA5IGtlcm5lbDogZXRoMDogQWZ0ZXIgaWYgZnVsbDogMCBjdXIgMTE3OTUg
ZGlydHkgMTE3OTQuIA0KQXByIDE0IDE1OjMzOjA0IG5vZGUwMDkga2VybmVs
OiBldGgwOiBBZnRlciBEb3duTGlzdFB0ciBmdWxsOiAwIGN1ciAxMTc5NSBk
aXJ0eSAxMTc5NC4gDQpBcHIgMTQgMTU6MzM6MDQgbm9kZTAwOSBrZXJuZWw6
IGV0aDA6IEFmdGVyIERvd25VbnN0YWxsIGZ1bGw6IDAgY3VyIDExNzk1IGRp
cnR5IDExNzk0LiANCkFwciAxNCAxNTozMzowNCBub2RlMDA5IGtlcm5lbDog
ZXRoMDogQWZ0ZXIgcmVzdG9yZV9mbGFncyBmdWxsOiAwIGN1ciAxMTc5NSBk
aXJ0eSAxMTc5NC4gDQpBcHIgMTQgMTU6MzM6MDQgbm9kZTAwOSBrZXJuZWw6
IGV0aDA6IEFmdGVyIGN1cl90eCsrIGZ1bGw6IDAgY3VyIDExNzk2IGRpcnR5
IDExNzk0LiANCkFwciAxNCAxNTozMzowNCBub2RlMDA5IGtlcm5lbDogZXRo
MDogQWZ0ZXIgaWYgZnVsbDogMSBjdXIgMTE3OTYgZGlydHkgMTE3OTQuIA0K
QXByIDE0IDE1OjMzOjA4IG5vZGUwMDkga2VybmVsOiBldGgwOiB0cmFuc21p
dCB0aW1lZCBvdXQsIHR4X3N0YXR1cyAwMCBzdGF0dXMgZTAwMC4gDQpBcHIg
MTQgMTU6MzM6MDggbm9kZTAwOSBrZXJuZWw6ICAgRmxhZ3M7IGJ1cy1tYXN0
ZXIgMSwgZnVsbCAxOyBkaXJ0eSAxMTc5NiBjdXJyZW50IDExNzk2LiANCkFw
ciAxNCAxNTozMzowOCBub2RlMDA5IGtlcm5lbDogICBUcmFuc21pdCBsaXN0
IDAwMDAwMDAwIHZzLiBjZmZkOGMyMC4gDQpBcHIgMTQgMTU6MzM6MDggbm9k
ZTAwOSBrZXJuZWw6ICAgMDogQGNmZmQ4YzIwICBsZW5ndGggODAwMDAwNDIg
c3RhdHVzIDgwMDEwMDQyIA0KQXByIDE0IDE1OjMzOjA4IG5vZGUwMDkga2Vy
bmVsOiAgIDE6IEBjZmZkOGMzMCAgbGVuZ3RoIDgwMDAwMDQyIHN0YXR1cyA4
MDAxMDA0MiANCkFwciAxNCAxNTozMzowOCBub2RlMDA5IGtlcm5lbDogZXRo
MDogUmVzZXR0aW5nIHRoZSBUeCByaW5nIHBvaW50ZXIuIA0KQXByIDE0IDE1
OjMzOjA4IG5vZGUwMDkga2VybmVsOiBldGgwOiBBZnRlciBEb3duTGlzdFB0
ciBmdWxsOiAwIGN1ciAxMTc5NiBkaXJ0eSAxMTc5Ni4gDQpBcHIgMTQgMTU6
MzM6MDggbm9kZTAwOSBrZXJuZWw6IGV0aDA6IEFmdGVyIERvd25VbnN0YWxs
IGZ1bGw6IDAgY3VyIDExNzk2IGRpcnR5IDExNzk2LiANCkFwciAxNCAxNToz
MzowOCBub2RlMDA5IGtlcm5lbDogZXRoMDogQWZ0ZXIgcmVzdG9yZV9mbGFn
cyBmdWxsOiAwIGN1ciAxMTc5NiBkaXJ0eSAxMTc5Ni4gDQpBcHIgMTQgMTU6
MzM6MDkgbm9kZTAwOSBrZXJuZWw6IGV0aDA6IEFmdGVyIGN1cl90eCsrIGZ1
bGw6IDAgY3VyIDExNzk3IGRpcnR5IDExNzk2LiANCkFwciAxNCAxNTozMzow
OSBub2RlMDA5IGtlcm5lbDogZXRoMDogQWZ0ZXIgaWYgZnVsbDogMCBjdXIg
MTE3OTcgZGlydHkgMTE3OTYuIA0KDQpBcHIgMTQgMTU6MzM6MTIgbm9kZTAw
OSBrZXJuZWw6IGV0aDA6IEFmdGVyIERvd25MaXN0UHRyIGZ1bGw6IDAgY3Vy
IDEzMzM0IGRpcnR5IDEzMzMzLiANCkFwciAxNCAxNTozMzoxMiBub2RlMDA5
IGtlcm5lbDogZXRoMDogQWZ0ZXIgRG93blVuc3RhbGwgZnVsbDogMCBjdXIg
MTMzMzQgZGlydHkgMTMzMzMuIA0KQXByIDE0IDE1OjMzOjEyIG5vZGUwMDkg
a2VybmVsOiBldGgwOiBBZnRlciByZXN0b3JlX2ZsYWdzIGZ1bGw6IDAgY3Vy
IDEzMzM0IGRpcnR5IDEzMzMzLiANCkFwciAxNCAxNTozMzoxMiBub2RlMDA5
IGtlcm5lbDogZXRoMDogQWZ0ZXIgY3VyX3R4KysgZnVsbDogMCBjdXIgMTMz
MzUgZGlydHkgMTMzMzMuIA0KQXByIDE0IDE1OjMzOjEyIG5vZGUwMDkga2Vy
bmVsOiBldGgwOiBBZnRlciBpZiBmdWxsOiAxIGN1ciAxMzMzNSBkaXJ0eSAx
MzMzNS4gDQpBcHIgMTQgMTU6MzM6MTggbm9kZTAwOSBrZXJuZWw6IGV0aDA6
IHRyYW5zbWl0IHRpbWVkIG91dCwgdHhfc3RhdHVzIDAwIHN0YXR1cyBlMDAw
LiANCkFwciAxNCAxNTozMzoxOCBub2RlMDA5IGtlcm5lbDogICBGbGFnczsg
YnVzLW1hc3RlciAxLCBmdWxsIDE7IGRpcnR5IDEzMzM1IGN1cnJlbnQgMTMz
MzUuIA0KQXByIDE0IDE1OjMzOjE4IG5vZGUwMDkga2VybmVsOiAgIFRyYW5z
bWl0IGxpc3QgMDAwMDAwMDAgdnMuIGNmZmQ4YzMwLiANCkFwciAxNCAxNToz
MzoxOCBub2RlMDA5IGtlcm5lbDogICAwOiBAY2ZmZDhjMjAgIGxlbmd0aCA4
MDAwMDVlYSBzdGF0dXMgODAwMTA1ZWEgDQpBcHIgMTQgMTU6MzM6MTggbm9k
ZTAwOSBrZXJuZWw6ICAgMTogQGNmZmQ4YzMwICBsZW5ndGggODAwMDA1ZWEg
c3RhdHVzIDgwMDEwNWVhIA0KQXByIDE0IDE1OjMzOjE4IG5vZGUwMDkga2Vy
bmVsOiBldGgwOiBSZXNldHRpbmcgdGhlIFR4IHJpbmcgcG9pbnRlci4gDQpB
cHIgMTQgMTU6MzM6MTggbm9kZTAwOSBrZXJuZWw6IGV0aDA6IEFmdGVyIERv
d25MaXN0UHRyIGZ1bGw6IDAgY3VyIDEzMzM1IGRpcnR5IDEzMzM1LiANCkFw
ciAxNCAxNTozMzoxOCBub2RlMDA5IGtlcm5lbDogZXRoMDogQWZ0ZXIgRG93
blVuc3RhbGwgZnVsbDogMCBjdXIgMTMzMzUgZGlydHkgMTMzMzUuIA0KQXBy
IDE0IDE1OjMzOjE4IG5vZGUwMDkga2VybmVsOiBldGgwOiBBZnRlciByZXN0
b3JlX2ZsYWdzIGZ1bGw6IDAgY3VyIDEzMzM1IGRpcnR5IDEzMzM1LiANCkFw
ciAxNCAxNTozMzoxOCBub2RlMDA5IGtlcm5lbDogZXRoMDogQWZ0ZXIgY3Vy
X3R4KysgZnVsbDogMCBjdXIgMTMzMzYgZGlydHkgMTMzMzUuIA0KQXByIDE0
IDE1OjMzOjE4IG5vZGUwMDkga2VybmVsOiBldGgwOiBBZnRlciBpZiBmdWxs
OiAwIGN1ciAxMzMzNiBkaXJ0eSAxMzMzNS4gDQoNCkFwciAxNCAxNTozMzoy
NiBub2RlMDA5IGtlcm5lbDogZXRoMDogQWZ0ZXIgRG93bkxpc3RQdHIgZnVs
bDogMCBjdXIgMjMyMjcgZGlydHkgMjMyMjYuIA0KQXByIDE0IDE1OjMzOjI2
IG5vZGUwMDkga2VybmVsOiBldGgwOiBBZnRlciBEb3duVW5zdGFsbCBmdWxs
OiAwIGN1ciAyMzIyNyBkaXJ0eSAyMzIyNi4gDQpBcHIgMTQgMTU6MzM6MjYg
bm9kZTAwOSBrZXJuZWw6IGV0aDA6IEFmdGVyIHJlc3RvcmVfZmxhZ3MgZnVs
bDogMCBjdXIgMjMyMjcgZGlydHkgMjMyMjYuIA0KQXByIDE0IDE1OjMzOjI2
IG5vZGUwMDkga2VybmVsOiBldGgwOiBBZnRlciBjdXJfdHgrKyBmdWxsOiAw
IGN1ciAyMzIyOCBkaXJ0eSAyMzIyNy4gDQpBcHIgMTQgMTU6MzM6MjYgbm9k
ZTAwOSBrZXJuZWw6IGV0aDA6IEFmdGVyIGlmIGZ1bGw6IDAgY3VyIDIzMjI4
IGRpcnR5IDIzMjI3LiANCkFwciAxNCAxNTozMzoyNiBub2RlMDA5IGtlcm5l
bDogZXRoMDogQWZ0ZXIgRG93bkxpc3RQdHIgZnVsbDogMCBjdXIgMjMyMjgg
ZGlydHkgMjMyMjcuIA0KQXByIDE0IDE1OjMzOjI2IG5vZGUwMDkga2VybmVs
OiBldGgwOiBBZnRlciBEb3duVW5zdGFsbCBmdWxsOiAwIGN1ciAyMzIyOCBk
aXJ0eSAyMzIyNy4gDQpBcHIgMTQgMTU6MzM6MjYgbm9kZTAwOSBrZXJuZWw6
IGV0aDA6IEFmdGVyIHJlc3RvcmVfZmxhZ3MgZnVsbDogMCBjdXIgMjMyMjgg
ZGlydHkgMjMyMjcuIA0KQXByIDE0IDE1OjMzOjI2IG5vZGUwMDkga2VybmVs
OiBldGgwOiBBZnRlciBjdXJfdHgrKyBmdWxsOiAwIGN1ciAyMzIyOSBkaXJ0
eSAyMzIyNy4gDQpBcHIgMTQgMTU6MzM6MjYgbm9kZTAwOSBrZXJuZWw6IGV0
aDA6IEFmdGVyIGlmIGZ1bGw6IDEgY3VyIDIzMjI5IGRpcnR5IDIzMjI3LiAN
CkFwciAxNCAxNTozMzozMyBub2RlMDA5IGtlcm5lbDogZXRoMDogdHJhbnNt
aXQgdGltZWQgb3V0LCB0eF9zdGF0dXMgMDAgc3RhdHVzIGUwMDAuIA0KQXBy
IDE0IDE1OjMzOjMzIG5vZGUwMDkga2VybmVsOiAgIEZsYWdzOyBidXMtbWFz
dGVyIDEsIGZ1bGwgMTsgZGlydHkgMjMyMjkgY3VycmVudCAyMzIyOS4gDQpB
cHIgMTQgMTU6MzM6MzMgbm9kZTAwOSBrZXJuZWw6ICAgVHJhbnNtaXQgbGlz
dCAwMDAwMDAwMCB2cy4gY2ZmZDhjMzAuIA0KQXByIDE0IDE1OjMzOjMzIG5v
ZGUwMDkga2VybmVsOiAgIDA6IEBjZmZkOGMyMCAgbGVuZ3RoIDgwMDAwMDQy
IHN0YXR1cyA4MDAxMDA0MiANCkFwciAxNCAxNTozMzozMyBub2RlMDA5IGtl
cm5lbDogICAxOiBAY2ZmZDhjMzAgIGxlbmd0aCA4MDAwMDA0MiBzdGF0dXMg
ODAwMTAwNDIgDQpBcHIgMTQgMTU6MzM6MzMgbm9kZTAwOSBrZXJuZWw6IGV0
aDA6IFJlc2V0dGluZyB0aGUgVHggcmluZyBwb2ludGVyLiANCkFwciAxNCAx
NTozMzozMyBub2RlMDA5IGtlcm5lbDogZXRoMDogQWZ0ZXIgRG93bkxpc3RQ
dHIgZnVsbDogMCBjdXIgMjMyMjkgZGlydHkgMjMyMjkuIA0KQXByIDE0IDE1
OjMzOjMzIG5vZGUwMDkga2VybmVsOiBldGgwOiBBZnRlciBEb3duVW5zdGFs
bCBmdWxsOiAwIGN1ciAyMzIyOSBkaXJ0eSAyMzIyOS4gDQpBcHIgMTQgMTU6
MzM6MzMgbm9kZTAwOSBrZXJuZWw6IGV0aDA6IEFmdGVyIHJlc3RvcmVfZmxh
Z3MgZnVsbDogMCBjdXIgMjMyMjkgZGlydHkgMjMyMjkuIA0KQXByIDE0IDE1
OjMzOjMzIG5vZGUwMDkga2VybmVsOiBldGgwOiBBZnRlciBjdXJfdHgrKyBm
dWxsOiAwIGN1ciAyMzIzMCBkaXJ0eSAyMzIyOS4gDQpBcHIgMTQgMTU6MzM6
MzMgbm9kZTAwOSBrZXJuZWw6IGV0aDA6IEFmdGVyIGlmIGZ1bGw6IDAgY3Vy
IDIzMjMwIGRpcnR5IDIzMjI5LiANCg==
---830399112-1207890389-955727642=:27258--
-------------------------------------------------------------------
To unsubscribe send a message body containing "unsubscribe"
to linux-vortex-bug-request@beowulf.org