[eepro100] eepro100 failures

Steinar Hauan steinhau+@andrew.cmu.edu
Thu, 23 Aug 2001 15:50:00 -0400 (EDT)


  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.
  Send mail to mime@docserver.cac.washington.edu for more info.

---559023410-851401618-998596200=:12018
Content-Type: TEXT/PLAIN; charset=US-ASCII

I have a small cluster of dual-cpu P3 machines on RedHat 7.1++
with network trouble using Intel Pro/100 adapters. Specifically,
a diff on tcpdump for the Tx and Rx ends -- reproducible by both nfs
and ftp -- shows that 1 bit is being flipped. The error is quite rare;
only a few specific bit sequences located at specific offsets produce
the error. A typical bit pattern is 5-12 bytes and cause errors if
found at offset j*4+1 in a packet with j=1, 2, ... , N.

Unfortunately, the error is intermittent. I can construct files
(see attached script) of 1kb that "almost always" fail to x-fer with
ftp. However, sometimes it does work.

Drivers used:
  - eepro100.c:v1.09t/1.36 (standard in 2.4.9)
  - eepro100.c:v1.15 5/18/2001 (from scyld)
  - eepro100.c:v1.17 7/30/2001

i've further used eepro100-diag.c:v2.05 6/13/2001 to turn off
"sleep mode" in the eeprom of all the cards. I've also manually
locked down the IRQ for video and network card and booted both with
and without the "noapic" kernel option.

More equipment specifications
  - mainboard IWill DVD266 (Via Apollo Pro266 w/ddr, latest bios)
  - other hardware found in lspci.out attachment
  - onboard sound, usb, parallel & serial port disabled in bios
  - 4x256mb crucial ddr memory: memtest86 has run for 48 hrs
    and parallel kernel compiles for 96 hrs without errors
  - switch: HP ProCurve 2524 (w/latest firmware+software)

i've tried about 30 different patch cables (machine made, cat 5e)
and 15 different Intel Pro/100 cards. I've also isolated the machines,
on a full-duplex hub and reproduced the errors there. Finally, i've
run the machines off a UPS with only 1 memory module present to
minimize the risk of power trouble.

Now here is the main cause of concern. Yesterday, I went out to my
local computer store and bought 4 new ethernet cards.

  1x 3Com PCI 3c590 Vortex 10Mbps (10Tx-HD)
  1x 3Com 3c905B Cyclone 100baseTx (100Tx-FD)
  1x Intellinet 10/100 PCI network card
  1x SMC 1244TX Rev B (100TX-FD)

  the last two card use the RealTek RTL8139 chip (100Tx-FD).

Whenever i boot my machines with one of the above cards along
with the "noapic" kernel option, the errors go away. Could this be a
driver error? What else could it be? Why does "noapic" make a difference
with IRQs locked to specific pci slots? (and nothing to share with)

any input would be appreciated.
--
  Steinar Hauan, dept of ChemE   -   steinhau+@andrew.cmu.edu
  Carnegie Mellon University, Pittsburgh PA, USA

---559023410-851401618-998596200=:12018
Content-Type: TEXT/PLAIN; charset=US-ASCII; name="diags.txt"
Content-Transfer-Encoding: BASE64
Content-ID: <Pine.GSO.4.33L-022.0108231550000.12018@unix14.andrew.cmu.edu>
Content-Description: 
Content-Disposition: attachment; filename="diags.txt"

QXVnIDIzIDE1OjQzOjI1IGMxbjA2IGtlcm5lbDogZWVwcm8xMDAuYzp2MS4x
NyA3LzMwLzIwMDEgRG9uYWxkIEJlY2tlciA8YmVja2VyQHNjeWxkLmNvbT4N
CkF1ZyAyMyAxNTo0MzoyNSBjMW4wNiBrZXJuZWw6ICAgaHR0cDovL3d3dy5z
Y3lsZC5jb20vbmV0d29yay9lZXBybzEwMC5odG1sDQpBdWcgMjMgMTU6NDM6
MjUgYzFuMDYga2VybmVsOiBldGgwOiBJbnRlbCBpODI1NTkgcmV2IDggYXQg
MHhmODk1MTAwMCwgMDA6MDI6QjM6M0U6QTk6MkIsIElSUSAxMS4NCkF1ZyAy
MyAxNTo0MzoyNSBjMW4wNiBrZXJuZWw6ICAgUmVjZWl2ZXIgbG9jay11cCBi
dWcgZXhpc3RzIC0tIGVuYWJsaW5nIHdvcmstYXJvdW5kLg0KQXVnIDIzIDE1
OjQzOjI1IGMxbjA2IGtlcm5lbDogICBCb2FyZCBhc3NlbWJseSA3MjEzODMt
MDE3LCBQaHlzaWNhbCBjb25uZWN0b3JzIHByZXNlbnQ6IFJKNDUNCkF1ZyAy
MyAxNTo0MzoyNSBjMW4wNiBrZXJuZWw6ICAgUHJpbWFyeSBpbnRlcmZhY2Ug
Y2hpcCBpODI1NTUgUEhZICMxLg0KQXVnIDIzIDE1OjQzOjI1IGMxbjA2IGtl
cm5lbDogICBHZW5lcmFsIHNlbGYtdGVzdDogcGFzc2VkLg0KQXVnIDIzIDE1
OjQzOjI1IGMxbjA2IGtlcm5lbDogICBTZXJpYWwgc3ViLXN5c3RlbSBzZWxm
LXRlc3Q6IHBhc3NlZC4NCkF1ZyAyMyAxNTo0MzoyNSBjMW4wNiBrZXJuZWw6
ICAgSW50ZXJuYWwgcmVnaXN0ZXJzIHNlbGYtdGVzdDogcGFzc2VkLg0KQXVn
IDIzIDE1OjQzOjI1IGMxbjA2IGtlcm5lbDogICBST00gY2hlY2tzdW0gc2Vs
Zi10ZXN0OiBwYXNzZWQgKDB4MDRmNDUxOGIpLg0KDQpbcm9vdEBjMW4wNiBy
b290XSMgL3N0b3JlL3NiaW4vZWVwcm8xMDAtZGlhZyAtZmFlbQ0KZWVwcm8x
MDAtZGlhZy5jOnYyLjA1IDYvMTMvMjAwMSBEb25hbGQgQmVja2VyIChiZWNr
ZXJAc2N5bGQuY29tKQ0KIGh0dHA6Ly93d3cuc2N5bGQuY29tL2RpYWcvaW5k
ZXguaHRtbA0KSW5kZXggIzE6IEZvdW5kIGEgSW50ZWwgaTgyNTU3LzgvOSBF
dGhlckV4cHJlc3NQcm8xMDAgYWRhcHRlciBhdCAweGQ0MDAuDQppODI1NTcg
Y2hpcCByZWdpc3RlcnMgYXQgMHhkNDAwOg0KICAwMDAwMDA1MCAzNzk1OTE0
MCAwMDAwMDAwMCAwMDA4MDAwMiAxODI1NDFlMSAwMDAwMDYwMA0KICBObyBp
bnRlcnJ1cHQgc291cmNlcyBhcmUgcGVuZGluZy4NCiAgIFRoZSB0cmFuc21p
dCB1bml0IHN0YXRlIGlzICdTdXNwZW5kZWQnLg0KICAgVGhlIHJlY2VpdmUg
dW5pdCBzdGF0ZSBpcyAnUmVhZHknLg0KICBUaGlzIHN0YXR1cyBpcyBub3Jt
YWwgZm9yIGFuIGFjdGl2YXRlZCBidXQgaWRsZSBpbnRlcmZhY2UuDQpJbnRl
bCBFdGhlckV4cHJlc3MgUHJvIDEwLzEwMCBFRVBST00gY29udGVudHM6DQog
IFN0YXRpb24gYWRkcmVzcyAwMDowMjpCMzozRTpBOToyQi4NCiAgQm9hcmQg
YXNzZW1ibHkgNzIxMzgzLTAxNywgUGh5c2ljYWwgY29ubmVjdG9ycyBwcmVz
ZW50OiBSSjQ1DQogIFByaW1hcnkgaW50ZXJmYWNlIGNoaXAgaTgyNTU1IFBI
WSAjMS4NCiBNSUkgUEhZICMxIHRyYW5zY2VpdmVyIHJlZ2lzdGVyczoNCiAg
MzAwMCA3ODJkIDAyYTggMDE1NCAwNWUxIDQxZTEgMDAwMyAwMDAwDQogIDAw
MDAgMDAwMCAwMDAwIDAwMDAgMDAwMCAwMDAwIDAwMDAgMDAwMA0KICAwMjAz
IDAwMDAgMDAwMSA4ZjJiIDAwMDAgMDAwMCA3NzQzIDAwMDANCiAgMDAwMCAw
MDAwIDAwMDAgMDAwMCAwMDAwIDAwMDAgMDAwMCAwMDAwLg0KIE1JSSBQSFkg
IzEgdHJhbnNjZWl2ZXIgcmVnaXN0ZXJzOg0KICAgMzAwMCA3ODJkIDAyYTgg
MDE1NCAwNWUxIDQxZTEgMDAwMSAwMDAwDQogICAwMDAwIDAwMDAgMDAwMCAw
MDAwIDAwMDAgMDAwMCAwMDAwIDAwMDANCiAgIDBhMDMgMDAwMCAwMDAxIDAw
MDAgMDAwMCAwMDAwIDAwMDAgMDAwMA0KICAgMDAwMCAwMDAwIDAwMDAgMDAw
MCAwMDAwIDAwMDAgMDAwMCAwMDAwLg0KIEJhc2ljIG1vZGUgY29udHJvbCBy
ZWdpc3RlciAweDMwMDA6IEF1dG8tbmVnb3RpYXRpb24gZW5hYmxlZC4NCiBC
YXNpYyBtb2RlIHN0YXR1cyByZWdpc3RlciAweDc4MmQgLi4uIDc4MmQuDQog
ICBMaW5rIHN0YXR1czogZXN0YWJsaXNoZWQuDQogICBDYXBhYmxlIG9mICAx
MDBiYXNlVHgtRkQgMTAwYmFzZVR4IDEwYmFzZVQtRkQgMTBiYXNlVC4NCiAg
IEFibGUgdG8gcGVyZm9ybSBBdXRvLW5lZ290aWF0aW9uLCBuZWdvdGlhdGlv
biBjb21wbGV0ZS4NCiBWZW5kb3IgSUQgaXMgMDA6YWE6MDA6LS06LS06LS0s
IG1vZGVsIDIxIHJldi4gNC4NCiAgIE5vIHNwZWNpZmljIGluZm9ybWF0aW9u
IGlzIGtub3duIGFib3V0IHRoaXMgdHJhbnNjZWl2ZXIgdHlwZS4NCiBJJ20g
YWR2ZXJ0aXNpbmcgMDVlMTogRmxvdy1jb250cm9sIDEwMGJhc2VUeC1GRCAx
MDBiYXNlVHggMTBiYXNlVC1GRCAxMGJhc2VUDQogICBBZHZlcnRpc2luZyBu
byBhZGRpdGlvbmFsIGluZm8gcGFnZXMuDQogICBJRUVFIDgwMi4zIENTTUEv
Q0QgcHJvdG9jb2wuDQogTGluayBwYXJ0bmVyIGNhcGFiaWxpdHkgaXMgNDFl
MTogMTAwYmFzZVR4LUZEIDEwMGJhc2VUeCAxMGJhc2VULUZEIDEwYmFzZVQu
DQogICBOZWdvdGlhdGlvbiAgY29tcGxldGVkLg0KDQpbcm9vdEBjMW4wNiBy
b290XSMgbHNwY2kgLXYNCjAwOjAwLjAgSG9zdCBicmlkZ2U6IFZJQSBUZWNo
bm9sb2dpZXMsIEluYy4gVlQ4NjMzIFtBcG9sbG8gUHJvMjY2XSAocmV2IDAx
KQ0KCUZsYWdzOiBidXMgbWFzdGVyLCBtZWRpdW0gZGV2c2VsLCBsYXRlbmN5
IDgNCglNZW1vcnkgYXQgZDE4MDAwMDAgKDMyLWJpdCwgcHJlZmV0Y2hhYmxl
KSBbc2l6ZT00TV0NCglDYXBhYmlsaXRpZXM6IFthMF0gQUdQIHZlcnNpb24g
Mi4wDQoJQ2FwYWJpbGl0aWVzOiBbYzBdIFBvd2VyIE1hbmFnZW1lbnQgdmVy
c2lvbiAyDQoNCjAwOjAxLjAgUENJIGJyaWRnZTogVklBIFRlY2hub2xvZ2ll
cywgSW5jLiBWVDg2MzMgW0Fwb2xsbyBQcm8yNjYgQUdQXSAocHJvZy1pZiAw
MCBbTm9ybWFsIGRlY29kZV0pDQoJRmxhZ3M6IGJ1cyBtYXN0ZXIsIDY2TWh6
LCBtZWRpdW0gZGV2c2VsLCBsYXRlbmN5IDANCglCdXM6IHByaW1hcnk9MDAs
IHNlY29uZGFyeT0wMSwgc3Vib3JkaW5hdGU9MDEsIHNlYy1sYXRlbmN5PTAN
CglDYXBhYmlsaXRpZXM6IFs4MF0gUG93ZXIgTWFuYWdlbWVudCB2ZXJzaW9u
IDINCg0KMDA6MGEuMCBWR0EgY29tcGF0aWJsZSBjb250cm9sbGVyOiBTaWxp
Y29uIEludGVncmF0ZWQgU3lzdGVtcyBbU2lTXSA4NkMzMjYgKHJldiAwYikg
KHByb2ctaWYgMDAgW1ZHQV0pDQoJU3Vic3lzdGVtOiBTaWxpY29uIEludGVn
cmF0ZWQgU3lzdGVtcyBbU2lTXSBTaVM2MzI2IEdVSSBBY2NlbGVyYXRvcg0K
CUZsYWdzOiBidXMgbWFzdGVyLCA2Nk1oeiwgbWVkaXVtIGRldnNlbCwgbGF0
ZW5jeSAzMiwgSVJRIDEwDQoJTWVtb3J5IGF0IGQxMDAwMDAwICgzMi1iaXQs
IHByZWZldGNoYWJsZSkgW3NpemU9OE1dDQoJTWVtb3J5IGF0IGQxZDAwMDAw
ICgzMi1iaXQsIG5vbi1wcmVmZXRjaGFibGUpIFtzaXplPTY0S10NCglJL08g
cG9ydHMgYXQgZDAwMCBbc2l6ZT0xMjhdDQoJRXhwYW5zaW9uIFJPTSBhdCA8
dW5hc3NpZ25lZD4gW2Rpc2FibGVkXSBbc2l6ZT02NEtdDQoJQ2FwYWJpbGl0
aWVzOiBbNDBdIFBvd2VyIE1hbmFnZW1lbnQgdmVyc2lvbiAxDQoNCjAwOjBi
LjAgRXRoZXJuZXQgY29udHJvbGxlcjogSW50ZWwgQ29ycG9yYXRpb24gODI1
NTcgW0V0aGVybmV0IFBybyAxMDBdIChyZXYgMDgpDQoJU3Vic3lzdGVtOiBJ
bnRlbCBDb3Jwb3JhdGlvbiBFdGhlckV4cHJlc3MgUFJPLzEwMCsgTWFuYWdl
bWVudCBBZGFwdGVyDQoJRmxhZ3M6IGJ1cyBtYXN0ZXIsIG1lZGl1bSBkZXZz
ZWwsIGxhdGVuY3kgMzIsIElSUSAxMQ0KCU1lbW9yeSBhdCBkMWQxMDAwMCAo
MzItYml0LCBub24tcHJlZmV0Y2hhYmxlKSBbc2l6ZT00S10NCglJL08gcG9y
dHMgYXQgZDQwMCBbc2l6ZT02NF0NCglNZW1vcnkgYXQgZDFjMDAwMDAgKDMy
LWJpdCwgbm9uLXByZWZldGNoYWJsZSkgW3NpemU9MU1dDQoJRXhwYW5zaW9u
IFJPTSBhdCA8dW5hc3NpZ25lZD4gW2Rpc2FibGVkXSBbc2l6ZT0xTV0NCglD
YXBhYmlsaXRpZXM6IFtkY10gUG93ZXIgTWFuYWdlbWVudCB2ZXJzaW9uIDIN
Cg0KMDA6MTEuMCBJU0EgYnJpZGdlOiBWSUEgVGVjaG5vbG9naWVzLCBJbmMu
IFZUODIzMyBQQ0kgdG8gSVNBIEJyaWRnZQ0KCVN1YnN5c3RlbTogVklBIFRl
Y2hub2xvZ2llcywgSW5jLjogVW5rbm93biBkZXZpY2UgMDAwMA0KCUZsYWdz
OiBidXMgbWFzdGVyLCBzdGVwcGluZywgbWVkaXVtIGRldnNlbCwgbGF0ZW5j
eSAwDQoJQ2FwYWJpbGl0aWVzOiBbYzBdIFBvd2VyIE1hbmFnZW1lbnQgdmVy
c2lvbiAyDQoNCjAwOjExLjEgSURFIGludGVyZmFjZTogVklBIFRlY2hub2xv
Z2llcywgSW5jLiBCdXMgTWFzdGVyIElERSAocmV2IDA2KSAocHJvZy1pZiA4
YSBbTWFzdGVyIFNlY1AgUHJpUF0pDQoJU3Vic3lzdGVtOiBWSUEgVGVjaG5v
bG9naWVzLCBJbmMuIEJ1cyBNYXN0ZXIgSURFDQoJRmxhZ3M6IGJ1cyBtYXN0
ZXIsIG1lZGl1bSBkZXZzZWwsIGxhdGVuY3kgMzINCglJL08gcG9ydHMgYXQg
ZDgwMCBbc2l6ZT0xNl0NCglDYXBhYmlsaXRpZXM6IFtjMF0gUG93ZXIgTWFu
YWdlbWVudCB2ZXJzaW9uIDINCg==
---559023410-851401618-998596200=:12018--