From arthur@levelogic.com Wed, 31 Jan 2001 11:11:22 -0800 Date: Wed, 31 Jan 2001 11:11:22 -0800 From: Arthur M. Kang arthur@levelogic.com Subject: [eepro100] RE: card reports no resources / RX buffers I've found your messages regarding the no resources for nic. What I couldn't find was how you fixed your problem. Do you think you could share with me what the fix/patch/workaround was? Any help is appreciated. Arthur From msox@soulscream.com Wed, 31 Jan 2001 16:35:46 -0500 Date: Wed, 31 Jan 2001 16:35:46 -0500 From: Michael Sox msox@soulscream.com Subject: [eepro100] eepro100 adapters not listing statistics correctly. This is a multi-part message in MIME format. ------=_NextPart_000_0010_01C08BA3.DCDB8D00 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Gretting all,=20 Two wierd problems that I am having with my Compaq DL380s. I have = three, all of which were running Redhat Linux 7.0, but I upgraded them = to the 2.4.0 and then to the 2.4.1 kernel. All three have dual Intel = EtherExpress Pro 100 PCI card on them. One is the one that comes with = the Compaq box ( on-board), the other is one that we added from Compaq = later. The first problem is that due to a miscommunication I mistakenly = brought both NICs up on all the machines thinking that the second = interface had a live network connection. It turns out that the second = card on all three Compaq was not plugged in, but I could not tell = because I could ping both IP addresses on both machines on either = interface. My secondary problem though is of more importance. Even = though we are forcing traffic through the second interface I am seeing = no increase in statistics for that interface in ifconfig. I would = appreciate any diagnostics or info that would be helpful in approaching = this problem. Michael Sox ------=_NextPart_000_0010_01C08BA3.DCDB8D00 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Gretting all,
  Two wierd problems that I am = having=20 with my Compaq DL380s.  I have three, all of which were = running Redhat=20 Linux 7.0, but I upgraded them to the 2.4.0 and then to the 2.4.1=20 kernel.   All three have dual Intel EtherExpress Pro 100 PCI = card on=20 them.  One is the one that comes with the Compaq box ( on-board), = the other=20 is one that we added from Compaq later.  The first problem is that = due to a=20 miscommunication I mistakenly brought both NICs up on all the machines = thinking=20 that the second interface had a live network connection.  It turns = out that=20 the second card on all three Compaq was not plugged in, but I could not = tell=20 because I could ping both IP addresses on both machines on either=20 interface.   My secondary problem though is of more = importance. =20 Even though we are forcing traffic through the second interface I am = seeing no=20 increase in statistics for that interface in ifconfig.  I would = appreciate=20 any diagnostics or info that would be helpful in approaching this=20 problem.   Michael Sox
------=_NextPart_000_0010_01C08BA3.DCDB8D00-- From aoga@Mail.Linux-Consulting.com Wed, 31 Jan 2001 18:54:05 -0800 (PST) Date: Wed, 31 Jan 2001 18:54:05 -0800 (PST) From: Alvin Oga aoga@Mail.Linux-Consulting.com Subject: [eepro100] RE: card reports no resources / RX buffers hi arthur... donno whom you're directing this to....but... maybe there is 3 possible solutions ?? - use "modules" instead of compiling the driver into thekernel - use the patched eepro driver x> On Thu, Jan 25, 2001 at 05:44:28AM +0000, Steve Leung wrote: x> > I have this problem with Redhat 7. When I set up DNS settings x> > (one primary and one secondary), I get error x> > eth0: card reports no RX buffers x> > eth0: card reports no resources x> x> It is likely to be a hardware bug, for which a workaround was recently x> developed by Donald Becker and his colleagues. x> It consists of inserting x> inl(ioaddr + SCBPointer); x> udelay(10); x> before outb(RxAddrLoad, ioaddr + SCBCmd); x> in speedo_resume(). x> x> Best regards x> Andrey V. x> Savochkin c ya alvin http://www.linux-1U.net ... 1U Raid5 ... 500Gb each .. On Wed, 31 Jan 2001, Arthur M. Kang wrote: > I've found your messages regarding the no resources for nic. What I > couldn't find was how you fixed your problem. > > Do you think you could share with me what the fix/patch/workaround was? > > Any help is appreciated. > > Arthur > > > _______________________________________________ > eepro100 mailing list > eepro100@scyld.com > http://www.scyld.com/mailman/listinfo/eepro100 > From gdavide@mclink.it Sat, 03 Feb 2001 14:50:40 +0100 Date: Sat, 03 Feb 2001 14:50:40 +0100 From: Davide Giunchi gdavide@mclink.it Subject: [eepro100] again on disconnection Hi. i've posted a message two weeks ago about some disconnetion on an hp with eepro100 servers, now it seems that i've still problem even after downloading the latest driver and using it as module, i've used mii-diag to control the status of the ethernet, it works on 10BaseT as my net, but sometimes i get disconnetion on some clients. I've the 6 server like this, all in separate network and places, and all with the same problem, some less but some more freqently... i ask you some question 1) how can i debug if it's a driver problem? 2) there's 10BaseT-HD and 10BaseT-FD, what's the difference and what's could i select? 3) speaking with an hp operator he said me that the nic in this servers is an intel eepro100 modified by hp, but it should work fine with redhat (i use redhat bu i know that it's equal for all distribution), it could be that i must use a differend module? or there's a patch by hp? (i don't know... it could be that they are the same). Thanks for your help. Regards. -- (o_ Davide Giunchi. //\ Membro del FoLUG (Forlí Linux User Group) - http://folug.linux.it V_/_ GPG Key available on http://www.keyserver.net From sgcarr@civeng.adelaide.edu.au Mon, 5 Feb 2001 08:51:40 +1030 Date: Mon, 5 Feb 2001 08:51:40 +1030 From: Stephen Carr sgcarr@civeng.adelaide.edu.au Subject: [eepro100] epro100 problems Dear members of the list. For your information I had what I suspect to be a problem performing backups over a network using the epro100 driver. Driver used is as per the 2.4.1 kernel and using dump with the card set to 100Mbs full duplex. The server would lockup hard and require a reset - the error message on the console was a scsi timeout on the disc. Furthermore when this occurred you could not telnet etc to the server. One characteristic seemed to be the volume being backed up. At the moment the clients are still using the epro100 driver. I switched to the latest Intel driver and had no problems backing up about 10GB. Yours Stephen Carr Stephen Carr Computing Officer Department of Civil and Environmental Engineering University of Adelaide Adelaide, South Australia, Australia 5005 Phone +618 8303-4313 Fax +618 8303-4359 Email sgcarr@civeng.adelaide.edu.au ----------------------------------------------------------- This email message is intended only for the addressee(s) and contains information which may be confidential and/or copyright. If you are not the intended recipient please do not read, save, forward, disclose, or copy the contents of this email. If this email has been sent to you in error, please notify the sender by reply email and delete this email and any copies or links to this email completely and immediately from your system. No representation is made that this email is free of viruses. Virus scanning is recommended and is the responsibility of the recipient. From aoga@Mail.Linux-Consulting.com Sun, 4 Feb 2001 21:59:31 -0800 (PST) Date: Sun, 4 Feb 2001 21:59:31 -0800 (PST) From: Alvin Oga aoga@Mail.Linux-Consulting.com Subject: [eepro100] asus cur_dls + onboard nic hi ya donno if the asus cur_dls supposed to work with the eepro100 driver.... the driver is recognized during bootup but thats it... no routing and no ifconfig entries asus cur_dls is supposedly a serverworks chipset but also does seem to have 82559 compatability ?? - donno..am clueless - due to time constraints...would up using a 3c905C card in a pci slot to get the server up and running with rh-7.0 and scsi3 disks w/ the Asus cur_dls motherboard have fun linuxing alvin From becker@scyld.com Mon, 5 Feb 2001 06:26:50 -0500 (EST) Date: Mon, 5 Feb 2001 06:26:50 -0500 (EST) From: Donald Becker becker@scyld.com Subject: [eepro100] asus cur_dls + onboard nic On Sun, 4 Feb 2001, Alvin Oga wrote: > donno if the asus cur_dls supposed to work with the > eepro100 driver.... the driver is recognized during bootup > but thats it... no routing and no ifconfig entries If the driver recognizes the card, it should work. What is the detection message? The IP address and route entries are configured by the distribution. They don't magically appear. If you installed your distribution incorrectly the interface will not be configured. > asus cur_dls is supposedly a serverworks chipset but > also does seem to have 82559 compatability ?? > - donno..am clueless It likely has an i82559 chip on the motherboard. > - due to time constraints...would up using a 3c905C card > in a pci slot to get the server up and running with > rh-7.0 and scsi3 disks w/ the Asus cur_dls motherboard That implies that your interface configuration is correct. What is the card detection message? Donald Becker becker@scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Second Generation Beowulf Clusters Annapolis MD 21403 410-990-9993 From lmcbekh@lmc.ericsson.se Mon, 5 Feb 2001 10:59:39 -0500 Date: Mon, 5 Feb 2001 10:59:39 -0500 From: Behrang Khoshnood (LMC) lmcbekh@lmc.ericsson.se Subject: [eepro100] eepro100:"card reports no resources" Hi, We get "card reports no resources" all the time, with the frequency apparently related to load (or rather, we see them more frequently during our peak times). The only kernel message being logged repeatedly is "card reports no resources". We are sure that there is enough memory and the load is not too much for 100baseT line. we are using eepro100.c:v1.091 with 4 cpu. The current value for Increase RX_RING_SIZE is 32. I read in one the emails that increasing this value might solve the problem.If yes, what value should i use? Is it possible that using 4 cpu has any thing to do with this problem ? ( Thank you. Behrang Khoshnood software Engineer Ericsson,Canada From louism@infosat.net Mon, 5 Feb 2001 18:06:54 +0200 (SAST) Date: Mon, 5 Feb 2001 18:06:54 +0200 (SAST) From: Louis Mandelstam louism@infosat.net Subject: [eepro100] eepro100:"card reports no resources" On Mon, 5 Feb 2001, Behrang Khoshnood (LMC) wrote: > Hi, > > We get "card reports no resources" all the time, with the frequency > apparently related to load (or rather, we see them more frequently > during our peak times). The only kernel message being logged repeatedly is > "card > reports no resources". We are sure that there is enough memory and the load > is not too much for 100baseT line. Wow, that description sounds EXTREMELY familiar. :) > we are using eepro100.c:v1.091 with 4 cpu. The current value for Increase > RX_RING_SIZE is 32. I read in one the emails that increasing this value > might solve the problem.If yes, what value should i use? > Is it possible that using 4 cpu has any thing to do with this problem ? ( I eventually got rid of the problem by upgrading to v1.21 of the eepro100 driver. ------------------------------------------------------------------------- Louis Mandelstam Technical Manager, Infoline (Pty) Ltd From stahlbock@basysprint.de Mon, 05 Feb 2001 18:27:34 +0100 Date: Mon, 05 Feb 2001 18:27:34 +0100 From: Bernd Stahlbock stahlbock@basysprint.de Subject: [eepro100] Latest Driver It's strange, I read about newer versions of the eepro driver several times, but when I go to www.scyld.com I can only find page "ftp://ftp.scyld.com/pub/network/eepro100.c", which contains Version 1.11. What is the latest Version of Donald's driver, and where can I get it? best regards Bernd Stahlbock -- stahlbock@basysprint.de, http://www.basysprint.de basysPrint GmbH, Guelzer Str. 15, 19258 Boizenburg, Germany Tel.: ++49-38847-99-150, Fax:++49-38847-99-192 From johnchabalko@hotmail.com Mon, 05 Feb 2001 11:43:41 -0800 Date: Mon, 05 Feb 2001 11:43:41 -0800 From: John Chabalko johnchabalko@hotmail.com Subject: [eepro100] EEpro Compile time options?
hi - i'm running v1.11 of the eepro100 driver and i had a couple of questions...
 
first off - i have several dual-nic machines and i'd like to be able to force them all - both nics - to 100 Meg Full-Duplex, problem is that i've reached the limit on the number of characters i can pass through LILO, and i've only got the first interface set. could someone let me know how i can compile these options into the driver? the kernel i'm using doesn't accept modules right now and i'd sort of like to keep it that way. i've looked around in the source and i seem to be missing something...
 
also - i saw this question asked a week or so ago but i never saw the answer... with this driver the machine identifies 8 different eth interfaces ( 0 - 7 ) when really there are only two, is there a way i can correct this? it's not causing problems - i'm just interested in what the root of the problem is - or if it's even a problem.
 
thanks very much
-john


Get your FREE download of MSN Explorer at http://explorer.msn.com

From arielexc@yahoo.com Mon, 5 Feb 2001 15:03:31 -0800 (PST) Date: Mon, 5 Feb 2001 15:03:31 -0800 (PST) From: Ariel Cohen arielexc@yahoo.com Subject: [eepro100] Integrated Pro/100 on Intel D815EEA Hi, The eepro100 driver in 2.4 kernels seems to have problems with the integrated Pro/100 on Intel's D815EEA motherboard (based on the 815E chipset). One issue is that the driver detects two NICs: eth0 which is detected as an "Intel Corporation 82820 820 (Camino 2) Chipset Ethernet", and eth1 which is detected as "Intel Corporation 82557 [Ethernet Pro 100]". Only eth1 works. This happens regardless of whether the driver is compiled into the kernel or as a module. This detection problem didn't exist with the 2.2 kernels that I tried (2.2.16 and 2.2.18). Another issue is that the driver in kernel 2.4.1 complains about "eth1: card reports no resources" once in a while, and the network connectivity becomes flaky. I didn't encounter this problem with the driver in kernel 2.4.0. Again, this is with the D815EEA integrated Pro/100. I know the driver in 2.4.1 is patched to solve the "resources" problem, but this patch seems to actually produce this problem with the D815EEA, while the problem doesn't appear with the 2.4.0 driver! Any ideas how to fix these issues? Thanks, Ariel Cohen arielexc@yahoo.com __________________________________________________ Get personalized email addresses from Yahoo! Mail - only $35 a year! http://personal.mail.yahoo.com/ From arielexc@yahoo.com Mon, 5 Feb 2001 15:49:14 -0800 (PST) Date: Mon, 5 Feb 2001 15:49:14 -0800 (PST) From: Ariel Cohen arielexc@yahoo.com Subject: [eepro100] Re: Integrated Pro/100 on Intel D815EEA Hi, Oops, it looks like there actually were two NICs in the box. One was the integrated Pro/100 and the other was a Pro/100 PCI card. I guess the 2.2.x kernels didn't detect the integrated NIC which is why I didn't notice this issue before I started using 2.4.x. The question about "card reports no resources" still remains. I was using the PCI card, not the integrated NIC. My feeling is that there was some conflict between the two NICs (both shared IRQ 11, by the way). I just tried disabling the integrated NIC, and I'll see what happens... Ariel __________________________________________________ Get personalized email addresses from Yahoo! Mail - only $35 a year! http://personal.mail.yahoo.com/ From aoga@Mail.Linux-Consulting.com Mon, 5 Feb 2001 18:02:25 -0800 (PST) Date: Mon, 5 Feb 2001 18:02:25 -0800 (PST) From: Alvin Oga aoga@Mail.Linux-Consulting.com Subject: [eepro100] asus cur_dls + onboard nic hi donald.. thanx for your reply.... unfortunately... i do NOT have the info you want.. "the detection message" from the bootup sequence ( ?? ) - it all looked nromal for the bootup recognizing the onboard nic.... ( only odd thing i noticed was "IRQ=0" ) eepro100 and version and other info is listed... about 10 lines or so of it.. - after booting... it was not properly "configured" ... ifconfig -v does NOT show eth0 as an interface == == the big issue/question .... == so route -nv is oviously empty too - configured many onboard nics...esp D815EEAAL and CA810EAL redhat, suse, slackware, etc... - sometimes... i noticed that even if the ifconfig and routing table is correct... that the nic driver does NOT always work... that an older or newer nic driver would work ... - so tried it with a known good eepro100.c that i patched to work with D815EEAAL and still not working with the asus mb ... oh well... - since they wanted it working "today"... i tried a simple test with the 3c905C and it worked... - since i dont have the cur_dls motherboard in stock... i cant provide any more debugging info... and am low on cash for "experimenting" - but i suppose we could go to Frys and buy it and return it in 21 days or something... ( bad idea ) thanx alvin On Mon, 5 Feb 2001, Donald Becker wrote: > On Sun, 4 Feb 2001, Alvin Oga wrote: > > > donno if the asus cur_dls supposed to work with the > > eepro100 driver.... the driver is recognized during bootup > > but thats it... no routing and no ifconfig entries > > If the driver recognizes the card, it should work. > > What is the detection message? > The IP address and route entries are configured by the distribution. > > They don't magically appear. If you installed your distribution > incorrectly the interface will not be configured. > > > asus cur_dls is supposedly a serverworks chipset but > > also does seem to have 82559 compatability ?? > > - donno..am clueless > > It likely has an i82559 chip on the motherboard. > > > - due to time constraints...would up using a 3c905C card > > in a pci slot to get the server up and running with > > rh-7.0 and scsi3 disks w/ the Asus cur_dls motherboard > > That implies that your interface configuration is correct. > What is the card detection message? > > Donald Becker becker@scyld.com > Scyld Computing Corporation http://www.scyld.com > 410 Severn Ave. Suite 210 Second Generation Beowulf Clusters > Annapolis MD 21403 410-990-9993 > From aoga@Mail.Linux-Consulting.com Mon, 5 Feb 2001 18:10:33 -0800 (PST) Date: Mon, 5 Feb 2001 18:10:33 -0800 (PST) From: Alvin Oga aoga@Mail.Linux-Consulting.com Subject: [eepro100] Integrated Pro/100 on Intel D815EEA hi ariel seems odd.... i've used many 2.4.0-testxx kernels on D815EEAAL and redhat-7.0 and have never seen more than one interface eth0 only as expected... and lo ... and similarly for 2.2.16 and 2.2.18.. with patches as needed for eepro100.c only bad thing about 2.4.0 kernels on D815EEAAL is no X11 on it... agpgart is broken....but after a day or two of tweeking...got XFree-4.x working.. i think.. c ya alvin http://www.Linux-1U.net ... 1U Raid5 ... 500Gb each ... On Mon, 5 Feb 2001, Ariel Cohen wrote: > Hi, > > The eepro100 driver in 2.4 kernels seems to have > problems with the integrated Pro/100 on Intel's > D815EEA motherboard (based on the 815E chipset). > > One issue is that the driver detects two NICs: eth0 > which is detected as an "Intel Corporation 82820 820 > (Camino 2) Chipset Ethernet", and eth1 which is > detected as "Intel Corporation 82557 [Ethernet Pro > 100]". Only eth1 works. This happens regardless of > whether the driver is compiled into the kernel or as a > module. This detection problem didn't exist with the > 2.2 kernels that I tried (2.2.16 and 2.2.18). > > Another issue is that the driver in kernel 2.4.1 > complains about "eth1: card reports no resources" once > in a while, and the network connectivity becomes > flaky. I didn't encounter this problem with the driver > in kernel 2.4.0. Again, this is with the D815EEA > integrated Pro/100. I know the driver in 2.4.1 is > patched to solve the "resources" problem, but this > patch seems to actually produce this problem with the > D815EEA, while the problem doesn't appear with the > 2.4.0 driver! > > Any ideas how to fix these issues? > > Thanks, > > Ariel Cohen > arielexc@yahoo.com > > > __________________________________________________ > Get personalized email addresses from Yahoo! Mail - only $35 > a year! http://personal.mail.yahoo.com/ > > _______________________________________________ > eepro100 mailing list > eepro100@scyld.com > http://www.scyld.com/mailman/listinfo/eepro100 > From saw@saw.sw.com.sg Tue, 6 Feb 2001 11:50:45 +0800 Date: Tue, 6 Feb 2001 11:50:45 +0800 From: Andrey Savochkin saw@saw.sw.com.sg Subject: [eepro100] Re: Integrated Pro/100 on Intel D815EEA Hello, On Mon, Feb 05, 2001 at 03:03:31PM -0800, Ariel Cohen wrote: [snip] > Another issue is that the driver in kernel 2.4.1 > complains about "eth1: card reports no resources" once > in a while, and the network connectivity becomes > flaky. I didn't encounter this problem with the driver > in kernel 2.4.0. Again, this is with the D815EEA > integrated Pro/100. I know the driver in 2.4.1 is > patched to solve the "resources" problem, but this > patch seems to actually produce this problem with the > D815EEA, while the problem doesn't appear with the > 2.4.0 driver! "card reports no resources" message means that the card indicates shortage of receive buffers. The fixed problem is a hardware bug showing at the initialization time and producing the same message as a consequence of the failed initialization. In your case, the message seems to have its natural original meaning. The shortage of receive buffers may be caused by - a very bursty network traffic (because of designation of the computer or character of your network, or, may be, faulty switches) - hardware interrupts disabled for too long periods of time Check what other hardware and its drivers may cause the interrupts being disabled for long time (IDE, video devices/X, USB etc) You may confirm if the shortage of receive buffers is the reason of the problem by increasing RX_RING_SIZE in the driver, which should decrease the frequency of the messages. Best regards Andrey From becker@scyld.com Mon, 5 Feb 2001 22:49:46 -0500 (EST) Date: Mon, 5 Feb 2001 22:49:46 -0500 (EST) From: Donald Becker becker@scyld.com Subject: [eepro100] asus cur_dls + onboard nic On Mon, 5 Feb 2001, Alvin Oga wrote: > unfortunately... i do NOT have the info you want.. "the detection > message" from the bootup sequence ( ?? ) > > - it all looked nromal for the bootup recognizing the onboard > nic.... ( only odd thing i noticed was "IRQ=0" ) Ahhh, read http://www.scyld.com/expert/irq-conflict.html The 3c905C worked becase it had an on-board boot ROM that activated the board before boot. Donald Becker becker@scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Second Generation Beowulf Clusters Annapolis MD 21403 410-990-9993 From arielexc@yahoo.com Mon, 5 Feb 2001 23:07:38 -0800 (PST) Date: Mon, 5 Feb 2001 23:07:38 -0800 (PST) From: Ariel Cohen arielexc@yahoo.com Subject: [eepro100] Re: Integrated Pro/100 on Intel D815EEA Hi, Thanks to everyone who responded to my message. It looks like I was getting those "card reports no resources" messages as a result of some network problems (probably a malfunctioning or misconfigured device on the network generating some crazy traffic). Ariel __________________________________________________ Do You Yahoo!? Yahoo! Auctions - Buy the things you want at great prices. http://auctions.yahoo.com/ From aoga@Mail.Linux-Consulting.com Tue, 6 Feb 2001 00:05:45 -0800 (PST) Date: Tue, 6 Feb 2001 00:05:45 -0800 (PST) From: Alvin Oga aoga@Mail.Linux-Consulting.com Subject: [eepro100] asus cur_dls + onboard nic hi donald.. thanx for the info... but the Asus Cur_DLS is the next generation asus motherboard.... newer than D815EEAAL generation motherboards... sounds like asus messed up since the irq is wrong ??? * i didnt worry aboutthe irq=0 part of the messages since nothing we can do aboutit usually... and there were zero PCI cards used... so donno why it defaults to wacky irq=0....oh well... just posted for those that might run into the asus cur_dls motherboard.... w/ scsi w/ rh-7.0 hope i gt to work on another customers cur-dls board without time contrainsts to check into it more thanx alvin On Mon, 5 Feb 2001, Donald Becker wrote: > On Mon, 5 Feb 2001, Alvin Oga wrote: > > > unfortunately... i do NOT have the info you want.. "the detection > > message" from the bootup sequence ( ?? ) > > > > - it all looked nromal for the bootup recognizing the onboard > > nic.... ( only odd thing i noticed was "IRQ=0" ) > > Ahhh, read > http://www.scyld.com/expert/irq-conflict.html > > The 3c905C worked becase it had an on-board boot ROM that activated the > board before boot. > > Donald Becker becker@scyld.com > Scyld Computing Corporation http://www.scyld.com > 410 Severn Ave. Suite 210 Second Generation Beowulf Clusters > Annapolis MD 21403 410-990-9993 > From becker@scyld.com Tue, 6 Feb 2001 11:24:42 -0500 (EST) Date: Tue, 6 Feb 2001 11:24:42 -0500 (EST) From: Donald Becker becker@scyld.com Subject: [eepro100] Latest Driver On Mon, 5 Feb 2001, Bernd Stahlbock wrote: > It's strange, I read about newer versions of the eepro driver several > times, but when I go to www.scyld.com I can only find page > "ftp://ftp.scyld.com/pub/network/eepro100.c", which contains Version > 1.11. > > What is the latest Version of Donald's driver, and where can I get it? The latest public release was in the test/ directory: ftp://www.scyld.com/pub/network/test/eepro100.c I've updated the version in the regular directory to the current version: ftp://www.scyld.com/pub/network/test/eepro100.c to "eepro100.c:v1.13 1/9/2001 Donald Becker \n"; Donald Becker becker@scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Second Generation Beowulf Clusters Annapolis MD 21403 410-990-9993 From christoph.plattner@alcatel.at Wed, 07 Feb 2001 15:04:51 +0100 Date: Wed, 07 Feb 2001 15:04:51 +0100 From: Christoph Plattner christoph.plattner@alcatel.at Subject: [eepro100] Developer software manual for i82559er (or family) Hello EEPRO100 hackers. For a porting/driver developing of a 82559er ethernet chip (eepro100) I am searching for a manual describing the handling of the chip. I want to have this additional to the 2 source codes handling the chip, one from Donald Becker, the other from intel themself. I only have the data sheets not describing the functional use of the chip ! Can anybody help me in this point. At intel I was not able to get such documentation.... With friendly regards Christoph P. ----------------------------------------------------------------- private: christoph.plattner@dot.at company: christoph.plattner@alcatel.at From becker@scyld.com Tue, 6 Feb 2001 14:32:19 -0500 (EST) Date: Tue, 6 Feb 2001 14:32:19 -0500 (EST) From: Donald Becker becker@scyld.com Subject: [eepro100] asus cur_dls + onboard nic On Tue, 6 Feb 2001, Alvin Oga wrote: > but the Asus Cur_DLS is the next generation asus motherboard.... > newer than D815EEAAL generation motherboards... > > sounds like asus messed up since the irq is wrong ??? > * i didnt worry aboutthe irq=0 part of the messages > since nothing we can do aboutit usually... Check your BIOS setup for the OS in use. > and there were zero PCI cards used... The chip is a PCI device, it doesn't matter if it's a slot or directly on the motherboard. Donald Becker becker@scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Second Generation Beowulf Clusters Annapolis MD 21403 410-990-9993 From becker@scyld.com Thu, 8 Feb 2001 01:22:49 -0500 (EST) Date: Thu, 8 Feb 2001 01:22:49 -0500 (EST) From: Donald Becker becker@scyld.com Subject: [eepro100] Developer software manual for i82559er (or family) On Wed, 7 Feb 2001, Christoph Plattner wrote: > For a porting/driver developing of a 82559er ethernet chip > (eepro100) I am searching for a manual describing the handling > of the chip. I want to have this additional to the 2 source codes > handling the chip, one from Donald Becker, the other from > intel themself. > > I only have the data sheets not describing the functional use > of the chip ! > > Can anybody help me in this point. At intel I was not able to > get such documentation.... The Intel manuals are available only from Intel, and usually only under a NDA. I have a copy of the i82557 and i82558 manuals, but can't get copy of the '559 manual. Donald Becker becker@scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Second Generation Beowulf Clusters Annapolis MD 21403 410-990-9993 From Antwerpen@netsquare.org Thu, 8 Feb 2001 09:02:15 +0100 Date: Thu, 8 Feb 2001 09:02:15 +0100 From: Antwerpen, Oliver Antwerpen@netsquare.org Subject: [eepro100] e100 vs. eepro100 Hi out there, I have a linux box here running 2.2.16 and got these strange timeouts just like others on the list, too. I have already installed the freshest modules, but that didn't help much. I get timeouts between 3 and 20 times a day, always on eth2. About once a week the box stops working at all. I am now about to try the drivers supplied on intel's website. Does anyone have experince with these? Olli From christoph.plattner@alcatel.at Thu, 08 Feb 2001 10:35:03 +0100 Date: Thu, 08 Feb 2001 10:35:03 +0100 From: Christoph Plattner christoph.plattner@alcatel.at Subject: [eepro100] e100 vs. eepro100 I am also interested in this question. I want to test both, because for my porting/development job I want to know, what to take as base, or how to mix, etc.... Cheers Christoph "Antwerpen, Oliver" wrote: > > Hi out there, > > I have a linux box here running 2.2.16 and got these strange timeouts just > like others on the list, too. I have already installed the freshest modules, > but that didn't help much. I get timeouts between 3 and 20 times a day, > always on eth2. About once a week the box stops working at all. > > I am now about to try the drivers supplied on intel's website. Does anyone > have experince with these? > > Olli > > _______________________________________________ > eepro100 mailing list > eepro100@scyld.com > http://www.scyld.com/mailman/listinfo/eepro100 ----------------------------------------------------------------- private: christoph.plattner@dot.at company: christoph.plattner@alcatel.at From d.mueller@elsoft.ch Thu, 08 Feb 2001 10:50:21 +0100 Date: Thu, 08 Feb 2001 10:50:21 +0100 From: David =?iso-8859-1?Q?M=FCller?= (ELSOFT AG) d.mueller@elsoft.ch Subject: [eepro100] Buggy 82559 chips Hello I remember a discussion on this list some long time ago where the topic was a bug in the silicon and its impact on the eepro100 driver. Has someone any information what was exactly the problem and how to identify the affected chips (data code, revision, test software, ...)? TIA Dave From aoga@Mail.Linux-Consulting.com Thu, 8 Feb 2001 02:13:27 -0800 (PST) Date: Thu, 8 Feb 2001 02:13:27 -0800 (PST) From: Alvin Oga aoga@Mail.Linux-Consulting.com Subject: [eepro100] e100 vs. eepro100 hi ya... to test e100 and eepro100.... if it is dual NIC system... ( i'd use modules ) /etc/modules.conf alias eth0 e100 alias eth1 eepro100 and send the tests to both eth0 and eth1 and see the performance c ya alvin On Thu, 8 Feb 2001, Christoph Plattner wrote: > I am also interested in this question. I want to test both, > because for my porting/development job I want to know, > what to take as base, or how to mix, etc.... > > Cheers > Christoph > > "Antwerpen, Oliver" wrote: > > > > Hi out there, > > > > I have a linux box here running 2.2.16 and got these strange timeouts just > > like others on the list, too. I have already installed the freshest modules, > > but that didn't help much. I get timeouts between 3 and 20 times a day, > > always on eth2. About once a week the box stops working at all. > > > > I am now about to try the drivers supplied on intel's website. Does anyone > > have experince with these? From becker@scyld.com Thu, 8 Feb 2001 08:22:16 -0500 (EST) Date: Thu, 8 Feb 2001 08:22:16 -0500 (EST) From: Donald Becker becker@scyld.com Subject: [eepro100] e100 vs. eepro100 On Thu, 8 Feb 2001, Antwerpen, Oliver wrote: > I have a linux box here running 2.2.16 and got these strange timeouts just > like others on the list, too. I have already installed the freshest modules, > but that didn't help much. I get timeouts between 3 and 20 times a day, > always on eth2. About once a week the box stops working at all. Please be more specific: what version are you using? The phrase "freshest modules" isn't very useful. What is the message in the timeout? Donald Becker becker@scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Second Generation Beowulf Clusters Annapolis MD 21403 410-990-9993 From Antwerpen@netsquare.org Thu, 8 Feb 2001 16:48:32 +0100 Date: Thu, 8 Feb 2001 16:48:32 +0100 From: Antwerpen, Oliver Antwerpen@netsquare.org Subject: [eepro100] e100 vs. eepro100 Hi Donald, some information for you... > Von: Donald Becker [mailto:becker@scyld.com] > > On Thu, 8 Feb 2001, Antwerpen, Oliver wrote: > > > I have a linux box here running 2.2.16 and got these > strange timeouts just > > like others on the list, too. I have already installed the > freshest modules, > > but that didn't help much. I get timeouts between 3 and 20 > times a day, > > always on eth2. About once a week the box stops working at all. > > Please be more specific: what version are you using? The phrase > "freshest modules" isn't very useful. Feb 6 06:30:21 hunter kernel: eepro100.c:v1.13 1/9/2001 Donald Becker Feb 6 06:30:21 hunter kernel: http://www.scyld.com/network/eepro100.html Feb 6 06:30:21 hunter kernel: eth0: Intel PCI EtherExpress Pro100 at 0xe0027000, 00:02:B3:23:04:40, IRQ 15. Feb 6 06:30:21 hunter kernel: Receiver lock-up bug exists -- enabling work-around. Feb 6 06:30:21 hunter kernel: Board assembly 721383-016, Physical connectors present: RJ45 Feb 6 06:30:21 hunter kernel: Primary interface chip i82555 PHY #1. Feb 6 06:30:21 hunter kernel: General self-test: passed. Feb 6 06:30:21 hunter kernel: Serial sub-system self-test: passed. Feb 6 06:30:21 hunter kernel: Internal registers self-test: passed. Feb 6 06:30:21 hunter kernel: ROM checksum self-test: passed (0x04f4518b). Feb 6 06:30:21 hunter kernel: eth1: Intel PCI EtherExpress Pro100 at 0xe0029000, 00:02:B3:23:8F:CB, IRQ 11. Feb 6 06:30:21 hunter kernel: Receiver lock-up bug exists -- enabling work-around. Feb 6 06:30:21 hunter kernel: Board assembly 721383-016, Physical connectors present: RJ45 Feb 6 06:30:21 hunter kernel: Primary interface chip i82555 PHY #1. Feb 6 06:30:21 hunter kernel: General self-test: passed. Feb 6 06:30:21 hunter kernel: Serial sub-system self-test: passed. Feb 6 06:30:21 hunter kernel: Internal registers self-test: passed. Feb 6 06:30:21 hunter kernel: ROM checksum self-test: passed (0x04f4518b). Feb 6 06:30:21 hunter kernel: eth2: Intel PCI EtherExpress Pro100 at 0xe002b000, 00:02:B3:2A:04:E1, IRQ 10. Feb 6 06:30:21 hunter kernel: Receiver lock-up bug exists -- enabling work-around. Feb 6 06:30:21 hunter kernel: Board assembly 721383-016, Physical connectors present: RJ45 Feb 6 06:30:21 hunter kernel: Primary interface chip i82555 PHY #1. Feb 6 06:30:21 hunter kernel: General self-test: passed. Feb 6 06:30:21 hunter kernel: Serial sub-system self-test: passed. Feb 6 06:30:21 hunter kernel: Internal registers self-test: passed. Feb 6 06:30:21 hunter kernel: ROM checksum self-test: passed (0x04f4518b). > What is the message in the timeout? Feb 8 15:25:28 hunter kernel: eth2: Transmit timed out: status 0050 0080 at 328083/328089 commands 000c0000 000c0000 000c0000. Feb 8 15:25:28 hunter kernel: eth2: Tx ring dump, Tx queue 328089 / 328083: Feb 8 15:25:28 hunter kernel: eth2: 0 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 1 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 2 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 3 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 4 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 5 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 6 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 7 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 8 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 9 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 10 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 11 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 12 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 13 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 14 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 15 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 16 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 17 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 18 000ca000. Feb 8 15:25:28 hunter kernel: eth2: * 19 000c0000. Feb 8 15:25:28 hunter kernel: eth2: 20 000c0000. Feb 8 15:25:28 hunter kernel: eth2: 21 000c0000. Feb 8 15:25:28 hunter kernel: eth2: 22 000c0000. Feb 8 15:25:28 hunter kernel: eth2: 23 000c0000. Feb 8 15:25:28 hunter kernel: eth2: 24 400c0000. Feb 8 15:25:28 hunter kernel: eth2: =25 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 26 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 27 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 28 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 29 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 30 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 31 000ca000. Feb 8 15:25:28 hunter kernel: eth2:Printing Rx ring (next to receive into 888009). Feb 8 15:25:28 hunter kernel: Rx ring entry 0 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 1 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 2 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 3 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 4 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 5 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 6 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 7 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 8 c0000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 9 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 10 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 11 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 12 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 13 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 14 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 15 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 16 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 17 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 18 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 19 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 20 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 21 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 22 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 23 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 24 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 25 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 26 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 27 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 28 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 29 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 30 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 31 00000001. Feb 8 15:25:28 hunter kernel: PHY index 1 register 0 is 3000. Feb 8 15:25:28 hunter kernel: PHY index 1 register 1 is 7829. Feb 8 15:25:28 hunter kernel: PHY index 1 register 2 is 02a8. Feb 8 15:25:28 hunter kernel: PHY index 1 register 3 is 0154. Feb 8 15:25:28 hunter kernel: PHY index 1 register 4 is 05e1. Feb 8 15:25:28 hunter kernel: PHY index 1 register 5 is 45e1. Feb 8 15:25:28 hunter kernel: PHY index 1 register 21 is 0000. Feb 8 15:25:28 hunter kernel: eth2: Tx ring dump, Tx queue 328089 / 328083: Feb 8 15:25:28 hunter kernel: eth2: 0 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 1 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 2 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 3 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 4 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 5 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 6 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 7 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 8 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 9 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 10 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 11 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 12 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 13 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 14 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 15 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 16 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 17 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 18 000ca000. Feb 8 15:25:28 hunter kernel: eth2: * 19 000c0000. Feb 8 15:25:28 hunter kernel: eth2: 20 000c0000. Feb 8 15:25:28 hunter kernel: eth2: 21 000c0000. Feb 8 15:25:28 hunter kernel: eth2: 22 000c0000. Feb 8 15:25:28 hunter kernel: eth2: 23 000c0000. Feb 8 15:25:28 hunter kernel: eth2: 24 400c0000. Feb 8 15:25:28 hunter kernel: eth2: =25 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 26 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 27 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 28 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 29 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 30 000ca000. Feb 8 15:25:28 hunter kernel: eth2: 31 000ca000. Feb 8 15:25:28 hunter kernel: eth2:Printing Rx ring (next to receive into 888009). Feb 8 15:25:28 hunter kernel: Rx ring entry 0 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 1 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 2 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 3 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 4 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 5 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 6 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 7 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 8 c0000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 9 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 10 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 11 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 12 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 13 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 14 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 15 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 16 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 17 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 18 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 19 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 20 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 21 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 22 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 23 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 24 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 25 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 26 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 27 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 28 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 29 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 30 00000001. Feb 8 15:25:28 hunter kernel: Rx ring entry 31 00000001. Feb 8 15:25:28 hunter kernel: PHY index 1 register 0 is 3000. Feb 8 15:25:28 hunter kernel: PHY index 1 register 1 is 782d. Feb 8 15:25:28 hunter kernel: PHY index 1 register 2 is 02a8. Feb 8 15:25:28 hunter kernel: PHY index 1 register 3 is 0154. Feb 8 15:25:28 hunter kernel: PHY index 1 register 4 is 05e1. Feb 8 15:25:28 hunter kernel: PHY index 1 register 5 is 45e1. Feb 8 15:25:28 hunter kernel: PHY index 1 register 21 is 0000. From becker@scyld.com Thu, 8 Feb 2001 11:25:06 -0500 (EST) Date: Thu, 8 Feb 2001 11:25:06 -0500 (EST) From: Donald Becker becker@scyld.com Subject: [eepro100] e100 vs. eepro100 On Thu, 8 Feb 2001, Antwerpen, Oliver wrote: > some information for you... .. > Feb 6 06:30:21 hunter kernel: eepro100.c:v1.13 1/9/2001 Donald Becker > Good. ... > Feb 8 15:25:28 hunter kernel: eth2: Transmit timed out: status 0050 0080 > at 328083/328089 commands 000c0000 000c0000 000c0000. ... > Feb 8 15:25:28 hunter kernel: PHY index 1 register 0 is 3000. > Feb 8 15:25:28 hunter kernel: PHY index 1 register 1 is 7829. This one _looks_ easy -- you lost link beat at some point. That's not necessarily the problem, but might be. > Feb 8 15:25:28 hunter kernel: PHY index 1 register 5 is 45e1. Your link partner is advertising flow control. It's possible that the i82559 chip has paused transmission due to flow control. I should print the current flow control timeout as well. Was the network exceptionally busy at this time? > Feb 8 15:25:28 hunter kernel: eth2: Tx ring dump, Tx queue 328089 / > 328083: > Feb 8 15:25:28 hunter kernel: eth2: 18 000ca000. > Feb 8 15:25:28 hunter kernel: eth2: * 19 000c0000. > Feb 8 15:25:28 hunter kernel: eth2: 20 000c0000. > Feb 8 15:25:28 hunter kernel: eth2: 21 000c0000. > Feb 8 15:25:28 hunter kernel: eth2: 22 000c0000. > Feb 8 15:25:28 hunter kernel: eth2: 23 000c0000. > Feb 8 15:25:28 hunter kernel: eth2: 24 400c0000. > Feb 8 15:25:28 hunter kernel: eth2: =25 000ca000. > Feb 8 15:25:28 hunter kernel: eth2: 28 000ca000. Did the interface resume correct operation at this point? > Feb 8 15:25:28 hunter kernel: PHY index 1 register 0 is 3000. > Feb 8 15:25:28 hunter kernel: PHY index 1 register 1 is 782d. Link beat has returned. Donald Becker becker@scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Second Generation Beowulf Clusters Annapolis MD 21403 410-990-9993 From dhobbit@midearth.net Thu, 8 Feb 2001 13:34:38 -0500 Date: Thu, 8 Feb 2001 13:34:38 -0500 From: Derek Harkness dhobbit@midearth.net Subject: [eepro100] Server Problem Here's the setup Compaq DL380 three i82555 chips. The first is on the mainboard and works great, the other two are on a compaq NC3134 controller, and worked under kernel 2.2.17 just fine. I upgraded to kernel 2.4.x and they stopped receiving data. Error generated: eth1: Transmit timeouted out: status e050 0x00 at 0/28 command 0001a000 Any ideas thanks! Derek From Antwerpen@netsquare.org Fri, 9 Feb 2001 08:17:13 +0100 Date: Fri, 9 Feb 2001 08:17:13 +0100 From: Antwerpen, Oliver Antwerpen@netsquare.org Subject: [eepro100] e100 vs. eepro100 Hi Donald, > Von: Donald Becker [mailto:becker@scyld.com] > > On Thu, 8 Feb 2001, Antwerpen, Oliver wrote: > > > some information for you... > .. > > Feb 6 06:30:21 hunter kernel: eepro100.c:v1.13 1/9/2001 > Donald Becker > > > > Good. *phew* > ... > > Feb 8 15:25:28 hunter kernel: eth2: Transmit timed out: > status 0050 0080 > > at 328083/328089 commands 000c0000 000c0000 000c0000. > ... > > Feb 8 15:25:28 hunter kernel: PHY index 1 register 0 is 3000. > > Feb 8 15:25:28 hunter kernel: PHY index 1 register 1 is 7829. > > This one _looks_ easy -- you lost link beat at some point. That's not > necessarily the problem, but might be. Yes, right. Now I remenber the numbers... *g* > > Feb 8 15:25:28 hunter kernel: PHY index 1 register 5 is 45e1. > > Your link partner is advertising flow control. It's possible that the > i82559 chip has paused transmission due to flow control. I > should print > the current flow control timeout as well. Was the network > exceptionally > busy at this time? No, it also happens at night, when there's almost no traffic. For me there seems to be no relationship between traffic and problems. > > Feb 8 15:25:28 hunter kernel: eth2: Tx ring dump, Tx > queue 328089 / > > 328083: > > Feb 8 15:25:28 hunter kernel: eth2: 18 000ca000. > > Feb 8 15:25:28 hunter kernel: eth2: * 19 000c0000. > > Feb 8 15:25:28 hunter kernel: eth2: 20 000c0000. > > Feb 8 15:25:28 hunter kernel: eth2: 21 000c0000. > > Feb 8 15:25:28 hunter kernel: eth2: 22 000c0000. > > Feb 8 15:25:28 hunter kernel: eth2: 23 000c0000. > > Feb 8 15:25:28 hunter kernel: eth2: 24 400c0000. > > Feb 8 15:25:28 hunter kernel: eth2: =25 000ca000. > > Feb 8 15:25:28 hunter kernel: eth2: 28 000ca000. > > Did the interface resume correct operation at this point? Yes, in this case it did. > > Feb 8 15:25:28 hunter kernel: PHY index 1 register 0 is 3000. > > Feb 8 15:25:28 hunter kernel: PHY index 1 register 1 is 782d. > > Link beat has returned. OK. So this means to me, that I should not use auto negotiation, but pass options=... to the module? Is it sure, that this will help? Otherways I will try out the intel e100 module. My customer isn't really happy right now... Thanks so long! Olli From becker@scyld.com Fri, 9 Feb 2001 12:04:47 -0500 (EST) Date: Fri, 9 Feb 2001 12:04:47 -0500 (EST) From: Donald Becker becker@scyld.com Subject: [eepro100] e100 vs. eepro100 On Fri, 9 Feb 2001, Antwerpen, Oliver wrote: > > > Feb 8 15:25:28 hunter kernel: PHY index 1 register 1 is 7829. > > > > This one _looks_ easy -- you lost link beat at some point. That's not > > necessarily the problem, but might be. > > Yes, right. Now I remenber the numbers... *g* That "7829" status isn't perfectly obvious? I was *certain* that they covered that in kindergarten! ;-> Understanding MII Transceiver Status Info http://www.scyld.com/diag/mii-status.html > > > Feb 8 15:25:28 hunter kernel: PHY index 1 register 5 is 45e1. > > > > Your link partner is advertising flow control. It's possible that the > > i82559 chip has paused transmission due to flow control. I > > should print > > the current flow control timeout as well. Was the network > > exceptionally > > busy at this time? > No, it also happens at night, when there's almost no traffic. For me > there seems to be no relationship between traffic and problems. > > Did the interface resume correct operation at this point? > > Yes, in this case it did. Are there cases where the driver doesn't recover? > > > Feb 8 15:25:28 hunter kernel: PHY index 1 register 0 is 3000. > > > Feb 8 15:25:28 hunter kernel: PHY index 1 register 1 is 782d. > > > > Link beat has returned. > > OK. So this means to me, that I should not use auto negotiation, but pass > options=... to the module? Is it sure, that this will help? Otherways I will > try out the intel e100 module. My customer isn't really happy right now... No, forcing the media type won't fix the problem. The chip stops transmitting when it doesn't have link beat even with the forced media type. The driver is configured to take action quickly in an attempt to restore operation, assuming that something has gone wrong. Look for v1.14 soon, which will reset a transceiver left set to fixed 10baseT (Ref: Alpha SRM boot bug) emit a few more details if the driver resets the chip Donald Becker becker@scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Second Generation Beowulf Clusters Annapolis MD 21403 410-990-9993 From arthur@cal040041.student.utwente.nl Sat, 10 Feb 2001 22:05:06 +0100 (CET) Date: Sat, 10 Feb 2001 22:05:06 +0100 (CET) From: Arthur Rinkel arthur@cal040041.student.utwente.nl Subject: [eepro100] eepro100 performance Hi, I'm having somewhat disappointing throughput with an eepro100 (i82557) on a Pentium 133 system and I'm hoping someone can help me out a bit. The actual throughput I'm getting is about 2MB/s peak, normal is about 1,4MB/s. I ran some tests with a prg called ttcp and transmitted a large file over the loopback. Does this give a good indication of how fast the upro is able to transmit (or receive) packets? I'm not sure, but the throughput during this test was 4,7MB/s. A friend of mine suggested that the NIC might be placed in a non-busmastering PCI-slot. So I tried another slot, and the throughput (not with ttcp) increased to 2,2MB/s peak. I only did one test with the NIC in another PCI-slot, so the increase may mean nothing. Anyway, there's not much difference... Furthermore, I tried switching between half and full duplex, but that too didn't change anything. The switch this NIC is connected to does support both half and full duplex. Here are some stats about the NIC collected over the past week: UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:19542477 errors:3608 dropped:0 overruns:0 frame:88730 TX packets:267371 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 There are no errors or warnings from the kernel. Anybody who can tell if there's really a problem going on here, or is the 2MB throughput the best I can get on a P133? One other question I have. Besides an eepro100 this P133 system has a 3Com NIC which, ofcourse, requires another driver. Both drivers are compiled in the (Linux) kernel and the 3Com NIC requires some parms to be detected by the kernel during booting; eepro is eth0, 3Com is eth1. But it seems those parms don't get sent to the setup routine of the 3Com driver, since this driver won't get loaded at all. Is it possible the parms for the 3Com NIC get sent to the eepro100 driver, which fails, and the 3Com NIC gets ignored? If so, how does one sent the 3Com parms to its driver (without making the 3Com driver a module)? Grtz, Arthur From becker@scyld.com Sun, 11 Feb 2001 16:35:34 -0500 (EST) Date: Sun, 11 Feb 2001 16:35:34 -0500 (EST) From: Donald Becker becker@scyld.com Subject: [eepro100] eepro100 performance On Sat, 10 Feb 2001, Arthur Rinkel wrote: > I'm having somewhat disappointing throughput with an eepro100 (i82557) on > a Pentium 133 system and I'm hoping someone can help me out a bit. The ... > RX packets:19542477 errors:3608 dropped:0 overruns:0 frame:88730 > TX packets:267371 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:100 That's a horrible Rx error rate! It appears that the local interface is set to full duplex mode, while the remote end is set to half duplex. Don't force the duplex on either side. Donald Becker becker@scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Second Generation Beowulf Clusters Annapolis MD 21403 410-990-9993 From cheech@pixelmetrix.com Mon, 12 Feb 2001 12:03:29 +0800 Date: Mon, 12 Feb 2001 12:03:29 +0800 From: CheeChun Kok cheech@pixelmetrix.com Subject: [eepro100] card reports no resources Hi, I have been faced with the problem of card reporting no resources and have tried various steps suggested by various contributors, notably Andrey Savochkin. Our setup is as follows : . Dedicated network monitoring equipment running kernel 2.2.14, rpm obtained from RedHat 6.2 on 800MHz PIII. . 256M RAM . 5 Intel PRO/100 S (reported as i82557 PCI Speedo by the driver) Other than the above, the other components are those of a typical PC. The PC essentially runs a single process which continually reads IP packets from the NICs to process the data carried in them. The error message starts appearing intermittently during operation. We have not seen them occuring immediately after startup (hence ruling out receiver bug being the cause ??) We have not seen the message "can't fill rx buffer" using v1.09j-t Revision: 1.18 $ 1999/12/29 Modified by Andrey V. Savochkin. This suggest that we are not running short of kernel memory (??) The version of the driver is 1.13 dated 1/9/2001 (uses pci-scan) which supposed fixed the occurence of the same messages during startup. I have changed a couple of changes to the source . increased RX_RING_SIZE to 64 . added kernel message where it used to occur in Andrey's code if ((status & 0x003c) == 0x0008) { /* No resources (why?!) */ Here are some kernel/driver messages which might be helpful. dmesg ===== eth0: OEM i82557/i82558 10/100 Ethernet at 0xd002f000, 00:D0:B7:44:E4:8B, IRQ 9. Receiver lock-up bug exists -- enabling work-around. Board assembly 734938-003, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x04f4518b). free -m ======= total used free shared buffers cached Mem: 251 247 4 38 93 82 -/+ buffers/cache: 71 180 Swap: 133 5 128 /proc/sys/vm/freepages ====================== 256 512 768 I have just turned on the debug level and am currently monitoring the messages. I see the comment "No resources (why?!)" in the code. Does Intel have an explanation to this or are they keeping mum about it? I wonder if this happen in their Windows (urgh!!) driver. Changing the kernel version and the NIC in our equipment is an option but a last resort because of the installed base. Thanks in advance CheeChun From Antwerpen@netsquare.org Mon, 12 Feb 2001 08:29:55 +0100 Date: Mon, 12 Feb 2001 08:29:55 +0100 From: Antwerpen, Oliver Antwerpen@netsquare.org Subject: [eepro100] e100 vs. eepro100 Moin, > Von: Donald Becker [mailto:becker@scyld.com] > > > > > Did the interface resume correct operation at this point? > > > > Yes, in this case it did. > > Are there cases where the driver doesn't recover? Yes. The server completely hangs about 2 times a week. The only problem obvious is the NIC. No other messages. > operation, assuming that something has gone wrong. Look for > v1.14 soon, > which will > reset a transceiver left set to fixed 10baseT (Ref: Alpha > SRM boot bug) > emit a few more details if the driver resets the chip Okay, I've loaded the e100 module for the NICs yesterday at 21:45. Today at 4:05 I get errors from the driver (but intel's module doesn't even report which NIC...). Feb 12 04:07:31 hunter kernel: e100_wait_exec_cmd: Wait failed. scb cmd=0x70 Feb 12 04:07:35 hunter last message repeated 5 times Feb 12 04:09:20 hunter last message repeated 8 times Feb 12 04:09:44 hunter last message repeated 8 times Feb 12 04:11:06 hunter last message repeated 6 times Feb 12 04:12:12 hunter last message repeated 15 times Feb 12 04:12:23 hunter last message repeated 3 times As this message looks similar, it seems to me as if I really have a hardware problem. I'll swap that NIC these days and we'll see on. Thanks so long! Olli From cheech@pixelmetrix.com Mon, 12 Feb 2001 17:41:01 +0800 Date: Mon, 12 Feb 2001 17:41:01 +0800 From: CheeChun Kok cheech@pixelmetrix.com Subject: [eepro100] card reports no resources Hi, Here's more info that I've collected from my debugging : 1. Changed RX_RING_SIZE to 256 (Is this a value too high, if so, is there a max?) After this was done, the error disappeared or maybe it has merely been delayed. Anyway, another error message appeared. This time it is 'Too much work at interrupt, status = 0x4050' where the status is decoded from Intel's driver code as 0x4000 - one frame is received 0x0040 - CU suspended 0x0010 - RU ready 2. I then proceeded to change max_interrupt_work to match RX_RING_SIZE. (Is there a reason why they are not the same in the original set of codes? with max_interrupt_work = 20 and RX_RING_SIZE = 32) However, this causes 'card reports no resources' to reoccur. Here's output from eepro100-diag. I do not see anything out of the ordinary. Index #1: Found a Intel i82557 (or i82558) EtherExpressPro100B adapter at 0xb000. MII PHY #1 transceiver registers: 3000 782d 02a8 0154 05e1 45e1 0003 0000 0000 0000 0000 0000 0000 0000 0000 0000 0203 0000 0001 1ada 0000 0000 1a58 0000 0000 0000 0000 0000 0000 0000 0000 0000. MII PHY #1 transceiver registers: 3000 782d 02a8 0154 05e1 45e1 0001 0000 0000 0000 0000 0000 0000 0000 0000 0000 0a03 0000 0001 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000. Basic mode control register 0x3000: Auto-negotiation enabled. Basic mode status register 0x782d ... 782d. Link status: established. Capable of 100baseTx-FD 100baseTx 10baseT-FD 10baseT. Able to perform Auto-negotiation, negotiation complete. Vendor ID is 00:aa:00:--:--:--, model 21 rev. 4. No specific information is known about this transceiver type. I'm advertising 05e1: Flow-control 100baseTx-FD 100baseTx 10baseT-FD 10baseT Advertising no additional info pages. IEEE 802.3 CSMA/CD protocol. Link partner capability is 45e1: Flow-control 100baseTx-FD 100baseTx 10baseT-FD 10baseT. Negotiation completed. Thanks CheeChun From jeffrey.hundstad@mnsu.edu Mon, 12 Feb 2001 10:29:37 -0600 Date: Mon, 12 Feb 2001 10:29:37 -0600 From: Jeffrey Hundstad jeffrey.hundstad@mnsu.edu Subject: [eepro100] eepro100 performance With all due respect to Mr. Becker, I agree with his conclusion that your duplex is not matching. But in my experience, if you are using CISCO equipment with the eepro cards you MUST force duplex on your card side and the switch side. We get 100% mismatch between card and switch if we allow the duplex to autoset. -- jeffrey hundstad minnesota state university, mankato Donald Becker wrote: > On Sat, 10 Feb 2001, Arthur Rinkel wrote: > > > I'm having somewhat disappointing throughput with an eepro100 (i82557) on > > a Pentium 133 system and I'm hoping someone can help me out a bit. The > ... > > RX packets:19542477 errors:3608 dropped:0 overruns:0 frame:88730 > > TX packets:267371 errors:0 dropped:0 overruns:0 carrier:0 > > collisions:0 txqueuelen:100 > > That's a horrible Rx error rate! > > It appears that the local interface is set to full duplex mode, while > the remote end is set to half duplex. > > Don't force the duplex on either side. > > Donald Becker becker@scyld.com > Scyld Computing Corporation http://www.scyld.com > 410 Severn Ave. Suite 210 Second Generation Beowulf Clusters > Annapolis MD 21403 410-990-9993 > > _______________________________________________ > eepro100 mailing list > eepro100@scyld.com > http://www.scyld.com/mailman/listinfo/eepro100 From arthur@cal040041.student.utwente.nl Mon, 12 Feb 2001 21:32:21 +0100 (CET) Date: Mon, 12 Feb 2001 21:32:21 +0100 (CET) From: Arthur Rinkel arthur@cal040041.student.utwente.nl Subject: [eepro100] eepro100 performance On Sun, 11 Feb 2001, Donald Becker wrote: > > I'm having somewhat disappointing throughput with an eepro100 (i82557) on > > a Pentium 133 system and I'm hoping someone can help me out a bit. The > > It appears that the local interface is set to full duplex mode, while > the remote end is set to half duplex. Local interface is set to FD, but I don't know about the switch on the other end. Though the switch is capable of 100baseTx-FD, 100baseTx, 10baseT-FD and 10baseT, I'm advised not to use auto-negotiation because it doesn't always works correctly between switch and NIC. Forcing the NIC to HD should always work I'm told...which I have tried, but doesn't improve performance. > Don't force the duplex on either side. I have no control over the switch this eepro10 is connected to, but if auto-negotiation fails the switch uses HD. Grtz, Arthur From arthur@cal040041.student.utwente.nl Mon, 12 Feb 2001 21:44:08 +0100 (CET) Date: Mon, 12 Feb 2001 21:44:08 +0100 (CET) From: Arthur Rinkel arthur@cal040041.student.utwente.nl Subject: [eepro100] eepro100 performance On Mon, 12 Feb 2001, Jeffrey Hundstad wrote: > With all due respect to Mr. Becker, I agree with his conclusion that your > duplex is not matching. But in my experience, if you are using CISCO > equipment with the eepro cards you MUST force duplex on your card side and the > switch side. We get 100% mismatch between card and switch if we allow the > duplex to autoset. The equipment is an HP Procurve Switch 4000M, if it matters. Or is this another Cisco device? ;) Greetz, Arthur From becker@scyld.com Tue, 13 Feb 2001 20:32:59 -0500 (EST) Date: Tue, 13 Feb 2001 20:32:59 -0500 (EST) From: Donald Becker becker@scyld.com Subject: [eepro100] eepro100 performance On Mon, 12 Feb 2001, Jeffrey Hundstad wrote: > With all due respect to Mr. Becker, I agree with his conclusion that your > duplex is not matching. But in my experience, if you are using CISCO > equipment with the eepro cards you MUST force duplex on your card side and the > switch side. We get 100% mismatch between card and switch if we allow the > duplex to autoset. I'll be blunt about the Cisco situation: Old Cisco switches had broken autonegotiation Cisco recommended duplex be forced to hide this bug. New Cisco switches, with functional autonegotiation, tend to be set to forced duplex due to this recommendation. The best institutional policy is to never force full duplex. The minor performance gain is not worth the problems caused. If you have an older switch with broken autonegotiation, just disable autonegotiation and allow the link to default to half duplex. Donald Becker becker@scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Second Generation Beowulf Clusters Annapolis MD 21403 410-990-9993 From dstein2203@t-online.de 14 Feb 2001 05:26 GMT Date: 14 Feb 2001 05:26 GMT From: dstein2203@t-online.de dstein2203@t-online.de Subject: [eepro100] Alias interface eth0:1 on eepro100 Hi I am new to this list, but went through the archives if someone had asked my question before but can't find any answer. I am trying to get two logical interfaces working on one physical interface, eth0 and eth0:0 (or eth0:1, as you like it). This seems to work on every 3com card, but I can't get it working on a eepro100 (card shipped with Compaq DL380). Command ifconfig produces the following output (ifconfig eth0:1 149.208.7.183): SIOCSIFADDR: No such device SIOCSIFFLAGS: No such device My module configuration is just alias eth0 eepro100 options=0x20 full_duplex=1 But I think is isn't read, while I had bound eepro100 module within the ramdisk... Question: is this impossible with eepro100 chips or is there a special option for it? Dietmar From Alexander.Stoll@fh-albsig.de Wed, 14 Feb 2001 16:00:10 GMT Date: Wed, 14 Feb 2001 16:00:10 GMT From: Alexander.Stoll@fh-albsig.de Alexander.Stoll@fh-albsig.de Subject: [eepro100] Alias interface eth0:1 on eepro100 > Hi > > I am new to this list, but went through the archives if someone had > asked my question before but can't find any answer. > > I am trying to get two logical interfaces working on one physical > interface, eth0 and eth0:0 (or eth0:1, as you like it). This seems to work on > every 3com card, but I can't get it working on a eepro100 (card shipped > with Compaq DL380). > > Command ifconfig produces the following output (ifconfig eth0:1 > 149.208.7.183): > > SIOCSIFADDR: No such device > SIOCSIFFLAGS: No such device > > My module configuration is just > > alias eth0 eepro100 options=0x20 full_duplex=1 > > But I think is isn't read, while I had bound eepro100 module within the > ramdisk... > > Question: is this impossible with eepro100 chips or is there a special > option for it? IP aliasing is not a specific feature of your NIC... add aliasing support to your kernel, e.g. load the corresponding module... regards, AS ------------------------------------------- http://www.student.fh-albsig.de/ -> MailMan From dstein2203@t-online.de 15 Feb 2001 05:18 GMT Date: 15 Feb 2001 05:18 GMT From: dstein2203@t-online.de dstein2203@t-online.de Subject: [eepro100] Alias interface eth0:1 on eepro100 > IP aliasing is not a specific feature of your NIC... > add aliasing support to your kernel, e.g. load the corresponding module... > Oh... yes, I remember. I am sorry, but I am not a specialist in networking. Thanks for your help. Dietmar From subscriptions@graphon.com Thu, 15 Feb 2001 13:27:56 -0800 Date: Thu, 15 Feb 2001 13:27:56 -0800 From: Nate Amsden subscriptions@graphon.com Subject: [eepro100] eepro100 82559 problems hi after seeing this message posted by Antwerpen@netsquare.org: http://www.scyld.com/pipermail/eepro100/2001-February/001509.html i figured i should post because i have a very similar problem. We have 3 identical 1U systems running Supermicro S370SSE motherboards (at least im 99.99999% sure it is, i cant be 100% sure without taking the system apart). They have dual onboard Intel 82559 NICs. (somewhat related..) When using OpenBSD 2.8 on one of them, the machine seemed to crash after about 5 minutes of use(firewaling/port forwarding under very low load maybe 10kb/s at best). I have since replaced OpenBSD 2.8 with Debian GNU/Linux 2.2r2 and kernel 2.2.17+many patches including modules for eepro100 v1.11a. this machine has been operating perfectly for the past 68 days 20 hours. At another location on the other side of the country we are trying to deploy the 2nd of 3 systems, using a similar configuration(kernel and modules are identical, bios settings match etc) and since we deployed it on monday i think it was it has consistantly locked up hard every night. Today we synched the bios settings between the unit here and there and things seemed to be going better however the errors are still showing up. something that has never shown up in the logs in the unit here. sample log entry: Feb 10 05:37:11 gate-nh kernel: eth1: Transmit timed out: status 0050 0080 at 59/61 commands 000c0000 400c0000 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: Tx ring dump, Tx queue 61 / 59: Feb 10 05:37:11 gate-nh kernel: eth1: 0 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 1 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 2 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 3 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 4 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 5 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 6 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 7 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 8 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 9 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 10 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 11 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 12 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 13 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 14 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 15 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 16 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 17 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 18 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 19 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 20 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 21 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 22 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 23 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 24 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 25 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 26 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: * 27 000c0000. Feb 10 05:37:11 gate-nh kernel: eth1: 28 400c0000. Feb 10 05:37:11 gate-nh kernel: eth1: =29 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 30 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 31 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1:Printing Rx ring (next to receive into 143). Feb 10 05:37:11 gate-nh kernel: Rx ring entry 0 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 1 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 2 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 3 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 4 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 5 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 6 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 7 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 8 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 9 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 10 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 11 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 12 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 13 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 14 c0000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 15 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 16 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 17 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 18 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 19 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 20 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 21 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 22 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 23 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 24 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 25 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 26 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 27 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 28 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 28 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 29 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 30 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 31 00000001. Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 0 is 3100. Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 1 is 782d. Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 2 is 02a8. Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 3 is 0320. Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 4 is 05e1. Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 5 is 0021. Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 21 is 0000. Feb 10 05:37:11 gate-nh kernel: eth1: Tx ring dump, Tx queue 61 / 59: Feb 10 05:37:11 gate-nh kernel: eth1: 0 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 1 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 2 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 3 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 4 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 5 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 6 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 7 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 8 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 9 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 10 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 11 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 12 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 13 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 14 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 15 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 16 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 17 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 18 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 19 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 20 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 21 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 22 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 23 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 24 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 25 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 26 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: * 27 000c0000. Feb 10 05:37:11 gate-nh kernel: eth1: 28 400c0000. Feb 10 05:37:11 gate-nh kernel: eth1: =29 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 30 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1: 31 000ca000. Feb 10 05:37:11 gate-nh kernel: eth1:Printing Rx ring (next to receive into 143) Feb 10 05:37:11 gate-nh kernel: Rx ring entry 0 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 1 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 2 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 3 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 4 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 5 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 6 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 7 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 8 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 9 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 10 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 11 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 12 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 13 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 14 c0000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 15 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 16 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 17 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 18 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 19 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 20 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 21 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 22 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 23 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 24 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 25 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 26 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 27 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 28 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 29 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 30 00000001. Feb 10 05:37:11 gate-nh kernel: Rx ring entry 31 00000001. Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 0 is 3100. Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 1 is 782d. Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 2 is 02a8. Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 3 is 0320. Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 4 is 05e1. Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 5 is 0021. Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 21 is 0000. I'm not sure what kind of machine was replaced by this one but I could find out..it was a redhat machine and it ran for about the past year until we decided to replace it with a racked debian box. any idea what could cause this? It only happened once we started using the new system. And I bet the OpenBSD crashes on my end here were the result of something similar. however, in OpenBSD it didn't give any errors, it just dumped to the debugger and sat there until i rebooted it. buggy chip? buggy driver? hard to imagine the driver is to blame as this other system has been running for over 2 months without a single problem. running ifconfig on both systems shows: (on broken system) 4:27pm up 5:48, 1 user, load average: 0.00, 0.00, 0.00 eth0 Link encap:Ethernet HWaddr 00:30:48:11:02:D8 inet addr:192.168.100.2 Bcast:192.168.100.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:417948 errors:0 dropped:0 overruns:0 frame:0 TX packets:184575 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 Interrupt:11 Base address:0xb000 eth1 Link encap:Ethernet HWaddr 00:30:48:11:12:16 inet addr:XX.XX.XX.XX Bcast:XX.255.255.255 Mask:255.255.255.XXX UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:207215 errors:0 dropped:0 overruns:0 frame:0 TX packets:192204 errors:2 dropped:0 overruns:0 carrier:0 collisions:157 txqueuelen:100 Interrupt:5 Base address:0xd000 eth1:0 Link encap:Ethernet HWaddr 00:30:48:11:12:16 inet addr:XX.XX.XX.XXX Bcast:XX.255.255.255 Mask:255.255.255.XXX UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:5 Base address:0xd000 (on working system) 1:24pm up 68 days, 20:42, 1 user, load average: 0.00, 0.04, 0.06 eth0 Link encap:Ethernet HWaddr 00:30:48:11:02:D9 inet addr:XX.XX.XX.XX Bcast:XX.XX.XXX.XXX Mask:255.255.255.XXX UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:101475510 errors:0 dropped:0 overruns:0 frame:1 TX packets:117209873 errors:0 dropped:0 overruns:0 carrier:116 collisions:15698167 txqueuelen:100 Interrupt:11 Base address:0x9000 eth0:1 Link encap:Ethernet HWaddr 00:30:48:11:02:D9 inet addr:XX.XX.XX.XXX Bcast:XX.XX.XX.255 Mask:255.255.255.XXX UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 Interrupt:11 Base address:0x9000 eth1 Link encap:Ethernet HWaddr 00:30:48:11:12:17 inet addr:192.168.50.20 Bcast:192.168.50.255 Mask:255.255.255.XX UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:123049598 errors:0 dropped:0 overruns:0 frame:0 TX packets:104000322 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 Interrupt:5 Base address:0xb000 I point this out because eth1 on the broken system had 2 TX errors, and after all the weeks and packets that have gone through the working system not a single error. although a lot of collisions, but it is hooked up to a $10 hub... here is the kernel log for the broken system when the kernel loaded the driver: Feb 15 02:38:45 gate-nh kernel: eepro100.c:v1.11a 7/31/2000 Donald Becker Feb 15 02:38:45 gate-nh kernel: http://www.scyld.com/network/eepro100.html Feb 15 02:38:45 gate-nh kernel: eth0: OEM i82557/i82558 10/100 Ethernet at 0xc808b000, 00:30:48:11:02:D8, IRQ 11. Feb 15 02:38:45 gate-nh kernel: Receiver lock-up bug exists -- enabling work-around. Feb 15 02:38:45 gate-nh kernel: Board assembly 000000-000, Physical connectors present: RJ45 Feb 15 02:38:45 gate-nh kernel: Primary interface chip i82555 PHY #1. Feb 15 02:38:45 gate-nh kernel: General self-test: passed. Feb 15 02:38:45 gate-nh kernel: Serial sub-system self-test: passed. Feb 15 02:38:45 gate-nh kernel: Serial sub-system self-test: passed. Feb 15 02:38:45 gate-nh kernel: Internal registers self-test: passed. Feb 15 02:38:45 gate-nh kernel: ROM checksum self-test: passed (0x04f4518b). Feb 15 02:38:45 gate-nh kernel: eth1: OEM i82557/i82558 10/100 Ethernet at 0xc808d000, 00:30:48:11:12:16, IRQ 5. Feb 15 02:38:45 gate-nh kernel: Receiver lock-up bug exists -- enabling work-around. Feb 15 02:38:45 gate-nh kernel: Board assembly a19716-001, Physical connectors present: RJ45 Feb 15 02:38:45 gate-nh kernel: Primary interface chip i82555 PHY #1. Feb 15 02:38:45 gate-nh kernel: General self-test: passed. Feb 15 02:38:45 gate-nh kernel: Serial sub-system self-test: passed. Feb 15 02:38:45 gate-nh kernel: Internal registers self-test: passed. Feb 15 02:38:45 gate-nh kernel: ROM checksum self-test: passed (0x04f4518b). I imagine the same is similar for the working system however the bootup logs are cycled and overwritten after a month of uptime. network load on both systems is extremely light, MRTG reports over the past 5 weeks average network traffic 2.9kB/s both ways for the broken system. the working one averages 13-14kB/s both ways for the past 5 weeks. both systems are on 1Mbit dsl connections. the 3rd is sitting on a shelf waiting for someone to get the time to set it up. its in another state so i don't have access to it. The machines themselves are Single P3-733Mhz 128MB ram, using that Supermicro motherboard, a single 20GB quantum IDE drive. any ideas would be appreciated :) i have a feeling it will lockup again tonight. thanks! nate -- Nate Amsden System Administrator GraphOn http://www.graphon.com From natea@graphon.com Fri, 16 Feb 2001 11:09:47 -0800 Date: Fri, 16 Feb 2001 11:09:47 -0800 From: Nate Amsden natea@graphon.com Subject: [eepro100] eepro100 82559 problems looks like we may of solved the problem. we switched configurations of eth0 and eth1 on the other machine so that the sub interface was that of eth0 and not eth1 and the system didn't lockup last night and the system has not had a single error since. so..what could this mean? tia. nate Nate Amsden wrote: > > hi > > after seeing this message posted by Antwerpen@netsquare.org: > http://www.scyld.com/pipermail/eepro100/2001-February/001509.html > > i figured i should post because i have a very similar problem. > > We have 3 identical 1U systems running Supermicro S370SSE motherboards > (at least im 99.99999% sure it is, i cant be 100% sure without taking > the system apart). They have dual onboard Intel 82559 NICs. > > (somewhat related..) > When using OpenBSD 2.8 on one of them, the machine seemed to crash > after about 5 minutes of use(firewaling/port forwarding under > very low load maybe 10kb/s at best). > > I have since replaced OpenBSD 2.8 with Debian GNU/Linux 2.2r2 and > kernel 2.2.17+many patches including modules for eepro100 v1.11a. > this machine has been operating perfectly for the past 68 days > 20 hours. At another location on the other side of the country > we are trying to deploy the 2nd of 3 systems, using a similar > configuration(kernel and modules are identical, bios settings > match etc) and since we deployed it on monday i think it was > it has consistantly locked up hard every night. Today we > synched the bios settings between the unit here and there and > things seemed to be going better however the errors are still > showing up. something that has never shown up in the logs in > the unit here. > > sample log entry: > > Feb 10 05:37:11 gate-nh kernel: eth1: Transmit timed out: status 0050 0080 at > 59/61 commands 000c0000 400c0000 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: Tx ring dump, Tx queue 61 / 59: > Feb 10 05:37:11 gate-nh kernel: eth1: 0 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 1 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 2 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 3 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 4 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 5 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 6 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 7 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 8 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 9 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 10 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 11 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 12 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 13 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 14 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 15 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 16 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 17 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 18 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 19 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 20 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 21 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 22 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 23 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 24 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 25 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 26 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: * 27 000c0000. > Feb 10 05:37:11 gate-nh kernel: eth1: 28 400c0000. > Feb 10 05:37:11 gate-nh kernel: eth1: =29 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 30 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 31 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1:Printing Rx ring (next to receive into > 143). > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 0 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 1 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 2 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 3 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 4 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 5 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 6 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 7 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 8 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 9 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 10 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 11 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 12 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 13 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 14 c0000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 15 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 16 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 17 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 18 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 19 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 20 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 21 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 22 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 23 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 24 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 25 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 26 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 27 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 28 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 28 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 29 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 30 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 31 00000001. > Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 0 is 3100. > Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 1 is 782d. > Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 2 is 02a8. > Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 3 is 0320. > Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 4 is 05e1. > Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 5 is 0021. > Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 21 is 0000. > Feb 10 05:37:11 gate-nh kernel: eth1: Tx ring dump, Tx queue 61 / 59: > Feb 10 05:37:11 gate-nh kernel: eth1: 0 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 1 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 2 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 3 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 4 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 5 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 6 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 7 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 8 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 9 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 10 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 11 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 12 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 13 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 14 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 15 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 16 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 17 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 18 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 19 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 20 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 21 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 22 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 23 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 24 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 25 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 26 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: * 27 000c0000. > Feb 10 05:37:11 gate-nh kernel: eth1: 28 400c0000. > Feb 10 05:37:11 gate-nh kernel: eth1: =29 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 30 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1: 31 000ca000. > Feb 10 05:37:11 gate-nh kernel: eth1:Printing Rx ring (next to receive into 143) > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 0 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 1 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 2 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 3 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 4 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 5 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 6 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 7 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 8 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 9 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 10 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 11 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 12 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 13 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 14 c0000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 15 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 16 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 17 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 18 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 19 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 20 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 21 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 22 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 23 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 24 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 25 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 26 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 27 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 28 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 29 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 30 00000001. > Feb 10 05:37:11 gate-nh kernel: Rx ring entry 31 00000001. > Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 0 is 3100. > Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 1 is 782d. > Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 2 is 02a8. > Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 3 is 0320. > Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 4 is 05e1. > Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 5 is 0021. > Feb 10 05:37:11 gate-nh kernel: PHY index 1 register 21 is 0000. > > I'm not sure what kind of machine was replaced by this one but I > could find out..it was a redhat machine and it ran for about the > past year until we decided to replace it with a racked debian > box. any idea what could cause this? It only happened once we > started using the new system. And I bet the OpenBSD crashes > on my end here were the result of something similar. however, > in OpenBSD it didn't give any errors, it just dumped to the > debugger and sat there until i rebooted it. buggy chip? > buggy driver? hard to imagine the driver is to blame as > this other system has been running for over 2 months without > a single problem. > > running ifconfig on both systems shows: > (on broken system) > 4:27pm up 5:48, 1 user, load average: 0.00, 0.00, 0.00 > eth0 Link encap:Ethernet HWaddr 00:30:48:11:02:D8 > inet addr:192.168.100.2 Bcast:192.168.100.255 Mask:255.255.255.0 > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:417948 errors:0 dropped:0 overruns:0 frame:0 > TX packets:184575 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:100 > Interrupt:11 Base address:0xb000 > > eth1 Link encap:Ethernet HWaddr 00:30:48:11:12:16 > inet addr:XX.XX.XX.XX Bcast:XX.255.255.255 Mask:255.255.255.XXX > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:207215 errors:0 dropped:0 overruns:0 frame:0 > TX packets:192204 errors:2 dropped:0 overruns:0 carrier:0 > collisions:157 txqueuelen:100 > Interrupt:5 Base address:0xd000 > > eth1:0 Link encap:Ethernet HWaddr 00:30:48:11:12:16 > inet addr:XX.XX.XX.XXX Bcast:XX.255.255.255 Mask:255.255.255.XXX > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > Interrupt:5 Base address:0xd000 > > (on working system) > 1:24pm up 68 days, 20:42, 1 user, load average: 0.00, 0.04, 0.06 > eth0 Link encap:Ethernet HWaddr 00:30:48:11:02:D9 > inet addr:XX.XX.XX.XX Bcast:XX.XX.XXX.XXX Mask:255.255.255.XXX > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:101475510 errors:0 dropped:0 overruns:0 frame:1 > TX packets:117209873 errors:0 dropped:0 overruns:0 carrier:116 > collisions:15698167 txqueuelen:100 > Interrupt:11 Base address:0x9000 > > eth0:1 Link encap:Ethernet HWaddr 00:30:48:11:02:D9 > inet addr:XX.XX.XX.XXX Bcast:XX.XX.XX.255 Mask:255.255.255.XXX > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > Interrupt:11 Base address:0x9000 > > eth1 Link encap:Ethernet HWaddr 00:30:48:11:12:17 > inet addr:192.168.50.20 Bcast:192.168.50.255 Mask:255.255.255.XX > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > RX packets:123049598 errors:0 dropped:0 overruns:0 frame:0 > TX packets:104000322 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:100 > Interrupt:5 Base address:0xb000 > > I point this out because eth1 on the broken system had 2 TX errors, > and after all the weeks and packets that have gone through the working > system not a single error. although a lot of collisions, but it is > hooked up to a $10 hub... > > here is the kernel log for the broken system when the kernel loaded > the driver: > > Feb 15 02:38:45 gate-nh kernel: eepro100.c:v1.11a 7/31/2000 Donald Becker > > Feb 15 02:38:45 gate-nh kernel: http://www.scyld.com/network/eepro100.html > Feb 15 02:38:45 gate-nh kernel: eth0: OEM i82557/i82558 10/100 Ethernet at > 0xc808b000, 00:30:48:11:02:D8, IRQ 11. > Feb 15 02:38:45 gate-nh kernel: Receiver lock-up bug exists -- enabling > work-around. > Feb 15 02:38:45 gate-nh kernel: Board assembly 000000-000, Physical connectors > present: RJ45 > Feb 15 02:38:45 gate-nh kernel: Primary interface chip i82555 PHY #1. > Feb 15 02:38:45 gate-nh kernel: General self-test: passed. > Feb 15 02:38:45 gate-nh kernel: Serial sub-system self-test: passed. > Feb 15 02:38:45 gate-nh kernel: Serial sub-system self-test: passed. > Feb 15 02:38:45 gate-nh kernel: Internal registers self-test: passed. > Feb 15 02:38:45 gate-nh kernel: ROM checksum self-test: passed (0x04f4518b). > Feb 15 02:38:45 gate-nh kernel: eth1: OEM i82557/i82558 10/100 Ethernet at > 0xc808d000, 00:30:48:11:12:16, IRQ 5. > Feb 15 02:38:45 gate-nh kernel: Receiver lock-up bug exists -- enabling > work-around. > Feb 15 02:38:45 gate-nh kernel: Board assembly a19716-001, Physical connectors > present: RJ45 > Feb 15 02:38:45 gate-nh kernel: Primary interface chip i82555 PHY #1. > Feb 15 02:38:45 gate-nh kernel: General self-test: passed. > Feb 15 02:38:45 gate-nh kernel: Serial sub-system self-test: passed. > Feb 15 02:38:45 gate-nh kernel: Internal registers self-test: passed. > Feb 15 02:38:45 gate-nh kernel: ROM checksum self-test: passed (0x04f4518b). > > I imagine the same is similar for the working system however the bootup > logs are cycled and overwritten after a month of uptime. > > network load on both systems is extremely light, MRTG reports over > the past 5 weeks average network traffic 2.9kB/s both ways for the > broken system. the working one averages 13-14kB/s both ways for > the past 5 weeks. both systems are on 1Mbit dsl connections. > > the 3rd is sitting on a shelf waiting for someone to get the time > to set it up. its in another state so i don't have access to it. > > The machines themselves are Single P3-733Mhz 128MB ram, using > that Supermicro motherboard, a single 20GB quantum IDE drive. > > any ideas would be appreciated :) i have a feeling it will > lockup again tonight. > > thanks! > > nate > > -- > Nate Amsden > System Administrator > GraphOn > http://www.graphon.com > > _______________________________________________ > eepro100 mailing list > eepro100@scyld.com > http://www.scyld.com/mailman/listinfo/eepro100 -- Nate Amsden System Administrator GraphOn http://www.graphon.com From emil@baymountain.com Fri, 16 Feb 2001 23:39:15 -0500 Date: Fri, 16 Feb 2001 23:39:15 -0500 From: Emil Briggs emil@baymountain.com Subject: [eepro100] Info on strange eepro100 lockup SuperMicro Quad motherboard (Serverworks HE chipset) The box is at a remote facility and the network will lockup while I'm logged in working. I call someone at the facility and the console has a message on it eth0: card reports no resources as soon as someone up there types something on the keyboard the network comes back up. The problem is very consistent and repeatable with the eepro100 locking up a few minutes after the last typing was done on the keyboard. The kernel is 2.2.14 and I'm going to upgrade it to get a more uptodate version of the eepro100 driver when I get onsite but I thought the symptons might be of some interest. Emil From gdavide@mclink.it Sun, 18 Feb 2001 10:39:21 +0100 Date: Sun, 18 Feb 2001 10:39:21 +0100 From: Davide Giunchi gdavide@mclink.it Subject: [eepro100] eepro on netserver disconnection Hi all. I write again because i've tryied to fix this problem more times but i haven't resolve nothing... The problem is that with eepro100 i82557 on some HP Netserver the users get "disconnect" during their telnet session, the NIC is a HP NetServer 10/100TX PCI (D5013A) . I get this error on RedHat6.0 and RedHat 6.0 but i think that would be the same with other distribution, in the redhat errata there's no problem like this; i've tried to force the velocity to 10bps with mii-diag but i don't get the problem solved, there's no irq conflict... It happen i more network (no router in the middle) so it isn't for a network cabling problem... what can i do? i'm very worry for this problem because i can't figure it out and my user get bored. Regards. From saw@saw.sw.com.sg Mon, 19 Feb 2001 17:06:41 -0800 Date: Mon, 19 Feb 2001 17:06:41 -0800 From: Andrey Savochkin saw@saw.sw.com.sg Subject: [eepro100] Re: card reports no resources Hello, On Mon, Feb 12, 2001 at 12:03:29PM +0800, CheeChun Kok wrote: [snip] > The PC essentially runs a single process which continually > reads IP packets from the NICs to process the data carried > in them. The error message starts appearing intermittently > during operation. We have not seen them occuring immediately > after startup (hence ruling out receiver bug being the > cause ??) > > We have not seen the message "can't fill rx buffer" using > v1.09j-t Revision: 1.18 $ 1999/12/29 Modified by Andrey V. > Savochkin. This suggest that we are not running short of > kernel memory (??) So, the "no resource" message means that the card thinks that there is receive buffer shortage. The natural reason for it is big traffic bursts. On Mon, Feb 12, 2001 at 05:41:01PM +0800, CheeChun Kok wrote: > 1. Changed RX_RING_SIZE to 256 (Is this a value too high, if so, > is there a max?) > After this was done, the error disappeared or maybe it has > merely been delayed. You increase the number of buffers, and the messages disappear. Reasonable. > Anyway, another error message appeared. This time it is > 'Too much work at interrupt, status = 0x4050' > where the status is decoded from Intel's driver code as It means that you have really big bursts (or really bad interrupt latency because some other driver disables interrupts for too long). [snip] > > 2. I then proceeded to change max_interrupt_work to match > RX_RING_SIZE. (Is there a reason why they are not the same > in the original set of codes? with max_interrupt_work = 20 and > RX_RING_SIZE = 32) > However, this causes 'card reports no resources' to reoccur. This way it should work. The only reason that may explain it is that you have horrible interrupt latency. For example, frame buffer on some chipsets has this property. Best regards Andrey From saw@saw.sw.com.sg Mon, 19 Feb 2001 17:26:22 -0800 Date: Mon, 19 Feb 2001 17:26:22 -0800 From: Andrey Savochkin saw@saw.sw.com.sg Subject: [eepro100] Re: eepro on netserver disconnection On Sun, Feb 18, 2001 at 10:39:21AM +0100, Davide Giunchi wrote: > I write again because i've tryied to fix this problem more times but i haven't resolve nothing... > The problem is that with eepro100 i82557 on some HP Netserver the users get "disconnect" during their > telnet session, the NIC is a HP NetServer 10/100TX PCI (D5013A) . > I get this error on RedHat6.0 and RedHat 6.0 but i think that would be the same with other distribution, in > the redhat errata there's no problem like this; i've tried to force the velocity to 10bps with mii-diag but i > don't get the problem solved, there's no irq conflict... > It happen i more network (no router in the middle) so it isn't for a network cabling problem... what can i do? > i'm very worry for this problem because i can't figure it out and my user get bored. Capture and send me tcpdump's of disconnecting sessions (-s1600 -n -S). Best regards Andrey From saw@saw.sw.com.sg Mon, 19 Feb 2001 17:30:54 -0800 Date: Mon, 19 Feb 2001 17:30:54 -0800 From: Andrey Savochkin saw@saw.sw.com.sg Subject: [eepro100] Re: Server Problem On Thu, Feb 08, 2001 at 01:34:38PM -0500, Derek Harkness wrote: > Compaq DL380 three i82555 chips. The first is on the mainboard and works > great, the other two are on a compaq NC3134 controller, and worked under > kernel 2.2.17 just fine. I upgraded to kernel 2.4.x and they stopped > receiving data. > > Error generated: > eth1: Transmit timeouted out: status e050 0x00 at 0/28 command 0001a000 1. Pass "debug=3" option to the driver. 2. Send all messages produced by the driver, starting from its greeting message. Best regards Andrey From saw@saw.sw.com.sg Mon, 19 Feb 2001 17:33:52 -0800 Date: Mon, 19 Feb 2001 17:33:52 -0800 From: Andrey Savochkin saw@saw.sw.com.sg Subject: [eepro100] Re: Info on strange eepro100 lockup On Fri, Feb 16, 2001 at 11:39:15PM -0500, Emil Briggs wrote: > The box is at a remote facility and the network will > lockup while I'm logged in working. I call someone at > the facility and the console has a message on it > > eth0: card reports no resources > > as soon as someone up there types something on the keyboard the > network comes back up. The problem is very consistent > and repeatable with the eepro100 locking up a few minutes > after the last typing was done on the keyboard. The kernel > is 2.2.14 and I'm going to upgrade it to get a more uptodate > version of the eepro100 driver when I get onsite but I thought > the symptons might be of some interest. Interesting phenomenon... If the computer doesn't have a big network load, you may run the driver under high debug level and send the output. I'll check what's going on. Best regards Andrey From johnzero-eepro100@johnzero.hu Tue, 20 Feb 2001 23:29:03 +0100 (CET) Date: Tue, 20 Feb 2001 23:29:03 +0100 (CET) From: Noll Janos johnzero-eepro100@johnzero.hu Subject: [eepro100] Integrated Pro/100 on Intel D815EEA - serious problem Hi! I've tried to get informed about this problem. Using Deja.com-s search, I've found that many people are having problems with the Intel D815EEA motherboard (which has an integrated ethernet card). The system seems to run "fine" until it gets some (not even too large) net load... I've tried Linux 2.4.1, I've tried the -ac patches, etc. but nothing so far. kernel logs: --------------------------------------- CPU: Intel Pentium III (Coppermine) stepping 03 [...] BIOS Vendor: Intel Corp. BIOS Version: EA81510A.86A.0040.P09.0011141019 BIOS Release: 11/14/2000 [...] eepro100.c:v1.09j-t 9/29/99 Donald Becker http://cesdis.gsfc.nasa.gov/linux/d eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin > Hi! > > I've tried to get informed about this problem. Using Deja.com-s search, I've > found that many people are having problems with the Intel D815EEA motherboard > (which has an integrated ethernet card). > > The system seems to run "fine" until it gets some (not even too large) net > load... > > I've tried Linux 2.4.1, I've tried the -ac patches, etc. but nothing so far. > kernel logs: > --------------------------------------- > CPU: Intel Pentium III (Coppermine) stepping 03 > [...] > BIOS Vendor: Intel Corp. > BIOS Version: EA81510A.86A.0040.P09.0011141019 > BIOS Release: 11/14/2000 > [...] > eepro100.c:v1.09j-t 9/29/99 Donald Becker http://cesdis.gsfc.nasa.gov/linux/d > eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin PCI: Found IRQ 11 for device 01:08.0 > eth0: Intel Corporation 82820 820 (Camino 2) Chipset Ethernet, 00:03:47:3E:44 > Board assembly 000000-000, Physical connectors present: RJ45 > Primary interface chip i82555 PHY #1. > General self-test: passed. > Serial sub-system self-test: passed. > Internal registers self-test: passed. > ROM checksum self-test: passed (0x04f4518b). > [...] > eepro100: wait_for_cmd_done timeout! > last message repeated 5 times > last message repeated 6 times > last message repeated 6 times > last message repeated 6 times > backup last message repeated 2 times > [...] > eth0: Transmit timed out: status 0050 0c80 at 3995/4023 command 000c0000. > Tx ring dump, Tx queue 4023 / 3995: > 0 200c0000. > 1 000c0000. > 2 000c0000. > 3 000c0000. > 4 000c0000. > 5 000c0000. > 6 000c0000. > 7 000c0000. > 8 200c0000. > 9 000c0000. > 10 000c0000. > 11 000c0000. > 12 000c0000. > 13 000c0000. > 14 000c0000. > 15 000c0000. > 16 200c0000. > 17 000c0000. > 18 000c0000. > 19 000c0000. > 20 000c0000. > 21 000c0000. > 22 400c0000. > =23 000ca000. > 24 200ca000. > 25 000ca000. > 26 000ca000. > * 27 000c0000. > 28 000c0000. > 29 000c0000. > 30 000c0000. > 31 000c0000. > [...] > -------------------------------- > > From what I've read, this problem exists with the 2.2.18 kernel too, though I > haven't tried that yet. > > Noll Janos > > _______________________________________________ > eepro100 mailing list > eepro100@scyld.com > http://www.scyld.com/mailman/listinfo/eepro100 -- ------------------------------------------------------------------------- private: christoph.plattner@dot.at company: christoph.plattner@alcatel.at From jlundell@pobox.com Tue, 20 Feb 2001 16:07:32 -0800 Date: Tue, 20 Feb 2001 16:07:32 -0800 From: Jonathan Lundell jlundell@pobox.com Subject: [eepro100] various eepro100 drivers > I think the first answer you will get from the experts here, is: >Use the current version of the eepro driver from the server of Donald >Becker. >There is a version of this year ! I'd be grateful if someone would either direct me to a FAQ or explain briefly the relationships among the various eepro100 drivers floating around. I don't mean simple revisions, but at least the version in the LInux release, the Becker driver, and the Intel driver. XyzBSD optional. -- /Jonathan Lundell. From saw@saw.sw.com.sg Tue, 20 Feb 2001 18:36:18 -0800 Date: Tue, 20 Feb 2001 18:36:18 -0800 From: Andrey Savochkin saw@saw.sw.com.sg Subject: [eepro100] Re: Integrated Pro/100 on Intel D815EEA - serious problem --dDRMvlgZJXvWKvBx Content-Type: text/plain; charset=us-ascii Hi, On Tue, Feb 20, 2001 at 11:29:03PM +0100, Noll Janos wrote: > I've tried to get informed about this problem. Using Deja.com-s search, I've > found that many people are having problems with the Intel D815EEA motherboard > (which has an integrated ethernet card). > > The system seems to run "fine" until it gets some (not even too large) net > load... [snip] > [...] > eepro100: wait_for_cmd_done timeout! It's interesting to know which command caused this timeout. I debugged some similar problems with the attached patch. Best regards Andrey --dDRMvlgZJXvWKvBx Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename=diff --- versions/eepro100.c.R1.33 Fri May 26 19:33:47 2000 +++ patches/eepro100.c-testwait Tue Jun 20 03:07:45 2000 @@ -363,19 +363,6 @@ #define outl writel #endif -/* How to wait for the command unit to accept a command. - Typically this takes 0 ticks. */ -static inline void wait_for_cmd_done(long cmd_ioaddr) -{ - int wait = 1000; - do ; - while(inb(cmd_ioaddr) && --wait >= 0); -#ifndef final_version - if (wait < 0) - printk(KERN_ALERT "eepro100: wait_for_cmd_done timeout!\n"); -#endif -} - /* Offsets to the various registers. All accesses need not be longword aligned. */ enum speedo_offsets { @@ -532,6 +519,7 @@ unsigned int full_duplex:1; /* Full-duplex operation requested. */ unsigned int flow_ctrl:1; /* Use 802.3x flow control. */ unsigned int rx_bug:1; /* Work around receiver hang errata. */ + unsigned int dumpstat; /* Last command was CUDumpStat. */ unsigned char default_port:8; /* Last dev->if_port value. */ unsigned char rx_ring_state; /* RX ring status flags. */ unsigned short phy[2]; /* PHY media interfaces available. */ @@ -1036,6 +1039,77 @@ return 0; } +/* How to wait for the command unit to accept a command. + Typically this takes 0 ticks. */ + +static int show_trace(int dummy) +{ + int i; + unsigned long *stack, addr, module_start, module_end; +#define MODULE_RANGE (8*1024*1024) + printk("CPU: %d\nEIP: [<%08lx>]\n", + smp_processor_id(), (unsigned long)__builtin_return_address(0)); + printk("Stack: "); + stack = (unsigned long *) &dummy; + for(i = 0; i < 24; i++) { + if (((long) stack & (THREAD_SIZE-1)) == 0) + break; + if (i && ((i % 8) == 0)) + printk("\n "); + printk("%08lx ", *stack++); + } + printk("\nCall Trace: "); + stack = (unsigned long *) &dummy; + i = 1; + module_start = PAGE_OFFSET + (max_mapnr << PAGE_SHIFT); + module_start = ((module_start + VMALLOC_OFFSET) & ~(VMALLOC_OFFSET-1)); + module_end = module_start + MODULE_RANGE; + while (((long) stack & (THREAD_SIZE-1)) != 0) { + extern char _stext, _etext; + addr = *stack++; + /* + * If the address is either in the text segment of the + * kernel, or in the region which contains vmalloc'ed + * memory, it *may* be the address of a calling + * routine; if so, print it so that someone tracing + * down the cause of the crash will be able to figure + * out the call path that was taken. + */ + if (((addr >= (unsigned long) &_stext) && + (addr <= (unsigned long) &_etext)) || + ((addr >= module_start) && (addr <= module_end))) { + if (i && ((i % 8) == 0)) + printk("\n "); + printk("[<%08lx>] ", addr); + i++; + } + } + return 0; +} + +#if 0 +static void wait_for_cmd_done(long cmd_ioaddr) +#endif +#define wait_for_cmd_done(cmd_ioaddr) \ +do \ +{ \ + int wait = 1000; \ + do ; \ + while(inb(cmd_ioaddr) && --wait >= 0); \ + if (wait < 0) { \ + printk(KERN_ALERT "eepro100: wait_for_cmd_done timeout!\n"); \ + /* DEBUG */ \ + show_trace(0); \ + printk(KERN_INFO \ + "eepro100: w-f-c-d-t, cmd=%04x, stat=%04x, %d.\n", \ + inw(cmd_ioaddr), inw(cmd_ioaddr + SCBStatus - SCBCmd), \ + sp->dumpstat); \ + } else { \ + sp->dumpstat = 0; \ + } \ +} \ +while (0) + /* Start the chip hardware after a full reset. */ static void speedo_resume(struct net_device *dev) { @@ -1934,6 +1999,7 @@ spin_lock_irqsave(&sp->lock, flags); wait_for_cmd_done(ioaddr + SCBCmd); outb(CUDumpStats, ioaddr + SCBCmd); + sp->dumpstat = 2; spin_unlock_irqrestore(&sp->lock, flags); } } --dDRMvlgZJXvWKvBx-- From saw@saw.sw.com.sg Tue, 20 Feb 2001 18:46:32 -0800 Date: Tue, 20 Feb 2001 18:46:32 -0800 From: Andrey Savochkin saw@saw.sw.com.sg Subject: [eepro100] Re: various eepro100 drivers On Tue, Feb 20, 2001 at 04:07:32PM -0800, Jonathan Lundell wrote: > > I think the first answer you will get from the experts here, is: > >Use the current version of the eepro driver from the server of Donald > >Becker. > >There is a version of this year ! > > I'd be grateful if someone would either direct me to a FAQ or explain briefly the relationships among the various eepro100 drivers floating around. I don't mean simple revisions, but at least the version in the LInux release, the Becker driver, and the Intel driver. XyzBSD optional. As you've stated, there are 3 major branches: - Donald's one - mainstream kernel driver (maintained by me) - Intel's one The second driver forked from the first one, Intel's driver is completely independent. In each driver authors fix problems they face. Intel knows a lot about the hardware, but they do terrible things like disabling interrupts for _seconds_. It doesn't compile for 2.4 kernels. It is reported to be less stable on SMP machines. It is reported to have less performance. Donald knows and cares more than me about workarounds for different hardware defects. I care about clear interface to kernel (and remove all legacy stuff), I also care about work under high load and, especially, memory pressure. Also, drivers change with time, and checking changelogs (or reading patches) is useful to understand the directions of development. Best regards Andrey From johnzero-eepro100@johnzero.hu Wed, 21 Feb 2001 21:49:12 +0100 (CET) Date: Wed, 21 Feb 2001 21:49:12 +0100 (CET) From: Noll Janos johnzero-eepro100@johnzero.hu Subject: [eepro100] Re: Integrated Pro/100 on Intel D815EEA - serious problem Hi! On 21-Feb-2001 Andrey Savochkin wrote: >> [...] >> eepro100: wait_for_cmd_done timeout! > > It's interesting to know which command caused this timeout. > I debugged some similar problems with the attached patch. After the patch, it says: *** Unresolved symbols in /lib/modules/2.4.1/kernel/drivers/net/eepro100.o The symbols were _stext and _etext Maybe I patched the eepro100.c too much (I tried many patches, to solve the problem.) Anyway, currently I can't do more testing, as we placed another networkcard in the machine, and put it to production. What I can say, is that we used the machine in two setups - when it was connected to a 100 MBps HUP - we had no problem - when it was connected to a (Cisco) switch - timeout errors Don't know if this helps... If you need people to do testing with your "debug" patch, just search on "eepro100 wait_for_cmd_done timeout" on Deja.com, you'll find plenty. Noll Janos From cerrito@centromultimediale.it Thu, 22 Feb 2001 13:57:36 +0100 Date: Thu, 22 Feb 2001 13:57:36 +0100 From: Andrea Cerrito cerrito@centromultimediale.it Subject: [eepro100] Too many nics Hi to all. I'm using Intel Pro/100S Desktop and I switched from eepro100 1.09j to eepro100 1.13 right now, because old version bug (cmd_wait for(0xffffff90) timedout with(0xffffff90)!). Recompiled the kernel, booted it up and *SURPRISE*! My NICS are grown from three to nine... :) Here is the boot kernel msg: ========= eth1: Intel PCI EtherExpress Pro100 at 0xfc802000, 00:02:B3:27:B4:83, IRQ 9. Receiver lock-up bug exists -- enabling work-around. Board assembly 751767-003, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. Secondary interface chip i82555. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x3258698e). eth2: Intel PCI EtherExpress Pro100 at 0xfc804000, 00:02:B3:27:B3:E9, IRQ 10. Receiver lock-up bug exists -- enabling work-around. Board assembly 751767-003, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. Secondary interface chip i82555. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x3258698e). eth3: Intel PCI EtherExpress Pro100 at 0xfc806000, 00:02:B3:27:9C:BB, IRQ 11. Receiver lock-up bug exists -- enabling work-around. Board assembly 751767-003, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. Secondary interface chip i82555. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x3258698e). eepro100.c:v1.13 1/9/2001 Donald Becker http://www.scyld.com/network/eepro100.html eth4: Intel PCI EtherExpress Pro100 at 0xfc808000, 00:02:B3:27:B4:83, IRQ 9. Receiver lock-up bug exists -- enabling work-around. Board assembly 751767-003, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. Secondary interface chip i82555. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x3258698e). eth5: Intel PCI EtherExpress Pro100 at 0xfc80a000, 00:02:B3:27:B3:E9, IRQ 10. Receiver lock-up bug exists -- enabling work-around. Board assembly 751767-003, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. Secondary interface chip i82555. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x3258698e). eth6: Intel PCI EtherExpress Pro100 at 0xfc80c000, 00:02:B3:27:9C:BB, IRQ 11. Receiver lock-up bug exists -- enabling work-around. Board assembly 751767-003, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. Secondary interface chip i82555. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x3258698e). eepro100.c:v1.13 1/9/2001 Donald Becker http://www.scyld.com/network/eepro100.html eth7: Intel PCI EtherExpress Pro100 at 0xfc80e000, 00:02:B3:27:B4:83, IRQ 9. Receiver lock-up bug exists -- enabling work-around. Board assembly 751767-003, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. Secondary interface chip i82555. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x3258698e). early initialization of device eth8 is deferred eth8: Intel PCI EtherExpress Pro100 at 0xfc810000, 00:02:B3:27:B3:E9, IRQ 10. Receiver lock-up bug exists -- enabling work-around. Board assembly 751767-003, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. Secondary interface chip i82555. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x3258698e). early initialization of device eth9 is deferred eth9: Intel PCI EtherExpress Pro100 at 0xfc812000, 00:02:B3:27:9C:BB, IRQ 11. Receiver lock-up bug exists -- enabling work-around. Board assembly 751767-003, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. Secondary interface chip i82555. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x3258698e). eepro100.c:v1.13 1/9/2001 Donald Becker http://www.scyld.com/network/eepro100.html ================ As you can see, it appears as the eepro100.c was called three times instead of just one, and that the third time it reports an (?) error: "early initialization of device eth9 is deferred". Meanwhile I'm writing this mail, i'm just recompiling the kernel 2.2.18 (make mrproper, make menuconfig, make dep, make clean, make bzImage), to be sure there was no problems during last "make". Done, same again. To compile the new driver, I've overwrote the old one, copied pci-scan.c/h and kern_compat.h in drivers/net and modified the makefile as reported in "Building updated drivers into the kernel" (http://www.scyld.com/network/updates.html): I think it is all correct, cause I've no errors during make bzImage. Someone can help? --- Cordiali saluti / Best regards Andrea Cerrito ^^^^^^^^^^^^^^ admin @ Centro MultiMediale di Terni SpA P.zzale Bosco 1A 05100 Terni IT Tel. 0744 / 5441330 Fax. 0744 / 5441372 From dave@zuka.net Sat, 24 Feb 2001 12:40:57 -0500 Date: Sat, 24 Feb 2001 12:40:57 -0500 From: Dave Filchak dave@zuka.net Subject: [eepro100] EEPro100 on Intel D815AAEEL Motherboard I am trying to complete an installation on the following machine: PIII 800 / 256 MB Ram / 15 gig ATA100 HD using Intel D815AAEEL motherboard with built in IntelPro100 Ethernet card. The machine also has a SCSI subsystem using an ATTO EXPRESS Ultra SCSI card. All works well except I am not able to get RH7 to recognize the Ethernet card using the eepro100 module. I read that this MB has had problems with this and can be fixed by either updating the eepro100 module or 'repatching' it. Another possibility is an IRQ conflict. Can anyone explain to me how to update and/or re-patch the eepro100 module and/or check and reassign the IRQ under RH7? Thanks in advance for all your help. Dave David Filchak - President Zuka Interactive Services Inc. 119 Spadina Ave., Level 5 Toronto, ON, M5V 2L1 PH:416.591.0882•FX:416.591.0828 http://www.zuka.net From walke@usna.edu Sat, 24 Feb 2001 20:05:38 -0500 Date: Sat, 24 Feb 2001 20:05:38 -0500 From: Vann H. Walke walke@usna.edu Subject: [eepro100] EEPro100 on Intel D815AAEEL Motherboard Dave, Hopefully someone from the list will give you a more complete answer shortly - Here is my limited knowledge of the situation: I just setup RH7 on a Micron ClientPro with the D815EEA motherboard. Like you, I couldn't get RH7 to recognize the network card. I downloaded and installed the latest netdriver package (SRPM) maintained by Donald Becker: http://www.scyld.com/network/updates.html This got everything working, but periodically the system would hang hard (reboot required). This seems to be the same problem others are having (see recent messages on the list). As I needed the computer for work, I just yanked a $20 linksys card out of one of my old computers. Since then I've had no problems. Supossedly Intel has a driver but I couldn't find it on the web site. I trust the open source and kernel developers a bit more to make good drivers anyway. Good Luck, Vann LT V. H. Walke USNA Computer Science From drewes@interstice.com Sat, 24 Feb 2001 17:30:02 -0800 Date: Sat, 24 Feb 2001 17:30:02 -0800 From: Rich Drewes drewes@interstice.com Subject: [eepro100] EEPro100 on Intel D815AAEEL Motherboard "Vann H. Walke" wrote: > Supossedly Intel has a driver but I couldn't find it on the web site. I > trust the open source and kernel developers a bit more to make good > drivers anyway. The e100.o driver you are referring to *is included* in the RH 7.0 release. I have been using it successfully for some time on the Intel 815EEAL and Intel 810EEAL mainboards. I believe it's even accessible from the RH 7.0 network boot floppy, so network installs are possible (though you will have to specify it explicitly, since the install program's autodetect logic will select the eepro100 driver and not the e100 driver). BTW, the Intel driver is also open source, but it isn't supported by either of the other Linux driver maintainers (the Becker fork or the kernel source fork drivers). It's sort of supported by Intel. Rich -- "Atheists do look for answers to existence itself. They just don't make them up." --Teller From walke@usna.edu Sat, 24 Feb 2001 21:44:45 -0500 Date: Sat, 24 Feb 2001 21:44:45 -0500 From: Vann H. Walke walke@usna.edu Subject: [eepro100] EEPro100 on Intel D815AAEEL Motherboard Rich Drewes wrote > >> Supossedly Intel has a driver but I couldn't find it on the web site. I >> trust the open source and kernel developers a bit more to make good >> drivers anyway. > > > The e100.o driver you are referring to *is included* in the RH 7.0 > release. I have been using it successfully for some time on the Intel > 815EEAL and Intel 810EEAL mainboards. I believe it's even accessible from > the RH 7.0 network boot floppy, so network installs are possible (though > you will have to specify it explicitly, since the install program's > autodetect logic will select the eepro100 driver and not the e100 driver). Ok, it may be included, but there was some detection problem with at least some motherboards. The RH7 packaged drivers couldn't find the network device. I saw some messages and patches for such a problem dated a few months ago (OCT/NOV?). Rather than try and patch the driver, I justed installed the latest netdriver SRPM, which got it working. Then I got the fail on high load problem. > > BTW, the Intel driver is also open source, but it isn't supported by either > of the other Linux driver maintainers (the Becker fork or the kernel source > fork drivers). It's sort of supported by Intel. Thanks for the correction. I don't want to come across as slighting the Intel people - I just feel more confident about the integration (and long term maintenance) of the other drivers. Vann From saw@saw.sw.com.sg Sat, 24 Feb 2001 19:22:24 -0800 Date: Sat, 24 Feb 2001 19:22:24 -0800 From: Andrey Savochkin saw@saw.sw.com.sg Subject: [eepro100] Re: EEPro100 on Intel D815AAEEL Motherboard On Sat, Feb 24, 2001 at 12:40:57PM -0500, Dave Filchak wrote: > I am trying to complete an installation on the following machine: PIII 800 / > 256 MB Ram / 15 gig ATA100 HD using Intel D815AAEEL motherboard with built > in IntelPro100 Ethernet card. The machine also has a SCSI subsystem using > an ATTO EXPRESS Ultra SCSI card. All works well except I am not able to get > RH7 to recognize the Ethernet card using the eepro100 module. I read that > this MB has had problems with this and can be fixed by either updating the > eepro100 module or 'repatching' it. Another possibility is an IRQ conflict. > Can anyone explain to me how to update and/or re-patch the eepro100 module > and/or check and reassign the IRQ under RH7? Please, send `lspci -n' output, I'll check why your card isn't recognized. Best regards Andrey V. Savochkin From dsalas@vt.edu Mon, 26 Feb 2001 14:54:05 -0500 Date: Mon, 26 Feb 2001 14:54:05 -0500 From: dsalas dsalas@vt.edu Subject: [eepro100] eepro100 driver problems I am having problems with the Intel EtherExpressPro100B card and was wondering if you could point me in the right direction. I'm running Red Hat Linux 7.0, kernel 2.2.17-14enterprise on a Dell PowerEdge 2400. The card works fine in 10baseT mode, but when I switch my connection to 100baseTx-FD I'm not getting a network connection. I've tried to download Donald's eepro100 driver, but I'm getting numerous errors while compiling: # gcc -DMODULE -D__KERNEL__ -O6 -c eepro100.c eepro100.c: In function `speedo_open': eepro100.c:847: structure has no member named `tbusy' eepro100.c:848: structure has no member named `interrupt' eepro100.c:849: structure has no member named `start' eepro100.c: In function `speedo_start_xmit': eepro100.c:1159: structure has no member named `tbusy' eepro100.c:1208: structure has no member named `tbusy' eepro100.c: In function `speedo_interrupt': eepro100.c:1237: structure has no member named `interrupt' eepro100.c:1323: structure has no member named `tbusy' eepro100.c:1325: `NET_BH' undeclared (first use in the function) eepro100.c:1325: (Each undeclared identifier is repoted only once eepro100.c:1325: for each function it appears in.) eepro100.c:1343: structure has no member named `interrupt' eepro100.c: In function `speedo_close': eepro100.c:1466: structure has no member named `start' eepro100.c:1467: structure has no member named `tbusy' eepro100.c: In function `speedo_get_stats': eepro100.c:1548: structure has no member named `start' If I try to ping an IP while plugged in to the 10baseT it works, but when I switch to the 100baseT line I only get Destination Host Unreachable. I have also tried to use Andrey Savochkin's driver, but his gives me: eepro100.o was compiled for kernel version 2.4.0-0.26 while this kernel is version 2.2.17-14.enterprise Any help you can give me would be great. Thanks in adavnce, Damian Salas Virginia Tech Athletics dsalas@vt.edu From ml@centromultimediale.it Mon, 26 Feb 2001 21:57:26 +0100 Date: Mon, 26 Feb 2001 21:57:26 +0100 From: Andrea Cerrito ml@centromultimediale.it Subject: R: [eepro100] eepro100 driver problems Have a look here: http://www.scyld.com/network/updates.html, "Special instructions for Red Hat 7.0". Good luck --- Cordiali saluti / Best regards Andrea Cerrito ^^^^^^^^^^^^^^ Net.Admin @ Centro MultiMediale di Terni SpA P.zzale Bosco 3A 05100 Terni IT Tel. 0744 / 5441330 Fax. 0744 / 5441372 -----Messaggio originale----- Da: eepro100-admin@scyld.com [mailto:eepro100-admin@scyld.com]Per conto di dsalas Inviato: lunedě 26 febbraio 2001 20.54 A: eepro100@scyld.com Oggetto: [eepro100] eepro100 driver problems I am having problems with the Intel EtherExpressPro100B card and was wondering if you could point me in the right direction. I'm running Red Hat Linux 7.0, kernel 2.2.17-14enterprise on a Dell PowerEdge 2400. The card works fine in 10baseT mode, but when I switch my connection to 100baseTx-FD I'm not getting a network connection. I've tried to download Donald's eepro100 driver, but I'm getting numerous errors while compiling: # gcc -DMODULE -D__KERNEL__ -O6 -c eepro100.c eepro100.c: In function `speedo_open': eepro100.c:847: structure has no member named `tbusy' eepro100.c:848: structure has no member named `interrupt' eepro100.c:849: structure has no member named `start' eepro100.c: In function `speedo_start_xmit': eepro100.c:1159: structure has no member named `tbusy' eepro100.c:1208: structure has no member named `tbusy' eepro100.c: In function `speedo_interrupt': eepro100.c:1237: structure has no member named `interrupt' eepro100.c:1323: structure has no member named `tbusy' eepro100.c:1325: `NET_BH' undeclared (first use in the function) eepro100.c:1325: (Each undeclared identifier is repoted only once eepro100.c:1325: for each function it appears in.) eepro100.c:1343: structure has no member named `interrupt' eepro100.c: In function `speedo_close': eepro100.c:1466: structure has no member named `start' eepro100.c:1467: structure has no member named `tbusy' eepro100.c: In function `speedo_get_stats': eepro100.c:1548: structure has no member named `start' If I try to ping an IP while plugged in to the 10baseT it works, but when I switch to the 100baseT line I only get Destination Host Unreachable. I have also tried to use Andrey Savochkin's driver, but his gives me: eepro100.o was compiled for kernel version 2.4.0-0.26 while this kernel is version 2.2.17-14.enterprise Any help you can give me would be great. Thanks in adavnce, Damian Salas Virginia Tech Athletics dsalas@vt.edu _______________________________________________ eepro100 mailing list eepro100@scyld.com http://www.scyld.com/mailman/listinfo/eepro100 From arthur@cal040041.student.utwente.nl Mon, 26 Feb 2001 22:56:46 +0100 (CET) Date: Mon, 26 Feb 2001 22:56:46 +0100 (CET) From: Arthur Rinkel arthur@cal040041.student.utwente.nl Subject: [eepro100] Performance prob. (again) Hi, I'm having some disappointing results with a eepro100 (with i82557) and allthough I've tried several things to fix it, nothing helped so far... The (Rx) speed doesn't seem to exceed 2MB/s or so; the NIC is installed in a Pentium system running 133MHz. A TCP/IP stack "test" with a prg called ttcp resulted in a (top) speed of 5,8MB/s, so I'm guessing the CPU is capable of doing 4-5MB/s of network throughput. Or is this incorrect? Tried different UTP cable...no change. Tried Intel driver...no change. Played with module parms (Donald's driver)...small improvement. Forced NIC 100Mb-only...connecting switch sees 100Mb-FD. Auto-neg. NIC...connecting switch sees 100Mb-FD, good, but still 2MB/s. 'ifconfig' reports...RX packets:214660 errors:21 dropped:0 overruns:0 frame:504. Any other thing there is to test? I have some debugging info too, but that's about 80 lines, so I'll only post it if it's necessary. Below however, is the output of 'eepro100-diag -aaeemf': eepro100-diag.c:v2.02 7/19/2000 Donald Becker (becker@scyld.com) http://www.scyld.com/diag/index.html Index #1: Found a Intel i82557 (or i82558) EtherExpressPro100B adapter at 0xe400. i82557 chip registers at 0xe400: 00000050 021af010 00000000 00080002 182541e1 00000600 No interrupt sources are pending. The transmit unit state is 'Suspended'. The receive unit state is 'Ready'. This status is normal for an activated but idle interface. EEPROM contents, size 64x16: 00: a000 a6c9 81a8 0000 0000 0101 4701 0000 0x08: 6784 0001 4000 0001 8086 0000 0000 0000 ... 0x38: 0000 0000 0000 0000 0000 0000 0000 823b The EEPROM checksum is correct. Intel EtherExpress Pro 10/100 EEPROM contents: Station address 00:A0:C9:A6:A8:81. Receiver lock-up bug exists. (The driver work-around *is* implemented.) Board assembly 678400-001, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. MII PHY #1 transceiver registers: 3000 782d 02a8 0150 01e1 41e1 0001 ffff ffff ffff ffff ffff ffff ffff ffff ffff 0a03 0000 0001 0000 0000 0000 0000 0000 0000 0000 16f6 0000 ffff ffff ffff ffff. Any ideas? Grtz, Arthur From subscriptions@graphon.com Mon, 26 Feb 2001 14:15:29 -0800 Date: Mon, 26 Feb 2001 14:15:29 -0800 From: Nate Amsden subscriptions@graphon.com Subject: [eepro100] Performance prob. (again) Arthur Rinkel wrote: > > Hi, > > I'm having some disappointing results with a eepro100 (with i82557) and > allthough I've tried several things to fix it, nothing helped so far... > > The (Rx) speed doesn't seem to exceed 2MB/s or so; the NIC is installed in > a Pentium system running 133MHz. A TCP/IP stack "test" with a prg called > ttcp resulted in a (top) speed of 5,8MB/s, so I'm guessing the CPU is > capable of doing 4-5MB/s of network throughput. Or is this incorrect? what are you doing that shows only 2MB/s ? if it's doing a file transfer i'd be its the HDD that is the bottleneck. what are the rest of the specs on that system? ram? hd? etc.. try running bonnie on the system or something else to see how fast your disk can write(bonnie is a hdd benchmark). 2MB/s i would consider fast for an IDE disk(during writes..reads are faster) -------Sequential Output-------- ---Sequential Input-- --Random-- -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks--- Machine MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU /sec %CPU 100 2084 78.5 11371 37.5 7752 57.3 2402 87.4 83453 91.3 7557.8 98.3 that's what i get on one of my systems. P3-800 512MB ECC ram, and that particular drive is a 10.1GB IBM 7200RPM IDE drive. no special hdparm tweaks or anything. nate -- Nate Amsden System Administrator GraphOn http://www.graphon.com From becker@scyld.com Mon, 26 Feb 2001 19:52:32 -0500 (EST) Date: Mon, 26 Feb 2001 19:52:32 -0500 (EST) From: Donald Becker becker@scyld.com Subject: [eepro100] Performance prob. (again) On Mon, 26 Feb 2001, Arthur Rinkel wrote: > 'ifconfig' reports...RX packets:214660 errors:21 dropped:0 overruns:0 > frame:504. Very bad -- if your duplex has been negotiated correctly, frame errors usually indicate bad cables. Usually mis-paired cables. (My favorite is "they can't be mispaired, I made them myself". The pairing only makes sense if you think like phone guy.) > MII PHY #1 transceiver registers: > 3000 782d 02a8 0150 01e1 41e1 0001 ffff Looks pretty much OK to me. > ffff ffff ffff ffff ffff ffff ffff ffff > 0a03 0000 0001 0000 0000 0000 0000 0000 ^^^^^^^^^^^^^^^^^^^ > 0000 0000 16f6 0000 ffff ffff ffff ffff. Hmmm, there are no symbol errors reported in register 20..23. I would expect symbol errors to result in non-zero counts. Donald Becker becker@scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Second Generation Beowulf Clusters Annapolis MD 21403 410-990-9993 From basil.hussain@specialreserve.net Tue, 27 Feb 2001 09:20:39 +0000 Date: Tue, 27 Feb 2001 09:20:39 +0000 From: Basil Hussain basil.hussain@specialreserve.net Subject: [eepro100] Performance prob. (again) Hi, > Very bad -- if your duplex has been negotiated correctly, frame errors > usually indicate bad cables. Usually mis-paired cables. (My favorite is > "they can't be mispaired, I made them myself". The pairing only makes > sense if you think like phone guy.) Yes, it's surprising the number of mis-made DIY cables you see. A colleague of mine specialises in making the worst UTP cables you could imagine. :) Stranger still, they sometimes work! If anyone needs a good reference to UTP cabling standards (in particular, pairing patterns), here's some good reference material: http://www.lanshack.com/highlights/makepatch.htm http://www.lanshack.com/highlights/cat5notes.htm http://www.duxcw.com/digest/Howto/network/cable/index.htm http://directory.google.com/Top/Computers/Hardware/Cables/Category_5_Informa tion/ Alternatively, just enter "EIA/TIA 568A 568B" into any good search engine. Regards, ------------------------------------------------ Basil Hussain (basil.hussain@specialreserve.net) From arthur@cal040041.student.utwente.nl Tue, 27 Feb 2001 21:52:26 +0100 (CET) Date: Tue, 27 Feb 2001 21:52:26 +0100 (CET) From: Arthur Rinkel arthur@cal040041.student.utwente.nl Subject: [eepro100] Performance prob. (again) On Mon, 26 Feb 2001, Nate Amsden wrote: > > The (Rx) speed doesn't seem to exceed 2MB/s or so; the NIC is installed in > > a Pentium system running 133MHz. A TCP/IP stack "test" with a prg called > > ttcp resulted in a (top) speed of 5,8MB/s, so I'm guessing the CPU is > > capable of doing 4-5MB/s of network throughput. Or is this incorrect? > > what are you doing that shows only 2MB/s ? if it's doing a file transfer > i'd be its the HDD that is the bottleneck. what are the rest of the specs > on that system? ram? hd? etc.. Asus P/I-P55TP4XE with 256kB cache, 40MB RAM. During testing the system is in runlevel 3 (Linux 2.2.13) and I'm downloading a 23MB file from a (very) near site. Tried big files from other near sites too. > try running bonnie on the system or something else to see how fast > your disk can write(bonnie is a hdd benchmark). 2MB/s i would consider 2x Seagate Barracuda 4GB (SCSI-UWD) in RAID0 (chunk size = 4kB, block size = 4kB), tested in Single User mode with Bonnie++ 1.0. ------Sequential Output------ --Sequential Input- --Random- -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- MB K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP 100 1946 98 13227 87 5010 72 2103 99 16053 90 39.8 2 But I don't think the disks are the bottleneck...maybe the little amount of RAM? Grtz, Arthur From james@fsck.co.uk Wed, 28 Feb 2001 18:13:24 +0000 (GMT) Date: Wed, 28 Feb 2001 18:13:24 +0000 (GMT) From: A James Lewis james@fsck.co.uk Subject: [eepro100] EEpro100 interfaces identified 3 times during boot??? My ethernet adapters are recognised 3 times each during boot... is this normal.... Can anyone suggest why this would happen? eth0: OEM i82557/i82558 10/100 Ethernet at 0xd3804000, 00:90:27:4C:6F:65, IRQ 17. Board assembly 701637-001, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x24c9f043). Receiver lock-up workaround activated. eth1: OEM i82557/i82558 10/100 Ethernet at 0xd3806000, 00:90:27:4C:6F:61, IRQ 18. Board assembly 701637-001, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x24c9f043). Receiver lock-up workaround activated. eth2: OEM i82557/i82558 10/100 Ethernet at 0xd3808000, 00:90:27:50:D2:83, IRQ 19. Board assembly 701637-001, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x24c9f043). Receiver lock-up workaround activated. eepro100.c:v1.13 1/9/2001 Donald Becker http://www.scyld.com/network/eepro100.html eth3: OEM i82557/i82558 10/100 Ethernet at 0xd380a000, 00:90:27:4C:6F:65, IRQ 17. Board assembly 701637-001, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x24c9f043). Receiver lock-up workaround activated. eth4: OEM i82557/i82558 10/100 Ethernet at 0xd380c000, 00:90:27:4C:6F:61, IRQ 18. Board assembly 701637-001, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x24c9f043). Receiver lock-up workaround activated. eth5: OEM i82557/i82558 10/100 Ethernet at 0xd380e000, 00:90:27:50:D2:83, IRQ 19. Board assembly 701637-001, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x24c9f043). Receiver lock-up workaround activated. eepro100.c:v1.13 1/9/2001 Donald Becker http://www.scyld.com/network/eepro100.html eth6: OEM i82557/i82558 10/100 Ethernet at 0xd3810000, 00:90:27:4C:6F:65, IRQ 17. Board assembly 701637-001, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x24c9f043). Receiver lock-up workaround activated. eth7: OEM i82557/i82558 10/100 Ethernet at 0xd3812000, 00:90:27:4C:6F:61, IRQ 18. Board assembly 701637-001, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x24c9f043). Receiver lock-up workaround activated. early initialization of device eth8 is deferred eth8: OEM i82557/i82558 10/100 Ethernet at 0xd3814000, 00:90:27:50:D2:83, IRQ 19. Board assembly 701637-001, Physical connectors present: RJ45 Primary interface chip i82555 PHY #1. General self-test: passed. Serial sub-system self-test: passed. Internal registers self-test: passed. ROM checksum self-test: passed (0x24c9f043). Receiver lock-up workaround activated. eepro100.c:v1.13 1/9/2001 Donald Becker http://www.scyld.com/network/eepro100.html A. James Lewis (james@fsck.co.uk) Redmond.... The Penguin has landed. Thats one small step for man.....