[Beowulf] Intel 82574L problems with newer kernels?

Thu Dec 13 02:08:15 PST 2012

Hi all,

I got a cluster with Supermicro X8DTT-H motherboards which have the Intel 
Corporation 82574L Gigabit Network Connection, running Debian Linux (Squeeze) 
kernel 2.6.32-5-amd64 on it.

The only time I had problems with the network card was when it stopped working 
altogether. Upgrading the BIOS to the latest available for that motherboard 
has cured that problem. I am not using any Intel drivers here but the native 
drivers for the card (i.e. the ones in the kernel). 

I agree with the upgrade to the latest kernels to get the most out of the 
Sandybridges and as I have 2 of these boxes to be installed I am a bit curious 
about the problem you have as well.

Have you tried upgrading the BIOS, if possible? Does that solve the problem?

The sourceforge link Alex posted seems to go in the same direction, a change 
of the card's firmware. 

On a different note: do you build your own kernel or do you use the 
distribution provided one?

All the best from a sunny but frosty London

Jörg

On Wednesday 12 December 2012 17:44:27 Alex Chekholko wrote:
> Hi,
> 
> I had an issue with 82574L before, and the Intel guys had to buy the same
> hardware for their lab in order to find and fix the issue.  It was a new
> SuperMicro motherboard:
> http://comments.gmane.org/gmane.linux.drivers.e1000.devel/6734
> 
> Are you using the latest driver from Intel?  I see you already link to the
> sourceforge fix:
> http://sourceforge.net/projects/e1000/files/e1000e%20stable/eeprom_fix_8257
> 4_or_82583/
> 
> I would follow up with the Intel folks, get them to test with the new
> kernel.
> 
> Regards,
> Alex
> 
> On Tue, Dec 11, 2012 at 6:21 PM, Bill Broadley <bill at cse.ucdavis.edu> wrote:
> > Anyone have some working tweaks to get an Intel E1000e driver + 82574L
> > chip to behave with linux 3.5 or 3.7 kernels?  Not sure if this is a
> > problem for all 82574Ls or just ones on recent supermicro motherboards.
> > 
> > I noticed stuttering, occasional high latencies, and a continuously
> > 
> > increasing dropped packets from ifconfig:
> >   RX packets:13437889 errors:0 dropped:14185 overruns:0 frame:0
> > 
> > Even something simple like ping -c 100 would show at least one packet
> > with over 1 second latencies.
> > 
> > Several discussions mention that some of the errors are not logged, so
> > it's may well be significantly worse than you'd think from the dropped
> > packet count.
> > 
> > Replacing the cables, switch, or even the entire node doesn't seem to
> > make any difference.   I've found quite a few discussions about it,
> > googling "linux 82574L dropped" finds quite a few.  Most that I found
> > that provide details mention supermicro motherboards.
> > 
> > There seems to be a solution for Centos 6, but I'm having problems
> > getting said fix to work with newer kernels.
> > 
> > Some of the discussions:
> > 
> > 
> > http://www.linuxquestions.org/questions/linux-hardware-18/intel-82574l-gi
> > gabit-network-card-issues-and-resolution-831364/
> > 
> > 
> > http://www.doxer.org/learn-linux/resolved-intel-e1000e-driver-bug-on-8257
> > 4l-ethernet-controller-causing-network-blipping/
> > 
> >   https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1018561
> > 
> > http://sourceforge.net/projects/e1000/files/e1000e%20stable/eeprom_fix_82
> > 574_or_82583/
> > 
> > This is mostly a concern for me because it's a noticeable performance
> > problem and supermicro based resellers seem to be winning the cluster
> > bids recently.  With newer Sandy Bridge, Ivy Bridge, Bulldozer, and
> > Piledriver CPUs it seems worthwhile to run a relatively new kernel.
> > 
> > Has anyone been successful with getting the 82574L to work as expected?
> > 
> >  With a supermicro motherboard?
> > 
> > I've tried all the discussed fixes including but not limited to updating
> > the driver, upgrading to from a 3.5 kernel -> 3.7 kernel,
> > turning pcie_aspm off, various e1000e.IntMode settings,
> > e1000e.interruptthrottleRate, apci=off, disabling various features with
> > ethtool, and patching the e1000e firmware.
> > 
> > For such a popular chip and driver I'm surprised that problems seem to
> > be lingering.  Then again I suspect most people are happy when a network
> > provides connectivity and not so much about performance.  Thus my email
> > to the beowulf list.
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> > To change your subscription (digest mode or unsubscribe) visit
> > http://www.beowulf.org/mailman/listinfo/beowulf

-- 
*************************************************************
Jörg Saßmannshausen
University College London
Department of Chemistry
Gordon Street
London
WC1H 0AJ 

email: j.sassmannshausen at ucl.ac.uk
web: http://sassy.formativ.net

Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html