[Beowulf] ifplugd messages on linux when network is busy?

David Mathog mathog at mendel.bio.caltech.edu
Wed Jun 22 09:11:02 PDT 2005

The linux boxes in our cluster throw these:

  Jun 22 08:35:31 monkey01 ifplugd(eth1)[1607]: Link beat lost.
  Jun 22 08:35:32 monkey02 ifplugd(eth1)[1607]: Link beat detected.

when the network is very busy. However ifconfig isn't showing any
significant problems on the interface:

eth1      Link encap:Ethernet  HWaddr 00:E0:81:22:2F:E7  
          inet addr:  Bcast:  Mask:
          inet6 addr: fe80::2e0:81ff:fe22:2fe7/64 Scope:Link
          RX packets:86210540 errors:0 dropped:0 overruns:8692 frame:0
          TX packets:63733464 errors:0 dropped:0 overruns:0 carrier:1
          collisions:0 txqueuelen:1000 
          RX bytes:2190672913 (2089.1 Mb)  TX bytes:425465611 (405.7 Mb)
          Interrupt:19 Base address:0x2480 

I'm thinking that this is probably not a bad switch (since then
I'd expect to see errors and dropped) but rather is the
result of ifplugd running at the same prio (16) as every
other network process, and it occasionally not being able
to access the network interface within the one second
poll interval.

Has anybody else seen these messages?  Was it a failing switch
or an ifplugd / driver / network stack problem on linux?


David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech

