Multicasting problem on shut down

Robert Schwartz
Mon Oct 4 19:00:16 1999

This is a multi-part message in MIME format.
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Hi jeff,

No, we did not do any testing with power management, the problem we saw
happened whenever we told linux to shutdown, but didn't power off the
system, or when rmmoding the module and leaving it removed. After a few
seconds (maybe 10-30) we saw the eepro100 (not every time, but at 1 out of 5
times) start to multicast the 8808 frame. This in turn told all other
eepro's on the network to stop transmitting, thus leaving your network in
bad shape until the machine transmitting 8808's is shut down or yanked.

We have added two lines (as suggested by Don B.) to the eepro100.c source
and have distributed the fix on to most of our systems (we're creating a
Linux Distro ourselves, so the latest beta cd's going out to the everyone
in-house has the new driver) and have not seen the problem since.

For fun, is the multicast destination address your seeing:
01:80:c2:00:00:01?? If so, this is exactly what we are seeing.

Here is Don's email to us on the problem:

### start of msg ###

The new "kern-2.3" version might fix this, but it doesn't do it explicitly.
It puts the chip into ACPI-D2 mode in order to reduce the idle power. (The
Sony Vaio Z505 laptop which I borrowed from the NYU guys for two weeks
uses the eepro100.  An OK box, but it just sucks down battery power.)

Hmmm, an explicit fix would be the following.  In speedo_close() we can send

a selective reset instead of stopping the chip nicely.

-       /* Disable interrupts, and stop the chip's Rx process. */
-       outw(SCBMaskAll, ioaddr + SCBCmd);
-       outw(SCBMaskAll | RxAbort, ioaddr + SCBCmd);
+       /* Bonk the chip on the head. */
+       outl(PortPartialReset, ioaddr + SCBPort);

Note: Do not change this to PortReset!  The chip might violate the PCI
protocol and hang the PCI bus!

I'll make this change in my development version.

Could someone that can reproduce the problem please verify that it's a fix?

### end of msg ###

Let me, the list, or Don know if this helps you out too.



Jeff Groman wrote:

> We seem to have experienced a similar problem last week with a RH 6.0
> box on our network.  We found that it was responsible for bringing down
> most of our network.  We didn't understand why, but reading your
> messages on this morning gave us some more insight.  We also saw
> the 8808 frames and multicast traffic.  The confusing thing is that this
> box has been on the network for several weeks, and only exhibited this
> behavior this one time; seemingly out of the blue.  However, we had a
> similar event happen a few months back with another RH 6.0 box, but it
> showed the problem almost immediately after putting it on the network.
> (Both boxes are have Intel NICs in them.) We found that by disabling
> power management in the bios of the pc, the problem went away.  We also
> noted that this time around, the RH box was also running with power
> management turned on.  Our problem is that we have other RH boxes
> running power management without problem.  We also have been unable to
> reproduce the problem from last week.  We tried rmmod-ing eepro, to no
> avail.  Also, when I halt the box, it automatically turns itself off- so
> that test didn't work either.
> Have you found any connection on your end with power management and this
> multicast problem?
> I'm sorry for the long-winded message, but any help you might have would
> be great!
> Thanks,
> Jeff
> --
> Jeff Groman
> IS Department,  Childrens Hospital, Denver
> 303 864 5671

Content-Type: text/x-vcard; charset=us-ascii;
Content-Transfer-Encoding: 7bit
Content-Description: Card for Robert Schwartz
Content-Disposition: attachment;

tel;fax:+1 613 761-9338
tel;work:+1 613 728-0826 x1499
org:Corel Corporation;Emerging Technologies Group
adr:;;1600 Carling Ave.;Ottawa;Ontario;K1Z 8R7;Canada
title:Network Applications Analyst
fn:Robert Schwartz