Problem with mass-collisions

Georg Koester georgk@bigfoot.de
Sat Aug 29 10:54:30 1998


Hi again!

Donald Becker wrote:
> > > This could be a either a duplex mismatch or the bridge dropping packets.
> 
> I'm guessing packet corruption.  Check /proc/net/dev.

No errors. Nothing. Everything looks very well.

> > I tried your tool, messages above. But I could not change the
> > autonegotiation things. It didn't know the option.
> 
> Try running 'mii-diag -A 10baseTx-HD', or just 'mii-diag' to see the current
> settings.  It does work..

Great. My fault. I tried the eepro100-diag.c! Now with mii-diag.c it
works. Yes, its running at 10baseT-HD. Setting it to 10baseT-FD won´ t
get a link.

> The EEPro100 driver gets more of a workout than most.  It is in constant use
> at work: One of my desktop test machines is a PR440FX, usually connected to
> a very busy 10mbps repeater, but sometimes to my local test network on a
> 100baseTx repeater. I have 40 PR440FX machines in a cluster talking
> 100baseTx-FD, and the cluster connection to the outside world is through a
> 10baseT switch.
 
That sounds great! But in your case there is only one type of
Ethernet-device on a repeater, even only one type of machine and
architecture. Here I also have the same architecture(i486), but
different speeds. One is a 486-133(overclocked at 160) and a 486-DX33.
So what if the first is sending too fast? What if the first is already
sending again, because it´ s considering the packet as being lost, and
the second is sending its reply just in that very moment. That would
explain the collisions(I believe). I can try to picture the situation
during a ftp-download between the machines:

                1 sec                 2 sec
  /\       /\       /\       /\
 /  \     /  \     /  \     /  \
/    \---/    \---/    \---/    \. . .(and so on)
start,    fast
 collisions collisions
      quiet    quiet

That´ s the situation. The descriptions first letter is the point in the
diagram above, where it takes place, for example the collisions are
located at the peaks, when it just starts being fast. So I think that
maybe the second computer is too slow to get all the data of the wire
with that speed. But why the collisions? And why is the situation
different when I use socketbuffers in ttcp? If there is at least
sockbufsize=1 there are no collisions(anywhere. i doesn´ t matter if the
first or the last machine is sending.). And what about nfs? When the
machines are copying files over a nfs link there are also no collisions.
I don´ t know where the problem could be related to. Should I try
another Kernel? These machines are all running 2.0.35. Maybe a 2.1?
ciao Georg

-- 

        From: Georg Koester e-Mail: georgk@bigfoot.de