[eepro100] High collision rate

Donald Becker becker@scyld.com
Sat, 27 Jan 2001 11:07:50 -0500 (EST)


On Fri, 26 Jan 2001, Bernd Stahlbock wrote:

> I've recognised a problem in the internal network of our industrial
> machines: We use a NT workstation control PC with GUI (normally SMC
> Etherpower II or 3Com NICs) and one or two Linux embedded PCs with Intel
> 82558B Chip and Donald Beckers driver "eepro100.c:v1.11 7/19/2000 Donald
> Becker <becker@scyld.com>\n";.
> If the machine has one embedded PC, it's a direct cross connection to
> the NT PC. If there are two embedded PCs, we use a small 100MB Hub. The
> Problem is on all machines the same:

Hmmm, a cross-over cable should result in negotiated full duplex.
Only the repeater connection should show collisions.
You might want to check the link status with 'mii-diag'.
     http://www.scyld.com/diag/index.html

> If we produce network traffic (with not more than 800kbyte/sec) on a TCP
> stream while the machine is operating, we got a collision rate up to
> 55%. Nearly no matter if one or two embedded pcs are written to at the
> same time.

The eepro100 driver is counting the total number of collisions, not the
number of packets with collisions.  A single packet might accumulate
up to 15 collisions.

Other Ethernet hardware can only report if a packet encountered a
collision.  Up to 15 collisions are counted as a just one.

> Now, I read something about a Interframe Gap, which may be set to small
> by the driver, for reasons of performance. The NT Intel-Etherpower
> driver for the 82558 chip has a special registry setting for this.
> (normal gap is 9.6 microseconds)
> Too small gaps leads to high collision rate.
>
> Now my question: is Donalds driver written according to the IEEE rule
> for 100TX?

Yes.  Several other chips permit lowering the IFG to more aggressive
values.  The primary purpose seems to be getting an edge in performance
against cards that follow the spec.

> Is it possible to adjust this value to archive a better
> throughtput/collision ratio?

In theory, yes.  But not without modifying the source.  And it only
works properly in an environment where all hosts are modified to a
smaller IFG, and they are tested to reliably receive packets with that
smaller IFG.  Many recievers will reject packets if the IFG shrinks too
much.

> I remember that in earlier times we didn't have this problems. We
> already changed the SMC drivers back to the old ones and also used old
> cards, with no effect. So I assume that there has been a change in the
> linux side drivers.

No.
Check your network for a duplex mismatch, or errors reported in
/proc/net/dev. 

Donald Becker				becker@scyld.com
Scyld Computing Corporation		http://www.scyld.com
410 Severn Ave. Suite 210		Second Generation Beowulf Clusters
Annapolis MD 21403			410-990-9993