Sat, 25 Nov 2000 16:45:52 +0100 (CET)
On Fri, 24 Nov 2000, Donald Becker wrote:
> Yes, everyone please keep this in mind.
> Register polling should never go on longer than about 50 microseconds.
I fully agree, however I still can't imagine how to poll reliably.
Is there any way to get a precise measurement of time spent within a loop?
We can at least find out if there is a PCI problem (in which case I really
don't know what to do) or a NIC problem (if the time needed for completion
of the operation is much larger than specified in docs -> hardware bug ?).
Until now I think I was bitten by this "bug" because my cluster nodes
would suddenly die after 2-3 weeks of perfect functioning, while now with
the increased counts I run them happily for more than 2 months. But
waiting 2-3 weeks for an event to happen is a very unreliable way of
reproducing things for debugging. If Richard is able to produce it at
will, we might be able to get some more data easily.
> The manual might report that the transceiver is at address 24, but we should
> still scan. The chip supports external MII transceivers (e.g. for the
> 100baseT4 product), and even 3Com doesn't keep track of what designs are
> using the chip.
Sure, I didn't say "set it at 24 and that's all !".
> Here are a few MII rules.
> A plug-in transceiver, via a MII connector, should be at address 0.
> On-board transceivers are addressed 1..31
> The scan order is 1,2,..30,31, 0
> A transceiver jumpered for address 0 should power up disconnected from the
> data lines. The driver might need to explicitly disable the on-board
> transceivers before activating the external one.
These rules make sense to me. But is this an "official" standard or just
your experience derived from all the drivers that you've written?
> I don't know. I suspect that it's just a bug in the address matching
> design. ...
Is there any possibility that PCI problems affect the MDIO operations ?
After all, MDIO operations happen through the PCI bus....
> I really don't like always using #24, but it looks as if we might have to
> that with some boards. It shouldn't be based only on IS_TORNADO, since that
> would break with 100baseT4 and 100baseFx boards. There are also HomePNA and
> radio transceivers with MII interfaces that might someday show up connected
> to 3Com chips.
AFAIK, these are the first boards where scanning for MII interfaces fails.
Do you know of some others ?
What I wanted to say was "if this is Tornado, we _know_ that we have at
least one NWAY transceiver on-board, at PHYAD 24" (because the chip itself
carries it and AFAIK there is no possibility to disable it). After that we
start searching (eventually skipping 24, or purposedly looking for it as
some kind of check) and add all the others. Now you say that we should use
the lowest numbered transceiver; is this also a standard ?
If there is a requirement that phys is ordered, we can start with
phys=24, then if we find another, we set phys=x and phys=24 and
so on, only moving 24 if one lower than it is found. But in
vortex_timer() only phys is used; actually I cannot find anything
that treats more than one transceiver, except the scanning itself...
IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868