Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

dual AMD clusters

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Martin Siegert siegert at sfu.ca
Fri Jun 22 12:04:38 PDT 2001


On Thu, Jun 21, 2001 at 16:45:27 -0400, Dan Kirkpatrick wrote:
> We're about to build our PoP (Pile of Pentiums) beowulf cluster #2 and are 
> trying to build this one even better since our budget is 4x that of the 1st 
> one...
> There's a wide variety of needs, some small code, some large (200mb-1gb+).
> Mostly threaded, some in the future may be parallel.
> 
> We've got a few questions for the list...
<snip>
> 3. Processor choice
> What about Dual Athlon's?  Are they actually available and reliable?  We've 
> gotten the feeling that they're fairly new, run hotter (more cooling/larger 
> case needed), and more expensive, but for the calculations we're doing, it 
> may be more power for the money in the end if they are actually available 
> and reliable.

I have almost completed my test of a dual AMD cluster. The test cluster
had two dual nodes. The master node has 5 NICs (2 onboard 3c980, 3 3c905B),
the internal node has 4 NICs (2 onboard 3c980, 2 3c905B). 3 NICs are used
in a channel-bonded configuration, one is used for NFS traffic. The
remaining one on the master node is for the internet connection.

I am using a 2.4.5 kernel on a otherwise RH7.1 system.

I have encountered a few problems, all of which, but one have been solved:
1) Pay close attention to the memory chips that Tyan has approved. Other
   chips may not work.
2) The latest BIOS upgrade solves problems with booting the nodes.
3) There is a bug in the 2.4.5 kernel (somewhere in the apic code) that
   brings network connections to a grinding halt. Using the -ac versions
   (e.g., 2.4.5-ac17) solves this problem.
4) lm_sensors fails to recognize the hardware monitoring chips on the
   Tyan motherboard.

Otherwise the system has been rock solid: no crashes, very good performance.
Furthermore, Tyan is going to release a no-SCSI version of the motherboard
soon - this will make the dual AMD system very competitive - I finally
made up my mind to go that way.

If only somebody could show me how to patch lm_sensors to detect the
hardware monitoring chips on the Tyan motherboard:
1) I was able to insert the i2c-amd756.o module (after changing the
   PCI_DEVICE_ID to 7413).
2) sensors-detect now shows a Winbond W83782D chip, but "modprobe -k w83781d"
   brings the box to a full stop (only a hard reset helps). This obviously
   needs some work - does anybody know more about this?

Cheers,
Martin

========================================================================
Martin Siegert
Academic Computing Services                        phone: (604) 291-4691
Simon Fraser University                            fax:   (604) 291-4242
Burnaby, British Columbia                          email: siegert at sfu.ca
Canada  V5A 1S6
========================================================================




More information about the Beowulf mailing list