problem & questions
Dr. Balasubramanian Sundaram
bala at jncasr.ac.in
Thu Aug 17 22:26:11 PDT 2000
dear ioannis,
Using PXE enabled network card, to build a cluster is
a rather painful exercise. Three months ago, we built a
8-node diskless cluster using Intel EtherExpressPro network
card with PXE support (i.e., we did not have to burn the
eeprom). You are using a 3COM card, but the underlying
problems, I suppose, are similar.
We have shared our experience in our homepage. You can
view it at:
http://www.jncasr.ac.in/kamadhenu
Also, Intel has some rudimentary PXE support documentation for
Linux. One of them is:
ftp://download.intel.com/ial/wfm/pxesdklinux.pdf
This will give some indication of things one needs to take care
in mounting the root partition.
A couple of simple things to use while building a diskless cluster
(pardon me if i am telling things that you know already).
1) Work only with the master (disked machine), and ONE slave to start
with.
2) Connect a spare monitor and spare keyboard on the slave also, so that
you can view what is going on in the slave.
3) Run bootpd with the debug option.
4) Do "tail -f /var/log/messages" on the master, to keep track of
any error messages.
5) As root, run "tcpdump -i eth0" (if eth0 is on the beowulf-private
network) on the master, to look at communication between master &
slave.
I would consider myself a beginner in linux, but still the
output of these commands can be interpreted with simple common sense.
and, it can help you in solving simple routing problems, for example.
good luck,
bala
Ioannis F Sotiriadis wrote:
>
> i am putting a beowulf cluster together (nodes diskless)
> using the new 3com bootable lancard and have nothing but
> problems AND all the HOWTO's take to much for granted :-(
>
> I am using on the card BOOT METHOD: TCP/IP and PROTOCOL: BOOTP
> (available methods TCP/IP, NETWARE, BPL and PXE - available
> protocols (under TCP/IP) DHCP and BOOTP and (under netware)
> 802.2 802.3 and EthII)
>
> RESULTS: when the node boots the message is "UNDI
> initialization failed" and boots again - on the server the
> message (everytime the node boots) "eth0:BTL8139: Interrupt
> line blocked,status 1"
>
> What am i doing wrong and/or what is missing?
>
> --
> Best regards,
> Ioannis mailto: admin at elcomsoft.gr
>
> _______________________________________________
> Beowulf mailing list
> Beowulf at beowulf.org
> http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf
mailing list