Advice for 2nd cluster installation
Kwan Wing Keung
hcxckwk at hkucc.hku.hk
Thu Jan 9 18:33:38 PST 2003
We have installed a 32 nodes dual 1 GHz P3 clusters a year already. Its
performance is excellent, and the system stability is fairly OK.
Due to the increase in system loading, we are going to install
additional nodes in the coming half year. I would like to have more
hints from our experts on the following questions:
(1) Currently most of the major vendors already propose P4 nodes of
at least 2.2 GHz. In view of the difference in processor speed,
definitely the new nodes have to be separated from the old nodes.
We can either create new batch queues to accommodate these new nodes
(but still sharing the old file system), or build an entire
new cluster with its own front-node and file system.
What will be the relative advantages/disadvantages?
(2) If we real intend to build a completely new cluster to house all the
new nodes, i.e. with its own front-node and file system, is it
possible for us to build some backup resilience between the 2
clusters as well?
During the last year, we experienced around 8 times of failure
to the processor nodes. All of them are related to the failure of
fan in the power supply. Losing 1-2 nodes for around 4 hours never
affect the overall operation of the cluster.
However on one occasion the master RAID system failed. All users
were not allowed to login for almost 12 hours, as the "/home" was
totally unavailable during this period of time.
One possible way can be a SAN approach which the two file systems
are always mirrored. Will it be very expensive?
Another way is just a cross-mounting of two file servers.
Likely the postgrads will be on the old server while
the researchers will be on the new one. During normal operation,
each cluster is only going to use its local file system, but the two
servers will be "rsych" during the night time.
In case a file system is inaccessible, all users will be allowed
to access the remaining available file system (after the sysadm. has
done some work).
This sounds complicated, but should be much cheaper. Any expert
has such experience?
(3) Most of the major vendors proposed blade server approach as alternate
proposal to the conventional 1U server. By stuffing 14
processors board into a blade centre (actually just another type of
rack-mounted chassis occupying 7U), the "processor density" can be
However when I asked them the same question as below, they cannot
give me a definite answer (or at least I am not convinced by their
The question is: will there be a timing difference in case a processor
in the 3rd blade, insider the 3rd chassis, is trying to communicate
(through MPI) to a processor in another blade within the SAME chassis,
as compared to another processor in a blade housed within ANOTHER
A sales from a vendor answered me that there should be some difference,
as the communication within a blader centre will go through the
back-plane. Once it goes out from the blade centre, the communication
has to go through an inter-chassis switch, thereby should have some
timing difference. He further told me that it is the beauty of
the "infiniteband" which I don't have any experience.
However another sales answered me that they should be the same,
because all processors within the entire "rack" should have distinct
IP addresses, and the communication between any 2 processors should
be fair and equal.
Which is right?
THANKS for all expert advice.
University of Hong Kong
More information about the Beowulf