MPICH hangs in redhat kernel-2.2.14-5.0
Woo Chat Ming
cmwoo at hkusua.hku.hk
Tue Feb 13 11:22:59 PST 2001
Dear beowulf friends and experts,
MPICH is having problem in my Linux cluster. Would any Linux
or MPI expert please help me to solve the problem ?
I have installed MPICH-1.2.1 on my Linux 6.2 cluster which contains
8 dual CPU nodes. Each node is running kernel-2.2.14smp . These nodes
were connected through Intel Gigabit Ethernet card.
However, some programs in NAS parallel benchmark version 2.3
would hang up frequently. There programs are using MPICH.
Most possibly, it hangs up when the data size is large.
I am sure I have got enough RAM because each program is around
100MB but each node has 1GB of physical memory. When the program hangs, I
can still ping the nodes from each other so I think it is not the network
card's hardware problem.
I don't know why is it. Did I set anything wrong ?
Do I need to get a new version of kernel ? Does the SMP kernel has
Thanks in advance for any help.
Woo Chat Ming Computer Centre,
TEL (852)-2857-8632 Room 1-34A, Old Library Building,
FAX (852)-2559-7904 The University of Hong Kong.
Email cmwoo at hku.hk http://www.hku.hk/cc/sp2/index.html
Email cswcm95 at engsvr.ust.hk
More information about the Beowulf