MPICH hangs in redhat kernel-2.2.14-5.0

Woo Chat Ming cmwoo at hkusua.hku.hk
Tue Feb 13 11:22:59 PST 2001


Dear beowulf friends and experts,

  MPICH is having problem in my Linux cluster. Would any Linux
or MPI expert please help me to solve the problem ?
  I have installed  MPICH-1.2.1 on my Linux 6.2 cluster which contains
8 dual CPU nodes. Each node is running kernel-2.2.14smp . These nodes
were connected through Intel Gigabit Ethernet card.
  However, some programs in NAS parallel benchmark version 2.3 
would hang up frequently. There programs are using MPICH.
Most possibly, it hangs up when the data size is large. 
  I am sure I have got enough RAM because each program is around 
100MB but each node has 1GB of physical memory. When the program hangs, I
can still ping the nodes from each other so I think it is not the network
card's hardware problem.
  I don't know why is it. Did I set anything wrong ? 
  Do I need to get a new version of kernel ? Does the SMP kernel has
problem ?

  Thanks in advance for any help.

Regards,
Ming.
-------------
Woo Chat Ming                 Computer Centre,
TEL   (852)-2857-8632         Room 1-34A, Old Library Building,
FAX   (852)-2559-7904         The University of Hong Kong.
Email cmwoo at hku.hk            http://www.hku.hk/cc/sp2/index.html
Email cswcm95 at engsvr.ust.hk






More information about the Beowulf mailing list