[Beowulf] MPI and Redhat9 NFS slow down

Laurence Liew laurenceliew at yahoo.com.sg
Fri Aug 27 16:19:44 PDT 2004


You are welcomed... but do note the dangers of async.. data corruption 
may occur as I understand when a HDD fails for example... not that it 
happens often.

You may wish to consider using one of the RHEL based distros to get 
better enterprise features and NFS performance.... If you like building 
your cluster from scratch.. explore CAOS/TAO/whiteboxlinux?... else use 
something like ROCKS (www.rocksclusters.org).. which automates most of 
the stuff for you.

Scalable Systems

Jack Chen wrote:
> Hi Laurence,
> Thanks for the suggestion.  Changing the export from sync to async made 
> a HUGE difference.
> The same job finished in 252 seconds as to 10407 seconds before the change.
> The sync option is the default export setting.
> Jack
> Laurence Liew wrote:
>> hi,
>> try adding in async to NFS
>> it speeds up the IO on our RHEL V3 cluster by an order of magnitude... 
>> not too sure the RH9 kernel and nfs supports async though
>> laurence
>> Jack Chen wrote:
>>> Hi all,
>>> I'm not sure if this is the right place to post this question.  If it
>>> is not, please tell me where's the best place to get help on this,
>>> thanks..
>>> We recently built a 8-node PC Linux cluster running RedHat 9 (kernel:
>>> 2.4.20-8smp #1 SMP).  We use this system to run EPA's CMAQ
>>> photochemical grid model.  I have installed the latest MPICH 1.2.6
>>> with Portland Group Compiler (5.2-1) using ssh.  Everything worked
>>> fine with the mpi example programs (cpi, pi3p etc)and 'make testing'. 
>>> However when I tried to run any program that write output to other nfs
>>> mounted drives I get very long delay.  I'm not sure where the problem
>>> is.  I know the NFS automount is working fine because if I start the
>>> job with just one processor (mpirun -np 1), I don't experience the
>>> slow down.
>>> For example: If I start the job on master node using 4 processors 
>>> (mpirun -np 4)
>>> and write to the master node (master2 0),
>>> PIxxx file:
>>> master2 0 /master2/home/chenj/CMAQ_v4.3/Run/cctm/CCTM_e2a
>>> node103 1 /master2/home/chenj/CMAQ_v4.3/Run/cctm/CCTM_e2a
>>> node103 1 /master2/home/chenj/CMAQ_v4.3/Run/cctm/CCTM_e2a
>>> node104 1 /master2/home/chenj/CMAQ_v4.3/Run/cctm/CCTM_e2a
>>> the run takes 168 sec
>>> If I start the same job but write the output to any other nfs mounted
>>> drives besides the master node, the job will be extremely slow.  In
>>> this case the same job took 10962 sec.
>>> I have tried to mount the drive using different parameters (rw,soft
>>> and rw,hard,bg,intr,noac) and increased the nfsd daemon from 8 to 16
>>> on the NSF server, but nothing change.
>>> If you have any idea on what is going on, please help!
>>> Any help/suggestion are greatly appreciated.
>>> Jack
>>>  Jack Chen
>>>  Laboratory for Atmospheric Research
>>>  Dept.of Civil & Environmental Engineering
>>>  Washington State University
>>>  Pullman, WA 99164-2910
>>>  509.335.5738
>>>  509.335.7632 (FAX)
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org
>>> To change your subscription (digest mode or unsubscribe) visit 
>>> http://www.beowulf.org/mailman/listinfo/beowulf
-------------- next part --------------
A non-text attachment was scrubbed...
Name: laurenceliew.vcf
Type: text/x-vcard
Size: 150 bytes
Desc: not available
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20040828/da937960/attachment.vcf>

More information about the Beowulf mailing list