[Beowulf] WRF model on linux cluster: Mpi problem

Fri Jul 1 00:38:26 PDT 2005

yeas, 

I will remove openmosix. 
I patched the kernel with openmosix because I used the cluster also for
other smaller applications, so the load balance was useful to me.

I already tried to switch off openmosix with

> service openmosix stop

but nothing seems to change...

Do tou think it could be different to completely remove it, replacing
the kernel with a new one without openmosix patch?

thanks...

federico

Il giorno gio, 30-06-2005 alle 12:10 -0700, Michael Will ha scritto:
> Vincent is on target here:
> 
> If your application already uses MPI as a middleware assuming
> distributed memory, then you should definitly use a beowulf style
> setup rather than openmosix with it's pseudo-shared memory model.
> 
> Look at rocks 4.0.0 http://www.rocksclusters.org/Rocks/ which
> is free and based on CentOS 4 which again is a free version of RHEL4.
> 
> Michael
> 
> Vincent Diepeveen wrote:
> 
> >At 02:34 PM 6/30/2005 +0200, Federico Ceccarelli wrote:
> >  
> >
> >>Thanks for you answer Vincent,
> >>
> >>my network cards are Intel Pro 1000, Gigabit.
> >>
> >>Yes I did a 72h (real time) simulations that lasted 20h on 4 cpus...same
> >>behaviour...
> >>
> >>I'm thinking about a bandwith problem...
> >>
> >>....maybe due to hardware failure of some network card, or switch (3com
> >>-Baseline switch 2824).
> >>
> >>Or the pci-raisers for the network card (I have a 2 unit rack so that I
> >>cannot mount network cards directly on the pci slot)...
> >>    
> >>
> >
> >because the gigabit cards have such horrible one way ping pong latencies as
> >compared to the highend cards (myri,dolphin,quadrics and relative seen also
> >infiniband), the pci bus is not your biggest problem which is the case here.
> >
> >The specifications of the card are so so so restricted that the pci is not
> >the problem at all.
> >
> >There are many tests out there to test things. You should try some one-way
> >pingpong test. 
> >
> >By the way, the reason for me to not run openmosix nor similar single image
> >software systems is because it has such ugly effect at the latencies and
> >the way it pages shared memory communication between nodes is real ugly
> >slow and bad for this type of software. There is also something called
> >OpenSSI which is pretty active getting developed. It has the same problem.
> >
> >Vincent
> >
> >  
> >
> >>Did you experience problem with pci-raisers?
> >>
> >>Can you suggest me a bandwidth benchmark?
> >>
> >>thanks again...
> >>
> >>federico
> >>
> >>Il giorno gio, 30-06-2005 alle 12:44 +0200, Vincent Diepeveen ha
> >>scritto:
> >>    
> >>
> >>>Hello Federico,
> >>>
> >>>Hope you can find contacts to colleges.
> >>>
> >>>A few questions.
> >>>  a) what kind of interconnects does the cluster have (networkcards and
> >>>which type?)
> >>>  b) if you run a simulation that eats a few hours instead of a few
> >>>      
> >>>
> >seconds,
> >  
> >
> >>>     do you get the same speed outcome difference?
> >>>
> >>>I see the program is pretty big for open source calculating software, about
> >>>1.9MB fortran code, so bit time consuming to figure out for someone who
> >>>isn't a non-meteorological expert.
> >>>
> >>>E:\wrf>dir *.f* /s /p
> >>>..
> >>>     Total Files Listed:
> >>>             141 File(s)      1,972,938 bytes
> >>>
> >>>Best regards,
> >>>Vincent
> >>>
> >>>At 06:56 PM 6/29/2005 +0200, federico.ceccarelli wrote:
> >>>      
> >>>
> >>>>Hi!
> >>>>
> >>>>I would like to get in touch with people running numerical meteorological
> >>>>models  on a linux cluster (16cpu) , distributed memory (1Gb every node),
> >>>>diskless nodes, Gigabit lan, mpich and openmosix.
> >>>>
> >>>>I'm tring to run WRF model but the mpi version parallelized on 4, 8, or 16
> >>>>nodes runs slower than the single node one! It runs correctly but so
> >>>>        
> >>>>
> >slow...
> >  
> >
> >>>>When I run wrf.exe on a single processor the cpu time for every
> >>>>        
> >>>>
> >timestep is
> >  
> >
> >>>>about 10s for my configuration.
> >>>>
> >>>>When I switch to np=4, 8 or 16 the cpu time for a single step sometimes
> >>>>        
> >>>>
> >its
> >  
> >
> >>>>faster (as It should always be, for example 3sec for 4 cpu ) but often
> >>>>        
> >>>>
> >it is
> >  
> >
> >>>>slower and slower (60sec and more!). The overall time of the simulation is
> >>>>bigger than for the single node run...
> >>>>
> >>>>anyone have experienced the same problem?
> >>>>
> >>>>thanks in advance to everybody...
> >>>>
> >>>>federico
> >>>>
> >>>>
> >>>>
> >>>>Dr. Federico Ceccarelli (PhD)
> >>>>-----------------------------
> >>>>    TechCom snc
> >>>>Via di Sottoripa 1-18
> >>>>16124 Genova - Italia
> >>>>Tel: +39 010 860 5664
> >>>>Fax: +39 010 860 5691
> >>>>http://www.techcom.it
> >>>>
> >>>>_______________________________________________
> >>>>Beowulf mailing list, Beowulf at beowulf.org
> >>>>To change your subscription (digest mode or unsubscribe) visit
> >>>>        
> >>>>
> >>>http://www.beowulf.org/mailman/listinfo/beowulf
> >>>      
> >>>
> >>>>        
> >>>>
> >>
> >>    
> >>
> >_______________________________________________
> >Beowulf mailing list, Beowulf at beowulf.org
> >To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> >  
> >
> 
>