[Beowulf] WRF model on linux cluster: Mpi problem
diep at xs4all.nl
Thu Jun 30 05:46:36 PDT 2005
At 02:34 PM 6/30/2005 +0200, Federico Ceccarelli wrote:
>Thanks for you answer Vincent,
>my network cards are Intel Pro 1000, Gigabit.
aha build in cards.
If you want to, just forward me your test and i can compile for you at a
quadrics QM500 network at dual k7 machines which i built at home.
the poor pci bus of the k7 machines takes care it is having low bandwidth
(350MB/s or so only whereas the cards easily can get 1.1 GB/s at better pci),
but the latency still is great it's like 2.8 us or so one way pingpong when
i remove a certain pci card. This at very slow 66Mhz PCI (i guess 2 times
it is better at pci-x at 133Mhz).
Your gigabit cards will have perhaps 125MB/s bandwidth, so not noticably
worse i suspect in this benchmark respect. However the latency is like
factor 30 better here or so than your intel cards will practically give.
So if the only problem is one way pingpong latency, this is easy to
determine when running the same test you did.
So at least 1 factor where the gigabit cards are weak (one way ping pong
latency) as compared to highend cards, that we can easily determine the
>Yes I did a 72h (real time) simulations that lasted 20h on 4 cpus...same
>I'm thinking about a bandwith problem...
>....maybe due to hardware failure of some network card, or switch (3com
>-Baseline switch 2824).
>Or the pci-raisers for the network card (I have a 2 unit rack so that I
>cannot mount network cards directly on the pci slot)...
>Did you experience problem with pci-raisers?
>Can you suggest me a bandwidth benchmark?
>Il giorno gio, 30-06-2005 alle 12:44 +0200, Vincent Diepeveen ha
>> Hello Federico,
>> Hope you can find contacts to colleges.
>> A few questions.
>> a) what kind of interconnects does the cluster have (networkcards and
>> which type?)
>> b) if you run a simulation that eats a few hours instead of a few
>> do you get the same speed outcome difference?
>> I see the program is pretty big for open source calculating software, about
>> 1.9MB fortran code, so bit time consuming to figure out for someone who
>> isn't a non-meteorological expert.
>> E:\wrf>dir *.f* /s /p
>> Total Files Listed:
>> 141 File(s) 1,972,938 bytes
>> Best regards,
>> At 06:56 PM 6/29/2005 +0200, federico.ceccarelli wrote:
>> >I would like to get in touch with people running numerical meteorological
>> >models on a linux cluster (16cpu) , distributed memory (1Gb every node),
>> >diskless nodes, Gigabit lan, mpich and openmosix.
>> >I'm tring to run WRF model but the mpi version parallelized on 4, 8, or 16
>> >nodes runs slower than the single node one! It runs correctly but so
>> >When I run wrf.exe on a single processor the cpu time for every
>> >about 10s for my configuration.
>> >When I switch to np=4, 8 or 16 the cpu time for a single step sometimes
>> >faster (as It should always be, for example 3sec for 4 cpu ) but often
>> >slower and slower (60sec and more!). The overall time of the simulation is
>> >bigger than for the single node run...
>> >anyone have experienced the same problem?
>> >thanks in advance to everybody...
>> >Dr. Federico Ceccarelli (PhD)
>> > TechCom snc
>> >Via di Sottoripa 1-18
>> >16124 Genova - Italia
>> >Tel: +39 010 860 5664
>> >Fax: +39 010 860 5691
>> >Beowulf mailing list, Beowulf at beowulf.org
>> >To change your subscription (digest mode or unsubscribe) visit
More information about the Beowulf