[Beowulf] WRF model on linux cluster: Mpi problem

Mon Jul 4 12:13:39 PDT 2005

Hi, 

I did the Pallas benchmark...after removing openmosix...here are the
ping-pong and ping-ping results...for 2 processes
What do you think about them?
Why the bandwidth is raising and decreasing many times as the #bytes
grow?

thanks again...

federico

#---------------------------------------------------
#    Intel (R) MPI Benchmark Suite V2.3, MPI-1 part
#---------------------------------------------------
# Date       : Mon Jul  4 15:20:32 2005
# Machine    : i686# System     : Linux
# Release    : 2.4.26-om1
# Version    : #3 mer feb 23 04:32:26 CET 2005

#
# Minimum message length in bytes:   0
# Maximum message length in bytes:   4194304
#
# MPI_Datatype                   :   MPI_BYTE
# MPI_Datatype for reductions    :   MPI_FLOAT
# MPI_Op                         :   MPI_SUM
#
#

# List of Benchmarks to run:

# PingPong
# PingPing
# Sendrecv
# Exchange
# Allreduce
# Reduce
# Reduce_scatter
# Allgather
# Allgatherv
# Alltoall
# Bcast
# Barrier

#---------------------------------------------------
# Benchmarking PingPong
# #processes = 2
# ( 6 additional processes waiting in MPI_Barrier)
#---------------------------------------------------
       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000       109.00         0.00
            1         1000       109.43         0.01
            2         1000       138.81         0.01
            4         1000       238.29         0.02
            8         1000       246.77         0.03
           16         1000       246.26         0.06
           32         1000       273.79         0.11
           64         1000       250.73         0.24
          128         1000       250.98         0.49
          256         1000       250.73         0.97
          512         1000       250.74         1.95
         1024         1000       250.23         3.90
         2048         1000       251.99         7.75
         4096         1000       256.01        15.26
         8192         1000       500.27        15.62
        16384         1000       785.51        19.89
        32768         1000     15087.75         2.07
        65536          640     33256.60         1.88
       131072          320      5399.92        23.15
       262144          160     95577.23         2.62
       524288           80    102396.36         4.88
      1048576           40    529898.21         1.89
      2097152           20     89600.72        22.32
      4194304           10    794578.55         5.03

#---------------------------------------------------
# Benchmarking PingPing
# #processes = 2
# ( 6 additional processes waiting in MPI_Barrier)
#---------------------------------------------------
       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000        94.45         0.00
            1         1000        94.92         0.01
            2         1000        94.07         0.02
            4         1000        95.82         0.04
            8         1000        95.33         0.08
           16         1000       105.89         0.14
           32         1000       117.57         0.26
           64         1000       120.45         0.51
          128         1000       124.39         0.98
          256         1000       136.02         1.79
          512         1000       171.28         2.85
         1024         1000       185.80         5.26
         2048         1000       238.80         8.18
         4096         1000       256.54        15.23
         8192         1000       381.98        20.45
        16384         1000     13932.86         1.12
        32768         1000     42027.47         0.74
        65536          640     45166.66         1.38
       131072          320      9002.89        13.88
       262144          160    194274.79         1.29
       524288           80    773914.26         0.65
      1048576           40     85866.48        11.65
      2097152           20    839526.30         2.38
      4194304           10    310144.00        12.90

Il giorno lun, 04-07-2005 alle 08:48 +0100, John Hearns ha scritto:
> On Fri, 2005-07-01 at 09:38 +0200, Federico Ceccarelli wrote:
> > yeas, 
> > 
> > I will remove openmosix. 
> > I patched the kernel with openmosix because I used the cluster also for
> > other smaller applications, so the load balance was useful to me.
> > 
> > I already tried to switch off openmosix with
> > 
> > > service openmosix stop
> Having a small amount of Openmosix experience, that should work.
> 
> Have you used the little graphical tool to display the loads on each
> node? (can't remember the name).
> 
> Anyway, I go along with the earlier advice to look at the network card
> performance.
> Do an lspci -vv on all nodes to check that your riser cards are running
> at full speed.
> 
> What I would do is break this problem down.
> Start by running the Pallas benchmark, on one node, then two, then four
> etc. See if a pattern develops.
> The same with your model, if it is possible to cut down the problem
> size. Run on one node (two processors), then two then four.
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>