[Beowulf] Can one Infiniband net support MPI and a parallel filesystem?

Gus Correa gus at ldeo.columbia.edu
Thu Aug 14 08:58:20 PDT 2008


Hello Chris and list

Chris Samuel wrote:

>----- "Gus Correa" <gus at ldeo.columbia.edu> wrote:
>
>  
>
>>One reason not mentioned is serial programs.
>>Well a cluster is to run parallel jobs.
>>    
>>
>
>Hmm, a cluster is to run HPC codes, there are plenty of
>legitimate single CPU codes to solve embarrassingly
>parallel problems! :-)
>
>  
>
We seem to agree on this.

An example: 
Millions of cross correlations of micro-earthquake seismograms (time 
series),
to locate the focii precisely, and produce a high-resolution map of 
geologic faults and potential hazard
in California took several days using shared nodes.
The code is serial, the size of each calculation doesn't justify 
parallelism,
but the large number of them requires massive computational resources.
I wouldn't bother the people who wrote the program to parallelize it (in 
the sense of using MPI or OpenMP).
The script that launched the tons of serial jobs was the "embarrassingly 
parallel" component of it.
Some people would say it is a waste to run this type of program on a 
cluster with Myrinet.
If we had a a farm of serial computers we would have used it, but we 
don't have one.

>[...]
>  
>
>>and an average of about 75% use of its maximum capacity
>>    
>>
>[..]
>  
>
>>I couldn't find usage data of other public, academic, or industry 
>>machines to compare.
>>    
>>
>
>It appears we've averaged almost 77% utilisation
>since the beginning of 2004 (when our current usage
>system records begin).
>
>  
>
Thank you very much for the data point!

I've insisted here that above 70% utilization is very good,
given the random nature of demand and jobs on queues in the academia, etc.
However, some folks would want more than 90% efficiency to get happy.
I had to resort to the Second Law of Thermodynamics,
compare our efficiency with Carnot cycles,
with the efficiency of thermal engines, of biological systems,
of the atmosphere and ocean heat transport, etc, to make my point,
and the theoretical argument almost jeopardized my job ... :)


>cheers,
>Chris
>  
>

Cheers,
Gus Correa

-- 
---------------------------------------------------------------------
Gustavo J. Ponce Correa, PhD - Email: gus at ldeo.columbia.edu
Lamont-Doherty Earth Observatory - Columbia University
P.O. Box 1000 [61 Route 9W] - Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------




More information about the Beowulf mailing list