[Beowulf] An annoying MPI problem

Joe Landman landman at scalableinformatics.com
Thu Jul 10 08:02:12 PDT 2008


Lombard, David N wrote:

>> I'll try all the usual things (reduce the optimization level, etc).
>> Sage words of advice (and clue sticks) welcome.
> 
> Not trying to sound like an ad...
> 
> The currently shipping Intel Trace Collector and Analyzer (7.1), includes
> message correctness checking.  An option is available that adds a
> library to an Intel MPI build that checks messages during the run.
> You can then view any errors it found in the Intel Trace Analyzer.
> 
> This may find there's a problem that has only just started to trip the
> code up.  I certainly have welts from those; I suspect others do too.

Actually, Intel MPI and related tools are in general one of the things 
we want to try.  User may be open to that (especially if it is more pain 
free than the alternative).

We have reliable functional non-sm/non-ib based execution on multiple 
machines now.  New code drop coming, so we have to wait on that.  Once 
we have that, we'll be doing more testing.

Joe


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
        http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615



More information about the Beowulf mailing list