[Beowulf] Intel MPI 2.0 mpdboot and large clusters, slow tostart up, sometimes not at all

Bill Bryce bill at platform.com
Fri Sep 29 06:38:35 PDT 2006



Hi Mark, 

We are going through a similar experience at one of our customer sites.
They are trying to run Intel MPI on more than 1,000 nodes.  Are you
experiencing problems starting the MPD ring?  We noticed it takes a
really long time especially when the node count is large.  It also just
doesn't work sometimes.

Regards, 

Bill.

-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org]
On Behalf Of Mark Hahn
Sent: Friday, September 29, 2006 8:47 AM
To: Clements, Brent M (SAIC)
Cc: beowulf at beowulf.org
Subject: Re: [Beowulf] Intel MPI 2.0 mpdboot and large clusters, slow
tostart up, sometimes not at all

> Does anyone have any experience running intel mpi over 1000 nodes and
do you have any tips to speed up task execution? Any tips to solve this
issue?

it's not uncommon for someone to write naive select() code that fails
when the number of open file descriptors hits 1024...  yes, even in 
the internals of major MPI implementations.
_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list