cluster frustrations
Patrick Geoffray
patrick at myri.com
Wed Jan 16 16:35:43 PST 2002
Joachim,
Joachim Worringen wrote:
> But they don't get it to run reliably with
> the current Linux/GM/MPICH versions which of course should run faster,
> better, nicer. I don't blame Linux or Myrinet for these problems -
Obviously, you do. Inciting another flame war ?
I have searched in the log of the Myricom support and I found one
help ticket from Ulrich Detert (help ticket #7197, Wed Jun 13 15:07:59
2001) with the configuration that you describe and a piece of MPI code
supposed to trigger a malfunction. This code was run the same day with
recent GM and MPICH-GM releases and shown no problem whatsoever. I have
tried a few minutes ago with the current software, and again no
problem. The help ticket was closed Fri Sep 14 10:13:28 2001 with no
reply from the customer.
So if you really experienced problems with this machine, please
contact help at myri.com, this is the first step toward happiness.
> just want to show that even people capable of running Crays, SP-2s,
> Paragon, any kind of workstatons etc. have a hard time setting up and
> maintaining a Linux cluster. And the next update is usually the next
You cannot compare Crays/SP2 with do-it-yourself Linux clusters. When
you buy a Cray or a SP(2,3), you get a machine that experts build for you,
you get softwares that experts install for you, you get often someone
on-site to take your hand the first month or even during the life of the
machine. The only problem is that you pay a lot for that.
Linux clusters are not easy to install, it's wrong to believe they are.
To have access to the Myricom support archive, I can tell you that a
large number of problems are related to customers trying the
do-it-yourself way, with no cluster experience, only Windows background,
who do not know exactely what's a kernel and how to compile one, who
believe that Redhat is pure Linux and have never heard about the MPI
specs.
Not surprisingly, customers using a third party, either a big vendor
like IBM or a small one like many people on this list, where people
know what they are doing, have usually a much smoother experience.
But you still have to pay a little for that.
So the do-it-yourself way ? Why not, but if it fails, call the mechanics.
Patrick
----------------------------------------------------------
| Patrick Geoffray, Ph.D. patrick at myri.com
| Myricom, Inc. http://www.myri.com
| Cell: 865-389-8852 685 Emory Valley Rd (B)
| Phone: 865-425-0978 Oak Ridge, TN 37830
----------------------------------------------------------
More information about the Beowulf
mailing list