[Beowulf] OS for 64 bit AMD

Laurence Liew laurence at scalablesystems.com
Thu Mar 31 17:58:01 PST 2005


I would like to expand on Joe's argument for a commercially supported 
distro for *production* sites.

Most of our customers run expansive commercial applications like Fluent, 
Cadence, etc etc... which are CERTIFIED to run on say SLES9 or RHEL3/4 
often the costs of these apps are much much more than a commercial 
distro like RHEL.

What the cluster IT admin wants is that whatever he puts on his 
cluster... is known to work... and supported by the expansive software 
he is running. He knows he can call their supportline and request for help.

While we all know Debian runs fine, CentOS runs fine, FC2,3,X will run 
fine.... but most IT cluster admins basically wants to do their job 9am 
- 5pm... and go home sleeping soundly.. knowing that if anything breaks 
or does not work... they can call/ask for help and the vendor MUST 
support him because he paid for it.

Most of us on this list can support a cluster running on a 
non-commercial OS... but for most commercial and a large number of 
academic sites... these IT admins or researchers just want to get their 
job done and research done... they dont want to bother or care much 
about the OS... which is most of the time a small fraction of the costs 
of the cluster....

if you look at Red Hat's HPC pricing.. at US$79 per node (undiscounted 
yet)... it is a fraction of the costs of a 2-way server....  if you have 
a 128 node cluster(~US$300-500K).. your electrical/heating/support costs 
would be much much more than US$10,112/year for RHEL HPC Edition.

Saving US$10K to buy additional 2 - 3 nodes... just does not make sense 
when you are running a large production system supporting expansive 
hardware (SANs, tape backups, and expensive commercial software).

All of our commercial customers and most of our academic customers 
recognise the value of a commercial distro and pay for it willingly (or 
unwillingly)... but at least they know who they can choke when things go 


Joe Landman wrote:
> Commercial support of apps, drivers, connectivity.  Corporate IT staff 
> are (unless they are empowered), unlikely to support things they are 
> unfamiliar with, or when they have go/no-go decision authority, unlikely 
> to give the go-ahead to a distro that does not have a 1800-help-me 
> number attached to it.  For academic staff, they will likely pick the 
> distro they are comfortable with, or one that they see lots of cluster 
> people using.
> At the end of the day, the cluster admin is going to be asked whether or 
> not they trust their production computing system to distribution X. 
> Distribution X needs to support all their mix of stuff, which likely 
> supports Redhat, SuSE, and possibly a third.  Anything else other than 
> those and they are on their own with their support community.
> It is aweful lonely using Stampede Linux on a production cluster, and 
> running into a problem with IB, when it comes time to asking a question 
> and getting help/support.
> Jamie Rollins wrote:
>> Any decent distro should support kernel 2.6 with amd64.  But can some one
>> give me one good reason why you would use anything other than a
>> streamlined distro like Debian?  Why pay for all the blote in something
>> like redhat when your nodes are probably going to be running a single
>> process anyway?
>> jamie.
>> On Thu, 31 Mar 2005, Joe Landman wrote:
>>> Actually Redhat now has HPC pricing per node.  There are other good
>>> reasons to look elsewhere for HPC distributions though, specifically due
>>> to their lack of good high performance/scalable per-node file system.
>>> SuSE at least makes XFS and JFS available, and you can build/install a
>>> system with these.  Redhat prefers that you use ext3.  Another issue for
>>> the RHEL3 were the ancient kernels with many backports of advanced
>>> functionality from modern kernels.  Additionally adding modules for new
>>> hardware support into their boot process is a minor nightmare...
>>> Chris Dagdigian wrote:
>>>> I second Joe's comments.
>>>> All of our Opteron systems run Suse 9.2 by default and we use Centos-4
>>>> for "Redhat" compatible functionality when required since Redhat has
>>>> explicitly chosen to price themselves out of the cluster market for
>>>> everything except 2-way 32bit boxes.
>>>> Suse 9.1/9.2 on Opteron and Suse Enterprise Linux (SLES 8/9) on Itanium
>>>> Systems (meaning our SGI Altix) have been extremely stable and 
>>>> useful in
>>>> our work. Highly recommended.
>>>> -Chris
>>>> Joe Landman wrote:
>>>>> Hi Mike:
>>>>>  Opterons will do better with a 2.6 kernel (2.6.9 and higher).  If
>>>>> you are going to use RHEL, you might want to look at Rocks (RHEL3
>>>>> based) or Warewulf which should be able to sit atop RHEL4.  If you
>>>>> want to use a Redhat work-alike, you might want to look closely at
>>>>> Centos4.
>>>>>  I am sure others will take issue with this, but I would strongly
>>>>> advise against using a rolling beta OS (FC-x) as the basis for a
>>>>> production cycle machine.  If it is a purely experimental cluster, go
>>>>> for it.  If it is supposed to provide cycles to a wide group, you
>>>>> might look more closely at a supported/supportable distribution.
>>>>>  We have had good luck with SuSE 9.x (x>=1), RHEL3, CentosX on
>>>>> clusters using a variety of meta-distributions (warewulf, Rocks,
>>>>> others).  Most of our customers seem to prefer the RHEL series, so we
>>>>> tend to work with that more than others, but YMMV.
>>>>> joe
>>>>> Mike Davis wrote:
>>>>>> What OSes are Opteron clusters out there running. Is anyone running
>>>>>> FC2 on opterons?
>>>>>> I'm looking at opterons for our next cluster, but I'm not sure about
>>>>>> what OS to run. Thus far we've been with RH and or RHAS. But, the
>>>>>> next cluster will be big and I'm just not sure what we should run.
>>>>>> Mike Davis
>>>>>> _______________________________________________
>>>>>> Beowulf mailing list, Beowulf at beowulf.org
>>>>>> To change your subscription (digest mode or unsubscribe) visit
>>>>>> http://www.beowulf.org/mailman/listinfo/beowulf
>>> -- 
>>> Joseph Landman, Ph.D
>>> Founder and CEO
>>> Scalable Informatics LLC,
>>> email: landman at scalableinformatics.com
>>> web  : http://www.scalableinformatics.com
>>> phone: +1 734 786 8423
>>> fax  : +1 734 786 8452
>>> cell : +1 734 612 4615
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org
>>> To change your subscription (digest mode or unsubscribe) visit 
>>> http://www.beowulf.org/mailman/listinfo/beowulf

Laurence Liew, CTO		Email: laurence at scalablesystems.com
Scalable Systems Pte Ltd	Web  : http://www.scalablesystems.com
(Reg. No: 200310328D)
7 Bedok South Road		Tel  : 65 6827 3953
Singapore 469272		Fax  : 65 6827 3922

More information about the Beowulf mailing list