Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] What services do you run on your cluster nodes?

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Prentice Bisbal prentice at ias.edu
Tue Sep 23 05:36:38 PDT 2008



Gerry Creager wrote:
> Eric Thibodeau wrote:
>> Prentice Bisbal wrote:
>>> The more services you run on your cluster node (gmond, sendmail, etc.)
>>> the less performance is available for number crunching, but at the same
>>> time, administration difficulty increases. For example, if you turn off
>>> postfix/sendmail, you'll no longer get automated e-mails from your
>>> system to alert you to a problem.
>>>
>>> My question is this: how extreme do you go in disabling non-essential
>>> services on your cluster nodes? Do you turn off *everything* that's not
>>> absolutely necessary, do you leave somethings running to make
>>> administration easier?
>>>   
>> Everything is turned off and, most of the time, a quick glance at
>> ganglia brings out problems. Simple scripts can be built to perform
>> cyclic checks on the nodes and would be less disruptive IMHO.
>>> I'm curious to see how everyone else has their cluster(s) configured.
>>>   
>> The only actual research I found on OS interference impacting HPC
>> computing is titled "A measurement and simulation methodology for
>> parallel computing performance studies" by Matthew Joseph Sottile. I
>> would be curious to know if anyone else has dipped into the subject
>> and come up with conclusive results on the subject.
> 
> As a slightly OT question: I've recently heard it posited that ganglia
> induces severe communications overhead ("It's chatty") and thus
> shouldn't be used.  What's the conventional wisdom thereof?

I believe it. I remember debugging a ganglia communication problem once,
and saw LOTS of traffic with tcpdump. I would imagine that gmond
produces a decent amount of load on the node everytime it polls the
system. I could be wrong on that last part, since I don't understand how
it works internally.

If ganglia produces a lot of traffic and some overhead, is it really
that bad to leave postfix/sendmail running so I can receive e-mail
system e-mails?

-- 
Prentice



More information about the Beowulf mailing list