[Beowulf] What services do you run on your cluster nodes?
Eric Thibodeau
kyron at neuralbs.com
Mon Sep 22 12:44:24 PDT 2008
Ashley Pittman wrote:
> On Mon, 2008-09-22 at 14:56 -0400, Eric Thibodeau wrote:
>
>>> My question is this: how extreme do you go in disabling non-essential
>>> services on your cluster nodes? Do you turn off *everything* that's not
>>> absolutely necessary, do you leave somethings running to make
>>> administration easier?
>>>
>
> If it were up to me I'd turn *everything* possible off except sshd and
> ntp. The problem however is the maintenance cost of doing this, it's
> fine if you've only got one cluster and one app but as soon as you try
> to support multiple users on multiple distributions the cost of ensuring
> everything is shut down on all of them skyrockets and it becomes easier
> which is to stick with the status quo :(
>
O_o...you mean you're still using local OS installations ... ew!
>
>> Everything is turned off and, most of the time, a quick glance at
>> ganglia brings out problems. Simple scripts can be built to perform
>> cyclic checks on the nodes and would be less disruptive IMHO.
>>
>>> I'm curious to see how everyone else has their cluster(s) configured
Well, while at it, here are my node's services (this one I built 3years
ago, the new images are different now):
thinkbig1 ~ # rc-status
Runlevel: unionfs
ntp-client
[ started ]
ntpd
[ started ]
sshd
[ started ]
acpid
[ started ]
gmond
[ started ]
portmap
[ started ]
autofs
[ started ]
nfsmount
[ started ]
netmount
[ started ]
vixie-cron
[ started ]
local
[ started ]
Runlevel: UNASSIGNED
fsck
[ started ]
rpc.statd
[ started ]
udev-postmount
[ started ]
>> The only actual research I found on OS interference impacting HPC
>> computing is titled "A measurement and simulation methodology for
>> parallel computing performance studies" by Matthew Joseph Sottile. I
>> would be curious to know if anyone else has dipped into the subject and
>> come up with conclusive results on the subject.
>>
>
> At medium to large scales it becomes hugely important,
> http://www.sc-conference.org/sc2003/paperpdfs/pap301.pdf
>
> Also look at "whatelse" from http://www.c3.lanl.gov/pal/software.shtml
>
> Ashley Pittman.
>
Hey, thanks for the links, I've coded my own whatelse (flimsy but does
the trick) and I'll read up the article.
Eric
More information about the Beowulf
mailing list