[Beowulf] first cluster

Reuti reuti at staff.uni-marburg.de
Tue Jul 13 05:07:27 PDT 2010


Am 13.07.2010 um 06:29 schrieb Rahul Nabar:

> On Mon, Jul 12, 2010 at 2:02 PM, Gus Correa <gus at ldeo.columbia.edu> wrote:
>> Consider disk for:
>> 
>> A) swap space (say, if the user programs are large,
>> or you can't buy a lot of RAM, etc);
> 
> Out of curiosity, is there the possibility of running a "swapless"
> compute-node? I mean most HPC nodes already have fairly generous RAM
> and once swapping to disk starts performance is degraded (severely?).
> Are there non-problem scenarios where one does desire swapping to
> disks?

As already said: yes, it's possible and you can even switch swap on and off during normal operation (`swapon` and `swapoff`). Disadvantage is of course, when the system runs out of memory the oom-killer will look for an eligible process to be killed to free up some space. As you mentioned, the application should fit into the physical installed RAM, and you may just want 2 GB or so as a last resort to swap out parts of the OS which are currently not in use.

You may want more swap, when you want to setup some kind of preemption using a job scheduler. E.g. GridEngine can suspend a low priority job once a urgent one comes in, but resources like memory are not freed automatically (the job is still on the node - you would need some kind of checkpointing to free the node completely). When you setup the queuing system that all running applications fit into physical memory, the swap of the suspended application is a one time issue and won't affect the ongoing computation.


>> D) Most current node chassis have hot-swappable disks, not hard to replace,
>> in case of failure.
> 
> Hot-swappable disks are great on head nodes but on compute-nodes
> whenever I hear "redundant" or "hot swappable", I see it as an
> inefficiency. Or a excessive feature that could be traded off for a
> cost saving. (of course, sometimes hands are tied if the server comes
> with that feature "standard") What do others think?

Correct. Often it's included in chassis as default, although you can't make much use of it when you use a e.g. RAID0 on a node for performance reasons and have to reinstall the node anyway. It will just avoid that you have to switch off the node completely and remove it from the rack to access the inner parts of the node. But there might been chassis, where you can access the drive from the front w/o hot-swap capability but with a big label: don't remove under operation.

-- Reuti

> 
> -- 
> Rahul
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf





More information about the Beowulf mailing list