[Beowulf] Security issues

Tim Cutts tjrc at sanger.ac.uk
Fri Oct 24 08:50:09 PDT 2008

On 24 Oct 2008, at 3:09 pm, Joe Landman wrote:

> Carsten Aulbert wrote:
>> Hi Jon
>> Jon Aquilina wrote:
>>> but why waste time sifting through all 26,000+ pkgs in the repos  
>>> when u
>>> can have a distro with repos focused on clustering pkgs?
>> Because you might/will save time later when you hit user requests  
>> which
>> want packages which are not pre-packaged in your cluster distro.
> Allow me to expand on this.
> Some distro packaged stuff is garbage, and broken.  Perl in RHEL4  
> and RHEL5 is notoriously bad (long discussions on this on a few  
> other lists I lurk on).  The rational for keeping it bad is  
> compatibility.  Which curiously leads to many developers building  
> their own base tools trees.

We do that to an extent, mainly so that machines running different  
OS's are running a consistent perl environment, for example.  But we  
don't do it because of breakages in the upstream distro.  If distros  
are that broken, we tend to not use them at all.  We abandoned pretty  
much all Red Hat flavours years ago for that reason.  For years, large  
parts of Red Hat were not 64-bit file aware, which was massively  
infuriating, and as you say, the kernel is in a world of its own  
(which of course leads to all sorts of fun problems with ISV software,  
which only supports Red Hat, and then doesn't work on other distros  
because it's been ported specifically to the Red Hat Broken View of  
the World)

> You can only trust the distro supplied tools so far.  Apache2 has  
> greatly improved in RHEL, and Debian/Ubuntu as compared to Apache in  
> RHEL.  Php is ancient, as is mysql, postgresql, etc.

That's always going to happen with any distribution.  Ubuntu is,  
thanks to its 6-month release cycle, usually rather more current than  
Debian.  But it's a trivial matter, usually, if you want something  
more up to date, to grab the source package from the distro's  
development tree, and build it on the current stable release.  Indeed,  
there are public repositories (such as etch-backports) where  
communities are doing just that.  But it's easy to do yourself it you  
want finer control.

However, for things like mysql, we tend to do as you describe, and  
install the versions directly obtained from upstream.

> The issue is that any cluster distribution based upon and base  
> distribution inherits all of the underlying issues of the base.  And  
> some of those issues are really pretty annoying.  In some cases,  
> they are broken.

I can't think of any real show-stoppers in the five or so years we've  
been running Debian.  The closest we came to a major snafu there was  
when Debian made their cock-up with SSH key security.  But that was  
easy enough to put right, and fortunately we hadn't migrated to Etch  
wholesale when it came to light, so we weren't badly affected.

> This is why we tend to prefer underlying-OS insensitive systems.  As  
> long as the underlying OS works, we don't care what it is.  When it  
> doesn't work, this is when we care, and have to figure out if the  
> cost of making it work is worth the effort.  The cost is time in  
> this case.

I agree wholeheartedly with that - time is the most important cost.  I  
also try not to care too much what the underlying OS is, but I also  
want to minimise the amount of software stack maintenance I have to  
do, so I tend to ask myself the following questions of the piece of  
software I'm considering:

1)  Does it need to be installed on every machine?
2)  Is the precise version present on the machine important?
3)  Is the software being rapidly developed, and consequently likely  
to be out of date in distros?
4)  Do I have an ISV support matrix to consider?

If the answer is yes to questions 1 and 4, or no to questions 2 and 3,  
then I tend to lean towards using the distro's packaging.  If the  
answers are the opposite to those, I will tend to use a copy I build  
and maintain myself, preferably on a central NFS server so I don't  
have to synchronise it everywhere.  There's no hard-and-fast answer to  
which approach is always best; it's very dependent on the situation.


 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

More information about the Beowulf mailing list