[Beowulf] Which distro for the cluster?

Tue Jan 9 04:30:46 PST 2007

Joe Landman <landman at scalableinformatics.com> writes:

> I think there are two different issues.  First: security is meant to be
> an access control and thottle/choke point.  Second: is how you view your
> cluster.  Is it "one-big-machine" in some sense (not necessarily Scyld,
> but with a security model such that if you are on the access node you
> are on the machine), or is it really a collection of individual machines
> each with their own administrative domain?  One of these models works
> really well for "cluster" use.

I don't think it's quite that black and white. You can have the
cluster appear as a single security domain to the user, while still
maintaining some internal barriers. Even if you have passwordless ssh
access across the cluster for ordinary users, you probably should have
restrictions for root access - even if an attacker can root the login
node, he shouldn't be able to just ssh as root to any machine. Yes, if
he can root the login node, he can probably root the other nodes as
well, but let him work for it.

>>> It all boils down to a CBA (as everything does).  Upgrading carries
>>> risk, no matter who does it, and how carefully things are packaged.  The
>>> CBA equation should look something like this:
>>>
>>> 	value_of_upgrade = positive_benefits_of_upgrade -
>>> 			   potential_risks_of_upgrade
>> 
>> With the security benefits being really hard to quantify. 
>
> Not really.  If you have a huge gaping hole that needs patching (OpenSSL
> off-by-one or weakness), the benefits are easy.

The security benefits of the upgrade (or, rather, the costs of *not*
performing the upgrade) is something like

  benefits = potential_damages_from_exploit * risk_of_exploit

Trying to estimate the risk of somebody exploiting a particular
vulnerability can be very hard. 

>> I don't get this. What's the point of having a "secure" frontend if
>> the systems behind it are insecure? OK, there's one big point -
>> hopefully you can buy some time - but other than that? 
>
> Its the model of how you use the machine.  If you lock all the doors
> tight with impenetrable seals, and the attacker goes through the weaker
> windows, those impenetrable seals haven't done much for you.

Exactly. But you seem to propose to seal the login node tight, but
leave the windows on the compute nodes ajar.

> The idea is you minimize the exposed footprint of the machine to threat
> facing access.

Yeah. But we seem to have different opinions on where the threat is.
It isn't just the Internet connected login node that is exposed to
threat. Even if you think you can trust your users, each and every
remote login session might actually be a hijacked account.

It's all very well saying that "If your system can be keylogged, it
should never ever be on a network, anywhere.", but I'm afraid that's
just wishful thinking. In actuality a huge proportion of Windows
systems are malware infested *AND* there have been large password
theft attacks against Unix systems in the last few years, using ssh
trojans and X-based keyloggers. This is what reality looks like, and
we have to deal with it.

We all know it's impossible to lock down a system completely. You
always have to make trade-offs and risk assessments. I'm not arguing
for a system where the users have to turn up in person and deliver
their jobs on punchcards. Rather, I think my two main points are:

a) Defense-in-depth. Relying on perimeter defense is so 20th century.
The Windows world is starting to discover this, and I think we should
learn from them. Putting all your effort into one big barrier is the
wrong way to build security. The attackers should have an uphill
struggle *all* the way - "we shall fight on the beaches, we shall
fight on the landing grounds, we shall fight in the fields and in the
streets, we shall fight in the hills; we shall never surrender"

b) The main threat has changed. You still have to protect yourself
against remote exploits, but for a cluster that exposes few services
this is no big problem. Instead, our main headache is now protection
against *local* attacks through identity theft.

-- 
Leif Nixon                       -            Systems expert
------------------------------------------------------------
National Supercomputer Centre    -      Linkoping University
------------------------------------------------------------