[Beowulf] Which distro for the cluster?

Tue Jan 9 07:14:59 PST 2007

Leif Nixon wrote:
> Joe Landman <landman at scalableinformatics.com> writes:
> 
>> I think there are two different issues.  First: security is meant to be
>> an access control and thottle/choke point.  Second: is how you view your
>> cluster.  Is it "one-big-machine" in some sense (not necessarily Scyld,
>> but with a security model such that if you are on the access node you
>> are on the machine), or is it really a collection of individual machines
>> each with their own administrative domain?  One of these models works
>> really well for "cluster" use.
> 
> I don't think it's quite that black and white. You can have the
> cluster appear as a single security domain to the user, while still
> maintaining some internal barriers. Even if you have passwordless ssh
> access across the cluster for ordinary users, you probably should have
> restrictions for root access - even if an attacker can root the login
> node, he shouldn't be able to just ssh as root to any machine. Yes, if
> he can root the login node, he can probably root the other nodes as
> well, but let him work for it.

I think we are delving into design areas now.

Login nodes are not and should not be administrative nodes.  That is, do
not trust login nodes for non-end-user accounts.  This is a nice idea,
and sadly, not implemented in most practice.  Rocks and other cluster
distros happily enable end user login to the cluster administrative
node.  A login node is like a compute node, though bad-people (tm) can
get to it.  Which means you should trust it less.  If you fuse this with
an admin node, then you increase your risk.

>>>> It all boils down to a CBA (as everything does).  Upgrading carries
>>>> risk, no matter who does it, and how carefully things are packaged.  The
>>>> CBA equation should look something like this:
>>>>
>>>> 	value_of_upgrade = positive_benefits_of_upgrade -
>>>> 			   potential_risks_of_upgrade
>>> With the security benefits being really hard to quantify. 
>> Not really.  If you have a huge gaping hole that needs patching (OpenSSL
>> off-by-one or weakness), the benefits are easy.
> 
> The security benefits of the upgrade (or, rather, the costs of *not*
> performing the upgrade) is something like
> 
>   benefits = potential_damages_from_exploit * risk_of_exploit

Yes, exactly.  This is precisely correct.

> 
> Trying to estimate the risk of somebody exploiting a particular
> vulnerability can be very hard. 

No.  Follow the cert/secunia/... lists.  See what is being exploited in
the wild.  Won't be perfect, but if it is not being exploited (not that
cert et al are perfect or reliable ahead of time), or is very hard if
not impossible to exploit (e.g. the cache based back channel attack on
SMP systems), then your risk is low.  Risk is inversely proportional to
the ease of exploit.  The easier it is to exploit, the higher the risk.

>>> I don't get this. What's the point of having a "secure" frontend if
>>> the systems behind it are insecure? OK, there's one big point -
>>> hopefully you can buy some time - but other than that? 
>> Its the model of how you use the machine.  If you lock all the doors
>> tight with impenetrable seals, and the attacker goes through the weaker
>> windows, those impenetrable seals haven't done much for you.
> 
> Exactly. But you seem to propose to seal the login node tight, but
> leave the windows on the compute nodes ajar.

Nope, the analogy is incorrect and inaccurate, as is your
characterization of what I am writing.  What I have pointed out is that
no matter how good you *think* your security model is,

a) it isnt that good
b) it is attackable
c) it is being attacked
d) it is being attacked in a way you didn't consider
e) your super-duper-ultra-fantastic model X security system on the door
does absolutely nothing for you if they come in through the air duct.

Sooner or later, unless you cut the tx line on the network card, someone
is going to compromise your system.  The only way around that is

a) never to transmit anything back to the user
b) never allow writing of bits anywhere for any reason.

Again, I have seen people guffaw at security threats relying heavily
upon a single or several measures, all of them similar (e.g. firewalls)
with a good appreciation that there are many more attack vectors than
through the front door.

The idea is, again, don't over-fortify one section believing that it
will be the only method of getting in.  Reduce your security footprint
an vulnerability.  Keep your risks low.  Understand that they will
eventually beat what you have in place, so your only real option is to
minimize the damage they can do.

>> The idea is you minimize the exposed footprint of the machine to threat
>> facing access.
> 
> Yeah. But we seem to have different opinions on where the threat is.
> It isn't just the Internet connected login node that is exposed to
> threat. Even if you think you can trust your users, each and every
> remote login session might actually be a hijacked account.

Yes.  It may be.  We might have different opinions of where the threat
is.  My belief is that the threat can come from any possible attack
vector.  This suggests one should contain the potential attack vectors
if possible.  If rsh is an external attack vector, then ask yourself if
you really need it.  If your ssh is being hit by dictionary attacks day
in and day out, ask yourself how it is being used, and contain the
operational modes to the minimum set you can (no ssh1, ...)

Hijacked accounts do happen.  Used to be from telnet/ftp/rsh access to
remote systems.  Now a-days it is from windows malware and keyloggers.

Worse is when it is purposeful insider attacks.  You cannot protect
against all attack vectors, you can protect against destruction of data
or configuration.  Data theft is harder to protect against.

Perimeter defenses do little for the insider attacks.

> It's all very well saying that "If your system can be keylogged, it
> should never ever be on a network, anywhere.", but I'm afraid that's
> just wishful thinking. In actuality a huge proportion of Windows
> systems are malware infested *AND* there have been large password

Yes, a huge proportion are infested.  Wishful thinking, no.  Security
begins with good security practices, and again, limiting damage
potential at the local level.  This in part means running in
least-privilege mode.  Unfortunately for some of the systems, it is not
possible to do this, and have a useful system, due in large part to its
(mis)design.

> theft attacks against Unix systems in the last few years, using ssh
> trojans and X-based keyloggers. This is what reality looks like, and
> we have to deal with it.

Heck, most of the password theft against unix systems comes from open
telnet, ftp, pop, and imap servers, and a little network sniffer.  As
late as mid last year, I was asked to show a customer what happens if
you stick a little packet sniffer on their net.  Doesn't even need an
IP.  Just have that going while they are on it, and then have them look
at the screen as they log into their mail.  They were convinced their
big (switch manufacturers name elided) switch would save them due to its
advanced security features.

You rely upon a perimeter defense and the (intelligent) attackers will
choose non-perimeter vectors.  This has the impact of rendering your
perimeter defense useless.

I like telling people that systems designed to fail often do.

Perimeter defenses are Maginot lines
(http://en.wikipedia.org/wiki/Maginot_Line).  They are the definition,
the poster child, of a failed *total* defense design.  A perimeter
defense is a speed bump to a determined hacker, it is a defensive
element, not a defense in and of itself.  As long as you accept that
they *will* get through these, you have to think about adding depth to
the speed bump.  My point is, adding additional perimeters may not be
the best approach to add this depth.  I think this is where we disagree,
as the sense I get is that you may believe that additional perimeters
are great for defensive depth.

> We all know it's impossible to lock down a system completely. You
> always have to make trade-offs and risk assessments. I'm not arguing

Yes.  This is/was my point.  You cannot *ever* lock it down.  You must
assume it will be compromised at some point.

> for a system where the users have to turn up in person and deliver
> their jobs on punchcards. Rather, I think my two main points are:
> 
> a) Defense-in-depth. Relying on perimeter defense is so 20th century.

Yes.  Agreed.  I am not arguing perimeter defenses.  I am pointing out
that enabling attack vectors by increasing your exposure footprint is
anathema to your ability to contain your risk.  Keep your perimeter as
small as possible.  Keep your attackable footprint as small as possible.
 Don't firewall rsh/telnet/etc... don't install them.  If they are not
there, they cannot be used as attack targets.

> The Windows world is starting to discover this, and I think we should
> learn from them. Putting all your effort into one big barrier is the
> wrong way to build security. The attackers should have an uphill

[scratch scratch]  Who is arguing for building a heavy door?  I am
arguing for minimizing the maximum damage.  In part you do this by
reducing your exposed surface area.  Keep as few threat-facing systems
as possible, and keep them patched, up to date, and don't trust them.

> struggle *all* the way - "we shall fight on the beaches, we shall
> fight on the landing grounds, we shall fight in the fields and in the
> streets, we shall fight in the hills; we shall never surrender"

Uh...  ok.   You seemed (maybe I misread or misunderstood you) that
multiple perimeters are the way to go.  I disagree with this.  I am also
of the opinion that "force" as it were, is best applied where it makes
the most sense.  Making the end users slog through using a system along
with the nasties seems not to be a solution that most would like.  There
are other alternatives, some very good, that limit the maximum possible
damage a user can do.

> b) The main threat has changed. You still have to protect yourself
> against remote exploits, but for a cluster that exposes few services
> this is no big problem. 

Complacency or a lack of profound paranoia is the first step down the
slope of (mistakenly) believing your systems are secure.  Keeping as few
services on the exposed net limits the attack vectors.  But it does not
make it secure.  Limiting the damage that can be done also doesn't make
it secure.  It just reduces the impact of the cleanup.

> Instead, our main headache is now protection
> against *local* attacks through identity theft.

Yeah...  well, it is my understanding that the insider attacks
(committed and trusted people with nefarious intent) as well as identity
theft are the fastest growing crimes.  Basing a security model upon an
identity that can be stolen (say from a USB key inserted into a
compromised machine, or keylogged, or ...) is problematic.  Including
additional factors that require possession of multiple critical
elements, including those that are never linked together (SecureID and
alike cards), is much better.  Unfortunately you cannot use such things
to protect against the determined internal attacker.  Your employee is
annoyed that someone else got a raise/promotion/ata-boy, so they decide
to steal and sell your design for super-duper-widget to your competitor.
 This employee is trusted.  How do you prevent this?  Or, you use single
factor (ssh key) authentication to get in.  Someone's keys are stolen
through a USB fob they think is secure that they run putty from, yet was
inserted into a zombified PC at a university.  Now the bad-guys (tm)
have access in.  They can use resources, delete files, alter content.

If someone can explain precisely how to protect against these scenarios
without using multi-factor (disconnected) authentication methods, I
would love to hear it.  But even if the new protection scheme fails (and
it will), how do you limit the damage?

If you can blast past the defenses, and you have ownership of the
cluster, the game is over.  If you prevent this from happening by
limiting the maximum damage that can ever be done (no, won't be perfect,
but a heck-of-a-lot better than not having it).

Again, it sounds like we may be agreeing more than we disagree.  I am
not advocating a perimeter model.  I am advocating forcing attacks to
use fewer numbers of vectors (smaller defense perimeter).  I am also
advocating reducing the potential damage an attacker who breaks through
can do.  Force them to channel their attacks, and limit their prize
should they win.  Sun Tzu explained this in his book, and it is worth
taking into consideration.

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452 or +1 866 888 3112
cell : +1 734 612 4615