[Beowulf] [upgrade strategy] Intel CPU design bug & security flaw - kernel fix imposes performance penalty
Jörg Saßmannshausen
sassy-work at sassy.formativ.net
Thu Jan 4 15:48:20 PST 2018
Dear all,
that was the question I was pondering about all day today and I tried to read
and digest any information I could get.
In the end, I contacted my friend at CERT and proposed the following:
- upgrade the heanode/login node (name it how you like) as that one is exposed
to the outside world via ssh
- do not upgrade the compute nodes for now until we got more information about
the impact of the patch(es).
It would not be the first time a patch is opening up another can of worms. What
I am hoping for is finding a middle way between security and performance. IF
the patch(es) are save to apply, I still can roll them out to the compute
nodes without loosing too much uptime. IF there is a problem regarding
performance it only affects the headnode which I can ignore on that cluster.
As always, your mileage will vary, specially as different clusters have
different purposes.
What I would like to know is: how about compensation? For me that is the same
as the VW scandal last year. We, the users, have been deceived. Specially if
the 30% performance loss which have been mooted are not special corner cases
but are seen often in HPC. Some of the chemistry code I am supporting relies
on disc I/O, others on InfiniBand and again other is running entirely in
memory.
These are my 2 cents. If somebody has a better idea, please let me know.
All the best from a rainy and windy London
Jörg
Am Mittwoch, 3. Januar 2018, 13:56:50 GMT schrieb Remy Dernat:
> Hi,
> I renamed that thread because IMHO there is a another issue related to that
> threat. Should we upgrade our system and lost a significant amount of
> XFlops... ? What should be consider : - the risk - your user population
> (size / type / average "knowledge" of hacking techs...) - the isolation
> level from the outside (internet)
>
> So here is me question : if this is not confidential, what will you do ?
> I would not patch our little local cluster, contrary to all of our other
> servers. Indeed, there is another "little" risk. If our strategy is to
> always upgrade/patch, in this particular case you can loose many users that
> will complain about perfs... So another question : what is your global
> strategy about upgrades on your clusters ? Do you upgrade it as often as
> you can ? One upgrade every X months (due to the downtime issue) ... ?
>
> Thanks,
> Best regardsRémy.
>
> -------- Message d'origine --------De : John Hearns via Beowulf
> <beowulf at beowulf.org> Date : 03/01/2018 09:48 (GMT+01:00) À : Beowulf
> Mailing List <beowulf at beowulf.org> Objet : Re: [Beowulf] Intel CPU design
> bug & security flaw - kernel fix imposes performance penalty Thanks Chris.
> In the past there have been Intel CPU 'bugs' trumpeted, but generally these
> are fixed with a microcode update. This looks different, as it is a
> fundamental part of the chips architecture.However the Register article
> says: "It allows normal user programs – to discern to some extent the
> layout or contents of protected kernel memory areas" I guess the phrase "to
> some extent" is the vital one here. Are there any security exploits which
> use this information? I guess it is inevitable that one will be engineered
> now that this is known about. The question I am really asking is should we
> worry about this for real world systems. And I guess tha answer is that if
> the kernel developers are worried enough then yes we should be too.
> Comments please.
>
>
>
> On 3 January 2018 at 06:56, Greg Lindahl <lindahl at pbm.com> wrote:
>
> On Wed, Jan 03, 2018 at 02:46:07PM +1100, Christopher Samuel wrote:
> > There appears to be no microcode fix possible and the kernel fix will
> >
> > incur a significant performance penalty, people are talking about in the
> >
> > range of 5%-30% depending on the generation of the CPU. :-(
>
> The performance hit (at least for the current patches) is related to
>
> system calls, which HPC programs using networking gear like OmniPath
>
> or Infiniband don't do much of.
>
>
>
> -- greg
>
>
>
>
>
> _______________________________________________
>
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf
mailing list