[Beowulf] Docker vs KVM paper by IBM
Ellis H. Wilson III
ellis at cse.psu.edu
Wed Jan 28 12:05:40 PST 2015
On 01/28/2015 02:16 PM, Gavin W. Burris wrote:
> Didn't mean to upset you there, Ellis. I'm talking about every other
> discipline that isn't CSE. I encourage researchers to NOT be their own
> IT department, so that their time is freed up to do research. Obviously
> if your research IS the system, that is the exception.
I've only been on one side of the fence, but here is my perspective on
the computational sciences and sysadmin relationship:
There's effectively a bell-curve of users. With perfectly average
sysadmins, the bell-curve looks pretty normal. On the far left-tail
you've got your users whose programs and research operates PERFECTLY
under the current regime...err...toolchain provided by said sysadmin.
The next quartile up, on the left side of perfect average, you have a
good bulk of researchers who truly don't want to be sysadmins and who
are willing to change their programs to fit into the available
toolchain. The cost in this case is put on the researcher to change her
programs and spend hours all over the interwebs figuring out why such
and such compilation failed.
Just over the line in the third quartile we have another bulk of the
researchers who are just savvy enough to work around the toolchains of
the sysadmins, either via homedir path and lib manipulation, chroots, or
downright bribing/stealing root somehow and installing into public
paths. The cost generally manifests itself on the IT budget paying for
sysadmins to fix this "just savvy enough to be dangerous" user's crap
up, and on other researchers whose code now doesn't compile or run
because the toolchain has been mucked with.
In the last, far right-most tiny quartile, we have those researchers who
actually enjoy some amount of being sysadmins and are relatively as
capable as the departmentally paid ones. It's faster for them to just
handle things themselves. They WILL get around you, no matter what you
do, they'll enjoy doing so, and they'll have the wherewithal to know if
nobody knows all the better. If you resist, they'll just make things
painful for everyone, and no amount of stick-wielding will dissuade them.
On the two far tails the aggregate costs are generally low. In the
middle costs tend to be high. Offering multiple toolchains on a single
machine is non-trivial, and dealing with those who force multiple
toolchains/drivers/kernels/whatever into such a setup is expensive to
correct.
So, the obvious answer here is, provide your "standard operating
environments" in the form of containerized/VM/whatever images quartiles
1 and 2 can use, and allow quartiles 3 and 4 to spin up their own.
Multiple environments means quartile 2 can probably just try their
program A on environments X, Y, and Z, and find one that "just works."
This reduces their time futzing with compilers or fixing other
researcher's crappy code that breaks on GCC > 4.x. Quartile 3 can spin
up their own absolutely crap environment and think their L33t and not
screw over their fellow researchers. Quartiles 1 and 4 are basically
untouched, since they were fine before as now.
Everybody wins, probably most of all the IT department.
Best,
ellis
More information about the Beowulf
mailing list