[Beowulf] how Google warps your brain
Ellis H. Wilson III
ellis at runnersroll.com
Thu Oct 21 20:40:23 PDT 2010
On 10/21/10 06:43, Eugen Leitl wrote:
> The cloud is real. The idea that you need a physical machine close by to get
> any work done is completely out the window at this point. My only machine at
> Google is a Mac laptop (with a big honking monitor and wireless keyboard and
> trackpad when I am at my desk). I do all of my development work on a virtual
> Linux machine running in a datacenter somewhere -- I am not sure exactly
> where, not that it matters. I ssh into the virtual machine to do pretty much
> everything: edit code, fire off builds, run tests, etc. The systems I build
> are running in various datacenters and I rarely notice or care where they are
> physically located. Wide-area network latencies are low enough that this
> works fine for interactive use, even when I'm at home on my cable modem.
The fact that the author is using a Mac and doing development work on a
virtual Linux machine in an unknown location highlights the underlying
theme of the article, the resultant thread, and perhaps even this entire
Different setups work better for different workloads.
Clearly, the author feels that the inconvenience incurred by having to
use a virtual Linux machine to perform his development is less than the
inconvenience of running Linux as his main OS. Otherwise, he would
simply use Linux on his machine and sit at home in his pajamas, sipping
a hot cup of Earl Grey and working out his HPC problem locally.
Nonetheless, there are numerous examples of workloads in the scientific
community (used here in reference to the physical sciences) and in HPC
development, which unfortunately do not play nicely with such a remote
and fluctuating setup.
For instance, in my research, it is far easier to own the machines one
runs on (or at least have root access) to develop and test breakthroughs
in systems development. Often messing with the kernel, toolchain, or
arbitrary libraries in the distribution is required to effect and test
the change in which one is interested. It goes without saying that we
have quite a bit of difficulty convincing "IT" types (even being
computer science persons ourselves) that this is a reasonable thing to
do on the average cluster, even in a university setting. Certainly, in
"the cloud" alterations at this level are not tolerated. Further, it is
extremely rare to find clusters tailored to system development such that
they have master nodes that reboot all the slave nodes with new images
and new file systems (and however many more system specific parameters,
specified by the developer) for every run.
That said, I do recognize that system development is a small sector in
HPC and not by any means the most influential customer. However, I do
feel it furthers the mantra that we should not all be pigeon-holed into
one particularly "efficient" setup just because it works well for the
common case (or because Google "invented it").
As totally off-topic points: it is great to see that RGB-Bot has been
rebooted (even if only with limited bandwidth), and I absolutely have no
idea why Eugen Leitl posted a blog entry from Matt Welsh. I scanned and
scanned for some comment from Eugen or the way in which it somehow
wrapped into recent conversation, but at this point I'm lost on why it
originally got posted (besides being quite the fire-starter).
More information about the Beowulf