[Beowulf] KVM to a compute node - ssh

Thu Jun 3 05:57:30 PDT 2004

On Wed, 2 Jun 2004, Guy Coates wrote:

> 
> > any serial console solutions will ever be terribly desirable.  Serial
> > terminal servers aren't cheap
> 
> 
> I suppose it's a question of scale. I have to look after lots of nodes,
> and the machine room that they sit in is on the other side of the site to
> my office. That colours my judgement on the "just go a press the button /
> plug in a monitor" approach.
> 
> Furthermore, my time is not free.  Whilst it is true that terminal servers
> are not cheap, I still think they pay for themselves in wear-and-tear of
> your sys-admin on medium-large installations. "Total cost of ownership"
> may be a much abused term, but in my experience, remote console access
> really does lower it. A remote serial console is scriptable and
> automatable. I can spend my time drinking tea or helping users rather than
> having to be in a machine room poking hardware.

I agree (and mentioned) that if remote management is an issue serial
consoles can be useful and terminal servers in those circumstances can
be cost effective.  I used to use this myself a decade plus ago in a
mostly Sparc Sun LAN (where Suns at the time were very deliberately
engineered with a fairly powerful serial console).

With PCs, though, it isn't so easy.  The problems I've encountered with
the Tyan bios-based serial console on the 2466 dual Athlons have
somewhat soured me to the whole idea.  The default bios configuration is
no serial console, so to get a serial console you have to open each
chassis and put in a video card and boot it at least once with KV.  If
you ever reflash bios, you have to do it again.  There are certain
commands and states the systems can get into where they only respond to
a real keyboard, not the serial keyboard, to the point where I generally
plug a real keyboard into each box while working on them via serial
console.

PCs, even "server class" motherboard PCs, alas, have never been designed
to have a "real" serial console as video cards are viewed to be
essential, not optional.  linuxbios may be a good, real, solution to
that (haven't tried it), but so far I haven't been impressed with at
least the Tyan attempt at a serial console on a PC.  At this point I
think the issue is moot.  I personally think a serial interface is
obsolete and that it may disappear altogether from PCs over the next few
years, supplanted by USB and smarter network interfaces.  And better
ways of dealing with the problem seem to be emerging.

The most important of these is WOL/network boot, which I think is likely
the way to go for remote manageability.  One can initiate a remote boot
strictly from the network.  In addition, one can start up e.g. a network
console interface that permits one to run the "boot console" remotely
from fairly early in the boot process to be able to fix minor problems
short of a reinstall although I agree that likely stops short so far of
being able to select PXE or grub boot options or set bios options.

Still, this may be enough.  A primary use of a remote console interface
is to bounce a locked up box, since if it isn't locked up you can ssh
in, and if you can ssh in you have full access to the system.  WOL/PXE
"should" resolve this problem. One can even control the particular boot
image sent to the particular host on the bounce and boot it into e.g.  a
single user mode with network console and a diskless/repair boot image
if it fails to boot back into good health with its regular/default
image.  This is a BETTER solution than a serial interface for this
purpose, as serial interfaces to my experience have a real problem
interpreting e.g. three finger salutes in all possible runtime states --
it requires at least a true BIOS interface and a system that cannot be
put into a runtime state where it ignores or cannot process input from
the serial port.  Which I think is a bit oxymoronic, unfortunately --
serial interfaces tend to require the CPU to manage.  WOL "should"
bypass enough of the runtime state of the system that will work even if
the box is locked six ways from Sunday (acting more or less like a hard
reset) although I confess that I haven't tried it.

To deal with a box that won't bounce (or that bounces into the repair
image whose use reveals e.g. a disk crash) and is therefore broken by
assumption or observation requires a trip to the server room no matter
what, and given that the box will need to be benched and messed with,
the issue of interface is moot.  We generally stick a video card into
benched serial console systems anyway as the bios is happier that way
and one doesn't want to complicate things on a broken box.

Don't get me wrong, at one point in time I liked the idea of a serial
interface and if serial consoles weren't the one piece of hardware that
hasn't gotten any cheaper over the last decade plus and serial
interfaces really worked, I'd like the solution you describe. In the
meantime, given that neither of these has happened and that WOL/PXE has
made something of an end run around the problem, what I'd REALLY like at
this point is a true network-based console -- one where one can connect
to a brand new box's bios and console via the network.

This isn't crazy impossible.  WOL/PXE and the ability to send very
specific boot images (including the boot image of what amounts to a
network-aware bootloader) make this very definitely an engineerable
option.  Scyld probably supports something very close to this right now
with two-kernel monte.  COD (cluster on demand) has features along these
lines as well. There are likely programs that do this already that I
just don't know about.

In a very few years, though, systems and LAN managers if not cluster
managers may well have worked out plug-n-play solutions to the point
where one can (and in general will by default) choose to WOL/PXE boot a
very specific boot image that securely connects to a network-based
rooted boot control application.  I could likely hack something out
myself using existing tools as a spanning base if I had time or
inclination.  The one issue that is still sticky AFAIK is accessing the
bios from such an interface -- for that, and to control PXE itself, I
suspect that physical presence may be required for at least the original
bios/PXE configuration and first real boot.

It's really too bad that one cannot easily control the BIOS of most PCs
via a standard interface.  On a Sun (in the old days and probably today)
one could read or write pretty much any bios option from a root
application.  This was actually very useful.  I miss it.  Of course Sun
owned both OS and hardware, so they could deliberately design this to
work -- PC motherboard and BIOS manufacturers haven't really cared
because they still think of PCs as "personal" with strictly video
interfaces and not as mass-managed workstations.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu