[Beowulf] Remote console management
John Hearns
john.hearns at streamline-computing.com
Fri Sep 23 00:08:01 PDT 2005
On Thu, 2005-09-22 at 16:35 -0500, Bruce Allen wrote:
> We're getting ready to put together our next large Linux compute cluster.
> This time around, we'd like to be able to interact with the machines
> remotely. By this I mean that if a machine is locked up, we'd like to be
> able to see what's on the console, power cycle it, mess with BIOS
> settings, and so on, WITHOUT having to drive to work, go into the cluster
> room, etc.
>
> One possible solution is to buy nodes that have IPMI cards. These
> piggyback on the ethernet LAN and let you interact with the machine even
> in the absence of an OS. With the appropriate tools running on a remote
> machine, you can interact with the nodes even if they have no OS on them
> or are hung.
IPMI cards are a good idea. I work with them all the time.
We use IPMI for remote monitoring of systems, and for power cycling.
IPMI cards for Supermicro nodes are not expensive, and are IPM 2
compliant.
IPMI does support Serial-over-LAN, but I don't have experience with it,
I'm not sure Linux does this.
Other manufacturers have similar. The Sun Service Processors are IPMI
compliant, and in addition allow remote access via ssh. You can do a
terminal redirect and get full access to the BIOS, ie. you ssh into
the SP and type 'system console' and get a serial console on the node.
> Another solution is to use the DB9 serial ports of the nodes. You have an
> 'administrative' box containing lots of high-port-count serial cards (eg,
> Cyclades 32 or 64 port cards) and then run a serial cable from each node
> to this box. By remotely logging into this admin box you can access the
> serial ports of the machines, and if the BIOS has the right
> settings/support, this lets you have keyboard/console access.
>
> Or one can do both IPMI + remote serial port access.
Cyclades terminal servers are very good.
That's a dedicated rackmount box (running Linux) with multiple serial
lines. You telnet to a specific port and get the serial screen.
I would spec one of these for a new cluster.
We ship them on many of our clusters, as an option, and customers are
always happy with them.
The alternative is to use a single flying lead from the head node
I would spec your new cluster with IPMI cards, plus a Cyclades.
Have your systems supplier set all the BIOSes to do a serial redirect,
and enable serial consoles at the Linux boot stage.
This is standard with all our clusters.
I'll give you some help if you email me off-list.
More information about the Beowulf
mailing list