[Beowulf] USB flash drive bootable distro to check cluster health.
Joe Landman
joe.landman at gmail.com
Fri Jan 11 08:06:00 PST 2019
On 1/11/19 7:59 AM, Richard Chang wrote:
> Hi,
> I would like to know if we have or can make( or prepare) a USB
> bootable OS that we can boot in a cluster and its nodes to test all
> its functionality.
>
> The purpose of this is to boot a new or existing cluster to check its
> health, including Infiniband network, any cards, local hard disks,
> memory etc, so that I don't have to disturb the existing OS and its
> configuration.
>
> If possible, it would be nice to boot the compute nodes from the
> master node.
>
> Anyone knows of any pre-existing distribution that will do the job ?
> Or know how to do it with Centos or Ubuntu ?
FWIW: this is one of the uses cases of
https://github.com/joelandman/nyble . It works with CentOS, Debian, and
Ubuntu (though I've not pushed the 18.04.1 changes yet).
I have a rudimentary USB target I was going to clean up soon, and the
images can be centrally booted from a pxe server, and pull/run scripts
post boot.
Runs in RAM, you can modify the distributions to your hearts content. I
have a few private repos here which have NVidia + MLNX + other drivers
and related bits already built in.
I've set up many systems with this, tying it together with
https://github.com/joelandman/tiburon for boot control. This was
originally used at Scalable Informatics when we were alive, and has
evolved significantly since then.
If you want a simple pure USB distro for this, try SystemRescueCD,
though I don't think it does Infiniband, or most drivers.
--
Joe Landman
e: joe.landman at gmail.com
t: @hpcjoe
w: https://scalability.org
g: https://github.com/joelandman
l: https://www.linkedin.com/in/joelandman
More information about the Beowulf
mailing list