Node cloning
Felix Rauch
rauch at inf.ethz.ch
Fri Apr 6 00:39:59 PDT 2001
On Thu, 5 Apr 2001, Robert G. Brown wrote:
[Copying /dev/hda to /dev/hd?]
> One of many possible problems, actually. This approach to cloning
> makes me shudder -- things like the devices in /dev generally have to
> built, not copied, there are issues with the boot blocks and bad block
> lists and the bad blocks themselves on both target and host. raw
> devices are dangerous things to use as if they were flatfiles.
Unfortunately I'm not an expert in disk technology, so I might be
wrong here... but I thought that the bad block lists were maintained
by the disks themselves and not transparent to the OS.
In any case: We did not have any instability issues due to cloning in
the last few years.
[...]
> One reason I gave up cloning (after investing many months writing a
> first generation cloning tool for nodes (which booted a diskless
> configuration, formatted a local disk, and cloned itself onto the local
> disk) and started a second generation GUI-driven one) was that just
> cloning isn't enough. There is all sorts of stuff that needs to be done
> to the clones to give them a unique identity (even something as simple
> as their own ssh keys), one needs to rerun lilo, it requires that you
> keep one "pristine" host to use as the master to clone or you have the
> very host configuration creep you set out to avoid. Either way you end
> up inevitably having to upgrade all the nodes or install security or
> functionality updates.
Let me just add a few insights from our years of experience here:
- We use DHCP to assign (fixed) IP addresses to nodes. The only
problem here is to get the list of all MAC addresses in the first
place.
- We use the same SSH hostkey for all nodes in our cluster (not for
the server and our personal workstations though).
- When we clone whole disks or whole partitions, we don't need to run
lilo, fdisk or whatever. The disks are identical after the clone,
including partition tables and boot sectors.
- An additinal boot script called "personalize" personalizes the
machines during the first boot-up. Based on the hostname the script
mounts additional external disk drives, configures additional
network interfaces etc.
To conclude: If we want to update our cluster, then we update a master
machine, boot all machines in a small maintenance Linux with PXE, run
Dolly on all machines to clone them, reboot, done. There are no
post-cloning operations required, but as usual, YMMV.
Of course there might be better ways to install your cluster,
depending on your needs, configuration, experience, etc. For
(mostly) homogenous mid-sized clusters (we have 16--24 nodes in our
clusters), cloning works well.
- Felix
--
Felix Rauch | Email: rauch at inf.ethz.ch
Institute for Computer Systems | Homepage: http://www.cs.inf.ethz.ch/~rauch/
ETH Zentrum / RZ H18 | Phone: ++41 1 632 7489
CH - 8092 Zuerich / Switzerland | Fax: ++41 1 632 1307
More information about the Beowulf
mailing list