How to mirror a harddisk for backup purpose - options

alvin at alvin at
Fri Jun 22 13:54:29 PDT 2001

hi robert...

nice script ( simple ) ...

i like it when people post their "scripts" ... 
( good to see how people doing their magic )

-- think we lost the orginal question/issue of what the guy from hong kong
   wanted to do ... what kind of copying disk to disk as there are various
   methods ... with advantages and disadvantages to each

have fun
alvin .. backup stuff... ..... security stuff ...

- dd -- bit-by-bit copying

- tar -- seems some folks use many options ...  i tend to just use "p"
  during extract to keep the original permissions

- rsync -- good for things like  mirroring "redhat" or other directory
  tree and stuff

- raid1 mirroring
	- exact copy ( partitions, directories, files ) of one disk to the
	other ...  ( not inodes, boot records, etc )

- think using  tar | ( rsh  tar ) is too slow... too many login
  sub-processes that it might try to do in between .. donno for sure...

- NFS mounted copying
	mount remote_disk:/backup /mnt/backup
	ssh -par  /foo /mnt/backup
	( ssh or cp or rcp or ...

- so what kind of copying does the guy from hong kong wanna do ???

On Fri, 22 Jun 2001, Robert G. Brown wrote:

> On Fri, 22 Jun 2001, David Vos wrote:
> > On Fri, 22 Jun 2001, David Vos wrote:
> > > The problem with tar is that you need disk space to store the intermediate
> > > file between your backup and restore.
> >
> > Correction.  You can use tar without an intermediate file.  I was thinking
> > of a different situation I was messing with yesterday.  Oops.
> Since we are giving recipes, let me give one for a tarpipe.  This one
> presumes that source directory and target directory exist and are
> mounted on the same system (although one could be NFS mounted).  There
> are also recipes that will run over a network.
> To copy the contents of one directory (let's say /var) to another (let's
> say /var_new), as root:
> cd /var
> tar cplsSf - . | (cd /var_new;tar xf -)
> If you like to watch, add a v to xpf (to see the files being unpacked).
> This slows it down a bit.  The flags stand for c(reate), p(reserve
> permissions and times), l(ocal filesystem only), s(ame order), S(parse
> files efficiently), f(ile to write to is) - (stdout).  The unpack
> command is run in a subshell, hence the ().  This will not copy files in
> e.g. /var/spool/mail if it is an NFS mount -- if you want them copied
> you'd need to remove the l option.
> Another tool that I haven't heard mentioned for keeping filesystems in
> sync that is MUCH more efficient than dd or tar is rsync.  If your
> purpose is to maintain a reasonably accurate archival mirror of a key
> work directory, especially over the network, rsync is a great choice.
> It will run on top of your choice of remote shell, rsh if your site is
> low security or (preferred, in my opinion) ssh.
> rsync makes it very easy to maintain absolutely identical directory
> structures (not partitions or filesystems per se) with minimum effort.
> One recipe for this sort of function goes into my "synccvs" script,
> which I use to keep CVS repositories sync'd across several platforms I
> work on in different networks, e.g. my home network, my laptop, the
> physics department network.  By using this script, I can easily pop an
> exact copy of a working CVS repository on my workstation at Duke onto my
> laptop before a trip, work on the project all I want on the trip
> (checking it in as needed) and then pop an exact copy of the repository
> back from my laptop to all my other CVSROOTs when I get back.
> The script is pretty trivial:
> #!/bin/sh
> # Correct command-line invocation usage:
> Usage="Usage: `basename $0` cvs_pkg cvshost"
> # Usage fragment
> if [ $# -ne 2 ]
> then
> 	echo $Usage >&2
> 	exit 1
> fi
> CVS_PKG=$1
> export RSYNC_RSH
> echo "Synchronizing package $CVS_PKG with host $CVS_HOST at `date`"
> rsync -avz --delete $CVSROOT/$CVS_PKG $CVS_HOST:\$CVSROOT
> (note that I'm too lazy to even do a proper job of parsing the command
> line) and the recipe for keeping pretty much arbitrary directory
> structures sync'd is obvious.  The only trickery is to NOT rsync
> $CVSROOT to $CVSROOT, or you'll end up with e.g.
> /home/rgb/Src/CVSROOT/CVSROOT -- it copies INTO the target directory,
> not onto the target directory.
> When rsync runs it starts by doing full directory listings of source and
> target, checks to identify the files it needs to actually update, and
> only updates those files.  So if you've altered only three files (and
> the other 4257 files, occupying 500 MB, are untouched) it only sends
> three files instead of 4260 as tar would or however many bytes that
> there are in the partition as dd would.  Sending the stat information is
> of course generally MUCH cheaper than sending the data -- rsync will
> sync quite large directory structures in a few seconds to a few minutes.
> It also transparently and automagically compresses and decompresses
> files (with the z flag) if doing so makes sense for your network.  Since
> I often rsync through a DSL connection, it makes sense for me.  Over
> 100BT or better it might not, although it probably doesn't really
> matter as syncing is pretty fast regardless.
> The --delete flag tells it to delete any files in the target that no
> longer exist in the source.  Note that tar (as far as I know) does NOT
> remove existing files that do not conflict with stuff on the archive.
> In the tarpipe example above, if /var_new/JUNK already existed (but not
> /var/JUNK) you would probably still find /var_new/JUNK there after the
> tarpipe completed.  If you are not careful, using it to mirror a rapidly
> changing directory structure will end up with a mirror consisting of the
> union of all files and directories that ever existed in the source
> directory.  To avoid this, you have to delete all the files in the
> target before beginning.  This leaves a small window when your mirror
> doesn't exist and a crash in the primary will lose the directory.  I
> therefore think of rsync as much smarter and much safer (there are LOTS
> of options for rsync and it works transparently over the network).
> Both tar and rsync require that a filesystem already exist on the target
> disk.  dd does not. For example, you can do a poor man's copy of a CD
> rom by reading the raw /dev/cdrom into disk_image.iso and then mount it
> via loopback or copy it out onto a CD.  I don't believe dd is
> recommended for writing CD's although in principle it should work -- if
> nothing at all interrupts the write process.  I could definitely be
> wrong about the latter as I've never tried it.
> dump has also been ported to linux, and one can also copy filesystems
> via a dump | restore pipe (I used to do this from time to time on Suns)
> but because dump is far from universal on linux boxen I have fallen back
> to using tar for the same purpose in pretty much the same way.
> -- 
> Robert G. Brown	             
> Duke University Dept. of Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at
> _______________________________________________
> Beowulf mailing list, Beowulf at
> To change your subscription (digest mode or unsubscribe) visit

More information about the Beowulf mailing list