[Beowulf] strange problem with large file moving between server

Jörg Saßmannshausen j.sassmannshausen at ucl.ac.uk
Thu Oct 23 13:06:19 PDT 2014


Dear all,

further my last email, the problem is sorted. In the end it turned out that 
the SCSI HBA had a problem. Trying to update the firmware resulted in a 
complete inoperable card. :-(
Fortunately, as I had a different card the problem is sorted now.

Thanks to everybody for their suggestions.

All the best from London

Jörg

On Sonntag 21 September 2014 Jörg Saßmannshausen wrote:
> Dear all,
> 
> I got a rather strange problem with one of my file servers which I recently
> have upgraded in order to accommodate more disc space.
> 
> The problem: I have copies the files from the old file space to a temporary
> disc storage space using this rsync command:
> 
> rsync -vrltH -pgo --stats -D --numeric-ids -x oldserver:foo  tempspace:baa
> 
> I am doing this now for some years and never had any problems.
> 
> As always, I am running md5sum afterwards to be sure ther is not a problem
> later and the user is loosing data. This time around a rather large file
> (around 16 GB) the md5sum failed after I moved the files from the temp
> space back to the new destination using the same command as above.
> 
> Having still access to the old file space, I decided to move this file from
> the old file space. Strangely enough, rsync does not sync the file again
> so I had to delete the file. Even after deleting the file and re-sync it
> from the old source, the md5sum is wrong.
> 
> Copying the file to a different file space did not cause these problem,
> i.e. the md5sum is correct.
> As it is a tar.gz file, I simply decided to decompress the original file on
> the different file server. That worked. The file where the md5sum is wrong
> did not decompress on the different file server but crashed with an error
> message when I executed gunzip. So the file is broken.
> 
> The setup:
> 
> Originally I was using an old Infortrand box which had old PATA discs in
> it. This box is connected via scsi to a frontend server which exports the
> file space via iscsi. The backend for that, i.e. the one the user is
> accessing is on a different physical machine and it is a XEN guest. The
> reason behind that setting is as the frontend is acting as a backup server
> and I don't want people to have access to it.
> I then exchanged the Infortrend box with a more recent model which got SATA
> capeabilities but still got scsi connection to the frontend. The frontend
> is the same. I got a new controller for that box as the old one was
> broken. There is no changes in the backend, that is still the same XEN
> guest on the same hardware.
> 
> What I cannot work out is why the old Infortrend box does not have any
> problems with the new file, the newer one has a problem here. Also, when I
> have copied over some files (again using the rsync command above) a few
> files did not copy correctly (again md5sum) in the first instance but done
> so later.
> 
> I find that highly alarming as that means that at least for larger and/or
> some binary files there seems to be a problem. However, I am not sure
> there to look at it as I am out of ideas.
> 
> Could it be there is a problem with the 'new' controller?
> In all cases I was using ext4 as a file system and I did not have any
> problems with that.
> 
> Anybody got some sentiments here?
> 
> All the best from a sunny London
> 
> Jörg
> 
> P.S. To make things worse I am off on a work related trip from Monday
> onwards and I am working on that problem since Friday evening.


-- 
*************************************************************
Dr. Jörg Saßmannshausen, MRSC
University College London
Department of Chemistry
Gordon Street
London
WC1H 0AJ 

email: j.sassmannshausen at ucl.ac.uk
web: http://sassy.formativ.net

Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 230 bytes
Desc: This is a digitally signed message part.
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20141023/4621da37/attachment.sig>


More information about the Beowulf mailing list