[Beowulf] Cluster doesn't like being moved

Joe Landman landman at scalableinformatics.com
Tue Mar 10 14:06:39 PDT 2009



Steve Herborn wrote:

> This time It appears that the file system & configuration databases 
> became corrupted after moving the equipment. Several services aren't 
> starting up (LADP, DHCP, PBS to name a few) and YAST2 hangs any time an 
> attempt is made to use it. For example adding a printer or software 
> package. My co-worker feels the issue maybe related to the ReiserFS file 
> system with AMD processors. The ReiserFS file system was the default 
> presented when I initially installed SLES so I went with it.

Ouch.  Can you boot an OpenSuSE disk in rescue mode and fsck the file 
system?

I have had two (severe) data losses in my work on Linux, one was with 
ext2, and the other with Reiserfs.  Wouldn't recommend using either one 
in a production mode for data that needed long term viability.

> Do you know of any issues with using the ReiserFS file system on AMD 
> based systems or have any other ideas what I maybe facing?

Yes, reiserfs may have been silently accumulating errors that only 
became apparent upon restart.  Or its fsck munged the file system.

If you can move off of it, I would urge you to do that.  It is likely 
that the configuration data that your non-starting services depend upon 
are lost.  Can you rebuild this (or pay someone to do so) without too 
much pain?

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452
cell : +1 734 612 4615




More information about the Beowulf mailing list