[Beowulf] XFS/EVMS/kernel 2.6.11
Suvendra Nath Dutta
sdutta at cfa.harvard.edu
Wed Mar 15 13:46:10 PST 2006
I am using a the above combination. At boot up I get loads of the
following:
<3>device-mapper: dm-linear: Device lookup failed
<4>device-mapper: error adding target to table
<3>device-mapper: dm-linear: Device lookup failed
<4>device-mapper: error adding target to table
<3>device-mapper: dm-linear: Device lookup failed
<4>device-mapper: error adding target to table
<3>device-mapper: dm-linear: Device lookup failed
<4>device-mapper: error adding target to table
<3>device-mapper: dm-linear: Device lookup failed
<4>device-mapper: error adding target to table
<3>device-mapper: dm-linear: Device lookup failed
<4>device-mapper: error adding target to table
<3>device-mapper: dm-linear: Device lookup failed
Then suddenly at some point after running fine I've started to get:
Mar 15 14:09:54 sauron kernel: XFS internal error
XFS_WANT_CORRUPTED_GOTO at line 1610 of file fs/xfs/xfs_alloc.c.
Caller 0xffffffff80277e49
Mar 15 14:09:54 sauron kernel:
Mar 15 14:09:54 sauron kernel: Call Trace:<ffffffff80276383>
{xfs_free_ag_extent+1251} <ffffffff80277e49>{xfs_free_extent+185}
Mar 15 14:09:54 sauron kernel: <ffffffff802a1994>{xfs_efd_init
+68} <ffffffff802856bd>{xfs_bmap_finish+253}
Mar 15 14:09:54 sauron kernel: <ffffffff802aad30>
{xfs_itruncate_finish+416} <ffffffff802bc479>{xfs_trans_alloc+217}
Mar 15 14:09:54 sauron kernel: <ffffffff802c1adf>{xfs_inactive
+591} <ffffffff80155887>{find_get_pages+119}
Mar 15 14:09:54 sauron kernel: <ffffffff801607e3>
{truncate_inode_pages+435} <ffffffff802d12df>{vn_rele+95}
Mar 15 14:09:54 sauron kernel: <ffffffff802cfc32>
{linvfs_clear_inode+18} <ffffffff8019324e>{clear_inode+142}
Mar 15 14:09:54 sauron kernel: <ffffffff80193865>
{generic_delete_inode+165} <ffffffff8019262e>{iput+126}
Mar 15 14:09:54 sauron kernel: <ffffffff8018a206>{sys_unlink
+262} <ffffffff8018c078>{sys_getdents+232}
Mar 15 14:09:54 sauron kernel: <ffffffff8018b50f>{sys_fcntl
+815} <ffffffff8010d54e>{system_call+126}
Mar 15 14:09:54 sauron kernel:
Mar 15 14:09:54 sauron kernel: xfs_force_shutdown(dm-1,0x8) called
from line 4073 of file fs/xfs/xfs_bmap.c. Return address =
0xffffffff802d0e28
Mar 15 14:09:54 sauron kernel: Filesystem "dm-1": Corruption of in-
memory data detected. Shutting down filesystem: dm-1
Mar 15 14:09:54 sauron kernel: Please umount the filesystem, and
rectify the problem(s)
At this point the following ensues:
sauron:~ # ls /raid3/sdutta
/bin/ls: /raid3/sdutta: Input/output error
On reboot, the boot.msg has this:
<5>XFS mounting filesystem dm-1
<5>Starting XFS recovery on filesystem: dm-1 (dev: dm-1)
<1>XFS internal error XFS_WANT_CORRUPTED_GOTO at line 1583 of file fs/
xfs/xfs_alloc.c. Caller 0xffffffff80277e49
<4>
<4>Call Trace:<ffffffff80276383>{xfs_free_ag_extent+1251}
<ffffffff80277e49>{xfs_free_extent+185}
<4> <ffffffff802a1994>{xfs_efd_init+68} <ffffffff802bd56b>
{xfs_trans_get_efd+43}
<4> <ffffffff802b5f41>{xlog_recover_finish+401}
<ffffffff802df7c1>{__up_write+49}
<4> <ffffffff802b1ddb>{xfs_log_mount_finish+27}
<ffffffff802b9584>{xfs_mountfs+2612}
<4> <ffffffff802c93fd>{xfs_setsize_buftarg_flags+61}
<ffffffff802bf380>{xfs_mount+2432}
<4> <ffffffff802d0020>{linvfs_fill_super+0} <ffffffff802d0b28>
{vfs_mount+40}
<4> <ffffffff802d00d3>{linvfs_fill_super+179} <ffffffff802d0020>
{linvfs_fill_super+0}
<4> <ffffffff802e12b3>{snprintf+131} <ffffffff80542cb3>
{__down_write+51}
<4> <ffffffff802dff1e>{strlcpy+78} <ffffffff8017f7f5>{sget+949}
<4> <ffffffff8017ebc0>{set_bdev_super+0} <ffffffff8017fe34>
{get_sb_bdev+276}
<4> <ffffffff8017faff>{do_kern_mount+111} <ffffffff80196daa>
{do_mount+1642}
<4> <ffffffff8011e2c2>{do_page_fault+1202} <ffffffff802e1f2e>
{_atomic_dec_and_lock+46}
<4> <ffffffff80188f7d>{link_path_walk+3581} <ffffffff8015acd4>
{buffered_rmqueue+516}
<4> <ffffffff8015aa80>{__get_free_pages+16} <ffffffff80196ebc>
{sys_mount+156}
<4> <ffffffff8010d54e>{system_call+126}
<5>Ending XFS recovery on filesystem: dm-1 (dev: dm-1)
And sooner or later filesystem on this partition will crash and not
recover. Until the next reboot.
Has anyone seen this behaviour also? Is there a solution?
Suvendra.
More information about the Beowulf
mailing list