[Lustre-discuss] e2fsck mdsdb: DB_NOTFOUND

Michelle Butler mbutler at ncsa.uiuc.edu
Thu Mar 13 08:02:46 PDT 2008


We got past that point by e2fsck the individual partitions first.

But we are still having problems.. I'm sorry to 
say.   we have an I/O server that is fine until 
we start Lustre.  It starts spewing lustre call traces :

Call 
Trace:<ffffffffa02fa089>{:libcfs:lcw_update_time+22} 
<ffffffffa03e06e3>{:ptlrpc:ptlrpc_main+1408}
        <ffffffff8013327d>{default_wake_function+0} 
<ffffffffa03e0156>{:ptlrpc:ptlrpc_retry_rqbds+0}
        <ffffffffa03e0156>{:ptlrpc:ptlrpc_retry_rqbds+0} 
<ffffffff80110ebb>{child_rip+8}
        <ffffffffa03e0163>{:ptlrpc:ptlrpc_main+0} 
<ffffffff80110eb3>{child_rip+0}

ll_ost_io_232 S 000001037d6bbee8     0 26764      1         26765 26763 (L-TLB)
000001037d6bbe58 0000000000000046 0000000100000246 0000000000000003
        0000000000000016 0000000000000001 00000104100bcb20 0000000300000246
        00000103f5470030 000000000001d381
Call 
Trace:<ffffffffa02fa089>{:libcfs:lcw_update_time+22} 
<ffffffffa03e06e3>{:ptlrpc:ptlrpc_main+1408}
        <ffffffff8013327d>{default_wake_function+0} 
<ffffffffa03e0156>{:ptlrpc:ptlrpc_retry_rqbds+0}
        <ffffffffa03e0156>{:ptlrpc:ptlrpc_retry_rqbds+0} 
<ffffffff80110ebb>{child_rip+8}
        <ffffffffa03e0163>{:ptlrpc:ptlrpc_main+0} 
<ffffffff80110eb3>{child_rip+0}

ll_ost_io_233 S 00000103de847ee8     0 26765      1         26766 26764 (L-TLB)
00000103de847e58 0000000000000046 0000000100000246 0000000000000001
        0000000000000016 0000000000000001 000001040f83c620 0000000100000246
        00000103e627e030 000000000001d487
Call 
Trace:<ffffffffa02fa089>{:libcfs:lcw_update_time+22} 
<ffffffffa03e06e3>{:ptlrpc:ptlrpc_main+1408}
        <ffffffff8013327d>{default_wake_function+0} 
<ffffffffa03e0156>{:ptlrpc:ptlrpc_retry_rqbds+0}
        <ffffffffa03e0156>{:ptlrpc:ptlrpc_retry_rqbds+0} 
<ffffffff80110ebb>{child_rip+8}
        <ffffffffa03e0163>{:ptlrpc:ptlrpc_main+0} 
<ffffffff80110eb3>{child_rip+0}

ll_ost_io_234 S 00000100c4353ee8     0 26766      1         26767 26765 (L-TLB)
00000100c4353e58 0000000000000046 0000000100000246 0000000000000003
        0000000000000016 0000000000000001 00000104100bcc60 0000000300000246
        00000103de81b810 000000000001d945
Call 
Trace:<ffffffffa02fa089>{:libcfs:lcw_update_time+22} 
<ffffffffa03e06e3>{:ptlrpc:ptlrpc_main+1408}
        <ffffffff8013327d>{default_wake_function+0} 
<ffffffffa03e0156>{:ptlrpc:ptlrpc_retry_rqbds+0}
        <ffffffffa03e0156>{:ptlrpc:ptlrpc_retr�f���c���c��
                                                          Ks[F���� 
<ffffffff8013327d>{default_wake_function+0} 
<ffffffffa03e0156>{:ptlrpc:ptlrpc_retry_rqbds+0}
        <ffffffffa03e0156>{:ptl

It then panic's the kernel.. ??

Michelle Butler

At 02:39 AM 3/13/2008, Andreas Dilger wrote:
>On Mar 12, 2008  06:44 -0500, Karen M. Fernsler wrote:
> > I'm running:
> >
> > e2fsck -y -v --mdsdb mdsdb --ostdb osth3_1 /dev/mapper/27l4
> >
> > and getting:
> >
> > Pass 6: Acquiring information for lfsck
> > error getting mds_hdr (3685469441:8) in 
> /post/cfg/mdsdb: DB_NOTFOUND: No matching key/data pair found
> > e2fsck: aborted
> >
> > Any ideas how to get around this?
>
>Does "mdsdb" actually exist?  This should be created by first running:
>
>e2fsck --mdsdb mdsdb /dev/{mdsdevicename}
>
>before running your above command on the OST.
>
>Please also try specifying the absolute pathname for the mdsdb and ostdb
>files.
>
>Cheers, Andreas
>--
>Andreas Dilger
>Sr. Staff Engineer, Lustre Group
>Sun Microsystems of Canada, Inc.





More information about the lustre-discuss mailing list