[Lustre-discuss] e2fsck mdsdb: DB_NOTFOUND
Aaron Knister
aaron at iges.org
Thu Mar 13 13:50:04 PDT 2008
What version of lustre/kernel is running on the problematic server?
On Mar 13, 2008, at 11:02 AM, Michelle Butler wrote:
> We got past that point by e2fsck the individual partitions first.
>
> But we are still having problems.. I'm sorry to
> say. we have an I/O server that is fine until
> we start Lustre. It starts spewing lustre call traces :
>
> Call
> Trace:<ffffffffa02fa089>{:libcfs:lcw_update_time+22}
> <ffffffffa03e06e3>{:ptlrpc:ptlrpc_main+1408}
> <ffffffff8013327d>{default_wake_function+0}
> <ffffffffa03e0156>{:ptlrpc:ptlrpc_retry_rqbds+0}
> <ffffffffa03e0156>{:ptlrpc:ptlrpc_retry_rqbds+0}
> <ffffffff80110ebb>{child_rip+8}
> <ffffffffa03e0163>{:ptlrpc:ptlrpc_main+0}
> <ffffffff80110eb3>{child_rip+0}
>
> ll_ost_io_232 S 000001037d6bbee8 0 26764 1 26765
> 26763 (L-TLB)
> 000001037d6bbe58 0000000000000046 0000000100000246 0000000000000003
> 0000000000000016 0000000000000001 00000104100bcb20
> 0000000300000246
> 00000103f5470030 000000000001d381
> Call
> Trace:<ffffffffa02fa089>{:libcfs:lcw_update_time+22}
> <ffffffffa03e06e3>{:ptlrpc:ptlrpc_main+1408}
> <ffffffff8013327d>{default_wake_function+0}
> <ffffffffa03e0156>{:ptlrpc:ptlrpc_retry_rqbds+0}
> <ffffffffa03e0156>{:ptlrpc:ptlrpc_retry_rqbds+0}
> <ffffffff80110ebb>{child_rip+8}
> <ffffffffa03e0163>{:ptlrpc:ptlrpc_main+0}
> <ffffffff80110eb3>{child_rip+0}
>
> ll_ost_io_233 S 00000103de847ee8 0 26765 1 26766
> 26764 (L-TLB)
> 00000103de847e58 0000000000000046 0000000100000246 0000000000000001
> 0000000000000016 0000000000000001 000001040f83c620
> 0000000100000246
> 00000103e627e030 000000000001d487
> Call
> Trace:<ffffffffa02fa089>{:libcfs:lcw_update_time+22}
> <ffffffffa03e06e3>{:ptlrpc:ptlrpc_main+1408}
> <ffffffff8013327d>{default_wake_function+0}
> <ffffffffa03e0156>{:ptlrpc:ptlrpc_retry_rqbds+0}
> <ffffffffa03e0156>{:ptlrpc:ptlrpc_retry_rqbds+0}
> <ffffffff80110ebb>{child_rip+8}
> <ffffffffa03e0163>{:ptlrpc:ptlrpc_main+0}
> <ffffffff80110eb3>{child_rip+0}
>
> ll_ost_io_234 S 00000100c4353ee8 0 26766 1 26767
> 26765 (L-TLB)
> 00000100c4353e58 0000000000000046 0000000100000246 0000000000000003
> 0000000000000016 0000000000000001 00000104100bcc60
> 0000000300000246
> 00000103de81b810 000000000001d945
> Call
> Trace:<ffffffffa02fa089>{:libcfs:lcw_update_time+22}
> <ffffffffa03e06e3>{:ptlrpc:ptlrpc_main+1408}
> <ffffffff8013327d>{default_wake_function+0}
> <ffffffffa03e0156>{:ptlrpc:ptlrpc_retry_rqbds+0}
>
> <ffffffffa03e0156>{:ptlrpc:ptlrpc_retr�f���c���c��
>
> Ks[F����
> <ffffffff8013327d>{default_wake_function+0}
> <ffffffffa03e0156>{:ptlrpc:ptlrpc_retry_rqbds+0}
> <ffffffffa03e0156>{:ptl
>
> It then panic's the kernel.. ??
>
> Michelle Butler
>
> At 02:39 AM 3/13/2008, Andreas Dilger wrote:
>> On Mar 12, 2008 06:44 -0500, Karen M. Fernsler wrote:
>>> I'm running:
>>>
>>> e2fsck -y -v --mdsdb mdsdb --ostdb osth3_1 /dev/mapper/27l4
>>>
>>> and getting:
>>>
>>> Pass 6: Acquiring information for lfsck
>>> error getting mds_hdr (3685469441:8) in
>> /post/cfg/mdsdb: DB_NOTFOUND: No matching key/data pair found
>>> e2fsck: aborted
>>>
>>> Any ideas how to get around this?
>>
>> Does "mdsdb" actually exist? This should be created by first
>> running:
>>
>> e2fsck --mdsdb mdsdb /dev/{mdsdevicename}
>>
>> before running your above command on the OST.
>>
>> Please also try specifying the absolute pathname for the mdsdb and
>> ostdb
>> files.
>>
>> Cheers, Andreas
>> --
>> Andreas Dilger
>> Sr. Staff Engineer, Lustre Group
>> Sun Microsystems of Canada, Inc.
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
Aaron Knister
Associate Systems Analyst
Center for Ocean-Land-Atmosphere Studies
(301) 595-7000
aaron at iges.org
More information about the lustre-discuss
mailing list