[Lustre-discuss] OST crashed after slow journal messages

Erik Froese erik.froese at gmail.com
Fri Jan 1 16:20:50 PST 2010


On Thu, Dec 31, 2009 at 4:52 PM, Andreas Dilger <adilger at sun.com> wrote:

> On 2009-12-30, at 08:44, Erik Froese wrote:
>
>> I had an OST crash (actually it made the entire OSS unresponsive to the
>> point where I had to shoot it).  There were messages in /var/log/messages
>> complaining about slow journal performance (we have separate OSTs and
>> journal disks).
>>
>> Dec 28 20:29:02 oss-0-0 kernel: LustreError:filter_commitrw_write())
>> scratch-OST000e: slow direct_io 85s
>> Dec 28 20:29:02 oss-0-0 kernel: LustreError:filter_commitrw_write())
>> Skipped 58 previous similar messages
>> Dec 28 21:50:13 oss-0-0 kernel: LustreError:fsfilt_commit_wait())
>> scratch-OST000e: slow journal start 51s
>> Dec 28 21:50:13 oss-0-0 kernel: LustreError:fsfilt_commit_wait()) Skipped
>> 66 previous similar messages
>>
>
> These are usually a sign that the back-end storage is overloaded, or
> somehow
> performing very slowly.  Maybe there was a RAID rebuild going on?


I don't see any hw failures or rebuild messages on the RAID. I do see
periodic media scans going on.


>
>
>  Lustre and e2fsprogs versions:
>>
>> [root at oss-0-0 ~]# rpm -q kernel-lustre
>> kernel-lustre-2.6.18-128.7.1.el5_lustre.1.8.1.1
>> [root at oss-0-0 ~]# rpm -q e2fsprogs
>> e2fsprogs-1.41.6.sun1-0redhat
>>
>>
>> Then there's this interesting message:
>> Dec 29 14:11:32 oss-0-0 kernel: LDISKFS-fs error (device sdz):
>> ldiskfs_lookup: unlinked inode 5384166 in dir #145170469
>> Dec 29 14:11:32 oss-0-0 kernel: Remounting filesystem read-only
>>
>
> This means the ldiskfs code found some corruption on disk, and remounted
> the filesystem read-only to avoid further corruptions on disk.
>
>
>  Whenever I try to mount the ost (known as /dev/dsk/ost24) I get the
>> following messages:
>> Dec 29 19:25:35 oss-0-0 kernel: LDISKFS-fs error (device sdz):
>> ldiskfs_check_descriptors: Checksum for group 16303 failed (64812!=44)
>> Dec 29 19:25:35 oss-0-0 kernel: LDISKFS-fs: group descriptors corrupted!
>>
>
>  So it looks like the "group descriptors" are corrupted. I'm not sure what
>> those are but e2fsck -n sure enough complains about them. So I tried running
>> it for real.
>>
>> I ran e2fsck -j /dev/$JOURNAL -v -fy -C 0 /dev/$DEVICE.
>>
>> The first time I ran to what looked like completion. It printed a summary
>> and all but then didn't exit. I sent it a kill but that didn't stop it. So I
>> let it run and went back to sleep for 3 hours. When I woke up the process
>> was gone but I still get the same error messages.
>>
>
> Having a log of the e2fsck errors would be helpful.


I don't have the log for the first fsck. First it logged a bunch of messages
about the group descriptors and that it was repairing them. Then messages
about inodes. I'm attaching the logs from the second e2fsck.




>
>
>  I found this discussion
>> http://lists.lustre.org/pipermail/lustre-discuss/2009-March/009885.html
>> and tried the tune2fs command followed by the e2fsck but it hasn't exited
>> yet (its a 2.7 TB OST)
>>
>
> It might take an hour or two, depending on how fast your storage is.
>
>
>  The LUN comes from a Sun STK 6140/CSM200 device which isn't reporting any
>> warning, events, or errors.
>>
>> I deactivated the OST with lctl but it still shows up as active on the
>> clients. Also lfs find /scratch -O scratch-OST000e_UUID HANGS!
>>
>
> You also need to deactivate it on the clients, at which point they will
> get an IO error when accessing files on that OST.


 I didn't know that the you had to disable it on the clients as well. Thanks

>
>
>  Are we screwed here? Is there a way to run lfs find with the OST disabled?
>> Shouldn't that just be a metadata operation?
>>
>
>
> The size of a file is stored on the OSTs, so it depends on what you are
> trying to do.  "lfs getstripe" can be run with a deactivated OST.
>
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100101/49d7d319/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: e2fsck-fy.sdz.out.post-mmp
Type: application/octet-stream
Size: 31819 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100101/49d7d319/attachment.obj>


More information about the lustre-discuss mailing list