[Lustre-discuss] ll_ost thread soft lockup
Tae Young Hong
catchrye at gmail.com
Tue Mar 20 21:17:23 PDT 2012
It worked.
Thank you for the valuable tip. I never expected this simple command(e2fsck -fD) could be a solution.
regards,
Taeyoung Hong
2012. 3. 21., 오전 7:43, Bernd Schubert 작성:
> I'm removing lustre-discuss as I'm an FhGFS developer now and I don't
> think my boss would like it to see me posting to Lustre lists... Anyway,
> I'm still reading and sometimes helping here
>
> On 03/20/2012 03:42 PM, Tae Young Hong wrote:
>>
>> Thank you for your information,
>> Today I tested our OSS after reading bugzilla 24264, say, after patching the kernel (http://review.whamcloud.com/#change,1672), I rebuilt the md in question with new one disk added (because we just had 9 disks for RAID6 8+2), and then reran e2fsck -fn, and I finally tried to mount it but I still saw ll_ost soft lockup. the call trace messages is the same as before. so I think ours is not the case that you said.
>>
>> Anyway yesterday I tried the simplest method as below, to see if ldiskfs is working properly alone.
>>
>> mount -t ldiskfs -o ro,extents,mballoc /dev/md17 /mnt/kkk
>> find /mnt/kkk -type f | while read f; do echo $f>&2 ; cat $f> /dev/null ; done
>>
>> and I got the following syslog messages while running this "find/cat" command, however the command finished without any other kernel error or soft lockup.
>>
>> Mar 19 17:57:51 oss19 kernel: LDISKFS-fs error (device md17): htree_dirblock_to_tree: bad entry in directory #341259: rec_len is smaller than minimal - offset=806912, inode=0, rec_len=0, name_len=0
>> Mar 19 18:31:09 oss19 kernel: LDISKFS-fs error (device md17): htree_dirblock_to_tree: bad entry in directory #341382: rec_len is smaller than minimal - offset=978944, inode=0, rec_len=0, name_len=0
>> Mar 19 18:31:11 oss19 kernel: LDISKFS-fs error (device md17): htree_dirblock_to_tree: bad entry in directory #341382: rec_len is smaller than minimal - offset=282624, inode=0, rec_len=0, name_len=0
>> Mar 19 18:31:11 oss19 kernel: LDISKFS-fs error (device md17): htree_dirblock_to_tree: bad entry in directory #341382: rec_len is smaller than minimal - offset=290816, inode=0, rec_len=0, name_len=0
>> Mar 19 19:01:15 oss19 kernel: LDISKFS-fs error (device md17): htree_dirblock_to_tree: bad entry in directory #341379: rec_len is smaller than minimal - offset=528384, inode=0, rec_len=0, name_len=0
>> Mar 19 19:18:14 oss19 kernel: LDISKFS-fs error (device md17): htree_dirblock_to_tree: bad entry in directory #341258: rec_len is smaller than minimal - offset=1196032, inode=0, rec_len=0, name_len=0
>> Mar 19 19:18:14 oss19 kernel: LDISKFS-fs error (device md17): htree_dirblock_to_tree: bad entry in directory #341258: rec_len is smaller than minimal - offset=1187840, inode=0, rec_len=0, name_len=0
>> ...
>
> Ideal would be to upload an e2image somewhere so that we can fix e2fsck.
> If you would be willing to do that, the procedure would be:
>
> e2image -r /dev/md17 /path/to/md17.e2image
> tar cvfS image.tar /path/to/md17.e2image
>
> You probably don't want to do the
> "e2image -r /dev/hda1 - | bzip2 > hda1.e2i.bz2" command from the man
> page, as compressing all the zeros usually takes several days bzip2 time.
>
> However, most recent e2fsprogs versions now also may create qcow2
> images, which should be the fastest method to create the meta-data
> image. However, I'm not sure if the current e2fsprogs version for Lustre
> already can do that.
>
> Anyway, to fix the htree problem, running "e2fsck -fvD /dev/md17"
> should do the trick. But I really would like to see the issue solved in
> e2fsck to fix it automatically without "-D" some day.
>
> Cheers,
> Bernd
More information about the lustre-discuss
mailing list