[Lustre-discuss] ll_ost thread soft lockup

Tae Young Hong catchrye at gmail.com
Tue Mar 20 21:17:23 PDT 2012


It worked.
Thank you for the valuable tip. I never expected this simple command(e2fsck -fD) could be a solution.

regards,
Taeyoung Hong
 

2012. 3. 21., 오전 7:43, Bernd Schubert 작성:

> I'm removing lustre-discuss as I'm an FhGFS developer now and I don't
> think my boss would like it to see me posting to Lustre lists... Anyway,
> I'm still reading and sometimes helping here
> 
> On 03/20/2012 03:42 PM, Tae Young Hong wrote:
>> 
>> Thank you for your  information,
>> Today I tested our OSS after reading bugzilla 24264, say, after patching the kernel (http://review.whamcloud.com/#change,1672), I rebuilt the md in question with new one disk added (because we just had 9 disks for RAID6 8+2), and then reran e2fsck -fn,  and I finally tried to mount it but I still saw ll_ost soft lockup. the call trace messages is the same as before. so I think ours is not the case that you said.
>> 
>> Anyway yesterday I tried the simplest method as below, to see if ldiskfs is working properly alone.
>> 
>> mount -t ldiskfs -o  ro,extents,mballoc /dev/md17 /mnt/kkk
>> find /mnt/kkk  -type f | while read f; do echo $f>&2 ; cat $f>  /dev/null ; done
>> 
>> and I got the following syslog messages while running this "find/cat" command, however the command finished without any other kernel error or soft lockup.
>> 
>> Mar 19 17:57:51 oss19 kernel: LDISKFS-fs error (device md17): htree_dirblock_to_tree: bad entry in directory #341259: rec_len is smaller than minimal - offset=806912, inode=0, rec_len=0, name_len=0
>> Mar 19 18:31:09 oss19 kernel: LDISKFS-fs error (device md17): htree_dirblock_to_tree: bad entry in directory #341382: rec_len is smaller than minimal - offset=978944, inode=0, rec_len=0, name_len=0
>> Mar 19 18:31:11 oss19 kernel: LDISKFS-fs error (device md17): htree_dirblock_to_tree: bad entry in directory #341382: rec_len is smaller than minimal - offset=282624, inode=0, rec_len=0, name_len=0
>> Mar 19 18:31:11 oss19 kernel: LDISKFS-fs error (device md17): htree_dirblock_to_tree: bad entry in directory #341382: rec_len is smaller than minimal - offset=290816, inode=0, rec_len=0, name_len=0
>> Mar 19 19:01:15 oss19 kernel: LDISKFS-fs error (device md17): htree_dirblock_to_tree: bad entry in directory #341379: rec_len is smaller than minimal - offset=528384, inode=0, rec_len=0, name_len=0
>> Mar 19 19:18:14 oss19 kernel: LDISKFS-fs error (device md17): htree_dirblock_to_tree: bad entry in directory #341258: rec_len is smaller than minimal - offset=1196032, inode=0, rec_len=0, name_len=0
>> Mar 19 19:18:14 oss19 kernel: LDISKFS-fs error (device md17): htree_dirblock_to_tree: bad entry in directory #341258: rec_len is smaller than minimal - offset=1187840, inode=0, rec_len=0, name_len=0
>> ...
> 
> Ideal would be to upload an e2image somewhere so that we can fix e2fsck.
> If you would be willing to do that, the procedure would be:
> 
> e2image -r /dev/md17 /path/to/md17.e2image
> tar cvfS image.tar /path/to/md17.e2image
> 
> You probably don't want to do the
> "e2image -r /dev/hda1 - | bzip2 > hda1.e2i.bz2" command from the man
> page, as compressing all the zeros usually takes several days bzip2 time.
> 
> However, most recent e2fsprogs versions now also may create qcow2
> images, which should be the fastest method to create the meta-data
> image. However, I'm not sure if the current e2fsprogs version for Lustre
> already can do that.
> 
> Anyway, to fix the htree problem, running "e2fsck  -fvD /dev/md17"
> should do the trick. But I really would like to see the issue solved in
> e2fsck to fix it automatically without "-D" some day.
> 
> Cheers,
> Bernd




More information about the lustre-discuss mailing list