[Lustre-discuss] BAD last_transno problem

Tue Feb 9 06:55:07 PST 2010

On 2010-02-08, at 06:29, Lu Wang wrote:
> We set the osc of "besfs-OST0034" on MDS "deactivate". The OSS did  
> not crash again.  However, this problem has not been solved. The OST  
> "besfs-OST0034" cannot be written now.
>
> ------------------				
> Lu Wang
> 2010-02-08
>
> -------------------------------------------------------------
> 发件人：Lu Wang
> 发送日期：2010-02-08 20:47:16
> 收件人：lustre-discuss
> 抄送：
> 主题：[Lustre-discuss] BAD last_transno problem
>
> Dear  list,
> we got a bad last_transno after  a OST is remounted.
> cat /proc/fs/lustre/obdfilter/besfs-OST0034/recovery_status
> status: COMPLETE
> recovery_start: 1265630163
> recovery_duration: 466 completed_clients: 298/298
> replayed_requests: 0 last_transno: -499056254903072891
>
> Each time after the OST finished recovery, the OSS crashed.  With a  
> kernel "Opps error", and reports error about deleting orphan objects.

Having the actual "oops error" makes commenting on such problems a lot  
easier.  It sounds like running "e2fsck -f" on this OST may avoid the  
oops, but it won't fix the transno error.  You can mount the OST  
filesystem as ldiskfs and delete the "last_rcvd" file to clear the  
transno

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.