[Lustre-discuss] 1.6.6 to 1.8.3 upgrade, OSS with wrong "Target" value
Roger Sersted
rs1 at aps.anl.gov
Thu Jul 15 07:55:51 PDT 2010
OK. This looks bad. It appears that I should have upgraded ext3 to ext4, I
found instructions for that,
tune2fs -O extents,uninit_bg,dir_index /dev/XXX
fsck -pf /dev/XXX
Is the above correct? I'd like to move our systems to ext4. I didn't know those
steps were necessary.
Other answers listed below.
Wojciech Turek wrote:
> Hi Roger,
>
> Sorry for the delay. From the ldiskfs messages I seem to me that you are
> using ext4 ldiskfs
> (Jun 26 17:54:30 puppy7 kernel: ldiskfs created from ext4-2.6-rhel5).
> If you upgrading from 1.6.6 you ldiskfs is ext3 based so I think taht in
> lustre-1.8.3 you should use ext3 based ldiskfs rpm.
>
> Can you also tell us a bit more about your setup? From what you wrote
> so far I understand you have 2 OSS servers and each server has one OST
> device. In addition to that you have a third server which acts as a
> MGS/MDS, is that right?
>
> The logs you provided seem to be only from one server called puppy7 so
> it does not give a whole picture of the situation. The timeout messages
> may indicate a problem with communication between the servers but it is
> really difficult to say without seeing the whole picture or at least
> more elements of it.
>
> To check if you have correct rpms installed can you please run 'rpm -qa
> | grep lustre' on both OSS servers and the MDS?
>
> Also please provide output from command 'lctl list_nids' run on both
> OSS servers, MDS and a client?
puppy5 (MDS/MGS)
172.17.2.5 at o2ib
172.16.2.5 at tcp
puppy6 (OSS)
172.17.2.6 at o2ib
172.16.2.6 at tcp
puppy7 (OSS)
172.17.2.7 at o2ib
172.16.2.7 at tcp
>
> In addition to above please run following command on all lustre targets
> (OSTs and MDT) to display your current lustre configuration
>
> tunefs.lustre --dryrun --print /dev/<ost_device>
puppy5 (MDS/MGS)
Read previous values:
Target: lustre1-MDT0000
Index: 0
Lustre FS: lustre1
Mount type: ldiskfs
Flags: 0x405
(MDT MGS )
Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
Parameters: lov.stripesize=125K lov.stripecount=2
mdt.group_upcall=/usr/sbin/l_getgroups mdt.group_upcall=NONE mdt.group_upcall=NONE
Permanent disk data:
Target: lustre1-MDT0000
Index: 0
Lustre FS: lustre1
Mount type: ldiskfs
Flags: 0x405
(MDT MGS )
Persistent mount opts: errors=remount-ro,iopen_nopriv,user_xattr
Parameters: lov.stripesize=125K lov.stripecount=2
mdt.group_upcall=/usr/sbin/l_getgroups mdt.group_upcall=NONE mdt.group_upcall=NONE
exiting before disk write.
----------------------------------------------------
puppy6
checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata
Read previous values:
Target: lustre1-OST0000
Index: 0
Lustre FS: lustre1
Mount type: ldiskfs
Flags: 0x2
(OST )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=172.17.2.5 at o2ib
Permanent disk data:
Target: lustre1-OST0000
Index: 0
Lustre FS: lustre1
Mount type: ldiskfs
Flags: 0x2
(OST )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=172.17.2.5 at o2ib
--------------------------------------------------
puppy7 (this is the broken OSS. The "Target" should be "lustre1-OST0001")
checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata
Read previous values:
Target: lustre1-OST0000
Index: 0
Lustre FS: lustre1
Mount type: ldiskfs
Flags: 0x2
(OST )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=172.17.2.5 at o2ib
Permanent disk data:
Target: lustre1-OST0000
Index: 0
Lustre FS: lustre1
Mount type: ldiskfs
Flags: 0x2
(OST )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=172.17.2.5 at o2ib
exiting before disk write.
>
> If possible please attach syslog from each machine from the time you
> mounted lustre targets (OST and MDT).
>
> Best regards,
>
> Wojciech
>
> On 14 July 2010 20:46, Roger Sersted <rs1 at aps.anl.gov
> <mailto:rs1 at aps.anl.gov>> wrote:
>
>
> Any additional info?
>
> Thanks,
>
> Roger S.
>
>
>
>
> --
> --
> Wojciech Turek
>
>
More information about the lustre-discuss
mailing list