[Lustre-discuss] Kernel panic on mounting an OST

Wojciech Turek wjt27 at cam.ac.uk
Thu Dec 13 08:22:15 PST 2007


Hi,

Sorry for big delay in responses but I was away for christmas lunch.
Changing index may help but I am not certain of that. Definitly it is  
weird  that you don't have OST0000 but you have OST0030 this may  
cause problems later with quotas when you try to turn them off. I  
think lustre will expect OST0000 to exist and if it don't find it,  
lustre will complain and quota will not work.
However if you do change indexes I think you need to do that in  
certain way I suggest do it as follows
# Umount all OST's and MDT's and run for each target:
tunefs.lustre --reformat --index=<index> --writeconf /dev/ 
<block_device_name>
# This need to be done on all OSS's and on MDS
# For each target mount it as ldiskfs file system. This need to be  
done on all OSS's and on MDS
#for example:
mount -t ldiskfs /dev/dm-0 /mnt/mdt
# delete file /mnt/mdt/last_rcvd
# mount filesystem

after that you can do writeconf for each target, then start MGS/MDT  
target, and then start one by one OST targets starting with mpath0 as  
first one

Also have a look at our /etc/multipath.conf
As you can see it is very static but we can be sure that each dm- 
<number> device is always pointing to the same LUN

defaults {
         udev_dir                /dev
         polling_interval        10
         selector                "round-robin 0"
         path_grouping_policy    failover
         getuid_callout          "/sbin/scsi_id -g -u -s /block/%n"
         prio_callout            /bin/true
         path_checker            tur
         rr_min_io               100
         rr_weight               priorities
         failback                immediate
         no_path_retry           fail
         user_friendly_name      yes
         prio_callout            "/sbin/mpath_prio_my %n"
}
devnode_blacklist {
         devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st|sda)[0-9]*"
         devnode "^hd[a-z]"
         devnode "^cciss!c[0-9]d[0-9]*"
}

multipaths {
         multipath {
                 wwid                                     
360001ff007e6173300000800001d1c17
                 alias                                           dm-0
                 path_grouping_policy            failover
                 path_checker                            tur
                 path_selector                           "round-robin 0"
                 failback                                         
immediate
                 rr_weight                               priorities
                 no_path_retry                           5
         }
         multipath {
                 wwid                                     
360001ff007e6173401000800001d1c17
                 alias                                           dm-1
                 path_grouping_policy            failover
                 path_checker                            tur
                 path_selector                           "round-robin 0"
                 failback                                         
immediate
                 rr_weight                               priorities
                 no_path_retry                           5
         }
         multipath {
                 wwid                                     
360001ff007e6173502000800001d1c17
                 alias                                           dm-2
                 path_grouping_policy            failover
                 path_checker                            tur
                 path_selector                           "round-robin 0"
                 failback                                         
immediate
                 rr_weight                               priorities
                 no_path_retry                           5
         }
         multipath {
                 wwid                                     
360001ff007e6173906000800001d1c17
                 alias                                           dm-3
                 path_grouping_policy            failover
                 path_checker                            tur
                 path_selector                           "round-robin 0"
                 failback                                         
immediate
                 rr_weight                               priorities
                 no_path_retry                           5
         }
         multipath {
                 wwid                                     
360001ff007e6173a07000800001d1c17
                 alias                                           dm-4
                 path_grouping_policy            failover
                 path_checker                            tur
                 path_selector                           "round-robin 0"
                 failback                                         
immediate
                 rr_weight                               priorities
                 no_path_retry                           5
         }
         multipath {
                 wwid                                     
360001ff007e6173b08000800001d1c17
                 alias                                           dm-5
                 path_grouping_policy            failover
                 path_checker                            tur
                 path_selector                           "round-robin 0"
                 failback                                         
immediate
                 rr_weight                               priorities
                 no_path_retry                           5
         }
         multipath {
                 wwid                                     
360001ff007e6173603000800001d1c17
                 alias                                           dm-6
                 path_grouping_policy            failover
                 path_checker                            tur
                 path_selector                           "round-robin 0"
                 failback                                         
immediate
                 rr_weight                               priorities
                 no_path_retry                           5
         }
         multipath {
                 wwid                                     
360001ff007e6173704000800001d1c17
                 alias                                           dm-7
                 path_grouping_policy            failover
                 path_checker                            tur
                 path_selector                           "round-robin 0"
                 failback                                         
immediate
                 rr_weight                               priorities
                 no_path_retry                           5
         }
         multipath {
                 wwid                                     
360001ff007e6173805000800001d1c17
                 alias                                           dm-8
                 path_grouping_policy            failover
                 path_checker                            tur
                 path_selector                           "round-robin 0"
                 failback                                         
immediate
                 rr_weight                               priorities
                 no_path_retry                           5
         }
         multipath {
                 wwid                                     
360001ff007e6173c09000800001d1c17
                 alias                                           dm-9
                 path_grouping_policy            failover
                 path_checker                            tur
                 path_selector                           "round-robin 0"
                 failback                                         
immediate
                 rr_weight                               priorities
                 no_path_retry                           5
         }
         multipath {
                 wwid                                     
360001ff007e6173d0a000800001d1c17
                 alias                                           dm-10
                 path_grouping_policy            failover
                 path_checker                            tur
                 path_selector                           "round-robin 0"
                 failback                                         
immediate
                 rr_weight                               priorities
                 no_path_retry                           5
         }
         multipath {
                 wwid                                     
360001ff007e6173e0b000800001d1c17
                 alias                                           dm-11
                 path_grouping_policy            failover
                 path_checker                            tur
                 path_selector                           "round-robin 0"
                 failback                                         
immediate
                 rr_weight                               priorities
                 no_path_retry                           5
         }
}


I hope this helps

Wojciech

On 13 Dec 2007, at 15:18, Ludovic Francois wrote:

> On Dec 13, 3:12 pm, Ludovic Francois <lfranc... at gmail.com> wrote:
>> On Dec 13, 2:59 pm, "Ludovic Francois" <lfranc... at gmail.com> wrote:
>>
>>> Do you think it's possible  someone overwrote the "label" with a  
>>> tunefs command?
>>
>> or the system
>>
>>> I already saw it with some other file system.
>
> We recreated Target and Index with the tunefs.lustre command:
>
> --8<---------------cut here---------------start------------->8---
> [root at oss01 ~]# tunefs.lustre --writeconf --index 0 /dev/mpath/mpath0
> checking for existing Lustre data: found CONFIGS/mountdata
> Reading CONFIGS/mountdata
>
> Read previous values:
> Target: lustre-OST0030
> Index: 48
> Lustre FS: lustre
> Mount type: ldiskfs
> Flags: 0x2
> (OST )
> Persistent mount opts: errors=remount-ro,extents,mballoc
> Parameters: mgsnode=10.143.0.5 at tcp mgsnode=10.143.0.6 at tcp
> failover.node=10.143.0.2 at tcp sys.timeout=80 mgsnode=10.143.0.5 at tcp
> mgsnode=10.143.0.6 at tcp failover.node=10.143.0.2 at tcp sys.timeout=80
>
>
> Permanent disk data:
> Target: lustre-OST0000
> Index: 0
> Lustre FS: lustre
> Mount type: ldiskfs
> Flags: 0x102
> (OST writeconf )
> Persistent mount opts: errors=remount-ro,extents,mballoc
> Parameters: mgsnode=10.143.0.5 at tcp mgsnode=10.143.0.6 at tcp
> failover.node=10.143.0.2 at tcp sys.timeout=80 mgsnode=10.143.0.5 at tcp
> mgsnode=10.143.0.6 at tcp failover.node=10.143.0.2 at tcp sys.timeout=80
>
> Writing CONFIGS/mountdata
> [root at oss01 ~]#
> --8<---------------cut here---------------end--------------->8---
>
> But now  we have some problems  to remount the file  system, could you
> confirm us this command just rewrite the index?
>
> Best Regards, Ludo
>
> --
> Ludovic Francois                 +33 (0)6 14 77 26 93
> System Engineer                  DataDirect Networks
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Mr Wojciech Turek
Assistant System Manager
University of Cambridge
High Performance Computing service
email: wjt27 at cam.ac.uk
tel. +441223763517



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20071213/ebd8b10b/attachment.htm>


More information about the lustre-discuss mailing list