[lustre-discuss] Cannot mount lustre filesystem anymore

Stefano Turolla turolla at genzentrum.lmu.de
Wed Apr 26 07:57:53 PDT 2017


I found the problem,
somehow the OST client configuration file was corrupted.

I found this post very useful:
http://lustre-discuss.lustre.narkive.com/Z5s6LU8B/lustre-2-5-2-unable-to-mount-ost


I tried the same steps:

unmount OST and MGT/MDT
mount the OST and MGS with ldiskfs
llog on the client configuration file crashed, so I copied the one from
the MGS
unmount all the filesystems
run a writeconf to the MDS and all OSTs
reboot the system

After the reboot the OST was mounted, and after a long check the client
could also mount the volume

Thanks in any case

ciao
Stefano

P.S.  the steps in details were:
# umount /mnt/lustre-ost
# mount -t ldiskfs /dev/sda1 /mnt/lustre-ost

# umount /mnt/lustre-mdt-mds/
# mount -t ldiskfs /dev/mapper/seqdata   /mnt/lustre-mdt-mds
Verify that llog crashed with
# llog_reader /mnt/lustre-ost/CONFIGS/seqdata-client

Copy the non-corrupted file from MGS
# cp    /mnt/lustre-ost/CONFIGS/seqdata-client
/mnt/lustre-ost/CONFIGS/seqdata-client.ORIG
# cp -f /mnt/lustre-mdt-mds/CONFIGS/seqdata-client
/mnt/lustre-ost/CONFIGS/seqdata-client
# tunefs.lustre --verbose --writeconf /dev/mapper/seqdata
# reboot


On 04/26/2017 10:27 AM, Stefano Turolla wrote:
> Actually, this is what I thought, but all files and directories are there
>
> [root at newmaster ~]# ls -lrtd /dev/mapper/seqdata /mnt/lustre-ost
> /dev/dm-0
> drwxr-xr-x 2 root root      6 Dec  2 13:35 /mnt/lustre-ost
> brw-rw---- 1 root disk 253, 0 Apr 25 11:59 /dev/dm-0
> lrwxrwxrwx 1 root root      7 Apr 25 11:59 /dev/mapper/seqdata -> ../dm-0
>
> Besides that, the file system seems to be there
> [root at newmaster ~]# e2fsck /dev/mapper/seqdata
> e2fsck 1.42.13.wc5 (15-Apr-2016)
> seqdata-OST0000: clean, 974805/31538944 files, 3578657570/8073955508
> blocks
>
> Stefano
>
> On 04/25/2017 11:49 PM, Brett Lee wrote:
>>
>> "No such file or directory."
>>
>> Could that be the cause?
>>
>> Brett
>> --
>> Protect Yourself Against Cybercrime
>> PDS Software Solutions LLC
>> https://www.TrustPDS.com
>>
>> On Apr 25, 2017 4:25 AM, "Stefano Turolla" <turolla at genzentrum.lmu.de
>> <mailto:turolla at genzentrum.lmu.de>> wrote:
>>
>>     Dear all, I am a newbie in lustre, I set up a simple
>>     configuration to mount a filesystem from a Dell Powervault
>>     MD3800i (iscsi + multipath enabled)
>>     It was working properly but, after the last reboot I cannot mount
>>     the lustre filesystem anymore
>>     I am running lustre 3.10.0 on scientific linux 7.3.
>>     I put MDT/MDS on the server
>>     together with OST
>>
>>     Here is the relevant /etc/fstab
>>
>>     # Lustre MDT / MDS (Manage filenames, directories etc and Block
>>     devices
>>
>>     /dev/sda1                       /mnt/lustre-mdt-mds    
>>     lustre          noauto,_netdev        0 0
>>
>>     /dev/mapper/seqdata             /mnt/lustre-ost        
>>     lustre          noauto,_netdev        0 0
>>
>>     # Lustre Client
>>
>>     master-mds at tcp:/seqdata         /seq_data              
>>     lustre          noauto,_netdev        0 0
>>
>>
>>     I can mount the /mnt/lustre-mdt-mds  filesystem but not the OST,
>>     and of course no  client
>>
>>
>>     here are the devices
>>     [root at newmaster lustre]# cat /proc/fs/lustre/devices
>>
>>       0 UP osd-ldiskfs seqdata-MDT0000-osd seqdata-MDT0000-osd_UUID 9
>>
>>       1 UP mgs MGS MGS 5
>>
>>       2 UP mgc MGC10.163.85.99 at tcp 69e92317-78f6-eef7-1764-57da5aadafe2 5
>>
>>       3 UP mds MDS MDS_uuid 3
>>
>>       4 UP lod seqdata-MDT0000-mdtlov seqdata-MDT0000-mdtlov_UUID 4
>>
>>       5 UP mdt seqdata-MDT0000 seqdata-MDT0000_UUID 5
>>
>>       6 UP mdd seqdata-MDD0000 seqdata-MDD0000_UUID 4
>>
>>       7 UP qmt seqdata-QMT0000 seqdata-QMT0000_UUID 4
>>
>>       8 UP osp seqdata-OST0000-osc-MDT0000 seqdata-MDT0000-mdtlov_UUID 5
>>
>>       9 UP lwp seqdata-MDT0000-lwp-MDT0000
>>     seqdata-MDT0000-lwp-MDT0000_UUID 5
>>
>>
>>
>>     Here are the errors when I try to mount the OST
>>
>>     [root at newmaster lustre]# mount /mnt/lustre-ost
>>
>>
>>     Apr 25 12:13:41 newmaster kernel: LDISKFS-fs (dm-0): file extents
>>     enabled, maximum tree depth=5
>>
>>     Apr 25 12:13:42 newmaster kernel: LDISKFS-fs (dm-0): mounted
>>     filesystem with ordered data mode. Opts:
>>     ,errors=remount-ro,no_mbcache,nodelalloc
>>
>>     Apr 25 12:13:42 newmaster kernel: LustreError:
>>     11242:0:(llog_osd.c:246:llog_osd_read_header())
>>     seqdata-OST0000-osd: error reading [0xa:0x14:0x0] log header size
>>     8192: rc = -14
>>
>>     Apr 25 12:13:42 newmaster kernel: LustreError:
>>     11242:0:(llog_osd.c:246:llog_osd_read_header()) Skipped 1
>>     previous similar message
>>
>>     Apr 25 12:13:42 newmaster kernel: LustreError:
>>     11242:0:(mgc_request.c:1832:mgc_llog_local_copy())
>>     MGC10.163.85.99 at tcp: failed to copy remote log seqdata-client: rc
>>     = -14
>>
>>     Apr 25 12:13:42 newmaster kernel: LustreError: 13a-8: Failed to
>>     get MGS log seqdata-client and no local copy.
>>
>>     Apr 25 12:13:42 newmaster kernel: LustreError: 15c-8:
>>     MGC10.163.85.99 at tcp: The configuration from log 'seqdata-client'
>>     failed (-2). This may be the result of communication errors
>>     between this node and the MGS, a bad configuration, or other
>>     errors. See the syslog for more information.
>>
>>     Apr 25 12:13:42 newmaster kernel: LustreError:
>>     11242:0:(obd_mount_server.c:1369:server_start_targets())
>>     seqdata-OST0000: failed to start LWP: -2
>>
>>     Apr 25 12:13:42 newmaster kernel: LustreError:
>>     11242:0:(obd_mount_server.c:1844:server_fill_super()) Unable to
>>     start targets: -2
>>
>>     Apr 25 12:13:42 newmaster kernel: Lustre: Failing over
>>     seqdata-OST0000
>>
>>     Apr 25 12:13:42 newmaster kernel: Lustre: server umount
>>     seqdata-OST0000 complete
>>
>>     Apr 25 12:13:42 newmaster kernel: LustreError:
>>     11242:0:(obd_mount.c:1449:lustre_fill_super()) Unable to mount  (-2)
>>
>>     mount.lustre: mount /dev/mapper/seqdata at /mnt/lustre-ost
>>     failed: No such file or directory
>>
>>     Is the MGS specification correct?
>>
>>     Is the filesystem name correct?
>>
>>     If upgrading, is the copied client log valid? (see upgrade docs)
>>
>>
>>
>>     Here is the lnet configuration, currently only the server is listed
>>     [root at newmaster lustre]# cat /etc/modprobe.d/lustre.conf
>>
>>     options lnet networks=tcp0(eth2)
>>
>>
>>     I tried to search in the mailing list some good explanation of
>>     what happened but I could not find any.
>>     Could you please help me to debug the problem?
>>     Thanks a lot in advance
>>     Stefano Turolla
>>
>>     _______________________________________________
>>     lustre-discuss mailing list
>>     lustre-discuss at lists.lustre.org
>>     <mailto:lustre-discuss at lists.lustre.org>
>>     http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>     <http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org>
>>
>
> -- 
> Stefano Turolla
> Gene Center
> University of Munich
> Feodor-Lynen-Str. 25 81377 Munich, Germany
> tel. +49 (0)89 2180 71055
> email. turolla at genzentrum.lmu.de
>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

-- 
Stefano Turolla
Gene Center
University of Munich
Feodor-Lynen-Str. 25 81377 Munich, Germany
tel. +49 (0)89 2180 71055
email. turolla at genzentrum.lmu.de

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20170426/8738b224/attachment.htm>


More information about the lustre-discuss mailing list