[lustre-discuss] Cannot mount lustre filesystem anymore
Stefano Turolla
turolla at genzentrum.lmu.de
Wed Apr 26 07:57:53 PDT 2017
I found the problem,
somehow the OST client configuration file was corrupted.
I found this post very useful:
http://lustre-discuss.lustre.narkive.com/Z5s6LU8B/lustre-2-5-2-unable-to-mount-ost
I tried the same steps:
unmount OST and MGT/MDT
mount the OST and MGS with ldiskfs
llog on the client configuration file crashed, so I copied the one from
the MGS
unmount all the filesystems
run a writeconf to the MDS and all OSTs
reboot the system
After the reboot the OST was mounted, and after a long check the client
could also mount the volume
Thanks in any case
ciao
Stefano
P.S. the steps in details were:
# umount /mnt/lustre-ost
# mount -t ldiskfs /dev/sda1 /mnt/lustre-ost
# umount /mnt/lustre-mdt-mds/
# mount -t ldiskfs /dev/mapper/seqdata /mnt/lustre-mdt-mds
Verify that llog crashed with
# llog_reader /mnt/lustre-ost/CONFIGS/seqdata-client
Copy the non-corrupted file from MGS
# cp /mnt/lustre-ost/CONFIGS/seqdata-client
/mnt/lustre-ost/CONFIGS/seqdata-client.ORIG
# cp -f /mnt/lustre-mdt-mds/CONFIGS/seqdata-client
/mnt/lustre-ost/CONFIGS/seqdata-client
# tunefs.lustre --verbose --writeconf /dev/mapper/seqdata
# reboot
On 04/26/2017 10:27 AM, Stefano Turolla wrote:
> Actually, this is what I thought, but all files and directories are there
>
> [root at newmaster ~]# ls -lrtd /dev/mapper/seqdata /mnt/lustre-ost
> /dev/dm-0
> drwxr-xr-x 2 root root 6 Dec 2 13:35 /mnt/lustre-ost
> brw-rw---- 1 root disk 253, 0 Apr 25 11:59 /dev/dm-0
> lrwxrwxrwx 1 root root 7 Apr 25 11:59 /dev/mapper/seqdata -> ../dm-0
>
> Besides that, the file system seems to be there
> [root at newmaster ~]# e2fsck /dev/mapper/seqdata
> e2fsck 1.42.13.wc5 (15-Apr-2016)
> seqdata-OST0000: clean, 974805/31538944 files, 3578657570/8073955508
> blocks
>
> Stefano
>
> On 04/25/2017 11:49 PM, Brett Lee wrote:
>>
>> "No such file or directory."
>>
>> Could that be the cause?
>>
>> Brett
>> --
>> Protect Yourself Against Cybercrime
>> PDS Software Solutions LLC
>> https://www.TrustPDS.com
>>
>> On Apr 25, 2017 4:25 AM, "Stefano Turolla" <turolla at genzentrum.lmu.de
>> <mailto:turolla at genzentrum.lmu.de>> wrote:
>>
>> Dear all, I am a newbie in lustre, I set up a simple
>> configuration to mount a filesystem from a Dell Powervault
>> MD3800i (iscsi + multipath enabled)
>> It was working properly but, after the last reboot I cannot mount
>> the lustre filesystem anymore
>> I am running lustre 3.10.0 on scientific linux 7.3.
>> I put MDT/MDS on the server
>> together with OST
>>
>> Here is the relevant /etc/fstab
>>
>> # Lustre MDT / MDS (Manage filenames, directories etc and Block
>> devices
>>
>> /dev/sda1 /mnt/lustre-mdt-mds
>> lustre noauto,_netdev 0 0
>>
>> /dev/mapper/seqdata /mnt/lustre-ost
>> lustre noauto,_netdev 0 0
>>
>> # Lustre Client
>>
>> master-mds at tcp:/seqdata /seq_data
>> lustre noauto,_netdev 0 0
>>
>>
>> I can mount the /mnt/lustre-mdt-mds filesystem but not the OST,
>> and of course no client
>>
>>
>> here are the devices
>> [root at newmaster lustre]# cat /proc/fs/lustre/devices
>>
>> 0 UP osd-ldiskfs seqdata-MDT0000-osd seqdata-MDT0000-osd_UUID 9
>>
>> 1 UP mgs MGS MGS 5
>>
>> 2 UP mgc MGC10.163.85.99 at tcp 69e92317-78f6-eef7-1764-57da5aadafe2 5
>>
>> 3 UP mds MDS MDS_uuid 3
>>
>> 4 UP lod seqdata-MDT0000-mdtlov seqdata-MDT0000-mdtlov_UUID 4
>>
>> 5 UP mdt seqdata-MDT0000 seqdata-MDT0000_UUID 5
>>
>> 6 UP mdd seqdata-MDD0000 seqdata-MDD0000_UUID 4
>>
>> 7 UP qmt seqdata-QMT0000 seqdata-QMT0000_UUID 4
>>
>> 8 UP osp seqdata-OST0000-osc-MDT0000 seqdata-MDT0000-mdtlov_UUID 5
>>
>> 9 UP lwp seqdata-MDT0000-lwp-MDT0000
>> seqdata-MDT0000-lwp-MDT0000_UUID 5
>>
>>
>>
>> Here are the errors when I try to mount the OST
>>
>> [root at newmaster lustre]# mount /mnt/lustre-ost
>>
>>
>> Apr 25 12:13:41 newmaster kernel: LDISKFS-fs (dm-0): file extents
>> enabled, maximum tree depth=5
>>
>> Apr 25 12:13:42 newmaster kernel: LDISKFS-fs (dm-0): mounted
>> filesystem with ordered data mode. Opts:
>> ,errors=remount-ro,no_mbcache,nodelalloc
>>
>> Apr 25 12:13:42 newmaster kernel: LustreError:
>> 11242:0:(llog_osd.c:246:llog_osd_read_header())
>> seqdata-OST0000-osd: error reading [0xa:0x14:0x0] log header size
>> 8192: rc = -14
>>
>> Apr 25 12:13:42 newmaster kernel: LustreError:
>> 11242:0:(llog_osd.c:246:llog_osd_read_header()) Skipped 1
>> previous similar message
>>
>> Apr 25 12:13:42 newmaster kernel: LustreError:
>> 11242:0:(mgc_request.c:1832:mgc_llog_local_copy())
>> MGC10.163.85.99 at tcp: failed to copy remote log seqdata-client: rc
>> = -14
>>
>> Apr 25 12:13:42 newmaster kernel: LustreError: 13a-8: Failed to
>> get MGS log seqdata-client and no local copy.
>>
>> Apr 25 12:13:42 newmaster kernel: LustreError: 15c-8:
>> MGC10.163.85.99 at tcp: The configuration from log 'seqdata-client'
>> failed (-2). This may be the result of communication errors
>> between this node and the MGS, a bad configuration, or other
>> errors. See the syslog for more information.
>>
>> Apr 25 12:13:42 newmaster kernel: LustreError:
>> 11242:0:(obd_mount_server.c:1369:server_start_targets())
>> seqdata-OST0000: failed to start LWP: -2
>>
>> Apr 25 12:13:42 newmaster kernel: LustreError:
>> 11242:0:(obd_mount_server.c:1844:server_fill_super()) Unable to
>> start targets: -2
>>
>> Apr 25 12:13:42 newmaster kernel: Lustre: Failing over
>> seqdata-OST0000
>>
>> Apr 25 12:13:42 newmaster kernel: Lustre: server umount
>> seqdata-OST0000 complete
>>
>> Apr 25 12:13:42 newmaster kernel: LustreError:
>> 11242:0:(obd_mount.c:1449:lustre_fill_super()) Unable to mount (-2)
>>
>> mount.lustre: mount /dev/mapper/seqdata at /mnt/lustre-ost
>> failed: No such file or directory
>>
>> Is the MGS specification correct?
>>
>> Is the filesystem name correct?
>>
>> If upgrading, is the copied client log valid? (see upgrade docs)
>>
>>
>>
>> Here is the lnet configuration, currently only the server is listed
>> [root at newmaster lustre]# cat /etc/modprobe.d/lustre.conf
>>
>> options lnet networks=tcp0(eth2)
>>
>>
>> I tried to search in the mailing list some good explanation of
>> what happened but I could not find any.
>> Could you please help me to debug the problem?
>> Thanks a lot in advance
>> Stefano Turolla
>>
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> <mailto:lustre-discuss at lists.lustre.org>
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>> <http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org>
>>
>
> --
> Stefano Turolla
> Gene Center
> University of Munich
> Feodor-Lynen-Str. 25 81377 Munich, Germany
> tel. +49 (0)89 2180 71055
> email. turolla at genzentrum.lmu.de
>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
--
Stefano Turolla
Gene Center
University of Munich
Feodor-Lynen-Str. 25 81377 Munich, Germany
tel. +49 (0)89 2180 71055
email. turolla at genzentrum.lmu.de
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20170426/8738b224/attachment.htm>
More information about the lustre-discuss
mailing list