[Lustre-discuss] One Lustre Client lost One Lustre Disk--solved

Ms. Megan Larko dobsonunit at gmail.com
Thu Jul 16 15:31:19 PDT 2009


Hi,

I fixed the problem of the one Lustre client not mounting one Lustre disk.

Truthfully, the problem expanded slightly.  When I rebooted another
client, it also lost contact with this one particular Lustre disk.
The error messages were exactly the same:

[root at crew01 ~]# mount /crew2
mount.lustre: mount ic-mds1 at o2ib:/crew2 at /crew2 failed: Invalid argument
This may have multiple causes.
Is 'crew2' the correct filesystem name?
Are the mount options correct?
Check the syslog for more info.

So, I thought something may have become a bit off with the disk
set-up.   I had recently upgraded the other MDT disk to a larger
physical volume.  This was successfully done following instructions in
the Lustre Manual.   So I thought perhaps the MDT that I did not
change merely needed to be "re-set".

On the MGS, I unmounted the MDT of the problem disk and ran the
following command:
>> tunefs.lustre --writeconf --mgs --mdt  --fsname=crew2 /dev/{sd-whatever}
I then remounted the MDT (which is also the MGS) successfully.

On the OSS, I first unmounted the OST disks and then I issued the command:
>> tunefs.lustre --writeconf --ost /dev/{sd-whatever}
This was issued for each and every OST.   I mounted my OSTs again successfully.

On my clients, I issued the mount command for the /crew2 lustre disk
and it was now successful.  No more "invalid argument" message.
One client did give me a "Transport endpoint not connected message",
so that client will require a re-boot to remount this lustre disk
(unless anyone can tell me how to do the re-mount without a reboot of
this client).

So--- I am guessing that when I did the upgrade in hardware disk size
on the non-mgs lustre disk a few weeks ago, the other lustre disk,
which functions as the mgs, was left in a state such that I could not
pick-up that disk from the clients if I rebooted a client.  Is this an
accurate guess?    If it is, then one may with to add to the Lustre
Manual (Ch. 15 in 1.6.x versions on restoring metadata to an mdt disk)
that the mgs disk may require an update using tunefs.lustre
--writeconf even if it was not the disk which was restored.

I may be wrong in my guess, but the above procedure did get my lustre
disk back onto my clients successfully.

Cheers!
megan



More information about the lustre-discuss mailing list