[lustre-discuss] MDT quota problem / MDS crash 2.5.3

Thomas Roth t.roth at gsi.de
Tue Jul 19 14:24:58 PDT 2016


Hi,

you could check the still-existing parameters with
 > tunefs.lustre --dryrun <dev>

The actual writing of the parameters into <dev> might require the options "--erase-param --writeconf" in addition to all the stuff you put into the 
original mkfs.lustre-command. But the manual has all that.

Cheers,
Thomas

On 07/19/2016 01:47 PM, Gary Molenkamp wrote:
> We have used this process successfully several times in the past on our 2.5.3 based system.  However, yesterday this occurred again and there were
> significant errors corrected in the e2fsck stage.   ie:
>
> Inode 1465064204 was part of the orphaned inode list.  FIXED.
> Inode 269985884 ref count is 2, should be 1.  Fix? yes
> Unattached inode 269985885
> Connect to /lost+found? yes
> ...
>
> Now when I try to mount the MDS/MGS filesystem, I get:
>
> mount.lustre: missing option mgsnode=<nid>
>
> I mounted the filesystem as ldiskfs and it looks ok from my limited experience.  I should be able to re-add the nid for the mgs with a :
>
> tunefs.lustre --mgs --mgsnode=<nid> <dev>
>
> correct?  Is there any way to see if there was any other corruption? The OSTs are still up and running, do they cache a copy of the mgs data for
> restoration?
>
> Any assistance would be appreciated.
>
> Thanks.
> Gary.
>
>
>
> On 19/07/16 06:24 AM, Dilger, Andreas wrote:
>> On Jul 14, 2016, at 04:13, Thomas Roth <t.roth at gsi.de> wrote:
>>> Hi Guido,
>>>
>>> thanks for the tip, that was successful, with the exact same commands,
>>>
>>>   tune2fs -O ^quota /dev/mdt     (took about ~3min)
>>>   tunefs.lustre --quota /dev/mdt (took about ~30min with ~200M used inodes)
>> Note that this can also be achieved by running "e2fsck -f" on the filesystem.
>> That is probably faster, and has the added benefit that it verifies the
>> consistency of the filesystem before recreating the quota files.
>>
>> Cheers, Andreas
>>
>>> A subsequent
>>>   lctl lfsck_start -M nyx-MDT0000
>>> ran for less than 45 min and seems to have cleaned up the mdt mess.
>>>
>>> Cheers,
>>> Thomas
>>>
>>>
>>> On 07/12/2016 11:18 PM, Guido Laubender wrote:
>>>> English version (I'm sorry for my previous mail in German - but should have been a personal mail to Thomas only :( ):
>>>>
>>>> We were recently able to fix wrong Lustre inode quotas by disabling and re-enabling quota support on the MDT by 'tune2fs -O ^quota /dev/mdt' and
>>>> 'tunefs.lustre --quota /dev/mdt'.
>>>>
>>>> Maybe it helps here as well.
>>>>
>>>>
>>>> On Tue, 12 Jul 2016, Guido Laubender wrote:
>>>>
>>>>> Bei uns waren vor kurzem die Inode-Quoten nicht korrekt; durch Deaktivieren und anschließendes Aktivieren der Quoten-Unterstützung (mittels 'tune2fs
>>>>> -O ^quota' und anschließendem 'tunefs.lustre --quota') auf dem MDT konnten wir es wieder reparieren.
>>>>>
>>>>> Vielleicht hilft das bei Euch auch...
>>>>>
>>>>> On Tue, 12 Jul 2016, Thomas Roth wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> we are running Lustre 2.5.3 on our servers. OSTs are on ZFS, MDS is on ldiskfs.
>>>>>>
>>>>>> After a MDS crash and e2fsck 1.42.9.wc1 on the partition, the MDS mounts but causes high-frequency log entries
>>>>>>
>>>>>> Jul 12 16:06:38 lxmds12 kernel: VFS: find_free_dqentry(): Data block full but it shouldn't.
>>>>>> Jul 12 16:06:38 lxmds12 kernel: VFS: Error -5 occurred while creating quota.
>>>>>>
>>>>>> interspersed with
>>>>>>
>>>>>> Jul 12 16:06:38 lxmds12 kernel: LustreError: 13159:0:(qsd_handler.c:1155:qsd_op_adjust()) nyx-MDT0000: fail to locate lqe for id:6763, type:0
>>>>>> Jul 12 16:06:38 lxmds12 kernel: LustreError: 13159:0:(qsd_handler.c:1155:qsd_op_adjust()) Skipped 4973 previous similar messages
>>>>>>
>>>>>> or
>>>>>>
>>>>>> Jul 12 15:59:26 lxmds12 kernel: LustreError: 13414:0:(qsd_entry.c:211:qsd_refresh_usage()) $$$ failed to read disk usage, rc:-3 qsd:nyx-MDT0000
>>>>>> qtype:usr id:7408 enforced:0 granted:0 pending:0 waiting:0 req:0 usage:0 qunit:0 qtune:0 edquot:0
>>>>>> Jul 12 15:59:26 lxmds12 kernel: LustreError: 13414:0:(qsd_entry.c:211:qsd_refresh_usage()) Skipped 5166 previous similar messages
>>>>>>
>>>>>>
>>>>>> According to our experience from the last few days, this will eventually bring all Lustre operations to a halt.
>>>>>>
>>>>>>
>>>>>> Both the web and the e2fsck-messages ([QUOTA WARNING] Usage inconsistent for ID 7989:actual (278528000, 738675) != expected (222507008, 531071))
>>>>>> hint towards quota issues.
>>>>>>
>>>>>> Therefore, we have 'switched off' quota by "lctl conf_param fsname.quota.ost|mdt=u|g|ug|none", restarted, umounted and 'switched on' quota again,
>>>>>> restarted, unmounted.
>>>>>>
>>>>>> -> The VSF-Errors still appear.
>>>>>>
>>>>>> Is there anything else we could do?
>>>>>> Mount the MDT as ldiskfs and do nasty things on the disk?
>>>>>> Is there any command that recalculates / rewrites the quota files on the MDT?
>>>>>>
>>>>>>
>>>>>>
>>>>>> (As long as Lustre is still accessible, 'lfs quota' gives results for both users and groups, but at least the file count is entirely wrong (all of
>>>>>> my own Lustre files amount to exactle 0 files).
>>>>>>
>>>>>> And the update of the usage numbers does not work either - I managed to copy a 1GB-file  and still had the same kbytes used...)
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Thomas
>>>>>>
>>>>>> --
>>>>>> --------------------------------------------------------------------
>>>>>> Thomas Roth
>>>>>> Department: Informationstechnologie
>>>>>> Location: SB3 1.250
>>>>>> Phone: +49-6159-71 1453  Fax: +49-6159-71 2986
>>>>>>
>>>>>> GSI Helmholtzzentrum für Schwerionenforschung GmbH
>>>>>> Planckstraße 1
>>>>>> 64291 Darmstadt
>>>>>> www.gsi.de
>>>>>>
>>>>>> Gesellschaft mit beschränkter Haftung
>>>>>> Sitz der Gesellschaft: Darmstadt
>>>>>> Handelsregister: Amtsgericht Darmstadt, HRB 1528
>>>>>>
>>>>>> Geschäftsführung: Ursula Weyrich
>>>>>> Professor Dr. Karlheinz Langanke
>>>>>> Jörg Blaurock
>>>>>>
>>>>>> Vorsitzende des Aufsichtsrates: St Dr. Georg Schütte
>>>>>> Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt
>>>>>>
>>>>>> _______________________________________________
>>>>>> lustre-discuss mailing list
>>>>>> lustre-discuss at lists.lustre.org
>>>>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>>>
>>>> _______________________________________________
>>>> lustre-discuss mailing list
>>>> lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>>>
>>> --
>>> --------------------------------------------------------------------
>>> Thomas Roth
>>> Department: HPC
>>> Location: SB3 1.262
>>> Phone: +49-6159-71 1453  Fax: +49-6159-71 2986
>>>
>>> GSI Helmholtzzentrum für Schwerionenforschung GmbH
>>> Planckstraße 1
>>> 64291 Darmstadt
>>> www.gsi.de
>>>
>>> Gesellschaft mit beschränkter Haftung
>>> Sitz der Gesellschaft: Darmstadt
>>> Handelsregister: Amtsgericht Darmstadt, HRB 1528
>>>
>>> Geschäftsführung: Professor Dr. Karlheinz Langanke
>>> Ursula Weyrich
>>> Jörg Blaurock
>>>
>>> Vorsitzender des Aufsichtsrates: St Dr. Georg Schütte
>>> Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt
>>> _______________________________________________
>>> lustre-discuss mailing list
>>> lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

-- 
--------------------------------------------------------------------
Thomas Roth
Department: HPC
Location: SB3 1.262
Phone: +49-6159-71 1453  Fax: +49-6159-71 2986

GSI Helmholtzzentrum für Schwerionenforschung GmbH
Planckstraße 1
64291 Darmstadt
www.gsi.de

Gesellschaft mit beschränkter Haftung
Sitz der Gesellschaft: Darmstadt
Handelsregister: Amtsgericht Darmstadt, HRB 1528

Geschäftsführung: Professor Dr. Karlheinz Langanke
Ursula Weyrich
Jörg Blaurock

Vorsitzender des Aufsichtsrates: St Dr. Georg Schütte
Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt


More information about the lustre-discuss mailing list