[lustre-discuss] MDT quota problem / MDS crash 2.5.3

Dilger, Andreas andreas.dilger at intel.com
Tue Jul 19 03:24:55 PDT 2016


On Jul 14, 2016, at 04:13, Thomas Roth <t.roth at gsi.de> wrote:
> 
> Hi Guido,
> 
> thanks for the tip, that was successful, with the exact same commands,
> 
>  tune2fs -O ^quota /dev/mdt     (took about ~3min)
>  tunefs.lustre --quota /dev/mdt (took about ~30min with ~200M used inodes)

Note that this can also be achieved by running "e2fsck -f" on the filesystem.
That is probably faster, and has the added benefit that it verifies the
consistency of the filesystem before recreating the quota files.

Cheers, Andreas

> A subsequent
>  lctl lfsck_start -M nyx-MDT0000
> ran for less than 45 min and seems to have cleaned up the mdt mess.
> 
> Cheers,
> Thomas
> 
> 
> On 07/12/2016 11:18 PM, Guido Laubender wrote:
>> 
>> English version (I'm sorry for my previous mail in German - but should have been a personal mail to Thomas only :( ):
>> 
>> We were recently able to fix wrong Lustre inode quotas by disabling and re-enabling quota support on the MDT by 'tune2fs -O ^quota /dev/mdt' and
>> 'tunefs.lustre --quota /dev/mdt'.
>> 
>> Maybe it helps here as well.
>> 
>> 
>> On Tue, 12 Jul 2016, Guido Laubender wrote:
>> 
>>> Bei uns waren vor kurzem die Inode-Quoten nicht korrekt; durch Deaktivieren und anschließendes Aktivieren der Quoten-Unterstützung (mittels 'tune2fs
>>> -O ^quota' und anschließendem 'tunefs.lustre --quota') auf dem MDT konnten wir es wieder reparieren.
>>> 
>>> Vielleicht hilft das bei Euch auch...
>>> 
>>> On Tue, 12 Jul 2016, Thomas Roth wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> we are running Lustre 2.5.3 on our servers. OSTs are on ZFS, MDS is on ldiskfs.
>>>> 
>>>> After a MDS crash and e2fsck 1.42.9.wc1 on the partition, the MDS mounts but causes high-frequency log entries
>>>> 
>>>> Jul 12 16:06:38 lxmds12 kernel: VFS: find_free_dqentry(): Data block full but it shouldn't.
>>>> Jul 12 16:06:38 lxmds12 kernel: VFS: Error -5 occurred while creating quota.
>>>> 
>>>> interspersed with
>>>> 
>>>> Jul 12 16:06:38 lxmds12 kernel: LustreError: 13159:0:(qsd_handler.c:1155:qsd_op_adjust()) nyx-MDT0000: fail to locate lqe for id:6763, type:0
>>>> Jul 12 16:06:38 lxmds12 kernel: LustreError: 13159:0:(qsd_handler.c:1155:qsd_op_adjust()) Skipped 4973 previous similar messages
>>>> 
>>>> or
>>>> 
>>>> Jul 12 15:59:26 lxmds12 kernel: LustreError: 13414:0:(qsd_entry.c:211:qsd_refresh_usage()) $$$ failed to read disk usage, rc:-3 qsd:nyx-MDT0000
>>>> qtype:usr id:7408 enforced:0 granted:0 pending:0 waiting:0 req:0 usage:0 qunit:0 qtune:0 edquot:0
>>>> Jul 12 15:59:26 lxmds12 kernel: LustreError: 13414:0:(qsd_entry.c:211:qsd_refresh_usage()) Skipped 5166 previous similar messages
>>>> 
>>>> 
>>>> According to our experience from the last few days, this will eventually bring all Lustre operations to a halt.
>>>> 
>>>> 
>>>> Both the web and the e2fsck-messages ([QUOTA WARNING] Usage inconsistent for ID 7989:actual (278528000, 738675) != expected (222507008, 531071))
>>>> hint towards quota issues.
>>>> 
>>>> Therefore, we have 'switched off' quota by "lctl conf_param fsname.quota.ost|mdt=u|g|ug|none", restarted, umounted and 'switched on' quota again,
>>>> restarted, unmounted.
>>>> 
>>>> -> The VSF-Errors still appear.
>>>> 
>>>> Is there anything else we could do?
>>>> Mount the MDT as ldiskfs and do nasty things on the disk?
>>>> Is there any command that recalculates / rewrites the quota files on the MDT?
>>>> 
>>>> 
>>>> 
>>>> (As long as Lustre is still accessible, 'lfs quota' gives results for both users and groups, but at least the file count is entirely wrong (all of
>>>> my own Lustre files amount to exactle 0 files).
>>>> 
>>>> And the update of the usage numbers does not work either - I managed to copy a 1GB-file  and still had the same kbytes used...)
>>>> 
>>>> 
>>>> Regards,
>>>> Thomas
>>>> 
>>>> --
>>>> --------------------------------------------------------------------
>>>> Thomas Roth
>>>> Department: Informationstechnologie
>>>> Location: SB3 1.250
>>>> Phone: +49-6159-71 1453  Fax: +49-6159-71 2986
>>>> 
>>>> GSI Helmholtzzentrum für Schwerionenforschung GmbH
>>>> Planckstraße 1
>>>> 64291 Darmstadt
>>>> www.gsi.de
>>>> 
>>>> Gesellschaft mit beschränkter Haftung
>>>> Sitz der Gesellschaft: Darmstadt
>>>> Handelsregister: Amtsgericht Darmstadt, HRB 1528
>>>> 
>>>> Geschäftsführung: Ursula Weyrich
>>>> Professor Dr. Karlheinz Langanke
>>>> Jörg Blaurock
>>>> 
>>>> Vorsitzende des Aufsichtsrates: St Dr. Georg Schütte
>>>> Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt
>>>> 
>>>> _______________________________________________
>>>> lustre-discuss mailing list
>>>> lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>> 
>> 
>> 
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>> 
> 
> -- 
> --------------------------------------------------------------------
> Thomas Roth
> Department: HPC
> Location: SB3 1.262
> Phone: +49-6159-71 1453  Fax: +49-6159-71 2986
> 
> GSI Helmholtzzentrum für Schwerionenforschung GmbH
> Planckstraße 1
> 64291 Darmstadt
> www.gsi.de
> 
> Gesellschaft mit beschränkter Haftung
> Sitz der Gesellschaft: Darmstadt
> Handelsregister: Amtsgericht Darmstadt, HRB 1528
> 
> Geschäftsführung: Professor Dr. Karlheinz Langanke
> Ursula Weyrich
> Jörg Blaurock
> 
> Vorsitzender des Aufsichtsrates: St Dr. Georg Schütte
> Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



More information about the lustre-discuss mailing list