[Lustre-discuss] lfsck_start

Mohr Jr, Richard Frank (Rick Mohr) rmohr at utk.edu
Mon Jan 12 11:13:15 PST 2015


Brian,

With the newer versions of Lustre (starting with 2.4 I believe), quotas are automatically enabled for space accounting purposes even if you don't apply quotas to any user accounts.  (Enabling/disabling quota enforcement is handled by running "lfs quota on|off".)

We recently had similar errors on several of our OSTs following a failure of one of our storage controllers.  We did not find any indication that anything else on the file system was inconsistent, so I am still not sure why the quotas would have been affected.  Nevertheless, we decided to just regenerate the quota info to be safe.  The manual doesn't clearly state how to regenerate the info, but when we upgraded to 2.4, we had to run "tunefs.lustre --quota $DEVICE" against the MDT/OSTs.  The manual seems to indicate that this will set the QUOTA flag in the superblock and then invoke e2fsck to generate the disk usage database.  However, when we tried running this, it did not seem to regenerate the quota info.  I am not sure if this was supposed to work or not.  But in any case, we found a Lustre ticket that recommended these commands which ended up working for us:

tune2fs -O ^quota $DEVICE
tune2fs -O quota $DEVICE

The first command truncates the quota databases and the second one forces the info to be recalculated. (On several of our OSTs, we noticed that e2fsck reported errors about some inodes in addition to the quota warnings.  But when we regenerated the quotas, the quota warnings and the errors about the inodes both went away.)

As far as running lfsck goes, we ran a namespace check after our 1.8->2.4 upgrade using this command:

  lctl lfsck_start -M medusa-MDT0000 -t namespace

The OI scrub happened automatically.   I know that Lustre 2.6 introduced some more options for lfsck_start so YMMV.

-- 
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu


On Jan 12, 2015, at 12:56 PM, "Andrus, Brian Contractor" <bdandrus at nps.edu> wrote:

> All,
>  
> I am running the latest lustre rpms (lustre-2.6.0-2.6.32_431.20.3.el6_lustre.x86_64.x86_64).
> One of our OSTs is throwing errors whenever e2fsck is run:
> 
> [ERROR] quotaio_tree.c:590:check_reference:: Illegal reference (2767 >= 54) in user quota file. Quota file is probably corrupted.
> Please run e2fsck (8) to fix it.
> [ERROR] quotaio_tree.c:590:check_reference:: Illegal reference (1429438464 >= 54) in user quota file. Quota file is probably corrupted.
> Please run e2fsck (8) to fix it.
> [QUOTA WARNING] Usage inconsistent for ID 0:actual (482955264, 8813504) != expected (0, 0)
> [ERROR] quotaio_tree.c:590:check_reference:: Illegal reference (14001 >= 54) in user quota file. Quota file is probably corrupted.
> Please run e2fsck (8) to fix it.
> [ERROR] quotaio_tree.c:590:check_reference:: Illegal reference (103 >= 54) in user quota file. Quota file is probably corrupted.
> Please run e2fsck (8) to fix it.
> [ERROR] quotaio_tree.c:590:check_reference:: Illegal reference (2064384 >= 54) in user quota file. Quota file is probably corrupted.
> Please run e2fsck (8) to fix it.
> [ERROR] quotaio_tree.c:590:check_reference:: Illegal reference (17508 >= 54) in user quota file. Quota file is probably corrupted.
> Please run e2fsck (8) to fix it.
> [QUOTA WARNING] Usage inconsistent for ID 0:actual (482955264, 8813504) != expected (137438953472, 0)
> [QUOTA WARNING] Usage inconsistent for ID 0:actual (482955264, 8813504) != expected (0, 0)
> [QUOTA WARNING] Usage inconsistent for ID 0:actual (482955264, 8813504) != expected (0, 0)
> [ERROR] quotaio_tree.c:590:check_reference:: Illegal reference (12288 >= 54) in user quota file. Quota file is probably corrupted.
> Please run e2fsck (8) to fix it.
> [ERROR] quotaio_tree.c:590:check_reference:: Illegal reference (19053 >= 54) in user quota file. Quota file is probably corrupted.
> Please run e2fsck (8) to fix it.
> [ERROR] quotaio_tree.c:590:check_reference:: Illegal reference (10087 >= 54) in user quota file. Quota file is probably corrupted.
> Please run e2fsck (8) to fix it.
> [ERROR] quotaio_tree.c:590:check_reference:: Illegal reference (4218437632 >= 54) in user quota file. Quota file is probably corrupted.
> Please run e2fsck (8) to fix it.
> Signal (11) SIGSEGV si_code=SEGV_MAPERR fault addr=0x1ff91020
> We are not running quotas on the filesystem.
>  
> I thought it may be a good idea to run lfsck on the system, but things have changed and now I get a message to use lfsck_start.
> Now, I am having great difficulty finding documentation on running it although I do see quite a few code bits and bugs about it.
> All I find is the standard help message put in the documentation, but nothing that goes into any detail about any of the options. It also seems no matter what options I try, I get:
>  
> Fail to start LFSCK: Operation not supported
>  
> 1)  What could be causing such an error on an OST
> 2) Are there any detailed descriptions/examples of running lfsck_start for the 2.6 code?
>  
> Brian Andrus
> ITACS/Research Computing
> Naval Postgraduate School
> Monterey, California
> voice: 831-656-6238
>  
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss





More information about the lustre-discuss mailing list