[Lustre-discuss] Huge Sparse files in ROOT partition of MDT
Nirmal Seenu
nirmal at fnal.gov
Fri Mar 6 05:26:13 PST 2009
While trying to figure out the reason for LVM2 snapshots failing on our
MDT server I found that there are a lot of sparse files on the MDT
volume. The file size as seen from a ls command output on the MDT is
same as the real file size. The tar runs for a few hours at this point
even if try to use the --spare option in the tar command.
The total MDT partition usage itself is about 500MB (as reported by df)
and it used to take me less than 10 minutes to create a LVM2 snapshot
and tar it up when I was running the servers using Lustre 1.6.5 with no
quota enabled.
I recently upgraded my Lustre servers to 1.6.7 and tried to enable quota
on the MDT and OST by doing the following commands:
tunefs.lustre --erase-params --mdt --mgsnode=iblustre1 at tcp1 --param
lov.stripecount=1 --writeconf --param mdt.quota_type=ug
/dev/mapper/lustre1_volume-mds_lv
tunefs.lustre --erase-params --ost --mgsnode=iblustre1 at tcp1 --param
ost.quota_type=ug --writeconf /dev/sdc1
I was never able to run a "lfs quotacheck" successfully due to LBUGS.
I was able to create a LVM2 snapshot and look at the contents of the
MDT. Every directory but for ROOT seems to have the correct content.
Some of the files under ROOT still have 0 byte usage while the other
have 35GB usage (the file itself is sparse as seen from od output):
-rw-r--r-- 1 ***** *** 0 Jan 24 08:36
l48144f21b747m0036m018-Coul_000505
-rw-rw-r-- 1 ***** *** 36691775424 Feb 22 19:56
prop_WALL_pbc_m0.033_LS16_t0_002080
At this point I am curious to know if this is the expected behaviour of
MDT or some corruption on the MDT file system. Do we have to live with
the fact that the backup process using LVM2 snapshots take a few hours
to complete if quotas are enabled?
Thanks for your help in advance.
Nirmal
More information about the lustre-discuss
mailing list