[lustre-discuss] Disk usage / quota discrepancy

David Schanzenbach davidls at hawaii.edu
Wed Nov 19 22:05:06 PST 2025


Hi Shane,

If you are using Lustre 2.15.7 + zfs with PFL , you are possibly 
encountering issue LU-19193 https://jira.whamcloud.com/browse/LU-19193 .

For our site, until we applied the patch provided in LU-19193,
we saw newly written files balloon to 4-10 times the uncompressed file 
size, unless it was written to a single stripe.


Thanks,
David


On 11/19/2025 5:56 AM, Nehring, Shane R [ITS] via lustre-discuss wrote:
> Hello All,
>
> As background this lustre volume is only (currently) 1 mdt and 5 osts, the mdt and osts are using zfs with compression on. We also have a default PFL defined to stripe any file larger than 1T to 2 osts.
>
> Recently we had a user create a swath of large files and I had to reach out to them because the volume was filling up pretty quickly. The files compress very nicely so they're in the process of doing that now, which resulted in a very quick reclaiming of the used space. It all sounds innocuous, but the rate at which the space was consumed and then freed was somewhat suspicious, the reported usage was also very much higher than what the user expected. So I got to looking.
>
> The usage as reported by lfs quota for the project directory this is:
> # lfs quota -h -p 212496 /lustre/hdd
> Disk quotas for prj 212496 (pid 212496):
>       Filesystem    used   quota   limit   grace   files   quota   limit   grace
>      /lustre/hdd  260.3T      0k      0k       -    7205       0       0       -
> At its highest point it was around 660T. The file count isn't too high so on average the files are quite large.
>
> A du on the directory reports numbers in line with the quota report:
> # du -sh /lustre/hdd/LAS/<directory>
> 261T    /lustre/hdd/LAS/<directory>
>
> The bulk of this space is used by a single directory containing multiple large (3T+ before compression) files.
>
> a du from within it looks like:
> du -sh .
> 253T    .
>
> however that number does not appear realistic when you roughly sum the sizes of the files reported with ls:
> -rw-rw----+ 1 <user> <group> 470G Nov  1 01:34 run2025.ZmChr10.all.vcf.gz
> -rw-rw----+ 1 <user> <group> 150K Nov  1 01:34 run2025.ZmChr10.all.vcf.idx
> -rw-r-----+ 1 <user> <group> 259G Oct 21 20:10 run2025.ZmChr10.variant.vcf.gz
> -rw-r-----+ 1 <user> <group> 150K Oct 21 17:49 run2025.ZmChr10.variant.vcf.idx
> -rw-rw----+ 1 <user> <group> 942G Nov 11 20:21 run2025.ZmChr1.all.vcf.gz
> -rw-rw----+ 1 <user> <group> 302K Nov 11 20:21 run2025.ZmChr1.all.vcf.idx
> -rw-rw----+ 1 <user> <group> 830G Nov 10 01:10 run2025.ZmChr1.variant.vcf.gz
> -rw-rw----+ 1 <user> <group> 302K Nov 10 01:10 run2025.ZmChr1.variant.vcf.idx
> -rw-rw----+ 1 <user> <group> 736G Nov 10 22:02 run2025.ZmChr2.all.vcf.gz
> -rw-rw----+ 1 <user> <group> 239K Nov 10 22:02 run2025.ZmChr2.all.vcf.idx
> -rw-rw----+ 1 <user> <group> 407G Oct 31 21:16 run2025.ZmChr2.variant.vcf.gz
> -rw-rw----+ 1 <user> <group> 239K Oct 31 21:16 run2025.ZmChr2.variant.vcf.idx
> -rw-rw----+ 1 <user> <group> 732G Nov 10 19:34 run2025.ZmChr3.all.vcf.gz
> -rw-rw----+ 1 <user> <group> 233K Nov 10 19:34 run2025.ZmChr3.all.vcf.idx
> -rw-rw----+ 1 <user> <group> 401G Nov  9 22:45 run2025.ZmChr3.variant.vcf.gz
> -rw-rw----+ 1 <user> <group> 233K Nov  9 22:45 run2025.ZmChr3.variant.vcf.idx
> -rw-rw----+ 1 <user> <group> 792G Nov 11 06:22 run2025.ZmChr4.all.vcf.gz
> -rw-rw----+ 1 <user> <group> 245K Nov 11 06:22 run2025.ZmChr4.all.vcf.idx
> -rw-rw----+ 1 <user> <group> 437G Oct 31 21:33 run2025.ZmChr4.variant.vcf.gz
> -rw-rw----+ 1 <user> <group> 245K Oct 31 21:33 run2025.ZmChr4.variant.vcf.idx
> -rw-rw----+ 1 <user> <group> 6.9T Oct 31 19:36 run2025.ZmChr5.all.vcf
> -rw-------+ 1 <user> <group> 295G Nov 19 09:15 run2025.ZmChr5.all.vcf.gz
> -rw-rw----+ 1 <user> <group> 222K Oct 31 19:36 run2025.ZmChr5.all.vcf.idx
> -rw-rw----+ 1 <user> <group> 3.0T Oct 31 22:47 run2025.ZmChr5.variant.vcf
> -rw-rw----+ 1 <user> <group> 222K Oct 31 22:47 run2025.ZmChr5.variant.vcf.idx
> -rw-rw----+ 1 <user> <group> 5.5T Nov  9 16:19 run2025.ZmChr6.all.vcf
> -rw-rw----+ 1 <user> <group> 178K Nov  9 16:19 run2025.ZmChr6.all.vcf.idx
> -rw-rw----+ 1 <user> <group> 2.3T Oct 30 15:51 run2025.ZmChr6.variant.vcf
> -rw-rw----+ 1 <user> <group> 178K Oct 30 15:51 run2025.ZmChr6.variant.vcf.idx
> -rw-rw----+ 1 <user> <group> 5.7T Nov  9 10:58 run2025.ZmChr7.all.vcf
> -rw-rw----+ 1 <user> <group> 182K Nov  9 10:58 run2025.ZmChr7.all.vcf.idx
> -rw-rw----+ 1 <user> <group> 2.5T Oct 31 10:04 run2025.ZmChr7.variant.vcf
> -rw-rw----+ 1 <user> <group> 182K Oct 31 10:04 run2025.ZmChr7.variant.vcf.idx
> -rw-rw----+ 1 <user> <group> 5.6T Oct 29 19:34 run2025.ZmChr8.all.vcf
> -rw-rw----+ 1 <user> <group> 179K Oct 29 19:34 run2025.ZmChr8.all.vcf.idx
> -rw-rw----+ 1 <user> <group> 2.4T Oct 29 16:11 run2025.ZmChr8.variant.vcf
> -rw-rw----+ 1 <user> <group> 179K Oct 29 16:11 run2025.ZmChr8.variant.vcf.idx
> -rw-rw----+ 1 <user> <group> 5.0T Oct 30 00:25 run2025.ZmChr9.all.vcf
> -rw-rw----+ 1 <user> <group> 160K Oct 30 00:25 run2025.ZmChr9.all.vcf.idx
> -rw-rw----+ 1 <user> <group> 2.2T Oct 29 18:05 run2025.ZmChr9.variant.vcf
> -rw-rw----+ 1 <user> <group> 160K Oct 29 18:05 run2025.ZmChr9.variant.vcf.idx
>
> what does more reflect reality is the result of du with --apparent-size added:
> # du -sh --apparent-size .
> 47T     .
>
> Looking at an individual file we see similar discrepancies:
> # du -sh run2025.ZmChr5.all.vcf
> 47T     run2025.ZmChr5.all.vcf
> # du -sh --apparent-size run2025.ZmChr5.all.vcf
> 6.9T    run2025.ZmChr5.all.vcf
> # ls -lah run2025.ZmChr5.all.vcf
> -rw-rw----+ 1 <user> <group> 6.9T Oct 31 19:36 run2025.ZmChr5.all.vcf
>
> I'm used to this being the reverse, with --apparent-size showing the 'logical usage' of a file before any transparent compression that might be in place (we use ZFS a lot).
>
> This appears to be being caused by the PFL striping to two osts, as I don't see a discrepancy on files with a single stripe. Here's one of the files that have been compressed (and are under 1T so not striped):
> # du -h --apparent-size run2025.ZmChr10.all.vcf.gz
> 470G    run2025.ZmChr10.all.vcf.gz
> # du -h  run2025.ZmChr10.all.vcf.gz
> 470G    run2025.ZmChr10.all.vcf.gz
>
> I had another volume I copied this file to with a different default striping rule and you can see the discrepancy from striping:
> # du -h run2025.ZmChr10.variant.vcf.gz
> 418G    run2025.ZmChr10.variant.vcf.gz
> # du -h --apparent-size run2025.ZmChr10.variant.vcf.gz
> 259G    run2025.ZmChr10.variant.vcf.gz
>
> The same file in the same volume but with the striping set to only one stripe:
> # du -h run2025.ZmChr10.variant.vcf.gz.copy
> 259G    run2025.ZmChr10.variant.vcf.gz.copy
> # du -h --apparent-size run2025.ZmChr10.variant.vcf.gz.copy
> 259G    run2025.ZmChr10.variant.vcf.gz.copy
>
> getstripe for the first large striped file:
> # lfs getstripe -y run2025.ZmChr5.all.vcf
>    lcm_layout_gen:    3
>    lcm_mirror_count:  1
>    lcm_entry_count:   2
>    component0:
>      lcme_id:             1
>      lcme_mirror_id:      0
>      lcme_flags:          init
>      lcme_extent.e_start: 0
>      lcme_extent.e_end:   1099511627776
>      sub_layout:
>        lmm_stripe_count:  1
>        lmm_stripe_size:   1048576
>        lmm_pattern:       raid0
>        lmm_layout_gen:    0
>        lmm_stripe_offset: 1
>        lmm_objects:
>        - l_ost_idx: 1
>          l_fid:     0x100010000:0x1869cb06:0x0
>
>    component1:
>      lcme_id:             2
>      lcme_mirror_id:      0
>      lcme_flags:          init
>      lcme_extent.e_start: 1099511627776
>      lcme_extent.e_end:   EOF
>      sub_layout:
>        lmm_stripe_count:  2
>        lmm_stripe_size:   1048576
>        lmm_pattern:       raid0
>        lmm_layout_gen:    65535
>        lmm_stripe_offset: 0
>        lmm_objects:
>        - l_ost_idx: 0
>          l_fid:     0x100000000:0x18ad637a:0x0
>        - l_ost_idx: 1
>          l_fid:     0x100010000:0x18717a96:0x0
>
> getstripe for the second single striped file:
> # lfs getstripe -y run2025.ZmChr10.all.vcf.gz
>    lcm_layout_gen:    2
>    lcm_mirror_count:  1
>    lcm_entry_count:   2
>    component0:
>      lcme_id:             1
>      lcme_mirror_id:      0
>      lcme_flags:          init
>      lcme_extent.e_start: 0
>      lcme_extent.e_end:   1099511627776
>      sub_layout:
>        lmm_stripe_count:  1
>        lmm_stripe_size:   1048576
>        lmm_pattern:       raid0
>        lmm_layout_gen:    0
>        lmm_stripe_offset: 1
>        lmm_objects:
>        - l_ost_idx: 1
>          l_fid:     0x100010000:0x18d70b19:0x0
>
>    component1:
>      lcme_id:             2
>      lcme_mirror_id:      0
>      lcme_flags:          0
>      lcme_extent.e_start: 1099511627776
>      lcme_extent.e_end:   EOF
>      sub_layout:
>        lmm_stripe_count:  2
>        lmm_stripe_size:   1048576
>        lmm_pattern:       raid0
>        lmm_layout_gen:    0
>        lmm_stripe_offset: -1
>
> getstripe of the smaller file copied to another volume with a smaller striping threshold:
> # lfs getstripe -y run2025.ZmChr10.variant.vcf.gz
>    lcm_layout_gen:    3
>    lcm_mirror_count:  1
>    lcm_entry_count:   2
>    component0:
>      lcme_id:             1
>      lcme_mirror_id:      0
>      lcme_flags:          init
>      lcme_extent.e_start: 0
>      lcme_extent.e_end:   107374182400
>      sub_layout:
>        lmm_stripe_count:  1
>        lmm_stripe_size:   1048576
>        lmm_pattern:       raid0
>        lmm_layout_gen:    0
>        lmm_stripe_offset: 1
>        lmm_objects:
>        - l_ost_idx: 1
>          l_fid:     0x100010000:0x11ae544:0x0
>
>    component1:
>      lcme_id:             2
>      lcme_mirror_id:      0
>      lcme_flags:          init
>      lcme_extent.e_start: 107374182400
>      lcme_extent.e_end:   EOF
>      sub_layout:
>        lmm_stripe_count:  2
>        lmm_stripe_size:   1048576
>        lmm_pattern:       raid0
>        lmm_layout_gen:    0
>        lmm_stripe_offset: 0
>        lmm_objects:
>        - l_ost_idx: 0
>          l_fid:     0x100000000:0x1689eaa:0x0
>        - l_ost_idx: 1
>          l_fid:     0x100010000:0x11ae545:0x0
>
> getstripe of the same file copied with the striping disabled:
>
> # lfs getstripe -y run2025.ZmChr10.variant.vcf.gz.copy
> lmm_stripe_count:  1
> lmm_stripe_size:   1048576
> lmm_pattern:       raid0
> lmm_layout_gen:    0
> lmm_stripe_offset: 0
> lmm_objects:
>        - l_ost_idx: 0
>          l_fid:     0x100000000:0x1689eab:0x0
>
> Is this behavior expected or is something strange going on?
>
> Shane
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> https://urldefense.com/v3/__http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org__;!!PvDODwlR4mBZyAb0!RDImgn2BJkk2oqRGAmMCO07TJNyKH7ddP8H_obZ5_WKpVuS9v5UZK3H_URS94t4BwEA9fvLlj1F2rojjojBASBD3hvWS91s$ 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20251119/0fc8fb6c/attachment.htm>


More information about the lustre-discuss mailing list