[lustre-discuss] [EXTERNAL] [BULK] Files created in append mode don't obey directory default stripe count

Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] darby.vicker-1 at nasa.gov
Mon Apr 29 18:01:06 PDT 2024


Very interesting, thanks for the info and history on this.  The reason for the different behavior makes sense after reading about the history and use cases.

From: Andreas Dilger <adilger at whamcloud.com>
Date: Monday, April 29, 2024 at 12:29 PM
To: Simon Guilbault <simon.guilbault at calculquebec.ca>
Cc: Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] <darby.vicker-1 at nasa.gov>, lustre-discuss at lists.lustre.org <lustre-discuss at lists.lustre.org>
Subject: Re: [lustre-discuss] [EXTERNAL] [BULK] Files created in append mode don't obey directory default stripe count
CAUTION: This email originated from outside of NASA.  Please take care when clicking links or opening attachments.  Use the "Report Message" button to report suspicious messages to the NASA SOC.


Simon is exactly correct.  This is expected behavior for files opened with O_APPEND, at least until LU-12738 is implemented.  Since O_APPEND writes are (by definition) entirely serialized, having multiple stripes on such files is mostly useless and just adds overhead.

Feel free to read https://jira.whamcloud.com/browse/LU-9341 for the very lengthy saga on the history of this behavior.

Cheers, Andreas


On Apr 29, 2024, at 10:42, Simon Guilbault <simon.guilbault at calculquebec.ca<mailto:simon.guilbault at calculquebec.ca>> wrote:

This is the expected behaviour. In the original implementation of PFL, when a file was open in append mode, the lock from 0 to EOF was initializing all stripes of the PFL file. We have a PFL layout on our system with 1 stripe up to 1 GB, then it increased to 4 and then 32 stripes when the file was getting very large. This was a problem with software that was creating 4kb log files (like slurm.out) because they were creating files with > 32 stripes because of the append mode. This was patched a few releases ago, that behaviour can be changed, but I would recommend keeping 1 stripe for files that are using append mode.

From the manual:
O_APPEND mode. When files are opened for append, they instantiate all uninitialized components expressed in the layout. Typically, log files are opened for append, and complex layouts can be inefficient.
Note
The mdd.*.append_stripe_count and mdd.*.append_pool options can be used to specify special default striping for files created with O_APPEND.

On Mon, Apr 29, 2024 at 11:21 AM Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.] via lustre-discuss <lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>> wrote:
Wow, I would say that is definitely not expected.  I can recreate this on both of our LFS’s.  One is community lustre 2.14, the other is a DDN Exascalar.  Shown below is our community lustre but we also have a 3-segment PFL on our Exascalar and the behavor is the same there.

$ echo > aaa
$ echo >> bbb
$ lfs getstripe aaa bbb
aaa
  lcm_layout_gen:    3
  lcm_mirror_count:  1
  lcm_entry_count:   3
    lcme_id:             1
    lcme_mirror_id:      0
    lcme_flags:          init
    lcme_extent.e_start: 0
    lcme_extent.e_end:   33554432
      lmm_stripe_count:  1
      lmm_stripe_size:   4194304
      lmm_pattern:       raid0
      lmm_layout_gen:    0
      lmm_stripe_offset: 6
      lmm_objects:
      - 0: { l_ost_idx: 6, l_fid: [0x100060000:0xace8112:0x0] }

    lcme_id:             2
    lcme_mirror_id:      0
    lcme_flags:          0
    lcme_extent.e_start: 33554432
    lcme_extent.e_end:   10737418240
      lmm_stripe_count:  4
      lmm_stripe_size:   4194304
      lmm_pattern:       raid0
      lmm_layout_gen:    0
      lmm_stripe_offset: -1

    lcme_id:             3
    lcme_mirror_id:      0
    lcme_flags:          0
    lcme_extent.e_start: 10737418240
    lcme_extent.e_end:   EOF
      lmm_stripe_count:  8
      lmm_stripe_size:   4194304
      lmm_pattern:       raid0
      lmm_layout_gen:    0
      lmm_stripe_offset: -1

bbb
lmm_stripe_count:  1
lmm_stripe_size:   2097152
lmm_pattern:       raid0
lmm_layout_gen:    0
lmm_stripe_offset: 3
                obdidx                  objid                    objid                    group
                     3             179773949       0xab721fd                   0


From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org<mailto:lustre-discuss-bounces at lists.lustre.org>> on behalf of Otto, Frank via lustre-discuss <lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>>
Date: Monday, April 29, 2024 at 8:33 AM
To: lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org> <lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>>
Subject: [EXTERNAL] [BULK] [lustre-discuss] Files created in append mode don't obey directory default stripe count
CAUTION: This email originated from outside of NASA.  Please take care when clicking links or opening attachments.  Use the "Report Message" button to report suspicious messages to the NASA SOC.

See subject. Is it a known issue? Is it expected? Easy to reproduce:


# lfs getstripe .
.
stripe_count:  4 stripe_size:   1048576 pattern:       raid0 stripe_offset: -1

# echo > aaa
# echo >> bbb
# lfs getstripe .
.
stripe_count:  4 stripe_size:   1048576 pattern:       raid0 stripe_offset: -1

./aaa
lmm_stripe_count:  4
lmm_stripe_size:   1048576
lmm_pattern:       raid0
lmm_layout_gen:    0
lmm_stripe_offset: 0
        obdidx           objid           objid           group
             0            2830          0xb0e                0
             1            2894          0xb4e                0
             2            2831          0xb0f                0
             3            2895          0xb4f                0

./bbb
lmm_stripe_count:  1
lmm_stripe_size:   1048576
lmm_pattern:       raid0
lmm_layout_gen:    0
lmm_stripe_offset: 4
        obdidx           objid           objid           group
             4            2831          0xb0f                0



As you see, file "bbb" is created with stripe count 1 instead of 4.
Observed in Lustre 2.12.x and Lustre 2.15.4.

Thanks,
Frank

--
Dr. Frank Otto
Senior Research Infrastructure Developer
UCL Centre for Advanced Research Computing
Tel: 020 7679 1506
_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Whamcloud






-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20240430/723bcb3e/attachment-0001.htm>


More information about the lustre-discuss mailing list