[lustre-discuss] lustre-discuss Digest, Vol 238, Issue 8
John Bauer
bauerj at iodoctors.com
Tue Jan 13 15:59:05 PST 2026
Rick,
I had suspicions it would be something you described. I am not very
familiar with those aspects of Lustre. Your comment about Persistent
Client Cache is interesting, but I believe it to be unavailable on my
client nodes. I'm going to throw one more image into the discussion as
an argument for needing a mechanism for bypassing Hybrid I/O. This
image is of the File Position Activity plot that had the checkpoint
thing happen during its run. Notice that my job did not stall a bit
while the checkpoint occurred. Should someone think it reasonable to
implement a mechanism to bypass the Hybrid I/O function, I will throw
out that it should be done via the PFL mechanism, allowing selected PFL
components to be flagged for legacy buffered I/O. I assume that every
read or write must be checked for which component and OST it is a part
of. Once lustre has determined the component it could also flag the
request for the non-hybrid path.
John
Image 5:
https://www.dropbox.com/scl/fi/a6jaf6piq4p7z42x5h4x5/buffered_with_no_cp_affect.png?rlkey=4mbl5ysmm5xokkremnn63f1qk&st=zlhqf2k5&dl=0
On 1/13/2026 3:02 PM, lustre-discuss-request at lists.lustre.org wrote:
> Send lustre-discuss mailing list submissions to
> lustre-discuss at lists.lustre.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> or, via email, send a message with subject or body 'help' to
> lustre-discuss-request at lists.lustre.org
>
> You can reach the person managing the list at
> lustre-discuss-owner at lists.lustre.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of lustre-discuss digest..."
>
>
> Today's Topics:
>
> 1. Re: [EXTERNAL] Dramatic loss of performance when another
> application does writing. (Mohr, Rick)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 13 Jan 2026 20:01:09 +0000
> From: "Mohr, Rick"<mohrrf at ornl.gov>
> To: John Bauer<bauerj at iodoctors.com>,
> "lustre-discuss at lists.lustre.org" <lustre-discuss at lists.lustre.org>
> Subject: Re: [lustre-discuss] [EXTERNAL] Dramatic loss of performance
> when another application does writing.
> Message-ID:<06BC13C1-B5E9-40F9-AC9E-1891D94336FA at ornl.gov>
> Content-Type: text/plain; charset="utf-8"
>
> John,
>
> I wonder if this could be a credit issue. Do you know the size of the other job that is doing the checkpointing? It sounds like your job is just a single client job so it is going to have a limited number of credits (the default used to be 8 but I don't know if that is still the case). If the other job is using 100 nodes (just as an example), it could have 100x more outstanding IO requests than your job can. The spike in the server load makes me think that IO requests are getting backed up.
>
> Lustre has limit on the peer_credits which is the number of outstanding IO requests per client which helps to prevent any one client from monopolizing a Lustre server. But the nodes themselves also have a limit on the total number of credits which helps to limit the number of outstanding IO requests on the server (I think the number is related to the limitations of the network fabric, but it can also serve as a way to limit the number of requests that get queued on the server to help prevent a server from getting overloaded). If a large job is checkpointing, then maybe that job is chewing up the server's credits so that your application is only getting a small number of IO requests added to a very large queue of outstanding requests. My knowledge of credits may be flawed/out-dated (and perhaps someone else on the list can correct me if I am), but it's one way that contention could exist on a server even if there isn't contention on the OSTs themselves.
>
> If your application is using a single client which has some local SSD storage, maybe the Persistent Client Cache (PCC) feature might be of some benefit to you (if it's available on your file system).
>
> --Rick
>
>
> ?On 1/12/26, 7:52 PM, "lustre-discuss on behalf of John Bauer via lustre-discuss"<lustre-discuss-bounces at lists.lustre.org> wrote:
>
>
> All,
> My questions of recent are related to my trying to understand the following issue. I have an application that is writing, reading forwards, and reading backwards, a single file multiple times ( as seen in bottom frame of Image 1). The file is striped 4x16M on 4 ssd OSTs on 2 OSS. Everything runs along just great with transfer rates in the 5GB/s range. At some point, another application triggers approximately 135 GB of writes to each of the 32 hdd OSTs on the16 OSSs of the file system. When this happens my applications performance drops to 4.8 MB/s, a 99.9% loss of performance for the 33+ second duration of the other application's writes. My application is doing 16MB preads and pwrites in parallel using 4 pthreads, with O_DIRECT on the client. The main question I have is: "Why do the writes from the other application affect my application so dramatically?" I am making demands of the 2 OSS of about the same order of magnitude, 2.5GB/s each from 2 OSS, as the other application is gettin
> g from the same 2 OSS, about 4 GB/s each. There should be no competition for the OSTs, as I am using ssd and the other application is using hdd. If both applications are triggering Direct I/O on the OSSs, I would think there would be minimal competition for compute resources on the OSSs. But as seen below in Image 3, there is a huge spike in cpu load during the other application's writes. This is not a one-off event. I see this about 2 out of every 3 times I run this job. I suspect the other application is one that checkpoints on a regular interval, but I am a non-root user and have no way to determine. I am using PCP/pmapi to get the OSS data during my run. If the images get removed from the email, I have used alternate text with links to Dropbox for the images.
> Thanks,
> John
>
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
> ------------------------------
>
> End of lustre-discuss Digest, Vol 238, Issue 8
> **********************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20260113/ed689ae4/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: buffered_with_no_cp_affect.png
Type: image/png
Size: 67613 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20260113/ed689ae4/attachment-0001.png>
More information about the lustre-discuss
mailing list