[lustre-discuss] Lustre caching behavior
Andreas Dilger
adilger at thelustrecollective.com
Tue Mar 24 14:08:18 PDT 2026
Yes, the stat() is using functionality called a "glimpse" lock, which is a kind of "try lock" mechanism for fetching the file size.
Lustre doesn't _require_ holding locks on the client when the stat() call is done. The stat() information itself is ephemeral (ie. the file attributes, and particularly the size, could be outdated as soon as the system call returns to userspace, and probably earlier since any locking and such would have to be dropped before the attributes are returned). Also, holding DLM locks on all the objects of a file on each client doing the stat() would be punishing in a large cluster with concurrent writes.
So instead of unconditionally grabbing the DLM locks for all the file objects to get the size and then drop them again, the client doing the stat() will "take a glimpse" of the file size and blocks without taking any locks, asking the OSTs for each object what the file size/blocks for each object are. In turn, if the objects are in use by another client, the OSTs will "take a glimpse" of the size of the file on the client(s) holding the DLM lock closest to the end of the file without cancelling the DLM lock, and return the largest size to the original stat() client.
*However*, if the client holding the DLM lock hasn't been using that lock/file for some (configurable) time, it will notice the glimpse and take that as a hint that some other client is interested in the DLM lock and cancel it if unused. That is probably what you are seeing.
I thought there was a tunable to control how long a DLM lock is idle before a glimpse will cancel it, but I can't find it. AFAIR it is about 20s. I also haven't checked whether the glimpse cancellation only applies to write locks (which _should_ be the case), or also to read locks (which would be counter productive), and whether it would be possible to downgrade the DLM write locks to read locks while preserving the cached data on the client(s).
Cheers, Andreas
On Mar 24, 2026, at 09:39, John Bauer via lustre-discuss <lustre-discuss at lists.lustre.org> wrote:
All,
Is it a known effect that a stat() of a file on a given node will change the caching behavior on a second node. I have noticed that the caching behavior of Lustre on a dedicated compute node will change significantly if I do intermittent stats of the file of interest from the front-end node.
Thanks,
John
_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20260324/3a3c13dd/attachment.htm>
More information about the lustre-discuss
mailing list