[lustre-discuss] NFS kernel server does not seem to trigger refreshes
Andreas Dilger
adilger at thelustrecollective.com
Fri Apr 24 17:14:39 PDT 2026
Hi Peter,
there were some fixes from Oleg recently related to kNFS cache of Lustre files, so that the NFS cache was invalidated when Lustre DLM locks are removed from a client.
It looks like LU-19237 and LU-19238 are the related tickets.
Cheers, Andreas
> On Apr 23, 2026, at 16:06, Peter Grandi via lustre-discuss <lustre-discuss at lists.Lustre.org> wrote:
>
> So the context is EL8 4.18 kernel and in-kernel NFS server,
> Lustre client 2.16.1, and for comparison NFS Ganesha 5.7;
> using NFS protocol version 4.2 with both.
>
> The issue is the vexed one of inter-node consistency in a
> special case:nnn
>
> * Client "W" writing to a Lustre filesystem a log-file (every 3
> seconds).
>
> * Lustre client "N" with an NFS server re-exporting the Lustre
> filesystem.
>
> * NFS client "R" reading from NFS the log-file.
>
> What I observe:
>
> * A 'tail -f' of the log-file on "S" itself has little to no
> perceivable lag thanks to the Lustre DLM.
>
> * A 'tail -f' of the log file on "R" has a few seconds of lag if
> I use the NFS Ganesha server on "S" (the lag depends on some
> NFS Ganesha caching parameters).
>
> * A 'tail -f' of the log file on "R" can have a rather long if I
> use the NFS kernel server on "S" but it is erratic (seems to
> depend on how often I reopen the log).
>
> * A 'tail -f' of the log file on "R" has only a small lag if I
> use the NFS kernel server on "S" and I write to the log-file
> on the "S" itself instead of on "W" (this seems to indicate
> that there is no issue with lag on the NFS side, because of
> presence or absence of delegations or various NFS side caching
> timeouts and I have done several tests).
>
> Note: in the latter case neither the 'mtime' of the i-mode nor
> the contents of the file get updated.
>
> The difference seems to be that:
>
> * User-program level access to the log-file on Lustre do trigger
> the DLM to refresh its cached state.
>
> * Kernel-level access to the log-file on Lustre does not seem to
> trigger the DLM to refresh its cached state.
>
> I would be happy to just use the NFS Ganesha server but it has
> another flaw; it just hangs every 1-3 days during periods of
> concurrent access on the NFS client (which I suspect due to some
> incompatibility between the Linux NFS kernel client and the NFS
> Ganesha server).
>
> One way to work around the issue would be to limit the
> time-to-live of cached Lustre file contents and attributes and
> various web searches indicated that some Lustre client versions
> used to have some caching timeouts but looking at 'lctl' and
> under '/sys/' and '/proc/' in 2.16.1 I cannot see anything
> relevant but the 'inode_cache' and 'xattr_cache' toggles and
> those seem a bit drastic.
>
> But I can see that on an NFS-Lustre server where there is no
> local access to Lustre files other than from the NFS server that
> might be an option.
>
> Any suggestions or workarounds?
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
---
Andreas Dilger
Principal Lustre Architect
adilger at thelustrecollective.com
More information about the lustre-discuss
mailing list