[lustre-discuss] Relaxing read consistency, from other node write

Patrick Farrell paf at cray.com
Mon May 9 07:42:10 PDT 2016


Hans,

It sounds like you want the data to be available to the other nodes 
before it's safely 'down' on the server.  Is that correct?

If so, then I believe there's no way to do that currently in Lustre.

If you were willing to accept the possibility of incorrect reads, then 
you could check out group locking - It lets clients get a special type 
of shared lock, which allows every client to think it has a lock on the 
whole file.  That lets them do reads while someone else is writing, with 
the caveat that they can get out of date data, and that one clients 
cache is not invalidated when another client updates part of the file.

It's a pretty significant relaxation of the POSIX consistency semantics, 
and is tricky to use safely without very well defined behavior from 
clients.  And it might not be what you need...  But I think it's what's 
available.

Cheers,
- Patrick

On 05/09/2016 02:15 AM, Hans Henrik Happe wrote:
> Hi,
>
> Some users experienced that reading a log file written on another node 
> the read of the last bytes were sometimes delayed teens of seconds. 
> This happens when other processes are writing heavily.
>
> It seems that the data needs to be committed to persistent storage, 
> before the reading node can have it. That makes sense since the 
> writing node and the server could die, taking with them all knowledge 
> about the write. Is this a correct description?
>
> I'm wondering if there is a way to relax this. I.e. ignore this 
> failure scenario or treat the cache entries in writing node and server 
> as enough redundancy?
>
> WRT why we see these long delays I think I tracked it down to an ZFS 
> issue (https://github.com/zfsonlinux/zfs/issues/4603), but I'm only a 
> layman when it comes to the internals of ZFS and Lustre.
>
> We are at 2.7.64, so we have to update to 2.8 soon. Going through the 
> commits I couldn't find anything that relates, but that might just be 
> my ignorance.
>
> Cheers,
> Hans Henrik
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



More information about the lustre-discuss mailing list