[lustre-discuss] Relaxing read consistency, from other node write

Hans Henrik Happe happe at nbi.ku.dk
Mon May 9 08:51:54 PDT 2016



On 09-05-2016 16:42, Patrick Farrell wrote:
> Hans,
>
> It sounds like you want the data to be available to the other nodes
> before it's safely 'down' on the server.  Is that correct?

Yes, when in server cache the other clients should be allowed to read. I 
guess you could make it a replica of the writing clients cache, so it 
could replay in case of server failure.

I was just wondering if there was a quick workaround my current problem.

If the system could squeeze in these small writes without seconds of 
latency while large writes are hitting the drives hard I would be happy.


> If so, then I believe there's no way to do that currently in Lustre.
>
> If you were willing to accept the possibility of incorrect reads, then
> you could check out group locking - It lets clients get a special type
> of shared lock, which allows every client to think it has a lock on the
> whole file.  That lets them do reads while someone else is writing, with
> the caveat that they can get out of date data, and that one clients
> cache is not invalidated when another client updates part of the file.
>
> It's a pretty significant relaxation of the POSIX consistency semantics,
> and is tricky to use safely without very well defined behavior from
> clients.  And it might not be what you need...  But I think it's what's
> available.
>
> Cheers,
> - Patrick
>
> On 05/09/2016 02:15 AM, Hans Henrik Happe wrote:
>> Hi,
>>
>> Some users experienced that reading a log file written on another node
>> the read of the last bytes were sometimes delayed teens of seconds.
>> This happens when other processes are writing heavily.
>>
>> It seems that the data needs to be committed to persistent storage,
>> before the reading node can have it. That makes sense since the
>> writing node and the server could die, taking with them all knowledge
>> about the write. Is this a correct description?
>>
>> I'm wondering if there is a way to relax this. I.e. ignore this
>> failure scenario or treat the cache entries in writing node and server
>> as enough redundancy?
>>
>> WRT why we see these long delays I think I tracked it down to an ZFS
>> issue (https://github.com/zfsonlinux/zfs/issues/4603), but I'm only a
>> layman when it comes to the internals of ZFS and Lustre.
>>
>> We are at 2.7.64, so we have to update to 2.8 soon. Going through the
>> commits I couldn't find anything that relates, but that might just be
>> my ignorance.
>>
>> Cheers,
>> Hans Henrik
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


More information about the lustre-discuss mailing list