[Lustre-discuss] brief 'hangs' on file operations

Tina Friedrich Tina.Friedrich at diamond.ac.uk
Fri Sep 3 03:04:16 PDT 2010


And another quick question - would this be more likely to be the journal 
on the MDS, or the OSS servers?

On 02/09/10 17:38, Tina Friedrich wrote:
> Hello,
>
> On 02/09/10 17:28, Tina Friedrich wrote:
>> Hi Andreas,
>>
>> thanks for your answer.
>>
>>>> Causing most grieve at the moment is that we sometimes see delays
>>>> writing files. From the writing clients end, it simply looks as if I/O
>>>> stops for a while (we've seen 'pauses' of anything up to 10 seconds).
>>>> This appears to be independent of what client does the writing, and
>>>> software doing the writing. We investigated this a bit using strace and
>>>> dd; the 'slow' calls appear to always be either open, write, or close
>>>> calls. Usually, these take well below 0.001s; in around 0.5% or 1% of
>>>> cases, they take up to multiple seconds. It does not seem to be
>>>> associated with any specific OST, OSS, client or anything; there is
>>>> nothing in any log files or any exceptional load on MDS or OSS or
>>>> any of
>>>> the clients.
>>>
>>> This is most likely associated with delays in committing the journal
>>> on the MDT or OST, which can happen if the journal fills completely.
>>> Having larger journals can help, if you have enough RAM to keep them
>>> all in memory and not overflow. Alternately, if you make the journals
>>> small it will limit the latency, at the cost of reducing overall
>>> performance. A third alternative might be to use SSDs for the journal
>>> devices.
>>
>> Just to double check - that would be the file system journal, I assume?
>>
>> That makes a lot of sense; is there a way to verify that this is the
>> issue we're having?
>>
>> Journal size appears to be 400M - if we were to try increasing it, how
>> would be determine what to best set it to?
>
> That was meant to be 'if we were to try increasing or decreasing it' -
> sounds to us as if decreasing might be the better option (as in, if this
> is the journal flushing, having less journal to flush would probably be
> better - or is that the wrong idea?)
>
>
>>>> The other issue is that we frequently see delays when trying to read a
>>>> file. I sometimes takes more than 60s for a file to be visible on a
>>>> machine after the initial write on a different machine has completed
>>>> (both machines being Lustre clients). Again, there is nothing in the
>>>> logs, nor exceptional load on any of the machines.
>>>
>>> This is probably just a manifestation of the first problem. The issue
>>> likely isn't in the read, but a delay in flushing the data from the
>>> cache of the writing client. There were fixes made in 1.8 to increase
>>> the IO priority for clients writing data under a lock that other
>>> clients are waiting on.
>>
>> We kind of suspected them to be related, yes.
>>
>> Tina
>>
>
>


-- 
Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
Diamond House, Harwell Science and Innovation Campus - 01235 77 8442



More information about the lustre-discuss mailing list