[lustre-discuss] Meaning of 'slow creates' messages on MDS
oleg.drokin at intel.com
Tue May 30 09:20:52 PDT 2017
On May 28, 2017, at 3:09 PM, Russell Dekema wrote:
> We have been having various kinds of trouble with our Lustre
> filesystem lately; right now the main problem we are having is
> intermittent severe slowness (such as 30 seconds for an 'ls' of a
> directory containing 100 files to return) when 'cd' and 'ls'ing around
> our Lustre filesystem.
> As far as I can tell [although I don't think we have perfect
> visibility into this], our underlying metadata and object storage
> arrays are not overloaded, either in general or specifically when we
> see the (presumably) metadata-related slowdowns.
> That said, during the slow periods, the load average on our metadata
> server is usually in the low single digits, and the load averages on
> our OSSes tend to be in the hundreds.
> I have noticed a number of error messages like the following in the
> system log on the metadata server, but I don't know quite how to
> interpret them:
> May 28 15:00:40 scr-mds0 kernel: : Lustre:
> scratch-OST001e-osc-MDT0000: slow creates,
> last=[0x1001e0000:0x58858e1:0x0], next=[0x1001e0000:0x58858e1:0x0],
> reserved=0, syn_changes=173, syn_rpc_in_progress=100, status=0
This means exactly what it says.
This ost is slow creating new objects (for the object preallocates).
If all of your OST creates are slow - then when you create a lot of files,
eventually you run out of OST objects (or when striping is used,
just on a few of them) and MDT threads doing creates start to block.
If all MDT threads block then no more new requests could be processed and you'd
see general fs slowness. Otherwise the slowness would just be localized to the directories where the files are being created.
Overall it looks like your OSTs are overloaded. You might want to try faster storage or
decrease number of ost io threads to ease the (parallel) load.
Hope this helps.
More information about the lustre-discuss