[Lustre-discuss] fseeks on lustre
Ronald K Long
rklong at usgs.gov
Fri Apr 16 10:26:21 PDT 2010
After doing some more digging it looks as though a bug was reported on
this in 2007.
https://bugzilla.lustre.org/show_bug.cgi?id=12739
We have loaded the patch for lustre attached to this bug, however when
running the set_param command I am getting the following error.
lctl set_param llite*.*.stat_blksize=4096
error: set_param: /proc/{fs,sys}/{lnet,lustre}/llite/lustre*/stat_blksize:
No such process
Is this patch still valid for 2.6.9-78.0.22.EL_lustre.1.6.7.2smp
Thanks again
Rocky
From:
Andreas Dilger <andreas.dilger at oracle.com>
To:
Ronald K Long <rklong at usgs.gov>
Cc:
"Brian J. Murrell" <Brian.Murrell at Sun.COM>,
lustre-discuss at lists.lustre.org, lustre-discuss-bounces at lists.lustre.org
Date:
04/14/2010 02:13 PM
Subject:
Re: [Lustre-discuss] fseeks on lustre
On 2010-04-14, at 11:08, Ronald K Long wrote:
> We've narrowed down the problem quite a bit.
>
> The problematic code snippet is not actually doing any reads or
> writes;
> it's just doing a massive number of fseek() operations within a couple
> of nested loops. (Note: The production code is doing some I/O, but
> this
> snippet was narrowed down to the bare minimum example that exhibited
> the
> problem, which was how we discovered that fseek was the culprit.)
>
> The issue appears to be the behavior of the glibc implementation of
> fseek(). Apparently, a call to fseek() on a buffered file stream
> causes
> glibc to flush the stream (regardless of whether a flush is actually
> needed). If we modify the snippet to call setvbuf() and disable
> buffering on the file stream before any of the fseek() calls, then it
> finishes more or less instantly, as you would expect.
I'd encourage you to file a bug (preferably with a patch) against
glibc to fix this. I've had reasonable success in getting problems
like this fixed upstream.
> The problem is that this offending code is actually buried deep
> within a
> COTS library that we're using to do image processing (the Hierarchical
> Data Format (HDF) library). While we do have access to the source
> code
> for this library and could conceivably modify it, this is a large and
> complex library, and a change of this nature would require us to do a
> large amount of regression testing to ensure that nothing was broken.
>
> So at the end of the day this is really not a "Lustre problem" per se,
> though we would still be interested in any suggestions as to how we
> can
> minimize the effects of this glibc "flush penalty". This penalty is
> not
> particularly onerous when reading and writing to local disk, but is
> obviously more of an issue with a distributed filesystem.
Similarly, HDF + Lustre usage is very common, and I expect that the
HDF developers would be interested to fix this if possible.
> On Wed, 2010-04-14 at 07:08 -0500, Ronald K Long wrote:
> >
> > Andreas - Here is a snipet of the strace output.
> >
> > read(3,
> "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0
> > \0\0"..., 2097152) = 2097152
>
> As Andreas suspected, your application is doing 2MB reads every time.
> Does it really need 2MB of data on each read? If not, can you fix
> your
> application to only read as much data as it actually wants?
Cheers, Andreas
--
Andreas Dilger
Principal Engineer, Lustre Group
Oracle Corporation Canada Inc.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100416/ed1225a6/attachment.htm>
More information about the lustre-discuss
mailing list