[Lustre-discuss] fseeks on lustre

Andreas Dilger andreas.dilger at oracle.com
Tue Apr 13 21:36:43 PDT 2010


On 2010-04-13, at 07:59, Ronald K Long wrote:
> We are doing SEEK_SET
>
> fseek(fp,offset[i],SEEK_SET
>
> We were running into this same issue on our san file system until we  
> set the dma_cache_read_ahead to match our buffer size of 256k.  Just  
> wondering if there is away to set that within lustre.  We are  
> running 1.8 on the MDS and OSS and the clients running the fseek are  
> are running 1.6.

Sorry, I didn't read enough into your question.  When you said  
"opening a large file and doing fseek()" I thought that was the only  
thing you are doing, but really you are doing IOs after the fseek()  
that is presumably what is taking a long time.

It's true that if you are doing random reads it may cause sub-optimal  
performance because the client cannot do readahead to mitigate the  
network/disk latency.

One problem that we saw some time ago at another customer was that  
their application doing random IO was getting the "IO blocksize" from  
the file via {f,}stat() and reading from the file in chunks of  
st_blksize.  Since Lustre returns st_blksize = 2MB, then the  
application wanting 4kB chunks of random data from the file was  
seeking and reading 2MB of extra data for each seek.

It would be worthwhile to strace your application to see if it is  
doing the same thing.

> Andreas Dilger <andreas.dilger at oracle.com> wrote:
>> On 2010-04-07, at 14:09, Ronald K Long wrote:
>> > I am having an issue with our lustre file system.  In our current
>> > environment on a san file system opening a large file and doing
>> > fseeks completes in under 2 seconds.  Running that same routine on
>> > our lustre file system the routine actually never finishes.
>>
>> Doing fseek() itself is only a client-side operation, so it should
>> have no performance impact, UNLESS you are doing SEEK_END, which
>> requires that the actual file size be computed on the client.  That
>> causes lock revocation from all of the clients and is an expensive
>> operation.  Using SEEK_CUR or SEEK_SET has no cost at all.
>>
>> > Are there any tunable parameter in lustre that can alleviate this
>> > problem?
>>
>> It depends on what the problem really is.


Cheers, Andreas
--
Andreas Dilger
Principal Engineer, Lustre Group
Oracle Corporation Canada Inc.




More information about the lustre-discuss mailing list