[Lustre-discuss] exceedingly slow lstats

wangdi di.wang at whamcloud.com
Sat Jan 21 10:53:55 PST 2012


Hello,

It is probably because heavy load of OSS cause the slow response of 
stat(fetching the file size). Did you see some slow IO message on OSS?
On LU-15, http://jira.whamcloud.com/browse/LU-15,  disable the readonly 
cache might be the first thing you need to do.
There are some detail discussion about what you should do to mitigate 
this problem. (See comments around April 15th).

Thanks
WangDi



On 01/20/2012 02:35 PM, John White wrote:
> Well, I was reading the strace wrong anyway:
> lstat("../403/a323", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0<0.134326>
> getxattr("../403/a323", "system.posix_acl_access", 0x0, 0) = -1 EOPNOTSUPP (Operation not supported)<0.000018>
> lstat("../403/a330", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0<0.158898>
> getxattr("../403/a330", "system.posix_acl_access", 0x0, 0) = -1 EOPNOTSUPP (Operation not supported)<0.000019>
> lstat("../403/a331", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0<0.239466>
> getxattr("../403/a331", "system.posix_acl_access", 0x0, 0) = -1 EOPNOTSUPP (Operation not supported)<0.000012>
> lstat("../403/a332", {st_mode=S_IFREG|0644, st_size=0, ...}) = 0<0.130146>
> getxattr("../403/a332", "system.posix_acl_access", 0x0, 0) = -1 EOPNOTSUPP (Operation not supported)<0.000012>
>
> The getxattr takes an incredibly short amount of time, it's the lstat itself that's taking 0.1+s.
> ----------------
> John White
> HPC Systems Engineer
> (510) 486-7307
> One Cyclotron Rd, MS: 50C-3396
> Lawrence Berkeley National Lab
> Berkeley, CA 94720
>
> On Jan 20, 2012, at 2:28 PM, Mark Hahn wrote:
>
>>> I'm sorry, I'm not quite understanding what you're asking.  I don't have
>>> ACLs specifically enabled anywhere (and would expect the default is
>>> disabled).
>> I guess what I was suggesting is that you could try a simple experiment:
>> mount a client requesting the acl mount option.  (I don't know whether the "mount -oremount,acl ..." trick will work with Lustre.)
>>
>> if the problem goes away, you're done.
>>
>> you also mentioned OSS load - I don't see how that could be related,
>> since OSSs are not involved in metadata operations like lstat or getxattr.
>> (though depending on Lustre version, they could be involved in fetching
>> actual size of files, which is especially salient on striped files...)
>>
>>
>>
>>> ----------------
>>> John White
>>> HPC Systems Engineer
>>> (510) 486-7307
>>> One Cyclotron Rd, MS: 50C-3396
>>> Lawrence Berkeley National Lab
>>> Berkeley, CA 94720
>>>
>>> On Jan 20, 2012, at 12:49 PM, Mark Hahn wrote:
>>>
>>>>>    0.916908 getxattr("/global/scratch/jwhite/backuptest/highcount/3/a5", "system.posix_acl_access", 0x0, 0) = -1 EOPNOTSUPP (Operation not supported)
>>>> are your clients mounting with the acl option
>>>> and acl isn't missing on the mds mount?
>> -- 
>> operator may differ from spokesperson.	            hahn at mcmaster.ca
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss




More information about the lustre-discuss mailing list