[Lustre-discuss] question about size on MDS (MDT) for lustre-1.8

Robin Humble robin.humble+lustre at anu.edu.au
Thu Jan 27 23:34:19 PST 2011

On Thu, Jan 13, 2011 at 05:28:23PM -0500, Kit Westneat wrote:
>> It would probably be better to set:
>> lctl conf_param fsname-OST00XX.ost.readcache_max_filesize=32M
>> or similar, to limit the read cache to files 32MB in size or less (or whatever you consider "small" files at your site.  That allows the read cache for config files and such, while not thrashing the cache while accessing large files.
>> We should probably change this to be the default, but at the time the read cache was introduced, we didn't know what should be considered a small vs. large file, and the amount of RAM and number of OSTs on an OSS, and the uses varies so much that it is difficult to pick a single correct value for this.

limiting the total amount of OSS cache used in order to leave room for
inodes/dentries might be more useful. the data cache will always fill
up and push out inodes otherwise.
Nathan's approach of turning off the caches entirely is extreme, but if
it gives us back some metadata performance then it might be worth it.

or is there a Lustre or VM setting to limit overall OSS cache size?

I presume that Lustre's OSS caches are subject to normal Linux VM
pagecache tweakables, but I don't think such a knob exists in Linux at
the moment...

>I was looking through the Linux vm settings and saw vfs_cache_pressure - 
>has anyone tested performance with this parameter? Do you know if this 
>would this have any effect on file caching vs. ext4 metadata caching?
>For us, Linux/Lustre would ideally push out data before the metadata, as 
>the performance penalty for doing 4k reads on the s2a far outweighs any 
>benefits of data caching.

good idea. if all inodes are always cached on OSS's then the fs should
be far more responsive to stat loads... 4k/inode shouldn't use up too
much of the OSS's ram (probably more like 1 or 2k/inode really).

anyway, following your idea, we tried vfs_cache_pressure=50 on our
OSS's a week or so ago, but hit this within a couple of hours
could have been a coincidence I guess.

did anyone else give it a try?

BTW, we recently had the opposite problem on a client that scans the
filesystem - too many inodes were cached leading to low memory problems
on the client. we've had vfs_cache_pressure=150 set on that machine for
the last month or so and it seems to help. although a more effective
setting in this case was limiting ldlm locks. eg. from the Lustre manual
  lctl set_param ldlm.namespaces.*osc*.lru_size=10000

Dr Robin Humble, HPC Systems Analyst, NCI National Facility

More information about the lustre-discuss mailing list