[Lustre-discuss] lctl set_param /proc/fs/lustre/llite/lusfs0*/max_cached_mb ???
Andreas Dilger
adilger at sun.com
Wed Jun 10 17:10:44 PDT 2009
On Jun 04, 2009 17:52 -0500, Andrea Rucks wrote:
> What I'm seeing is that WPS eventually takes all 15.5 GB of available
> memory (or tries to) and then my server will hang and show an out of
> memory error on the console:
>
> Call Trace:
> [<ffffffff802bc998>] out_of_memory+0x8b/0x203
> Free pages: 6040kB (0kB HighMem)
> active:8348340kB inactive:7937820kB present:16785408kB
So, about 8GB is just in cached memory, but is inactive so it should
be able to be released under memory pressure.
> pages_scanned:375028407 all_unreclaimable? yes
> lowmem_reserve[]: 0 0 0 0
> DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB
> inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
> inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
> active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
This _should_ mean that some pages are reclaimable, not sure why they
are not.
> If we stop WPS as it begins chewing through RAM, we still see a lot of
> memory in use (Lustre client cache). As I unmount each Lustre filesystem,
> I gain back a significant portion of memory (about 7 GB back total).
That isn't indicative of anything, because Linux/Lustre caches data that
isn't in use (inactive) in case it might be used later.
> I'd like to now limit the Lustre clients to the following, but I'm not
> sure if doing so will mess things up:
>
> lctl set_param /proc/fs/lustre/llite/lusfs01*/max_cached_mb 2048 # Lustre
> Default is 12288
Note: you can use "lctl set_param llite.*.max_cached_mb=2048" as a
shortcut for this. Note that having many separate caches (i.e.
multiple filesystems) is less efficient than a single large filesystem.
> So, here are my questions. Why is 75% the default for max_cached_mb?
Just a reasonable maximum amount of cached data. Something has to be
kept available for application use.
> What will happen if I "lctl set_param llite.*.max_cached_mb 2048" instead
> of 12288 where it is today for each of those filesystems mentioned above?
It should cap the cached data at 2GB per filesystem.
> How am I affecting the performance of the client by making that change?
Depends on how much they re-use data.
> Is this a bad thing to do or no big deal?
For Lustre, no big deal. Depends again on how much cached data affects
your application performance.
> Some filesystems are more heavily used than others, should I give them
> more memory?
Seems reasonable.
> Some filesystems have large files that I'm sure end up sitting up in
> memory, should I give them more memory?
Depends if your application re-uses files or not.
> I know the ldlm lru_size can be used to flush cache, but I don't
> think that's a wise thing to do, people might lose...locks (true/false?)
Clearing all of the locks will in turn flush all of your caches, so it
is only a short-term fix unless you put a hard limit on the number of
locks for each filesystem. Getting that right is hard.
> on files they're downloading or something, right? Is there another cache
> tunable where I can flush cached things that are two hours or more old,
> but leave the newer stuff (a max_cache_time parameter perhaps)?
Yes, there is the ldlm.namespaces.*.lru_max_age parameter you could tune.
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
More information about the lustre-discuss
mailing list