[Lustre-discuss] Lustre client memory usage very high
Andreas Dilger
adilger at sun.com
Wed Jul 29 15:46:27 PDT 2009
On Jul 22, 2009 11:45 +0200, Guillaume Demillecamps wrote:
> Lustre 1.8.0 on all servers / clients involved in this. OS is SLES 10
> SP2 with un-patched kernel on the clients. I however has put the same
> kernel revision downloaded from suse.com on the clients as the version
> used in the Lustre-patched MGS/MDS/OSS servers. File system is only
> several GBs, with ~500000 files. All inter-connections are through TCP.
>
> We have some “manual” replication of an active lustre file system to a
> passive lustre file system. We have “sync” clients that just basically
> mount both file systems and run large sync jobs from the active Lustre
> to the passive Lustre. So far, so good (apart that it is quite a slow
> process). However my issue is that Lustre is rising memory so high
> that rsync cannot get enough RAM to finish its job before kswap kicks
> in and slows things down drastically.
> Up to now, I have succeeded fine-tuning things using the following
> steps in my rsync script:
> ########
> umount /opt/lustre_a
> umount /opt/lustre_z
> mount /opt/lustre_a
> mount /opt/lustre_z
> for i in `ls /proc/fs/lustre/osc/*/max_dirty_mb`; do echo 4 > $i ; done
> for i in `ls /proc/fs/lustre/ldlm/namespaces/*/lru_max_age`; do echo
> 30 > $i ; done
> for i in `ls /proc/fs/lustre/llite/*/max_cached_mb`; do echo 64 > $i ; done
> echo 64 > /proc/sys/lustre/max_dirty_mb
Note that you can do these more easily with
lctl set_param osc.*.max_dirty_mb=4
lctl set_param ldlm.namespaces.*.lru_max_age=30
lctl set_param llite.*.max_cache_mb=64
lctl set_param max_dirty_mb=64
> lctl set_param ldlm.namespaces.*osc*.lru_size=100
> sysctl -w lnet.debug=0
This can also be "lctl set_param debug=0".
> What I still don't understand is that even when putting a max limit of
> a few MB of read-cache (max_cached_mb / max_dirty_mb) and putting the
> write-cache (lru_max_age ? is it correct ?) to a very limited number,
> it still sky-rise to several GBs in /proc/sys/lustre/mem_used ?
Can you please check /proc/slabinfo to see what kind of memory is being
allocated the most? The max_cached_mb/max_dirty_mb are only limits on
the cached/dirty data pages, and not for metadata structures. Also,
in 30s I expect you can have a LOT of inodes traversed, so that might
be your problem, and even then lock cancellation does not necessarily
force the kernel dentry/inode out of memory.
Getting total lock counts would also help:
lctl get_param ldlm.namespaces.*.resource_count
You might be able to tweak some of the "normal" (not Lustre specific)
/proc parmeters to flush the inodes from cache more quickly, or increase
the rate at which kswapd is trying to flush unused inodes.
> And as soon as I un-mount the disks, it drops. The memused number however
> will not decrease even if the client remains idle for several days
> with no i/o from/to any lustre file systems. Note that cutting the
> rsync jobs in smaller but more numbered jobs is not helping.
There is a test program called "memhog" that could force memory to be
flushed between jobs, but that is a sub-standard solution.
> Unless
> I'd start un-mounting and re-mounting the lustre file systems between
> each job (which is nevertheless what I may have to plan if there is no
> further parameter which would help me) !
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
More information about the lustre-discuss
mailing list