[Lustre-discuss] MDS memory usage
Frederik Ferner
frederik.ferner at diamond.ac.uk
Tue Aug 24 09:35:30 PDT 2010
Hi List,
on our MDS we noticed that all memory seems to be used. (And it's not
just normal buffers/cache as far as I can tell.)
When we put load on the machine, for example by starting rsync
on a few clients, generating file lists to copy data from Lustre to
local disks or just running a MDT backup locally using dd/gzip to copy a
LVM snapshot to a remote server, kswapd starts using a lot of CPU
time, sometimes up to 100% of one CPU core.
This is on a Lustre 1.6.7.2.ddn3.5 based file system with about 200TB,
the MDT is 800GB with 200M inodes, ACLs enabled.
The memory seems mostly used by the kernel and that quite a lot of it
is ldlm_locks, ldlm_resource according to slabtop. Some details of this
are below, but the main question that we now have is whether or not this
is normal and expected.
Is there a tunable to restrict Lustre to use a bit less slab memory
than it currently is?
Will adding more memory to this machine solve the problem that there
seems to be not enough memory to run normal processes or will it just
delay the occurrences of this?
Kind regards,
Frederik
Memory details:
<snip>
[root at cs04r-sc-mds01-01 proc]# free
total used free shared buffers cached
Mem: 16497436 16146416 351020 0 257624 17836
-/+ buffers/cache: 15870956 626480
Swap: 2031608 322768 1708840
[root at cs04r-sc-mds01-01 proc]# cat /proc/meminfo
MemTotal: 16497436 kB
MemFree: 352004 kB
Buffers: 256084 kB
Cached: 17688 kB
SwapCached: 149544 kB
Active: 200764 kB
Inactive: 255344 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 16497436 kB
LowFree: 352004 kB
SwapTotal: 2031608 kB
SwapFree: 1708840 kB
Dirty: 268 kB
Writeback: 0 kB
AnonPages: 182272 kB
Mapped: 17528 kB
Slab: 15248816 kB
PageTables: 6984 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
CommitLimit: 10280324 kB
Committed_AS: 1321284 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 330740 kB
VmallocChunk: 34359394255 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
Hugepagesize: 2048 kB
[root at cs04r-sc-mds01-01 proc]# slabtop --once |head -15
Active / Total Objects (% used) : 30350433 / 38705406 (78.4%)
Active / Total Slabs (% used) : 3801362 / 3801369 (100.0%)
Active / Total Caches (% used) : 114 / 168 (67.9%)
Active / Total Size (% used) : 12325021.07K / 14610074.85K (84.4%)
Minimum / Average / Maximum Object : 0.02K / 0.38K / 128.00K
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
15657800 14362022 91% 0.50K 1957225 8 7828900K ldlm_locks
10165900 9719990 95% 0.38K 1016590 10 4066360K ldlm_resources
3650979 1038530 28% 0.06K 61881 59 247524K size-64
3646620 3159662 86% 0.12K 121554 30 486216K size-128
3099906 863841 27% 0.21K 172217 18 688868K dentry_cache
1679436 859267 51% 0.83K 419859 4 1679436K ldiskfs_inode_cache
460725 133164 28% 0.25K 30715 15 122860K size-256
122440 65022 53% 0.09K 3061 40 12244K buffer_head
--
Frederik Ferner
Computer Systems Administrator phone: +44 1235 77 8624
Diamond Light Source Ltd. mob: +44 7917 08 5110
(Apologies in advance for the lines below. Some bits are a legal
requirement and I have no control over them.)
More information about the lustre-discuss
mailing list