[Lustre-discuss] MDS memory usage

Frederik Ferner frederik.ferner at diamond.ac.uk
Tue Aug 24 09:35:30 PDT 2010


Hi List,

on our MDS we noticed that all memory seems to be used. (And it's not
just normal buffers/cache as far as I can tell.)

When we put load on the machine, for example by starting rsync
on a few clients, generating file lists to copy data from Lustre to
local disks or just running a MDT backup locally using dd/gzip to copy a
LVM snapshot to a remote server, kswapd starts using a lot of CPU
time, sometimes up to 100% of one CPU core.

This is on a Lustre 1.6.7.2.ddn3.5 based file system with about 200TB,
the MDT is 800GB with 200M inodes, ACLs enabled.

The memory seems mostly used by the kernel and that quite a lot of it
is ldlm_locks, ldlm_resource according to slabtop. Some details of this
are below, but the main question that we now have is whether or not this
is normal and expected.

Is there a tunable to restrict Lustre to use a bit less slab memory
than it currently is?

Will adding more memory to this machine solve the problem that there 
seems to be not enough memory to run normal processes or will it just 
delay the occurrences of this?

Kind regards,
Frederik

Memory details:
<snip>
[root at cs04r-sc-mds01-01 proc]# free
              total       used       free     shared    buffers     cached
Mem:      16497436   16146416     351020          0     257624      17836
-/+ buffers/cache:   15870956     626480
Swap:      2031608     322768    1708840
[root at cs04r-sc-mds01-01 proc]# cat /proc/meminfo
MemTotal:     16497436 kB
MemFree:        352004 kB
Buffers:        256084 kB
Cached:          17688 kB
SwapCached:     149544 kB
Active:         200764 kB
Inactive:       255344 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:     16497436 kB
LowFree:        352004 kB
SwapTotal:     2031608 kB
SwapFree:      1708840 kB
Dirty:             268 kB
Writeback:           0 kB
AnonPages:      182272 kB
Mapped:          17528 kB
Slab:         15248816 kB
PageTables:       6984 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:  10280324 kB
Committed_AS:  1321284 kB
VmallocTotal: 34359738367 kB
VmallocUsed:    330740 kB
VmallocChunk: 34359394255 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB
[root at cs04r-sc-mds01-01 proc]# slabtop --once |head -15
  Active / Total Objects (% used)    : 30350433 / 38705406 (78.4%)
  Active / Total Slabs (% used)      : 3801362 / 3801369 (100.0%)
  Active / Total Caches (% used)     : 114 / 168 (67.9%)
  Active / Total Size (% used)       : 12325021.07K / 14610074.85K (84.4%)
  Minimum / Average / Maximum Object : 0.02K / 0.38K / 128.00K

   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
15657800 14362022  91%    0.50K 1957225        8   7828900K ldlm_locks
10165900 9719990  95%    0.38K 1016590       10   4066360K ldlm_resources
3650979 1038530  28%    0.06K  61881       59    247524K size-64
3646620 3159662  86%    0.12K 121554       30    486216K size-128
3099906 863841  27%    0.21K 172217       18    688868K dentry_cache
1679436 859267  51%    0.83K 419859        4   1679436K ldiskfs_inode_cache
460725 133164  28%    0.25K  30715       15    122860K size-256
122440  65022  53%    0.09K   3061       40     12244K buffer_head


-- 
Frederik Ferner
Computer Systems Administrator		phone: +44 1235 77 8624
Diamond Light Source Ltd.		mob:   +44 7917 08 5110
(Apologies in advance for the lines below. Some bits are a legal
requirement and I have no control over them.)




More information about the lustre-discuss mailing list