[lustre-discuss] ZFS backed OSS out of memory

Thu Jun 23 09:13:51 PDT 2016

Folks,

I've done my fair share of googling and run across some good information on ZFS backed Lustre tuning including this:

http://lustre.ornl.gov/ecosystem-2016/documents/tutorials/Stearman-LLNL-ZFS.pdf

and various discussions around how to limit (or not) the ARC and clear it if needed.

That being said, here is my configuration.

RHEL 6 
Kernel 2.6.32-504.3.3.el6.x86_64
ZFS 0.6.3
Lustre 2.5.3 with a couple of patches
Single OST per OSS with 4 x RAIDZ2 4TB SAS drives
Log and Cache on separate SSDs
These OSSes are beefy with 128GB of memory and Dual E5-2630 v2 CPUs

 About 30 OSSes in all serving mostly a standard HPC cluster over FDR IB with a sprinkle of 10G

# more /etc/modprobe.d/lustre.conf
options lnet networks=o2ib9,tcp9(eth0)

ZFS backed MDS with same software stack.

The problem I am having is the OOM killer is whacking away at system processes on a few of the OSSes. 

"top" shows all my memory is in use with very little Cache or Buffer usage.

Tasks: 1429 total,   5 running, 1424 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  2.9%sy,  0.0%ni, 94.0%id,  3.1%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  132270088k total, 131370888k used,   899200k free,     1828k buffers
Swap: 61407100k total,     7940k used, 61399160k free,    10488k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
   47 root      RT   0     0    0    0 S 30.0  0.0 372:57.33 migration/11

I had done zero tuning so I am getting the default ARC size of 1/2 the memory.

[root at lzfs18b ~]# arcstat.py 1
    time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c
09:11:50     0     0      0     0    0     0    0     0    0    63G   63G
09:11:51  6.2K  2.6K     41   206    6  2.4K   71     0    0    63G   63G
09:11:52   21K  4.0K     18   305    2  3.7K   34    18    0    63G   63G

The question is, if I have 128GB of RAM and ARC is only taking 63, where did the rest go and how can I get it back so that the OOM killer stops killing me?

Thanks!

Tim