[lustre-discuss] Lustre client memory and MemoryAvailable
Nathan Dauchy - NOAA Affiliate
nathan.dauchy at noaa.gov
Wed Apr 24 14:56:07 PDT 2019
On Mon, Apr 15, 2019 at 9:18 PM Jacek Tomaka <jacekt at dug.com> wrote:
>
> >signal_cache should have one entry for each process (or thread-group).
>
> That is what i thought as well, looking at the kernel source, allocations
> from
> signal_cache happen only during fork.
>
>
I was recently chasing an issue with clients suffering from low memory and
saw that "signal_cache" was a major player. But the workload on those
clients was not doing a lot of forking. (and I don't *think* threading
either) Rather it was a LOT of metadata read operations.
You can see the symptoms by a simple "du" on a Lustre file system:
# grep signal_cache /proc/slabinfo
signal_cache 967 1092 1152 28 8 : tunables 0 0 0
: slabdata 39 39 0
# du -s /mnt/lfs1/projects/foo
339744908 /mnt/lfs1/projects/foo
# grep signal_cache /proc/slabinfo
signal_cache 164724 164724 1152 28 8 : tunables 0 0 0
: slabdata 5883 5883 0
# slabtop -s c -o | head -n 20
Active / Total Objects (% used) : 3660791 / 3662863 (99.9%)
Active / Total Slabs (% used) : 93019 / 93019 (100.0%)
Active / Total Caches (% used) : 72 / 107 (67.3%)
Active / Total Size (% used) : 836474.91K / 837502.16K (99.9%)
Minimum / Average / Maximum Object : 0.01K / 0.23K / 12.75K
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
164724 164724 100% 1.12K 5883 28 188256K signal_cache
331712 331712 100% 0.50K 10366 32 165856K ldlm_locks
656896 656896 100% 0.12K 20528 32 82112K kmalloc-128
340200 339971 99% 0.19K 8100 42 64800K kmalloc-192
162838 162838 100% 0.30K 6263 26 50104K osc_object_kmem
744192 744192 100% 0.06K 11628 64 46512K kmalloc-64
205128 205128 100% 0.19K 4884 42 39072K dentry
4268 4256 99% 8.00K 1067 4 34144K kmalloc-8192
162978 162978 100% 0.17K 3543 46 28344K vvp_object_kmem
162792 162792 100% 0.16K 6783 24 27132K kvm_mmu_page_header
162825 162825 100% 0.16K 6513 25 26052K sigqueue
16368 16368 100% 1.02K 528 31 16896K nfs_inode_cache
20385 20385 100% 0.58K 755 27 12080K inode_cache
Repeat that for more (and bigger) directories and slab cache added up to
more than half the memory on this 24GB node.
This is with CentOS-7.6 and lustre-2.10.5_ddn6.
I worked around the problem by tackling the "ldlm_locks" memory usage with:
# lctl set_param ldlm.namespaces.lfs*.lru_max_age=10000
...but I did not find a way to reduce the "signal_cache".
Regards,
Nathan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20190424/6ebc8aeb/attachment.html>
More information about the lustre-discuss
mailing list