[lustre-discuss] Lustre client memory and MemoryAvailable

Sun Apr 28 22:06:28 PDT 2019

Wow, Thanks Nathan and NeilBrown.
It is great to learn about slub merging. It is awesome to have a
reproducer.
I am yet to trigger my original problem with slurm_nomerge but
slabinfo tool (in kernel sources) can actually show merged caches:
kernel/3.10.0-693.5.2.el7/tools/slabinfo  -a

:t-0000112   <- sysfs_dir_cache kernfs_node_cache blkdev_integrity
task_delay_info
:t-0000144   <- flow_cache cl_env_kmem
:t-0000160   <- sigqueue lov_object_kmem
:t-0000168   <- lovsub_object_kmem osc_extent_kmem
:t-0000176   <- vvp_object_kmem nfsd4_stateids
:t-0000192   <- ldlm_resources kiocb cred_jar inet_peer_cache key_jar
file_lock_cache kmalloc-192 dmaengine-unmap-16 bio_integrity_payload
:t-0000216   <- vvp_session_kmem vm_area_struct
:t-0000256   <- biovec-16 ip_dst_cache bio-0 ll_file_data kmalloc-256
sgpool-8 filp request_sock_TCP rpc_tasks request_sock_TCPv6
skbuff_head_cache pool_workqueue lov_thread_kmem
:t-0000264   <- osc_lock_kmem numa_policy
:t-0000328   <- osc_session_kmem taskstats
:t-0000576   <- kioctx xfrm_dst_cache vvp_thread_kmem
:t-0001152   <- signal_cache lustre_inode_cache

It is not on a machine that had the problem i described before but the
kernel version is the same so I am assuming the cache merges are the same.

Looks like signal_cache points to lustre_inode_cache.
Regards.
Jacek Tomaka

On Thu, Apr 25, 2019 at 7:42 AM NeilBrown <neilb at suse.com> wrote:

>
> Hi,
>  you seem to be able to reproduce this fairly easily.
>  If so, could you please boot with the "slub_nomerge" kernel parameter
>  and then reproduce the (apparent) memory leak.
>  I'm hoping that this will show some other slab that is actually using
>  the memory - a slab with very similar object-size to signal_cache that
>  is, by default, being merged with signal_cache.
>
> Thanks,
> NeilBrown
>
>
> On Wed, Apr 24 2019, Nathan Dauchy - NOAA Affiliate wrote:
>
> > On Mon, Apr 15, 2019 at 9:18 PM Jacek Tomaka <jacekt at dug.com> wrote:
> >
> >>
> >> >signal_cache should have one entry for each process (or thread-group).
> >>
> >> That is what i thought as well, looking at the kernel source,
> allocations
> >> from
> >> signal_cache happen only during fork.
> >>
> >>
> > I was recently chasing an issue with clients suffering from low memory
> and
> > saw that "signal_cache" was a major player.  But the workload on those
> > clients was not doing a lot of forking.  (and I don't *think* threading
> > either)  Rather it was a LOT of metadata read operations.
> >
> > You can see the symptoms by a simple "du" on a Lustre file system:
> >
> > # grep signal_cache /proc/slabinfo
> > signal_cache         967   1092   1152   28    8 : tunables    0    0
> 0
> > : slabdata     39     39      0
> >
> > # du -s /mnt/lfs1/projects/foo
> > 339744908 /mnt/lfs1/projects/foo
> >
> > # grep signal_cache /proc/slabinfo
> > signal_cache      164724 164724   1152   28    8 : tunables    0    0
> 0
> > : slabdata   5883   5883      0
> >
> > # slabtop -s c -o | head -n 20
> >  Active / Total Objects (% used)    : 3660791 / 3662863 (99.9%)
> >  Active / Total Slabs (% used)      : 93019 / 93019 (100.0%)
> >  Active / Total Caches (% used)     : 72 / 107 (67.3%)
> >  Active / Total Size (% used)       : 836474.91K / 837502.16K (99.9%)
> >  Minimum / Average / Maximum Object : 0.01K / 0.23K / 12.75K
> >
> >   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
> >
> > 164724 164724 100%    1.12K   5883       28    188256K signal_cache
> >
> > 331712 331712 100%    0.50K  10366       32    165856K ldlm_locks
> >
> > 656896 656896 100%    0.12K  20528       32     82112K kmalloc-128
> >
> > 340200 339971  99%    0.19K   8100       42     64800K kmalloc-192
> >
> > 162838 162838 100%    0.30K   6263       26     50104K osc_object_kmem
> >
> > 744192 744192 100%    0.06K  11628       64     46512K kmalloc-64
> >
> > 205128 205128 100%    0.19K   4884       42     39072K dentry
> >
> >   4268   4256  99%    8.00K   1067        4     34144K kmalloc-8192
> >
> > 162978 162978 100%    0.17K   3543       46     28344K vvp_object_kmem
> >
> > 162792 162792 100%    0.16K   6783       24     27132K
> kvm_mmu_page_header
> >
> > 162825 162825 100%    0.16K   6513       25     26052K sigqueue
> >
> >  16368  16368 100%    1.02K    528       31     16896K nfs_inode_cache
> >
> >  20385  20385 100%    0.58K    755       27     12080K inode_cache
> >
> >
> > Repeat that for more (and bigger) directories and slab cache added up to
> > more than half the memory on this 24GB node.
> >
> > This is with CentOS-7.6 and lustre-2.10.5_ddn6.
> >
> > I worked around the problem by tackling the "ldlm_locks" memory usage
> with:
> > # lctl set_param ldlm.namespaces.lfs*.lru_max_age=10000
> >
> > ...but I did not find a way to reduce the "signal_cache".
> >
> > Regards,
> > Nathan
>

-- 
*Jacek Tomaka*
Geophysical Software Developer

*DownUnder GeoSolutions*
76 Kings Park Road
West Perth 6005 WA, Australia
*tel *+61 8 9287 4143 <+61%208%209287%204143>
jacekt at dug.com
*www.dug.com <http://www.dug.com>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20190429/5a9f4cde/attachment.html>