[lustre-discuss] SLUB: Unable to allocate memory on node -1
Julien Rey
julien.rey at univ-paris-diderot.fr
Fri Oct 29 06:05:11 PDT 2021
Hello,
This may not be related directly to Lustre, but here's what I get when I
try to mount our Lustre filesystem on one of our compute node running
CentOS 7:
Oct 29 14:30:20 gpu-node8 kernel: SLUB: Unable to allocate memory on
node -1 (gfp=0x8050)
Oct 29 14:30:20 gpu-node8 kernel: cache: dm_rq_target_io, object size:
136, buffer size: 136, default order: 0, min order: 0
Oct 29 14:30:20 gpu-node8 kernel: node 1: slabs: 2, objs: 60, free: 0
Oct 29 14:30:20 gpu-node8 kernel: LustreError:
3097:0:(niobuf.c:994:ptlrpc_register_rqbd()) LNetMDAttach failed: -12;
Oct 29 14:30:20 gpu-node8 kernel: LustreError:
3097:0:(service.c:2551:ptlrpc_main()) Failed to post rqbd for ldlm_cbd
on CPT 0: -1
Oct 29 14:30:20 gpu-node8 kernel: LustreError:
3091:0:(service.c:2917:ptlrpc_start_threads()) cannot start ldlm_cb
thread #0_0: rc -1
Oct 29 14:30:20 gpu-node8 kernel: LustreError:
3091:0:(service.c:837:ptlrpc_register_service()) Failed to start threads
for service ldlm_cbd: -1
Oct 29 14:30:20 gpu-node8 kernel: LustreError:
3091:0:(ldlm_lockd.c:3077:ldlm_setup()) failed to start service
Oct 29 14:30:20 gpu-node8 kernel: LustreError:
3091:0:(ldlm_lib.c:462:client_obd_setup()) ldlm_get_ref failed: -1
Oct 29 14:30:20 gpu-node8 kernel: LustreError:
3091:0:(obd_config.c:559:class_setup()) setup MGC10.0.1.70 at tcp failed (-1)
Oct 29 14:30:20 gpu-node8 kernel: LustreError:
3091:0:(obd_mount.c:202:lustre_start_simple()) MGC10.0.1.70 at tcp setup
error -1
Oct 29 14:30:20 gpu-node8 kernel: LustreError:
3091:0:(obd_mount.c:1608:lustre_fill_super()) Unable to mount (-1)
I've been scratching my head on this one because this could just be a
kernel bug but we have 3 other identical servers running the exact same
versions of CentOS 7 and Lustre client and I got no problem with them.
Some more info:
[root at gpu-node8 ~]# uname -r
3.10.0-1160.el7.x86_64
[root at gpu-node8 ~]# lctl --version
lctl 2.12.7
[root at gpu-node8 ~]# vmstat -m |grep dm_rq_target_io
dm_rq_target_io 60 60 136 30
[root at gpu-node8 ~]# free -h
total used free shared buff/cache
available
Mem: 31G 1.4G 29G 10M 117M 29G
Swap: 15G 0B 15G
I've been playing with the sysctl parameters but I don't really know
what I'm doing and got no result anyway:
sysctl vm.overcommit_memory=1
sysctl vm.min_free_kbytes=90112
sysctl vm.overcommit_kbytes=90112
Any help would be greetly appreciated.
Thanks!
--
Julien Rey
Plate-forme RPBS
Unité BFA - CMPLI
Université de Paris
tel: 01 57 27 83 95
More information about the lustre-discuss
mailing list