[lustre-discuss] [EXTERNAL] No port 988?

Mohr, Rick mohrrf at ornl.gov
Tue Sep 26 10:13:55 PDT 2023


What error do you get when you run "modprobe lnet"?

--Rick

On 9/26/23, 12:29 PM, "lustre-discuss on behalf of Jan Andersen" <lustre-discuss-bounces at lists.lustre.org <mailto:lustre-discuss-bounces at lists.lustre.org> on behalf of jan at comind.io <mailto:jan at comind.io>> wrote:


I have come a bit further with this problem - it seems the lnet module 
can't load:


[root at rocky8 lustre-release]# depmod lnet
depmod: ERROR: Bad version passed lnet


I deleted the VMs and reinstalled Rocky 8.8, then built lustre 2.15.3 
and installed it, everything without any error messages. I haven't been 
able to find any indication of what this message means through google, 
but I assume it would mean that the kernel source doesn't match the 
running kernel? But how well must they match? This is my running kernel:


[root at rocky8 lustre]# uname -r
4.18.0-477.10.1.el8_8.x86_64


And this is the kernel source:


[root at rocky8 lustre]# ll /usr/src/kernels
total 4
drwxr-xr-x. 23 root root 4096 Sep 26 12:34 4.18.0-477.27.1.el8_8.x86_64/


IOW, they diverge just after '477.' - is that the problem?


/jan


Hi,


I've built and installed lustre on two VirtualBoxes running Rocky 8.8 
and formatted one as the MGS/MDS and the other as OSS, following a 
presentation from Oak Ridge National Laboratory: "Creating a Lustre Test 
System from Source with Virtual Machines" (sorry, no link; it was a 
while ago I downloaded them).


I can mount the filesystems on the MDS, but when I try from the OSS, it 
just times out - from dmesg:


[root at oss1 log]# dmesg | grep -i lustre
[ 564.028680] Lustre: Lustre: Build Version: 2.15.58_42_ga54a206
[ 625.567672] LustreError: 15f-b: lustre-OST0000: cannot register this 
server with the MGS: rc = -110. Is the MGS running?
[ 625.567767] LustreError: 
1789:0:(tgt_mount.c:2216:server_fill_super()) Unable to start targets: -110
[ 625.567851] LustreError: 1789:0:(tgt_mount.c:1752:server_put_super()) 
no obd lustre-OST0000
[ 625.567894] LustreError: 
1789:0:(tgt_mount.c:132:server_deregister_mount()) lustre-OST0000 not 
registered
[ 625.588244] Lustre: server umount lustre-OST0000 complete
[ 625.588251] LustreError: 
1789:0:(tgt_mount.c:2365:lustre_tgt_fill_super()) Unable to mount (-110)


Both 'nmap' and 'netstat -nap' show that there is nothing listening on 
port 988:


[root at mds ~]# netstat -nap | grep -i listen
tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 1/systemd
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 806/sshd
tcp6 0 0 :::111 :::* LISTEN 1/systemd
tcp6 0 0 :::22 :::* LISTEN 806/sshd


What should be listening on 988?


/jan


_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org <mailto:lustre-discuss at lists.lustre.org>
https://urldefense.us/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org&d=DwICAg&c=v4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-O7C4ViYc&r=SpEwA4Pnyq7nH7aMGq8KpA&m=CgNxrHlVi8E080Wn9FedFf9aFiNoDLgThFJTOZPuDDQhPM4NButKWaORGrnA5Wpp&s=8Km2w08u3C_u5IhtX97HQ8K535wZx5OcHElSsUbsNCA&e= <https://urldefense.us/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org&d=DwICAg&c=v4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-O7C4ViYc&r=SpEwA4Pnyq7nH7aMGq8KpA&m=CgNxrHlVi8E080Wn9FedFf9aFiNoDLgThFJTOZPuDDQhPM4NButKWaORGrnA5Wpp&s=8Km2w08u3C_u5IhtX97HQ8K535wZx5OcHElSsUbsNCA&e=> 





More information about the lustre-discuss mailing list