[Lustre-discuss] Problems adding new OSS to existing Lustre filesystem -- Refusing connection, No matching NI
Michael D. Seymour
seymour at cita.utoronto.ca
Fri Apr 24 08:53:35 PDT 2009
Hi,
We are having a problem adding a new OSS (roc06, 10.5.203.6) to an existing
Lustre file system (raid-cita) on the 10.5 network. selinux and iptables are
disabled. It is a multi-homed OSS on the 10.4 and 10.5 network.
When mounted, clients are trying to connect to the Lustre file system via the
10.4 network, even though things are set up to use the 10.5 network. The clients
do not see the new space on the file system either. It shows 23T as opposed to
the > 27T it should show.
lfs quota hangs as well.
We did suffer some problems with the MDS filesystem, which was fcsked, the
kernel downgraded to 1.6.6 and remounted.
Many messages like this exist in /var/log/messages on the new OSS:
Apr 24 10:01:07 roc06 kernel: LustreError: 120-3: Refusing connection from
10.4.1.52 for 10.4.203.6 at tcp: No matching NI
On the multi-homed client 10.4.1.52:
[root at tpb52-chroot ~]# uname -a; cat /etc/redhat-release
Linux tpb52 2.6.18-92.1.17.el5_lustre.1.6.7smp #1 SMP Mon Feb 9 19:56:55 MST
2009 x86_64 x86_64 x86_64 GNU/Linux
CentOS release 5 (Final)
[root at tpb52-chroot ~]# df -h /mnt/raid-cita/
Filesystem Size Used Avail Use% Mounted on
10.5.203.250 at tcp:/roc
23T 11T 12T 47% /mnt/raid-cita
[root at tpb52-chroot ~]# lctl list_nids
10.5.2.12 at tcp
[root at tpb52-chroot ~]# grep lnet /etc/modprobe.conf
options lnet networks=tcp0(eth1)
[root at tpb52-chroot ~]# ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:15:C5:EC:FA:8C
inet addr:10.5.2.12 Bcast:10.5.255.255 Mask:255.255.0.0
On the OSS roc06:
[root at roc06 lustre]# uname -a; cat /etc/redhat-release
Linux roc06 2.6.18-92.1.17.el5_lustre.1.6.7.1smp #1 SMP Mon Apr 13 16:13:00 MDT
2009 x86_64 x86_64 x86_64 GNU/Linux
CentOS release 5.3 (Final)
[root at roc06 lustre]# lctl list_nids
10.5.203.6 at tcp
[root at roc06 ~]# grep lnet /etc/modprobe.conf
options lnet networks=tcp0(eth1)
[root at roc06 ~]# ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:22:19:05:90:F2
inet addr:10.5.203.6 Bcast:10.5.255.255 Mask:255.255.0.0
The OSS was formatted with the following:
mkfs.lustre --verbose --reformat --fsname=roc --ost --mgsnode=10.5.203.250 at tcp0
--mkfsoptions="-m 0 -E stride=32" /dev/md2
I believe this was done before "options lnet networks=tcp0(eth1)" was included
in modprobe.conf.
[root at roc06 ~]# tunefs.lustre --print /dev/md2
Permanent disk data:
Target: roc-OST0005
Index: 5
Lustre FS: roc
Mount type: ldiskfs
Flags: 0x402
(OST )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=10.5.203.250 at tcp ost.quota_type=u
For comparison, the OSS roc05:
[root at roc05 ~]# uname -a; cat /etc/redhat-release
Linux roc05 2.6.18-92.1.17.el5_lustre.1.6.7smp #1 SMP Mon Feb 9 19:56:55 MST
2009 x86_64 x86_64 x86_64 GNU/Linux
CentOS release 5 (Final)
[root at roc05 ~]# lctl list_nids
10.5.203.5 at tcp
[root at roc05 ~]# grep lnet /etc/modprobe.conf
options lnet networks=tcp0(eth1)
[root at roc05 ~]# ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:1C:23:D5:F5:4F
inet addr:10.5.203.5 Bcast:10.5.255.255 Mask:255.255.0.0
[root at roc05 ~]# tunefs.lustre --print /dev/md2
Permanent disk data:
Target: roc-OST0004
Index: 4
Lustre FS: roc
Mount type: ldiskfs
Flags: 0x402
(OST )
Persistent mount opts: errors=remount-ro,extents,mballoc
Parameters: mgsnode=10.5.203.250 at tcp ost.quota_type=u
On the MDS (rocpile):
[root at rocpile ~]# uname -a; cat /etc/redhat-release
Linux rocpile 2.6.18-92.1.10.el5_lustre.1.6.6smp #1 SMP Tue Aug 26 12:16:17 EDT
2008 x86_64 x86_64 x86_64 GNU/Linux
CentOS release 5 (Final)
[root at rocpile ~]# lctl list_nids
10.5.203.250 at tcp
[root at rocpile ~]# grep lnet /etc/modprobe.conf
options lnet networks=tcp(eth1)
[root at rocpile ~]# ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:15:C5:EC:F6:88
inet addr:10.5.203.250 Bcast:10.5.255.255 Mask:255.255.0.0
Any suggestions?
Thanks,
Mike
--
Michael D. Seymour Phone: 416-978-1776
Scientific Computing Support Fax: 416-978-3921
Canadian Institute for Theoretical Astrophysics, University of Toronto
More information about the lustre-discuss
mailing list