[Lustre-discuss] two multi-homed cluster

Patrice Hamelin patrice.hamelin at ec.gc.ca
Mon Dec 19 04:57:41 PST 2011


Cliff,

   Maybe our configuration is a bit special.  We are running two 
Infiniband partitions, one for storage and the other for TCP over IB.  
Both clusters are named IB3 and IB4.

I have 4 OSS on clustre IB3 which are configured like:

bond0     Link encap:InfiniBand  HWaddr 
80:00:00:4B:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
           inet addr:10.10.135.115  Bcast:10.10.135.255  Mask:255.255.255.0
           inet6 addr: fe80::202:c903:e:8bc6/64 Scope:Link
           UP BROADCAST RUNNING MASTER MULTICAST  MTU:65520  Metric:1
           RX packets:6 errors:0 dropped:0 overruns:0 frame:0
           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:336 (336.0 b)  TX bytes:0 (0.0 b)

eth0      Link encap:Ethernet  HWaddr E4:1F:13:60:93:C0
           inet addr:10.10.132.115  Bcast:10.10.132.255  Mask:255.255.255.0
           inet6 addr: fe80::e61f:13ff:fe60:93c0/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
           RX packets:85 errors:0 dropped:0 overruns:0 frame:0
           TX packets:91 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:1000
           RX bytes:10707 (10.4 KiB)  TX bytes:10607 (10.3 KiB)
           Interrupt:169 Memory:92000000-92012800

ib0.8001  Link encap:InfiniBand  HWaddr 
80:00:00:4A:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:65520  Metric:1
           RX packets:3 errors:0 dropped:0 overruns:0 frame:0
           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:256
           RX bytes:168 (168.0 b)  TX bytes:0 (0.0 b)

ib1.8001  Link encap:InfiniBand  HWaddr 
80:00:00:4B:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:65520  Metric:1
           RX packets:3 errors:0 dropped:0 overruns:0 frame:0
           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:256
           RX bytes:168 (168.0 b)  TX bytes:0 (0.0 b)

lo        Link encap:Local Loopback
           inet addr:127.0.0.1  Mask:255.0.0.0
           inet6 addr: ::1/128 Scope:Host
           UP LOOPBACK RUNNING  MTU:16436  Metric:1
           RX packets:8 errors:0 dropped:0 overruns:0 frame:0
           TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:560 (560.0 b)  TX bytes:560 (560.0 b)

[root at ib3-st01 ~]# cat /etc/modprobe.conf
alias eth0 bnx2
alias eth1 bnx2
alias scsi_hostadapter mptbase
alias scsi_hostadapter1 mptsas
alias scsi_hostadapter2 ata_piix
alias scsi_hostadapter3 qla2xxx
alias usb0 cdc_ether
alias bond0 bonding
options bond0 miimon=100 mode=1
options lnet networks="o2ib(bond0),tcp(eth0)"
options ost oss_num_threads=24

I formatted the MGS/MDT like:

mkfs.lustre --mgs --mdt --fsname=sata --reformat /dev/mpath/emcssd-1

And the 8 OST's like:

mkfs.lustre --fsname sata --reformat --ost --mgsnode=10.10.135.115 at o2ib 
--mgsnode=10.10.132.115 at tcp /dev/mpath/colosse4-lun53-sata


[root at ib3-st01 ~]# cat /etc/ha.d/haresources
ib3-st01 Filesystem::/dev/mpath/emcssd-1::/mnt/mdt-colosse::lustre
ib3-st01 
Filesystem::/dev/mpath/colosse4-lun53-sata::/mnt/data/clun53::lustre
ib3-st02 
Filesystem::/dev/mpath/colosse4-lun54-sata::/mnt/data/clun54::lustre
ib3-st03 
Filesystem::/dev/mpath/colosse4-lun55-sata::/mnt/data/clun55::lustre
ib3-st04 
Filesystem::/dev/mpath/colosse4-lun56-sata::/mnt/data/clun56::lustre
ib3-st01 
Filesystem::/dev/mpath/colosse4-lun57-sata::/mnt/data/clun57::lustre
ib3-st02 
Filesystem::/dev/mpath/colosse4-lun58-sata::/mnt/data/clun58::lustre
ib3-st03 
Filesystem::/dev/mpath/colosse4-lun59-sata::/mnt/data/clun59::lustre
ib3-st04 
Filesystem::/dev/mpath/colosse4-lun60-sata::/mnt/data/clun60::lustre

[root at ib3-st01 ~]# lctl list_nids
10.10.135.115 at o2ib
10.10.132.115 at tcp

service heartbeat start


Client on cluster IB3
ib3-bc3e41-be01:~# ifconfig
ib0.8001  Link encap:UNSPEC  HWaddr 
80-00-00-51-FE-80-00-00-00-00-00-00-00-00-00-00
           inet addr:10.10.135.74  Bcast:10.10.135.255  Mask:255.255.255.0
           inet6 addr: fe80::224:e890:97fe:fc91/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
           RX packets:5580 errors:0 dropped:0 overruns:0 frame:0
           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:2048
           RX bytes:430797 (430.7 KB)  TX bytes:0 (0.0 B)

ib0.8608  Link encap:UNSPEC  HWaddr 
80-00-00-4A-FE-80-00-00-00-00-00-00-00-00-00-00
           inet addr:10.10.133.74  Bcast:10.10.133.255  Mask:255.255.255.0
           inet6 addr: fe80::224:e890:97fe:fc91/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
           RX packets:209527 errors:0 dropped:0 overruns:0 frame:0
           TX packets:99270 errors:0 dropped:2 overruns:0 carrier:0
           collisions:0 txqueuelen:2048
           RX bytes:20774987 (20.7 MB)  TX bytes:16029957 (16.0 MB)

lo        Link encap:Local Loopback
           inet addr:127.0.0.1  Mask:255.0.0.0
           inet6 addr: ::1/128 Scope:Host
           UP LOOPBACK RUNNING  MTU:16436  Metric:1
           RX packets:157814 errors:0 dropped:0 overruns:0 frame:0
           TX packets:157814 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:7262472 (7.2 MB)  TX bytes:7262472 (7.2 MB)

ib3-bc3e41-be01:/proc/fs/lustre/osc# cat /etc/modprobe.d/lustre.conf
options lnet networks="o2ib(ib0.8001),tcp(ib0.8608)

I am able to mount both o2ib and tcp (strange though but still it works!)

ib3-bc3e41-be01:/proc/fs/lustre/osc# mount -t lustre
10.10.135.115 at o2ib:/sata on /mnt/sata type lustre (rw)
10.10.132.115 at tcp:/sata on /mnt/sata type lustre (rw)

The same goes for clients on cluster IB4.

What I would like to achieve is TCP mount from cluster IB4 to cluster IB3

Clients on cluster IB4 are like:
ib4-bc1f82-be01:~# ifconfig
ib0.8003  Link encap:UNSPEC  HWaddr 
80-00-00-50-FE-80-00-00-00-00-00-00-00-00-00-00
           inet addr:10.10.142.26  Bcast:10.10.142.255  Mask:255.255.255.0
           inet6 addr: fe80::224:e890:97fe:fca9/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
           RX packets:2530 errors:0 dropped:0 overruns:0 frame:0
           TX packets:280 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:2048
           RX bytes:609159 (609.1 KB)  TX bytes:16936 (16.9 KB)

ib0.8613  Link encap:UNSPEC  HWaddr 
80-00-00-4A-FE-80-00-00-00-00-00-00-00-00-00-00
           inet addr:10.10.140.26  Bcast:10.10.140.255  Mask:255.255.255.0
           inet6 addr: fe80::224:e890:97fe:fca9/64 Scope:Link
           UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
           RX packets:4218 errors:0 dropped:0 overruns:0 frame:0
           TX packets:3196 errors:0 dropped:1 overruns:0 carrier:0
           collisions:0 txqueuelen:2048
           RX bytes:570916 (570.9 KB)  TX bytes:1665488 (1.6 MB)

lo        Link encap:Local Loopback
           inet addr:127.0.0.1  Mask:255.0.0.0
           inet6 addr: ::1/128 Scope:Host
           UP LOOPBACK RUNNING  MTU:16436  Metric:1
           RX packets:1455 errors:0 dropped:0 overruns:0 frame:0
           TX packets:1455 errors:0 dropped:0 overruns:0 carrier:0
           collisions:0 txqueuelen:0
           RX bytes:69554 (69.5 KB)  TX bytes:69554 (69.5 KB)

ib4-bc1f82-be01:~# cat /etc/modprobe.d/lustre.conf
options lnet networks="o2ib(ib0.8003),tcp(ib0.8613)"

ib4-bc1f82-be01:~# lctl ping 10.10.132.115 at tcp
12345-0 at lo
12345-10.10.135.115 at o2ib
12345-10.10.132.115 at tcp

ib4-bc1f82-be01:~# mount -t lustre 10.10.132.115 at tcp:/sata /mnt/sata

That hangs and the log files says:

Dec 19 12:43:50 ib4-bc1f82-be01 kernel: [ 1649.617429] Lustre: 
2420:0:(import.c:517:import_select_connection()) 
sata-MDT0000-mdc-ffff880c3a9e6400: tried all connections, increasing 
latency to 1s
Dec 19 12:45:05 ib4-bc1f82-be01 kernel: [ 1724.492699] Lustre: 
2420:0:(import.c:517:import_select_connection()) 
sata-MDT0000-mdc-ffff880c3a9e6400: tried all connections, increasing 
latency to 4s
Dec 19 12:45:05 ib4-bc1f82-be01 kernel: [ 1724.492705] Lustre: 
2420:0:(import.c:517:import_select_connection()) Skipped 2 previous 
similar messages
Dec 19 12:47:35 ib4-bc1f82-be01 kernel: [ 1874.243747] Lustre: 
2420:0:(import.c:517:import_select_connection()) 
sata-MDT0000-mdc-ffff880c3a9e6400: tried all connections, increasing 
latency to 10s
Dec 19 12:47:35 ib4-bc1f82-be01 kernel: [ 1874.243754] Lustre: 
2420:0:(import.c:517:import_select_connection()) Skipped 5 previous 
similar messages
Dec 19 12:52:35 ib4-bc1f82-be01 kernel: [ 2173.742386] Lustre: 
2420:0:(import.c:517:import_select_connection()) 
sata-MDT0000-mdc-ffff880c3a9e6400: tried all connections, increasing 
latency to 21s
Dec 19 12:52:35 ib4-bc1f82-be01 kernel: [ 2173.742393] Lustre: 
2420:0:(import.c:517:import_select_connection()) Skipped 10 previous 
similar messages
Dec 19 12:52:35 ib4-bc1f82-be01 kernel: [ 2173.742544] Lustre: 
2419:0:(client.c:1487:ptlrpc_expire_one_request()) @@@ Request 
x1388626094064659 sent from sata-MDT0000-mdc-ffff880c3a9e6400 to NID 
10.10.135.115 at o2ib 0s ago has failed due to network error (26s prior to 
deadline).*
*Dec 19 12:52:35 ib4-bc1f82-be01 kernel: [ 2173.742547]   
req at ffff880c3b0e6400 x1388626094064659/t0 
o38->sata-MDT0000_UUID at 10.10.135.115@o2ib:12/10 lens 368/584 e 0 to 1 dl 
1324299181 ref 1 fl Rpc:N/0/0 rc 0/0
Dec 19 12:52:35 ib4-bc1f82-be01 kernel: [ 2173.742554] Lustre: 
2419:0:(client.c:1487:ptlrpc_expire_one_request()) Skipped 23 previous 
similar messages


Seems like I have a network error from  
"sata-MDT0000-mdc-ffff880c3a9e6400" to NID "10.10.135.115 at o2ib"

Same phenomenon is observed if I try to mount IB3 clients from IB4 
lustre partitions.

What am I missing here?

Thanks.


On 12/16/11 22:27, Cliff White wrote:
> You can do this, simply define networks for both devices.
> Assuming ib0, and eth0, you would have
> options lnet networks="tcp0(eth0),o2ib0(ib0)"
>
> The IB clients will mount using a @o2ib0 NID, and the ethernet clients 
> will mount using @tcp0 NIDs. Since you are explicitly specifying the 
> network, the hop rule doesn't apply.
> cliffw
>
>
> On Fri, Dec 16, 2011 at 9:49 AM, Patrice Hamelin 
> <patrice.hamelin at ec.gc.ca <mailto:patrice.hamelin at ec.gc.ca>> wrote:
>
>     Hi,
>
>       I have two Infiniband clusters, each in a separate location with
>     a solid ethernet connectivity between each of them.  Say they are
>     named cluster A and cluster B.  All members of each clusters have
>     both IB and eth networks available to them, and the IB network is
>     not routed between cluster A and B, but ethernet is.  On each
>     clusters, I have 4 OSS's serving FC disks.  Clients on cluster A
>     mounts Lustre disk from their local cluster, and the same goes on
>     for for cluster B, both on Infiniband NIDs.
>
>       What I would like to achieve is client from cluster A to mount
>     disks from OSS's on cluster B on the ethernet connection.  The
>     same goes on for clients in cluster B to mount disks from OSS's on
>     cluster A.
>
>       From my readings in the luster 1.8.7 manual, I got:
>
>     7.1.1 Modprobe.conf
>     Options under modprobe.conf are used to specify the networks
>     available to a node.
>     You have the choice of two different options – the networks
>     option, which explicitly
>     lists the networks available and the ip2nets option, which
>     provides a list-matching
>     lookup. Only one option can be used at any one time. The order of
>     LNET lines in
>     modprobe.conf is important when configuring multi-homed servers.
>     *If a server
>     node can be reached using more than one network, the first network
>     specified in
>     modprobe.conf will be used.*
>
>     Is the last sentence means that I cannot do that?
>
>     Thanks.
>
>     -- 
>     Patrice Hamelin
>     Specialiste sénior en systèmes d'exploitation | Senior OS specialist
>     Environnement Canada | Environment Canada
>     2121, route Transcanadienne | 2121 Transcanada Highway
>     Dorval, QC H9P 1J3
>     Gouvernement du Canada | Government of Canada
>
>
>     _______________________________________________
>     Lustre-discuss mailing list
>     Lustre-discuss at lists.lustre.org
>     <mailto:Lustre-discuss at lists.lustre.org>
>     http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>
>
>
> -- 
> cliffw
> Support Guy
> WhamCloud, Inc.
> www.whamcloud.com <http://www.whamcloud.com>
>
>

-- 
Patrice Hamelin
Specialiste sénior en systèmes d'exploitation | Senior OS specialist
Environnement Canada | Environment Canada
2121, route Transcanadienne | 2121 Transcanada Highway
Dorval, QC H9P 1J3
Téléphone | Telephone 514-421-5303
Télécopieur | Facsimile 514-421-7231
Gouvernement du Canada | Government of Canada

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20111219/3ef3cb91/attachment.htm>


More information about the lustre-discuss mailing list