[Lustre-discuss] two multi-homed cluster
Patrice Hamelin
patrice.hamelin at ec.gc.ca
Mon Dec 19 06:16:43 PST 2011
OK! Found the solution (came from a Luster user). So simple!...
Quote:
---
I think the possible solution to your problem lies in differentiating
the two different IB networks - by changing the lustre lnet device names.
This means that each separate cluster would have different non-default
"o2ib" naming convention in modprobe.conf.
The IB3 lustre servers might call it:
options lnet networks="o2ib3(bond0),tcp(eth0)"
and the IB4 lustre servers might call it:
options lnet networks="o2ib4(bond0),tcp(eth0)"
---
That solution works perfectly.
Thanks to repliers!
Season's Greetings all!
On 12/19/11 12:57, Patrice Hamelin wrote:
> Cliff,
>
> Maybe our configuration is a bit special. We are running two
> Infiniband partitions, one for storage and the other for TCP over IB.
> Both clusters are named IB3 and IB4.
>
> I have 4 OSS on clustre IB3 which are configured like:
>
> bond0 Link encap:InfiniBand HWaddr
> 80:00:00:4B:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
> inet addr:10.10.135.115 Bcast:10.10.135.255 Mask:255.255.255.0
> inet6 addr: fe80::202:c903:e:8bc6/64 Scope:Link
> UP BROADCAST RUNNING MASTER MULTICAST MTU:65520 Metric:1
> RX packets:6 errors:0 dropped:0 overruns:0 frame:0
> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:0
> RX bytes:336 (336.0 b) TX bytes:0 (0.0 b)
>
> eth0 Link encap:Ethernet HWaddr E4:1F:13:60:93:C0
> inet addr:10.10.132.115 Bcast:10.10.132.255 Mask:255.255.255.0
> inet6 addr: fe80::e61f:13ff:fe60:93c0/64 Scope:Link
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> RX packets:85 errors:0 dropped:0 overruns:0 frame:0
> TX packets:91 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:1000
> RX bytes:10707 (10.4 KiB) TX bytes:10607 (10.3 KiB)
> Interrupt:169 Memory:92000000-92012800
>
> ib0.8001 Link encap:InfiniBand HWaddr
> 80:00:00:4A:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
> UP BROADCAST RUNNING SLAVE MULTICAST MTU:65520 Metric:1
> RX packets:3 errors:0 dropped:0 overruns:0 frame:0
> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:256
> RX bytes:168 (168.0 b) TX bytes:0 (0.0 b)
>
> ib1.8001 Link encap:InfiniBand HWaddr
> 80:00:00:4B:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
> UP BROADCAST RUNNING SLAVE MULTICAST MTU:65520 Metric:1
> RX packets:3 errors:0 dropped:0 overruns:0 frame:0
> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:256
> RX bytes:168 (168.0 b) TX bytes:0 (0.0 b)
>
> lo Link encap:Local Loopback
> inet addr:127.0.0.1 Mask:255.0.0.0
> inet6 addr: ::1/128 Scope:Host
> UP LOOPBACK RUNNING MTU:16436 Metric:1
> RX packets:8 errors:0 dropped:0 overruns:0 frame:0
> TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:0
> RX bytes:560 (560.0 b) TX bytes:560 (560.0 b)
>
> [root at ib3-st01 ~]# cat /etc/modprobe.conf
> alias eth0 bnx2
> alias eth1 bnx2
> alias scsi_hostadapter mptbase
> alias scsi_hostadapter1 mptsas
> alias scsi_hostadapter2 ata_piix
> alias scsi_hostadapter3 qla2xxx
> alias usb0 cdc_ether
> alias bond0 bonding
> options bond0 miimon=100 mode=1
> options lnet networks="o2ib(bond0),tcp(eth0)"
> options ost oss_num_threads=24
>
> I formatted the MGS/MDT like:
>
> mkfs.lustre --mgs --mdt --fsname=sata --reformat /dev/mpath/emcssd-1
>
> And the 8 OST's like:
>
> mkfs.lustre --fsname sata --reformat --ost
> --mgsnode=10.10.135.115 at o2ib --mgsnode=10.10.132.115 at tcp
> /dev/mpath/colosse4-lun53-sata
>
>
> [root at ib3-st01 ~]# cat /etc/ha.d/haresources
> ib3-st01 Filesystem::/dev/mpath/emcssd-1::/mnt/mdt-colosse::lustre
> ib3-st01
> Filesystem::/dev/mpath/colosse4-lun53-sata::/mnt/data/clun53::lustre
> ib3-st02
> Filesystem::/dev/mpath/colosse4-lun54-sata::/mnt/data/clun54::lustre
> ib3-st03
> Filesystem::/dev/mpath/colosse4-lun55-sata::/mnt/data/clun55::lustre
> ib3-st04
> Filesystem::/dev/mpath/colosse4-lun56-sata::/mnt/data/clun56::lustre
> ib3-st01
> Filesystem::/dev/mpath/colosse4-lun57-sata::/mnt/data/clun57::lustre
> ib3-st02
> Filesystem::/dev/mpath/colosse4-lun58-sata::/mnt/data/clun58::lustre
> ib3-st03
> Filesystem::/dev/mpath/colosse4-lun59-sata::/mnt/data/clun59::lustre
> ib3-st04
> Filesystem::/dev/mpath/colosse4-lun60-sata::/mnt/data/clun60::lustre
>
> [root at ib3-st01 ~]# lctl list_nids
> 10.10.135.115 at o2ib
> 10.10.132.115 at tcp
>
> service heartbeat start
>
>
> Client on cluster IB3
> ib3-bc3e41-be01:~# ifconfig
> ib0.8001 Link encap:UNSPEC HWaddr
> 80-00-00-51-FE-80-00-00-00-00-00-00-00-00-00-00
> inet addr:10.10.135.74 Bcast:10.10.135.255 Mask:255.255.255.0
> inet6 addr: fe80::224:e890:97fe:fc91/64 Scope:Link
> UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1
> RX packets:5580 errors:0 dropped:0 overruns:0 frame:0
> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:2048
> RX bytes:430797 (430.7 KB) TX bytes:0 (0.0 B)
>
> ib0.8608 Link encap:UNSPEC HWaddr
> 80-00-00-4A-FE-80-00-00-00-00-00-00-00-00-00-00
> inet addr:10.10.133.74 Bcast:10.10.133.255 Mask:255.255.255.0
> inet6 addr: fe80::224:e890:97fe:fc91/64 Scope:Link
> UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1
> RX packets:209527 errors:0 dropped:0 overruns:0 frame:0
> TX packets:99270 errors:0 dropped:2 overruns:0 carrier:0
> collisions:0 txqueuelen:2048
> RX bytes:20774987 (20.7 MB) TX bytes:16029957 (16.0 MB)
>
> lo Link encap:Local Loopback
> inet addr:127.0.0.1 Mask:255.0.0.0
> inet6 addr: ::1/128 Scope:Host
> UP LOOPBACK RUNNING MTU:16436 Metric:1
> RX packets:157814 errors:0 dropped:0 overruns:0 frame:0
> TX packets:157814 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:0
> RX bytes:7262472 (7.2 MB) TX bytes:7262472 (7.2 MB)
>
> ib3-bc3e41-be01:/proc/fs/lustre/osc# cat /etc/modprobe.d/lustre.conf
> options lnet networks="o2ib(ib0.8001),tcp(ib0.8608)
>
> I am able to mount both o2ib and tcp (strange though but still it works!)
>
> ib3-bc3e41-be01:/proc/fs/lustre/osc# mount -t lustre
> 10.10.135.115 at o2ib:/sata on /mnt/sata type lustre (rw)
> 10.10.132.115 at tcp:/sata on /mnt/sata type lustre (rw)
>
> The same goes for clients on cluster IB4.
>
> What I would like to achieve is TCP mount from cluster IB4 to cluster IB3
>
> Clients on cluster IB4 are like:
> ib4-bc1f82-be01:~# ifconfig
> ib0.8003 Link encap:UNSPEC HWaddr
> 80-00-00-50-FE-80-00-00-00-00-00-00-00-00-00-00
> inet addr:10.10.142.26 Bcast:10.10.142.255 Mask:255.255.255.0
> inet6 addr: fe80::224:e890:97fe:fca9/64 Scope:Link
> UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1
> RX packets:2530 errors:0 dropped:0 overruns:0 frame:0
> TX packets:280 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:2048
> RX bytes:609159 (609.1 KB) TX bytes:16936 (16.9 KB)
>
> ib0.8613 Link encap:UNSPEC HWaddr
> 80-00-00-4A-FE-80-00-00-00-00-00-00-00-00-00-00
> inet addr:10.10.140.26 Bcast:10.10.140.255 Mask:255.255.255.0
> inet6 addr: fe80::224:e890:97fe:fca9/64 Scope:Link
> UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1
> RX packets:4218 errors:0 dropped:0 overruns:0 frame:0
> TX packets:3196 errors:0 dropped:1 overruns:0 carrier:0
> collisions:0 txqueuelen:2048
> RX bytes:570916 (570.9 KB) TX bytes:1665488 (1.6 MB)
>
> lo Link encap:Local Loopback
> inet addr:127.0.0.1 Mask:255.0.0.0
> inet6 addr: ::1/128 Scope:Host
> UP LOOPBACK RUNNING MTU:16436 Metric:1
> RX packets:1455 errors:0 dropped:0 overruns:0 frame:0
> TX packets:1455 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:0
> RX bytes:69554 (69.5 KB) TX bytes:69554 (69.5 KB)
>
> ib4-bc1f82-be01:~# cat /etc/modprobe.d/lustre.conf
> options lnet networks="o2ib(ib0.8003),tcp(ib0.8613)"
>
> ib4-bc1f82-be01:~# lctl ping 10.10.132.115 at tcp
> 12345-0 at lo
> 12345-10.10.135.115 at o2ib
> 12345-10.10.132.115 at tcp
>
> ib4-bc1f82-be01:~# mount -t lustre 10.10.132.115 at tcp:/sata /mnt/sata
>
> That hangs and the log files says:
>
> Dec 19 12:43:50 ib4-bc1f82-be01 kernel: [ 1649.617429] Lustre:
> 2420:0:(import.c:517:import_select_connection())
> sata-MDT0000-mdc-ffff880c3a9e6400: tried all connections, increasing
> latency to 1s
> Dec 19 12:45:05 ib4-bc1f82-be01 kernel: [ 1724.492699] Lustre:
> 2420:0:(import.c:517:import_select_connection())
> sata-MDT0000-mdc-ffff880c3a9e6400: tried all connections, increasing
> latency to 4s
> Dec 19 12:45:05 ib4-bc1f82-be01 kernel: [ 1724.492705] Lustre:
> 2420:0:(import.c:517:import_select_connection()) Skipped 2 previous
> similar messages
> Dec 19 12:47:35 ib4-bc1f82-be01 kernel: [ 1874.243747] Lustre:
> 2420:0:(import.c:517:import_select_connection())
> sata-MDT0000-mdc-ffff880c3a9e6400: tried all connections, increasing
> latency to 10s
> Dec 19 12:47:35 ib4-bc1f82-be01 kernel: [ 1874.243754] Lustre:
> 2420:0:(import.c:517:import_select_connection()) Skipped 5 previous
> similar messages
> Dec 19 12:52:35 ib4-bc1f82-be01 kernel: [ 2173.742386] Lustre:
> 2420:0:(import.c:517:import_select_connection())
> sata-MDT0000-mdc-ffff880c3a9e6400: tried all connections, increasing
> latency to 21s
> Dec 19 12:52:35 ib4-bc1f82-be01 kernel: [ 2173.742393] Lustre:
> 2420:0:(import.c:517:import_select_connection()) Skipped 10 previous
> similar messages
> Dec 19 12:52:35 ib4-bc1f82-be01 kernel: [ 2173.742544] Lustre:
> 2419:0:(client.c:1487:ptlrpc_expire_one_request()) @@@ Request
> x1388626094064659 sent from sata-MDT0000-mdc-ffff880c3a9e6400 to NID
> 10.10.135.115 at o2ib 0s ago has failed due to network error (26s prior
> to deadline).*
> *Dec 19 12:52:35 ib4-bc1f82-be01 kernel: [ 2173.742547]
> req at ffff880c3b0e6400 x1388626094064659/t0
> o38->sata-MDT0000_UUID at 10.10.135.115@o2ib:12/10 lens 368/584 e 0 to 1
> dl 1324299181 ref 1 fl Rpc:N/0/0 rc 0/0
> Dec 19 12:52:35 ib4-bc1f82-be01 kernel: [ 2173.742554] Lustre:
> 2419:0:(client.c:1487:ptlrpc_expire_one_request()) Skipped 23 previous
> similar messages
>
>
> Seems like I have a network error from
> "sata-MDT0000-mdc-ffff880c3a9e6400" to NID "10.10.135.115 at o2ib"
>
> Same phenomenon is observed if I try to mount IB3 clients from IB4
> lustre partitions.
>
> What am I missing here?
>
> Thanks.
>
>
> On 12/16/11 22:27, Cliff White wrote:
>> You can do this, simply define networks for both devices.
>> Assuming ib0, and eth0, you would have
>> options lnet networks="tcp0(eth0),o2ib0(ib0)"
>>
>> The IB clients will mount using a @o2ib0 NID, and the ethernet
>> clients will mount using @tcp0 NIDs. Since you are explicitly
>> specifying the network, the hop rule doesn't apply.
>> cliffw
>>
>>
>> On Fri, Dec 16, 2011 at 9:49 AM, Patrice Hamelin
>> <patrice.hamelin at ec.gc.ca <mailto:patrice.hamelin at ec.gc.ca>> wrote:
>>
>> Hi,
>>
>> I have two Infiniband clusters, each in a separate location
>> with a solid ethernet connectivity between each of them. Say
>> they are named cluster A and cluster B. All members of each
>> clusters have both IB and eth networks available to them, and the
>> IB network is not routed between cluster A and B, but ethernet
>> is. On each clusters, I have 4 OSS's serving FC disks. Clients
>> on cluster A mounts Lustre disk from their local cluster, and the
>> same goes on for for cluster B, both on Infiniband NIDs.
>>
>> What I would like to achieve is client from cluster A to mount
>> disks from OSS's on cluster B on the ethernet connection. The
>> same goes on for clients in cluster B to mount disks from OSS's
>> on cluster A.
>>
>> From my readings in the luster 1.8.7 manual, I got:
>>
>> 7.1.1 Modprobe.conf
>> Options under modprobe.conf are used to specify the networks
>> available to a node.
>> You have the choice of two different options – the networks
>> option, which explicitly
>> lists the networks available and the ip2nets option, which
>> provides a list-matching
>> lookup. Only one option can be used at any one time. The order of
>> LNET lines in
>> modprobe.conf is important when configuring multi-homed servers.
>> *If a server
>> node can be reached using more than one network, the first
>> network specified in
>> modprobe.conf will be used.*
>>
>> Is the last sentence means that I cannot do that?
>>
>> Thanks.
>>
>> --
>> Patrice Hamelin
>> Specialiste sénior en systèmes d'exploitation | Senior OS specialist
>> Environnement Canada | Environment Canada
>> 2121, route Transcanadienne | 2121 Transcanada Highway
>> Dorval, QC H9P 1J3
>> Gouvernement du Canada | Government of Canada
>>
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> <mailto:Lustre-discuss at lists.lustre.org>
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>>
>>
>>
>> --
>> cliffw
>> Support Guy
>> WhamCloud, Inc.
>> www.whamcloud.com <http://www.whamcloud.com>
>>
>>
>
> --
> Patrice Hamelin
> Specialiste sénior en systèmes d'exploitation | Senior OS specialist
> Environnement Canada | Environment Canada
> 2121, route Transcanadienne | 2121 Transcanada Highway
> Dorval, QC H9P 1J3
> Téléphone | Telephone 514-421-5303
> Télécopieur | Facsimile 514-421-7231
> Gouvernement du Canada | Government of Canada
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
--
Patrice Hamelin
Specialiste sénior en systèmes d'exploitation | Senior OS specialist
Environnement Canada | Environment Canada
2121, route Transcanadienne | 2121 Transcanada Highway
Dorval, QC H9P 1J3
Téléphone | Telephone 514-421-5303
Télécopieur | Facsimile 514-421-7231
Gouvernement du Canada | Government of Canada
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20111219/179e715f/attachment.htm>
More information about the lustre-discuss
mailing list