[lustre-discuss] ​luster client mount issues

sohamm sohamm at gmail.com
Wed Jul 20 11:09:16 PDT 2016


Hi

Any guidance/help on this is greatly appreciated.

Thanks

On Mon, Jul 18, 2016 at 7:25 PM, sohamm <sohamm at gmail.com> wrote:

> Hi Ben
> Both the networks have netmasks of value 255.255.255.0
>
> Thanks
>
> On Mon, Jul 18, 2016 at 10:08 AM, Ben Evans <bevans at cray.com> wrote:
>
>> What do your netmasks look like on each network?
>>
>> From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on behalf
>> of sohamm <sohamm at gmail.com>
>> Date: Monday, July 18, 2016 at 1:56 AM
>> To: "lustre-discuss at lists.lustre.org" <lustre-discuss at lists.lustre.org>
>> Subject: Re: [lustre-discuss] lustre-discuss Digest, Vol 124, Issue 17
>>
>> Hi Thomas
>> Below are the results of the commands you suggested.
>>
>> *From Client*
>> [root at dev1 ~]# lctl ping 192.168.200.52 at o2ib
>> failed to ping 192.168.200.52 at o2ib: Input/output error
>> [root at dev1 ~]# lctl ping 192.168.111.52 at tcp
>> 12345-0 at lo
>> 12345-192.168.200.52 at o2ib
>> 12345-192.168.111.52 at tcp
>> [root at dev1 ~]# mount -t lustre 192.168.111.52 at tcp:/mylustre /lustre
>> mount.lustre: mount 192.168.111.52 at tcp:/mylustre at /lustre failed:
>> Input/output error
>> Is the MGS running?
>> mount: mounting 192.168.111.52 at tcp:/mylustre on /lustre failed: Invalid
>> argument
>>
>> cat /var/log/messages | tail
>> Jul 18 01:37:04 dev1 user.warn kernel: [2250504.401397] ib1: multicast
>> join failed for ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22
>> Jul 18 01:37:26 dev1 user.warn kernel: [2250526.257309] LNet: No route to
>> 12345-192.168.200.52 at o2ib via <?> (all routers down)
>> Jul 18 01:37:36 dev1 user.warn kernel: [2250536.481862] ib1: multicast
>> join failed for ff12:401b:ffff:0000:0000:0000:ffff:ffff, status -22
>> Jul 18 01:41:53 dev1 user.warn kernel: [2250792.947299] LNet: No route to
>> 12345-192.168.200.52 at o2ib via <?> (all routers down)
>>
>>
>> *From MGS*
>> [root at lustre_mgs01_vm03 ~]# lctl ping 192.168.111.102 at tcp
>> 12345-0 at lo
>> 12345-192.168.111.102 at tcp
>>
>> Please let me know what else i can try. Looks like i am missing something
>> with the ib config? Do i need router setup as part of lnet ?
>> if i am able to ping mgs from client on the tcp network, it should still
>> work ?
>>
>> Thanks
>>
>>
>> On Sun, Jul 17, 2016 at 1:07 PM, <lustre-discuss-request at lists.lustre.org
>> > wrote:
>>
>>> Send lustre-discuss mailing list submissions to
>>>         lustre-discuss at lists.lustre.org
>>>
>>> To subscribe or unsubscribe via the World Wide Web, visit
>>>         http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>> or, via email, send a message with subject or body 'help' to
>>>         lustre-discuss-request at lists.lustre.org
>>>
>>> You can reach the person managing the list at
>>>         lustre-discuss-owner at lists.lustre.org
>>>
>>> When replying, please edit your Subject line so it is more specific
>>> than "Re: Contents of lustre-discuss digest..."
>>>
>>>
>>> Today's Topics:
>>>
>>>    1. llapi_file_get_stripe() and /proc/fs/lustre/osc/  entries
>>>       (John Bauer)
>>>    2. luster client mount issues (sohamm)
>>>    3. Re:
>>> ​​
>>> luster client mount issues (Thomas Roth)
>>>
>>>
>>> ----------------------------------------------------------------------
>>>
>>> Message: 1
>>> Date: Sat, 16 Jul 2016 15:11:22 -0500
>>> From: John Bauer <bauerj at iodoctors.com>
>>> To: "lustre-discuss at lists.lustre.org"
>>>         <lustre-discuss at lists.lustre.org>
>>> Subject: [lustre-discuss] llapi_file_get_stripe() and
>>>         /proc/fs/lustre/osc/    entries
>>> Message-ID: <03ceaaa0-b004-ae43-eaa1-437da2a5b717 at iodoctors.com>
>>> Content-Type: text/plain; charset="utf-8"; Format="flowed"
>>>
>>> I am using *llapi_file_get_stripe()* to get the ost indexes that a file
>>> is striped on.  That part is working fine. But there are multiple Lustre
>>> file systems on the node resulting in multiple **OST0000* *in the
>>> directory /proc/fs/lustre/osc.  Is there something in the *struct
>>> lov_user_ost_data* or *struct lov_user_md* that would indicate which of
>>> the following directories pertains to the file's OST ?
>>>
>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp1-OST0000-osc-ffff880287ae4c00
>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp2-OST0000-osc-ffff881034d99000
>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp6-OST0000-osc-ffff881003cd7800
>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp7-OST0000-osc-ffff880ffe051c00
>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp8-OST0000-osc-ffff880ffe054c00
>>> dr-xr-xr-x 2 root root 0 Jul 16 12:31 nbp9-OST0000-osc-ffff880fcf179400
>>>
>>> Thanks
>>>
>>> --
>>> I/O Doctors, LLC
>>> 507-766-0378
>>> bauerj at iodoctors.com
>>>
>>> -------------- next part --------------
>>> An HTML attachment was scrubbed...
>>> URL: <
>>> http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20160716/95176929/attachment.html
>>> >
>>>
>>> ------------------------------
>>>
>>> Message: 2
>>> Date: Sat, 16 Jul 2016 14:34:35 -0700
>>> From: sohamm <sohamm at gmail.com>
>>> To: lustre-discuss at lists.lustre.org
>>> Subject: [lustre-discuss] luster client mount issues
>>> Message-ID:
>>>         <
>>> CAKGc+eBQ+MCdbSrc7Ft4gd+zmZ6FbAZHaVhSQtpgOshYrJqhhw at mail.gmail.com>
>>> Content-Type: text/plain; charset="utf-8"
>>>
>>> Hi
>>>
>>> I am trying to mount lustre client. Below are steps and necessary
>>> information surrounding the issue. Please let me know if i am missing
>>> something
>>>
>>> Thanks
>>> Div
>>>
>>> *Mgs:*
>>>
>>> [root at lustre_mgs01_vm03 ~]# cat /etc/modprobe.d/lustre.conf
>>>
>>> options lnet networks=o2ib(ib0),tcp0(eth0)
>>>
>>>
>>>
>>> [root at lustre_mgs01_vm03 ~]# modprobe lnet
>>>
>>> [root at lustre_mgs01_vm03 ~]# lsmod | grep lnet
>>>
>>> lnet                  449065  0
>>>
>>> libcfs                405839  1 lnet
>>>
>>> [root at lustre_mgs01_vm03 ~]# lctl network up
>>>
>>> LNET configured
>>>
>>> [root at lustre_mgs01_vm03 ~]# lctl list_nids
>>>
>>> 192.168.200.52 at o2ib
>>>
>>> 192.168.111.52 at tcp
>>>
>>> *On Client:*
>>> I am able to ping MGS on both tcp and ib network
>>>
>>> [root at dev1~]# ping 192.168.111.52
>>>
>>> PING 192.168.111.52 (192.168.111.52) 56(84) bytes of data.
>>>
>>> 64 bytes from 192.168.111.52: icmp_req=1 ttl=64 time=5.81 ms
>>>
>>> 64 bytes from 192.168.111.52: icmp_req=2 ttl=64 time=0.802 ms
>>>
>>> 64 bytes from 192.168.111.52: icmp_req=3 ttl=64 time=0.780 ms
>>>
>>> ^C
>>>
>>> --- 192.168.111.52 ping statistics ---
>>>
>>> 3 packets transmitted, 3 received, 0% packet loss, time 2000ms
>>>
>>> rtt min/avg/max/mdev = 0.780/2.464/5.811/2.366 ms
>>>
>>> [root at dev1 ~]# ping 192.168.200.52
>>>
>>> PING 192.168.200.52 (192.168.200.52) 56(84) bytes of data.
>>>
>>> 64 bytes from 192.168.200.52: icmp_req=1 ttl=64 time=24.4 ms
>>>
>>> 64 bytes from 192.168.200.52: icmp_req=2 ttl=64 time=2.14 ms
>>>
>>> 64 bytes from 192.168.200.52: icmp_req=3 ttl=64 time=0.782 ms
>>>
>>> 64 bytes from 192.168.200.52: icmp_req=4 ttl=64 time=9.30 ms
>>>
>>> ^C
>>>
>>> --- 192.168.200.52 ping statistics ---
>>>
>>> 4 packets transmitted, 4 received, 0% packet loss, time 3005ms
>>>
>>>
>>> *client mount commands*
>>>
>>> mount -t lustre 192.168.111.52 at tcp:/mylustre /lustre ( or)
>>>
>>> mount -t lustre 192.168.111.52 at tcp0:/mylustre /lustre ( or)
>>>
>>> mount -t lustre 192.168.200.52 at ob2:/mylustre /lustre
>>>
>>>
>>> *cat /var/log/messages | tail -40*
>>>
>>> Jul 16 17:03:17 dev1 user.err kernel: [2133277.466013] LustreError:
>>> 162-5:
>>> Missing mount data: check that /sbin/mount.lustre is installed.
>>>
>>> Jul 16 17:03:17 dev1 user.err kernel: [2133277.466064] LustreError:
>>> 13627:0:(obd_mount.c:1325:lustre_fill_super()) Unable to mount  (-22)
>>>
>>> Jul 16 17:03:23 dev1 user.warn kernel: [2133282.680519] Lustre:
>>> 12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>>> timed out for slow reply: [sent 1468702998/real 1468702998]
>>> req at ffff8801e0bc3c00 x1539427524411444/t0(0) o250->MGC192.168.111.52
>>>
>>> Jul 16 17:03:24 dev1 user.err kernel: [2133283.680193] LustreError:
>>> 13628:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired
>>> req at ffff8801e0bc7000 x1539427524411448/t0(0) o101->MGC192.168.111.52 at tcp
>>> @192.168.111.52 at tcp:26/25 lens 328/344 e 0 to 0 dl
>>>
>>> Jul 16 17:03:31 dev1 user.err kernel: [2133290.760978] LustreError:
>>> 13657:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired
>>> req at ffff8801b7159800 x1539427524411456/t0(0) o101->MGC192.168.111.52 at tcp
>>> @192.168.111.52 at tcp:26/25 lens 328/344 e 0 to 0 dl
>>>
>>> Jul 16 17:03:43 dev1 user.warn kernel: [2133302.681412] Lustre:
>>> 12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>>> failed due to network error: [sent 1468703023/real 1468703023]
>>> req at ffff8801d6bfc800 x1539427524411460/t0(0) o250->MGC192.168.111
>>>
>>> Jul 16 17:04:08 dev1 user.warn kernel: [2133327.681402] Lustre:
>>> 12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>>> failed due to network error: [sent 1468703048/real 1468703048]
>>> req at ffff8801d6bfec00 x1539427524411464/t0(0) o250->MGC192.168.111
>>>
>>> Jul 16 17:04:15 dev1 user.err kernel: [2133334.680175] LustreError:
>>> 13628:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit expired
>>> req at ffff8801e0bc7000 x1539427524411452/t0(0) o101->MGC192.168.111.52 at tcp
>>> @192.168.111.52 at tcp:26/25 lens 328/344 e 0 to 0 dl
>>>
>>> Jul 16 17:04:15 dev1 user.err kernel: [2133334.680316] LustreError:
>>> 15c-8:
>>> MGC192.168.111.52 at tcp: The configuration from log 'mylustre-client'
>>> failed
>>> (-5). This may be the result of communication errors between this node
>>> and
>>> the MGS, a bad configuration, or other e
>>>
>>> Jul 16 17:04:15 dev1 user.err kernel: [2133334.680357] LustreError:
>>> 13628:0:(llite_lib.c:1046:ll_fill_super()) Unable to process log: -5
>>>
>>> Jul 16 17:04:15 dev1 user.warn kernel: [2133334.680881] Lustre: Unmounted
>>> mylustre-client
>>>
>>> Jul 16 17:04:15 dev1 user.err kernel: [2133334.731730] LustreError:
>>> 13628:0:(obd_mount.c:1325:lustre_fill_super()) Unable to mount  (-5)
>>> -------------- next part --------------
>>> An HTML attachment was scrubbed...
>>> URL: <
>>> http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20160716/28fb2cad/attachment-0001.htm
>>> >
>>>
>>> ------------------------------
>>>
>>> Message: 3
>>> Date: Sun, 17 Jul 2016 10:19:18 +0200
>>> From: Thomas Roth <t.roth at gsi.de>
>>> To: <lustre-discuss at lists.lustre.org>
>>> Subject: Re: [lustre-discuss] luster client mount issues
>>> Message-ID: <578B3F86.1060203 at gsi.de>
>>> Content-Type: text/plain; charset="windows-1252"; format=flowed
>>>
>>> Hi,
>>>
>>> try 'lctl ping' from your clients to the MDS to check if you get through
>>> on lnet, e.g.
>>>
>>> lctl ping ping 192.168.200.52 at o2ib
>>>
>>> or
>>>
>>> lctl ping 192.168.111.52 at tcp
>>>
>>>
>>> and vice versa from the MDS to the clients' nids.
>>>
>>> Regards,
>>> Thomas
>>>
>>> On 07/16/2016 11:34 PM, sohamm wrote:
>>> > Hi
>>> >
>>> > I am trying to mount lustre client. Below are steps and necessary
>>> > information surrounding the issue. Please let me know if i am missing
>>> > something
>>> >
>>> > Thanks
>>> > Div
>>> >
>>> > *Mgs:*
>>> >
>>> > [root at lustre_mgs01_vm03 ~]# cat /etc/modprobe.d/lustre.conf
>>> >
>>> > options lnet networks=o2ib(ib0),tcp0(eth0)
>>> >
>>> >
>>> >
>>> > [root at lustre_mgs01_vm03 ~]# modprobe lnet
>>> >
>>> > [root at lustre_mgs01_vm03 ~]# lsmod | grep lnet
>>> >
>>> > lnet                  449065  0
>>> >
>>> > libcfs                405839  1 lnet
>>> >
>>> > [root at lustre_mgs01_vm03 ~]# lctl network up
>>> >
>>> > LNET configured
>>> >
>>> > [root at lustre_mgs01_vm03 ~]# lctl list_nids
>>> >
>>> > 192.168.200.52 at o2ib
>>> >
>>> > 192.168.111.52 at tcp
>>> >
>>> > *On Client:*
>>> > I am able to ping MGS on both tcp and ib network
>>> >
>>> > [root at dev1~]# ping 192.168.111.52
>>> >
>>> > PING 192.168.111.52 (192.168.111.52) 56(84) bytes of data.
>>> >
>>> > 64 bytes from 192.168.111.52: icmp_req=1 ttl=64 time=5.81 ms
>>> >
>>> > 64 bytes from 192.168.111.52: icmp_req=2 ttl=64 time=0.802 ms
>>> >
>>> > 64 bytes from 192.168.111.52: icmp_req=3 ttl=64 time=0.780 ms
>>> >
>>> > ^C
>>> >
>>> > --- 192.168.111.52 ping statistics ---
>>> >
>>> > 3 packets transmitted, 3 received, 0% packet loss, time 2000ms
>>> >
>>> > rtt min/avg/max/mdev = 0.780/2.464/5.811/2.366 ms
>>> >
>>> > [root at dev1 ~]# ping 192.168.200.52
>>> >
>>> > PING 192.168.200.52 (192.168.200.52) 56(84) bytes of data.
>>> >
>>> > 64 bytes from 192.168.200.52: icmp_req=1 ttl=64 time=24.4 ms
>>> >
>>> > 64 bytes from 192.168.200.52: icmp_req=2 ttl=64 time=2.14 ms
>>> >
>>> > 64 bytes from 192.168.200.52: icmp_req=3 ttl=64 time=0.782 ms
>>> >
>>> > 64 bytes from 192.168.200.52: icmp_req=4 ttl=64 time=9.30 ms
>>> >
>>> > ^C
>>> >
>>> > --- 192.168.200.52 ping statistics ---
>>> >
>>> > 4 packets transmitted, 4 received, 0% packet loss, time 3005ms
>>> >
>>> >
>>> > *client mount commands*
>>> >
>>> > mount -t lustre 192.168.111.52 at tcp:/mylustre /lustre ( or)
>>> >
>>> > mount -t lustre 192.168.111.52 at tcp0:/mylustre /lustre ( or)
>>> >
>>> > mount -t lustre 192.168.200.52 at ob2:/mylustre /lustre
>>> >
>>> >
>>> > *cat /var/log/messages | tail -40*
>>> >
>>> > Jul 16 17:03:17 dev1 user.err kernel: [2133277.466013] LustreError:
>>> 162-5:
>>> > Missing mount data: check that /sbin/mount.lustre is installed.
>>> >
>>> > Jul 16 17:03:17 dev1 user.err kernel: [2133277.466064] LustreError:
>>> > 13627:0:(obd_mount.c:1325:lustre_fill_super()) Unable to mount  (-22)
>>> >
>>> > Jul 16 17:03:23 dev1 user.warn kernel: [2133282.680519] Lustre:
>>> > 12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent
>>> has
>>> > timed out for slow reply: [sent 1468702998/real 1468702998]
>>> > req at ffff8801e0bc3c00 x1539427524411444/t0(0) o250->MGC192.168.111.52
>>> >
>>> > Jul 16 17:03:24 dev1 user.err kernel: [2133283.680193] LustreError:
>>> > 13628:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit
>>> expired
>>> > req at ffff8801e0bc7000 x1539427524411448/t0(0)
>>> o101->MGC192.168.111.52 at tcp
>>> > @192.168.111.52 at tcp:26/25 lens 328/344 e 0 to 0 dl
>>> >
>>> > Jul 16 17:03:31 dev1 user.err kernel: [2133290.760978] LustreError:
>>> > 13657:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit
>>> expired
>>> > req at ffff8801b7159800 x1539427524411456/t0(0)
>>> o101->MGC192.168.111.52 at tcp
>>> > @192.168.111.52 at tcp:26/25 lens 328/344 e 0 to 0 dl
>>> >
>>> > Jul 16 17:03:43 dev1 user.warn kernel: [2133302.681412] Lustre:
>>> > 12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent
>>> has
>>> > failed due to network error: [sent 1468703023/real 1468703023]
>>> > req at ffff8801d6bfc800 x1539427524411460/t0(0) o250->MGC192.168.111
>>> >
>>> > Jul 16 17:04:08 dev1 user.warn kernel: [2133327.681402] Lustre:
>>> > 12364:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent
>>> has
>>> > failed due to network error: [sent 1468703048/real 1468703048]
>>> > req at ffff8801d6bfec00 x1539427524411464/t0(0) o250->MGC192.168.111
>>> >
>>> > Jul 16 17:04:15 dev1 user.err kernel: [2133334.680175] LustreError:
>>> > 13628:0:(client.c:1083:ptlrpc_import_delay_req()) @@@ send limit
>>> expired
>>> > req at ffff8801e0bc7000 x1539427524411452/t0(0)
>>> o101->MGC192.168.111.52 at tcp
>>> > @192.168.111.52 at tcp:26/25 lens 328/344 e 0 to 0 dl
>>> >
>>> > Jul 16 17:04:15 dev1 user.err kernel: [2133334.680316] LustreError:
>>> 15c-8:
>>> > MGC192.168.111.52 at tcp: The configuration from log 'mylustre-client'
>>> failed
>>> > (-5). This may be the result of communication errors between this node
>>> and
>>> > the MGS, a bad configuration, or other e
>>> >
>>> > Jul 16 17:04:15 dev1 user.err kernel: [2133334.680357] LustreError:
>>> > 13628:0:(llite_lib.c:1046:ll_fill_super()) Unable to process log: -5
>>> >
>>> > Jul 16 17:04:15 dev1 user.warn kernel: [2133334.680881] Lustre:
>>> Unmounted
>>> > mylustre-client
>>> >
>>> > Jul 16 17:04:15 dev1 user.err kernel: [2133334.731730] LustreError:
>>> > 13628:0:(obd_mount.c:1325:lustre_fill_super()) Unable to mount  (-5)
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > lustre-discuss mailing list
>>> > lustre-discuss at lists.lustre.org
>>> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>> >
>>>
>>> --
>>> --------------------------------------------------------------------
>>> Thomas Roth
>>> Department: HPC
>>> Location: SB3 1.262
>>> Phone: +49-6159-71 1453  Fax: +49-6159-71 2986
>>>
>>> GSI Helmholtzzentrum f?r Schwerionenforschung GmbH
>>> Planckstra?e 1
>>> 64291 Darmstadt
>>> www.gsi.de
>>>
>>> Gesellschaft mit beschr?nkter Haftung
>>> Sitz der Gesellschaft: Darmstadt
>>> Handelsregister: Amtsgericht Darmstadt, HRB 1528
>>>
>>> Gesch?ftsf?hrung: Professor Dr. Karlheinz Langanke
>>> Ursula Weyrich
>>> J?rg Blaurock
>>>
>>> Vorsitzender des Aufsichtsrates: St Dr. Georg Sch?tte
>>> Stellvertreter: Ministerialdirigent Dr. Rolf Bernhardt
>>>
>>>
>>> ------------------------------
>>>
>>> Subject: Digest Footer
>>>
>>> _______________________________________________
>>> lustre-discuss mailing list
>>> lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>>
>>>
>>> ------------------------------
>>>
>>> End of lustre-discuss Digest, Vol 124, Issue 17
>>> ***********************************************
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20160720/35283e15/attachment-0001.htm>


More information about the lustre-discuss mailing list