[lustre-discuss] lustre filesystem in hung state

Anilkumar Naik anilkumar.j.naik at gmail.com
Sun Feb 24 21:45:48 PST 2019


Dear Raj,

Thanks its working now.

--
Regards,
A J Naik
Computer Centre, TIFR.
+91 022 22782342 / 2121



On Fri, Feb 22, 2019 at 10:41 AM Raj <rajgautam at gmail.com> wrote:

> Anil,
> Your error message show
> o8->scratch-OST0003-osc-MDT0000 at 192.168.1.5@o2ib
> which means it is trying to connect (opcode o8 is OST connect) to OST0003
> ost of scratch file system which is hosted in 192.168.1.5 at o2ib nid OSS
> node but the client has lost connection to the OSS node. This seems to me
> that you are having network issue. You can ping server nid from client and
> try to troubleshoot network issue:
>
> client# lctl ping 192.168.1.5 at o2ib
>
>
> On Tue, Feb 19, 2019 at 11:44 PM Anilkumar Naik <
> anilkumar.j.naik at gmail.com> wrote:
>
>> Dear All,
>>
>> Lustre file system goes to hung state and unable to know the exact issue
>> with lustre. Kindly find below information and help us to know the fixes
>> for flle system kernerl hung issue.
>>
>> Cluster Details:
>>
>> Oss node/server is mounted with below mount targets. We could able to
>> mount the client with home mounts and its works for some time. After
>> 10-15mins all the clients hangs and oss node get rebooted. Kindly help.
>>
>>  /dev/mapper/mdt-mgt        19G  446M   17G   3% /mdt-mgt
>> /dev/mapper/mdt-home      140G  2.8G  128G   3% /mdt-home
>> /dev/mapper/mdt-scratch   140G  759M  130G   1% /mdt-scratch
>> /dev/mapper/ost-home      3.7T  2.4T  1.1T  69% /ost-home
>>
>> Below Lustre packages has been installed at oss node.
>> ==============
>> kernel-devel-2.6.32-431.23.3.el6_lustre.x86_64
>> lustre-debuginfo-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64
>> lustre-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64
>> kernel-firmware-2.6.32-431.23.3.el6_lustre.x86_64
>> lustre-iokit-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64
>> kernel-2.6.32-431.23.3.el6_lustre.x86_64
>> lustre-modules-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64
>> lustre-tests-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64
>> kernel-debuginfo-common-x86_64-2.6.32-431.23.3.el6_lustre.x86_64
>> lustre-osd-ldiskfs-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64
>> kernel-debuginfo-2.6.32-431.23.3.el6_lustre.x86_64
>> =====================
>>
>> Lustre errors:
>> =====
>> Feb 20 06:22:06 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 17 previous
>> similar messages
>> Feb 20 06:29:11 oss1 kernel: LustreError: 137-5: scratch-OST0001_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 06:29:11 oss1 kernel: LustreError: Skipped 16 previous similar
>> messages
>> Feb 20 06:29:11 oss1 kernel: LustreError: 11-0:
>> scratch-OST0001-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 06:29:11 oss1 kernel: LustreError: Skipped 16 previous similar
>> messages
>> Feb 20 06:32:42 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> timed out for sent delay: [sent 1550624551/real 0]  req at ffff880800be1000
>> x1625913123994836/t0(0) o8->scratch-OST0003-osc-MDT0000 at 192.168.1.5@o2ib:28/4
>> lens 400/544 e 0 to 1 dl 1550624562 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 06:32:42 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 15 previous
>> similar messages
>> Feb 20 06:39:36 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 06:39:36 oss1 kernel: LustreError: Skipped 17 previous similar
>> messages
>> Feb 20 06:39:36 oss1 kernel: LustreError: 11-0:
>> scratch-OST0003-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 06:39:36 oss1 kernel: LustreError: Skipped 17 previous similar
>> messages
>> Feb 20 06:43:12 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> timed out for sent delay: [sent 1550625151/real 0]  req at ffff880800dcd000
>> x1625913123996040/t0(0) o8->scratch-OST0001-osc-MDT0000 at 192.168.1.5@o2ib:28/4
>> lens 400/544 e 0 to 1 dl 1550625192 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 06:43:12 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 17 previous
>> similar messages
>> Feb 20 06:50:01 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 06:50:01 oss1 kernel: LustreError: Skipped 15 previous similar
>> messages
>> Feb 20 06:50:01 oss1 kernel: LustreError: 11-0:
>> scratch-OST0003-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 06:50:01 oss1 kernel: LustreError: Skipped 15 previous similar
>> messages
>> Feb 20 06:53:57 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> timed out for sent delay: [sent 1550625826/real 0]  req at ffff881005e88800
>> x1625913123997352/t0(0) o8->scratch-OST0003-osc-MDT0000 at 192.168.1.5@o2ib:28/4
>> lens 400/544 e 0 to 1 dl 1550625837 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 06:53:57 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 15 previous
>> similar messages
>> Feb 20 07:00:51 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 07:00:51 oss1 kernel: LustreError: Skipped 17 previous similar
>> messages
>> Feb 20 07:00:51 oss1 kernel: LustreError: 11-0:
>> scratch-OST0003-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 07:00:51 oss1 kernel: LustreError: Skipped 17 previous similar
>> messages
>> Feb 20 07:04:32 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> timed out for sent delay: [sent 1550626426/real 0]  req at ffff880fc2b02800
>> x1625913123998556/t0(0) o8->scratch-OST0001-osc-MDT0000 at 192.168.1.5@o2ib:28/4
>> lens 400/544 e 0 to 1 dl 1550626472 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 07:04:32 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 17 previous
>> similar messages
>> Feb 20 07:12:06 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 07:12:06 oss1 kernel: LustreError: Skipped 15 previous similar
>> messages
>> Feb 20 07:12:06 oss1 kernel: LustreError: 11-0:
>> scratch-OST0003-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 07:12:06 oss1 kernel: LustreError: Skipped 15 previous similar
>> messages
>> Feb 20 07:14:42 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> timed out for sent delay: [sent 1550627076/real 0]  req at ffff880fb1d50000
>> x1625913123999836/t0(0) o8->scratch-OST0001-osc-MDT0000 at 192.168.1.5@o2ib:28/4
>> lens 400/544 e 0 to 1 dl 1550627082 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 07:14:42 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 16 previous
>> similar messages
>> Feb 20 07:23:21 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 07:23:21 oss1 kernel: LustreError: Skipped 20 previous similar
>> messages
>> Feb 20 07:23:21 oss1 kernel: LustreError: 11-0:
>> scratch-OST0003-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 07:23:21 oss1 kernel: LustreError: Skipped 20 previous similar
>> messages
>> Feb 20 07:25:51 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> failed due to network error: [sent 1550627726/real 1550627751]
>> req at ffff880fd292c000 x1625913124001124/t0(0)
>> o8->scratch-OST0003-osc-MDT0000 at 192.168.1.5@o2ib:28/4 lens 400/544 e 0
>> to 1 dl 1550627782 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 07:25:51 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 17 previous
>> similar messages
>> Feb 20 07:33:46 oss1 kernel: LustreError: 137-5: scratch-OST0001_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 07:33:46 oss1 kernel: LustreError: Skipped 14 previous similar
>> messages
>> Feb 20 07:33:46 oss1 kernel: LustreError: 11-0:
>> scratch-OST0001-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 07:33:46 oss1 kernel: LustreError: Skipped 14 previous similar
>> messages
>> Feb 20 07:35:52 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> timed out for sent delay: [sent 1550628326/real 0]  req at ffff880fb23b4c00
>> x1625913124002296/t0(0) o8->scratch-OST0003-osc-MDT0000 at 192.168.1.5@o2ib:28/4
>> lens 400/544 e 0 to 1 dl 1550628352 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 07:35:52 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 15 previous
>> similar messages
>> Feb 20 07:44:36 oss1 kernel: LustreError: 137-5: scratch-OST0001_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 07:44:36 oss1 kernel: LustreError: Skipped 18 previous similar
>> messages
>> Feb 20 07:44:36 oss1 kernel: LustreError: 11-0:
>> scratch-OST0001-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 07:44:36 oss1 kernel: LustreError: Skipped 18 previous similar
>> messages
>> Feb 20 07:47:31 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> failed due to network error: [sent 1550629001/real 1550629051]
>> req at ffff880fb1fa6000 x1625913124003644/t0(0)
>> o8->scratch-OST0001-osc-MDT0000 at 192.168.1.5@o2ib:28/4 lens 400/544 e 0
>> to 1 dl 1550629056 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 07:47:31 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 17 previous
>> similar messages
>> Feb 20 07:55:01 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 07:55:01 oss1 kernel: LustreError: Skipped 14 previous similar
>> messages
>> Feb 20 07:55:01 oss1 kernel: LustreError: 11-0:
>> scratch-OST0003-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 07:55:01 oss1 kernel: LustreError: Skipped 14 previous similar
>> messages
>> Feb 20 07:57:56 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> failed due to network error: [sent 1550629651/real 1550629676]
>> req at ffff880fe52fd800 x1625913124004920/t0(0)
>> o8->scratch-OST0003-osc-MDT0000 at 192.168.1.5@o2ib:28/4 lens 400/544 e 0
>> to 1 dl 1550629682 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 07:57:56 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 17 previous
>> similar messages
>> Feb 20 08:05:51 oss1 kernel: LustreError: 137-5: scratch-OST0001_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 08:05:51 oss1 kernel: LustreError: Skipped 17 previous similar
>> messages
>> Feb 20 08:05:51 oss1 kernel: LustreError: 11-0:
>> scratch-OST0001-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 08:05:51 oss1 kernel: LustreError: Skipped 18 previous similar
>> messages
>> Feb 20 08:08:46 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> failed due to network error: [sent 1550630276/real 1550630326]
>> req at ffff880fb1f31400 x1625913124006160/t0(0)
>> o8->scratch-OST0001-osc-MDT0000 at 192.168.1.5@o2ib:28/4 lens 400/544 e 0
>> to 1 dl 1550630332 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 08:08:46 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 15 previous
>> similar messages
>> Feb 20 08:16:16 oss1 kernel: LustreError: 137-5: scratch-OST0001_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 08:16:16 oss1 kernel: LustreError: Skipped 15 previous similar
>> messages
>> Feb 20 08:16:16 oss1 kernel: LustreError: 11-0:
>> scratch-OST0001-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 08:16:16 oss1 kernel: LustreError: Skipped 16 previous similar
>> messages
>> Feb 20 08:18:47 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> timed out for sent delay: [sent 1550630901/real 0]  req at ffff880fb0c00000
>> x1625913124007400/t0(0) o8->scratch-OST0001-osc-MDT0000 at 192.168.1.5@o2ib:28/4
>> lens 400/544 e 0 to 1 dl 1550630927 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 08:18:47 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 17 previous
>> similar messages
>> Feb 20 08:27:31 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 08:27:31 oss1 kernel: LustreError: Skipped 16 previous similar
>> messages
>> Feb 20 08:27:31 oss1 kernel: LustreError: 11-0:
>> scratch-OST0001-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 08:27:31 oss1 kernel: LustreError: Skipped 16 previous similar
>> messages
>> Feb 20 08:30:26 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> failed due to network error: [sent 1550631576/real 1550631626]
>> req at ffff880fb0ebb400 x1625913124008720/t0(0)
>> o8->scratch-OST0001-osc-MDT0000 at 192.168.1.5@o2ib:28/4 lens 400/544 e 0
>> to 1 dl 1550631632 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 08:30:26 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 15 previous
>> similar messages
>> Feb 20 08:38:21 oss1 kernel: LustreError: 137-5: scratch-OST0001_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 08:38:21 oss1 kernel: LustreError: Skipped 15 previous similar
>> messages
>> Feb 20 08:38:21 oss1 kernel: LustreError: 11-0:
>> scratch-OST0001-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 08:38:21 oss1 kernel: LustreError: Skipped 17 previous similar
>> messages
>> Feb 20 08:40:32 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> timed out for sent delay: [sent 1550632201/real 0]  req at ffff880fb0fd6000
>> x1625913124009960/t0(0) o8->scratch-OST0001-osc-MDT0000 at 192.168.1.5@o2ib:28/4
>> lens 400/544 e 0 to 1 dl 1550632232 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 08:40:32 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 17 previous
>> similar messages
>> Feb 20 08:49:36 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 08:49:36 oss1 kernel: LustreError: Skipped 15 previous similar
>> messages
>> Feb 20 08:49:36 oss1 kernel: LustreError: 11-0:
>> scratch-OST0003-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 08:49:36 oss1 kernel: LustreError: Skipped 15 previous similar
>> messages
>> Feb 20 08:50:51 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> failed due to network error: [sent 1550632801/real 1550632851]
>> req at ffff880fab861000 x1625913124011132/t0(0)
>> o8->scratch-OST0001-osc-MDT0000 at 192.168.1.5@o2ib:28/4 lens 400/544 e 0
>> to 1 dl 1550632857 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 08:50:51 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 13 previous
>> similar messages
>> Feb 20 09:00:01 oss1 kernel: LustreError: 137-5: scratch-OST0001_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 09:00:01 oss1 kernel: LustreError: Skipped 17 previous similar
>> messages
>> Feb 20 09:00:01 oss1 kernel: LustreError: 11-0:
>> scratch-OST0001-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 09:00:01 oss1 kernel: LustreError: Skipped 17 previous similar
>> messages
>> Feb 20 09:01:02 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> timed out for sent delay: [sent 1550633426/real 0]  req at ffff880fb0f20000
>> x1625913124012372/t0(0) o8->scratch-OST0001-osc-MDT0000 at 192.168.1.5@o2ib:28/4
>> lens 400/544 e 0 to 1 dl 1550633462 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 09:01:02 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 17 previous
>> similar messages
>> Feb 20 09:11:16 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> failed due to network error: [sent 1550634026/real 1550634076]
>> req at ffff881016296400 x1625913124013544/t0(0)
>> o8->scratch-OST0001-osc-MDT0000 at 192.168.1.5@o2ib:28/4 lens 400/544 e 0
>> to 1 dl 1550634082 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 09:11:16 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 13 previous
>> similar messages
>> Feb 20 09:11:41 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 09:11:41 oss1 kernel: LustreError: Skipped 15 previous similar
>> messages
>> Feb 20 09:11:41 oss1 kernel: LustreError: 11-0:
>> scratch-OST0003-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 09:11:41 oss1 kernel: LustreError: Skipped 15 previous similar
>> messages
>> Feb 20 09:21:52 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> timed out for sent delay: [sent 1550634676/real 0]  req at ffff880fb1dae400
>> x1625913124014832/t0(0) o8->scratch-OST0001-osc-MDT0000 at 192.168.1.5@o2ib:28/4
>> lens 400/544 e 0 to 1 dl 1550634712 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 09:21:52 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 17 previous
>> similar messages
>> Feb 20 09:22:06 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 09:22:06 oss1 kernel: LustreError: Skipped 17 previous similar
>> messages
>> Feb 20 09:22:06 oss1 kernel: LustreError: 11-0:
>> scratch-OST0003-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 09:22:06 oss1 kernel: LustreError: Skipped 17 previous similar
>> messages
>> Feb 20 09:32:42 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> timed out for sent delay: [sent 1550635351/real 0]  req at ffff880fb0e9f800
>> x1625913124016140/t0(0) o8->scratch-OST0003-osc-MDT0000 at 192.168.1.5@o2ib:28/4
>> lens 400/544 e 0 to 1 dl 1550635362 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 09:32:42 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 15 previous
>> similar messages
>> Feb 20 09:32:56 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 09:32:56 oss1 kernel: LustreError: Skipped 15 previous similar
>> messages
>> Feb 20 09:32:56 oss1 kernel: LustreError: 11-0:
>> scratch-OST0003-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 09:32:56 oss1 kernel: LustreError: Skipped 15 previous similar
>> messages
>> Feb 20 09:43:12 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> timed out for sent delay: [sent 1550635951/real 0]  req at ffff880fb0e79c00
>> x1625913124017344/t0(0) o8->scratch-OST0001-osc-MDT0000 at 192.168.1.5@o2ib:28/4
>> lens 400/544 e 0 to 1 dl 1550635992 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 09:43:12 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 17 previous
>> similar messages
>> Feb 20 09:43:21 oss1 kernel: LustreError: 137-5: scratch-OST0001_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 09:43:21 oss1 kernel: LustreError: Skipped 17 previous similar
>> messages
>> Feb 20 09:43:21 oss1 kernel: LustreError: 11-0:
>> scratch-OST0001-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 09:43:21 oss1 kernel: LustreError: Skipped 17 previous similar
>> messages
>> Feb 20 09:53:32 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> timed out for sent delay: [sent 1550636601/real 0]  req at ffff880fc2466800
>> x1625913124018600/t0(0) o8->scratch-OST0003-osc-MDT0000 at 192.168.1.5@o2ib:28/4
>> lens 400/544 e 0 to 1 dl 1550636612 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 09:53:32 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 13 previous
>> similar messages
>> Feb 20 09:53:46 oss1 kernel: LustreError: 137-5: scratch-OST0001_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 09:53:46 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 09:53:46 oss1 kernel: LustreError: Skipped 13 previous similar
>> messages
>> Feb 20 09:53:46 oss1 kernel: LustreError: 11-0:
>> scratch-OST0001-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 09:53:46 oss1 kernel: LustreError: Skipped 13 previous similar
>> messages
>> Feb 20 10:03:37 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> timed out for sent delay: [sent 1550637176/real 0]  req at ffff880fb1f71400
>> x1625913124019756/t0(0) o8->scratch-OST0001-osc-MDT0000 at 192.168.1.5@o2ib:28/4
>> lens 400/544 e 0 to 1 dl 1550637217 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 10:03:37 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 17 previous
>> similar messages
>> Feb 20 10:04:11 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 10:04:11 oss1 kernel: LustreError: Skipped 17 previous similar
>> messages
>> Feb 20 10:04:11 oss1 kernel: LustreError: 11-0:
>> scratch-OST0003-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 10:04:11 oss1 kernel: LustreError: Skipped 18 previous similar
>> messages
>> Feb 20 10:13:57 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> timed out for sent delay: [sent 1550637826/real 0]  req at ffff880fb0d45c00
>> x1625913124021024/t0(0) o8->scratch-OST0003-osc-MDT0000 at 192.168.1.5@o2ib:28/4
>> lens 400/544 e 0 to 1 dl 1550637837 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 10:13:57 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 15 previous
>> similar messages
>> Feb 20 10:15:01 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 10:15:01 oss1 kernel: LustreError: Skipped 14 previous similar
>> messages
>> Feb 20 10:15:01 oss1 kernel: LustreError: 11-0:
>> scratch-OST0003-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 10:15:01 oss1 kernel: LustreError: Skipped 15 previous similar
>> messages
>> Feb 20 10:24:32 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> timed out for sent delay: [sent 1550638426/real 0]  req at ffff880fb0d2b000
>> x1625913124022228/t0(0) o8->scratch-OST0001-osc-MDT0000 at 192.168.1.5@o2ib:28/4
>> lens 400/544 e 0 to 1 dl 1550638472 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 10:24:32 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 17 previous
>> similar messages
>> Feb 20 10:26:16 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 10:26:16 oss1 kernel: LustreError: Skipped 18 previous similar
>> messages
>> Feb 20 10:26:16 oss1 kernel: LustreError: 11-0:
>> scratch-OST0003-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 10:26:16 oss1 kernel: LustreError: Skipped 18 previous similar
>> messages
>> Feb 20 10:34:42 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> timed out for sent delay: [sent 1550639076/real 0]  req at ffff880fb1cd2800
>> x1625913124023492/t0(0) o8->scratch-OST0001-osc-MDT0000 at 192.168.1.5@o2ib:28/4
>> lens 400/544 e 0 to 1 dl 1550639082 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 10:34:42 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 14 previous
>> similar messages
>> Feb 20 10:36:41 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 10:36:41 oss1 kernel: LustreError: Skipped 15 previous similar
>> messages
>> Feb 20 10:36:41 oss1 kernel: LustreError: 11-0:
>> scratch-OST0003-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 10:36:41 oss1 kernel: LustreError: Skipped 15 previous similar
>> messages
>> Feb 20 10:45:01 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> failed due to network error: [sent 1550639676/real 1550639701]
>> req at ffff880fb0e35000 x1625913124024684/t0(0)
>> o8->scratch-OST0003-osc-MDT0000 at 192.168.1.5@o2ib:28/4 lens 400/544 e 0
>> to 1 dl 1550639731 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 10:45:01 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 16 previous
>> similar messages
>> Feb 20 10:47:06 oss1 kernel: LustreError: 137-5: scratch-OST0003_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 10:47:06 oss1 kernel: LustreError: Skipped 15 previous similar
>> messages
>> Feb 20 10:47:06 oss1 kernel: LustreError: 11-0:
>> scratch-OST0003-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 10:47:06 oss1 kernel: LustreError: Skipped 15 previous similar
>> messages
>> Feb 20 10:55:27 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has
>> timed out for sent delay: [sent 1550640301/real 0]  req at ffff880fb0e70000
>> x1625913124025896/t0(0) o8->scratch-OST0003-osc-MDT0000 at 192.168.1.5@o2ib:28/4
>> lens 400/544 e 0 to 1 dl 1550640327 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
>> Feb 20 10:55:27 oss1 kernel: Lustre:
>> 6285:0:(client.c:1918:ptlrpc_expire_one_request()) Skipped 14 previous
>> similar messages
>> Feb 20 10:57:56 oss1 kernel: LustreError: 137-5: scratch-OST0001_UUID:
>> not available for connect from 0 at lo (no target). If you are running an
>> HA pair check that the target is mounted on the other server.
>> Feb 20 10:57:56 oss1 kernel: LustreError: Skipped 16 previous similar
>> messages
>> Feb 20 10:57:56 oss1 kernel: LustreError: 11-0:
>> scratch-OST0001-osc-MDT0000: Communicating with 0 at lo, operation
>> ost_connect failed with -19.
>> Feb 20 10:57:56 oss1 kernel: LustreError: Skipped 16 previous similar
>> messages
>> =================
>>
>> --
>> Regards,
>> A J Naik
>> Computer Centre, TIFR.
>> +91 022 22782342 / 2121
>>
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20190225/f501989c/attachment-0001.html>


More information about the lustre-discuss mailing list