[lustre-discuss] network error on bulk WRITE/bad log

Thu Aug 18 02:46:11 PDT 2022

Hello Chris,

Thank you for the insight. I have in fact already updated the firmware 
on the Infiniband adapters. The cards are ConnectX-5 VPI MCX555A-ECAT 
and the firmware version:

[snassyr at io01 ~]$ sudo mstflint -d 81:00.0 q
Image type:            FS4
FW Version:            16.34.1002
FW Release Date:       26.7.2022
Product Version:       16.34.1002
Rom Info:              type=UEFI version=14.27.14 cpu=AMD64
                        type=PXE version=3.6.700 cpu=AMD64
Description:           UID                GuidsNumber
Base GUID:             0c42a1030054820a        4
Base MAC:              0c42a154820a            4
Image VSD:             N/A
Device VSD:            N/A
PSID:                  MT_0000000010
Security Attributes:   N/A

Which is the latest firmware release from NVIDIA/Mellanox, dated 2022-08-02.

I am not using an external driver distribution and the system packages 
are up-to-date.

Best regards,
Stepan

On 17.08.22 16:28, Horn, Chris wrote:
>
> [66494.575431] LNetError: 
> 20017:0:(o2iblnd.c:1880:kiblnd_fmr_pool_map()) Failed to map mr 1/8 
> elements
> [66494.575446] LNetError: 
> 20017:0:(o2iblnd_cb.c:613:kiblnd_fmr_map_tx()) Can't map 32768 bytes 
> (8/8)s: -22
>
> These errors originate from a call to ib_map_mr_sg() which is part of 
> the kernel verbs API.
>
>                                 n = ib_map_mr_sg(mr, tx->tx_frags,
>
> rd->rd_nfrags, NULL, PAGE_SIZE);
>
>                                 if (unlikely(n != rd->rd_nfrags)) {
>
>                                 CERROR("Failed to map mr %d/%d 
> elements\n",
>
> n, rd->rd_nfrags);
>
> return n < 0 ? n : -EINVAL;
>
>                                 }
>
> Your errors mean that we wanted to map 8 fragments to the memory 
> region, but we were only able to map one of them.
>
> As a first step, I would recommend ensuring that you have the latest 
> firmware for your network cards, and if you’re using an external 
> driver distribution (like mlnx-ofa_kernel) then upgrade to the latest 
> version. There could be some bug in the o2iblnd driver code but it is 
> best to first rule out any issue with firmware/drivers.
>
> Chris Horn
>
> *From: *lustre-discuss <lustre-discuss-bounces at lists.lustre.org> on 
> behalf of Stepan Nassyr via lustre-discuss 
> <lustre-discuss at lists.lustre.org>
> *Date: *Tuesday, August 16, 2022 at 8:26 AM
> *To: *Peter Jones <pjones at whamcloud.com>, 
> lustre-discuss at lists.lustre.org <lustre-discuss at lists.lustre.org>
> *Subject: *Re: [lustre-discuss] network error on bulk WRITE/bad log
>
> Hello Peter,
>
> Thank you for the reply. I have upgraded lustre to 2.15.1 . The errors 
> persist, however - now I am also seeing a new error on io02:
>
> [ 1749.396942] LustreError: 
> 9216:0:(mdt_handler.c:7499:mdt_iocontrol()) storage-MDT0001: Not 
> supported cmd = 1074292357, rc = -95
>
> I'm not entirely sure how to look up the cmd code and rc -95 seems to 
> just be EOPNOTSUPP, so no additional information here.
>
> Is there a way to look up what the cmd value means?
>
> On 15.08.22 14:50, Peter Jones wrote:
>
>     Stepan
>
>     2.14.56 is not a version of Lustre – it is an interim dev build.
>     Even if it does not resolve this specific issue, I would strongly
>     recommend switching to the recently released Lustre 2.15.1 release
>
>     Peter
>
>     *From: *lustre-discuss <lustre-discuss-bounces at lists.lustre.org>
>     <mailto:lustre-discuss-bounces at lists.lustre.org> on behalf of
>     Stepan Nassyr via lustre-discuss <lustre-discuss at lists.lustre.org>
>     <mailto:lustre-discuss at lists.lustre.org>
>     *Reply-To: *Stepan Nassyr <s.nassyr at fz-juelich.de>
>     <mailto:s.nassyr at fz-juelich.de>
>     *Date: *Monday, August 15, 2022 at 1:35 AM
>     *To: *"lustre-discuss at lists.lustre.org"
>     <mailto:lustre-discuss at lists.lustre.org>
>     <lustre-discuss at lists.lustre.org>
>     <mailto:lustre-discuss at lists.lustre.org>
>     *Subject: *[lustre-discuss] network error on bulk WRITE/bad log
>
>     Hi all,
>
>     In May I had a failure on a small cluster and asked here
>     (http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/2022-May/018073.html).
>     Due to time constraints I just recreated the filesystem back then.
>
>     Now the failure happened again, this time I have more time and can
>     investigate and haven't done anything destructive yet.
>
>     I use the following versions:
>
>      1. lustre 2.14.56
>      2. zfs 2.0.7 (previously used 2.1.2, but got told that 2.1.x is
>         not tested well with lustre)
>      3. Nodes are running Rocky Linux 8.6
>      4. uname -r: 4.18.0-372.19.1.el8_6.aarch64
>
>     There are 2 IO nodes (io01 and io02), both of them are MDS and OSS
>     and one of them is MGS. Here are the devices:
>
>     [snassyr at io02 ~]$ sudo lctl dl
>       0 UP osd-zfs storage-MDT0001-osd storage-MDT0001-osd_UUID 8
>       1 UP mgc MGC10.31.7.61 at o2ib a087e05e-d57c-4561-ad75-6827d4428f54 4
>       2 UP mds MDS MDS_uuid 2
>       3 UP lod storage-MDT0001-mdtlov storage-MDT0001-mdtlov_UUID 3
>       4 UP mdt storage-MDT0001 storage-MDT0001_UUID 8
>       5 UP mdd storage-MDD0001 storage-MDD0001_UUID 3
>       6 UP osp storage-MDT0000-osp-MDT0001 storage-MDT0001-mdtlov_UUID 4
>       7 UP osp storage-OST0000-osc-MDT0001 storage-MDT0001-mdtlov_UUID 4
>       8 UP osp storage-OST0001-osc-MDT0001 storage-MDT0001-mdtlov_UUID 4
>       9 UP lwp storage-MDT0000-lwp-MDT0001
>     storage-MDT0000-lwp-MDT0001_UUID 4
>      10 UP osd-zfs storage-OST0001-osd storage-OST0001-osd_UUID 4
>      11 UP ost OSS OSS_uuid 2
>      12 UP obdfilter storage-OST0001 storage-OST0001_UUID 6
>      13 UP lwp storage-MDT0000-lwp-OST0001
>     storage-MDT0000-lwp-OST0001_UUID 4
>      14 UP lwp storage-MDT0001-lwp-OST0001
>     storage-MDT0001-lwp-OST0001_UUID 4
>
>     [snassyr at io01 ~]$ sudo lctl dl
>       0 UP osd-zfs MGS-osd MGS-osd_UUID 4
>       1 UP mgs MGS MGS 6
>       2 UP mgc MGC10.31.7.61 at o2ib 9f351a51-0232-4306-a66d-cecee8629329 4
>       3 UP osd-zfs storage-MDT0000-osd storage-MDT0000-osd_UUID 9
>       4 UP mds MDS MDS_uuid 2
>       5 UP lod storage-MDT0000-mdtlov storage-MDT0000-mdtlov_UUID 3
>       6 UP mdt storage-MDT0000 storage-MDT0000_UUID 12
>       7 UP mdd storage-MDD0000 storage-MDD0000_UUID 3
>       8 UP qmt storage-QMT0000 storage-QMT0000_UUID 3
>       9 UP osp storage-MDT0001-osp-MDT0000 storage-MDT0000-mdtlov_UUID 4
>      10 UP osp storage-OST0000-osc-MDT0000 storage-MDT0000-mdtlov_UUID 4
>      11 UP osp storage-OST0001-osc-MDT0000 storage-MDT0000-mdtlov_UUID 4
>      12 UP lwp storage-MDT0000-lwp-MDT0000
>     storage-MDT0000-lwp-MDT0000_UUID 4
>      13 UP osd-zfs storage-OST0000-osd storage-OST0000-osd_UUID 4
>      14 UP ost OSS OSS_uuid 2
>      15 UP obdfilter storage-OST0000 storage-OST0000_UUID 6
>      16 UP lwp storage-MDT0000-lwp-OST0000
>     storage-MDT0000-lwp-OST0000_UUID 4
>      17 UP lwp storage-MDT0001-lwp-OST0000
>     storage-MDT0001-lwp-OST0000_UUID 4
>
>     On io01 I see repeating errors mentioning a network error:
>
>     [65922.582578] LustreError:
>     20017:0:(ldlm_lib.c:3540:target_bulk_io()) Skipped 11 previous
>     similar messages
>     [66494.575431] LNetError:
>     20017:0:(o2iblnd.c:1880:kiblnd_fmr_pool_map()) Failed to map mr
>     1/8 elements
>     [66494.575442] LNetError:
>     20017:0:(o2iblnd.c:1880:kiblnd_fmr_pool_map()) Skipped 11 previous
>     similar messages
>     [66494.575446] LNetError:
>     20017:0:(o2iblnd_cb.c:613:kiblnd_fmr_map_tx()) Can't map 32768
>     bytes (8/8)s: -22
>     [66494.575448] LNetError:
>     20017:0:(o2iblnd_cb.c:613:kiblnd_fmr_map_tx()) Skipped 11 previous
>     similar messages
>     [66494.575452] LNetError:
>     20017:0:(o2iblnd_cb.c:1725:kiblnd_send()) Can't setup PUT src for
>     10.31.7.62 at o2ib: -22
>     [66494.575454] LNetError:
>     20017:0:(o2iblnd_cb.c:1725:kiblnd_send()) Skipped 11 previous
>     similar messages
>     [66494.575458] LustreError:
>     20017:0:(events.c:477:server_bulk_callback()) event type 5, status
>     -5, desc 00000000cdd4e797
>     [66494.575460] LustreError:
>     20017:0:(events.c:477:server_bulk_callback()) Skipped 11 previous
>     similar messages
>     [66546.574314] LustreError:
>     20017:0:(ldlm_lib.c:3540:target_bulk_io()) @@@ network error on
>     bulk WRITE  req at 0000000070b8f1ab x1740960836990720/t0(0)
>     o1000->storage-MDT0001-mdtlov_UUID at 10.31.7.62@o2ib:522/0 lens
>     336/33016 e 0 to 0 dl 1660376137 ref 1 fl Interpret:/0/0 rc 0/0 job:''
>
>     On io02 I see repeating errors mentioning a bad log:
>
>     [66582.856444] LustreError:
>     14905:0:(llog_osd.c:264:llog_osd_read_header())
>     storage-MDT0000-osp-MDT0001: bad log [0x200000401:0x1:0x0] header
>     magic: 0x0 (expected 0x10645539)
>     [66582.856450] LustreError:
>     14905:0:(llog_osd.c:264:llog_osd_read_header()) Skipped 11
>     previous similar messages
>
>     I can't make sense of these error messages. How can I recover?
>
>     (I have the full dmesg/lctl dk log, but they are too big to
>     attach, is it ok to upload them somewhere and put a link in a reply?)
>
>     Thank you and best regards,
>     Stepan
>
>
>
>     ------------------------------------------------------------------------------------------------
>     ------------------------------------------------------------------------------------------------
>     Forschungszentrum Juelich GmbH
>     52425 Juelich
>     Sitz der Gesellschaft: Juelich
>     Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
>     Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
>     Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
>     Karsten Beneke (stellv. Vorsitzender), Prof. Dr. Astrid Lambrecht,
>     Prof. Dr. Frauke Melchior
>     ------------------------------------------------------------------------------------------------
>     ------------------------------------------------------------------------------------------------
>
>
>     Neugierige sind herzlich willkommen am Sonntag, den 21. August
>     2022, von 10:00 bis 17:00 Uhr. Mehr unter:
>     https://www.tagderneugier.de
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20220818/5b148ba1/attachment-0001.htm>