[Lustre-discuss] Kernel Panic error while running lustre 2.0 with infiniband

Sébastien Buisson sebastien.buisson at bull.net
Mon Feb 21 04:45:41 PST 2011


Hi,

The important bit is:
LustreError: 6041:0:(o2iblnd.c:2501:kiblnd_startup()) Can't query IPoIB 
interface ib0: it's down

Lustre requires IPoIB interfaces to be setup on all Lustre nodes. It 
does not mean Lustre will use IP stack on top of Infiniband to transfer 
data, but IPoIB addresses are used as identifiers to establish initial 
Infiniband connections (Queue Pairs and so on).

Cheers,
Sebastien.


Le 21/02/2011 13:27, Arya Mazaheri a écrit :
> Hi there,
> I have configured and ran lustre 2.0 with tcp (OSS and MDS on on the
> same server) without problem. Now I am trying to run lustre with
> infiniband support. but whenever I mount the mdt storage on server, the
> process ends with following error:
> kernel panic - not syncing: fatal exception
>
> my /etc/modprobe.conf is:
> options lnet networks="o2ib0(ib0)"
>
> last lines of dmesg:
> ----------------------------------------------------
> kjournald starting.  Commit interval 5 seconds
> LDISKFS-fs warning: maximal mount count reached, running e2fsck is
> recommended
> LDISKFS FS on sda2, internal journal
> LDISKFS-fs: recovery complete.
> LDISKFS-fs: mounted filesystem with ordered data mode.
> kjournald starting.  Commit interval 5 seconds
> LDISKFS-fs warning: maximal mount count reached, running e2fsck is
> recommended
> LDISKFS FS on sda2, internal journal
> LDISKFS-fs: mounted filesystem with ordered data mode.
> LustreError: 6041:0:(o2iblnd.c:2501:kiblnd_startup()) Can't query IPoIB
> interface ib0: it's down
> LustreError: 6041:0:(o2iblnd.c:2501:kiblnd_startup()) Skipped 1 previous
> similar message
> eth0: no IPv6 routers present
> LustreError: 105-4: Error -100 starting up LNI o2ib
> LustreError: Skipped 1 previous similar message
> LustreError: 6041:0:(events.c:731:ptlrpc_init_portals()) network
> initialisation failed
> LustreError: 158-c: Can't load module 'mgs'
> LustreError: 6035:0:(genops.c:286:class_newdev()) OBD: unknown type: mgs
> LustreError: 6035:0:(obd_config.c:300:class_attach()) Cannot create
> device MGS of type mgs : -19
> LustreError: 6035:0:(obd_mount.c:502:lustre_start_simple()) MGS attach
> error -19
> LustreError: 15e-a: Failed to start MGS 'MGS' (-19). Is the 'mgs' module
> loaded?
> LustreError: 6035:0:(obd_mount.c:1492:server_put_super()) no obd
> lustre-MDTffff
> LustreError: 6035:0:(obd_mount.c:137:server_deregister_mount())
> lustre-MDTffff not registered
> Lustre: server umount lustre-MDTffff complete
> LustreError: 6035:0:(obd_mount.c:2136:lustre_fill_super()) Unable to
> mount  (-19)
> kjournald starting.  Commit interval 5 seconds
> LDISKFS-fs warning: maximal mount count reached, running e2fsck is
> recommended
> LDISKFS FS on sdb1, internal journal
> LDISKFS-fs: mounted filesystem with ordered data mode.
> kjournald starting.  Commit interval 5 seconds
> LDISKFS-fs warning: maximal mount count reached, running e2fsck is
> recommended
> LDISKFS FS on sdb1, internal journal
> LDISKFS-fs: mounted filesystem with ordered data mode.
> LDISKFS-fs: file extents enabled
> LDISKFS-fs: mballoc enabled
> LustreError: 6117:0:(events.c:731:ptlrpc_init_portals()) network
> initialisation failed
> LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success)
> LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0
> breaks, 0 lost
> LDISKFS-fs: mballoc: 0 generated and it took 0
> LDISKFS-fs: mballoc: 0 preallocated, 0 discarded
> kjournald starting.  Commit interval 5 seconds
> LDISKFS-fs warning: maximal mount count reached, running e2fsck is
> recommended
> LDISKFS FS on sdb2, internal journal
> LDISKFS-fs: mounted filesystem with ordered data mode.
> kjournald starting.  Commit interval 5 seconds
> LDISKFS-fs warning: maximal mount count reached, running e2fsck is
> recommended
> LDISKFS FS on sdb2, internal journal
> LDISKFS-fs: mounted filesystem with ordered data mode.
> LDISKFS-fs: file extents enabled
> LDISKFS-fs: mballoc enabled
> LustreError: 6193:0:(events.c:731:ptlrpc_init_portals()) network
> initialisation failed
> LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success)
> LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0
> breaks, 0 lost
> LDISKFS-fs: mballoc: 0 generated and it took 0
> LDISKFS-fs: mballoc: 0 preallocated, 0 discarded
> kjournald starting.  Commit interval 5 seconds
> LDISKFS-fs warning: maximal mount count reached, running e2fsck is
> recommended
> LDISKFS FS on sdb3, internal journal
> LDISKFS-fs: mounted filesystem with ordered data mode.
> kjournald starting.  Commit interval 5 seconds
> LDISKFS-fs warning: maximal mount count reached, running e2fsck is
> recommended
> LDISKFS FS on sdb3, internal journal
> LDISKFS-fs: mounted filesystem with ordered data mode.
> LDISKFS-fs: file extents enabled
> LDISKFS-fs: mballoc enabled
> LustreError: 6269:0:(o2iblnd.c:2501:kiblnd_startup()) Can't query IPoIB
> interface ib0: it's down
> LustreError: 6269:0:(o2iblnd.c:2501:kiblnd_startup()) Skipped 2 previous
> similar messages
> LustreError: 105-4: Error -100 starting up LNI o2ib
> LustreError: Skipped 2 previous similar messages
> LustreError: 6269:0:(events.c:731:ptlrpc_init_portals()) network
> initialisation failed
> LustreError: 158-c: Can't load module 'mgc'
> LustreError: Skipped 2 previous similar messages
> LustreError: 6263:0:(genops.c:286:class_newdev()) OBD: unknown type: mgc
> LustreError: 6263:0:(genops.c:286:class_newdev()) Skipped 2 previous
> similar messages
> LustreError: 6263:0:(obd_config.c:300:class_attach()) Cannot create
> device MGC0 at lo of type mgc : -19
> LustreError: 6263:0:(obd_config.c:300:class_attach()) Skipped 2 previous
> similar messages
> LustreError: 6263:0:(obd_mount.c:502:lustre_start_simple()) MGC0 at lo
> attach error -19
> LustreError: 6263:0:(obd_mount.c:502:lustre_start_simple()) Skipped 2
> previous similar messages
> LustreError: 6263:0:(obd_mount.c:1492:server_put_super()) no obd
> lustre-OST0002
> LustreError: 6263:0:(obd_mount.c:1492:server_put_super()) Skipped 2
> previous similar messages
> LustreError: 6263:0:(obd_mount.c:137:server_deregister_mount())
> lustre-OST0002 not registered
> LustreError: 6263:0:(obd_mount.c:137:server_deregister_mount()) Skipped
> 2 previous similar messages
> LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success)
> LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0
> breaks, 0 lost
> LDISKFS-fs: mballoc: 0 generated and it took 0
> LDISKFS-fs: mballoc: 0 preallocated, 0 discarded
> Lustre: server umount lustre-OST0002 complete
> Lustre: Skipped 2 previous similar messages
> LustreError: 6263:0:(obd_mount.c:2136:lustre_fill_super()) Unable to
> mount  (-19)
> LustreError: 6263:0:(obd_mount.c:2136:lustre_fill_super()) Skipped 2
> previous similar messages
> kjournald starting.  Commit interval 5 seconds
> LDISKFS-fs warning: maximal mount count reached, running e2fsck is
> recommended
> LDISKFS FS on sdb4, internal journal
> LDISKFS-fs: mounted filesystem with ordered data mode.
> kjournald starting.  Commit interval 5 seconds
> LDISKFS-fs warning: maximal mount count reached, running e2fsck is
> recommended
> LDISKFS FS on sdb4, internal journal
> LDISKFS-fs: mounted filesystem with ordered data mode.
> LDISKFS-fs: file extents enabled
> LDISKFS-fs: mballoc enabled
> LustreError: 6345:0:(events.c:731:ptlrpc_init_portals()) network
> initialisation failed
> LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success)
> LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0
> breaks, 0 lost
> LDISKFS-fs: mballoc: 0 generated and it took 0
> LDISKFS-fs: mballoc: 0 preallocated, 0 discarded
> ------------------------------------------------------------------------------------
>
> Any ideas?
>
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss



More information about the lustre-discuss mailing list