[Lustre-discuss] configuring lustre network get kernel panic

Sébastien Buisson sebastien.buisson at bull.net
Wed Apr 23 06:19:50 PDT 2008


Hi Christian,

We had the same problem a few months ago. See bugzilla 14988 for all the 
details, and comment #21 
(https://bugzilla.lustre.org/show_bug.cgi?id=14988#c21) for the solution.

Regards,
Sebastien.


Christian Gajan a écrit :
> Hi,
> 
> I try to configure  luste 1.6.4.3 + OFED 1.2.5.5 + RHEL5u1 (2.6.18-53.1.13)
> 
> compilation and installation steps are ok
> 
> - build kernel 2.6.18-53.1.13 + lustre patch OK
> - boot with new kernel
> - build OFED 1.2.5.5 with new kernel
> - install OFED
> - boot again with ofed drivers
> - build lustre 1.6.4.3 rpm with-o2ib=/usr/src/ofa-kernel-1.2.5.5
> with-linux=/usr/src/linux-2.6.18-53.1.13.el5.lustre-1.6.4.3
> - install lustre rpm  1- lustre-ldiskfs 2- lustre-module 3- lustre
> without any warning
> 
> When I begin to configure my lustre I get a kernel panic
> 
> # ifconfig ib1
> ib1       Link encap:InfiniBand  HWaddr 
> 80:00:04:05:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
>         inet addr:192.168.1.16  Bcast:192.168.1.255  Mask:255.255.255.0
>         inet6 addr: fe80::203:ba00:100:5142/64 Scope:Link
>         UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
>         RX packets:6 errors:0 dropped:0 overruns:0 frame:0
>         TX packets:42 errors:0 dropped:2 overruns:0 carrier:0
>         collisions:0 txqueuelen:128
>         RX bytes:336 (336.0 b)  TX bytes:8186 (7.9 KiB)
> # cat /etc/modprobe.conf
> ...
> options lnet ip2nets="o2ib0(ib1) 192.168.1.[16-19]
> # modprobe lnet
> # lctl network configure
> ko2iblnd: no version for "ib_fmr_pool_unmap" found: kernel tainted.
> general protection fault: 0000 [1] SMP last sysfs file: 
> /devices/pci0000:00/0000:00:00.0/irq
> CPU 1 Modules linked in: ko2iblnd(U) lnet(U) libcfs(U) nfs(U) lockd(U) 
> fscache(U) nfs_acl(U) autofs4(U) hidp(U) rfcomm(U) l2cap(U) bluetooth(U) 
> sunrpc(U) rdma_ucm(U) ib_sdp(U) rdma_cm(U) iw_cm(U) ib_addr(U) 
> ib_uverbs(U) ib_umad(U) ib_ipath(U) mlx4_ib(U) mlx4_core(U) ib_ipoib(U) 
> ib_cm(U) ib_sa(U) ipv6(U) dm_mirror(U) dm_multipath(U) dm_mod(U) 
> video(U) sbs(U) backlight(U) i2c_ec(U) button(U) battery(U) asus_acpi(U) 
> acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) qla2xxx(U) 
> shpchp(U) scsi_transport_fc(U) ide_cd(U) cdrom(U) forcedeth(U) 
> i2c_nforce2(U) i2c_core(U) k8temp(U) hwmon(U) tg3(U) k8_edac(U) 
> edac_mc(U) serio_raw(U) ib_mthca(U) ib_mad(U) ib_core(U) pcspkr(U) sg(U) 
> sata_nv(U) libata(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) ehci_hcd(U) 
> ohci_hcd(U) uhci_hcd(U)
> Pid: 11797, comm: lctl Tainted: GF     
> 2.6.18-53.1.13.el5_lustre.1.6.4.3.v2 #1
> RIP: 0010:[<ffffffff88703f1a>]  [<ffffffff88703f1a>] 
> :ko2iblnd:kiblnd_map_tx_descs+0xea/0x180
> RSP: 0000:ffff81022ba85808  EFLAGS: 00010282
> RAX: ffffffff881446df RBX: ffffc20000071000 RCX: 0000000000000001
> RDX: 0000000000001000 RSI: ffff81022be20000 RDI: ffff81023fd95000
> RBP: ffff81023b9fc640 R08: ffff81022ba84000 R09: 000000000000003f
> R10: ffff810107f60008 R11: 0000000000000100 R12: 0000000000000001
> R13: 0000000000000001 R14: 0000000000000000 R15: ffff81023b9fc668
> FS:  00002aaaaaaec360(0000) GS:ffff810107e99440(0000) 
> knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00000034e7e95770 CR3: 000000043d90e000 CR4: 00000000000006e0
> Process lctl (pid: 11797, threadinfo ffff81022ba84000, task 
> ffff81023a9247a0)
> Stack:  ffff81023b9fc340 ffff81023f9fba00 ffff81023b9fc340 ffff81023b9fc640
> ffff81023b9fc2c0 ffff81022c2c5746 ffff81022e2cdac0 ffffffff887075fa
> ffff81022ba85858 ffffffff886c7e0a 0000000000000000 ffffffff886f9018
> Call Trace:
> [<ffffffff887075fa>] :ko2iblnd:kiblnd_startup+0x9fa/0xb10
> [<ffffffff886c7e0a>] :lnet:lnet_trimwhite+0x2a/0x60
> [<ffffffff886c5738>] :lnet:lnet_startup_lndnis+0x128/0x630
> [<ffffffff88692fe8>] :libcfs:cfs_alloc+0x28/0x60
> [<ffffffff886c635e>] :lnet:LNetNIInit+0xfe/0x1e9
> [<ffffffff8009d458>] ktime_get_ts+0x1a/0x4e
> [<ffffffff800625bf>] __down_read+0x12/0x92
> [<ffffffff800c360f>] zone_statistics+0x3e/0x6d
> [<ffffffff886d40b3>] :lnet:lnet_configure+0x33/0x60
> [<ffffffff88698790>] :libcfs:libcfs_ioctl+0x490/0x550
> [<ffffffff800c360f>] zone_statistics+0x3e/0x6d
> [<ffffffff8000afce>] __find_get_block+0x15c/0x16c
> [<ffffffff8011cf2e>] selinux_ipc_permission+0x0/0x2f
> [<ffffffff80128931>] constraint_expr_eval+0x298/0x45d
> [<ffffffff80128931>] constraint_expr_eval+0x298/0x45d
> [<ffffffff8012546c>] avtab_search_node+0x38/0x6a
> [<ffffffff80128d3f>] context_struct_compute_av+0x249/0x2ba
> [<ffffffff8011b7dc>] avc_alloc_node+0x3a/0x187
> [<ffffffff8011bb31>] avc_has_perm_noaudit+0x208/0x36b
> [<ffffffff8011c85f>] avc_has_perm+0x43/0x55
> [<ffffffff8011c85f>] avc_has_perm+0x43/0x55
> [<ffffffff8011d396>] inode_has_perm+0x56/0x63
> [<ffffffff88695b42>] :libcfs:libcfs_ioctl+0x142/0x170
> [<ffffffff8011d437>] file_has_perm+0x94/0xa3
> [<ffffffff80062bfd>] lock_kernel+0x1b/0x32
> [<ffffffff8003fc46>] do_ioctl+0x55/0x6b
> [<ffffffff8002fc81>] vfs_ioctl+0x248/0x261
> [<ffffffff8004a230>] sys_ioctl+0x59/0x78
> [<ffffffff8005b28d>] tracesys+0xd5/0xe0
> 
> 
> Code: ff 50 08 eb 18 90 48 8b 05 d9 16 d3 f7 b9 01 00 00 00 ba 00 RIP  
> [<ffffffff88703f1a>] :ko2iblnd:kiblnd_map_tx_descs+0xea/0x180
> RSP <ffff81022ba85808>
> <0>Kernel panic - not syncing: Fatal exception
> 
> 
> Any idea about a mistake in my procedure
> or any known issue ?
> 
> regards
> 
> christian
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss



More information about the lustre-discuss mailing list