[Lustre-discuss] Lustre 1.6.7 kernel panics when umounting MDTs

Wed Apr 1 19:21:12 PDT 2009

Lustre Admins,

We have recently had 2 odd events I am keen to get some feedback on.

1) We recently had a filesystem go read-only (noticed on the client)  
due to an "inode out of bounds" error as reported in the server logs.   
Please see below for log and fsck details.

2) In order to correct the issue (ie remount the device read-write), I  
attempted to umount that device (intending to do a quick fsck as the  
device was the filesystem's MDT, and then remounting) - however this  
action caused a kernel panic.

I have experienced these kernel panics quite a few times in the past  
when umounting MDTs and OSTs in similar situations - when one or more  
have gone read-only, but this is the first with the 1.6.7 kernel.
The resulting panic required a reboot and fsck on all mounted LUNs -  
which at present is about 15 TB (in a few separate filesystems).

The server is currently acting has the MGS, MDT and OST partitions for  
all lustre filesystems.  Unfortunately we have still not had the  
chance to separate those roles onto different servers.
The server is running 64 bit RHEL 5.2 with the latest lustre kernel  
(1.6.7) and associated packages.  It runs on a SUN blade with 4 AMD  
cores and 16 GB RAM.
[ kernel version: 2.6.18-92.1.17.el5_lustre.1.6.7smp #1 SMP Mon Feb 9  
19:56:55 MST 2009 x86_64 x86_64 x86_64 GNU/Linux ]

All clients are also running RHEL 5.2/5.3 with unpatched kernels and  
the latest lustre packages (1.6.7).
[ client kernel: 2.6.18-92.1.17.el5 #1 SMP Wed Oct 22 04:19:38 EDT  
2008 x86_64 x86_64 x86_64 GNU/Linux ]

I look forward to any advice regarding the unusual error (1) or any  
procedures to follow in order to prevent the kernel panics.

Thanks in advance.

Marcus Schull,
Systems Administrator
IMB, University of Queensland

Lustre server /var/log/messages:

Apr  1 19:03:04 lustre1 kernel: LDISKFS-fs error (device dm-37):  
ldiskfs_add_entry: bad entry in directory #12686675: inode out of  
bounds - offset=0, inode=2807365078, rec_len=4096, name_len=36
Apr  1 19:03:04 lustre1 kernel: Remounting filesystem read-only
Apr  1 19:03:04 lustre1 kernel: LDISKFS-fs error (device dm-37) in  
start_transaction: Readonly filesystem
Apr  1 19:03:04 lustre1 kernel: LustreError: 8059:0:(fsfilt-ldiskfs.c: 
1231:fsfilt_ldiskfs_write_record()) can't start transaction for 18  
blocks (128 bytes)
Apr  1 19:03:04 lustre1 kernel: LustreError: 8059:0:(mds_reint.c: 
230:mds_finish_transno()) @@@ wrote trans #0 rc -5 client  
5142a23c-5adc-d4e8-4375-1b1028fe8e7d at idx 0: err = -30   
req at ffff8101771ba000 x28445861/t0 o101->51
42a23c-5adc-d4e8-4375-1b1028fe8e7d at NET_0x200008266732c_UUID:0/0 lens  
512/568 e 0 to 0 dl 1238576684 ref 1 fl Interpret:/0/0 rc 0/0
Apr  1 19:03:04 lustre1 kernel: LustreError: 8059:0:(mds_reint.c: 
238:mds_finish_transno()) wrote objids: err = 0
Apr  1 19:03:04 lustre1 kernel: LustreError: 11330:0:(fsfilt-ldiskfs.c: 
280:fsfilt_ldiskfs_start()) error starting handle for op 8 (33  
credits): rc -30
Apr  1 19:03:04 lustre1 kernel: LustreError: 11330:0:(fsfilt-ldiskfs.c: 
280:fsfilt_ldiskfs_start()) error starting handle for op 8 (33  
credits): rc -30
Apr  1 19:03:04 lustre1 kernel: LustreError: 11330:0:(mds_reint.c: 
154:mds_finish_transno()) fsfilt_start: -30
Apr  1 19:03:04 lustre1 kernel: LustreError: 11337:0:(mds_reint.c: 
154:mds_finish_transno()) fsfilt_start: -30

fsck output of filesystem with error reported above:

[root at lustre1 ~]# e2fsck /dev/mapper/mpath11p1
e2fsck 1.40.11.sun1 (17-June-2008)
qfab-MDT0000: recovering journal
qfab-MDT0000 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Inode 12686675, i_size is 28672, should be 65536.  Fix<y>? yes

Pass 2: Checking directory structure
Problem in HTREE directory inode 12686675: node (0) has an unordered  
hash table
Clear HTree index<y>? yes

Entry '(=fM-'^\^@ ^@M-~^JM-XM-'p^H$^@:{M-XM-'^H^G$^@TM-O^IM-(x^L(^@t^W 
$M-(' in /ROOT/data1/offindex (12686675) has invalid inode #:  
2807365078.
Clear<y>? yes

Pass 3: Checking directory connectivity
Pass 3A: Optimizing directories
Pass 4: Checking reference counts
Unattached inode 12681396
Connect to /lost+found<y>? yes

Inode 12681396 ref count is 2, should be 1.  Fix<y>? yes

Unattached inode 12681397
Connect to /lost+found<y>? yes

... etc

-------

Out of interest, we are also getting occasional "soft CPU lockups",  
although these don't appear to be causing problems at present . . .

Mar 31 08:57:36 lustre1 kernel: BUG: soft lockup - CPU#0 stuck for  
10s! [ll_mdt_rdpg_11:11332]
Mar 31 08:57:36 lustre1 kernel: CPU 0:
Mar 31 08:57:36 lustre1 kernel: Modules linked in: obdfilter(U) ost(U)  
mds(U) fsfilt_ldiskfs(U) mgs(U) mgc(U) ldiskfs(U) crc16(U) lustre(U)  
lov(U) mdc(U) lquota(U) osc(U) ksocklnd(U) ptlrpc(U) obdclass(U)  
lnet(U) lvfs(U) li
bcfs(U) ip_conntrack_netbios_ns(U) ipt_REJECT(U) xt_tcpudp(U)  
xt_state(U) ip_conntrack(U) nfnetlink(U) iptable_filter(U)  
ip_tables(U) x_tables(U) cpufreq_ondemand(U) dm_round_robin(U)  
dm_multipath(U) video(U) sbs(U) backlig
ht(U) i2c_ec(U) button(U) battery(U) asus_acpi(U) acpi_memhotplug(U)  
ac(U) parport_pc(U) lp(U) parport(U) sr_mod(U) joydev(U) cdrom(U)  
pata_acpi(U) libata(U) i2c_nforce2(U) k8temp(U) shpchp(U) pcspkr(U)  
i2c_core(U) hwmon(U)
  k8_edac(U) sg(U) forcedeth(U) edac_mc(U) serio_raw(U) e1000e(U)  
dm_snapshot(U) dm_zero(U) dm_mirror(U) dm_mod(U) usb_storage(U)  
qla2xxx(U) scsi_transport_fc(U) mptsas(U) mptscsih(U) mptbase(U)  
scsi_transport_sas(U) sd_mod(
U) scsi_mod(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
Mar 31 08:57:36 lustre1 kernel: Pid: 11332, comm: ll_mdt_rdpg_11  
Tainted: G      2.6.18-92.1.17.el5_lustre.1.6.7smp #1
Mar 31 08:57:36 lustre1 kernel: RIP: 0010:[<ffffffff800649e6>]   
[<ffffffff800649e6>] _write_lock+0x7/0xf
Mar 31 08:57:36 lustre1 kernel: RSP: 0018:ffff81024f0e3928  EFLAGS:  
00000246
Mar 31 08:57:36 lustre1 kernel: RAX: ffff81005ab05600 RBX:  
0000000000002b10 RCX: 00000000000053b7
Mar 31 08:57:36 lustre1 kernel: RDX: fffffffffffff5e8 RSI:  
ffffffff802f0d80 RDI: ffffc20000d1810c
Mar 31 08:57:36 lustre1 kernel: RBP: 0000000000000286 R08:  
ffff810001006360 R09: ffff8100610996c0
Mar 31 08:57:36 lustre1 kernel: R10: 5a5a5a5a5a5a5a5a R11:  
5a5a5a5a5a5a5a5a R12: 0000000000000286
Mar 31 08:57:36 lustre1 kernel: R13: ffff81024f0e38d0 R14:  
ffff810061099718 R15: ffff8100610996c0
Mar 31 08:57:36 lustre1 kernel: FS:  00002b6a51d6d220(0000)  
GS:ffffffff803ea000(0000) knlGS:00000000f7f326d0
Mar 31 08:57:36 lustre1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0:  
000000008005003b
Mar 31 08:57:36 lustre1 kernel: CR2: 00002b5047841000 CR3:  
0000000000201000 CR4: 00000000000006e0
Mar 31 08:57:36 lustre1 kernel:
Mar 31 08:57:36 lustre1 kernel: Call Trace:
Mar 31 08:57:36 lustre1 kernel:   
[<ffffffff885893fe>] :obdclass:lustre_hash_for_each_empty+0x20e/0x290
Mar 31 08:57:36 lustre1 kernel:   
[<ffffffff8858fbf8>] :obdclass:class_disconnect+0x378/0x400
Mar 31 08:57:36 lustre1 kernel:   
[<ffffffff886055b0>] :ptlrpc:ldlm_cancel_locks_for_export_cb+0x0/0xc0
Mar 31 08:57:36 lustre1 kernel:   
[<ffffffff888fc05b>] :mds:mds_disconnect+0x11b/0x310
Mar 31 08:57:36 lustre1 kernel:  [<ffffffff8003d514>] lock_timer_base 
+0x1b/0x3c
Mar 31 08:57:36 lustre1 kernel:  [<ffffffff8004a824>]  
try_to_del_timer_sync+0x51/0x5a
Mar 31 08:57:36 lustre1 kernel:   
[<ffffffff8858bce4>] :obdclass:class_fail_export+0x384/0x4c0
Mar 31 08:57:36 lustre1 kernel:  [<ffffffff888f4751>] :mds:mds_readpage 
+0x1571/0x18a0
Mar 31 08:57:36 lustre1 kernel:  [<ffffffff8853a5f0>] :lnet:LNetMDBind 
+0x2c0/0x420
Mar 31 08:57:36 lustre1 kernel:   
[<ffffffff8863ef33>] :ptlrpc:ptl_send_buf+0x3f3/0x5b0
Mar 31 08:57:36 lustre1 kernel:   
[<ffffffff88645004>] :ptlrpc:lustre_msg_set_timeout+0x34/0x110
Mar 31 08:57:36 lustre1 kernel:  [<ffffffff8008abb9>]  
default_wake_function+0x0/0xe
Mar 31 08:57:36 lustre1 kernel:  [<ffffffff888f870b>] :mds:mds_handle 
+0x216b/0x4d30
Mar 31 08:57:36 lustre1 kernel:  [<ffffffff800d3fc7>] free_block+0x11c/ 
0x13b
Mar 31 08:57:36 lustre1 kernel:  [<ffffffff80143809>] __next_cpu 
+0x19/0x28
Mar 31 08:57:36 lustre1 kernel:  [<ffffffff80143809>] __next_cpu 
+0x19/0x28
Mar 31 08:57:36 lustre1 kernel:  [<ffffffff800898e3>]  
find_busiest_group+0x20d/0x621
Mar 31 08:57:36 lustre1 kernel:   
[<ffffffff886435a5>] :ptlrpc:lustre_msg_get_conn_cnt+0x35/0xf0
Mar 31 08:57:36 lustre1 kernel:   
[<ffffffff8864bcfa>] :ptlrpc:ptlrpc_server_request_get+0x6a/0x150
Mar 31 08:57:36 lustre1 kernel:   
[<ffffffff8864db7d>] :ptlrpc:ptlrpc_check_req+0x1d/0x110
Mar 31 08:57:36 lustre1 kernel:   
[<ffffffff88650103>] :ptlrpc:ptlrpc_server_handle_request+0xa93/0x1150
Mar 31 08:57:36 lustre1 kernel:  [<ffffffff80062f4b>] thread_return 
+0x0/0xdf
Mar 31 08:57:36 lustre1 kernel:  [<ffffffff8006d8a2>] do_gettimeofday 
+0x40/0x8f
Mar 31 08:57:36 lustre1 kernel:   
[<ffffffff884f37c6>] :libcfs:lcw_update_time+0x16/0x100
Mar 31 08:57:36 lustre1 kernel:  [<ffffffff800891f6>] __wake_up_common 
+0x3e/0x68
Mar 31 08:57:36 lustre1 kernel:   
[<ffffffff886535f8>] :ptlrpc:ptlrpc_main+0x1218/0x13e0
Mar 31 08:57:36 lustre1 kernel:  [<ffffffff8008abb9>]  
default_wake_function+0x0/0xe
Mar 31 08:57:36 lustre1 kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Mar 31 08:57:36 lustre1 kernel:   
[<ffffffff886523e0>] :ptlrpc:ptlrpc_main+0x0/0x13e0
Mar 31 08:57:37 lustre1 kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
Mar 31 08:57:37 lustre1 kernel:
Mar 31 08:57:46 lustre1 kernel: BUG: soft lockup - CPU#0 stuck for  
10s! [ll_mdt_rdpg_11:11332]
Mar 31 08:57:46 lustre1 kernel: CPU 0: