[Lustre-discuss] 1.8.1.1

Tue Dec 1 13:51:31 PST 2009

Andreas Dilger wrote, On 2009. 11. 28. 0:22:
> On 2009-11-27, at 03:13, Papp Tamás wrote:
>> Craig Prescott wrote, On 2009. 11. 19. 20:42:
>>> We had the same problem with 1.8.x.x.
>>>
>>> We set lnet.printk=0 on our OSS nodes and it has helped us 
>>> dramatically - we have not seen the 'soft lockup' problem since.
>>>
>>> sysctl -w lnet.printk=0
>>>
>>> This will turn off all but 'emerg' messages from lnet.
>>>
>>> It would be interesting to know if this avoided the lockups for you, 
>>> too.
>>
>> Well, this definetely helped, but didn't resolve the root of the 
>> problem.
>>
>> A few minutes ago we were not able to reach our cluster from clients.
>>
>> On MDS I see a lot of this:
>>
>>
>> Nov 27 10:52:17 meta1 kernel: BUG: soft lockup - CPU#3 stuck for 10s! 
>> [ll_evictor:6123]
>> Nov 27 10:52:18 meta1 kernel: Call Trace:
>> :obdclass:lustre_hash_for_each_empty+0x237/0x2b0
>> :obdclass:class_disconnect+0x398/0x420
>> :mds:mds_disconnect+0x121/0xe40
>> :obdclass:class_fail_export+0x384/0x4c0
>> :ptlrpc:ping_evictor_main+0x4f8/0x7e0
>> default_wake_function+0x0/0xe
>> :ptlrpc:ping_evictor_main+0x0/0x7e0
>
> This looks like the server evicting a client that has a lot of locks.
> One thing to try is in lustre_hash_for_each_empty() add a call to 
> cond_resched(), since it seems this function could run a long time if
> func() doesn't ever cause a reschedule:
>
> lustre_hash_for_each_empty(lustre_hash_t *lh, lh_for_each_cb func,
> {
>                         read_unlock(&lh->lh_rwlock);
>                         func(obj, data);
>                         (void)lh_put(lh, hnode);
> +                       cond_resched();
>                         goto restart;
>
> I'm not sure this is the root cause, but you could check the DLM lock 
> stats in /proc/fs/lustre/ldlm/namespaces/*/lock_count on some clients, 
> to see how many locks they are holding, or the same on the MDS, which 
> will be the total number of locks currently granted to all clients.

At the weekend we upgraded the firmware on the OSS's raid controllers 
(SmartArray P410) and the messages has gone. Of course I can't be sure, 
because of it, but we are hoping, still watching it.

On the MDS the messages are more rare, in two days there were two of it:

Dec  1 09:19:18 meta1 kernel: Modules linked in: mds(U) 
fsfilt_ldiskfs(U) mgs(U) mgc(U) ldiskfs(U) crc16(U) lustre(U) lov(U) m
dc(U) lquota(U) osc(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) 
libcfs(U) mptctl(U) mptbase(U) ipmi_devintf(U) ipmi_s
i(U) ipmi_msghandler(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) autofs4(U) 
lockd(U) sunrpc(U) cpufreq_ondemand(U) acpi_cpufreq(U)
freq_table(U) dm_mirror(U) dm_multipath(U) scsi_dh(U) video(U) hwmon(U) 
backlight(U) sbs(U) i2c_ec(U) i2c_core(U) button(U) ba
ttery(U) asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) 
parport(U) sg(U) pcspkr(U) shpchp(U) igb(U) dm_raid45(U) dm
_message(U) dm_region_hash(U) dm_log(U) dm_mod(U) dm_mem_cache(U) 
usb_storage(U) ata_piix(U) ahci(U) libata(U) sd_mod(U) scsi_
mod(U) raid1(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
Dec  1 09:19:18 meta1 kernel: Pid: 4470, comm: ll_evictor Tainted: 
G      2.6.18-128.7.1.el5_lustre.1.8.1.1 #1
Dec  1 09:19:18 meta1 kernel: RIP: 0010:[<ffffffff80064aee>]  
[<ffffffff80064aee>] _write_lock+0x7/0xf
Dec  1 09:19:18 meta1 kernel: RSP: 0018:ffff81022e503c78  EFLAGS: 00000246
Dec  1 09:19:18 meta1 kernel: RAX: 0000000000007fff RBX: 
00000000000013d6 RCX: 0000000000006f89
Dec  1 09:19:18 meta1 kernel: RDX: 0000000000000216 RSI: 
ffffffff802fae80 RDI: ffffc2001085cd6c
Dec  1 09:19:18 meta1 kernel: RBP: 0000000000000286 R08: 
ffff81000101f3e0 R09: 5a5a5a5a5a5a5a5a
Dec  1 09:19:18 meta1 kernel: R10: 5a5a5a5a5a5a5a5a R11: 
5a5a5a5a5a5a5a5a R12: 0000000000000286
Dec  1 09:19:18 meta1 kernel: R13: ffff81022e503c20 R14: 
ffff8101cea7f418 R15: ffff8101cea7f3c0
Dec  1 09:19:18 meta1 kernel: FS:  00002ae6ba0b1230(0000) 
GS:ffff810107f61540(0000) knlGS:0000000000000000
Dec  1 09:19:18 meta1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 
000000008005003b
Dec  1 09:19:18 meta1 kernel: CR2: 00002aaaae52d000 CR3: 
00000002350d0000 CR4: 00000000000006e0
Dec  1 09:19:18 meta1 kernel:
Dec  1 09:19:18 meta1 kernel: Call Trace:
Dec  1 09:19:18 meta1 kernel:  [<ffffffff884ccba7>] 
:obdclass:lustre_hash_for_each_empty+0x237/0x2b0
Dec  1 09:19:19 meta1 kernel:  [<ffffffff884d3ae8>] 
:obdclass:class_disconnect+0x398/0x420
Dec  1 09:19:19 meta1 kernel:  [<ffffffff8886f5e1>] 
:mds:mds_disconnect+0x121/0xe40
Dec  1 09:19:19 meta1 kernel:  [<ffffffff8014b87a>] snprintf+0x44/0x4c
Dec  1 09:19:19 meta1 kernel:  [<ffffffff884cf994>] 
:obdclass:class_fail_export+0x384/0x4c0
Dec  1 09:19:19 meta1 kernel:  [<ffffffff885a0238>] 
:ptlrpc:ping_evictor_main+0x4f8/0x7e0
Dec  1 09:19:19 meta1 kernel:  [<ffffffff8008a3f3>] 
default_wake_function+0x0/0xe
Dec  1 09:19:19 meta1 kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Dec  1 09:19:19 meta1 kernel:  [<ffffffff8859fd40>] 
:ptlrpc:ping_evictor_main+0x0/0x7e0
Dec  1 09:19:19 meta1 kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11

Dec  1 10:08:38 meta1 kernel: BUG: soft lockup - CPU#1 stuck for 10s! 
[ldlm_elt:4466]
Dec  1 10:08:38 meta1 kernel: CPU 1:
Dec  1 10:08:38 meta1 kernel: Modules linked in: mds(U) 
fsfilt_ldiskfs(U) mgs(U) mgc(U) ldiskfs(U) crc16(U) lustre(U) lov(U) m
dc(U) lquota(U) osc(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) 
libcfs(U) mptctl(U) mptbase(U) ipmi_devintf(U) ipmi_s
i(U) ipmi_msghandler(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) autofs4(U) 
lockd(U) sunrpc(U) cpufreq_ondemand(U) acpi_cpufreq(U)
freq_table(U) dm_mirror(U) dm_multipath(U) scsi_dh(U) video(U) hwmon(U) 
backlight(U) sbs(U) i2c_ec(U) i2c_core(U) button(U) ba
ttery(U) asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) 
parport(U) sg(U) pcspkr(U) shpchp(U) igb(U) dm_raid45(U) dm
_message(U) dm_region_hash(U) dm_log(U) dm_mod(U) dm_mem_cache(U) 
usb_storage(U) ata_piix(U) ahci(U) libata(U) sd_mod(U) scsi_
mod(U) raid1(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U)
Dec  1 10:08:38 meta1 kernel: Pid: 4466, comm: ldlm_elt Tainted: G      
2.6.18-128.7.1.el5_lustre.1.8.1.1 #1
Dec  1 10:08:38 meta1 kernel: RIP: 0010:[<ffffffff80064aee>]  
[<ffffffff80064aee>] _write_lock+0x7/0xf
Dec  1 10:08:38 meta1 kernel: RSP: 0018:ffff81022efa3ca8  EFLAGS: 00000246
Dec  1 10:08:38 meta1 kernel: RAX: 0000000000007fff RBX: 
0000000000004cc7 RCX: 0000000000005e96
Dec  1 10:08:38 meta1 kernel: RDX: ffffffffffffa6d8 RSI: 
ffffffff802fae80 RDI: ffffc20010611c7c
Dec  1 10:08:38 meta1 kernel: RBP: 0000000000000282 R08: 
ffff81000100e8e0 R09: 5a5a5a5a5a5a5a5a
Dec  1 10:08:38 meta1 kernel: R10: 5a5a5a5a5a5a5a5a R11: 
5a5a5a5a5a5a5a5a R12: 0000000000000282
Dec  1 10:08:38 meta1 kernel: R13: ffff81022efa3c50 R14: 
ffff8101b9d2eb98 R15: ffff8101b9d2eb40
Dec  1 10:08:38 meta1 kernel: FS:  00002ae6ba0b1230(0000) 
GS:ffff810107ed96c0(0000) knlGS:0000000000000000
Dec  1 10:08:38 meta1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 
000000008005003b
Dec  1 10:08:38 meta1 kernel: CR2: 00000031d329a3a0 CR3: 
0000000237251000 CR4: 00000000000006e0
Dec  1 10:08:38 meta1 kernel:
Dec  1 10:08:38 meta1 kernel: Call Trace:
Dec  1 10:08:38 meta1 kernel:  [<ffffffff884ccba7>] 
:obdclass:lustre_hash_for_each_empty+0x237/0x2b0
Dec  1 10:08:38 meta1 kernel:  [<ffffffff884d3ae8>] 
:obdclass:class_disconnect+0x398/0x420
Dec  1 10:08:38 meta1 kernel:  [<ffffffff8886f5e1>] 
:mds:mds_disconnect+0x121/0xe40
Dec  1 10:08:38 meta1 kernel:  [<ffffffff80089d8d>] enqueue_task+0x41/0x56
Dec  1 10:08:38 meta1 kernel:  [<ffffffff80089bdd>] dequeue_task+0x18/0x37
Dec  1 10:08:38 meta1 kernel:  [<ffffffff80063098>] thread_return+0x62/0xfe
Dec  1 10:08:38 meta1 kernel:  [<ffffffff884cf994>] 
:obdclass:class_fail_export+0x384/0x4c0
Dec  1 10:08:39 meta1 kernel:  [<ffffffff8003b59a>] 
remove_wait_queue+0x1c/0x2c
Dec  1 10:08:39 meta1 kernel:  [<ffffffff8857224f>] 
:ptlrpc:expired_lock_main+0x68f/0x8b0
Dec  1 10:08:39 meta1 kernel:  [<ffffffff8008a3f3>] 
default_wake_function+0x0/0xe
Dec  1 10:08:39 meta1 kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
Dec  1 10:08:39 meta1 kernel:  [<ffffffff88571bc0>] 
:ptlrpc:expired_lock_main+0x0/0x8b0
Dec  1 10:08:39 meta1 kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11

I'm afraid(?), it's not the same. The MDT is not on a SmartArray, it's 
on an md raid1 array, I don't know if it matters.
I think these were in the time, when users switched on their workstation.

Anyway, the not reported problem about the cluster.

I'm sorry, it's been a bit long...

Still do you think, I should recompile the kernel with the change above?

Thank you,

tamas