[Lustre-discuss] BUG: using smp_processor_id() in preemptible [00000000] code: modprobe/6024

Gregory Matthews greg.matthews at diamond.ac.uk
Tue Mar 23 04:59:06 PDT 2010


if it helps, I have longer stack traces...

Mar 21 05:31:42 dec055 kernel: BUG: scheduling while atomic: 
cat/4439/0x00000002
Mar 21 05:31:42 dec055 kernel: Modules linked in:<3>BUG: scheduling 
while atomic: cat/4508/0x00000002
Mar 21 05:31:42 dec055 kernel:  mgc lustre lov mdc lquotaModules linked 
in: mgc lustre lov mdc lquota osc ksocklnd osc ksocklnd ptlrpc obdclass 
lnet ptlrpc obdclass lnet lvfs libcfs lvfs libcfs gsd gsd autofs4 nfs 
autofs4 nfs lockd lockd nfs_acl nfs_acl sunrpc iptable_filter ip_tables 
ip6table_filter sunrpc ip6_tables x_tables ipv6 microcode iptable_filter 
ip_tables ip6table_filter ip6_tables x_tables ipv6 microcode loop loop 
dm_mod dm_mod rtc_cmos rtc_cmos usb_storage iTCO_wdt rtc_core 
iTCO_vendor_support bnx2 sr_mod rtc_lib cdrom serio_raw shpchp 
usb_storage pci_hotplug button iTCO_wdt rtc_core iTCO_vendor_support 
bnx2 sr_mod rtc_lib cdrom serio_raw shpchp pci_hotplug button joydev 
dcdbas sg usbhid hid ff_memless ehci_hcd uhci_hcd usbcore sd_mod mptsas 
mptscsih mptbase joydev dcdbas sg usbhid hid ff_memless ehci_hcd 
uhci_hcd usbcore sd_mod mptsas mptscsih mptbase scsi_transport_sas edd 
ext3 mbcache jbd fan megaraid_sas ata_piix libata scsi_mod dock 
scsi_transport_sas edd ext3 mbcache jbd fan megaraid_sas ata_p
Mar 21 05:31:42 dec055 kernel: iix libata scsi_mod dock thermal 
processor thermal_sys [last unloaded: libcfs]
Mar 21 05:31:42 dec055 kernel: Pid: 4439, comm: cat Not tainted 
2.6.27.39-default #2
Mar 21 05:31:42 dec055 kernel:  thermal processor thermal_sys [last 
unloaded: libcfs]
Mar 21 05:31:42 dec055 kernel: Pid: 4508, comm: cat Not tainted 
2.6.27.39-default #2
Mar 21 05:31:42 dec055 kernel:
Mar 21 05:31:42 dec055 kernel: Call Trace:
Mar 21 05:31:42 dec055 kernel:
Mar 21 05:31:42 dec055 kernel: Call Trace:
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80422c6a>] schedule+0xf7/0x7bd
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80424c84>] _spin_unlock+0x10/0x2b
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa062e08b>] 
osc_queue_async_io+0x63b/0x10c0 [osc]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff804236a9>] 
schedule_timeout+0x1e/0xad
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80423e63>] __down+0x62/0x8f
Mar 21 05:31:42 dec055 kernel:  [<ffffffff802495ca>] down+0x27/0x36
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa06a8b57>] 
lov_putref+0x37/0xf90 [lov]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff8042490f>] _spin_lock+0xe/0x24
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80229776>] task_rq_lock+0x40/0x79
Mar 21 05:31:42 dec055 kernel:  [<ffffffff8022bf33>] 
try_to_wake_up+0x188/0x19a
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa06c9113>] 
lov_stripe_number+0x213/0x280 [lov]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80422c6a>] schedule+0xf7/0x7bd
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa06bf0e1>] 
lov_get_info+0x151/0x2370 [lov]
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa04255e2>] cfs_alloc+0x52/0xb0 
[libcfs]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80316a65>] 
__percpu_counter_add+0x74/0x9b
Mar 21 05:31:42 dec055 kernel:  [<ffffffff804236a9>] 
schedule_timeout+0x1e/0xad
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80424c84>] _spin_unlock+0x10/0x2b
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa07319ce>] 
llap_from_page_with_lockh+0x42e/0x2670 [lustre]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80423e63>] __down+0x62/0x8f
Mar 21 05:31:42 dec055 kernel:  [<ffffffff802495ca>] down+0x27/0x36
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa06a88c0>] 
lov_getref+0x20/0x40 [lov]
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa06bf046>] 
lov_get_info+0xb6/0x2370 [lov]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80424c84>] _spin_unlock+0x10/0x2b
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa04552e1>] 
lprocfs_counter_add+0xb1/0x120 [lvfs]
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa04c1893>] oig_init+0xa3/0x2c0 
[obdclass]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff8022982f>] 
set_next_entity+0x18/0x36
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa073d2e3>] 
ll_readpage+0xd63/0x1f60 [lustre]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff804233cf>] thread_return+0x9f/0xc7
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa06cd3dc>] 
lov_fini_cancel_set+0x1ac/0x290 [lov]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80424c84>] _spin_unlock+0x10/0x2b
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa0534e54>] 
ldlm_lock_remove_from_lru+0x44/0x100 [ptlrpc]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80424c84>] _spin_unlock+0x10/0x2b
Mar 21 05:31:42 dec055 kernel:  [<ffffffff8026b087>] 
generic_file_aio_read+0x3c9/0x551
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa070d1fa>] 
ll_file_aio_read+0xf1a/0x2350 [lustre]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80424a04>] 
_spin_lock_irqsave+0x18/0x34
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa04552e1>] 
lprocfs_counter_add+0xb1/0x120 [lvfs]
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa0712e79>] 
ll_file_read+0xb9/0xd0 [lustre]
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa04255e2>] cfs_alloc+0x52/0xb0 
[libcfs]
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa04552e1>] 
lprocfs_counter_add+0xb1/0x120 [lvfs]
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa04c1893>] oig_init+0xa3/0x2c0 
[obdclass]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff8022982f>] 
set_next_entity+0x18/0x36
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa073d2e3>] 
ll_readpage+0xd63/0x1f60 [lustre]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff804233cf>] thread_return+0x9f/0xc7
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa06cd3dc>] 
lov_fini_cancel_set+0x1ac/0x290 [lov]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80424c84>] _spin_unlock+0x10/0x2b
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa0534e54>] 
ldlm_lock_remove_from_lru+0x44/0x100 [ptlrpc]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80424c84>] _spin_unlock+0x10/0x2b
Mar 21 05:31:42 dec055 kernel:  [<ffffffff8026b087>] 
generic_file_aio_read+0x3c9/0x551
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa070d1fa>] 
ll_file_aio_read+0xf1a/0x2350 [lustre]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80424a04>] 
_spin_lock_irqsave+0x18/0x34
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa04552e1>] 
lprocfs_counter_add+0xb1/0x120 [lvfs]
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa0712e79>] 
ll_file_read+0xb9/0xd0 [lustre]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80246516>] 
autoremove_wake_function+0x0/0x2e
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80294570>] 
rw_verify_area+0x7f/0x9f
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80294c33>] vfs_read+0xaa/0x133
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80294f18>] sys_read+0x45/0x6e
Mar 21 05:31:42 dec055 kernel:  [<ffffffff8020bf8b>] 
system_call_fastpath+0x16/0x1b
Mar 21 05:31:42 dec055 kernel:
Mar 21 05:31:42 dec055 kernel:  [<ffffffff804233cf>] thread_return+0x9f/0xc7
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80246516>] 
autoremove_wake_function+0x0/0x2e
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80294570>] 
rw_verify_area+0x7f/0x9f
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80294c33>] vfs_read+0xaa/0x133
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80294f00>] sys_read+0x2d/0x6e
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80294f18>] sys_read+0x45/0x6e
Mar 21 05:31:42 dec055 kernel:  [<ffffffff8020bf8b>] 
system_call_fastpath+0x16/0x1b
Mar 21 05:31:42 dec055 kernel:
Mar 21 05:31:42 dec055 kernel: BUG: scheduling while atomic: 
cat/4439/0x00000002
Mar 21 05:31:42 dec055 kernel: Modules linked in: mgc lustre lov mdc 
lquota osc ksocklnd ptlrpc obdclass lnet lvfs libcfs gsd autofs4 nfs 
lockd nfs_acl sunrpc iptable_filter ip_tables ip6table_filter ip6_tables 
x_tables ipv6 microcode loop dm_mod rtc_cmos usb_storage iTCO_wdt 
rtc_core iTCO_vendor_support bnx2 sr_mod rtc_lib cdrom serio_raw shpchp 
pci_hotplug button joydev dcdbas sg usbhid hid ff_memless ehci_hcd 
uhci_hcd usbcore sd_mod mptsas mptscsih mptbase scsi_transport_sas edd 
ext3 mbcache jbd fan megaraid_sas ata_piix libata scsi_mod dock thermal 
processor thermal_sys [last unloaded: libcfs]
Mar 21 05:31:42 dec055 kernel: Pid: 4439, comm: cat Not tainted 
2.6.27.39-default #2
Mar 21 05:31:42 dec055 kernel:
Mar 21 05:31:42 dec055 kernel: Call Trace:
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80422c6a>] schedule+0xf7/0x7bd
Mar 21 05:31:42 dec055 kernel:  [<ffffffff804233cf>] thread_return+0x9f/0xc7
Mar 21 05:31:42 dec055 kernel:  [<ffffffff804236a9>] 
schedule_timeout+0x1e/0xad
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80423e63>] __down+0x62/0x8f
Mar 21 05:31:42 dec055 kernel:  [<ffffffff802495ca>] down+0x27/0x36
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa06a8b57>] 
lov_putref+0x37/0xf90 [lov]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80231b1b>] 
check_preempt_wakeup+0x190/0x19d
Mar 21 05:31:42 dec055 kernel:  [<ffffffff8022bf33>] 
try_to_wake_up+0x188/0x19a
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa06c9113>] 
lov_stripe_number+0x213/0x280 [lov]
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa06bf0e1>] 
lov_get_info+0x151/0x2370 [lov]
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa04255e2>] cfs_alloc+0x52/0xb0 
[libcfs]
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa04552e1>] 
lprocfs_counter_add+0xb1/0x120 [lvfs]
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa04c1893>] oig_init+0xa3/0x2c0 
[obdclass]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff8022982f>] 
set_next_entity+0x18/0x36
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa073d2e3>] 
ll_readpage+0xd63/0x1f60 [lustre]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff804233cf>] thread_return+0x9f/0xc7
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa06cd3dc>] 
lov_fini_cancel_set+0x1ac/0x290 [lov]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80424c84>] _spin_unlock+0x10/0x2b
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa0534e54>] 
ldlm_lock_remove_from_lru+0x44/0x100 [ptlrpc]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80424c84>] _spin_unlock+0x10/0x2b
Mar 21 05:31:42 dec055 kernel:  [<ffffffff8026b087>] 
generic_file_aio_read+0x3c9/0x551
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa070d1fa>] 
ll_file_aio_read+0xf1a/0x2350 [lustre]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80424a04>] 
_spin_lock_irqsave+0x18/0x34
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa04552e1>] 
lprocfs_counter_add+0xb1/0x120 [lvfs]
Mar 21 05:31:42 dec055 kernel:  [<ffffffffa0712e79>] 
ll_file_read+0xb9/0xd0 [lustre]
Mar 21 05:31:42 dec055 kernel:  [<ffffffff804233cf>] thread_return+0x9f/0xc7
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80246516>] 
autoremove_wake_function+0x0/0x2e
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80294570>] 
rw_verify_area+0x7f/0x9f
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80294c33>] vfs_read+0xaa/0x133
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80294f00>] sys_read+0x2d/0x6e
Mar 21 05:31:42 dec055 kernel:  [<ffffffff80294f18>] sys_read+0x45/0x6e
Mar 21 05:31:42 dec055 kernel:  [<ffffffff8020bf8b>] 
system_call_fastpath+0x16/0x1b
Mar 21 05:31:42 dec055 kernel:



Andreas Dilger wrote:
> It definitely wouldn't have anything to do with cat itself.  
> Unfortunately, I can't see anywhere in that call stack where we are 
> scheduling while atomic.  It appears (from what I can make of the stack 
> trace) we are in osc_queue_async_io() and the only place we grab a 
> spinlock is in a very isolated piece of code.
> 
>> BUG: scheduling while atomic: cat/4439/0x00000002
>>  Call Trace:
>>   [<ffffffff80422c6a>] schedule+0xf7/0x7bd
>>   [<ffffffff80424c84>] _spin_unlock+0x10/0x2b
>>   [<ffffffffa062e08b>] osc_queue_async_io+0x63b/0x10c0 [osc]
>>   [<ffffffffa06add05>] lov_queue_async_io+0x165/0x4b0 [lov]
>>   [<ffffffffa073d2e3>] ll_readpage+0xd63/0x1f60 [lustre]
>>   [<ffffffff8026b087>] generic_file_aio_read+0x3c9/0x551
>>   [<ffffffffa070d1fa>] ll_file_aio_read+0xf1a/0x2350 [lustre]
>>   [<ffffffffa0712e79>] ll_file_read+0xb9/0xd0 [lustre]
>>   [<ffffffff80294c33>] vfs_read+0xaa/0x133
>>   [<ffffffff80294f18>] sys_read+0x45/0x6e
> 
> 
> Cheers, Andreas
> -- 
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
> 
> 


-- 
Greg Matthews            01235 778658
Senior Computer Systems Administrator
Diamond Light Source, Oxfordshire, UK



More information about the lustre-discuss mailing list