[Lustre-discuss] Lustre v2.1 RHEL 6.1 build does not work

Oleg Drokin green at whamcloud.com
Fri Jun 24 11:43:19 PDT 2011


Hwllo~

On Jun 23, 2011, at 9:51 PM, Jon Zhu wrote:

> I still got some crash when further run some I/O test with the build, here's some system message containing call stack info maybe be useful to you to find the bug: 

> Jun 23 21:46:12 ip-10-112-59-173 kernel: ------------[ cut here ]------------
> Jun 23 21:46:12 ip-10-112-59-173 kernel: WARNING: at kernel/sched.c:7087 __cond_resched_lock+0x8e/0xb0() (Not tainted)
> Jun 23 21:46:12 ip-10-112-59-173 kernel: Modules linked in: lustre(U) lov(U) osc(U) lquota(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ldiskfs(U) sha256_generic cryptd aes_x86_64 aes_generic cbc dm_crypt autofs4 ipv6 microcode xen_netfront ext4 mbcache jbd2 xen_blkfront dm_mod [last unloaded: scsi_wait_scan]
> Jun 23 21:46:12 ip-10-112-59-173 kernel: Pid: 1421, comm: mount.lustre Not tainted 2.6.32.lustre21 #6
> Jun 23 21:46:12 ip-10-112-59-173 kernel: Call Trace:
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff81069c37>] ? warn_slowpath_common+0x87/0xc0
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff81007671>] ? __raw_callee_save_xen_save_fl+0x11/0x1e
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff81069c8a>] ? warn_slowpath_null+0x1a/0x20
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff810654fe>] ? __cond_resched_lock+0x8e/0xb0
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff811a53b7>] ? shrink_dcache_for_umount_subtree+0x187/0x340
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff811a55a6>] ? shrink_dcache_for_umount+0x36/0x60
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff8118f4ff>] ? generic_shutdown_super+0x1f/0xe0
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff8118f5f1>] ? kill_block_super+0x31/0x50
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff811906b5>] ? deactivate_super+0x85/0xa0
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff811ac5af>] ? mntput_no_expire+0xbf/0x110
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffffa0273f8e>] ? unlock_mntput+0x3e/0x60 [obdclass]
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffffa0277a98>] ? server_kernel_mount+0x268/0xe80 [obdclass]
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffffa0280d40>] ? lustre_fill_super+0x0/0x1290 [obdclass]
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffffa0279070>] ? lustre_init_lsi+0xd0/0x5b0 [obdclass]
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff810ac71d>] ? lock_release+0xed/0x220
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffffa0280fd0>] ? lustre_fill_super+0x290/0x1290 [obdclass]
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff8118ee20>] ? set_anon_super+0x0/0x110
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffffa0280d40>] ? lustre_fill_super+0x0/0x1290 [obdclass]
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff8119035f>] ? get_sb_nodev+0x5f/0xa0
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffffa0272885>] ? lustre_get_sb+0x25/0x30 [obdclass]
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff8118ffbb>] ? vfs_kern_mount+0x7b/0x1b0
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff81190162>] ? do_kern_mount+0x52/0x130
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff811ae647>] ? do_mount+0x2e7/0x870
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff811aec60>] ? sys_mount+0x90/0xe0
> Jun 23 21:46:12 ip-10-112-59-173 kernel: [<ffffffff8100b132>] ? system_call_fastpath+0x16/0x1b
> Jun 23 21:46:12 ip-10-112-59-173 kernel: ---[ end trace a8fb737c71bfba13 ]---

This is not a crash, it's just a warning about scheduling in inappropriate context I guess, but the kernel will continue to work.
Interesting that I have never seen anything like that in rhel5 xen kernels, perhaps it's something with rhel6.1 xen?

Bye,
    Oleg
--
Oleg Drokin
Senior Software Engineer
Whamcloud, Inc.




More information about the lustre-discuss mailing list