[Lustre-discuss] LBUG encountered in Lustre 1.8.2 - rw.c:1948:ras_stride_increase_window() ASSERTION

Ashley Nicholls ashley at maxeler.com
Mon Sep 27 09:23:59 PDT 2010


Hello all,

We have been running lustre for almost a month with no problems, however,
about a week ago while running our application we encountered the following
LBUG:

Sep 14 14:08:20 max13 kernel: LustreError:
12346:0:(rw.c:1948:ras_stride_increase_window())
ASSERTION(ras->ras_window_start + ras->ras_window_len >=
ras->ras_stride_offset) failed: window_start 34816, window_len 0
stride_offset 34825
Sep 14 14:08:20 max13 kernel: LustreError:
12346:0:(rw.c:1948:ras_stride_increase_window()) LBUG
Sep 14 14:08:20 max13 kernel: Lustre:
12346:0:(linux-debug.c:264:libcfs_debug_dumpstack()) showing stack for
process 12346
Sep 14 14:08:20 max13 kernel: stepcrsgs     R  running task       0 12346
 12344         12452       (NOTLB)
Sep 14 14:08:20 max13 kernel:  0000000000000020 0000000000000001
0000000500000000 0000000000000001
Sep 14 14:08:20 max13 kernel:  0000000000000092 ffffffff80047152
3830303437313532 ffffffff801bf903
Sep 14 14:08:20 max13 kernel:  0000000500000000 0000000000000000
0000000000000011 0000000000000096
Sep 14 14:08:20 max13 kernel: Call Trace:
Sep 14 14:08:20 max13 kernel:  [<ffffffff801bf903>]
serial8250_console_putchar+0x3f/0xa5
Sep 14 14:08:20 max13 last message repeated 2 times
Sep 14 14:08:20 max13 kernel:  [<ffffffff80091d2d>] printk+0x52/0xbd
Sep 14 14:08:20 max13 kernel:  [<ffffffff80091d2d>] printk+0x52/0xbd
Sep 14 14:08:20 max13 kernel:  [<ffffffff800a74cd>]
get_symbol_offset+0x1d/0x3c
Sep 14 14:08:20 max13 kernel:  [<ffffffff800a7b2e>]
kallsyms_lookup+0xe6/0x1ae
Sep 14 14:08:20 max13 kernel:  [<ffffffff80091c8f>] vprintk+0x2cb/0x317
Sep 14 14:08:20 max13 last message repeated 3 times
Sep 14 14:08:20 max13 kernel:  [<ffffffff8006bc3b>] printk_address+0x9f/0xab
Sep 14 14:08:20 max13 kernel:  [<ffffffff80064b50>]
_spin_unlock_irqrestore+0x8/0x9
Sep 14 14:08:20 max13 kernel:  [<ffffffff800a54d2>]
module_text_address+0x33/0x3c
Sep 14 14:08:20 max13 kernel:  [<ffffffff8009e65b>]
kernel_text_address+0x1a/0x26
Sep 14 14:08:20 max13 kernel:  [<ffffffff8006b921>] dump_trace+0x206/0x22f
Sep 14 14:08:20 max13 kernel:  [<ffffffff8006b97e>] show_trace+0x34/0x47
Sep 14 14:08:20 max13 kernel:  [<ffffffff8006ba83>] _show_stack+0xdb/0xea
Sep 14 14:08:20 max13 kernel:  [<ffffffff88740b1a>]
:libcfs:lbug_with_loc+0x7a/0xd0
Sep 14 14:08:20 max13 kernel:  [<ffffffff88a8061f>]
:lustre:ll_readpage+0x129f/0x1e40
Sep 14 14:08:20 max13 kernel:  [<ffffffff8000c6dd>]
add_to_page_cache+0xaa/0xc1
Sep 14 14:08:20 max13 kernel:  [<ffffffff8000c2cb>]
do_generic_mapping_read+0x208/0x354
Sep 14 14:08:20 max13 kernel:  [<ffffffff8000d0b6>]
file_read_actor+0x0/0x159
Sep 14 14:08:20 max13 kernel:  [<ffffffff8000c563>]
__generic_file_aio_read+0x14c/0x198
Sep 14 14:08:20 max13 kernel:  [<ffffffff800c5be8>]
generic_file_readv+0x8f/0xa8
Sep 14 14:08:20 max13 kernel:  [<ffffffff800a00be>]
autoremove_wake_function+0x0/0x2e
Sep 14 14:08:20 max13 kernel:  [<ffffffff88a8d3b7>]
:lustre:our_vma+0x117/0x1d0
Sep 14 14:08:20 max13 kernel:  [<ffffffff8000b984>] touch_atime+0x67/0xaa
Sep 14 14:08:20 max13 kernel:  [<ffffffff88a5263b>]
:lustre:ll_file_readv+0x1e4b/0x2130
Sep 14 14:08:20 max13 kernel:  [<ffffffff80045d65>] do_sock_read+0xcf/0x110
Sep 14 14:08:20 max13 kernel:  [<ffffffff88a5293a>]
:lustre:ll_file_read+0x1a/0x20
Sep 14 14:08:20 max13 kernel:  [<ffffffff8000b695>] vfs_read+0xcb/0x171
Sep 14 14:08:20 max13 kernel:  [<ffffffff80011b35>] sys_read+0x45/0x6e
Sep 14 14:08:20 max13 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 14 14:08:20 max13 kernel:
Sep 14 14:08:20 max13 kernel: LustreError: dumping log to
/tmp/lustre-log.1284466101.12346

Due to the distributed nature of the application it has been
difficult/impossible to reproduce this.

Has anyone else experienced this or know if a) this has been fixed in a
newer version of Lustre or b) how I can go about providing enough
information to document this bug?
I would prefer not to upgrade Lustre as this version has been working very
well with our current application.

Thanks,
Ashley Nicholls
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100927/89e61d3c/attachment.htm>


More information about the lustre-discuss mailing list