[Lustre-discuss] ras_stride_increase_window() ASSERTION failed
Tom.Wang
Tom.Wang at sun.com
Mon Apr 5 12:54:51 PDT 2010
Hello,
you need the patch in bug 17197, attachment
https://bugzilla.lustre.org/attachment.cgi?id=28672
and probably also the patch in
https://bugzilla.lustre.org/show_bug.cgi?id=22385
Thanks
WangDi
Christopher J. Walker wrote:
> I see the following error in the logs on some of my lustre clients:
>
> Mar 29 20:58:43 cn507 kernel: LustreError:
> 18750:0:(rw.c:1948:ras_stride_increase_window())
> ASSERTION(ras->ras_window_
> start + ras->ras_window_len >= ras->ras_stride_offset) failed:
> window_start 1792, window_len 0 stride_offset 2017
>
> Several processes seem to be blocking on this machine in state DN.
>
> Is this a known issue? I've looked in bugzilla and not found anything
> obvious (but this is the first time I've looked in your bugzilla).
> I've found
> http://www.nersc.gov/hypermail/nersc-io/att-0612/summary.pdf and had a
> quick flick through, but it refers to mpi-io, which we are not doing,
> and a 1.6 kernel, whereas we are running 1.8.
>
> I'm running 1.8.2 servers (downloaded from Sun/Oracle), and 1.8.2
> clients compiled from source on a Scientific Linux 2.6.18-164.15.1.el5
> kernel.
>
> /var/log/messages says:
>
>> Mar 29 20:58:43 cn507 kernel: LustreError:
>> 18750:0:(rw.c:1948:ras_stride_increase_window())
>> ASSERTION(ras->ras_window_
>> start + ras->ras_window_len >= ras->ras_stride_offset) failed:
>> window_start 1792, window_len 0 stride_offset 2017
>> Mar 29 20:58:43 cn507 kernel: LustreError:
>> 18750:0:(rw.c:1948:ras_stride_increase_window()) LBUG
>> Mar 29 20:58:43 cn507 kernel: Pid: 18750, comm: athena.py
>> Mar 29 20:58:43 cn507 kernel: Mar 29 20:58:43 cn507 kernel: Call Trace:
>> Mar 29 20:58:43 cn507 kernel: [<ffffffff8844d6a1>]
>> libcfs_debug_dumpstack+0x51/0x60 [libcfs]
>> Mar 29 20:58:43 cn507 kernel: [<ffffffff8844dbda>]
>> lbug_with_loc+0x7a/0xd0 [libcfs]
>> Mar 29 20:58:43 cn507 kernel: [<ffffffff8878d63f>]
>> ll_readpage+0x129f/0x1e40 [lustre]
>> Mar 29 20:58:43 cn507 kernel: [<ffffffff8000c707>]
>> add_to_page_cache+0xaa/0xc1
>> Mar 29 20:58:43 cn507 kernel: [<ffffffff8000c2f5>]
>> do_generic_mapping_read+0x208/0x354
>> Mar 29 20:58:43 cn507 kernel: [<ffffffff8000d0e0>]
>> file_read_actor+0x0/0x159
>> Mar 29 20:58:43 cn507 kernel: [<ffffffff8000c58d>]
>> __generic_file_aio_read+0x14c/0x198
>> Mar 29 20:58:43 cn507 kernel: [<ffffffff800c5d8f>]
>> generic_file_readv+0x8f/0xa8
>> Mar 29 20:58:43 cn507 kernel: [<ffffffff800a0307>]
>> autoremove_wake_function+0x0/0x2e
>> Mar 29 20:58:43 cn507 kernel: [<ffffffff8879a427>]
>> our_vma+0x117/0x1d0 [lustre]
>> Mar 29 20:58:43 cn507 kernel: [<ffffffff8000b984>]
>> touch_atime+0x67/0xaa
>> Mar 29 20:58:43 cn507 kernel: [<ffffffff8875f65b>]
>> ll_file_readv+0x1e4b/0x2130 [lustre]
>> Mar 29 20:58:43 cn507 kernel: [<ffffffff8875f95a>]
>> ll_file_read+0x1a/0x20 [lustre]
>> Mar 29 20:58:43 cn507 kernel: [<ffffffff8000b695>] vfs_read+0xcb/0x171
>> Mar 29 20:58:43 cn507 kernel: [<ffffffff80011b60>] sys_read+0x45/0x6e
>> Mar 29 20:58:43 cn507 kernel: [<ffffffff8006149d>]
>> sysenter_do_call+0x1e/0x76
>> Mar 29 20:58:43 cn507 kernel: Mar 29 20:58:43 cn507 kernel:
>> LustreError: dumping log to /tmp/lustre-log.1269892723.18750
>
> Thanks,
>
> Chris
> ------------------------------------------------------------------------
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
More information about the lustre-discuss
mailing list