[Lustre-discuss] ras_stride_increase_window() ASSERTION failed

Christopher J. Walker C.J.Walker at qmul.ac.uk
Mon Apr 5 12:41:50 PDT 2010


I see the following error in the logs on some of my lustre clients:

	Mar 29 20:58:43 cn507 kernel: LustreError: 	
	18750:0:(rw.c:1948:ras_stride_increase_window())
	ASSERTION(ras->ras_window_
	start + ras->ras_window_len >= ras->ras_stride_offset) failed:
	window_start 1792, window_len 0 stride_offset 2017

Several processes seem to be blocking on this machine in state DN.

Is this a known issue? I've looked in bugzilla and not found anything 
obvious (but this is the first time I've looked in your bugzilla). I've 
found http://www.nersc.gov/hypermail/nersc-io/att-0612/summary.pdf and 
had a quick flick through, but it refers to mpi-io, which we are not 
doing, and a 1.6 kernel, whereas we are running 1.8.

I'm running 1.8.2 servers (downloaded from Sun/Oracle), and 1.8.2 
clients compiled from source on a Scientific Linux 2.6.18-164.15.1.el5 
kernel.

  /var/log/messages says:

> Mar 29 20:58:43 cn507 kernel: LustreError: 18750:0:(rw.c:1948:ras_stride_increase_window()) ASSERTION(ras->ras_window_
> start + ras->ras_window_len >= ras->ras_stride_offset) failed: window_start 1792, window_len 0 stride_offset 2017
> Mar 29 20:58:43 cn507 kernel: LustreError: 18750:0:(rw.c:1948:ras_stride_increase_window()) LBUG
> Mar 29 20:58:43 cn507 kernel: Pid: 18750, comm: athena.py
> Mar 29 20:58:43 cn507 kernel: 
> Mar 29 20:58:43 cn507 kernel: Call Trace:
> Mar 29 20:58:43 cn507 kernel:  [<ffffffff8844d6a1>] libcfs_debug_dumpstack+0x51/0x60 [libcfs]
> Mar 29 20:58:43 cn507 kernel:  [<ffffffff8844dbda>] lbug_with_loc+0x7a/0xd0 [libcfs]
> Mar 29 20:58:43 cn507 kernel:  [<ffffffff8878d63f>] ll_readpage+0x129f/0x1e40 [lustre]
> Mar 29 20:58:43 cn507 kernel:  [<ffffffff8000c707>] add_to_page_cache+0xaa/0xc1
> Mar 29 20:58:43 cn507 kernel:  [<ffffffff8000c2f5>] do_generic_mapping_read+0x208/0x354
> Mar 29 20:58:43 cn507 kernel:  [<ffffffff8000d0e0>] file_read_actor+0x0/0x159
> Mar 29 20:58:43 cn507 kernel:  [<ffffffff8000c58d>] __generic_file_aio_read+0x14c/0x198
> Mar 29 20:58:43 cn507 kernel:  [<ffffffff800c5d8f>] generic_file_readv+0x8f/0xa8
> Mar 29 20:58:43 cn507 kernel:  [<ffffffff800a0307>] autoremove_wake_function+0x0/0x2e
> Mar 29 20:58:43 cn507 kernel:  [<ffffffff8879a427>] our_vma+0x117/0x1d0 [lustre]
> Mar 29 20:58:43 cn507 kernel:  [<ffffffff8000b984>] touch_atime+0x67/0xaa
> Mar 29 20:58:43 cn507 kernel:  [<ffffffff8875f65b>] ll_file_readv+0x1e4b/0x2130 [lustre]
> Mar 29 20:58:43 cn507 kernel:  [<ffffffff8875f95a>] ll_file_read+0x1a/0x20 [lustre]
> Mar 29 20:58:43 cn507 kernel:  [<ffffffff8000b695>] vfs_read+0xcb/0x171
> Mar 29 20:58:43 cn507 kernel:  [<ffffffff80011b60>] sys_read+0x45/0x6e
> Mar 29 20:58:43 cn507 kernel:  [<ffffffff8006149d>] sysenter_do_call+0x1e/0x76
> Mar 29 20:58:43 cn507 kernel: 
> Mar 29 20:58:43 cn507 kernel: LustreError: dumping log to /tmp/lustre-log.1269892723.18750

Thanks,

Chris
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: lustre-error.txt
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100405/28086260/attachment.txt>


More information about the lustre-discuss mailing list