[Lustre-discuss] csum errors

Thu Aug 28 18:07:44 PDT 2008

Thanks for the information, greatly appreciated.

We are keeping an eye on the client causing these errors and doing a  
few tests.  The mmap issue is interesting.  The code producing these  
errors is running across the entire cluster, so I assume if it was  
mmap-ing we would be seeing these sort of random errors from more than  
just one node.

If the problems persist, we will turn on the extra debugging and see  
where we are at.

-- 
Dr Stuart Midgley
sdm900 at gmail.com

On 29/08/2008, at 1:25 AM, Andreas Dilger wrote:

> On Aug 28, 2008  21:49 +0800, Stuart Midgley wrote:
>> for completeness, here are the logs from 172.16.4.93
>>
>> Aug 27 07:49:55 clus093 kernel: LustreError: 132-0: BAD WRITE
>> CHECKSUM: changed on the client after we checksummed it - likely  
>> false
>> positive due to mmap IO (bug 11742): from 172.16.0.25 at tcp inum
>
> This is the important part to note - if you are doing mmap writes then
> the VM doesn't protect the pages from being modified while they are
> being checksummed and sent over the wire.
>
>> Aug 27 07:49:55 clus093 kernel: LustreError: 28573:0:(osc_request.c:
>> 1162:check_write_checksum()) original client csum 2dbc1696 (type 2),
>> server csum 9d081697 (type 2), client csum now 9d081697
>
> This means the data changed after the initial checksum was computed,
> but now it has settled down.  In some cases the "client csum now" can
> have changed again, depending on whether the process is rewriting the
> same file repeatedly.
>
>> Aug 27 07:49:55 clus093 kernel: LustreError: 28573:0:(osc_request.c:
>> 1372:osc_brw_redo_request()) @@@ redo for recoverable error
>> req at ffff81012c434600 x4720217/t820873 o4->p1-
>> OST0018_UUID at 172.16.0.25@tcp:6/4 lens 384/480 e 0 to 100 dl  
>> 1219794694
>> ref 2 fl Interpret:R/0/0 rc 0/0
>
> Here it tells you it is resending the RPC.
>
>>> always from the same cluster node...  Should we be worried?  I
>>> suspect this means we shouldn't turn check summing off?  I assume
>>> these are rejected and resent from the client?
>
> If you are NOT doing mmap IO (just normal read/write) then it is  
> possible
> your node has memory corruption.  There is an extra check that can be
> done to checksum the pages while they are in memory, instead of just
> over the wire.  More overhead of course, but can help isolate the  
> problem.
>
> 	echo 1> /proc/fs/lustre/llite/*/checksum_pages
>
> This will also enable on-wire checksumming, which is already on
> by default.  One caveat is that turning off checksum_pages will
> also turn off the on-wire checksums (which can be re-enabled via
> /proc/fs/lustre/osc/*/checksums)...  Blame Phil.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>