[Lustre-discuss] csum errors
Stuart Midgley
sdm900 at gmail.com
Thu Aug 28 06:49:52 PDT 2008
for completeness, here are the logs from 172.16.4.93
Aug 27 07:49:55 clus093 kernel: LustreError: 132-0: BAD WRITE
CHECKSUM: changed on the client after we checksummed it - likely false
positive due to mmap IO (bug 11742): from 172.16.0.25 at tcp inum
24522277/1605841060 object 12021/0 extent [10485760-11534335]
Aug 27 07:49:55 clus093 kernel: LustreError: 28573:0:(osc_request.c:
1162:check_write_checksum()) original client csum 2dbc1696 (type 2),
server csum 9d081697 (type 2), client csum now 9d081697
Aug 27 07:49:55 clus093 kernel: LustreError: 28573:0:(osc_request.c:
1372:osc_brw_redo_request()) @@@ redo for recoverable error
req at ffff81012c434600 x4720217/t820873 o4->p1-
OST0018_UUID at 172.16.0.25@tcp:6/4 lens 384/480 e 0 to 100 dl 1219794694
ref 2 fl Interpret:R/0/0 rc 0/0
--
Dr Stuart Midgley
sdm900 at gmail.com
On 28/08/2008, at 11:57 AM, Stuart Midgley wrote:
> We recently upgraded from 1.4.10.1 to 1.6.5.1 (clients and servers)
> and now we are seeing errors like
>
>
> Aug 27 07:49:54 oss025 kernel: LustreError: 3738:0:(ost_handler.c:
> 1163:ost_brw_write()) client csum 2dbc1696, server csum 9d081697
> Aug 27 07:49:54 oss025 kernel: LustreError: 168-f: p1-OST0018: BAD
> WRITE CHECKSUM: changed in transit before arrival at OST from
> 12345-172.16.4.93 at tcp inum 24522277/426969871 object 12021/0 extent
> [10485760-11534335]
> Aug 27 07:49:55 oss025 kernel: LustreError: 3738:0:(ost_handler.c:
> 1225:ost_brw_write()) client csum 2dbc1696, original server csum
> 9d081697, server csum now 9d081697
>
>
> always from the same cluster node... Should we be worried? I
> suspect this means we shouldn't turn check summing off? I assume
> these are rejected and resent from the client?
>
>
> --
> Dr Stuart Midgley
> sdm900 at gmail.com
>
>
>
More information about the lustre-discuss
mailing list