[Lustre-discuss] Lustre 1.8.0.50 + Xen + kernel 2.6.22.17

Lukas Hejtmanek xhejtman at ics.muni.cz
Mon Apr 20 11:23:32 PDT 2009


Hello,

I'm using 1.8.0.50 lustre server + clients. The server is running native
2.6.22.19 kernel, it has 1 mds and 3 OSDs (all at one server).

The client is running as DomU under Xen with Suse 2.6.22.17 kernel. The client
is patch less. 

I have an application that reads TIFF images stored on the lustre fs. The
libtiff uses mmap on tiffs to read. Unfortunately, I got Bus Errors (SIGBUS)
from libtiff.

The core looks like this:
#1  0x00002b7d5ff825a2 in DumpModeDecode (tif=0x58cdd0, buf=0xf7f7f7f5f5f5f6f6
<Address 0xf7f7f7f5f5f5f6f6 out of bounds>, cc=76800, s=2016)
    at tif_dumpmode.c:85
(gdb) up
#2  0x00002b7d5ff9f745 in TIFFReadEncodedStrip (tif=0x58cdd0, strip=53,
buf=0x5923a0, size=76800) at tif_read.c:160

the tif and buf are passed to the DumpModeDecode directly from
TIFFRedEncodeStrip. The cc and size are the same. While the tif and cc/size values
are preserved, the buf is obviously corrupted. I tried to use
-fstack-protector-all, but no check fires. 

It does not happen, if I copy the TIFFs to local disk, or if I disable mmap
usage.

There is no chance that the TIFFs are modified in parallel in background. My
application is the only application and the only client accessing the lustre
fs.

Is this a known problem? Is this related only to Xen or some bugs in 2.6.22
kernel?

-- 
Lukáš Hejtmánek



More information about the lustre-discuss mailing list