[Lustre-discuss] HW RAID - fragmented I/O

Kevin Van Maren kevin.van.maren at oracle.com
Mon Jun 13 12:38:23 PDT 2011

Did you printk the SGE in the driver, to make sure it was being set 

sg_tablesize may be being limited elsewhere, although the kernel patches 
in v1.8.5 should prevent that.

do this:

# cat /sys/class/scsi_host/host*/sg_tablesize
This should be 256.  If not, then this is still the issue.

# cat /sys/block/sd*/queue/max_hw_sectors_kb
This should be >= 1024

# cat /sys/block/sd*/queue/max_sectors_kb
This should be 1024 (Lustre mount sets it to max_hw_sectors_kb)

_base_allocate_memory_pools prints a bunch of helpful info using 
and dinitprintk (MPT_DEBUG_INIT flag).  Turn up kernel verbosity and set 
the module parameter logging_level=0x20

If you still don't have an answer, then look at these values in 

        blk_queue_max_hw_segments(q, shost->sg_tablesize);
        blk_queue_max_phys_segments(q, SCSI_MAX_PHYS_SEGMENTS);
        blk_queue_max_sectors(q, shost->max_sectors);


Wojciech Turek wrote:
> Hi Kevin,
> Unfortunately still no luck with 1MB I/O. I have forced my OSS to do 
> 512KB I/O following your suggestion and setting 512 max_sectors_kb. I 
> also recreated my HW RAID with 64KB chunks to align it for 512KB 
> chunks. I can see from the brw_stats and  controller statistics that 
> it does indeed twice as many IOPS as compared to throughput MB/s but 
> perfoamnce isn't any better as before.
> From the sgpdd-survey I know that this controller can do around 3GB/s 
> write and 4GB/s read. Also when running sgpdd-survey controller stats 
> show that I/O is not fragmented (nr of IOPS = Throughput in MB/s). I 
> also try to bypass multipath layer by mounting the sd devices directly 
> but that did not make any difference.
> If you have any more suggestions I will be happy to try them out.
> Best regards,
> Wojciech
> On 13 June 2011 15:13, Kevin Van Maren <kevin.van.maren at oracle.com 
> <mailto:kevin.van.maren at oracle.com>> wrote:
>     Did you get it doing 1MB IOs?
>     Kevin
>     Kevin Van Maren wrote:
>         Wojciech Turek wrote:
>             Hi Kevin,
>             In my kernel .config I find following lines
>             CONFIG_SCSI_MPT2SAS=m
>             CONFIG_SCSI_MPT2SAS_MAX_SGE=128
>             I changed SGE value to 256
>             Do I need to recompile the Kernel before building new
>             module based on that .config?
>         No, but you do need to do something like "make oldconfig" to
>         propagate the change in .config to the header files, and then
>         rebuild the driver.
>         Kevin

More information about the lustre-discuss mailing list