[Lustre-discuss] Lustre IOkit newbie: sgpdd-survey

Tue Jul 29 12:38:18 PDT 2008

Hello,

I'm back to working on benchmarking the hardware prior to lustre fs
installation.

I have used the LSI 8888ELP WebBIOS utility to destroy the previous
RAID configuration on one of our 16-bay JBOD units.   A new set of
RAID arrays were set up for benchmarking purposes.   One RAID6 with
parity disks,   one RAID5 with parity disk and one single spindle
passed directly to the OSS.  The boot of CentOS 5
(2.6.18-53.1.13.el5_lustre.1.6.4.3smp) detects these new block devices
as /dev/sdh, /dev/sdi and /dev/sdj respectively.  The "sg_map" command
detects the raw units.   The lustre-IOkit is still unable to find
them.   Themodule "sg" is indeed loaded as is the megaraid_sas for the
LSI 8888ELP card.  Can lustre-IOkit actually benchmark anything on an
LSI 8888ELP card?  Should I use another tool like IOzone?

Thanks,
megan

>From dmesg:
sd 2:2:1:0: Attached scsi generic sg12 type 0
  Vendor: LSI       Model: MegaRAID 8888ELP  Rev: 1.20
  Type:   Direct-Access                      ANSI SCSI revision: 05
sdg : very big device. try to use READ CAPACITY(16).
SCSI device sdg: 11707023360 512-byte hdwr sectors (5993996 MB)
sdg: Write Protect is off
sdg: Mode Sense: 1f 00 00 08
SCSI device sdg: drive cache: write back, no read (daft)
 sdg: unknown partition table
sd 2:2:2:0: Attached scsi disk sdg

sd 2:2:2:0: Attached scsi generic sg13 type 0
  Vendor: LSI       Model: MegaRAID 8888ELP  Rev: 1.20
  Type:   Direct-Access                      ANSI SCSI revision: 05
sdh : very big device. try to use READ CAPACITY(16).
SCSI device sdh: 11707023360 512-byte hdwr sectors (5993996 MB)
sdh: Write Protect is off
sdh: Mode Sense: 1f 00 00 08
SCSI device sdh: drive cache: write back, no read (daft)
 sdh: unknown partition table
sd 2:2:3:0: Attached scsi disk sdh

sd 2:2:3:0: Attached scsi generic sg14 type 0
  Vendor: LSI       Model: MegaRAID 8888ELP  Rev: 1.20
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sdi: 1951170560 512-byte hdwr sectors (998999 MB)
sdi: Write Protect is off
sdi: Mode Sense: 1f 00 00 08
SCSI device sdi: drive cache: write back, no read (daft)
SCSI device sdi: 1951170560 512-byte hdwr sectors (998999 MB)
sdi: Write Protect is off
sdi: Mode Sense: 1f 00 00 08
SCSI device sdi: drive cache: write back, no read (daft)
 sdi: unknown partition table
sd 2:2:4:0: Attached scsi disk sdi
sd 2:2:4:0: Attached scsi generic sg15 type 0

[root at oss4 ~]# sg_map
/dev/sg0  /dev/sda
/dev/sg1  /dev/scd0
/dev/sg2
/dev/sg3
/dev/sg4
/dev/sg5  /dev/sdb
/dev/sg6  /dev/sdc
/dev/sg7  /dev/sdd
/dev/sg8
/dev/sg9
/dev/sg10
/dev/sg11  /dev/sde
/dev/sg12  /dev/sdf
/dev/sg13  /dev/sdg
/dev/sg14  /dev/sdh
/dev/sg15  /dev/sdi
/dev/sg16  /dev/sdj
/dev/sg17  /dev/sdk

Lustre-IOkit sgpdd-survey:
scsidevs=/dev/sg16

Result of sgpdd-survey:
[root at oss4 log]# sgpdd-survey
Can't find SG device for /dev/sg16, testing for partition
Can't find SG device /dev/sg1.
Do you have the sg module configured for your kernel?
[root at oss4 log]# lsmod | grep sg
sg                     70056  0
scsi_mod              187192  11
ib_iser,libiscsi,scsi_transport_iscsi,ib_srp,sr_mod,libata,megaraid_sas,sg,3w_9xxx,usb_storage,sd_mod

This is still using the "-16" option to each of the two  sg_readcap  calls.
sg_readcap -b -16

********************************************************************
On Thu, Jul 24, 2008 at 3:45 PM, Andreas Dilger <adilger at sun.com> wrote:
> On Jul 24, 2008  14:52 -0400, Ms. Megan Larko wrote:
>> Okay.   So I have a JBOD with 16 1Tb Hitachi Ultrastar sATA drives in
>> it connected to the OSS via an LSI 8888ELP controller card (with
>> Battery Back-up).  The JBOD is passed to the OSS as raw space.  My
>> partitions cannot exceed 8Tb, but splitting into two 7Tb will hurt
>> performance....
>
> The ideal layout would be to have two RAID-5 arrays with 8 data + 1 parity
> disks using 64kB or 128kB RAID chunk size, but you are short two disks...
>
> You may also consider using RAID-6 with 6 data + 2 parity disks.  Having
> the RAID stripe width be 15+1 is probably quite bad because a single
> 1MB RPC will always need to recalculate the parity for that IO.
>
>> What do I do for Lustre set-up?  I thought the fewer partitions the
>> better because one does has less "overhead" space.  Do I put them out
>> as 16 single Tb partitions?????   That seems like extra work for a
>> file system to track.
>
> Please see section 10.1 in the Lustre manual for more tips:
> http://manual.lustre.org/manual/LustreManual16_HTML/RAID.html
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>
>