[Lustre-discuss] Lustre IOkit newbie: sgpdd-survey

Tue Jul 29 12:43:48 PDT 2008

Hi,

Additional info.

If I use "scsidevs=/dev/sdj" in /usr/bin/sgpdd-survey in place of the
/dev/sg16 I receive the following result:
Tue Jul 29 15:40:47 EDT 2008 sgpdd-survey on /dev/sdj from oss4.crew.local
total_size 17487872K rsz 1024 crg     1 thr     4 write 1 failed read 1 failed
total_size 17487872K rsz 1024 crg     1 thr     8 write 1 failed read 1 failed
total_size 17487872K rsz 1024 crg     1 thr    16 write 1 failed read 1 failed
total_size 17487872K rsz 1024 crg     2 thr     4 write 2 failed read 2 failed
total_size 17487872K rsz 1024 crg     2 thr     8 write 2 failed read 2 failed
total_size 17487872K rsz 1024 crg     2 thr    16 write 2 failed read 2 failed
total_size 17487872K rsz 1024 crg     2 thr    32 write 2 failed read 2 failed
total_size 17487872K rsz 1024 crg     4 thr     4 write 4 failed read 4 failed
total_size 17487872K rsz 1024 crg     4 thr     8 write 4 failed read 4 failed
total_size 17487872K rsz 1024 crg     4 thr    16 write 4 failed read 4 failed
total_size 17487872K rsz 1024 crg     4 thr    32 write 4 failed read 4 failed
total_size 17487872K rsz 1024 crg     4 thr    64 write 4 failed read 4 failed
total_size 17487872K rsz 1024 crg     8 thr     8 write 8 failed read 8 failed
total_size 17487872K rsz 1024 crg     8 thr    16 write 8 failed read 8 failed
total_size 17487872K rsz 1024 crg     8 thr    32 write 8 failed read 8 failed
total_size 17487872K rsz 1024 crg     8 thr    64 write 8 failed read 8 failed
total_size 17487872K rsz 1024 crg    16 thr    16 write 16 failed read 16 failed
total_size 17487872K rsz 1024 crg    16 thr    32 write 16 failed read 16 failed
total_size 17487872K rsz 1024 crg    16 thr    64 write 16 failed read 16 failed
total_size 17487872K rsz 1024 crg    32 thr    32 write 32 failed read 32 failed
total_size 17487872K rsz 1024 crg    32 thr    64 write 32 failed read 32 failed
total_size 17487872K rsz 1024 crg    64 thr    64 write 64 failed read 64 failed

All writes and reads fail but it indicates that it found the device....

megan

On Tue, Jul 29, 2008 at 3:38 PM, Ms. Megan Larko <dobsonunit at gmail.com> wrote:
> Hello,
>
> I'm back to working on benchmarking the hardware prior to lustre fs
> installation.
>
> I have used the LSI 8888ELP WebBIOS utility to destroy the previous
> RAID configuration on one of our 16-bay JBOD units.   A new set of
> RAID arrays were set up for benchmarking purposes.   One RAID6 with
> parity disks,   one RAID5 with parity disk and one single spindle
> passed directly to the OSS.  The boot of CentOS 5
> (2.6.18-53.1.13.el5_lustre.1.6.4.3smp) detects these new block devices
> as /dev/sdh, /dev/sdi and /dev/sdj respectively.  The "sg_map" command
> detects the raw units.   The lustre-IOkit is still unable to find
> them.   Themodule "sg" is indeed loaded as is the megaraid_sas for the
> LSI 8888ELP card.  Can lustre-IOkit actually benchmark anything on an
> LSI 8888ELP card?  Should I use another tool like IOzone?
>
> Thanks,
> megan
>
> From dmesg:
> sd 2:2:1:0: Attached scsi generic sg12 type 0
>  Vendor: LSI       Model: MegaRAID 8888ELP  Rev: 1.20
>  Type:   Direct-Access                      ANSI SCSI revision: 05
> sdg : very big device. try to use READ CAPACITY(16).
> SCSI device sdg: 11707023360 512-byte hdwr sectors (5993996 MB)
> sdg: Write Protect is off
> sdg: Mode Sense: 1f 00 00 08
> SCSI device sdg: drive cache: write back, no read (daft)
>  sdg: unknown partition table
> sd 2:2:2:0: Attached scsi disk sdg
>
> sd 2:2:2:0: Attached scsi generic sg13 type 0
>  Vendor: LSI       Model: MegaRAID 8888ELP  Rev: 1.20
>  Type:   Direct-Access                      ANSI SCSI revision: 05
> sdh : very big device. try to use READ CAPACITY(16).
> SCSI device sdh: 11707023360 512-byte hdwr sectors (5993996 MB)
> sdh: Write Protect is off
> sdh: Mode Sense: 1f 00 00 08
> SCSI device sdh: drive cache: write back, no read (daft)
>  sdh: unknown partition table
> sd 2:2:3:0: Attached scsi disk sdh
>
> sd 2:2:3:0: Attached scsi generic sg14 type 0
>  Vendor: LSI       Model: MegaRAID 8888ELP  Rev: 1.20
>  Type:   Direct-Access                      ANSI SCSI revision: 05
> SCSI device sdi: 1951170560 512-byte hdwr sectors (998999 MB)
> sdi: Write Protect is off
> sdi: Mode Sense: 1f 00 00 08
> SCSI device sdi: drive cache: write back, no read (daft)
> SCSI device sdi: 1951170560 512-byte hdwr sectors (998999 MB)
> sdi: Write Protect is off
> sdi: Mode Sense: 1f 00 00 08
> SCSI device sdi: drive cache: write back, no read (daft)
>  sdi: unknown partition table
> sd 2:2:4:0: Attached scsi disk sdi
> sd 2:2:4:0: Attached scsi generic sg15 type 0
>
> [root at oss4 ~]# sg_map
> /dev/sg0  /dev/sda
> /dev/sg1  /dev/scd0
> /dev/sg2
> /dev/sg3
> /dev/sg4
> /dev/sg5  /dev/sdb
> /dev/sg6  /dev/sdc
> /dev/sg7  /dev/sdd
> /dev/sg8
> /dev/sg9
> /dev/sg10
> /dev/sg11  /dev/sde
> /dev/sg12  /dev/sdf
> /dev/sg13  /dev/sdg
> /dev/sg14  /dev/sdh
> /dev/sg15  /dev/sdi
> /dev/sg16  /dev/sdj
> /dev/sg17  /dev/sdk
>
> Lustre-IOkit sgpdd-survey:
> scsidevs=/dev/sg16
>
> Result of sgpdd-survey:
> [root at oss4 log]# sgpdd-survey
> Can't find SG device for /dev/sg16, testing for partition
> Can't find SG device /dev/sg1.
> Do you have the sg module configured for your kernel?
> [root at oss4 log]# lsmod | grep sg
> sg                     70056  0
> scsi_mod              187192  11
> ib_iser,libiscsi,scsi_transport_iscsi,ib_srp,sr_mod,libata,megaraid_sas,sg,3w_9xxx,usb_storage,sd_mod
>
> This is still using the "-16" option to each of the two  sg_readcap  calls.
> sg_readcap -b -16
>
> ********************************************************************
> On Thu, Jul 24, 2008 at 3:45 PM, Andreas Dilger <adilger at sun.com> wrote:
>> On Jul 24, 2008  14:52 -0400, Ms. Megan Larko wrote:
>>> Okay.   So I have a JBOD with 16 1Tb Hitachi Ultrastar sATA drives in
>>> it connected to the OSS via an LSI 8888ELP controller card (with
>>> Battery Back-up).  The JBOD is passed to the OSS as raw space.  My
>>> partitions cannot exceed 8Tb, but splitting into two 7Tb will hurt
>>> performance....
>>
>> The ideal layout would be to have two RAID-5 arrays with 8 data + 1 parity
>> disks using 64kB or 128kB RAID chunk size, but you are short two disks...
>>
>> You may also consider using RAID-6 with 6 data + 2 parity disks.  Having
>> the RAID stripe width be 15+1 is probably quite bad because a single
>> 1MB RPC will always need to recalculate the parity for that IO.
>>
>>> What do I do for Lustre set-up?  I thought the fewer partitions the
>>> better because one does has less "overhead" space.  Do I put them out
>>> as 16 single Tb partitions?????   That seems like extra work for a
>>> file system to track.
>>
>> Please see section 10.1 in the Lustre manual for more tips:
>> http://manual.lustre.org/manual/LustreManual16_HTML/RAID.html
>>
>> Cheers, Andreas
>> --
>> Andreas Dilger
>> Sr. Staff Engineer, Lustre Group
>> Sun Microsystems of Canada, Inc.
>>
>>
>