[Lustre-discuss] Lustre IOkit newbie: sgpdd-survey
Ms. Megan Larko
dobsonunit at gmail.com
Thu Jul 31 11:12:18 PDT 2008
>From megan:
Comments in-line.
== 2 of 2 ==
Date: Wed, Jul 30 2008 7:12 am
From: "Brian J. Murrell"
On Tue, 2008-07-29 at 15:43 -0400, Ms. Megan Larko wrote:
> Hi,
>
> Additional info.
>
> If I use "scsidevs=/dev/sdj" in /usr/bin/sgpdd-survey in place of the
> /dev/sg16
Yes, this is the correct syntax.
> I receive the following result:
> Tue Jul 29 15:40:47 EDT 2008 sgpdd-survey on /dev/sdj from oss4.crew.local
> total_size 17487872K rsz 1024 crg 1 thr 4 write 1 failed read 1 failed
> total_size 17487872K rsz 1024 crg 1 thr 8 write 1 failed read 1 failed
> total_size 17487872K rsz 1024 crg 1 thr 16 write 1 failed read 1 failed
> total_size 17487872K rsz 1024 crg 2 thr 4 write 2 failed read 2 failed
> total_size 17487872K rsz 1024 crg 2 thr 8 write 2 failed read 2 failed
> total_size 17487872K rsz 1024 crg 2 thr 16 write 2 failed read 2 failed
> total_size 17487872K rsz 1024 crg 2 thr 32 write 2 failed read 2 failed
> total_size 17487872K rsz 1024 crg 4 thr 4 write 4 failed read 4 failed
> total_size 17487872K rsz 1024 crg 4 thr 8 write 4 failed read 4 failed
> total_size 17487872K rsz 1024 crg 4 thr 16 write 4 failed read 4 failed
> total_size 17487872K rsz 1024 crg 4 thr 32 write 4 failed read 4 failed
> total_size 17487872K rsz 1024 crg 4 thr 64 write 4 failed read 4 failed
> total_size 17487872K rsz 1024 crg 8 thr 8 write 8 failed read 8 failed
> total_size 17487872K rsz 1024 crg 8 thr 16 write 8 failed read 8 failed
> total_size 17487872K rsz 1024 crg 8 thr 32 write 8 failed read 8 failed
> total_size 17487872K rsz 1024 crg 8 thr 64 write 8 failed read 8 failed
> total_size 17487872K rsz 1024 crg 16 thr 16 write 16 failed read 16 failed
> total_size 17487872K rsz 1024 crg 16 thr 32 write 16 failed read 16 failed
> total_size 17487872K rsz 1024 crg 16 thr 64 write 16 failed read 16 failed
> total_size 17487872K rsz 1024 crg 32 thr 32 write 32 failed read 32 failed
> total_size 17487872K rsz 1024 crg 32 thr 64 write 32 failed read 32 failed
> total_size 17487872K rsz 1024 crg 64 thr 64 write 64 failed read 64 failed
>
> All writes and reads fail but it indicates that it found the device....
Indeed. So the question is, why are the reads and writes failing.
Do you have any files in /tmp named:
/tmp/sgpdd_survey_$(date)_$(uname -n).detail
If so, can you paste one here?
megan: I am attaching the file from
/tmp/sgpdd_survey_2008-07-29 at 15:40_oss4.crew.local.detail
The complaint seems to be that the memory cannot be accessed.
Alternatively you can try using sgp_dd to read a device. The following
should work:
# sgp_dd /dev/sg16 /dev/null count=10 bs=512 time=1
and paste the result here.
megan: Pasting result--
[root at oss4 ~]# sgp_dd of=/dev/sg16 if=/dev/null count=10 bs=512 time=1
time to transfer data was 0.000121 secs
remaining block count=10
0+0 records in
0+0 records out
Note that a "cat /proc/meminfo" shows 16Gb RAM on the machine oss4.
[root at oss4 ~]# cat /proc/meminfo
MemTotal: 16439328 kB
MemFree: 16101332 kB
Buffers: 32260 kB
Cached: 205820 kB
---snip---
BTW I am running iozone v. 3.283 on the OS drive, a RAID6 JBOD disk
formatted ext3 and one of our existing Lustre disks and the lustre
system is doing well under iozone.
Thanks,
megan
b.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sgpdd_survey_2008-07-29 at 15:40_oss4.crew.local.detail
Type: application/octet-stream
Size: 33332 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20080731/a61eb44d/attachment.obj>
More information about the lustre-discuss
mailing list