[Lustre-discuss] Errors in output from sgpdd-survey (sgp_dd.c Cannot allocate memory)

Heald, Nathan T. nheald at indiana.edu
Tue Dec 14 09:20:10 PST 2010


Hi everyone,
I have been running sgpdd-survey on some DDN 9550's and am getting some
errors. I'm using what I believe to be the latest version of the I/O Kit
(lustre-iokit-1.2-200709210921). I've got 4 OSSes attached and run
sgpdd-survey against all the disk from each host one at a time. Each host is
getting these errors, but not identically. I've found several threads on the
mailing list with people reporting this same error but there are no
resolutions posted. One post suggested a modification to the flags for
"sg_readcap" in the script could resolve these errors, but making the
changes did not seem to fix the issue. It looks like sgp_dd is having
intermittent problems:

16384+0 records out
sg starting in command at "sgp_dd.c":827: Cannot allocate memory
sg starting in command at "sgp_dd.c":827: Cannot allocate memory
sg starting in command at "sgp_dd.c":827: Cannot allocate memory
sg starting in command at "sgp_dd.c":827: Cannot allocate memory
sg starting in command at "sgp_dd.c":827: Cannot allocate memory
sg starting in command at "sgp_dd.c":827: Cannot allocate memory


Output from sgpdd-survey:

Wed Dec  1 10:55:55 EST 2010 sgpdd-survey on /dev/sdp /dev/sdo /dev/sdn
/dev/sdw /dev/sdv /dev/sdu /dev/sdt /dev/sds /dev/sdy /dev/sdr /dev/sdx
/dev/sdq  from oss1
... 
total_size 100663296K rsz 1024 crg   384 thr   768 write  388.20 MB/s   384
x   1.01 =  388.18 MB/s read  387.16 MB/s   384 x   1.01 =  388.18 MB/s
total_size 100663296K rsz 1024 crg   384 thr  1536 write 1 failed read
385.72 MB/s   384 x   1.01 =  388.18 MB/s
total_size 100663296K rsz 1024 crg   384 thr  3072 write 140 failed read 121
failed 
total_size 100663296K rsz 1024 crg   384 thr  6144 ENOMEM
total_size 100663296K rsz 1024 crg   768 thr   768 write 1 failed read
387.28 MB/s   768 x   0.51 =  388.18 MB/s
total_size 100663296K rsz 1024 crg   768 thr  1536 write  388.23 MB/s   768
x   0.51 =  388.18 MB/s read  386.76 MB/s   768 x   0.51 =  388.18 MB/s
total_size 100663296K rsz 1024 crg   768 thr  3072 write 42 failed read 31
failed 
total_size 100663296K rsz 1024 crg   768 thr  6144 ENOMEM
total_size 100663296K rsz 1024 crg   768 thr 12288 ENOMEM
...

Any suggestions are welcome.

Thanks,
-Nathan





More information about the lustre-discuss mailing list