[Lustre-discuss] Getting weird disk errors, no apparent impact
LaoTsao
laotsao at gmail.com
Fri Aug 13 09:05:31 PDT 2010
U mean stk 2540?
Iirc one can download drivers from oracle sun site
------- Original message -------
> From: David Noriega <tsk133 at my.utsa.edu>
> To: lustre-discuss at lists.lustre.org
> Sent: 13.8.'10, 11:51
>
> We have three Sun StorageTek 2150, one connected to the metadata
> server and two crossconnected to the two data storage nodes. They are
> connected via fiber using the qla2xxx driver that comes with CentOS
> 5.5. The multipath daemon has the following config:
>
> defaults {
> udev_dir /dev
> polling_interval 10
> selector "round-robin 0"
> path_grouping_policy multibus
> getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
> prio_callout "/sbin/mpath_prio_rdac /dev/%n"
> path_checker rdac
> rr_min_io 100
> max_fds 8192
> rr_weight priorities
> failback immediate
> no_path_retry fail
> user_friendly_names yes
> }
>
> Comment out from multipath.conf file:
>
> blacklist {
> devnode "*"
> }
>
>
> On Fri, Aug 13, 2010 at 4:31 AM, Wojciech Turek <wjt27 at cam.ac.uk> wrote:
>> Hi David,
>>
>> I have seen simmilar errors given out by some storage arrays. There were
>> caused by arrays exporting volumes via more then a single path without multi
>> path driver installed or configured properly. Some times the array
>> controllers requires a special driver to be installed on Linux host (for
>> example RDAC mpp driver) to properly present and handle configured volumes
>> in the OS. What sort of disk raid array are you using?
>>
>> Best gerads,
>>
>> Wojciech
>>
>> On 12 August 2010 17:58, David Noriega <tsk133 at my.utsa.edu> wrote:
>>>
>>> We just setup a lustre system, and all looks good, but there is this
>>> nagging error thats floating about. When I reboot any of the nodes, be
>>> it a OSS or MDS, I will get this:
>>>
>>> [root at meta1 ~]# dmesg | grep sdc
>>> sdc : very big device. try to use READ CAPACITY(16).
>>> SCSI device sdc: 4878622720 512-byte hdwr sectors (2497855 MB)
>>> sdc: Write Protect is off
>>> sdc: Mode Sense: 77 00 10 08
>>> SCSI device sdc: drive cache: write back w/ FUA
>>> sdc : very big device. try to use READ CAPACITY(16).
>>> SCSI device sdc: 4878622720 512-byte hdwr sectors (2497855 MB)
>>> sdc: Write Protect is off
>>> sdc: Mode Sense: 77 00 10 08
>>> SCSI device sdc: drive cache: write back w/ FUA
>>> sdc:end_request: I/O error, dev sdc, sector 0
>>> Buffer I/O error on device sdc, logical block 0
>>> end_request: I/O error, dev sdc, sector 0
>>>
>>> This doesn't seem to affect anything. fdisk -l doesn't even report the
>>> device. The same(thought of course different block device sdd, sde, on
>>> the OSSs), happens on all the nodes.
>>>
>>> If I run pvdisplay or lvdisplay, I'll get this:
>>> /dev/sdc: read failed after 0 of 4096 at 0: Input/output error
>>>
>>> Any ideas?
>>> David
>>> --
>>> Personally, I liked the university. They gave us money and facilities,
>>> we didn't have to produce anything! You've never been out of college!
>>> You don't know what it's like out there! I've worked in the private
>>> sector. They expect results. -Ray Ghostbusters
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>>
>>
>> --
>> Wojciech Turek
>>
>> Senior System Architect
>>
>> High Performance Computing Service
>> University of Cambridge
>> Email: wjt27 at cam.ac.uk
>> Tel: (+)44 1223 763517
>>
>
>
>
> --
> Personally, I liked the university. They gave us money and facilities,
> we didn't have to produce anything! You've never been out of college!
> You don't know what it's like out there! I've worked in the private
> sector. They expect results. -Ray Ghostbusters
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
More information about the lustre-discuss
mailing list