[Lustre-discuss] ost_brw_write()

Mag Gam magawake at gmail.com
Tue Dec 30 19:06:05 PST 2008


I have done the tuning but still occasionally get a CSUM error. About
200 per day.  Considering, we probally transfer close to 500G to 1TB
of data a day is not that bad.

I did the tuning on the e1000 card but I am not sure what else to do.
The network guys have nothing wrong with their switch and the cables
are fine (we even got them replaced).

Since lustre has its own checksumming, I suppose I am in good shape...




On Sat, Nov 15, 2008 at 10:59 AM, Mag Gam <magawake at gmail.com> wrote:
> Brian. Thanks for getting back to me.
>
> Yes. The contents matched but getting the RX drop which is king of
> scary. I am using the same machine when doing the test.
>
> I have already looked at the Lnet tests
>
> http://manual.lustre.org/manual/LustreManual16_HTML/LustreIOKit.html#50642990_pgfId-1290255
>
> For some reason, "lst add_group servers ipaddrs_of_OSS_and_MDS" gets
> me a RPC error but it seems my 5 servers get added. Wierd. Is there
> better documentation or perhaps an example for the lnet tests I am
> curious to try it.
>
> BTW, I am very happy to see this
> http://manual.lustre.org/manual/LustreManual16_HTML/LustreTuning.html#50642992_24952
> (Last section regarding CRC). Where can I read more about this??
>
>
>
> Keep in mind, I am using e1000 NICs, and I think there is some tuning
> I should be doing (but I am not certain if I am doing the right
> tuning)
>
> TIA
>
>
>
>
>
>
>
>
>
> On Fri, Nov 14, 2008 at 7:11 AM, Brian J. Murrell <Brian.Murrell at sun.com> wrote:
>> On Thu, 2008-11-13 at 21:32 -0500, Mag Gam wrote:
>>> OK.
>>>
>>> It seems Lustre FS is dropping the packets.
>>
>> No.  Nobody said anything about packets being dropped.  They are failing
>> checksum.
>>
>>>  I did multiple FTPs and
>>> they were very large files (10GB each), and no packet drops
>>
>> Did you verify the contents of what you ftp'd matched the original?  Are
>> you using the same machines in your ftp tests that are reporting
>> checksum failures with Lustre?
>>
>> You might want to look in our test suite and see if there is a checksum
>> unit test.  I'd be surprised if there is not.  Maybe run that and see
>> what the results are.  I'm afraid I don't have a lustre source tree very
>> handy at the moment to check for you.
>>
>> b.
>>
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>



More information about the lustre-discuss mailing list