[Lustre-discuss] ost_brw_write()

Mag Gam magawake at gmail.com
Wed Dec 31 12:25:13 PST 2008


Kevin:

Thanks for the response.

What do I need to change using ethtool? BTW, I am using ethernet
bonding to increase bandwidth. I suspect this could be causing the
problem...

I am not sure if my applications are using mmap(). I am not aware of
an easy way to determine if they are.



On Wed, Dec 31, 2008 at 12:34 PM, Kevin Van Maren
<Kevin.Vanmaren at sun.com> wrote:
> I have previously observed cases where the RX checksum offload NIC would
> pass packets up
> to Linux as "good" if the Ethernet CRC was valid, even though the UDP
> checksum failed (for
> some reason it appeared that something (the sender?) was corrupting a byte
> in the payload after
> calculating the UDP csum, but before the Ethernet CRC was calculated).
>
> So disable any NIC offloading on both sides (ethtool) and see if the Lustre
> csums errors go away.
>
> Also note that is you are using mmap files, it is _expected_ that the csum
> might not match,
> as the page can be modified between when the csum is calculated by Luster,
> and the page
> is actually transmitted.
>
> Kevin
>
>
> Mag Gam wrote:
>>
>> I have done the tuning but still occasionally get a CSUM error. About
>> 200 per day.  Considering, we probally transfer close to 500G to 1TB
>> of data a day is not that bad.
>>
>> I did the tuning on the e1000 card but I am not sure what else to do.
>> The network guys have nothing wrong with their switch and the cables
>> are fine (we even got them replaced).
>>
>> Since lustre has its own checksumming, I suppose I am in good shape...
>>
>>
>>
>>
>> On Sat, Nov 15, 2008 at 10:59 AM, Mag Gam <magawake at gmail.com> wrote:
>>
>>>
>>> Brian. Thanks for getting back to me.
>>>
>>> Yes. The contents matched but getting the RX drop which is king of
>>> scary. I am using the same machine when doing the test.
>>>
>>> I have already looked at the Lnet tests
>>>
>>>
>>> http://manual.lustre.org/manual/LustreManual16_HTML/LustreIOKit.html#50642990_pgfId-1290255
>>>
>>> For some reason, "lst add_group servers ipaddrs_of_OSS_and_MDS" gets
>>> me a RPC error but it seems my 5 servers get added. Wierd. Is there
>>> better documentation or perhaps an example for the lnet tests I am
>>> curious to try it.
>>>
>>> BTW, I am very happy to see this
>>>
>>> http://manual.lustre.org/manual/LustreManual16_HTML/LustreTuning.html#50642992_24952
>>> (Last section regarding CRC). Where can I read more about this??
>>>
>>>
>>>
>>> Keep in mind, I am using e1000 NICs, and I think there is some tuning
>>> I should be doing (but I am not certain if I am doing the right
>>> tuning)
>>>
>>> TIA
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Nov 14, 2008 at 7:11 AM, Brian J. Murrell <Brian.Murrell at sun.com>
>>> wrote:
>>>
>>>>
>>>> On Thu, 2008-11-13 at 21:32 -0500, Mag Gam wrote:
>>>>
>>>>>
>>>>> OK.
>>>>>
>>>>> It seems Lustre FS is dropping the packets.
>>>>>
>>>>
>>>> No.  Nobody said anything about packets being dropped.  They are failing
>>>> checksum.
>>>>
>>>>
>>>>>
>>>>>  I did multiple FTPs and
>>>>> they were very large files (10GB each), and no packet drops
>>>>>
>>>>
>>>> Did you verify the contents of what you ftp'd matched the original?  Are
>>>> you using the same machines in your ftp tests that are reporting
>>>> checksum failures with Lustre?
>>>>
>>>> You might want to look in our test suite and see if there is a checksum
>>>> unit test.  I'd be surprised if there is not.  Maybe run that and see
>>>> what the results are.  I'm afraid I don't have a lustre source tree very
>>>> handy at the moment to check for you.
>>>>
>>>> b.
>>>>
>>>>
>>>> _______________________________________________
>>>> Lustre-discuss mailing list
>>>> Lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>>
>>>>
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>
>



More information about the lustre-discuss mailing list