[Lustre-discuss] Slow read performance after ~700MB-2GB

Wed Jun 9 12:45:33 PDT 2010

On 2010-06-09, at 13:41, Alexander Oltu wrote:
> On Wed, 9 Jun 2010 10:29:36 -0700 Jason Rappleye wrote:
>> Is vm.zone_reclaim_mode set to a value other that zero?
> 
> Yes, it was 1, as soon as I set it to 0 the problem disappears. 

Interesting, I have never heard of this problem before.  Is this a client-side parameter, or on the server?

>> This sounds a lot like a problem we recently experienced when a BIOS  
>> upgrade changed to the ACPI SLIT table, which specifies the
>> distances between NUMA nodes in a system. It put the remote node
>> distance over the threshold the kernel uses to decide whether or not
>> to enable the inline zone reclaim path. At least on SLES, Lustre
>> doesn't seem to be able to free up pages in the page cache in this
>> path, and performance dropped to 2-4MB/s. In my test case I was
>> issuing 2MB I/Os and the kernel only let 1-2 I/Os trickle out per
>> second, so that matches up with what you're seeing.

Jason, do you have enough of an understanding of this codepath to know why Lustre is not freeing pages in this case?  Is it because Lustre just doesn't have a VM callback that frees pages at all, or is it somehow ignoring the requests from the kernel to free up the pages?

Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.