[Lustre-discuss] Slow read performance after ~700MB-2GB

Wed Jun 9 10:29:36 PDT 2010

Is vm.zone_reclaim_mode set to a value other that zero?

This sounds a lot like a problem we recently experienced when a BIOS  
upgrade changed to the ACPI SLIT table, which specifies the distances  
between NUMA nodes in a system. It put the remote node distance over  
the threshold the kernel uses to decide whether or not to enable the  
inline zone reclaim path. At least on SLES, Lustre doesn't seem to be  
able to free up pages in the page cache in this path, and performance  
dropped to 2-4MB/s. In my test case I was issuing 2MB I/Os and the  
kernel only let 1-2 I/Os trickle out per second, so that matches up  
with what you're seeing.

j

On Jun 9, 2010, at 7:44 AM, Alexander Oltu wrote:

> We are experiencing strange behavior on our Lustre setup. The problem
> appears only while reading files. It is a Cray XT4 machine and this
> problem is reproducible only on login nodes, while compute nodes are
> fine.
>
> To reproduce it we run:
> cd /lustrefs
> dd if=/dev/zero of=test.file bs=1024k count=5000
> # drop cache
> echo 1 > /proc/sys/vm/drop_caches
> dd if=test.file of=/dev/null bs=1024k &
> # check dd status:
> kill -USR1 %1
>
> After 700MB-2GB of reading speed drops to 1-2 MB/s, iowait will grow  
> to
> 50% (one full CPU):
> hexgrid:~ # vmstat 3
> procs -----------memory---------- ---swap-- -----io---- -system--
> -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id  
> wa st
> 0 1 0 4016432 0 3349580 0 0 0 0 253 109 0 0 99 0 0
> 0 1 0 4016788 0 3349240 0 0 0 0 515 2322 9 6 38 47 0
> 0 1 0 4016836 0 3349512 0 0 0 0 505 90 0 0 50 50 0
> 0 1 0 4016860 0 3349716 0 0 0 0 505 82 0 0 50 50 0
> 0 1 0 4016860 0 3349784 0 0 0 0 505 85 0 0 50 50 0
> 0 1 0 4016856 0 3349920 0 0 0 0 505 85 0 0 50 50 0
> 0 1 0 4017048 0 3349852 0 0 0 0 505 86 0 0 50 50 0
>
> hexgrid:~ # mpstat -P ALL 3
> Linux 2.6.16.60-0.39_1.0102.4784.2.2.48B-ss (hexgrid.bccs.uib.no)
> 05/11/2010
>
> 03:35:12 PM CPU %user %nice %sys %iowait %irq %soft %steal %idle  
> intr/s
> 03:35:15 PM all 0.00 0.00 0.00 49.92 0.00 0.00 0.00 50.08 504.67
> 03:35:15 PM 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 0.00
> 03:35:15 PM 1 0.00 0.00 0.00 99.67 0.00 0.00 0.00 0.00 504.67
>
> 03:35:15 PM CPU %user %nice %sys %iowait %irq %soft %steal %idle  
> intr/s
> 03:35:18 PM all 0.00 0.00 0.00 50.00 0.00 0.00 0.00 50.00 503.99
> 03:35:18 PM 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00 0.00
> 03:35:18 PM 1 0.00 0.00 0.00 100.00 0.00 0.00 0.00 0.00 503.99
>
> I am not sure where to look into and if it is lustre or HW problem? No
> messages in dmesg.
>
> Thanks,
> Alex.
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

--
Jason Rappleye
System Administrator
NASA Advanced Supercomputing Division
NASA Ames Research Center
Moffett Field, CA 94035