[Lustre-discuss] slow direct_io , slow journal .. in OST log
Lex
lexluthor87 at gmail.com
Sun Jan 24 18:42:20 PST 2010
Hi list
I have one OSS with hadware info like this :
CPU Intel(R) xeon E5420 2.5 Ghz
Chipset intel 5000P
8GB RAM
With this OSS, we using 2 RAID-5 arrays as OSTs ( each has 4 x 1.5 TB hard
drive with RAID controller adaptec 5805 )
I worked quite smooth before, but, about 2 weeks ago, in /var/log/messages,
i saw many warning ( i thought so) like this:
*Jan 25 08:41:23 OST6 kernel: Lustre:
9587:0:(filter_io_26.c:706:filter_commitrw_write()) lustre-OST0006: slow
direct_io 35s
Jan 25 08:41:34 OST6 kernel: Lustre:
9608:0:(filter_io_26.c:706:filter_commitrw_write()) lustre-OST0006: slow
direct_io 41s
Jan 25 08:41:34 OST6 kernel: Lustre:
9608:0:(filter_io_26.c:706:filter_commitrw_write()) Skipped 2 previous
similar messages
Jan 25 08:41:35 OST6 kernel: Lustre:
9645:0:(filter_io_26.c:706:filter_commitrw_write()) lustre-OST0006: slow
direct_io 43s
Jan 25 08:58:10 OST6 kernel: Lustre:
9646:0:(filter_io_26.c:706:filter_commitrw_write()) lustre-OST0006: slow
direct_io 31s
Jan 25 08:59:39 OST6 kernel: Lustre:
9609:0:(filter_io_26.c:706:filter_commitrw_write()) lustre-OST0006: slow
direct_io 30s
Jan 25 09:01:05 OST6 kernel: Lustre:
9587:0:(filter_io_26.c:706:filter_commitrw_write()) lustre-OST0006: slow
direct_io 33s
Jan 25 09:03:23 OST6 kernel: Lustre:
9633:0:(filter_io_26.c:706:filter_commitrw_write()) lustre-OST0006: slow
direct_io 32s
Jan 25 09:11:25 OST6 kernel: Lustre:
9585:0:(filter_io_26.c:706:filter_commitrw_write()) lustre-OST0006: slow
direct_io 36s*
I googled around and found that it's because a problem with oss_num_threads
and even though brought it down to 64 ( followed by the function i found in
the 1.8 manual: thread_number = RAM * CPU core / 128 MB, its value is 256 )
*options ost oss_num_threads=64*
It still didn't help.
I thought it was only the harmless warning but maybe wrong, our performance
is goes down quite heavily ( it's maybe because of other reason, but for
now, i am only doubting slow direct_io problem )
iostat -m 1 1
Linux 2.6.18-92.1.17.el5_lustre.1.8.0custom (OST6) 01/25/2010
avg-cpu: %user %nice %system %iowait %steal %idle
0.01 0.02 2.86 25.01 0.00 72.10
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 1.30 0.01 0.00 11386 3469
sdb 1.30 0.01 0.00 11531 3469
sdc 131.50 *12.40* 0.26 11793218 249934
sdd 178.46 *18.00* 0.26 17124065 250334
md2 3.33 0.02 0.00 22915 2634
md1 0.00 0.00 0.00 0 0
md0 0.00 0.00 0.00 0 0
drbd3 480.10 *12.39* 0.26 11789047 249639
drbd6 565.85 *14.89* 0.26 14168452 249211
So, could anyone please tell me whether it's warning impact our system
performance or not ? and if it does, give me solution or advice to resolve
it, please
Best regards
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100125/782df497/attachment.htm>
More information about the lustre-discuss
mailing list