[Lustre-discuss] High I/O write pb

hindisvik at gmail.com hindisvik at gmail.com
Thu Mar 6 00:56:49 PST 2008


Hi,

I use Lustre 1.6 with 3 OST and 1 MDS for about 1 year. Every is
working fine, but for some weeks, without having more traffic on my
web clients (7 Linux servers), the Load average MDS is growing and
growing higher.
When I run top I have :

top - 09:50:53 up 205 days,  6:57,  1 user,  load average: 3.63, 3.29,
3.23
Tasks: 191 total,   3 running, 188 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.2% us,  4.5% sy,  0.0% ni, 94.2% id,  0.6% wa,  0.1% hi,
0.5% si
Mem:   1033480k total,   919444k used,   114036k free,   562172k
buffers
Swap:  1052216k total,      540k used,  1051676k free,    14932k
cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+
COMMAND
 3298 root      15   0     0    0    0 R 11.5  0.0   4012:54
socknal_sd00
 3407 root      15   0     0    0    0 S  1.9  0.0 383:01.30
ldlm_cn_02
 3408 root      15   0     0    0    0 S  1.9  0.0 383:30.09
ldlm_cn_03
 3448 root      15   0     0    0    0 S  1.9  0.0   1157:19
ll_mdt_07
 3449 root      15   0     0    0    0 S  1.9  0.0   1156:52
ll_mdt_08
 3450 root      15   0     0    0    0 S  1.9  0.0   1156:50
ll_mdt_09
 3451 root      15   0     0    0    0 S  1.9  0.0   1156:45
ll_mdt_10
 3455 root      15   0     0    0    0 S  1.9  0.0   1157:36
ll_mdt_14
 3459 root      15   0     0    0    0 S  1.9  0.0   1156:17
ll_mdt_18
 3461 root      15   0     0    0    0 S  1.9  0.0   1157:09
ll_mdt_20
 3464 root      15   0     0    0    0 S  1.9  0.0   1156:53
ll_mdt_23
 3466 root      15   0     0    0    0 S  1.9  0.0   1157:21
ll_mdt_25
 3467 root      16   0     0    0    0 S  1.9  0.0   1157:04
ll_mdt_26
28702 root      15   0     0    0    0 R  1.9  0.0  24:16.25
ll_mdt_rdpg_11
    1 root      15   0  1744  568  492 S  0.0  0.1   0:01.58 init
     ...............................

This load average (load average: 3.63) was about 1.00 4 weeks ago ...
it's strange.
I'd like to know what is making my MDS grow like that but How? is
there a command?

It seems to be I/O writes, because when I run iostat I can see more
and more writes :
Device:            tps    Blk_lus/s Blk_écrits/s    Blk_lus Blk_écrits
sda              22,15       140,14       124,43 2470320067 2193374254

Writes are about 124.43 Blk_written/s, they were 50Blk_written/s somme
weeks ago.

Are there any commands to know what is making the load grow like that?
and Which files are writing my MDS?
Thanx by advance for ANY suggestions.

Best regards,

R. Hindisvik



More information about the lustre-discuss mailing list