[Lustre-discuss] strange slow pararell writes
Papp Tamás
tompos at martos.bme.hu
Mon Feb 18 13:39:18 PST 2008
Dear All,
I have some strange problem, now I'm at the point, I have no idea,
what's happening.
The cluster has 2 meta servers (meta1 and 2) and 6 nodes (node1-6).
The meta's have CentOS 5, nodes have CentOS 4.
Node1,5,6 are 2.6.9-55.0.9.EL_lustre.1.6.4.1smp, the others are
2.6.9-42.0.10.EL_lustre-1.6.0.1custom-drbd.
There are drbd peers, like node1-2 and so on.
Nodes have 8 SATA disks on Adaptec 2610S and 2620S RAID adapter, and 3
NIC's (main network, lnet, drbd).
There are the symptoms:
Paralell read is OK, fast and quiet. Single write is OK.
Paralell writes with few (for example 3-4) clients is slow, above that
it's stucked.
The load on one or two nodes is high, and growing, the kernel is in
io-wait. Usually this two nodes are node4 and node3 (with file stiping),
and node4 has load for example 30-40-50, than node3 has approximately
half of it.
The problem is, this was OK for half year ago.
Do you have any idea or any tip?
Thank,
tamas
More information about the lustre-discuss
mailing list