[Lustre-discuss] Balancing I/O Load
Charles Taylor
taylor at hpc.ufl.edu
Thu Nov 29 09:55:44 PST 2007
We are seeing some disturbing (probably due to our ignorance)
behavior from lustre 1.6.3 right now. We have 8 OSSs with 3 OSTs
per OSS (24 physical LUNs). We just created a brand new lustre file
system across this configuration using the default mkfs.lustre
formatting options. We have this file system mounted across 400
clients.
At the moment, we have 63 IOzone threads running on roughly 60
different clients. The balance among the OSSs is terrible and
within each OSS, the balance across the OSTs (luns) is even worse.
We have one OSS with a load of 100 and another that is not being
touched. On several of the OSSs, only one OST (luns) is being used
while the other two are ignored entirely.
This is really just a bnuch of random I/O (both large and small
block) from a bunch of random clients (as will occur in real-life)
and our lustre implementation is not making very good use of the
available resources. Can this be tuned? What are we doing
wrong? The 1.6 operations manual (version 1.9) does not say a lot
about options for balancing the work load among OSSs/OSTs.
Shouldn't lustre be doing a better job (by default) of distributing
the workload?
Charlie Taylor
UF HPC Center
FWIW, the servers are dual-processor, dual-core Opterons (275s) with
4GB RAM each. They are running CentOS 5 w/ a
2.6.18-8.1.14.el5Lustre (patched lustre, smp kernel) and the deadline
I/O scheduler. If it matters, our OSTs are atop LVM2 volumes (for
management). The back-end storage is all Fibre-channel RAID
(Xyratex). We have tuned the servers and know that we can get
roughly 500MB/s per server across a striped *local* file system.
More information about the lustre-discuss
mailing list