[Lustre-discuss] low performance maybe related to quota
gregoire.pichon at bull.net
gregoire.pichon at bull.net
Mon Jan 30 02:09:37 PST 2012
Hi,
If someone could have a look, this would be very helpful. I have no idea
what to look at.
I am running a performance test (ES4) on a Lustre file-system, installed
with Lustre 2.1 plus a few Bull patches, and I observe very low throughput
compared to what I usually measure on the same hardware.
Write bandwidth is varying between 150MB/s and 500 MB/s running with a
standard user. With the exact same parameters and configuration, but
running under the root user, I get around 2000 MB/s write bandwidth. This
second value is what I observe usually.
The profiling of the Lustre client indicates more than 50% of time is
spent in osc_quota_chkdq() routine. So this seems related to the quota
subsystem and certainly explains why root user is not impacted by the
problem.
The quota are disabled on the client::
# lfs quota /b9
user quotas are not enabled.
group quotas are not enabled
There is no quota parameter stored on the MDT, nor on the 15 OSTs:
# tunefs.lustre /dev/loop1
checking for existing Lustre data: found CONFIGS/mountdata
Reading CONFIGS/mountdata
Read previous values:
Target: b9-MDT0000
Index: 0
Lustre FS: b9
Mount type: ldiskfs
Flags: 0x1
(MDT )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters:
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84 at o2ib3
lov.stripecount=2 lov.stripesize=1048576 network=o2ib0
# for dev in `mount -t lustre | cut -d' ' -f1`; do tunefs.lustre $dev |
grep "^Parameters" | sort -u; done
Parameters:
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84 at o2ib3
failover.node=60.64.0.37 at o2ib failover.node=60.64.0.39 at o2ib
failover.node=60.64.0.36 at o2ib network=o2ib0
Parameters:
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84 at o2ib3
failover.node=60.64.0.37 at o2ib failover.node=60.64.0.39 at o2ib
failover.node=60.64.0.36 at o2ib network=o2ib0
Parameters:
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84 at o2ib3
failover.node=61.64.0.36 at o2ib2 failover.node=61.64.0.37 at o2ib2
failover.node=61.64.0.39 at o2ib2 network=o2ib2
Parameters:
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84 at o2ib3
failover.node=61.64.0.36 at o2ib2 failover.node=61.64.0.37 at o2ib2
failover.node=61.64.0.39 at o2ib2 network=o2ib2
Parameters:
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84 at o2ib3
failover.node=160.64.0.39 at o2ib1 failover.node=160.64.0.36 at o2ib1
failover.node=160.64.0.37 at o2ib1 network=o2ib1
Parameters:
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84 at o2ib3
failover.node=161.64.0.36 at o2ib3 failover.node=161.64.0.37 at o2ib3
failover.node=161.64.0.39 at o2ib3 network=o2ib3
Parameters:
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84 at o2ib3
failover.node=160.64.0.37 at o2ib1 failover.node=160.64.0.39 at o2ib1
failover.node=160.64.0.36 at o2ib1 network=o2ib1
Parameters:
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84 at o2ib3
failover.node=60.64.0.39 at o2ib failover.node=60.64.0.36 at o2ib
failover.node=60.64.0.37 at o2ib network=o2ib0
Parameters:
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84 at o2ib3
failover.node=160.64.0.36 at o2ib1 failover.node=160.64.0.37 at o2ib1
failover.node=160.64.0.39 at o2ib1 network=o2ib1
Parameters:
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84 at o2ib3
failover.node=61.64.0.36 at o2ib2 failover.node=61.64.0.37 at o2ib2
failover.node=61.64.0.39 at o2ib2 network=o2ib2
Parameters:
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84 at o2ib3
failover.node=160.64.0.37 at o2ib1 failover.node=160.64.0.39 at o2ib1
failover.node=160.64.0.36 at o2ib1 network=o2ib1
Parameters:
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84 at o2ib3
failover.node=161.64.0.36 at o2ib3 failover.node=161.64.0.37 at o2ib3
failover.node=161.64.0.39 at o2ib3 network=o2ib3
Parameters:
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84 at o2ib3
failover.node=161.64.0.37 at o2ib3 failover.node=161.64.0.39 at o2ib3
failover.node=161.64.0.36 at o2ib3 network=o2ib3
Parameters:
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84 at o2ib3
failover.node=161.64.0.39 at o2ib3 failover.node=161.64.0.36 at o2ib3
failover.node=161.64.0.37 at o2ib3 network=o2ib3
Parameters:
mgsnode=60.64.2.84 at o2ib,160.64.2.84 at o2ib1,61.64.2.84 at o2ib2,161.64.2.84 at o2ib3
failover.node=61.64.0.39 at o2ib2 failover.node=61.64.0.36 at o2ib2
failover.node=61.64.0.37 at o2ib2 network=o2ib2
Thanks in advance,
Grégoire.
--
Grégoire PICHON
Software Developer, Lustre - Extreme Computing R&D
Bull, Architect of an Open World
Phone: +33 4 76 29 70 63
http://www.bull.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20120130/2bb3f298/attachment.htm>
More information about the lustre-discuss
mailing list