[Lustre-discuss] Large directory performance

Andreas Dilger andreas.dilger at oracle.com
Fri Sep 10 15:26:54 PDT 2010


On 2010-09-10, at 12:11, Michael Robbert wrote:
> Create performance is a flat line of ~150 files/sec across the board. Delete performance is all over the place, but no higher than 3,000 files/sec... Then yesterday I was browsing the Lustre Operations Manual and found section 33.8 that says Lustre is tested with directories as large as 10 million files in a single directory and still get lookups at a rate of 5,000 files/sec. That leaves me wondering 2 things. How can we get 5,000 files/sec for anything and why is our performance dropping off so suddenly at after 20k files?
> 
> Here is our setup:
> All IO servers are Dell PowerEdge 2950s. 2 8-core sockets with X5355  @ 2.66GHz and 16Gb of RAM.
> The data is on DDN S2A 9550s with 8+2 RAID configuration connected directly with 4Gb Fibre channel.

Are you using the DDN 9550s for the MDT?  That would be a bad configuration, because they can only be configured with RAID-6, and would explain why you are seeing such bad performance.  For the MDT you always want to have RAID-1+0 storage.  Potentially, for every 512-byte inode written to disk you need to write many times that much data inside the RAID-6 array to keep the parity correct.

For large filesystems, sites have used 12 or 24 small SAS disks (15k RPM) in RAID-1+0 to get high IOPS performance for the MDT.

> We have as many as 1.4 million files in a single directory and we now have half a billion files that we need to deal with in one way or another.

Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.




More information about the lustre-discuss mailing list