[Lustre-discuss] lustre and small files overhead

Mon Mar 3 11:16:30 PST 2008

Joe Barjo <mailto:jobarjo78 at yahoo.fr> wrote:
> Turning off debuging made it much better.
> It went from 1m54 down to 25 seconds, but still 85% of system processing...
> I really think you should turn off debuging by default, or make it appear
> as a BIG warning message.
>
> People who are trying lustre for the first time are not going to debug lustre.
> Also, looking in the documentation, though the debugging was documented, it
> was not clear that removing it would improve performance so much.
> 
> I will now make more tests and see how the coherency is...
> Thanks for your support.

What version of Lustre are you using?  We have turned down the default
debugging level in more recent versions of Lustre.

> Andreas Dilger a écrit : 
> 	On Feb 29, 2008  15:37 +0100, Joe Barjo wrote:
> 	  
> 
> 		We have a (small) 30 node sge based cluster with centos4 which will be
> 		growing to maximum 50 core duos.
> 		We use custom software that is based on gmake to launch parallel
> 		compilation and computations with lot of small files and some large files.
> 		We actualy use nfs and have a lot of problems with incoherencies between
> 		nodes.
> 		
> 		I'm currently evaluating lustre and have some questions about lustre
> 		overhead with small files.
> 		I succesfully installed the rpms on a test machine and launched the
> 		local lmount.sh script.
> 		    
> 
> 	
> 	Note that if you are using the unmodified llmount.sh script this is running
> 	on loopback files in /tmp, so the performance is likely quite bad.  For
> 	a realistic performance measure, put the MDT and OST on separate disks.
> 	
> 	  
> 
> 		The first thing I tried is to make a svn checkout into it. (lot of small
> 		files...)
> 		It takes 1m54 from our local svn server versus 15s into a local ext3
> 		filesystem and 50s over nfs network.
> 		During the checkout, the processor (amd64 3200) is busy with 90% system.
> 		
> 		How come is there so much system process?
> 		    
> 
> 	
> 	Have you turned off debugging (sysctl -w lnet.debug=0)?
> 	Have you increased the DLM lock LRU sizes?
> 	
> 	for L in /proc/fs/lustre/ldlm/namespaces/*/lru_size; do
> 	    echo 10000 > $L
> 	done
> 	
> 	In 1.6.5/1.8.0 it will be possible to use a new command to set
> 	this kind of parameter easier:
> 	
> 	lctl set_param ldlm.namespaces.*.lru_size=10000
> 	
> 	  
> 
> 		Is there something to tweak to lower this overhead?
> 		Is there a specific tweak for small files?
> 		    
> 
> 	
> 	Not really, this isn't Lustre's strongest point.
> 	
> 	  
> 
> 		Using multiple server nodes, will the performance be better?
> 		    
> 
> 	
> 	Partly.  There can only be a single MDT per filesystem, but it can
> 	scale quite well with multiple clients.  There can be many OSTs,
> 	but it isn't clear whether you are IO bound.  It probably wouldn't
> 	hurt to have a few to give you a high IOPS rate.
> 	
> 	Note that increasing OST count also by default allows clients to
> 	cache more dirty data (32MB/OST).  You can change this manually,
> 	it is by default tuned for very large clusters (000's of nodes).
> 	
> 	for C in /proc/fs/lustre/osc/*/max_dirty_mb
> 		echo 256 > $C
> 	done
> 	
> 	Similarly, in 1.6.5/1.8.0 it will be possible to do:
> 	
> 	lctl set_param osc.*.max_dirty_mb=256
> 	
> 	Cheers, Andreas
> 	--
> 	Andreas Dilger
> 	Sr. Staff Engineer, Lustre Group
> 	Sun Microsystems of Canada, Inc.
> 	
> 	
> 	  
> 
> 

> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.