[Lustre-discuss] Lustre HA Experiences

Wed May 4 14:16:28 PDT 2011

On May 4, 2011, at 10:05 AM, Charles Taylor wrote:

> 
> We are dipping our toes into the waters of Lustre HA using  
> pacemaker.     We have 16 7.2 TB OSTs across 4 OSSs (4 OSTs each).    
> The four OSSs are broken out into two dual-active pairs running Lustre  
> 1.8.5.    Mostly, the water is fine but we've encountered a few  
> surprises.
> 
> 1. An 8-client  iozone write test in which we write 64 files of 1.7  
> TB  each seems to go well - until the end at which point iozone seems  
> to finish successfully and begins its "cleanup".   That is to say it  
> starts to remove all 64 large files.    At this point, the ll_ost   
> threads go bananas - consuming all available cpu cycles on all 8 cores  
> of each server.   This seems to block the corosync "totem" exchange  
> long enough to initiate a "stonith" request.

Running oprofile or profile.pl (possibly only included in SGI's respin of perfsuite, original is at http://perfsuite.ncsa.illinois.edu/) is useful in situations where you have one or more thread consuming a lot of CPU. It should point to what function(s) the offending thread(s) are spending time in. From there, bugzilla/jira or the mailing list should be able to help further.

> 2. We have found that re-mounting the OSTs, either via the HA agent or  
> manually, often can take a *very* long time - on the order of four or  
> five minutes.   We have not figured out why yet.   An strace of the  
> mount process has not yielded much.    The mount seems to just be  
> waiting for something but we can't tell what.

Could be bz 18456. 

Jason

--
Jason Rappleye
System Administrator
NASA Advanced Supercomputing Division
NASA Ames Research Center
Moffett Field, CA 94035