[Lustre-discuss] Large scale delete results in lag on clients

Fri Aug 7 13:11:38 PDT 2009

--- On Fri, 8/7/09, Jim McCusker <james.mccusker at yale.edu> wrote:

> From: Jim McCusker <james.mccusker at yale.edu>
> Subject: Re: [Lustre-discuss] Large scale delete results in lag on clients
> To: "lustre-discuss" <lustre-discuss at lists.lustre.org>
> Date: Friday, August 7, 2009, 5:25 AM
> On Fri, Aug 7, 2009 at 6:45 AM, Arden
> Wiebe <albert682 at yahoo.com>
> wrote:
> > --- On Thu, 8/6/09, Andreas Dilger <adilger at sun.com>
> wrote:
> > > Jim McCusker wrote:
> 
> > > > We have a 15 TB luster volume across 4 OSTs
> and we recently deleted over 4
> > > > million files from it in order to free up
> the 80 GB MDT/MDS (going from 100%
> > > > capacity on it to 81%. As a result, after
> the rm completed, there is
> > > > significant lag on most file system
> operations (but fast access once it
> > > > occurs) even after the two servers that host
> the targets were rebooted. It
> > > > seems to clear up for a little while after
> reboot, but comes back after some
> > > > time.
> > > >
> > > > Any ideas?
> > >
> > > The Lustre unlink processing is somewhat
> asynchronous, so you may still be
> > > catching up with unlinks.  You can check this by
> looking at the OSS service
> > > RPC stats file to see if there are still object
> destroys being processed
> > > by the OSTs.  You could also just check the
> system load/io on the OSTs to
> > > see how busy they are in a "no load" situation.
> > >
> > >
> > > > For the curious, we host a large image
> archive (almost 400k images) and do
> > > > research on processing them. We had a lot of
> intermediate files that we
> > > > needed to clean up:
> > > >
> > > >  http://krauthammerlab.med.yale.edu/imagefinder
> (currently laggy and
> > > > unresponsive due to this problem)
> > > >
> >
> > Jim, from the web side perspective it seems
> responsive.  Are you actually serving the images from the
> lustre cluster?  I have ran a few searches looking for
> "Purified HIV Electron Microscope" and your project returns
> 15 pages of results with great links to full abstracts
> almost instantly but obviously none with real purified HIV
> electron microscope images similar to a real pathogenic
> virus like http://krauthammerlab.med.yale.edu/imagefinder/Figure.external?sp=62982&state:Figure=BrO0ABXcRAAAAAQAACmRvY3VtZW50SWRzcgARamF2YS5sYW5nLkludGVnZXIS4qCk94GHOAIAAUkABXZhbHVleHIAEGphdmEubGFuZy5OdW1iZXKGrJUdC5TgiwIAAHhwAAD2Cg%3D%3D
> 
> The images and the lucene index are both served from the
> lustre
> cluster (as is just about everything else on our network).
> I think
> Andreas is right, it seems to have cleared itself up.
> You're seeing
> typical performance. If you don't find what you're looking
> for, you
> can expand your search to the full text, abstract, or title
> using the
> checkboxes below the search box. Of course, the lack of
> images in
> search has more to do with the availability of open access
> papers on
> the topic than the performance of lustre. :-)
> 

Yeah I was all over the full text check box as soon as I ran one query.  Great project by the way as there really is no way for any researcher or doctor to read the volumes of scientific journals the pharmaceutical industry pays for every month.  Sad how mass consensus has replaced the actual scientific method all for capitalism.  

> > Have you physically separated your MDS/MDT from the
> MGS portion on different servers?  I somehow doubt you
> overlooked this but if you didn't for some reason this could
> be a cause of unresponsiveness on the client side.  Again
> if your serving up the images from the cluster I find it
> works great.
> 
> This server started life as a 1.4.x server, so the MGS is
> still on the
> same partition as MDS/MDT. We have one server with the MGS,
> MDS/MDT,
> and two OSTs, and another server with two more OSTs. The
> first server
> also provides NFS and SMB services for the volume in
> question. I know
> that we're not supposed to mount the volume on a server
> that provides
> it, but limited budget means limited servers, and
> performance has been
> excellent except for this one problem.
> 

I roll the same way at http://oil-gas.ca/phpsysinfo and http://linuxguru.ca/phpsysinfo with the OST's actually providing tcp routing and DNS service for the network that leads surfers to the internal lustre powered webservers although at this time I'm actually only serving one file via a symlink from the physically separated by block device lustre cluster right now at http://workwanted.ca/images/3689011.avi (let me know how fast it downloads back to you) but am tempted to symlink the entire /var/www/html directory for a few domains over to the lustre filesystem. I also run other services like smb, of course apache and mysql.  

The fact remains that lustre can be built with off the shelf hardware and is very robust and dependable and obviously upgradable if your coming from 1.4.x servers.  I believe that the the "Lustre Product" could be used by more people but given the stigma of a High Performance Compute filesystem it will take a little magic marketing it to more of the masses.  Like you also budget is a concern at times although I can see a solid state OST in the near future even if that one box deployed costs five thousand.  

> Jim
> --
> Jim McCusker
> Programmer Analyst
> Krauthammer Lab, Pathology Informatics
> Yale School of Medicine
> james.mccusker at yale.edu
> | (203) 785-6330
> http://krauthammerlab.med.yale.edu
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>