[Lustre-discuss] WARNING: Potential directory corruptions on the MDS with 1.6.7

Andrea Rucks andrea.rucks at us.lawson.com
Thu Apr 9 12:42:48 PDT 2009


Hi Andreas and Lustre folks,

After asking the noob question on how to patch Lustre.  I figured out how 
to apply bug patch 18695 and compile Lustre 1.6.7 with the source code 
last night.  I read up on how the patch command works, applied the 4 
patches and then compiled it.  The only issue I encountered was for the 
18695-client-patch.patch.  Hunk #1 of 2 wouldn't go into file.c (so I 
manually edited it, and added the one liner where it said it wanted it, 
hated to do it that way, but I'm working on a tight deadline to turn this 
over for testing).  For peace of mind, as we hadn't migrated our data into 
our filesystems yet, I re-ran my mkfs.lustre --reformat scripts on all my 
MGS / MDS / OSS filesystems.  Our RHEL 5.3 XEN systems are loading the 
patched MGS / MDS / OSS server and client Lustre modules and mounting the 
filesystems.  We're crossing fingers hoping we won't have those data 
corruption issues the folks at Tokyo University experienced.

Any new Lustre admin who's interested in what I did to apply the patch, 
here's kind of a high level view:

1.      Copied bug18695.patch, bug18695_LASSERT.patch, and 
18695-regression-test.patch to the server where I'd loaded the Lustre 
source code.

2.      Took a tarball of the existing /usr/src/lustre-1.6.7 directory (in 
case I messed things up)

3.      Searched for more information on patch as I'm definitely not a 
developer (read the man page, found this blog article from SUN that has a 
blip about patch: http://blogs.sun.com/mberg/entry/building_lustre_1_6_4, 
and found this helpful, old doc that talks about unified diffs, rcsdiff 
and patch: http://www.linuxjournal.com/article/1237)

4.      Found the files being patched and did the following for the Lustre 
server side source code:

                SYNTAX: patch <file-that-needs-patching> <patch-that-fixes 
file>

                cd /usr/src/lustre-1.6.7
                patch lustre/mds/mds_open.c /tmp/bug18695.patch
                patch lustre/lvfs/fsfilt_ext3.c 
/tmp/bug18695_LASSERT.patch
                patch lustre/tests/sanityN.sh 
/tmp/18695-regression-test.patch

5.      Followed the walkthrough from Joey Jablonski's blog for compiling 
Lustre (very helpful and similar to the Lustre Quick Start Guide):
 
 
http://mergingbusinessandit.blogspot.com/2008/10/building-lustre-1651-against-latest.html
 
6.      Uninstalled my old lustre-1.6.7*rpms:

                umount /<lustre filesystem(s)>
                modprobe -r lustre
                rpm -qa | grep lustre
                rpm -e --nodeps <old lustre-1.6.7*rpm(s)>

7.      Installed the newly compiled bug 18695 patched lustre-1.6.7*rpms 
(I already had a compiled and patched Lustre XEN kernel and Sun's 
e2fsprogs in place from before):

                rpm -ivh <newly compiled lustre-1.6.7*rpms>
                depmod -a
                modprobe lustre

8.      Followed the walkthrough from Joey's blog on how to build a Lustre 
filesystem, mounted it up and ran some file creation loops through it:

 
http://mergingbusinessandit.blogspot.com/2008/10/building-new-lustre-filesystem.html

Joey's blogs were real life savers.  Hope this helps someone else!

Cheers,

Ms. Andrea D. Rucks
Sr. Unix Systems Administrator,
Lawson ITS Unix Server Team
_____________________________

Lawson
380 St. Peter Street
St. Paul, MN 55102
Tel: 651-767-6252
http://www.lawson.com

__________________

On Apr 08, 2009  16:58 -0500, Andrea Rucks wrote:
> For a new production system, we downloaded Lustre 1.6.7 a couple weeks 
ago
> and have installed and configured it.  I just read this warning:
>
> >A bug has been identified in 1.6.7 that can cause directory corruptions
> >on the MDT. A patch and full details are in bug 18695 -
> >https://bugzilla.lustre.org/show_bug.cgi?id=18695
>
> >We recommend to anyone running 1.6.7 on the MDS to unmount the MDT, run
> >e2fsck against the MDT device and apply the patch from bug 18695 as 
soon
> >as possible.
>
> I've visited this page, but am uncertain as to how to "apply the patch"
> (does it require compiling?).  Are there any instructions available, or
> perhaps someone could point me to a FAQ?

I would instead suggest to downgrade the RPMs on your MDS to 1.6.6 until
the 1.6.7.1 packages are available.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20090409/cff102c1/attachment.htm>


More information about the lustre-discuss mailing list