[Lustre-discuss] system disk with external journals for OSTs formatted

Alexander Bugl alexander.bugl at zmaw.de
Mon Nov 1 14:13:14 PDT 2010


Hi!

On Wednesday 27 October 2010 14:46:41 Andreas Dilger wrote:
> On 2010-10-27, at 15:02, Alexander Bugl wrote:
> >> Trying to run e2fsck -n yields:
> > [root at soss10 ~]# e2fsck -fn /dev/md14
> > e2fsck 1.41.10.sun2 (24-Feb-2010)
> > e2fsck: Group descriptors look bad... trying backup blocks...
> > Error writing block 1 (Attempt to write block from filesystem resulted in
> > short write).  Ignore error? no
> > Error writing block 2 (Attempt to write block from filesystem resulted in
> > short write).  Ignore error? no
> > Error writing block 3 (Attempt to write block from filesystem resulted in
> > short write).  Ignore error? no
> > [... and all integers between]
> > Error writing block 463 (Attempt to write block from filesystem resulted
> > in short write).  Ignore error? no
> > Error writing block 464 (Attempt to write block from filesystem resulted
> > in short write).  Ignore error? no
> 
> I don't know what these errors are, possibly trying to write into the
> broken journal device?  The rest of the fileystem errors are very minor. 
> You should probably delete the journal device via "tune2fs -O
> ^has_journal", run a full "e2fsck -f" and then recreate the journal with
> "tune2fs -j size=400".
> 
> Cheers, Andreas

Thanks for all the tips. We started some minutes before Andreas' mail with the 
e2fsck, without deleting the journal:

# e2fsck -fp /dev/md14
squall-OST001d: Note: if several inode or block bitmap blocks or part
of the inode table require relocation, you may wish to try
running e2fsck with the '-b 32768' option first.  The problem
may lie only with the primary block group descriptors, and
the backup block group descriptors may be OK.
squall-OST001d: Block bitmap for group 1920 is not in group.  (block 
268482810)
squall-OST001d: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
        (i.e., without -a or -p options)

# e2fsck -fp -b 32768 /dev/md14
squall-OST001d: One or more block group descriptor checksums are invalid.  
FIXED.
squall-OST001d: Group descriptor 0 checksum is invalid.  
squall-OST001d: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
        (i.e., without -a or -p options)

# e2fsck -fy /dev/md14
e2fsck 1.41.10.sun2 (24-Feb-2010)
One or more block group descriptor checksums are invalid.  Fix? yes
Group descriptor 0 checksum is invalid.  FIXED.
Group descriptor 1 checksum is invalid.  FIXED.
Group descriptor 2 checksum is invalid.  FIXED.
[... continuing]

But luckily the file system checks on the OSTs finished, and the OSTs could be 
mounted as ldiskfs and as lustre.

And it looks like there has been no file mangled, deleted or whatsoever, we 
did not find any problems after careful checking, and our users did not report 
problems, either.

Next time we know how to remove and re-add the external journals, so thank you 
again for all the tips.

With regards, Alex

-- 
Alexander Bugl,  Central IT Services, ZMAW
Max  Planck  Institute   for   Meteorology
Bundesstrasse 53, D-20146 Hamburg, Germany
tel +49-40-41173-351, fax -298, room PE048



More information about the lustre-discuss mailing list