[Lustre-discuss] files/directories are temporarily unavailable on patchless clients

Harald van Pee pee at hiskp.uni-bonn.de
Wed Mar 12 03:07:10 PDT 2008


On Wednesday 05 March 2008 07:42 am, Andreas Dilger wrote:
> On Mar 05, 2008  01:19 +0100, Harald van Pee wrote:
> > On Wednesday 05 March 2008 01:06 am, Andreas Dilger wrote:
> > > On Mar 04, 2008  19:52 +0100, Harald van Pee wrote:
<snip>
> > >
> > > If you are expecting to fix the filesystem, it is best to just unmount
> > > everything and run e2fsck in parallel.  Alternately, you can just force
> > > unmount the MDT+OST filesystems and let the clients hang until the
> > > MDT+OSTs are restarted, but this can be more troublesome in some cases.
> >
> > o.k. thanks,
> >  than I will unmount all clients first and than
> > unmount all osts
> > and the mdt as the last.
>
> Actually, it is better to unmount clients, then MDT, then OSTs last,
> because the MDT is a "client" on the OSTs.
>
> > If it is possible should I try to avoid the -f flag?
>
> You shouldn't need to use -f if you unmount in the above order.

o.k. done!
There was no (mds) or only one error on some of the osts:
Inode 2 has a extra size (2) which is invalid
Fix? yes

than I want to run lfsck but on debian etch I got the following error:
lfsck -n -v --mdsdb ./mdsdb 
--ostdb ./ostdb0 ./ostdb1 ./ostdb2 ./ostdb3 ./ostdb4 ./ostdb5 ./ostdb6 ./ostdb7 ./ostdb8 ./ostdb9 /mnt/
lfsck 1.40.4.cfs1 (31-Dec-2007)
MDSDB: ./mdsdb
OSTDB[0]: ./ostdb0
OSTDB[1]: ./ostdb1
OSTDB[2]: ./ostdb2
OSTDB[3]: ./ostdb3
OSTDB[4]: ./ostdb4
OSTDB[5]: ./ostdb5
OSTDB[6]: ./ostdb6
OSTDB[7]: ./ostdb7
OSTDB[8]: ./ostdb8
OSTDB[9]: ./ostdb9
MOUNTPOINT: /mnt/
lfsck: symbol lookup error: lfsck: undefined symbol: db_env_create

on redhat I got
lfsck -n -v --mdsdb ./mdsdb 
--ostdb ./ostdb0 ./ostdb1 ./ostdb2 ./ostdb3 ./ostdb4 ./ostdb5 ./ostdb6 ./ostdb7 ./ostdb8 ./ostdb9 /mnt/
lfsck 1.40.4.cfs1 (31-Dec-2007)
MDSDB: ./mdsdb
OSTDB[0]: ./ostdb0
OSTDB[1]: ./ostdb1
OSTDB[2]: ./ostdb2
OSTDB[3]: ./ostdb3
OSTDB[4]: ./ostdb4
OSTDB[5]: ./ostdb5
OSTDB[6]: ./ostdb6
OSTDB[7]: ./ostdb7
OSTDB[8]: ./ostdb8
OSTDB[9]: ./ostdb9
MOUNTPOINT: /mnt/
error: can't get lov name.: Inappropriate ioctl for device (25)

any ideas?

Harald

>
> > > > On Monday 21 January 2008 11:55 pm, Andreas Dilger wrote:
> > > > > On Jan 21, 2008  18:55 +0100, Harald van Pee wrote:
> > > > > > The directory is just not there! Directory or file not found.
> > > > > >
> > > > > > in my opinion there is no error message on the clients which is
> > > > > > directly related to the problem on our node0010 today I have seen
> > > > > > this problem a several time. Mostly the directory is not seen!
> > > > > > Probably all of the other directories can be accessed at the same
> > > > > > time.
> > > > > >
> > > > > > and here all lustre related messages from the last days (others
> > > > > > are mostly timestamps!)
> > > > > >
> > > > > >
> > > > > >
> > > > > > Jan 17 07:41:16 node0010 kernel: Lustre: 5723:0:
> > > > > > (namei.c:235:ll_mdc_blocking_ast()) More than 1 alias dir
> > > > > > 133798800 alias
> > > > >
> > > > > A quick search in bugzilla for this error message shows bug 12123,
> > > > > which is fixed in the 1.6.1 release, and also has a patch.
> > > > >
> > > > > Cheers, Andreas
> > > > > --
> > > > > Andreas Dilger
> > > > > Sr. Staff Engineer, Lustre Group
> > > > > Sun Microsystems of Canada, Inc.
> > >
> > > Cheers, Andreas
> > > --
> > > Andreas Dilger
> > > Sr. Staff Engineer, Lustre Group
> > > Sun Microsystems of Canada, Inc.
> >
> > --
> > Harald van Pee
> >
> > Helmholtz-Institut fuer Strahlen- und Kernphysik der Universitaet Bonn
> > _______________________________________________
> > Lustre-discuss mailing list
> > Lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.

-- 
Harald van Pee

Helmholtz-Institut fuer Strahlen- und Kernphysik der Universitaet Bonn



More information about the lustre-discuss mailing list