[Lustre-discuss] forced umount of OST in failover case?

Andreas Dilger adilger at sun.com
Mon May 19 22:59:46 PDT 2008


On May 19, 2008  13:09 -0700, Nathaniel Rutman wrote:
> Erich Focht wrote:
> > the lustre manual says:
> >
> >  2.2.1.5 Stopping a Server
> >  To stop a server:
> >  $ umount -f /mnt/test/ost0
> >  The '-f' flag means "force"; force the server to stop WITHOUT RECOVERY. 
> >  Without the '-f' flag, "failover" is
> >  implied, meaning the next time the server is started it goes through the 
> >  recovery procedure.
> >
> > So we were tempted to use "umount -f" when doing a failover of OSTs, but we
> > see problems (I/O errors on clients) during the failover when doing this.
> > Without the "-f" flag we get no I/O errors.
>   
> yes.  That is the difference between "forced" or not.  Forced means stop 
> with errors for clients, unforced means take more time and do recovery 
> at restart.
> > Is there a recommended way of dealing with the umount at failover?
> >   
> Don't use -f

Perhaps it makes sense to clarify the manual a bit?  It doesn't really
make sense to have the manual specify "-f" as the default action, IMHO,
since this isn't what 99% of users or scripts will do.

Something like:

To stop a server:

# umount /mnt/test/ost0

This preserves the state of the connected clients, and the next time the
server is started it will wait for clients to reconnect and go through
the recovery procedure.

If the '-f' ("force") flag is given, the server will evict all clients and
stop WITHOUT RECOVERY.  The server will not wait for recovery upon restart.
Any currently connected clients will get IO errors until they reconnect.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.




More information about the lustre-discuss mailing list