[Lustre-discuss] making a client reconnect to OST

Andreas Dilger adilger at sun.com
Mon Feb 4 11:35:45 PST 2008


On Feb 04, 2008  08:31 -0800, Jim Harm wrote:
> Is there a tool that will really attempt a reconnect from a client to 
> a single OST?
> it would be helpful for those rare cases
> 	when this happens and there is nothing really wrong with either.
> i imagine original cause could be something as simple as repeated delays
> 	on a very busy network?
> Other OSTs from the same OSS remained connected to the same client
> 	during this problem.
> If umount and mount could be avoided,
> 	it would be less disruptive to other processes on the client.

You can use "echo_client" to perform operations on a single OST.  See
the lustre-iokit obdfilter-survey for usage details.

> At 2:10 PM -0800 1/25/08, Jim Harm wrote:
> >On the client i tried the lctl --device $number deactivate
> >which worked
> >followed by
> >llctl --device $number activate
> >which i believe should have done the same thing
> >this failed without error notice to me.
> >
> >i ended up having to umount and mount, which finally reconnected the ost.
> >
> >At 12:55 PM -0700 1/25/08, Andreas Dilger wrote:
> >>On Jan 24, 2008  10:23 -0500, Brock Palen wrote:
> >>>   I have a client (one of our login nodes) that was evicted by one of
> >>>   the OST's but not both of them.  So some files are accessible others
> >>>   are not.  Strange thing is that both the OST's live on the same OSS.
> >>>
> >>>   Is there a way to ask lustre to restore this?  Up
> >>>   till this point, the client would recover quickly, but this time its
> >>>   just waiting.
> >>
> >>You could try "lctl --device {OSC device in question} recover".
> >>
> >>Cheers, Andreas
> >>--
> >>Andreas Dilger
> >>Sr. Staff Engineer, Lustre Group
> >>Sun Microsystems of Canada, Inc.
> >>
> >>_______________________________________________
> >>Lustre-discuss mailing list
> >>Lustre-discuss at lists.lustre.org
> >>http://lists.lustre.org/mailman/listinfo/lustre-discuss
> >
> >
> >--
> >}}}===============>>  LLNL
> >James E. Harm (Jim); jharm at llnl.gov
> >System Administrator, ICCD Clusters
> >(925) 422-4018 Page: 423-7705x57152
> >_______________________________________________
> >Lustre-discuss mailing list
> >Lustre-discuss at lists.lustre.org
> >http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
> 
> -- 
> }}}===============>>  LLNL
> James E. Harm (Jim); jharm at llnl.gov
> System Administrator, ICCD Clusters
> (925) 422-4018 Page: 423-7705x57152
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.




More information about the lustre-discuss mailing list