[Lustre-devel] [RFC] "lctl readonly" modification proposal

Nathaniel Rutman Nathan.Rutman at Sun.COM
Thu Aug 28 10:11:03 PDT 2008


Andreas Dilger wrote:
> On Aug 22, 2008  19:45 +0400, Alexander Zarochentsev wrote:
>   
>> On 20 August 2008 23:29:22 Andreas Dilger wrote:
>>     
>>> On Aug 20, 2008  11:39 -0600, Peter J. Braam wrote:
>>>       
>>>> If I remember correctly the flush is only there to try to reduce
>>>> rollback. However, given that failover may happen on a system where
>>>> the software is not fully responsive, one could question the wisdom
>>>> of this reduction.  In any case having more replay due to more
>>>> rollback is harmless.
>>>>         
IIRC one other reason for the flush is that loopback disks tend not to 
"really" flush everything to disk when asked, and additional sync calls 
seem to help.  So beware when running loopback disks...
>>> One major caveat is that with mountconf we ALWAYS mark the device as
>>> "readonly" when it is being unmounted.
>>> If we don't have the sync 
>>> there I fear data loss after a clean server unmount, when all clients
>>> are also being unmounted and cannot do replay.
>>>
>>> I'd be thrilled if this was fixed so a normal shutdown did not do a
>>> "force" unmount and set the device read-only, because that would also
>>> avoid leaving the journal needing recovery.
>>>       
>
umount does either force or failover shutdown; failover sets readonly 
but force does not.  Test-framework regularly does both.  Andreas, if 
you want to avoid journal recovery, use umount -f.
Really read-only is intended to simulate a power loss, so I think sync 
before it is a bit of a cheat.    Having said that, I think there were 
real issues that prompted us to include the sync in the first place, and 
some heavy recovery testing (including loopback devs) is in order if it 
is removed.



More information about the lustre-devel mailing list