[Lustre-discuss] Heartbeat, LVM and Lustre

Andreas Dilger adilger at sun.com
Thu Dec 10 12:52:15 PST 2009

On 2009-12-10, at 13:29, Brian J. Murrell wrote:
> Looking a little further, the LVM script has both "start" and "stop"
> actions which presumably heartbeat invokes to (dis-)"own" a resource.
> These two actions do:
> vgscan; vgchange -a y $1
> and
> vgchange -a n $1
> respectively.  That implies that heartbeat wants to own an entire VG  
> or
> nothing.  It would appear you cannot have multiple volumes from a  
> single
> VG owned by different nodes.  As I said, I do this myself and have  
> found
> no issues, but am not at all a heavy, or what I would call  
> "production"
> user.

A VG is like a filesystem in that regard, even though they layout  
changes much less frequently.  If two nodes had a VG imported, and  
then one did a grow of an LV (let's say a raw volume for simplicity)  
the allocation of that volume would consume PEs from the VG, which  
changes the layout on disk.  The node that did the resize would  
reflect the new size, but the other node has no reason to re-read the  
VG layout from disk and would see the old size and PE allocation  
maps.  If it resized a different LV, it would lead to corruption of  
the VG.

>> and heartbeat on both mds nodes does not start any resource (even  
>> after
>> waiting for 35 minutes).
> Well, it would seem that heartbeat has found a condition it considers
> dangerous and stopping there so as not to cause any damage.  From the
> looks of things, you will need to disable the operating system's LVM
> startup code and leave it to heartbeat manage, if you buy into their
> assumptions.  Might be worth a question or two on the LVM list to  
> see if
> the assumptions are valid or not -- or resign yourself to allowing
> heartbeat to operate LVM resource ownership at the VG level and not LV
> level.

No, the heartbeat code is correct.  The whole VG should be under  
control of the HA agent, unless you are using the clustered LVM  
extensions that Red Hat wrote for GFS2.  I'm not sure if they are  
public or not, but in any case, since Lustre/ldiskfs expects sole  
ownership of the LVs (and the filesystems therein) there isn't any  
benefit to having them imported on 2 nodes at once, but a lot of risk.

Cheers, Andreas
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

More information about the lustre-discuss mailing list