[Lustre-discuss] Failover & recovery issues / questions

Mon Mar 30 17:52:56 PDT 2009

Thanks Kevin for clearing this up.

So when the manual mentions "Load-balanced Active/Active configuration", what does that mean? I have never tried it, but I expected it to be different from the "Active/Passive" configuration on the MDS.

jab  

> -----Original Message-----
> From: Kevin.Vanmaren at Sun.COM [mailto:Kevin.Vanmaren at Sun.COM] 
> Sent: Monday, March 30, 2009 5:29 PM
> To: Jeffrey Bennett
> Cc: Adam Gandelman; lustre-discuss at lists.lustre.org
> Subject: Re: [Lustre-discuss] Failover & recovery issues / questions
> 
> You can NOT have an OST mounted on both. You can use 
> heartbeat to mount different OSTs on each, and to mount them 
> all on one node when the other node goes down.
> 
> Kevin
> 
> 
> On Mar 30, 2009, at 6:16 PM, Jeffrey Bennett <jab at sdsc.edu> wrote:
> 
> > Hi, I am not familiar with using heartbeat with the OSS, I 
> have only 
> > used it on the MDS for failover, since you can't have an active/ 
> > active configuration on the MDS. However, you can have 
> active/active 
> > on the OSS, I can't understand why would you want to use 
> heartbeat to 
> > unmount the OSTs on one system if you can have them mounted on both?
> >
> > Now when you say you "kill" heartbeat, what do you mean by 
> that? You 
> > can't test heartbeat functionality by killing it, you have 
> to use the 
> > provided tools for failing over to the other node. The tool 
> usage and 
> > parameters depend on what version of heartbeat you are using.
> >
> > Do you have a serial connection between these machines or a 
> crossover 
> > cable for heartbeat or do you use the regular network?
> >
> > jab
> >
> >> -----Original Message-----
> >> From: lustre-discuss-bounces at lists.lustre.org
> >> [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Adam 
> >> Gandelman
> >> Sent: Monday, March 30, 2009 4:38 PM
> >> To: lustre-discuss at lists.lustre.org
> >> Subject: [Lustre-discuss] Failover & recovery issues / questions
> >>
> >> Hi-
> >>
> >> I'm new to Lustre and am running into some issues with 
> fail over and 
> >> recovery that I can't seem to find answers to in the Lustre manual 
> >> (v1.14).  If anyone can fill me in as to what is going on (or not 
> >> going on), or point me toward some documentation that goes 
> into more 
> >> detail it would be greatly appreciated.
> >>
> >> It's a simple cluster at the moment:
> >>
> >> MDT/MGS data is collocated on node LUS-MDT
> >>
> >> LUS-OSS0 and LUS-OSS1 are set up in an active/active 
> failover setup,.
> >> LUS-OSS0 is primary for /dev/drbd1 and backup for /dev/drbd2,
> >> LUS-OSS1 is primary for /dev/drbd2 and backup for /dev/drbd1.
> >> I have heartbeat configured to monitor and handle fail 
> over, however, 
> >> I run into the same problems when manually testing fail over.
> >>
> >> When heartbeat is killed on either OSS and resources 
> failed over to 
> >> the backup, or when the filesystem is manually unmounted and 
> >> remounted on the backup node, the migrated OST either 1, 
> goes into a 
> >> state of endless recovery or 2, doesn't seem to go into 
> recovery at 
> >> all.  It becomes inactive on the cluster entirely.  If I bring the 
> >> OST's primary back up and fail back the resources, the OST 
> goes into 
> >> recovery, completes and comes back up online as it should.
> >>
> >> For example, if I take down OSS0, the OST fails over to 
> it's back up, 
> >> however, it never makes it past this and never recovers:
> >>
> >> [root at lus-oss0 ~]# cat
> >> /proc/fs/lustre/obdfilter/lustre-OST0000/recovery_status
> >> status: RECOVERING
> >> recovery_start: 0
> >> time_remaining: 0
> >> connected_clients: 0/4
> >> completed_clients: 0/4
> >> replayed_requests: 0/??
> >> queued_requests: 0
> >> next_transno: 2002
> >>
> >> In some instances, /proc/fs/lustre/obdfilter/lustre-OST0000/
> >> is empty.
> >> Like I said, when the primary node comes back online and resources 
> >> are migrated back, the OST goes into recovery fine, completes and 
> >> comes back up online.
> >>
> >> Here are log output on the secondary node after fail over.
> >>
> >> Lustre: 13290:0:(filter.c:867:filter_init_server_data()) RECOVERY:
> >> service lustre-OST0000, 4 recoverable clients, last_rcvd 2001
> >> Lustre: lustre-OST0000: underlying device drbd2 should be 
> tuned for 
> >> larger I/O requests: max_sectors = 64 could be up to 
> >> max_hw_sectors=255
> >> Lustre: OST lustre-OST0000 now serving dev 
> >> (lustre-OST0000/1ff44d23-d13a-b0c6-48e1-36c104ea6752), but 
> will be in 
> >> recovery for at least 5:00, or until 4 clients reconnect. 
> During this 
> >> time new clients will not be allowed to connect. Recovery progress 
> >> can be monitored by watching 
> >> /proc/fs/lustre/obdfilter/lustre-OST0000/recovery_status.
> >> Lustre: Server lustre-OST0000 on device /dev/drbd2 has started
> >> Lustre: Request x8184 sent from lustre-OST0000-osc-c6cedc00 to NID 
> >> 192.168.10.23 at tcp 100s ago has timed out (limit 100s).
> >> Lustre: lustre-OST0000-osc-c6cedc00: Connection to service 
> >> lustre-OST0000 via nid 192.168.10.23 at tcp was lost; in progress 
> >> operations using this service will wait for recovery to complete.
> >> Lustre: 3983:0:(import.c:410:import_select_connection())
> >> lustre-OST0000-osc-c6cedc00: tried all connections, increasing 
> >> latency to 6s
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> Lustre-discuss mailing list
> >> Lustre-discuss at lists.lustre.org
> >> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> >>
> > _______________________________________________
> > Lustre-discuss mailing list
> > Lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
>