[Lustre-discuss] lustre 1.6.5.1 panic on failover
Brock Palen
brockp at umich.edu
Fri Aug 1 08:39:43 PDT 2008
yes it is consistant. I looked up how to induce a panic using sysrq
echo c > /proc/sysreq-trigger
That will work right, the machine cycles the second takes over and
all is well.
If instead of crashing the node I run 'killall -9 heartbeat'
I can get the panic every time. I even edited the external/ipmi
script from 'power reset' to 'power cycle' didn't help.
Its kinda unstable, if heartbeat dies the who MDS/mgs server setup
would lock up, if the server panics I will be ok. I don't like this
spot.
I am looking at grabbing a crash dump. I think its a race, heartbeat
is mounting the filesystems before the first node is toatally dead.
Does it hurt to run mmp on the mgs file system also?
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
brockp at umich.edu
(734)936-1985
On Jul 31, 2008, at 5:28 PM, Klaus Steden wrote:
>
> Hi Brock,
>
> I've been using Sun X2200s with Lustre in a similar configuration
> (IPMI,
> STONITH, Linux-HA, FC storage) and haven't had any issues like this
> (although I would typically panic the primary node during testing
> using
> Sysrq) ... is the behaviour consistent?
>
> Klaus
>
> On 7/31/08 1:57 PM, "Brock Palen" <brockp at umich.edu>did etch on stone
> tablets:
>
>> I have two machines I am setting up as my first mds failover pair.
>>
>> The two sun x4100's are connected to a FC disk array. I have set up
>> heartbeat with IPMI for STONITH.
>>
>> Problem is when I run a test on the host that currently has the mds/
>> mgs mounted 'killall -9 heartbeat' I see the IPMI shutdown and when
>> the second 4100 tries to mount the filesystem it does a kernel panic.
>>
>> Has anyone else seen this behavior? Is there something I am running
>> into? If I do a 'hb_takelover' or shutdown heartbeat cleanly all is
>> well. Only if I simulate heartbeat failing does this happen. Note I
>> have not tired yanking power yet, but I want to simulate a MDS in a
>> semi dead state and ran into this.
>>
>>
>> Brock Palen
>> www.umich.edu/~brockp
>> Center for Advanced Computing
>> brockp at umich.edu
>> (734)936-1985
>>
>>
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>
>
More information about the lustre-discuss
mailing list