[Lustre-discuss] new OST not mounting up

Mag Gam magawake at gmail.com
Sat Feb 28 05:39:45 PST 2009


(sorry adding the entire list for Evan's reponse)

Thankyou got getting back to me on this.
So, when I try to mount the **new** ost I keep getting these messages.

For some reason the new OST is active on the MGS side which I am not
sure why.  I think I made a mistake by trying to mount up a new OST
while clients were still active.


When I try to activaste  the bad OST.I get this message.

Lustre: 11647:0:(ldlm_lib.c:736:target_handle_connect())
lfs001-OST0005: cookie lfs001-mdtlov_UUID seen on new NID
mds_ip_addr at tcp when existing NID 0 at lo is already connected
Feb 27 11:59:01 oss_server kernel: Lustre:
11647:0:(ldlm_lib.c:736:target_handle_connect()) Skipped 4 previous
similar messages
Feb 27 11:59:01 mds_server kernel: Lustre:
3426:0:(import.c:411:import_select_connection()) lfs001-OST0005-osc:
tried all connections, increasing latency to 51s
Feb 27 11:59:01 oss_server kernel: LustreError:
11647:0:(ldlm_lib.c:1614:target_send_reply_msg()) @@@ processing error
(-114)  req at ffff8104251a4400 x388745/t0 o8-><?>@<?>:0/0 lens 240/144 e
0 to 0 dl 1235754041 ref 1 fl Interpret:/0/0 rc -114/0
Feb 27 11:59:01 mds_server kernel: Lustre:
3426:0:(import.c:411:import_select_connection()) Skipped 6 previous
similar messages
Feb 27 11:59:01 mds_server kernel: LustreError: 11-0: an error
occurred while communicating with oss_ip at tcp. The ost_connect
operation failed with -114
Feb 27 11:59:01 mds_server kernel: LustreError: Skipped 12 previous
similar messages


 oss_server kernel: LustreError:
11556:0:(ldlm_lib.c:1614:target_send_reply_msg()) @@@ processing error
(-114)  req at ffff81042150a000 x388953/t0 o8-><?>@<?>:0/0 lens 240/144 e
0 to 0 dl 1235754240 ref 1 fl Interpret:/0/0 rc -114/0



Also, I was wondering if there was a way to reset the state of my OST.
Its keep thinking its already mounted. Even after a reboot. Any way to
say "hey, I am not mounted" ? :-)

TIA

On Sat, Feb 28, 2009 at 8:38 AM, Mag Gam <magawake at gmail.com> wrote:
> Thankyou got getting back to me on this.
> So, when I try to mount the **new** ost I keep getting these messages.
>
> For some reason the new OST is active on the MGS side which I am not
> sure why.  I think I made a mistake by trying to mount up a new OST
> while clients were still active.
>
>
>
>
> When I try to activaste  the bad OST.I get this message.
>
> Lustre: 11647:0:(ldlm_lib.c:736:target_handle_connect())
> lfs001-OST0005: cookie lfs001-mdtlov_UUID seen on new NID
> mds_ip_addr at tcp when existing NID 0 at lo is already connected
> Feb 27 11:59:01 oss_server kernel: Lustre:
> 11647:0:(ldlm_lib.c:736:target_handle_connect()) Skipped 4 previous
> similar messages
> Feb 27 11:59:01 mds_server kernel: Lustre:
> 3426:0:(import.c:411:import_select_connection()) lfs001-OST0005-osc:
> tried all connections, increasing latency to 51s
> Feb 27 11:59:01 oss_server kernel: LustreError:
> 11647:0:(ldlm_lib.c:1614:target_send_reply_msg()) @@@ processing error
> (-114)  req at ffff8104251a4400 x388745/t0 o8-><?>@<?>:0/0 lens 240/144 e
> 0 to 0 dl 1235754041 ref 1 fl Interpret:/0/0 rc -114/0
> Feb 27 11:59:01 mds_server kernel: Lustre:
> 3426:0:(import.c:411:import_select_connection()) Skipped 6 previous
> similar messages
> Feb 27 11:59:01 mds_server kernel: LustreError: 11-0: an error
> occurred while communicating with oss_ip at tcp. The ost_connect
> operation failed with -114
> Feb 27 11:59:01 mds_server kernel: LustreError: Skipped 12 previous
> similar messages
>
>
>  oss_server kernel: LustreError:
> 11556:0:(ldlm_lib.c:1614:target_send_reply_msg()) @@@ processing error
> (-114)  req at ffff81042150a000 x388953/t0 o8-><?>@<?>:0/0 lens 240/144 e
> 0 to 0 dl 1235754240 ref 1 fl Interpret:/0/0 rc -114/0
>
>
>
> Also, I was wondering if there was a way to reset the state of my OST.
> Its keep thinking its already mounted. Even after a reboot. Any way to
> say "hey, I am not mounted" ? :-)
>
> Would a writeconf help on the OST? I am hesitant to run one on it.
>
> TIA
>
> On Fri, Feb 27, 2009 at 11:41 AM, Evan Felix <evan.felix at pnl.gov> wrote:
>> Mag,
>>
>> Can you send us the output from your kernel log after you try the mount
>> command that is failing?
>>
>> Just run 'dmesg' and send us the last 20 lines or so..
>>
>> evan
>>
>>
>> On 2/26/09 7:12 PM, "Mag Gam" <magawake at gmail.com> wrote:
>>
>>> Any ideas?
>>>
>>> I am still unable to mount this new OST.  I stopped the client hang
>>> problem by disabling the OST via lctl but crazy problem indeed.
>>>
>>>
>>> I would love to know how to activate the OST.
>>>
>>>
>>>
>>> On Wed, Feb 25, 2009 at 4:43 PM, Mag Gam <magawake at gmail.com> wrote:
>>>> Hello.
>>>>
>>>>
>>>> We created an OST on an OSS. But when I try to mount up the OST, it
>>>> keeps saying.
>>>>
>>>>
>>>> mount.lustre /dev/vg/ost002 /vol/srv1/ost002
>>>> mount.lustre: mount /dev/vg/ost002 at /vol/srv1/ost002 failed:
>>>> Operation already in progress
>>>> The target service is already running. (/dev/vg/ost002)
>>>>
>>>>
>>>> However,
>>>> mount | grep -i ost002
>>>> Nothing is mounted up....
>>>>
>>>> lctl is even showing this OST and also the client is able to see it.
>>>> lfs df -h
>>>> ...
>>>> lfs001-OST0005_UUID     492.2G    445.2G     22.0G   90%
>>>> /lfs/srv5/lfs001[OST:5]
>>>> ...
>>>>
>>>>
>>>> The MDS/OSS Version:
>>>> lustre: 1.6.5.52
>>>> kernel: patchless
>>>> build:
>>>> 1.6.5.52-19691231190000-PRISTINE-.var.tmp.linux-2.6.18.x86_64-2.6.18-prep
>>>> 2.6.18 = Kernel Version
>>>>
>>>> I don't think its bugzilla 11564, because my lustre fs name is only 6
>>>> characters long.
>>>>
>>>>
>>>> Also, when the client tries to access the new OST's space, it simple
>>>> hangs. It placed it in "bloc
>>>>
>>>> Any thoughts about this?
>>>>
>>>> TIA
>>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>>
>



More information about the lustre-discuss mailing list