[lustre-discuss] Deactivate an OST for new file write operations

Wolfgang Baudler wbaudler at gb.nrao.edu
Tue May 5 07:12:25 PDT 2015


Yes, I have reported this in LU-4294 back in 2013 for version 2.4.

Unfortunately recent lustre versions still seem to behave the same way,
even so the status of this bug report shows "Resolved". Only the
documentation was updated, I think.

Wolfgang

> Ah, OK, so I'm not imagining things then.  Thank you, Roland.  This is
> definitely poor behavior, very user unfriendly.
>
> bob
>
> On 5/5/2015 3:40 AM, Laifer, Roland (SCC) wrote:
>> Hello Bob,
>>
>> some time ago I've made the same observations and was also astonished:
>> 1. Trying to write to an inactive OST silently writes to another OST.
>> 2. For an inactive OST "lctl dl" on the MDS no longer reports IN.
>>
>> After discussion of item #2 with support they pointed me to LU-4294,
>> i.e. you can use "cat /proc/fs/lustre/lov/*/target_obd" instead of
>> "lctl dl". I also requested a documentation change and LUDOC-218
>> was created which is still open.
>>
>> Regards,
>>   Roland
>>
>>
>> Am 04.05.2015 um 19:54 schrieb Bob Ball:
>>> Hmm, an interesting addendum to this, section 18.3.4 of the Lustre
>>> manual shows how to create a file on a given OST.  If I try that to the
>>> disabled OST, it silently creates it on a different index:
>>>
>>> [ball at umt3int01:ball]$ lfs setstripe --index 11 mytestfile
>>> [ball at umt3int01:ball]$ lfs getstripe mytestfile
>>> mytestfile
>>> lmm_stripe_count:   1
>>> lmm_stripe_size:    1048576
>>> lmm_pattern:        1
>>> lmm_layout_gen:     0
>>> lmm_stripe_offset:  0
>>>           obdidx           objid           objid           group
>>>                0         3363788       0x3353cc                0
>>>
>>> Attempts to set a non-existent index fail as expected, attempts to
>>> write
>>> to another, enabled index are fine.
>>>
>>> bob
>>>
>>>
>>> On 5/4/2015 1:39 PM, Bob Ball wrote:
>>>> We just built a new 2.7.0 Lustre file system.  Overall I'm happy with
>>>> performance, but something is confusing me.
>>>>
>>>> We have a combined mgs/mdt DataStore.  On that server I issue:
>>>> ...
>>>>   20 UP osp umt3B-OST000b-osc-MDT0000 umt3B-MDT0000-mdtlov_UUID 5
>>>> [root at mdtmgs ~]# lctl --device 20 deactivate
>>>>
>>>> /var/log/messages logs the change
>>>> 2015-05-04T13:22:38-04:00 mdtmgs.aglt2.org kernel: [4051367.295627]
>>>> Lustre: setting import umt3B-OST000b_UUID INACTIVE by administrator
>>>> request
>>>>
>>>> BUT, lctl dl still shows the device UP, not IN, as the manual, and
>>>> past experience with older Lustre versions had led me to expect. On
>>>> the OSS, this logs:
>>>> 2015-05-04T13:26:21-04:00 umdist09.aglt2.org kernel: [2694525.765701]
>>>> Lustre: umt3B-OST000b: haven't heard from client
>>>> umt3B-MDT0000-mdtlov_UUID (at 10.10.2.173 at tcp) in 227 seconds. I think
>>>> it's dead, and I am evicting it. exp ffff88400a679800, cur 1430760381
>>>> expire 1430760231 last 1430760154
>>>>
>>>> lctl dl on that OSS now shows
>>>>   35 UP osd-zfs umt3B-OST000b-osd umt3B-OST000b-osd_UUID 5
>>>>   36 UP obdfilter umt3B-OST000b umt3B-OST000b_UUID 401
>>>>   37 UP lwp umt3B-MDT0000-lwp-OST000b umt3B-MDT0000-lwp-OST000b_UUID 5
>>>>
>>>> The 401 count is 2 smaller than for the other OST on this server.
>>>>
>>>> Note the time delay in the message logged on the OSS.  So, what is
>>>> wrong with this picture?  Is this OST really de-activated for write
>>>> operations?  And if it is, why does lctl still show it as UP and not
>>>> as IN?
>>>>
>>>> bob
>>>>




More information about the lustre-discuss mailing list