[lustre-discuss] Deactivate an OST for new file write operations

Laifer, Roland (SCC) roland.laifer at kit.edu
Tue May 5 01:09:50 PDT 2015


Hello Bob,

some time ago I've made the same observations and was also astonished:
1. Trying to write to an inactive OST silently writes to another OST.
2. For an inactive OST "lctl dl" on the MDS no longer reports IN.

After discussion of item #2 with support they pointed me to LU-4294,
i.e. you can use "cat /proc/fs/lustre/lov/*/target_obd" instead of
"lctl dl". I also requested a documentation change and LUDOC-218
was created which is still open.

Regards,
   Roland


Am 04.05.2015 um 19:54 schrieb Bob Ball:
> Hmm, an interesting addendum to this, section 18.3.4 of the Lustre
> manual shows how to create a file on a given OST.  If I try that to the
> disabled OST, it silently creates it on a different index:
>
> [ball at umt3int01:ball]$ lfs setstripe --index 11 mytestfile
> [ball at umt3int01:ball]$ lfs getstripe mytestfile
> mytestfile
> lmm_stripe_count:   1
> lmm_stripe_size:    1048576
> lmm_pattern:        1
> lmm_layout_gen:     0
> lmm_stripe_offset:  0
>           obdidx           objid           objid           group
>                0         3363788       0x3353cc                0
>
> Attempts to set a non-existent index fail as expected, attempts to write
> to another, enabled index are fine.
>
> bob
>
>
> On 5/4/2015 1:39 PM, Bob Ball wrote:
>> We just built a new 2.7.0 Lustre file system.  Overall I'm happy with
>> performance, but something is confusing me.
>>
>> We have a combined mgs/mdt DataStore.  On that server I issue:
>> ...
>>   20 UP osp umt3B-OST000b-osc-MDT0000 umt3B-MDT0000-mdtlov_UUID 5
>> [root at mdtmgs ~]# lctl --device 20 deactivate
>>
>> /var/log/messages logs the change
>> 2015-05-04T13:22:38-04:00 mdtmgs.aglt2.org kernel: [4051367.295627]
>> Lustre: setting import umt3B-OST000b_UUID INACTIVE by administrator
>> request
>>
>> BUT, lctl dl still shows the device UP, not IN, as the manual, and
>> past experience with older Lustre versions had led me to expect. On
>> the OSS, this logs:
>> 2015-05-04T13:26:21-04:00 umdist09.aglt2.org kernel: [2694525.765701]
>> Lustre: umt3B-OST000b: haven't heard from client
>> umt3B-MDT0000-mdtlov_UUID (at 10.10.2.173 at tcp) in 227 seconds. I think
>> it's dead, and I am evicting it. exp ffff88400a679800, cur 1430760381
>> expire 1430760231 last 1430760154
>>
>> lctl dl on that OSS now shows
>>   35 UP osd-zfs umt3B-OST000b-osd umt3B-OST000b-osd_UUID 5
>>   36 UP obdfilter umt3B-OST000b umt3B-OST000b_UUID 401
>>   37 UP lwp umt3B-MDT0000-lwp-OST000b umt3B-MDT0000-lwp-OST000b_UUID 5
>>
>> The 401 count is 2 smaller than for the other OST on this server.
>>
>> Note the time delay in the message logged on the OSS.  So, what is
>> wrong with this picture?  Is this OST really de-activated for write
>> operations?  And if it is, why does lctl still show it as UP and not
>> as IN?
>>
>> bob
>>
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



More information about the lustre-discuss mailing list