[lustre-discuss] Deactivate an OST for new file write operations

Bob Ball ball at umich.edu
Mon May 4 10:54:04 PDT 2015


Hmm, an interesting addendum to this, section 18.3.4 of the Lustre 
manual shows how to create a file on a given OST.  If I try that to the 
disabled OST, it silently creates it on a different index:

[ball at umt3int01:ball]$ lfs setstripe --index 11 mytestfile
[ball at umt3int01:ball]$ lfs getstripe mytestfile
mytestfile
lmm_stripe_count:   1
lmm_stripe_size:    1048576
lmm_pattern:        1
lmm_layout_gen:     0
lmm_stripe_offset:  0
         obdidx           objid           objid           group
              0         3363788       0x3353cc                0

Attempts to set a non-existent index fail as expected, attempts to write 
to another, enabled index are fine.

bob


On 5/4/2015 1:39 PM, Bob Ball wrote:
> We just built a new 2.7.0 Lustre file system.  Overall I'm happy with 
> performance, but something is confusing me.
>
> We have a combined mgs/mdt DataStore.  On that server I issue:
> ...
>  20 UP osp umt3B-OST000b-osc-MDT0000 umt3B-MDT0000-mdtlov_UUID 5
> [root at mdtmgs ~]# lctl --device 20 deactivate
>
> /var/log/messages logs the change
> 2015-05-04T13:22:38-04:00 mdtmgs.aglt2.org kernel: [4051367.295627] 
> Lustre: setting import umt3B-OST000b_UUID INACTIVE by administrator 
> request
>
> BUT, lctl dl still shows the device UP, not IN, as the manual, and 
> past experience with older Lustre versions had led me to expect. On 
> the OSS, this logs:
> 2015-05-04T13:26:21-04:00 umdist09.aglt2.org kernel: [2694525.765701] 
> Lustre: umt3B-OST000b: haven't heard from client 
> umt3B-MDT0000-mdtlov_UUID (at 10.10.2.173 at tcp) in 227 seconds. I think 
> it's dead, and I am evicting it. exp ffff88400a679800, cur 1430760381 
> expire 1430760231 last 1430760154
>
> lctl dl on that OSS now shows
>  35 UP osd-zfs umt3B-OST000b-osd umt3B-OST000b-osd_UUID 5
>  36 UP obdfilter umt3B-OST000b umt3B-OST000b_UUID 401
>  37 UP lwp umt3B-MDT0000-lwp-OST000b umt3B-MDT0000-lwp-OST000b_UUID 5
>
> The 401 count is 2 smaller than for the other OST on this server.
>
> Note the time delay in the message logged on the OSS.  So, what is 
> wrong with this picture?  Is this OST really de-activated for write 
> operations?  And if it is, why does lctl still show it as UP and not 
> as IN?
>
> bob
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>



More information about the lustre-discuss mailing list