[lustre-discuss] Experience with resizing MDT

Andreas Dilger adilger at whamcloud.com
Thu Sep 27 10:00:30 PDT 2018


On Sep 27, 2018, at 17:00, Patrick Farrell <paf at cray.com> wrote:
> 
> Andreas,
> 
> Take a closer look.  It doesn't look to be connected to anything (this is current master).  This is all the instances of it I see:
> 
> C symbol: mdt_enable_remote_dir
> 
>  File           Function                  Line
> 0 mdt_internal.h <global>                   251 mdt_enable_remote_dir:1,
> 1 mdt_lproc.c    <global>                   627 LPROC_SEQ_FOPS(mdt_enable_remote
>                                                _dir);
> 2 mdt_handler.c  mdt_init0                 5057 m->mdt_enable_remote_dir = 0;
> 3 mdt_lproc.c    mdt_enable_remote_dir_seq  606 seq_printf(m, "%u\n",
>                                                mdt->mdt_enable_remote_dir);
> 4 mdt_lproc.c    mdt_enable_remote_dir_seq  624 mdt->mdt_enable_remote_dir =
>                                                val;
> 
> It's there.  It's set at init, and it can be read out and set in proc...  But it's not connected to anything any more, unless there's an obscure macro I missed.  The actual checking of it was removed in the patch Cory mentioned:
> https://review.whamcloud.com/#/c/12282/48/lustre/mdt/mdt_reint.c
> 
> mdt_enable_remote_dir_gid still looks to be working as expected.

Ah,
I searched for enable_remote_dir but found mdt_enable_remote_dir_gid in mdt_remote_permission_check() and assumed it was checking both values.

It looks like the presence of enable_remote_dir is not strictly needed, and enable_remote_dir_gid is controlling access.  Setting it to a specific group number (e.g. "wheel" or "admin") will allow that group to create remote/striped directories, while "-1" will allow all users to do this.

Cheers, Andreas


> On 9/27/18, 4:52 AM, "Andreas Dilger" <adilger at whamcloud.com> wrote:
> 
>    On Sep 27, 2018, at 04:13, Cory Spitz <spitzcor at cray.com> wrote:
>> 
>> Hello, all.
>> 
>>> If you set mdt.*.enable_remote_dir=1 then you can create directories that point back and forth across MDTs
>> 
>> I thought enable_remote_dir would be useful too, but it turns out that it has changed.  Patrick F. pointed out to me that it was gutted when LU-3537 was landed for L2.8.0.  Setting the option does nothing to change the behavior, which defaults to the behavior formally made possible with enable_remote_dir=1.
>> 
>> Please take a look at LU-11429, which I filed to have the parameter removed.  The assessment may be wrong, please let us know.
> 
>    The LU-3537 patch is only removing the restriction on remote rename and hard links, which were not allowed with DNE1 due to lack of recovery, but are handled correctly with DNE2 distributed transactions.
> 
>    As far as I can see, "mdt_remote_dir" checks still exist in the master code.
> 
>    Cheers, Andreas
> 
>> On 9/21/18, 11:28 PM, "Andreas Dilger" <adilger at whamcloud.com> wrote:
>> 
>>   On Sep 20, 2018, at 16:38, Mohr Jr, Richard Frank (Rick Mohr) <rmohr at utk.edu> wrote:
>>> 
>>> 
>>>> On Sep 19, 2018, at 8:09 PM, Colin Faber <cfaber at gmail.com> wrote:
>>>> 
>>>> Why wouldn't you use DNE?
>>> 
>>> I am considering it as an option, but there appear to be some potential drawbacks.
>>> 
>>> If I use DNE1, then I have to manually create directories on specific MDTs.  I will need to monitor MDT usage and make adjustments as necessary (which is not the end of the world, but still involves some additional work).  This might be fine when I am creating new top-level directories for new users/projects, but any existing directories created before we add a new MDT will still only use MDT0.  Since the bulk of our user/project directories will be created early on, we still have the potential issue of running out of inodes on MDT0.
>> 
>>   Note that it is possible to create remote directories at any point in the filesystem.  If you set mdt.*.enable_remote_dir=1 then you can create directories that point back and forth across MDTs.  If you also set
>>   mdt.*.enable_remote_dir_gid=-1 then all users can create remote directories.
>> 
>>> Based on that, I think DNE2 would be the better alternative, but it still has similar limitations.  The directories created initially will still be only striped over a single MDT.  When another MDT is added, I would need to recursively adjust all the existing directories to have a stripe count of 2 (or risk having MDT0 run out of inodes).  Based on my understanding of how the striped directories work, all the files in a striped directory are about evenly split across all the MDTs that the directory is striped across (which doesn’t work very well if MDT0 is mostly full and MDT1 is mostly empty).  Most likely we would want to have every directory striped across all MDTs, but there is a note in the lustre manual explicitly mentioning that it’s not a good idea to do this.
>> 
>>   Yes, since remote and particularly striped directory creation has a non-zero overhead due to distributed transactions and ongoing extra RPC counts to access, it is better to limit remote and striped directories to ones that need it.
>> 
>>   We're working on automating the use of DNE remote/striped directories.  In 2.12 it is possible to use "lfs mkdir -i -1" and "lfs mkdir -c N" to automatically select one or more "good" MDT(s) (where "good" == least full right now), or "lfs mkdir -i m,n,p,q" to select a disjoint list of MDTs.
>> 
>>> So that is why I was thinking that resizing the MDT might be the simplest approach.   Of course, I might be mistunderstanding something about DNE2, and if that is the case, someone can correct me.  Of if there are options I am not considering, I would welcome those too.
>> 
>>   Yes, if you are not pushing the limits of MDT size, then resizing the MDT is a reasonable approach.  This also avoids issues with MDT imbalance, which is not ideal now, but we are working to improve.
>> 
>>   Cheers, Andreas
>>   ---
>>   Andreas Dilger
>>   CTO Whamcloud
>> 
>> 
>> 
>> 
>> 
>> 
> 
>    Cheers, Andreas
>    ---
>    Andreas Dilger
>    CTO Whamcloud
> 
> 
> 
> 
>    _______________________________________________
>    lustre-discuss mailing list
>    lustre-discuss at lists.lustre.org
>    http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> 
> 

Cheers, Andreas
---
Andreas Dilger
CTO Whamcloud






More information about the lustre-discuss mailing list