[lustre-discuss] Experience with resizing MDT

Andreas Dilger adilger at whamcloud.com
Thu Sep 27 02:51:54 PDT 2018


On Sep 27, 2018, at 04:13, Cory Spitz <spitzcor at cray.com> wrote:
> 
> Hello, all.
> 
>>  If you set mdt.*.enable_remote_dir=1 then you can create directories that point back and forth across MDTs
> 
> I thought enable_remote_dir would be useful too, but it turns out that it has changed.  Patrick F. pointed out to me that it was gutted when LU-3537 was landed for L2.8.0.  Setting the option does nothing to change the behavior, which defaults to the behavior formally made possible with enable_remote_dir=1.
> 
> Please take a look at LU-11429, which I filed to have the parameter removed.  The assessment may be wrong, please let us know.

The LU-3537 patch is only removing the restriction on remote rename and hard links, which were not allowed with DNE1 due to lack of recovery, but are handled correctly with DNE2 distributed transactions.

As far as I can see, "mdt_remote_dir" checks still exist in the master code.

Cheers, Andreas

> On 9/21/18, 11:28 PM, "Andreas Dilger" <adilger at whamcloud.com> wrote:
> 
>    On Sep 20, 2018, at 16:38, Mohr Jr, Richard Frank (Rick Mohr) <rmohr at utk.edu> wrote:
>> 
>> 
>>> On Sep 19, 2018, at 8:09 PM, Colin Faber <cfaber at gmail.com> wrote:
>>> 
>>> Why wouldn't you use DNE?
>> 
>> I am considering it as an option, but there appear to be some potential drawbacks.
>> 
>> If I use DNE1, then I have to manually create directories on specific MDTs.  I will need to monitor MDT usage and make adjustments as necessary (which is not the end of the world, but still involves some additional work).  This might be fine when I am creating new top-level directories for new users/projects, but any existing directories created before we add a new MDT will still only use MDT0.  Since the bulk of our user/project directories will be created early on, we still have the potential issue of running out of inodes on MDT0.
> 
>    Note that it is possible to create remote directories at any point in the filesystem.  If you set mdt.*.enable_remote_dir=1 then you can create directories that point back and forth across MDTs.  If you also set
>    mdt.*.enable_remote_dir_gid=-1 then all users can create remote directories.
> 
>> Based on that, I think DNE2 would be the better alternative, but it still has similar limitations.  The directories created initially will still be only striped over a single MDT.  When another MDT is added, I would need to recursively adjust all the existing directories to have a stripe count of 2 (or risk having MDT0 run out of inodes).  Based on my understanding of how the striped directories work, all the files in a striped directory are about evenly split across all the MDTs that the directory is striped across (which doesn’t work very well if MDT0 is mostly full and MDT1 is mostly empty).  Most likely we would want to have every directory striped across all MDTs, but there is a note in the lustre manual explicitly mentioning that it’s not a good idea to do this.
> 
>    Yes, since remote and particularly striped directory creation has a non-zero overhead due to distributed transactions and ongoing extra RPC counts to access, it is better to limit remote and striped directories to ones that need it.
> 
>    We're working on automating the use of DNE remote/striped directories.  In 2.12 it is possible to use "lfs mkdir -i -1" and "lfs mkdir -c N" to automatically select one or more "good" MDT(s) (where "good" == least full right now), or "lfs mkdir -i m,n,p,q" to select a disjoint list of MDTs.
> 
>> So that is why I was thinking that resizing the MDT might be the simplest approach.   Of course, I might be mistunderstanding something about DNE2, and if that is the case, someone can correct me.  Of if there are options I am not considering, I would welcome those too.
> 
>    Yes, if you are not pushing the limits of MDT size, then resizing the MDT is a reasonable approach.  This also avoids issues with MDT imbalance, which is not ideal now, but we are working to improve.
> 
>    Cheers, Andreas
>    ---
>    Andreas Dilger
>    CTO Whamcloud
> 
> 
> 
> 
> 
> 

Cheers, Andreas
---
Andreas Dilger
CTO Whamcloud






More information about the lustre-discuss mailing list