[lustre-discuss] Experience with resizing MDT

Andreas Dilger adilger at whamcloud.com
Fri Sep 21 21:28:15 PDT 2018


On Sep 20, 2018, at 16:38, Mohr Jr, Richard Frank (Rick Mohr) <rmohr at utk.edu> wrote:
> 
> 
>> On Sep 19, 2018, at 8:09 PM, Colin Faber <cfaber at gmail.com> wrote:
>> 
>> Why wouldn't you use DNE?
> 
> I am considering it as an option, but there appear to be some potential drawbacks.
> 
> If I use DNE1, then I have to manually create directories on specific MDTs.  I will need to monitor MDT usage and make adjustments as necessary (which is not the end of the world, but still involves some additional work).  This might be fine when I am creating new top-level directories for new users/projects, but any existing directories created before we add a new MDT will still only use MDT0.  Since the bulk of our user/project directories will be created early on, we still have the potential issue of running out of inodes on MDT0.

Note that it is possible to create remote directories at any point in the filesystem.  If you set mdt.*.enable_remote_dir=1 then you can create directories that point back and forth across MDTs.  If you also set
mdt.*.enable_remote_dir_gid=-1 then all users can create remote directories.

> Based on that, I think DNE2 would be the better alternative, but it still has similar limitations.  The directories created initially will still be only striped over a single MDT.  When another MDT is added, I would need to recursively adjust all the existing directories to have a stripe count of 2 (or risk having MDT0 run out of inodes).  Based on my understanding of how the striped directories work, all the files in a striped directory are about evenly split across all the MDTs that the directory is striped across (which doesn’t work very well if MDT0 is mostly full and MDT1 is mostly empty).  Most likely we would want to have every directory striped across all MDTs, but there is a note in the lustre manual explicitly mentioning that it’s not a good idea to do this.

Yes, since remote and particularly striped directory creation has a non-zero overhead due to distributed transactions and ongoing extra RPC counts to access, it is better to limit remote and striped directories to ones that need it.

We're working on automating the use of DNE remote/striped directories.  In 2.12 it is possible to use "lfs mkdir -i -1" and "lfs mkdir -c N" to automatically select one or more "good" MDT(s) (where "good" == least full right now), or "lfs mkdir -i m,n,p,q" to select a disjoint list of MDTs.

> So that is why I was thinking that resizing the MDT might be the simplest approach.   Of course, I might be mistunderstanding something about DNE2, and if that is the case, someone can correct me.  Of if there are options I am not considering, I would welcome those too.

Yes, if you are not pushing the limits of MDT size, then resizing the MDT is a reasonable approach.  This also avoids issues with MDT imbalance, which is not ideal now, but we are working to improve.

Cheers, Andreas
---
Andreas Dilger
CTO Whamcloud




-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180922/d6d39059/attachment.sig>


More information about the lustre-discuss mailing list