[Lustre-discuss] Running low on inodes

Mon Apr 21 10:48:50 PDT 2008

On Apr 21, 2008  08:15 -0700, D. Marc Stearman wrote:
> On Apr 17, 2008, at 5:23 PM, Andreas Dilger wrote:
>> On Apr 17, 2008  10:47 -0700, D. Marc Stearman wrote:
>>> One of our production servers was created with 245M inodes,  which
>>> was enough at the time, but then we doubled the size of the file
>>> system and are now running low on inodes (only 27M available), and
>>> capacity-wise, we are only 60% full.  We ran out of inodes once
>>> already, and did a purge to reduce the number of filesystems.
>>>
>>> I would like to migrate all the data on the MDS to another lun on the
>>> RAID controller.  I can format a new file system with more inodes and
>>> move all the data on the MDS to the new file system.
>>>
>>> Here is my question:  What tools are safe to use to copy all the MDS
>>> data?  Are cp, tar, rsync, etc aware of all the EA data in ldiskfs?
>>> Is there a lustre tool to copy all the information on the MDS over to
>>> a new disk?
>>
>> If you have a very recent (1.17 from FC8) it supports the --xattr option
>> to copy EAs.  Otherwise the recommended procedure is described in a KB
>> article in bugzilla - a search for "backup & restore" gives bug 5716.
>
> I think you left out a command name in there (I'm assuming you mean tar).

Doh, yes, tar...  I'm not aware of any version of rsync that supports
xattrs yet (which would in fact be very handy).

> So if we have a recent enough version of tar, I can skip the getfattr 
> command?

Probably, yes, assuming you use the right options, which are not the
defaults.  I am trying to testing this out myself.  There aren't the any
issues with save/restore of xattrs when doing a backup/restore of the
underlying MDT/OST filesystems AFAIK.  There are ordering constraints
when doing backup/restore of the Lustre client files, if you want to
keep the same striping, and that may need some tweaks to tar.

Please let me know your results, so that I can update the backup/restore
documentation.

>> Out of curiosity, how big is the MDS LUN, and how much free space is
>> on the MDS?  We could reduce the default bytes/inode on the MDS to avoid
>> this problem in the future.  On ZFS it will be a non-issue (dynamic inode
>> allocation).
>
> The LUN is 2TB (171.2G) in use.  245M inodes, 209M inodes in use.  At the 
> time the file system was created, we didn't expect users to create this 
> many files, and we wanted a reasonable number of inodes to reduce fsck 
> time.  Now that we have the fast fsck feature, we aren't as concerned about 
> the fsck time, and after we doubled the filesystem, and the users started 
> creating many small files, we found ourselves in a bit of a bind.  I have 
> another identical size lun on the DDN controller, so copying the data over 
> is no big deal, I just want to make sure we don't lose the EAs.
>
> I don't think the default bytes/inode is an issue.  We explicitly chose 
> this number of inodes, overriding the default.

What I'd suggest is to make a smaller LUN using the recommended 4kB of
space per inode, or less.  That way, if you ever need to increase the
number of inodes you can get that by increasing the LUN size.  Your current
actual space usage is at 890 bytes/inode, while the MDS was formatted with
the e2fsprogs default 8192 bytes/inode.  While I wouldn't recommend going
too low, even 4096 bytes/inode would give you 256M inodes in 1TB.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.