[Lustre-discuss] MDS inode allocation question

Andreas Dilger andreas.dilger at oracle.com
Wed Apr 28 19:19:25 PDT 2010


On 2010-04-28, at 7:44, Gary Molenkamp <gary at sharcnet.ca> wrote:

> When I create the MDS, I specified '-i 1024' and I can see (locally)
> 800M inodes, but only part of the available space is allocated.

This is to be expected. There needs to be free space on the MDS for  
directories, striping and other internal usage.

>  Also, when the client mounts the filesystem,  the MDS only has 400M  
> blocks available:
>
> gulfwork-MDT0000_UUID 430781784    500264 387274084    0% /gulfwork 
> [MDT:0]
>
> As we were creating files for testing, I saw that each inode  
> allocation
> on the MDS was consuming 4k of space,

That depends on how you are striping your files.  If the striping is  
larger than will fit inside the inode (13 stripes for 512-byte inodes  
IIRC) them each inode will also consume a block for the striping, and  
some step-wise fraction of a block for each directory entry. That is  
why 'df -i' will return min(free blocks, free inodes), though the  
common case is that files do not need an external xattr block for the  
striping (see stripe hint argument for mkfs.lustre) and the number of  
'free' inodes will remain constant as files are being created, until  
the number of free blocks exceeds the free inode count.

> so even though I have 800M inodes available on actual mds partition,  
> it appears that the actual space available was only allowing 100M  
> inodes in the lustre fs.  Am I
> understanding that correctly?

Possibly, yes. If you are striping all files widely by default it can  
happen as you write.

> I tried to force the MDS creation to use a smaller size per inode but
> that produced an error:
>
> mkfs.lustre --fsname gulfwork --mdt --mgs --mkfsoptions='-i 1024 -I
> 1024' --reformat --failnode=10.18.12.1 /dev/sda
> ...
>   mke2fs: inode_size (1024) * inodes_count (860148736) too big for a
>        filesystem with 215037184 blocks, specify higher inode_ratio
>    (-i) or lower inode count (-N).
> ...

You can't fill the filesystem 100% full of inodes (1 inode per 1024  
bytes and each inode is 1024 bytes in size). If you ARE striping  
widely you may try -i 1536 -I 1024 but please make sure this is  
actually needed or it will reduce you MDS performance due to 2x larger  
inodes.

> yet the actual drive has many more blocks available:
>
> SCSI device sda: 1720297472 512-byte hdwr sectors (880792 MB)
>
> Is this ext4 setting the block size limit?
>
>
> FYI, I am using:
>  lustre-1.8.2-2.6.18_164.11.1.el5-ext4_lustre.1.8.2.x86_64.rpm
>  lustre-ldiskfs-3.0.9-2.6.18_164.11.1.el5-ext4_lustre.1.8.2.x86_64.rpm
>  e2fsprogs-1.41.6.sun1-0redhat.rhel5.x86_64.rpm
>
>
>
>
> -- 
> Gary Molenkamp            SHARCNET
> Systems Administrator        University of Western Ontario
> gary at sharcnet.ca        http://www.sharcnet.ca
> (519) 661-2111 x88429        (519) 661-4000
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss



More information about the lustre-discuss mailing list