[Lustre-discuss] MDS inode allocation question

Sat Apr 24 01:43:11 PDT 2010

On 2010-04-23, at 13:30, Kevin Van Maren wrote:
> Not sure if it was fixed, but there was a bug in Lustre returning the 
> wrong values here.  If you create a bunch of files, the number of inodes reported should go up until you get where you expect it to be.

It depends what you mean by "wrong values".  The number reported by "df" is the number of new files you are guaranteed to be able to create at that time in the filesystem in the worst case scenario.  The returned value is limited by both the number of objects on the OSTs, as well as the number of blocks (for wide striped files) on the MDT.  As the MDT filesystem has files created in it, the number of files that can be created (i.e. "IFree") will usually stay constant, because the worst case is not the common case.

Since we wanted "IFree" and "IUsed" to reflect the actual values, the "Inodes" value had by necessity to be variable, because the Unix statfs() interface only supplies "Inodes" and "IFree", not "IUsed".

> Note that the number of inodes on the OSTs also limits the number of 
> creatable files:
> each file requires an inodes on at least one OST (number depends on how 
> many OSTs each file is striped across).

Right.  If you don't have enough OST objects, then you will never be able to hit this limit.  However, it is relatively easy to add more OSTs if you ever get close to running out of objects.  Most people run out of space first, but then adding more OSTs for space also gives you proportionately more objects, so the available objects are rarely the issue.

> Gary Molenkamp wrote:
>> When creating the MDS filesystem, I used  '-i 1024' on a 860GB logical
>> drive to provide approx 800M inodes in the lustre filesystem.  This was
>> then verified with 'df -i' on the server:
>> 
>>  /dev/sda    860160000  130452 860029548    1% /data/mds
>> 
>> Later, after completing the OST creation and mounting the full
>> filesystem on a client, I noticed that 'df -i' on the client mount is
>> only showing 108M inodes in the lfs:
>> 
>> 10.18.12.1 at tcp:10.18.12.2 at tcp:/gulfwork
>>                     107454606  130452 107324154    1% /gulfwork
>> 
>> A check with 'lfs df -i' shows the MDT only has 108M inodes:
>> 
>> gulfwork-MDT0000_UUID 107454606    130452 107324154    0%
>> 						/gulfwork[MDT:0]
>> 
>> Is there a preallocation mechanism in play here, or did I miss something
>> critical in the initial setup?  My concern is that modifications to the
>> inodes are not reconfigurable, so it must be correct before the
>> filesystem goes into production.
>> 
>> FYI,  the filesystem was created with:
>> 
>> MDS/MGS on 880G logical drive:
>> mkfs.lustre --fsname gulfwork --mdt --mgs --mkfsoptions='-i 1024'
>> 	--failnode=10.18.12.1 /dev/sda
>> 
>> OSSs on 9.1TB logical drives:
>> /usr/sbin/mkfs.lustre --fsname gulfwork --ost --mgsnode=10.18.12.2 at tcp
>> 	--mgsnode=10.18.12.1 at tcp /dev/cciss/c0d0
>> 
>> Thanks.
>> 
>> 
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.