[Lustre-discuss] Size of MDT, used space

Andreas Dilger adilger at sun.com
Wed May 14 12:19:02 PDT 2008


On May 14, 2008  09:41 +0200, Thomas Roth wrote:
> Andreas Dilger wrote:
>> On May 13, 2008  19:41 +0200, Thomas Roth wrote:
>>> I'm still in trouble with numbers: the available, used and necessary 
>>> space on my MDT:
>>> According to "lfs df", I have now filled my file system with 115.3 TB.
>>> All of these files are sized 5 MB. That should be roughly 24 million files.
>>> For the MDT, "lfs df" reports 28.2 GB used.
>>>
>>> Now I believed that creating a file on Lustre means using one inode on 
>>> the MDT. Since all of my Lustre partitions were formatted with the 
>>> default options (all of this is running Lustre v. 1.6.4.3, btw), an inode 
>>> should eat up 4kB on the MDT partition. Of course, 24 million files times 
>>> 4 kB gives you 91 GB rather than 28GB.
>>> Obviously, there is something I missed completely. Perhaps somebody could 
>>> illuminate me here?
>>>
>>> This issue could also be phrased as "How large should my MDT be to 
>>> accommodate n TB storage space?" The manual's answer boils down to "= 
>>> number of files * 4 kB "  (*2 per recommendation). That's how I 
>>> calculated above - maybe my test system is broken? I can't check on the 
>>> content of these files, it's just 5MB test files created with the 
>>> 'stress' utility.
>>
>> Please provide output of "lfs df" and "lfs df -i" so we can see the
>> actual numbers.
>
> Ok, I'm attaching these two files with the output. You'll notice some OSTs 
> with  a lower fill level: these OSS had some hardware problems during the 
> production of my (now counted) 22.7 million files.
> Greetings,
> Thomas
>

> # lfs df
> UUID                 1K-blocks      Used Available  Use% Mounted on
> gsilust1-MDT0000_UUID 495497804  29575416 465922388    5% /lustre[MDT:0]
> filesystem summary:  137454163496 123851912300 13602251196   90% /lustre

> # lfs df -i
> UUID                    Inodes     IUsed     IFree IUse% Mounted on
> gsilust1-MDT0000_UUID 141590528  22816486 118774042   16% /lustre[MDT:0]
> filesystem summary:  141590528  22816486 118774042   16% /lustre

I finally understand your question now.  To clarify, with ext3 (ldiskfs)
an inode is preallocated, and is counted as part of the filesystem
"overhead", and not in the "used" space.  Creating and deleting files
in ext3 doesn't "consume" any space for the inode, only for the directory
entries.  The 4kB/inode guideline is to ensure that there is enough space
in the MDS for the "overhead" parts of the filesystem.

The default is 512-byte inodes on the MDT, so this gives an overhead of
141590528 * 512 = 67GB, which is not in the "Used" space.  There is also
directory overhead of somewhere around (12 + filename) * num_files * 2
(directory block overhead).  I'll guess 16-byte filenames, so this gives
141590528 * (12 + 16) * 2 = 7.5GB at the minimum.  There is additionally
MDS log file overhead in order to keep the distributed filesystem sane
in the face of a crash, and the ext3 journal (400MB).

If you have longer filenames, or small directories, or are striping
files over many OSTs, then using 29GB on the MDT doesn't seem outrageous.
The MDT is using 5% of the space, and 16% of the inodes, so there isn't
much to worry about space consumption.  In your case, you are using
1327 bytes per inode on average.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.




More information about the lustre-discuss mailing list