[lustre-discuss] [EXTERNAL] inodes distribution

Ihsan Ur Rahman ihsanurr67 at gmail.com
Thu Nov 13 03:02:48 PST 2025


Thank you Mr. Dilger & Mr. Rick.

Your explanations help me a lot. Now I can take a deep breath that
everything is fine with the file system.

Regards,

Ihsan

On Wed, Nov 12, 2025 at 2:02 AM Andreas Dilger <adilger at dilger.ca> wrote:

> For ZFS OSTs/MDTs, they will not "run out of inodes" before they run out
> of usable space.  In other words, for ZFS the used inode percentage == used
> space percentage, and both will hit 100% when the OST is full.
>
> That is fine (and expected).  It may be that the MDT has free space (==
> free inodes), but this is just a small fraction of the total space in the
> filesystem (a few percent at most).  It is typical to over-provision the
> MDTs a bit, to allow adding more OSTs to the filesystem easily, and to
> store log files (e.g. changelogs).
>
> Having the "lfs df" information as well as "lfs df -i" would be helpful,
> but I don't think there is anything actually wrong here.  As Rick wrote,
> the "free inodes" is just an estimate for ZFS ("free space / average object
> size"), so they will not "run out".
>
> Cheers, Andreas
>
> On Nov 11, 2025, at 3:53 PM, Mohr, Rick via lustre-discuss <
> lustre-discuss at lists.lustre.org> wrote:
> >
> > Ihsan,
> >
> > Lustre doesn't allocate inodes in the same way that you're probably used
> to thinking about, say in the context of an ext4 filesystem.  The inode
> usage you see for each mdt/ost is just the inode usage of the underlying
> filesystem (ldiskfs or zfs).  Lustre itself doesn't have a list of inodes
> that it gives out.  Instead, Lustre identifies a file using a 128-bit FID
> (File Identifier) that is unique for each file in the filesystem.  A new
> FID is allocated when a file is created, and FIDs are not reused.  The
> number of files that Lustre can hold will be limited by capacity and
> numbers of inodes on the individual mdts/osts, but the total number of FIDs
> will be much larger than that (so that Lustre won't run out of FIDs before
> running out of resources on the backend).  Commands like 'ls -I' will list
> an inode number for a Lustre file, but it isn't actually an inode.  Lustre
> just has a way to convert the 128-bit FID number into a 64-bit number that
> it displays for the inode number.
> >
> > Using files with stripe count of 1 (which you are doing) will help to
> conserve ost inodes.  But since you are using zfs for the ost backend, you
> should keep in mind that those inode numbers are just estimates.  Zfs
> doesn't have fixed numbers of inodes like ldiskfs does.  So it's possible
> you could have more files than what might be indicated by the ost inode
> usage.  I am not a zfs expert, so I am not sure how that inode estimate is
> calculated or how accurate it might be.
> >
> > --Rick
> >
> >
> > On 11/11/25, 2:48 AM, "Ihsan Ur Rahman" <ihsanurr67 at gmail.com <mailto:
> ihsanurr67 at gmail.com>> wrote:
> >
> > Thank you Rick for the detailed explanation.
> >
> > For the OSTs we have used, zfs is a backend file system. If MDT is
> responsible for giving the inodes then why OST inodes are consumed. If OST
> is also giving the inodes to the store data, I am afraid that sooner or
> later we will run out of inodes on the OSTs.
> > We are using a strip count of 1.
> >
> > Regards,
> >
> > Ihsan
> >
> >
> >
> > On Tue, Nov 11, 2025 at 1:41 AM Mohr, Rick <mohrrf at ornl.gov <mailto:
> mohrrf at ornl.gov> <mailto:mohrrf at ornl.gov <mailto:mohrrf at ornl.gov>>> wrote:
> >
> >
> > Ihsan,
> >
> >
> > Roughly speaking, every file/dir in lustre will consume one inode on the
> mdt that hosts it, and each file will also consume one inode on each ost
> that has a stripe allocated to that file. The exact inode usage can be
> complicated with more advanced features like DNE, PFL, etc. but that is a
> simple estimate on how inodes are used.
> >
> >
> > Now, how the inode usage is presented is a bit tricky. In your case, the
> mdts have 156.7M and 127.7M inodes used for a combined total of 284.4M
> inodes. Since inode usage for filesystems is usually an indication of how
> many files/dirs exist on the filesystem, the sum of the mdt inode usage is
> reported as the overall filesystem inode usage. (Because even though a file
> with stripe_count=4 might consume 1 inode on an mdt and 4 inodes on 4
> different osts, it still only counts as 1 file. So it only adds 1 to the
> total inode usage and not 5.)
> >
> >
> > Free inodes are calculated differently. In the simplest case, a file
> with stripe_count=1 would consume 1 mdt inode and 1 ost inode. Since your
> filesystem has a lot more mdt inodes than ost inodes, lustre assumes that
> the number of ost inodes is the limiting factor, so it uses the sum of all
> the free ost inodes as the total number of free inodes remaining. If you
> add up 17.9M+16.6M+..., you will get 173.6M which basically matches the
> number of free inodes. (The total number of filesystem inodes is then the
> sum of the used inodes and free inodes.). Of course, the calculation of
> free inodes can be off depending on the circumstance. If you use DoM, then
> it is possible to have a small file that consumes an inode on a mdt but
> doesn't consume any ost inodes which means your filesystem could
> accommodate more additional files than the 173.7M indicated by the number
> of free inodes. On the other hand, if you created files with
> stripe_count=10, you would only be able to create about 16.4M files. Since
> the total inode usage on your mdts is 284.4M, but the total inode usage on
> all your osts is around 125M, I'm guessing maybe you are using DoM for a
> bunch of small files.
> >
> >
> > The above explanation assumes you are using ldiskfs for the backend
> which formats the mdts and osts with a fixed number of inodes. If you are
> using zfs for the backend, then I think the inode values for each mdt/ost
> are merely estimates anyway since zfs doesn't have fixed inodes like
> ldiskfs does.
> >
> >
> > Hope that helps.
> >
> >
> > --Rick
> >
> >
> >
> >
> > On 11/10/25, 6:30 AM, "lustre-discuss on behalf of Ihsan Ur Rahman via
> lustre-discuss" <lustre-discuss-bounces at lists.lustre.org <mailto:
> lustre-discuss-bounces at lists.lustre.org> <_blank>> wrote:
> >
> >
> > Hello lustre folks,
> >
> >
> > In the Lustre file system who is responsible for giving inodes. As per
> my understanding, it is MDS/MGS who is giving inodes.
> > Below is the output of the inodes distribution in our lustre file
> system. Is this correct? Because the ost is also giving the inodes and most
> of the used more than 40%.
> >
> >
> > lfs df -ih /mnt/lust-das
> > UUID Inodes IUsed IFree IUse% Mounted on
> > lust-das-MDT0000_UUID 745.2M 156.7M 588.5M 22% /mnt/lust-das[MDT:0]
> > lust-das-MDT0001_UUID 745.2M 127.7M 617.6M 18% /mnt/lust-das[MDT:1]
> > lust-das-OST0000_UUID 30.4M 12.5M 17.9M 42% /mnt/lust-das[OST:0]
> > lust-das-OST0001_UUID 29.1M 12.5M 16.6M 43% /mnt/lust-das[OST:1]
> > lust-das-OST0002_UUID 30.2M 12.5M 17.7M 42% /mnt/lust-das[OST:2]
> > lust-das-OST0003_UUID 30.7M 12.5M 18.2M 41% /mnt/lust-das[OST:3]
> > lust-das-OST0004_UUID 29.7M 12.5M 17.3M 42% /mnt/lust-das[OST:4]
> > lust-das-OST0005_UUID 29.8M 12.5M 17.3M 42% /mnt/lust-das[OST:5]
> > lust-das-OST0006_UUID 29.9M 12.5M 17.4M 42% /mnt/lust-das[OST:6]
> > lust-das-OST0007_UUID 29.8M 12.5M 17.3M 42% /mnt/lust-das[OST:7]
> > lust-das-OST0008_UUID 28.8M 12.5M 16.4M 44% /mnt/lust-das[OST:8]
> > lust-das-OST0009_UUID 30.0M 12.5M 17.5M 42% /mnt/lust-das[OST:9]
> >
> >
> > filesystem_summary: 458.1M 284.4M 173.7M 63% /mnt/lust-das
> >
> >
> > Regards,
> > Ihsan
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > lustre-discuss mailing list
> > lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
> Cheers, Andreas
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20251113/5a46e9e6/attachment.htm>


More information about the lustre-discuss mailing list