[lustre-discuss] Data stored in OST [EXT]

Peter Grandi pg at lustre.list.sabi.co.UK
Thu Jun 15 13:44:34 PDT 2023


>>> What would be the problem with large 'datacenter' type HDD's
>>> for an OST (in raid10 for instance)?

> Very, very low IOPS-per-TB, leading to terrifyingly low speed
> under combined user and maintenance load. [...]

> Our last DDN system has OST's using 14TB disks.

That's quite popular. If single-digit transfer rates per-HDD for
HPC clusters are the goal, that's ideal :-). Plus probably those
OSTs from DDNs use (their slightly better version of) RAID6,
which "complicates" matters.

My guess why systems with very low IOPS-per-TB are popular is
that what matters most is IOPS-per-TB *actually used*, so for
the initial usage period, when the HDDs hold less than 1-2TBs,
and mostly in the outer cylinders (a kind of spontaneous "short
stroking"), and mostly unfragmented, and maintenance operations
like checking, scrubbing, migration, backup are endlessly
procrastinated, the storage layer seems to perform well and to
be so cheap, making the purchaser look like a genius.

Then when the HDDs fill up, data reaches the inner cylinders,
and the shrinking free space is heavily fragmented, and latency
goes way up (I have seen Lustre systems with IO latencies of
some *seconds*) and user-visible transfer rates go way down
(sometimes below 1MB/s per HDD), and high-IOPS maintenance
operations can no longer be put off, that's when usually I get
hired. :-(.


More information about the lustre-discuss mailing list