[lustre-discuss] OSTs per OSS question

Kurt Strosahl strosahl at jlab.org
Mon Feb 27 07:02:28 PST 2023


Good Morning,

   I'm in the early phases of a new lustre file system design.  One thing we are observing on our lustre file system from 2019 (running 2.12.9-1) is that the oss systems report a lot of time spent in i/o wait.  We are configured with two oss systems connected via sas cables to a pair of JBODS, and are wondering if going from two JBODs per OSS (so 6 osts per oss under normal operating conditions, 12 under failover) to 4 shelves (12 osts per oss under normal operating conditions, 24 under failover).  The OSTs are 10 raidz2 vdevs, and we are planning on using 20TB drives in this new file system.

Has anyone tried the 4 shelf / 2 oss configuration?

Reading through the Lustre manual I see the following under table 1.2

"OSS: 1-128 TiB per OST, 1-8 OSTs per OSS"

Is that an indication that more than 8 OSTs per OSS causes problems for the OSS systems?  Our current OSS systems have run at 12 OSTs during failover situations, once for at least a few days due to a hardware failure on one of the OSS systems.

respectfully,

Kurt J. Strosahl (he/him)
System Administrator: Lustre, HPC
Scientific Computing Group, Thomas Jefferson National Accelerator Facility
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20230227/f9f56906/attachment.htm>


More information about the lustre-discuss mailing list