[lustre-discuss] The confusion for mds hardware requirement

Andreas Dilger adilger at whamcloud.com
Sun Mar 10 14:47:28 PDT 2024


These numbers are just estimates, you can use values more suitable to your workload.

Similarly, 32-core clients may be on the low side these days.  NVIDIA DGX nodes have 256 cores, though you may not have 1024 of them.

The net answer is that having 64GB+ of RAM is inexpensive these days and improves MDS performance, especially if you compare it to the cost of client nodes that would sit waiting for filesystem access if the MDS is short of RAM.  Better to have too much RAM on the MDS than too little.

Cheers, Andreas

On Mar 4, 2024, at 00:56, Amin Brick Mover via lustre-discuss <lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>> wrote:

In the Lustre Manual 5.5.2.1 section, the examples mentioned:
For example, for a single MDT on an MDS with 1,024 compute nodes, 12 interactive login nodes, and a
20 million file working set (of which 9 million files are cached on the clients at one time):
Operating system overhead = 4096 MB (RHEL8)
File system journal = 4096 MB
1024 * 32-core clients * 256 files/core * 2KB = 16384 MB
12 interactive clients * 100,000 files * 2KB = 2400 MB
20 million file working set * 1.5KB/file = 30720 MB
I'm curious, how were the two numbers, 256 files/core and 100,000 files, determined? Why?

_______________________________________________
lustre-discuss mailing list
lustre-discuss at lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Whamcloud







-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20240310/c2f5c780/attachment.htm>


More information about the lustre-discuss mailing list