<div dir="ltr">I'm certainly not Andreas. That said...<div><br></div><div>You're running an MPI simulation, assumingly across most or all of your 34 compute nodes. Lustre server operations, their lnet activity and the backend storage I/O will create a profound imbalance on the few compute nodes you designate to do both server and client operation. That and you expose yourself to deadlocks and other potentials mentioned earlier. I do not know how performant your login server is, but depending on the file operations of your simulations you could cavitate your login server. Also, you generally don't want users logging in on a node as critical as an MDS. </div><div><br></div><div>You would be better served by allocating two of your compute nodes to just be Lustre servers, one mds/oss, the other an oss and run 32 clean client nodes. More stable, clean and in the end probably more workflow productivity over time. Fewer technical incidents. </div><div><br></div><div>Just my opinion...others may differ. </div><div><br></div><div>--Jeff</div><div><br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Oct 13, 2023 at 12:43 PM Fedele Stabile <<a href="mailto:fedele.stabile@fis.unical.it">fedele.stabile@fis.unical.it</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<div>I believe in Linux is possible to limit the memory used by a user and also it is possible to limit the amount of cpu used so I can limit resources for group user and also if i put oss server in a vm i suppose i can limit cpu and memory usage. </div>
<div dir="auto">My scenario is: i have 34 compute nodes 512 GB RAM and 34 HD 16 TB each that I can arrange in 9 nodes, i have also a management node that can be used for LUSTRE metadata server, infiniband is 200 Gb/s</div>
<div dir="auto">We make mhd simulations.</div>
<div dir="auto">What Lustre configuration do you suggest?</div>
<div dir="auto" id="m_6634747231595961948ms-outlook-mobile-signature"></div>
<div id="m_6634747231595961948mail-editor-reference-message-container" dir="auto"><br>
<hr style="display:inline-block;width:98%">
<div id="m_6634747231595961948divRplyFwdMsg" style="font-size:11pt"><strong>Da:</strong> Andreas Dilger <<a href="mailto:adilger@whamcloud.com" target="_blank">adilger@whamcloud.com</a>><br>
<strong>Inviato:</strong> Venerdì, Ottobre 13, 2023 7:19:11 PM<br>
<strong>A:</strong> Fedele Stabile <<a href="mailto:fedele.stabile@fis.unical.it" target="_blank">fedele.stabile@fis.unical.it</a>><br>
<strong>Cc:</strong> <a href="mailto:lustre-discuss@lists.lustre.org" target="_blank">lustre-discuss@lists.lustre.org</a> <<a href="mailto:lustre-discuss@lists.lustre.org" target="_blank">lustre-discuss@lists.lustre.org</a>><br>
<strong>Oggetto:</strong> Re: [lustre-discuss] OSS on compute node<br>
</div>
<br>
<div>
<div>On Oct 13, 2023, at 20:58, Fedele Stabile <<a href="mailto:fedele.stabile@fis.unical.it" target="_blank">fedele.stabile@fis.unical.it</a>> wrote:</div>
<blockquote type="cite"><br>
<div>
<div>Hello everyone,<br>
We are in progress to integrate Lustre on our little HPC Cluster and we would like to know if it is possible to use the same node in a cluster to act as an OSS with disks and to also use it as a Compute Node and then install a Lustre Client.<br>
I know that the OSS server require a modified kernel so I suppose it can be installed in a virtual machine using kvm on a compute node.<br>
</div>
</div>
</blockquote>
<br>
</div>
<div>There isn't really a problem with running a client + OSS on the same node anymore, nor is there a problem with an OSS running inside a VM (if you have SR-IOV and enough CPU+RAM to run the server).</div>
<div><br>
</div>
<div>*HOWEVER*, I don't think it would be good to have the client mounted on the *VM host*, and then run the OSS on a *VM guest*. That could lead to deadlocks and priority inversion if the client becomes busy, but depends on the local OSS to flush dirty data
from RAM and the OSS cannot run in the VM because it doesn't have any RAM...</div>
<div><br>
</div>
<div>If the client and OSS are BOTH run in VMs, or neither run in VMs, or only the client run in a VM, then that should be OK, but may have reduced performance due to the server contending with the client application.</div>
<br>
<div>
<div dir="auto" style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;color:rgb(0,0,0)">
<div dir="auto" style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;color:rgb(0,0,0)">
<div dir="auto" style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;color:rgb(0,0,0)">
<div dir="auto" style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;color:rgb(0,0,0)">
<div dir="auto" style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;color:rgb(0,0,0)">
<div dir="auto" style="letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none;color:rgb(0,0,0)">
<div>Cheers, Andreas</div>
<div>--</div>
<div>Andreas Dilger</div>
<div>Lustre Principal Architect</div>
<div>Whamcloud</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
</div>
<br>
<br>
</div>
<br>
<br>
</div>
</div>
_______________________________________________<br>
lustre-discuss mailing list<br>
<a href="mailto:lustre-discuss@lists.lustre.org" target="_blank">lustre-discuss@lists.lustre.org</a><br>
<a href="http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org" rel="noreferrer" target="_blank">http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org</a><br>
</blockquote></div><br clear="all"><div><br></div><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr">------------------------------<br>Jeff Johnson<br>Co-Founder<br>Aeon Computing<br><br><a href="mailto:jeff.johnson@aeoncomputing.com" target="_blank">jeff.johnson@aeoncomputing.com</a><br><a href="http://www.aeoncomputing.com" target="_blank">www.aeoncomputing.com</a><br>t: 858-412-3810 x1001 f: 858-412-3845<br>m: 619-204-9061<br><br>4170 Morena Boulevard, Suite C - San Diego, CA 92117<div><br></div><div>High-Performance Computing / Lustre Filesystems / Scale-out Storage</div></div></div></div></div>