<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type"/>
</head><body style="">
<div>
Hi,
</div>
<div>
<p>I have been working on benchmarking Lustre with IOR on a 4-node cluster and have encountered an issue, where the observed write bandwidth is significantly lower than read bandwidth. Below are the setup details for the cluster:</p>
<ol>
<li>1 MGS/MDS node with :
<ol>
<li>Linux Kernel 4.18.0-513.9.1.el8_lustre.x86_64</li>
<li>800 GB nvme disk formatted as LDISKFS</li>
<li>Lustre server v2.15.4</li>
</ol></li>
<li>2 OSS nodes with 1 OST on each node with :
<ol>
<li>Linux Kernel 4.18.0-513.9.1.el8_lustre.x86_64</li>
<li>800 GB nvme disk formatted as LDISKFS</li>
<li>Lustre server v2.15.4</li>
</ol></li>
<li>1 lustre client with :
<ol>
<li>Lustre v2.15.6</li>
<li>Linux Kernel 5.14.0-503.11.1.el9_5.x86_64</li>
</ol></li>
<li>default strip size is used :
<ol>
<li>stripe_count: 1 stripe_size: 1048576 pattern: 0 stripe_offset: -1</li>
</ol></li>
<li>Interconnected using 56 Gbps Mellanox IB network</li>
<li>Contents of /etc/modprobe.d/lustre.conf file :</li>
</ol>
<p> options lnet networks="o2ib(ib0)"</p>
</div>
<div>
<p> options lnet lnet_transaction_timeout=100</p>
<p> options lnet lnet_retry_count=2</p>
<p> options ko2iblnd peer_credits=32</p>
<p> options ko2iblnd peer_credits_hiw=16</p>
<p> options ko2iblnd concurrent_sends=256</p>
<p> options ksocklnd conns_per_peer=0</p>
<p> options ost oss_num_threads=64</p>
<p> </p>
<p>I conducted individual tests on the OST nodes using obdfilter-survey. For reference, the full summary output of the test is attached.</p>
<ul>
<li> nobjlo=1 nobjhi=512 thrlo=1 thrhi=1024 size=480000 rslt_loc=/var/tmp/obdfilter-survey_out targets="lustrefs-OST0001" case=disk obdfilter-survey</li>
</ul>
<p>ost 1 sz 491520000K rsz 1024K obj 16 thr 16 write 3377.62 [1428.92, 186681.32] rewrite 154516.51 [147831.54, 186675.48] read 6977.51 [3370.55, 103311.36]</p>
<p>ost 1 sz 491520000K rsz 1024K obj 16 thr 32 write 3661.83 [1510.83, 192783.49] rewrite 150708.13 [186337.79, 186337.79] read 6951.00 [2917.89, 59171.64]</p>
<p>ost 1 sz 491520000K rsz 1024K obj 16 thr 64 write 3603.10 [1545.90, 213008.56] rewrite 172656.48 [177891.67, 177891.67] read 6984.14 [3352.78, 57702.04]</p>
<p>ost 1 sz 491520000K rsz 1024K obj 16 thr 128 write 3692.16 [1594.80, 13478.11] rewrite 149716.18 [106440.28, 225295.61] read 6850.52 [2804.80, 45156.82]</p>
<p>ost 1 sz 491520000K rsz 1024K obj 16 thr 256 write 3661.13 [1446.88, 223403.23] rewrite 140771.55 [103769.40, 190108.76] read 6964.33 [3357.70, 85623.55]</p>
<p>ost 1 sz 491257856K rsz 1024K obj 16 thr 512 write 3193.67 [1001.90, 205874.24] rewrite 137435.34 [104790.09, 180991.34] read 6938.31 [3358.61, 54319.14]</p>
<p>ost 1 sz 490733568K rsz 1024K obj 16 thr 1024 write 2379.98 [ 454.94, 202684.59] rewrite 130579.85 [100158.02, 161904.29] read 6945.24 [3354.17, 48807.91]</p>
<p> </p>
<ul>
<li> nobjlo=1 nobjhi=512 thrlo=1 thrhi=1024 size=480000 rslt_loc=/var/tmp/obdfilter-survey_out targets="lustrefs-OST0000" case=disk obdfilter-survey</li>
</ul>
<p>ost 1 sz 491520000K rsz 1024K obj 16 thr 16 write 3747.17 [1393.84, 190306.68] rewrite 156040.83 [148205.37, 188453.46] read 7009.94 [3398.61, 108528.27]</p>
<p>ost 1 sz 491520000K rsz 1024K obj 16 thr 32 write 3745.34 [1393.92, 193273.05] rewrite 154722.13 [177941.31, 177941.31] read 6989.40 [3330.82, 30959.14]</p>
<p>ost 1 sz 491520000K rsz 1024K obj 16 thr 64 write 3760.65 [1367.83, 104560.10] rewrite 162225.64 [148197.30, 148197.30] read 6999.92 [3363.80, 60847.55]</p>
<p>ost 1 sz 491520000K rsz 1024K obj 16 thr 128 write 3754.86 [1379.88, 56060.15] rewrite 147814.31 [104369.56, 217353.01] read 6990.76 [3330.77, 53634.79]</p>
<p>ost 1 sz 491520000K rsz 1024K obj 16 thr 256 write 3705.70 [1358.82, 150706.49] rewrite 138369.68 [101585.51, 182624.29] read 6962.05 [3337.70, 73858.34]</p>
<p>ost 1 sz 491257856K rsz 1024K obj 16 thr 512 write 3612.06 [1275.87, 95958.05] rewrite 134727.11 [105177.20, 172269.61] read 6986.99 [3350.63, 46219.95]</p>
<p>ost 1 sz 490733568K rsz 1024K obj 16 thr 1024 write 2867.46 [ 537.87, 53084.22] rewrite 129812.07 [102830.81, 159936.73] read 6987.93 [3335.35, 79355.00]</p>
<p> </p>
<p>Network performance was evaluated across the cluster nodes using lnet_selftest, yielding a bandwidth of approximately 6800 MB/s for both read and write operations</p>
<p>I used IOR-4.0.0 to check the read and write bandwidth of the setup using the following command. The output is attached for reference.</p>
<ul>
<li>mpirun -genvall -np 16 -ppn 16 -f /path_to_hostfile/hosts_rt05 /path_to_ior_bin/ior -F -w -r -e -g -C -w -b 1g -t 1m -i 4 -D 70 -vv -o ./out</li>
</ul>
<p> Max Write: 1712.75 MiB/sec (1795.95 MB/sec)</p>
<p> Max Read: 83994.25 MiB/sec (88074.36 MB/sec)</p>
<ul>
<li>mpirun -genvall -np 16 -ppn 16 -f /path_to_hostfile/hosts_rt05 /path_to_ior_bin/ior -F -w -r -e -g -C -w -b 256m -t 1m -i 4 -D 70 -vv -o ./out</li>
</ul>
<p> Max Write: 1633.31 MiB/sec (1712.65 MB/sec)</p>
<p> Max Read: 73826.50 MiB/sec (77412.69 MB/sec)</p>
<p> </p>
<p>The observed write bandwidth of 1800 MB/s is significantly lower than the read bandwidth of 88,000 MB/s. Are there specific configurations that could help enhance write performance? Any suggestions or insights on addressing this disparity would be greatly appreciated.</p>
<p>Thanks</p>
<p>John</p>
</div>
<br />------------------------------------------------------------------------------------------------------------
<br />[ C-DAC is on Social-Media too. Kindly follow us at:
<br />Facebook: https://www.facebook.com/CDACINDIA & Twitter: @cdacindia ]
<br />
<br />This e-mail is for the sole use of the intended recipient(s) and may
<br />contain confidential and privileged information. If you are not the
<br />intended recipient, please contact the sender by reply e-mail and destroy
<br />all copies and the original message. Any unauthorized review, use,
<br />disclosure, dissemination, forwarding, printing or copying of this email
<br />is strictly prohibited and appropriate legal action will be taken.
<br />------------------------------------------------------------------------------------------------------------
</body></html>