<div dir="ltr">While walking through the code, found a lnet module parameter local_nid_dist_zero. Setting it to 0 resolves the issue. Just putting it here if anyone searching for the same thing in the future. </div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, 6 Nov 2024 at 13:39, Backer <<a href="mailto:backer.kolo@gmail.com">backer.kolo@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="auto">Hi Chris,</div><div dir="auto"><br></div><div dir="auto">Thank you looking in to this. I agree. In cloud and other type of networks on-Prem, floating ip is real thing providing ha and I am attempting to make it work. Since ip move happens within subseconds in these environment, the failover happens within a few seconds and even notice Any delay. This optimization is an undesired optimization in certain environment. If there is no param already <span style="font-family:-apple-system,helveticaneue;background-color:rgba(0,0,0,0);border-color:rgb(0,0,0);color:rgb(0,0,0)"> exists for a behavior change, how I can make it work within this environment? I wonder if it requires a code change? If so, I could look in to it if someone can help with some pointers. </span></div><div dir="auto"><span style="font-family:-apple-system,helveticaneue;background-color:rgba(0,0,0,0);border-color:rgb(0,0,0);color:rgb(0,0,0)"><br></span></div><div dir="auto"><span style="font-family:-apple-system,helveticaneue;background-color:rgba(0,0,0,0);border-color:rgb(0,0,0);color:rgb(0,0,0)">Regards</span></div><div dir="auto"><span style="font-family:-apple-system,helveticaneue;background-color:rgba(0,0,0,0);border-color:rgb(0,0,0);color:rgb(0,0,0)"><br></span></div><div dir="auto"><span style="font-family:-apple-system,helveticaneue;background-color:rgba(0,0,0,0);border-color:rgb(0,0,0);color:rgb(0,0,0)">Aboo</span></div><div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Nov 6, 2024 at 11:05 AM Horn, Chris <<a href="mailto:chris.horn@hpe.com" target="_blank">chris.horn@hpe.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div lang="EN-US">
<div>
<p class="MsoNormal" style="margin-left:0.5in"><span style="font-size:11pt">Here the failover is designed in such a way that the IP address moves (fails over) with OST and becomes active on the other server.<u></u><u></u></span></p>
<p class="MsoNormal" style="margin-left:0.5in"><span style="font-size:11pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt">This is probably the source of your problem. I would suggest assigning unique IP addresses to each OSS.</span></p></div></div><div lang="EN-US"><div><p class="MsoNormal"><span style="font-size:11pt"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt">Chris Horn<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11pt"><u></u> <u></u></span></p>
<div id="m_6819826665016080842m_2579605375602447286mail-editor-reference-message-container">
<div>
<div>
<div style="border-width:1pt medium medium;border-style:solid none none;padding:3pt 0in 0in;border-color:rgb(181,196,223) currentcolor currentcolor">
<p class="MsoNormal" style="margin-bottom:12pt"><b><span style="color:black">From:
</span></b><span style="color:black">lustre-discuss <<a href="mailto:lustre-discuss-bounces@lists.lustre.org" target="_blank">lustre-discuss-bounces@lists.lustre.org</a>> on behalf of Backer <<a href="mailto:backer.kolo@gmail.com" target="_blank">backer.kolo@gmail.com</a>><br>
<b>Date: </b>Tuesday, November 5, 2024 at 10:19</span><span style="font-family:Arial,sans-serif;color:black"> </span><span style="color:black">PM<br>
<b>To: </b>Backer via lustre-discuss <<a href="mailto:lustre-discuss@lists.lustre.org" target="_blank">lustre-discuss@lists.lustre.org</a>>, <a href="mailto:lustre-devel@lists.lustre.org" target="_blank">lustre-devel@lists.lustre.org</a> <<a href="mailto:lustre-devel@lists.lustre.org" target="_blank">lustre-devel@lists.lustre.org</a>><br>
<b>Subject: </b>Re: [lustre-discuss] Lustre switching to loop back lnet interface when it is not desired<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal">Any ideas on how to avoid using 0@lo as failover_nids? Please see below. <u></u><u></u></p>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<div>
<p class="MsoNormal">On Tue, 5 Nov 2024 at 12:34, Backer <<a href="mailto:backer.kolo@gmail.com" target="_blank">backer.kolo@gmail.com</a>> wrote:<u></u><u></u></p>
</div>
<blockquote style="border-width:medium medium medium 1pt;border-style:none none none solid;padding:0in 0in 0in 6pt;margin-left:4.8pt;margin-right:0in;border-color:currentcolor currentcolor currentcolor rgb(204,204,204)">
<div>
<p class="MsoNormal">Hi,<u></u><u></u></p>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<p class="MsoNormal">Mounting the Lustre file file system on the OSS. Some of the OSTs are locally attached to the OSS.
<br>
<br>
The failover IP on the OST is "10.99.100.152". It is a local lnet on the OSS. However, when the client mounts it, the import automatically changes to 0@lo. It is undesirable here because when this OST fails over to another server, the client is still trying
to connect to 0@lo while it is no longer on the same host. This makes the client fs mount hangs for ever. <br>
<br>
Here the failover is designed in such a way that the IP address moves (fails over) with OST and becomes active on the other server.
<br>
<br>
How can I make the import pointing to the real IP and not the loopback? (so that the failover works)<u></u><u></u></p>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<p class="MsoNormal"><span style="font-family:"Courier New"">[oss000 ~]$ lfs df<br>
UUID 1K-blocks Used Available Use% Mounted on<br>
fs-MDT0000_UUID 29068444 25692 26422344 1% /mnt/fs[MDT:0]<br>
fs-OST0000_UUID 50541812 30160292 17743696 63% /mnt/fs[OST:0]<br>
fs-OST0001_UUID 50541812 29301740 18602248 62% /mnt/fs[OST:1]<br>
fs-OST0002_UUID 50541812 29356508 18547480 62% /mnt/fs[OST:2]<br>
fs-OST0003_UUID 50541812 8822980 39081008 19% /mnt/fs[OST:3]<br>
<br>
filesystem_summary: 202167248 97641520 93974432 51% /mnt/fs<br>
<br>
[oss000 ~]$ df -h<br>
Filesystem Size Used Avail Use% Mounted on<br>
devtmpfs 30G 0 30G 0% /dev<br>
tmpfs 30G 8.1M 30G 1% /dev/shm<br>
tmpfs 30G 25M 30G 1% /run<br>
tmpfs 30G 0 30G 0% /sys/fs/cgroup<br>
/dev/mapper/ocivolume-root 36G 17G 19G 48% /<br>
/dev/sdc2 1014M 637M 378M 63% /boot<br>
/dev/mapper/ocivolume-oled 10G 2.5G 7.6G 25% /var/oled<br>
/dev/sdc1 100M 5.1M 95M 6% /boot/efi<br>
tmpfs 5.9G 0 5.9G 0% /run/user/987<br>
tmpfs 5.9G 0 5.9G 0% /run/user/0<br>
/dev/sdb 49G 28G 18G 62% /fs-OST0001<br>
/dev/sda 49G 29G 17G 63% /fs-OST0000<br>
tmpfs 5.9G 0 5.9G 0% /run/user/1000<br>
10.99.100.221@tcp1:/fs 193G 94G 90G 51% /mnt/fs<br>
<br>
[oss000 ~]$ sudo tunefs.lustre --dryrun /dev/sda<br>
checking for existing Lustre data: found<br>
<br>
Read previous values:<br>
Target: fs-OST0000<br>
Index: 0<br>
Lustre FS: fs<br>
Mount type: ldiskfs<br>
Flags: 0x1002<br>
(OST no_primnode )<br>
Persistent mount opts: ,errors=remount-ro<br>
Parameters: mgsnode=10.99.100.221@tcp1 failover.node=10.99.100.152@tcp1,10.99.100.152@tcp1<br>
<br>
<br>
Permanent disk data:<br>
Target: fs-OST0000<br>
Index: 0<br>
Lustre FS: fs<br>
Mount type: ldiskfs<br>
Flags: 0x1002<br>
(OST no_primnode )<br>
Persistent mount opts: ,errors=remount-ro<br>
Parameters: mgsnode=10.99.100.221@tcp1 failover.node=10.99.100.152@tcp1,10.99.100.152@tcp1<br>
<br>
exiting before disk write.<br>
<br>
<br>
[oss000 proc]# cat /proc/fs/lustre/osc/fs-OST0000-osc-ffff89c57672e000/import<br>
import:<br>
name: fs-OST0000-osc-ffff89c57672e000<br>
target: fs-OST0000_UUID<br>
state: IDLE<br>
connect_flags: [ write_grant, server_lock, version, request_portal, max_byte_per_rpc, early_lock_cancel, adaptive_timeouts, lru_resize, alt_checksum_algorithm, fid_is_enabled, version_recovery, grant_shrink, full20, layout_lock, 64bithash, object_max_bytes,
jobstats, einprogress, grant_param, lvb_type, short_io, lfsck, bulk_mbits, second_flags, lockaheadv2, increasing_xid, client_encryption, lseek, reply_mbits ]<br>
connect_data:<br>
flags: 0xa0425af2e3440078<br>
instance: 39<br>
target_version: 2.15.3.0<br>
initial_grant: 8437760<br>
max_brw_size: 4194304<br>
grant_block_size: 4096<br>
grant_inode_size: 32<br>
grant_max_extent_size: 67108864<br>
grant_extent_tax: 24576<br>
cksum_types: 0xf7<br>
max_object_bytes: 17592186040320<br>
import_flags: [ replayable, pingable, connect_tried ]<br>
connection:<br>
failover_nids: [ 0@lo, 0@lo ]<br>
current_connection: 0@lo<br>
connection_attempts: 1<br>
generation: 1<br>
in-progress_invalidations: 0<br>
idle: 36 sec<br>
rpcs:<br>
inflight: 0<br>
unregistering: 0<br>
timeouts: 0<br>
avg_waittime: 2627 usec<br>
service_estimates:<br>
services: 1 sec<br>
network: 1 sec<br>
transactions:<br>
last_replay: 0<br>
peer_committed: 0<br>
last_checked: 0</span><u></u><u></u></p>
</div>
</blockquote>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote></div></div>
</blockquote></div>