<div dir="ltr">While walking through the code, found a lnet module parameter local_nid_dist_zero. Setting it to 0 resolves the issue. Just putting it here if anyone searching for the same thing in the future. </div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, 6 Nov 2024 at 13:39, Backer <<a href="mailto:backer.kolo@gmail.com">backer.kolo@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="auto">Hi Chris,</div><div dir="auto"><br></div><div dir="auto">Thank you looking in to this. I agree.  In cloud and other type of networks on-Prem, floating ip is real thing providing ha and I am attempting to make it work. Since ip move happens within subseconds in these environment, the failover happens within a few seconds and even notice Any delay. This optimization is an undesired optimization in certain environment. If there is no param already <span style="font-family:-apple-system,helveticaneue;background-color:rgba(0,0,0,0);border-color:rgb(0,0,0);color:rgb(0,0,0)"> exists for a behavior change, how I can make it work within this environment?  I wonder if it requires a code change? If so, I could look in to it if someone can help with some pointers. </span></div><div dir="auto"><span style="font-family:-apple-system,helveticaneue;background-color:rgba(0,0,0,0);border-color:rgb(0,0,0);color:rgb(0,0,0)"><br></span></div><div dir="auto"><span style="font-family:-apple-system,helveticaneue;background-color:rgba(0,0,0,0);border-color:rgb(0,0,0);color:rgb(0,0,0)">Regards</span></div><div dir="auto"><span style="font-family:-apple-system,helveticaneue;background-color:rgba(0,0,0,0);border-color:rgb(0,0,0);color:rgb(0,0,0)"><br></span></div><div dir="auto"><span style="font-family:-apple-system,helveticaneue;background-color:rgba(0,0,0,0);border-color:rgb(0,0,0);color:rgb(0,0,0)">Aboo</span></div><div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Nov 6, 2024 at 11:05 AM Horn, Chris <<a href="mailto:chris.horn@hpe.com" target="_blank">chris.horn@hpe.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<div lang="EN-US">

<div>

<p class="MsoNormal" style="margin-left:0.5in"><span style="font-size:11pt">Here the failover is designed in such a way that the IP address moves (fails over) with OST and becomes active on the other server.<u></u><u></u></span></p>

<p class="MsoNormal" style="margin-left:0.5in"><span style="font-size:11pt"><u></u> <u></u></span></p>

<p class="MsoNormal"><span style="font-size:11pt">This is probably the source of your problem. I would suggest assigning unique IP addresses to each OSS.</span></p></div></div><div lang="EN-US"><div><p class="MsoNormal"><span style="font-size:11pt"><u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11pt"><u></u> <u></u></span></p>

<p class="MsoNormal"><span style="font-size:11pt">Chris Horn<u></u><u></u></span></p>

<p class="MsoNormal"><span style="font-size:11pt"><u></u> <u></u></span></p>

<div id="m_6819826665016080842m_2579605375602447286mail-editor-reference-message-container">

<div>

<div>

<div style="border-width:1pt medium medium;border-style:solid none none;padding:3pt 0in 0in;border-color:rgb(181,196,223) currentcolor currentcolor">

<p class="MsoNormal" style="margin-bottom:12pt"><b><span style="color:black">From:

</span></b><span style="color:black">lustre-discuss <<a href="mailto:lustre-discuss-bounces@lists.lustre.org" target="_blank">lustre-discuss-bounces@lists.lustre.org</a>> on behalf of Backer <<a href="mailto:backer.kolo@gmail.com" target="_blank">backer.kolo@gmail.com</a>><br>

<b>Date: </b>Tuesday, November 5, 2024 at 10:19</span><span style="font-family:Arial,sans-serif;color:black"> </span><span style="color:black">PM<br>

<b>To: </b>Backer via lustre-discuss <<a href="mailto:lustre-discuss@lists.lustre.org" target="_blank">lustre-discuss@lists.lustre.org</a>>, <a href="mailto:lustre-devel@lists.lustre.org" target="_blank">lustre-devel@lists.lustre.org</a> <<a href="mailto:lustre-devel@lists.lustre.org" target="_blank">lustre-devel@lists.lustre.org</a>><br>

<b>Subject: </b>Re: [lustre-discuss] Lustre switching to loop back lnet interface when it is not desired<u></u><u></u></span></p>

</div>

<div>

<p class="MsoNormal">Any ideas on how to avoid using 0@lo as failover_nids? Please see below. <u></u><u></u></p>

</div>

<p class="MsoNormal"><u></u> <u></u></p>

<div>

<div>

<p class="MsoNormal">On Tue, 5 Nov 2024 at 12:34, Backer <<a href="mailto:backer.kolo@gmail.com" target="_blank">backer.kolo@gmail.com</a>> wrote:<u></u><u></u></p>

</div>

<blockquote style="border-width:medium medium medium 1pt;border-style:none none none solid;padding:0in 0in 0in 6pt;margin-left:4.8pt;margin-right:0in;border-color:currentcolor currentcolor currentcolor rgb(204,204,204)">

<div>

<p class="MsoNormal">Hi,<u></u><u></u></p>

<div>

<p class="MsoNormal"><u></u> <u></u></p>

</div>

<p class="MsoNormal">Mounting the Lustre file file system on the OSS. Some of the OSTs are locally attached to the OSS.

<br>

<br>

The failover IP on the OST is "10.99.100.152". It is a local lnet on the OSS. However, when the client mounts it, the import automatically changes to 0@lo. It is undesirable here because when this OST fails over to another server, the client is still trying

 to connect to 0@lo while it is no longer on the same host. This makes the client fs mount hangs for ever.  <br>

<br>

Here the failover is designed in such a way that the IP address moves (fails over) with OST and becomes active on the other server.

<br>

<br>

How can I make the import pointing to the real IP and not the loopback? (so that the failover works)<u></u><u></u></p>

<div>

<p class="MsoNormal"><u></u> <u></u></p>

</div>

<div>

<p class="MsoNormal"><u></u> <u></u></p>

</div>

<p class="MsoNormal"><span style="font-family:"Courier New"">[oss000 ~]$ lfs df<br>

UUID                   1K-blocks        Used   Available Use% Mounted on<br>

fs-MDT0000_UUID     29068444       25692    26422344   1% /mnt/fs[MDT:0]<br>

fs-OST0000_UUID     50541812    30160292    17743696  63% /mnt/fs[OST:0]<br>

fs-OST0001_UUID     50541812    29301740    18602248  62% /mnt/fs[OST:1]<br>

fs-OST0002_UUID     50541812    29356508    18547480  62% /mnt/fs[OST:2]<br>

fs-OST0003_UUID     50541812     8822980    39081008  19% /mnt/fs[OST:3]<br>

<br>

filesystem_summary:    202167248    97641520    93974432  51% /mnt/fs<br>

<br>

[oss000 ~]$ df -h<br>

Filesystem                  Size  Used Avail Use% Mounted on<br>

devtmpfs                     30G     0   30G   0% /dev<br>

tmpfs                        30G  8.1M   30G   1% /dev/shm<br>

tmpfs                        30G   25M   30G   1% /run<br>

tmpfs                        30G     0   30G   0% /sys/fs/cgroup<br>

/dev/mapper/ocivolume-root   36G   17G   19G  48% /<br>

/dev/sdc2                  1014M  637M  378M  63% /boot<br>

/dev/mapper/ocivolume-oled   10G  2.5G  7.6G  25% /var/oled<br>

/dev/sdc1                   100M  5.1M   95M   6% /boot/efi<br>

tmpfs                       5.9G     0  5.9G   0% /run/user/987<br>

tmpfs                       5.9G     0  5.9G   0% /run/user/0<br>

/dev/sdb                     49G   28G   18G  62% /fs-OST0001<br>

/dev/sda                     49G   29G   17G  63% /fs-OST0000<br>

tmpfs                       5.9G     0  5.9G   0% /run/user/1000<br>

10.99.100.221@tcp1:/fs  193G   94G   90G  51% /mnt/fs<br>

<br>

[oss000 ~]$ sudo tunefs.lustre --dryrun /dev/sda<br>

checking for existing Lustre data: found<br>

<br>

   Read previous values:<br>

Target:     fs-OST0000<br>

Index:      0<br>

Lustre FS:  fs<br>

Mount type: ldiskfs<br>

Flags:      0x1002<br>

              (OST no_primnode )<br>

Persistent mount opts: ,errors=remount-ro<br>

Parameters: mgsnode=10.99.100.221@tcp1 failover.node=10.99.100.152@tcp1,10.99.100.152@tcp1<br>

<br>

<br>

   Permanent disk data:<br>

Target:     fs-OST0000<br>

Index:      0<br>

Lustre FS:  fs<br>

Mount type: ldiskfs<br>

Flags:      0x1002<br>

              (OST no_primnode )<br>

Persistent mount opts: ,errors=remount-ro<br>

Parameters: mgsnode=10.99.100.221@tcp1 failover.node=10.99.100.152@tcp1,10.99.100.152@tcp1<br>

<br>

exiting before disk write.<br>

<br>

<br>

[oss000 proc]# cat /proc/fs/lustre/osc/fs-OST0000-osc-ffff89c57672e000/import<br>

import:<br>

    name: fs-OST0000-osc-ffff89c57672e000<br>

    target: fs-OST0000_UUID<br>

    state: IDLE<br>

    connect_flags: [ write_grant, server_lock, version, request_portal, max_byte_per_rpc, early_lock_cancel, adaptive_timeouts, lru_resize, alt_checksum_algorithm, fid_is_enabled, version_recovery, grant_shrink, full20, layout_lock, 64bithash, object_max_bytes,

 jobstats, einprogress, grant_param, lvb_type, short_io, lfsck, bulk_mbits, second_flags, lockaheadv2, increasing_xid, client_encryption, lseek, reply_mbits ]<br>

    connect_data:<br>

       flags: 0xa0425af2e3440078<br>

       instance: 39<br>

       target_version: 2.15.3.0<br>

       initial_grant: 8437760<br>

       max_brw_size: 4194304<br>

       grant_block_size: 4096<br>

       grant_inode_size: 32<br>

       grant_max_extent_size: 67108864<br>

       grant_extent_tax: 24576<br>

       cksum_types: 0xf7<br>

       max_object_bytes: 17592186040320<br>

    import_flags: [ replayable, pingable, connect_tried ]<br>

    connection:<br>

       failover_nids: [ 0@lo, 0@lo ]<br>

       current_connection: 0@lo<br>

       connection_attempts: 1<br>

       generation: 1<br>

       in-progress_invalidations: 0<br>

       idle: 36 sec<br>

    rpcs:<br>

       inflight: 0<br>

       unregistering: 0<br>

       timeouts: 0<br>

       avg_waittime: 2627 usec<br>

    service_estimates:<br>

       services: 1 sec<br>

       network: 1 sec<br>

    transactions:<br>

       last_replay: 0<br>

       peer_committed: 0<br>

       last_checked: 0</span><u></u><u></u></p>

</div>

</blockquote>

</div>

</div>

</div>

</div>

</div>

</div>

</blockquote></div></div>

</blockquote></div>