<div dir="ltr">The import is showing the following though the OST is formatted with <span style="font-family:monospace">failover.node=10.99.100.152 or service mode </span><span style="font-family:monospace">=10.99.100.152</span><div><br></div><div><span style="font-family:monospace"> connection:</span><br style="font-family:monospace"><span style="font-family:monospace"> failover_nids: [ 0@lo, 0@lo ]</span><br style="font-family:monospace"><span style="font-family:monospace"> current_connection: 0@lo</span></div><div><span style="font-family:monospace"><br></span></div><div><span style="font-family:monospace"><br></span></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, 5 Nov 2024 at 12:34, Backer <<a href="mailto:backer.kolo@gmail.com">backer.kolo@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hi,<div><br></div>Mounting the Lustre file file system on the OSS. Some of the OSTs are locally attached to the OSS. <br><br>The failover IP on the OST is "10.99.100.152". It is a local lnet on the OSS. However, when the client mounts it, the import automatically changes to 0@lo. It is undesirable here because when this OST fails over to another server, the client is still trying to connect to 0@lo while it is no longer on the same host. This makes the client fs mount hangs for ever. <br><br>Here the failover is designed in such a way that the IP address moves (fails over) with OST and becomes active on the other server. <br><br>How can I make the import pointing to the real IP and not the loopback? (so that the failover works)<div><span style="font-family:monospace"></span></div><div><span style="font-family:monospace"><br></span></div><div><br></div><font face="monospace">[oss000 ~]$ lfs df<br>UUID 1K-blocks Used Available Use% Mounted on<br>fs-MDT0000_UUID 29068444 25692 26422344 1% /mnt/fs[MDT:0]<br>fs-OST0000_UUID 50541812 30160292 17743696 63% /mnt/fs[OST:0]<br>fs-OST0001_UUID 50541812 29301740 18602248 62% /mnt/fs[OST:1]<br>fs-OST0002_UUID 50541812 29356508 18547480 62% /mnt/fs[OST:2]<br>fs-OST0003_UUID 50541812 8822980 39081008 19% /mnt/fs[OST:3]<br><br>filesystem_summary: 202167248 97641520 93974432 51% /mnt/fs<br><br>[oss000 ~]$ df -h<br>Filesystem Size Used Avail Use% Mounted on<br>devtmpfs 30G 0 30G 0% /dev<br>tmpfs 30G 8.1M 30G 1% /dev/shm<br>tmpfs 30G 25M 30G 1% /run<br>tmpfs 30G 0 30G 0% /sys/fs/cgroup<br>/dev/mapper/ocivolume-root 36G 17G 19G 48% /<br>/dev/sdc2 1014M 637M 378M 63% /boot<br>/dev/mapper/ocivolume-oled 10G 2.5G 7.6G 25% /var/oled<br>/dev/sdc1 100M 5.1M 95M 6% /boot/efi<br>tmpfs 5.9G 0 5.9G 0% /run/user/987<br>tmpfs 5.9G 0 5.9G 0% /run/user/0<br>/dev/sdb 49G 28G 18G 62% /fs-OST0001<br>/dev/sda 49G 29G 17G 63% /fs-OST0000<br>tmpfs 5.9G 0 5.9G 0% /run/user/1000<br>10.99.100.221@tcp1:/fs 193G 94G 90G 51% /mnt/fs<br><br>[oss000 ~]$ sudo tunefs.lustre --dryrun /dev/sda<br>checking for existing Lustre data: found<br><br> Read previous values:<br>Target: fs-OST0000<br>Index: 0<br>Lustre FS: fs<br>Mount type: ldiskfs<br>Flags: 0x1002<br> (OST no_primnode )<br>Persistent mount opts: ,errors=remount-ro<br>Parameters: mgsnode=10.99.100.221@tcp1 failover.node=10.99.100.152@tcp1,10.99.100.152@tcp1<br><br><br> Permanent disk data:<br>Target: fs-OST0000<br>Index: 0<br>Lustre FS: fs<br>Mount type: ldiskfs<br>Flags: 0x1002<br> (OST no_primnode )<br>Persistent mount opts: ,errors=remount-ro<br>Parameters: mgsnode=10.99.100.221@tcp1 failover.node=10.99.100.152@tcp1,10.99.100.152@tcp1<br><br>exiting before disk write.<br><br><br>[oss000 proc]# cat /proc/fs/lustre/osc/fs-OST0000-osc-ffff89c57672e000/import<br>import:<br> name: fs-OST0000-osc-ffff89c57672e000<br> target: fs-OST0000_UUID<br> state: IDLE<br> connect_flags: [ write_grant, server_lock, version, request_portal, max_byte_per_rpc, early_lock_cancel, adaptive_timeouts, lru_resize, alt_checksum_algorithm, fid_is_enabled, version_recovery, grant_shrink, full20, layout_lock, 64bithash, object_max_bytes, jobstats, einprogress, grant_param, lvb_type, short_io, lfsck, bulk_mbits, second_flags, lockaheadv2, increasing_xid, client_encryption, lseek, reply_mbits ]<br> connect_data:<br> flags: 0xa0425af2e3440078<br> instance: 39<br> target_version: 2.15.3.0<br> initial_grant: 8437760<br> max_brw_size: 4194304<br> grant_block_size: 4096<br> grant_inode_size: 32<br> grant_max_extent_size: 67108864<br> grant_extent_tax: 24576<br> cksum_types: 0xf7<br> max_object_bytes: 17592186040320<br> import_flags: [ replayable, pingable, connect_tried ]<br> connection:<br> failover_nids: [ 0@lo, 0@lo ]<br> current_connection: 0@lo<br> connection_attempts: 1<br> generation: 1<br> in-progress_invalidations: 0<br> idle: 36 sec<br> rpcs:<br> inflight: 0<br> unregistering: 0<br> timeouts: 0<br> avg_waittime: 2627 usec<br> service_estimates:<br> services: 1 sec<br> network: 1 sec<br> transactions:<br> last_replay: 0<br> peer_committed: 0<br> last_checked: 0</font></div>
</blockquote></div>