[lustre-devel] Lustre switching to loop back lnet interface when it is not desired

Tue Nov 5 09:34:36 PST 2024

Hi,

Mounting the Lustre file file system on the OSS. Some of the OSTs are
locally attached to the OSS.

The failover IP on the OST is "10.99.100.152". It is a local lnet on the
OSS. However, when the client mounts it, the import automatically changes
to 0 at lo. It is undesirable here because when this OST fails over to another
server, the client is still trying to connect to 0 at lo while it is no longer
on the same host. This makes the client fs mount hangs for ever.

Here the failover is designed in such a way that the IP address moves
(fails over) with OST and becomes active on the other server.

How can I make the import pointing to the real IP and not the loopback? (so
that the failover works)

[oss000 ~]$ lfs df
UUID                   1K-blocks        Used   Available Use% Mounted on
fs-MDT0000_UUID     29068444       25692    26422344   1% /mnt/fs[MDT:0]
fs-OST0000_UUID     50541812    30160292    17743696  63% /mnt/fs[OST:0]
fs-OST0001_UUID     50541812    29301740    18602248  62% /mnt/fs[OST:1]
fs-OST0002_UUID     50541812    29356508    18547480  62% /mnt/fs[OST:2]
fs-OST0003_UUID     50541812     8822980    39081008  19% /mnt/fs[OST:3]

filesystem_summary:    202167248    97641520    93974432  51% /mnt/fs

[oss000 ~]$ df -h
Filesystem                  Size  Used Avail Use% Mounted on
devtmpfs                     30G     0   30G   0% /dev
tmpfs                        30G  8.1M   30G   1% /dev/shm
tmpfs                        30G   25M   30G   1% /run
tmpfs                        30G     0   30G   0% /sys/fs/cgroup
/dev/mapper/ocivolume-root   36G   17G   19G  48% /
/dev/sdc2                  1014M  637M  378M  63% /boot
/dev/mapper/ocivolume-oled   10G  2.5G  7.6G  25% /var/oled
/dev/sdc1                   100M  5.1M   95M   6% /boot/efi
tmpfs                       5.9G     0  5.9G   0% /run/user/987
tmpfs                       5.9G     0  5.9G   0% /run/user/0
/dev/sdb                     49G   28G   18G  62% /fs-OST0001
/dev/sda                     49G   29G   17G  63% /fs-OST0000
tmpfs                       5.9G     0  5.9G   0% /run/user/1000
10.99.100.221 at tcp1:/fs  193G   94G   90G  51% /mnt/fs

[oss000 ~]$ sudo tunefs.lustre --dryrun /dev/sda
checking for existing Lustre data: found

   Read previous values:
Target:     fs-OST0000
Index:      0
Lustre FS:  fs
Mount type: ldiskfs
Flags:      0x1002
              (OST no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters: mgsnode=10.99.100.221 at tcp1 failover.node=10.99.100.152 at tcp1
,10.99.100.152 at tcp1

   Permanent disk data:
Target:     fs-OST0000
Index:      0
Lustre FS:  fs
Mount type: ldiskfs
Flags:      0x1002
              (OST no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters: mgsnode=10.99.100.221 at tcp1 failover.node=10.99.100.152 at tcp1
,10.99.100.152 at tcp1

exiting before disk write.

[oss000 proc]# cat
/proc/fs/lustre/osc/fs-OST0000-osc-ffff89c57672e000/import
import:
    name: fs-OST0000-osc-ffff89c57672e000
    target: fs-OST0000_UUID
    state: IDLE
    connect_flags: [ write_grant, server_lock, version, request_portal,
max_byte_per_rpc, early_lock_cancel, adaptive_timeouts, lru_resize,
alt_checksum_algorithm, fid_is_enabled, version_recovery, grant_shrink,
full20, layout_lock, 64bithash, object_max_bytes, jobstats, einprogress,
grant_param, lvb_type, short_io, lfsck, bulk_mbits, second_flags,
lockaheadv2, increasing_xid, client_encryption, lseek, reply_mbits ]
    connect_data:
       flags: 0xa0425af2e3440078
       instance: 39
       target_version: 2.15.3.0
       initial_grant: 8437760
       max_brw_size: 4194304
       grant_block_size: 4096
       grant_inode_size: 32
       grant_max_extent_size: 67108864
       grant_extent_tax: 24576
       cksum_types: 0xf7
       max_object_bytes: 17592186040320
    import_flags: [ replayable, pingable, connect_tried ]
    connection:
       failover_nids: [ 0 at lo, 0 at lo ]
       current_connection: 0 at lo
       connection_attempts: 1
       generation: 1
       in-progress_invalidations: 0
       idle: 36 sec
    rpcs:
       inflight: 0
       unregistering: 0
       timeouts: 0
       avg_waittime: 2627 usec
    service_estimates:
       services: 1 sec
       network: 1 sec
    transactions:
       last_replay: 0
       peer_committed: 0
       last_checked: 0
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20241105/2ada30aa/attachment.htm>