[lustre-discuss] Error messages (ex: not available for connect from 0 at lo) on server boot with Lustre 2.15.3 and 2.15.4-RC1

Mon Dec 4 11:23:52 PST 2023

I do not want to hijack this thread but just checking here before I start
another new thread. I am getting similar messages randomly. The IP involved
here is one Client IP. Getting messages from multiple OSS about multiple
OST at the same time and stops. These types of messages appear
occasionally on multiple OSS, and all these are related to one client at a
time.  Wondering if it is one client related issue as this FS has 100s of
clients and only one client reports at a time. Unfortunately, there is no
easy way for me to figure out if the specified client had an access issue
around the time frame mentioned in the log (no access to clients).

Dec  4 18:05:27 oss010 kernel: LustreError: 137-5: fs-OST00b0_UUID: not
available for connect from <client ip>@tcp1 (no target). If you are running
an HA pair check that the target is mounted on the other server.

On Mon, 4 Dec 2023 at 05:27, Andreas Dilger via lustre-discuss <
lustre-discuss at lists.lustre.org> wrote:

> It wasn't clear from your rail which message(s) are you concerned about?
> These look like normal mount message(s) to me.
>
> The "error" is pretty normal, it just means there were multiple services
> starting at once and one wasn't yet ready for the other.
>
>          LustreError: 137-5: lustrevm-MDT0000_UUID: not available for
> connect
>          from 0 at lo (no target). If you are running an HA pair check that
> the target
>         is mounted on the other server.
>
> It probably makes sense to quiet this message right at mount time to avoid
> this.
>
> Cheers, Andreas
>
> On Dec 1, 2023, at 10:24, Audet, Martin via lustre-discuss <
> lustre-discuss at lists.lustre.org> wrote:
>
> 
>
> Hello Lustre community,
>
>
> Have someone ever seen messages like these on in "/var/log/messages" on a
> Lustre server ?
>
> Dec  1 11:26:30 vlfs kernel: Lustre: Lustre: Build Version: 2.15.4_RC1
> Dec  1 11:26:30 vlfs kernel: LDISKFS-fs (sdd): mounted filesystem with
> ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc
> Dec  1 11:26:30 vlfs kernel: LDISKFS-fs (sdc): mounted filesystem with
> ordered data mode. Opts: errors=remount-ro,no_mbcache,nodelalloc
> Dec  1 11:26:30 vlfs kernel: LDISKFS-fs (sdb): mounted filesystem with
> ordered data mode. Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc
> Dec  1 11:26:36 vlfs kernel: LustreError: 137-5: lustrevm-MDT0000_UUID:
> not available for connect from 0 at lo (no target). If you are running an HA
> pair check that the target is mounted on the other server.
> Dec  1 11:26:36 vlfs kernel: Lustre: lustrevm-OST0001: Imperative Recovery
> not enabled, recovery window 300-900
> Dec  1 11:26:36 vlfs kernel: Lustre: lustrevm-OST0001: deleting orphan
> objects from 0x0:227 to 0x0:513
>
> This happens on every boot on a Lustre server named vlfs (a AlmaLinux 8.9
> VM hosted on a VMware) playing the role of both MGS and OSS (it hosts an
> MDT two OST using "virtual" disks). We chose LDISKFS and not ZFS. Note that
> this happens at every boot, well before the clients (AlmaLinux 9.3 or 8.9
> VMs) connect and even when the clients are powered off. The network
> connecting the clients and the server is a "virtual" 10GbE network (of
> course there is no virtual IB). Also we had the same messages previously
> with Lustre 2.15.3 using an AlmaLinux 8.8 server and AlmaLinux 8.8 / 9.2
> clients (also using VMs). Note also that we compile ourselves the Lustre
> RPMs from the sources from the git repository. We also chose to use a
> patched kernel. Our build procedure for RPMs seems to work well because
> our real cluster run fine on CentOS 7.9 with Lustre 2.12.9 and IB (MOFED)
> networking.
>
> So has anyone seen these messages ?
>
> Are they problematic ? If yes, how do we avoid them ?
>
> We would like to make sure our small test system using VMs works well
> before we upgrade our real cluster.
>
> Thanks in advance !
>
> Martin Audet
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20231204/cb972cf3/attachment-0001.htm>