[lustre-discuss] MOFED 4.4-184.108.40.206
Matt.Bidwell at nrel.gov
Fri Aug 3 06:21:19 PDT 2018
If I were troubleshooting this, there are a few things I would look at first.
There's usually a matrix of approved FW versions per Mofed release. Are you running one of the approved versions on your cards? It shouldn't cause failures, but I'd want to make sure I'm running something from their FW matrix before involving Mellanox.
Did you do a reinstall of Luster drivers after upgrading Mofed? I usually must uninstall the running version when I make new RPMS as I've had issues with the new RPMS referencing parts of the previously installed version if I build while the previous version is installed. -Matt
From: lustre-discuss <lustre-discuss-bounces at lists.lustre.org> On Behalf Of Hans Henrik Happe
Sent: Friday, August 03, 2018 5:53 AM
To: lustre-discuss at lists.lustre.org
Subject: [lustre-discuss] MOFED 4.4-220.127.116.11
Did anyone try Mellanox OFED 4.4-18.104.22.168?
With Lustre 2.10.4 and CentOS 6.10 and 6.9 we have issues. Using CentOS
6.9 and the previous supported version there are no problems (CentOS
6.10 is not supported on the previous).
We are using ConnectX-3 cards on kernel 2.6.32-696.18.7.el6.x86_64.
First mount after start of openibd fails. Attached 'first.txt' shows the log.
A second mount succeeds ('second.txt'). The OSTs are slowly added after some timeouts. Everything seems to work after this.
After this we can unmount and mount again and everything is normal.
However, reloading the driver (restart openibd) the mount fails again.
I'll have a go at CentOS 7.5 and contact Mellanox next.
More information about the lustre-discuss