[lustre-discuss] Lustre 2.7 deployment issues
jerome.becot at inserm.fr
jerome.becot at inserm.fr
Fri Dec 4 02:43:57 PST 2015
One consideration first : You try the 2.7 version which is not the
production one (aka 2.5). From this perspective wether you run 2.7.0 or
2.7.x won't make any big difference, it is the develpment release.
Then if I understand the problem comes from the infiniband driver module
which is buggy in the 2.6.32-504.8.1 kernel, meaning that you have to
update the kernel to fix it. Doing this may result that the 2.7.0
version on the site, compiled on an older kernel version, will refuse to
load then. (because kernel modules - i.e the lustre ones here - relies
on features that may change between different kernel version making it
In any case you can try to rebuild the 2.7.0 version from the source to
your new kernel. The procedure is quite easy :
It will regenerate the 2.7.0 client uppon your newer kernel with the
working infinband modules, but the stability is not garanted as the 2.7
branch is under development anyway.
Or use a precompiled one on the build site if you can't (some nasty bugs
in the base 2.x.0 version are fixed in the latest builds)
The only thing is to stick to the very same version on mds and oss and
at least the same or newer version for the clients.
Le 03-12-2015 16:13, Ray Muno a écrit :
> I am trying to set up a test deployment of Lustre 2.7.
> I pulled RPMS from http://lustre.org/download/ and installed them on a
> set of server running Scientific Linux 6.6 which seems to be a proper
> OS for deployment. Everything installs and I can format the
> filesystems on the MDS (1) and OSS (2) servers. When I try and mount
> the OST files systems, I get communication errors. I can "lctl ping"
> the servers from each other, but cannot establish communication
> between the MDS and OSS.
> The installation is on servers connected over Infiniband (Qlogic DDR
> In trying to diagnose the issues related to the error messages, I
> found mention in some list discussions that o2ib is broken in the
> 2.6.32-504.8.1 kernel.
> After much frustration, I pulled a nightly build from
> build.hpdd.intel.com (kernel
> 2.6.32-573.8.1.el6_lustre.g8438f2a.x86_64) and tried the same set up.
> Everything worked as I expected.
> Am I missing something? Is the default release pointed to at
> https://downloads.hpdd.intel.com/ for 2.7 broken in some way? Is it
> just the hardware I am trying to deploy against?
> I can provide specifics about the errors I see, I am just posting this
> to make sure I am pulling the Lustre RPM's from the proper source.
More information about the lustre-discuss