[lustre-discuss] Lustre 2.7 deployment issues
tegner at foi.se
Fri Dec 4 03:18:38 PST 2015
Where do you find the 2.7.x-releases? I thought fixes were only released
for the Intel maintenance version?
On 12/04/2015 11:43 AM, jerome.becot at inserm.fr wrote:
> Hello Ray,
> One consideration first : You try the 2.7 version which is not the
> production one (aka 2.5). From this perspective wether you run 2.7.0
> or 2.7.x won't make any big difference, it is the develpment release.
> Then if I understand the problem comes from the infiniband driver
> module which is buggy in the 2.6.32-504.8.1 kernel, meaning that you
> have to update the kernel to fix it. Doing this may result that the
> 2.7.0 version on the site, compiled on an older kernel version, will
> refuse to load then. (because kernel modules - i.e the lustre ones
> here - relies on features that may change between different kernel
> version making it incompatible)
> In any case you can try to rebuild the 2.7.0 version from the source
> to your new kernel. The procedure is quite easy :
> It will regenerate the 2.7.0 client uppon your newer kernel with the
> working infinband modules, but the stability is not garanted as the
> 2.7 branch is under development anyway.
> Or use a precompiled one on the build site if you can't (some nasty
> bugs in the base 2.x.0 version are fixed in the latest builds)
> The only thing is to stick to the very same version on mds and oss and
> at least the same or newer version for the clients.
> Le 03-12-2015 16:13, Ray Muno a écrit :
>> I am trying to set up a test deployment of Lustre 2.7.
>> I pulled RPMS from http://lustre.org/download/ and installed them on a
>> set of server running Scientific Linux 6.6 which seems to be a proper
>> OS for deployment. Everything installs and I can format the
>> filesystems on the MDS (1) and OSS (2) servers. When I try and mount
>> the OST files systems, I get communication errors. I can "lctl ping"
>> the servers from each other, but cannot establish communication
>> between the MDS and OSS.
>> The installation is on servers connected over Infiniband (Qlogic DDR
>> In trying to diagnose the issues related to the error messages, I
>> found mention in some list discussions that o2ib is broken in the
>> 2.6.32-504.8.1 kernel.
>> After much frustration, I pulled a nightly build from
>> build.hpdd.intel.com (kernel
>> 2.6.32-573.8.1.el6_lustre.g8438f2a.x86_64) and tried the same set up.
>> Everything worked as I expected.
>> Am I missing something? Is the default release pointed to at
>> https://downloads.hpdd.intel.com/ for 2.7 broken in some way? Is it
>> just the hardware I am trying to deploy against?
>> I can provide specifics about the errors I see, I am just posting this
>> to make sure I am pulling the Lustre RPM's from the proper source.
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
More information about the lustre-discuss