[lustre-discuss] Lustre 2.7 deployment issues

jerome.becot at inserm.fr jerome.becot at inserm.fr
Fri Dec 4 04:49:00 PST 2015


Hi,


I honestly don't know if the compiled versions available here are meant 
to be used by everyone but they are publicly browsable on Intel Jenkins 
:

https://build.hpdd.intel.com

but as the source is publicly available from the whamcloud git, there 
imo might not be any problem

If you are in production stick to the 2.5.

Regards


Le 04-12-2015 12:18, Jon Tegner a écrit :
> Hi,
> 
> Where do you find the 2.7.x-releases? I thought fixes were only
> released for the Intel maintenance version?
> 
> Regards,
> 
> /jon
> 
> On 12/04/2015 11:43 AM, jerome.becot at inserm.fr wrote:
>> Hello Ray,
>> 
>> One consideration first : You try the 2.7 version which is not the 
>> production one (aka 2.5). From this perspective wether you run 2.7.0 
>> or 2.7.x won't make any big difference, it is the develpment release.
>> 
>> Then if I understand the problem comes from the infiniband driver 
>> module which is buggy in the 2.6.32-504.8.1 kernel, meaning that you 
>> have to update the kernel to fix it. Doing this may result that the 
>> 2.7.0 version on the site, compiled on an older kernel version, will 
>> refuse to load then. (because kernel modules - i.e the lustre ones 
>> here -  relies on features that may change between different kernel 
>> version making it incompatible)
>> 
>> In any case you can try to rebuild the 2.7.0 version from the source 
>> to your new kernel. The procedure is quite easy :
>> 
>> https://wiki.hpdd.intel.com/display/PUB/Rebuilding+the+Lustre-client+rpms+for+a+new+kernel 
>> It will regenerate the 2.7.0 client uppon your newer kernel with the 
>> working infinband modules, but the stability is not garanted as the 
>> 2.7 branch is under development anyway.
>> 
>> Or use a precompiled one on the build site if you can't (some nasty 
>> bugs in the base 2.x.0 version are fixed in the latest builds)
>> 
>> The only thing is to stick to the very same version on mds and oss and 
>> at least the same or newer version for the clients.
>> 
>> Regards
>> 
>> Le 03-12-2015 16:13, Ray Muno a écrit :
>>> I am trying to set up a test deployment of Lustre 2.7.
>>> 
>>> I pulled RPMS from http://lustre.org/download/ and installed them on 
>>> a
>>> set of server running Scientific Linux 6.6 which seems to be a proper
>>> OS for deployment.  Everything installs and I can format the
>>> filesystems on the MDS (1) and OSS (2) servers. When I try and mount
>>> the OST files systems, I get communication errors. I can "lctl ping"
>>> the servers from each other, but cannot establish communication
>>> between the MDS and OSS.
>>> 
>>> The installation is on servers connected over Infiniband (Qlogic DDR 
>>> 4X).
>>> 
>>> In trying to diagnose the issues related to the error messages, I
>>> found mention in some list discussions that o2ib is broken in the
>>> 2.6.32-504.8.1 kernel.
>>> 
>>> After much frustration, I pulled a nightly build from
>>> build.hpdd.intel.com (kernel
>>> 2.6.32-573.8.1.el6_lustre.g8438f2a.x86_64) and tried the same set up.
>>> Everything worked as I expected.
>>> 
>>> Am I missing something? Is the default release pointed to at
>>> https://downloads.hpdd.intel.com/ for 2.7 broken in some way? Is it
>>> just the hardware I am trying to deploy against?
>>> 
>>> I can provide specifics about the errors I see, I am just posting 
>>> this
>>> to make sure I am pulling the Lustre RPM's from the proper source.
>> _______________________________________________
>> lustre-discuss mailing list
>> lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> 
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


More information about the lustre-discuss mailing list