[lustre-discuss] Lustre 2.7 deployment issues

Jon Tegner tegner at foi.se
Fri Dec 4 03:18:38 PST 2015


Hi,

Where do you find the 2.7.x-releases? I thought fixes were only released 
for the Intel maintenance version?

Regards,

/jon

On 12/04/2015 11:43 AM, jerome.becot at inserm.fr wrote:
> Hello Ray,
>
> One consideration first : You try the 2.7 version which is not the 
> production one (aka 2.5). From this perspective wether you run 2.7.0 
> or 2.7.x won't make any big difference, it is the develpment release.
>
> Then if I understand the problem comes from the infiniband driver 
> module which is buggy in the 2.6.32-504.8.1 kernel, meaning that you 
> have to update the kernel to fix it. Doing this may result that the 
> 2.7.0 version on the site, compiled on an older kernel version, will 
> refuse to load then. (because kernel modules - i.e the lustre ones 
> here -  relies on features that may change between different kernel 
> version making it incompatible)
>
> In any case you can try to rebuild the 2.7.0 version from the source 
> to your new kernel. The procedure is quite easy :
>
> https://wiki.hpdd.intel.com/display/PUB/Rebuilding+the+Lustre-client+rpms+for+a+new+kernel 
>
>
> It will regenerate the 2.7.0 client uppon your newer kernel with the 
> working infinband modules, but the stability is not garanted as the 
> 2.7 branch is under development anyway.
>
> Or use a precompiled one on the build site if you can't (some nasty 
> bugs in the base 2.x.0 version are fixed in the latest builds)
>
> The only thing is to stick to the very same version on mds and oss and 
> at least the same or newer version for the clients.
>
> Regards
>
> Le 03-12-2015 16:13, Ray Muno a écrit :
>> I am trying to set up a test deployment of Lustre 2.7.
>>
>> I pulled RPMS from http://lustre.org/download/ and installed them on a
>> set of server running Scientific Linux 6.6 which seems to be a proper
>> OS for deployment.  Everything installs and I can format the
>> filesystems on the MDS (1) and OSS (2) servers. When I try and mount
>> the OST files systems, I get communication errors. I can "lctl ping"
>> the servers from each other, but cannot establish communication
>> between the MDS and OSS.
>>
>> The installation is on servers connected over Infiniband (Qlogic DDR 
>> 4X).
>>
>> In trying to diagnose the issues related to the error messages, I
>> found mention in some list discussions that o2ib is broken in the
>> 2.6.32-504.8.1 kernel.
>>
>> After much frustration, I pulled a nightly build from
>> build.hpdd.intel.com (kernel
>> 2.6.32-573.8.1.el6_lustre.g8438f2a.x86_64) and tried the same set up.
>> Everything worked as I expected.
>>
>> Am I missing something? Is the default release pointed to at
>> https://downloads.hpdd.intel.com/ for 2.7 broken in some way? Is it
>> just the hardware I am trying to deploy against?
>>
>> I can provide specifics about the errors I see, I am just posting this
>> to make sure I am pulling the Lustre RPM's from the proper source.
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



More information about the lustre-discuss mailing list