[Lustre-discuss] RHEL5's OFED with lustre1.8.2 on IB
Brian J. Murrell
Brian.Murrell at Sun.COM
Thu Apr 8 08:19:25 PDT 2010
On Thu, 2010-04-08 at 10:56 -0400, Lawrence Sorrillo wrote:
> I am about to try to build lustre again as I am getting hangs with the
> lustre mounts in my previous build.
>
> "Apr 7 09:09:30 host0 kernel: LustreError:
> 5270:0:(o2iblnd_cb.c:2883:kiblnd_check_txs()) Timed out tx: active_txs,
> 9 seconds
> Apr 7 09:09:30 host0 kernel: LustreError:
> 5270:0:(o2iblnd_cb.c:2945:kiblnd_check_conns()) Timed out RDMA with
> 172.17.1.108 at o2ib (84)"
What makes you think that this is a software problem and that rebuilding
the software stack will resolve it? FWIW, every time I have seen this
type of problem reported, the fabric was flaky.
> Here is the plan. Lustre 1.8.2 on rhel5 x86_64 using the ofed in the rhel5 kernel.
In case it's not what you mean, why don't you just use the pre-built
packages that we have built and extensively tested in our QA department
for you?
> I have gathered the following packages from the lustre site:
> e2fsprogs-1.41.6.sun1-0redhat.rhel5.x86_64.rpm
> kernel-2.6.18-164.6.1.0.1.el5.src.rpm
Why do you need a kernel src.rpm?
> lustre-client-1.8.2-2.6.18_164.6.1.0.1.el5_lustre.1.8.2.x86_64.rpm
> lustre-client-modules-1.8.2-2.6.18_164.6.1.0.1.el5_lustre.1.8.2.x86_64.rpm
>
> I want to get the kernel-2.6.18-164.6.1.0.1.el5.x86_64.rpm binary from
> kernel-2.6.18-164.6.1.0.1.el5.src.rpm.
Why not just use the binary kernel we provide instead of rebuilding your
own? It's the *exact* same kernel that we used in our QA testing and
therefore a known quantity.
b.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100408/382ab9e3/attachment.pgp>
More information about the lustre-discuss
mailing list