[Lustre-discuss] Lustre installation and configuration problems

Carlos Santana neubyr at gmail.com
Fri Jun 19 11:51:37 PDT 2009


Guys,

Thanks a lot for all the help..
I was able to build a patchless client from source. The basic verification
tests (unix commands) were successful.

I had an issue with latest CentOS kernel - 2.6.18-128.el5 though. Since I
started with minimum install (withou gcc) and then installed gcc thru yum,
which had dependency on kernel-headers package. By default CentOS 5.2
selects package from updates repo. So one may end up with 2.6.18-92.el5 for
kernel and 2.6.18-128.el5 for kernel-headers. I also tried building it
against latest 2.6.18-128.el5 kernel, however it had an issue as  pointed
out here:
http://lists.lustre.org/pipermail/lustre-discuss/2009-May/010560.html (bug
fixed: https://bugzilla.lustre.org/show_bug.cgi?id=19024 ).

Thank you everone.
Excited to get started with lustre..

-
CS.


On Wed, Jun 17, 2009 at 7:36 PM, Arden Wiebe <albert682 at yahoo.com> wrote:

>
> Carlos:
>
> This client of mine works. Matter of fact on all my clients it works.
>
> [root at lustreone]# rpm -qa | grep -i lustre
> lustre-ldiskfs-3.0.8-2.6.18_92.1.17.el5_lustre.1.8.0smp
> lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
> lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
> kernel-lustre-smp-2.6.18-92.1.17.el5_lustre.1.8.0
>
> Otherwise your output for the same command lists only 2 packages installed
> so you are missing some packages - those being the client packages if you
> don't want to use the patched kernel method of making a client as I have
> done above.  If you issue the rpm commands I mentioned in the very first
> response of this thread you will have a working client.
>
> Arden
>
> --- On Wed, 6/17/09, Carlos Santana <neubyr at gmail.com> wrote:
>
> > From: Carlos Santana <neubyr at gmail.com>
> > Subject: Re: [Lustre-discuss] Lustre installation and configuration
> problems
> > To: "Jerome, Ron" <Ron.Jerome at nrc-cnrc.gc.ca>
> > Cc: lustre-discuss at lists.lustre.org
> > Date: Wednesday, June 17, 2009, 5:10 PM
> > Folks,
> >
> > It been unsuccessful till now..
> >
> > I made a fresh CentOS 5.2 minimum install (2.6.18-92.el5).
> > Later, I
> > updated kernel to 2.6.18-92.1.17 version. Here is a output
> > from uname
> > and rpm query:
> >
> > [root at localhost ~]# rpm -qa | grep lustre
> > lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
> > lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
> > [root at localhost ~]# uname -a
> > Linux localhost.localdomain 2.6.18-92.1.17.el5 #1 SMP Tue
> > Nov 4
> > 13:45:01 EST 2008 i686 i686 i386 GNU/Linux
> >
> > Other details:
> > --- --- ---
> > [root at localhost ~]# ls -l /lib/modules | grep 2.6
> > drwxr-xr-x 6 root root 4096 Jun 17 18:47
> > 2.6.18-92.1.17.el5
> > drwxr-xr-x 6 root root 4096 Jun 17 17:38 2.6.18-92.el5
> >
> >
> > [root at localhost modules]# find . | grep lustre
> > ./2.6.18-92.1.17.el5/kernel/net/lustre
> > ./2.6.18-92.1.17.el5/kernel/net/lustre/libcfs.ko
> > ./2.6.18-92.1.17.el5/kernel/net/lustre/lnet.ko
> > ./2.6.18-92.1.17.el5/kernel/net/lustre/ksocklnd.ko
> > ./2.6.18-92.1.17.el5/kernel/net/lustre/ko2iblnd.ko
> > ./2.6.18-92.1.17.el5/kernel/net/lustre/lnet_selftest.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/osc.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/ptlrpc.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/obdecho.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/lvfs.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/mgc.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/llite_lloop.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/lov.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/mdc.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/lquota.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/lustre.ko
> > ./2.6.18-92.1.17.el5/kernel/fs/lustre/obdclass.ko
> > --- --- ---
> >
> >
> > I am still having same problem. I seriously doubt, am I
> > missing anything?
> > I also tried a source install for 'patchless client',
> > however I have
> > been consistent in its results too.
> >
> > Are there any configuration steps needed after rpm (or
> > source)
> > installation? The one that I know of is restricting
> > interfaces in
> > modeprobe.conf, however I have tried it on-n-off with no
> > success.
> > Could anyone please suggest any debugging and tests for the
> > same? How
> > can I provide you more valuable output to help me? Any
> > insights?
> >
> > Also, I have a suggestion here. It might be good idea to
> > check for
> > 'uname -r' check in RPM installation to check for matching
> > kernel
> > version and if not suggest for source install.
> >
> > Thanks for the help. I really appreciate your patience..
> >
> > -
> > Thanks,
> > CS.
> >
> >
> > On Wed, Jun 17, 2009 at 10:40 AM, Jerome, Ron<Ron.Jerome at nrc-cnrc.gc.ca>
> > wrote:
> > > I think the problem you have, as Cliff alluded to, is
> > a mismatch between
> > > your kernel version  and the Luster kernel version
> > modules.
> > >
> > >
> > >
> > > You have kernel “2.6.18-92.el5” and are installing
> > Lustre
> > > “2.6.18_92.1.17.el5”   Note the “.1.17” is
> > significant as the modules will
> > > end up in the wrong directory.  There is an update to
> > CentOS to bring the
> > > kernel to the matching 2.6.18_92.1.17.el5 version you
> > can pull it off the
> > > CentOS mirror site in the updates directory.
> > >
> > >
> > >
> > >
> > >
> > > Ron.
> > >
> > >
> > >
> > > From: lustre-discuss-bounces at lists.lustre.org
> > > [mailto:lustre-discuss-bounces at lists.lustre.org]
> > On Behalf Of Carlos Santana
> > > Sent: June 17, 2009 11:21 AM
> > > To: lustre-discuss at lists.lustre.org
> > > Subject: Re: [Lustre-discuss] Lustre installation and
> > configuration problems
> > >
> > >
> > >
> > > And is there any specific installation order for
> > patchless client? Could
> > > someone please share it with me?
> > >
> > > -
> > > CS.
> > >
> > > On Wed, Jun 17, 2009 at 10:18 AM, Carlos Santana
> > <neubyr at gmail.com>
> > wrote:
> > >
> > > Huh... :( Sorry to bug you guys again...
> > >
> > > I am planning to make a fresh start now as nothing
> > seems to have worked for
> > > me. If you have any comments/feedback please share
> > them.
> > >
> > > I would like to confirm installation order before I
> > make a fresh start. From
> > > Arden's experience:
> > > http://lists.lustre.org/pipermail/lustre-discuss/2009-June/010710.html
> > , the
> > > lusre-module is installed last. As I was installing
> > Lustre 1.8, I was
> > > referring 1.8 operations manual
> > > http://manual.lustre.org/index.php?title=Main_Page .
> > The installation order
> > > in the manual is different than what Arden has
> > suggested.
> > >
> > > Will it make a difference in configuration at later
> > stage? Which one should
> > > I follow now?
> > > Any comments?
> > >
> > > Thanks,
> > > CS.
> > >
> > >
> > >
> > > On Wed, Jun 17, 2009 at 12:35 AM, Carlos Santana
> > <neubyr at gmail.com>
> > wrote:
> > >
> > > Thanks Cliff.
> > >
> > > The depmod -a was successful before as well. I am
> > using CentOS 5.2
> > > box. Following are the packages installed:
> > > [root at localhost tmp]# rpm -qa | grep -i lustre
> > >
> > lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
> > >
> > > lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
> > >
> > > [root at localhost tmp]# uname -a
> > >
> > > Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue
> > Jun 10 18:49:47
> > > EDT 2008 i686 i686 i386 GNU/Linux
> > >
> > > And here is a output from strace for mount:
> > > http://www.heypasteit.com/clip/8WT
> > >
> > > Any further debugging hints?
> > >
> > > Thanks,
> > > CS.
> > >
> > > On 6/16/09, Cliff White <Cliff.White at sun.com>
> > wrote:
> > >> Carlos Santana wrote:
> > >>> The '$ modprobe -l lustre*' did not show any
> > module on a patchless
> > >>> client. modprobe -v returns 'FATAL: Module
> > lustre not found'.
> > >>>
> > >>> How do I install a patchless client?
> > >>> I have tried lustre-client-modules and
> > lustre-client-ver rpm packages in
> > >>> both sequences. Am I missing anything?
> > >>>
> > >>
> > >> Make sure the lustre-client-modules package
> > matches your running kernel.
> > >> Run depmod -a to be sure
> > >> cliffw
> > >>
> > >>> Thanks,
> > >>> CS.
> > >>>
> > >>>
> > >>>
> > >>> On Tue, Jun 16, 2009 at 2:28 PM, Cliff White
> > <Cliff.White at sun.com
> > >>> <mailto:Cliff.White at sun.com>>
> > wrote:
> > >>>
> > >>>     Carlos Santana wrote:
> > >>>
> > >>>         The lctlt ping and 'net up' failed
> > with the following messages:
> > >>>         --- ---
> > >>>         [root at localhost ~]# lctl ping
> > 10.0.0.42
> > >>>         opening /dev/lnet failed: No such
> > device
> > >>>         hint: the kernel modules may not
> > be loaded
> > >>>         failed to ping 10.0.0.42 at tcp: No
> > such device
> > >>>
> > >>>         [root at localhost ~]# lctl network
> > up
> > >>>         opening /dev/lnet failed: No such
> > device
> > >>>         hint: the kernel modules may not
> > be loaded
> > >>>         LNET configure error 19: No such
> > device
> > >>>
> > >>>
> > >>>     Make sure modules are unloaded, then try
> > modprobe -v.
> > >>>     Looks like you have lnet mis-configured,
> > if your module options are
> > >>>     wrong, you will see an error during the
> > modprobe.
> > >>>     cliffw
> > >>>
> > >>>         --- ---
> > >>>
> > >>>
> > >>>         I tried lustre_rmmod and depmod
> > commands and it did not return
> > >>>         any error messages. Any further
> > clues? Reinstall patchless
> > >>>         client again?
> > >>>
> > >>>         -
> > >>>         CS.
> > >>>
> > >>>
> > >>>         On Tue, Jun 16, 2009 at 1:32 PM,
> > Cliff White
> > >>>         <Cliff.White at sun.com
> > <mailto:Cliff.White at sun.com>
> > >>>         <mailto:Cliff.White at sun.com
> > <mailto:Cliff.White at sun.com>>>
> > wrote:
> > >>>
> > >>>            Carlos Santana wrote:
> > >>>
> > >>>                I was able to run
> > lustre_rmmod and depmod successfully.
> > >>> The
> > >>>                '$lctl list_nids'
> > returned the server ip address and
> > >>>         interface
> > >>>                (tcp0).
> > >>>
> > >>>                I tried to mount the
> > file system on a remote client, but
> > >>> it
> > >>>                failed with the
> > following message.
> > >>>                --- ---
> > >>>                [root at localhost ~]#
> > mount -t lustre 10.0.0.42 at tcp0:/lustre
> > >>>                /mnt/lustre
> > >>>                mount.lustre: mount
> > 10.0.0.42 at tcp0:/lustre at /mnt/lustre
> > >>>                failed: No such device
> > >>>                Are the lustre modules
> > loaded?
> > >>>                Check
> > /etc/modprobe.conf and /proc/filesystems
> > >>>                Note 'alias lustre
> > llite' should be removed from
> > >>>         modprobe.conf
> > >>>                --- ---
> > >>>
> > >>>                However, the mounting
> > is successful on a single node
> > >>>                configuration - with
> > client on the same machine as MDS
> > >>>         and OST.
> > >>>                Any clues? Where to
> > look for logs and debug messages?
> > >>>
> > >>>
> > >>>            Syslog || /var/log/messages
> > is the normal place.
> > >>>
> > >>>            You can use 'lctl ping' to
> > verify that the client can reach
> > >>>         the server.
> > >>>            Usually in these cases, it's
> > a network/name misconfiguration.
> > >>>
> > >>>            Run 'tunefs.lustre --print'
> > on your servers, and verify that
> > >>>         mgsnode=
> > >>>            is correct.
> > >>>
> > >>>            cliffw
> > >>>
> > >>>
> > >>>                Thanks,
> > >>>                CS.
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>                On Tue, Jun 16, 2009 at
> > 12:16 PM, Cliff White
> > >>>                <Cliff.White at sun.com
> > <mailto:Cliff.White at sun.com>
> > >>>         <mailto:Cliff.White at sun.com
> > <mailto:Cliff.White at sun.com>>
> > >>>                <mailto:Cliff.White at sun.com
> > <mailto:Cliff.White at sun.com>
> > >>>         <mailto:Cliff.White at sun.com
> > <mailto:Cliff.White at sun.com>>>>
> > >>> wrote:
> > >>>
> > >>>                   Carlos Santana
> > wrote:
> > >>>
> > >>>                       Thanks
> > Kevin..
> > >>>
> > >>>                   Please read:
> > >>>
> > >>>
> > >>>
> > >>>
> http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529
> > >>>
> > >>>                   Those instructions
> > are identical for 1.6 and 1.8.
> > >>>
> > >>>                   For current lustre,
> > only two commands are used for
> > >>>         configuration.
> > >>>                   mkfs.lustre and
> > mount.
> > >>>
> > >>>
> > >>>                   Usually when
> > lustre_rmmod returns that error, you run
> > >>>         it a second
> > >>>                   time, and it will
> > clear things. Unless you have live
> > >>>         mounts or
> > >>>                   network
> > connections.
> > >>>
> > >>>                   cliffw
> > >>>
> > >>>
> > >>>                       I am
> > referring to 1.8 manual, but I was also
> > >>>         referring to
> > >>>                HowTo
> > >>>                       page on wiki
> > which seems to be for 1.6. The HowTo
> > >>> page
> > >>>
> > >>>
> > >>>
> > >>>
> http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools
> > >>>                       mentions abt
> > lmc, lconf, and lctl.
> > >>>
> > >>>                       The modules
> > are installed in the right place. The
> > >>> '$
> > >>>                       lustre_rmmod'
> > resulted in following o/p:
> > >>>
> > [root at localhost
> > >>> 2.6.18-92.1.17.el5_lustre.1.8.0smp]#
> > >>>                lustre_rmmod
> > >>>                       ERROR: Module
> > obdfilter is in use
> > >>>                       ERROR: Module
> > ost is in use
> > >>>                       ERROR: Module
> > mds is in use
> > >>>                       ERROR: Module
> > fsfilt_ldiskfs is in use
> > >>>                       ERROR: Module
> > mgs is in use
> > >>>                       ERROR: Module
> > mgc is in use by mgs
> > >>>                       ERROR: Module
> > ldiskfs is in use by fsfilt_ldiskfs
> > >>>                       ERROR: Module
> > lov is in use
> > >>>                       ERROR: Module
> > lquota is in use by obdfilter,mds
> > >>>                       ERROR: Module
> > osc is in use
> > >>>                       ERROR: Module
> > ksocklnd is in use
> > >>>                       ERROR: Module
> > ptlrpc is in use by
> > >>>
> > obdfilter,ost,mds,mgs,mgc,lov,lquota,osc
> > >>>                       ERROR: Module
> > obdclass is in use by
> > >>>
> > >>>
> > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc
> > >>>                       ERROR: Module
> > lnet is in use by
> > >>>         ksocklnd,ptlrpc,obdclass
> > >>>                       ERROR: Module
> > lvfs is in use by
> > >>>
> > >>>
> > >>>
> > obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass
> > >>>                       ERROR: Module
> > libcfs is in use by
> > >>>
> > >>>
> > >>>
> > >>>
> >
> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs
> > >>>
> > >>>                       Do I need to
> > shutdown these services? How can I do
> > >>>         that?
> > >>>
> > >>>                       Thanks,
> > >>>                       CS.
> > >>>
> > _______________________________________________
> > Lustre-discuss mailing list
> > Lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/mailman/listinfo/lustre-discuss
> >
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20090619/a5b24d19/attachment.htm>


More information about the lustre-discuss mailing list