[Lustre-discuss] Lustre installation and configuration problems

Arden Wiebe albert682 at yahoo.com
Wed Jun 17 17:36:20 PDT 2009


Carlos:

This client of mine works. Matter of fact on all my clients it works.

[root at lustreone]# rpm -qa | grep -i lustre
lustre-ldiskfs-3.0.8-2.6.18_92.1.17.el5_lustre.1.8.0smp
lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
kernel-lustre-smp-2.6.18-92.1.17.el5_lustre.1.8.0

Otherwise your output for the same command lists only 2 packages installed so you are missing some packages - those being the client packages if you don't want to use the patched kernel method of making a client as I have done above.  If you issue the rpm commands I mentioned in the very first response of this thread you will have a working client.

Arden

--- On Wed, 6/17/09, Carlos Santana <neubyr at gmail.com> wrote:

> From: Carlos Santana <neubyr at gmail.com>
> Subject: Re: [Lustre-discuss] Lustre installation and configuration problems
> To: "Jerome, Ron" <Ron.Jerome at nrc-cnrc.gc.ca>
> Cc: lustre-discuss at lists.lustre.org
> Date: Wednesday, June 17, 2009, 5:10 PM
> Folks,
> 
> It been unsuccessful till now..
> 
> I made a fresh CentOS 5.2 minimum install (2.6.18-92.el5).
> Later, I
> updated kernel to 2.6.18-92.1.17 version. Here is a output
> from uname
> and rpm query:
> 
> [root at localhost ~]# rpm -qa | grep lustre
> lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
> lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
> [root at localhost ~]# uname -a
> Linux localhost.localdomain 2.6.18-92.1.17.el5 #1 SMP Tue
> Nov 4
> 13:45:01 EST 2008 i686 i686 i386 GNU/Linux
> 
> Other details:
> --- --- ---
> [root at localhost ~]# ls -l /lib/modules | grep 2.6
> drwxr-xr-x 6 root root 4096 Jun 17 18:47
> 2.6.18-92.1.17.el5
> drwxr-xr-x 6 root root 4096 Jun 17 17:38 2.6.18-92.el5
> 
> 
> [root at localhost modules]# find . | grep lustre
> ./2.6.18-92.1.17.el5/kernel/net/lustre
> ./2.6.18-92.1.17.el5/kernel/net/lustre/libcfs.ko
> ./2.6.18-92.1.17.el5/kernel/net/lustre/lnet.ko
> ./2.6.18-92.1.17.el5/kernel/net/lustre/ksocklnd.ko
> ./2.6.18-92.1.17.el5/kernel/net/lustre/ko2iblnd.ko
> ./2.6.18-92.1.17.el5/kernel/net/lustre/lnet_selftest.ko
> ./2.6.18-92.1.17.el5/kernel/fs/lustre
> ./2.6.18-92.1.17.el5/kernel/fs/lustre/osc.ko
> ./2.6.18-92.1.17.el5/kernel/fs/lustre/ptlrpc.ko
> ./2.6.18-92.1.17.el5/kernel/fs/lustre/obdecho.ko
> ./2.6.18-92.1.17.el5/kernel/fs/lustre/lvfs.ko
> ./2.6.18-92.1.17.el5/kernel/fs/lustre/mgc.ko
> ./2.6.18-92.1.17.el5/kernel/fs/lustre/llite_lloop.ko
> ./2.6.18-92.1.17.el5/kernel/fs/lustre/lov.ko
> ./2.6.18-92.1.17.el5/kernel/fs/lustre/mdc.ko
> ./2.6.18-92.1.17.el5/kernel/fs/lustre/lquota.ko
> ./2.6.18-92.1.17.el5/kernel/fs/lustre/lustre.ko
> ./2.6.18-92.1.17.el5/kernel/fs/lustre/obdclass.ko
> --- --- ---
> 
> 
> I am still having same problem. I seriously doubt, am I
> missing anything?
> I also tried a source install for 'patchless client',
> however I have
> been consistent in its results too.
> 
> Are there any configuration steps needed after rpm (or
> source)
> installation? The one that I know of is restricting
> interfaces in
> modeprobe.conf, however I have tried it on-n-off with no
> success.
> Could anyone please suggest any debugging and tests for the
> same? How
> can I provide you more valuable output to help me? Any
> insights?
> 
> Also, I have a suggestion here. It might be good idea to
> check for
> 'uname -r' check in RPM installation to check for matching
> kernel
> version and if not suggest for source install.
> 
> Thanks for the help. I really appreciate your patience..
> 
> -
> Thanks,
> CS.
> 
> 
> On Wed, Jun 17, 2009 at 10:40 AM, Jerome, Ron<Ron.Jerome at nrc-cnrc.gc.ca>
> wrote:
> > I think the problem you have, as Cliff alluded to, is
> a mismatch between
> > your kernel version  and the Luster kernel version
> modules.
> >
> >
> >
> > You have kernel “2.6.18-92.el5” and are installing
> Lustre
> > “2.6.18_92.1.17.el5”   Note the “.1.17” is
> significant as the modules will
> > end up in the wrong directory.  There is an update to
> CentOS to bring the
> > kernel to the matching 2.6.18_92.1.17.el5 version you
> can pull it off the
> > CentOS mirror site in the updates directory.
> >
> >
> >
> >
> >
> > Ron.
> >
> >
> >
> > From: lustre-discuss-bounces at lists.lustre.org
> > [mailto:lustre-discuss-bounces at lists.lustre.org]
> On Behalf Of Carlos Santana
> > Sent: June 17, 2009 11:21 AM
> > To: lustre-discuss at lists.lustre.org
> > Subject: Re: [Lustre-discuss] Lustre installation and
> configuration problems
> >
> >
> >
> > And is there any specific installation order for
> patchless client? Could
> > someone please share it with me?
> >
> > -
> > CS.
> >
> > On Wed, Jun 17, 2009 at 10:18 AM, Carlos Santana
> <neubyr at gmail.com>
> wrote:
> >
> > Huh... :( Sorry to bug you guys again...
> >
> > I am planning to make a fresh start now as nothing
> seems to have worked for
> > me. If you have any comments/feedback please share
> them.
> >
> > I would like to confirm installation order before I
> make a fresh start. From
> > Arden's experience:
> > http://lists.lustre.org/pipermail/lustre-discuss/2009-June/010710.html
> , the
> > lusre-module is installed last. As I was installing
> Lustre 1.8, I was
> > referring 1.8 operations manual
> > http://manual.lustre.org/index.php?title=Main_Page .
> The installation order
> > in the manual is different than what Arden has
> suggested.
> >
> > Will it make a difference in configuration at later
> stage? Which one should
> > I follow now?
> > Any comments?
> >
> > Thanks,
> > CS.
> >
> >
> >
> > On Wed, Jun 17, 2009 at 12:35 AM, Carlos Santana
> <neubyr at gmail.com>
> wrote:
> >
> > Thanks Cliff.
> >
> > The depmod -a was successful before as well. I am
> using CentOS 5.2
> > box. Following are the packages installed:
> > [root at localhost tmp]# rpm -qa | grep -i lustre
> >
> lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
> >
> > lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
> >
> > [root at localhost tmp]# uname -a
> >
> > Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue
> Jun 10 18:49:47
> > EDT 2008 i686 i686 i386 GNU/Linux
> >
> > And here is a output from strace for mount:
> > http://www.heypasteit.com/clip/8WT
> >
> > Any further debugging hints?
> >
> > Thanks,
> > CS.
> >
> > On 6/16/09, Cliff White <Cliff.White at sun.com>
> wrote:
> >> Carlos Santana wrote:
> >>> The '$ modprobe -l lustre*' did not show any
> module on a patchless
> >>> client. modprobe -v returns 'FATAL: Module
> lustre not found'.
> >>>
> >>> How do I install a patchless client?
> >>> I have tried lustre-client-modules and
> lustre-client-ver rpm packages in
> >>> both sequences. Am I missing anything?
> >>>
> >>
> >> Make sure the lustre-client-modules package
> matches your running kernel.
> >> Run depmod -a to be sure
> >> cliffw
> >>
> >>> Thanks,
> >>> CS.
> >>>
> >>>
> >>>
> >>> On Tue, Jun 16, 2009 at 2:28 PM, Cliff White
> <Cliff.White at sun.com
> >>> <mailto:Cliff.White at sun.com>>
> wrote:
> >>>
> >>>     Carlos Santana wrote:
> >>>
> >>>         The lctlt ping and 'net up' failed
> with the following messages:
> >>>         --- ---
> >>>         [root at localhost ~]# lctl ping
> 10.0.0.42
> >>>         opening /dev/lnet failed: No such
> device
> >>>         hint: the kernel modules may not
> be loaded
> >>>         failed to ping 10.0.0.42 at tcp: No
> such device
> >>>
> >>>         [root at localhost ~]# lctl network
> up
> >>>         opening /dev/lnet failed: No such
> device
> >>>         hint: the kernel modules may not
> be loaded
> >>>         LNET configure error 19: No such
> device
> >>>
> >>>
> >>>     Make sure modules are unloaded, then try
> modprobe -v.
> >>>     Looks like you have lnet mis-configured,
> if your module options are
> >>>     wrong, you will see an error during the
> modprobe.
> >>>     cliffw
> >>>
> >>>         --- ---
> >>>
> >>>
> >>>         I tried lustre_rmmod and depmod
> commands and it did not return
> >>>         any error messages. Any further
> clues? Reinstall patchless
> >>>         client again?
> >>>
> >>>         -
> >>>         CS.
> >>>
> >>>
> >>>         On Tue, Jun 16, 2009 at 1:32 PM,
> Cliff White
> >>>         <Cliff.White at sun.com
> <mailto:Cliff.White at sun.com>
> >>>         <mailto:Cliff.White at sun.com
> <mailto:Cliff.White at sun.com>>>
> wrote:
> >>>
> >>>            Carlos Santana wrote:
> >>>
> >>>                I was able to run
> lustre_rmmod and depmod successfully.
> >>> The
> >>>                '$lctl list_nids'
> returned the server ip address and
> >>>         interface
> >>>                (tcp0).
> >>>
> >>>                I tried to mount the
> file system on a remote client, but
> >>> it
> >>>                failed with the
> following message.
> >>>                --- ---
> >>>                [root at localhost ~]#
> mount -t lustre 10.0.0.42 at tcp0:/lustre
> >>>                /mnt/lustre
> >>>                mount.lustre: mount
> 10.0.0.42 at tcp0:/lustre at /mnt/lustre
> >>>                failed: No such device
> >>>                Are the lustre modules
> loaded?
> >>>                Check
> /etc/modprobe.conf and /proc/filesystems
> >>>                Note 'alias lustre
> llite' should be removed from
> >>>         modprobe.conf
> >>>                --- ---
> >>>
> >>>                However, the mounting
> is successful on a single node
> >>>                configuration - with
> client on the same machine as MDS
> >>>         and OST.
> >>>                Any clues? Where to
> look for logs and debug messages?
> >>>
> >>>
> >>>            Syslog || /var/log/messages
> is the normal place.
> >>>
> >>>            You can use 'lctl ping' to
> verify that the client can reach
> >>>         the server.
> >>>            Usually in these cases, it's
> a network/name misconfiguration.
> >>>
> >>>            Run 'tunefs.lustre --print'
> on your servers, and verify that
> >>>         mgsnode=
> >>>            is correct.
> >>>
> >>>            cliffw
> >>>
> >>>
> >>>                Thanks,
> >>>                CS.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>                On Tue, Jun 16, 2009 at
> 12:16 PM, Cliff White
> >>>                <Cliff.White at sun.com
> <mailto:Cliff.White at sun.com>
> >>>         <mailto:Cliff.White at sun.com
> <mailto:Cliff.White at sun.com>>
> >>>                <mailto:Cliff.White at sun.com
> <mailto:Cliff.White at sun.com>
> >>>         <mailto:Cliff.White at sun.com
> <mailto:Cliff.White at sun.com>>>>
> >>> wrote:
> >>>
> >>>                   Carlos Santana
> wrote:
> >>>
> >>>                       Thanks
> Kevin..
> >>>
> >>>                   Please read:
> >>>
> >>>
> >>>
> >>> http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.html#50401328_pgfId-1289529
> >>>
> >>>                   Those instructions
> are identical for 1.6 and 1.8.
> >>>
> >>>                   For current lustre,
> only two commands are used for
> >>>         configuration.
> >>>                   mkfs.lustre and
> mount.
> >>>
> >>>
> >>>                   Usually when
> lustre_rmmod returns that error, you run
> >>>         it a second
> >>>                   time, and it will
> clear things. Unless you have live
> >>>         mounts or
> >>>                   network
> connections.
> >>>
> >>>                   cliffw
> >>>
> >>>
> >>>                       I am
> referring to 1.8 manual, but I was also
> >>>         referring to
> >>>                HowTo
> >>>                       page on wiki
> which seems to be for 1.6. The HowTo
> >>> page
> >>>
> >>>
> >>>
> >>> http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configuration_Tools
> >>>                       mentions abt
> lmc, lconf, and lctl.
> >>>
> >>>                       The modules
> are installed in the right place. The
> >>> '$
> >>>                       lustre_rmmod'
> resulted in following o/p:
> >>>                      
> [root at localhost
> >>> 2.6.18-92.1.17.el5_lustre.1.8.0smp]#
> >>>                lustre_rmmod
> >>>                       ERROR: Module
> obdfilter is in use
> >>>                       ERROR: Module
> ost is in use
> >>>                       ERROR: Module
> mds is in use
> >>>                       ERROR: Module
> fsfilt_ldiskfs is in use
> >>>                       ERROR: Module
> mgs is in use
> >>>                       ERROR: Module
> mgc is in use by mgs
> >>>                       ERROR: Module
> ldiskfs is in use by fsfilt_ldiskfs
> >>>                       ERROR: Module
> lov is in use
> >>>                       ERROR: Module
> lquota is in use by obdfilter,mds
> >>>                       ERROR: Module
> osc is in use
> >>>                       ERROR: Module
> ksocklnd is in use
> >>>                       ERROR: Module
> ptlrpc is in use by
> >>>                      
> obdfilter,ost,mds,mgs,mgc,lov,lquota,osc
> >>>                       ERROR: Module
> obdclass is in use by
> >>>
> >>>        
> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc
> >>>                       ERROR: Module
> lnet is in use by
> >>>         ksocklnd,ptlrpc,obdclass
> >>>                       ERROR: Module
> lvfs is in use by
> >>>
> >>>
> >>>
> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass
> >>>                       ERROR: Module
> libcfs is in use by
> >>>
> >>>
> >>>
> >>>
> obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,obdclass,lnet,lvfs
> >>>
> >>>                       Do I need to
> shutdown these services? How can I do
> >>>         that?
> >>>
> >>>                       Thanks,
> >>>                       CS.
> >>>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 


      



More information about the lustre-discuss mailing list