[Lustre-discuss] Lustre installation and configuration problems
Jerome, Ron
Ron.Jerome at nrc-cnrc.gc.ca
Wed Jun 17 08:40:48 PDT 2009
I think the problem you have, as Cliff alluded to, is a mismatch between
your kernel version and the Luster kernel version modules.
You have kernel "2.6.18-92.el5" and are installing Lustre
"2.6.18_92.1.17.el5" Note the ".1.17" is significant as the modules
will end up in the wrong directory. There is an update to CentOS to
bring the kernel to the matching 2.6.18_92.1.17.el5 version you can pull
it off the CentOS mirror site in the updates directory.
Ron.
From: lustre-discuss-bounces at lists.lustre.org
[mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Carlos
Santana
Sent: June 17, 2009 11:21 AM
To: lustre-discuss at lists.lustre.org
Subject: Re: [Lustre-discuss] Lustre installation and configuration
problems
And is there any specific installation order for patchless client? Could
someone please share it with me?
-
CS.
On Wed, Jun 17, 2009 at 10:18 AM, Carlos Santana <neubyr at gmail.com>
wrote:
Huh... :( Sorry to bug you guys again...
I am planning to make a fresh start now as nothing seems to have worked
for me. If you have any comments/feedback please share them.
I would like to confirm installation order before I make a fresh start.
>From Arden's experience:
http://lists.lustre.org/pipermail/lustre-discuss/2009-June/010710.html ,
the lusre-module is installed last. As I was installing Lustre 1.8, I
was referring 1.8 operations manual
http://manual.lustre.org/index.php?title=Main_Page . The installation
order in the manual is different than what Arden has suggested.
Will it make a difference in configuration at later stage? Which one
should I follow now?
Any comments?
Thanks,
CS.
On Wed, Jun 17, 2009 at 12:35 AM, Carlos Santana <neubyr at gmail.com>
wrote:
Thanks Cliff.
The depmod -a was successful before as well. I am using CentOS 5.2
box. Following are the packages installed:
[root at localhost tmp]# rpm -qa | grep -i lustre
lustre-modules-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
[root at localhost tmp]# uname -a
Linux localhost.localdomain 2.6.18-92.el5 #1 SMP Tue Jun 10 18:49:47
EDT 2008 i686 i686 i386 GNU/Linux
And here is a output from strace for mount:
http://www.heypasteit.com/clip/8WT
Any further debugging hints?
Thanks,
CS.
On 6/16/09, Cliff White <Cliff.White at sun.com> wrote:
> Carlos Santana wrote:
>> The '$ modprobe -l lustre*' did not show any module on a patchless
>> client. modprobe -v returns 'FATAL: Module lustre not found'.
>>
>> How do I install a patchless client?
>> I have tried lustre-client-modules and lustre-client-ver rpm packages
in
>> both sequences. Am I missing anything?
>>
>
> Make sure the lustre-client-modules package matches your running
kernel.
> Run depmod -a to be sure
> cliffw
>
>> Thanks,
>> CS.
>>
>>
>>
>> On Tue, Jun 16, 2009 at 2:28 PM, Cliff White <Cliff.White at sun.com
>> <mailto:Cliff.White at sun.com>> wrote:
>>
>> Carlos Santana wrote:
>>
>> The lctlt ping and 'net up' failed with the following
messages:
>> --- ---
>> [root at localhost ~]# lctl ping 10.0.0.42
>> opening /dev/lnet failed: No such device
>> hint: the kernel modules may not be loaded
>> failed to ping 10.0.0.42 at tcp: No such device
>>
>> [root at localhost ~]# lctl network up
>> opening /dev/lnet failed: No such device
>> hint: the kernel modules may not be loaded
>> LNET configure error 19: No such device
>>
>>
>> Make sure modules are unloaded, then try modprobe -v.
>> Looks like you have lnet mis-configured, if your module options
are
>> wrong, you will see an error during the modprobe.
>> cliffw
>>
>> --- ---
>>
>>
>> I tried lustre_rmmod and depmod commands and it did not
return
>> any error messages. Any further clues? Reinstall patchless
>> client again?
>>
>> -
>> CS.
>>
>>
>> On Tue, Jun 16, 2009 at 1:32 PM, Cliff White
>> <Cliff.White at sun.com <mailto:Cliff.White at sun.com>
>> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>>
wrote:
>>
>> Carlos Santana wrote:
>>
>> I was able to run lustre_rmmod and depmod
successfully. The
>> '$lctl list_nids' returned the server ip address and
>> interface
>> (tcp0).
>>
>> I tried to mount the file system on a remote client,
but it
>> failed with the following message.
>> --- ---
>> [root at localhost ~]# mount -t lustre
10.0.0.42 at tcp0:/lustre
>> /mnt/lustre
>> mount.lustre: mount 10.0.0.42 at tcp0:/lustre at
/mnt/lustre
>> failed: No such device
>> Are the lustre modules loaded?
>> Check /etc/modprobe.conf and /proc/filesystems
>> Note 'alias lustre llite' should be removed from
>> modprobe.conf
>> --- ---
>>
>> However, the mounting is successful on a single node
>> configuration - with client on the same machine as MDS
>> and OST.
>> Any clues? Where to look for logs and debug messages?
>>
>>
>> Syslog || /var/log/messages is the normal place.
>>
>> You can use 'lctl ping' to verify that the client can
reach
>> the server.
>> Usually in these cases, it's a network/name
misconfiguration.
>>
>> Run 'tunefs.lustre --print' on your servers, and verify
that
>> mgsnode=
>> is correct.
>>
>> cliffw
>>
>>
>> Thanks,
>> CS.
>>
>>
>>
>>
>>
>> On Tue, Jun 16, 2009 at 12:16 PM, Cliff White
>> <Cliff.White at sun.com <mailto:Cliff.White at sun.com>
>> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>
>> <mailto:Cliff.White at sun.com
<mailto:Cliff.White at sun.com>
>> <mailto:Cliff.White at sun.com <mailto:Cliff.White at sun.com>>>>
wrote:
>>
>> Carlos Santana wrote:
>>
>> Thanks Kevin..
>>
>> Please read:
>>
>>
>>
http://manual.lustre.org/manual/LustreManual16_HTML/ConfiguringLustre.ht
ml#50401328_pgfId-1289529
>>
>> Those instructions are identical for 1.6 and 1.8.
>>
>> For current lustre, only two commands are used for
>> configuration.
>> mkfs.lustre and mount.
>>
>>
>> Usually when lustre_rmmod returns that error, you
run
>> it a second
>> time, and it will clear things. Unless you have
live
>> mounts or
>> network connections.
>>
>> cliffw
>>
>>
>> I am referring to 1.8 manual, but I was also
>> referring to
>> HowTo
>> page on wiki which seems to be for 1.6. The
HowTo
>> page
>>
>>
>>
http://wiki.lustre.org/index.php/Lustre_Howto#Using_Supplied_Configurati
on_Tools
>> mentions abt lmc, lconf, and lctl.
>>
>> The modules are installed in the right place.
The '$
>> lustre_rmmod' resulted in following o/p:
>> [root at localhost
2.6.18-92.1.17.el5_lustre.1.8.0smp]#
>> lustre_rmmod
>> ERROR: Module obdfilter is in use
>> ERROR: Module ost is in use
>> ERROR: Module mds is in use
>> ERROR: Module fsfilt_ldiskfs is in use
>> ERROR: Module mgs is in use
>> ERROR: Module mgc is in use by mgs
>> ERROR: Module ldiskfs is in use by
fsfilt_ldiskfs
>> ERROR: Module lov is in use
>> ERROR: Module lquota is in use by obdfilter,mds
>> ERROR: Module osc is in use
>> ERROR: Module ksocklnd is in use
>> ERROR: Module ptlrpc is in use by
>> obdfilter,ost,mds,mgs,mgc,lov,lquota,osc
>> ERROR: Module obdclass is in use by
>>
>>
obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc
>> ERROR: Module lnet is in use by
>> ksocklnd,ptlrpc,obdclass
>> ERROR: Module lvfs is in use by
>>
>>
>>
obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ptlrpc,obdclass
>> ERROR: Module libcfs is in use by
>>
>>
>>
obdfilter,ost,mds,fsfilt_ldiskfs,mgs,mgc,lov,lquota,osc,ksocklnd,ptlrpc,
obdclass,lnet,lvfs
>>
>> Do I need to shutdown these services? How can I
do
>> that?
>>
>> Thanks,
>> CS.
>>
>>
>> On Tue, Jun 16, 2009 at 11:36 AM, Kevin Van
Maren
>> <Kevin.Vanmaren at sun.com
>> <mailto:Kevin.Vanmaren at sun.com>
<mailto:Kevin.Vanmaren at sun.com
>> <mailto:Kevin.Vanmaren at sun.com>>
>> <mailto:Kevin.Vanmaren at sun.com
>> <mailto:Kevin.Vanmaren at sun.com>
<mailto:Kevin.Vanmaren at sun.com
>> <mailto:Kevin.Vanmaren at sun.com>>>
>> <mailto:Kevin.Vanmaren at sun.com
>> <mailto:Kevin.Vanmaren at sun.com>
>> <mailto:Kevin.Vanmaren at sun.com
>> <mailto:Kevin.Vanmaren at sun.com>>
<mailto:Kevin.Vanmaren at sun.com
>> <mailto:Kevin.Vanmaren at sun.com>
>> <mailto:Kevin.Vanmaren at sun.com
>> <mailto:Kevin.Vanmaren at sun.com>>>>>
>>
>> wrote:
>>
>> I think lconf and lmc went away with Lustre
>> 1.6. Are you
>> sure you
>> are looking at the 1.8 manual, and not
>> directions for 1.4?
>>
>> /usr/sbin/lctl should be in the
>> lustre-<version> RPM.
>> Do a:
>> # rpm -q -l
>> lustre-1.8.0-2.6.18_92.1.17.el5_lustre.1.8.0smp
>>
>>
>> Do make sure the modules are installed in
the
>> right place:
>> # cd /lib/modules/`uname -r`
>> # find . | grep lustre.ko
>>
>> If it shows up, then do:
>> # lustre_rmmod
>> # depmod
>> and try again.
>>
>> Otherwise, figure out where your modules are
>> installed:
>> # uname -r
>> # cd /lib/modules
>> # find . | grep lustre.ko
>>
>>
>> You can also double-check the NID. On the
MSD
>> server, do
>> # lctl list_nids
>>
>> Should show 10.0.0.42 at tcp0
>>
>> Kevin
>>
>>
>>
>>
>>
>>
------------------------------------------------------------------------
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> <mailto:Lustre-discuss at lists.lustre.org>
>> <mailto:Lustre-discuss at lists.lustre.org
>> <mailto:Lustre-discuss at lists.lustre.org>>
>>
>>
http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>>
>>
>>
>>
>>
------------------------------------------------------------------------
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> <mailto:Lustre-discuss at lists.lustre.org>
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>>
>>
>>
>>
------------------------------------------------------------------------
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20090617/abfbb300/attachment.htm>
More information about the lustre-discuss
mailing list