[Lustre-discuss] Lustre module not loading on client mount

Michael Robbert mrobbert at mines.edu
Thu Apr 15 12:56:12 PDT 2010


I think that I've discovered the problem is the OFED Roll that I'm using. When a node is first built it recompiles the OFED modules for the current kernel and I'm still deciphering the actual sequence of events, but I think that I need to add a reboot at the end of the process. 

Mike

On Apr 14, 2010, at 10:21 PM, Kit Westneat wrote:

> Hey Mike,
> 
> That's pretty odd, it looks like the o2ib module has a symbol mismatch 
> with the ofed driver. I'm surprised it works at all...can you send the 
> dmesg output after modprobe lustre + mounting, as well as the lctl 
> list_nids output?
> 
> Thanks,
> Kit
> 
> On 4/14/2010 1:42 PM, Michael Robbert wrote:
>> Kit,
>> I thought that it may be a timing issue, but I added mount commands to rc.local and it didn't help. The odd thing is that it does seem to work on subsequent reboots. I haven't done extensive testing to see if that works all the time or not. The other odd thing is that if the FSs don't mount on boot a manual mount command does not work without first doing "modprobe lustre" first. This is what I see in that case:
>> 
>> [root at compute-2-1 ~]# mount -a
>> mount.lustre: mount 172.16.34.1 at o2ib:/home at /lustre/home failed: No such device
>> Are the lustre modules loaded?
>> Check /etc/modprobe.conf and /proc/filesystems
>> Note 'alias lustre llite' should be removed from modprobe.conf
>> mount.lustre: mount 172.16.34.1 at o2ib:/scratch at /lustre/scratch failed: No such device
>> Are the lustre modules loaded?
>> Check /etc/modprobe.conf and /proc/filesystems
>> Note 'alias lustre llite' should be removed from modprobe.conf
>> 
>> Here are some dmesg entries from a boot that does not mount the FSs:
>> 
>> ADDRCONF(NETDEV_UP): eth0: link is not ready
>> bnx2: eth0 NIC Copper Link is Up, 1000 Mbps full duplex
>> ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
>> ADDRCONF(NETDEV_UP): ib0: link is not ready
>> ADDRCONF(NETDEV_CHANGE): ib0: link becomes ready
>> Lustre: OBD class driver, http://www.lustre.org/
>> Lustre:     Lustre Version: 1.8.2
>> Lustre:     Build Version: 1.8.2-20100122190848-PRISTINE-2.6.18-164.15.1.el5
>> ko2iblnd: disagrees about version of symbol ib_fmr_pool_unmap
>> ko2iblnd: Unknown symbol ib_fmr_pool_unmap
>> ... Lots more ko2iblnd errors here (Is this part of the problem or a red herring? ...
>> ko2iblnd: disagrees about version of symbol ib_fmr_pool_map_phys
>> ko2iblnd: Unknown symbol ib_fmr_pool_map_phys
>> LustreError: 3288:0:(api-ni.c:1043:lnet_startup_lndnis()) Can't load LND o2ib, module ko2iblnd, rc=256
>> LustreError: 3288:0:(events.c:729:ptlrpc_init_portals()) network initialisation failed
>> LustreError: 165-2: Nothing registered for client mount! Is the 'lustre' module loaded?
>> LustreError: 3381:0:(obd_mount.c:2042:lustre_fill_super()) Unable to mount  (-19)
>> 
>> 
>> Thanks,
>> Mike
>> 
>> On Apr 12, 2010, at 10:07 PM, Kit Westneat wrote:
>> 
>> 
>>> Hey Mike,
>>> 
>>> Are there any messages in dmesg on boot? I've seen it on occasion where
>>> the IB takes a second to actually start. If that's the case, you might
>>> need to add mounts to rc.local, or try to get openibd to start earlier.
>>> 
>>> - Kit
>>> 
>>> On 4/12/2010 7:33 PM, Michael Robbert wrote:
>>> 
>>>> I am trying to configure a Lustre 1.8.2 client on a CentOS 5.4 machine. I have compiled from source into RPMS and all 4 RPMS are installed (lustre, -modules, -tests, and -source). The lustre module will load find manually with "modprobe lustre", but I can not get the filesystem to automatically mount on boot up. I have added the following to /etc/modprobe.conf
>>>> 
>>>> options lnet networks=o2ib0(ib0)
>>>> 
>>>> and these are the entries in my /etc/fstab
>>>> 
>>>> 172.16.34.1 at o2ib:/home  /lustre/home    lustre  auto,_netdev    1 2
>>>> 172.16.34.1 at o2ib:/scratch       /lustre/scratch lustre  auto,_netdev    1 2
>>>> 
>>>> I have a similar setup with Lustre 1.6.7.2 client running on RHEL 4.5 and it loads fine there.
>>>> 
>>>> What am I missing?
>>>> 
>>>> Thanks,
>>>> Mike Robbert
>>>> 
>>>> _______________________________________________
>>>> Lustre-discuss mailing list
>>>> Lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>> 
>>>> 
>>> 
>>> -- 
>>> ---
>>> Kit Westneat
>>> kwestneat at datadirectnet.com
>>> 812-484-8485
>>> 
>>> 
>> 
> 
> 
> -- 
> ---
> Kit Westneat
> kwestneat at datadirectnet.com
> 812-484-8485
> 




More information about the lustre-discuss mailing list