[Lustre-discuss] ost mount failed

Tomec Martin tomec.martin at tiscali.cz
Wed Feb 27 15:41:56 PST 2008


Ping to "loopback" is ok:
ping 192.168.2.54
12345-0 at lo
12345-192.168.2.54 at tcp

Ping to other machines:
failed to ping 192.168.2.98 at tcp: Input/output error

And dk log after ping:
00000400:00000100:0:1204089847.556806:0:3163:0:(linux-tcpip.c:669:libcfs_sock_connect()) 
Error -113 connecting 0.0.0.0/1023 -> 192.168.2.98/988

00000400:00000100:0:1204089847.556864:0:3163:0:(acceptor.c:81:lnet_connect_console_error()) 
Connection to 192.168.2.98 at tcp at host 192.168.2.98 was unreachable: the 
network or that node may be down, or Lustre may be misconfigured.

00000800:00000100:0:1204089847.556905:0:3163:0:(socklnd_cb.c:417:ksocknal_txlist_done()) 
Deleting packet type 2 len 0 192.168.2.54 at tcp->192.168.2.98 at tcp

Aaron Knister napsal(a):
> When you say they can communicate, did you try lctl ping?
> 
> On Feb 21, 2008, at 4:57 AM, Tomec Martin wrote:
> 
>>
>>
>> Isaac Huang napsal(a):
>>> On Wed, Feb 20, 2008 at 11:19:06PM +0100, Tomec Martin wrote:
>>>> According to tutorial I started MGS and MDT on first machine (with IP
>>>> 192.168.2.55):
>>>> mkfs.lustre --mgs /dev/sdb1
>>>> mkdir -p /mnt/mgs
>>>> mount -t lustre /dev/sdb1 /mnt/mgs
>>>> mkfs.lustre --fsname=testfs --mdt --mgsnode=192.168.2.55 at tcp0 /dev/sdb2
>>>> mkdir -p /mnt/test/mdt
>>>> mount -t lustre /dev/sdb2 /mnt/test/mdt
>>>>
>>>> On second machine I tried start OST with:
>>>> mkfs.lustre --fsname=testfs --ost --mgsnode=192.168.2.55 at tcp0 /dev/sdb
>>>> mkdir -p /mnt/test/ost0
>>>> mount -t lustre /dev/sdb /mnt/test/ost0
>>>>
>>>> but got this error:
>>>> mount.lustre: mount /dev/sdb at /mnt/test/ost0 failed: Input/output 
>>>> error
>>>> Is the MGS running?
>>>
>>> Was there any error messages in 'dmesg' on the node?
>>>
>>> Isaac
>>>
>>
>> Yes, log is below. Maybe it can be some incompatibility with Centos 5 (I
>> used packages for Red Hat 5)
>>
>> Lustre: OBD class driver, info at clusterfs.com
>>         Lustre Version: 1.6.4.2
>>         Build Version:
>> 1.6.4.2-19691231190000-PRISTINE-.cache.build.BUILD.lustre-kernel-2.6.18.lustre.linux-2.6.18-8.1.14.el5_lustre.1.6.4.2smp 
>>
>> Lustre: Added LNI 192.168.2.56 at tcp [8/256]
>> Lustre: Accept secure, port 988
>> Lustre: Lustre Client File System; info at clusterfs.com
>> kjournald starting.  Commit interval 5 seconds
>> LDISKFS FS on sdb, internal journal
>> LDISKFS-fs: mounted filesystem with ordered data mode.
>> SELinux: initialized (dev sdb, type ldiskfs), not configured for labeling
>> kjournald starting.  Commit interval 5 seconds
>> LDISKFS FS on sdb, internal journal
>> LDISKFS-fs: mounted filesystem with ordered data mode.
>> LDISKFS-fs: file extents enabled
>> LDISKFS-fs: mballoc enabled
>> SELinux: initialized (dev sdb, type ldiskfs), not configured for labeling
>>
>> LustreError: 2891:0:(client.c:975:ptlrpc_expire_one_request()) @@@
>> timeout (sent at 1203589568, 5s ago)  req at cbaa7600 x1/t0
>> o250->MGS at MGC192.168.2.55@tcp_0:26 lens 240/272 ref 1 fl Rpc:/0/0 rc 
>> 0/-22
>> LustreError: 2858:0:(obd_mount.c:954:server_register_target())
>> registration with the MGS failed (-5)
>> LustreError: 2858:0:(obd_mount.c:1054:server_start_targets()) Required
>> registration failed for testfs-OSTffff: -5
>> LustreError: 15f-b: Communication error with the MGS.  Is the MGS 
>> running?
>> LustreError: 2858:0:(obd_mount.c:1570:server_fill_super()) Unable to
>> start targets: -5
>> LustreError: 2858:0:(obd_mount.c:1368:server_put_super()) no obd
>> testfs-OSTffff
>> LustreError: 2858:0:(obd_mount.c:119:server_deregister_mount())
>> testfs-OSTffff not registered
>>
>> LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success)
>> LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0
>> breaks, 0 lost
>> LDISKFS-fs: mballoc: 0 generated and it took 0
>> LDISKFS-fs: mballoc: 0 preallocated, 0 discarded
>> Lustre: server umount testfs-OSTffff complete
>> LustreError: 2858:0:(obd_mount.c:1924:lustre_fill_super()) Unable to
>> mount  (-5)
>>
>>>> Machines can comunicate and MGS is probably running.
>>>> It is Lustre 1.6.4.2 and kernel 2.6.18
>>>> Have you any ideas where could be a problem?
>>>> _______________________________________________
>>>> Lustre-discuss mailing list
>>>> Lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
> Aaron Knister
> Associate Systems Analyst
> Center for Ocean-Land-Atmosphere Studies
> 
> (301) 595-7000
> aaron at iges.org
> 
> 
> 
> 
> 



More information about the lustre-discuss mailing list