[lustre-discuss] Problems on mds/mgs
Mohr Jr, Richard Frank (Rick Mohr)
rmohr at utk.edu
Wed Apr 22 05:58:38 PDT 2015
Do you have any osts formatted/mounted? I believe you will get false errors if you mount the mdt without osts, but I can't remember exactly if it's the same error you are seeing.
-- Rick
> On Apr 22, 2015, at 7:55 AM, Sven Schumacher <schumacher at tfd.uni-hannover.de> wrote:
>
> Hello,
>
> I always get the following error, when doing the things described below:
>> LustreError: 11-0: BIGWORK-MDT0000-lwp-MDT0000: Communicating with
>> 0 at lo, operation mds_connect failed with -11.
> If anyone has a helping hint... I'm up for it...
>
> Thanks in advance
>
> Sven
>
>
> what I do have: 4 servers for lustre with 2 infiniband-ports
> (ConnectX-mellanox-Cards)
> Infiniband is configured on mds:
>> 6: ib0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast
>> state UP qlen 256
>> link/infiniband
>> 80:00:00:48:fe:80:00:00:00:00:00:00:f4:52:14:03:00:57:e1:c1 brd
>> 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
>> inet 10.69.100.5/24 brd 10.69.100.255 scope global ib0
>> 7: ib1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc pfifo_fast
>> state UP qlen 256
>> link/infiniband
>> 80:00:00:49:fe:80:00:00:00:00:00:00:f4:52:14:03:00:57:e1:c2 brd
>> 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
>
> What I would like to have:
> MDS/MGS on one server
> OSS on 3 servers, each with 2 OST.
>
> I did the following on mds:
>> # mkfs.lustre --fsname=BIGWORK --mgs --mdt --index=0
>> --mgsnode=10.69.100.5 at o2ib0 --reformat /dev/vg_mds/mdsmgs
>> Permanent disk data:
>> Target: BIGWORK:MDT0000
>> Index: 0
>> Lustre FS: BIGWORK
>> Mount type: ldiskfs
>> Flags: 0x65
>> (MDT MGS first_time update )
>> Persistent mount opts: user_xattr,errors=remount-ro
>> Parameters: mgsnode=10.69.100.5 at o2ib
>>
>> device size = 1116156MB
>> formatting backing filesystem ldiskfs on /dev/vg_mds/mdsmgs
>> target name BIGWORK:MDT0000
>> 4k blocks 285735936
>> options -J size=400 -I 512 -i 2048 -q -O
>> dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E
>> lazy_journal_init -F
>> mkfs_cmd = mke2fs -j -b 4096 -L BIGWORK:MDT0000 -J size=400 -I 512 -i
>> 2048 -q -O
>> dirdata,uninit_bg,^extents,dir_nlink,quota,huge_file,flex_bg -E
>> lazy_journal_init -F /dev/vg_mds/mdsmgs 285735936
>> Writing CONFIGS/mountdata
>
> And dmesg shows:
>> LDISKFS-fs (dm-2): mounted filesystem with ordered data mode.
>> quota=on. Opts:
>
> But "mount" doesn't show any lustre-filesystem mounted, so I do:
>> # mount -t lustre /dev/vg_mds/mdsmgs /lustre/mdsmgs
>> mount.lustre: set /sys/block/dm-2/queue/max_sectors_kb to 127
>>
>> mount.lustre: set /sys/block/dm-1/queue/max_sectors_kb to 127
>>
>> mount.lustre: set /sys/block/dm-0/queue/max_sectors_kb to 127
>>
>> mount.lustre: set /sys/block/sdc/queue/max_sectors_kb to 32767
>>
>> mount.lustre: set /sys/block/sdd/queue/max_sectors_kb to 32767
>>
>> mount.lustre: set /sys/block/sde/queue/max_sectors_kb to 32767
>>
>> mount.lustre: set /sys/block/sdf/queue/max_sectors_kb to 32767
>
> Now dmesg shows:
>>
>> LDISKFS-fs (dm-2): mounted filesystem with ordered data mode.
>> quota=on. Opts:
>> LNet: HW CPU cores: 24, npartitions: 4
>> padlock: VIA PadLock Hash Engine not detected.
>> Lustre: Lustre: Build Version:
>> 2.5.3.90--CHANGED-2.6.32-431.23.3.el6.lustre
>> LNet: Added LNI 10.69.100.5 at o2ib [8/256/0/180]
>> LDISKFS-fs (dm-2): mounted filesystem with ordered data mode.
>> quota=on. Opts:
>> Lustre: ctl-BIGWORK-MDT0000: No data found on store. Initialize space
>> Lustre: BIGWORK-MDT0000: new disk, initializing
>> LustreError: 11-0: BIGWORK-MDT0000-lwp-MDT0000: Communicating with
>> 0 at lo, operation mds_connect failed with -11.
>
> So whats possibly wrong here?
>
> lsmod lists the following modules (which belong to lustre):
>> Module Size Used by
>> osp 242759 1
>> mdd 284205 3
>> lfsck 103130 4
>> lod 263636 3
>> mdt 746013 4
>> mgs 281619 1
>> mgc 82367 2
>> fsfilt_ldiskfs 5865 1
>> osd_ldiskfs 452528 4
>> lquota 345916 11
>> lustre 919263 0
>> mdc 201643 1
>> lov 514967 1
>> osc 392643 1
>> fid 82230 9
>> fld 84131 8
>> ko2iblnd 239245 1
>> ptlrpc 1665273 16
>> obdclass 1263221 77
>> lvfs 16685 19
>> lnet 344978 4
>> sha512_generic 5198 0
>> sha256_generic 10425 0
>> crc32c_intel 2015 0
>> libcfs 495892 21
>> ldiskfs 425708 3
>
>
>
> lsmod lists the following modules (which belong to infiniband):
>> ib_ipoib 80756 0
>> ib_srp 32208 0
>> scsi_transport_srp 5487 1
>> rdma_ucm 16185 0
>> rdma_cm 38340 2
>> ib_addr 6606 2
>> iw_cm 8657 1
>> ib_uverbs 34909 1
>> ib_cm 36936 3
>> ipv6 319905 2
>> ib_umad 11686 0
>> mlx4_ib 126642 0
>> ib_sa 24113 6
>> ib_mad 39070 4
>> ib_core 74419 12
>> mlx4_core 212574 1
>
>
>
>
> --
> Sven Schumacher - Systemadministrator Tel: (0511)762-2753
> Leibniz Universitaet Hannover
> Institut für Turbomaschinen und Fluid-Dynamik - TFD
> Appelstraße 9 - 30167 Hannover
> Institut für Kraftwerkstechnik und Wärmeübertragung - IKW
> Callinstraße 36 - 30167 Hannover
>
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
More information about the lustre-discuss
mailing list