[lustre-discuss] Lustre Client: Unable to mount

Rohan Garg rohgarg at ccs.neu.edu
Sun May 6 07:46:06 PDT 2018


Thanks for the reply, Andreas.

> - the local.sh config and llmount.sh are only for basic testing and development. That isn't how you would use lustre for production deployment. 

Yes, I agree. My goal with llmount.sh was to make sure that I have
the VMs and networking set up correctly, and then to use that setup
as a base for further development.

> - after "ping" did you try "lctl ping" between the various VMs? Do you have firewall rules that block connection?

Firewall is disabled on all the 4 VM's. For some reason, 'lctl ping'
was giving me unreliable/unreproducible results. (But, I have realized
that 'lctl ping' and 'lctl dl' are reliable means for establishing
whether the VMs are correctly set up for Lustre.)

Anyway, after a few trial and errors, I managed to get it to work,
without llmount.sh. I created new VMs and manually formatted and
mounted the Lustre filesystem on the VMs.

(To avoid disturbing the sensitive stability of the setup, I haven't
 tried llmount.sh again.  :-) )

On Sun, May 06, 2018 at 12:34:25PM +0000, Dilger, Andreas wrote:
> Two notes:
> - the local.sh config and llmount.sh are only for basic testing and development. That isn't how you would use lustre for production deployment. 
> - after "ping" did you try "lctl ping" between the various VMs? Do you have firewall rules that block connection?
> 
> Cheers, Andreas
> 
> > On May 5, 2018, at 21:14, Rohan Garg <rohgarg at ccs.neu.edu> wrote:
> > 
> > Hi,
> > 
> > I'm trying to set up a virtual cluster (with 4 VirtualBox VMs: 1
> > MGS, 1 MDS, 1 Client, and 1 OSS) using Lustre.  The VMs are running
> > CentOS-7.  I have built Lustre from the master branch.
> > 
> > The VM's have a NAT interface (eth0), and a host-only network
> > interface (eth1).
> > 
> >  Client: eth0: 10.0.2.15, eth1: 192.168.50.7, hostname: ct-client1
> >     OSS: eth0: 10.0.2.15, eth1: 192.168.50.5, hostname: ct-oss1
> >     MDS: eth0: 10.0.2.15, eth1: 192.168.50.9, hostname: ct-mds1
> >     MGS: eth0: 10.0.2.15, eth1: 192.168.50.11, hostname: ct-mgs1
> > 
> > - All the VMs have SELinux disabled.
> > - All the VMs can ping each other and can use password-less ssh among themselves.
> > - All the 4 VM's have the following line in /etc/modprobe.d/lnet.conf:
> > 
> >      options lnet networks="tcp(eth1)"
> > 
> > I modified the cfg/local.sh file and added the following entries to make
> > it use the correct hostnames.
> > 
> >    MDSCOUNT=1
> >    mds_HOST=ct-mds1
> >    MDSDEV1=/dev/sdb
> > 
> >    mgs_HOST=ct-mgs1
> >    MGSDEV=/dev/sdb
> > 
> >    OSTCOUNT=1
> >    ost_HOST=ct-oss1
> >    OSTDEV1=/dev/sdb
> > 
> > The issue is that I can't get the llmount.sh script to mount the
> > filesystem on the client and run successfully. The script exits with the
> > following messages:
> > 
> >  ...
> >  Started lustre-OST0000
> >  Starting client: ct-client1.lfs.local:  -o user_xattr,flock ct-mgs1:/lustre /mnt/lustre
> >  CMD: ct-client1.lfs.local mkdir -p /mnt/lustre
> >  CMD: ct-client1.lfs.local mount -t lustre -o user_xattr,flock ct-mgs1:/lustre /mnt/lustre
> >  mount.lustre: mount ct-mgs1:/lustre at /mnt/lustre failed: No such file or directory
> >  Is the MGS specification correct?
> >  Is the filesystem name correct?
> >  If upgrading, is the copied client log valid? (see upgrade docs)
> > 
> > (Trying to run the last mount command manually also gives the same
> > error.)
> > 
> > After the llmount script exits, I can check the output of "lctl list_nids"
> > on the 3 server VMs.
> > 
> >    OSS: 192.168.50.5 at tcp
> >    MDS: 192.168.50.9 at tcp
> >    MGS: 192.168.50.11 at tcp
> > 
> > Here's the dmesg output from the client:
> > 
> >  [311.259776] Lustre: Lustre: Build Version: 2.11.51_20_g9ac477c
> >  [312.792145] Lustre: 1836:0:(gss_svc_upcall.c:1185:gss_init_svc_upcall()) Init channel is not opened by lsvcgssd, following request might be dropped until lsvcgssd is active
> >  [312.792162] Lustre: 1836:0:(gss_mech_switch.c:71:lgss_mech_register()) Register gssnull mechanism
> >  [312.792174] Key type lgssc registered
> >  [312.868636] Lustre: Echo OBD driver; http://www.lustre.org/
> >  [325.835994] LustreError: 3302:0:(ldlm_lib.c:488:client_obd_setup()) can't add initial connection
> >  [325.836737] LustreError: 3302:0:(obd_config.c:559:class_setup()) setup MGC192.168.50.11 at tcp failed (-2)
> >  [325.837248] LustreError: 3302:0:(obd_mount.c:202:lustre_start_simple()) MGC192.168.50.11 at tcp setup error -2
> >  [325.837765] LustreError: 3302:0:(obd_mount.c:1583:lustre_fill_super()) Unable to mount  (-2)
> > 
> > I'm not sure if I'm missing something in my config. Any help is appreciated.
> > 
> > Thanks,
> > Rohan
> > _______________________________________________
> > lustre-discuss mailing list
> > lustre-discuss at lists.lustre.org
> > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


More information about the lustre-discuss mailing list