[lustre-discuss] Install issues on 2.10.0

Cowe, Malcolm J malcolm.j.cowe at intel.com
Tue Jul 25 18:25:55 PDT 2017


Also, for the record, for distros that do not have the genhostid command, there is a fairly simple workaround:

h=`hostid`; a=${h:6:2}; b=${h:4:2}; c=${h:2:2}; d=${h:0:2}
sudo sh -c "echo -ne \"\x$a\x$b\x$c\x$d\" > /etc/hostid"

I’m sure there’s a more elegant way to express the solution, but as a quick bash hack, it serves. genhostid is mostly just a wrapper around the sethostid() glibc function, if you prefer C.

Malcolm.

On 26/7/17, 3:53 am, "lustre-discuss on behalf of John Casu" <lustre-discuss-bounces at lists.lustre.org on behalf of john at chiraldynamics.com> wrote:

    Ok, so I assume this is actually a ZFS/SPL bug & not a lustre bug.
    Also, thanks Ben, for the ptr.
    
    many thanks,
    -john
    
    On 7/25/17 10:19 AM, Mannthey, Keith wrote:
    > Host_id is for zpool double import protection.  If a host id is set on a zpool (zfs does this automatically) then a HA server can't just import to pool (users have to use --force). This makes the system a lot safer from double zpool imports.  Call 'genhostid' on your Lustre servers and the warning will go away.
    > 
    > Thanks,
    >   Keith
    > 
    > 
    > 
    > -----Original Message-----
    > From: lustre-discuss [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of Ben Evans
    > Sent: Tuesday, July 25, 2017 10:13 AM
    > To: John Casu <john at chiraldynamics.com>; lustre-discuss at lists.lustre.org
    > Subject: Re: [lustre-discuss] Install issues on 2.10.0
    > 
    > health_check moved to /sys/fs/lustre/ along with a bunch of other things.
    > 
    > -Ben
    > 
    > On 7/25/17, 12:21 PM, "lustre-discuss on behalf of John Casu"
    > <lustre-discuss-bounces at lists.lustre.org on behalf of john at chiraldynamics.com> wrote:
    > 
    >> Just installed latest 2.10.0 Lustre over ZFS on a vanilla Centos
    >> 7.3.1611 system, using dkms.
    >> ZFS is 0.6.5.11 from zfsonlinux.org, installed w. yum
    >>
    >> Not a single problem during installation, but I am having issues
    >> building a lustre filesystem:
    >> 1. Building a separate mgt doesn't seem to work properly, although the
    >> mgt/mdt combo
    >>     seems to work just fine.
    >> 2. I get spl_hostid not set warnings, which I've never seen before 3.
    >> /proc/fs/lustre/health_check seems to be missing.
    >>
    >> thanks,
    >> -john c
    >>
    >>
    >>
    >> ---------
    >> Building an mgt by itself doesn't seem to work properly:
    >>
    >>> [root at fb-lts-mds0 x86_64]# mkfs.lustre --reformat --mgs
    >>> --force-nohostid --servicenode=192.168.98.113 at tcp \
    >>>                                         --backfstype=zfs mgs/mgt
    >>>
    >>>     Permanent disk data:
    >>> Target:     MGS
    >>> Index:      unassigned
    >>> Lustre FS:
    >>> Mount type: zfs
    >>> Flags:      0x1064
    >>>                (MGS first_time update no_primnode ) Persistent mount
    >>> opts:
    >>> Parameters: failover.node=192.168.98.113 at tcp
    >>> WARNING: spl_hostid not set. ZFS has no zpool import protection
    >>> mkfs_cmd = zfs create -o canmount=off -o xattr=sa mgs/mgt
    >>> WARNING: spl_hostid not set. ZFS has no zpool import protection
    >>> Writing mgs/mgt properties
    >>>    lustre:failover.node=192.168.98.113 at tcp
    >>>    lustre:version=1
    >>>    lustre:flags=4196
    >>>    lustre:index=65535
    >>>    lustre:svname=MGS
    >>> [root at fb-lts-mds0 x86_64]# mount.lustre mgs/mgt /mnt/mgs
    >>> WARNING: spl_hostid not set. ZFS has no zpool import protection
    >>>
    >>> mount.lustre FATAL: unhandled/unloaded fs type 0 'ext3'
    >>
    >> If I build the combo mgt/mdt, things go a lot better:
    >>
    >>>
    >>> [root at fb-lts-mds0 x86_64]# mkfs.lustre --reformat --mgs --mdt
    >>> --force-nohostid --servicenode=192.168.98.113 at tcp --backfstype=zfs
    >>> --index=0 --fsname=test meta/meta
    >>>
    >>>     Permanent disk data:
    >>> Target:     test:MDT0000
    >>> Index:      0
    >>> Lustre FS:  test
    >>> Mount type: zfs
    >>> Flags:      0x1065
    >>>                (MDT MGS first_time update no_primnode )  Persistent
    >>> mount opts:
    >>> Parameters: failover.node=192.168.98.113 at tcp
    >>> WARNING: spl_hostid not set. ZFS has no zpool import protection
    >>> mkfs_cmd = zfs create -o canmount=off -o xattr=sa meta/meta
    >>> WARNING: spl_hostid not set. ZFS has no zpool import protection
    >>> Writing meta/meta properties
    >>>    lustre:failover.node=192.168.98.113 at tcp
    >>>    lustre:version=1
    >>>    lustre:flags=4197
    >>>    lustre:index=0
    >>>    lustre:fsname=test
    >>>    lustre:svname=test:MDT0000
    >>> [root at fb-lts-mds0 x86_64]# mount.lustre meta/meta  /mnt/meta
    >>> WARNING: spl_hostid not set. ZFS has no zpool import protection
    >>> [root at fb-lts-mds0 x86_64]# df
    >>> Filesystem          1K-blocks    Used Available Use% Mounted on
    >>> /dev/mapper/cl-root  52403200 3107560  49295640   6% /
    >>> devtmpfs             28709656       0  28709656   0% /dev
    >>> tmpfs                28720660       0  28720660   0% /dev/shm
    >>> tmpfs                28720660   17384  28703276   1% /run
    >>> tmpfs                28720660       0  28720660   0% /sys/fs/cgroup
    >>> /dev/sdb1             1038336  195484    842852  19% /boot
    >>> /dev/mapper/cl-home  34418260   32944  34385316   1% /home
    >>> tmpfs                 5744132       0   5744132   0% /run/user/0
    >>> meta                 60435328     128  60435200   1% /meta
    >>> meta/meta            59968128    4992  59961088   1% /mnt/meta
    >>> [root at fb-lts-mds0 ~]# ls /proc/fs/lustre/mdt/test-MDT0000/
    >>> async_commit_count     hash_stats               identity_upcall
    >>> num_exports         sync_count
    >>> commit_on_sharing      hsm                      instance
    >>> recovery_status     sync_lock_cancel
    >>> enable_remote_dir      hsm_control              ir_factor
    >>> recovery_time_hard  uuid
    >>> enable_remote_dir_gid  identity_acquire_expire  job_cleanup_interval
    >>> recovery_time_soft
    >>> evict_client           identity_expire          job_stats
    >>> rename_stats
    >>> evict_tgt_nids         identity_flush           md_stats
    >>> root_squash
    >>> exports                identity_info            nosquash_nids
    >>> site_stats
    >>
    >> Also, there's no /proc/fs/lustre/health_check
    >>
    >>> [root at fb-lts-mds0 ~]# ls /proc/fs/lustre/
    >>> fld   llite  lod  lwp  mdd  mdt  mgs      osc      osp  seq
    >>> ldlm  lmv    lov  mdc  mds  mgc  nodemap  osd-zfs  qmt  sptlrpc
    >>
    >>
    >>
    >>
    >> _______________________________________________
    >> lustre-discuss mailing list
    >> lustre-discuss at lists.lustre.org
    >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
    > 
    > _______________________________________________
    > lustre-discuss mailing list
    > lustre-discuss at lists.lustre.org
    > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
    > 
    > 
    _______________________________________________
    lustre-discuss mailing list
    lustre-discuss at lists.lustre.org
    http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
    



More information about the lustre-discuss mailing list