[lustre-discuss] SSK configuration

Jeremy Filizetti jeremy.filizetti at gmail.com
Sun Jun 24 15:46:20 PDT 2018


I have encountered this issue before as well.  Something on the system is
creating a new root user session keyring and keyctl_read fails after that
happens.  For now reloading the key into the keyring is what I have done.
For the client you could mount with --skpath option so any time it's
mounted it reloads the key but there is still the issue when the session
context expires and the keys are re-established keyctl_read will fail again
if a new keyring is created.  I'm not sure when I'll have time to put
together a fix for this but let me know if mounting with skpath option
works.

Jeremy

On Sun, Jun 24, 2018 at 4:41 PM, Mark Roper <markroper at gmail.com> wrote:

> Hi Jeremy,
>
> Thanks for taking a look at my question. I have validated that the key on
> the server and the client match and that the client key has the prime
> generated.
>
> When I ssh to the client node and run
>   sudo mount -t lustre -o skpath=/secure_directory/scratch.client.key
> 172.31.46.245 at tcp:/scratch /scratch
> I get the following output in /var/log/messages with verbosity turned up
> to trace on the MDS node I see:
>
> Jun 24 20:26:41 ip-172-31-44-121 lsvcgssd[23975]: keyctl_read() failed for
> key 27091278: Permission denied
>
> Jun 24 20:26:41 ip-172-31-44-121 lsvcgssd[23975]: Failed to create sk
> credentials
> As I mentioned, If I remove the option I'm able to mount the FS. I'm using
> Lustre 2.11 server and clients. The server kernel is
> 3.10.0-693.21.1.el7_lustre.x86_64 and the client kernel is
> 3.10.0-693.21.1.el7.x86_64.
>
> I am wondering if this has something to do with linux keyring permissions
> on CentOS.  When I ssh to my server and client nodes as the user `centos`
> and run `sudo lgss_sk -l /secure_directory/scratch.<server | client>.key`
> followed by `keyctl show`, the lustre user key does not appear in the list
> of keys.  If I ssh to the client & server nodes as root and run the same
> two commands, the lustre key shows up on the server as:
>
> 772711346 --alswrv      0     0  keyring: _ses
>
> 1047091535 --alswrv      0 65534   \_ keyring: _uid.0
>
>  27091278 --alswrv      0     0       \_ user: lustre:scratch:default
>
> ... and on the client as:
>
> Session Keyring
>
>  269152212 --alswrv      0     0  keyring: _ses
>
> 1059491764 --alswrv      0 65534   \_ keyring: _uid.0
>
>  146272009 --alswrv      0     0       \_ user: lustre:scratch
> I'm going to try setting up a 2.10.3 server and client to see if this is
> some kind of regression in 2.11 and not just me fat fingering something.
> I'm also going to dive deeper into keyring permissions and see if I can
> find anything there.  I'll update this thread for those interested if I
> figure it out.
>
> Any additional thoughts would be appreciated!
>
> Cheers,
>
> Mark
>
>
> On Sun, Jun 24, 2018 at 4:02 PM Jeremy Filizetti <
> jeremy.filizetti at gmail.com> wrote:
>
>> GSS error 0x60000 is GSS bad signature which would mean the HMAC was
>> invalid.  Can you verify your key file's have the same shared key?  Do you
>> have any logs for the server side as well?  You can increase server
>> verbosity by adding some extra v's to LSVCGSSDARGS in
>> /etc/sysconfig/lsvcgss.
>>
>> Jeremy
>>
>> On Fri, Jun 22, 2018 at 3:41 PM, Mark Roper <markroper at gmail.com> wrote:
>>
>>> Hi Lustre Admins,
>>>
>>> I am hoping someone can help me understand what I'm doing wrong with SSK
>>> setup. I have set up a lustre 2.11 server and worked through the steps to
>>> use shared secret keys (SSKs) to encrypt data in transit between client
>>> nodes and the MDT and OSS.  I followed the manual instructions here:
>>> http://doc.lustre.org/lustre_manual.xhtml#idm140687075065344
>>>
>>> Before enabling the encryption settings on the MDT, I can mount the FS
>>> on the client node.  After I turn on the encryption I get back an
>>> encryption refused error and cannot mount:
>>>
>>> mount.lustre: mount 172.31.46.245 at tcp:/scratch at /scratch failed:
>>> Connection refused
>>>
>>> The keys are definitely distributed to client nodes and server nodes and
>>> the settings have all been made as instruct4red in the manual (I did this a
>>> few times from scratch to make sure).  I can manually load the keys into
>>> the keyring and see them by running `keyctl show`, I can compare the key
>>> files on client and server nodes with the command `lgss_sk --read
>>> /secure_directory/scratch.client.key` and validate that they all match
>>> and that the client has a prime.
>>>
>>> The commands I'm using to enable the encryption are:
>>>
>>>   mdt# sudo lctl conf_param scratch.srpc.flavor.tcp.cli2mdt=skpi
>>>   mdt# sudo lctl conf_param scratch.srpc.flavor.tcp.cli2ost=skpi
>>> I tried tailing /var/log/messages and am not able to interpret the
>>> output, I'm wondering - does anyone have a hypothesis about what might be
>>> wrong or instructions to debug?
>>>
>>> Log output is below!  Many thanks to anyone who can help!
>>>
>>> Mark
>>>
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:main():
>>> start parsing parameters
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:INFO:main(): key
>>> 428863463, desc 0 at 26, ugid 0:0, sring 46159405, coinfo 38:sk:0:0:m:p:2:
>>> 0x20000ac1f2109:scratch-OST1cd0-osc-MDT0000:0x20000ac1f2ef5:1
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22250]:TRACE:parse_callout_info(): components: 38,sk,0,0,m,p,2,
>>> 0x20000ac1f2109,scratch-OST1cd0-osc-MDT0000,0x20000ac1f2ef5,1
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22250]:DEBUG:parse_callout_info(): parse call out info: secid 38, mech
>>> sk, ugid 0:0, is_root 0, is_mdt 1, is_ost 0, svc type p, svc 2, nid
>>> 0x20000ac1f2109, tgt scratch-OST1cd0-osc-MDT0000, self nid 0x20000ac1f2ef5,
>>> pid 1
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:main():
>>> parsing parameters OK
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:lgss_mech_initialize():
>>> initialize mech sk
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:lgss_create_cred():
>>> create a sk cred at 0x1ecc2e0
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:main():
>>> caller's namespace is the same
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22250]:TRACE:lgss_prepare_cred(): preparing sk cred 0x1ecc2e0
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22250]:INFO:sk_create_cred(): Creating credentials for target:
>>> scratch-OST1cd0-osc-MDT0000 with nodemap: (null)
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22250]:INFO:sk_create_cred(): Searching for key with description:
>>> lustre:scratch
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:prepare_and_instantiate():
>>> instantiated kernel key 198fefe7
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:main():
>>> forked child 22251
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:TRACE:lgssc_kr_negotiate():
>>> child start on behalf of key 198fefe7: cred 0x1ecc2e0, uid 0, svc 2, nid
>>> 20000ac1f2109, uids: 0:0/0:0
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:INFO:ipv4_nid2hostname():
>>> SOCKLND: net 0x20000, addr 0x9211fac => ip-172-31-33-9.us-west-2.
>>> compute.internal
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:DEBUG:lgss_get_service_str():
>>> constructed service string: lustre_oss at ip-172-31-33-9.us-
>>> west-2.compute.internal
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22251]:TRACE:lgss_using_cred(): using sk cred 0x1ecc2e0
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:main():
>>> start parsing parameters
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:INFO:main(): key
>>> 189483693, desc 0 at 25, ugid 0:0, sring 46159405, coinfo 37:sk:0:0:m:p:2:
>>> 0x20000ac1f2687:scratch-OST2b9d-osc-MDT0000:0x20000ac1f2ef5:1
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22253]:TRACE:parse_callout_info(): components: 37,sk,0,0,m,p,2,
>>> 0x20000ac1f2687,scratch-OST2b9d-osc-MDT0000,0x20000ac1f2ef5,1
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22253]:DEBUG:parse_callout_info(): parse call out info: secid 37, mech
>>> sk, ugid 0:0, is_root 0, is_mdt 1, is_ost 0, svc type p, svc 2, nid
>>> 0x20000ac1f2687, tgt scratch-OST2b9d-osc-MDT0000, self nid 0x20000ac1f2ef5,
>>> pid 1
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:main():
>>> parsing parameters OK
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:lgss_mech_initialize():
>>> initialize mech sk
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:lgss_create_cred():
>>> create a sk cred at 0x21b02e0
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:main():
>>> caller's namespace is the same
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22253]:TRACE:lgss_prepare_cred(): preparing sk cred 0x21b02e0
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22253]:INFO:sk_create_cred(): Creating credentials for target:
>>> scratch-OST2b9d-osc-MDT0000 with nodemap: (null)
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22253]:INFO:sk_create_cred(): Searching for key with description:
>>> lustre:scratch
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:prepare_and_instantiate():
>>> instantiated kernel key 0b4b4aad
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:main():
>>> forked child 22254
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:TRACE:lgssc_kr_negotiate():
>>> child start on behalf of key 0b4b4aad: cred 0x21b02e0, uid 0, svc 2, nid
>>> 20000ac1f2687, uids: 0:0/0:0
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:INFO:ipv4_nid2hostname():
>>> SOCKLND: net 0x20000, addr 0x87261fac => ip-172-31-38-135.us-west-2.
>>> compute.internal
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:DEBUG:lgss_get_service_str():
>>> constructed service string: lustre_oss at ip-172-31-38-135.
>>> us-west-2.compute.internal
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22254]:TRACE:lgss_using_cred(): using sk cred 0x21b02e0
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:INFO:sk_encode_netstring():
>>> Encoded netstring of 647 bytes
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22251]:INFO:lgss_sk_using_cred(): Created netstring of 647 bytes
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:TRACE:lgssc_negotiation_manual():
>>> starting gss negotation
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22251]:TRACE:do_nego_rpc(): start negotiation rpc
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22251]:TRACE:gss_do_ioctl(): to open /proc/fs/lustre/sptlrpc/gss/
>>> init_channel
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22251]:TRACE:gss_do_ioctl(): to down-write
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:INFO:sk_encode_netstring():
>>> Encoded netstring of 647 bytes
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22254]:INFO:lgss_sk_using_cred(): Created netstring of 647 bytes
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:TRACE:lgssc_negotiation_manual():
>>> starting gss negotation
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22254]:TRACE:do_nego_rpc(): start negotiation rpc
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22254]:TRACE:gss_do_ioctl(): to open /proc/fs/lustre/sptlrpc/gss/
>>> init_channel
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22254]:TRACE:gss_do_ioctl(): to down-write
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22251]:TRACE:do_nego_rpc(): do_nego_rpc: to parse reply
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22251]:DEBUG:do_nego_rpc(): do_nego_rpc: receive handle len 0, token len
>>> 0, res 0
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:ERROR:lgssc_negotiation_manual():
>>> negotiation gss error 60000
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:ERROR:lgssc_kr_negotiate_manual():
>>> key 198fefe7: failed to negotiate
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22251]:TRACE:error_kernel_key(): revoking kernel key 198fefe7
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22251]:INFO:error_kernel_key(): key 198fefe7: revoked
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22251]:TRACE:lgss_release_cred(): releasing sk cred 0x1ecc2e0
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22254]:TRACE:do_nego_rpc(): do_nego_rpc: to parse reply
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22254]:DEBUG:do_nego_rpc(): do_nego_rpc: receive handle len 0, token len
>>> 0, res 0
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:ERROR:lgssc_negotiation_manual():
>>> negotiation gss error 60000
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:ERROR:lgssc_kr_negotiate_manual():
>>> key 0b4b4aad: failed to negotiate
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22254]:TRACE:error_kernel_key(): revoking kernel key 0b4b4aad
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22254]:INFO:error_kernel_key(): key 0b4b4aad: revoked
>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>> [22254]:TRACE:lgss_release_cred(): releasing sk cred 0x21b02e0
>>>
>>> _______________________________________________
>>> lustre-discuss mailing list
>>> lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>>
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180624/774b2ebb/attachment-0001.html>


More information about the lustre-discuss mailing list