[lustre-discuss] SSK configuration

Mark Roper markroper at gmail.com
Mon Jun 25 13:59:30 PDT 2018


Thanks again Jermey. This is pretty strange but here goes:  SSK encryption
works end to end if I ssh as root into the server and client nodes to
mount.  If I ssh as another user (say, centos) and `sudo` or `sudo -s` the
same commands with --skpath, the client mount fails.

So it seems like there is something going on with how user and session keys
are loaded into the linux keyring and later made available, but I haven't
gone further in my investigation than this.

I was able to get what I needed for performance numbers which was my goal
in setting up ssk encryption: thanks again!

Mark


On Sun, Jun 24, 2018 at 6:46 PM Jeremy Filizetti <jeremy.filizetti at gmail.com>
wrote:

> I have encountered this issue before as well.  Something on the system is
> creating a new root user session keyring and keyctl_read fails after that
> happens.  For now reloading the key into the keyring is what I have done.
> For the client you could mount with --skpath option so any time it's
> mounted it reloads the key but there is still the issue when the session
> context expires and the keys are re-established keyctl_read will fail again
> if a new keyring is created.  I'm not sure when I'll have time to put
> together a fix for this but let me know if mounting with skpath option
> works.
>
> Jeremy
>
> On Sun, Jun 24, 2018 at 4:41 PM, Mark Roper <markroper at gmail.com> wrote:
>
>> Hi Jeremy,
>>
>> Thanks for taking a look at my question. I have validated that the key on
>> the server and the client match and that the client key has the prime
>> generated.
>>
>> When I ssh to the client node and run
>>   sudo mount -t lustre -o skpath=/secure_directory/scratch.client.key
>> 172.31.46.245 at tcp:/scratch /scratch
>> I get the following output in /var/log/messages with verbosity turned up
>> to trace on the MDS node I see:
>>
>> Jun 24 20:26:41 ip-172-31-44-121 lsvcgssd[23975]: keyctl_read() failed
>> for key 27091278: Permission denied
>>
>> Jun 24 20:26:41 ip-172-31-44-121 lsvcgssd[23975]: Failed to create sk
>> credentials
>> As I mentioned, If I remove the option I'm able to mount the FS. I'm
>> using Lustre 2.11 server and clients. The server kernel is
>> 3.10.0-693.21.1.el7_lustre.x86_64 and the client kernel is
>> 3.10.0-693.21.1.el7.x86_64.
>>
>> I am wondering if this has something to do with linux keyring permissions
>> on CentOS.  When I ssh to my server and client nodes as the user `centos`
>> and run `sudo lgss_sk -l /secure_directory/scratch.<server | client>.key`
>> followed by `keyctl show`, the lustre user key does not appear in the list
>> of keys.  If I ssh to the client & server nodes as root and run the same
>> two commands, the lustre key shows up on the server as:
>>
>> 772711346 --alswrv      0     0  keyring: _ses
>>
>> 1047091535 --alswrv      0 65534   \_ keyring: _uid.0
>>
>>  27091278 --alswrv      0     0       \_ user: lustre:scratch:default
>>
>> ... and on the client as:
>>
>> Session Keyring
>>
>>  269152212 --alswrv      0     0  keyring: _ses
>>
>> 1059491764 --alswrv      0 65534   \_ keyring: _uid.0
>>
>>  146272009 --alswrv      0     0       \_ user: lustre:scratch
>> I'm going to try setting up a 2.10.3 server and client to see if this is
>> some kind of regression in 2.11 and not just me fat fingering something.
>> I'm also going to dive deeper into keyring permissions and see if I can
>> find anything there.  I'll update this thread for those interested if I
>> figure it out.
>>
>> Any additional thoughts would be appreciated!
>>
>> Cheers,
>>
>> Mark
>>
>>
>> On Sun, Jun 24, 2018 at 4:02 PM Jeremy Filizetti <
>> jeremy.filizetti at gmail.com> wrote:
>>
>>> GSS error 0x60000 is GSS bad signature which would mean the HMAC was
>>> invalid.  Can you verify your key file's have the same shared key?  Do you
>>> have any logs for the server side as well?  You can increase server
>>> verbosity by adding some extra v's to LSVCGSSDARGS in
>>> /etc/sysconfig/lsvcgss.
>>>
>>> Jeremy
>>>
>>> On Fri, Jun 22, 2018 at 3:41 PM, Mark Roper <markroper at gmail.com> wrote:
>>>
>>>> Hi Lustre Admins,
>>>>
>>>> I am hoping someone can help me understand what I'm doing wrong with
>>>> SSK setup. I have set up a lustre 2.11 server and worked through the steps
>>>> to use shared secret keys (SSKs) to encrypt data in transit between client
>>>> nodes and the MDT and OSS.  I followed the manual instructions here:
>>>> http://doc.lustre.org/lustre_manual.xhtml#idm140687075065344
>>>>
>>>> Before enabling the encryption settings on the MDT, I can mount the FS
>>>> on the client node.  After I turn on the encryption I get back an
>>>> encryption refused error and cannot mount:
>>>>
>>>> mount.lustre: mount 172.31.46.245 at tcp:/scratch at /scratch failed:
>>>> Connection refused
>>>>
>>>> The keys are definitely distributed to client nodes and server nodes
>>>> and the settings have all been made as instruct4red in the manual (I did
>>>> this a few times from scratch to make sure).  I can manually load the keys
>>>> into the keyring and see them by running `keyctl show`, I can compare the
>>>> key files on client and server nodes with the command `lgss_sk --read
>>>> /secure_directory/scratch.client.key` and validate that they all match and
>>>> that the client has a prime.
>>>>
>>>> The commands I'm using to enable the encryption are:
>>>>
>>>>   mdt# sudo lctl conf_param scratch.srpc.flavor.tcp.cli2mdt=skpi
>>>>   mdt# sudo lctl conf_param scratch.srpc.flavor.tcp.cli2ost=skpi
>>>> I tried tailing /var/log/messages and am not able to interpret the
>>>> output, I'm wondering - does anyone have a hypothesis about what might be
>>>> wrong or instructions to debug?
>>>>
>>>> Log output is below!  Many thanks to anyone who can help!
>>>>
>>>> Mark
>>>>
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:main():
>>>> start parsing parameters
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:INFO:main(): key
>>>> 428863463, desc 0 at 26, ugid 0:0, sring 46159405, coinfo
>>>> 38:sk:0:0:m:p:2:0x20000ac1f2109:scratch-OST1cd0-osc-MDT0000:0x20000ac1f2ef5:1
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22250]:TRACE:parse_callout_info(): components:
>>>> 38,sk,0,0,m,p,2,0x20000ac1f2109,scratch-OST1cd0-osc-MDT0000,0x20000ac1f2ef5,1
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22250]:DEBUG:parse_callout_info(): parse call out info: secid 38, mech sk,
>>>> ugid 0:0, is_root 0, is_mdt 1, is_ost 0, svc type p, svc 2, nid
>>>> 0x20000ac1f2109, tgt scratch-OST1cd0-osc-MDT0000, self nid 0x20000ac1f2ef5,
>>>> pid 1
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:main():
>>>> parsing parameters OK
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22250]:TRACE:lgss_mech_initialize(): initialize mech sk
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22250]:TRACE:lgss_create_cred(): create a sk cred at 0x1ecc2e0
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:main():
>>>> caller's namespace is the same
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22250]:TRACE:lgss_prepare_cred(): preparing sk cred 0x1ecc2e0
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22250]:INFO:sk_create_cred(): Creating credentials for target:
>>>> scratch-OST1cd0-osc-MDT0000 with nodemap: (null)
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22250]:INFO:sk_create_cred(): Searching for key with description:
>>>> lustre:scratch
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22250]:TRACE:prepare_and_instantiate(): instantiated kernel key 198fefe7
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:main():
>>>> forked child 22251
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22251]:TRACE:lgssc_kr_negotiate(): child start on behalf of key 198fefe7:
>>>> cred 0x1ecc2e0, uid 0, svc 2, nid 20000ac1f2109, uids: 0:0/0:0
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22251]:INFO:ipv4_nid2hostname(): SOCKLND: net 0x20000, addr 0x9211fac =>
>>>> ip-172-31-33-9.us-west-2.compute.internal
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22251]:DEBUG:lgss_get_service_str(): constructed service string:
>>>> lustre_oss at ip-172-31-33-9.us-west-2.compute.internal
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22251]:TRACE:lgss_using_cred(): using sk cred 0x1ecc2e0
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:main():
>>>> start parsing parameters
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:INFO:main(): key
>>>> 189483693, desc 0 at 25, ugid 0:0, sring 46159405, coinfo
>>>> 37:sk:0:0:m:p:2:0x20000ac1f2687:scratch-OST2b9d-osc-MDT0000:0x20000ac1f2ef5:1
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22253]:TRACE:parse_callout_info(): components:
>>>> 37,sk,0,0,m,p,2,0x20000ac1f2687,scratch-OST2b9d-osc-MDT0000,0x20000ac1f2ef5,1
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22253]:DEBUG:parse_callout_info(): parse call out info: secid 37, mech sk,
>>>> ugid 0:0, is_root 0, is_mdt 1, is_ost 0, svc type p, svc 2, nid
>>>> 0x20000ac1f2687, tgt scratch-OST2b9d-osc-MDT0000, self nid 0x20000ac1f2ef5,
>>>> pid 1
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:main():
>>>> parsing parameters OK
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22253]:TRACE:lgss_mech_initialize(): initialize mech sk
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22253]:TRACE:lgss_create_cred(): create a sk cred at 0x21b02e0
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:main():
>>>> caller's namespace is the same
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22253]:TRACE:lgss_prepare_cred(): preparing sk cred 0x21b02e0
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22253]:INFO:sk_create_cred(): Creating credentials for target:
>>>> scratch-OST2b9d-osc-MDT0000 with nodemap: (null)
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22253]:INFO:sk_create_cred(): Searching for key with description:
>>>> lustre:scratch
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22253]:TRACE:prepare_and_instantiate(): instantiated kernel key 0b4b4aad
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:main():
>>>> forked child 22254
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22254]:TRACE:lgssc_kr_negotiate(): child start on behalf of key 0b4b4aad:
>>>> cred 0x21b02e0, uid 0, svc 2, nid 20000ac1f2687, uids: 0:0/0:0
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22254]:INFO:ipv4_nid2hostname(): SOCKLND: net 0x20000, addr 0x87261fac =>
>>>> ip-172-31-38-135.us-west-2.compute.internal
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22254]:DEBUG:lgss_get_service_str(): constructed service string:
>>>> lustre_oss at ip-172-31-38-135.us-west-2.compute.internal
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22254]:TRACE:lgss_using_cred(): using sk cred 0x21b02e0
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22251]:INFO:sk_encode_netstring(): Encoded netstring of 647 bytes
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22251]:INFO:lgss_sk_using_cred(): Created netstring of 647 bytes
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22251]:TRACE:lgssc_negotiation_manual(): starting gss negotation
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22251]:TRACE:do_nego_rpc(): start negotiation rpc
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22251]:TRACE:gss_do_ioctl(): to open
>>>> /proc/fs/lustre/sptlrpc/gss/init_channel
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22251]:TRACE:gss_do_ioctl(): to down-write
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22254]:INFO:sk_encode_netstring(): Encoded netstring of 647 bytes
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22254]:INFO:lgss_sk_using_cred(): Created netstring of 647 bytes
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22254]:TRACE:lgssc_negotiation_manual(): starting gss negotation
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22254]:TRACE:do_nego_rpc(): start negotiation rpc
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22254]:TRACE:gss_do_ioctl(): to open
>>>> /proc/fs/lustre/sptlrpc/gss/init_channel
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22254]:TRACE:gss_do_ioctl(): to down-write
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22251]:TRACE:do_nego_rpc(): do_nego_rpc: to parse reply
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22251]:DEBUG:do_nego_rpc(): do_nego_rpc: receive handle len 0, token len
>>>> 0, res 0
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22251]:ERROR:lgssc_negotiation_manual(): negotiation gss error 60000
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22251]:ERROR:lgssc_kr_negotiate_manual(): key 198fefe7: failed to negotiate
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22251]:TRACE:error_kernel_key(): revoking kernel key 198fefe7
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22251]:INFO:error_kernel_key(): key 198fefe7: revoked
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22251]:TRACE:lgss_release_cred(): releasing sk cred 0x1ecc2e0
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22254]:TRACE:do_nego_rpc(): do_nego_rpc: to parse reply
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22254]:DEBUG:do_nego_rpc(): do_nego_rpc: receive handle len 0, token len
>>>> 0, res 0
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22254]:ERROR:lgssc_negotiation_manual(): negotiation gss error 60000
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22254]:ERROR:lgssc_kr_negotiate_manual(): key 0b4b4aad: failed to negotiate
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22254]:TRACE:error_kernel_key(): revoking kernel key 0b4b4aad
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22254]:INFO:error_kernel_key(): key 0b4b4aad: revoked
>>>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring:
>>>> [22254]:TRACE:lgss_release_cred(): releasing sk cred 0x21b02e0
>>>>
>>>> _______________________________________________
>>>> lustre-discuss mailing list
>>>> lustre-discuss at lists.lustre.org
>>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>>>
>>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180625/a6511430/attachment-0001.html>


More information about the lustre-discuss mailing list