[lustre-discuss] SSK configuration

Andreas Dilger adilger at whamcloud.com
Thu Jun 28 15:51:26 PDT 2018


On Jun 27, 2018, at 06:05, Mark Roper <markroper at gmail.com> wrote:
> 
> Hi Jeremy & All,
> I got a request to share the results of my SSK performance investigation with this group from Mark Hahn, which I'm happy to do!  If you're not interested in the impact on throughput for encryption of client-to-mds and client-to-oss communication using the SSK feature, you can stop reading now.
> 
> The tldr is that enabling encryption reduced read throughput 81% and write throughput 53%.  To me this was large but unsurprising given that the server and client nodes are performing software encryption, but I wanted to know the extent of the impact.

If crypto performance is critical for you, you might consider to look at the Intel QAT PCI adapter.  While I don't work at Intel anymore, I learned about this adapter and it can definitely improve performance for crypto and compression workloads, so long as you are limited by the crypto performance.  It does not quite run fast enough to go full wire speed for IB/OPA networks, but at the speeds you are reporting it should be a benefit.

For some benchmark results see:

https://www.servethehome.com/intel-quickassist-technology-and-openssl-setup-insights-and-initial-benchmarks/

Cheers, Andreas

> Details are below!
> 
> Mark
> 
> I set up two Lustre file systems on the same virtual machine configuration in AWS, one OSS VM and one MDS VM in each.  I enabled encryption of client and server communication as follows:
> 
> sudo lctl conf_param scratch.srpc.flavor.default.cli2mdt=skpi
> sudo lctl conf_param scratch.srpc.flavor.default.cli2ost=skpi
> 
> I then ran a single IOR benchmark test aimed at evaluating sustem throughput.  I ran the benchmark on a cluster of 5 clients with the following command:
> srun --tasks-per-node=1 -N 5 ior -a POSIX -o /scratch/demo -z -w -r -F -B -b 1g -t 1m -i 2
> 
> The IOR results for the encrypted FS were:
> Summary:
>     api                = POSIX
>     test filename      = /lustre/demo
>     access             = file-per-process
>     ordering in a file = random offsets
>     ordering inter file= no tasks offsets
>     clients            = 5 (1 per node)
>     repetitions        = 2
>     xfersize           = 1 MiB
>     blocksize          = 1 GiB
>     aggregate filesize = 5 GiB
> 
> access    bw(MiB/s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)   close(s)   total(s)   iter
> ------    ---------  ---------- ---------  --------   --------   --------   --------   ----
> write     89.25      1048576    1024.00    0.003589   57.36      1.22       57.37      0
> read      95.14      1048576    1024.00    0.001952   53.81      0.860698   53.81      0
> remove    -          -          -          -          -          -          0.002120   0
> write     88.16      1048576    1024.00    0.001738   58.08      1.23       58.08      1
> read      95.22      1048576    1024.00    0.001806   53.77      0.989825   53.77      1
> remove    -          -          -          -          -          -          0.001562   1
> 
> Max Write: 89.25 MiB/sec (93.59 MB/sec)
> Max Read:  95.22 MiB/sec (99.84 MB/sec)
> 
> Summary of all tests:
> Operation   Max(MiB)   Min(MiB)  Mean(MiB)     StdDev    Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum
> write          89.25      88.16      88.70       0.55   57.72177 0 5 1 2 1 0 1 0 0 1 1073741824 1048576 5368709120 POSIX 0
> read           95.22      95.14      95.18       0.04   53.79227 0 5 1 2 1 0 1 0 0 1 1073741824 1048576 5368709120 POSIX 0
> 
> The IOR results of the unencrypted filesystem were:
> 
> Summary:
>     api                = POSIX
>     test filename      = /scratch/demo
>     access             = file-per-process
>     ordering in a file = random offsets
>     ordering inter file= no tasks offsets
>     clients            = 5 (1 per node)
>     repetitions        = 2
>     xfersize           = 1 MiB
>     blocksize          = 1 GiB
>     aggregate filesize = 5 GiB
> 
> access    bw(MiB/s)  block(KiB) xfer(KiB)  open(s)    wr/rd(s)   close(s)   total(s)   iter
> ------    ---------  ---------- ---------  --------   --------   --------   --------   ----
> write     189.02     1048576    1024.00    0.002521   27.09      0.551086   27.09      0
> read      508.37     1048576    1024.00    0.001257   10.07      0.326688   10.07      0
> remove    -          -          -          -          -          -          0.002035   0
> write     187.13     1048576    1024.00    0.001748   27.36      0.928853   27.36      1
> read      502.72     1048576    1024.00    0.001494   10.18      0.356007   10.18      1
> remove    -          -          -          -          -          -          0.001705   1
> 
> Max Write: 189.02 MiB/sec (198.20 MB/sec)
> Max Read:  508.37 MiB/sec (533.07 MB/sec)
> 
> Summary of all tests:
> Operation   Max(MiB)   Min(MiB)  Mean(MiB)     StdDev    Mean(s) Test# #Tasks tPN reps fPP reord reordoff reordrand seed segcnt blksiz xsize aggsize API RefNum
> write         189.02     187.13     188.07       0.95   27.22418 0 5 1 2 1 0 1 0 0 1 1073741824 1048576 5368709120 POSIX 0
> read          508.37     502.72     505.54       2.83   10.12801 0 5 1 2 1 0 1 0 0 1 1073741824 1048576 5368709120 POSIX 0
> 
> 
> On Mon, Jun 25, 2018 at 4:59 PM Mark Roper <markroper at gmail.com> wrote:
> Thanks again Jermey. This is pretty strange but here goes:  SSK encryption works end to end if I ssh as root into the server and client nodes to mount.  If I ssh as another user (say, centos) and `sudo` or `sudo -s` the same commands with --skpath, the client mount fails.
> 
> So it seems like there is something going on with how user and session keys are loaded into the linux keyring and later made available, but I haven't gone further in my investigation than this.
> 
> I was able to get what I needed for performance numbers which was my goal in setting up ssk encryption: thanks again!
> 
> Mark
> 
> 
> On Sun, Jun 24, 2018 at 6:46 PM Jeremy Filizetti <jeremy.filizetti at gmail.com> wrote:
> I have encountered this issue before as well.  Something on the system is creating a new root user session keyring and keyctl_read fails after that happens.  For now reloading the key into the keyring is what I have done.  For the client you could mount with --skpath option so any time it's mounted it reloads the key but there is still the issue when the session context expires and the keys are re-established keyctl_read will fail again if a new keyring is created.  I'm not sure when I'll have time to put together a fix for this but let me know if mounting with skpath option works.
> 
> Jeremy
> 
> On Sun, Jun 24, 2018 at 4:41 PM, Mark Roper <markroper at gmail.com> wrote:
> Hi Jeremy,
> 
> Thanks for taking a look at my question. I have validated that the key on the server and the client match and that the client key has the prime generated.
> 
> When I ssh to the client node and run
>   sudo mount -t lustre -o skpath=/secure_directory/scratch.client.key 172.31.46.245 at tcp:/scratch /scratch
> I get the following output in /var/log/messages with verbosity turned up to trace on the MDS node I see:
> Jun 24 20:26:41 ip-172-31-44-121 lsvcgssd[23975]: keyctl_read() failed for key 27091278: Permission denied
> 
> Jun 24 20:26:41 ip-172-31-44-121 lsvcgssd[23975]: Failed to create sk credentials
> 
> As I mentioned, If I remove the option I'm able to mount the FS. I'm using Lustre 2.11 server and clients. The server kernel is 3.10.0-693.21.1.el7_lustre.x86_64 and the client kernel is 3.10.0-693.21.1.el7.x86_64.
> 
> I am wondering if this has something to do with linux keyring permissions on CentOS.  When I ssh to my server and client nodes as the user `centos` and run `sudo lgss_sk -l /secure_directory/scratch.<server | client>.key` followed by `keyctl show`, the lustre user key does not appear in the list of keys.  If I ssh to the client & server nodes as root and run the same two commands, the lustre key shows up on the server as:
> 772711346 --alswrv      0     0  keyring: _ses
> 
> 1047091535 --alswrv      0 65534   \_ keyring: _uid.0
> 
>  27091278 --alswrv      0     0       \_ user: lustre:scratch:default
> 
> ... and on the client as:
> 
> Session Keyring
> 
>  269152212 --alswrv      0     0  keyring: _ses
> 
> 1059491764 --alswrv      0 65534   \_ keyring: _uid.0
> 
> 
>  146272009 --alswrv      0     0       \_ user: lustre:scratch
> 
> I'm going to try setting up a 2.10.3 server and client to see if this is some kind of regression in 2.11 and not just me fat fingering something. I'm also going to dive deeper into keyring permissions and see if I can find anything there.  I'll update this thread for those interested if I figure it out.
> 
> Any additional thoughts would be appreciated!
> 
> Cheers,
> 
> Mark
> 
> 
> On Sun, Jun 24, 2018 at 4:02 PM Jeremy Filizetti <jeremy.filizetti at gmail.com> wrote:
> GSS error 0x60000 is GSS bad signature which would mean the HMAC was invalid.  Can you verify your key file's have the same shared key?  Do you have any logs for the server side as well?  You can increase server verbosity by adding some extra v's to LSVCGSSDARGS in /etc/sysconfig/lsvcgss.
> 
> Jeremy
> 
> On Fri, Jun 22, 2018 at 3:41 PM, Mark Roper <markroper at gmail.com> wrote:
> Hi Lustre Admins,
> 
> I am hoping someone can help me understand what I'm doing wrong with SSK setup. I have set up a lustre 2.11 server and worked through the steps to use shared secret keys (SSKs) to encrypt data in transit between client nodes and the MDT and OSS.  I followed the manual instructions here: http://doc.lustre.org/lustre_manual.xhtml#idm140687075065344
> 
> Before enabling the encryption settings on the MDT, I can mount the FS on the client node.  After I turn on the encryption I get back an encryption refused error and cannot mount:
> 
> mount.lustre: mount 172.31.46.245 at tcp:/scratch at /scratch failed: Connection refused
> The keys are definitely distributed to client nodes and server nodes and the settings have all been made as instruct4red in the manual (I did this a few times from scratch to make sure).  I can manually load the keys into the keyring and see them by running `keyctl show`, I can compare the key files on client and server nodes with the command `lgss_sk --read /secure_directory/scratch.client.key` and validate that they all match and that the client has a prime.
> 
> The commands I'm using to enable the encryption are:
> 
>   mdt# sudo lctl conf_param scratch.srpc.flavor.tcp.cli2mdt=skpi
>   mdt# sudo lctl conf_param scratch.srpc.flavor.tcp.cli2ost=skpi
> 
> I tried tailing /var/log/messages and am not able to interpret the output, I'm wondering - does anyone have a hypothesis about what might be wrong or instructions to debug?
> 
> Log output is below!  Many thanks to anyone who can help!
> 
> Mark
> 
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:main(): start parsing parameters
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:INFO:main(): key 428863463, desc 0 at 26, ugid 0:0, sring 46159405, coinfo 38:sk:0:0:m:p:2:0x20000ac1f2109:scratch-OST1cd0-osc-MDT0000:0x20000ac1f2ef5:1
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:parse_callout_info(): components: 38,sk,0,0,m,p,2,0x20000ac1f2109,scratch-OST1cd0-osc-MDT0000,0x20000ac1f2ef5,1
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:DEBUG:parse_callout_info(): parse call out info: secid 38, mech sk, ugid 0:0, is_root 0, is_mdt 1, is_ost 0, svc type p, svc 2, nid 0x20000ac1f2109, tgt scratch-OST1cd0-osc-MDT0000, self nid 0x20000ac1f2ef5, pid 1
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:main(): parsing parameters OK
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:lgss_mech_initialize(): initialize mech sk
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:lgss_create_cred(): create a sk cred at 0x1ecc2e0
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:main(): caller's namespace is the same
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:lgss_prepare_cred(): preparing sk cred 0x1ecc2e0
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:INFO:sk_create_cred(): Creating credentials for target: scratch-OST1cd0-osc-MDT0000 with nodemap: (null)
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:INFO:sk_create_cred(): Searching for key with description: lustre:scratch
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:prepare_and_instantiate(): instantiated kernel key 198fefe7
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:main(): forked child 22251
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:TRACE:lgssc_kr_negotiate(): child start on behalf of key 198fefe7: cred 0x1ecc2e0, uid 0, svc 2, nid 20000ac1f2109, uids: 0:0/0:0
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:INFO:ipv4_nid2hostname(): SOCKLND: net 0x20000, addr 0x9211fac => ip-172-31-33-9.us-west-2.compute.internal
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:DEBUG:lgss_get_service_str(): constructed service string: lustre_oss at ip-172-31-33-9.us-west-2.compute.internal
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:TRACE:lgss_using_cred(): using sk cred 0x1ecc2e0
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:main(): start parsing parameters
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:INFO:main(): key 189483693, desc 0 at 25, ugid 0:0, sring 46159405, coinfo 37:sk:0:0:m:p:2:0x20000ac1f2687:scratch-OST2b9d-osc-MDT0000:0x20000ac1f2ef5:1
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:parse_callout_info(): components: 37,sk,0,0,m,p,2,0x20000ac1f2687,scratch-OST2b9d-osc-MDT0000,0x20000ac1f2ef5,1
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:DEBUG:parse_callout_info(): parse call out info: secid 37, mech sk, ugid 0:0, is_root 0, is_mdt 1, is_ost 0, svc type p, svc 2, nid 0x20000ac1f2687, tgt scratch-OST2b9d-osc-MDT0000, self nid 0x20000ac1f2ef5, pid 1
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:main(): parsing parameters OK
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:lgss_mech_initialize(): initialize mech sk
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:lgss_create_cred(): create a sk cred at 0x21b02e0
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:main(): caller's namespace is the same
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:lgss_prepare_cred(): preparing sk cred 0x21b02e0
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:INFO:sk_create_cred(): Creating credentials for target: scratch-OST2b9d-osc-MDT0000 with nodemap: (null)
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:INFO:sk_create_cred(): Searching for key with description: lustre:scratch
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:prepare_and_instantiate(): instantiated kernel key 0b4b4aad
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:main(): forked child 22254
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:TRACE:lgssc_kr_negotiate(): child start on behalf of key 0b4b4aad: cred 0x21b02e0, uid 0, svc 2, nid 20000ac1f2687, uids: 0:0/0:0
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:INFO:ipv4_nid2hostname(): SOCKLND: net 0x20000, addr 0x87261fac => ip-172-31-38-135.us-west-2.compute.internal
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:DEBUG:lgss_get_service_str(): constructed service string: lustre_oss at ip-172-31-38-135.us-west-2.compute.internal
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:TRACE:lgss_using_cred(): using sk cred 0x21b02e0
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:INFO:sk_encode_netstring(): Encoded netstring of 647 bytes
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:INFO:lgss_sk_using_cred(): Created netstring of 647 bytes
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:TRACE:lgssc_negotiation_manual(): starting gss negotation
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:TRACE:do_nego_rpc(): start negotiation rpc
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:TRACE:gss_do_ioctl(): to open /proc/fs/lustre/sptlrpc/gss/init_channel
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:TRACE:gss_do_ioctl(): to down-write
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:INFO:sk_encode_netstring(): Encoded netstring of 647 bytes
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:INFO:lgss_sk_using_cred(): Created netstring of 647 bytes
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:TRACE:lgssc_negotiation_manual(): starting gss negotation
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:TRACE:do_nego_rpc(): start negotiation rpc
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:TRACE:gss_do_ioctl(): to open /proc/fs/lustre/sptlrpc/gss/init_channel
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:TRACE:gss_do_ioctl(): to down-write
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:TRACE:do_nego_rpc(): do_nego_rpc: to parse reply
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:DEBUG:do_nego_rpc(): do_nego_rpc: receive handle len 0, token len 0, res 0
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:ERROR:lgssc_negotiation_manual(): negotiation gss error 60000
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:ERROR:lgssc_kr_negotiate_manual(): key 198fefe7: failed to negotiate
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:TRACE:error_kernel_key(): revoking kernel key 198fefe7
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:INFO:error_kernel_key(): key 198fefe7: revoked
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22251]:TRACE:lgss_release_cred(): releasing sk cred 0x1ecc2e0
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:TRACE:do_nego_rpc(): do_nego_rpc: to parse reply
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:DEBUG:do_nego_rpc(): do_nego_rpc: receive handle len 0, token len 0, res 0
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:ERROR:lgssc_negotiation_manual(): negotiation gss error 60000
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:ERROR:lgssc_kr_negotiate_manual(): key 0b4b4aad: failed to negotiate
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:TRACE:error_kernel_key(): revoking kernel key 0b4b4aad
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:INFO:error_kernel_key(): key 0b4b4aad: revoked
> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:TRACE:lgss_release_cred(): releasing sk cred 0x21b02e0
> 
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> 
> 
> 
> _______________________________________________
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Cheers, Andreas
---
Andreas Dilger
Principal Lustre Architect
Whamcloud







-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 235 bytes
Desc: Message signed with OpenPGP
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20180628/28466471/attachment-0001.sig>


More information about the lustre-discuss mailing list