[Lustre-devel] GSS cross-realm on MDT -> OST
Eric Mei
Eric.Mei at Sun.COM
Thu Jul 10 09:45:16 PDT 2008
Benjamin Bennett wrote:
> Eric Mei wrote:
>> Peter Braam wrote:
>>>
>>>
>>> On 7/8/08 2:38 PM, "Benjamin Bennett" <ben at psc.edu> wrote:
>>>
>>>> Peter Braam wrote:
>>>>> Hmm. Perhaps there are implementation issues here that overshadow the
>>>>> architecture.
>>>>>
>>>>> To interact with MDS nodes that are part of one file system, the
>>>>> MDS needs
>>>>> to be part of a realm. The MDS performs authorization based on a
>>>>> principal
>>>>> to MDS (i.e. Lustre) user/group database. Within one Lustre file
>>>>> system
>>>>> each MDS MUST HAVE the same user group database. We will likely
>>>>> want to
>>>>> place MDS's distributedly in the longer term future, so take clear
>>>>> note of
>>>>> this: one Kerberos realm owns the entire MDS cluster for a file
>>>>> system.
>>>> Could you explain more on why this requires a single realm and not just
>>>> consistent mappings across all MDSs?
>>>
>>> That MIGHT work ... But how would two domains guarantee consistent
>>> updates
>>> to the databases? However, the server - server trust across domains
>>> we need
>>> is new to me (and I am not sure if/how it works).
>>
>> Practically it's doable, of course. But as Peter pointed out the user
>> database must be the same across all MDSs within a Luster FS. If 2
>> MDSs could share the user database, why bother putting them into
>> different kerberos realms? So we assume all MDSs should be in a single
>> realm. Does TeraGrid have different requirement?
>
> TeraGrid has a central database of users which could be used to
> consistently generate mappings.
>
> The reason to bother putting MDSs in separate realms is that TeraGrid is
> composed of distinct organizations. We are trying to distribute a
> filesystem across several organizations, not simply implement a
> centralized fs accessed by several organizations.
I see, thanks for explanation. I think if the issue of server membership
solved, there'll be no problem to do that as GSS/Kerberos's aspect.
>>>>> There can be multiple MDS clusters, i.e. Lustre file systems, in a
>>>>> single
>>>>> realm, each serving their own file system. Each Lustre file system
>>>>> can have
>>>>> its own user/group database. No restrictions here.
>>>> Well, that's the problem with multiple clusters in a single realm, lack
>>>> of restriction... ;-)
>>>
>>> Restrict yourself, not me or Lustre :)
>>>
>>>>> For a given file system the MDS nodes produce capabilities which
>>>>> the OSS
>>>>> nodes use for authorization. It is important that the MDS can maken
>>>>> authenticated RPC's to the OSS nodes in its file system and for
>>>>> this we use
>>>>> Kerberos (this is not a "must have" - it could have been done with a
>>>>> different key sharing mechanism).
>>>> With multiple clusters in a single realm an MDS from any cluster could
>>>> authenticate and authorize as an MDS to an OSS in any cluster.
>>>
>>>
>>>
>>> Good point. If so that should be a bug.
>>>
>>> ===> Eric Mei, what is the story here?
>>
>> Yes Ben is right, currently in a same realm any MDS could authenticate
>> with any MDS and OSS. But afaics the problem is nothing to do with
>> Kerberos. It's because currently Lustre have no config information
>> about the server cluster membership, each server target have no idea
>> what other targets are.
>>
>> So solve this, we can either place the configuration on each MDS/OST
>> nodes - as Ben proposed in last mail; or probably better centrally
>> managed by MGS, thus MDT/OST would be able to get uptodate server
>> cluster information. Would it work?
>
> Sounds like a good idea. If I understand correctly...
> A) An MDT/OST is explicitly given the MGS NID by a trusted entity
> (administrator) during mkfs.
>
> B) The MGS principal name would be derived from its NID (assuming
> lustre_mgs/mgsnode at REALM). Realm is determined from the usual kerberos
> dns -> realm mapping mechanism?
>
> C) MDT and OST (or just MDS, OSS) list retrieved via secured MGC ->
> MGS connection.
>
> D) MDS and OSS principal names are derived from MDS and OSS NIDs. Same
> realm determination as in B?
Well I guess you're talking about secure connection of MGC->MGS. Yes we
have plan to add that in the near future.
As for the server membership control, I meant sysad need to teach MGS
that a Lustre filesytem is comprised of what MDT/OSTs. And when a
MDT/OST mounting, it can get the server list from MGS, thus it would
know to prevent unwanted connection which pretend to be a MDT.
And I think the membership management better be working for both with or
without Kerberos.
>>> The key (which is manually generated) should authenticate an instance
>>> of an
>>> MDS, not a "cluster". The only case where this might become
>>> delicate is if
>>> one MDS node is the server for two file systems.
>>
>> GSS/Kerberos is for the a certain kind service on a node, we can tell
>> it simply from the composition of Kerberos principal
>> "service_name/hostname at REALM". As to Lustre, lustre_mds/hostname at REALM
>> it's for MDS, not specific to MDT. So if two MDTs on a MDS serving two
>> different file systems, GSS/Kerberos authentications are performed in
>> the same way for them, further access control should be handled by
>> each target (MDT/OST).
>>
>>>> This would allow an MDS in one cluster to change the key used for
>>>> capabilities on the OSSs in another cluster, no?
>>>>
>>>>> ==> So the first issue you have to become clear about is how you
>>>>> authorize
>>>>> an MDS to contact one of its OSS nodes, wherever these are place.
>>>> I've changed lsvcgssd on the OSSs to take an arbitrary number of '-M
>>>> lustre_mds/mdshost at REALM' and use this list to determine MDS
>>>> authorization. Is there a way in which an OSS is already aware of its
>>>> appropriate MDSs?
>>>
>>> As you pointed out, we need that, and Eric Mei should help you get that.
>>
>> Yes that works, probably as temporary solution. As described above,
>> currently OSS don't know that info. we may need a more complete
>> centrally controlled server membership authentication, maybe
>> independent of GSS/Kerberos.
>
> If you're interested, the patch I have is at [1].
Thanks.
>>>>> Similarly the Kerberos connections are used by the clients to
>>>>> connect to the
>>>>> OSS, but they are not used to authenticate anything (but optionally
>>>>> the
>>>>> node), they are used merely to provide privacy and/or authenticity for
>>>>> transporting data between the client and the OSS nodes. With
>>>>> relatively
>>>>> little effort this could be done without Kerberos at all, on the
>>>>> other hand,
>>>>> probably using Kerberos for this leads to a more easily understood
>>>>> architecture.
>>>>>
>>>>> So, to repeat, the authorization uses capabilities, which
>>>>> authenticate the
>>>>> requestor and contain authorization information, independent of a
>>>>> server
>>>>> user/group database on the OSS.
>>>>>
>>>>> ==> The second issue you need to be clear about is how you
>>>>> authenticate
>>>>> client NODES (NOT users) to OSS nodes.
>>>> Client nodes are issued lustre_root/host credentials from their local
>>>> realm. This works just fine for Client -> OST since the only
>>>> [kerberos-related] authorization check is a "lustre_root" service part.
>>>
>>> Good. Does it work across realms, because it seems we need that in any
>>> case?
>>
>> Yes, Ben had a patch to make it work.
>
> The foreign lustre_root principals have to be mapped on the MDS to allow
> mount. What are your thoughts on authorizing [squashed] mount to all,
> so as to not require mapping?
It was original assumption we made is that "remote realm" means
"different user database". That's why remote realm user have to be
remapped to a local user. It seems in TeraGrid case that's not true anymore.
The squashed mount, if I understand it correctly, it can be done by set
a mapping entry in lustre/idmap.conf, to map "*@REALM" from NID "*" to a
local user "U" - I don't remember the exact syntax though.
As for the user mapping part, I always feel not confident whether the
current implementation is what people really want or not, and not fully
tested, that's why I didn't put the UID mapping information on the
public wiki. I believe you are the first one outside of Lustre Group to
try that :) any opinions are very welcome, but decisions to change need
to be made by Peter Braam.
>>> BTW, thank you for trying this all out in detail, that is very helpful.
>>> Perhaps Sheila could talk with you and Eric Mei and get a nice
>>> writeup done
>>> for the manual.
>
> np :-)
>
>
> --ben
>
> [1] http://staff.psc.edu/ben/patches/lustre/lustre-explicit-mds-authz.patch
--
Eric
More information about the lustre-devel
mailing list