[Lustre-devel] faking LNET scale
Liang Zhen
Zhen.Liang at Sun.COM
Fri Jun 5 01:57:46 PDT 2009
Hi Nic,
For incoming requests, I think we can share the same network aliases
with outgoing messsages (i.e: lnet_t::ln_local_nets in my previous
mail), matching on the aliases list could be embedded in
lnet_ptlcompat_match{net,nid} and lnet_net2ni_locked so we don't need
worry about changing code everywhere.
Regards
Liang
Nicholas Henke wrote:
> Liang Zhen wrote:
>> Nic,
>> It's very late night for me now, my head is not clear enough for me
>> to make sure whether I'm saying something crazy, :)
>> LNet always thinks target is remote network(needs router) if it can't
>> find a NI with same network ID, for example, if local NI is (ptl0)
>> and caller wants to send message to (ptl1), then LNet will:
>> 1. Try to find local NI for ptl1, and failed then:
>> 2. try to find if ptl1 is a remote network and whether there is
>> router for this network (ptl1)
>>
>> So if you want your server has only one NI instance and can talk with
>> a set of different networks, and at the same time, it can talk with
>> other remote networks via routers, I would suggest:
>> 1. create a new command, for example: lctl add_local_net ptl0
>> ptl[1-N], which means LNet should allow NI(ptl0) accessing networks(
>> ptl[1-N] as local networks.
>> 2. add a new structure in LNet, i.e:
>> struct {
>> struct list_head ln_list;
>> __u32 ln_net;
>> lnet_ni_t *ln_localni;
>> ......
>> }lnet_localnet_t;
>> As you see, it's very like current structure lnet_remotenet_t, which
>> is pending on lnet_t::ln_remote_nets; we can create a
>> lnet_locallnet_t object and add it to global list (i.e:
>> lnet_t::ln_local_nets) by the command we mentioned above: lctl
>> add_local_net
>> 3. once upper layer caller sending message, lnet_send() should check
>> lnet_t::ln_local_nets firstly (before thinking it's a remote network
>> and checking on lnet_t::ln_remote_nets), if it is on
>> lnet_t::ln_local_netsthen we can take the local NI. on
>> lnet_locanet_t::ln_localni;
>> 4. We need add a new flag for LND, only LND with the flag can support
>> command lctl add_local_net.
>> 5. make the LND wouldn't reject messages from different networks.
>> again, hope I'm answering what you are asking, :)
>
> This is almost working - I'm running into one problem: lnet_accept
> wants to match the ni->ni_nid against the requested NID. It is failing
> as the nets don't match (ptl1 vs ptl0).
>
> It looks like there are a fair number of places like this, most using
> lnet_ptlcompat_match{net,nid}.
>
> How should I handle those? Add another clause like ptlcompat (like
> ln_aliases) and if that is set (we have aliases set), do a search to
> find the alias and see if there is an alias that would allow
> NIDNET(lnet_net) == NIDNET(ptl_net)?
>
> Is there a cleaner way?
>
> Nic
More information about the lustre-devel
mailing list