[Lustre-devel] Query to understand the Lustre request/reply message

Vilobh Meshram vilobh.meshram at gmail.com
Thu Oct 14 17:58:02 PDT 2010


Hi Alexey/Nicholas,

I modified the code in following way in the way Nicholas suggested yesterday
in-order to get some information filled in a fixed sized buffer sent from
client side.Here I am sending a buffer called "str" (whose size is 16) which
will be updated at the MDS side by the string "hello"(whose size is 7 much
less than original size of buffer "str" i.e 16).But I am not able to perform
the operation successfully and I am getting an error
"LustreError: 4209:0:(file.c:3143:ll_inode_revalidate_fini()) failure -14
inode 31257"

which seems to be related to  DLM_REPLY_REC_OFF since I have modified this
offset in my code.Can you please review my code and suggest me if I am
making any mistake.I will be done with my task if I can resolve this
problem.

Following are the modifications .The text in BOLD and Italics (blue color)
are my modification at Client and MDS side for *Lustre 1.8.1.1*:-

*At Client side :- lustre/ldlm/ldlm_lockd.c**

* 655 int ldlm_cli_enqueue(.........)
 665         __u32 size[] = { [MSG_PTLRPC_BODY_OFF] = sizeof(struct
ptlrpc_body),
 666                         [DLM_LOCKREQ_OFF]     = sizeof(*body),
 667                         [DLM_REPLY_REC_OFF]   = lvb_len ? lvb_len :
 668                                                 sizeof(struct ost_lvb),
* 669                                                 16};*

 717         if (reqp == NULL || *reqp == NULL) {
 *718                 req = ldlm_prep_enqueue_req(exp, 4, size, NULL, 0);
                                                               |
                                                              |
                                                             v

                      575 struct ptlrpc_request *ldlm_prep_elc_req(.......)
                      584         void *str=NULL;
                      585         char *bufs[4] = {NULL,NULL,NULL,str};
                      616         req =
ptlrpc_prep_req(class_exp2cliimp(exp), version,
                      617                               opc, bufcount, size,
bufs**);


At MDS side :- lustre/ldlm/ldlm_lockd.c

 992 int ldlm_handle_enqueue(.........)
 996 {
1000         void *str;
         __u32 size[4] = { [MSG_PTLRPC_BODY_OFF] = sizeof(struct
ptlrpc_body),
                         [DLM_LOCKREPLY_OFF]   = sizeof(*dlm_rep)
1009         char *org = "hello";


*1119 existing_lock:
1120
1121         if (flags & LDLM_FL_HAS_INTENT) {
1122                 /* In this case, the reply buffer is allocated deep in
1123                  * local_lock_enqueue by the policy function. */
1124                 cookie = req;
1125         } else {
*1126                 int buffers = 4;*
1127
1128                 lock_res_and_lock(lock);
1129                 if (lock->l_resource->lr_lvb_len) {
*                       size[DLM_REPLY_REC_OFF] =
lock->l_resource->lr_lvb_len;
                       buffers = 4;*
1132                 }
1133                 unlock_res_and_lock(lock);
1134
1135                 if
(OBD_FAIL_CHECK_ONCE(OBD_FAIL_LDLM_ENQUEUE_EXTENT_ERR))
1136                         GOTO(out, rc = -ENOMEM);
*             str = lustre_msg_buf(req->rq_reqmsg, DLM_REPLY_REC_OFF+1, 1);
             memcpy ( str , org , 7);
             size[DLM_REPLY_REC_OFF + 1] = 16;


*

Thanks,
Vilobh
*Graduate Research Associate
Department of Computer Science
The Ohio State University Columbus Ohio*


On Thu, Oct 14, 2010 at 12:25 PM, Vilobh Meshram
<vilobh.meshram at gmail.com>wrote:

> Hi Alexey,
>
> That surely helps.Thanks for all the help till now.
>
> Thanks,
> Vilobh
> *Graduate Research Associate
> Department of Computer Science
> The Ohio State University Columbus Ohio*
>
>
> On Thu, Oct 14, 2010 at 11:45 AM, Alexey Lyashkov <
> alexey.lyashkov at clusterstor.com> wrote:
>
>> Hi Vilobh,
>>
>> interop == interoperability between nodes with different version of
>> software.
>>
>> in general we have two ways to solve that - for requests with mdc_body -
>> you can set flag in body and analyze that flag in server/client side.
>> if you want add new operation - better way add new flag into  connect_data
>>  (look to OBD_CONNECT_* macroses handling)
>> that flag can checked via export->connect_flags on client or server side
>> for remote side features.
>> as example 1.x and 2.0 have a different format for setattr requests :
>> int mdc_setattr
>> ...
>>        if (mdc_exp_is_2_0_server(exp)) {
>>
>>                 size[REQ_REC_OFF] = sizeof(struct mdt_rec_setattr);
>>
>>                 size[REQ_REC_OFF + 1] = 0; /* capa */
>>
>>                 size[REQ_REC_OFF + 2] = 0; //sizeof (struct mdt_epoch);
>>
>>                 size[REQ_REC_OFF + 3] = ealen;
>>
>>                 size[REQ_REC_OFF + 4] = ea2len;
>>
>>                 size[REQ_REC_OFF + 5] = sizeof(struct ldlm_request);
>>
>>                 offset = REQ_REC_OFF + 5;
>>
>>                 bufcount = 6;
>>
>>                 replybufcount = 6;
>>
>>         } else {
>>
>>                 bufcount = 4;
>>
>>         }
>>
>>
>> example of client features are checking version based recovery support for
>> client
>> mds_version_get_check
>> ...
>>         if (inode == NULL || !exp_connect_vbr(req->rq_export))
>>
>>
>>
>> I hope that help you.
>>
>>
>> On Oct 14, 2010, at 18:29, Vilobh Meshram wrote:
>>
>> Hi Alexey,
>>
>> Thanks again for the reply.
>>
>> Can you briefly give me some pointers about this interop issue and in
>> which kind of RPC should this issue arise ? How should we resolve this what
>> kind of flag needs to be set in ?
>>
>> I went through the bugzilla entry mentioned by you it seems like for RPCs
>> dealing with LDLM may cause this issue.Please correct me if I am wrong.
>>
>> Thanks,
>> Vilobh
>> *Graduate Research Associate
>> Department of Computer Science
>> The Ohio State University Columbus Ohio*
>>
>>
>> On Thu, Oct 14, 2010 at 11:10 AM, Alexey Lyashkov <
>> alexey.lyashkov at clusterstor.com> wrote:
>>
>>> Hi Vilobh,
>>>
>>> as i see, you touched code related to locking. struct ldm_request used to
>>> lock enqueue process - that why i say about interop issue in ELC code, which
>>> solved with export flag.
>>> for common mdc requests you can resolve interop issue with flags in
>>> mdc_body (mdt_body), but that not possible for ldlm requests.
>>>
>>>
>>> On Oct 14, 2010, at 18:04, Vilobh Meshram wrote:
>>>
>>> Hi Alexey,
>>>
>>> Thanks again for your reply.
>>>
>>> I am trying to embed a buffer in the RPC which will get filled in with
>>> some values which MDS is aware of which the client calling the RPC is not
>>> aware of.It has nothing to do with locking.I just want to fill in the
>>> buffer which I embedd in the RPC with some suitable data from the MDS end
>>> and then do operations on that data at the client side.So I think the
>>> approach suggested by you and Nicholas of just including the sizeof(str)
>>> [the size of the expected information from the MDS] in the size[] array
>>> should be fine as done below :-
>>>
>>>
>>>
>>> __u32 size[2] = { [MSG_PTLRPC_BODY_OFF] = sizeof(struct ptlrpc_body),
>>>                                     [DLM_LOCKREQ_OFF]     = sizeof(struct
>>> ldlm_request) };
>>>
>>> ---->>
>>>      __u32 size[3] = { [MSG_PTLRPC_BODY_OFF] = sizeof(struct
>>> ptlrpc_body),
>>>                                   [DLM_LOCKREQ_OFF]     = sizeof(struct
>>> ldlm_request) ,
>>>                                   //how to add "char *str=Hello" ofcourse
>>> we will have sizeof(str) but how to choose the MACRO like DLM_LOCKREQ_OFF
>>> bcz for a specific kind of RPC there are limited number of such MACROS
>>>
>>>
>>> *Please correct me if I am wrong or please guide me if I need to
>>> consider few corner cases to handle this use case.
>>>
>>> *Thanks again.
>>>
>>> Thanks,
>>> Vilobh
>>> *Graduate Research Associate
>>> Department of Computer Science
>>> The Ohio State University Columbus Ohio*
>>>
>>>
>>> On Thu, Oct 14, 2010 at 10:40 AM, Alexey Lyashkov <
>>> alexey.lyashkov at clusterstor.com> wrote:
>>>
>>>> Andreas,
>>>>
>>>> On Oct 14, 2010, at 17:31, Andreas Dilger wrote:
>>>>
>>>> > On 2010-10-13, at 23:18, Nicolas Williams wrote:
>>>> >> On Thu, Oct 14, 2010 at 06:38:16AM +0300, Alexey Lyashkov wrote:
>>>> >>> On Oct 14, 2010, at 03:28, Nicolas Williams wrote:
>>>> >>>> Yes, it's possible to add buffers to requests.  It's not possible
>>>> to add
>>>> >>>> buffers to _replies_ to existing RPCs unless you know the client
>>>> expects
>>>> >>>> those additional buffers -- existing clients expect a given maxsize
>>>> for
>>>> >>>> each reply, and if your reply is bigger then it will get dropped.
>>>> >>> It is wrong for last ~1year.
>>>> >>> ~1year ago i add code to ptlrpc layer which a adjust buffer for
>>>> reply, and resend a request.
>>>> >>
>>>> >> Ah, I didn't know that was in 1.8.  Are there interop issues (with
>>>> older
>>>> >> clients) though with sending larger replies than expected?
>>>> >
>>>> > Nico, it has always been possible in the past to increase the size of
>>>> any buffer in a request, or in a reply (if the total reply size will fit
>>>> into the pre-allocated reply buffer).  An older peer would just ignore the
>>>> bytes beyond the known part of the buffer.
>>>> >
>>>> I think that question don't about rebalance buffers size in message,
>>>> i think that sending large reply in smaller reply buffer.
>>>> LNet don't able to put large reply to small buffer (without truncate
>>>> flag, which is not exist in older ptlrpc version).
>>>> without that flag you will see messages
>>>> >>
>>>>                CERROR("Matching packet from %s, match "LPU64
>>>>                       " length %d too big: %d left, %d allowed\n",
>>>>                       libcfs_id2str(src), match_bits, rlength,
>>>>                       md->md_length - offset, mlength);
>>>> >>
>>>> and LNet will drop message without notify PtlRPC.
>>>>
>>>>
>>>> > Is that not true with the 2.x RPC handling?
>>>> >
>>>> 2.x able to rebalance space between buffers (but looks by hand), and
>>>> able adjust reply buffer after truncated reply.
>>>>
>>>>
>>>>
>>>> --------------------------------------
>>>> Alexey Lyashkov
>>>> alexey.lyashkov at clusterstor.com
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20101014/302f016d/attachment.htm>


More information about the lustre-devel mailing list