[lustre-discuss] [PATCH 04/37] staging/lustre: tracefile: use 64-bit seconds

Drokin, Oleg oleg.drokin at intel.com
Thu Sep 24 18:52:37 PDT 2015

On Sep 24, 2015, at 3:46 AM, Arnd Bergmann wrote:

> On Thursday 24 September 2015 04:02:09 Drokin, Oleg wrote:
>>> The lustre tracefile has a timestamp defined as
>>>      __u32 ph_sec;
>>>      __u64 ph_usec;
>>> which seems completely backwards, as the microsecond portion of
>>> a time stamp will always fit into a __u32 value, while the second
>>> portion will overflow in 2038 or 2106 (in case of unsigned seconds).
>>> This rectifies the situation by swapping out the types to have
>>> 64-bit seconds like everything else.
>>> While this constitutes an ABI change, it seems to be reasonable
>>> for a debugging interface to change and is likely what was
>>> originally intended.
>> This is going to wreak some havoc as the old tools would obviously
>> misrepresent this, but the new tools also cannot assume blindly
>> this change is in place, since people tend to stick to old
>> lustre modules for a long time in production for various reasons,
>> while the tools might get upgraded.
>> So I wonder if we should include some sort of a hint somewhere that
>> the lctl could read and see which format it's going to convert from.
>> Either that or we'd need to play with some heuristic in the tools
>> to observe where the leading zeros are (in little ending) in one
>> and the other case (if the year is not quite 2038 yet) and
>> make a decision based on that.
> Ok, I see.
> If you can prove that the user space tools interpret this value
> as a unsigned 32-bit number, that would work until 2106 and we could
> document it as a restriction that way.

Well, let's see.
The same structure definition is used as in the kernel (in fact it's
the same header, just before it got adopted into the staging tree).

The users are in lnet/utils/debug.c in the same lustre tree I pointed you at
(tools live there too):

static int cmp_rec(const void *p1, const void *p2)
        struct dbg_line *d1 = *(struct dbg_line **)p1;
        struct dbg_line *d2 = *(struct dbg_line **)p2;

        if (d1->hdr->ph_sec < d2->hdr->ph_sec)
                return -1;
        if (d1->hdr->ph_sec == d2->hdr->ph_sec &&
            d1->hdr->ph_usec < d2->hdr->ph_usec)
                return -1;
        if (d1->hdr->ph_sec == d2->hdr->ph_sec &&
            d1->hdr->ph_usec == d2->hdr->ph_usec)
                return 0;
        return 1;

               bytes = snprintf(out, sizeof(out),
                                "(%s:%u:%s()) %s",
                                hdr->ph_subsys, hdr->ph_mask,
                                hdr->ph_cpu_id, hdr->ph_type,
                                hdr->ph_flags & PH_FLAG_FIRST_RECORD ? "F" : "",
                                hdr->ph_sec, (unsigned long long)hdr->ph_usec,
                                hdr->ph_stack, hdr->ph_pid, hdr->ph_extern_pid,
                                line->file, hdr->ph_line_num, line->fn,

        fprintf(stderr, "  seconds = %u\n", hdr->ph_sec);

                if (hdr->ph_len > 4094 ||       /* is this header bogus? */
                    hdr->ph_stack > 65536 ||
                    hdr->ph_sec < (1 << 30) ||
                    hdr->ph_usec > 1000000000 ||
                    hdr->ph_line_num > 65536) {

These are all the users.
Seems to be pretty much u32 (or unsigned int when printing) everywhere.

> Another option would be to change the code storing the times there
> to do:
> 	header->ph_sec = (u32)ts.tv_sec;
> 	header->ph_usec = (ts.tv_sec & 0xffffffff00000000ull) | 
> 			   (ts.tv_nsec / NSEC_PER_USEC);
> and do the reverse on the user space side. This would be both endian-
> safe and backwards compatible, although rather ugly.

Indeed, thats' quite ugly.
I think in the next 90 years we will need some other adjustments to the
log format and should be able to include this one with them
(e.g. there are only 16 bits for the cpu, considering we have 260+ core
CPUs on the market today, might as well overflow that quite soon).

More information about the lustre-discuss mailing list