[lustre-devel] caching in Lustre

Patrick Farrell paf at cray.com
Tue Dec 13 05:33:13 PST 2016



Quentin,

I suspect that the pages are only maintained during the duration of an IO, then discarded.  I haven't dug in to the exact mechanics of it, but when caches are disabled, the key thing is no CACHING occurs, i.e., nothing can be read from the cache.  So, I assume, these pages you see are transiently present for purposes of performing the IO.  (The data from the disk has to go somewhere.)

- Patrick
________________________________
From: lustre-devel <lustre-devel-bounces at lists.lustre.org> on behalf of quentin.bouget at cea.fr <quentin.bouget at cea.fr>
Sent: Tuesday, December 13, 2016 4:02:06 AM
To: lustre-devel at lists.lustre.org
Subject: [lustre-devel] caching in Lustre


Hi all,

I am currently trying to work out how Lustre behaves when both "read_cache" and "writethrough_cache" are disabled. What I particularly want to know is how does writing to the related proc files influence the cache policy?

To me (and perf_event reports it too on a 2.7 setup), the code always gets cache pages (using find_or_create_page() in "lustre/osd-ldiskfs/osd_io.c") even with both cache parameters set to 0.
After that, if caching is disabled, a call to generic_error_remove_page() is issued on the pages that were allocated. This functions is described in the kernel sources like this:

/*
 * Used to get rid of pages on hardware memory corruption.
 */
int generic_error_remove_page(struct adress_space *mapping, struct page *page)


This does not seem to be the "natural" call to use, but anyway, I can live with that.
What really bothers me is that the behaviour of Lustre from this point looks exactly the same as if cache was enabled. I can't find a single branching point that handles things differently: pages are kmapped, written to/read from, kunmmaped... I am probably missing something, but I can't figure out what. Could someone please point me in the right direction?

The functions I find the most relevant to study are:
"lustre/ofd/ofd_io.c":
    ofd_preprw() -> ofd_preprw_read() / ofd_preprw_write()

their counterparts:
"lustre/ofd/ofd_io.c":
    ofd_commitrw() -> ofd_commitrw_read() / ofd_commitrw_write()

the handlers of the proc files "/proc/fs/lustre/obdfilter/*/{read,writethrough}_cache_enable":
"lustre/osd-ldiskfs/osd_lproc.c":
    ldiskfs_osd_cache_seq_write(), ldiskfs_osd_wcache_seq_write()

and the only places that use the variables set by the proc files (where generic_error_remove_page() is used):
"lustre/osd-ldiskfs/osd_io.c":
    osd_read_prep(), osd_write_prep()

(I suspect I am missing something really important about what generic_error_remove_page() does)


Regards

Quentin Bouget
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20161213/0ecd2868/attachment-0001.htm>


More information about the lustre-devel mailing list