[lustre-discuss] Can Linux FS-Cache/CacheFS run on Lustre 2.10.x

Dilger, Andreas andreas.dilger at intel.com
Tue Dec 12 13:59:42 PST 2017

On Dec 11, 2017, at 20:31, Forrest.Wc.Ling at dell.com wrote:
> Dear All:
> The best practice of DGX-1 on storage for DL, requires 4x SSDs to be local cache backend to improve IO performance by using Linux cacheFS. 
> http://docs.nvidia.com/deeplearning/dgx/pdf/Best-Practices.pdf
> 6.1.1. Internal Storage
> The first storage consideration is storage within the DGX-1 itself. For the best possible performance, a NFS read cache has been included in the DGX-1 appliance using the Linux cacheFS capability. It uses four SSD’s in a RAID-0 group. The drives are connected to a dedicated hardware RAID controller.
> I looked Redhat Linux, the manual shows the FS-Cache is a persistent local cache that can be used by file systems to take data retrieved from over the network and cache it on local disk. This helps minimize network traffic for users accessing data from a file system mounted over the network (for example, NFS).
> https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/ch-fscache  
> Can Linux FS-Cache/CacheFS run on Lustre 2.10.x or will the Persistent Client-side Cache on Lustre 2.12 have the same function as FS-Cache on NFS to be released next year ?

It is not possible to use FSCache/CacheFS with Lustre today (at least there is no public patch that I know of that does this).

The Persistent Client Cache feature will provide equivalent functionality for Lustre as CacheFS/FSCache.  We have been discussing whether to use CacheFS/FSCache for Lustre instead of the dedicated PCC code, but there are significant differences in the architecture (file vs. block based cache), and we are leaning toward the dedicated PCC code as providing better functionality for Lustre, as well as the flexibility to modify it to suit our needs.

If you are interested in this, it would be interesting/useful if you are able to test out the patch and run the NVidia benchmarks to compare Lustre+PCC vs. NFS+CacheFS.  I doubt SSDs+RAID-0 makes sense, vs. having a single PCI NVMe device like P3700 or similar.

Cheers, Andreas
Andreas Dilger
Lustre Principal Architect
Intel Corporation

More information about the lustre-discuss mailing list