[lustre-discuss] ZFS w/Lustre problem

Steve Thompson smt at vgersoft.com
Fri Nov 6 12:23:08 PST 2020

This may be a question for the ZFS list...

I have Lustre 2.12.5 on Centos 7.8 with ZFS 0.7.13, 10GB network. I make 
snapshots of the Lustre filesystem with 'lctl snapshot_create' and at a 
later time transfer these snapshots to a backup system with zfs send/recv. 
This works well for everything but the MDT. For the MDT, I find that the 
zfs recv always fails when a little less than 1GB has been transferred 
(this being an incremental send/recv of snapshots taken a day apart):

# zfs send -v -c -i fs0pool/mdt0 at 03-nov-2020 fs0pool/mdt0 at 04-nov-2020 | \
 	zfs recv -F backups/fs0pool/mdt0
12:11:18    946M   fs0pool/mdt0 at 04-nov-2020-01:00
12:11:19    946M   fs0pool/mdt0 at 04-nov-2020-01:00
12:11:20    946M   fs0pool/mdt0 at 04-nov-2020-01:00
cannot receive incremental stream: dataset does not exist

while if the data transfer is much smaller, the send/recv works. Since 
once I get a failure it is not possible to complete a send/recv for any 
subsequent day, I am doing a full snapshot send to a file; this always 
works and takes about 5/6 minutes for my MDT. When using zfs send/recv, 
the recv is always very very slow (several hours to get to the above 
failure point, even when using mbuffer). I am using custom zfs replication 
scripts, but it fails also using the zrep package.

Does anyone know of a possible explanation? Is there any version of ZFS 
0.8 that works with Lustre 2.12.5?

Steve Thompson                 E-mail:      smt AT vgersoft DOT com
Voyager Software LLC           Web:         http://www DOT vgersoft DOT com
3901 N Charles St              VSW Support: support AT vgersoft DOT com
Baltimore MD 21218
   "186,282 miles per second: it's not just a good idea, it's the law"

More information about the lustre-discuss mailing list