[lustre-discuss] OSTs per OSS with ZFS
andreas.dilger at intel.com
Mon Jul 3 01:15:14 PDT 2017
We have seen performance improvements with multiple zpools/OSTs per OSS. However, with only 5x NVMe devices per OSS you don't have many choices in terms of redundancy, unless you are not using any redundancy at all, just raw bandwidth?
The other thing to consider is what the network bandwidth is vs. the NVMe bandwidth? With similar test systems using NVMe devices without redundancy we've seen multi GB/s, so if you aren't using OPA/IB network then that will likely be your bottleneck. Even if the TCP is fast enough, the CPU overhead and data copies will probably kill the performance.
In the end, you can probably test with a few of configs to see which one will give the best performance - mirror, single RAID-Z, two RAID-Z pools on half-sized partitions, five no-redundancy zpools with one VDEV each, single no-redundancy zpool with five VDEVs.
PS - there is initial snapshot functionality in the 2.10 release.
> On Jul 2, 2017, at 10:07, Brian Andrus <toomuchit at gmail.com> wrote:
> We have been having some discussion about the best practices when creating OSTs with ZFS.
> The basic question is: What is the best ration of OSTs per OSS when using ZFS?
> It is easy enough to do a single OST with all disks and have reliable data protection provided by ZFS. It may be an better scenario when snapshots of lfs become a feature as well.
> However, multiple OSTs can mean more stripes and faster reads/writes. I have seen some tests that were done quite some time ago which may not be so valid anymore with the updates to Lustre.
> We have a system for testing that has 5 NVMes each. We can do 1 zfs file system with all or we can separate them into 5 (which would forgo some of the features of zfs).
> Any prior experience/knowledge/suggestions would be appreciated.
> Brian Andrus
> lustre-discuss mailing list
> lustre-discuss at lists.lustre.org
More information about the lustre-discuss