Ok, I have been working with using ldev.conf and failover.
In the past I did not configure the failnode column (using the '-' as directed). By referencing disks by label and labeling them by what OST/MDT/MGT they would be, I have the convenience of being able to use the very same ldev.conf on all my nodes.

So I added a reference to a failover node for one of my OSTs.
I noticed that I get an error:
ldev: Fatal: /etc/ldev.conf line 5: local and foreign host not mapped to each other

when running ldev -s (sanity check). I started digging into this and found a few things:

1.       You cannot have more than one failnode per device (although you can use tunefs.luster to set more than one)

2.       Each node must map to the same failnode for all devices on itself

3.       All devices on a listed failnode must map its devices to the node that lists it as a failnode

a.       Nodes must have reciprocal mapping. That is all devices on Node1 must map Node2 as failnode and all devices on Node2 must map Node1 as a failnode

I'm sure my circumstances may be contributing to this, but it seems useful to be able to have different OSTs fail to different nodes (or not at all) and to have multiple failnodes listed.

Would it make sense to have ldev use the tunefs.lustre settings to find failover.node settings and not use ldev.conf at all for that?

