[Lustre-discuss] drbd slow I/O with lustre filesystem

Mon Oct 5 07:42:25 PDT 2009

> I'm currently using drbd as a raid 1 network devices solution
> for my lustre storage system. It worked well during 3 months

Well chosen, that is a popular and quite interesting setup; I
think that it should be the best/default Lustre setup if some
resilience is desired.

> but now, when i have to reformat our devices, re-synchronize
> drbd devices ( it took about 2 days with 6TB

Thats 70MB/s which is a bit low by any standards, but then it is
over a 1Gb/s link. Consider 10Gb/s.

> raid-5 partition ).

RAID5 over RAID1? Nahh. Consider http://WWW.BAARF.com/ and that
the storage system of a Lustre pool over DRBD is ideally suited to
RAID10 (with each pair a DRBD resource). RAID5 may be contributing
to your speed problem below because of or being rebuilt/syncing
itself.

> After formatting them with lustre format ( using mkfs.lustre ) ,
> i start to copy data to my drbd devices, but:

> - Its I/O wait when i monitor by top or iostat is too hight,
> about 25%

This is not much related to anything... After all you are doing a
lot of IO, and jumping around on the disk, doing a restore.

> - The copy speed from my web client to our OST using drbd
> devices is too low, only about 13MB/s although client and ost in
> is the same 1Gb Ethernet LAN.

Too few details about this. Thigns to check:

* Raw network speed: I like 'nuttcp' to do check it. Using the
  usual trick (larger send/receive buffers, jumbo frames, ...) may
  help if there are issues. But then you were getting 70MB/s above.
    http://lists.centos.org/pipermail/centos/2009-July/079505.html
* If you are using LVMN2 bad news.
    http://archives.free.net.ph/message/20070815.091608.fff62ba9.en.html
* Using RAID5 as argued above may be detrimental.
* The DRBD must be configured to allow higher sync speeds:
    http://www.ossramblings.com/drbd_defaults_too_slow
    http://www.linux-ha.org/DRBD/FAQ#head-e09d2c15ba7ff691ecd5d5d7b848a50d25a3c3eb
  Your initial sync however seemed to run at 70MB/s so
  I wonder. Maybe tuning the "unplug" waterkmark in DRBD
  or if you have battery backup enabling no-flush mode.
    http://archives.free.net.ph/message/20081219.085301.997727d2.en.html

> When i tried using one OST without drbd, it worked quite well

It might mean that it is mainly a DRBD issue. You might want to
get the latest DRBD versions, as some earlier versions. If you
have RHEL the ElRepo has got fairly recent ones.

> So, could any one please tell me where the problem is ? In our
> drbd devices or because of lustre ? Is there anyone has the same
> problem with me ? :(

All of the above probably -- max performance here means ensuring
that write requests are issued as fast as possible and back-to-back
packets/blocks are then possible both on the network and on the
storage system...

  http://www.gossamer-threads.com/lists/drbd/users/17991
  http://lists.linbit.com/pipermail/drbd-user/2007-August/007256.html
  http://lists.linbit.com/pipermail/drbd-user/2009-January/011165.html
  http://lists.linbit.com/pipermail/drbd-user/2009-January/011198.html

It may conceivably be quicker for you to load all your data first
on the primary storage half of the pair, and then reactivate the
secondary and let resync.

My impression is that a problem is unlikely to originate in the
Lustre side, but more on the underlying layers mentioned above.
There is a fair bit of material on DRBD optimization, both on its
site, and more specifically around the MySQL community, where it
is very commonly used, and they care a lot about performance.