[Lustre-discuss] Tar backup of MDT runs extremely slow, tar pauses on pointers to very large files

Jeff Johnson jeff.johnson at aeoncomputing.com
Mon May 28 22:08:20 PDT 2012


I am aiding in the recovery of a multi-Petabyte Lustre filesystem
(1.8.7) that went down hard due to site wide power loss. Power loss
caused the MDT RAID volume to be put in a critical state and I was
able to get the md raid based MDT device mounted read only and the MDT
mounted read only as type ldiskfs.

I was able to successfully backup the extended attributes of the MDT.
This process took about 10 minutes.

The tar backup of the MDT is taking a very long time. So far it has
backed up 1.6GB of the 5.0GB used in nine hours. In watching the tar
process pointers to small or average size files are backed up quickly
and at a consistent pace. When tar encounters a pointer/inode
belonging to a very large file (100GB+) the tar process stalls on that
file for a very long time, as if it were trying to archive the real
filesize amount of data rather than the pointer/inode.

During this process there are no errors reported by kernel, ldiskfs,
md or tar. Nothing that would indiciate why things are so slow on
pointers to large files. In watching the tar process the CPU
utilization is at or near 100% so it is doing something. Running
iostat at the same time shows that while tar is at or near 100% CPU
there are no reads taking place on the MDT device and no writes to the
device where the tarball is being written.

It appears that the tar process goes to outer space when it encounters
pointers to very large files. Is this expected behavior?

The backup command used is the one from the MDT backup process in the
1.8 manual: 'tar zcvf <tarfile> --sparse .'

df reports the ldiskfs MDT as 5GB used:
/dev/md0           2636788616   5192372 2455778504   1% /mnt/mdt

df -i reports the ldiskfs MDT as having 10,300,000 inodes used:
/dev/md0           1758199808 10353389 1747846419    1% /mnt/mdt

Any feedback is appreciated!


Jeff Johnson
Aeon Computing

jeff dot johnson at aeoncomputing.com
t: 858-412-3810 x101   f: 858-412-3845

4905 Morena Boulevard, Suite 1313 - San Diego, CA 92117

More information about the lustre-discuss mailing list