[Lustre-discuss] Tar backup of MDT runs extremely slow, tar pauses on pointers to very large files

Jeff Johnson jeff.johnson at aeoncomputing.com
Wed May 30 15:57:49 PDT 2012

Following up on my original post. I switched from /bin/tar that comes 
with RHEL/CentOS 5.x to thw Whamcloud patched tar utility. The entire 
backup was successful and took only 12 hours to complete. The CPU 
utilization was high >90% but only on one core. The process was much 
faster than the standard tar shipped in RHEL/CentOS and the only slow 
downs were on file pointers to very large files (100TB+) with large 
stripe counts. The files that were going very slow when I reported the 
initial problem were backed up instantly with the Whamcloud version of tar.

Best part, the MDT was saved and the 4PB filesystem is in production again.


On 5/30/12 3:02 PM, Andreas Dilger wrote:
> On 2012-05-29, at 1:28 PM, Peter Grandi wrote:
>>> The tar backup of the MDT is taking a very long time. So far it has
>>> backed up 1.6GB of the 5.0GB used in nine hours. In watching the tar
>>> process pointers to small or average size files are backed up quickly
>>> and at a consistent pace. When tar encounters a pointer/inode
>>> belonging to a very large file (100GB+) the tar process stalls on that
>>> file for a very long time, as if it were trying to archive the real
>>> filesize amount of data rather than the pointer/inode.
>> If you have stripes on, a 100GiB file will have 100,000 1MiB
>> stripes, and each requires a chunk of metadata. The descriptor
>> for that file will have this potentially a very large number of
>> extents, scattered around the MDT block device, depending on how
>> slowly the file grew etc.
> While that may be true for other distributed filesystems, that is
> not true for Lustre at all.  The size of a Lustre object is not
> fixed to a "chunk size" like 32MB or similar, but rather is
> variable depending on the size of the file itself.  The number of
> "stripes" (== objects) on a file is currently fixed at file
> creation time, and the MDS only needs to store the location of
> each stripe (at most one per OST).  The actual blocks/extents of
> the objects are managed inside the OST itself and are never seen
> by the client or the MDS.
> Cheers, Andreas
> --
> Andreas Dilger                       Whamcloud, Inc.
> Principal Lustre Engineer            http://www.whamcloud.com/
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Jeff Johnson
Aeon Computing

jeff.johnson at aeoncomputing.com
t: 858-412-3810 x101   f: 858-412-3845
m: 619-204-9061

4905 Morena Boulevard, Suite 1313 - San Diego, CA 92117

More information about the lustre-discuss mailing list