[Lustre-discuss] IOR writing to a shared file, performance does not scale
Michael Kluge
Michael.Kluge at tu-dresden.de
Fri Feb 10 23:29:03 PST 2012
Hi Kshitij,
I would recommend to run sgpdd-survey on the servers for one and for
multiple disks and then obdfilter-survey. Then you know what your
storage can deliver. Then you could do lnet tests as well to see wether
the network works fine. If the disks and the network deliver the
expected performance, IOR will most probably run with good performance
as well.
Please see:
http://wiki.lustre.org/images/4/40/Wednesday_shpc-2009-benchmarking.pdf
Regards, Michael
On 10.02.2012 23:27, Kshitij Mehta wrote:
> We have lustre 1.6.7 configured using 64 OSTs.
> I am testing the performance using IOR, which is a file system benchmark.
>
> When I run IOR using mpi such that processes write to a shared file,
> performance does not scale. I tested with 1,2 and 4 processes, and the
> performance remains constant at 230 MBps.
>
> When processes write to separate files, performance improves greatly,
> reaching 475 MBps.
>
> Note that all processes are spawned on a single node.
>
> Here is the output:
> Writing to a shared file:
>
>> Command line used: ./IOR -a POSIX -b 2g -e -t 32m -w -o
>> /fastfs/gabriel/ss_64/km_ior.out
>> Machine: Linux deimos102
>>
>> Summary:
>> api = POSIX
>> test filename = /fastfs/gabriel/ss_64/km_ior.out
>> access = single-shared-file
>> ordering in a file = sequential offsets
>> ordering inter file= no tasks offsets
>> clients = 4 (4 per node)
>> repetitions = 1
>> xfersize = 32 MiB
>> blocksize = 2 GiB
>> aggregate filesize = 8 GiB
>>
>> Operation Max (MiB) Min (MiB) Mean (MiB) Std Dev Max (OPs) Min
>> (OPs) Mean (OPs) Std Dev Mean (s)
>> --------- --------- --------- ---------- ------- ---------
>> --------- ---------- ------- --------
>> write 233.61 233.61 233.61 0.00 7.30
>> 7.30 7.30 0.00 35.06771 EXCEL
>>
>> Max Write: 233.61 MiB/sec (244.95 MB/sec)
>
> Writing to separate files:
>
>> Command line used: ./IOR -a POSIX -b 2g -e -t 32m -w -o
>> /fastfs/gabriel/ss_64/km_ior.out -F
>> Machine: Linux deimos102
>>
>> Summary:
>> api = POSIX
>> test filename = /fastfs/gabriel/ss_64/km_ior.out
>> access = file-per-process
>> ordering in a file = sequential offsets
>> ordering inter file= no tasks offsets
>> clients = 4 (4 per node)
>> repetitions = 1
>> xfersize = 32 MiB
>> blocksize = 2 GiB
>> aggregate filesize = 8 GiB
>>
>> Operation Max (MiB) Min (MiB) Mean (MiB) Std Dev Max (OPs) Min
>> (OPs) Mean (OPs) Std Dev Mean (s)
>> --------- --------- --------- ---------- ------- ---------
>> --------- ---------- ------- --------
>> write 475.95 475.95 475.95 0.00 14.87
>> 14.87 14.87 0.00 17.21191 EXCEL
>>
>> Max Write: 475.95 MiB/sec (499.07 MB/sec)
>
> I am trying to understand where the bottleneck is, when processes write
> to a shared file.
> Your help is appreciated.
>
--
Dr.-Ing. Michael Kluge
Technische Universität Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany
Contact:
Willersbau, Room WIL A 208
Phone: (+49) 351 463-34217
Fax: (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW: http://www.tu-dresden.de/zih
More information about the lustre-discuss
mailing list