[Lustre-discuss] Lustre Intelligence?
Brock Palen
brockp at umich.edu
Wed Dec 10 17:24:03 PST 2008
So question, had a user, thought his problem was disk system, turns
out he was just OOM machine, his IO code though looked like this:
for(i=0;i<kloc;i++)
{
sprintf(&(buffer[39]),"%d",
i+(int)floor(rank*(double)k/(double)numprocs)+1);
f1 = fopen(buffer,"w");
for(j=0;j<N;j++)
{
fprintf(f1,"%e\n",u[i][j]);
}
fclose(f1);
}
So how I read this every processor (every processor calls this
function and writes to its own sets of files) is writing one double
at a time to their files. IO performance though was still quite good.
I enabled extents_stats on rank0 of this job and ran it, Here Is
what I ended up with (stats were zeroed, and only job running on client)
extents calls % cum% | calls % cum%
0K - 4K : 12 4 4 | 0
0 0
4K - 8K : 0 0 4 | 0
0 0
8K - 16K : 0 0 4 | 0
0 0
16K - 32K : 0 0 4 | 0
0 0
32K - 64K : 0 0 4 | 0
0 0
64K - 128K : 0 0 4 | 0 0 0
128K - 256K : 0 0 4 | 0 0 0
256K - 512K : 0 0 4 | 0 0 0
512K - 1024K : 0 0 4 | 4 1 1
1M - 2M : 136 47 51 | 220 98 100
2M - 4M : 140 48 100 | 0 0 100
So 98% of writes and reads (read code is similar and reads in about
2GB this way) were all 1-4MB. Is this lustre showing its'
preference for 1MB IO ops? Even though the code wanted to do 8bytes
at a time, lustre cleaned it up? Or did LInux do this some place?
Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
brockp at umich.edu
(734)936-1985
More information about the lustre-discuss
mailing list