[Lustre-discuss] lustre performance question (contd)

Tue Dec 25 08:24:11 PST 2007

On Tue, 25 Dec 2007, Aaron Knister wrote:
Looks like there were lots of tiny writes (no reads) at the time when this 
happened. Can't really blame the tiny writes. I was testing with making 
the tar file out of /usr/lib!. Here is the playback from colletcl on a 
simple tar right after the the volume data1 is mounted. 

### RECORD    7 >>> lustre1 <<< (1198592822) (Tue Dec 25 10:27:02 2007) 
###

# LUSTRE FILESYSTEM SINGLE OST STATISTICS
#Ost               Rds  RdK   1K   2K   4K   8K  16K  32K  64K 128K 256K 
Wrts WrtK   1K   2K   4K   8K  16K  32K  64K 128K 256K
data1-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0  207 429M    0    0    0    0    0    0    2    0
data2-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
data3-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
data4-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch1-OST0000     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch1-OST0001     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch2-OST0000     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch2-OST0001     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0

         There is 429M of 1k i/o. Here is the collectl playback when the 
data1 starts working normally, after 10 - 15 minutes of mounting.

### RECORD    5 >>> lustre1 <<< (1198598746) (Tue Dec 25 12:05:46 2007) 
###

# LUSTRE FILESYSTEM SINGLE OST STATISTICS
#Ost               Rds  RdK   1K   2K   4K   8K  16K  32K  64K 128K 256K 
Wrts WrtK   1K   2K   4K   8K  16K  32K  64K 128K 256K
data1-OST0000       18  17K    0    0    1    0    0    0    0    0   17   
10  10K    0    0    0    0    0    0    0    0   10
data2-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
data3-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
data4-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch1-OST0000     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch1-OST0001     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch2-OST0000     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch2-OST0001     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0

### RECORD    6 >>> lustre1 <<< (1198598757) (Tue Dec 25 12:05:57 2007) 
###

# LUSTRE FILESYSTEM SINGLE OST STATISTICS
#Ost               Rds  RdK   1K   2K   4K   8K  16K  32K  64K 128K 256K 
Wrts WrtK   1K   2K   4K   8K  16K  32K  64K 128K 256K
data1-OST0000      222 3625    7    5  195   15    0    0    0    0    0    
1  403    0    0    0    0    0    0    0    0    0
data2-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
data3-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
data4-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch1-OST0000     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch1-OST0001     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch2-OST0000     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch2-OST0001     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0

### RECORD    7 >>> lustre1 <<< (1198598767) (Tue Dec 25 12:06:07 2007) 
###

# LUSTRE FILESYSTEM SINGLE OST STATISTICS
#Ost               Rds  RdK   1K   2K   4K   8K  16K  32K  64K 128K 256K 
Wrts WrtK   1K   2K   4K   8K  16K  32K  64K 128K 256K
data1-OST0000       53  846    2    1   46    4    0    0    0    0    0    
2 1866    0    0    0    0    0    0    0    0    1
data2-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
data3-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
data4-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch1-OST0000     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch1-OST0001     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch2-OST0000     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch2-OST0001     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0

### RECORD    8 >>> lustre1 <<< (1198598777) (Tue Dec 25 12:06:17 2007) 
###

# LUSTRE FILESYSTEM SINGLE OST STATISTICS
#Ost               Rds  RdK   1K   2K   4K   8K  16K  32K  64K 128K 256K 
Wrts WrtK   1K   2K   4K   8K  16K  32K  64K 128K 256K
data1-OST0000       11  10K    0    0    1    0    0    0    0    0   10    
8 8499    0    0    0    0    0    0    0    0    9
data2-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
data3-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
data4-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch1-OST0000     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch1-OST0001     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch2-OST0000     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch2-OST0001     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0

### RECORD    9 >>> lustre1 <<< (1198598787) (Tue Dec 25 12:06:27 2007) 
###

# LUSTRE FILESYSTEM SINGLE OST STATISTICS
#Ost               Rds  RdK   1K   2K   4K   8K  16K  32K  64K 128K 256K 
Wrts WrtK   1K   2K   4K   8K  16K  32K  64K 128K 256K
data1-OST0000      245 3914    2    6  222   15    0    0    0    0    0    
2  421    0    0    1    0    0    0    0    0    0
data2-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
data3-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
data4-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch1-OST0000     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch1-OST0001     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch2-OST0000     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch2-OST0001     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0

### RECORD   10 >>> lustre1 <<< (1198598797) (Tue Dec 25 12:06:37 2007) 
###

# LUSTRE FILESYSTEM SINGLE OST STATISTICS
#Ost               Rds  RdK   1K   2K   4K   8K  16K  32K  64K 128K 256K 
Wrts WrtK   1K   2K   4K   8K  16K  32K  64K 128K 256K
data1-OST0000      105 1678    5    4   84   12    0    0    0    0    0    
3  171    0    0    2    1    0    0    0    0    0
data2-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
data3-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
data4-OST0000        0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch1-OST0000     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch1-OST0001     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch2-OST0000     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0
scratch2-OST0001     0    0    0    0    0    0    0    0    0    0    0    
0    0    0    0    0    0    0    0    0    0    0

### RECORD   11 >>> lustre1 <<< (1198598807) (Tue Dec 25 12:06:47 2007) 
###

          This looks normal. I am seeing the same thing on another 
Lustre installation (1.6.0.1) occassionally. But it goes away after a 
while.

Balagopal

> When the volume is slugish do you see lots of tiny reads (500k/sec) on the
> full volume? When it slows down could you run an "iostat -k 2" on the OSS in
> question? I think we may be having the same problem. I could find no
> answer/solution and ended up blowing the whole setup away and starting from
> scratch. I'd like to track this down and figure out if its actually a bug or
> whether I FUBAR'd something in my setup.
> 
> -Aaron
> 
> On Dec 25, 2007, at 9:42 AM, Balagopal Pillai wrote:
> 
> >Hi,
> >
> >    Please ignore the previous email. It seemed to solve itself after 10
> >- 15 minutes of mounting the filled volume. Now it is as fast as the empty
> >volumes.
> >
> >Thanks
> >Balagopal
> >
> >---------- Forwarded message ----------
> >Date: Tue, 25 Dec 2007 10:36:28 -0400 (AST)
> >From: Balagopal Pillai <pillai at mathstat.dal.ca>
> >To:  <lustre-discuss at clusterfs.com>
> >Subject: lustre performance question
> >
> >Hi,
> >
> >        We have one Lustre volume that is getting full and some other
> >volumes that are totally empty. The one that is full is a little sluggish
> >at times with the following messages appearing in syslog on the OSS -
> >
> >Lustre: 5809:0:(filter_io_26.c:698:filter_commitrw_write()) data1-OST0001:
> >slow i_mutex 82s
> >Lustre: 5809:0:(filter_io_26.c:711:filter_commitrw_write()) data1-OST0001:
> >slow brw_start 82s
> >Lustre: 5809:0:(filter_io_26.c:763:filter_commitrw_write()) data1-OST0001:
> >slow direct_io 82s
> >Lustre: 5809:0:(filter_io_26.c:776:filter_commitrw_write()) data1-OST0001:
> >slow commitrw commit 82s
> >
> >          But the same two OSS are also exporting the empty volume, which
> >is very fast on any tests (like creation of a tar file, bonnie etc etc)
> >I also tested the same thing on the nfs exported backup volume of the
> >filled up lustre volume (exported from the same OSS server) and it doesn't
> >show any significant slow down. Is it normal for Lustre volumes to slow
> >down when the volumes get full?
> >
> >Thanks
> >Balagopal
> >
> >_______________________________________________
> >Lustre-discuss mailing list
> >Lustre-discuss at clusterfs.com
> >https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
> 
> Aaron Knister
> Associate Systems Analyst
> Center for Ocean-Land-Atmosphere Studies
> 
> (301) 595-7000
> aaron at iges.org
> 
> 
>