[Lustre-discuss] lustre performance question (contd)
Balagopal Pillai
pillai at mathstat.dal.ca
Tue Dec 25 08:24:11 PST 2007
On Tue, 25 Dec 2007, Aaron Knister wrote:
Looks like there were lots of tiny writes (no reads) at the time when this
happened. Can't really blame the tiny writes. I was testing with making
the tar file out of /usr/lib!. Here is the playback from colletcl on a
simple tar right after the the volume data1 is mounted.
### RECORD 7 >>> lustre1 <<< (1198592822) (Tue Dec 25 10:27:02 2007)
###
# LUSTRE FILESYSTEM SINGLE OST STATISTICS
#Ost Rds RdK 1K 2K 4K 8K 16K 32K 64K 128K 256K
Wrts WrtK 1K 2K 4K 8K 16K 32K 64K 128K 256K
data1-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 207 429M 0 0 0 0 0 0 2 0
data2-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
data3-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
data4-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch1-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch1-OST0001 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch2-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch2-OST0001 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
There is 429M of 1k i/o. Here is the collectl playback when the
data1 starts working normally, after 10 - 15 minutes of mounting.
### RECORD 5 >>> lustre1 <<< (1198598746) (Tue Dec 25 12:05:46 2007)
###
# LUSTRE FILESYSTEM SINGLE OST STATISTICS
#Ost Rds RdK 1K 2K 4K 8K 16K 32K 64K 128K 256K
Wrts WrtK 1K 2K 4K 8K 16K 32K 64K 128K 256K
data1-OST0000 18 17K 0 0 1 0 0 0 0 0 17
10 10K 0 0 0 0 0 0 0 0 10
data2-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
data3-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
data4-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch1-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch1-OST0001 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch2-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch2-OST0001 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
### RECORD 6 >>> lustre1 <<< (1198598757) (Tue Dec 25 12:05:57 2007)
###
# LUSTRE FILESYSTEM SINGLE OST STATISTICS
#Ost Rds RdK 1K 2K 4K 8K 16K 32K 64K 128K 256K
Wrts WrtK 1K 2K 4K 8K 16K 32K 64K 128K 256K
data1-OST0000 222 3625 7 5 195 15 0 0 0 0 0
1 403 0 0 0 0 0 0 0 0 0
data2-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
data3-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
data4-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch1-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch1-OST0001 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch2-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch2-OST0001 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
### RECORD 7 >>> lustre1 <<< (1198598767) (Tue Dec 25 12:06:07 2007)
###
# LUSTRE FILESYSTEM SINGLE OST STATISTICS
#Ost Rds RdK 1K 2K 4K 8K 16K 32K 64K 128K 256K
Wrts WrtK 1K 2K 4K 8K 16K 32K 64K 128K 256K
data1-OST0000 53 846 2 1 46 4 0 0 0 0 0
2 1866 0 0 0 0 0 0 0 0 1
data2-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
data3-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
data4-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch1-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch1-OST0001 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch2-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch2-OST0001 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
### RECORD 8 >>> lustre1 <<< (1198598777) (Tue Dec 25 12:06:17 2007)
###
# LUSTRE FILESYSTEM SINGLE OST STATISTICS
#Ost Rds RdK 1K 2K 4K 8K 16K 32K 64K 128K 256K
Wrts WrtK 1K 2K 4K 8K 16K 32K 64K 128K 256K
data1-OST0000 11 10K 0 0 1 0 0 0 0 0 10
8 8499 0 0 0 0 0 0 0 0 9
data2-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
data3-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
data4-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch1-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch1-OST0001 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch2-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch2-OST0001 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
### RECORD 9 >>> lustre1 <<< (1198598787) (Tue Dec 25 12:06:27 2007)
###
# LUSTRE FILESYSTEM SINGLE OST STATISTICS
#Ost Rds RdK 1K 2K 4K 8K 16K 32K 64K 128K 256K
Wrts WrtK 1K 2K 4K 8K 16K 32K 64K 128K 256K
data1-OST0000 245 3914 2 6 222 15 0 0 0 0 0
2 421 0 0 1 0 0 0 0 0 0
data2-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
data3-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
data4-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch1-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch1-OST0001 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch2-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch2-OST0001 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
### RECORD 10 >>> lustre1 <<< (1198598797) (Tue Dec 25 12:06:37 2007)
###
# LUSTRE FILESYSTEM SINGLE OST STATISTICS
#Ost Rds RdK 1K 2K 4K 8K 16K 32K 64K 128K 256K
Wrts WrtK 1K 2K 4K 8K 16K 32K 64K 128K 256K
data1-OST0000 105 1678 5 4 84 12 0 0 0 0 0
3 171 0 0 2 1 0 0 0 0 0
data2-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
data3-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
data4-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch1-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch1-OST0001 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch2-OST0000 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
scratch2-OST0001 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0
### RECORD 11 >>> lustre1 <<< (1198598807) (Tue Dec 25 12:06:47 2007)
###
This looks normal. I am seeing the same thing on another
Lustre installation (1.6.0.1) occassionally. But it goes away after a
while.
Balagopal
> When the volume is slugish do you see lots of tiny reads (500k/sec) on the
> full volume? When it slows down could you run an "iostat -k 2" on the OSS in
> question? I think we may be having the same problem. I could find no
> answer/solution and ended up blowing the whole setup away and starting from
> scratch. I'd like to track this down and figure out if its actually a bug or
> whether I FUBAR'd something in my setup.
>
> -Aaron
>
> On Dec 25, 2007, at 9:42 AM, Balagopal Pillai wrote:
>
> >Hi,
> >
> > Please ignore the previous email. It seemed to solve itself after 10
> >- 15 minutes of mounting the filled volume. Now it is as fast as the empty
> >volumes.
> >
> >Thanks
> >Balagopal
> >
> >---------- Forwarded message ----------
> >Date: Tue, 25 Dec 2007 10:36:28 -0400 (AST)
> >From: Balagopal Pillai <pillai at mathstat.dal.ca>
> >To: <lustre-discuss at clusterfs.com>
> >Subject: lustre performance question
> >
> >Hi,
> >
> > We have one Lustre volume that is getting full and some other
> >volumes that are totally empty. The one that is full is a little sluggish
> >at times with the following messages appearing in syslog on the OSS -
> >
> >Lustre: 5809:0:(filter_io_26.c:698:filter_commitrw_write()) data1-OST0001:
> >slow i_mutex 82s
> >Lustre: 5809:0:(filter_io_26.c:711:filter_commitrw_write()) data1-OST0001:
> >slow brw_start 82s
> >Lustre: 5809:0:(filter_io_26.c:763:filter_commitrw_write()) data1-OST0001:
> >slow direct_io 82s
> >Lustre: 5809:0:(filter_io_26.c:776:filter_commitrw_write()) data1-OST0001:
> >slow commitrw commit 82s
> >
> > But the same two OSS are also exporting the empty volume, which
> >is very fast on any tests (like creation of a tar file, bonnie etc etc)
> >I also tested the same thing on the nfs exported backup volume of the
> >filled up lustre volume (exported from the same OSS server) and it doesn't
> >show any significant slow down. Is it normal for Lustre volumes to slow
> >down when the volumes get full?
> >
> >Thanks
> >Balagopal
> >
> >_______________________________________________
> >Lustre-discuss mailing list
> >Lustre-discuss at clusterfs.com
> >https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>
> Aaron Knister
> Associate Systems Analyst
> Center for Ocean-Land-Atmosphere Studies
>
> (301) 595-7000
> aaron at iges.org
>
>
>
More information about the lustre-discuss
mailing list