[Lustre-discuss] odd quota on single OST
Mueller Eric
eric.mueller at id.ethz.ch
Tue Oct 15 23:33:12 PDT 2013
Hello
I have a quota problem with a new lustre installation, version 2.4.0.
On generating files in a loop for testing, the quota is hit much sooner than anticipated or set,
but only on some files.
Finally I found that, for some unknown reason, the quota on a single OST (OST001b) is corrupt
and, according to the striping of files, OST001b is limiting each file that has a stripe on it:
[root at node123]# lfs quota -u eric -v /cluster/scratch_xp
Disk quotas for user eric (uid 804):
Filesystem kbytes quota limit grace files quota limit grace
/cluster/scratch_xp
145400 2147483648 2147483648 - 36136 209715 209715 -
prism-MDT0000_UUID
1712 - 0 - 36136 - 65536 -
prism-OST0000_UUID
4516 - 16777216 - - - - -
prism-OST0001_UUID
4516 - 16777216 - - - - -
prism-OST0002_UUID
4520 - 16777216 - - - - -
prism-OST0003_UUID
4520 - 16777216 - - - - -
prism-OST0004_UUID
4508 - 16777216 - - - - -
prism-OST0005_UUID
4512 - 16777216 - - - - -
prism-OST0006_UUID
4520 - 16777216 - - - - -
prism-OST0007_UUID
4512 - 16777216 - - - - -
prism-OST0008_UUID
4512 - 16777216 - - - - -
prism-OST0009_UUID
4516 - 16777216 - - - - -
prism-OST000a_UUID
4520 - 16777216 - - - - -
prism-OST000b_UUID
4512 - 16777216 - - - - -
prism-OST000c_UUID
4516 - 16777216 - - - - -
prism-OST000d_UUID
4520 - 16777216 - - - - -
prism-OST000e_UUID
4516 - 16777216 - - - - -
prism-OST000f_UUID
4516 - 16777216 - - - - -
prism-OST0010_UUID
4516 - 16777216 - - - - -
prism-OST0011_UUID
4520 - 16777216 - - - - -
prism-OST0012_UUID
4516 - 16777216 - - - - -
prism-OST0013_UUID
4516 - 16777216 - - - - -
prism-OST0014_UUID
4516 - 16777216 - - - - -
prism-OST0015_UUID
4516 - 16777216 - - - - -
prism-OST0016_UUID
4516 - 16777216 - - - - -
prism-OST0017_UUID
4520 - 16777216 - - - - -
prism-OST0018_UUID
4520 - 16777216 - - - - -
prism-OST0019_UUID
4516 - 16777216 - - - - -
prism-OST001a_UUID
4512 - 16777216 - - - - -
prism-OST001b_UUID
3680* - 3680 - - - - -
prism-OST001c_UUID
4516 - 16777216 - - - - -
prism-OST001d_UUID
4516 - 16777216 - - - - -
prism-OST001e_UUID
4520 - 16777216 - - - - -
prism-OST001f_UUID
4520 - 16777216 - - - - -
[root at node123]# perl -e 'printf("p-oss0%d\n",0x1b%8+1)'
p-oss04
I did not observe anything unusual in the /var/log/messages files of p-oss04.
[root at p-oss04 quota_slave]# grep -wA1 804 /proc/fs/lustre/osd-ldiskfs/prism-OST001b/quota_slave/limit_user
limit_user:- id: 804
limit_user- limits: { hard: 2147483648, soft: 2147483648, granted: 0, time: 0 }
[root at p-oss04 quota_slave]# cat acct_user
usr_accounting:
- id: 0
usage: { inodes: 3557, kbytes: 1028 }
- id: 804
usage: { inodes: 1132, kbytes: 3680 }
[root at p-oss04 quota_slave]#
I have also tried to completely turn off quotas for user and group and turn it on again. It did
not help as well:
[root at p-mds1 ~]# lctl get_param osd-*.*.quota_slave.info
osd-ldiskfs.prism-MDT0000.quota_slave.info=
target name: prism-MDT0000
pool ID: 0
type: md
quota enabled: ug
conn to master: setup
space acct: ug
user uptodate: glb[1],slv[1],reint[0]
group uptodate: glb[1],slv[1],reint[0]
[root at p-mds1 ~]# lctl conf_param prism.quota.ost=none
[root at p-mds1 ~]# lctl conf_param prism.quota.mdt=none
[root at p-mds1 ~]# lctl get_param osd-*.*.quota_slave.info
osd-ldiskfs.prism-MDT0000.quota_slave.info=
target name: prism-MDT0000
pool ID: 0
type: md
quota enabled: none
conn to master: setup
space acct: ug
user uptodate: glb[1],slv[1],reint[0]
group uptodate: glb[1],slv[1],reint[0]
[root at p-mds1 ~]# lctl conf_param prism.quota.mdt=ug
[root at p-mds1 ~]# lctl conf_param prism.quota.ost=ug
[root at p-mds1 ~]# lctl get_param osd-*.*.quota_slave.info
osd-ldiskfs.prism-MDT0000.quota_slave.info=
target name: prism-MDT0000
pool ID: 0
type: md
quota enabled: ug
conn to master: setup
space acct: ug
user uptodate: glb[1],slv[1],reint[0]
group uptodate: glb[1],slv[1],reint[0]
How can I recover OST-0001b`s quota?
Thanks!
- Eric
More information about the lustre-discuss
mailing list